亚洲免费在线-亚洲免费在线播放-亚洲免费在线观看-亚洲免费在线观看视频-亚洲免费在线看-亚洲免费在线视频

To build Heritrix in Eclipse

系統 2161 0

?

?

To build Heritrix in Eclipse

This uses Heritrix 1.14.4 (2010 Year 5 dated 10 version is the latest version of the current situation)

1. First of all download from http://sourceforge.net/projects/archive-crawler/
heritrix-1.14.4.zip
heritrix-1.14.4-src.zip

2. In Eclipse create a java project in the works, respectively,
heritrix-1.14.4.zip
heritrix-1.14.4-src.zip to extract.

3. Will heritrix-1.14.4-src. zip Unzip the src / java in the com, org, st three files under the src folder to the project.
4. Will heritrix-1.14.4-src.zip Unzip the src in the conf folder to the project root directory .
5. Will heritrix-1.14.4-src.zip Unzip in the lib folder to the project root directory.
6. Will heritrix-1.14.4-src.zip Unzip in src / resources / org / archive / util in tlds-alpha-by-domain.txt file to the next project org.archive.util package.
7. Will heritrix-1.14.4.zip extract the webapps folder to the project root directory.
If the folder name is not in the webapps need to make the appropriate changes Heritrix.java.

        /**
     * @throws IOException
     * @return Returns the directory under which reside the WAR files
     * we're to load into the servlet container.
     */
    public static File getWarsdir()
    throws IOException {
        return getSubDir("webapps");
    }


  


8. Configuration file changes, find the conf file under the heritrix.properties

    // Set the user password  
heritrix.cmdline.admin = admin:admin
// Set port  
heritrix.cmdline.port = 8080

  


9. Jar works package on the introduction of the all the jar lib package following the introduction of engineering.
10. Org.archive.crawler.Heritrix.java found right in the project configuration options selected operating mode Classpath
Select User Entries - Advanced
Select Add Folders to add into the conf folder.
Click Start Run Run

    05:22:32.875 EVENT  Starting Jetty/4.2.23
05:22:32.937 WARN!! Delete existing temp dir C:\DOCUME~1\ADMINI~1\LOCALS~1\Temp\Jetty_127_0_0_1_8080__ for WebApplicationContext[/,jar:file:/D:/workspace/jcjcd/heritrixDemo/webapps/admin.war!/]
05:22:33.062 EVENT  Started WebApplicationContext[/,Heritrix Console]
05:22:33.156 EVENT  Started SocketListener on 127.0.0.1:8080
05:22:33.156 EVENT  Started org.mortbay.jetty.Server@1f6f0bf
Heritrix version: @VERSION@

  


So far we have completed the configuration under Heritrix in Eclipse.

Now we can create a job for testing.

To build Heritrix in Eclipse

1. Http://127.0.0.1:8080 in your browser and enter the user input configuration file name password.
Two. Next, we create a job, select the navigation menu in the jobs, select CreateNewJob With defaults.

3. Were filled name, description, and to be crawling the url.
4. Select modules, here we will grab the results to create a mirror image, the default is compressed, Select Writers of org.archive.crawler.writer.ARCWriterProcessor remove and re-add a org.archive.crawler.writer.MirrorWriterProcessor
5. Select Setting bottom of the page set, many items can be set here, such as the maximum number of threads, timeout and so on.
There are two must be set
http-headers HTTP headers.
user-agent: Mozilla/5.0 (compatible; heritrix / @ VERSION @ + PROJECT_URL_HERE)
from: CONTACT_EMAIL_ADDRESS_HERE

I am here simply to replace @ VERSION @ heritrix version
PROJECT_URL_HERE local ip changed to http://
CONTACT_EMAIL_ADDRESS_HERE wrote a random email address above configuration is complete select submitjob.

To build Heritrix in Eclipse

To build Heritrix in Eclipse

6. To Console Click to start the beginning of the crawl job.
Crawl under the completed projects to see jobs in the folder can be found in the folder

To build Heritrix in Eclipse

?

?

文章來自: http://www.codeweblog.com/to-build-heritrix-in-eclipse/

http://www.codeweblog.com/search/Heritrix/

To build Heritrix in Eclipse


更多文章、技術交流、商務合作、聯系博主

微信掃碼或搜索:z360901061

微信掃一掃加我為好友

QQ號聯系: 360901061

您的支持是博主寫作最大的動力,如果您喜歡我的文章,感覺我的文章對您有幫助,請用微信掃描下面二維碼支持博主2元、5元、10元、20元等您想捐的金額吧,狠狠點擊下面給點支持吧,站長非常感激您!手機微信長按不能支付解決辦法:請將微信支付二維碼保存到相冊,切換到微信,然后點擊微信右上角掃一掃功能,選擇支付二維碼完成支付。

【本文對您有幫助就好】

您的支持是博主寫作最大的動力,如果您喜歡我的文章,感覺我的文章對您有幫助,請用微信掃描上面二維碼支持博主2元、5元、10元、自定義金額等您想捐的金額吧,站長會非常 感謝您的哦!!!

發表我的評論
最新評論 總共0條評論
主站蜘蛛池模板: 亚洲欧美在线免费 | 中文字幕免费视频精品一 | 香蕉视频免费在线 | 国产精品欧美亚洲韩国日本不卡 | 四虎影院网 | 在线观看精品91老司机 | 手机看福利| 国产欧美日韩一区二区三区 | 国产亚洲精品视频中文字幕 | 国内精品久久久久尤物 | 久久久久99 | 欧美一区二区三区东南亚 | 欧美日韩在线免费观看 | 一区二区三区免费精品视频 | 日韩不卡一级毛片免费 | 美女在线视频观看影院免费天天看 | h片免费| 日本一区二区三区免费高清在线 | 久久久噜噜噜www成人网 | 9久久免费国产精品特黄 | 无人码一区二区三区视频 | 亚洲国产综合专区在线播一一 | a集毛片| 全部免费毛片在线 | 亚洲国产一区二区a毛片 | 亚洲一区二区在线免费观看 | 久久精品美女 | 亚洲欧洲一二三区机械有限公司 | 四虎永久在线精品国产 | 九九久久久2| 亚洲一区中文字幕 | 欧美色老头oldvideos | 国内拍拍自拍视频在线观看 | 在线看欧美三级中文经典 | 日本高清一级做a爱过程免费视频 | 99久久亚洲综合精品网站 | 亚洲天天在线日亚洲洲精 | 国产九色 | 免费看一级特黄a大片 | 99热这| 欧美成人在线免费观看 |