Softpedia
 


SCRIPTS CATEGORIES:



NEWS ARCHIVE >>
SOFTPEDIA REVIEWS >>
MEET THE EDITORS >>
WEEK'S BEST
  • Koken 0.8.2
  • ContentBox 1.5.2
  • jQPlayer 0.5.2
  • SPOILER ALERT! 0.0.2
  • jQuery Mask Plugin 0.9.0
  • Easing Slider 2.1.2
  • Btapp.js 0.2.0
  • WiiFlash 0.4.5
  • Breeze.js 1.3.3
  • TinyMCE Templates 3.0.2
  • 7-DAY TOP DOWNLOAD
    #
    Program
    Coppermine Photo
    Gallery 1.5.22

    1,113 downloads
    Scary Maze
    462 downloads
    JW FLV Media Player
    6

    402 downloads
    GNU Compiler
    Collection 4.8.0

    343 downloads
    Guestbook PHP code
    251 downloads
    Aircrack 1.1
    229 downloads
    WebChat
    212 downloads
    Recently Registered
    3.1

    196 downloads
    InsanityVille’s
    AJAX Guestbook

    192 downloads
    Flat UI Pro
    181 downloads
    Home > Scripts > Search Engines > Apache Nutch > Changelog

    Apache Nutch 2.1 - Changelog


    What's new in Apache Nutch 2.0:

    · Renamed HTMLParseFilter into ParseFilter.
    · Remove remaining robots/IP blocking code in lib-http.
    · Port logging to slf4j.
    · External parser supports encoding attribute.
    · Ivy configuration settings don't include Gora.
    · Injector should add the metadata before calling injectedScore.
    · Port Nutch benchmark to Nutchbase.
    · Add parse-html back.
    · MoreIndexingFilter missing date format.
    · Timeout for Parser.
    · Retry interval in crawl date is set to 0.
    · Generate log output for solr indexer and dedup.
    · Improved NutchConfiguration.
    · SolrDeleteDuplicates needs to clone the SolrRecord objects.
    · Native hadoop libs not available through maven.
    · Separate the build and runtime environments.



    What's new in Apache Nutch 1.5:

    · This release includes several improvements including upgrades of several major components including Tika 1.1 and Hadoop 1.0.0, improvements to LinkRank and WebGraph elements as well as a number of new plugins covering blacklisting, filtering and parsing to name a few.



    What's new in Apache Nutch 1.4:

    · Added Solr 4x (trunk) example schema.
    · Added '/runtime' to svn ignore.
    · Application/xhtml+xml should be enabled in plugin.xml of parse-html; allow multiple mimetypes for plugin.xml.
    · Fixed parse-tika and parse-html to use relative URL resolution per RFC-3986.
    · Upgraded to Tika 0.10. NOTE: Tika's new RTF parser may ignore more text in malformed documents than previously - see TIKA-748 for details.
    · Added Sonar targets to Ant build.xml.
    · Upgraded SolrJ to version 3.4.0.
    · Ant pmd target is broken.
    · Upgraded Solr schema to version 1.4.



    What's new in Apache Nutch 1.3:

    · This release includes several improvements (improved RSS parsing support, tighter integration with Apache Tika, external parsing support, improved language identification and an order of magnitude smaller source release tarball -- only about 2MB!).



    What's new in Apache Nutch 1.2:

    · Make index-more plug-in configurable.
    · Configurable file protocol parent directory crawling.
    · Timeout for Parser.
    · Website is still Lucene branded.
    · Retry interval in crawl date is set to 0.



    What's new in Apache Nutch 1.0:

    · Allow parsers to return multiple Parse objects.
    · Removed redundant commons-logging jar from ontology plugin.
    · Bug in SegmentReader causes infinite loop.
    · Scoring filter should distribute score to all outlinks at once.
    · Reduce number of warnings in nutch core.




    WindowsGamesDriversMacLinuxScriptsMobileHandheldNews

    SUBMIT PROGRAM   |   ADVERTISE   |   GET HELP   |   SEND US FEEDBACK   |   RSS FEEDS   |   UPDATE YOUR SOFTWARE   |   ROMANIAN FORUM