Can be used in writing search crawlers (spiders) that mine Web pages for various information.
PHPCrawl acquires information it was configured to fetch and passes it to more powerful apps for further processing.
Here are some key features of "PHPCrawl":
· Filters for URL and Content-Type data
· Define ways to handle cookies
· Define ways to handle robots.txt files
· Limit its activity in various ways
· Multi-processing modes
· PHP 5 or higher
· PHP with OpenSSL support
What's New in This Release: [ read full changelog ]
· Links that are partially urlencoded and partially not get rebuild/encoded correctly now.
· Removed a unnecessary debug var_dump() from PHPCrawlerRobotsTxtParser.class.php
· Server-name-indication in TLS/SSL works correctly now.
· "base-href"-tags in websites get interpreted correctly now again.