Hello all:
It is release time again, and time to list everything that is new and improved.
- Fixed a bug in IsDisallowed.cs where Disallowed* wasn't being processed properly under certain conditions.
- Added Web\Test.aspx for use in performance testing and crawl verification.
- Fixed a bug in EXIFExtractor.cs which would throw a NRE under certain conditions.
- Added Utilities.Web functions for resetting the ASP.Net webservers as well as IIS.
- Updated ManageLuceneDotNetIndexes.cs to use Lucene.NET 2.9.1.2.
- Added BreadthFirst or DepthFirst CrawlMode to the Crawler.
- Added additional tests for Crawler.cs.
- Added IEnumerable to PriorityQueue.cs, enabling foreach.
- Added HttpWebRequestRetries, thereby allowing AN to retry unresponsive WebPages at the end of the normal crawl cycle.
- Changed the 'Platform Target' to x86 for DEBUG to allow 'Edit and Continue' on x64-based systems.
- Updated API Documentation using the latest version of NDoc.
- Added 'Renderer', allowing complete rendering of AJAX/JavaScript-based sites and content, and allowing dynamic DOM interaction.
- Added EngineActions for populating CrawlRequests from alternate sources.
- Added Storable.cs to illustrate how to use the new AStorable functionality.
- Updated the Lucene.Net Highlighter functionality to provide better results when search text is found in HTML tags or SCRIPT.
- Added CustomManageLuceneDotNetIndexes.cs, illustrating how to implement custom fields in the Lucene.Net indexes.
- Added DiscoveryChain.cs, allowing explicit illustration of how a Discovery was ultimately found.
- Improved Lucene.Net indexing speed and AutoCommit functionality.
- Added ExceptionSeverity designation for configuration exceptions.
- Improved the Service, and added facilities to recrawl a seed list of AbsoluteUris from the Service.
- Added additional Application Log events for Engine State changes.
- Added helper code to Console\Program.cs to enhance the DEMO experience.
- Added SiteCrawler\Actions\Renderer.cs, illustrating how to interact with the DOM.
- Improved Cache handling in WebClient.cs.
- Improved Cookie handling in WebClient.cs, allowing the currently logged on user's cookies to be submitted with each HttpWebRequest.
- Improved Cache/Discovery handling for the Cache under low-memory situations.
- Enabled DYNAMIC enabling/disabling of CrawlActions/CrawlRules/EngineActions at Crawl time.
- Implemented AStorable.cs, which allows for selective storage of content while still allowing crawling to proceed.
- Improved Cache handling when AN cannot locate cached content due to user configuration error or data loss.
- Improved ContentType handling when ContentTypes are malformed when returned from HttpWebRequests.
- Improved RegEx handling for HyperLinks.
- Improved Encoding detection when an improper set of Encoding detection attributes are returned from HttpWebRequests.
Posted
Thu, Jun 17 2010 4:53 PM
by
arachnode.net