When I have SaveDiscoveredWebPagesToDisk and SaveDiscoveredFilesToDisk set to false, I get the Exception listed below in the Exceptions table regularly (I believe for every CrawlRequest). It appears that it tries to add a file to the Lucene.net index even though it isn't saving the files to the hard drive.
AbsoluteUri1: http://localhost/Default.aspx
AbsoluteUri2: http://localhost/Default.aspx
HelpLink: NULL
Message: value cannot be null
Source: Lucene.Net
StackTrace:
at Lucene.Net.Documents.Field..ctor(String name, Boolean internName, String value_Renamed, Store store, Index index, TermVector termVector)
at Lucene.Net.Documents.Field..ctor(String name, String value_Renamed, Store store, Index index, TermVector termVector) at Lucene.Net.Documents.Field..ctor(String name, String value_Renamed, Store store, Index index)
at Arachnode.Plugins.CrawlActions.ManageLuceneDotNetIndexes.CreateDocument(Document document, Int64 discoveryID, DiscoveryType discoveryType, String absoluteUri, String contentToIndex, Int32 codePage, String fullTextIndexType, String discoveryPath, Int32 threadNumber) in C:\Inetpub\wwwroot\Stateside-WebCrawler-v2.5\Arachnode\Plugins\CrawlActions\ManageLuceneDotNetIndexes.cs:line 450
at Arachnode.Plugins.CrawlActions.ManageLuceneDotNetIndexes.PerformAction(CrawlRequest crawlRequest, ArachnodeDAO arachnodeDAO) in C:\Inetpub\wwwroot\Stateside-WebCrawler-v2.5\Arachnode\Plugins\CrawlActions\ManageLuceneDotNetIndexes.cs:line 414
at Arachnode.SiteCrawler.Managers.ActionManager.PerformCrawlActions(CrawlRequest crawlRequest, CrawlActionType crawlActionType, ArachnodeDAO arachnodeDAO) in C:\Inetpub\wwwroot\Stateside-WebCrawler-v2.5\Arachnode\SiteCrawler\Managers\ActionManager.cs:line 291
Would turning off the ManageLuceneDotNetIndexes CrawlAction resolve this? Assuming that I intend to use Arachnode to crawl and do custom processing after each CrawlRequest is complete, I shouldn't need that feature, I think.
If this is correct, I incorrectly labelled this as a bug.
It would. You are correct, it will fix the exception.
The Lucene.NET indexing needs to be able to pull the file from disk to so that the web project will function properly.
Thanks!
For best service when you require assistance:
Skype: arachnodedotnet