arachnode.net
An Open Source C# web crawler with Lucene.NET search using SQL Server 2008/2012/2014/2016/CE An Open Source C# web crawler with Lucene.NET search using MongoDB/RavenDB/Hadoop
Search the Live Index Does arachnode.net scale? | Download the latest release

"value cannot be null" Exception Message

rated by 0 users
Answered (Verified) This post has 1 verified answer | 2 Replies | 2 Followers

Top 50 Contributor
11 Posts
bscott posted on Wed, May 25 2011 10:10 AM

When I have SaveDiscoveredWebPagesToDisk and SaveDiscoveredFilesToDisk set to false, I get the Exception listed below in the Exceptions table regularly (I believe for every CrawlRequest).  It appears that it tries to add a file to the Lucene.net index even though it isn't saving the files to the hard drive.

AbsoluteUri1: http://localhost/Default.aspx

AbsoluteUri2: http://localhost/Default.aspx

HelpLink:  NULL

Message: value cannot be null

Source: Lucene.Net

StackTrace:

at Lucene.Net.Documents.Field..ctor(String name, Boolean internName, String value_Renamed, Store store, Index index, TermVector termVector)    

at Lucene.Net.Documents.Field..ctor(String name, String value_Renamed, Store store, Index index, TermVector termVector)     at Lucene.Net.Documents.Field..ctor(String name, String value_Renamed, Store store, Index index)    

at Arachnode.Plugins.CrawlActions.ManageLuceneDotNetIndexes.CreateDocument(Document document, Int64 discoveryID, DiscoveryType discoveryType, String absoluteUri, String contentToIndex, Int32 codePage, String fullTextIndexType, String discoveryPath, Int32 threadNumber) in C:\Inetpub\wwwroot\Stateside-WebCrawler-v2.5\Arachnode\Plugins\CrawlActions\ManageLuceneDotNetIndexes.cs:line 450    

at Arachnode.Plugins.CrawlActions.ManageLuceneDotNetIndexes.PerformAction(CrawlRequest crawlRequest, ArachnodeDAO arachnodeDAO) in C:\Inetpub\wwwroot\Stateside-WebCrawler-v2.5\Arachnode\Plugins\CrawlActions\ManageLuceneDotNetIndexes.cs:line 414    

at Arachnode.SiteCrawler.Managers.ActionManager.PerformCrawlActions(CrawlRequest crawlRequest, CrawlActionType crawlActionType, ArachnodeDAO arachnodeDAO) in C:\Inetpub\wwwroot\Stateside-WebCrawler-v2.5\Arachnode\SiteCrawler\Managers\ActionManager.cs:line 291


Answered (Verified) Verified Answer

Top 50 Contributor
11 Posts
Answered (Verified) bscott replied on Wed, May 25 2011 10:43 AM
Verified by bscott

Would turning off the ManageLuceneDotNetIndexes CrawlAction resolve this?  Assuming that I intend to use Arachnode to crawl and do custom processing after each CrawlRequest is complete, I shouldn't need that feature, I think.

If this is correct, I incorrectly labelled this as a bug.

All Replies

Top 50 Contributor
11 Posts
Answered (Verified) bscott replied on Wed, May 25 2011 10:43 AM
Verified by bscott

Would turning off the ManageLuceneDotNetIndexes CrawlAction resolve this?  Assuming that I intend to use Arachnode to crawl and do custom processing after each CrawlRequest is complete, I shouldn't need that feature, I think.

If this is correct, I incorrectly labelled this as a bug.

Top 10 Contributor
1,905 Posts

It would.  You are correct, it will fix the exception.

The Lucene.NET indexing needs to be able to pull the file from disk to so that the web project will function properly.

Thanks!

For best service when you require assistance:

  1. Check the DisallowedAbsoluteUris and Exceptions tables first.
  2. Cut and paste actual exceptions from the Exceptions table.
  3. Include screenshots.

Skype: arachnodedotnet

Page 1 of 1 (3 items) | RSS
An Open Source C# web crawler with Lucene.NET search using SQL 2008/2012/CE

copyright 2004-2017, arachnode.net LLC