arachnode.net
An Open Source C# web crawler with Lucene.NET search using SQL Server 2008/2012/2014/2016/CE An Open Source C# web crawler with Lucene.NET search using MongoDB/RavenDB/Hadoop

Completely Open Source @ GitHub

Does arachnode.net scale? | Download the latest release

UriClassificationType does not exist in current context

rated by 0 users
Answered (Verified) This post has 1 verified answer | 1 Reply | 2 Followers

Top 100 Contributor
4 Posts
jon555 posted on Tue, Sep 1 2009 7:25 AM

 

 

 

 

 

 

 

 

 

 

 

 

 

In the Console project the test crawl the new Crawl Request says that UriClassificationType does not exist in the current context.

#region

 

 

crawl test code

_crawler =

 

new Crawler

();

 

 

if (_crawler.QueryProcessor != null

)

{

_crawler.QueryProcessor.OnQuerySuccessfullyProcessed += QueryProcessor_OnQuerySuccessfullyProcessed;

}

 

 

if (_crawler.Engine != null

)

{

_crawler.Engine.OnQuerySuccessfullySubmitted += Engine_OnQuerySuccessfullySubmitted;

_crawler.Engine.OnCrawlCompleted += Engine_OnCrawlCompleted;

 

 

//run [arachnode_usp_arachnode.net_RESET_DATABASE] to completely reset the database, if desired.

 

 

//ArachnodeDAO arachnodeDAO = new ArachnodeDAO();

 

 

//arachnodeDAO.ExecuteSql("EXEC [dbo].[arachnode_usp_arachnode.net_RESET_DATABASE]");

 

 

//You can logically OR the UriClassificationTypes to set what a CrawlRequest crawls!!!

 

 

//This example shows crawling only the 'arachnode.net' domain, and only crawling .aspx pages (WebPages), and only processing/storing discoveries (EmailAddresses, HyperLinks, Files, Images) that match the submitted CrawlRequests' FileExtension.

 

 

//Settings the Depth to int.Max means to crawl the first page, and then int.MaxValue - 1 hops away from the initial CrawlRequest AbsoluteUri - so, the entire site.

 

 

//You can stop a Crawl and the Crawl will be saved to the CrawlRequests table.

_crawler.Crawl(

 

new CrawlRequest(new Discovery("http://arachnode.net/Default.aspx"), int

.MaxValue, UriClassificationType.Domain | UriClassificationType.FileExtension, UriClassificationType.Domain | UriClassificationType.FileExtension, 1));

_stopwatch.Start();

_crawler.Engine.Start();

}

System.

 

Console

.ReadLine();

#endregion

Answered (Verified) Verified Answer

Top 10 Contributor
1,905 Posts

Do you know how to add a reference to an object?  The console's using statements were likely cleaned after the Crawl statement was commented.

Mike

For best service when you require assistance:

  1. Check the DisallowedAbsoluteUris and Exceptions tables first.
  2. Cut and paste actual exceptions from the Exceptions table.
  3. Include screenshots.

Skype: arachnodedotnet

Page 1 of 1 (2 items) | RSS
An Open Source C# web crawler with Lucene.NET search using SQL 2008/2012/CE

copyright 2004-2017, arachnode.net LLC