arachnode.net
An Open Source C# web crawler with Lucene.NET search using SQL 2005/2008/CE
Does arachnode.net scale? | VS2008/2010/2012 & SQL2008/2012 | Download the latest release

crawl URL Error

rated by 0 users
Answered (Verified) This post has 1 verified answer | 1 Reply | 2 Followers

Top 50 Contributor
8 Posts
TileCheng posted on Tue, Feb 17 2009 9:31 PM

When I crawl URL( eg:http://www.modernfurniture.net.cn/shownew.asp?id=93)  ,I receive an error :"Disallowed by Address".

how I will Resolve It ???     3x

Answered (Verified) Verified Answer

Top 10 Contributor
1,692 Posts
Verified by arachnode.net

See this post: http://arachnode.net/forums/t/127.aspx

Also, check the DisallowedAbsoluteUris table.  It could be that this page has registered an error and isn't being allowed to be crawled.

Also, look at IsDisallowed.cs.  This code is the core of the Address.cs CrawlRule.

Thanks!
Mike

For best service when you require assistance:

  1. Check the DisallowedAbsoluteUris and Exceptions tables first.
  2. Cut and paste actual exceptions from the Exceptions table.
  3. Include screenshots.

Skype: arachnodedotnet

Page 1 of 1 (2 items) | RSS
An Open Source C# web crawler with Lucene.NET search using SQL 2005/2008/CE

copyright 2004-2013, arachnode.net LLC