arachnode.net
An Open Source C# web crawler with Lucene.NET search using SQL Server 2008/2012/2014/2016/CE An Open Source C# web crawler with Lucene.NET search using MongoDB/RavenDB/Hadoop

Completely Open Source @ GitHub

Does arachnode.net scale? | Download the latest release

no robots.txt

rated by 0 users
Answered (Verified) This post has 1 verified answer | 2 Replies | 2 Followers

Top 100 Contributor
3 Posts
Roel posted on Mon, Mar 16 2009 11:19 AM

Hi,

When there is no robots.txt, will arachnode crawl the page?

I've got an exception 'no robots.txt' and there are no urls showing up.

Thanks!

Roel

Answered (Verified) Verified Answer

Top 10 Contributor
1,905 Posts
Verified by arachnode.net

arachnode.net will crawl the site if no robots.txt file is present.

We have a build of 1.1 in review right now.  The latest check in should be a viable check-in if you're running a release from Sourgeforge or Codeplex.

For best service when you require assistance:

  1. Check the DisallowedAbsoluteUris and Exceptions tables first.
  2. Cut and paste actual exceptions from the Exceptions table.
  3. Include screenshots.

Skype: arachnodedotnet

All Replies

Top 100 Contributor
3 Posts
Roel replied on Mon, Mar 16 2009 11:22 AM

Solved!!

CrawlRules, set isenabled of

Arachnode.SiteCrawler.Rules.RobotsDotText

to false!

Top 10 Contributor
1,905 Posts
Verified by arachnode.net

arachnode.net will crawl the site if no robots.txt file is present.

We have a build of 1.1 in review right now.  The latest check in should be a viable check-in if you're running a release from Sourgeforge or Codeplex.

For best service when you require assistance:

  1. Check the DisallowedAbsoluteUris and Exceptions tables first.
  2. Cut and paste actual exceptions from the Exceptions table.
  3. Include screenshots.

Skype: arachnodedotnet

Page 1 of 1 (3 items) | RSS
An Open Source C# web crawler with Lucene.NET search using SQL 2008/2012/CE

copyright 2004-2017, arachnode.net LLC