arachnode.net
An Open Source C# web crawler with Lucene.NET search using SQL Server 2008/2012/2014/2016/CE An Open Source C# web crawler with Lucene.NET search using MongoDB/RavenDB/Hadoop

Completely Open Source @ GitHub

Does arachnode.net scale? | Download the latest release

Browse Site by Tags

Showing related tags and posts across the entire site.
  • Re: How to restrict crawl to single domain?

    OK, how about this: If you want to crawl 500 domains you would configure arachnode.net to restrict Crawls to those 500 domain only like the posts above describe how to do. Then, make sure your settings in Application.config are set as shown. The Crawl process works like this if you have the settings...
    Posted to General Questions by arachnode.net on Fri, Feb 13 2009
  • Re: How to restrict crawl to single domain?

    After all of the CrawlRequests are crawled, if you have CreateCrawlRequestsFromDatabaseHyperLinks set to true or CreateCrawlRequestsFromDatabaseHyperLinks set to true then arachnode.net will create CrawlRequests from those AbsoluteUris. There is an order in which CrawlRequests are processed. a.) CrawlRequests...
    Posted to General Questions by arachnode.net on Wed, Feb 11 2009
Page 1 of 1 (20 items)
An Open Source C# web crawler with Lucene.NET search using SQL 2008/2012/CE

copyright 2004-2017, arachnode.net LLC