arachnode.net
An Open Source C# web crawler with Lucene.NET search using SQL Server 2008/2012/2014/2016/CE An Open Source C# web crawler with Lucene.NET search using MongoDB/RavenDB/Hadoop

Completely Open Source @ GitHub

Does arachnode.net scale? | Download the latest release

Browse Site by Tags

Showing related tags and posts across the entire site.
  • Crawl Restriction and dbo.Domain table

    Hi There - I have two questions: 1. I am trying to figure out why the crawler is not populating the dbo.domain and dbo.domain_discoveries table. I suspects its one of the application settings and have not been able to figure out which one. 2. How can a crawl request be restricted to a specific URL and...
    Posted to General Questions by rlink12 on Wed, Nov 13 2013
  • 404 is returned because trailing slash is not used

    When I crawl this site: http://www.jenkinskling.com the following response is returned: <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"> <html> <head> <title>R a z o r B a l l</title> <meta http-equiv="Content-Type" content="text...
    Posted to General Questions by canuckbbp on Thu, Nov 10 2011
  • Operations: selecting and managing a collection of crawl requests

    Hi Mike and all, I'm evaluating AN for a client and my goal is to crawl a collection of URIs on a regular schedule. Essentially I will run AN as a service and schedule crawls for various sites by specifying those site URIs to AN. My understanding is that the table CrawlRequests is working memory...
    Posted to General Questions by jamesy on Wed, Jul 6 2011
  • crawling specific web sites for tag words

    Thanks a lot for this open source venture. I am trying to come up with a system that crawls specific sites (may be 4 or 5) for specific tag works. As per the requirement I would have to keep the crawl restricted within the web site. Is it possible to achieve this with arachnode.net. I have download &...
    Posted to General Questions by dbs2000 on Fri, Jul 31 2009
An Open Source C# web crawler with Lucene.NET search using SQL 2008/2012/CE

copyright 2004-2017, arachnode.net LLC