arachnode.net
An Open Source C# web crawler with Lucene.NET search using SQL Server 2008/2012/2014/2016/CE An Open Source C# web crawler with Lucene.NET search using MongoDB/RavenDB/Hadoop

Completely Open Source @ GitHub

Does arachnode.net scale? | Download the latest release

Browse Site by Tags

Showing related tags and posts across the entire site.
  • Re: Simulating POST Request

    This is what you want: http://stackoverflow.com/questions/5401501/how-to-post-data-to-specific-url-using-webclient-in-c-sharp You would likely want to craft this code in a plugin, which you could call externally (from Program.cs) to retrieve your first set of CrawlRequests to crawl. You can use Fiddler...
    Posted to General Questions by arachnode.net on Fri, Nov 16 2012
  • Re: Write with arachnode.net

    Try the ID instead? " ctl00_PersonalizationManager1_WebPartManager1_wp961475462_wp927486962_dvwComment_txtAddedBy" But,. that does look correct. One thing I did notice when I was toying with the idea of adding a form submission plugin to AN was that many of the sites that I attempted to interact...
    Posted to Feature Requests by arachnode.net on Fri, Apr 23 2010
  • Re: Console disappear/shutdown - memory issues

    You won't believe this: internal void ProcessCrawlRequest(CrawlRequest crawlRequest, bool obeyCrawlRules, bool executeCrawlActions) { if (crawlRequest.Discovery.Uri.AbsolutePath != "/robots.txt") { crawlRequest.WebClient.Method = "HEAD"; } else { //always get the robots.txt file...
    Posted to Bug Reports by arachnode.net on Fri, Sep 11 2009
  • Re: Prevent Re-Crawl If No Date Change?

    I do believe that the WebClient already takes this in account, FWIW - need to check for sure.
    Posted to General Questions by arachnode.net on Mon, Aug 3 2009
An Open Source C# web crawler with Lucene.NET search using SQL 2008/2012/CE

copyright 2004-2017, arachnode.net LLC