arachnode.net
An Open Source C# web crawler with Lucene.NET search using SQL Server 2008/2012/2014/2016/CE An Open Source C# web crawler with Lucene.NET search using MongoDB/RavenDB/Hadoop

Completely Open Source @ GitHub

Does arachnode.net scale? | Download the latest release

Browse Site by Tags

Showing related tags and posts across the entire site.
  • Crawl Restriction and dbo.Domain table

    Hi There - I have two questions: 1. I am trying to figure out why the crawler is not populating the dbo.domain and dbo.domain_discoveries table. I suspects its one of the application settings and have not been able to figure out which one. 2. How can a crawl request be restricted to a specific URL and...
    Posted to General Questions by rlink12 on Wed, Nov 13 2013
  • Operations: selecting and managing a collection of crawl requests

    Hi Mike and all, I'm evaluating AN for a client and my goal is to crawl a collection of URIs on a regular schedule. Essentially I will run AN as a service and schedule crawls for various sites by specifying those site URIs to AN. My understanding is that the table CrawlRequests is working memory...
    Posted to General Questions by jamesy on Wed, Jul 6 2011
  • Remove Crawl/Discovery Results

    How do i remove the results from a pervious crawl/discovery so they no longer are a part of search results in the Web Application? I deleted the records from WebPages, Discoveries and Files where the absolute uri includes the site to be deleted but the search (in the we app), is still returning the unwanted...
    Posted to General Questions by rlink12 on Wed, Sep 8 2010
  • Re: Application.config

    It looks at the database table 'Configuration'... Find ConfigurationManager.cs to see the relationship between ApplicationSettings and the configuration table. Mike
    Posted to General Questions by arachnode.net on Mon, Jun 1 2009
  • Re: Not geting any results from Lucene

    Is the ManageLuceceDotNetIndexes.cs code being called after each CrawlRequest? Set a breakpoint at the function 'PerformAction'. If the breakpoint doesn't hit, either the CrawlAction isn't enabled in CrawlActions.config or something else is wrong. Is the CrawlAction enabled? It looks...
    Posted to General Questions by arachnode.net on Sat, Apr 11 2009
Page 1 of 2 (40 items) 1 2 Next >
An Open Source C# web crawler with Lucene.NET search using SQL 2008/2012/CE

copyright 2004-2017, arachnode.net LLC