arachnode.net
An Open Source C# web crawler with Lucene.NET search using SQL Server 2008/2012/2014/2016/CE An Open Source C# web crawler with Lucene.NET search using MongoDB/RavenDB/Hadoop
Search the Live Index Does arachnode.net scale? | Download the latest release

Browse Forum Posts by Tags

Showing related tags and posts for the General Questions forum. See all tags in the site
  • Crawl Restriction and dbo.Domain table

    Hi There - I have two questions: 1. I am trying to figure out why the crawler is not populating the dbo.domain and dbo.domain_discoveries table. I suspects its one of the application settings and have not been able to figure out which one. 2. How can a crawl request be restricted to a specific URL and...
    Posted to General Questions (Forum) by rlink12 on Wed, Nov 13 2013
  • Operations: selecting and managing a collection of crawl requests

    Hi Mike and all, I'm evaluating AN for a client and my goal is to crawl a collection of URIs on a regular schedule. Essentially I will run AN as a service and schedule crawls for various sites by specifying those site URIs to AN. My understanding is that the table CrawlRequests is working memory...
    Posted to General Questions (Forum) by jamesy on Wed, Jul 6 2011
  • Remove Crawl/Discovery Results

    How do i remove the results from a pervious crawl/discovery so they no longer are a part of search results in the Web Application? I deleted the records from WebPages, Discoveries and Files where the absolute uri includes the site to be deleted but the search (in the we app), is still returning the unwanted...
    Posted to General Questions (Forum) by rlink12 on Wed, Sep 8 2010
  • Re: Application.config

    It looks at the database table 'Configuration'... Find ConfigurationManager.cs to see the relationship between ApplicationSettings and the configuration table. Mike
    Posted to General Questions (Forum) by arachnode.net on Mon, Jun 1 2009
  • Re: Not geting any results from Lucene

    Is the ManageLuceceDotNetIndexes.cs code being called after each CrawlRequest? Set a breakpoint at the function 'PerformAction'. If the breakpoint doesn't hit, either the CrawlAction isn't enabled in CrawlActions.config or something else is wrong. Is the CrawlAction enabled? It looks...
    Posted to General Questions (Forum) by arachnode.net on Sat, Apr 11 2009
  • Re: Not geting any results from Lucene

    Double-check the lucene.net index with LUKE: http://www.getopt.org/luke/ How did you close the console application? Did you click the close button or press ctrl-c? Is the CrawlAction enabled in CrawlActions.config? Also, did you grab a release or the latest from SVN? And I've also set luceneDotNetIndexDirectory...
    Posted to General Questions (Forum) by arachnode.net on Fri, Apr 10 2009
Page 1 of 1 (6 items)
An Open Source C# web crawler with Lucene.NET search using SQL 2008/2012/CE

copyright 2004-2017, arachnode.net LLC