arachnode.net
An Open Source C# web crawler with Lucene.NET search using SQL Server 2008/2012/2014/2016/CE An Open Source C# web crawler with Lucene.NET search using MongoDB/RavenDB/Hadoop

Completely Open Source @ GitHub

Does arachnode.net scale? | Download the latest release

Search

  • Service Control Flow Question

    Mike, I'm looking at Engine_CrawlCompleted in the service and trying to understand the logic. What I expected was that the service would go dormant until the CrawlRequests.txt is touched. What it looks like it's doing is continuing to pause/resume over and over a few times a second. Here's the cycle from the event log : Engine_CrawlCompleted
    Posted to General Questions (Forum) by jamesy on Sun, Jul 10 2011
  • Re: Service Installation Help

    I found the culprit in Crawler.cs line 79 _arachnodeDAO = new ArachnodeDAO(ApplicationSettings.ConnectionString, true, true); This static constructor below fails in the service code line 48 because ApplicationSettings.ConnectionString above is not set properly: private static readonly Crawler _crawler = new Crawler(CrawlMode.BreadthFirstByPriority,
    Posted to Installation (Forum) by jamesy on Sun, Jul 10 2011
  • Re: Service Installation Help

    On Windows Server 2008, I tried changing the permissions on the file directories, database, and service account. I always get the system error: Error 1053: The arachnode.net service failed to start due to the following error: The service did not respond to the start or control request in a timely fashion. The reason I think it is security is because
    Posted to Installation (Forum) by jamesy on Sun, Jul 10 2011
  • Service Installation Help

    Mike, I have tried for a couple days to install and run as a service on Windows 7 and Server 2008 and it doesn't work for me. I can install the service without a problem, but when I start the service it times out. This happens when I create the service with installutil, sc.exe. I tried something called nssm that installs and runs the service, but
    Posted to Installation (Forum) by jamesy on Sun, Jul 10 2011
  • SQL full text search/index vs. Lucene

    I see that AN is generating both SQL full text search/index as well as Lucene. What is the scenario when I would use SQL full text search/index or Lucene? Thanks, James
    Posted to General Questions (Forum) by jamesy on Wed, Jul 6 2011
  • What is the "MOST_POPULAR" Algorithm

    I'm looking for a description of the algorithm that is use to calculate the all the tables titled "MOST_POPULAR". E.g. <XXXXXX>_MOST_POPULAR_ABSOLUTEURIS_BY_ABSOLUTEURIS? Also, it seems like this table is missing for web pages: WebPages_MOST_POPULAR_ABSOLUTEURIS_BY_ABSOLUTEURIS. My ultimate goal is to figure out "most popular"
    Posted to General Questions (Forum) by jamesy on Wed, Jul 6 2011
  • Operations: selecting and managing a collection of crawl requests

    Hi Mike and all, I'm evaluating AN for a client and my goal is to crawl a collection of URIs on a regular schedule. Essentially I will run AN as a service and schedule crawls for various sites by specifying those site URIs to AN. My understanding is that the table CrawlRequests is working memory for AN, so it looks like I would need to create my
    Posted to General Questions (Forum) by jamesy on Wed, Jul 6 2011
Page 1 of 1 (7 items) | More Search Options
An Open Source C# web crawler with Lucene.NET search using SQL 2008/2012/CE

copyright 2004-2017, arachnode.net LLC