arachnode.net
An Open Source C# web crawler with Lucene.NET search using SQL 2005/2008/CE
Does arachnode.net scale? | VS2008/2010/2012 & SQL2008/2012 | Download the latest release

Browse Site by Tags

Showing related tags and posts across the entire site.
  • 404 is returned because trailing slash is not used

    When I crawl this site: http://www.jenkinskling.com the following response is returned: <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"> <html> <head> <title>R a z o r B a l l</title> <meta http-equiv="Content-Type" content="text...
    Posted to General Questions by canuckbbp on Thu, Nov 10 2011
  • Re: Where to enter the initial crawl request?

    Simon - See Program.cs... (in the 'crawl test code' region...) _crawler.Crawl(new CrawlRequest(new Discovery("http://fark.com"), int.MaxValue, UriClassificationType.None, UriClassificationType.None, 1)); The warning you see are debug artifacts that I have left behind. As long as the...
    Posted to General Questions by arachnode.net on Sun, Nov 8 2009
  • Re: INSERT INTO [arachnode.net].[dbo].[CrawlRequests]

    The check constraint is quite restrictive because certain functionality depends on the AbsoluteUris submitted being correct. Try 'http://kkk.net/' - what is this site, anyway? Makes me nervous... A better way would be to use the API to submit the CrawlRequests you want. Mike (I'm off to putter...
    Posted to General Questions by arachnode.net on Sun, Aug 2 2009
Page 1 of 1 (20 items)
An Open Source C# web crawler with Lucene.NET search using SQL 2005/2008/CE

copyright 2004-2013, arachnode.net LLC