arachnode.net
An Open Source C# web crawler with Lucene.NET search using SQL Server 2005/2008/CE An Open Source C# web crawler with Lucene.NET search using MongoDB/RavenDB/Hadoop
Mongo/Raven/MySQL/Hadoop Does arachnode.net scale? | VS2008/2010/2012 & SQL2008/2012 | Download the latest release

Browse Forum Posts by Tags

Showing related tags and posts for the General Questions forum. See all tags in the site
  • Re: Docs with examples?

    Open cfg.Configuration and set 'ExtractWebPageMetaData' to 'true', and 'InsertWebPageMetaData' to 'true'. Then, crawl and then examine 'WebPages_MetaData'. -Mike
    Posted to General Questions (Forum) by arachnode.net on Mon, Jun 14 2010
  • Re: Plugin help

    arachnode.net already contains support for the HtmlAgilityPack - however, the HtmlAgilityPack is a HUGE memory hog and has an extremely negative impact on crawling rate. If you can avoid it, don't use it. If you have to use it, change the configuration setting for 'ExtractWebPageMetaData'...
    Posted to General Questions (Forum) by arachnode.net on Fri, Aug 7 2009
Page 1 of 1 (2 items)
An Open Source C# web crawler with Lucene.NET search using SQL 2005/2008/CE

copyright 2004-2014, arachnode.net LLC