arachnode.net
An Open Source C# web crawler with Lucene.NET search using SQL Server 2008/2012/2014/2016/CE An Open Source C# web crawler with Lucene.NET search using MongoDB/RavenDB/Hadoop
Search the Live Index Does arachnode.net scale? | Download the latest release

Browse Forum Posts by Tags

Showing related tags and posts for the Forums application. See all tags in the site
  • Restrict Domains

    Hello again...I am the same user as this question: http://arachnode.net/forums/t/1445.aspx I just bought a license and am still wondering this same question. Here is the question again: I would like to restrict part of a domain from crawling. I am trying to crawl a site and a "site-map like page"...
    Posted to General Questions (Forum) by David Rodecker on Mon, Sep 20 2010
  • Restrict part of a domain from a crawl

    I would like to restrict part of a domain from crawling. I am trying to crawl a site and a "site-map like page" has many links that go to many different parts of the site. (E.g: the footer and header of the page contains many different types of links that go to various pages that are not of...
    Posted to General Questions (Forum) by bvandrunen on Sun, Sep 19 2010
  • Re: Server Errors that being take care by the website

    Megetron, I have a situation where I build lists of particular web sites I want to crawl. I do it via a sql script that will submit the crawl requests, but it may give you some ideas. In the sql I just do something like this: declare @id bigint declare @url varchar(255) declare @depth int declare @restrictcrawl...
    Posted to General Questions (Forum) by Kevin on Sat, Sep 5 2009
  • Re: What are these fields used for and mean?

    They correspond to the enum: namespace Arachnode.SiteCrawler.Value.Enums Flags ] { [ public enum UriClassificationType : byte { None = 0, Domain = 1, Extension = 2, FileExtension = 4, Host = 8, Scheme = 16 } } RestrictCrawlTo means that the Crawl won't crawl WebPages that aren't the same UriClassificationType...
    Posted to General Questions (Forum) by arachnode.net on Thu, Jul 23 2009
Page 1 of 1 (4 items)
An Open Source C# web crawler with Lucene.NET search using SQL 2008/2012/CE

copyright 2004-2017, arachnode.net LLC