arachnode.net
An Open Source C# web crawler with Lucene.NET search using SQL Server 2008/2012/2014/2016/CE An Open Source C# web crawler with Lucene.NET search using MongoDB/RavenDB/Hadoop

Completely Open Source @ GitHub

Does arachnode.net scale? | Download the latest release

Search

  • Re: Credential Cache

    Thanks Mike! I'll follow the links and try it again.
    Posted to General Questions (Forum) by Anat on Thu, Jul 1 2010
  • Re: Credential Cache

    Thanks Mike! I did try it but it's still not working for some reason. my code (inside GetWebResponse) looks like the following - CredentialCache cc = new CredentialCache(); cc.Add(new Uri(absoluteUri), "Digest", new NetworkCredential("myUser", "myPassword","MyUriDomain")); HttpWebRequest.Credentials = cc;
    Posted to General Questions (Forum) by Anat on Sun, Jun 20 2010
  • Re: What new feature should I finish first?

    Thanks! I've enabled the CrawlAction in cfg.CrawlActions. not sure it makes much different, but it seems that some of my URIs now look like -> http://anonymouse.org/cgi-bin/anon-www.cgi/http://anonymouse.org/cgi-bin/anon-www.cgi/http://www.........(rest of my URI) Did you see this happening before?
    Posted to Feature Requests (Forum) by Anat on Sun, Jun 20 2010
  • Re: What new feature should I finish first?

    Hi Mike, Any News with the Anonymizer plug-in? Is it implemented in a newer version and if so how do I use it? Thanks!
    Posted to Feature Requests (Forum) by Anat on Sun, Jun 20 2010
  • Re: Credential Cache

    Hi Mike, Regarding to the issue of using the Credential Cache to crawl into sites that requires user/password authentication, I know this is already available in the release but the last thing I found is that you said you didn't try to use it yet... So I was wondering if you did do so and if you have any code examples for using it? Thanks!
    Posted to General Questions (Forum) by Anat on Sun, Jun 20 2010
  • Re: Disallowed Directories

    Hey Mike, Any News with the Disallowed Directories mentioned above? Thanks.
    Posted to Feature Requests (Forum) by Anat on Mon, Feb 8 2010
  • Disallowed Directories

    Hi Mike, Following our conversation, I think it might be a good idea to add the feature to limit the crawl to a specific Directory. For example - If I have a crawl request with "http://edition.cnn.com/US/" as URI, I will want only results from this directory down (i.e http://edition.cnn.com/2009/US/12/11/.....) but not from anywhere in this
    Posted to Feature Requests (Forum) by Anat on Sun, Dec 13 2009
  • Re: Questions about the Templater etc'

    Hi Mike, Thanks for your last answer and advices. I'm currently trying to understand how creating a CrawlRule can help me to filter result web pages while crawling. I've searched and read posts I've found about rules in the forom but still didn't figure completely how to define one. Can you please give an example of the basic steps needed
    Posted to Plugins (Forum) by Anat on Thu, Nov 5 2009
  • Questions about the Templater etc'

    Hi Mike, Thanks for the quick reply on mail. As I said, I already have the enviorment up and running - Crawling my sites and searching. My main goal is to be able to have a list of sites that the user can enter and save (which I assume can be saved in the CrawlRequest Table + enabling the CreateCrawlRequestsFromDatabaseFiles?) and also have a list of
    Posted to Plugins (Forum) by Anat on Thu, Sep 3 2009
Page 1 of 1 (9 items) | More Search Options
An Open Source C# web crawler with Lucene.NET search using SQL 2008/2012/CE

copyright 2004-2017, arachnode.net LLC