arachnode.net
An Open Source C# web crawler with Lucene.NET search using SQL 2005/2008/CE
Does arachnode.net scale? | VS2008/2010/2012 & SQL2008/2012 | Download the latest release

Blogs

  • Performance Tools

    http://www.passmark.com/download/pt_download.htm https://www.google.com/webhp?sourceid=chrome-instant&rlz=1C1CHFX_enUS531US531&ion=1&ie=UTF-8#rlz=1C1CHFX_enUS531US531&sclient=psy-ab&q=byte+to+megabits&oq=byte+to+megabits&gs_l...
    Posted to arachnode.net : blog by arachnode.net on Thu, May 16 2013
  • One Billion WebPages+

    Yes, that is over 1 Billion!
    Posted to arachnode.net : blog by arachnode.net on Fri, Apr 12 2013
  • Common Crawler Challenges

    There are a number of choices for .NET crawlers. A few of them are extremely well architected, following TDD practices and the best patterns for ultimate extensibility. While most are excellent starts, none but arachnode.net provide sliding-window caching...
    Posted to arachnode.net : blog by arachnode.net on Sun, Mar 24 2013
  • How to set web proxy in arachnode code.

    If you are running the AN (arachnode crawler) in a ntetowrk secured by some proxy server , then you may need to specify the proxy server settings in the AN code. You can do this as follows: Go to SiteCrawler project in the AN code solution , then to Components...
    Posted to Using arachnode.net by Abhishek Gahlout on Wed, Mar 20 2013
  • Binary Search Tree Challenge

    #region using System; using System.Collections.Generic; #endregion namespace BinaryTreeSearch { public class TreeNode<T> { public TreeNode<T> Parent { get ; set ; } public TreeNode<T> Left { get ; set ; } public TreeNode<T> Right...
    Posted to Classic CS Programming Challenges by arachnode.net on Mon, Feb 18 2013
  • arachnode.net Help

    In summary: If you need help, I will help you directly. Skype: arachnodedotnet (chat) and TeamViewer.com (shared desktop) Recently, I have received a few questions around, "Hey! arachnode.net seems like a great product but I'm having trouble getting started...
    Posted to arachnode.net : blog by arachnode.net on Wed, Feb 13 2013
    Filed under:
  • Programming Challenge 2

    Question: You are given an array of n integers (both positive and negative). Find the first continuous sequence of integers with the largest sum. Example 1: Input: {-7, 4, -2, 5, 3, -6, 8, -9} : Answer: {4, -2, 5, 3, -6, 8} Example 2: Input {5, -3, -4...
    Posted to arachnode.net : blog by arachnode.net on Thu, Feb 7 2013
  • Programming Challenge 1

    Question: A solution consists of four balls from a set of four different colors. The user tries to guess the solution. If they guess the right color for the right spot, record it as being in the correct 'Location'. If it's the right color, but the wrong...
    Posted to arachnode.net : blog by arachnode.net on Thu, Feb 7 2013
  • How to use the arachnode.net Plugins with AN.Next.

    This example uses the most commonly used plugin, ManageLuceneDotNetIndexes.cs. Many plugins require that a Dictionary<string, string> be passed to AssignSettings(...), and this applies to ManageLuceneDotNetIndexes.cs. These settings correspond to...
    Posted to arachnode.net : blog by arachnode.net on Sat, Dec 29 2012
  • NCrawler

    NCrawler is an extremely well written application, following intelligent programming standards and organizational practices. Esben Carlsen ( Link ), the author, is undoubtedly a gifted programmer and has demonstrated a level of programming proficiency...
    Posted to arachnode.net : blog by arachnode.net on Mon, Dec 3 2012
  • Common Questions

    I frequently receive mail/posts about how to do something similar to a request received this morning: (1) Start a initial crawl of 1 website to retrieve page content, meta title, keyword, description etc and out bound links to other pages on the site...
    Posted to arachnode.net : blog by arachnode.net on Thu, Sep 20 2012
    Filed under:
  • http://www.primordialcomputers.com - Custom PC's

    Primordial Computers is owned by a good friend and long-time user of arachnode.net. http://primordialcomputers.com From their site: Primordial Computers delivers the best custom computer experience by redefining the industry standards one computer at...
    Posted to arachnode.net : blog by arachnode.net on Wed, Feb 29 2012
  • Crawl facebook.com, twitter.com and linkedin.com...

    First, download Fiddler Web Proxy if you haven't already, and start Fiddler. http://www.fiddler2.com/fiddler2/ Second, log in to the site you want to crawl using a browser. Find the cookie value passed to the site. In this case, it's LinkedIn...
    Posted to arachnode.net : blog by arachnode.net on Thu, Jun 2 2011
  • Performance Counters.

    arachnode.net provides performance counters in three categories. 1.) ArachnodeDAO (how is the Database performing) 2.) Cache (how is the Cache performing) 3.) Crawl (how is the Crawl performing) Use these counters to monitor crawl performance, establish...
    Posted to arachnode.net : blog by arachnode.net on Thu, May 5 2011
  • Browse.aspx and Cache.aspx

    I recently had a question concerning whether or not AN could cache a webpage and use local content to browse a site in an offline manner or semi-offline manner. The demo crawl that is presented at http://arachnode.net/content/Search.aspx downloaded the...
    Posted to arachnode.net : blog by arachnode.net on Thu, Apr 28 2011
1 2 3 4 5 Next > ... Last »
An Open Source C# web crawler with Lucene.NET search using SQL 2005/2008/CE

copyright 2004-2013, arachnode.net LLC