arachnode.net
An Open Source C# web crawler with Lucene.NET search using SQL Server 2008/2012/2014/2016/CE An Open Source C# web crawler with Lucene.NET search using MongoDB/RavenDB/Hadoop
Search the Live Index Does arachnode.net scale? | Download the latest release

Search

  • Re: Control arachnode configurations

    If we speak about installation instructions, I think there is a space for improvement. What I faced, working in visual studio: when I add connection string to the project in 'properties' -> 'settings' tab, it silently adds kind of project's namespace to connection string name. So no way to set arachnode connection string this
    Posted to General Questions (Forum) by victor on Fri, Jan 20 2017
  • Re: Control arachnode configurations

    Figured out: the thing is to properly set up connection string in App.config file of a project that you run. Something like this: <connectionStrings> <add name="arachnode_net_ConnectionString" connectionString="Data Source=LALALA-PC\SQLEXPRESS;Initial Catalog=arachnode.net;Integrated Security=True;Connection Timeout=3600; Max
    Posted to General Questions (Forum) by victor on Thu, Jan 19 2017
  • Re: Control arachnode configurations

    This is what I get, when starting engine: The database connection works well, because I get this exception written into the Exceptions table. Can't really figure out, where am I missing the point..
    Posted to General Questions (Forum) by victor on Wed, Jan 18 2017
  • Adding custom field to index

    Is it possible to add any kind of custom field in index when crawling? To make it easier to match website domains to some entities in my system, when performing search queries later.. Thank you, Vic
    Posted to General Questions (Forum) by victor on Tue, Jan 17 2017
  • Re: Lucene index is empty after finishing crawl

    ok, got it :) And again thanks a lot!
    Posted to General Questions (Forum) by victor on Thu, Jan 12 2017
  • Re: Control arachnode configurations

    hmm.. Could you maybe advice a place to debug it? because I have restored and reset the database, but still the app throws exception, if does not find xml files with configs
    Posted to General Questions (Forum) by victor on Thu, Jan 12 2017
  • Re: Lucene index is empty after finishing crawl

    ok, seems, like lucene needs to access downloaded files on disk, and when I disable storing files on disk, lucene can not produce index files, is it true? What if i don't want to store images and other documents?
    Posted to General Questions (Forum) by victor on Thu, Jan 12 2017
  • Lucene index is empty after finishing crawl

    I'm seeing this output at the end of the crawl But still the lucene index is empty (only few bytes in size) and using Web project to search results always returns 0 rows. I turned off saving any files to disk (no images, no webpages, no files), can this be the reason? Could you advice some steps to investigate this?
    Posted to General Questions (Forum) by victor on Thu, Jan 12 2017
  • Re: Control arachnode configurations

    Not sure, what overrides are you talking about. Is it overriding one of crawler.ApplicationSettings? Then which one is responsible for simulating db? yea, actually I'm working on creating my own console crawler in my main project solution, taking AN Console app from SVN as a base. I'm connecting to the AN database, restored from .bak as it was
    Posted to General Questions (Forum) by victor on Thu, Jan 12 2017
  • Updating indexes

    How does arachnode keep the index up to date? When I run crawler with a list of websites second time after initial crawl, what does actually happen? Does AN update the index for websites, which were already crawled before?
    Posted to General Questions (Forum) by victor on Wed, Jan 11 2017
Page 2 of 3 (23 items) < Previous 1 2 3 Next > | More Search Options
An Open Source C# web crawler with Lucene.NET search using SQL 2008/2012/CE

copyright 2004-2017, arachnode.net LLC