arachnode.net
An Open Source C# web crawler with Lucene.NET search using SQL Server 2008/2012/2014/2016/CE An Open Source C# web crawler with Lucene.NET search using MongoDB/RavenDB/Hadoop

Completely Open Source @ GitHub

Does arachnode.net scale? | Download the latest release

Control arachnode configurations

rated by 0 users
Answered (Verified) This post has 2 verified answers | 14 Replies | 2 Followers

Top 25 Contributor
23 Posts
victor posted on Wed, Jan 11 2017 1:57 PM

What is the proper way to control arachnode configuration?

We have different configuration tables in AN database. Still when I do crawler.Start() it tries to load disallowed domains, disallowed uris and some other config stuff from xml files on the disk.

I'm writing my console app for crawling, so I don't have any xmls with config. Could you give some advice, how to handle this properly?

Thank you :)

Answered (Verified) Verified Answer

Top 10 Contributor
1,905 Posts
Verified by arachnode.net

The .xml files should only be used if the DB cannot be found.  Check your build directory.

For best service when you require assistance:

  1. Check the DisallowedAbsoluteUris and Exceptions tables first.
  2. Cut and paste actual exceptions from the Exceptions table.
  3. Include screenshots.

Skype: arachnodedotnet

Top 25 Contributor
23 Posts
Answered (Verified) victor replied on Thu, Jan 19 2017 10:01 AM
Verified by arachnode.net

Figured out: the thing is to properly set up connection string in App.config file of a project that you run. Something like this:

 <connectionStrings>

      <add name="arachnode_net_ConnectionString" 

           connectionString="Data Source=LALALA-PC\SQLEXPRESS;Initial Catalog=arachnode.net;Integrated Security=True;Connection Timeout=3600; Max Pool Size=100000;" 

           providerName="System.Data.SqlClient" />

    </connectionStrings>

Don't hardcode the connection string into your code. 

All Replies

Top 10 Contributor
1,905 Posts

The .xml configs are present if you're using one of the database overrides to simulate database presence.

Have you tried customizing Console\Program.cs - this file has everything you need?

The .xml files will be there if you use the DB, and then can easily be used for your non-DB applications.

Are you using the DEMO or the code from SVN?

Mike

For best service when you require assistance:

  1. Check the DisallowedAbsoluteUris and Exceptions tables first.
  2. Cut and paste actual exceptions from the Exceptions table.
  3. Include screenshots.

Skype: arachnodedotnet

Top 25 Contributor
23 Posts
victor replied on Thu, Jan 12 2017 7:41 AM

Not sure, what overrides are you talking about. Is it overriding one of crawler.ApplicationSettings? Then which one is responsible for simulating db?

yea, actually I'm working on creating my own console crawler in my main project solution, taking AN Console app from SVN as a base. I'm connecting to the AN database, restored from .bak as it was written in installation guide. So I'm not sure, why is my console app looking for any .xml config files.

Sorry if any questions are really simple :)

Thanks a lot!

vic

Top 10 Contributor
1,905 Posts

If you're using a non-database storage system, then, you'll need something to mimic the the configuration tables in SQL, thus, the .xml files.

For best service when you require assistance:

  1. Check the DisallowedAbsoluteUris and Exceptions tables first.
  2. Cut and paste actual exceptions from the Exceptions table.
  3. Include screenshots.

Skype: arachnodedotnet

Top 10 Contributor
1,905 Posts
Verified by arachnode.net

The .xml files should only be used if the DB cannot be found.  Check your build directory.

For best service when you require assistance:

  1. Check the DisallowedAbsoluteUris and Exceptions tables first.
  2. Cut and paste actual exceptions from the Exceptions table.
  3. Include screenshots.

Skype: arachnodedotnet

Top 25 Contributor
23 Posts
victor replied on Thu, Jan 12 2017 5:39 PM

hmm.. Could you maybe advice a place to debug it? because I have restored and reset the database, but still the app throws exception, if does not find xml files with configs

Top 10 Contributor
1,905 Posts

Screenshot(s), specific exception(s)?

For best service when you require assistance:

  1. Check the DisallowedAbsoluteUris and Exceptions tables first.
  2. Cut and paste actual exceptions from the Exceptions table.
  3. Include screenshots.

Skype: arachnodedotnet

Top 25 Contributor
23 Posts
victor replied on Wed, Jan 18 2017 8:28 AM

This is what I get, when starting engine:

The database connection works well, because I get this exception written into the Exceptions table. Can't really figure out, where am I missing the point..

Top 10 Contributor
1,905 Posts

 

What is your ConnectionString, here?

Does it match here:

If dbo.DisallowedAbsoluteUris.xml isn't on disk, then it looks like you've never been able to connect to the DB to create the file.

OK,
Mike

For best service when you require assistance:

  1. Check the DisallowedAbsoluteUris and Exceptions tables first.
  2. Cut and paste actual exceptions from the Exceptions table.
  3. Include screenshots.

Skype: arachnodedotnet

Top 25 Contributor
23 Posts
Answered (Verified) victor replied on Thu, Jan 19 2017 10:01 AM
Verified by arachnode.net

Figured out: the thing is to properly set up connection string in App.config file of a project that you run. Something like this:

 <connectionStrings>

      <add name="arachnode_net_ConnectionString" 

           connectionString="Data Source=LALALA-PC\SQLEXPRESS;Initial Catalog=arachnode.net;Integrated Security=True;Connection Timeout=3600; Max Pool Size=100000;" 

           providerName="System.Data.SqlClient" />

    </connectionStrings>

Don't hardcode the connection string into your code. 

Top 10 Contributor
1,905 Posts

Yes!

Was this not clear in the instructions?  (needs to be improved?)

Mike

For best service when you require assistance:

  1. Check the DisallowedAbsoluteUris and Exceptions tables first.
  2. Cut and paste actual exceptions from the Exceptions table.
  3. Include screenshots.

Skype: arachnodedotnet

Top 25 Contributor
23 Posts
victor replied on Fri, Jan 20 2017 9:27 AM

If we speak about installation instructions, I think there is a space for improvement. What I faced, working in visual studio: when I add connection string to the project in 'properties' -> 'settings' tab, it silently adds kind of project's namespace to connection string name. So no way to set arachnode connection string this way. Only writing it yourself into App.config file.

The other thing to make clear is that you have to set connection string in App.config for the project that you currently build and run. Because ConfigurationManager, which is used in Configuration project will read this file for connection strings.

I had no notifications in visual studio about changing connection string, I also could not follow instruction for checking connection string for Functions project, because I see no 'database' tab in project's properties, I also could not find 'Functions.csproj.user' file. So it was all pretty much confusing for a not experienced .net user like me.

It is also good to mention that, if anyone uses arachnode dlls as a dependency for their own projects, they should also state connection string in App.config file of a project, that they build and run.

Top 10 Contributor
1,905 Posts

OK - I will take a look this weekend.

I am planning to update the DEMO (a good number of fixes and enhancements) to VS2015+.

Thank you for your feedback.

Mike

For best service when you require assistance:

  1. Check the DisallowedAbsoluteUris and Exceptions tables first.
  2. Cut and paste actual exceptions from the Exceptions table.
  3. Include screenshots.

Skype: arachnodedotnet

Top 25 Contributor
23 Posts
victor replied on Sun, Jan 22 2017 5:04 PM

No problem, thank you too! Will you announce the update, when released?

Top 10 Contributor
1,905 Posts

Other things came up this weekend.

BTW, the DEMO update and instructions update won't affect you in any way...  crawl away!

Mike

For best service when you require assistance:

  1. Check the DisallowedAbsoluteUris and Exceptions tables first.
  2. Cut and paste actual exceptions from the Exceptions table.
  3. Include screenshots.

Skype: arachnodedotnet

Page 1 of 1 (15 items) | RSS
An Open Source C# web crawler with Lucene.NET search using SQL 2008/2012/CE

copyright 2004-2017, arachnode.net LLC