arachnode.net
An Open Source C# web crawler with Lucene.NET search using SQL Server 2008/2012/2014/2016/CE An Open Source C# web crawler with Lucene.NET search using MongoDB/RavenDB/Hadoop

Completely Open Source @ GitHub

Does arachnode.net scale? | Download the latest release

Crawling in background of System

rated by 0 users
Answered (Not Verified) This post has 0 verified answers | 10 Replies | 1 Follower

posted on Thu, Feb 17 2011 7:31 AM

Hi,
Arachoned.net is very good.
I want to Crawl in background and auto update when virtual directory have change.
I see a Service in demo. But I don't know its workflow.

Please give some ways to progress this task!

Thanks!

All Replies

Top 10 Contributor
1,905 Posts

http://msdn.microsoft.com/en-us/library/system.diagnostics.processpriorityclass.idle.aspx

For best service when you require assistance:

  1. Check the DisallowedAbsoluteUris and Exceptions tables first.
  2. Cut and paste actual exceptions from the Exceptions table.
  3. Include screenshots.

Skype: arachnodedotnet

replied on Fri, Feb 18 2011 6:58 AM

I had run successfully service in demo. But I don't known how to this service crawles on virtual directory.
Please give some advise!
Thanks!

Top 10 Contributor
1,905 Posts

You have to create a CrawlRequest for the virtual directory, which is just an AbsoluteUri.

For best service when you require assistance:

  1. Check the DisallowedAbsoluteUris and Exceptions tables first.
  2. Cut and paste actual exceptions from the Exceptions table.
  3. Include screenshots.

Skype: arachnodedotnet

replied on Fri, Feb 18 2011 9:13 AM

I run console is ok. Now I wan't use service to Crawl in background by using service. Can't you help me?

Thanks!

Top 10 Contributor
1,905 Posts

I am not completely sure I know what you want to do.  You want to use the service to crawl a virtual directory?  Submit a CrawlRequest via CrawlRequests.txt.  (search the code for CrawlRequests.txt)  AN doesn't exist as a FileSystemWatcher, so it doesn't just 'come alive' and know when to start crawling, although you could program one in the Service.

Do you know completely what you want to do?

For best service when you require assistance:

  1. Check the DisallowedAbsoluteUris and Exceptions tables first.
  2. Cut and paste actual exceptions from the Exceptions table.
  3. Include screenshots.

Skype: arachnodedotnet

replied on Fri, Feb 18 2011 9:39 AM

OK Thanks!

I want to use the service to crawl a virtual directory. And my application can auto update when vitual directory change. Everything run in background. I will by arachnode_net if it can solve this task.

I hope you help me!

Thanks!

Top 10 Contributor
1,905 Posts

You will want to add one of these to watch the actual directory: http://msdn.microsoft.com/en-us/library/system.io.filesystemwatcher(v=VS.90).aspx, which will then start the Crawler.  The Service will always be running, and the FileSystemWatcher will watch the actual directory (not the virtual directory) and instruct the Crawler to start.  The first link I sent instructs how to run the Service as a background priority.

This is how to do what you want to.  How long have you been programming?

For best service when you require assistance:

  1. Check the DisallowedAbsoluteUris and Exceptions tables first.
  2. Cut and paste actual exceptions from the Exceptions table.
  3. Include screenshots.

Skype: arachnodedotnet

replied on Mon, Feb 21 2011 7:52 PM

Your guide is very specific!

Thanks!

 

replied on Wed, Feb 23 2011 3:32 AM

Hi Mike!

The first link http://msdn.microsoft.com/en-us/library/system.diagnostics.processpriorityclass.idle.aspx which you give me is "Page Not Found".

Please give me another way to solve this task!

Thanks!

Top 10 Contributor
1,905 Posts

Try this: http://msdn.microsoft.com/en-us/library/system.diagnostics.process.priorityclass(v=VS.90).aspx

For best service when you require assistance:

  1. Check the DisallowedAbsoluteUris and Exceptions tables first.
  2. Cut and paste actual exceptions from the Exceptions table.
  3. Include screenshots.

Skype: arachnodedotnet

Page 1 of 1 (11 items) | RSS
An Open Source C# web crawler with Lucene.NET search using SQL 2008/2012/CE

copyright 2004-2017, arachnode.net LLC