arachnode.net
An Open Source C# web crawler with Lucene.NET search using SQL 2005/2008/CE
Does arachnode.net scale? | VS2008/2010/2012 & SQL2008/2012 | Download the latest release

Cleaning up data

rated by 0 users
Answered (Verified) This post has 1 verified answer | 1 Reply | 2 Followers

Top 100 Contributor
3 Posts
hsaritas posted on Fri, Dec 16 2011 3:42 PM

 

    Hi, I hope you're doing well, 

    What is the best policy to clean up the DB, File System, and Lucene index in a regular basis, assuming that we do not need to persist the data on Arachnode server? 

    Thanks in advance.

 

Answered (Verified) Verified Answer

Top 10 Contributor
1,714 Posts
Verified by hsaritas

Hi there!

Just run the '[dbo].[arachnode_usp_arachnode.net_RESET_DATABASE' stored procedure.  This will reset the database to the original state and will remove all user generated data.

If you run the Crawler in DEMO mode you can elect to: Reset Directories: y - Deletes all files in 'ConsoleOutputLogsDirectory', 'Downloaded[Files/Images/WebPages]Directoriy', 'LuceneDotNetIndexDirectory'.  Or, manually delete the directories using your preferred method to delete a directory.

That's it.

Thanks!
Mike 

For best service when you require assistance:

  1. Check the DisallowedAbsoluteUris and Exceptions tables first.
  2. Cut and paste actual exceptions from the Exceptions table.
  3. Include screenshots.

Skype: arachnodedotnet

Page 1 of 1 (2 items) | RSS
An Open Source C# web crawler with Lucene.NET search using SQL 2005/2008/CE

copyright 2004-2013, arachnode.net LLC