Hello guys!
Congratulations, your project is incredible...
I´d like to know if is possible to do a Web Shopping Pricing site using arachnode, like http://www.pricegrabber.com/... Is it recommended?
Thanks!
The answer is yes.
If you want to crawl a specific list of domains here's what you need to do:
1.) Insert your intended Domains into the DisallowedDomains table and set the column value for 'IsDisallowed' to True.
2.) Delete all rows from the DisallowedWords table. The words in this table are for filtering adults-only content. Since you know you want to crawl specific sites we can remove them. And, since we'll need to negate the Address CrawlRule, we need to delete these rules or else we'll only get content from PriceGrabber.com that is adults-only content, which will likely be 2 pages. (Yes, it's possible to crawl only adults-only content...)
3.) Set the value for negateIsDisallowed in the Address CrawlRule in CrawlActions.config to True.
4.) Insert your starting domains into the CrawlRequests table.
5.) Start crawling.
Then, slice and dice the imcoming data however you please. Do you need additional information?
An open source .NET web crawler written in C# using SQL 2008.
Join the arachnode.net group on Facebook: http://www.facebook.com/groups.php?ref=sb#/group.php?gid=166721755872
Twitter: http://twitter.com/arachnode_net
arachnode.net is provides custom crawling and contracting resources. Please ask.
http://bit.ly/TOFX4
Thanks! We're always looking for people to use our code and help make it better, so, please do!!!
I am super swamped with work and I will answer your question as best I can.
In the time before I can break away from work to best answer your questions, tell me: What exactly are you wanting to do with a web shopping pricing site?
-Mike
I´d like to do a site to compare products prices .... So I need a Bot to get the prices/products over several sites.
Something similar : http://www.pricegrabber.com
Thanks for the help
WOW
Thats amazing!
Thanks man!!!!!!!!!!!
I´ll work around that... If I need something, I´ll ask you...
:)
Please do. I'm working on a few enhancements as requested by other users, so please report any issues you find, etc.
You're welcome,Mike
Hello...
I´m starting to use arachnode... Just one initial question... Is it possible to use VS2008? Any major problem?
thx
Great!
There shouldn't be any problems using the complete solution if you install SQL 2008 as well. SQL 2008 is required to use the Analysis Services and Integration Services projects. Let me know what you find, if you can't use the Functions project, etc.
Heads up: If you're planning on using the lucene.net indexing functionality, be sure to get the latest from the SVN repository. I've been working on this code and optimizing it a great deal. It's worth your while to get the latest over the tag-1.0 version.
Thanks guy !
But I´m a bit confused with arachnode...
I´m able to use Console example and create a lucene indexes...
But, I dont know how to start to solve my problem...
If you could help-me with some samples would be great. Maybe a lillte test that get the NAME and PRICES products from www.buy.com and show it on other page...
Any help is aprecciate... Thanks !
Gotcha.
1.) How much coding experience do you have?
2.) How much experience do you have with regular expressions?
3.) Have you taken a look at ManageLuceneDotNetIndexes.cs? You'll want to write a Plugin for arachnode.net that will strip out the information you require.
Take a look and get back to me.
Hi,
I am brand new to arachnode and have a few questions which I thought I could put in this thread. I have followed this guide and http://arachnode.net/forums/t/44.aspx, but i recieve an error, when trying to run the application (compiling works fine):
Error: Cannot deploy. There is no database connection specified. To correct this error, add a database connection using the project properties.
In the function properties I have added the database and tested the connection works properly, but what have I missed?
Thanks in advance
Hi there!
http://arachnode.net/forums/t/94.aspx
http://arachnode.net/forums/t/44.aspx
Do either of these threads help?
Are there connection strings present for the DataSource project in Properties > Settings...?
I found out that I forgot the types config database. But now I receive a different error in the types project:
The type 'Domain' already exists, or you do not have permission to create it. Types
One thing I do not understand is why you have different SqlConnection objects, why not have a global which you reuses in the different objects, the other seem to be double work in my opinion?
You can safely remove the Types project from the Solution.
Multiple connection strings are an oversight on my part. There will be one ConnectionString location in the next release.
Thanks, but just created a new error:
(null reference)
"Value cannot be null.\r\nParameter name: value"
The stack trace is:
at System.Boolean.Parse(String value) at Arachnode.Configuration.ApplicationSettings.get_ClassifyAbsoluteUris() in I:\Archanode\source\Configuration\ApplicationSettings.cs:line 122 at Arachnode.Console.Program..cctor() in I:\Archanode\source\Console\Program.cs:line 24