Hi.
I've tried to figure it out, but I think it will be better to ask here:
What is the difference between a Table and its discovery table? (e.g. what is the diffetence between Images table and Images_Discoveries table?)
To make things more general, what is the purpose of a discovery in the crawler?
Thanks, Sagie
A Discovery is anything the crawler can discover. Check the database table 'DiscoveryTypes' for the full list.
[Discovery]_Discoveries is used to store those 'me too' references. A billion pages will point to 'http://google.com', but we only store the string 'http://google.com' once, and then a billion integer references to the 'Discovery'.
Always glad to help!Mike
An open source .NET web crawler written in C# using SQL 2008.
Join the arachnode.net group on Facebook: http://www.facebook.com/groups.php?ref=sb#/group.php?gid=166721755872
Twitter: http://twitter.com/arachnode_net
arachnode.net is provides custom crawling and contracting resources. Please ask.
http://bit.ly/TOFX4