arachnode.net
An Open Source C# web crawler with Lucene.NET search using SQL Server 2008/2012/2014/2016/CE An Open Source C# web crawler with Lucene.NET search using MongoDB/RavenDB/Hadoop

Completely Open Source @ GitHub

Does arachnode.net scale? | Download the latest release

Downloads

Name
Comments
comments
Pin
shortcut
WebCrawlingSurvey
WebCrawlingSurvey
Lee
Lee
Mining of Massive Datasets [comp sci] - A. Rajaraman, J...
Mining of Massive Datasets [comp sci] - A. Rajaraman, J. Ullman (Cambridge, 2011) WW.pdf
Balancing volume, quality and freshness in web crawling
Baeza-Yates, R. and Castillo, C. (2002). Balancing volume, quality and freshness in web crawling...
Breadth-first crawling yields high-quality pages
Marc Najork and Janet L. Wiener. Breadth-first crawling yields high-quality pages. In Proceedings...
Crawling a Country: Better Strategies than Breadth-First...
Baeza-Yates, R., Castillo, C., Marin, M. and Rodriguez, A. (2005). Crawling a Country: Better Strategies...
Design and implementation of a distributed crawler and filtering...
Zeinalipour-Yazti, D. and Dikaiakos, M. D. (2002). Design and implementation of a distributed crawler...
Design and implementation of a high performance distributed...
Shkapenyuk, V. and Suel, T. (2002). Design and implementation of a high performance distributed...
Do your worst to make the best: Paradoxical effects in pagerank...
Boldi, P., Santini, M., and Vigna, S. (2004b). Do your worst to make the best: Paradoxical effects...
Focused crawling using context graphs
Diligenti, M., Coetzee, F., Lawrence, S., Giles, C. L., and Gori, M. (2000). Focused crawling using...
Mercator: A scalable, extensible Web crawler
Heydon, A. and Najork, M. (1999). Mercator: A scalable, extensible Web crawler. World Wide Web,...
Modeling and managing content changes in text databases
Ipeirotis, P., Ntoulas, A., Cho, J., Gravano, L. (2005) Modeling and managing content changes in...
Synchronizing a database to improve freshness
Cho, J. and Garcia-Molina, H. (2000). Synchronizing a database to improve freshness. In Proceedings...
Page 2 of 2 (35 items) < Previous 1 2
An Open Source C# web crawler with Lucene.NET search using SQL 2008/2012/CE

copyright 2004-2017, arachnode.net LLC