arachnode.net
An Open Source C# web crawler with Lucene.NET search using SQL Server 2008/2012/2014/2016/CE An Open Source C# web crawler with Lucene.NET search using MongoDB/RavenDB/Hadoop

Completely Open Source @ GitHub

Does arachnode.net scale? | Download the latest release

Crawl specific URL with specific start path

rated by 0 users
Answered (Not Verified) This post has 0 verified answers | 11 Replies | 1 Follower

Top 10 Contributor
83 Posts
InvestisDev posted on Fri, Mar 2 2012 11:04 PM

Hello,

We have a URL:

http://www.xyz.com/en/home.aspx - English language

and http://www.xyz.com/fi-FI/home.aspx - German language

as above, the language will be right after the host name. In this can while we will crawl the site "http://www.xyz.com/en/home.aspx" then it should not include the pages of "http://www.xyz.com/fi-FI/home.aspx". b'cos as on homepage there will be a URL from which client can change the language of the site.

Let me know if you need further detail for the same.

Thanks,

All Replies

Page 1 of 1 (1 items) | RSS
An Open Source C# web crawler with Lucene.NET search using SQL 2008/2012/CE

copyright 2004-2017, arachnode.net LLC