-
OK, I think I've passed the configuration and deployed the project, But I don't know how to configure from 1.2 I want to create crawl requests from all resources image, hyperlink, file,database etc. and I want to crawl the domain entirely as you know. Also I don't want to allow namedAnchors and query strings in the crawling. I have no other
-
OK I've got a fresh 1.2 and build it. Now I've got the Error 1 UNSAFE ASSEMBLY permission was denied on object 'server', database 'master'. Functions
-
Maybe I have some fault in the depth configuration. I did not touch the "Arachnode.SiteCrawler.Rules.Depth" rule. How should I configure this?
-
By the way i am currntly running the 1.1. 3444.26904 version
-
thank you for your reply. 1. I have just some 404s mainly about pictures and a couple of XML parsing errors. I am currently crawling and there are 100 exp.s 7000 hyperlinks ( I assume there are some which represent the same uri like .../download and .../download/ ) and just 400 webpages. The crawling is continuing with no problems. But the problem is
-
Hi, I am trying to crawl completely a big domain, which has aprox. 1m pages(or maybe more). I have started the crawl with maximum depth "int.MaxValue", restrictToUriHost. Then I have confıgured everything for negateIsDisallowedForAbsoluteUri="true". But when the crawl is completed I have only 3000 pages and 33000 hyperlinks in the
-
Thanks a lot! After giving the right permissions on ms, I have struggled a little bit but then I have added <appSettings> <add key="luceneDotNetIndexDirectory" value="C:\LuceneDotNetIndex"/> <add key="cacheTimeoutInMinutes" value="0"/> <add key="maximumPageTitleLength" value="64"
-
Hi, Thanks to your help, I have manage to restrict my crawling to a specific domain and completed my crawling. But when I was running the crawling app. (Program) I used to get the Visual Studio Development Host working and I was being able to search the indexes from there. Since I have concluded my crawling -for the moment- I need to deploy the project
-
i think this post is not valid for version 1.1 because Arachnode.SiteCrawler.Rules.Address does not exist anymore. So what is the actual solution to restrict a crawling to a specific domain? Now I will try to make Arachnode.SiteCrawler.Rules.AbsoluteUri negateIsDisallowed=true and post if it works
-
After my post to installing instructions I have manage to deploy the project on VS2008 & SQL2008. I think I don't have any problems with SQL Server and I can build the solution and run test program under Console project with no errors. But I got 2 questions: As it is commented on the test program crawling always starts from `arachnode.net`.