Compiled the whole VS solution. I get few warnings but no errors.
2 types of warnings are there:
1. 'xyz' method is obsolete, use 'abc' method. These type are from Lucene module.
2. Other warning is from Fucntions.csproj.user . It is schema related. For reference i pasted below this warning:
I setup console project as startup. but on F5, nothing launches and VS says 'deployment failed'.
Can anyone help ?
I am using VS 2010 (dont have VS 2008), i dont think that should make any difference. backward compatibility is always ensured in such dev tools.
Warning 2 The element 'Project' in namespace 'http://schemas.microsoft.com/developer/msbuild/2003' has incomplete content. List of possible elements expected: 'PropertyGroup, ItemGroup, ItemDefinitionGroup, Choose, UsingTask, ProjectExtensions, Target, Import' in namespace 'http://schemas.microsoft.com/developer/msbuild/2003'. D:\Arachnode\LatestRelease2.5\Functions\Functions.csproj.user 13 3 Functions
1.) They will, but this table is really for the crawler to use.
2.) You should comment what you want. You have the option to se in the DB or in code.
3.) Check out the reset DB procedure to completely reset the DB. Make a SQL backup to save your data. If the crawl completes the DB will be in 'clean slate' mode and you can crawl again. See those options that prompt you every time you start the console - those help you to reset the crawler.
4.) Don't change that table. :)
5.) http://arachnode.net/search/SearchResults.aspx?q=negateisdisallowed What order in which config table? There are more than a few.
For best service when you require assistance:
I can answer this post further in a few hours.
1.) Warning, are just that - warnings. Lucene frequently makes methods obsolete, and then removed them in the next minor version. Most of these warnings are addressed in code on my desktop and can be safely ignored.
2.) This I don't know because I use 2008 for AN.
For you failed deployment: Sounds like a VS2008 bug still exists in 2010. Restart VS, and restart SQL and the project will deploy. (It does in 2008). If it doesn't, look at the properties of the Functions project and ensure you have a valid DB connection string.
Do either of these posts help with the 'Functions' warning?
Ok, I got it working ! Did the console run and did the search test.
I see there are 3 webdav servers running :
1. search - works fine
2. administration: the page shows fine, but clicking to view a table content, it says "SQL connection couldnt be made, named pipe etc....." I have TCP, named pipe etc. enabled in SQL config manager.
Did i miss adding connection string anywhere ?
3. crawl.aspx - How to use this ? It only asks for a URL.
- How can i give the starting URL to crawl ? Say if i want to crawl a given website (and no links outside that website) , can i configure this ? Where ?
- I hope crawling a single large website wont get my IP banned ?
- And if i dont want images etc. only the html text, where to do this config.
- finally how do i access the raw downloaded htmls . are they in DB ?
- and can i insert sm custom parsing code when a page is downloaded, so tht i can store it in my own way. Or may be once crawl is done, i can do processing on whole bunch.
2.) The Administration project may not be supported in 2010. Check the web.config.
3.) This is a beta and there isn't any documentation on it. Step into the code and see what it does! There are plenty of comments in the source.
You will want to read these sections completely:
These sections will answer most of your questions. Get back to me when you have, OK?
Thanks Mike for the help !
Ok, So i went through most of the documentation and got pretty good idea to basic workings.
My questions further which need you to come in are:
1. Can i give crawlrequests directly by adding entries in the table ? Even during ongoing crawl, will those get picked-up ?
2. Do i need to run console everytime i need to crawl ? But Program.cs resets lot of Configuration settings , should i comment those lines out if i want to control through config table modification. ?
3. So after every crawl, how do i consolidate/backup my data before another crawl ? Which minimum tables i should clear to get to clean-slate for next crawl
4. I see in program.cs if i give new crawlrequest, then i can hardcode my restriction in code . Then Is there need to modify UriClassificationType table ?
Lastly for now...
What is Order in config.. table and what does NegateIsDisallowed mean ?