Hello everyone,
I have set up arachnode.net and it works fine for en-us page (works ok for crawl and search). But for non en-us page, I find two issues,
1. the snippet content from search result page can not display non en-us character correctly (but when I click the link to display the real content from search result page, the content page is displayed correctly for non en-us content page, and it proves it is not my browser issue to display non en-us content characters);2. when I search non en-us query, I usually find nothing.
Any ideas what is wrong?
thanks in advance,George
Which site is giving trouble?
I will very likely check in Vesion 1.3 today.
Mike
An open source .NET web crawler written in C# using SQL 2008.
Join the arachnode.net group on Facebook: http://www.facebook.com/groups.php?ref=sb#/group.php?gid=166721755872
Twitter: http://twitter.com/arachnode_net
arachnode.net is provides custom crawling and contracting resources. Please ask.
http://bit.ly/TOFX4
Non-english charaters may not be supported with lucene.net StandardAnalyzer. Additionally, I'm opening all Cached file using UTF-8, which obviously doesn't work in all cases.
http://www.aspfree.com/c/a/BrainDump/Working-with-Lucene-dot-Net/2/
Add a feature request/bug report? I'm planning on taking two weeks away from the code, likely starting today.
This is fixed in Version 1.3