arachnode.net
An Open Source C# web crawler with Lucene.NET search using SQL Server 2008/2012/2014/2016/CE An Open Source C# web crawler with Lucene.NET search using MongoDB/RavenDB/Hadoop
Search the Live Index Does arachnode.net scale? | Download the latest release

Content Relevancy

rated by 0 users
Answered (Not Verified) This post has 0 verified answers | 0 Replies | 1 Follower

Top 100 Contributor
4 Posts
Kronkite posted on Tue, Mar 8 2011 9:29 PM

SE returns data as XML in format:

<Document>

  <AbsoluteUri>http://www.amazon.com/rss/bestsellers/pc/541966/ref=pd_ts_rss_link</AbsoluteUri>

  <Created>2011-01-10T00:00:00</Created>

  <DiscoveryID>668</DiscoveryID>

  <DiscoveryPath>c:\applications\arachnode.net\console\bin\debug\downloadedfiles\http\www\amazon\com\rss\bestsellers\pc\541966\_13067891405923497595228200139105217215116.xml</DiscoveryPath>

  <Domain>amazon.com</Domain>

  <Extension>com</Extension>

  <Host>amazon.com</Host>

  <Scheme>http</Scheme>

  <Score>45233.18</Score>

  <Strength>999967</Strength>

  <Summary>& Accessories list for authoritative information on this product's current rank.)]]> #2: 3 Pack of Premium Crystal Clear Screen Protectors for Apple <B>iPad</B> top ...</Summary>

  <Title>Amazon.com: Bestsellers in Electronics &gt; Computers &amp; Accessories</Title>

  <Updated xsi:nil="true" />

</Document>

 

Is a “relevancy” issue covered somehow?  What are Score, Strength, DiscoveryID, DiscoveryPath and Updated tags mean? Are they described somewhere?

Page 1 of 1 (1 items) | RSS
An Open Source C# web crawler with Lucene.NET search using SQL 2008/2012/CE

copyright 2004-2017, arachnode.net LLC