arachnode.net
An Open Source C# web crawler with Lucene.NET search using SQL Server 2008/2012/2014/2016/CE An Open Source C# web crawler with Lucene.NET search using MongoDB/RavenDB/Hadoop

Completely Open Source @ GitHub

Does arachnode.net scale? | Download the latest release

Is it possible to get specific data from APIs ?

rated by 0 users
Answered (Verified) This post has 1 verified answer | 5 Replies | 2 Followers

Top 25 Contributor
25 Posts
Dinesh posted on Tue, Jun 18 2013 9:14 AM

Hi All,

I would like to get the specific content from Facebook API. Is it possible to get specific data from APIs ?. If so, what is my approach in AN crawler ??

 

Answered (Verified) Verified Answer

Top 10 Contributor
1,905 Posts
Verified by arachnode.net

Yes, this is possible.  FB's API accepts standard HTTP WebRequests.  (GraphAPI).

Look at 'EnableJavaScript.sql' in the 'Database' solution folder.  (at the top of the solution) - GraphAPI returns JavaScript/JSON.

Approaches, based on what your business requirements are...

1.) Submit FB AbsoluteUris as straight CrawlRequests.

2.) Submit FB AbsoluteUris through a Plugin.

 

 

For best service when you require assistance:

  1. Check the DisallowedAbsoluteUris and Exceptions tables first.
  2. Cut and paste actual exceptions from the Exceptions table.
  3. Include screenshots.

Skype: arachnodedotnet

All Replies

Top 10 Contributor
1,905 Posts
Verified by arachnode.net

Yes, this is possible.  FB's API accepts standard HTTP WebRequests.  (GraphAPI).

Look at 'EnableJavaScript.sql' in the 'Database' solution folder.  (at the top of the solution) - GraphAPI returns JavaScript/JSON.

Approaches, based on what your business requirements are...

1.) Submit FB AbsoluteUris as straight CrawlRequests.

2.) Submit FB AbsoluteUris through a Plugin.

 

 

For best service when you require assistance:

  1. Check the DisallowedAbsoluteUris and Exceptions tables first.
  2. Cut and paste actual exceptions from the Exceptions table.
  3. Include screenshots.

Skype: arachnodedotnet

Top 25 Contributor
25 Posts
Dinesh replied on Tue, Jun 18 2013 1:24 PM

Hi Mike,

Basically we want grab id, description of a company from FB GraphAPI. The GraphAPI gives response in JSON format. I did the following steps to grab the id and description.

1. Executed EnableJavaScript.sql sql statements

2. Passed absolute uri like https://graph.facebook.com/amazon as a crawlrequest from programs.cs

3. We don't want to use HtmlAgilityPack HTML document and parse the web page, because the GraphAPI itself giving us required data in JSON format

4. We wont seen any data in decoded.html in performaction method since its not a html page. So, how should I get/access the id and description in performAction method ?. 


Top 10 Contributor
1,905 Posts

Use Encoding.Ascii.GetString(crawlRequest.Data).

For best service when you require assistance:

  1. Check the DisallowedAbsoluteUris and Exceptions tables first.
  2. Cut and paste actual exceptions from the Exceptions table.
  3. Include screenshots.

Skype: arachnodedotnet

Top 25 Contributor
25 Posts
Dinesh replied on Tue, Jun 18 2013 2:10 PM

Can't we use like below.

var client = new FacebookClient();

dynamic me = client.Get("amazon");

int id = me.id;

string desc = me.description;

off course, its against to AN it seems 

Source: http://facebooksdk.net/docs/web/getting-started/

Top 10 Contributor
1,905 Posts

Don't know.  Never used.

For best service when you require assistance:

  1. Check the DisallowedAbsoluteUris and Exceptions tables first.
  2. Cut and paste actual exceptions from the Exceptions table.
  3. Include screenshots.

Skype: arachnodedotnet

Page 1 of 1 (6 items) | RSS
An Open Source C# web crawler with Lucene.NET search using SQL 2008/2012/CE

copyright 2004-2017, arachnode.net LLC