<?xml version="1.0" encoding="UTF-8" ?>
<?xml-stylesheet type="text/xsl" href="http://arachnode.net/utility/FeedStylesheets/rss.xsl" media="screen"?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/" xmlns:wfw="http://wellformedweb.org/CommentAPI/"><channel><title>Search results matching tag 'docmanager'</title><link>http://arachnode.net/search/SearchResults.aspx?s=7&amp;o=DateDescending&amp;tag=docmanager&amp;orTags=0</link><description>Search results matching tag 'docmanager'</description><dc:language>en-US</dc:language><generator>CommunityServer 2008.5 SP1 (Debug Build: 31106.3070)</generator><item><title>Managing Files</title><link>http://arachnode.net/forums/p/1678/15520.aspx#15520</link><pubDate>Mon, 23 May 2011 14:32:13 GMT</pubDate><guid isPermaLink="false">a2478770-777f-41ab-83b8-a21ff47ebb1f:15520</guid><dc:creator>bscott</dc:creator><description>&lt;p&gt;I need access to the contents of documents in the &lt;b&gt;CrawlRequestCompleted &lt;/b&gt;event.&amp;nbsp; For PDFs, I&amp;#39;m able to get this easily from the byte array attached to the &lt;b&gt;CrawlRequest&lt;/b&gt; using the &lt;b&gt;PDFManager&amp;#39;s GetText(byte[])&lt;/b&gt; method.&amp;nbsp; For Office documents, the best I&amp;#39;ve been able to do so far in the API is use &lt;b&gt;DOCManager&amp;#39;s GetText(string discoveryPath)&lt;/b&gt; method, which I believe grabs the file off of the hard drive.&amp;nbsp; Ideally, I would prefer not to save the files to hard drive, though.&lt;/p&gt;
&lt;p&gt;These are the approaches I&amp;#39;m considering&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Delete each file after I&amp;#39;m done processing it&lt;/li&gt;
&lt;li&gt;Find another way to read the document contents&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Which would you recommend, and how would you recommend I go about it?&amp;nbsp; I&amp;#39;ve had trouble figuring out how to do the former through the API.&amp;nbsp; For the latter, I imagine I can find a solution from outside Arachnode, but I was hoping there might be something built in.&lt;/p&gt;</description></item></channel></rss>