-
Open cfg.Configuration and set 'ExtractWebPageMetaData' to 'true', and 'InsertWebPageMetaData' to 'true'. Then, crawl and then examine 'WebPages_MetaData'. -Mike
-
arachnode.net already contains support for the HtmlAgilityPack - however, the HtmlAgilityPack is a HUGE memory hog and has an extremely negative impact on crawling rate. If you can avoid it, don't use it. If you have to use it, change the configuration setting for 'ExtractWebPageMetaData'...