I have installed CogWorks Cogworks.ExamineFileIndexer version 1.0.3 which allows me to index PDF and MS Office documents.
As such I have set up my indexer as follows
And in my public class SearchEventHandler : IApplicationEventHandler class I have created event handlers
var pdfIndexer =
(UmbracoMediaFileIndexer)ExamineManager.Instance.IndexProviderCollection[@"PDFIndexer"];
pdfIndexer.GatheringNodeData += HandlePDFGatheringNodeData;
pdfIndexer.DocumentWriting += HandlePDFDocumentWriting;
But can anyone tell me why UmbracoContext.Current.ContentCache is null in the HandlePDFGatheringNodeData()?
And more importantly, how to get access to the content within any PDF, XLS, DOCX?
As for me the only fields available are ID, nodeName, updateDate, writerName, path, nodeTypeAlias and parentID
Note: Oddly enough, during the HandlePDFDocumentWriting() I have access to 43 fields one of which is the FileTextContent. So when I used this event to add to my Index, Luke doesn't show it.
Umbraco 7.11.1 GatheringNode()
Hi:
I have installed CogWorks Cogworks.ExamineFileIndexer version 1.0.3 which allows me to index PDF and MS Office documents. As such I have set up my indexer as follows
And in my public class SearchEventHandler : IApplicationEventHandler class I have created event handlers var pdfIndexer = (UmbracoMediaFileIndexer)ExamineManager.Instance.IndexProviderCollection[@"PDFIndexer"]; pdfIndexer.GatheringNodeData += HandlePDFGatheringNodeData; pdfIndexer.DocumentWriting += HandlePDFDocumentWriting;
But can anyone tell me why UmbracoContext.Current.ContentCache is null in the HandlePDFGatheringNodeData()?
And more importantly, how to get access to the content within any PDF, XLS, DOCX?
As for me the only fields available are ID, nodeName, updateDate, writerName, path, nodeTypeAlias and parentID
Note: Oddly enough, during the HandlePDFDocumentWriting() I have access to 43 fields one of which is the FileTextContent. So when I used this event to add to my Index, Luke doesn't show it.
Thanks
Tom
is working on a reply...