I have implemented a custom indexing / search function in a 4.7 site with Examine. I'm very impressed with how easy it was to index and search site content and PDF files.
My client has a fairly large number of word documents (both .doc and .docx) that i would like to index and search as well.
I have searched the forum and Wiki but havent really found any solution to indexing Word documents.
you would need to create your own custom indexer take a look at the pdf indexer code and make use of iFilter to extract out word content. The old umbSearch has some code on how to extact out word content see http://umbracoext.codeplex.com/SourceControl/changeset/view/56317#58384 look in umbSearch/msOfficeFilter you could rip that code out and put in your indexer.
I have implemented the pdf search using the Umbraco Examine on my site. I have a requirement that my client want to search the doc and the docx files as well. Can i do it with Umbraco Examine. I have done lot of googling but havent find a good solution. If any one has a solution for searching doc and docx files can you please help me with the solution or links.
Hi Bobby did you have any luck with this. Ismail has been very kind and pointed me in the direction of a package he has created, bu would be interested if you have found anything?
Using Examine with doc and docx files
I have implemented a custom indexing / search function in a 4.7 site with Examine. I'm very impressed with how easy it was to index and search site content and PDF files.
My client has a fairly large number of word documents (both .doc and .docx) that i would like to index and search as well.
I have searched the forum and Wiki but havent really found any solution to indexing Word documents.
Has anybody achieved this ?
Mikael,
you would need to create your own custom indexer take a look at the pdf indexer code and make use of iFilter to extract out word content. The old umbSearch has some code on how to extact out word content see http://umbracoext.codeplex.com/SourceControl/changeset/view/56317#58384 look in umbSearch/msOfficeFilter you could rip that code out and put in your indexer.
Regards
Ismail
Thanks Ismail.
I will look at your suggestions.
Where does the code for the pdf indexer live ?
Mikael
Mikael,
See http://examine.codeplex.com/SourceControl/changeset/view/82bd120bcbf1#UmbracoExamine.PDF%2fPDFIndexer.cs also there should be some docs on the examine codeplex page on how to create your own indexer but copying pdfindexer should get you going. Once you have done it you just need to update the examine config settings and point that to your indexer to create your own word doc index.
Regards
Ismail
Hi Guys,
I have implemented the pdf search using the Umbraco Examine on my site. I have a requirement that my client want to search the doc and the docx files as well. Can i do it with Umbraco Examine. I have done lot of googling but havent find a good solution. If any one has a solution for searching doc and docx files can you please help me with the solution or links.
Thanks
Bobby
Hi Bobby did you have any luck with this. Ismail has been very kind and pointed me in the direction of a package he has created, bu would be interested if you have found anything?
Craig,
Whats the problem you are trying to solve?
regards
Ismail
is working on a reply...