Copied to clipboard

Flag this post as spam?

This post will be reported to the moderators as potential spam to be looked at


  • Mikael Mørup 297 posts 326 karma points
    Jan 27, 2012 @ 10:20
    Mikael Mørup
    0

    Using Examine with doc and docx files

    I have implemented a custom indexing / search function in  a 4.7 site with Examine. I'm very impressed with how easy it was to index and search site content and PDF files.

    My client has a fairly large number of word documents (both .doc and .docx) that i would like to index and search as well.

    I have searched the forum and Wiki but havent really found any solution to indexing Word documents.

    Has anybody achieved this ?

  • Ismail Mayat 4511 posts 10092 karma points MVP 2x admin c-trib
    Jan 27, 2012 @ 10:53
    Ismail Mayat
    1

    Mikael,

    you would need to create your own custom indexer take a look at the pdf indexer code and make use of iFilter to extract out word content.  The old umbSearch has some code on how to extact out word content see http://umbracoext.codeplex.com/SourceControl/changeset/view/56317#58384 look in umbSearch/msOfficeFilter you could rip that code out and put in your indexer.

    Regards

     

    Ismail

  • Mikael Mørup 297 posts 326 karma points
    Jan 27, 2012 @ 11:00
    Mikael Mørup
    0

    Thanks Ismail.

    I will look at your suggestions.

    Where does the code for the pdf indexer live ?

     

    Mikael

  • Ismail Mayat 4511 posts 10092 karma points MVP 2x admin c-trib
    Jan 27, 2012 @ 11:06
    Ismail Mayat
    1

    Mikael,

    See http://examine.codeplex.com/SourceControl/changeset/view/82bd120bcbf1#UmbracoExamine.PDF%2fPDFIndexer.cs also there should be some docs on the examine codeplex page on how to create your own indexer but copying pdfindexer should get you going. Once you have done it you just need to update the examine config settings and point that to your indexer to create your own word doc index.

     

    Regards

     

    Ismail

  • Bobby 43 posts 63 karma points
    May 11, 2012 @ 13:28
    Bobby
    0

    Hi Guys,

     

    I have implemented the pdf search using the Umbraco Examine on my site. I have a requirement that my client want to search the doc and the docx files as well. Can i do it with Umbraco Examine. I have done lot of googling but havent find a good solution. If any one has a solution for searching doc and docx files can you please help me with the solution or links.

     

    Thanks 

    Bobby

  • Craig Cronin 304 posts 503 karma points
    Oct 03, 2012 @ 09:31
    Craig Cronin
    0

    Hi Bobby did you have any luck with this.  Ismail has been very kind and pointed me in the direction of a package he has created, bu would be interested if you have found anything?

  • Ismail Mayat 4511 posts 10092 karma points MVP 2x admin c-trib
    Oct 04, 2012 @ 09:26
    Ismail Mayat
    0

    Craig,

    Whats the problem you are trying to solve?

    regards

    Ismail

Please Sign in or register to post replies

Write your reply to:

Draft