Copied to clipboard

Flag this post as spam?

This post will be reported to the moderators as potential spam to be looked at

  • Connie DeCinko 931 posts 1159 karma points
    Jan 26, 2011 @ 20:15
    Connie DeCinko

    Search PDFs?

    Can XSLTSearch search within PDFs as well as my content nodes?


  • Ismail Mayat 4511 posts 10058 karma points MVP 2x admin c-trib
    Jan 27, 2011 @ 10:59
    Ismail Mayat

    Not out of the box. You could do a hack to achieve it also I think Doug does have something in the pipeline todo it?  Here is what you could, add extra tab to file with property called pdfcontent type multi text box.  Add action handler that on save will test if file type and has pdf.  

    Then using pdfsharp or itextsharp extra the pdf content and copy it to the pdfcontent field.  Update you xslt search to also search on that field in media.  This is all theory not tried it myself but maybe possible.  I would however recommend using Examine.



  • Douglas Robar 3570 posts 4671 karma points MVP 6x admin c-trib
    Jan 27, 2011 @ 16:10
    Douglas Robar

    Ismail is correct and his approach is what I'd do...though I'd make shadow pages in the content tree with the extracted pdf text and pointer to the original media file because media isn't searched with XSLTsearch and adding it with GetMedia() calls would be quite slow.

    Ismail is also correct that I would recommend Examine in this case. More challenging to set up initially than XSLTsearch but it is a fantastic engine that is worth the effort to learn to use well.


Please Sign in or register to post replies

Write your reply to: