Copied to clipboard

Flag this post as spam?

This post will be reported to the moderators as potential spam to be looked at


  • Raluca Dumitru 33 posts 174 karma points
    Jun 12, 2020 @ 09:52
    Raluca Dumitru
    0

    UmbracoExamine.Pdf search implementation

    Hi everyone

    I am want to implement a pdf search using UmbracoExamine.Pdf, but I don't really know how to do it. I have used the class from this blog, but it doesn't work. The documentation I found on examine pdf is for a multi index searcher, but I don't want that. This is how my code looks like so far:

     [HttpPost]
        public ActionResult pdfSearch(string searchTerm)
        {
            var test = querySearchIndex(searchTerm);
            return Json(test);
        }
    
        private ISearchResults querySearchIndex(string searchTerm)
        {
            if (ExamineManager.Instance.TryGetIndex("PDFIndex", out var index))
            {
                ISearcher searcher = index.GetSearcher();
                IQuery query = searcher.CreateQuery(null, BooleanOperation.And);
                string searchFields = "fileTextContent";
                IBooleanOperation terms = query.GroupedOr(searchFields.Split(','), searchTerm);
                return terms.Execute();
            }
            else
            {
                throw new InvalidOperationException($"No Index found with name PDFIndex");
            }
    
        }
    

    What am I doing wrong? Thanks

  • Ismail Mayat 4511 posts 10092 karma points MVP 2x admin c-trib
    Jun 12, 2020 @ 11:27
    Ismail Mayat
    0

    Raluca,

    I am assuming you have installed that package. So in the umbraco backoffice under settings examine dashboard do you see the pdf index and are you able to search for stuff in that index? Does it find your pdf?

    This is what it looks like for my pdf index you can see I have 32 items

    enter image description here

    One thing to note is that the pdf indexer it uses itextsharp and that does not always extract pdf content.

  • Raluca Dumitru 33 posts 174 karma points
    Jun 12, 2020 @ 11:36
    Raluca Dumitru
    0

    Hi Ismail

    Yes I've installed the package, and my backoffice looks like yours. My search also returns results in there.

  • Ismail Mayat 4511 posts 10092 karma points MVP 2x admin c-trib
    Jun 12, 2020 @ 11:39
    Ismail Mayat
    0

    Raluca,

    In your code can you before doing terms.Execute() can you do searcher.ToString() or it may be terms.ToString() one of those will give you the generated lucene query I need to see that to get a handle on whats going.

    Regards

    Ismail

  • Raluca Dumitru 33 posts 174 karma points
    Jun 12, 2020 @ 13:34
    Raluca Dumitru
    0

    Hi Ismail

    searcher.ToString() comes up null, and terms.ToString() is:

    { Category: , LuceneQuery: +(fileTextContent:protection) }
    

    'Protection' is my search term

    Regards

  • Ismail Mayat 4511 posts 10092 karma points MVP 2x admin c-trib
    Jun 12, 2020 @ 13:57
    Ismail Mayat
    0

    Raluca,

    Ok so the query itself looks fine to me. One more thing to try can u download https://code.google.com/archive/p/luke/downloads luke lukeall-3.5.0.jar you will need to have java installed on your machine. Then run that jar and open the index using it then in search tab paste

    +(fileTextContent:protection)

    enter image description here

    Use standard analyser and see if you get any results.

Please Sign in or register to post replies

Write your reply to:

Draft