Copied to clipboard

Flag this post as spam?

This post will be reported to the moderators as potential spam to be looked at


  • David Milor 2 posts 72 karma points
    Jun 27, 2018 @ 11:07
    David Milor
    0

    Strange results from a Lucene search

    Hi all,

    I've been working on extending an existing Umbraco 7.8.1 site. This site has a global search (global in the sense that the search form is on every page and searches every published document).

    I've added a new section into the content tree and one of the requirements was a local search for the new section that only returned published content that was in the new section.

    I just call the normal search and then filter out any unwanted pages which works very nicely indeed.

    However I've got a problem with the results. I have one page in the new section with its Name set to "An afternoon of foo bar" and it also has a property with the exact same value which is used to populate the title on a Hero component. I can see both of these properties & their values in the search index.

    If I search for "afternoon" I get one result as expected. If I search for "an afternoon" I get zero results. The word "afternoon" does not appear in any other properties on that page. Lucene can only be getting it from the two properties mentioned.

    Why can't Lucene find "an afternoon"? I know from working with Lucene on previous non Umbraco projects that common words (like: it, and, the) are counted as noise and discounted as search terms. Shouldn't this have the effect of reducing the search term to "afternoon" and return me the same result as manually searching for "afternoon"?

  • Frans de Jong 548 posts 1840 karma points MVP 3x c-trib
    Jun 28, 2018 @ 19:40
    Frans de Jong
    0

    I don't know the answer to this specific question but in the past the following link helped me a lot:

    Examine search documentation

  • David Milor 2 posts 72 karma points
    Jul 02, 2018 @ 08:11
    David Milor
    0

    Thank you Frans. I'll check it out.

  • Ismail Mayat 4511 posts 10090 karma points MVP 2x admin c-trib
    Jul 02, 2018 @ 09:13
    Ismail Mayat
    0

    David,

    Can you look through the source code and see which searcher is being used todo the search. Then look in examine settings config file for that searcher to see which analyser is being used. Can you also look at which analyser is being used for indexing.

    If the analyser for both indexing and searching is standard. Then the word "an" which is english stop word will be removed both by the indexer and by the query parser so for your query you should get matches. If analyser is stop word then your query will be a phrase query and it will try and match exactly on your query and if the words are not adjacent it will not match and there you would need to do grouped or when doing your query.

    Could you also write out the generated query so we can see whats going on. You can do that by looking at your search code and where you call .Search you will be passing in a criteria / query object run a .ToString call on that and that will give you the generated lucene query.

    Regards

    Ismial

Please Sign in or register to post replies

Write your reply to:

Draft