Copied to clipboard

Flag this post as spam?

This post will be reported to the moderators as potential spam to be looked at


  • Andreas Kristensen 71 posts 285 karma points c-trib
    May 18, 2020 @ 07:37
    Andreas Kristensen
    0

    "The" seems to be ignored in search

    I have a site, with a lot of nodes which name contains "the double".

    If I make a search on that term "the double" i get results for "the", but no results containing "the double".

    If I then search for "double", I get the expected results that contain "the double" in the node name.

    What gives?

  • Marc Goodson 2155 posts 14408 karma points MVP 9x c-trib
    May 20, 2020 @ 22:45
    Marc Goodson
    100

    Hi Andreas

    Examine is a wrapper around Lucene, and the lucene standard analyzer has a list of English 'stop words' that for efficiency purposes it doesn't index, I guess because of their frequency... that a search for 'a' or 'it' or 'the' would be meaningless, when you search for 'the double' therefore the 'the' part is ignored...

    The full list of stop words is in this code sample:

    http://alvinalexander.com/java/jwarehouse/lucene/src/java/org/apache/lucene/analysis/StopAnalyzer.java.shtml

    There is an explanation of the issue here in the ezSearch repo, for V7

    https://github.com/umco/umbraco-ezsearch/issues/23

    where one workaround is to remove the stop words from the search terms...

    otherwise you could use a different analyser...

    regards

    Marc

Please Sign in or register to post replies

Write your reply to:

Draft