Copied to clipboard

Flag this post as spam?

This post will be reported to the moderators as potential spam to be looked at


  • Richard Brookes 34 posts 57 karma points
    Nov 04, 2014 @ 21:48
    Richard Brookes
    0

    Not finding common terms

    I am not sure if this is the correct place to post this issue because it feels like it could be a more general examne issue but let's see:

    I am using ezsearch and have come accross the following strange problem. Lets say I have a node with the name "This is not my cat" a search for this is not my cat returns no results

    If i search just the term cat then my item does appear in the search results.

    On further investigation it appears that there are a whole series of words that cause this problem. Words like "and", "the", "it" etc

    If I go and change the name of my node to "thisx isx notx myx cat" then my search for this is not my cat does return the correct item. So it finds these common words if they are a fragment of a longer word.

    My examine settings are all just the defaults. 

    If anybody can shed any light on this it would be much appreciated

  • Ismail Mayat 4511 posts 10090 karma points MVP 2x admin c-trib
    Nov 05, 2014 @ 10:45
    Ismail Mayat
    0

    Richard,

    ezSearch uses examine which under the hood uses lucene.net. In lucene.net you have the concept of analysers and different ones do different things.  eZSearch uses the externalindexer which uses standard analyser, this analyser will remove english stop words at the point of indexing and at the point of searching see http://www.codeproject.com/Articles/32175/Lucene-Net-Text-Analysis for more information.  In your example the text

    "This is not my cat"

    will end up in the index as "cat" when you search using same phrase it should just search on cat and work becuase the query analyser is also using standard analyser and it will strip stop words. 

    Regards

    Ismail

Please Sign in or register to post replies

Write your reply to:

Draft