Copied to clipboard

Flag this post as spam?

This post will be reported to the moderators as potential spam to be looked at


  • nickornotto 402 posts 905 karma points
    May 10, 2019 @ 01:07
    nickornotto
    0

    Is Examine search ignoring certain keywords or my query is wrong?

    I am building Examine search in Umbraco and am searching by either keyword or country code or city but I am getting wrong results for some of them.

    For example I search for countryCode = IN and startCity = New Delhi

    I pass a list of required properties (MUST properties):

    List<string> matchProperties =  new List<string> { countryCode = IN, startCity = New Delhi };
    

    But the Examine builds the query as:

    SearchIndexType: "content", LuceneQuery: {+(__NodeTypeAlias:product +startCity:"new delhi" -umbracoNaviHide:1) +__IndexType:content}
    

    evidently ignoring countryCode IN value

    Same if I search only by country:

    List<string> matchProperties =  new List<string> { countryCode = IN };
    

    I receive the following query:

    SearchIndexType: "content", LuceneQuery: {+(__NodeTypeAlias:product -umbracoNaviHide:1) +__IndexType:content}
    

    Here I am getting wrong results, because the search returns all product nodes instead of the ones containing IN countryCode.

    If I look up by a keyword (it uses Or properties so should return products when either of fields contain the keyword):

    SearchIndexType: "content", LuceneQuery: {+(__NodeTypeAlias:product heading:danakil tags:danakil description:danakil -umbracoNaviHide:1) +__IndexType:content}
    

    it again returns all nodes which is wrong because FOR SURE only a few nodes (up to 20) contain this specific word (case sensitive or not).

    This is how I am building my query:

        protected async Task<List<SearchResult>> SearchCriteriaResultAsync(Examine.Providers.BaseSearchProvider searcher, ISearchCriteria searchCriteria, string contentAliasToMatch, bool excludeHidden, Dictionary<string, string> matchProperties, Dictionary<string, string> matchOrProperties)
        {
            IBooleanOperation query = searchCriteria.NodeTypeAlias(contentAliasToMatch);
    
            if (matchProperties != null && matchProperties.Any())
            {
                foreach (var item in matchProperties)
                {
                    query = query.And().Field(item.Key, item.Value);
                }
            }
    
            if (matchOrProperties != null && matchOrProperties.Any())
            {
                int counter = 0;
                foreach (var item in matchOrProperties)
                {
                    query = query.Or().Field(item.Key, item.Value);
                    counter++;
                }
            }
    
            if(excludeHidden)
            {
                query = query.Not().Field("umbracoNaviHide", "1");
            }
    
            return await System.Threading.Tasks.Task.Run(() => {
                return searcher.Search(query.Compile()).ToList();
            });
        }
    

    And my indexer and searcher:

        <add name="PublishedProductsIndexer" type="UmbracoExamine.UmbracoContentIndexer, UmbracoExamine"
             supportUnpublished="false"
             supportProtected="true"
             interval="10"
             analyzer="Lucene.Net.Analysis.Standard.StandardAnalyzer, Lucene.Net"/>
    
    
    
        <add name="PublishedProductsSearcher" type="UmbracoExamine.UmbracoExamineSearcher, UmbracoExamine"
             analyzer="Lucene.Net.Analysis.Standard.StandardAnalyzer, Lucene.Net"/>
    
    
    
    <IndexSet SetName="PublishedProductsIndexSet" IndexPath="~/App_Data/TEMP/ExamineIndexes/PublishedProducts/" IndexParentId="1049">
        <IndexAttributeFields>
            <add Name="id" />
            <add Name="nodeName" />
            <add Name="updateDate" />
            <add Name="writerName" />
            <add Name="path" />
            <add Name="email" />
            <add Name="nodeTypeAlias" />
            <add Name="parentID" />
        </IndexAttributeFields>
        <IncludeNodeTypes>
            <add Name="product"/>
        </IncludeNodeTypes>
    </IndexSet>
    

    Is Examine ignoring certain keywords or my query is wrong? If wrong how should I fix it?

    I ma using umbraco 7.6 if this matters.

  • Marc Goodson 2155 posts 14406 karma points MVP 9x c-trib
    May 18, 2019 @ 08:41
    Marc Goodson
    0

    Hi manila

    Examine is a wrapper around Lucene, and Lucene is highly performant text search engine - depending on the analyzer used then some normal English words are ignored - eg why take up space indexing words like 'it' or 'of' - who would search for them etc....

    These are referred to as 'StopWords' - and for the standard analyzer are the following words:

    final List

    You'll notice "in" is on the list...

    which is why your query fails for searches in India!!! as In is being ignored.

    There is some info here about how you could modify this list:

    https://our.umbraco.com/forum/developers/extending-umbraco/25600-Examine-case-insensitive-keyword-search#comment-95336

    or I guess, store more text to identify the country in the index.

    regards

    Marc

Please Sign in or register to post replies

Write your reply to:

Draft