Copied to clipboard

Flag this post as spam?

This post will be reported to the moderators as potential spam to be looked at


  • Pickels 75 posts 108 karma points
    Aug 29, 2010 @ 16:46
    Pickels
    0

    Examine: Leading wildcards

    Hello,

    If I understand correctly leading wildcards are not allowed in Lucene/examine because of performance issues. But unlike English in Dutch we like to connect words. So door handle becomes 'doorhandle'.

    So I wonder how can I fix it that if somebody looks for handle they also find 'doorhandle'?

  • Pickels 75 posts 108 karma points
    Aug 29, 2010 @ 18:03
    Pickels
    0

    Or if anybody knows how to turn on: QueryParser.SetAllowLeadingWildcard. Since it's a small website performance wont be a big issues anyways.

  • Aaron Powell 1708 posts 3046 karma points c-trib
    Aug 30, 2010 @ 01:07
    Aaron Powell
    0

    You would have to create your own searcher (you can inherit the default one) and customize the ISearchCriteria implementation.

    If you're indexing non-English content I recommend that you use a different analyzer, although I'm not sure if there is a Dutch one.

  • Martin Lingstuyl 202 posts 379 karma points
    Jan 26, 2011 @ 10:42
    Martin Lingstuyl
    0

    Hi,

     

    running into the same problem.

    How did you solve it pickels?

     

    martin

  • Pickels 75 posts 108 karma points
    Jan 26, 2011 @ 11:42
    Pickels
    0

    For that project we didn't fix it and just used the default.

  • Martin Lingstuyl 202 posts 379 karma points
    Jan 26, 2011 @ 15:46
    Martin Lingstuyl
    0

    Slace, could you say something on how to go about that? (You would have to create your own searcher (you can inherit the default one) and customize the ISearchCriteria implementation.)

    having the same problem here. Cannot understand why leadingwildcardstory is so hard to get right. For every end-user searching handle the expectations will be to also find doorhandle.

     

    thanks.

    Martin

     

     

  • Aaron Powell 1708 posts 3046 karma points c-trib
    Jan 26, 2011 @ 21:51
    Aaron Powell
    0

    Here's the Lucene info about wildcards: http://wiki.apache.org/lucene-java/LuceneFAQ#What_wildcard_search_support_is_available_from_Lucene.3F

    With examine you need to create your own Searcher and when the QueryParser is created you need to enable prefix wildcards. This is part of the FluentAPI, so you have to customize that as well.

    There's a reason that this isn't exposed as Lucene doesn't recommend it's done, and if you want to do it you have to dig pretty deep.

  • Pickels 75 posts 108 karma points
    Mar 03, 2011 @ 13:53
    Pickels
    0

    I noticed that there is an option now enableLeadingWildcards="true". Is this new and does this mean that examine now supports leading wildcards? 

    Cause when I do:

    criteria.RawQuery(String.Format("nodeName:*{0}*", SearchTerm));

    I still get:

    '*' or '?' not allowed as first character in WildcardQuery

  • Pierre-Yves Savard 1 post 71 karma points
    Feb 15, 2019 @ 20:20
    Pierre-Yves Savard
    0

    Be aware that the correct setting to use is enableLeadingWildcard="true"

    Notice that there is no "s" at the end of the config variable ;)

    I strongly suggest to look at the config file suggested by Shannon Deminick later is this post :

    https://github.com/umbraco/Umbraco-CMS/blob/dev-v7/src/Umbraco.Web.UI/config/ExamineSettings.config#L42

  • Shannon Deminick 1524 posts 5270 karma points MVP 2x
    Mar 03, 2011 @ 14:08
    Shannon Deminick
    0

    Are you sure thats not just on the members searcher?

  • Pickels 75 posts 108 karma points
    Mar 03, 2011 @ 14:17
    Pickels
    0

    Ah so I see. I guess it's not available on UmbracoExamineSearcher?

  • Shannon Deminick 1524 posts 5270 karma points MVP 2x
    Mar 03, 2011 @ 14:21
    Shannon Deminick
    0

    pretty sure thats correct (been a while since i've been in the source). As Aaron mentioned, its not implemented because Lucene doesn't recommend it. I don't want to be blamed for bringing down peoples sites because of overusing leading wildcards :) Would be fairly simple to make your own searcher, just inherit from the UmbracoExamineSearcher. You'll have to check out the source to figure out what to override.... don't think it would be very hard.

  • Pickels 75 posts 108 karma points
    Mar 03, 2011 @ 14:42
    Pickels
    0

    I will check it out for sure this time. I should also do some research on how other dutch websites use lucene.

  • Shannon Deminick 1524 posts 5270 karma points MVP 2x
    Mar 09, 2011 @ 00:32
    Shannon Deminick
    0

    I'm just doing some updates on the codebase of Examine and noticed (forgot) that the enableLeadingWildcard parameter is available on the base class of LuceneSearcher which means that it should be available on all searchers, not just member searcher.

  • Nuno Lourenço 4 posts 30 karma points
    May 25, 2011 @ 20:10
    Nuno Lourenço
    5

    I've managed to do it by creating my own searcher:

    public class MyUmbracoExamineSearcher : UmbracoExamineSearcher
        {
            public MyUmbracoExamineSearcher() : base()
            {
                this.EnableLeadingWildcards = true;
            }
        }

    Defining it on the settings:

          <add name="CustomSearchSearcher" 
               type="MyNamespace.MyUmbracoExamineSearcher, MyNamespace"
               analyzer="Lucene.Net.Analysis.WhitespaceAnalyzer, Lucene.Net"
               enableLeadingWildcards="true"/>

    When I try to search with something like:

    var searchProvider = ExamineManager.Instance.SearchProviderCollection["CustomSearchSearcher"];
    var searchCriteria = searchProvider.CreateSearchCriteria();
    IBooleanOperation query = searchCriteria.NodeName("*" + keyword.MultipleCharacterWildcard())
                                            .Or()
                                            .Field("categoryDescription", "*" + keyword.MultipleCharacterWildcard())
                                            .Or()
                                            .Field("postContent", "*" + keyword.MultipleCharacterWildcard());
    searchProvider.Search(query.Compile());

    This does not return as much results as I was expecting, but changing the query from the Fluent API to a simple Raw version works fine :)

    searchProvider.Search(searchCriteria.RawQuery(string.Format("nodeName:*{0}* categoryDescription:*{0} postContent:*{0}*", keyword)));

    And that made the road so much nicer :)

    Cheers guys,

    Hope this helped.

  • Tobias Lopez 64 posts 210 karma points
    Aug 10, 2015 @ 13:55
    Tobias Lopez
    0

    Hi

    Does this also work on the newer umbraco version (7.2.4)?

    i getting some ysod: the provider has to inherit Examine.Providers.BaseIndexProvider

    thanks

  • nojaf 91 posts 300 karma points
    Jun 04, 2014 @ 15:50
    nojaf
    0

    Hello Nuno,

    I've tried you're approach but can't seem to get it working. My code is the following:

            string[] terms = searchString.Split(' ').Select(x => string.Format("*{0}*", x)).ToArray();
    
    
            var searchFields = new List<string>();
            var searchTerms = new List<string>();
    
            foreach (var t in terms)
            {
                searchTerms.AddRange(fields.Select(_ => t));
                searchFields.AddRange(fields);
            }
    
            // Pass our lists to GroupOr, compile and execute the search. 
            BooleanQuery.SetMaxClauseCount(99999);
            var query = ExamineManager.Instance.CreateSearchCriteria().GroupedOr(searchFields, searchTerms.ToArray());        
            var search = ExamineManager.Instance.Search(query.Compile());
    

    Does anyone see a major hole in this?

    Thanks

  • Carola 4 posts 79 karma points
    Sep 30, 2015 @ 12:45
    Carola
    0

    Hello,

    did anyone find a solution to the leading wildcard problem?

    I tried using the QueryParser and set leading wildcards like this:

    queryParser.SetAllowLeadingWildcard(true);
    

    and handing it to the searcher, but I just can't get it to work. Whatever I try, I get the same frustrating exception from Examine, telling me that * or ? are not allowed as first character in a wildcard search. I don't know what to do... Is there another search provider than Examine, which allows leading wildcards?

  • Shannon Deminick 1524 posts 5270 karma points MVP 2x
    Sep 30, 2015 @ 12:47
    Shannon Deminick
    2

    Leading wildcards work, it's part of the core and you can just set this with configuration. For example, we have this enabled by default for the members searcher:

    https://github.com/umbraco/Umbraco-CMS/blob/dev-v7/src/Umbraco.Web.UI/config/ExamineSettings.config#L42

  • Carola 4 posts 79 karma points
    Sep 30, 2015 @ 13:06
    Carola
    2

    Thanks, I set the option to true for all my searchers in the examine settings, but still no luck with the search on my website. I used this article https://our.umbraco.org/documentation/Reference/Searching/Examine/overview-explanation as a guide, using a raw lucene query. Maybe I have to use the fluent API to get the leading wildcards working?

    Edit: oh my, this is so embarrassing. Found my mistake in the ExamineSettings.config. I wrote enableLeadingWildcards(true) instead of enableLeadingWildcard(true). So that was all, removed the "s" from wildcards and everything worked as it should, no errors, no exceptions...

    Thank you very much for your help!

Please Sign in or register to post replies

Write your reply to:

Draft