If I understand correctly leading wildcards are not allowed in Lucene/examine because of performance issues. But unlike English in Dutch we like to connect words. So door handle becomes 'doorhandle'.
So I wonder how can I fix it that if somebody looks for handle they also find 'doorhandle'?
Slace, could you say something on how to go about that? (You would have to create your own searcher (you can inherit the default one) and customize the ISearchCriteria implementation.)
having the same problem here. Cannot understand why leadingwildcardstory is so hard to get right. For every end-user searching handle the expectations will be to also find doorhandle.
With examine you need to create your own Searcher and when the QueryParser is created you need to enable prefix wildcards. This is part of the FluentAPI, so you have to customize that as well.
There's a reason that this isn't exposed as Lucene doesn't recommend it's done, and if you want to do it you have to dig pretty deep.
pretty sure thats correct (been a while since i've been in the source). As Aaron mentioned, its not implemented because Lucene doesn't recommend it. I don't want to be blamed for bringing down peoples sites because of overusing leading wildcards :) Would be fairly simple to make your own searcher, just inherit from the UmbracoExamineSearcher. You'll have to check out the source to figure out what to override.... don't think it would be very hard.
I'm just doing some updates on the codebase of Examine and noticed (forgot) that the enableLeadingWildcard parameter is available on the base class of LuceneSearcher which means that it should be available on all searchers, not just member searcher.
I've tried you're approach but can't seem to get it working.
My code is the following:
string[] terms = searchString.Split(' ').Select(x => string.Format("*{0}*", x)).ToArray();
var searchFields = new List<string>();
var searchTerms = new List<string>();
foreach (var t in terms)
{
searchTerms.AddRange(fields.Select(_ => t));
searchFields.AddRange(fields);
}
// Pass our lists to GroupOr, compile and execute the search.
BooleanQuery.SetMaxClauseCount(99999);
var query = ExamineManager.Instance.CreateSearchCriteria().GroupedOr(searchFields, searchTerms.ToArray());
var search = ExamineManager.Instance.Search(query.Compile());
did anyone find a solution to the leading wildcard problem?
I tried using the QueryParser and set leading wildcards like this:
queryParser.SetAllowLeadingWildcard(true);
and handing it to the searcher, but I just can't get it to work. Whatever I try, I get the same frustrating exception from Examine, telling me that * or ? are not allowed as first character in a wildcard search.
I don't know what to do...
Is there another search provider than Examine, which allows leading wildcards?
Leading wildcards work, it's part of the core and you can just set this with configuration. For example, we have this enabled by default for the members searcher:
Thanks, I set the option to true for all my searchers in the examine settings, but still no luck with the search on my website. I used this article https://our.umbraco.org/documentation/Reference/Searching/Examine/overview-explanation
as a guide, using a raw lucene query. Maybe I have to use the fluent API to get the leading wildcards working?
Edit:
oh my, this is so embarrassing. Found my mistake in the ExamineSettings.config. I wrote enableLeadingWildcards(true) instead of enableLeadingWildcard(true).
So that was all, removed the "s" from wildcards and everything worked as it should, no errors, no exceptions...
Examine: Leading wildcards
Hello,
If I understand correctly leading wildcards are not allowed in Lucene/examine because of performance issues. But unlike English in Dutch we like to connect words. So door handle becomes 'doorhandle'.
So I wonder how can I fix it that if somebody looks for handle they also find 'doorhandle'?
Or if anybody knows how to turn on: QueryParser.SetAllowLeadingWildcard. Since it's a small website performance wont be a big issues anyways.
You would have to create your own searcher (you can inherit the default one) and customize the ISearchCriteria implementation.
If you're indexing non-English content I recommend that you use a different analyzer, although I'm not sure if there is a Dutch one.
Hi,
running into the same problem.
How did you solve it pickels?
martin
For that project we didn't fix it and just used the default.
Slace, could you say something on how to go about that? (You would have to create your own searcher (you can inherit the default one) and customize the ISearchCriteria implementation.)
having the same problem here. Cannot understand why leadingwildcardstory is so hard to get right. For every end-user searching handle the expectations will be to also find doorhandle.
thanks.
Martin
Here's the Lucene info about wildcards: http://wiki.apache.org/lucene-java/LuceneFAQ#What_wildcard_search_support_is_available_from_Lucene.3F
With examine you need to create your own Searcher and when the QueryParser is created you need to enable prefix wildcards. This is part of the FluentAPI, so you have to customize that as well.
There's a reason that this isn't exposed as Lucene doesn't recommend it's done, and if you want to do it you have to dig pretty deep.
I noticed that there is an option now enableLeadingWildcards="true". Is this new and does this mean that examine now supports leading wildcards?
Cause when I do:
I still get:
'*' or '?' not allowed as first character in WildcardQuery
Be aware that the correct setting to use is
enableLeadingWildcard="true"
Notice that there is no "s" at the end of the config variable ;)
I strongly suggest to look at the config file suggested by Shannon Deminick later is this post :
https://github.com/umbraco/Umbraco-CMS/blob/dev-v7/src/Umbraco.Web.UI/config/ExamineSettings.config#L42
Are you sure thats not just on the members searcher?
Ah so I see. I guess it's not available on UmbracoExamineSearcher?
pretty sure thats correct (been a while since i've been in the source). As Aaron mentioned, its not implemented because Lucene doesn't recommend it. I don't want to be blamed for bringing down peoples sites because of overusing leading wildcards :) Would be fairly simple to make your own searcher, just inherit from the UmbracoExamineSearcher. You'll have to check out the source to figure out what to override.... don't think it would be very hard.
I will check it out for sure this time. I should also do some research on how other dutch websites use lucene.
I'm just doing some updates on the codebase of Examine and noticed (forgot) that the enableLeadingWildcard parameter is available on the base class of LuceneSearcher which means that it should be available on all searchers, not just member searcher.
I've managed to do it by creating my own searcher:
Defining it on the settings:
When I try to search with something like:
This does not return as much results as I was expecting, but changing the query from the Fluent API to a simple Raw version works fine :)
And that made the road so much nicer :)
Cheers guys,
Hope this helped.
Hi
Does this also work on the newer umbraco version (7.2.4)?
i getting some ysod: the provider has to inherit Examine.Providers.BaseIndexProvider
thanks
Hello Nuno,
I've tried you're approach but can't seem to get it working. My code is the following:
Does anyone see a major hole in this?
Thanks
Hello,
did anyone find a solution to the leading wildcard problem?
I tried using the QueryParser and set leading wildcards like this:
and handing it to the searcher, but I just can't get it to work. Whatever I try, I get the same frustrating exception from Examine, telling me that * or ? are not allowed as first character in a wildcard search. I don't know what to do... Is there another search provider than Examine, which allows leading wildcards?
Leading wildcards work, it's part of the core and you can just set this with configuration. For example, we have this enabled by default for the members searcher:
https://github.com/umbraco/Umbraco-CMS/blob/dev-v7/src/Umbraco.Web.UI/config/ExamineSettings.config#L42
Thanks, I set the option to true for all my searchers in the examine settings, but still no luck with the search on my website. I used this article https://our.umbraco.org/documentation/Reference/Searching/Examine/overview-explanation as a guide, using a raw lucene query. Maybe I have to use the fluent API to get the leading wildcards working?
Edit: oh my, this is so embarrassing. Found my mistake in the ExamineSettings.config. I wrote enableLeadingWildcards(true) instead of enableLeadingWildcard(true). So that was all, removed the "s" from wildcards and everything worked as it should, no errors, no exceptions...
Thank you very much for your help!
is working on a reply...