Examine - How to combine wildcards with custom analyzer?
Hallo,
In Examine I have added an index and a searchprocedure with wildcards. So when the user for example only enters 'Belg' also all entries for 'België' should be returned.
Here's the code:
if (!String.IsNullOrEmpty(SearchView.SearchTerm))
{
foreach (string token in Helper.Tokenize(SearchView.SearchTerm))
{
Criteria.GroupedOr(SearchView.SearchFields.ToArray(), token.MultipleCharacterWildcard());
}
}
SearchView.Results = Searcher.Search(Criteria).ToList().OrderByDescending(x => x.Score);
This works fine. But now I want to add a custom analyzer, that replaces all special characters, for instance 'ë' becomes an 'e'. So when the user enters 'België' or 'Belgie' in both cases the same results are returned.
This means that the searchterm the user has entered needs to be converted the same way that the indexes are. I can do this by using a QueryParser. Since these return a full query I cannot use the GroupedOr method anymore, because this works only on strings.
So I make a raw query and pass that to my searcher:
MultiFieldQueryParser queryParser = new MultiFieldQueryParser(Lucene.Net.Util.Version.LUCENE_29, SearchView.SearchFields.ToArray(), new TechKeywordAnalyzer());
if (!String.IsNullOrEmpty(SearchView.SearchTerm))
{
foreach (string token in Helper.Tokenize(SearchView.SearchTerm))
{
Query.AppendFormat("+({0}) ", queryParser.Parse((token));
}
}
SearchView.Results = Searcher.Search(Criteria.RawQuery(Query.ToString())).ToList().OrderByDescending(x => x.Score);
This also works, but: I cannot enter the wildcardsearch in this one!
When I change the parameter token to token.MultipleCharacterWildCard() in the Query.AppendFormat method nothing is passed to the analyzer...
I guess the answer should be in the first code-snippet. When I look in the debugger I can see that Criteria (which is an ISearchCriteria) itself contains a QueryParser of my custom analyzer. I expected that the token I pass to this Criteria in the GroupedOr-method would be parsed through this QueryParser, but it seems that is not the case.
I see that one of the differences with my code is that you pass a string.Empty as parameter field in the new QueryParser. I did not know that was allowed, so in my code I got a full query including field-name returned. That was the reason I had to use a raw query in the second snippet.
Now I can use the ISearchCriteria again which I prefer.
I touch upon on this on the examine course in fact just last Friday Dennis Spijkerboer asked me about this. He will be running examine course in the netherlands i think.
Examine - How to combine wildcards with custom analyzer?
Hallo,
In Examine I have added an index and a searchprocedure with wildcards. So when the user for example only enters 'Belg' also all entries for 'België' should be returned.
Here's the code:
This works fine. But now I want to add a custom analyzer, that replaces all special characters, for instance 'ë' becomes an 'e'. So when the user enters 'België' or 'Belgie' in both cases the same results are returned.
This means that the searchterm the user has entered needs to be converted the same way that the indexes are. I can do this by using a QueryParser. Since these return a full query I cannot use the GroupedOr method anymore, because this works only on strings. So I make a raw query and pass that to my searcher:
This also works, but: I cannot enter the wildcardsearch in this one!
When I change the parameter token to token.MultipleCharacterWildCard() in the Query.AppendFormat method nothing is passed to the analyzer...
I guess the answer should be in the first code-snippet. When I look in the debugger I can see that Criteria (which is an ISearchCriteria) itself contains a QueryParser of my custom analyzer. I expected that the token I pass to this Criteria in the GroupedOr-method would be parsed through this QueryParser, but it seems that is not the case.
Does anyone have a solution for this problem?
thanks,
Frans
Frans,
I came across this when wildcarding with non english languages. You need to create your own ascii folding filter that will ascii fold. See https://gist.github.com/ismailmayat/83715613236db7ae8742a180f8d3abed
See the comments in that gist. Basically you need to fold then add the wildcard.
Regards
Ismail
Hi Ismail,
This works, thanks!
I see that one of the differences with my code is that you pass a string.Empty as parameter field in the new QueryParser. I did not know that was allowed, so in my code I got a full query including field-name returned. That was the reason I had to use a raw query in the second snippet. Now I can use the ISearchCriteria again which I prefer.
thanks,
Frans
I touch upon on this on the examine course in fact just last Friday Dennis Spijkerboer asked me about this. He will be running examine course in the netherlands i think.
Regards
Ismail
is working on a reply...