in the end, the short term fix was to remove the hyphens from the data that was in the searchable feed when it was indexed, and then remove any hyphens from the search text.
Essentially if you handle the TransformingIndexValue event:
You could add a new field to the examine index called 'StrippedArticleNumber' which has any hyphens or underscores removed, so X-Way would be stored as XWay.... when somebody searches for 'X-Way' you would again strip the hyphen from the search term... and send 'XWay' instead, which hould match the article...
Hyphens in Lucene search (Examine)
Hi! I have a Lucene index named DownloadIndex, the index have the following fields:
Most of the items in the index have a articleNumber that contains a hyphen for example: x-way
The problem is when using the StandardAnalyzer and we perform a search for x-way we get 250 results:
Lucene Query:
When analyzing the query you can see that everything after the hyphen is removed and the search is only "x" thats why we get 250 results.
I've also tried to use the Whitespace analyzer, keyword analyzer but without luck.
I also tried to perform a wildcard search resulting with the following query:
But that doesn't help.
Any ideas?
HI Cimplex
Yes I think this is a thing with Lucene and hyphens and also underscores...
Ran into the issue talking with Jeavon about his PR here:
https://github.com/umbraco/Umbraco-CMS/pull/6579
this involves searching for Guids in Umbraco that contained hyphens
And Shannon found this stack overflow suggestion: https://stackoverflow.com/questions/16858880/java-lucene-search-query-hyphens-with-wildcards/
in the end, the short term fix was to remove the hyphens from the data that was in the searchable feed when it was indexed, and then remove any hyphens from the search text.
Essentially if you handle the TransformingIndexValue event:
https://our.umbraco.com/Documentation/Reference/Searching/Examine/examine-events
You could add a new field to the examine index called 'StrippedArticleNumber' which has any hyphens or underscores removed, so X-Way would be stored as XWay.... when somebody searches for 'X-Way' you would again strip the hyphen from the search term... and send 'XWay' instead, which hould match the article...
regards
Marc
is working on a reply...