I search the umbracoFile property with examine. E.g. my file is media/2747/MyDocument-Hello.pdf
When I search for Hello* the Item is found.
When I search for MyDocument* the item is not found.
When I search for media/2747/MyDocument* the item is found.
This must be because the indexer sees the "-" as separator character, but not the "/".
By default for external index while indexing its run through standard analyser that will tokenise and will i think replace the - with space then tokenise on space so you end up with:
What is the context for your search? Is it find documents based on document name? If so then search on the name field rather than umbracoFile field.
You could during indexing get the file name and inject it in with non alpha numerics stripped out say by regex. Then you search on that field but ensure that the search term also has non alpha numeric stripped as well.
Modifying core is not a good idea because when you upgrade your changes will be lost.
How to set Examine "Separator Characters"?
Hi,
I search the umbracoFile property with examine. E.g. my file is media/2747/MyDocument-Hello.pdf
When I search for Hello* the Item is found. When I search for MyDocument* the item is not found. When I search for media/2747/MyDocument* the item is found.
This must be because the indexer sees the "-" as separator character, but not the "/".
Does anybody know how I can configure this?
Kind regards, Stephan
By default for external index while indexing its run through standard analyser that will tokenise and will i think replace the - with space then tokenise on space so you end up with:
What is the context for your search? Is it find documents based on document name? If so then search on the name field rather than umbracoFile field.
Regards
Ismail
Hi Ismail,
You're right, searching the name property is simpler but I had concerns that the value differs from the filename.
I recently changed the umbraco core in order not to modify the name property after upload (see here: https://our.umbraco.com/forum/using-umbraco-and-getting-started/101202-keep-original-filename-in-filesystem-for-uploaded-file#comment-317819).
But there is also the stoppwords and tokenization that will prevent finding files when users enter the exact file name incl. dot plus extension.
I will keep on testing. Maybe examine isn't just the right tool to search for filenames.
Kind regards, Stephan
You could during indexing get the file name and inject it in with non alpha numerics stripped out say by regex. Then you search on that field but ensure that the search term also has non alpha numeric stripped as well.
Modifying core is not a good idea because when you upgrade your changes will be lost.
is working on a reply...