I'm using Examine with a custom data set from an external source, and I'm trying to query a category field to find an exact phrase (which happens to include a stop word.
In my ExamineLuceneIndexSet I've created two fields, _category which stores the cateogies as a comma-separated list, and category which stores the categories replacing the comma with a space.
I'm then running a filtered search whereby the user can search by a combination of term (text input), brand (checkbox list: "Mornobit", "Durndrop") and/or category (checkbox list: "Bug and Tar", "Headlights", "Scratch Repair").
{{ SearchIndexType: , LuceneQuery: (category:"bug and tar") }}
No results
WhitespaceAnalyzer
{{ SearchIndexType: , LuceneQuery: (category:"Bug and Tar") }}
No results
Ideally when I search for "Bug and Tar", I need to find that exact phrase which I know exists as when querying in the Examine Management dashboard I can see _category:Solutions,Bug and Tar category: Solutions Bug and Tar for some entries.
I'd really appreciate any advice on what I need to do to get this search to work.
UPDATE
After playing around with various settings, I updated my CustomSearcher as follows
Have you played around in the back office examine indexer tool to see if you can see results based on your search or variations of your search , ie making sure the index is built and has that value..
If you are passing the exact category values to the search anyway then I would say the easiest thing to do would be to, insert a field for each category value during DocumentWriting and set it to be not analyzed.
Something like this:
private void MyIndexer_DocumentWriting(object sender, DocumentWritingEventArgs e)
{
if (e.Fields.ContainsKey("category"))
{
var facetValue = e.Fields["category"];
var values = facetValue.Split(new[] { ',' }, StringSplitOptions.RemoveEmptyEntries).ToList();
foreach (var value in values)
{
e.Document.Add(new Field("categoryNonAnalyzed", value.Trim(), Field.Store.YES, Field.Index.NOT_ANALYZED));
}
}
}
Then you can use the categoryNonAnalyzed field to search on.
private static void CustomIndexer_DocumentWriting(object sender, Examine.LuceneEngine.DocumentWritingEventArgs e)
{
var values = e.Fields["_category"].Split(new[] {","}, StringSplitOptions.RemoveEmptyEntries).ToList();
foreach (var value in values)
{
e.Document.Add(new Field("categoryList", value.Trim(), Field.Store.YES, Field.Index.NOT_ANALYZED));
}
}
I know one of the items has two categories selected, "Headlights" and "Bug and Tar". When using the Examine Management dashboard, I'm seeing the following
categoryList: Headlights
Does the field name need to be unique for each value?
Fire up luke or even in the backoffice try the following lucene query:
category:bug category:tar
You are doing a phrase search with
category:"bug tar"
and in your index you have bug and tar stored although the stored tokens will be bug tar when using standard analyser.
Another thing to not sure why you are ordering by score when its already ordered by score. Also does searchProviderCollection.Search("*", true) do anything first time I have seen that.
I was getting an error when no term, brand or category had been supplied so after playing around for a bit I found that I could perform an search to grab everything by using searchProviderCollection.Search("*", true).
Not sure if this is the best practice, but it is giving the expected results.
Thanks Ismail. I'm assuming that the DocumentWriting event is available for a custom data set. I'll have a play around and see what I can come up with.
Examine filtered search for exact phrase
I'm using Examine with a custom data set from an external source, and I'm trying to query a category field to find an exact phrase (which happens to include a stop word.
Below is my Examine configuration
ExamineSettings.config
ExamineIndex.config
In my ExamineLuceneIndexSet I've created two fields, _category which stores the cateogies as a comma-separated list, and category which stores the categories replacing the comma with a space.
I'm then running a filtered search whereby the user can search by a combination of term (text input), brand (checkbox list: "Mornobit", "Durndrop") and/or category (checkbox list: "Bug and Tar", "Headlights", "Scratch Repair").
Below is the code that I'm currently using.
Using the above code I'm getting the following results using various analyzers
StandardAnalyzer (this removes stop words)
SimpleAnalyzer
WhitespaceAnalyzer
Ideally when I search for "Bug and Tar", I need to find that exact phrase which I know exists as when querying in the Examine Management dashboard I can see _category:Solutions,Bug and Tar category: Solutions Bug and Tar for some entries.
I'd really appreciate any advice on what I need to do to get this search to work.
UPDATE
After playing around with various settings, I updated my CustomSearcher as follows
Now the search looks like
I still need to do further testing with various filter combinations, but I feel that I have stumbled upon a possible solution.
Have you played around in the back office examine indexer tool to see if you can see results based on your search or variations of your search , ie making sure the index is built and has that value..
Ravi
Yes Ravi, I used Examine Management in the back office to confirm the values are present, and tested.
Hey Sean,
If you are passing the exact category values to the search anyway then I would say the easiest thing to do would be to, insert a field for each category value during DocumentWriting and set it to be not analyzed. Something like this:
Then you can use the categoryNonAnalyzed field to search on.
Hey Tom
I've setup the following
I know one of the items has two categories selected, "Headlights" and "Bug and Tar". When using the Examine Management dashboard, I'm seeing the following
categoryList: Headlights
Does the field name need to be unique for each value?
Sean,
Fire up luke or even in the backoffice try the following lucene query:
You are doing a phrase search with
and in your index you have bug and tar stored although the stored tokens will be bug tar when using standard analyser.
Another thing to not sure why you are ordering by score when its already ordered by score. Also does searchProviderCollection.Search("*", true) do anything first time I have seen that.
Regards
Ismail
I was getting an error when no term, brand or category had been supplied so after playing around for a bit I found that I could perform an search to grab everything by using searchProviderCollection.Search("*", true).
Not sure if this is the best practice, but it is giving the expected results.
Following on from Toms suggestion see https://our.umbraco.org/forum/extending-umbraco-and-using-the-api/89730-multi-value-index-search-on-tag-field for multi value fields which is what you have effectively have.
Thanks Ismail. I'm assuming that the DocumentWriting event is available for a custom data set. I'll have a play around and see what I can come up with.
is working on a reply...