Copied to clipboard

Flag this post as spam?

This post will be reported to the moderators as potential spam to be looked at


  • Mark 255 posts 612 karma points
    Feb 20, 2015 @ 01:32
    Mark
    0

    Examine Raw Query - Can I stop ISearchCriteria.RawQuery from changing it?

    Umbraco 7.2.1

    Indexer:

    <add name="MySearchIndexer" type="UmbracoExamine.UmbracoContentIndexer, UmbracoExamine" />

    Searcher:

    <add name="MySearchSearcher" type="UmbracoExamine.UmbracoExamineSearcher, UmbracoExamine" />

    Umbraco Examine Fluent API is not performing the search I need when running it against the new index I've created.

    So as a workaround I've worked out that the raw query I need to run is:

    +(region:2323 region:2324) AND +lowerPricePadded:[0000000000 TO 1000000000] AND +upperPricePadded:[0000000000 TO 1000000000] AND +bedroomsPadded:[0001 TO 1000]

    I've tested this query using the searcher via the Examine Managament dashboard. It returns the results I expect.

    So, I am trying to run the above raw query using the code found here

    The problem is, the query that the RawQuery function returns is:

    (+(region:2323 region:2324) +lowerPricePadded:[0000000000 TO 1000000000] +upperPricePadded:[0000000000 TO 1000000000] +bedroomsPadded:[0001 TO 1000])

    This is obviously different from the query I need to run. Is it possible to submit the query that I want to submit, and not the one examine is regurgitating?

  • Mark 255 posts 612 karma points
    Feb 20, 2015 @ 01:50
    Mark
    0

    This is the code I'm using:

    StringBuilder rawQuery = new StringBuilder();
     
    // regions
    rawQuery.Append("+(");
    foreach (var region in regions.Split(','))
    {
    rawQuery.AppendFormat("region:{0} ", region);
    }
    rawQuery.Remove(rawQuery.Length - 1, 1);
    rawQuery.Append(')');
     
    // lower price
    if (!string.IsNullOrEmpty(lowerPrice))
    {
    rawQuery.Append(" AND +lowerPricePadded:[");
    rawQuery.Append(lowerPrice.PadLeft(10, '0'));
    rawQuery.Append(" TO 1000000000]");
    }
     
    // upper price
    if (!string.IsNullOrEmpty(upperPrice))
    {
    rawQuery.Append(" AND +upperPricePadded:[0000000000 TO ");
    rawQuery.Append(upperPrice.PadLeft(10, '0'));
    rawQuery.Append("]");
    }
     
    // bedrooms
    if (!string.IsNullOrEmpty(bedrooms))
    {
    rawQuery.Append(" AND +bedroomsPadded:[");
    rawQuery.Append(bedrooms.PadLeft(4, '0'));
    rawQuery.Append(" TO 1000]");
    }
     
    // filters
    if (!string.IsNullOrEmpty(filters))
    {
    rawQuery.Append(" AND +(");
    foreach (var filter in filters.Split(','))
    {
    rawQuery.AppendFormat("filtersSpaced:{0} ", filter);
    }
    rawQuery.Remove(rawQuery.Length - 1, 1);
    rawQuery.Append(')');
    }
     
    // do the search
    var searcher = ExamineManager.Instance.SearchProviderCollection["MySearchSearcher"];
     
    var query = searcher.CreateSearchCriteria(BooleanOperation.Or).RawQuery(rawQuery.ToString());
     
    var results = searcher.Search(query);
  • Mark 255 posts 612 karma points
    Feb 20, 2015 @ 09:52
    Mark
    0

    Perhaps I need to use specific analyzers to ensure Examine doesn't change the raw query. Does anyone know?

  • Mark 255 posts 612 karma points
    Feb 20, 2015 @ 12:32
    Mark
    0

    Anyone?

  • Syntaxis 4 posts 24 karma points
    Aug 28, 2015 @ 11:03
    Syntaxis
    0

    Hi Mark,

    Did you resolve this issue as I also have the same issue. We require a query with "AND" and the FluentAPI does not seem to add this in any way we have tried to use it so have tried the RawQuery in the same way as yourself and also have the same issue that it is stripping out the "AND"

  • Shannon Deminick 1526 posts 5272 karma points MVP 2x
    Aug 31, 2015 @ 07:24
    Shannon Deminick
    102

    This is because the query parser in Lucene changes it, and it's also because the terms "AND" is synonymous with a prefixed "+", similarly, "OR" is synonymous with " "

    This document is slightly misleading:

    https://lucene.apache.org/core/294/queryparsersyntax.html#Boolean%20operators

    Because the AND and OR operators get converted. It does mention this in a bit of an abstract way:

    This is equivalent to an intersection using sets

    This explains it a little better and is actually how the query parser works:

    https://lucene.apache.org/core/470/queryparser/org/apache/lucene/queryparser/simple/SimpleQueryParser.html

    You can see how your queries are directly translated using Luke:

    enter image description here

    So instead of thinking with "AND" and "OR", think in how Lucene actually does it's query parsing which is with "MUST" and "SHOULD" and then combine this with parenthesis.

    For example, your original query could be rewritten as:

     +(region:2323 region:2324) +(lowerPricePadded:[0000000000 TO 1000000000]) +(upperPricePadded:[0000000000 TO 1000000000]) +(bedroomsPadded:[0001 TO 1000])
    

    Which equals:

    • Must be in region 2323 OR 2324
    • Must have lowerPriceAdded between 0 -> 1000000000
    • Must have upperPricePadded between 0 -> 1000000000
    • Must have bedroomsPadded between 1 -> 1000

    ... you see there is no reason for "AND" here each one of these items MUST match something to be in the result, therefore "AND" is implied.

  • Syntaxis 4 posts 24 karma points
    Sep 01, 2015 @ 15:51
    Syntaxis
    0

    Hi Shannon,

    Thanks for your reply to this thread.

    We have done some more tests with our data and are still experiencing something strange with the back office [Lucene Search] box, our queries are now working as expected on the front end.

    The following code returns our expected results on the front end of our website:

    search.RawQuery("+newsItemType:news +(newsCategories:1140 newsCategories:1141) +archiveYear:2015");
    

    Works fine on the front end of our site and returns the three expected test records, however, if we paste the same raw query into the [Lucene Search] box in the Examine Management Dashboard:

    +newsItemType:news +(newsCategories:1140 newsCategories:1141) +archiveYear:2015
    

    We then receive 5 results, which are made up of four items that have been tagged as a "news" type and one additional item that has been tagged with the year 2015

    As you can see below:

    Score   Id  Values
    2.42338729  1170    __IndexType: content __NodeId: 1170 __NodeTypeAlias: newsitem __Path: -1,1056,1136,1153,1154,1170 archiveYear: 2015 id: 1170 newsCategories: 1140 1141 newsItemType: News previewDescription: Lorem ipsum dolor sit amet, consectetur adipiscing elit. Nam facilisis nibh lorem, in varius purus fringilla nec. Phasellus sed pulvinar massa. Nulla eget erat eget sapien lobortis eleifend. Vestibulum ultricies sed sem sit amet lacinia. Nunc faucibus arcu quis mollis tincidunt. Sed vitae malesuada massa. Quisque rutrum et tellus eget convallis. Ut blandit metus massa, a luctus ante egestas a. Nam lobortis, velit id congue venenatis, dui felis congue libero, eget lacinia turpis dui ac odio. previewTitle: News item 2v updateDate: 20150828125312000
    2.10719061  1161    __IndexType: content __NodeId: 1161 __NodeTypeAlias: newsitem __Path: -1,1056,1136,1153,1162,1161 archiveYear: 2015 id: 1161 newsCategories: 1140 1141 1173 newsItemType: News previewDescription: Sed sagittis blandit quam ac ullamcorper. Donec sit amet dictum ante. Vestibulum laoreet facilisis ex ut auctor. Maecenas id imperdiet massa. Vivamus suscipit euismod magna in interdum. Cras rutrum, libero sed porttitor venenatis, orci dolor pretium magna, tempor ultrices augue diam eget nunc. Donec sed rutrum nunc. Suspendisse commodo eros libero, a auctor neque dapibus in. Quisque vitae iaculis nisi. Nullam accumsan libero et elementum convallis. previewTitle: This product announcement updateDate: 20150828113600000
    1.38691664  1155    __IndexType: content __NodeId: 1155 __NodeTypeAlias: newsitem __Path: -1,1056,1136,1153,1162,1155 archiveYear: 2015 id: 1155 newsCategories: 1141 newsItemType: News previewDescription: Test Item event previewTitle: Test Item event updateDate: 20150828113601000
    0.561602533 1183    __Icon: icon-newspaper __IndexType: content __NodeId: 1183 __NodeTypeAlias: newsitem __Path: -1,1056,1136,1153,1162,1183 archiveYear: 2015 id: 1183 newsCategories: 1176 newsItemType: News previewDescription: News Item Another Test previewTitle: News Item Another Test updateDate: 20150901110257000
    0.09885396  1169    __Icon: icon-newspaper __IndexType: content __NodeId: 1169 __NodeTypeAlias: newsitem __Path: -1,1056,1136,1153,1162,1169 archiveYear: 2015 id: 1169 newsCategories: 1142 newsItemType: Events previewDescription: A new event previewTitle: A new event updateDate: 20150901105703000
    

    You should be able to see from the above data that rows 4 and 5 have newsItemCategories that should have been filtered out of this search result.

    However, if we add the query in the following format the results return as expected.

    +(+newsItemType:news AND +(newsCategories:1140 newsCategories:1141) AND +archiveYear:2015)
    

    The results:

    Score   Id  Values
    2.42338729  1170    __IndexType: content __NodeId: 1170 __NodeTypeAlias: newsitem __Path: -1,1056,1136,1153,1154,1170 archiveYear: 2015 id: 1170 newsCategories: 1140 1141 newsItemType: News previewDescription: Lorem ipsum dolor sit amet, consectetur adipiscing elit. Nam facilisis nibh lorem, in varius purus fringilla nec. Phasellus sed pulvinar massa. Nulla eget erat eget sapien lobortis eleifend. Vestibulum ultricies sed sem sit amet lacinia. Nunc faucibus arcu quis mollis tincidunt. Sed vitae malesuada massa. Quisque rutrum et tellus eget convallis. Ut blandit metus massa, a luctus ante egestas a. Nam lobortis, velit id congue venenatis, dui felis congue libero, eget lacinia turpis dui ac odio. previewTitle: News item 2v updateDate: 20150828125312000
    2.10719061  1161    __IndexType: content __NodeId: 1161 __NodeTypeAlias: newsitem __Path: -1,1056,1136,1153,1162,1161 archiveYear: 2015 id: 1161 newsCategories: 1140 1141 1173 newsItemType: News previewDescription: Sed sagittis blandit quam ac ullamcorper. Donec sit amet dictum ante. Vestibulum laoreet facilisis ex ut auctor. Maecenas id imperdiet massa. Vivamus suscipit euismod magna in interdum. Cras rutrum, libero sed porttitor venenatis, orci dolor pretium magna, tempor ultrices augue diam eget nunc. Donec sed rutrum nunc. Suspendisse commodo eros libero, a auctor neque dapibus in. Quisque vitae iaculis nisi. Nullam accumsan libero et elementum convallis. previewTitle: This product announcement updateDate: 20150828113600000
    1.38691652  1155    __IndexType: content __NodeId: 1155 __NodeTypeAlias: newsitem __Path: -1,1056,1136,1153,1162,1155 archiveYear: 2015 id: 1155 newsCategories: 1141 newsItemType: News previewDescription: Test Item event previewTitle: Test Item event updateDate: 20150828113601000
    

    So it seems that there is something very odd going on with the Lucene Search box, as though it is perhaps compiling the search query down to a raw query, but this does not explain why our raw query does not seem to be working?

    Any ideas would be greatly appreciated so we fully understand how this is working :)

  • Shannon Deminick 1526 posts 5272 karma points MVP 2x
    Sep 02, 2015 @ 09:04
    Shannon Deminick
    0

    It's a bug!!

    It's actually a bug with the Examine dashboard GET request not encoding strings so it' stripping off the "+" signs. We'll fix for 7.3:

    http://issues.umbraco.org/issue/U4-7055

  • Syntaxis 4 posts 24 karma points
    Sep 02, 2015 @ 09:29
    Syntaxis
    0

    Thanks Shannon, thought it probably was so glad i'm not going crazy! :)

Please Sign in or register to post replies

Write your reply to:

Draft