Copied to clipboard

Flag this post as spam?

This post will be reported to the moderators as potential spam to be looked at


  • Dan Diplo 1490 posts 5790 karma points MVP 4x c-trib
    Jan 09, 2020 @ 16:43
    Dan Diplo
    0

    Raw Lucene Date Range Query not working in Examine (Umbraco 8)

    For Examine experts!

    Have a strange issue with Examine in Umbraco 8. If I use the fluent API to create a RangeQuery using dates then it works, but if I execute the same query as a raw query then it doesn't. To provide an example:

    if (ExamineManager.Instance.TryGetIndex(global::Umbraco.Core.Constants.UmbracoIndexes.ExternalIndexName, out var index))
    {
        var searcher = index.GetSearcher();
        var query = searcher.CreateQuery("content").RangeQuery<DateTime>(new string[] { "createDate" }, DateTime.Now.AddDays(-37), DateTime.Now);
    
        var results = query.Execute();
    
        // results.TotalItemCount = 2;
    }
    

    So the above query returns 2 results, which is correct. It just returns items with a create date between the two specified dates, as expected.

    However, if I take the raw Lucene query from the above code (eg. via query.ToString()) I get this:

    +(createDate:[637109872908600000 TO 637141840908600000])
    

    (Note the date ticks values will change depending on what the current date is - this is just an example).

    Now if I run this in the back-office Examine Management dashboard it returns every node, even if I amend it just to search content. So my query would actually be:

    +__IndexType:content +createDate:[637109872908600000 TO 637141840908600000]
    

    This still returns every content node in the site.

    In fact, even this query below returns every node when I would expect it to return none (as the 'from' date is after the 'to' date).

    +__IndexType:content +createDate:[999999999999999999 TO 637143565034990000]
    

    I've run these queries in the back office Examine Management dashboard and also via Luke.Net to same effect. It's like the range query is ignored.

    So my question is why does the date range query work when constructed via the fluent API, but not if constructed manually as a raw query?

    Is there any way to get the raw query to work? (I'm building a rather complex query via a raw Lucence query and it works fine apart from this date range part). Note I'm searching the built-in createDate field, not a custom field, and I can see the date is being stored as ticks in the index. I'm using the Standard Analyser, too.

  • Matt Brailsford 2473 posts 12075 karma points MVP 7x c-trib
    Jan 10, 2020 @ 12:08
    Matt Brailsford
    0

    I don't really have an answer, but looking at 2 different indexes, one from v7 (which I just ran a date range query on in Like fine) and one from v8 (which I was also unable to run a date range query in Luke (no results)) it looks like the fields are indexed differently now so I wonder if that has something to do with it.

    In v7, Luke shows the createDate flags as ISV but in v8 they are now ITSf0

    I'm no expert in this stuff, but it could be a clue.

    The person you really want to get involved in Ismail 😁

  • Matt Brailsford 2473 posts 12075 karma points MVP 7x c-trib
    Jan 10, 2020 @ 12:12
    Matt Brailsford
    0

    I'm also guessing the Fluent API might also capture some metadata about the field type and maybe pass that into the query in some other way so it's not immediately obvious by ToString-ing the query.

  • Dan Diplo 1490 posts 5790 karma points MVP 4x c-trib
    Jan 10, 2020 @ 13:04
    Dan Diplo
    0

    Thanks, Matt! Yeah, dates are definitely stored differently in 8 than 7. In 8 they are stored as rounded ticks - which is essentially just a long integer. So in essence the range query just treats them as a numeric range, which you would have thought would be relatively simple.... You can see how Shannon does it in the Examine source code.

    Like you say, maybe Shannon does something else to make this work beyond what is visible in the raw query. I would ping Ismail, but don't like singling out people - I'm sure he gets enough hassle with Examine :) But thanks for checking, I appreciate it.

  • Shannon Deminick 1498 posts 5073 karma points hq
    Jan 13, 2020 @ 06:36
    Shannon Deminick
    0

    It's because lucene query parser parses ranges ONLY as string ranges not as numerical or date ranges. This isn't a bug of Examine per se but one of Lucene, though I know how to work around it.

    You can follow the issue here https://github.com/Shazwazza/Examine/issues/133

  • Dan Diplo 1490 posts 5790 karma points MVP 4x c-trib
    Jan 13, 2020 @ 19:21
    Dan Diplo
    0

    Thanks, Shannon, I was wondering if it was something like that.

    How do you work around it in Examine, then? I looked through the source code but couldn't really pick up anything. Presumably at some point your API has to turn everything into a raw Lucene query, so how do you get it to treat a range as numeric? Thanks!

  • Shannon Deminick 1498 posts 5073 karma points hq
    Jan 15, 2020 @ 01:00
    Shannon Deminick
    100

    Hey Dan,

    I already have the fix in for Examine locally, just haven't pushed it yet. But it goes here: https://github.com/Shazwazza/Examine/blob/master/src/Examine/LuceneEngine/Search/CustomMultiFieldQueryParser.cs

    Presumably at some point your API has to turn everything into a raw Lucene query, so how do you get it to treat a range as numeric

    Actually it's the reverse of this :) Sure Lucene can work by passing in a string Lucene Query but Lucene actually works by using objects to create a query. Examine creates these query objects directly and doesn't build up a string query. When you pass in a string query to Lucene it uses a QueryParser to break that string down into objects.

    You can see the Lucene method here for converting a Range query string to a RangeQuery object https://github.com/apache/lucenenet/blob/3.0.3/src/core/QueryParser/QueryParser.cs#L743

    Which tries to parse dates but not numbers, it then ends up calling NewRangeQuery: https://github.com/apache/lucenenet/blob/3.0.3/src/core/QueryParser/QueryParser.cs#L891

    which you can see just returns a TermRangeQuery, it doesn't try to check for numbers or anything which would require a NumericRangeQuery

    So the fix is to overrdie NewRangeQuery and detect values.

  • Dan Diplo 1490 posts 5790 karma points MVP 4x c-trib
    Jan 15, 2020 @ 08:50
    Dan Diplo
    0

    Thanks, Shannon! Makes sense about the raw query part; I'd just never considered that. Will dig into it and see what I can do! Great to have Examine abstract some of these issues away :) Thanks.

  • Shannon Deminick 1498 posts 5073 karma points hq
    Jan 15, 2020 @ 01:03
    Shannon Deminick
    0

    The fix i have locally isn't complete though, so I've marked the issue as help-wanted (up for grabs) if anyone wants to take a stab at it

Please Sign in or register to post replies

Write your reply to:

Draft