Copied to clipboard

Flag this post as spam?

This post will be reported to the moderators as potential spam to be looked at


  • Steve Temple 63 posts 324 karma points MVP 4x c-trib
    Sep 23, 2016 @ 12:00
    Steve Temple
    0

    Searching for multiple word phrases using Examine

    I'm trying to search for a 2 word phrase using examine, I'm building my query like this:

    query.And().Field("articleTag", tag.ToLower().Escape());
    

    This is creating the query that looks like this

    +(articleTag:two words)
    

    I'd expect it to be:

    +(articleTag:"two words") 
    

    I can get it to output that by using:

    query.And().Field("articleTag", $"\"{tag.ToLower()}\"");
    

    Then using Luke I can get ~200 results, using examine with that query returns 0.

    It's searching a field that is Indexed and Stored but isn't tokenized to eliminate issues around tokenization.

    I can't see how to get that to work via Examine. I've lots of experience with Lucene.Net so always find Examine a bit tricky to make it work like direct access to Lucene.Net.

    Does anyone know how to search for a phrase?

    Steve

  • Steve Temple 63 posts 324 karma points MVP 4x c-trib
    Sep 23, 2016 @ 12:38
    Steve Temple
    100

    To help anyone else with the same issue, I've figured out a way of doing this, I guess writing about it has helped me come up with a solution

    I created this class:

    public class ExactPhraseExamineValue : IExamineValue
    {
      public ExactPhraseExamineValue(string phrase)
      {
        Examineness = Examineness.Escaped;
        Value = $"\"{phrase}\"";
        Level = 1;
      }
    
      public Examineness Examineness { get; }
      public float Level { get; }
      public string Value { get; }
    }
    

    Then do this:

    query = query.And().Field("articleTag", new ExactPhraseExamineValue(tag.ToLower()));
    

    I guess this would only work if your value in the index isn't tokenized. I do the following that takes all the tags out of the tags field and saves each one as a non-tokenized field to make it easier to search for exact matches as Lucene by default will tokenize everything:

    public static void Indexer_DocumentWriting(object sender, DocumentWritingEventArgs e)
    {
      var document = e.Document;
      if (e.Fields["__NodeTypeAlias"].EndsWith("Article"))
      {
        var tags = e.Fields["articleTags"].Split(",".ToCharArray(), StringSplitOptions.RemoveEmptyEntries);
        foreach (var tag in tags)
        {
          document.Add(new Field("articleTag", tag.ToLower(), Field.Store.YES, Field.Index.NOT_ANALYZED));
        }
    }
    
  • This forum is in read-only mode while we transition to the new forum.

    You can continue this topic on the new forum by tapping the "Continue discussion" link below.

Please Sign in or register to post replies