Copied to clipboard

Flag this post as spam?

This post will be reported to the moderators as potential spam to be looked at


  • Steve Temple 54 posts 288 karma points MVP c-trib
    Sep 23, 2016 @ 12:00
    Steve Temple
    0

    Searching for multiple word phrases using Examine

    I'm trying to search for a 2 word phrase using examine, I'm building my query like this:

    query.And().Field("articleTag", tag.ToLower().Escape());
    

    This is creating the query that looks like this

    +(articleTag:two words)
    

    I'd expect it to be:

    +(articleTag:"two words") 
    

    I can get it to output that by using:

    query.And().Field("articleTag", $"\"{tag.ToLower()}\"");
    

    Then using Luke I can get ~200 results, using examine with that query returns 0.

    It's searching a field that is Indexed and Stored but isn't tokenized to eliminate issues around tokenization.

    I can't see how to get that to work via Examine. I've lots of experience with Lucene.Net so always find Examine a bit tricky to make it work like direct access to Lucene.Net.

    Does anyone know how to search for a phrase?

    Steve

  • Steve Temple 54 posts 288 karma points MVP c-trib
    Sep 23, 2016 @ 12:38
    Steve Temple
    100

    To help anyone else with the same issue, I've figured out a way of doing this, I guess writing about it has helped me come up with a solution

    I created this class:

    public class ExactPhraseExamineValue : IExamineValue
    {
      public ExactPhraseExamineValue(string phrase)
      {
        Examineness = Examineness.Escaped;
        Value = $"\"{phrase}\"";
        Level = 1;
      }
    
      public Examineness Examineness { get; }
      public float Level { get; }
      public string Value { get; }
    }
    

    Then do this:

    query = query.And().Field("articleTag", new ExactPhraseExamineValue(tag.ToLower()));
    

    I guess this would only work if your value in the index isn't tokenized. I do the following that takes all the tags out of the tags field and saves each one as a non-tokenized field to make it easier to search for exact matches as Lucene by default will tokenize everything:

    public static void Indexer_DocumentWriting(object sender, DocumentWritingEventArgs e)
    {
      var document = e.Document;
      if (e.Fields["__NodeTypeAlias"].EndsWith("Article"))
      {
        var tags = e.Fields["articleTags"].Split(",".ToCharArray(), StringSplitOptions.RemoveEmptyEntries);
        foreach (var tag in tags)
        {
          document.Add(new Field("articleTag", tag.ToLower(), Field.Store.YES, Field.Index.NOT_ANALYZED));
        }
    }
    
Please Sign in or register to post replies

Write your reply to:

Draft