Copied to clipboard

Flag this post as spam?

This post will be reported to the moderators as potential spam to be looked at


  • trfletch 598 posts 604 karma points
    May 28, 2013 @ 18:17
    trfletch
    0

    Querying against a comma separated list of IDs with examine and Lucene.net

    Hi,

    I have an umbraco 4.11 website and I am trying to build a custom Examine (lucene) index. I have a property on the nodes that is a comma separated list of category Id's that I want to add to the index. I am pretty much trying to do exactly what is explained in the following article:

    http://stackoverflow.com/questions/5124183/querying-against-a-comma-separated-list-of-ids-with-examine-and-lucene-net ;

    The problem I have is when I am trying to create the index using the following code:

                // Loop through articles
                foreach (var a in articles)
                {
                    yield return new SimpleDataSet()
                    {
                        NodeDefinition = new Examine.IndexedNode()
                        {
                            NodeId = a.Id,
                            Type = "Article"
                            
                        },
                        RowData = new Dictionary<stringstring>()
                        {
                            {"Name", a.Name},
                            {"Url", a.NiceUrl},
                            {"Category""1234"},
                            {"Category""5678"}
                        }
                    };
                }

    I received the following error:

    An item with the same key has already been added.

    Does any know how I can get multiple categories for each article into my Examine index?

    Regards Tony

  • Aaron 14 posts 34 karma points
    Nov 13, 2013 @ 23:39
    Aaron
    0

    Hey Tony,

    I am sure you have solved this by now, but came across this post while I was working on a similar problem, thought I would share my findings.

    You can access the a Lucene document in the OnDocumentWriting overload when you create a custom indexer from BaseUmbracoIndexer

    protected override void OnDocumentWriting(DocumentWritingEventArgs docArgs)
    {
        var currentNode = _nodeFactoryFacade.GetNode(docArgs.NodeId);
    
        var categories = currentNode.GetProperty("categories").Value;
        if (!string.IsNullOrEmpty(categories))
        {
            var categoryNodeIdsXml = XElement.Parse(categories);
            var categoryNodeIds = categoryNodeIdsXml.Descendants("nodeId");
            foreach (var categoryNodeId in categoryNodeIds)
            {
                docArgs.Document.Add(new Field("categories", categoryNodeId.Value, Field.Store.YES, Field.Index.ANALYZED));
            }
        }
    
    
        base.OnDocumentWriting(docArgs);
    }
    

    However the issue is now with searching. When you search you get the same duplicate key exception. I worked around this by concatenating my values, separating them by a pipe (|).

    Blogged about my findings if you want any more detail.

    http://blog.gravypower.net/examine-indexing-and-searching-with-multypart-properties/

    Hope that helps

  • Ismail Mayat 4511 posts 10092 karma points MVP 2x admin c-trib
    Nov 14, 2013 @ 08:51
    Ismail Mayat
    0

    Guys,

    You can inject in into categories space separated list and that can then searched. I do something similar when i want todo path based queries eg all nodes that have parent 1234. Path is stored as csv and not tokenised so i create new field using gatheringnode data then inject in space separated path all works nicely. BTW I did a whole series of blog posts here on examiness over 10 posts covering different hints and tips.

    Regards

    Ismail

  • Matt 76 posts 280 karma points
    Nov 15, 2013 @ 08:12
    Matt
    0

    Ismail - I'm using your GatheringNodeData technique to inject the data into the index replacing "," with ", ", however my issue is with the incorrect results being returned.

    Given this data:
    Blog Entry 1 has categories (array of tags) set to "symptom, something else" 
    Blog Entry 2 has categories set to "another symptom, something else"
    Blog Entry 3 has categories set to "symptom, something else" 

    When searching for:
    something else - get 3 results back - Correct 
    another symptom - get 1 result back - Correct
    symptom - get 3 results back - InCorrect should be only 2 results

    I know what is happening and kind of why it is happening but dont know how to fix it - any ideas?  Can I separate with pipes and then include those in the search when I pass in the categories?

    Regards,
    Matt

  • Aaron 14 posts 34 karma points
    Nov 15, 2013 @ 09:14
    Aaron
    0

    Hey Matt,

    I would say the issue is that Lucene is doing a text search and is picking up the word "symptom" out of "another symptom" category. I have been adding the id of the node the represents a category this way the name does not affect the search.

    There could be an issue when a node ID contains the ID of another category, I would have to have a bit more of a look, but that would solve your issue. Have a look at my blog post above to see what I ended up doing when dealing with categories.

    Hope that helps.

    Aaron

  • Matt 76 posts 280 karma points
    Nov 15, 2013 @ 09:20
    Matt
    0

    Thanks Aaron - I we you a beer if you're ever in Seattle :-)

    Nice simple solution and also solved my other issues of categories having stop words in them.

    Regards,
    Matt 

  • Aaron 14 posts 34 karma points
    Nov 16, 2013 @ 01:02
    Aaron
    0

    No worries Matt, glad I can help. I find that when I put things into writing I understand them more.

    Aaron

  • This forum is in read-only mode while we transition to the new forum.

    You can continue this topic on the new forum by tapping the "Continue discussion" link below.

Please Sign in or register to post replies