Copied to clipboard

Flag this post as spam?

This post will be reported to the moderators as potential spam to be looked at


  • Lars-Erik Aabech 349 posts 1100 karma points MVP 8x c-trib
    Jul 07, 2016 @ 13:32
    Lars-Erik Aabech
    0

    Analyzing path and composition worthy of core?

    Hi guys,

    To be able to present our users with relevant search results, we use a mixture of elaborate techniques. (yeah, right)

    The most important ones are filtering by path and composition types. Using the builtin indexer, we have to create a big graph of compositions to find the inheriting nodeTypeAliases, and we have to search path by wildcard.

    Instead of doing that we just analyze the path and composition instead. So we have this custom analyzer that delegates to this "ISearchDataGatherer":

    public class PathAndCompositionSearchDataGatherer : ISearchDataGatherer
    {
        private readonly Dictionary<int, IContentTypeComposition> contentTypes = new Dictionary<int, IContentTypeComposition>();
    
        public void GatherNodeData(IndexingNodeDataEventArgs e)
        {
            var pathAtt = e.Node.Attribute("path");
            var path = pathAtt != null ? pathAtt.Value : "";
            e.Fields.Add("analyzedPath", path.Replace(',', ' '));
    
            var contentTypeAtt = e.Node.Attribute("nodeType");
            var contentTypeId = contentTypeAtt != null ? Convert.ToInt32(contentTypeAtt.Value) : -1;
            var composition = "";
            if (contentTypeId > -1)
            {
                IContentTypeComposition contentType = contentTypes.ContainsKey(contentTypeId)
                    ? contentTypes[contentTypeId]
                    : null;
                if (contentType == null)
                {
                    contentType =
                        (IContentTypeComposition)
                            ApplicationContext.Current.Services.ContentTypeService.GetContentType(contentTypeId) ??
                        (IContentTypeComposition)
                            ApplicationContext.Current.Services.ContentTypeService.GetMediaType(contentTypeId);
                    contentTypes.Add(contentTypeId, contentType);
                }
    
                composition = String.Join(" ", contentType.CompositionAliases().Union(new[] {contentType.Alias}));
            }
            e.Fields.Add("analyzedComposition", composition);
        }
    }
    

    Is this be something that would be worth pulling into the UmbracoContentIndexer?

    I'll make the PR, just want to know if it's useful enough in the core.

  • Shannon Deminick 1526 posts 5272 karma points MVP 3x
    Jul 08, 2016 @ 09:03
    Shannon Deminick
    0

    The main issue I see with this is performance, this means that for every single node that will be indexed, for every indexer you are going to be querying the database - and really really really hoping that your going to get a cached result back.

    To do this 'correctly' this data would be part of the main lookup when re-indexing a node or when rebuilding the index. The issue with that is Examine v1.0 limitation with the silly XML instance (but you know that you can work around that too by adding additional xml attributes if necessary). v2.0 will be much better suited because we can add any info we want up-front.

  • Lars-Erik Aabech 349 posts 1100 karma points MVP 8x c-trib
    Jul 08, 2016 @ 09:07
    Lars-Erik Aabech
    0

    Could possibly have hacked up some content type cache before any indexing, but I see the challenges involved with making this work well on any site.

    But I'll happily wait with taking on this PR until 8.0/2.0. :)

Please Sign in or register to post replies

Write your reply to:

Draft