analyzing path and composition worthy of core - Contributing to Umbraco CMS

Press Ctrl / CMD + C to copy this to your clipboard.

Copied to clipboard

Flag this post as spam?

This post will be reported to the moderators as potential spam to be looked at

Lars-Erik Aabech 349 posts 1100 karma points MVP 8x c-trib

Jul 07, 2016 @ 13:32

Analyzing path and composition worthy of core?

Hi guys,

To be able to present our users with relevant search results, we use a mixture of elaborate techniques. (yeah, right)

The most important ones are filtering by path and composition types. Using the builtin indexer, we have to create a big graph of compositions to find the inheriting nodeTypeAliases, and we have to search path by wildcard.

Instead of doing that we just analyze the path and composition instead. So we have this custom analyzer that delegates to this "ISearchDataGatherer":

public class PathAndCompositionSearchDataGatherer : ISearchDataGatherer
{
    private readonly Dictionary<int, IContentTypeComposition> contentTypes = new Dictionary<int, IContentTypeComposition>();

    public void GatherNodeData(IndexingNodeDataEventArgs e)
    {
        var pathAtt = e.Node.Attribute("path");
        var path = pathAtt != null ? pathAtt.Value : "";
        e.Fields.Add("analyzedPath", path.Replace(',', ' '));

        var contentTypeAtt = e.Node.Attribute("nodeType");
        var contentTypeId = contentTypeAtt != null ? Convert.ToInt32(contentTypeAtt.Value) : -1;
        var composition = "";
        if (contentTypeId > -1)
        {
            IContentTypeComposition contentType = contentTypes.ContainsKey(contentTypeId)
                ? contentTypes[contentTypeId]
                : null;
            if (contentType == null)
            {
                contentType =
                    (IContentTypeComposition)
                        ApplicationContext.Current.Services.ContentTypeService.GetContentType(contentTypeId) ??
                    (IContentTypeComposition)
                        ApplicationContext.Current.Services.ContentTypeService.GetMediaType(contentTypeId);
                contentTypes.Add(contentTypeId, contentType);
            }

            composition = String.Join(" ", contentType.CompositionAliases().Union(new[] {contentType.Alias}));
        }
        e.Fields.Add("analyzedComposition", composition);
    }
}

Is this be something that would be worth pulling into the UmbracoContentIndexer?

I'll make the PR, just want to know if it's useful enough in the core.

Copy Link

Shannon Deminick 1526 posts 5272 karma points MVP 3x

Jul 08, 2016 @ 09:03

0

The main issue I see with this is performance, this means that for every single node that will be indexed, for every indexer you are going to be querying the database - and really really really hoping that your going to get a cached result back.

To do this 'correctly' this data would be part of the main lookup when re-indexing a node or when rebuilding the index. The issue with that is Examine v1.0 limitation with the silly XML instance (but you know that you can work around that too by adding additional xml attributes if necessary). v2.0 will be much better suited because we can add any info we want up-front.

Copy Link
Lars-Erik Aabech 349 posts 1100 karma points MVP 8x c-trib

Jul 08, 2016 @ 09:07

0

Could possibly have hacked up some content type cache before any indexing, but I see the challenges involved with making this work well on any site.

But I'll happily wait with taking on this PR until 8.0/2.0. :)

Copy Link
is working on a reply...

Please Sign in or register to post replies