exclude node from examine index

Peter S 169 posts 587 karma points

Nov 05, 2014 @ 16:24

Exclude node from Examine index

Hi,

Is it possible to exclude a particulare node ID from the examine index? I'd like something like <ExcludeNodeID>. Or is there some other way to go about to achieve this?

Copy Link

Valerie 67 posts 163 karma points

Nov 06, 2014 @ 10:14

I've not tried this but there is a "Cancel" event on the IndexingNodeDataEventArgs argument that you have access to on GatheringNodeData which you can apply on a node by node basis. Maybe this will work?

public class ExamineEvents : ApplicationEventHandler

{

public ExamineEvents()

{

ExamineManager.Instance.IndexProviderCollection["MainSiteIndexer"].GatheringNodeData += GatheringMainSiteNodeData;

}

private void GatheringMainSiteNodeData(object sender, IndexingNodeDataEventArgs e)

{

e.Cancel = true;

}

Copy Link

Mark 255 posts 612 karma points

Oct 06, 2015 @ 10:05

I've not tried cancel either, but if it works as expected, another option could be to add a page property of type True/False and set it to true if you want it excluded. Then, add that property to the index and check it's value in the gathering node data event. This would avoid the need to keep a list of excluded id's anywhere (e.g. appsettings).

Copy Link

Michael Nielsen 155 posts 812 karma points

Oct 06, 2015 @ 13:31

If the particular node has it's own document type, you could exclude it from your index based on that.

Or you could exclude it in your search.

Like this example, where related blog items are found based on a tags property on the items, and the current page is excluded, but this could be any nodeId.

string[] relatedTags = CurrentPage.tags.ToString().Split(',');
string searchProvider = CurrentPage.Site().Name.ToString().Replace(".","").ToLower() + "BlogSearcher";

var criteria = ExamineManager.Instance.SearchProviderCollection[searchProvider].CreateSearchCriteria();            
var crawl = criteria.GroupedOr(new string[] {"tags"}, relatedTags).Not().Field("id", CurrentPage.Id.ToString()).Compile();
ISearchResults results = ExamineManager.Instance.SearchProviderCollection[searchProvider].Search(crawl);

Copy Link

Christian Palm 278 posts 273 karma points

Feb 08, 2016 @ 16:22

I can confirm that you can use e.Cancel to ensure a node do not get into the index.

using Examine;
using Examine.Providers;
using Umbraco.Core;

namespace CPalm.Search
{
    public class ExamineEvents : IApplicationEventHandler
    {
        public void OnApplicationInitialized(UmbracoApplicationBase umbracoApplication, ApplicationContext applicationContext)
        {
            BaseIndexProvider provider = ExamineManager.Instance.IndexProviderCollection["ExternalIndexer"];
            if (provider != null)
            {
                provider.NodeIndexing += NodeIndexing;
            }
        }

        private void NodeIndexing(object sender, IndexingNodeEventArgs e)
        {
            e.Cancel = e.NodeId == 1000;
            //var node = new Node(e.NodeId);
            //e.Cancel = node.NodeTypeAlias != "TextPage";
        }
        public void OnApplicationStarting(UmbracoApplicationBase umbracoApplication, ApplicationContext applicationContext) { }

        public void OnApplicationStarted(UmbracoApplicationBase umbracoApplication, ApplicationContext applicationContext) { }
    }
}

Copy Link

Mark Bowser 273 posts 860 karma points c-trib

Oct 31, 2016 @ 17:31

This doesn't seem to work in Umbraco 7.5.3. I'm trying to work around a bug in umbraco 7.4.3 - 7.5.3 where Umbraco incorrectly indexes unpublished content. http://issues.umbraco.org/issue/U4-8481

This is what my code looks like right now. I've verified with breakpoints that there are a lot of nodes I set e.Cancel = true; and after reindexing, the node is still indexed.

internal static void SupportFaq_GatheringNodeData(object sender, IndexingNodeDataEventArgs e)
{
    if (UmbracoContext.Current == null)
    {
        var dummyHttpContext = new HttpContextWrapper(new HttpContext(new SimpleWorkerRequest("default.aspx", "", new StringWriter())));
        UmbracoContext.EnsureContext(dummyHttpContext, ApplicationContext.Current,
            new WebSecurity(dummyHttpContext, ApplicationContext.Current), UmbracoConfig.For.UmbracoSettings(), UrlProviderResolver.Current.Providers, false, null);
    }

    var umbracoHelper = new UmbracoHelper(UmbracoContext.Current);

    IPublishedContent currentNode = null;
    currentNode = umbracoHelper.TypedContent(e.NodeId);
    if (currentNode == null)
    {
        // Unindex the node if it was unpublished. This is due to an umbraco issue. Issue Tracker: U4-8481
        LogHelper.Warn<SupportFaqExamineEvents>($"Node with id: {e.NodeId} is not published.");
        e.Cancel = true;
        return;
    }
    FormatSupportFaqQuestionRankField(e);
    FormatSupportFaqQuestionSearchField(e);
}

What version did you all have this working in? Any thoughts?

Copy Link

Mark Bowser 273 posts 860 karma points c-trib

Oct 31, 2016 @ 17:46

I figured it out. In umbraco 7.5.3, you need to use

ExamineManager.Instance.IndexProviderCollection["SupportFAQIndexer"].DeleteFromIndex(e.NodeId.ToString());

instead of

e.Cancel = true;

Copy Link

Marcin Zajkowski 112 posts 585 karma points MVP 7x c-trib

Sep 13, 2018 @ 10:03

So, it doesn't work anymore? We're just discovered it in our work now...

Copy Link

Nicholas Westby 2054 posts 7104 karma points c-trib

Jun 17, 2019 @ 21:08

Bit annoying that IndexingNodeDataEventArgs.Cancel doesn't work anymore. I'm using BaseIndexProvider.DeleteFromIndex in Umbraco 7.14.0 and it seems to work, mostly. Full code here (minus some extraneous bits):

// Variables.
using Examine;
using Umbraco.Core;

/// <summary>
/// Assists with indexing Umbraco content for search functionality.
/// </summary>
public class ExamineIndexer
{

    #region Methods

    /// <summary>
    /// Adds content to an Examine index.
    /// </summary>
    public static void ExamineContentIndexer(object sender, IndexingNodeDataEventArgs e)
    {

        // Variables.
        var skipIndex = false;

        // Document containing the "Do Not Index" field will not be added to the index.
        if (e.Fields.ContainsKey("doNotIndex"))
        {
            skipIndex = true;
        }

        // Skip pages that have no template.
        var template = e.Fields["template"];
        if (template == "0")
        {
            skipIndex = true;
        }

        // Skip indexing the current node (i.e., exclude it from the index)?
        if (skipIndex)
        {
            // It seems like e.Cancel doesn't work anymore, so now we do DeleteFromIndex instead.
            // See: https://our.umbraco.com/forum/umbraco-7/using-umbraco-7/57871-Exclude-node-from-Examine-index#comment-258903
            ExamineManager.Instance.IndexProviderCollection["CustomExternalIndexer"]
                .DeleteFromIndex(e.NodeId.ToString());
            return;
        }

        // Variables.
        var contentService = ApplicationContext.Current.Services.ContentService;
        var page = contentService.GetById(e.NodeId);

        // Check to see if the node is valid.
        if (page == null)
        {
            skipIndex = true;
        }

        // Check to see if is has a published version.
        if (!skipIndex && !(page?.HasPublishedVersion).GetValueOrDefault(false))
        {
            skipIndex = true;
        }

        // Skip indexing the current node (i.e., exclude it from the index)?
        if (skipIndex)
        {
            // It seems like e.Cancel doesn't work anymore, so now we do DeleteFromIndex instead.
            // See: https://our.umbraco.com/forum/umbraco-7/using-umbraco-7/57871-Exclude-node-from-Examine-index#comment-258903
            ExamineManager.Instance.IndexProviderCollection["CustomExternalIndexer"]
                .DeleteFromIndex(e.NodeId.ToString());
            return;
        }

    }

    #endregion

}

When I say it "mostly" works, I mean that it seems to exclude the pages from the index, but when I go to the Examine section to check on the index, I see a few hundred "deleted" nodes (after clicking "Rebuild index"):

Indexer Information

I guess this leaves the Examine index in a slightly dirty state, which is probably not the worst thing in the world, but doesn't seem ideal.

Copy Link

is working on a reply...

This forum is in read-only mode while we transition to the new forum.

You can continue this topic on the new forum by tapping the "Continue discussion" link below.

Flag this post as spam?

Exclude node from Examine index