Copied to clipboard

Flag this post as spam?

This post will be reported to the moderators as potential spam to be looked at


  • Peter S 169 posts 587 karma points
    Nov 05, 2014 @ 16:24
    Peter S
    2

    Exclude node from Examine index

    Hi,

    Is it possible to exclude a particulare node ID from the examine index? I'd like something like <ExcludeNodeID>. Or is there some other way to go about to achieve this?

  • Valerie 67 posts 163 karma points
    Nov 06, 2014 @ 10:14
    Valerie
    1

    I've not tried this but there is a "Cancel" event on the IndexingNodeDataEventArgs argument that you have access to on GatheringNodeData which you can apply on a node by node basis. Maybe this will work?

    public class ExamineEvents : ApplicationEventHandler

    {

            public ExamineEvents()

            {

               ExamineManager.Instance.IndexProviderCollection["MainSiteIndexer"].GatheringNodeData += GatheringMainSiteNodeData;

            }

     

            private void GatheringMainSiteNodeData(object sender, IndexingNodeDataEventArgs e)

            {

                e.Cancel = true;

            }

    }

  • Mark 255 posts 612 karma points
    Oct 06, 2015 @ 10:05
    Mark
    0

    I've not tried cancel either, but if it works as expected, another option could be to add a page property of type True/False and set it to true if you want it excluded. Then, add that property to the index and check it's value in the gathering node data event. This would avoid the need to keep a list of excluded id's anywhere (e.g. appsettings).

  • Michael Nielsen 153 posts 810 karma points
    Oct 06, 2015 @ 13:31
    Michael Nielsen
    0

    If the particular node has it's own document type, you could exclude it from your index based on that.

    Or you could exclude it in your search.

    Like this example, where related blog items are found based on a tags property on the items, and the current page is excluded, but this could be any nodeId.

    string[] relatedTags = CurrentPage.tags.ToString().Split(',');
    string searchProvider = CurrentPage.Site().Name.ToString().Replace(".","").ToLower() + "BlogSearcher";
    
    var criteria = ExamineManager.Instance.SearchProviderCollection[searchProvider].CreateSearchCriteria();            
    var crawl = criteria.GroupedOr(new string[] {"tags"}, relatedTags).Not().Field("id", CurrentPage.Id.ToString()).Compile();
    ISearchResults results = ExamineManager.Instance.SearchProviderCollection[searchProvider].Search(crawl);
    
  • Christian Palm 278 posts 273 karma points
    Feb 08, 2016 @ 16:22
    Christian Palm
    2

    I can confirm that you can use e.Cancel to ensure a node do not get into the index.

    using Examine;
    using Examine.Providers;
    using Umbraco.Core;
    
    namespace CPalm.Search
    {
        public class ExamineEvents : IApplicationEventHandler
        {
            public void OnApplicationInitialized(UmbracoApplicationBase umbracoApplication, ApplicationContext applicationContext)
            {
                BaseIndexProvider provider = ExamineManager.Instance.IndexProviderCollection["ExternalIndexer"];
                if (provider != null)
                {
                    provider.NodeIndexing += NodeIndexing;
                }
            }
    
            private void NodeIndexing(object sender, IndexingNodeEventArgs e)
            {
                e.Cancel = e.NodeId == 1000;
                //var node = new Node(e.NodeId);
                //e.Cancel = node.NodeTypeAlias != "TextPage";
            }
            public void OnApplicationStarting(UmbracoApplicationBase umbracoApplication, ApplicationContext applicationContext) { }
    
            public void OnApplicationStarted(UmbracoApplicationBase umbracoApplication, ApplicationContext applicationContext) { }
        }
    }
    
  • Mark Bowser 273 posts 860 karma points c-trib
    Oct 31, 2016 @ 17:31
    Mark Bowser
    0

    This doesn't seem to work in Umbraco 7.5.3. I'm trying to work around a bug in umbraco 7.4.3 - 7.5.3 where Umbraco incorrectly indexes unpublished content. http://issues.umbraco.org/issue/U4-8481

    This is what my code looks like right now. I've verified with breakpoints that there are a lot of nodes I set e.Cancel = true; and after reindexing, the node is still indexed.

    internal static void SupportFaq_GatheringNodeData(object sender, IndexingNodeDataEventArgs e)
    {
        if (UmbracoContext.Current == null)
        {
            var dummyHttpContext = new HttpContextWrapper(new HttpContext(new SimpleWorkerRequest("default.aspx", "", new StringWriter())));
            UmbracoContext.EnsureContext(dummyHttpContext, ApplicationContext.Current,
                new WebSecurity(dummyHttpContext, ApplicationContext.Current), UmbracoConfig.For.UmbracoSettings(), UrlProviderResolver.Current.Providers, false, null);
        }
    
        var umbracoHelper = new UmbracoHelper(UmbracoContext.Current);
    
        IPublishedContent currentNode = null;
        currentNode = umbracoHelper.TypedContent(e.NodeId);
        if (currentNode == null)
        {
            // Unindex the node if it was unpublished. This is due to an umbraco issue. Issue Tracker: U4-8481
            LogHelper.Warn<SupportFaqExamineEvents>($"Node with id: {e.NodeId} is not published.");
            e.Cancel = true;
            return;
        }
        FormatSupportFaqQuestionRankField(e);
        FormatSupportFaqQuestionSearchField(e);
    }
    

    What version did you all have this working in? Any thoughts?

  • Mark Bowser 273 posts 860 karma points c-trib
    Oct 31, 2016 @ 17:46
    Mark Bowser
    1

    I figured it out. In umbraco 7.5.3, you need to use

    ExamineManager.Instance.IndexProviderCollection["SupportFAQIndexer"].DeleteFromIndex(e.NodeId.ToString());
    

    instead of

    e.Cancel = true;
    
  • Marcin Zajkowski 112 posts 585 karma points MVP 7x c-trib
    Sep 13, 2018 @ 10:03
    Marcin Zajkowski
    0

    So, it doesn't work anymore? We're just discovered it in our work now...

  • Nicholas Westby 2054 posts 7103 karma points c-trib
    Jun 17, 2019 @ 21:08
    Nicholas Westby
    0

    Bit annoying that IndexingNodeDataEventArgs.Cancel doesn't work anymore. I'm using BaseIndexProvider.DeleteFromIndex in Umbraco 7.14.0 and it seems to work, mostly. Full code here (minus some extraneous bits):

    // Variables.
    using Examine;
    using Umbraco.Core;
    
    /// <summary>
    /// Assists with indexing Umbraco content for search functionality.
    /// </summary>
    public class ExamineIndexer
    {
    
        #region Methods
    
        /// <summary>
        /// Adds content to an Examine index.
        /// </summary>
        public static void ExamineContentIndexer(object sender, IndexingNodeDataEventArgs e)
        {
    
            // Variables.
            var skipIndex = false;
    
            // Document containing the "Do Not Index" field will not be added to the index.
            if (e.Fields.ContainsKey("doNotIndex"))
            {
                skipIndex = true;
            }
    
            // Skip pages that have no template.
            var template = e.Fields["template"];
            if (template == "0")
            {
                skipIndex = true;
            }
    
            // Skip indexing the current node (i.e., exclude it from the index)?
            if (skipIndex)
            {
                // It seems like e.Cancel doesn't work anymore, so now we do DeleteFromIndex instead.
                // See: https://our.umbraco.com/forum/umbraco-7/using-umbraco-7/57871-Exclude-node-from-Examine-index#comment-258903
                ExamineManager.Instance.IndexProviderCollection["CustomExternalIndexer"]
                    .DeleteFromIndex(e.NodeId.ToString());
                return;
            }
    
            // Variables.
            var contentService = ApplicationContext.Current.Services.ContentService;
            var page = contentService.GetById(e.NodeId);
    
            // Check to see if the node is valid.
            if (page == null)
            {
                skipIndex = true;
            }
    
            // Check to see if is has a published version.
            if (!skipIndex && !(page?.HasPublishedVersion).GetValueOrDefault(false))
            {
                skipIndex = true;
            }
    
            // Skip indexing the current node (i.e., exclude it from the index)?
            if (skipIndex)
            {
                // It seems like e.Cancel doesn't work anymore, so now we do DeleteFromIndex instead.
                // See: https://our.umbraco.com/forum/umbraco-7/using-umbraco-7/57871-Exclude-node-from-Examine-index#comment-258903
                ExamineManager.Instance.IndexProviderCollection["CustomExternalIndexer"]
                    .DeleteFromIndex(e.NodeId.ToString());
                return;
            }
    
        }
    
        #endregion
    
    }
    

    When I say it "mostly" works, I mean that it seems to exclude the pages from the index, but when I go to the Examine section to check on the index, I see a few hundred "deleted" nodes (after clicking "Rebuild index"):

    Indexer Information

    I guess this leaves the Examine index in a slightly dirty state, which is probably not the worst thing in the world, but doesn't seem ideal.

Please Sign in or register to post replies

Write your reply to:

Draft