Is it possible to exclude a particulare node ID from the examine index? I'd like something like <ExcludeNodeID>. Or is there some other way to go about to achieve this?
I've not tried this but there is a "Cancel" event on the IndexingNodeDataEventArgs argument that you have access to on GatheringNodeData which you can apply on a node by node basis. Maybe this will work?
public class ExamineEvents : ApplicationEventHandler
I've not tried cancel either, but if it works as expected, another option could be to add a page property of type True/False and set it to true if you want it excluded. Then, add that property to the index and check it's value in the gathering node data event. This would avoid the need to keep a list of excluded id's anywhere (e.g. appsettings).
If the particular node has it's own document type, you could exclude it from your index based on that.
Or you could exclude it in your search.
Like this example, where related blog items are found based on a tags property on the items, and the current page is excluded, but this could be any nodeId.
This doesn't seem to work in Umbraco 7.5.3. I'm trying to work around a bug in umbraco 7.4.3 - 7.5.3 where Umbraco incorrectly indexes unpublished content. http://issues.umbraco.org/issue/U4-8481
This is what my code looks like right now. I've verified with breakpoints that there are a lot of nodes I set e.Cancel = true; and after reindexing, the node is still indexed.
internal static void SupportFaq_GatheringNodeData(object sender, IndexingNodeDataEventArgs e)
{
if (UmbracoContext.Current == null)
{
var dummyHttpContext = new HttpContextWrapper(new HttpContext(new SimpleWorkerRequest("default.aspx", "", new StringWriter())));
UmbracoContext.EnsureContext(dummyHttpContext, ApplicationContext.Current,
new WebSecurity(dummyHttpContext, ApplicationContext.Current), UmbracoConfig.For.UmbracoSettings(), UrlProviderResolver.Current.Providers, false, null);
}
var umbracoHelper = new UmbracoHelper(UmbracoContext.Current);
IPublishedContent currentNode = null;
currentNode = umbracoHelper.TypedContent(e.NodeId);
if (currentNode == null)
{
// Unindex the node if it was unpublished. This is due to an umbraco issue. Issue Tracker: U4-8481
LogHelper.Warn<SupportFaqExamineEvents>($"Node with id: {e.NodeId} is not published.");
e.Cancel = true;
return;
}
FormatSupportFaqQuestionRankField(e);
FormatSupportFaqQuestionSearchField(e);
}
What version did you all have this working in? Any thoughts?
Bit annoying that IndexingNodeDataEventArgs.Cancel doesn't work anymore. I'm using BaseIndexProvider.DeleteFromIndex in Umbraco 7.14.0 and it seems to work, mostly. Full code here (minus some extraneous bits):
// Variables.
using Examine;
using Umbraco.Core;
/// <summary>
/// Assists with indexing Umbraco content for search functionality.
/// </summary>
public class ExamineIndexer
{
#region Methods
/// <summary>
/// Adds content to an Examine index.
/// </summary>
public static void ExamineContentIndexer(object sender, IndexingNodeDataEventArgs e)
{
// Variables.
var skipIndex = false;
// Document containing the "Do Not Index" field will not be added to the index.
if (e.Fields.ContainsKey("doNotIndex"))
{
skipIndex = true;
}
// Skip pages that have no template.
var template = e.Fields["template"];
if (template == "0")
{
skipIndex = true;
}
// Skip indexing the current node (i.e., exclude it from the index)?
if (skipIndex)
{
// It seems like e.Cancel doesn't work anymore, so now we do DeleteFromIndex instead.
// See: https://our.umbraco.com/forum/umbraco-7/using-umbraco-7/57871-Exclude-node-from-Examine-index#comment-258903
ExamineManager.Instance.IndexProviderCollection["CustomExternalIndexer"]
.DeleteFromIndex(e.NodeId.ToString());
return;
}
// Variables.
var contentService = ApplicationContext.Current.Services.ContentService;
var page = contentService.GetById(e.NodeId);
// Check to see if the node is valid.
if (page == null)
{
skipIndex = true;
}
// Check to see if is has a published version.
if (!skipIndex && !(page?.HasPublishedVersion).GetValueOrDefault(false))
{
skipIndex = true;
}
// Skip indexing the current node (i.e., exclude it from the index)?
if (skipIndex)
{
// It seems like e.Cancel doesn't work anymore, so now we do DeleteFromIndex instead.
// See: https://our.umbraco.com/forum/umbraco-7/using-umbraco-7/57871-Exclude-node-from-Examine-index#comment-258903
ExamineManager.Instance.IndexProviderCollection["CustomExternalIndexer"]
.DeleteFromIndex(e.NodeId.ToString());
return;
}
}
#endregion
}
When I say it "mostly" works, I mean that it seems to exclude the pages from the index, but when I go to the Examine section to check on the index, I see a few hundred "deleted" nodes (after clicking "Rebuild index"):
I guess this leaves the Examine index in a slightly dirty state, which is probably not the worst thing in the world, but doesn't seem ideal.
Exclude node from Examine index
Hi,
Is it possible to exclude a particulare node ID from the examine index? I'd like something like <ExcludeNodeID>. Or is there some other way to go about to achieve this?
I've not tried this but there is a "Cancel" event on the IndexingNodeDataEventArgs argument that you have access to on GatheringNodeData which you can apply on a node by node basis. Maybe this will work?
public class ExamineEvents : ApplicationEventHandler
{
public ExamineEvents()
{
ExamineManager.Instance.IndexProviderCollection["MainSiteIndexer"].GatheringNodeData += GatheringMainSiteNodeData;
}
private void GatheringMainSiteNodeData(object sender, IndexingNodeDataEventArgs e)
{
e.Cancel = true;
}
}
I've not tried cancel either, but if it works as expected, another option could be to add a page property of type True/False and set it to true if you want it excluded. Then, add that property to the index and check it's value in the gathering node data event. This would avoid the need to keep a list of excluded id's anywhere (e.g. appsettings).
If the particular node has it's own document type, you could exclude it from your index based on that.
Or you could exclude it in your search.
Like this example, where related blog items are found based on a tags property on the items, and the current page is excluded, but this could be any nodeId.
I can confirm that you can use e.Cancel to ensure a node do not get into the index.
This doesn't seem to work in Umbraco 7.5.3. I'm trying to work around a bug in umbraco 7.4.3 - 7.5.3 where Umbraco incorrectly indexes unpublished content. http://issues.umbraco.org/issue/U4-8481
This is what my code looks like right now. I've verified with breakpoints that there are a lot of nodes I set
e.Cancel = true;
and after reindexing, the node is still indexed.What version did you all have this working in? Any thoughts?
I figured it out. In umbraco 7.5.3, you need to use
instead of
So, it doesn't work anymore? We're just discovered it in our work now...
Bit annoying that
IndexingNodeDataEventArgs.Cancel
doesn't work anymore. I'm usingBaseIndexProvider.DeleteFromIndex
in Umbraco 7.14.0 and it seems to work, mostly. Full code here (minus some extraneous bits):When I say it "mostly" works, I mean that it seems to exclude the pages from the index, but when I go to the Examine section to check on the index, I see a few hundred "deleted" nodes (after clicking "Rebuild index"):
I guess this leaves the Examine index in a slightly dirty state, which is probably not the worst thing in the world, but doesn't seem ideal.
is working on a reply...