Hi, I'm trying to index multilevel nested content without any success. I'm able to break out the values one level down but as I have multi level nested content the second level writes out the JSON data as one string. Any help would much appreciated.
The code looks like this:
using Examine;
using Umbraco.Core;
using System.Linq;
using Newtonsoft.Json;
using System.Collections.Generic;
using Newtonsoft.Json.Linq;
namespace Custom.Controllers
{
public static class StringExtensions
{
public static bool DetectIsJson(this string input)
{
input = input.Trim();
if (input.StartsWith("{") && input.EndsWith("}"))
return true;
if (input.StartsWith("["))
return input.EndsWith("]");
else
return false;
}
}
public class ExamineJsonIndexer : ApplicationEventHandler
{
protected override void ApplicationStarted(UmbracoApplicationBase umbracoApplication, ApplicationContext applicationContext)
{
base.ApplicationStarted(umbracoApplication, applicationContext);
// Declare the indexers that should index JSON properties
var jsonIndexer = ExamineManager.Instance.IndexProviderCollection["KundUnikumContentIndexer"];
if (jsonIndexer != null)
{
jsonIndexer.GatheringNodeData += (sender, e) =>
{
// Extract JSON properties
var fieldKeys = e.Fields.Keys.ToArray();
foreach (var key in fieldKeys)
{
var value = e.Fields[key];
// Check if index field is JSON data
if (value.DetectIsJson())
{
IndexNestedObject(e.Fields, JsonConvert.DeserializeObject(value), key);
}
}
};
}
}
private void IndexNestedObject(Dictionary<string, string> fields, object obj, string prefix)
{
var objType = obj.GetType();
if (objType == typeof(JObject))
{
var jObj = obj as JObject;
if (jObj != null)
{
foreach (var kvp in jObj)
{
var propKey = prefix + "___jobject___" + kvp.Key;
var valueType = kvp.Value.GetType();
if (typeof(JContainer).IsAssignableFrom(valueType))
{
IndexNestedObject(fields, kvp.Value, propKey);
}
else
{
fields.Add(propKey, kvp.Value.ToString().StripHtml());
}
}
}
}
else if (objType == typeof(JArray))
{
var jArr = obj as JArray;
if (jArr != null)
{
for (var i = 0; i < jArr.Count; i++)
{
var itm = jArr[i];
var propKey = prefix + "_jarray_" + i;
var valueType = itm.GetType();
if (typeof(JContainer).IsAssignableFrom(valueType))
{
IndexNestedObject(fields, itm, propKey);
}
else
{
fields.Add(propKey, itm.ToString().StripHtml());
}
}
}
}
}
}
}
You might like to have a look at Look again, it now does this out of the box with a custom Examine indexer/searcher (unfortunately the docs are a little out of date and are being worked on, but happy to help out if I can...)
Or if you want to roll-your-own, then from this source line might be useful, and the extension method GetFlatDetachedDescendants.
I think I'm going about it slightly differently - it doesn't parse Json, instead it requests the nested content item as IEnumerable < IPublishedContent > (recursively) and indexes each as a new Lucene document - so the idea is that each nested content item will be indexed in the same way as a regular node - might spark some ideas ?
Could you do the same kind of thing, avoid JSON parsing, and recurse collections of IPublishedContent to then flatten out then store custom fields on the 'hosting' node ?
Index multi level nested content
Hi, I'm trying to index multilevel nested content without any success. I'm able to break out the values one level down but as I have multi level nested content the second level writes out the JSON data as one string. Any help would much appreciated.
The code looks like this:
Hi David,
You might like to have a look at Look again, it now does this out of the box with a custom Examine indexer/searcher (unfortunately the docs are a little out of date and are being worked on, but happy to help out if I can...)
Or if you want to roll-your-own, then from this source line might be useful, and the extension method GetFlatDetachedDescendants.
Hi Hendy, been a long time since the last time :D
Anyway, how does your code index the json data? I cant find any "json checking" in your code or is this done in some referenced library?
Best regards David
Hi David,
I think I'm going about it slightly differently - it doesn't parse Json, instead it requests the nested content item as IEnumerable < IPublishedContent > (recursively) and indexes each as a new Lucene document - so the idea is that each nested content item will be indexed in the same way as a regular node - might spark some ideas ?
Could you do the same kind of thing, avoid JSON parsing, and recurse collections of IPublishedContent to then flatten out then store custom fields on the 'hosting' node ?
I will try that Hendy. I feel like I'm kind of out of my comfort zone here... but I will certainly try.
Is there anything more I have to consider except for the links you referred to? Any changes in the examine settings or index?
Thanks /David
Found a updated version of Matt Brailsford's snippet that handles nested in nested content.
Hi!
I successfully use this snippet to index a nested content used by a member, thank you!!!
Can someone help me to find the code to filter members by a specific property of the nested content?
Thank you!
S
is working on a reply...