Unable to find search results for multiple archetypes?
Dear all,
As per my earlier cry for help via the Twitterverse, I am expanding on the 140 chars to hopefully explain my predicament!
TL:DR ~ We have multiple types of Archetype instances in use, each type embedded in multiple pages as "modules" with different content but possibly with more than one module "type" per page. However, nothing is shown at all for the archetypes in the results page period?!?
I need to be able to find a search term in the Archetype and show a preview of the parent page it's embedded in, or the Archetype module the terms is found in depending on the layout of the page.
After very productive conversations with @Rob Baty-Barr and scouring the forums to find that while the question is certainly not a newone, no matter which of these solutions I'm testing I still get nothing from the archetypes to show in the results page?
To illustrate the issue, the searchable content is in the back office examine results as you can see here:
However the same cannot be said for the front end search results:
The results will not show the archetype: json or otherwise?
So, here is the latest edition of the ideas I've investigated to render the results:
@helper RenderContentResult(SearchViewModel model, IPublishedContent result)
{
<div class="ezsearch-result">
<h2><a href="@result.Url">@result.Name</a>
</h2>
@foreach (var field in model.PreviewFields.Where(result.HasValue))
{
var archeModel = result.GetPropertyValue(field) as ArchetypeModel;
if (archeModel != null)
{
foreach (var pageBuilderFs in archeModel)
{
if (!pageBuilderFs.Disabled)
{
@*Pass in only one archetype to test:*@
<p>@Highlight(Truncate(Umbraco.StripHtml(@Html.RenderArchetypePartials(result.GetPropertyValue<ArchetypeModel>("NG_MOD__CallToActionModule"))), model.PreviewLength), model.SearchTerms)</p>
}
break;
}
}
else
{
<p>@Highlight(Truncate(Umbraco.StripHtml(result.GetPropertyValue(field).ToString()), model.PreviewLength), model.SearchTerms)</p>
}
break;
}
</div>
}
And the code that extends the GatheringNodeData event:
protected override void ApplicationStarted(UmbracoApplicationBase umbracoApplication, ApplicationContext applicationContext)
{
base.ApplicationStarted(umbracoApplication, applicationContext);
ExamineManager.Instance.IndexProviderCollection["ExternalIndexer"].GatheringNodeData += NGSearchGatheringNodeDataMethod;
}
private void NGSearchGatheringNodeDataMethod(object sender, IndexingNodeDataEventArgs nodeData)
{
try
{
var nodeTypeAliasField = nodeData.Fields["nodeTypeAlias"];
// Create searchable path
if (nodeData.Fields.ContainsKey("path"))
{
nodeData.Fields["searchPath"] = nodeData.Fields["path"].Replace(',', ' ');
}
// Extract the filename from media items
if (nodeData.Fields.ContainsKey("umbracoFile"))
{
nodeData.Fields["umbracoFileName"] = Path.GetFileName(nodeData.Fields["umbracoFile"]);
}
// Lowercase all the fields for case insensitive searching
var keys = nodeData.Fields.Keys.ToList();
foreach (var key in keys)
{
nodeData.Fields[key] = HttpUtility.HtmlDecode(nodeData.Fields[key].ToLower(CultureInfo.InvariantCulture));
}
string module = null;
string fields = null;
string nestingBox = null;
string nestedFields = null;
switch (nodeTypeAliasField)
{
case "NG_MOD_CallToAction":
module = "NG_MOD__CallToActionModule";
fields = "sectionLabel, sectionDescription";
nestingBox = "callToActionItemCollection";
nestedFields = "label,copyText";
break;
case "NG_MOD_FactoidCarousel":
module = "NG_MOD__FactoidCarouselModule";
nestingBox = "items";
nestedFields = "shortTextDescription";
break;
case "NG_MOD_GridCarousel":
module = "NG_MOD__GridCarouselModule";
nestingBox = "items";
nestedFields = "shortTextDescription";
break;
case "NG_MOD_GeneralContent":
module = "NG_MOD__GeneralContentModule";
fields = "callToActionTitle, callToActionText, mainContentText";
break;
case "NG_MOD_Resources":
module = "NG_MOD__ResourcesModule";
fields = "resourcesLabel, resourcesDescription";
nestingBox = "resourcesItemCollection";
nestedFields = "resourceTitle, resourceDescription";
break;
case "NG_MOD_Timeline":
module = "NG_MOD__TimelineCarouselModule";
nestingBox = "timelineItem";
nestedFields = "datebyline, dateContent";
break;
default:
break;
}
////module is the name of the property alias for the archetype
if (module == null || !nodeData.Fields.ContainsKey(module)) return;
{
var archetypModuleValueAsString = nodeData.Fields[module];
if (string.IsNullOrEmpty(archetypModuleValueAsString)) return;
var archetype = JsonConvert.DeserializeObject<ArchetypeModel>(archetypModuleValueAsString);
foreach (var fieldset in archetype)
{
_index++;
if (fields != null)
{
foreach (
var field in
fields.Split(',')
.Select(p => p.Trim().ToLower(CultureInfo.InvariantCulture))
.Where(p => !string.IsNullOrWhiteSpace(p)).ToList())
{
if (!fieldset.HasValue(field)) continue;
var value = fieldset.GetValue<string>(field);
// Split the CamelCaseString to hyphenated-lower-case
var fieldName = System.Text.RegularExpressions.Regex.Replace(field, "([^^])([A-Z])", "$1-$2");
nodeData.Fields.Add(string.Format("archetype-{0}-{1}", fieldName, _index), value);
}
}
if (nestingBox != null)
{
var step = 0;
var nestedItems = JsonConvert.DeserializeObject<ArchetypeModel>(fieldset.GetValue(nestingBox));
foreach (var item in nestedItems)
{
step++;
foreach (
var nest in
nestedFields.Split(',')
.Select(p => p.Trim().ToLower(CultureInfo.InvariantCulture))
.Where(p => !string.IsNullOrWhiteSpace(p)))
{
if (!item.HasValue(nest)) continue;
var value = item.GetValue<string>(nest);
// Split the CamelCaseString to hyphenated-lower-case
var fieldName = System.Text.RegularExpressions.Regex.Replace(nest, "([^^])([A-Z])", "$1-$2");
nodeData.Fields.Add(string.Format("archetype-{0}-{2}_{1}", fieldName, _index, step), value);
}
}
}
}
}
// Stuff all the fields into a single field for easier searching
var combinedFields = new StringBuilder();
foreach (var keyValuePair in nodeData.Fields)
{
combinedFields.AppendLine(keyValuePair.Value);
}
//this duplicates the contents - removed
// nodeData.Fields.Add("contents", combinedFields.ToString());
Console.Write(_index.ToString() + nodeTypeAliasField);
}
catch (Exception ex)
{
LogHelper.Error<Exception>(ex.Message, ex);
}
}
As we can see in the first screenshot above the content is most certainly in the search index, however, I can't even get a hit on the archetype to attempt to render a preview in the results page!
Why not?
Thank you for reading this far, and if you've got any ideas I'm open to suggestions!
By the way, our little one is sleeping peacefully and we've checked on him a few times so far, thank you for your rapid responses expressing your concern with his welfare earlier - this is what makes Umbraco the community to be in!
I've never been successful at searching within a JSON based field, so maybe someone else can suggest a way to do that, but what I have had success with is with flattening the JSON into individual fields (or you can merge into one big field if you just need to search across the JSON generally).
This looks something like this:
public class Bootstrap : ApplicationEventHandler
{
protected override void ApplicationStarted(UmbracoApplicationBase umbracoApplication, ApplicationContext applicationContext)
{
ExamineManager.Instance.IndexProviderCollection["ExternalIndexer"].GatheringNodeData += (sender, args) =>
{
...
// Extract JSON properties
var fieldKeys = args.Fields.Keys.ToArray();
foreach (var key in fieldKeys)
{
var value = args.Fields[key];
if (value.DetectIsJson())
{
IndexNestedObject(args.Fields, JsonConvert.DeserializeObject(value), key);
}
}
...
}
}
private void IndexNestedObject(Dictionary<string, string> fields, object obj, string prefix)
{
var objType = obj.GetType();
if (objType == typeof(JObject))
{
var jObj = obj as JObject;
if (jObj != null)
{
foreach (var kvp in jObj)
{
var propKey = prefix + "_" + kvp.Key;
var valueType = kvp.Value.GetType();
if (typeof(JContainer).IsAssignableFrom(valueType))
{
IndexNestedObject(fields, kvp.Value, propKey);
}
else
{
fields.Add(propKey, kvp.Value.ToString().StripHtml());
}
}
}
}
else if (objType == typeof(JArray))
{
var jArr = obj as JArray;
if (jArr != null)
{
for (var i = 0; i < jArr.Count; i++)
{
var itm = jArr[i];
var propKey = prefix + "_" + i;
var valueType = itm.GetType();
if (typeof(JContainer).IsAssignableFrom(valueType))
{
IndexNestedObject(fields, itm, propKey);
}
else
{
fields.Add(propKey, itm.ToString().StripHtml());
}
}
}
}
}
}
With the DetectIsJson extension method being:
public static class StringExtensions
{
public static bool DetectIsJson(this string input)
{
input = input.Trim();
if (input.StartsWith("{") && input.EndsWith("}"))
return true;
if (input.StartsWith("["))
return input.EndsWith("]");
return false;
}
}
This should allow you do search across the data, but it'll make it tricky to reload that data back in, so your best bet in your view is to reload the entity from the content cache. Another draw back is you won't be able to identify within the content where the keyword was incase you wanted to highlight/preview the matching content, but these are the trade offs.
Thank you so much for the info, I'm slightly confused as to how my code isn't doing exactly what your code is doing when I can see the nested content from the archetypes in the examine results in the back office?
Out of curiosity, as the "parent" node that the modules are held under has the umbracoNaviHide property on by default, could this have anything to do with the fact that the search results aren't showing anything from the "child" modules?
To illustrate, a page structure could be as follows:
page
modules
archetype
archetype
archetype
With the module node having the umbracoNaviHide property selected.
Hence, if the path is then hidden the search won't go any deeper ...
Quite possibly as ezSearch by default checks umbracoNaviHide to see if a doc should be hidden from search. You can change that via the "hideFromSearchField" macro param though:
I've set the param to "" in the render macro call I've used for the site's results page alongside un-setting the umbracoNaviHide on the parent node as well as ensuring the same property is NOT set on the modules but still no luck?
Interestingly, as I've rebuilt the index after each modification, the number of fields is continuing to grow instead of keeping the same number, any thoughts on what might cause this; or maybe a way to read the examine index files in the temp folder:
Finally, if we could just get a hit on the archetype's "grandparent" to show that that's the page with the search term then I'm pretty sure I can find a field on that page to fill the preview from!
Thanks for walking through this with me, as Tesco's says "every little helps"!
PS. Thanks for showing me how to highlight a line in github, I never knew that trick!
Thank you, Luke is brilliant! A little hard to get used to in the beginning but hey, that's what Google is for eh? ;-)
While Luke shows me that the data is definitely in the index and it finds exactly the same results as the back office Lucene search results do, I still cannot get a hit on the (grand)parent node for a match?
To reitterate, a page structure could be as follows:
page
module manager
archetype container
archetype item
archetype container
archetype item
archetype item
archetype item
So, after investigating things even more, and talking the whole process out with @Bob, we've determined that in order to have the nested modular data show in the search results we have to link them directly to the grandparent node as appended content actually on the grandparent node - this is the new brick wall!
I've found a couple links on Our outlining similar situations but the answers I've found are for earlier versions of Umbraco?
I'm also not sure when to trigger the content realignment - onGatherNodeData event or the individual node Publish event - although I'm tending towards the latter this would mean republishing all the modules and content - better to do this now all the same!
Unable to find search results for multiple archetypes?
Dear all,
As per my earlier cry for help via the Twitterverse, I am expanding on the 140 chars to hopefully explain my predicament!
After very productive conversations with @Rob Baty-Barr and scouring the forums to find that while the question is certainly not a new one, no matter which of these solutions I'm testing I still get nothing from the archetypes to show in the results page?
To illustrate the issue, the searchable content is in the back office examine results as you can see here:
However the same cannot be said for the front end search results:
The results will not show the archetype: json or otherwise?
So, here is the latest edition of the ideas I've investigated to render the results:
And the code that extends the GatheringNodeData event:
As we can see in the first screenshot above the content is most certainly in the search index, however, I can't even get a hit on the archetype to attempt to render a preview in the results page!
Why not?
Thank you for reading this far, and if you've got any ideas I'm open to suggestions!
By the way, our little one is sleeping peacefully and we've checked on him a few times so far, thank you for your rapid responses expressing your concern with his welfare earlier - this is what makes Umbraco the community to be in!
#h5yr!
I've never been successful at searching within a JSON based field, so maybe someone else can suggest a way to do that, but what I have had success with is with flattening the JSON into individual fields (or you can merge into one big field if you just need to search across the JSON generally).
This looks something like this:
With the DetectIsJson extension method being:
This should allow you do search across the data, but it'll make it tricky to reload that data back in, so your best bet in your view is to reload the entity from the content cache. Another draw back is you won't be able to identify within the content where the keyword was incase you wanted to highlight/preview the matching content, but these are the trade offs.
Matt
Absolute legend!!! This has just saved me a lot of time! Cheers for sharing this
@Matt,
Thank you so much for the info, I'm slightly confused as to how my code isn't doing exactly what your code is doing when I can see the nested content from the archetypes in the examine results in the back office?
Out of curiosity, as the "parent" node that the modules are held under has the umbracoNaviHide property on by default, could this have anything to do with the fact that the search results aren't showing anything from the "child" modules?
To illustrate, a page structure could be as follows:
With the module node having the umbracoNaviHide property selected.
Hence, if the path is then hidden the search won't go any deeper ...
Thoughts?
Hi Jon,
Quite possibly as ezSearch by default checks umbracoNaviHide to see if a doc should be hidden from search. You can change that via the "hideFromSearchField" macro param though:
https://github.com/mattbrailsford/ezSearch/blob/master/Src/Our.Umbraco.ezSearch/Web/UI/Views/MacroPartials/ezSearch.cshtml#L24
Matt
I've set the param to "" in the render macro call I've used for the site's results page alongside un-setting the umbracoNaviHide on the parent node as well as ensuring the same property is NOT set on the modules but still no luck?
Interestingly, as I've rebuilt the index after each modification, the number of fields is continuing to grow instead of keeping the same number, any thoughts on what might cause this; or maybe a way to read the examine index files in the temp folder:
Finally, if we could just get a hit on the archetype's "grandparent" to show that that's the page with the search term then I'm pretty sure I can find a field on that page to fill the preview from!
Thanks for walking through this with me, as Tesco's says "every little helps"!
PS. Thanks for showing me how to highlight a line in github, I never knew that trick!
You'll want to use Luke to examine the lucene indexes. It's vital for seeing what is actually going on in your index:
http://www.getopt.org/luke/
Matt
@Matt,
Thank you, Luke is brilliant! A little hard to get used to in the beginning but hey, that's what Google is for eh? ;-)
While Luke shows me that the data is definitely in the index and it finds exactly the same results as the back office Lucene search results do, I still cannot get a hit on the (grand)parent node for a match?
To reitterate, a page structure could be as follows:
So, after investigating things even more, and talking the whole process out with @Bob, we've determined that in order to have the nested modular data show in the search results we have to link them directly to the grandparent node as appended content actually on the grandparent node - this is the new brick wall!
I've found a couple links on Our outlining similar situations but the answers I've found are for earlier versions of Umbraco?
I'm also not sure when to trigger the content realignment - onGatherNodeData event or the individual node Publish event - although I'm tending towards the latter this would mean republishing all the modules and content - better to do this now all the same!
Thoughts? Is this the right path to wander?
is working on a reply...