ezSearch returns json in Preview for Grid properties
In umbraco 7.2 you might have a Grid datatype in use. If you search on one of those doctype properties (let's suppose your grid was used on a document type property with an alias of bodyText so the default ezSearch macro will look in it without further configuration) you'll see results like this with ezSearch 1.2:
Products
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Tibi hoc incredibile, quod beatissimum. Res tota, Torquate, non
doctorum hominum, velle post mortem epulis celebrari memoriam sui
nominis. In eo enim positum est id, quod dicimus esse…
Full Text Search is pretty awesome... but it does not support hiding pages from results without manually reindexing the site every time a node is updated/created.
I've tested in 7.3.8 and no matter what settings I use hidden pages show in results. The add-on says that it uses umbracoSearchHide as the default field (which I use on all my sites too)... hidden pages still show. If I go to the developer tab and manually reindex the hidden pages disappear (which is good). If I modify one of those hidden pages or create a new page with hidden checked the pages show in the results again... until I manually reindex.
I don't know if it's a bug or if I'm doing something wrong.
Either way, ezSearch does not work on pages with the Grid Datatype at all. I've had scenarios where it will not show results even if the nodeName matches exactly.
You are correct that Full Text Search dosnt pick up on values like umbracoSearchHide or umbracoNaviHide.
But just this code in App_Code folder, then it works.
using Umbraco.Core.Models;
using Umbraco.Core.Services;
public class MyApplicationEvents : ApplicationEventHandler
{
protected override void ApplicationStarted(UmbracoApplicationBase umbracoApplication, ApplicationContext applicationContext)
{
ContentService.Saving += ContentServiceSaving; // rasied before the content has been saved.
ContentService.Saved += ContentServiceSaving; // raised after the content has been saved.
base.ApplicationStarted(umbracoApplication, applicationContext);
}
private void ContentService_Saving(IContentService contentService, SaveEventArgs
The fundamental problem is caused by storing complex data inside a single document type property. This is especially evident with the grid editor -- a single property may have images, headings, rich text content, and much else. Lucene (and thus examine and ezSearch) are storing the data of each property and you can easily specify which properties are important and which aren't, which should be displayed as a source for preview text, etc.
There needs to be some way to do the same sort of thing, but for each element inside the grid (or other complex json-based data type), so that the individual elements can be understood/ranked/searched/displayed properly. This doesn't yet exist. The approach shown by abjerner is a good one. If we make complex data types to put data into Umbraco we need some way to get it back out. That may require some effort, at least for now.
My dream would be that now that Examine is part of the core, the document type UI might be extended to allow flagging the elements in complex data types so that they can be searched, ranked, and displayed properly. Until some sort of UI is available we'll have to handle it manually in code, probably as abjerner suggests.
Alternatively, you could create a new lucene index to crawl the live site. This also has complications though, and the full-text-search package is an option.
It seems to me that for "professional" sites (if I can put it that way), some work is going to be needed to make search work well with complex data types. It's a matter of deciding for each project which approach will give the best results for the least work.
Yup agree that examine should have some way to manage complex data in a reasonable way - that might not be perfect in all cases - but we could then ontop of this provide api endpoints, so developers can implement their own "Provide data for search index" format.
As discussed at the retreat and at codegarden - nested complex content inside property editors is a challenge we want to prioritise as it will become more and more normal to do these things. (JSON all the things)
So we will find a managed way to handle complex, nested data and a way to index and search this for most cases - and provide a hook for the remaining cases
The core team have done some good work on this. It isn't specific to ezSearch but to the way examine/lucene is indexing grid (and other complex json data types). http://issues.umbraco.org/issue/U4-7295
Will be released with Umbraco version 7.5.0, but it's already in the beta if you want to try it out.
Sorry to unearth such an old thread, but I wondered if there's anything 'special' required to get grid content to be indexed without JSON these days? I'm using Umbraco 7.12.3 and ezSearch 1.2 to index content generated using DocTypeGridEditor, but the search preview text is coming out with all the JSON.
I saw the core commit around 7.5 which does some smart-indexing of grid content — perhaps it just doesn't play ball with DocTypeGridEditor?
If anyone knows how to index DocTypeGridEditor data without having to go through the whole manually-coded gatheringNodeData indexing I'd be most grateful for any pointers.
ezSearch returns json in Preview for Grid properties
In umbraco 7.2 you might have a Grid datatype in use. If you search on one of those doctype properties (let's suppose your grid was used on a document type property with an alias of bodyText so the default ezSearch macro will look in it without further configuration) you'll see results like this with ezSearch 1.2:
There will need to be some way to handle the display of grid data in search results preview.
(perhaps the core needs a
StripJson()
function in addition toStripHtml()
?)cheers, doug.
I second that!
hi Douglas Robar ,
How to solve this issue , did you got any solution Please help me to solve this.
From what I've found it seems to be a fundamental issue with both the default Umbraco 7.2 installs (which is lame) as well as with ezSearch.
There are 2 work-arounds:
1) Use this package instead (NOT ezSearch) https://our.umbraco.org/projects/website-utilities/full-text-search/
2) Use this library https://gist.github.com/abjerner/bdd89e0788d274ec5a33 and spend a day or two screwing around trying to get all the code working (with or without ezSearch, I don't know)
Full Text Search is pretty awesome... but it does not support hiding pages from results without manually reindexing the site every time a node is updated/created.
I've tested in 7.3.8 and no matter what settings I use hidden pages show in results. The add-on says that it uses umbracoSearchHide as the default field (which I use on all my sites too)... hidden pages still show. If I go to the developer tab and manually reindex the hidden pages disappear (which is good). If I modify one of those hidden pages or create a new page with hidden checked the pages show in the results again... until I manually reindex.
I don't know if it's a bug or if I'm doing something wrong.
Either way, ezSearch does not work on pages with the Grid Datatype at all. I've had scenarios where it will not show results even if the nodeName matches exactly.
Hi,
You are correct that Full Text Search dosnt pick up on values like umbracoSearchHide or umbracoNaviHide.
But just this code in App_Code folder, then it works.
using Umbraco.Core.Models; using Umbraco.Core.Services;
public class MyApplicationEvents : ApplicationEventHandler { protected override void ApplicationStarted(UmbracoApplicationBase umbracoApplication, ApplicationContext applicationContext) { ContentService.Saving += ContentServiceSaving; // rasied before the content has been saved. ContentService.Saved += ContentServiceSaving; // raised after the content has been saved. base.ApplicationStarted(umbracoApplication, applicationContext); } private void ContentService_Saving(IContentService contentService, SaveEventArgs
}
The fundamental problem is caused by storing complex data inside a single document type property. This is especially evident with the grid editor -- a single property may have images, headings, rich text content, and much else. Lucene (and thus examine and ezSearch) are storing the data of each property and you can easily specify which properties are important and which aren't, which should be displayed as a source for preview text, etc.
There needs to be some way to do the same sort of thing, but for each element inside the grid (or other complex json-based data type), so that the individual elements can be understood/ranked/searched/displayed properly. This doesn't yet exist. The approach shown by abjerner is a good one. If we make complex data types to put data into Umbraco we need some way to get it back out. That may require some effort, at least for now.
My dream would be that now that Examine is part of the core, the document type UI might be extended to allow flagging the elements in complex data types so that they can be searched, ranked, and displayed properly. Until some sort of UI is available we'll have to handle it manually in code, probably as abjerner suggests.
Alternatively, you could create a new lucene index to crawl the live site. This also has complications though, and the full-text-search package is an option.
It seems to me that for "professional" sites (if I can put it that way), some work is going to be needed to make search work well with complex data types. It's a matter of deciding for each project which approach will give the best results for the least work.
cheers,
doug.
Hey Doug
Yup agree that examine should have some way to manage complex data in a reasonable way - that might not be perfect in all cases - but we could then ontop of this provide api endpoints, so developers can implement their own "Provide data for search index" format.
As discussed at the retreat and at codegarden - nested complex content inside property editors is a challenge we want to prioritise as it will become more and more normal to do these things. (JSON all the things)
So we will find a managed way to handle complex, nested data and a way to index and search this for most cases - and provide a hook for the remaining cases
i have a site which would suit the grid for content population, but i need it to be searchable.
is there any progress on this? what is the current best practice for searchable grid?
The core team have done some good work on this. It isn't specific to ezSearch but to the way examine/lucene is indexing grid (and other complex json data types). http://issues.umbraco.org/issue/U4-7295
Will be released with Umbraco version 7.5.0, but it's already in the beta if you want to try it out.
cheers,
doug
Thanks Doug
Hey all,
Sorry to unearth such an old thread, but I wondered if there's anything 'special' required to get grid content to be indexed without JSON these days? I'm using Umbraco 7.12.3 and ezSearch 1.2 to index content generated using DocTypeGridEditor, but the search preview text is coming out with all the JSON.
I saw the core commit around 7.5 which does some smart-indexing of grid content — perhaps it just doesn't play ball with DocTypeGridEditor?
If anyone knows how to index DocTypeGridEditor data without having to go through the whole manually-coded
gatheringNodeData
indexing I'd be most grateful for any pointers.Thanks
is working on a reply...