How to search content when nodes consist of complex content structure?
In an Umbraco 13 solution I want to implement a search that searches in nodes (pages in the frontend) that consist of content from these "sources":
Normal properties like text, text areas and rich text editors on the
nodes themselves
Block list elements properties which also have
difference document types attached with properties like text, text
areas and rich text editors
Blocks in rich text editor (and these
blocks also have properties like text, text areas and rich text
editors)
Multi node tree pickers that gets content from other
content nodes with normal properties and rich text editor blocks
I guess that bullet #1 is straight forward with the default Umbraco index and search functionality.
But how about bullets #2, #3 and #4? Including handling that a rich text editor can have a block and inside that block there is a rich text property that can also contain a block (and so on recursively).
I get the sense that it might be much easier crawling the site from the frontend and indexing that crawled content by url.
So my questions are these:
Should I use Umbracos standard indexing that indexes "from the inside" or should I crawl the site "from the outside"?
I am also checking out Algolia. It's seems like a quite cool search engine, but I need to make a "mirror" of the frontend in a json structure as I read their docs. So I will have to maintain two structures of content. But I think I will be ok with that.
You can write a search resolver which primarily uses the UmbracoIndexes to search the passed searchParam by iterating the parent node of all pages, this resolver uses the defined aliases of all fields which you want to include for search. You can use the below hint for this:
var canGetSearcher = _examineManager.TryGetIndex(UmbracoIndexes.ExternalIndexName, out IIndex index);
if (canGetSearcher)
{
var searcher = index.Searcher;
var searchResult = searcher.CreateQuery(IndexTypes.Content)
.Field("title", searchTerm)
.Execute();
}
No, It will not crawl the site on frontend, you will pass the searchTerm from frontend to Search Resolver. You should create SearchResolverApi for that.
How to search content when nodes consist of complex content structure?
In an Umbraco 13 solution I want to implement a search that searches in nodes (pages in the frontend) that consist of content from these "sources":
I guess that bullet #1 is straight forward with the default Umbraco index and search functionality.
But how about bullets #2, #3 and #4? Including handling that a rich text editor can have a block and inside that block there is a rich text property that can also contain a block (and so on recursively).
I get the sense that it might be much easier crawling the site from the frontend and indexing that crawled content by url.
So my questions are these:
Should I use Umbracos standard indexing that indexes "from the inside" or should I crawl the site "from the outside"?
In case of "from the inside"; how?
In case of "from the outside"; how?
You can use full text search to index and search the frontend rendering https://marketplace.umbraco.com/package/our.umbraco.fulltextsearch
Cool, thanks. I will look into that.
I am also checking out Algolia. It's seems like a quite cool search engine, but I need to make a "mirror" of the frontend in a json structure as I read their docs. So I will have to maintain two structures of content. But I think I will be ok with that.
I have marked Søren Kottals answer as a solution since it answers my question of how it's possible to crawl an Umbraco site from the frontend.
I plan, though, to use Algolio since it has a lot of greate features that I would like to take advantage of.
You can write a search resolver which primarily uses the UmbracoIndexes to search the passed searchParam by iterating the parent node of all pages, this resolver uses the defined aliases of all fields which you want to include for search. You can use the below hint for this:
But that solution will not crawl the site, right (i.e. "see" it from the frontend)?
No, It will not crawl the site on frontend, you will pass the searchTerm from frontend to Search Resolver. You should create SearchResolverApi for that.
is working on a reply...