General question on implementing custom filtered search using Umbraco Examine.
I have simple site three branch, page and home DocTypes and would like to create a filtered search on content down a certain node path. i.e home -> branch 1 - pages. Search should return only content found down this path and not branch 2, branch 3....
Or should i be creating DocTypes for my branches and then create custom Indexes for each DocType i want to use for filtered search?
Each Umbraco Page has a 'Path' property which is a comma delimited list of its Parent, and Ancestor Ids all the way up to the top of the content tree, eg
-1,1138,1073,1098
(where -1 is the root of the site, 1098 is the id of the Current page, and 1138 and 1073 are the ids of the pages directly above)
So knowing this when you search, if you wanted to only return results from a certain branch you could filter on this 'path' property to only return results that begin with -1,1138
With this approach, you'd only need one index.
(if your branches were different languages though, then it would make sense to create separate indexes with different Lucene analysers for each language, you would use the IndexParentId property on the index in examine.config to restrict results within your branch).
Now the complication... I think, although it might be resolved (it's a while since I've written this kind of search) that Examine/Lucene will get thrown by the commas in the path, when searching for 'part' of the path to do this kind of filtering. The way I work around this is to add a new index item, when Examine executes the indexing of a page called 'searchablePath' and this is the Path propety with the commas removed.
eg -1 1138 1073 1098
This then allows you to filter by -1 1138 to restrict the search to this part of the content tree, without the commas it works!
Anyway you add this searchablePath property by tapping into the Examine Gathering Nodes event:
var externalIndexSet = ExamineManager.Instance.IndexProviderCollection["ExternalIndexer"];
externalIndexSet.GatheringNodeData += ExternalIndexProvider_GatheringNodeData;
and you would add the new searchablePath property like so:
private void ExternalIndexProvider_GatheringNodeData(object sender, IndexingNodeDataEventArgs e)
{
//make a searchable path field in the index
if (e.IndexType == IndexTypes.Content)
{
//grab the current data from the Fields collection
var path = e.Fields["path"];
//let's get rid of those commas!
path = path.Replace(",", " ");
//add as new field
e.Fields.Add("searchablePath", path);
}
}
Now when constructing your filter, you can insist the results have the 'path' of the branch you are searching in their searchablePath property eg:
I can make sense of your code and see what you are doing. i may make use of it further down the line.
I'm just getting hang of Examine and following documentation, from github and on here.
I'm trying to make use of ExamineManager and not getting the following basic code to work.
var query = Request.QueryString["query"];
var searcher = Examine.ExamineManager.Instance.SearchProviderCollection["ExternalSearcher"];
// the boolean parameter is whether to use wildcards when searching.
var searchResults = searcher.Search(query, true).OrderByDescending("createDate");
if(searchResults.Any())
{
<ul>
@foreach (var result in searchResults)
{
<li>
<a href="@result.Url">@result.Name</a>
</li>
}
</ul>
}
VS intellisense highlighting a few issues with the code:-
The suggested is to actually send the ordering request along with the search criteria, this enables Lucene to handle the sorting for you.
so something like this might get you results...
using Examine;
using Examine.LuceneEngine.SearchCriteria;
var searcher = ExamineManager.Instance.SearchProviderCollection["ExternalSearcher"];
var criteria = searcher.CreateSearchCriteria();
//build up a filter
Examine.SearchCriteria.IBooleanOperation filter = null;
//search content only
filter = criteria.GroupedOr(new string[] { "__IndexType" }, "content");
//split search terms on a space
var terms = term.Contains(" ") ? term.Split(new char[] { ' ' }, StringSplitOptions.RemoveEmptyEntries) : new string[] { term };
//filter out english stop words, it, and, of
terms = terms.Where(x => !Lucene.Net.Analysis.StopAnalyzer.ENGLISH_STOP_WORDS_SET.Contains(x.ToLower()) && x.Length > 2).ToArray();
//search on main fields of your Umbraco documents that might contain the search term
filter.And().GroupedOr(new string[] { "nodeName", "bodyText", "title", "whateverDescription", "authorName" }, terms);
//add the ordering
filter.And().OrderByDescending(new string[] { "createDate" });
//do the search!
var searchResult = searcher.Search(filter.Compile());
The search results won't include the Url property, but you can obtain that by using UmbracoHelper.Url(IdOfContentItem).
This approach gives you the most flexibility/complexity when defining your search, but if you have a small site and are after quick results then
var searchResults = Umbraco.TypedSearch(filter.Compile());
Examine best practices
Dear Umbraco team
General question on implementing custom filtered search using Umbraco Examine.
I have simple site three branch, page and home DocTypes and would like to create a filtered search on content down a certain node path. i.e home -> branch 1 - pages. Search should return only content found down this path and not branch 2, branch 3....
Or should i be creating DocTypes for my branches and then create custom Indexes for each DocType i want to use for filtered search?
Dibs
Hi Dibs
Each Umbraco Page has a 'Path' property which is a comma delimited list of its Parent, and Ancestor Ids all the way up to the top of the content tree, eg
-1,1138,1073,1098
(where -1 is the root of the site, 1098 is the id of the Current page, and 1138 and 1073 are the ids of the pages directly above)
So knowing this when you search, if you wanted to only return results from a certain branch you could filter on this 'path' property to only return results that begin with -1,1138
With this approach, you'd only need one index.
(if your branches were different languages though, then it would make sense to create separate indexes with different Lucene analysers for each language, you would use the IndexParentId property on the index in examine.config to restrict results within your branch).
Now the complication... I think, although it might be resolved (it's a while since I've written this kind of search) that Examine/Lucene will get thrown by the commas in the path, when searching for 'part' of the path to do this kind of filtering. The way I work around this is to add a new index item, when Examine executes the indexing of a page called 'searchablePath' and this is the Path propety with the commas removed.
eg -1 1138 1073 1098
This then allows you to filter by -1 1138 to restrict the search to this part of the content tree, without the commas it works!
Anyway you add this searchablePath property by tapping into the Examine Gathering Nodes event:
(this would be inserted OnApplicationStarted event - see https://our.umbraco.com/documentation/reference/events/application-startup)
and you would add the new searchablePath property like so:
Now when constructing your filter, you can insist the results have the 'path' of the branch you are searching in their searchablePath property eg:
a bit of a faff! but once it's in place works well.
regards
Marc
cheers Marc for reply
I can make sense of your code and see what you are doing. i may make use of it further down the line.
I'm just getting hang of Examine and following documentation, from github and on here.
I'm trying to make use of ExamineManager and not getting the following basic code to work.
VS intellisense highlighting a few issues with the code:-
Any advice appreciated
Dibs
Hi Dibs
I think those docs, might be a little out of date...
... if you look at Examine documentation itself:
https://github.com/Shazwazza/Examine/wiki/Sorting-results
The suggested is to actually send the ordering request along with the search criteria, this enables Lucene to handle the sorting for you. so something like this might get you results...
The search results won't include the Url property, but you can obtain that by using UmbracoHelper.Url(IdOfContentItem).
This approach gives you the most flexibility/complexity when defining your search, but if you have a small site and are after quick results then
will return an IEnumerable
I hear Marc
I have been going through the Github docs and starting to get my head round Examine.
Thanks for your input, much appreciated.
I close off this query and say that read the documentation on Github : )
Solution for all things Examine read https://github.com/Shazwazza/Examine
Dibs
is working on a reply...