Creating Examine queries without specifying fields.
In a bid to make a generic method for handling Examine queries I have created the following method.
/// <summary>
/// Searches the examine index to search for nodes containing the given term.
/// </summary>
/// <param name="query">
/// The query to search for.
/// </param>
/// <param name="fields">
/// A list of fields within a node to search.
/// </param>
/// <param name="searchProvider">
/// The search provider.
/// </param>
/// <param name="searchType">
/// The search type .
/// </param>
/// <param name="booleanOperation">
/// The Boolean operation which determines what kind of operators to
/// apply to the search.
/// Defaults to OR.
/// </param>
/// <returns>
/// The <see cref="ISearchResults"/> containing the search results.
/// </returns>
public static ISearchResults Investigate(string query,
IEnumerable<string> fields, string searchProvider,
SearchType searchType,
BooleanOperation booleanOperation = BooleanOperation.Or)
{
ISearchResults results = null;
if (!string.IsNullOrWhiteSpace(query))
{
// Create a search provider.
BaseSearchProvider baseSearchProvider =
ExamineManager.Instance.SearchProviderCollection[searchProvider];
ISearchCriteria criteria =
baseSearchProvider.CreateSearchCriteria(searchType.ToDescription(),
booleanOperation)
.GroupedOr(fields, query.Fuzzy().Value)
.Compile();
results = baseSearchProvider.Search(criteria);
}
return results;
}
SearchType is an enum representing the UmbracoExamine.IndexTypes static class which cannot be passed as a constrained parameter, (Incidentally, why isn't that class an enum as it acts like one?)
My issue is this: In the config files I can specifically set which fields to index for my doctypes which allows me to constrain my index to the bare minimum. I would expect then to be able to pass just my query to the instance of ISearchCriteria without having to specify again which fields to search within and instead searches in all indexed fields.
Thanks for that, I've copied your class over here since links never last forever.
using System;
using System.Text;
using System.Xml.Linq;
using Examine;
using Umbraco_Site_Extensions.Helpers;
using umbraco.BusinessLogic;
using UmbracoExamine;
namespace Umbraco_Site_Extensions.examineExtensions
{
public class ExamineEvents: ApplicationBase
{
public ExamineEvents()
{
ExamineManager.Instance.IndexProviderCollection[Constants.CogWorksIndexer].GatheringNodeData += ExamineEvents_GatheringNodeData;
}
void ExamineEvents_GatheringNodeData(object sender, IndexingNodeDataEventArgs e)
{
AddToContentsField(e);
}
/// <summary>
/// munge into one field
/// </summary>
/// <param name="e"></param>
private void AddToContentsField(IndexingNodeDataEventArgs e)
{
var fields = e.Fields;
var combinedFields = new StringBuilder();
foreach (var keyValuePair in fields)
{
combinedFields.AppendLine(keyValuePair.Value);
}
e.Fields.Add("contents", combinedFields.ToString());
}
}
}
Ok... I can see what your class is doing. How do I go about implementing it?. Where do I declare the constructor?
Also, ApplicationBase is marked as obsolete in versions 4.8+. According to these docsApplicationEventHandler should be used instead.
Aces... I got it working. Big thanks to both of you! :)
I'll need to check if I'm allowed as I'd like to publish a gist of what I have done. I have enhanced the method so that I can work with multiple IndexProviders
Just for some added information, if you don't specify any fields in your config then Examine will index everything which is why when you search it wants you to specify fields.
The solution posted is ideal for what you wanted where you have specified explicit fields in your config and just want to search them all. This method is also useful for when you want to include child node data inside of a single index (i.e. if a child node just contains data but is not actualy a template itself).
will cause the compiler to generate a query as follows.
{+(+(+searchable:"1077")) +__IndexType:content}
Obviously that's incorrect. It looks like the query compiler is skipping duplicate field keys when generating. Other than specifically dictating the fields there looks as if there is no way to enure that the compiler checks the field for both queries.
I use the munged field for full text search then if i want to do anything specific on a field like path or any other property i specifically add the fields to the query. BTW path is not tokenised you will need to add new searchable path field by using gatheringnode data see the blog post there is section on there about it.
I've actually written code to normalize paths etc in the startup event. I've amended my method to accept an optional IEnumerable<string> and that has fixed it.
I worked through this and was able to implement it using the ApplicationEventHandler, but my "contents" field is only bringing in the <IndexAttributeFields> specified in the config, and ignores any of the <IndexUserFields> I've specified in the config.
I'm able to look at the Index with Luke and it indexes the <IndexUserFields> fields on their own. But when I step through the GatheringNodeData Event, I'm not seeing them come thorugh in the GatheringNodeData args. Not sure what I am missing here.
No need to make custom index set what you can do is using gathering node data add new field with value call it selectAll and set value to 1. Then you can query on it by fields .Fields("selectAll","1") that will pull everything back and it will be lightening quick.
No I mean http://thecogworks.co.uk/blog/posts/2012/november/examiness-hints-and-tips-from-the-trenches-part-2/ select * type queries. So in lucene you cannot search on blank fields and you cannot do a select * to select all documents. What you can do is stick an arbitrary field call it whatever you like and give it whatever value, stick it in for all documents then you can run a query on that field and because every document will have that field you will get all the documents back so in effect you are mimicking a select *.
Creating Examine queries without specifying fields.
In a bid to make a generic method for handling Examine queries I have created the following method.
SearchType
is an enum representing theUmbracoExamine.IndexTypes
static class which cannot be passed as a constrained parameter, (Incidentally, why isn't that class an enum as it acts like one?)My issue is this: In the config files I can specifically set which fields to index for my doctypes which allows me to constrain my index to the bare minimum. I would expect then to be able to pass just my query to the instance of
ISearchCriteria
without having to specify again which fields to search within and instead searches in all indexed fields.Something like:
Unfortunately I cannot find anything to allow me to do that. Is there something obvious I have missed?
For some reason there is no intellisense for many of the fluent api methods so I am coding blind.
James, What I typically do is use gatheringnode data event and munge all the content into one field called contents and search on that see http://thecogworks.co.uk/blog/posts/2012/november/examiness-hints-and-tips-from-the-trenches-part-2/ for more info. If you add any new fields to the config its fine as the publish will add to the contents field.
Regards
Ismail
Thanks for that, I've copied your class over here since links never last forever.
Ok... I can see what your class is doing. How do I go about implementing it?. Where do I declare the constructor?
Also,
ApplicationBase
is marked as obsolete in versions 4.8+. According to these docsApplicationEventHandler
should be used instead.Just have the class included in your project somewhere. It will get picked up by Umbraco automagically and you should be fine.
James,
You can update to use ApplicationEventHandler just wire up the gatheringnode event there both will work.
Regards
Ismail
BTW,
I recommend http://our.umbraco.org/projects/website-utilities/ezsearch
Regards
Ismail
Aces... I got it working. Big thanks to both of you! :)
I'll need to check if I'm allowed as I'd like to publish a gist of what I have done. I have enhanced the method so that I can work with multiple
IndexProviders
Works an absolute treat now.
Great to hear James :)
Remember to marked Ismail's post as accepted answer :)
Nice work guys!
Just for some added information, if you don't specify any fields in your config then Examine will index everything which is why when you search it wants you to specify fields.
The solution posted is ideal for what you wanted where you have specified explicit fields in your config and just want to search them all. This method is also useful for when you want to include child node data inside of a single index (i.e. if a child node just contains data but is not actualy a template itself).
I may have found an issue with the given solution, I could be mistaken though.
Say if I want to do a search for a node where I want to match more than one property within the node.
e.g path and nodeName.
Using code like this.
will cause the compiler to generate a query as follows.
Obviously that's incorrect. It looks like the query compiler is skipping duplicate field keys when generating. Other than specifically dictating the fields there looks as if there is no way to enure that the compiler checks the field for both queries.
Am I wrong?
James,
I use the munged field for full text search then if i want to do anything specific on a field like path or any other property i specifically add the fields to the query. BTW path is not tokenised you will need to add new searchable path field by using gatheringnode data see the blog post there is section on there about it.
Regards
Ismail
Thanks Ismail,
I've actually written code to normalize paths etc in the startup event. I've amended my method to accept an optional
IEnumerable<string>
and that has fixed it.Many Thanks!
I worked through this and was able to implement it using the ApplicationEventHandler, but my "contents" field is only bringing in the <IndexAttributeFields> specified in the config, and ignores any of the <IndexUserFields> I've specified in the config.
I'm able to look at the Index with Luke and it indexes the <IndexUserFields> fields on their own. But when I step through the GatheringNodeData Event, I'm not seeing them come thorugh in the GatheringNodeData args. Not sure what I am missing here.
Old post but i fixed it like this:
Just get all the fields and search through that!
Edwin,
If you have a lot of doctypes and properties that could get very expensive.
Regards
Ismail
Ismail,
But i don't want to make for every doctype a custom index set. If that's becoming slow i will cache it, thanks for the advice!
Regards, Ed
Edwin,
No need to make custom index set what you can do is using gathering node data add new field with value call it selectAll and set value to 1. Then you can query on it by fields .Fields("selectAll","1") that will pull everything back and it will be lightening quick.
Regards
Ismail
You refering to this?
http://our.umbraco.org/forum/developers/extending-umbraco/11667-GatheringNodeData-examine-event
If i understand it right you do a concat of all the fields and place it all in one field?
Edwin,
No I mean http://thecogworks.co.uk/blog/posts/2012/november/examiness-hints-and-tips-from-the-trenches-part-2/ select * type queries. So in lucene you cannot search on blank fields and you cannot do a select * to select all documents. What you can do is stick an arbitrary field call it whatever you like and give it whatever value, stick it in for all documents then you can run a query on that field and because every document will have that field you will get all the documents back so in effect you are mimicking a select *.
Regards
Ismail
is working on a reply...