How to create a fast search using a custom Examine Index?
We have inherited the development of a client site which has the most poorly built search on the planet, whoever built it you should hold your head in shame ;)
The more content the client adds to the site, the slower the search will become as it currently uses a huge number of nested queries, returning multiple IPublishedContent nodes and then filtering them in .NET before performing yet more SQL queries.
A relatively simple search takes about 800+ SQL queries and a more complex search could end up in the 10k's SQL queries region and will time out.
The Umbraco node structure looks like this:
Topics > Sub Topics > AnswerBoolean
Topics > Sub Topics > AnswerString
Topics > Sub Topics > AnswerNumber
Each of the above Answer nodes has a single Answer property which is either string, number or boolean.
In case this is not already bad enough, this is where it really gets nasty....
Each of the Answer nodes are named with a country name, e.g. UK
In total there are something like 140 possible answers per Country and about 80 countries.
The search form allows a user to select multiple countries and multiple sub topics and optionally enter search text.
The response currently returns a partial view. (edited)
It also shows the contact details for a person associated with the country and "associated links"
So all in all, it's a monster of a query and the way it has been written the more options you choose the more horrible the query, if you selected all the options it would currently try and do approximately 11k queries!
What I would like to do is flatten this data and insert it into a custom Examine index which I assume would be the fastest method to then search this content?
If there is anyone out there that can give me any pointer to good online articles or can explain how to create such an index it would be much appreciated!
Definitely a beer at #CG16 if you help me get this sorted today!
Re reading the post the actual data is in umbraco as nodes right? If so then ignore my last post that is for indexing dbs. In your case you have umbraco nodes that you want to search on.
So you will need to use gatheringnode data event to massage your data then you can query on it. I am assuming the country is a home node at top level? If so then you can inject in a searchable path and use that as a country filter there are some examples on our mostly posted by me on how to do this.
Unfortunately your assumption is wrong about the country, each of the child nodes is named with the country name and it has a "country picker" on the node that is currently being used.
It's a pig's ear of an implementation... and I stress... not built by me!
I think your SQL route might actually be the way for us to go, I think we need to turn things on their head so we have a single examine result per country which contains all the current answers.
SQL route? But your data is in umbraco? If it is in Umbraco you will already have the data in the index might need a bit of tweaking so you can execute the searches you want.
Again the gatheringnode holds instead of getting country inject from path you can look at the country set by picker it should be csv id list however you will need to space separate them in the index so that they are searchable.
Each row of date we return will have up to 140 fields, but in the current structure each of those fields is a property on a content node.
So as an example:
Topics > Sub Topics "ST1" > AnswerBoolean ( UK ) > "my text"
Topics > Sub Topics "ST2" > AnswerString ( UK ) > "another bit"
Topics > Sub Topics "ST3" > AnswerNumber ( UK ) > "last one"
This would all be in one searchable record and if it was SQL I'd want the data out like:
ST1, ST2, ST3
"my text", "another bit", "last one"
Not sure quite how this maps across to the Examine result, but hopefully you get the idea :)
It gets more complex because for each Sub Topic in the results the search looks at the ST answer and returns related news that has that ST node ID, so guess we will need to try and use another index to return this info ( maybe the standard indexes? )
How to create a fast search using a custom Examine Index?
We have inherited the development of a client site which has the most poorly built search on the planet, whoever built it you should hold your head in shame ;)
The more content the client adds to the site, the slower the search will become as it currently uses a huge number of nested queries, returning multiple IPublishedContent nodes and then filtering them in .NET before performing yet more SQL queries.
A relatively simple search takes about 800+ SQL queries and a more complex search could end up in the 10k's SQL queries region and will time out.
The Umbraco node structure looks like this:
Each of the above Answer nodes has a single Answer property which is either string, number or boolean.
In case this is not already bad enough, this is where it really gets nasty....
Each of the Answer nodes are named with a country name, e.g. UK
In total there are something like 140 possible answers per Country and about 80 countries.
The search form allows a user to select multiple countries and multiple sub topics and optionally enter search text.
The response currently returns a partial view. (edited)
It also shows the contact details for a person associated with the country and "associated links"
So all in all, it's a monster of a query and the way it has been written the more options you choose the more horrible the query, if you selected all the options it would currently try and do approximately 11k queries!
What I would like to do is flatten this data and insert it into a custom Examine index which I assume would be the fastest method to then search this content?
If there is anyone out there that can give me any pointer to good online articles or can explain how to create such an index it would be much appreciated!
Definitely a beer at #CG16 if you help me get this sorted today!
Chris,
If its v7 then for a quick one to get your index up use https://github.com/rsoeteman/ExamineDB
Thanks Ismail,
I will take a look at that one, like you said it's probably a good pointer in the right direction.
Cheers,
Chris
Chris,
Re reading the post the actual data is in umbraco as nodes right? If so then ignore my last post that is for indexing dbs. In your case you have umbraco nodes that you want to search on.
So you will need to use gatheringnode data event to massage your data then you can query on it. I am assuming the country is a home node at top level? If so then you can inject in a searchable path and use that as a country filter there are some examples on our mostly posted by me on how to do this.
Regards
Ismail
Hi Ismail,
Unfortunately your assumption is wrong about the country, each of the child nodes is named with the country name and it has a "country picker" on the node that is currently being used.
It's a pig's ear of an implementation... and I stress... not built by me!
I think your SQL route might actually be the way for us to go, I think we need to turn things on their head so we have a single examine result per country which contains all the current answers.
Cheers,
Chris
Chris,
SQL route? But your data is in umbraco? If it is in Umbraco you will already have the data in the index might need a bit of tweaking so you can execute the searches you want.
Again the gatheringnode holds instead of getting country inject from path you can look at the country set by picker it should be csv id list however you will need to space separate them in the index so that they are searchable.
Regards
Ismail
Hi Ismail,
Each row of date we return will have up to 140 fields, but in the current structure each of those fields is a property on a content node.
So as an example:
This would all be in one searchable record and if it was SQL I'd want the data out like:
Not sure quite how this maps across to the Examine result, but hopefully you get the idea :)
It gets more complex because for each Sub Topic in the results the search looks at the ST answer and returns related news that has that ST node ID, so guess we will need to try and use another index to return this info ( maybe the standard indexes? )
Cheers,
Chris
Chris,
Get luke and crack open your index you will see most of the data is there and is searchable and flattened.
Regards
Ismail
is working on a reply...