I am working on a large members area and need to be able to provide Search functionality across the members.
I have written a user control which functions, but wondered if implementing an equivalent to the XSLT search would provide faster search results. The added bonus too is the weighted results of XSLT search.
I have managed to export member data to an XML file so this part is all good.
I wondered if anyone can advise how to alter the XSLT search to read an external file rather than "/data/umbraco.config" ?
Does it require some custom written / compiled code, or can you "tweak" the XSLT search file directly and hard code a file path to a custom XML file ?
The second question is... how can I modify XSLTsearch to search inside this file?
This is more challenging because XSLTsearch is expecting an xml schema that matches that used by umbraco, with <node> and <data> elements. If you were clever in how you create your membership data to follow the same schema you would need very few changes to XSLTsearch to handle it. Primarily, you'd update the $source variable in XSLTsearch to us the data from the document() function rather than using $currentPage or umbraco.library:GetXmlAll().
If you don't mirror the xml schema then you'll also need to update the various select="" statements to work with your schema. This isn't impossible by any means but it would take a bit of time. Hopefully the comments in the code of XSLTsearch will guide you along your way.
The unspoken question is.... should I use XSLTsearch for this, or something else?
I would probably use something else. Though I don't know your exact configuration or what you're trying to accomplish, I'd look into the membership api in umbraco and probably build a small .net macro that can query your live membership data without needing to export the data to an xml file and modify XSLTsearch. I suspect it would be less work and perform better in the long run.
Thanks for that detailed answer - really appreciated.
In terms of searching my XML file I had realised that I'd either have to match the xml strucure (data, alias, etc) to simplify the conversion or be prepared to invest a bit of time testing / debugging - that is the fun part (providing you make progress as you go).
I had already created a user control that works to our satisfaction but we weren't totally happy with the time to generate the results, hence experimenting with your XSLT - it works so well and quickly on the live site that we thought it worth investing some time to try and convert to needs.
Further to the above I have been asked to expand on our "situation" with a view to seeing if anyone can provide further comment / feedback.
The anticipated number of members is in excess of 20,000 and so we wondered if the performance of an XSLT / XML solution on those volumes would be less than straight SQL queries / stored procedure queries.
Also as part of the XSLT search functionality we intend providing a group of checkboxes (e.g. I want to search on people who are involved with Beer and Wine and Spirits). Has anyone dealt with dynamic search criteria and parsed the data as part of an XSLT ?
The more I think things through the more I am leaning back to SQL queries within a user control, but would appreciate anyones thoughts.
This is a stored proc I use to spit out members. I don't have thousands of members so not sure if it would scale very well :)
create table myXML (xm_data xml) insert into MyXML (xm_data) select cmsContentXml.xml from cmsContentXML inner join cmsContent c on c.nodeid = cmsContentXml.nodeid where c.contentType = 1071
select tab.col.value('@nodeName','varchar(50)') as 'ID', tab.col.value('@email','varchar(50)') as 'Email', tab.col.value('data[@alias][1]','varchar(50)') + ' ' + tab.col.value('data[@alias][2]','varchar(50)') as "Name", tab.col.value('data[@alias][1]','varchar(50)') as 'First Name', tab.col.value('data[@alias][2]','varchar(50)') as 'Last Name', tab.col.value('data[@alias][3]','varchar(50)') as 'Home Phone', tab.col.value('data[@alias][4]','varchar(50)') as 'Mobile Phone', tab.col.value('data[@alias][5]','varchar(50)') as 'Address', tab.col.value('data[@alias][6]','varchar(50)') as 'AvartarMediaID', tab.col.value('data[@alias][7]','varchar(50)') as 'ShowOnProfile' from myXML cross apply xm_data.nodes('node') tab (col)
Hi skiltz - thanks for the code snippet - to date I've not gone tot he extent of a Stored Procedure but your code is much leaner than mine - an example of which is below.
CAST(cmsContentXml.xml as xml).value('(node/data[@alias="usrRoles"])[1]', 'nvarchar(MAX)')
Maybe changing to your format will provide some speed improvments - worth trying I guess . . .
No worries let me know how you get on. I'm going to be moving a site to Umbraco shortly that has 30,000 members so this stuff is a lot of interest to me.
I have just successfully adapted the XSLT search to search an external XML file.
In a bare bones testing format I am searching across 3,000 "node" records with 3 data fields on each node - effectively 9,000 fields being checked.
The search is blisteringly fast - far faster than my previous SQL query.
So based on the initial testing I feel confident we can implement this solution across a larger dataset - our member base is likely to be around 30,000 too.
Feel free to check my profile and send me an email (temporarily listed my email address there) - I will gladly share what I know / have learnt to date.
Glad you've got this working and that XSLTsearch is up to the task. As you say, xslt is blisteringly fast! I will note that being a 'brute force' search there is a more or less linear increase in time to execute as the data set grows. With 10x the members to search you'll see search times that are in the neighborhood of 10x longer.
Though, if you can quickly eliminate a bunch of items from the search by using the checkboxes to filter out various groups you'll get better performance.
I'd love to hear more about what you've done. You've got my email, or you can contact me through my website.
As per previous posts I now have modified the XSLT search to read an external XML file of exported member data.
My search page is to have a textbox (same as standard XSLT search) but then also a series of checkboxes to apply additional filters to the search, e.g. list of companies. These companies are stored as standard nodes witihn Umbraco and I am successfully displaying them on the search form.
Then when the search form is submitted I populate a hidden text field (javascript) with a comma separated list of selected checkboxes (e.g. ,1234,4567,6789,) Not sure if the comma's are needed at both ends but noted what was being done within the XSLT for previewFields, etc!
I can then successfully access the form data by using:
My problem is now how to filter the node set on the checkbox selections.
Initially I am trying to pass the node set to a template to do the looping / filtering but I cannot get that to work - my excerpt of code is below (bolded code is as per the original XSLT search):
<!-- reduce the number of nodes for applying all the functions in the next step --> <xsl:variable name="possibleNodes" select="$items/descendant-or-self::node"/>
<!-- Call template 'companyMatches to find nodes that match selections --> <xsl:variable name="possibleNodesCompanyMatches"> <xsl:call-template name="companyMatches"> <xsl:with-param name="nodeSetOne" select="$possibleNodes"/> </xsl:call-template> </xsl:variable>
<!-- generate a string of a semicolon-delimited list of all @id's of the matching nodes --> <xsl:variable name="matchedNodesIdList"> <xsl:call-template name="booleanAndMatchedNodes"> <xsl:with-param name="yetPossibleNodes" select="$possibleNodesCompanyMatches"/> <xsl:with-param name="searchTermList" select="concat($searchUpper, ' ')"/> </xsl:call-template> </xsl:variable> . . . <xsl:template name="companyMatches"> <xsl:param name="nodeSetOne"/> <xsl:variable name="matchBGNodes" select="$nodeSetOne"/> <xsl:value-of select="$matchBGNodes" /> </xsl:template>
All I am initially doing to trying to pass the node set through the template without applying the filtering but get the following:
Further to the above - I have just solved the problem.
I have replicated the "booleanAndMatchedNodes" template within the XSLT search and applied my own custom filtering within the template on the relevant comma separated data/alias.
The part of the whole equation I was missing was the following:
<!-- get the actual matching nodes as a nodeset --> <xsl:variable name="matchedNodes" select="$possibleNodes[contains($matchedNodesIdList, concat(';', concat(@id, ';')))]" />
The above takes the returned list of node ID's and filters the full node set to only return the matched nodes.
Like pieces in a jigsaw - it now all fits together perfectly !
Modify XSLT Search to Read External XML File
Hi there
I am working on a large members area and need to be able to provide Search functionality across the members.
I have written a user control which functions, but wondered if implementing an equivalent to the XSLT search would provide faster search results. The added bonus too is the weighted results of XSLT search.
I have managed to export member data to an XML file so this part is all good.
I wondered if anyone can advise how to alter the XSLT search to read an external file rather than "/data/umbraco.config" ?
Does it require some custom written / compiled code, or can you "tweak" the XSLT search file directly and hard code a file path to a custom XML file ?
Thanks
Nigel
Nigel,
Check if umbraco.library.GetXmlDocumentByUrl(url) instead of umbraco.library.GetXmlNodeById(nodeId) can be of any help to you.
Cheers,
/Dirk
Hi Dirk - thanks for the feedback.
That sounds like a good idea - will work on it tomorrow and if I manage to implement a solution will post details on this thread.
Thanks again.
Nigel
Hi, Nigel,
There are two questions here.
The first one is... how can I read the content of an external xml file? This is very simple using the document() function. Here's an example...
The second question is... how can I modify XSLTsearch to search inside this file?
This is more challenging because XSLTsearch is expecting an xml schema that matches that used by umbraco, with <node> and <data> elements. If you were clever in how you create your membership data to follow the same schema you would need very few changes to XSLTsearch to handle it. Primarily, you'd update the $source variable in XSLTsearch to us the data from the document() function rather than using $currentPage or umbraco.library:GetXmlAll().
If you don't mirror the xml schema then you'll also need to update the various select="" statements to work with your schema. This isn't impossible by any means but it would take a bit of time. Hopefully the comments in the code of XSLTsearch will guide you along your way.
The unspoken question is.... should I use XSLTsearch for this, or something else?
I would probably use something else. Though I don't know your exact configuration or what you're trying to accomplish, I'd look into the membership api in umbraco and probably build a small .net macro that can query your live membership data without needing to export the data to an xml file and modify XSLTsearch. I suspect it would be less work and perform better in the long run.
Hope this helps.
cheers,
doug.
hi Douglas
Thanks for that detailed answer - really appreciated.
In terms of searching my XML file I had realised that I'd either have to match the xml strucure (data, alias, etc) to simplify the conversion or be prepared to invest a bit of time testing / debugging - that is the fun part (providing you make progress as you go).
I had already created a user control that works to our satisfaction but we weren't totally happy with the time to generate the results, hence experimenting with your XSLT - it works so well and quickly on the live site that we thought it worth investing some time to try and convert to needs.
Thanks again for your help.
Nigel
Further to the above I have been asked to expand on our "situation" with a view to seeing if anyone can provide further comment / feedback.
The anticipated number of members is in excess of 20,000 and so we wondered if the performance of an XSLT / XML solution on those volumes would be less than straight SQL queries / stored procedure queries.
Also as part of the XSLT search functionality we intend providing a group of checkboxes (e.g. I want to search on people who are involved with Beer and Wine and Spirits). Has anyone dealt with dynamic search criteria and parsed the data as part of an XSLT ?
The more I think things through the more I am leaning back to SQL queries within a user control, but would appreciate anyones thoughts.
Thanks
Nigel
You could do a search on Lucene. Like so http://forum.umbraco.org/yaf_postst8410_Member-Search.aspx
This is a stored proc I use to spit out members. I don't have thousands of members so not sure if it would scale very well :)
Hi skiltz - thanks for the code snippet - to date I've not gone tot he extent of a Stored Procedure but your code is much leaner than mine - an example of which is below.
Maybe changing to your format will provide some speed improvments - worth trying I guess . . .
Thanks from a fellow kiwi for your input.
Nigel
No worries let me know how you get on. I'm going to be moving a site to Umbraco shortly that has 30,000 members so this stuff is a lot of interest to me.
Hi Skiltz
I have just successfully adapted the XSLT search to search an external XML file.
In a bare bones testing format I am searching across 3,000 "node" records with 3 data fields on each node - effectively 9,000 fields being checked.
The search is blisteringly fast - far faster than my previous SQL query.
So based on the initial testing I feel confident we can implement this solution across a larger dataset - our member base is likely to be around 30,000 too.
Feel free to check my profile and send me an email (temporarily listed my email address there) - I will gladly share what I know / have learnt to date.
Cheers
Nigel
Hi, Nigel,
Glad you've got this working and that XSLTsearch is up to the task. As you say, xslt is blisteringly fast! I will note that being a 'brute force' search there is a more or less linear increase in time to execute as the data set grows. With 10x the members to search you'll see search times that are in the neighborhood of 10x longer.
Though, if you can quickly eliminate a bunch of items from the search by using the checkboxes to filter out various groups you'll get better performance.
I'd love to hear more about what you've done. You've got my email, or you can contact me through my website.
cheers,
doug.
Hi there
As per previous posts I now have modified the XSLT search to read an external XML file of exported member data.
My search page is to have a textbox (same as standard XSLT search) but then also a series of checkboxes to apply additional filters to the search, e.g. list of companies. These companies are stored as standard nodes witihn Umbraco and I am successfully displaying them on the search form.
Then when the search form is submitted I populate a hidden text field (javascript) with a comma separated list of selected checkboxes (e.g. ,1234,4567,6789,) Not sure if the comma's are needed at both ends but noted what was being done within the XSLT for previewFields, etc!
I can then successfully access the form data by using:
My problem is now how to filter the node set on the checkbox selections.
Initially I am trying to pass the node set to a template to do the looping / filtering but I cannot get that to work - my excerpt of code is below (bolded code is as per the original XSLT search):
All I am initially doing to trying to pass the node set through the template without applying the filtering but get the following:
Is anyone able to shed light on what I am doing wrong ?
Thanks
Nigel
Further to the above - I have just solved the problem.
I have replicated the "booleanAndMatchedNodes" template within the XSLT search and applied my own custom filtering within the template on the relevant comma separated data/alias.
The part of the whole equation I was missing was the following:
<!-- get the actual matching nodes as a nodeset -->
<xsl:variable name="matchedNodes" select="$possibleNodes[contains($matchedNodesIdList, concat(';', concat(@id, ';')))]" />
The above takes the returned list of node ID's and filters the full node set to only return the matched nodes.
Like pieces in a jigsaw - it now all fits together perfectly !
Umbraco / XSLT rocks !
Wahooo! Well done.
cheers,
doug.
is working on a reply...