I'm trying to pull search words from multiple sources, including nForum posts. Unfortunately all the HTML tags are showing up in search preview results when it's coming from an nForum post. This is obviously due to the fact that the posted content contains encoded characters (i.e. <p>) and therefore shows up as normal HTML code when previewed.
Is there a workaround or solution for this without affecting the rest of the search results?
Ahh, yes, I can imagine that's exactly what's happening... nForum isn't storing the html markup the same way that Umbraco does. I strip html from the search fields but if they're already encoded that won't work.
What you could do is modify the xsltsearch.xslt file to first 'encode' the field (but only if it is an nForum document type and an nForum field) and then strip the html as usual. Basically, use an xsl:choose statement to check the field and encode+htmlstrip or just htmlstrip as appropriate. The line you'd want to augment in xsltsearch.xslt is:
Or you might check with Lee Messenger (author of nForum) to see if there's a way to stop the encoding of markup when a post is created and only encode on the display side of things when it is output (to avoid potential security and XSS concerns).
The problem is that I need to decode, not encode, the values back into tags. And unfortunately there's no umbraco.library:HtmlDecode. The only way I can think of is the disable-output-escaping="yes" to at least output the text with the tags, but I then need to store the result of that in a variable in order to use the StripHtml function, and attempting to declare a variable using both disable-output-escaping and a previous variable in the XSLT file just throws an error. I tried passing the output-escaped text through as a parameter to a template call, but that seemed to have no effect on decoding it.
Short of having to extend the umbraco library and make my own custom function to decode HTML (which I'd rather not) I'm not sure how else to approach this.
EDIT: *sigh* Never mind. Barring client deadlines, I went ahead and downloaded the 4.7.1.1 source code and added a HtmlDecode method to the umbraco.library class. Replaced the DLL and got it working. Not sure why the developers didn't put in a decoder to begin with.
EDIT 2: Aaaaand replacing the old umbraco.dll completely broke the backoffice. Why is the GetMacro() method missing? They're both builds from 4.7.1.1... ugh. Nothing is ever easy... back to where I was now.
Thanks again. I learned something new today. Didn't put 2+2 together that I could call methods from App_Code class files in XSLT.
I just added to the XSLTSearch.cs file a static "decodehtml" method and am calling it unconditionally, just to get it working for now. (I know it's not efficient, but at least it works OK, and none of the non-forum search results appear to be negatively affected.)
If I wanted to do a conditional statement later on, would I put it in the "displayFieldText" template? Could I use the item parameter to check for doctype "ForumTopic" or "ForumPost"? I don't use XSLT enough to remember its intricacies...
(Can't wait to switch to v5 and Razor when it'll be feasible later on... never been a big fan of this XSLT stuff)
Searching inside nForum posts shows HTML tags
I'm trying to pull search words from multiple sources, including nForum posts. Unfortunately all the HTML tags are showing up in search preview results when it's coming from an nForum post. This is obviously due to the fact that the posted content contains encoded characters (i.e. <p>) and therefore shows up as normal HTML code when previewed.
Is there a workaround or solution for this without affecting the rest of the search results?
Hi, Joey,
Ahh, yes, I can imagine that's exactly what's happening... nForum isn't storing the html markup the same way that Umbraco does. I strip html from the search fields but if they're already encoded that won't work.
What you could do is modify the xsltsearch.xslt file to first 'encode' the field (but only if it is an nForum document type and an nForum field) and then strip the html as usual. Basically, use an xsl:choose statement to check the field and encode+htmlstrip or just htmlstrip as appropriate. The line you'd want to augment in xsltsearch.xslt is:
Or you might check with Lee Messenger (author of nForum) to see if there's a way to stop the encoding of markup when a post is created and only encode on the display side of things when it is output (to avoid potential security and XSS concerns).
Hope that's helpful.
cheers,
doug.
Thanks for the reply Doug.
The problem is that I need to decode, not encode, the values back into tags. And unfortunately there's no umbraco.library:HtmlDecode. The only way I can think of is the disable-output-escaping="yes" to at least output the text with the tags, but I then need to store the result of that in a variable in order to use the StripHtml function, and attempting to declare a variable using both disable-output-escaping and a previous variable in the XSLT file just throws an error. I tried passing the output-escaped text through as a parameter to a template call, but that seemed to have no effect on decoding it.
Short of having to extend the umbraco library and make my own custom function to decode HTML (which I'd rather not) I'm not sure how else to approach this.
EDIT: *sigh* Never mind. Barring client deadlines, I went ahead and downloaded the 4.7.1.1 source code and added a HtmlDecode method to the umbraco.library class. Replaced the DLL and got it working. Not sure why the developers didn't put in a decoder to begin with.
EDIT 2: Aaaaand replacing the old umbraco.dll completely broke the backoffice. Why is the GetMacro() method missing? They're both builds from 4.7.1.1... ugh. Nothing is ever easy... back to where I was now.
You don't need to rebuild Umbraco, you can just make your own XSLT extension method. Details and "how to" at http://blog.percipientstudios.com/2010/11/12/make-an-app_code-xslt-extension-for-umbraco.aspx
It's exactly how I add some capabilities to XSLTsearch (in the /app_code/xsltsearch.cs file).
cheers,
doug.
Thanks again. I learned something new today. Didn't put 2+2 together that I could call methods from App_Code class files in XSLT.
I just added to the XSLTSearch.cs file a static "decodehtml" method and am calling it unconditionally, just to get it working for now. (I know it's not efficient, but at least it works OK, and none of the non-forum search results appear to be negatively affected.)
If I wanted to do a conditional statement later on, would I put it in the "displayFieldText" template? Could I use the item parameter to check for doctype "ForumTopic" or "ForumPost"? I don't use XSLT enough to remember its intricacies...
(Can't wait to switch to v5 and Razor when it'll be feasible later on... never been a big fan of this XSLT stuff)
is working on a reply...