Copied to clipboard

Flag this post as spam?

This post will be reported to the moderators as potential spam to be looked at


  • Sander Houttekier 114 posts 163 karma points
    Sep 24, 2009 @ 17:42
    Sander Houttekier
    0

    rendering xml from richtext content shows newline even with striphtml

    Hi,

    an xml that is rendered trough xslt,

    this is how we create the xml node for description:

           <description><xsl:text disable-output-escaping="yes">&lt;![CDATA[</xsl:text><xsl:value-of disable-output-escaping="yes" select="umbraco.library:StripHtml($currentPage/data [@alias = 'textPicture'])"/><xsl:text disable-output-escaping="yes">]]&gt;</xsl:text></description>

    as you can see, we use the striphtml function in the umbraco library to strip all html tags from the description field.

    yet when this xslt is placed (trough a macro) on a template, (when you click to view the source) it shows several empty lines where the <br/> tags should have been. As i suspect the striphtml replaces <br/> tags with /n (newline) characters?

    i suspect this because the replaceLineBreaks function on that variable renders us with a whole lot of extra <br/> tags

     

    does anyone have a solution to how we can remove these (preferably in xslt, but if nothing else is possible we can probably add an xslt-extention that replaces these...

    best regards
    Sander

  • Lachlann 344 posts 626 karma points
    Sep 24, 2009 @ 17:56
    Lachlann
    0

    Perhaps you could edit the tinyMCE config file to remove the unwated HTML tags? rather than doing it in XSLT.

     

    L

  • Sander Houttekier 114 posts 163 karma points
    Sep 24, 2009 @ 18:20
    Sander Houttekier
    0

    no, i don't think so, the data from that rich text box is used in the site, and is perfect

    we only want to give that same data trough xml to a flash, and that flash needs to get it without html markup

  • Douglas Robar 3570 posts 4711 karma points MVP ∞ admin c-trib
    Sep 25, 2009 @ 11:25
    Douglas Robar
    0

    Sounds like you should write your own stripHtml() function since you want precise control over the output. You could do this as in-line c# or vb or javascript in your xslt file (javascript example below), or you could write your own xslt extension.

    Here's how I did it in some xslt once. You'd need to modify it but I think you get the idea.

    <xsl:stylesheet version="1.0"
        xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
        xmlns:msxml="urn:schemas-microsoft-com:xslt"
        xmlns:msxsl="urn:schemas-microsoft-com:xslt"
        xmlns:umbraco.library="urn:umbraco.library"
        xmlns:ps="urn:percipientstudios.xsl"
        exclude-result-prefixes="msxml umbraco.library ps">

    <xsl:param name="currentPage"/>

    <xsl:template match="/">
        <xsl:value-of select="ps:stripHTML($currentPage/data[@alias = 'bodyText'])"/>
    </xsl:template>

    <msxsl:script language="JavaScript" implements-prefix="ps">
    <![CDATA[
    // strip all HTML tags, as well as all text within OPTION tags
    function stripHTML(oldString) {
        var newString = "";
        var inTag = false;
        var skipContents = false;
        for(var i = 0; i < oldString.length; i++) {
            if(oldString.charAt(i) == '<') {
                inTag = true;
                skipContents = (oldString.substr(i+1, 6).toUpperCase()=='OPTION');
            }
            if(oldString.charAt(i) == '>') {
                inTag = false;
            }else if(!inTag && !skipContents) {
                newString += oldString.charAt(i);
            }
        }
        return newString;
    }
    ]]>
    </msxsl:script>

     

    You might want to replace BR's with a space rather than just removing them entirely from the xml so that sentences don't run together without a space between them as might othewise happen. Same would go for closing DIV and P tags I would think, but you'll want to do a bit of testing to be sure.

    cheers,
    doug.

  • Josh Townson 67 posts 162 karma points
    Sep 25, 2009 @ 14:05
    Josh Townson
    1

    Umbraco has a library function called replace - this ought to be useful for getting rid of newline characters. I use it to remove \r line breaks like so:

    umbraco.library:Replace($string, '&#xD;', '')

    If you want to get rid of the \n, then the xml entity is &#xA; which would be:

    umbraco.library:Replace($string, '&#xA;', '')

    Not sure if you can do both of them together, but I don't think combining them would be a problem

  • Sander Houttekier 114 posts 163 karma points
    Sep 25, 2009 @ 14:52
    Sander Houttekier
    0

    tried josh's townson's solution first since it seems that easy...
    and it seems like it does exactly what we need it to do.

     

    so, for a quick fix i will thank josh townson

    i will however take a deeper look into the solution of douglas, as i'm not familiar with incluiding other languages inside xslt,
    it looks very interesting

    thank you both for the efforts!

    Sander

Please Sign in or register to post replies

Write your reply to:

Draft