Copied to clipboard

Flag this post as spam?

This post will be reported to the moderators as potential spam to be looked at


  • Stefan 117 posts 215 karma points
    Nov 23, 2011 @ 10:17
    Stefan
    0

    Strip certain html tags

    Hi.

    I'm in a situation where I would like to strip certain HTML tags from the output of bodyText with XSLT. More precisely it's the header tags I dont want to show.

    This is the compromise I could come up with.

                <p class="description">
                  <span class="description">
                    <xsl:value-of select="umbraco.library:StripHtml(umbraco.library:TruncateString(bodyText,250,'...'))" />
                  </span>
                </p>

    An example could be something like this:

    <h2>Header</h2><p>Paragraph text</p>

    Which should be:

    Paragraph text

    By the way, I did find this thread, but I wonder if it can be done in an easier way?
    http://our.umbraco.org/forum/developers/xslt/10272-Remove-attributes-from-html-tags-in-xslt

     

    Thanks in advance!

  • Lee Kelleher 4020 posts 15802 karma points MVP 13x admin c-trib
    Nov 23, 2011 @ 10:52
    Lee Kelleher
    0

    Hi Stefan,

    If you happen to be using uComponents, then you could try the XML XsltExtension method called Parse().  This will take the HTML from your 'bodyText' property and convert it to XML.

    <xsl:variable name="html" select="ucomponents.xml.Parse(bodyText)" />

    Then you can use the variable to select the XML (HTML) nodes that you want...

    <xsl:value-of select="$html/p" />

    Cheers, Lee.

  • Rodion Novoselov 694 posts 859 karma points
    Nov 23, 2011 @ 10:52
    Rodion Novoselov
    0

    Hmmm. My first initial idea:

    #some-content-wrapper h2 {
    display: none;

    :-)

  • Stefan 117 posts 215 karma points
    Nov 23, 2011 @ 16:15
    Stefan
    0

    Thank you for your replies!

    Lee, I think uComponents is the way to do it. I have installed the package, but I can't get it to work.
    I have registered the extension in xsltExtensions.config as:

    <ext assembly="uComponents.Core" type="uComponents.Core.XsltExtensions.Xml" alias="ucomponents.xml" />

    and added the following prefix attributes to the xsl:stylesheet element:

    xmlns:ucomponents.xml="urn:ucomponents.xml"
    ucomponents.xml

    The error I'm getting when trying to save my xslt file is:

    System.Xml.Xsl.XslLoadException: 'ucomponents.xml.Parse()' is an unknown
    XSLT function. An error occurred at C:\Users\Stefan\Documents\My Web 
    Sites\CD\xslt\634576611715742106_temp.xslt(15,1).
    at System.Xml.Xsl.XslCompiledTransform.LoadInternal(Object stylesheet, XsltSettings settings, XmlResolver stylesheetResolver)
    at umbraco.presentation.webservices.codeEditorSave.SaveXslt(String 
    fileName, String oldName, String fileContents, Boolean ignoreDebugging)

    What am I missing?

    Rodion, I didn't even think of using CSS for that - keeping that in mind will prove useful in other situations, but unfortuatenely it can't be done in this situation :(

  • Lee Kelleher 4020 posts 15802 karma points MVP 13x admin c-trib
    Nov 23, 2011 @ 16:20
    Lee Kelleher
    0

    Hi Stefan,

    Sorry, it was a typo in my example (I was coding by hand) ... it should be:

    <xsl:variable name="html" select="ucomponents.xml:Parse(bodyText)" />

    (I'd put a period "." instead of a colon ":" - doh!)

    Cheers, Lee.

  • Stefan 117 posts 215 karma points
    Nov 23, 2011 @ 16:49
    Stefan
    0

    Well, that happens when you (=me) is copy-pasting without paying attention...!

    I'm getting soem strange errors that I cant interpret.

    I have put this textarea right after the beginning of a for-each loop for testing purposes:

    <textarea><xsl:copy-of select="ucomponents.xml:Parse(bodyText)" /></textarea>

    When bodyText only contains a paragraph with text inside (lets say <p>This is a test</p>,
    everything works fine and
    <p>This is a test</p> shows up in the textarea.

    When bodyText contains any other html inside the <p></p> tags, I get an error saying:

    <Exception Type="System.Xml.XmlException">
        <Message>There are multiple root elements. Line 4, position 2.</Message>
            <StackTrace>
                <Frame>System.Xml.XmlTextReaderImpl.Throw(Exception e)</Frame>
                <Frame>System.Xml.XmlTextReaderImpl.Throw(String res, String arg)</Frame><Frame>System.Xml.XmlTextReaderImpl.ParseDocumentContent()</Frame>
                <Frame>System.Xml.XmlTextReaderImpl.Read()</Frame>
                <Frame>System.Xml.XPath.XPathDocument.LoadFromReader(XmlReader reader, XmlSpace space)</Frame>
    <Frame>System.Xml.XPath.XPathDocument..ctor(TextReader textReader)</Frame>
    <Frame>uComponents.Core.XsltExtensions.Xml.Parse(String xml)</Frame>
            </StackTrace>
    </Exception>

    Do you have any clues about what's causing that?

    Thanks again!

  • Stefan 117 posts 215 karma points
    Nov 23, 2011 @ 17:02
    Stefan
    0

    I have just learned that it's because that bodyText contains more than one root element.

    Can I overcome this in any way, and still strip all tags other than the paragraphs?

    For example, this will fail on line 4, position 2:

    <bodyText>
    <p>Paragraph text paragraph text paragraph text paragraph text
    paragraph text paragraph text paragraph text...</p>

    <ul class="bullet-rt">
    <li>Test list 1</li>
    <li>Test list 2</li>
    </ul>
    </bodyText>

     

  • Lee Kelleher 4020 posts 15802 karma points MVP 13x admin c-trib
    Nov 23, 2011 @ 17:43
    Lee Kelleher
    0

    Hi Stefan,

    Ah yes, it must be valid XML, so would need a single root tag... try this:

    <textarea><xsl:copy-of select="ucomponents.xml:Parse(concat('&lt;html&gt;', bodyText, '&lt;/html&gt;'))" /></textarea>

    It's a little bit hacky, but had to encode the angle-brackets :-$

    Cheers, Lee.

  • Chriztian Steinmeier 2798 posts 8788 karma points MVP 7x admin c-trib
    Nov 23, 2011 @ 17:50
    Chriztian Steinmeier
    0

    Hi guys,

    I'll just chip in with another gotcha you might run into (sorry Lee, I KNOW I should have submitted bugs long ago for these :-)

    - bodyText may at some point contain the dreaded &nbsp; non-breaking space, and THAT will wreak havoc again...

    I've wrapped up most of this into a nice little include file that I use - it's available as a Gist for now: https://gist.github.com/1171897

    /Chriztian

  • Lee Kelleher 4020 posts 15802 karma points MVP 13x admin c-trib
    Nov 23, 2011 @ 17:53
    Lee Kelleher
    0

    @Chriztian: With the next (major) version of uComponents (v4.x) I'm planning on using HtmlAgilityPack to parse the HTML - that should handle all the quirks much better!  In the meantime, any bugs, etc ... CodePlex me! (oooo how rude! LOL)

  • Lee Kelleher 4020 posts 15802 karma points MVP 13x admin c-trib
    Nov 23, 2011 @ 17:56
    Lee Kelleher
    0

    @Chriztian: Forgot to say - about your gist snippet ... the "EditorContent" entity is very very clever and cool!

  • Chriztian Steinmeier 2798 posts 8788 karma points MVP 7x admin c-trib
    Nov 23, 2011 @ 18:03
    Chriztian Steinmeier
    1

    Hi Lee,

    Now look - I just went and reported TWO issues in the same day (even same hour :-). "How do you like them apples?"

    Thanks!

    /Chriztian

  • Lee Kelleher 4020 posts 15802 karma points MVP 13x admin c-trib
    Nov 23, 2011 @ 18:45
    Lee Kelleher
    0

    oooh I like apples!

  • Stefan 117 posts 215 karma points
    Nov 23, 2011 @ 19:27
    Stefan
    0

    Thanks again for your replies!

    And Chriztian, you were right, the &nbsp; sure made havoc again!

    I have included the xslt file, but because of my lack of experience with templates in xslt, I can't figure out how to include it :/

  • Chriztian Steinmeier 2798 posts 8788 karma points MVP 7x admin c-trib
    Nov 23, 2011 @ 19:42
    Chriztian Steinmeier
    0

    Hi Stefan,

    OK - here's a complete sample that should get you going:

    <?xml version="1.0" encoding="utf-8" ?>
    <xsl:stylesheet
        version="1.0"
        xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
        xmlns:umb="urn:umbraco.library"
        exclude-result-prefixes="umb"
    >
    
        <xsl:output method="xml" indent="yes" omit-xml-declaration="yes" />
    
        <xsl:param name="currentPage" />
    
        <xsl:template match="/">
            <div class="maincontent">
                <xsl:apply-templates select="$currentPage/bodyText" mode="WYSIWYG" />
            </div>
        </xsl:template>
    
        <!-- Call the cavalry -->
        <xsl:include href="_WYSIWYG.xslt" />
    
    </xsl:stylesheet>

    The crucial line is the one I've highlighted, which tells the processor to basically use the entry template in the _WYSIWYG.xslt file (because it also has the mode="WYSIWYG" specified).

    From there, you can add templates for specific things, e.g. you wanted to skip the <h2>'s - just add an empty template for them then:

    <xsl:template match="h2" /><!-- Sorry, no rooom for you... -->

    /Chriztian

  • Stefan 117 posts 215 karma points
    Nov 23, 2011 @ 23:51
    Stefan
    0

    Thank you yet again :-)

    Unfortunately I'm still left in the dark with a few questions - which I hope you will answer.

    1. What can I do to make the _WYSIWYG.xslt skip every html tag but the paragraphs? Now it will only skip <h2> and include everything else (images, lists etc.).
    2. What about stripping paragraph classes?
    3. How can I apply the template on bodyText in conjunction with umbraco.library:TruncateString?
    4. Before using the xslt include, I tried using uComponents as suggested by Lee.
    Can the two solutions be used together (for example to take care of &nbsp; when using the parse() function from uComponents) to prevent this from failing?

    <xsl:variable name="html" select="ucomponents.xml:Parse(concat('&lt;html&gt;', bodyText, '&lt;/html&gt;'))" />
    <xsl:value-of select="umbraco.library:TruncateString($html/html/p,500,'...')" />

    Sorry for asking all these questions, but templates and xslt extensions is pretty new to me. Hopefully these questions will prove useful for others too!

    PS: Learning a lot of useful stuff right now :-)

  • Stefan 117 posts 215 karma points
    Nov 24, 2011 @ 21:13
    Stefan
    0

    Anyone?

  • Chriztian Steinmeier 2798 posts 8788 karma points MVP 7x admin c-trib
    Nov 24, 2011 @ 21:44
    Chriztian Steinmeier
    0

    Hi Stefan,

    Thanks for the nudge :-)

    Here goes:

    1. One way to do this is to replace the Identity Template (match="* | text()") with a new template that basically just bypasses elements and text - then add another one for those elements you *do* want to copy:

    <xsl:template match="*">
        <xsl:apply-templates select="*" />
    </xsl:template>
    
    <xsl:template match="p | strong">
        <xsl:copy>
            <xsl:apply-templates />
        </xsl:copy>
    </xsl:template>
    

    2. Already solved with the above...

    3. That's rather tricky - the template with mode="WYSIWYG.excerpt" tries to do a similar thing, whereby only selecting the first paragraph - but it needs tweaking to your particular situation.

    4. The _WYSIWYG.xslt already takes care of those two issues (multiple root elements and the &nbsp; thing) if you're executing like in the highlighted line in my previous answer.

    Let us now how it goes!

    /Chriztian

  • Ashley Andersen 45 posts 88 karma points
    Sep 25, 2013 @ 20:35
    Ashley Andersen
    0

    I know this is an old topic and I apologize. But I am using this for our client's mobile site due to the design. Everything works except the instances where we have macros in the RTE. These are unavoidable due to the clients' design restrictions and desire for control.

    1. Is there a way I can render the RTE content fully before parsing it in my macro?
    2. If not, can I target it to be excluded as well.

    Maybe I do not understand the protocol. But currently all pages but those work fine and they are throwing this error:

    Unexpected end of file while parsing PI has occurred. Line 6, position 613. System.Xml.XmlTextReaderImpl.Throw(String res, String arg) System.Xml.XmlTextReaderImpl.ParsePIValue(Int32& outStartPos, Int32& outEndPos) System.Xml.XmlTextReaderImpl.ParsePI(StringBuilder piInDtdStringBuilder) System.Xml.XmlTextReaderImpl.ParseElementContent() System.Xml.XPath.XPathDocument.LoadFromReader(XmlReader reader, XmlSpace space) System.Xml.XPath.XPathDocument..ctor(TextReader textReader) uComponents.XsltExtensions.Xml.ParseXml(String xml, String xpath)
    
  • Chriztian Steinmeier 2798 posts 8788 karma points MVP 7x admin c-trib
    Sep 25, 2013 @ 21:09
    Chriztian Steinmeier
    0

    Hi Ashley,

    I've had the same problem once in a while and I just dug out one of the "solutions" I've been using - basically, I sacrifice the WYSIWYG handling when there's a macro on the page, which of course is a call you can only make when you know your solution well.

    Here goes:

    <!-- Let's make a variable for this -->
    <xsl:variable name="macroStart" select="'&lt;?UMBRACO_MACRO '" />
    
    <!-- Any macros on the page? -->
    <xsl:if test="contains($currentPage/bodyText, $macroStart)">
        <xsl:value-of select="umbraco.library:RenderMacroContent($currentPage/bodyText, $currentPage/@id)" disable-output-escaping="yes" />
    </xsl:if>
    
    <!-- Otherwise, handle WYSIWYG content... -->
    <xsl:apply-templates select="$currentPage/bodyText[normalize-space()][not(contains(., $macroStart))]" mode="WYSIWYG" />
    

    (Yes, I know about the <xsl:choose> construct β€” I just try not to use it for simple stuff like this A/B case :-)

    Hope it helps,

    /Chriztian

  • Ashley Andersen 45 posts 88 karma points
    Sep 25, 2013 @ 21:14
    Ashley Andersen
    0

    I was afraid of that but it makes sense. Thank you!

  • Chriztian Steinmeier 2798 posts 8788 karma points MVP 7x admin c-trib
    Sep 25, 2013 @ 21:42
    Chriztian Steinmeier
    0

    Come to think of itβ€” it should actually be possible to have the _WYSIWYG.xslt handle this automatically, by detecting the macro(s) an then use RenderMacroContent() first β€” Hmmmm???!!... (evil laughing ensue :-)

    /Chriztian

Please Sign in or register to post replies

Write your reply to:

Draft