Copied to clipboard

Flag this post as spam?

This post will be reported to the moderators as potential spam to be looked at


  • ThomasBrunbjerg 90 posts 182 karma points
    Aug 29, 2017 @ 11:20
    ThomasBrunbjerg
    0

    Removing <span> and <p> tags from shortened RTE content

    I need to display the first 200 or so characters from some content entered in an RTE. Right now I have accomplished this using a function that truncates whatever string you input to a desired length. Right now it removes the

    tag, that appears at the start, by letting the substring start at index 3, but sometimes other tags are added as well.

    public static string TruncateAtWord(string input, int length)
        {
            if (input == null || input.Length < length)
                return input;
            int iNextSpace = input.LastIndexOf(" ", length, StringComparison.Ordinal);
            return string.Format("{0}...", input.Substring(3, (iNextSpace > 0) ? iNextSpace : length).Trim());
        }
    

    This works fine, though it doesn't take into account the added tags the RTE creates, which are returned when I use the ToString() method.

    The function itself is called like this :

    @TruncateAtWord(item.GetPropertyValue("sommerhusFuldBeskrivelse").ToString(), 200)
    

    How can I take into account the extra tags that the RTE creates behind the scenes when i shorten my content? Is there a way for the RTE to not use these tags, since I already have the text wrapped in a p tag?

  • Laurence Gillian 600 posts 1219 karma points
    Aug 30, 2017 @ 14:12
    Laurence Gillian
    0

    Html Agility pack can be used for this purpose, see: https://stackoverflow.com/questions/12787449/html-agility-pack-removing-unwanted-tags-without-removing-content

    However, it may be easier / less risk of something breaking in the middle of the night to have an additional field for this excerpt that uses the textarea property editor, rather than the rich text editor.

  • Paul Griffiths 370 posts 1021 karma points
    Aug 31, 2017 @ 18:44
    Paul Griffiths
    0

    Hi Thomas,

    Whenever i need to strip out the HTML from the RTE output i tend to use the following helper method passing in the alias of the RTE property from the doc type.

       library.StripHtml()
    

    If i want to truncate the content to a certain number of characters i use something like so

    @{
          var contentSnippet = Umbraco.Truncate(library.StripHtml(Model.Content.GetPropertyValue<string>("mainContent")), 120, true);
       }
    

    and then output the truncated snippet without html

    <p>@contentSnippet </p>
    

    Hopefully that is what you were trying to achieve?

    Thanks

    Paul

  • David Armitage 510 posts 2081 karma points
    Sep 01, 2017 @ 04:17
    David Armitage
    0

    Hi Guys,

    Here are a few helper methods. I usually add these in as String Extension Methods.

    I haven't used them in a long time so please give them a good test.

    public static string StripHTML(string htmlString)
            {
                string pattern = @"<(.|\n)*?>";
    
                return Regex.Replace(htmlString, pattern, string.Empty);
            }
    
            public static string HtmlGetFirstParagraph(string htmlString)
            {
                Match m = Regex.Match(htmlString, @"<p>\s*(.+?)\s*</p>");
                if (m.Success)
                {
                    return m.Groups[1].Value;
                }
                else
                {
                    return string.Empty;
                }
            }
    
Please Sign in or register to post replies

Write your reply to:

Draft