Copied to clipboard

Flag this post as spam?

This post will be reported to the moderators as potential spam to be looked at


  • Jacob Polden 67 posts 177 karma points
    May 03, 2012 @ 16:19
    Jacob Polden
    0

    Special Characters in HTML to XML

    I'm trying to parse in a HTML Google doc into umbraco by using XML. However there are some special characters in HTML that XML doesn't understand and throws an exception when converting. (Characters can be seen here: http://www.degraeve.com/reference/specialcharacters.php)

    The conversion process is as shown below

     

                    //XmlReader reader = null;
                    string GDocURL = uploadedFile.Content.AbsoluteUri + "&exportFormat=html&format=html";
                    logger.Info("The GDocURL is " + GDocURL);
    
                    try
                    {
                        Stream uploadedFileStream = service.Query(new Uri(GDocURL));
                        // Create the validating reader and specify DTD validation.
    
                        XmlDocument doc = new XmlDocument();
    
                       doc.Load(uploadedFileStream);
    
                        logger.Info("The outer xml is " + doc.OuterXml);
                        return doc;
                    }
  • Simon steed 376 posts 688 karma points
    May 04, 2012 @ 15:05
    Simon steed
    0

    Just use the entity code in it's place - following site has more of a reference of them for you:

    http://www.elizabethcastro.com/html/extras/entities.html

    Simon

Please Sign in or register to post replies

Write your reply to:

Draft