I'm trying to parse in a HTML Google doc into umbraco by using XML. However there are some special characters in HTML that XML doesn't understand and throws an exception when converting. (Characters can be seen here: http://www.degraeve.com/reference/specialcharacters.php)
The conversion process is as shown below
//XmlReader reader = null;
string GDocURL = uploadedFile.Content.AbsoluteUri + "&exportFormat=html&format=html";
logger.Info("The GDocURL is " + GDocURL);
try
{
Stream uploadedFileStream = service.Query(new Uri(GDocURL));
// Create the validating reader and specify DTD validation.
XmlDocument doc = new XmlDocument();
doc.Load(uploadedFileStream);
logger.Info("The outer xml is " + doc.OuterXml);
return doc;
}
Special Characters in HTML to XML
I'm trying to parse in a HTML Google doc into umbraco by using XML. However there are some special characters in HTML that XML doesn't understand and throws an exception when converting. (Characters can be seen here: http://www.degraeve.com/reference/specialcharacters.php)
The conversion process is as shown below
Just use the entity code in it's place - following site has more of a reference of them for you:
http://www.elizabethcastro.com/html/extras/entities.html
Simon
is working on a reply...