I'm going around in circles with this issue, as far as I can tell my output is correct and there isn't an issue, but Google keeps throwing a incorrect namespace error to me. I've been trying to debug this for the last 3 days, can't work out what is wrong with the namespace!
The output from http://wearearchitects.co.uk/google-sitemap.aspx doesn't look right to me - especially the xmlns="" part. Google supports sitemaps in the universal Sitemaps.org format - see http://sitemaps.org/protocol.php This is used by other search indexes too - just output like this:
[code]
?xml version="1.0" encoding="UTF-8"?>
[/code]
Don't stick any other extraneous information in the XML. Also make sure you are sending the content type of text/xml in the header. I've created sitemaps in this format before (though not with Umbraco) and they work fine with Google.
Ah your on to something there, If i remove the xmlns, the random empty xmlns disappear...
So I got thinking, we're declaring the namespace to all children not specific ones (:ns), so in theory this should work fine... so whats going on? Its because the namespace isn't defined in the XSLT header, so i think it essential says, well thats all well and good applying this to all urlset children, however I don't exist. Hence throwing an empty tag...
My two cents:
1. I think you should add the utf-8 information via [code]
[/code]
2. In the protocol definitions is defined that all entities have to be escaped. I saw a url with a single quote. So if you have the namespace discussion it can be that the error message from google should mean: this file isn't valid to the namespace of the schema regarding the entity escaping.
Firstly, the XML produced by the translation engine/page render method will be UTF-16 and will include xmlns="" if you don't maintain the namespace declarations for all of the child nodes.
Achieving ACTUAL UTF-8 encoding would involve pumping the response through an XmlTextWriter instead of the default on the page render. Otherwise, using the common method of appending your own declaration, you will have unexpected problems whenever your UTF16 and UTF8 encodings would differ.
Remember that passing validation doesn't mean you've done it right, it just means that you've passed the rules of the validation.
Edit: I've confirmed you will not have a problem using the utf-8 declaration. This is because, although the HTML encoded bit is utf-16, it wraps around a propertly utf-8 encoded chunk of xml from the xslt translation engine. If you ask me, it's a bit silly, but it will work. I still can't figure out why the encoding is valid utf-8 when the xslt engine outputs utf-16 on the declaration. Perhaps this is a bug in the translation engine?
Google Sitemap Issue
Hello,
I'm going around in circles with this issue, as far as I can tell my output is correct and there isn't an issue, but Google keeps throwing a incorrect namespace error to me. I've been trying to debug this for the last 3 days, can't work out what is wrong with the namespace!
The output is viewable here...
http://wearearchitects.co.uk/google-sitemap.aspx
The XSLT which produces this is...
[code]
[/code]
The output from http://wearearchitects.co.uk/google-sitemap.aspx doesn't look right to me - especially the xmlns="" part. Google supports sitemaps in the universal Sitemaps.org format - see http://sitemaps.org/protocol.php This is used by other search indexes too - just output like this:
[code]
?xml version="1.0" encoding="UTF-8"?>
[/code]
Don't stick any other extraneous information in the XML. Also make sure you are sending the content type of text/xml in the header. I've created sitemaps in this format before (though not with Umbraco) and they work fine with Google.
Ah your on to something there, If i remove the xmlns, the random empty xmlns disappear...
So I got thinking, we're declaring the namespace to all children not specific ones (:ns), so in theory this should work fine... so whats going on? Its because the namespace isn't defined in the XSLT header, so i think it essential says, well thats all well and good applying this to all urlset children, however I don't exist. Hence throwing an empty tag...
I might be completely wrong! ;)
* NEW CODE *
[code]
[/code]
My two cents:
1. I think you should add the utf-8 information via [code]
[/code]
2. In the protocol definitions is defined that all entities have to be escaped. I saw a url with a single quote. So if you have the namespace discussion it can be that the error message from google should mean: this file isn't valid to the namespace of the schema regarding the entity escaping.
just my thoughts,
Thomas
Very much so, I'd tried to add the
Good point on the entities, I'm working on fixing that now.
Thanks for your perspective, Lau
Firstly, the XML produced by the translation engine/page render method will be UTF-16 and will include xmlns="" if you don't maintain the namespace declarations for all of the child nodes.
Achieving ACTUAL UTF-8 encoding would involve pumping the response through an XmlTextWriter instead of the default on the page render. Otherwise, using the common method of appending your own declaration, you will have unexpected problems whenever your UTF16 and UTF8 encodings would differ.
Remember that passing validation doesn't mean you've done it right, it just means that you've passed the rules of the validation.
Edit: I've confirmed you will not have a problem using the utf-8 declaration. This is because, although the HTML encoded bit is utf-16, it wraps around a propertly utf-8 encoded chunk of xml from the xslt translation engine. If you ask me, it's a bit silly, but it will work. I still can't figure out why the encoding is valid utf-8 when the xslt engine outputs utf-16 on the declaration. Perhaps this is a bug in the translation engine?
is working on a reply...