Why don't you generate the robots.txt file (just a normal page that outputs through xslt or razor & some url rewriting should do the trick), then you can have a clean & relevant robots.txt for each url you use to access the same site?
The sitemap.xml files themselves don't seem to contain errors:
how to set sitemap and robots.txt for multiple domain in one umbraco installation
Hi,
Does anybody know how to set this properly in umbraco?
Currently this is how my robots.txt looks like
But i'm having some crawl errors when I check the webmaster tool so if anyone have an idea how to do this it will be greatly appreaciated.
What crawl errors?
One of the sitemap files would be dismissed because it can't match the domain, so that's one expected error.
Why don't you generate the robots.txt file (just a normal page that outputs through xslt or razor & some url rewriting should do the trick), then you can have a clean & relevant robots.txt for each url you use to access the same site?
The sitemap.xml files themselves don't seem to contain errors:
Target: http://www.florahotelsindia.com/florahotelsindia_sitemap.xml
(Real name: http://www.florahotelsindia.com/florahotelsindia_sitemap.xml
Length: 43134 bytes
Last Modified: Mon, 03 Oct 2011 14:13:41 GMT
Server: Microsoft-IIS/7.5)docElt: {http://www.sitemaps.org/schemas/sitemap/0.9}urlsetValidation was strict, starting with type [Anonymous]schemaLocs: http://www.sitemaps.org/schemas/sitemap/0.9 -> http://www.sitemaps.org/schemas/sitemap/0.9/sitemap.xsdThe schema(s) used for schema-validation had
no errorsNo schema-validity problems were found in the target
I'm getting a 500 error.
if I checked the url that is reported, it is working fine so I'm confuse with this.
I don't know how to generate that robots.txt file or should I just remove that sitemap info in the robots.txt file?
Another thing I can see that google is crawling the virtual path of our site.
I use umbracoUrlName and umbracoUrlAlias to fix the urls of our site but I can see that the google is still crawling the link to document
This did the trick for me... quick and easy.
https://our.umbraco.org/projects/website-utilities/cultiv-dynamicrobots/
is working on a reply...