In my experience this is not neccesarily an issue if no one links to the pages since the search bots can't follow the links to them and therefore won't index them.
But if the accident has happened you should be able to install and use the 301 infocaster redirect package, which you find here http://our.umbraco.org/projects/developer-tools/301-url-tracker - Then you can setup pages that this non-wanted page should 301 redirect to instead.
It's true that the damage is first done if Google indexes the pages with nodeid's, but in my opinion a CMS should never allow 2 urls to the same page no matter what.
And unfortunately a junior developer has not been aware and use library.niceurl or node.url in his razor code, so we have some sites where pages with nodeid's has been indexed by Google.
I know I could redirect the affected pages, but I am looking for an automated solution, also to avoid these things in the future.
It won't work with the Url Tracker (thanks for suggesting it Jan ;-) ), because the Response status has to be 404 for the Url Tracker to do something... The HttpModule only kicks in when umbraco didn't find a page to render. Maybe you can write your own HttpModule?
Yes and no... The package was already installed, but I know there is some issues with the package in v6, so we've also stopped installed it on new sites.
I could also setup an URLRewrite to the /404 page for all nodeid urls.
But as I said, I'm looking for a more automated solution. Something that could be used on future sites to avoid this problem.
I really hope the Umbraco team will change this behaviour in the core.
Auto redirect /NodeId to /NiceUrl
I found out/had forgotten that you can access a page with the url /NodeId (using folder urls)
Please tell me there is a way to avoid this as it's a presents a huge duplicate content issue.
I've searched but not found any solution to automatically redirect the nodeid urls to the real urls.
Hi Michael
In my experience this is not neccesarily an issue if no one links to the pages since the search bots can't follow the links to them and therefore won't index them.
But if the accident has happened you should be able to install and use the 301 infocaster redirect package, which you find here http://our.umbraco.org/projects/developer-tools/301-url-tracker - Then you can setup pages that this non-wanted page should 301 redirect to instead.
Hope this helps.
/Jan
Hi Jan
It's true that the damage is first done if Google indexes the pages with nodeid's, but in my opinion a CMS should never allow 2 urls to the same page no matter what.
And unfortunately a junior developer has not been aware and use library.niceurl or node.url in his razor code, so we have some sites where pages with nodeid's has been indexed by Google.
I know I could redirect the affected pages, but I am looking for an automated solution, also to avoid these things in the future.
And the 301 Url tracker package does not seem work in this case.
I entered the nodeid as url, but nothing happened.
Hi Michael
Did the package install without any errors? Have you checked that there has been created some infocaster tables in the database?
/Jan
It won't work with the Url Tracker (thanks for suggesting it Jan ;-) ), because the Response status has to be 404 for the Url Tracker to do something...
The HttpModule only kicks in when umbraco didn't find a page to render. Maybe you can write your own HttpModule?
Hi Jan
Yes and no... The package was already installed, but I know there is some issues with the package in v6, so we've also stopped installed it on new sites.
I could also setup an URLRewrite to the /404 page for all nodeid urls.
But as I said, I'm looking for a more automated solution. Something that could be used on future sites to avoid this problem.
I really hope the Umbraco team will change this behaviour in the core.
@Michael
The new version (2.x) works great in umbraco v4.6 and up, which includes v6.x as well :-)
I agree nodeId URLs are unwanted and should be disabled by default. Have you considered creating an issue ticket?
Well the solution for me was to add this to UrlRewriting.config
<add name="nodeidrewrite" virtualUrl="^~/([0-9]{4})/?(.*)" rewriteUrlParameter="ExcludeFromClientQueryString" destinationUrl="~/404" redirectMode="Permanent" ignoreCase="true" />
Nodeid urls will now return a 404 page, at least that will solve the duplicate content issue.
And I have created an issue ticket.
http://issues.umbraco.org/issue/U4-2499
kipusoep: I can see that you made it even better. I have some upgrading to do now :-)
is working on a reply...