Hi there, not sure this is the right place to ask this but I'm hoping someone here can offer some guidance on an issue I'm having with a Umbraco site regarding crawling.
We have a number of urls that have been indexed that use the Umbraco page Id in the url, ie 'https://www.domain.co.uk/1234/', but the actual url is 'https://www.domain.co.uk/the-is-the-page-url/'.
To make things worse the Umbraco Id version is actually what's indexed.
Any thoughts on this would be greatly appreciated. If I have missed anything important that would aid understanding the issue please advise.
As I know URLs can be indexed only if there is a link on the site.
So probably you have somewhere on the site links with href attribute with Umbraco page ids instead of url of pages. Can you check which pages has this problem? We can fix it I think.
And it's not a problem of Umbraco, it looks like developer or editor made a mistake.
Do you have xml sitemap and google master account? If not then create xml sitemap and push that to google and use that to index the site this way you can control what google indexes.
There are some additional sitemaps where these pages were listed on which were not on the main sitemap. Not sure why it was done this way by a previous developer and why he didn't check his work.
I have also been advised that Umbraco ID urls are no longer accessible as well.
Thanks for all your guidance on this Alex and Ismail.
Umbraco Id Page Crawling/Indexing
Hi there, not sure this is the right place to ask this but I'm hoping someone here can offer some guidance on an issue I'm having with a Umbraco site regarding crawling.
We have a number of urls that have been indexed that use the Umbraco page Id in the url, ie 'https://www.domain.co.uk/1234/', but the actual url is 'https://www.domain.co.uk/the-is-the-page-url/'.
To make things worse the Umbraco Id version is actually what's indexed.
Any thoughts on this would be greatly appreciated. If I have missed anything important that would aid understanding the issue please advise.
Many thanks, David
Hi David
As I know URLs can be indexed only if there is a link on the site.
So probably you have somewhere on the site links with href attribute with Umbraco page ids instead of url of pages. Can you check which pages has this problem? We can fix it I think.
And it's not a problem of Umbraco, it looks like developer or editor made a mistake.
Thanks
Alex
Hi David
Did you solve the issue? Please share with us
Alex
Hi Alex,
I have forwarded your thoughts to my developer and will update in due course.
Many thanks, David
David,
What is doing the crawling?
Regards
Ismail
Hi Ismail,
Google is crawling some of these urls and has indexed them.
Thanks, David
David,
Do you have xml sitemap and google master account? If not then create xml sitemap and push that to google and use that to index the site this way you can control what google indexes.
Regards
Ismail
Hi David
I think it's easy to find this problem, just find on which page you have these links and what part of code renders this link
Alex
Problem solved.
There are some additional sitemaps where these pages were listed on which were not on the main sitemap. Not sure why it was done this way by a previous developer and why he didn't check his work.
I have also been advised that Umbraco ID urls are no longer accessible as well.
Thanks for all your guidance on this Alex and Ismail.
Regards, David
You are welcome, David.
Glad that we helped!
Alex
is working on a reply...
This forum is in read-only mode while we transition to the new forum.
You can continue this topic on the new forum by tapping the "Continue discussion" link below.
Continue discussion