I have a specific need. I have a custom route registred that maps /mypage/uniqueid to an umbraco page /myumbracopage
If someone access /myumbracopage a normal page is displayed, and if /mypage/uniqeid is accessed the same page but loaded with additional info is displayed.
The problem is that these pages are beeing crawled and I want to add a Disallow on /mypage to robots.txt
Is there anyway I can add custom rules to the generated robots.txt?
I also tried to add my own robots.txt in the root folder of the site (where web.config is stored) to see if I can make a manual file. read in the docs that it was a possibility. But seochecker seems to generate it's own anyway? Is it a feature available after 1.10? Or should this go in some other folder?
You shouldn't put that in a robots.txt. When you put it in robots.txt you are basically telling everyone not knowing about that secret, that here is something they shouldn't see.
A robots.txt is basically a list of all the things you don't want people to know about. But by writing things there, you are also telling people that these things exist - and shady people will take advantage of that.
Instead, you should add robots meta tags to the specific page telling robots not to index it.
Good point.
First of all, I did get it to load a a custom robots.txt (just needed to complile by saving something i think)
The thing is that in this case it's not much of a secret. It's a personal webshop that requires a passcode. All I want is that google don't index it, and present lots (like up to 60k) unique pages.
The robot meta tags may be the way anyhow, so great tip, thanks!
Robots.txt with custom path
Using SeoChecker 1.10 for umbraco 7
I have a specific need. I have a custom route registred that maps /mypage/uniqueid to an umbraco page /myumbracopage
If someone access /myumbracopage a normal page is displayed, and if /mypage/uniqeid is accessed the same page but loaded with additional info is displayed. The problem is that these pages are beeing crawled and I want to add a Disallow on /mypage to robots.txt
Is there anyway I can add custom rules to the generated robots.txt?
I also tried to add my own robots.txt in the root folder of the site (where web.config is stored) to see if I can make a manual file. read in the docs that it was a possibility. But seochecker seems to generate it's own anyway? Is it a feature available after 1.10? Or should this go in some other folder?
You shouldn't put that in a robots.txt. When you put it in robots.txt you are basically telling everyone not knowing about that secret, that here is something they shouldn't see.
A robots.txt is basically a list of all the things you don't want people to know about. But by writing things there, you are also telling people that these things exist - and shady people will take advantage of that.
Instead, you should add robots meta tags to the specific page telling robots not to index it.
Good point. First of all, I did get it to load a a custom robots.txt (just needed to complile by saving something i think)
The thing is that in this case it's not much of a secret. It's a personal webshop that requires a passcode. All I want is that google don't index it, and present lots (like up to 60k) unique pages.
The robot meta tags may be the way anyhow, so great tip, thanks!
You can also put the nofollow attribute on all your links to the page, that should save you some time in your crawl budget.
is working on a reply...