Copied to clipboard

Flag this post as spam?

This post will be reported to the moderators as potential spam to be looked at


  • Josh Olson 79 posts 207 karma points
    Jan 28, 2015 @ 12:00
    Josh Olson
    0

    Keep Google (and other search engines) out of my /media/

    I am trying to figure out a way to keep Google and other search engines out of my /media/ section. The scenario I am looking at is as such:

    I have lots of editors using the backend and adding lots of content that needs to be publicly available (images, documents, etc).

    I also have lots of members using the frontend and uploading photos and other documents that need to be not public.

    All of this stuff ends up in the /media/ folder mixed together, cause that is just the way that Umbraco do (à la Ze Frank - True Facts). Now, the things that are uploaded by members are custom properties of the upload type and there are no public pages linking to those uploaded images or documents (there are pages but they are protected through role based protection), so a crawl of the site will not turn up those links. Additionally, I have added the restriction to my robots.txt file, but not all search engines respect that file.

    This is where I run out of ideas. I would like to make sure that a.) no search engines index the things that are in the /media/ folder and b.) be assured that the media that is uploaded through the frontend to custom member properties is not accessible (except via the pages set up with the role based protection) to the outside world.

    I am pretty sure that things are not being indexed, but I have no way of preventing someone who just happens to know the full URL (including the filename) from accessing the saved media. I have looked at the Media Protect for Umbraco and while it is awesome, I don't think it will work. From what I can tell, this can only protect folders/nodes in the media section available via the Umbraco backend. I basically need the reverse where everything is restricted by default except what is in the backend media section.

    Thanks in advance for any suggestions!

    Cheers, Josh

  • David Armitage 505 posts 2073 karma points
    Jan 24, 2020 @ 03:49
    David Armitage
    0

    Hi,

    Did you find out a solution for this.

    Also how have you found the robots.txt to work. I have the same problem. 1000ds of pdf files which I don't want indexing by google.

    Did you find this works in most cases? User-agent: * Disallow: /media/

Please Sign in or register to post replies

Write your reply to:

Draft