Confusingly the ID in the media URL isn't the corresponding node id of the media item... If so you could just visit a media item in the back office and change the ID in the query string to be that number and load the errant item
So how to find the media item in Umbraco?
If you use the search in the back office only the 'name' field in Umbraco is searched... So if you know what this was for the file or if it was dragged to upload it may be the filename...
If not the you can search the raw examine indexes... In the developer section choose the examine tab and locate and expand the indexsearcher... You can the search here for the filename and if it finds a match the first column of the search result will contain the Umbraco I'd for the media item that you can use to subscribe in the query string to load in the back office...
(There is a PR in to allow searching by media filename in the back office but that won't help you now)
If you can't find the item in the internal index
Then later versions of Umbraco 7 have a database table called 'umbracoMedia' that stores the path that you could search in
If the media item has been deleted from the media section... It will be in the recycle bin .. but the file will still be accessible to the outside world via the direct URL... So you'd need to delete the item from the recycle bin to remove from disk...
Thanks so much! I'll have a look into all of these.
Just to follow up on one thing you are saying - I do find correlation between some other PDF media files in terms on node number being same in the backend as the published URL.
Is there a way that I can create a redirect so that the offending "missing pdf" that Google is finding could just redirect to where I know it is?
I have a certain level of access to Umbraco (I have the Developer tab) but might need to go to the digital agency we work with.
The 'number' in the url is calculated from the folders existing already in the media folder on disk, what Umbraco )in V7) does is scan the folders on disk to find the highest numbered folder - and then the next media upload will get this highestnumbered folder + 1 in it's Url.
Whereas the unique umbraco node Id, of an item is across content, media, members etc so would not be sequential. (I know it's not intuitive!, but I think it's to ensure uniquenessa and to avoid clashes when saving files to disk, in V8, the numbers have been replaced by guids to ensure uniqueness), but it's a pain that there is no correlation!
With regard to a redirect, then if the file exists on disk, the direct request for the file is a static request and will be served first by the webserver before any redirect package or Umbraco logic can redirect it - so if you can't track down the file or remove it from disk, you could setup an IIS redirect rule in the web.config to handle the redirect to prevent it being served - if you can delete the file on disk, then an existing redirects package such as Skybrud Redirects could redirect to the new media url - or by writing a custom IContentFinder or you might have something already setup to handle redirects like this on your site.
Finding a PDF
Hi
Is there away to search on the node number in the backend of Umbraco so that I can search on 6524 in the backend of Umbraco.
media/6524/filename.pdf
The reason I ask is that Google has indexed a file but the search result returns a 404 in Umbraco.
When I search on the filename it is not appearing. It is not in the recycle bin either.
Thanks Graham
Hi Graham
Confusingly the ID in the media URL isn't the corresponding node id of the media item... If so you could just visit a media item in the back office and change the ID in the query string to be that number and load the errant item
So how to find the media item in Umbraco?
If you use the search in the back office only the 'name' field in Umbraco is searched... So if you know what this was for the file or if it was dragged to upload it may be the filename...
If not the you can search the raw examine indexes... In the developer section choose the examine tab and locate and expand the indexsearcher... You can the search here for the filename and if it finds a match the first column of the search result will contain the Umbraco I'd for the media item that you can use to subscribe in the query string to load in the back office...
(There is a PR in to allow searching by media filename in the back office but that won't help you now)
If you can't find the item in the internal index
Then later versions of Umbraco 7 have a database table called 'umbracoMedia' that stores the path that you could search in
If the media item has been deleted from the media section... It will be in the recycle bin .. but the file will still be accessible to the outside world via the direct URL... So you'd need to delete the item from the recycle bin to remove from disk...
....happy hunting!
Regards
Marc
Hi
Thanks so much! I'll have a look into all of these.
Just to follow up on one thing you are saying - I do find correlation between some other PDF media files in terms on node number being same in the backend as the published URL.
Is there a way that I can create a redirect so that the offending "missing pdf" that Google is finding could just redirect to where I know it is?
I have a certain level of access to Umbraco (I have the Developer tab) but might need to go to the digital agency we work with.
Thanks again Graham
Hi Graham
The 'number' in the url is calculated from the folders existing already in the media folder on disk, what Umbraco )in V7) does is scan the folders on disk to find the highest numbered folder - and then the next media upload will get this highestnumbered folder + 1 in it's Url.
Whereas the unique umbraco node Id, of an item is across content, media, members etc so would not be sequential. (I know it's not intuitive!, but I think it's to ensure uniquenessa and to avoid clashes when saving files to disk, in V8, the numbers have been replaced by guids to ensure uniqueness), but it's a pain that there is no correlation!
With regard to a redirect, then if the file exists on disk, the direct request for the file is a static request and will be served first by the webserver before any redirect package or Umbraco logic can redirect it - so if you can't track down the file or remove it from disk, you could setup an IIS redirect rule in the web.config to handle the redirect to prevent it being served - if you can delete the file on disk, then an existing redirects package such as Skybrud Redirects could redirect to the new media url - or by writing a custom IContentFinder or you might have something already setup to handle redirects like this on your site.
Finally you can request Google via WebmasterTools to forget the indexed url, if it contains sensitive information, and you need it to go asap! https://support.google.com/webmasters/answer/1663419?hl=en
regards
Marc
Thanks so much for the great advice Marc!
is working on a reply...