Media Import queries, whats the MediaService like these days?
I'm in the middle of doing an huge rebuild from an old 4.5 site to v7.3 (woot!). Last step is getting all the old media ported over.
I've got a list of media I know that the content on the new site is actually linking too (which means its actually live media so allows me to prune off some of the 1000's of files they appear to not need anymore).
Next step is importing it all in. I'm ok with doing this but wanted to see if anything has moved on lately with the new MediaService. I've got the source from Simon's Import Media package and ripped the guts out of that but most of it is aimed at v4 stuff and apis. Before I crack on with that I wondered if the newer v7 apis can handle stuff like making thumbnails, setting width, height etc. for me or do I still need write all that logic myself (or steal it from Import Media at least then beat it into shape).
I'd be rather amazed if I can't just say MediaServer.SaveAs( MediaTypes.Image, blah, blah, blah) and it just handle all that crud for me but looking at the docs I suspect it won't :(
Well, I had to just get stuck into this one. If anyone follows in my wake here is what I did. In short I used lots of intermediate XML files to make sure I was doing it "right" an allowed me to quickly re-run a step rather than having to re-run one "mega job" for speedy hacking. Additionally I used lots of quick and dirty Hashtables to act as very fast lookups for just about everything.
I uses Xenu link checker to spider the dev site and index everything, it does a report that lists everything broken and spits this out into a HUGE html page. I hacked this around in a text editor to get all the pages missing media into an XML format. That gave me something to work with.
Then I used that to spit out another XML doc with all the unique media references (media can of course be referenced on multiple pages).
Then I simply used a webclient request to grab each media item from the live site and save them to disc, I chose not to care what folder they were in and simply saved them into the same folder, I did do a cheeky Replace on the "\" to a "~" (tilda) so I could get the folder structure later if needed but mainly so I knew I would get a unique file name and not over-write anything.
Then I got the guts of Import Media (https://importmedia4umbraco.codeplex.com/) and used bits of that to allow me to look over the XML file of unique media and import it into Umbraco. Magically I found that v7 now makes the thumbnails for you so no need to mess with that bit of it (great to see). Media could only be one of 2 types, an Image or a File for ease. I created two folders on the dev site via the Umbraco backoffice and made note of their ID's and hard coded this into my code, if an Image it went in the "Misc Images" folder and if a file (pdf, doc, docx, etc.) it went into the "Misc Docs" folder.
After each import I made a note of the new Id for that media item and added that to the XML doc for unique media, this meant I now have a look up of old id to new id for the next step.
Next part is linking all the new media to the content that references it, thats part two that I'm on with now.
First off - glad that my old project was of some use to you! Not sure my response is now of use to you but will share in case...
I'm also interested in answers to this thread as I recently inherited a site with a HUGE amount of oversized media (many image files are up to 30MB and at crazy resolutions) and there is a total of of 80,000 files in the media directory!
I've had to write a script to scale the images down and optimise them in batches starting from the largest but have had to set the new umbracoBytes, umbracoWidth & umbracoHeight properties myself. I'm also not sure that the process updates the thumbnail but note your comment...
Magically I found that v7 now makes the thumbnails for you
Interesting - I wonder if that is just when your Create new media then rather than updating existing media.
One thing I did initially find confusing was the inability to get at the media item url when working with IMedia so I stopped using and reverted to IPublishedContent
Similar to you I built a way that allowed me to work in small batches to test the results before working through the media in larger batches. I made a quick plug-in that produced a report and an API Controller that allowed me to process them one at a time via the report interface or blast through larger batches of images. It basically copied each image to a backup file and reverted it in the case of an error or removed it in the case of success. I'm still working through the images!
P.S. As part of your upgrade are you starting from a fresh v7 site and importing content etc from the v4 site?
Maybe a little late but you can do it as follows as well (will automatically fill the properties too):
var mediaService = Services.MediaService;
// figure out a way to get the path and the filename
var path = "D:\\test.png";
var fileName = path.Split('\\').Last();
var mediaImage = mediaService.CreateMedia(fileName, -1, "Image");
using(Stream stream = new FileStream(path, FileMode.Open))
mediaImage.SetValue("umbracoFile", fileName, stream);
mediaService.Save(mediaImage);
Make sure to add using Umbraco.Core.Models; as well to get to the SetValue extension that accepts a Stream.
I'm using a clean v7 build and importing from v4. A lot of your stuff is using the v4 API but it all still works, I've updated a few bits here and there.
Regarding the thumbnails I have to confess that I "assumed" that it creates thumbnails simply because when I view the media I've created (and set the umbracoFile prop to point to the full size image) in the backoffice media section the image appears as a thumbnail. But that is not actually the case, there is no _thumb.jpg created for each one.
Do we actually "need" the thumbnails in v7? If so I'm going to have to run through them all again and create them I guess :(
Ohh thats good to know, I hoped so as like I said the backend seemed to render just fine without it. In that case I will skip them for now and concentrate on finishing this import off (ahhh the joys of managing as well as coding...everything takes 5 times as long!)
Media Import queries, whats the MediaService like these days?
I'm in the middle of doing an huge rebuild from an old 4.5 site to v7.3 (woot!). Last step is getting all the old media ported over.
I've got a list of media I know that the content on the new site is actually linking too (which means its actually live media so allows me to prune off some of the 1000's of files they appear to not need anymore).
Next step is importing it all in. I'm ok with doing this but wanted to see if anything has moved on lately with the new MediaService. I've got the source from Simon's Import Media package and ripped the guts out of that but most of it is aimed at v4 stuff and apis. Before I crack on with that I wondered if the newer v7 apis can handle stuff like making thumbnails, setting width, height etc. for me or do I still need write all that logic myself (or steal it from Import Media at least then beat it into shape).
I'd be rather amazed if I can't just say MediaServer.SaveAs( MediaTypes.Image, blah, blah, blah) and it just handle all that crud for me but looking at the docs I suspect it won't :(
Ideas, tips and insights welcome.
Ta!
Pete
What do you need for the generation of the thumbnails?
Would GetCropUrl be any good just to change in your views?
For creating, I'm sure you've seen from the docs but the CreateMedia method is what I've recently used
Well, I had to just get stuck into this one. If anyone follows in my wake here is what I did. In short I used lots of intermediate XML files to make sure I was doing it "right" an allowed me to quickly re-run a step rather than having to re-run one "mega job" for speedy hacking. Additionally I used lots of quick and dirty Hashtables to act as very fast lookups for just about everything.
I uses Xenu link checker to spider the dev site and index everything, it does a report that lists everything broken and spits this out into a HUGE html page. I hacked this around in a text editor to get all the pages missing media into an XML format. That gave me something to work with.
Then I used that to spit out another XML doc with all the unique media references (media can of course be referenced on multiple pages).
Then I simply used a webclient request to grab each media item from the live site and save them to disc, I chose not to care what folder they were in and simply saved them into the same folder, I did do a cheeky Replace on the "\" to a "~" (tilda) so I could get the folder structure later if needed but mainly so I knew I would get a unique file name and not over-write anything.
Then I got the guts of Import Media (https://importmedia4umbraco.codeplex.com/) and used bits of that to allow me to look over the XML file of unique media and import it into Umbraco. Magically I found that v7 now makes the thumbnails for you so no need to mess with that bit of it (great to see). Media could only be one of 2 types, an Image or a File for ease. I created two folders on the dev site via the Umbraco backoffice and made note of their ID's and hard coded this into my code, if an Image it went in the "Misc Images" folder and if a file (pdf, doc, docx, etc.) it went into the "Misc Docs" folder.
After each import I made a note of the new Id for that media item and added that to the XML doc for unique media, this meant I now have a look up of old id to new id for the next step.
Next part is linking all the new media to the content that references it, thats part two that I'm on with now.
First off - glad that my old project was of some use to you! Not sure my response is now of use to you but will share in case...
I'm also interested in answers to this thread as I recently inherited a site with a HUGE amount of oversized media (many image files are up to 30MB and at crazy resolutions) and there is a total of of 80,000 files in the media directory!
I've had to write a script to scale the images down and optimise them in batches starting from the largest but have had to set the new
umbracoBytes
,umbracoWidth
&umbracoHeight
properties myself. I'm also not sure that the process updates the thumbnail but note your comment...Interesting - I wonder if that is just when your
Create
new media then rather than updating existing media.One thing I did initially find confusing was the inability to get at the media item url when working with
IMedia
so I stopped using and reverted toIPublishedContent
Similar to you I built a way that allowed me to work in small batches to test the results before working through the media in larger batches. I made a quick plug-in that produced a report and an API Controller that allowed me to process them one at a time via the report interface or blast through larger batches of images. It basically copied each image to a backup file and reverted it in the case of an error or removed it in the case of success. I'm still working through the images!
P.S. As part of your upgrade are you starting from a fresh v7 site and importing content etc from the v4 site?
Maybe a little late but you can do it as follows as well (will automatically fill the properties too):
Make sure to add
using Umbraco.Core.Models;
as well to get to theSetValue
extension that accepts aStream
.Hey Simon,
I'm using a clean v7 build and importing from v4. A lot of your stuff is using the v4 API but it all still works, I've updated a few bits here and there.
Regarding the thumbnails I have to confess that I "assumed" that it creates thumbnails simply because when I view the media I've created (and set the umbracoFile prop to point to the full size image) in the backoffice media section the image appears as a thumbnail. But that is not actually the case, there is no _thumb.jpg created for each one.
Do we actually "need" the thumbnails in v7? If so I'm going to have to run through them all again and create them I guess :(
Pete
The work I am doing is actually in a v6 site but the thumbnails already exist so not overly bothered but handy to know anyway.
The _thumb files were mostly just a convenience thing, definitely don't need them for anything any more. Hurrah for legacy!
Ohh thats good to know, I hoped so as like I said the backend seemed to render just fine without it. In that case I will skip them for now and concentrate on finishing this import off (ahhh the joys of managing as well as coding...everything takes 5 times as long!)
is working on a reply...