Copied to clipboard

Flag this post as spam?

This post will be reported to the moderators as potential spam to be looked at


  • Jason Espin 368 posts 1335 karma points
    Jan 25, 2016 @ 15:30
    Jason Espin
    0

    Querying Umbraco for a specific Media node - Best Practise

    Hi all,

    Just a best practice question really.

    I have an external web service that uses the content and media services in Umbraco to create page content and associated media items.

    The process runs nightly but can also run on-demand with a user triggering it in Umbraco using a dashboard.

    With regards to media, each page created by the process has an associated media folder in the media section. If the media folder already exists in the tree, it should be used to store child elements. If one does not exist, a new media folder should be created and child elements stored beneath it.

    The process I have works but I am looking to improve it and making it faster and more robust as I wrote it 2 years ago when I first started developing in Umbraco.

    Essentially I am doing the following:

    1. Run an examine query against a custom index. This is used to quickly pull the id of the media folder so that it can then be used in conjunction with the media service to give me an IMedia object to work with.

    2. If the above process fails, I used the Umbraco.Typed media helper to search for the specific folder I am looking for in conjunction with LINQ.

    3. For the final stage, if all else has failed I use the media service directly along with LINQ to look for the specific media node in the database directly.

    In theory, this should provide the fastest results as a lucene index should be fairly rapid and failing that the information would then come from the XML cache. Ultimately I want to try and cut down on calls made to the database. However, in practice I am finding that using the media service directly is actually faster than using the XML cache.

    Example 1: Query using the Umbraco helper

    Stopwatch xmlStopwatch = new Stopwatch();
    xmlStopwatch.Start();
    IPublishedContent xmlMediaObject = Umbraco.TypedMediaAtRoot().DescendantsOrSelf(umbraco_node.ContentType.Alias.ToLower() + "Images").Where(m => m.GetPropertyValue<string>("folderCode") == umbraco_node.ContentType.Alias.ToLower() + itemCode).FirstOrDefault();
    xmlStopwatch.Stop();
    TimeSpan xmlTimespan = xmlStopwatch.Elapsed;
    

    Result : xmlTimespan = 00:01:10:003


    Example 2: Query using the Media Service

    Stopwatch databaseStopwatch = new Stopwatch();
    databaseStopwatch.Start();
    
    IMedia rootMedia = ms.GetRootMedia().Where(m => m.ContentType.Alias == umbraco_node.ContentType.Alias.ToLower() + "Images").FirstOrDefault();
    if (rootMedia != null)
    {
       databaseMediaObject = rootMedia.Children().Where(c => c.GetValue<string>("folderCode") == umbraco_node.ContentType.Alias.ToLower() + itemCode).FirstOrDefault();
    }
    
    databaseStopwatch.Stop();
    TimeSpan databaseTimespan = databaseStopwatch.Elapsed;
    

    Result : databaseTimespan = 00:00:00:311


    This is running on localhost on my development machine but as you can see, accessing the database directly is much faster than accessing the XML cache.

    I am aware that for this particular project there are a lot of media folders (approaching 150 as we speak) but generally I am looking to see what other people would do in this situation.

    Remember this is just a fallback as I am initially searching using Examine but it would be great to get the view of other developers and their combined experience.

    Thanks in advance.

  • Shannon Deminick 1526 posts 5272 karma points MVP 3x
    Jan 25, 2016 @ 16:31
    Shannon Deminick
    2

    Hi,

    Here's a few notes/tips:

    • There is no XML media cache
    • All Media cache is in Lucene/Examine
    • So long as your indexes are in sync (which they should always be), then you should always used TypedMedia -> This uses the media cache == Lucene

    Your queries above are not comparable. First thing to note, this code:

     Umbraco.TypedMediaAtRoot().DescendantsOrSelf(umbraco_node.ContentType.Alias.ToLower() + "Images").Where(m => m.GetPropertyValue<string>("folderCode") == umbraco_node.ContentType.Alias.ToLower() + itemCode).FirstOrDefault();
    

    Is the most inefficient way you could query media. This is essentially going to lookup/retrieve every single media item in your media library, load it into memory and then run a Linq statement on it.

    Your db query doesn't do this, it looks up all root items and then it looks at the child structures of each of those, it does not iterate deeper than that, it is not loading every media item in your library.

    If you want to be efficient, put the media that you are looking for (i.e. things that have 'folderCode' equal to something) in a media folder. Then you can just look this folder and find the results in it's children, or it's descendants if you need. You shouldn't need to query your entire media library for a single item.

    So to re-cap:

    • Ensure nothing is interfering with your indexes being populated correct
    • TypedMedia will always be faster ... so long as you are not querying your entire media library to look for a single item
    • If you want to be efficient will large media sets, you need to organize your media so that it's easily looked up/targeted
  • Jason Espin 368 posts 1335 karma points
    Jan 25, 2016 @ 16:51
    Jason Espin
    0

    Hi Shannon,

    Thanks for your response.

    In this case, we are talking about folders. I have a folder for each of my page types (packageImages, hotelImages, activityImages) which then contain images.

    In this instance, I am first looking for the correct folder, what should be the parent of the images I wish to create or update, and then I do the image processing elsewhere.

    I cannot query a specific media item at the point you mention because I do not have the media id with which to access it using TypedMedia. This is why I have to go through this process to ensure that I have an ID / Node that I can use.

    Forgive me if I have misunderstood but do you mean processing it in this way would be more fitting:

    IPublishedContent luceneMediaParent = Umbraco.TypedMediaAtRoot().Where(m => m.ContentType.Alias == umbraco_node.ContentType.Alias.ToLower() + "Images").FirstOrDefault();
                    IPublishedContent luceneChildElement = luceneMediaParent.Children().Where(x => x.GetPropertyValue<string>("folderCode") == umbraco_node.ContentType.Alias.ToLower() + itemCode).FirstOrDefault();
                    if (luceneChildElement != null)
                    {
                        try
                        {
                            databaseMediaObject = ms.GetById(luceneChildElement.Id);
                        }
                        catch (NullReferenceException ex)
                        {
                            Log.Error("Media folder id " + luceneChildElement.Id + " exists in the media cache but is not present in the database. ", ex);
                        }
                    }
    
  • Shannon Deminick 1526 posts 5272 karma points MVP 3x
    Jan 26, 2016 @ 09:43
    Shannon Deminick
    101

    Hi, yes that would be a better approach since now you are simply looking at your root nodes and selecting one. Then looking at it's children and selecting one. Previously, you are iterating the entire media library.

    But if you already know the Id of the folder that contains umbraco_node.ContentType.Alias.ToLower() + "Images", then just lookup that node by Id and then query it's children, this will be faster again.

    You should not require a database lookup. All media should be in your index, if this is not the case then something is interfering with the media getting stored into your index.

  • Jason Espin 368 posts 1335 karma points
    Jan 26, 2016 @ 14:08
    Jason Espin
    0

    Hi, thanks again for your response. I wasn't aware before than the media was not subject to the same XML caching methods as content. I guess this is something that should maybe be given a little more focus in the certification courses as again this is the first I've heard of it.

    Whilst I completely understand it would be best to know the specific ID of the parent folder in question in this case the site is a product for a client and therefore we have to have safeguards in place to account for user error such as accidentally deleting the parent folder. Therefore in the extended version of my code above, if no node exists in the root that matches umbraco_node.ContentType.Alias.ToLower() + "Images" we go ahead and create one to ensure that there is somewhere for the children to go. This means that essentially the id of this media folder could potentially be dynamic if this were to occur.

  • Shannon Deminick 1526 posts 5272 karma points MVP 3x
    Jan 26, 2016 @ 14:13
    Shannon Deminick
    0

    Sure i can pass along your request to trainers about explaining where media is cached. The umbraco.config file in ~/App_Data is the xml cache file, as you can tell, there is no media in there.

    Not knowing the ids is fine, though you could easily cache them in runtime cache once found to save a media lookup the next time any page is executed.

Please Sign in or register to post replies

Write your reply to:

Draft