Copied to clipboard

Flag this post as spam?

This post will be reported to the moderators as potential spam to be looked at


  • Mark 116 posts 248 karma points
    Oct 05, 2015 @ 14:09
    Mark
    0

    Listing unreferenced media items

    Hi All,

    Does anyone know of a package or even just a DB query that can list out all media items that are not referenced in properties of either content nodes or other media nodes.

    I found: https://our.umbraco.org/forum/developers/extending-umbraco/23713-How-to-find-unused-files-under-the-media-folder#comment-88363 but this returns a list of all media items in the database. I can use that to clear of files that aren't used, but I want to clear out from the db media items that aren't referenced anywhere, and I don't fancy doing it by hand..

    Thanks in advance..

    Mark

  • Mark Bowser 261 posts 846 karma points c-trib
    Oct 05, 2015 @ 16:27
    Mark Bowser
    0

    This is something that should get you started. The gotcha is that media can be linked to in rich text editors or custom data types, so you can't just search for the Media Id in the dataInt column of the cmsPropertyData table. You also need to search the dataNvarchar and dataNtext columns. I'm just using a LIKE, so you might get false positives. For instance, if you have a media node with an id of 1234 that is never selected, and you have a node that selects media with an id of 12345, your media that is never selected won't be detected by this query.

    USE my_umbraco_database
    
    DECLARE @latestOnly bit = 1
    
    DECLARE @selectedMedia TABLE (id int)
    
    -- first, get all media that is probably selected
    INSERT INTO @selectedMedia
    SELECT DISTINCT m.id
    FROM cmsPropertyData pd
    INNER JOIN cmsPropertyType pt
        ON pt.id = pd.propertytypeid
    INNER JOIN umbracoNode n
        ON n.id = pd.contentNodeId
    INNER JOIN cmsContentVersion cv
        ON cv.VersionId = pd.versionId
    INNER JOIN (SELECT cv_inner.ContentId
                      ,MAX(cv_inner.VersionDate) VersionDate
                FROM cmsContentVersion cv_inner
                GROUP BY cv_inner.ContentId) cv_top
        ON cv_top.ContentId = cv.ContentId AND (@latestOnly = 0 OR cv_top.VersionDate = cv.VersionDate)
    INNER JOIN umbracoNode m
        ON pd.dataInt = m.id
        OR pd.dataNvarchar LIKE '%' + CAST(m.Id AS VARCHAR) + '%'
        OR pd.dataNtext LIKE '%' + CAST(m.Id AS VARCHAR) + '%'
    INNER JOIN cmsContent c
        ON c.nodeId = m.id
    INNER JOIN cmsContentType ct
        ON ct.nodeId = c.contentType
    WHERE m.nodeObjectType = 'B796F64C-1F99-4FFB-B886-4BF4BC011A9C'
      AND ct.alias <> 'Folder'
    
    -- next, find all the media that isn't selected
    SELECT *
    FROM umbracoNode m
    INNER JOIN cmsContent c
        ON c.nodeId = m.id
    INNER JOIN cmsContentType ct
        ON ct.nodeId = c.contentType
    WHERE m.nodeObjectType = 'B796F64C-1F99-4FFB-B886-4BF4BC011A9C'
      AND ct.alias <> 'Folder'
      AND m.id NOT IN (SELECT id
                       FROM @selectedMedia)
    

    This query takes a while to run. I'm sure if more thought was applied, this query could be a lot more efficient. If you make improvements, I'd love to see them. I have a client that wants this in the near future as well :)

    Hope this helps.

  • Mark Bowser 261 posts 846 karma points c-trib
    Oct 05, 2015 @ 16:30
    Mark Bowser
    0

    The latest only is a flag that will let you search on only the latest version of each node or on every version of a node. If you put @latestOnly = 0, you will find nodes that have never been selected.

  • kleptbit 37 posts 92 karma points
    Nov 01, 2015 @ 21:14
    kleptbit
    0

    This is working in 6.0.5 too, thanks.

Please Sign in or register to post replies

Write your reply to:

Draft