Does anyone know of a package or even just a DB query that can list out all media items that are not referenced in properties of either content nodes or other media nodes.
This is something that should get you started. The gotcha is that media can be linked to in rich text editors or custom data types, so you can't just search for the Media Id in the dataInt column of the cmsPropertyData table. You also need to search the dataNvarchar and dataNtext columns. I'm just using a LIKE, so you might get false positives. For instance, if you have a media node with an id of 1234 that is never selected, and you have a node that selects media with an id of 12345, your media that is never selected won't be detected by this query.
USE my_umbraco_database
DECLARE @latestOnly bit = 1
DECLARE @selectedMedia TABLE (id int)
-- first, get all media that is probably selected
INSERT INTO @selectedMedia
SELECT DISTINCT m.id
FROM cmsPropertyData pd
INNER JOIN cmsPropertyType pt
ON pt.id = pd.propertytypeid
INNER JOIN umbracoNode n
ON n.id = pd.contentNodeId
INNER JOIN cmsContentVersion cv
ON cv.VersionId = pd.versionId
INNER JOIN (SELECT cv_inner.ContentId
,MAX(cv_inner.VersionDate) VersionDate
FROM cmsContentVersion cv_inner
GROUP BY cv_inner.ContentId) cv_top
ON cv_top.ContentId = cv.ContentId AND (@latestOnly = 0 OR cv_top.VersionDate = cv.VersionDate)
INNER JOIN umbracoNode m
ON pd.dataInt = m.id
OR pd.dataNvarchar LIKE '%' + CAST(m.Id AS VARCHAR) + '%'
OR pd.dataNtext LIKE '%' + CAST(m.Id AS VARCHAR) + '%'
INNER JOIN cmsContent c
ON c.nodeId = m.id
INNER JOIN cmsContentType ct
ON ct.nodeId = c.contentType
WHERE m.nodeObjectType = 'B796F64C-1F99-4FFB-B886-4BF4BC011A9C'
AND ct.alias <> 'Folder'
-- next, find all the media that isn't selected
SELECT *
FROM umbracoNode m
INNER JOIN cmsContent c
ON c.nodeId = m.id
INNER JOIN cmsContentType ct
ON ct.nodeId = c.contentType
WHERE m.nodeObjectType = 'B796F64C-1F99-4FFB-B886-4BF4BC011A9C'
AND ct.alias <> 'Folder'
AND m.id NOT IN (SELECT id
FROM @selectedMedia)
This query takes a while to run. I'm sure if more thought was applied, this query could be a lot more efficient. If you make improvements, I'd love to see them. I have a client that wants this in the near future as well :)
The latest only is a flag that will let you search on only the latest version of each node or on every version of a node. If you put @latestOnly = 0, you will find nodes that have never been selected.
Listing unreferenced media items
Hi All,
Does anyone know of a package or even just a DB query that can list out all media items that are not referenced in properties of either content nodes or other media nodes.
I found: https://our.umbraco.org/forum/developers/extending-umbraco/23713-How-to-find-unused-files-under-the-media-folder#comment-88363 but this returns a list of all media items in the database. I can use that to clear of files that aren't used, but I want to clear out from the db media items that aren't referenced anywhere, and I don't fancy doing it by hand..
Thanks in advance..
Mark
This is something that should get you started. The gotcha is that media can be linked to in rich text editors or custom data types, so you can't just search for the Media Id in the
dataInt
column of thecmsPropertyData
table. You also need to search thedataNvarchar
anddataNtext
columns. I'm just using a LIKE, so you might get false positives. For instance, if you have a media node with an id of1234
that is never selected, and you have a node that selects media with an id of12345
, your media that is never selected won't be detected by this query.This query takes a while to run. I'm sure if more thought was applied, this query could be a lot more efficient. If you make improvements, I'd love to see them. I have a client that wants this in the near future as well :)
Hope this helps.
The latest only is a flag that will let you search on only the latest version of each node or on every version of a node. If you put
@latestOnly = 0
, you will find nodes that have never been selected.This is working in 6.0.5 too, thanks.
In 7.15.3, I found node unique id in the rich text editor's dataNtext column.
Replaced this part
with that
so it covers substring id concern.
is working on a reply...