In Umbraco 7 we got used to delete old versions with some SQL statements. But these statements don't work anymore in Umbraco 8, since the database structure changed. I tried to use the DeleteVersions method of the ContentService, but it doesn't work:
// Get all nodes with old versions
static readonly string sqlForOldVersions =
@"SELECT distinct [nodeId] FROM[umbracoContentVersion]
where[current] <> 1";
...
List<int> ids = new List<int>();
using (var reader = handler.Execute( sqlForOldVersions, true ))
while (reader.Read())
{
ids.Add( reader.GetInt32( 0 ) );
}
foreach (var id in ids)
{
this.contentService.DeleteVersions( id, dt );
}
It throws with the following error:
The primary key value cannot be deleted because references to this key
still exist. [ Foreign key constraint name =
FK_umbracoDocumentVersion_umbracoContentVersion_id ]
Does anybody have any insights regarding this issue?
There should definitely be a housekeeping functionality to delete old versions. The API only allows to delete old versions of a given content. But how do we determine, which contents have old versions?
The SQL query in my sample code above is a quick hack.
Why isn't there an API like
DeleteAllContentVersionsBefore(DateTime dt)?
This should have been part of Umbraco since the v4 days… ;-)
That does look like a bug in v8. That foreign key was missing before, so it didn't matter which order things were deleted. I think it might also need a tweak to how DeleteVersions decides which versions can be deleted.
For deleting old versions of all content, this SQL should work for v8
SELECT cv.id
INTO #toDelete
FROM umbracoDocumentVersion dv
INNER JOIN umbracoContentVersion cv ON dv.id = cv.id
WHERE cv.[current] != 1 AND dv.published != 1 AND cv.VersionDate < 'xxxx-xx-xx'
DELETE FROM umbracoPropertyData WHERE versionId IN (select id from #toDelete)
DELETE FROM umbracoDocumentVersion WHERE id IN (select id from #toDelete)
DELETE FROM umbracoContentVersion WHERE id IN (select id from #toDelete)
DROP TABLE #toDelete
Adding a ContentService.DeleteAllVersionsBefore(DateTime dt) would be nice, though that SQL only works for SQL Server. We'd need some other solution for SQL CE.
This is great. I've been looking for any kind of help with compacting an Umbraco 8 database; removing old versions, pruning the log, etc. If you happen to have any other "clean up" scripts to share please do!
Note if you have multiple language variations of your content the query above won't work because it doesn't cleanup table:
[umbracoContentVersionCultureVariation]
I'll let you know when I have a version which clean this as well.
SELECT cv.id
INTO #toDelete
FROM umbracoDocumentVersion dv
INNER JOIN umbracoContentVersion cv ON dv.id = cv.id
WHERE cv.nodeId = @nodeId AND cv.[current] != 1 AND dv.published != 1 AND cv.VersionDate < @toDate
DELETE FROM umbracoPropertyData WHERE versionId IN (select id from #toDelete)
DELETE FROM umbracoDocumentVersion WHERE id IN (select id from #toDelete)
DELETE FROM umbracoContentVersionCultureVariation WHERE versionId IN (select id from #toDelete)
DELETE FROM umbracoContentVersion WHERE id IN (select id from #toDelete)
DROP TABLE #toDelete
I've used your query (removed the "cv.nodeId = @nodeId AND" part) and it worked perfectly.
I noticed when clicking on the Info it still reported the historical versions, it got this information from the Log. So adding this statement completely cleaned it.
TRUNCATE TABLE umbracoLog
P.S. disclaimer, it never hurts to create a backup before doing any of this....
I run an hourly syncronization from an ERP system to Umbraco 8, updating only content that have changed compared to the record from the ERP, but within the last year I discovered that my DB has grown to about 12 GB (for a relatively simple site with a product DB of about 600 items).
I have now been running your script above, Remko, in date interval execution chunks of about 10 days from site launch to today, (otherwise MS SQL Management studio stalled), deleting millions of entries per 10 day interval. Now the size is down to about 500 mb and running.
Still, when I do a report on the DB, the size of the tables dbo.umbracoAudit and dbo.umbracoPropertyData seems very high.
Do any of you know if it is possible to turn off audit trails or versions for dynamically created and maintained content?
@Matthew, thanks for notifying us there is also a package for this.
@Thomsen, when SQL Server allocates spaces it doesn't automatically deallocate it. Think doing a full backup will clear the transaction log giving back a lot of space. Right clicking a database after this and select Shrink Database further releases space (think it reserves the actual used space + 10%).
Not sure what you can and can't delete from umbracoAudit and umbracoPropertyData when you find this out please share.
Delete old versions
In Umbraco 7 we got used to delete old versions with some SQL statements. But these statements don't work anymore in Umbraco 8, since the database structure changed. I tried to use the DeleteVersions method of the ContentService, but it doesn't work:
It throws with the following error:
Does anybody have any insights regarding this issue?
If the contentservice is throwing that error I assume it's a bug in V8.
There should definitely be a housekeeping functionality to delete old versions. The API only allows to delete old versions of a given content. But how do we determine, which contents have old versions?
The SQL query in my sample code above is a quick hack.
Why isn't there an API like
This should have been part of Umbraco since the v4 days… ;-)
I stumbled upon this one as well.. Did you figure out how to fix this?
That does look like a bug in v8. That foreign key was missing before, so it didn't matter which order things were deleted. I think it might also need a tweak to how
DeleteVersions
decides which versions can be deleted.For deleting old versions of all content, this SQL should work for v8
Adding a
ContentService.DeleteAllVersionsBefore(DateTime dt)
would be nice, though that SQL only works for SQL Server. We'd need some other solution for SQL CE.This is great. I've been looking for any kind of help with compacting an Umbraco 8 database; removing old versions, pruning the log, etc. If you happen to have any other "clean up" scripts to share please do!
Note if you have multiple language variations of your content the query above won't work because it doesn't cleanup table: [umbracoContentVersionCultureVariation]
I'll let you know when I have a version which clean this as well.
Already made that one:
Thanks Remco ! ..
I've used your query (removed the "cv.nodeId = @nodeId AND" part) and it worked perfectly.
I noticed when clicking on the Info it still reported the historical versions, it got this information from the Log. So adding this statement completely cleaned it.
TRUNCATE TABLE umbracoLog
P.S. disclaimer, it never hurts to create a backup before doing any of this....
Ok, thanks.
Think you'd better do:
DELETE FROM [dbo].[umbracoLog] WHERE DateStamp < @toDate
this then. Because the versions you remove are also selected by their date.
Thank you, this thread was incredibly helpful!
I run an hourly syncronization from an ERP system to Umbraco 8, updating only content that have changed compared to the record from the ERP, but within the last year I discovered that my DB has grown to about 12 GB (for a relatively simple site with a product DB of about 600 items).
I have now been running your script above, Remko, in date interval execution chunks of about 10 days from site launch to today, (otherwise MS SQL Management studio stalled), deleting millions of entries per 10 day interval. Now the size is down to about 500 mb and running.
Still, when I do a report on the DB, the size of the tables dbo.umbracoAudit and dbo.umbracoPropertyData seems very high.
Do any of you know if it is possible to turn off audit trails or versions for dynamically created and maintained content?
Best regards!
There is a package that deletes old version - https://our.umbraco.com/packages/website-utilities/unversion/
Maybe interesting to compare the SQL maybe it's missing something we can all benefit from, or it goes further :)
Matt
@Matthew, thanks for notifying us there is also a package for this.
@Thomsen, when SQL Server allocates spaces it doesn't automatically deallocate it. Think doing a full backup will clear the transaction log giving back a lot of space. Right clicking a database after this and select Shrink Database further releases space (think it reserves the actual used space + 10%).
Not sure what you can and can't delete from umbracoAudit and umbracoPropertyData when you find this out please share.
is working on a reply...