I would like to regenerate the content in the cmsContentXml table that is used to create the umbraco.config file for the frontend cached data.
My understanding of the content lifecycle is:
1) Tabular Data in Multiple tables
2) cmsContentXml table
3) umbraco.config
When I click the root "Content" node and choose "Republish all Nodes" it just takes the cmsContentXml table and rebuilds the umbraco.config file.
What I would like is to perform the process that compiles the tabular data in the rest of the database into the cmsContentXml table. This happens when you click the "Save and Publish" button on a node. I would have expected it to happen with the "Republish all Nodes" button, but it doesn't.
I am experiencing the same problem. The tabular data is not consistent with the content of cmsContentXml. In my case, some nodes are shown in the Sort dialog (Which uses the tabular data), but not shown in content tree (Which uses the umbraco.config based on the cmsContentXml table).
Does anyone know if the table cmsContentXml ever gets automatically rebuilt by the system and if so when does this happen.
In a site I am running it looks as if this table started rebuilding but then died 40% of the way through meaning that a load of content disappeared from the site...
I'm not an expert on this, but tried some things out and looked at a few of my sites.
As far as i can see, the XML-file (called "umbraco.config" in folder "App_Data") is only changed when publishing content. My sites have dates that are a few days and even more than a month old. When i publish, the last-change-date of the file is being updated.
You can also see this in the SQL-table "umbracoLog" with this query: SELECT * FROM umbracoLog WHERE logComment LIKE '%xml%' ORDER BY datestamp DESC You will see lines with "Xml saved in xxx".
But that's not your question: you want to know if the SQL-table is being updated. You can use this SQL-query to check the date your SQL-table has changed (but replace "umbracodb" with your own database name):
SELECT OBJECT_NAME(OBJECT_ID) AS name, last_user_update, * FROM sys.dm_db_index_usage_stats WHERE OBJECT_NAME(OBJECT_ID) = 'cmsContentXml' AND database_id = DB_ID( 'umbracodb')
As far as i can see, the table nor the file is automaticly updated. But maybe that also depends on the version of Umbraco you have.
- The cmsContentXML database table. - Optionally the \app_data\umbraco.config file. The purpost of this is to quickly regenerate the in-memory XML document, you can switch this off. - An in memory XML document (identical to the umbraco.config).
As you correctly say the data lives in a load of related tables, when you publish a node, Umbraco will try to update that single node in all the above.
To rebuild the XML cache from the cmsContentXML right click the top content node and republish.
To rebuild the cmsContentXML and both file and in-memory XML caches, go to /umbraco/dialogs/republish.aspx?xml=true
Paul asked about automatic rebuild of one of the caching-levels. Any info on that? As far as i know, older versions of Umbraco did a rebuild of the umbraco.config file after restart of the application. Is this still happening in version 4.7?
Hi - there's no regenerating of any of the XML caches on app restart that I can see (in 4.7).
On publish of one node, the in-memory XML and it's file backup are not regenerated, only the relevant node is updated.
The in-memory XML and it's file system backup are regenerated from the cmsContentXML table at various points (this happens fairly qucikly), e.g. on any of the republish from the right click menu (if you choose all sub pages), on moving or sorting nodes, adding new document types, installing packages.
There's only two places where the cmsContentXML SQL table is regenerated: 1) you do the dialogs republish.aspx?xml=true mentioned above or 2) you rename the alias of an existing document type (as these are used in the XML, I guess it makes sense that you have to regenerate all, as searching through the XML blurb in the table would be painful and slow). Note that the update routine for cmsContentXML (RePublishAll()) starts by truncating the table - and it then loops through a datareader and inserts rows - so if anything went wrong or timed out during this, you would indeed end up with a half empty table (as Paul mentioned). Look for "Error generating xml: " in UmbracoLog table. Only option to fix this is to do the dialog republish xml.
Hi - quicly further to the above, with very large content databases, I have seen problems with timeouts during the republish (it takes about 15 mins on the site I'm currently working on). One quick option with this is to either go in the web.config and change the scripttimemout for your site, or find the /umbraco/dialogs/republish.aspx page in your Umbraco installation, and then you can do an inline script runat="server" block and set the scripttimeout on page load, and reset on page prerender (so for the duration of the click eventhandler you will have a larger timeout).
Thomas: As I understand your posts, you can eventually end up with an Xml cache on disk/memory that is out of synch with the database and in this case you have to do something by hand as it won't be fixed by any automatic processes. Is this correct?
Hi - yes the two can be out of synch, but probably you will never see this on small/medium sites, or large running on reliable hardware. If you should encounter, the solution is 1) try republishing using the right-click on the top "content" node, if still no good 2) do the dialogs/republish.aspx as mentioned.
As I said I cannot see anywhere in the source where any cache is reset or regenerated on app restart, save of course for the in-memory, which will reload from the disk umbraco.config if this is enabled (see ContinouslyUpdateXmlDiskCache setting), or the cmsContentXML table if not.
I found that the cmsContentXml had gotten out of sync' with the actual content on the site and so obviouly the XML cache on the disk/memory were out of sync' too.
We never actually nailed down what caused them to get out of sync' but it happened during the installation of a package. I couldn't see anything in the source code to suggest that the cmsContentXml is rebuilt during installation of packages but can't find any other explanation for why half the records in this table would disappear.
I have found that on application restart if there is no cache on disk it will be regenerated (I.E. delete umbraco.config, restart app pool and check and it will be there again) so there must surely at least be some check on restart to see if this file exists?
If you have "ContinouslyUpdateXmlDiskCache" set to true (which I believe is the default) then it will continously update the xml cache on disk - so if this is missing it will regenerate it along with the in-memory cache.
Background: "App_Data/umbraco.config" is re-generated at web start, and it's content is equal to DB table [cmsContentXml].
To re-produce the scenario, I manually make an error siutation by deleting all rows in DB table [cmsContentXml], then start the web. Now I check the content status in the table and umbraco.config, and the result is that both are in-sync and in my expected error situation.
Then I invoked ContentService.RePublishAll() programatically by creating an API (or any entry-point you want, just code it). Then the whole DB table [cmsContentXml] is now re-generated.
To verify the re-generated rows in [cmsContentXml] are the expected contents (i.e. published and non-trashed content), I also compared the total row count in [cmsContentXml] vs the follow SQL:
SELECT COUNT(0)
FROM cmsDocument
inner join umbracoNode on umbracoNode.id = cmsDocument.nodeId and umbracoNode.trashed = 0
and published = 1
They are same! So it works!
Restarting the web at this moment can re-sync the umbraco.config cache file.
Regenerating cmsContentXml table
Hello,
I would like to regenerate the content in the cmsContentXml table that is used to create the umbraco.config file for the frontend cached data.
My understanding of the content lifecycle is:
1) Tabular Data in Multiple tables
2) cmsContentXml table
3) umbraco.config
When I click the root "Content" node and choose "Republish all Nodes" it just takes the cmsContentXml table and rebuilds the umbraco.config file.
What I would like is to perform the process that compiles the tabular data in the rest of the database into the cmsContentXml table. This happens when you click the "Save and Publish" button on a node. I would have expected it to happen with the "Republish all Nodes" button, but it doesn't.
Any ideas?
Hi
I am experiencing the same problem. The tabular data is not consistent with the content of cmsContentXml. In my case, some nodes are shown in the Sort dialog (Which uses the tabular data), but not shown in content tree (Which uses the umbraco.config based on the cmsContentXml table).
Any help is very appreciated.
Hi Stephan,
I had the same problem and found this article:
http://our.umbraco.org/wiki/reference/api-cheatsheet/publishing-and-republishing
In the article, there is an umbraco URL that will rebuild the XML cached data:
http;//YOURDOMAIN/Umbraco/dialogs/republish.aspx?xml=true and clicking "republish"
Could you please give this a try?
Hi
Does anyone know if the table cmsContentXml ever gets automatically rebuilt by the system and if so when does this happen.
In a site I am running it looks as if this table started rebuilding but then died 40% of the way through meaning that a load of content disappeared from the site...
Cheers
Paul,
I'm not an expert on this, but tried some things out and looked at a few of my sites.
As far as i can see, the XML-file (called "umbraco.config" in folder "App_Data") is only changed when publishing content. My sites have dates that are a few days and even more than a month old. When i publish, the last-change-date of the file is being updated.
You can also see this in the SQL-table "umbracoLog" with this query:
SELECT *
FROM umbracoLog
WHERE logComment LIKE '%xml%'
ORDER BY datestamp DESC
You will see lines with "Xml saved in xxx".
But that's not your question: you want to know if the SQL-table is being updated.
You can use this SQL-query to check the date your SQL-table has changed (but replace "umbracodb" with your own database name):
SELECT OBJECT_NAME(OBJECT_ID) AS name, last_user_update, *
FROM sys.dm_db_index_usage_stats
WHERE OBJECT_NAME(OBJECT_ID) = 'cmsContentXml'
AND database_id = DB_ID( 'umbracodb')
As far as i can see, the table nor the file is automaticly updated. But maybe that also depends on the version of Umbraco you have.
There's 3 levels of caching in Umbraco:
- The cmsContentXML database table.
- Optionally the \app_data\umbraco.config file. The purpost of this is to quickly regenerate the in-memory XML document, you can switch this off.
- An in memory XML document (identical to the umbraco.config).
As you correctly say the data lives in a load of related tables, when you publish a node, Umbraco will try to update that single node in all the above.
To rebuild the XML cache from the cmsContentXML right click the top content node and republish.
To rebuild the cmsContentXML and both file and in-memory XML caches, go to /umbraco/dialogs/republish.aspx?xml=true
@Thomas:
Paul asked about automatic rebuild of one of the caching-levels. Any info on that?
As far as i know, older versions of Umbraco did a rebuild of the umbraco.config file after restart of the application. Is this still happening in version 4.7?
Hi - there's no regenerating of any of the XML caches on app restart that I can see (in 4.7).
On publish of one node, the in-memory XML and it's file backup are not regenerated, only the relevant node is updated.
The in-memory XML and it's file system backup are regenerated from the cmsContentXML table at various points (this happens fairly qucikly), e.g. on any of the republish from the right click menu (if you choose all sub pages), on moving or sorting nodes, adding new document types, installing packages.
There's only two places where the cmsContentXML SQL table is regenerated: 1) you do the dialogs republish.aspx?xml=true mentioned above or 2) you rename the alias of an existing document type (as these are used in the XML, I guess it makes sense that you have to regenerate all, as searching through the XML blurb in the table would be painful and slow).
Note that the update routine for cmsContentXML (RePublishAll()) starts by truncating the table - and it then loops through a datareader and inserts rows - so if anything went wrong or timed out during this, you would indeed end up with a half empty table (as Paul mentioned).
Look for "Error generating xml: " in UmbracoLog table. Only option to fix this is to do the dialog republish xml.
Hi - quicly further to the above, with very large content databases, I have seen problems with timeouts during the republish (it takes about 15 mins on the site I'm currently working on). One quick option with this is to either go in the web.config and change the scripttimemout for your site, or find the /umbraco/dialogs/republish.aspx page in your Umbraco installation, and then you can do an inline script runat="server" block and set the scripttimeout on page load, and reset on page prerender (so for the duration of the click eventhandler you will have a larger timeout).
Thomas: As I understand your posts, you can eventually end up with an Xml cache on disk/memory that is out of synch with the database and in this case you have to do something by hand as it won't be fixed by any automatic processes. Is this correct?
Hi - yes the two can be out of synch, but probably you will never see this on small/medium sites, or large running on reliable hardware.
If you should encounter, the solution is 1) try republishing using the right-click on the top "content" node, if still no good 2) do the dialogs/republish.aspx as mentioned.
As I said I cannot see anywhere in the source where any cache is reset or regenerated on app restart, save of course for the in-memory, which will reload from the disk umbraco.config if this is enabled (see ContinouslyUpdateXmlDiskCache setting), or the cmsContentXML table if not.
I found that the cmsContentXml had gotten out of sync' with the actual content on the site and so obviouly the XML cache on the disk/memory were out of sync' too.
We never actually nailed down what caused them to get out of sync' but it happened during the installation of a package. I couldn't see anything in the source code to suggest that the cmsContentXml is rebuilt during installation of packages but can't find any other explanation for why half the records in this table would disappear.
I have found that on application restart if there is no cache on disk it will be regenerated (I.E. delete umbraco.config, restart app pool and check and it will be there again) so there must surely at least be some check on restart to see if this file exists?
If you have "ContinouslyUpdateXmlDiskCache" set to true (which I believe is the default) then it will continously update the xml cache on disk - so if this is missing it will regenerate it along with the in-memory cache.
I am currently experiencing an issue related to this topic which I created in a topic here:
https://our.umbraco.org/forum/developers/extending-umbraco/71633-bulk-content-updates-not-appearing-in-backoffice
If anyone could give advice on this it would be appreciated.
I have the same problem in my websites. One of them works another not working.
Take a look at umbracoLog if after republishing xml you have entry like this:
Be careful because dialog republish.aspx displays green success info even Xml generating fails.
Take a look at this code from republish.aspx.cs:
and RepublishAll method from ContentService:
Boolean result doesn't have any sense for republish.aspx.cs
Errors should apperar at App_Data/Logs/UmbracoTraceLog.txt
Hey, this should be marked as answer!
Background:
"App_Data/umbraco.config"
is re-generated at web start, and it's content is equal to DBtable [cmsContentXml]
.To re-produce the scenario, I manually make an error siutation by deleting all rows in DB
table [cmsContentXml]
, then start the web. Now I check the content status in the table and umbraco.config, and the result is that both are in-sync and in my expected error situation.Then I invoked
ContentService.RePublishAll()
programatically by creating an API (or any entry-point you want, just code it). Then the whole DBtable [cmsContentXml]
is now re-generated.To verify the re-generated rows in [cmsContentXml] are the expected contents (i.e. published and non-trashed content), I also compared the total row count in [cmsContentXml] vs the follow SQL:
They are same! So it works! Restarting the web at this moment can re-sync the umbraco.config cache file.
Thanks @wtct
I used the /umbraco/dialogs/republish.aspx?xml=true URL which worked great but can you just truncate the cmsContentXml table?
And if you do what steps do you have to take to get the table rebuilt?
Are there any other "gotchas" ?
Thanks
is working on a reply...