I'd like to start this topic because of a discussion we had at the Umbraco Benelus meetup.
Currently in Umbraco 4 document and data types (metadata) are saved in the database which is unnecessary. It would be more logical to store this data inside xml. This way the actual data (like nodes and content) can be saved inside the database and the metadata in xml. This means data and metadata are more separated which makes it easier to maintain. Since it's easier to maintain it will also be easier to export data from a test instance of Umbraco to a live instance of Umbraco (which currently is a pain in the ass).
It shouldn't be hard to save document and data types inside xml because when you create a package this data is also saved inside an xml file.
This would be a great feature for Umbraco 5!
I've also created a workitem for this on codeplex. Please vote for it!
I'm not really sure what the benefit would be, other than having the metadata easier to edit and export?
If it was stored in XML, would it not be harder/slower to implement inherited data and document types? And maintaining the relationships between them would be harder, as there's no relational integrity checking per se in XML, which there is in SQL. What about recovery of files in case of errors etc as well?
I could be wrong of course! I'd normally store that kind of data in the database in my own projects, as its data rather than settings (which I would store as XML).
The following statements are my opinion and don't reflect that of the Umbraco Core team. (just thought I'd add this disclaimer before I'm held accountable for anything in 12 months time :P)
Easier to maintain
How? You've now got two places where the 'structure' is handled, in the DB (it's got to have some kind of knowledge) and in an XML file. You update a document type and it had to write an XML file and modify a database (to property needs to be 'added' to the existing data, at least in some form). What if the database is offline, currently you can't do any changes, but then you'd be able to modify an XML file without the DB being there, so then you need to have rollbacks occurring.
It shouldn't be hard to save (it) as XML
It's not hard at the moment, in fact it's very easy, and already built ;)
Actually storing the document types as a physical file is something that's been talked about already, even in the context of v4.x, and I've never been in favor of it, as I can't see any benefit. What is the ultimate gain of it? Document types & data types are a "database" concept, a data type isn't truely used on the front end (either via XSLT or .NET), you actually use the ToString()-style output of the data type's value. So having them "cached" doesn't really have any benefit.
Additionally it can pose some odd problems, such as permissions. These files become another thing that needs to be considered when setting up the file system permissions. Sure we'd be storing it in the umbracoData folder (which could ideally be App_Data), but have a quick look on the forums and you'll see how frequent the problem of permissions comes up :P
And here's another potential problem, direct-editing of the XML file. People are more likely to modify a raw text file than they are to dive into the database and edit a series of tables. Sure it's a "do at your own risk" kind of thing, but you give people the option and it is often taken (this is one of my arguments against XSLT, it's a lot easier to edit an XSLT file on the live environment and fix a bug than it is to edit a .NET file, compile, and then upload!). Should it be handled, if so how? If not, what do you tell someone who's edited it and it's not "worked as expected"?
You are right about the maintaining part. What I meant with maintaining is if you have 2 instances of 1 website. 1 test version and 1 live version. If you update your code you first use the test version. If everything works you update the live version. This is currently very hard if you don't want to update the database (because the live version has actual data and not test data), but you did updated/created some document and data types. If the document and data types would also be stored at a physical path inside xml just like templates you could just upload those and the live version would be updated without overwritting the actual database.
Currently it's really hard to sync 2 instances of Umbraco and even Courier has it's limitations if document and data types are updated.
In my opinion only content (actual data) should be stored inside the database and data which doesn't need to be saved in the database (meta data) like document and datatypes should be stored inside xml.
Hope it still happens in Umbraco 5 :). If that's not the case are you going to change the way document and data types work in Umbraco 4?
I don't know what you mean by 'hard'. I haven't had any problems with updating document types across environments, it can be a tedious task, but the packager (and export/ import engine) both work fine and can 'upgrade' document types as it does alias matching.
Content is a different kettle of fish though, and I've yet to see a CMS that handles it well.
But your scenario ignores my final point, how do you expect Umbraco to deal with modifying an XML file outside of itself. You said it would be "uploading the file" to 'upgrade', but you can't expect that it is only uploaded via the web UI, it's a physical file, you can update it from the file system can you not ;).
I have very different ideas on how the document type/ data type structure should go in v5, but it's something I'd have to raise in the core before it's openly discussed :P
Umbraco 5 document and data types
I'd like to start this topic because of a discussion we had at the Umbraco Benelus meetup.
Currently in Umbraco 4 document and data types (metadata) are saved in the database which is unnecessary. It would be more logical to store this data inside xml. This way the actual data (like nodes and content) can be saved inside the database and the metadata in xml. This means data and metadata are more separated which makes it easier to maintain. Since it's easier to maintain it will also be easier to export data from a test instance of Umbraco to a live instance of Umbraco (which currently is a pain in the ass).
It shouldn't be hard to save document and data types inside xml because when you create a package this data is also saved inside an xml file.
This would be a great feature for Umbraco 5!
I've also created a workitem for this on codeplex. Please vote for it!
http://umbraco.codeplex.com/WorkItem/View.aspx?WorkItemId=25809
Hmm I thought I could start a discussion about this. I'd like to hear other people their opinion about this.
I'm not really sure what the benefit would be, other than having the metadata easier to edit and export?
If it was stored in XML, would it not be harder/slower to implement inherited data and document types? And maintaining the relationships between them would be harder, as there's no relational integrity checking per se in XML, which there is in SQL. What about recovery of files in case of errors etc as well?
I could be wrong of course! I'd normally store that kind of data in the database in my own projects, as its data rather than settings (which I would store as XML).
The following statements are my opinion and don't reflect that of the Umbraco Core team. (just thought I'd add this disclaimer before I'm held accountable for anything in 12 months time :P)
Easier to maintain
How? You've now got two places where the 'structure' is handled, in the DB (it's got to have some kind of knowledge) and in an XML file. You update a document type and it had to write an XML file and modify a database (to property needs to be 'added' to the existing data, at least in some form). What if the database is offline, currently you can't do any changes, but then you'd be able to modify an XML file without the DB being there, so then you need to have rollbacks occurring.
It shouldn't be hard to save (it) as XML
It's not hard at the moment, in fact it's very easy, and already built ;)
Actually storing the document types as a physical file is something that's been talked about already, even in the context of v4.x, and I've never been in favor of it, as I can't see any benefit. What is the ultimate gain of it? Document types & data types are a "database" concept, a data type isn't truely used on the front end (either via XSLT or .NET), you actually use the ToString()-style output of the data type's value. So having them "cached" doesn't really have any benefit.
Additionally it can pose some odd problems, such as permissions. These files become another thing that needs to be considered when setting up the file system permissions. Sure we'd be storing it in the umbracoData folder (which could ideally be App_Data), but have a quick look on the forums and you'll see how frequent the problem of permissions comes up :P
And here's another potential problem, direct-editing of the XML file. People are more likely to modify a raw text file than they are to dive into the database and edit a series of tables. Sure it's a "do at your own risk" kind of thing, but you give people the option and it is often taken (this is one of my arguments against XSLT, it's a lot easier to edit an XSLT file on the live environment and fix a bug than it is to edit a .NET file, compile, and then upload!). Should it be handled, if so how? If not, what do you tell someone who's edited it and it's not "worked as expected"?
Thanks for you feedback slace.
You are right about the maintaining part. What I meant with maintaining is if you have 2 instances of 1 website. 1 test version and 1 live version. If you update your code you first use the test version. If everything works you update the live version. This is currently very hard if you don't want to update the database (because the live version has actual data and not test data), but you did updated/created some document and data types. If the document and data types would also be stored at a physical path inside xml just like templates you could just upload those and the live version would be updated without overwritting the actual database.
Currently it's really hard to sync 2 instances of Umbraco and even Courier has it's limitations if document and data types are updated.
In my opinion only content (actual data) should be stored inside the database and data which doesn't need to be saved in the database (meta data) like document and datatypes should be stored inside xml.
Hope it still happens in Umbraco 5 :). If that's not the case are you going to change the way document and data types work in Umbraco 4?
I don't know what you mean by 'hard'. I haven't had any problems with updating document types across environments, it can be a tedious task, but the packager (and export/ import engine) both work fine and can 'upgrade' document types as it does alias matching.
Content is a different kettle of fish though, and I've yet to see a CMS that handles it well.
But your scenario ignores my final point, how do you expect Umbraco to deal with modifying an XML file outside of itself. You said it would be "uploading the file" to 'upgrade', but you can't expect that it is only uploaded via the web UI, it's a physical file, you can update it from the file system can you not ;).
I have very different ideas on how the document type/ data type structure should go in v5, but it's something I'd have to raise in the core before it's openly discussed :P
To the admin: Could this topic be moved to http://our.umbraco.org/forum/core/umbraco-5-general-discussion? Now might be a good time to start this discussion again :).
is working on a reply...