Hi! I experience(d) two problems on an Umbraco site. Most likely it's not because of Umbraco, but rather I should tweak or optimize stuff.
First the site demanded more memory than was avaliable to run smoothly. It was installed on a VPS with 768 Mb, when we opened too many ui-pages at the same time w3svc ate all our memory and stuff crashed. That's taken care of by a move to a physical server with more memory.
The other problem is still left, some admin-functions takes long time to complete and locks up the site during progress. One particular example: I have a function that creates some 300 documents (nodes) once every day. It takes forever. Can I run this stuff in the background somehow?
The stuff that takes time is Document.MakeNew (6 seconds) and umbraco.library.PublishSingleNode (4 seconds) for each single document. What's going on under the hood?
I will dig deeper into this, but if someone have an advice for me I would be very happy.
It is true that the Document API (well, CMSNode really) suffers from sub-optimal performance, and this is something that I've already got myself tasks to try and address for v4.1.
The problem is (mainly) caused by the children collection length property accessor. When you do Document.MakeNew it has to set the sort order of the current node, which is the current length of the parents children + 1. Because of how CMSNode.Children works the performance is very poor, again something that I'm planning on addressing (if possible).
If you want to run it as an async tasks I'd suggest running it as a scheduled task (Umbraco has an inbuilt schedular, but it's prone to errors if the AppPool resets or goes to sleep). The best way to do a scheduled task is to have a console application or Windows Service which pings a Web Service on your Umbraco install that then creates the documents. It's ugly, but it'd work to produce the async operation.
Your only other option would be going directly to the database and creating the nodes yourself, but that I strongly recommend against doing.
Ok, so optimization is coming up, that's good news! Thanks alot to you and all Umbraco contributors!
So less nodes would mean faster performance? I have read about that, but in my case seems something else also is slowing things down. I read of a really node-intense site (12.000+) which added 1 node per second. Still slow, but much faster than my site. Wonder why. Also - even if I delete all documents under my particular parentnode, inserting nodes is still just as slow. Can it be deleted nodes / old version of nodes that slows things down?
Perhaps I'll stay with the slowness for now and wait for 4.1. Yes - inserting into the db is something I stay away from as long as I can.
Async tasks - ok, does that mean that if I run my admin operations with the scheduler it will let the users keep a fair share of the performance during the scheduled tasks runs?
Less nodes in a particular folder would provide better performance, because when you create a node the following line of code is run:
int sortOrder = new CMSNode(parentId).Children.Length + 1;
(Well it's a little different but that's the jist of it)
The problem comes from the fact that the Children property is horid when it comes to performance, it executes the following (I'll pesudocode it out so it makes more sense):
SELECT umbracoNode.id WHERE umbracoNode.parentId == CURRENT_ID FOR EACH (result in DB) LIST_OF_ID.ADD(result) FOR EACH (id in LIST_OF_ID) ARRAY_OF_CHILDREN.ADD(new CMSNode(id))
So you can start seeing the problem. When you have hundreds of nodes as children and you add a new one it will execute the above. If it is possible to create a sub folder of like 20 node groups you'll probably find it speeds up a bit during the creation. But that may introduce more problems than it solves.
The reason I recommend a web service running async is that it can be done when there is down time on the site (ie - early morning) so that there is very little impact on the CMS uses.
Ok, thanks that's really helpful information, but I still don't get it entirely. Less nodes in a particular folder yes. I do like this: I have a node that's called Companies. Under that there are supposed to be some hundred company-nodes as children. For my test I deleted all children so its completely empty. After that I ran my function that does MakeNew ... Publish for each company. And even for the first created company node it takes 6 seconds for my Document.MakeNew().
So, even when the many-children-problem-under-one-node is taken good care of, I guess I still might have a problem?
Ok, seeing is believing: I moved the Companies-folder to the root and suddenly children are created like lightning (well, compared to before). I'll run my stuff at nights and I wish you and the team the best of luck with the 4.1-version.
Although it's probably way too late in your build to suggest I would say that you should look into moving the data out of Umbraco Documents and into a custom database.
With the work that was done for Umbraco 4.0 it's very easy to create a custom tree, or a complete custom application (or module, depending how you know them) which could read from an external data source. There's a good amount of documentation on the wiki for doing so.
Ah, ok, it's not to late. Interesting, does that mean I relatively easy can create a tree with content that is presented as regular umbraco nodes (searchable with xsltsearch and presented with nice url's)?
"In version 4, the structure is built by serializing a new object called XmlTree and XmlTreeNode. The developer does not have to worry about the serialization process, only how to create XmlTreeNode’s and adding them to the XmlTree."
Does that mean when I add a custom tree that will be included in the Umbraco content xml document (and therefore searchable et.c) And also - will it even be visible from within the ui? How will that work, will the items only be visible and not editable (unless I add edit functions myself)?
No, the data wont be in the Umbraco XML, the tree API is purely for creating a UI for data within the Umbraco admin section, the data isn't really Umbraco data, it's just made to look like that.
If you wanted it to be searchable you'd either have to write XSLT extensions which are called from the XSLT Search or you would need to implement a custom indexer like Lucene which reads both in.
Also, they would be read-only unless you explicitly create a UI for it (well, they wouldn't even be read-only unless you create the read-only UI as well :P)
Can I run slow admin-routines in the background?
Hi! I experience(d) two problems on an Umbraco site. Most likely it's not because of Umbraco, but rather I should tweak or optimize stuff.
First the site demanded more memory than was avaliable to run smoothly. It was installed on a VPS with 768 Mb, when we opened too many ui-pages at the same time w3svc ate all our memory and stuff crashed. That's taken care of by a move to a physical server with more memory.
The other problem is still left, some admin-functions takes long time to complete and locks up the site during progress. One particular example: I have a function that creates some 300 documents (nodes) once every day. It takes forever. Can I run this stuff in the background somehow?
The stuff that takes time is Document.MakeNew (6 seconds) and umbraco.library.PublishSingleNode (4 seconds) for each single document. What's going on under the hood?
I will dig deeper into this, but if someone have an advice for me I would be very happy.
Thank's alot!
I decided to add another question regarding that slow behaviour for MakeNew & PublishSingleNode since my main question here is about running admin routines in the background. http://our.umbraco.org/forum/ourumb-dev-forum/bugs/2538-Bug-or-a-messed-up-db
It is true that the Document API (well, CMSNode really) suffers from sub-optimal performance, and this is something that I've already got myself tasks to try and address for v4.1.
The problem is (mainly) caused by the children collection length property accessor. When you do Document.MakeNew it has to set the sort order of the current node, which is the current length of the parents children + 1. Because of how CMSNode.Children works the performance is very poor, again something that I'm planning on addressing (if possible).
If you want to run it as an async tasks I'd suggest running it as a scheduled task (Umbraco has an inbuilt schedular, but it's prone to errors if the AppPool resets or goes to sleep). The best way to do a scheduled task is to have a console application or Windows Service which pings a Web Service on your Umbraco install that then creates the documents.
It's ugly, but it'd work to produce the async operation.
Your only other option would be going directly to the database and creating the nodes yourself, but that I strongly recommend against doing.
Ok, so optimization is coming up, that's good news! Thanks alot to you and all Umbraco contributors!
So less nodes would mean faster performance? I have read about that, but in my case seems something else also is slowing things down. I read of a really node-intense site (12.000+) which added 1 node per second. Still slow, but much faster than my site. Wonder why. Also - even if I delete all documents under my particular parentnode, inserting nodes is still just as slow. Can it be deleted nodes / old version of nodes that slows things down?
Perhaps I'll stay with the slowness for now and wait for 4.1. Yes - inserting into the db is something I stay away from as long as I can.
Async tasks - ok, does that mean that if I run my admin operations with the scheduler it will let the users keep a fair share of the performance during the scheduled tasks runs?
Cheers, J
Less nodes in a particular folder would provide better performance, because when you create a node the following line of code is run:
(Well it's a little different but that's the jist of it)
The problem comes from the fact that the Children property is horid when it comes to performance, it executes the following (I'll pesudocode it out so it makes more sense):
SELECT umbracoNode.id WHERE umbracoNode.parentId == CURRENT_ID
FOR EACH (result in DB) LIST_OF_ID.ADD(result)
FOR EACH (id in LIST_OF_ID) ARRAY_OF_CHILDREN.ADD(new CMSNode(id))
So you can start seeing the problem. When you have hundreds of nodes as children and you add a new one it will execute the above. If it is possible to create a sub folder of like 20 node groups you'll probably find it speeds up a bit during the creation. But that may introduce more problems than it solves.
The reason I recommend a web service running async is that it can be done when there is down time on the site (ie - early morning) so that there is very little impact on the CMS uses.
Ok, thanks that's really helpful information, but I still don't get it entirely. Less nodes in a particular folder yes. I do like this: I have a node that's called Companies. Under that there are supposed to be some hundred company-nodes as children. For my test I deleted all children so its completely empty. After that I ran my function that does MakeNew ... Publish for each company. And even for the first created company node it takes 6 seconds for my Document.MakeNew().
So, even when the many-children-problem-under-one-node is taken good care of, I guess I still might have a problem?
Ok, seeing is believing: I moved the Companies-folder to the root and suddenly children are created like lightning (well, compared to before). I'll run my stuff at nights and I wish you and the team the best of luck with the 4.1-version.
Cheers! / Jonas
Although it's probably way too late in your build to suggest I would say that you should look into moving the data out of Umbraco Documents and into a custom database.
With the work that was done for Umbraco 4.0 it's very easy to create a custom tree, or a complete custom application (or module, depending how you know them) which could read from an external data source. There's a good amount of documentation on the wiki for doing so.
Ah, ok, it's not to late. Interesting, does that mean I relatively easy can create a tree with content that is presented as regular umbraco nodes (searchable with xsltsearch and presented with nice url's)?
Very cool indeed!
"In version 4, the structure is built by serializing a new object called XmlTree and XmlTreeNode. The developer does not have to worry about the serialization process, only how to create XmlTreeNode’s and adding them to the XmlTree."
http://our.umbraco.org/wiki/reference/api-cheatsheet/tree-api---to-create-custom-treesapplications
Does that mean when I add a custom tree that will be included in the Umbraco content xml document (and therefore searchable et.c) And also - will it even be visible from within the ui? How will that work, will the items only be visible and not editable (unless I add edit functions myself)?
No, the data wont be in the Umbraco XML, the tree API is purely for creating a UI for data within the Umbraco admin section, the data isn't really Umbraco data, it's just made to look like that.
If you wanted it to be searchable you'd either have to write XSLT extensions which are called from the XSLT Search or you would need to implement a custom indexer like Lucene which reads both in.
Also, they would be read-only unless you explicitly create a UI for it (well, they wouldn't even be read-only unless you create the read-only UI as well :P)
The problems I had with slow publishing was that my smtp-settings was wrong, and notifications paused waiting for a non-existing server... http://our.umbraco.org/forum/using/ui-questions/3459-Publish-(all)-stops-at-646-out-of-658-pages-
:-)
is working on a reply...