How does the Umbraco XML Document work with Multiple sites
Hi
I have a multisite implementation of umbraco and I'm a little worried about the XML document. I refer to the document that's exposed as currentPage, which when I examine the root of (selecting the ancestor at level 1) shows the exml for the whole site
Now if I have a site with thousands of pages, I presumt this XML document is going to be very big indeed. If I have multiple sites under one implementation, each with thousands of documents, when will (if ever) it get unmanageable
How is this document stored on the server? Is it cached? Any thoughts on this would be helkpful as I'm doing a feasibility study and need this answered
The XML document itself i stored in data/umbraco.config (it's named .config so that IIS won't serve it and so prevents people downloading it). Umbraco also keep a copy cached in memory to make things quicker.
In general, it's only advised that related sites (usualy for the same company) are setup under one instance as, as youv'e seen, it is possible to access the content all sites, within any given site.
I can't say I have worked on any really large sites yet, but there are a few posts around, where I'm sure people are handling tens of thousands of pages and Umbraco still holds up well.
What exactly is your worry with the XML doc? Just size? or privacy?
We're running a site that is a multisite (32 sites at the moment) and contains about a 35 000 nodes right now, the umbraco.config is 25MB and we have a lot of properties.
It's running really fast with no cache turned on. Just make sure that you make efficient queries (avoiding "//") and don't do too many recursive calls if they're not really needed.
Add umbDebugShowTrace=true to your querystring to see the timings of your XSLT files, it's very handy in finding out if there are any performance problems.
Matt, my main concern is size and performance really, but thanks for the resources - that's a lot of help
Sebastiaan, It's good to know about site (which is an AMAZING looking site by the way). Could you pelase give me a bit more information about what you mean when you say avoiding //? My understanding is // will start the xpath query from the root node - is this correct? Could you provide an example if it's not difficult
Thanks again - I love Umbraco, and the community is what makes it magic. You guys rock
p.s - Matt - your desktop uploader is fantastic, just what I needed for this project as we'll be dealing with dealerships who want to upload lots of photos of their cars in one go. Perfect timing with the release
Yes. that's exactly what I meant, it seemed to be related to some performance issues I had early on. I switched to using this to get to the root node (in Umbraco 4.0.x, I'm not familiar with the 4.5 xml schema yet):
Can I ask what server hardware you are using to host a site of this size, and what kind of traffic you are getting. I need to plan for deployment and am completely in the dark regarding the kind of hardware I'd need to throw at this. Are we talking load balanced servers, or single servers.
My current hosting environment is a clustered SQL server which should be fine to use, and I would look at creating a dedicated server (s) to host my new site, but not sure how much to spend on them
We're using something called a "virtual private server", which is just a virtual machine that our webhosting provider has set up. I don't know what actual hardware the VM is running on, but we get a 2.5GHz processor with 1GB of RAM. (win2008 server web edition SP2).
We run about 30 other sites on that same server and this site that has 32 sites in it.
For the launch last month we have beefed up the server to 4 CPU's and 4 gigs of RAM. However, we've never had a load higher than 20%, even at the launch (about 15 000 visits on the first day if I'm not mistaken).
It might be good to look at the time it takes to render the page in Umbraco, when you add umbDebugShowTrace=true to the querystring of your pages, you get a long list of timings and at the bottom the total .net processing time is listed, All of our pages stay below 0.2 seconds per page and that is with caching completely turned off, so we might even be able to optimize that if it's really needed. Caching could eat memory though, so beware of too much per page caching.
One other thing to keep in mind: I am performing a lot of AfterSave events (in the backend). Every time a content item is saved or published, I do see a CPU peak. Working with multiple content editors used to be difficult, as the code for these events was not very optimized and a lot needed to happen automatically. Now with this 4 CPU set-up it's ideal, saving and publishing doesn't take a noticeable amount of time any more.
And a last thing: We did loadtesting (before and after the upgrade). Before upgrade: at about 50 page refreshes per second (without browser cache, so a LOT of requests, we have many small images that need to be fetched as well) the server started to max out on CPU usage, which is mainly due to the processing of the many SQL queries. After the upgrade, it started to max out at about 150 page refreshes per second. But this second time, the site was still relatively fast, so it could take some more load, I didn't have the time to fully max it out though.
All in all I have been very, very pleased and surprised by the amazing performance, it is much better than I could've hoped for.
How does the Umbraco XML Document work with Multiple sites
Hi
I have a multisite implementation of umbraco and I'm a little worried about the XML document. I refer to the document that's exposed as currentPage, which when I examine the root of (selecting the ancestor at level 1) shows the exml for the whole site
Now if I have a site with thousands of pages, I presumt this XML document is going to be very big indeed. If I have multiple sites under one implementation, each with thousands of documents, when will (if ever) it get unmanageable
How is this document stored on the server? Is it cached? Any thoughts on this would be helkpful as I'm doing a feasibility study and need this answered
thanks,
Hi Carl,
The XML document itself i stored in data/umbraco.config (it's named .config so that IIS won't serve it and so prevents people downloading it). Umbraco also keep a copy cached in memory to make things quicker.
In general, it's only advised that related sites (usualy for the same company) are setup under one instance as, as youv'e seen, it is possible to access the content all sites, within any given site.
I can't say I have worked on any really large sites yet, but there are a few posts around, where I'm sure people are handling tens of thousands of pages and Umbraco still holds up well.
What exactly is your worry with the XML doc? Just size? or privacy?
Matt
Hi Carl,
This post may also be of interest to you:
http://our.umbraco.org/forum/getting-started/installing-umbraco/4505-Umbraco-Nodes-plus-How-Many-is-Too-Many
Matt
We're running a site that is a multisite (32 sites at the moment) and contains about a 35 000 nodes right now, the umbraco.config is 25MB and we have a lot of properties.
It's running really fast with no cache turned on. Just make sure that you make efficient queries (avoiding "//") and don't do too many recursive calls if they're not really needed.
Add umbDebugShowTrace=true to your querystring to see the timings of your XSLT files, it's very handy in finding out if there are any performance problems.
Hi both
Matt, my main concern is size and performance really, but thanks for the resources - that's a lot of help
Sebastiaan, It's good to know about site (which is an AMAZING looking site by the way). Could you pelase give me a bit more information about what you mean when you say avoiding //? My understanding is // will start the xpath query from the root node - is this correct? Could you provide an example if it's not difficult
Thanks again - I love Umbraco, and the community is what makes it magic. You guys rock
p.s - Matt - your desktop uploader is fantastic, just what I needed for this project as we'll be dealing with dealerships who want to upload lots of photos of their cars in one go. Perfect timing with the release
Carl: Thanks! It's been a long time in the works.
Yes. that's exactly what I meant, it seemed to be related to some performance issues I had early on. I switched to using this to get to the root node (in Umbraco 4.0.x, I'm not familiar with the 4.5 xml schema yet):
Hi Sebastiaan
Can I ask what server hardware you are using to host a site of this size, and what kind of traffic you are getting. I need to plan for deployment and am completely in the dark regarding the kind of hardware I'd need to throw at this. Are we talking load balanced servers, or single servers.
My current hosting environment is a clustered SQL server which should be fine to use, and I would look at creating a dedicated server (s) to host my new site, but not sure how much to spend on them
thanks for your help
We're using something called a "virtual private server", which is just a virtual machine that our webhosting provider has set up. I don't know what actual hardware the VM is running on, but we get a 2.5GHz processor with 1GB of RAM. (win2008 server web edition SP2).
We run about 30 other sites on that same server and this site that has 32 sites in it.
For the launch last month we have beefed up the server to 4 CPU's and 4 gigs of RAM. However, we've never had a load higher than 20%, even at the launch (about 15 000 visits on the first day if I'm not mistaken).
It might be good to look at the time it takes to render the page in Umbraco, when you add umbDebugShowTrace=true to the querystring of your pages, you get a long list of timings and at the bottom the total .net processing time is listed, All of our pages stay below 0.2 seconds per page and that is with caching completely turned off, so we might even be able to optimize that if it's really needed. Caching could eat memory though, so beware of too much per page caching.
One other thing to keep in mind: I am performing a lot of AfterSave events (in the backend). Every time a content item is saved or published, I do see a CPU peak.
Working with multiple content editors used to be difficult, as the code for these events was not very optimized and a lot needed to happen automatically. Now with this 4 CPU set-up it's ideal, saving and publishing doesn't take a noticeable amount of time any more.
And a last thing: We did loadtesting (before and after the upgrade). Before upgrade: at about 50 page refreshes per second (without browser cache, so a LOT of requests, we have many small images that need to be fetched as well) the server started to max out on CPU usage, which is mainly due to the processing of the many SQL queries. After the upgrade, it started to max out at about 150 page refreshes per second. But this second time, the site was still relatively fast, so it could take some more load, I didn't have the time to fully max it out though.
All in all I have been very, very pleased and surprised by the amazing performance, it is much better than I could've hoped for.
is working on a reply...