Copied to clipboard

Flag this post as spam?

This post will be reported to the moderators as potential spam to be looked at


  • Ben Ellis 5 posts 35 karma points
    Apr 09, 2015 @ 18:07
    Ben Ellis
    0

    Scalability and Performance

    I'm sure as Umbraco has stood the test of time it scales well, but I'm interested to know why/if the node hierachy doesn't suffer from performance penalities for large amounts of content in the tree? 

    The database schema uses a parentId on a non-clustered index to store the hierarchy. I've seen this method before and it's always suffered from bad performance particularly for deeply nested trees. Is there a trick being used in Umbraco to avoid this issue? Are content trees not typically deep enough/big enough for this to be a problem?

    If users have experienced problems with this, I wouldn't mind contributing some alternative ways of storing hierarchy relations as I've had some success with dealing with this in legacy systems in the past.

  • Nicholas Westby 1985 posts 6777 karma points c-trib
    Apr 09, 2015 @ 19:41
    Nicholas Westby
    0

    When you publish content, it gets stored to an XML cache file in "App_Data\umbraco.config" (though, that path is configurable, so may not be the same in some installations).

    Typically, that is where content comes from when rendering pages (rather than hitting the database). Also, the content service if a further abstraction that allows for content to be store in-memory, which offers a further speed improvement.

  • Nicholas Westby 1985 posts 6777 karma points c-trib
    Apr 09, 2015 @ 19:43
    Nicholas Westby
    0

    Correction, the content service hits the database. It's the UmbracoHelper class that gets data from the XML file.

  • Ben Ellis 5 posts 35 karma points
    Apr 10, 2015 @ 22:10
    Ben Ellis
    0

    So if there is a large amount of content it's all stored in memory? I'm assuming media and large binary data isn't stored in memory though?

    I saw a couple of posts on the forum regarding too much memory being used, is this likely due to incorrect usage of imbraco?

     

  • Nicholas Westby 1985 posts 6777 karma points c-trib
    Apr 11, 2015 @ 05:04
    Nicholas Westby
    0

    Yep, all content [in the content tree] is stored in memory (that includes content like rich text and booleans). Large binary data (media) is stored on the file system (i.e., it's not loaded into memory, aside from when IIS serves it up to the client).

    I have posted about "too much" memory being used myself, actually (see https://our.umbraco.org/forum/core/general/61448-Memory-Climbs-2GB-in-11-Hours-in-Umbraco-718 ). Turns out it's just because IIS intentionally avoids doing garbage collection when it doesn't absolutely have to (from what I read, this is for performance reasons). So it's not really anything to do with Umbraco.

    For what it's worth, I remember hearing somebody from the Umbraco core team saying that Umbraco should be able to scale up to around 250,000 content nodes. No idea where they got that from, but it should at least be a good indicator of the order of magnitude of data Umbraco can handle.

  • Giorgio 6 posts 46 karma points
    Apr 12, 2015 @ 17:08
    Giorgio
    0

    How big will the umbraco.config be with 250,000 nodes?

    From estimates i've done to data which might get added to our website, we'll have 120,000 nodes and our umbraco.config might get to 1.7gb.

    Will it survive?

  • Alex Skrypnyk 5833 posts 22225 karma points MVP 4x admin c-trib
    Apr 12, 2015 @ 18:00
    Alex Skrypnyk
    0

    Giorgio, It depends on your site, if it will have good caching and good code, it's not problem. Try to use [OutputCache(Duration = 60)], and it will be faster and cheaper for server.

    http://stefantsov.com/2014/march/umbraco-7-mvc-performance#.VSqVZvmsXs4

    Also great helper is @Html.CachedPartial(), we are using it often, it can be dynamic and very flexible.

    Thanks

  • Nicholas Westby 1985 posts 6777 karma points c-trib
    Apr 13, 2015 @ 03:08
    Nicholas Westby
    0

    I have never worked on a site with that may nodes, so I wouldn't be able to guess what problems you might run into. You could always test it if you like (e.g., by using the ContentService to create a bunch of nodes and see what happens).

    If you do have that many nodes, it's probably time to start thinking about storing some of that content in a database rather than in Umbraco. For example, at a past job, we stored all of our "product" information in a database (with tens of thousands of products and multiple languages and regions, products accounted for the vast majority of pages on the site).

    Something to keep in mind in regards to a site with that many nodes and that large of an umbraco.config file is that the umbraco.config file gets regenerated on each publish operation. Having to write 1.7GB to disk on every publish seems like it'd be a very expensive operation. One way to get around that (that I haven't tried) is to set "ContinouslyUpdateXmlDiskCache" to false in umbracoSettings.config: https://our.umbraco.org/documentation/Using-Umbraco/Config-files/umbracoSettings/

  • Giorgio 6 posts 46 karma points
    Apr 13, 2015 @ 10:04
    Giorgio
    0

    Alex, we already have all those caching mechanism in work but I'm afraid of parsing the umbraco.config eventually..

     

    Nicholas, one of the dilemmas we're facing is the tradeoff between parsing a large umbraco.config and creating connections to the database.

    From your experience, creating the connections was an expensive and time consuming task or a relatively swift one?

    If i may ask, what kind of database were you using? and was it a local one or located on another server?

  • Nicholas Westby 1985 posts 6777 karma points c-trib
    Apr 13, 2015 @ 21:47
    Nicholas Westby
    0

    In my case, the database was the "cheaper" option (in terms of time to implement) because all of our data was already in a database (SQL Server). The database was on a different server in the same network. Keep in mind this product database was entirely separate from the Umbraco database. The database didn't significantly slow down page load times.

    The tricky bit would be to create an interface in Umbraco to edit that data. We already had a tool to edit product data, so we didn't need to build anything in Umbraco.

    With a project of the size you are talking about, however, I'd expect you would have the budget to build such an interface.

  • Wing 17 posts 39 karma points
    Apr 21, 2015 @ 10:31
    Wing
    0

    For minor Custom Database management, DEWD is a an excellent package: https://our.umbraco.org/projects/developer-tools/dewd

    Easy to create overviews + edit data.

    Note: it's only for v6 right now. The developer seems to be inactive.

  • MuirisOG 378 posts 1277 karma points
    Apr 21, 2015 @ 11:10
    MuirisOG
    0

    Is it necessary to keep umbraco.config when you consider how fast data can be delivered from a database?

    Data extracted from databases (e.g. events, press releases, product databases) can be cached using .NET controls, and even caching for a small time period can improve speed.

    Our existing CMS solution uses SQL Server and we haven't had any performance issues when getting web content data from the database. (I hope I haven't jinxed everything!)

    It just seems like unnecessary duplication to me.

  • Nicholas Westby 1985 posts 6777 karma points c-trib
    Apr 21, 2015 @ 16:23
    Nicholas Westby
    0

    DEWD looks neat; thanks for that.

    Umbraco has seemed really slow to me when accessing content directly from the database. However, that may have been on something like Azure where the database was on a different machine. if the database is on the same machine, it might be fast.

  • Alex Skrypnyk 5833 posts 22225 karma points MVP 4x admin c-trib
    Apr 21, 2015 @ 16:57
    Alex Skrypnyk
    1

    Hi Nicholas,

    Did you try to use Examine? If you will create your own index, you won't need to get data from db or umbraco.config, try to look how https://our.umbraco.org/projects/website-utilities/ezsearch is working. You can do search and traversing login in the Lucene, and then render from xml cache or database, it will be faster.

    Thanks, Alex

  • Nicholas Westby 1985 posts 6777 karma points c-trib
    Apr 21, 2015 @ 18:04
    Nicholas Westby
    0

    I haven't used Examine for the purpose of getting content (I've only used it for general site searches), though that might work for scenarios that require large amounts of data and fast performance (at the cost of needing the complexity of extra code and configuration). Good tip, thanks.

  • MuirisOG 378 posts 1277 karma points
    Apr 24, 2015 @ 11:38
    MuirisOG
    0

    I'm in the process of migrating our old site to Umbraco.

    The existing site has about 6,500 pages (the site belongs to a local authority) and after my first import, the umbraco.config file grew to about 29.5MB.

    I killed the back-office after the initial import because every page was under the root.

    After phase 2, i.e. moving pages back into the tree-structure that came with the old CMS, the back-office is back to normal.

    It's early days, but I am already looking at ideas to improve performance.

    For instance, after the import, I had to move the site homepage to the root of the site, which in turn seems to have dragged everything else along with it. This one move has taken 40 minutes and is still running, updating the umbraco.config file as it goes (although this will be a once-off move).

    I'll look at changing the setting for 'ContinouslyUpdateXmlDiskCache' before my next run.

  • Alex Skrypnyk 5833 posts 22225 karma points MVP 4x admin c-trib
    Apr 24, 2015 @ 11:41
    Alex Skrypnyk
    0

    Hi MuirisOG,

    Yes you have to disable ContinouslyUpdateXmlDiskCache I think. It's good feature for little sites, not like yours. What version of Umbraco are you using ? Could you show more your settings ? Has you SSD on the server ?

    Thanks

  • MuirisOG 378 posts 1277 karma points
    Apr 24, 2015 @ 12:17
    MuirisOG
    0

    At the moment, I'm using

    • Umbraco 7.2.4 (but in a position to upgrade easily as we're not live yet)
    • SQL Server 2012 (Express for now, but will upgrade once the imported site is fully functional)
    • IIS 8 on Windows Server 2012 (on a virtual environment)

    I'm not sure what you mean by SSD.

  • Nicholas Westby 1985 posts 6777 karma points c-trib
    Apr 24, 2015 @ 16:10
    Nicholas Westby
    0

    An SSD is a type of hard drive that is very fast: http://en.wikipedia.org/wiki/Solid-state_drive

    By the way, if you are manipulating a lot of content programmatically, you may also consider disabling the Examine indexing, then reenable and perform a reindex when you're all done. Keep in mind that media may not work while the index is disabled.

  • Dan White 206 posts 510 karma points c-trib
    Aug 04, 2015 @ 22:48
    Dan White
    0

    If I remember correctly, the umbraco.config is only used for filling the in-memory cache - it is not used for serving up the front-end. Having it does slow up publish times on very large sites. You can disable it by setting ContinouslyUpdateXmlDiskCache to false.

    I think the trade off is slower load times after an app pool restart, since Umbraco hits the database instead of the xml to replenish the cache.

Please Sign in or register to post replies

Write your reply to:

Draft