Copied to clipboard

Flag this post as spam?

This post will be reported to the moderators as potential spam to be looked at


  • Nicolae Laslo 1 post 71 karma points
    May 07, 2021 @ 08:33
    Nicolae Laslo
    0

    Accommodating large number of items

    Hi

    Are there any limitations or performance issues when hosting a large number of items in the backend?

    I'm talking about a number up to 1 million.

    We want to recommend Umbraco for a customer who has a large number of items / pages

    Thanks very much

    Nicolae

  • Marc Goodson 1529 posts 10230 karma points MVP 6x c-trib
    May 08, 2021 @ 10:14
    Marc Goodson
    0

    Hi Nicolae

    Yes, No, It depends!

    Lots of different factors will combine, eg hosting, site traffic and the nature of the million items eg News Articles, or something else.

    Also - do the million items all need to be editable by editors on a day to day basis, or is there the concept of an 'archive' of articles that need to be publshed, but perhaps aren't ever edited again?

    You can see @saif posted about this a while back:

    https://our.umbraco.com/forum/extending-umbraco-and-using-the-api/98791-how-to-handle-and-archive-large-amount-of-news-and-still-be-able-to-get-them-umbraco-8

    it might be worth following up on that ticket and asking him what he ended up implementing.

    In V7, I worked on a similar sized site, which was News Articles based, (had articles back to 1997, and was introducing new articles at a rate of 30-40 a week!

    we implemented an 'archived content' mechanism:

    https://moriyama.co.uk/about-us/news/blog-the-need-for-archived-content-in-umbraco-and-how-to-do-it/

    where only recent content was 'published' and older content was served from a 'database' archive table, keeping the Umbraco published cache small, but making all articles 'still available', and using Azure Search across 'everything' to make it searchable.

    In V8, the published cache was re-architected to move away from a big XML file stored in memory, and in theory can perform much better, if the server has the memory, however it's much harder to perform this 'archive' trick' (as unpublished content is also stored in the published cache!)

    Anyway, in your situation, I'd probably recommend a quick development spike, create a new V8 install, use the ContentService to populate it with generic example content loosely based on the site you are planning, create 100,000 content items, and publish the site to your planned hosting environment, and then load test for the traffic you are expecting, if all good, add another 100,000 and repeat again - you'll then be able to report back to the client the point at where problems begin to occur. If you are using Azure or similar cloud instance for hosting, you can then scale up the environments and try again, and get an idea of the hosting costs for this kind of site. Additionally, you'll be able to test some of the observations Saif made, about 'changing document type properties' when there is lots of content to determine how viable this amount of content is in a Vanilla Umbraco, and whether you'd need to pull a similar trick to the archive one above or not.

    regards

    marc

Please Sign in or register to post replies

Write your reply to:

Draft