Copied to clipboard

Flag this post as spam?

This post will be reported to the moderators as potential spam to be looked at


  • shinsuke nakayama 109 posts 250 karma points
    Sep 18, 2018 @ 02:01
    shinsuke nakayama
    0

    Testing with millions of pages - optimising the backoffice

    Hi guys

    I'm currently prototyping the Umbraco CMS with potentially millions of active pages (Currently prototyped with 400,000 pages, got hit by resource issue).

    I have already set the XmlCacheEnabled to False, since we will have third party caching server on top of the Umbraco.

    Using Umbraco 7.12.2

    I have a site structure that looks like this

    • page 1
      • Page 1.1
        • Page 1.1.1
        • Page 1.1.2
        • Page 1.1.3
        • Page ...
      • Page 1.2
      • Page 1.3
      • ...
    • page 2
    • ...

    Page 1.1 contains (9000 sub / sub sub / sub sub sub pages)

    I've noticed that in the backoffice, if i click on the Page 1.1, it loads all the sub pages in memory. Hence, it loads really slow when I click on the "Page 1.1" (Memory resource goes up to 4000-5000mb) but loads fast when I click on the sub pages. (This is an issue when I login as Admin and clicked on the Home node, or 2nd level).

    Is there anyway of not loading the sub pages, when I click on the parent node or any page?

    Thank you

    Shinsuke

  • Matthew Wise 271 posts 1373 karma points MVP 4x c-trib
    Sep 18, 2018 @ 07:18
    Matthew Wise
    1

    Hi Shinsuke,

    You can set the parent node to be a list view. Which means it no longer expands but instead shows its children in a paged list view.

    You would then need to add a list view property to the child pages so you can see their children as well.

    Hope this helps

    Matt

  • shinsuke nakayama 109 posts 250 karma points
    Sep 18, 2018 @ 07:23
    shinsuke nakayama
    0

    Hi Matt,

    Thank you for the reply.

    That's a good idea, Let me give it a try. I'll have to create new Document Types and convert it so it might take some time, but will keep this update it.

    Cheers

  • shinsuke nakayama 109 posts 250 karma points
    Sep 20, 2018 @ 03:41
    shinsuke nakayama
    0

    Hi Matt,

    I had to re-create the website again with 400,000 pages, because the other one stopped responding for some reason and I couldn't debug it.

    Anyhow, using the list view seems to worked for the backoffice. However I have same issue with the front end (Client facing). When I load the home page for the first time, I have to wait for a long time, then I get an error message.

    IIS Worker Process: 3,075.8 MB

    SQL Server: Around 3,000 MB

    This is a log:

     2018-09-20 13:28:28,631 [P2212/D2/T1] INFO  Umbraco.Core.CoreBootManager - Umbraco 7.12.2 application starting on EC2AMAZ-KPL4I55
     2018-09-20 13:28:28,644 [P2212/D2/T1] INFO  Umbraco.Core.PluginManager - Determining hash of code files on disk
     2018-09-20 13:28:28,653 [P2212/D2/T1] INFO  Umbraco.Core.PluginManager - Hash determined (took 9ms)
     2018-09-20 13:28:28,706 [P2212/D2/T1] INFO  Umbraco.Core.MainDom - Acquiring MainDom...
     2018-09-20 13:28:28,706 [P2212/D2/T1] INFO  Umbraco.Core.MainDom - Acquired MainDom.
     2018-09-20 13:28:28,709 [P2212/D2/T1] INFO  Umbraco.Core.PluginManager - Resolving umbraco.interfaces.IDiscoverable
     2018-09-20 13:28:28,754 [P2212/D2/T1] INFO  Umbraco.Core.PluginManager - Resolved umbraco.interfaces.IDiscoverable (took 44ms)
     2018-09-20 13:28:28,754 [P2212/D2/T1] INFO  Umbraco.Core.PluginManager - Resolving umbraco.interfaces.IApplicationStartupHandler
     2018-09-20 13:28:28,755 [P2212/D2/T1] INFO  Umbraco.Core.PluginManager - Resolved umbraco.interfaces.IApplicationStartupHandler (took 1ms)
     2018-09-20 13:28:28,785 [P2212/D2/T1] INFO  Umbraco.Core.PluginManager - Resolving umbraco.interfaces.IDiscoverable
     2018-09-20 13:28:28,785 [P2212/D2/T1] INFO  Umbraco.Core.PluginManager - Resolved umbraco.interfaces.IDiscoverable (took 0ms)
     2018-09-20 13:28:28,786 [P2212/D2/T1] INFO  Umbraco.Core.PluginManager - Resolving Umbraco.Core.PropertyEditors.IPropertyEditorValueConverter
     2018-09-20 13:28:28,786 [P2212/D2/T1] INFO  Umbraco.Core.PluginManager - Resolved Umbraco.Core.PropertyEditors.IPropertyEditorValueConverter (took 0ms)
     2018-09-20 13:28:28,786 [P2212/D2/T1] INFO  Umbraco.Core.PluginManager - Resolving umbraco.interfaces.IDiscoverable
     2018-09-20 13:28:28,786 [P2212/D2/T1] INFO  Umbraco.Core.PluginManager - Resolved umbraco.interfaces.IDiscoverable (took 0ms)
     2018-09-20 13:28:28,787 [P2212/D2/T1] INFO  Umbraco.Core.PluginManager - Resolving Umbraco.Core.PropertyEditors.IPropertyValueConverter
     2018-09-20 13:28:28,789 [P2212/D2/T1] INFO  Umbraco.Core.PluginManager - Resolved Umbraco.Core.PropertyEditors.IPropertyValueConverter (took 1ms)
     2018-09-20 13:28:28,795 [P2212/D2/T1] INFO  Umbraco.Core.PluginManager - Resolving umbraco.interfaces.IDiscoverable
     2018-09-20 13:28:28,795 [P2212/D2/T1] INFO  Umbraco.Core.PluginManager - Resolved umbraco.interfaces.IDiscoverable (took 0ms)
     2018-09-20 13:28:28,795 [P2212/D2/T1] INFO  Umbraco.Core.PluginManager - Resolving Umbraco.Web.Mvc.SurfaceController
     2018-09-20 13:28:28,796 [P2212/D2/T1] INFO  Umbraco.Core.PluginManager - Resolved Umbraco.Web.Mvc.SurfaceController (took 0ms)
     2018-09-20 13:28:28,796 [P2212/D2/T1] INFO  Umbraco.Core.PluginManager - Resolving umbraco.interfaces.IDiscoverable
     2018-09-20 13:28:28,796 [P2212/D2/T1] INFO  Umbraco.Core.PluginManager - Resolved umbraco.interfaces.IDiscoverable (took 0ms)
     2018-09-20 13:28:28,796 [P2212/D2/T1] INFO  Umbraco.Core.PluginManager - Resolving Umbraco.Web.WebApi.UmbracoApiController
     2018-09-20 13:28:28,797 [P2212/D2/T1] INFO  Umbraco.Core.PluginManager - Resolved Umbraco.Web.WebApi.UmbracoApiController (took 0ms)
     2018-09-20 13:28:30,399 [P2212/D2/T1] INFO  Umbraco.Core.DatabaseContext - CanConnect = True
     2018-09-20 13:28:30,525 [P2212/D2/T1] INFO  Umbraco.Core.PluginManager - Resolving Umbraco.Core.Models.PublishedContent.PublishedContentModel
     2018-09-20 13:28:30,528 [P2212/D2/T1] INFO  Umbraco.Core.PluginManager - Resolved Umbraco.Core.Models.PublishedContent.PublishedContentModel (took 2ms)
     2018-09-20 13:28:30,589 [P2212/D2/T1] INFO  Umbraco.Web.Cache.CacheRefresherEventHandler - Initializing Umbraco internal event handlers for cache refreshing
     2018-09-20 13:28:30,707 [P2212/D2/T1] INFO  Umbraco.Web.Search.ExamineEvents - Initializing Examine and binding to business logic events
     2018-09-20 13:28:30,707 [P2212/D2/T1] INFO  Umbraco.Web.Search.ExamineEvents - Adding examine event handlers for index providers: 3
     2018-09-20 13:28:30,747 [P2212/D2/T1] INFO  Umbraco.Core.CoreBootManager - Umbraco application startup complete (took 2193ms)
     2018-09-20 13:28:31,075 [P2212/D2/T10] INFO  Umbraco.Core.Sync.ApplicationUrlHelper - New ApplicationUrl detected: http://##.##.##.###:8011/umbraco
     2018-09-20 13:28:31,075 [P2212/D2/T10] INFO  Umbraco.Core.Sync.ApplicationUrlHelper - ApplicationUrl: http://##.##.##.###:8011/umbraco (UmbracoModule request)
     2018-09-20 13:28:31,399 [P2212/D2/T10] INFO  umbraco.content - Loading content from database...
     2018-09-20 13:30:28,468 [P2212/D2/T10] ERROR Umbraco.Core.Persistence.UmbracoDatabase - Exception (06ac2b31).
    The thread has been aborted, because the request has timed out.
    System.Threading.ThreadAbortException: Thread was being aborted.
       at SNIReadSyncOverAsync(SNI_ConnWrapper* , SNI_Packet** , Int32 )
       at SNINativeMethodWrapper.SNIReadSyncOverAsync(SafeHandle pConn, IntPtr& packet, Int32 timeout)
       at System.Data.SqlClient.TdsParserStateObject.ReadSniSyncOverAsync()
       at System.Data.SqlClient.TdsParserStateObject.TryReadNetworkPacket()
       at System.Data.SqlClient.TdsParserStateObject.TryPrepareBuffer()
       at System.Data.SqlClient.TdsParserStateObject.TryReadByteArray(Byte[] buff, Int32 offset, Int32 len, Int32& totalRead)
       at System.Data.SqlClient.TdsParserStateObject.TryReadString(Int32 length, String& value)
       at System.Data.SqlClient.TdsParser.TryReadSqlStringValue(SqlBuffer value, Byte type, Int32 length, Encoding encoding, Boolean isPlp, TdsParserStateObject stateObj)
       at System.Data.SqlClient.TdsParser.TryReadSqlValue(SqlBuffer value, SqlMetaDataPriv md, Int32 length, TdsParserStateObject stateObj, SqlCommandColumnEncryptionSetting columnEncryptionOverride, String columnName)
       at System.Data.SqlClient.SqlDataReader.TryReadColumnInternal(Int32 i, Boolean readHeaderOnly)
       at System.Data.SqlClient.SqlDataReader.TryReadColumn(Int32 i, Boolean setTimeout, Boolean allowPartiallyReadColumn)
       at System.Data.SqlClient.SqlDataReader.GetValueInternal(Int32 i)
       at System.Data.SqlClient.SqlDataReader.GetValue(Int32 i)
       at petapoco_factory_7(IDataReader )
       at Umbraco.Core.Persistence.Database.<Query>d__74`1.MoveNext()
     2018-09-20 13:30:28,474 [P2212/D2/T10] ERROR umbraco.content - Error Republishing
    System.InvalidOperationException: Internal connection fatal error. Error state: 15, Token : 115
       at System.Data.SqlClient.TdsParser.TryRun(RunBehavior runBehavior, SqlCommand cmdHandler, SqlDataReader dataStream, BulkCopySimpleResultSet bulkCopyHandler, TdsParserStateObject stateObj, Boolean& dataReady)
       at System.Data.SqlClient.TdsParser.Run(RunBehavior runBehavior, SqlCommand cmdHandler, SqlDataReader dataStream, BulkCopySimpleResultSet bulkCopyHandler, TdsParserStateObject stateObj)
       at System.Data.SqlClient.TdsParser.DrainData(TdsParserStateObject stateObj)
       at System.Data.SqlClient.SqlInternalConnectionTds.ValidateConnectionForExecute(SqlCommand command)
       at System.Data.SqlClient.SqlInternalTransaction.Rollback()
       at System.Data.SqlClient.SqlTransaction.Rollback()
       at Umbraco.Core.Persistence.Database.CleanupTransaction()
       at Umbraco.Core.Persistence.Database.AbortTransaction()
       at Umbraco.Core.Scoping.Scope.DisposeLastScope()
       at Umbraco.Core.Scoping.Scope.Dispose()
       at Umbraco.Core.Persistence.UnitOfWork.ScopeUnitOfWork.DisposeResources()
       at Umbraco.Core.DisposableObjectSlim.Dispose(Boolean disposing)
       at Umbraco.Core.DisposableObjectSlim.Dispose()
       at Umbraco.Core.Services.ContentService.BuildXmlCache()
       at umbraco.content.LoadContentFromDatabase()
    

    Is there anyway to not load all the pages in the Client side? So retrieve the data from the database, instead of caching in memory.

    Thank you

    Shinsuke

  • Dave Woestenborghs 3504 posts 12133 karma points MVP 8x admin c-trib
    Sep 20, 2018 @ 06:26
    Dave Woestenborghs
    0

    Hi Shinsuke,

    This is problay because you disabled the xml cache.

    Normally Umbraco loads content for the front end from the xml cache.

    If that is not active it needs to hit the database.

    Could you try with the xml cache on ?

    Dave

  • shinsuke nakayama 109 posts 250 karma points
    Sep 20, 2018 @ 21:07
    shinsuke nakayama
    0

    Hi Dave,

    The reason why I disabled the xml cache was because the file size was reaching 3gb. and was really slow saving the data (This might of been because I didn't use the listing previously).

    Let me try turn on again and republish.

    However, Is there anyway of only retrieving the data for that page only? so when I hit the home page, it doesn't load all the pages?

    Does Umbraco has page limit on how many pages it can publish?

  • Dave Woestenborghs 3504 posts 12133 karma points MVP 8x admin c-trib
    Sep 21, 2018 @ 07:34
    Dave Woestenborghs
    0

    Can you post the code of your homepage ?

    Maybe something in there is causing the slowdown.

    Dave

  • shinsuke nakayama 109 posts 250 karma points
    Sep 21, 2018 @ 08:28
    shinsuke nakayama
    0

    Hi Dave,

    I just recreated the Server (Web and DB Server) to mimic the production environment, and I'm running the script now to populate the pages. I'll probably have to leave the script running overnight.

    The code is straight out of box, I haven't done any coding yet. I'm just testing the page capacity. My homepage cshtml looks like this, and no controller has been used.

    @inherits Umbraco.Web.Mvc.UmbracoTemplatePage<ContentModels.Home>
    @using ContentModels = Umbraco.Web.PublishedContentModels;
    @{
        Layout = null;
    }
    <h1>Home</h1>
    

    However, what I'll be doing in the future is I will hijack the process using the controller and return the JSON data. Then front-end uses that data to display the data.

    I'll make this demo site publicly accessible, once I finish populating the pages.

  • shinsuke nakayama 109 posts 250 karma points
    Sep 24, 2018 @ 02:05
    shinsuke nakayama
    0

    Hi guys,

    Update on this: I have created 2 websites pointing to the same database for testing purpose.

    1. XmlCacheEnabled - False: First website, I turned off the cache because of the Umbraco.Config file was getting over 2gb. However, it seems it stored all the pages in memory (Used 6gb of memory then hang). So this is probably not a good option.

    2. XmlCacheEnabled - True: On the 2nd time, I generated 1,500,000 pages. However, the size of the umbraco.config was 200mb. When I look into the file, there's a lot of content missing. I went into the backend and republished some of the pages, but that didn't update the Umbraco.Config file. When I look at the URL, it said "This document is published but is not in the cache" I ran "/Umbraco/dialogs/republish.aspx?xml=true" And left for 8 hours but the system hangs, and I can't view the website. CPU usage was at 0%.

    I'm thinking in the controller, I go through all the pages and publish the page 1 by 1.

    However, do I need to do this everytime?

  • shinsuke nakayama 109 posts 250 karma points
    Sep 25, 2018 @ 07:45
    shinsuke nakayama
    0

    Hi guys,

    Another update on this. I have upgrade our web server and it seems to be working quite well. Admin is a bit slow, but it's still usable. Only thing is it uses alot of memory.

    • umbraco.config is 5.9GB
    • approx 1.5 million active pages
    • avg memory usage on the web server is 18~27GB

    This is my current Server spec:

    • Web Server
      • Amazon EC2 (t2.2xlarge)
      • vcpu: 8
      • Memory: 32GB
    • DB Server:

      • Amazon RDS (db.t2.xlarge)
      • cpu: 4
      • Memory: 16GB

    In avg, the web server uses 18GB ~ 27GB in memory. I think it uses alot of memory because Umbraco keeps all the page in memory for easy write to the umbraco.config.

    enter image description here

    Does anyone know anyway to reduce the memory size?

  • Matthew Wise 271 posts 1373 karma points MVP 4x c-trib
    Sep 25, 2018 @ 07:47
    Matthew Wise
    0

    Hi Shinsuke,

    One option depending on the pages you have would be to "Archive" some of the pages. There is a great blog post on this approach here: https://www.moriyama.co.uk/about-us/news/blog-the-need-for-archived-content-in-umbraco-and-how-to-do-it/

    Matt

  • Dave Woestenborghs 3504 posts 12133 karma points MVP 8x admin c-trib
    Sep 25, 2018 @ 09:18
    Dave Woestenborghs
    0

    Hi Shinsuke,

    I was about to post the same blog as matthew did.

    I hardly ever see a site with 1.5 million pages that are actively visited. A lot of older content maybe get's visited only once a month. This blog post is good solution for that.

    Dave

  • shinsuke nakayama 109 posts 250 karma points
    Sep 25, 2018 @ 09:31
    shinsuke nakayama
    0

    Hi guys,

    Thank you for the feedback, I have went through this blog as well. However, potentially we most likely going to have millions of records. I have tried using the examine to search 1.5 millions records and it was extremely fast. So I might create another layer on top of Umbraco.

    This is most likely one of the biggest data website I've worked on (in my 20yrs of programming experience). I did recommend custom CMS initially but after I showed Umbraco backend as an example, they loved how flexible it was.

    Let me have more think about the architect this website (or web application).

    Keep you guys updated.

Please Sign in or register to post replies

Write your reply to:

Draft