Copied to clipboard

Flag this post as spam?

This post will be reported to the moderators as potential spam to be looked at


  • Keith 35 posts 161 karma points
    28 days ago
    Keith
    0

    "Save" button loading forever on Azure App Service

    Hi All,

    I have a strange issue when I click the "Save" button to save a document type in the Backoffice. When the site is deployed to azure app service, it shows the loading icon for an extremely long time (i dont know exactly, but it feels like minutes). But if I run out of patients and refresh the window after, say 10 seconds, the changes have actually been persisted even though the button hasn't stopped loading.

    When I inspect the network tab, I can see that the PostSave call is eventually getting a gateway timeout on the azure app service, which enables the save button again and shows a back office error. But again, the changes are actually persisted.

    When I run the site locally this does not happen. Even when I point my local environment at the same Azure database that the app service uses. So I don't think its a database performance issue.

    There is also another behaviour I have seen on application startup that might be related. Locally, the application starts in a few seconds (even when pointed at the Azure database the app service uses). But on the Azure App Service itself, it takes several minutes to start. At a guess, Id say between 5 and 10 minutes.

    Any ideas what could be causing this?

    FYI, I am using the latest version 9, but this has been happening for some time through all the various RCs and nightly builds

  • SteveV 48 posts 212 karma points
    27 days ago
    SteveV
    0
    • Which App Service Plan are you using?
    • Do you see a CPU or memory spike on that plan when saving the document type?
    • How many documents are using this document type?
    • Which Azure SQL Database type are you using? (DTU or B, S0, S1, ...)
    • Do you see anything in the logfiles?
  • Per Bolmstedt 44 posts 216 karma points
    27 days ago
    Per Bolmstedt
    0
    • Are you load balanced (i.e. multi-Umbraco-instance)? And even if not, have you configured the server role(s) correctly (single/publisher/subscriber)?

    • Are you multi-Azure-app-instance?

    • Are you using Azure SQL or SQL Server or something else?

    • Are you using uSync?

    • Which model builder mode are you on?

    We haven't deployed a v9 site to Azure yet (only v8), but we run all our Umbracos as load balanced Azure App Services (using SQL Server on an Azure VM on the same internal network) and don't have this problem.

  • Keith 35 posts 161 karma points
    27 days ago
    Keith
    0

    Thanks for the replies guys. Sorry if I didn't give enough info:

    Which App Service Plan are you using?

    This is our test environment. Its running P2V2 (7GB memory, 420ACU)

    Do you see a CPU or memory spike on that plan when saving the document type?

    No, I cant see a CPU spike when saving, nor anything significant while starting up either.

    How many documents are using this document type?

    It happens on all document types as far as I can tell. Some with as few as 1 document using it. Many of my document types are elements used within blocklists. Not sure if that would make a difference?

    Which Azure SQL Database type are you using? (DTU or B, S0, S1, ...)

    S4 and utilisation never goes above 3 or 4%

    Do you see anything in the logfiles?

    Nothing. But ill try increase the log level and see if I see anything else.

    Are you load balanced (i.e. multi-Umbraco-instance)? And even if not, have you configured the server role(s) correctly (single/publisher/subscriber)?

    Are you multi-Azure-app-instance?

    Are you using Azure SQL or SQL Server or something else?

    I'm running this on an app service and the app service plan is set not to scale for the moment, so it will only be one instance, but still, you make an interesting point, there are extra network layers that I wont have locally. Let me have a read of the multi-intance documentation and see if there is anything in that.

    Are you using uSync?

    No

    Which model builder mode are you on?

          "ModelsBuilder": {
        "Enable": false,
        "ModelsMode": "Nothing"
      }
    

    using SQL Server on an Azure VM on the same internal network)

    Yeah, i have had problems with DTU throttling on Azure SQL Databases in the past when running complex queries, which you dont see on SQL server running on a VM. But I see no evidence of that here because all the utilisation charts are very low.

  • Per Bolmstedt 44 posts 216 karma points
    27 days ago
    Per Bolmstedt
    0

    And just to clarify, you are using the same Umbraco instance for Admin (i.e. Umbraco) and Front-end?

  • Keith 35 posts 161 karma points
    27 days ago
    Keith
    0

    Yes.

    But perhaps I have caused an issue by connecting to the Azure Db from my local environment? I mean, I occasionally connect to the "test environment" azure database from my local machine. So effectively there are 2 instances of the Backoffice using the same Db. I havent had a chance to read the docs about multi-instance, but could that be an issue? But in saying that, if it was an issue, you would expect to see weird behaviour on both Backoffices, but I dont. Just the one hosted on Azure app service.

    Its running on Linux too, and my local machine is Windows.

  • Keith 35 posts 161 karma points
    22 days ago
    Keith
    0

    Hi All,

    When I increase the log level on the server, I get the following when I save a document type:

    Invoking refresher {RefresherType} on local server for message type RefreshByPayload","@l":"Debug","RefresherType":"Umbraco.Cms.Core.Cache.ContentTypeCacheRefresher
    Notified {ChangeTypes} for {ItemType} {ItemId}","@l":"Debug","ChangeTypes":"RefreshOther","ItemType":"IContentType","ItemId":18143,"SourceContext":"Umbraco.Cms.Infrastructure.PublishedCache.PublishedSnapshotService","ActionId":"3013ab77-c65d-42c5-a887-07da177d116c","ActionName":"Umbraco.Cms.Web.BackOffice.Controllers.ContentTypeController.PostSave (Umbraco.Web.BackOffice)","RequestId":"0HMC7BR45ABD3:00000016","RequestPath":"/umbraco/backoffice/umbracoapi/contenttype/PostSave
    Notified {ChangeTypes} for content {ContentId}","@l":"Debug","ChangeTypes":"RefreshAll","ContentId":0,"SourceContext":"Umbraco.Cms.Infrastructure.PublishedCache.PublishedSnapshotService","ActionId":"3013ab77-c65d-42c5-a887-07da177d116c","ActionName":"Umbraco.Cms.Web.BackOffice.Controllers.ContentTypeController.PostSave (Umbraco.Web.BackOffice)","RequestId":"0HMC7BR45ABD3:00000016","RequestPath":"/umbraco/backoffice/umbracoapi/contenttype/PostSave
    {StartMessage} [Timing {TimingId}]","StartMessage":"Loading content from database"
    

    Which seems to then trigger a rake load of these log entries (over 5000, so i think its every node on the site):

    Set {kit.Node.Id} with parent {kit.Node.ParentContentId}","@l":"Debug","kit.Node.Id":2117,"kit.Node.ParentContentId":-1,"SourceContext":"ContentStore"
    

    Which, on the face of it seems a bit fishy. But according to the timestamps of the logs and the timing messages, it takes less than a second to complete:

    "{EndMessage} ({Duration}ms) [Timing {TimingId}]","EndMessage":"Completed.","Duration":741,"
    

    And Id assume that these kinds of thing don't run interactively anyway, right? I mean, I assume if there is some sort of cache re-building going on, it wouldn't block the back office requests.

  • Per Bolmstedt 44 posts 216 karma points
    22 days ago
    Per Bolmstedt
    0

    It does sound unrelated.

    So,

    • Saving is actually instant (you could verify this, right? For example in the logs seeing the "save" take place normally when clicking save)
    • Web UI hangs because PostSave never gets a response from the Umbraco back office APIs, and gets a "gateway timeout" when ARR(?) has timed out

    Can you show the "gateway timeout" error, and any strange requests/response before/after? This looks like a communication failure between the web UI and PostSave, so something strictly Azure-infrastructure-related, or web app configuration related.

  • Keith 35 posts 161 karma points
    22 days ago
    Keith
    1

    Thanks for your response again.

    Yeah, correct. If I wait a couple of seconds and then reload the browser after clicking save, the changes have persisted.

    It does feel like a network-y type of issue to me. I cant find any evidence of a performance bottleneck. I have not changed any settings on the app services relating to networking. Its an off-the-shelf .net 5 linux app service, with the deployment centre hooked up to a repo. Thats it.

    There are no other network requests going on when I get the gateway timeout. Except this request that polls, but again, this seems normal:

    /umbraco/backoffice/umbracoapi/authentication/GetRemainingTimeoutSeconds

    The response itself is also pretty uneventful

    enter image description here

    We have a couple of custom middlewares. I'm going to recheck them for any silly typos with async/await.

    I might also try deploy to a windows server and see if its the same behaviour

  • Keith 35 posts 161 karma points
    19 days ago
    Keith
    0

    So I deployed the same code to a Windows appservice and I get the same behaviour.

    I think the next step might be to deploy a fresh V9 install, with no customisations and see if that has the same behaviour.

  • Keith 35 posts 161 karma points
    1 week ago
    Keith
    0

    I think I have discovered the issue. I have been working locally a lot more lately, migrating a large amount of data onto the new Umbraco site and I started seeing this behaviour locally too.

    Locally, I was able to wait for the server to eventually respond (I obviously don't get the gateway timeout locally) and I could see that our earlier assumptions about the cache building being in a background job were incorrect. The sever eventually responded after the cache building has finished (after like 3 or 4 minutes).

    So I started having a look at the source code in this area and I found this comment in PublishedSnapshotService.cs

                // In the case of ModelsMode.InMemoryAuto generated models - we actually need to refresh all of the content and the media
                // see https://github.com/umbraco/Umbraco-CMS/issues/5671
                // The underlying issue is that in ModelsMode.InMemoryAuto mode the IAutoPublishedModelFactory will re-compile all of the classes/models
                // into a new DLL for the application which includes both content types and media types.
                // Since the models in the cache are based on these actual classes, all of the objects in the cache need to be updated
                // to use the newest version of the class.
                // NOTE: Ideally this can be run on background threads here which would prevent blocking the UI
                // as is the case when saving a content type. Initially one would think that it won't be any different
                // between running this here or in another background thread immediately after with regards to how the
                // UI will respond because we already know between calling `WithSafeLiveFactoryReset` to reset the generated models
                // and this code here, that many front-end requests could be attempted to be processed. If that is the case, those pages are going to get a
                // model binding error and our ModelBindingExceptionFilter is going to to its magic to reload those pages so the end user is none the wiser.
                // So whether or not this executes 'here' or on a background thread immediately wouldn't seem to make any difference except that we can return
                // execution to the UI sooner.
                // BUT!... there is a difference IIRC. There is still execution logic that continues after this call on this thread with the cache refreshers
                // and those cache refreshers need to have the up-to-date data since other user cache refreshers will be expecting the data to be 'live'. If
                // we ran this on a background thread then those cache refreshers are going to not get 'live' data when they query the content cache which
                // they require.
    

    originating from this fix:

    https://github.com/umbraco/Umbraco-CMS/pull/5767/commits/126c4cbd46de303bd3cf6ca095d68216d16f914a

    So, it seems when you have models builder set to "InMemoryAuto", the caches are rebuilt on the interactive thread. And because I have over 100k documents, its taking several minutes to do this.

    It sounds like the simple solution is to change the models builder configuration to another option than "InMemoryAuto", which I will try. But peeking around at the code, Im not sure it will change anything.

    The caches are rebuild if this is true:

                if (_publishedModelFactory.IsLiveFactoryEnabled())
    

    and the source of this method is:

        public static bool IsLiveFactoryEnabled(this IPublishedModelFactory factory)
        {
            if (factory is IAutoPublishedModelFactory liveFactory)
            {
                return liveFactory.Enabled;
            }
    
            // if it's not ILivePublishedModelFactory we can't determine if it's enabled or not so return true
            return true;
        }
    

    So according to this, it will always return true, unless the factory is an instance of IAutoPublishedModelFactory. In which case it returns factory.Enabled.

    BUT

    It looks like there is only one implementation of this interface:

    InMemoryModelFactory
    

    and for factory.Enabled

    it uses this:

    public bool Enabled => _config.ModelsMode == ModelsMode.InMemoryAuto;
    

    And it is only used if "InMemoryAuto" is enabled:

                if (config.ModelsMode == ModelsMode.InMemoryAuto)
                {
                    return factory.GetRequiredService<InMemoryModelFactory>();
                }
                else
                {
                    return factory.CreateDefaultPublishedModelFactory();
                }
    

    So, basically, this is always going to happen, regardless of the models builder config.

    Because if i dont set it to "inMemoryAuto" i fall into the else block from earlier "// if it's not ILivePublishedModelFactory we can't determine if it's enabled or not so return true" and if I DO set it to "inMemoryAuto", true is always returned.

    I hope I am missing something, but I'll try this out and see what the behaviour is in testing.

  • Per Bolmstedt 44 posts 216 karma points
    1 week ago
    Per Bolmstedt
    0

    So, just to be clear, when you save a document type in development it takes 3-4 minutes with 100k documents? That sounds insane. I've never seen document type save times above a few seconds.

    Why would the number of documents matter? It seems like only the amount of document types should matter? How many document types do you have?

    Perhaps there is something in your solution that increases the cache rebuild time dramatically?

    Other than that, fantastic investigation. Following with great curiousity. :) And it's disappointing that Occam's razor strikes again; the timeout occurred because the save actually did time out.

  • Keith 35 posts 161 karma points
    1 week ago
    Keith
    0

    I think the reason the number of documents matters is because when it forces a cache refresh, it queries all the content, all >100k records, creates a new IPublishedContent for each of them and then saves them all into the content cache. Then it does the same for the media.

    The document type save works in a second or two, like you say. It does actually persist the changes. But it looks like it triggers a cache rebuild (which take a few minutes) before it returns a response back to the browser.

  • Keith 35 posts 161 karma points
    1 week ago
    Keith
    0

    I changed to another models builder mode: SourceCodeAuto

    Same behaviour.

    I figure that the implementation of the IPublishedModelFactory that the site uses needs to implement IAutoPublishedModelFactory but also return false for the Enabled property of the IAutoPublishedModelFactory interface, if I want to turn off this auto cache re-build (see the logic in the posts above).

    There is no implementation like this in the source code. Its my opinion that if I set

     "ModelsBuilder": {
        "ModelsMode": "Nothing"
      }
    

    I should get an implementation injected as i described above. But i dont. I get the NoopPublishedModelFactory which does not implement IAutoPublishedModelFactory.

    So I made my own one:

    public class NoopAutoPublishedModelFactory : IAutoPublishedModelFactory
    {
        public object SyncRoot { get; } = new object();
    
        public bool Enabled => false;
    
        public IPublishedElement CreateModel(IPublishedElement element) => element;
    
        public IList CreateModelList(string alias) => new List<IPublishedElement>();
    
        public Type MapModelType(Type type) => typeof(IPublishedElement);
    
        public void Reset()
        {
        }
    }
    

    and added it to the serivces after removing all other implementations:

        public static IUmbracoBuilder AddNoopAutoPublishedModelFactory(this IUmbracoBuilder builder)
        {
            while (builder.Services.Any(descriptor => descriptor.ServiceType == typeof(IPublishedModelFactory)))
            {
                var serviceDescriptor = builder.Services.FirstOrDefault(descriptor => descriptor.ServiceType == typeof(IPublishedModelFactory));
    
                builder.Services.Remove(serviceDescriptor);
            }
    
    
            builder.Services.AddSingleton<IPublishedModelFactory, NoopAutoPublishedModelFactory>();
    
            return builder;
        }
    

    Now, when I save a doc type, it doesnt take 3 to 4 minutes and spew out hundreds of thousands of logs, it takes about 25 seconds. and the logs look like this:

    [00:20:22 VRB] Begin request [15212170-e218-42b4-856e-1c4848fb00b1]: /umbraco/backoffice/umbracoapi/contenttype/PostSave
    [00:20:49 DBG] Invoking refresher Umbraco.Cms.Core.Cache.ContentTypeCacheRefresher on local server for message type RefreshByPayload
    [00:20:49 DBG] Notified RefreshOther for IContentType 16009    
    [00:20:49 VRB] End Request [15212170-e218-42b4-856e-1c4848fb00b1]: /umbraco/backoffice/umbracoapi/contenttype/PostSave (27282.8581ms)
    

    I can live with this, if it is taking 25 seconds-ish to refresh a content type cache.

    But I'm now out of my depth here in terms of what this is actually doing under the hood.

    My gut feeling is it is fine, because I am not using the strongly typed models and I want to turn models builder off anyway. But I would love to know if anyone has worked on this aspect of Umbraco before, if they had any insight on this?

  • Keith 35 posts 161 karma points
    1 week ago
    Keith
    0

    I raised an issue for this, because I believe it is a bug. I understand that the whole cache needs to be rebuilt when the models are all recompiled, but when models builder is set to Nothing, the full cache rebuild should not be required, but it is still happening.

    https://github.com/umbraco/Umbraco-CMS/issues/11409

Please Sign in or register to post replies

Write your reply to:

Draft