Load-balanced Umbraco restarts when publishing content under high load
When traffic is high, (re)publishing any page often causes all 3 applications pools (master + 2 read nodes) to restart.
This causes the load balancer (which monitors ping.aspx) to mark all our nodes as offline, which results in a very bad user experience. We are having to tell editors not to use the Back Office during high traffic times.
We can't reproduce this on our (single) dev environment. I can see that App_Data\umbraco.config goes to 0 then the normal size - maybe this is by design.
I don't know how to debug this issue. We've had several experienced Umbraco developers look at this, and they don't know either.
We're using Umbraco 7.4.3. Architecture diagram below.
I think this event corresponds to most of the outages:
2016-05-16 01:48:07,638 [P9432/D2/T1] INFO Umbraco.Core.DatabaseContext - CanConnect = True
2016-05-16 01:48:30,497 [P7384/D2/T4] INFO Umbraco.Core.UmbracoApplicationBase - Application shutdown. Details: HostingEnvironment
_shutDownMessage=HostingEnvironment initiated shutdown
HostingEnvironment caused shutdown
_shutDownStack= at System.Environment.GetStackTrace(Exception e, Boolean needFileInfo)
at System.Environment.get_StackTrace()
at System.Web.Hosting.HostingEnvironment.InitiateShutdownInternal()
at System.Web.Hosting.HostingEnvironment.InitiateShutdownWithoutDemand()
at System.Web.Hosting.PipelineRuntime.StopProcessing()
Double check that you have KB3052480 installed. This is the exact behaviour that I have seen with unpatched 2102r2 servers with a Umbraco.config that is large enough to trigger the bug as it lives in App_Data
I have configured the above App Pool settings and triple-checked the patch install, but the issue is still happening.
When new nodes are published under load, the umbraco.config will go to 0, the application pool will stop responding on all three nodes. The website will become unavailable because the load balancer all three servers go offline.
I have checked the logs on the master and read-only servers, but I do not see error. I just see:
2016-08-02 23:03:12,845 [P3004/D5/T63] INFO umbraco.content - Save Xml to file...
2016-08-02 23:04:59,691 [P3004/D5/T60] INFO umbraco.content - Saved Xml to file.
By the way, when this happens, the Umbraco Back Office on the master node remains fine (since it does not use the cache). The read-only Back Offices do go down (the application pool is busy).
I checked the Windows Event Log and it does not seem that the application pool is recycling - it's just busy rebuilding the 200MB cache each time.
Load-balanced Umbraco restarts when publishing content under high load
When traffic is high, (re)publishing any page often causes all 3 applications pools (master + 2 read nodes) to restart.
This causes the load balancer (which monitors ping.aspx) to mark all our nodes as offline, which results in a very bad user experience. We are having to tell editors not to use the Back Office during high traffic times. We can't reproduce this on our (single) dev environment. I can see that App_Data\umbraco.config goes to 0 then the normal size - maybe this is by design.
I don't know how to debug this issue. We've had several experienced Umbraco developers look at this, and they don't know either.
We're using Umbraco 7.4.3. Architecture diagram below.
I think this event corresponds to most of the outages:
Are your servers running 2012R2 and how big is your Umbraco.Config?
Yes, 2012 R2 and umbraco.config is 181MB
I confirmed that this fix which fixes App Pool restarts is installed, along with all optional Windows updates.
Double check that you have KB3052480 installed. This is the exact behaviour that I have seen with unpatched 2102r2 servers with a Umbraco.config that is large enough to trigger the bug as it lives in App_Data
Is IIS configured to recycle the application pool when memory exceeds some specified limit?
Are there any errors in the Umbraco log prior to the "HostingEnvironment initiated shutdown" message?
Also, you may want to try disabling rapid fail protection for the application pool in IIS:
Hi guys,
I have configured the above App Pool settings and triple-checked the patch install, but the issue is still happening.
When new nodes are published under load, the umbraco.config will go to 0, the application pool will stop responding on all three nodes. The website will become unavailable because the load balancer all three servers go offline.
I have checked the logs on the master and read-only servers, but I do not see error. I just see:
2016-08-02 23:03:12,845 [P3004/D5/T63] INFO umbraco.content - Save Xml to file...
2016-08-02 23:04:59,691 [P3004/D5/T60] INFO umbraco.content - Saved Xml to file.
Any ideas?
By the way, when this happens, the Umbraco Back Office on the master node remains fine (since it does not use the cache). The read-only Back Offices do go down (the application pool is busy).
I checked the Windows Event Log and it does not seem that the application pool is recycling - it's just busy rebuilding the 200MB cache each time.
is working on a reply...