Umbraco fails to boot even after database connection is warmed up. It seems the failture to boot is "cached"?
We have an Umbraco installation that takes around a minute to boot. Sometimes Umbraco will successfully boot. But some of the time Umbraco will fail to boot with the "failed to boot" screen and the logs will point to database time-outs as the reason.
However, even after the database is verifiably warmed up and radiply accessible from Umbraco (using a diagnostic page outside of Umbraco), Umbraco still shows the "failed to boot" screen immediately on page refresh, and the logs will still point to database time-outs as the reason. Only now these time-outs are happening immediately.
So it seems Umbraco is "caching" the failure to boot, including the reason for boot failure. Is this correct? Is there a way to turn off this "caching" behavior, so that subsequent page refreshs would result in a successful boot?
If not, is there a way to increase the time-outs during boot, specifically the database time-out? To stop the boot failures occuring in the first place.
Or is there a different way to try to get Umbraco to boot after a "failed to boot"?
The Umbraco connection string is verified to work, and sometimes Umbraco will boot after a restart.
The installation is load-balanced admin/front setup running on Azure Web Apps. We've never had any problems with this type of setup before.
The same solution works in the production environment. This is the test environment with rougly the same server sizing.
The only thing special about this Umbraco solution is that it takes long to start, probably because of custom start-up code or similar.
It's not a connection string issue; Umbraco sometimes successfully boots using the same configuration files, but often fails to boot.
We have a diagnostic page in the solution that uses the Umbraco connection string to query the database, and this diagnostic page can successfully retrieve data as soon as the page can load. But at this point, Umbraco has already failed to boot.
So, it sounds to me like the time-out probably isn't a connection time out, but a query/processing time out.
When a site starts up from, it has to rebuild the nuCache and, I think, it rebuilds the Examine indexes as well to ensure that they are fully up to date.
With sites that have large amounts of content/media/members, this process can reportedly time out. Although in my experience the process is still on going, but it's hogging all the resources while it runs. If you give it half an hour does it eventually start?
Have you also checked resources utilisation on your db when the timeouts are occurring?
The site does not consume any resources once the "Umbraco failed to boot" message appears.
I'm not sure any database timeouts are actually occurring.
After the attempted boot (which takes a little over a minute), the site (both /umbraco and /) immediately start returning the "Umbraco failed to boot" message.
This is what is written (and only this) to the Umbraco log on each attempted page load:
Umbraco.Core.Exceptions.BootFailedException: Boot failed: Umbraco cannot run. See Umbraco's log file for more details.
-> Umbraco.Core.Exceptions.BootFailedException: Boot failed.
-> System.Data.SqlClient.SqlException: Execution Timeout Expired. The timeout period elapsed prior to completion of the operation or the server is not responding.
at System.Data.SqlClient.SqlConnection.OnError(SqlException exception, Boolean breakConnection, Action`1 wrapCloseInAction)
...
at Umbraco.Core.Persistence.FaultHandling.FaultHandlingDbCommand.ExecuteScalar() in D:\\a\\1\\s\\src\\Umbraco.Core\\Persistence\\FaultHandling\\RetryDbConnection.cs:line 214
...
at Umbraco.Web.PublishedCache.NuCache.PublishedSnapshotService..ctor(PublishedSnapshotServiceOptions options, IMainDom mainDom, IRuntimeState runtime, ServiceContext serviceContext, IPublishedContentTypeFactory publishedContentTypeFactory, IdkMap idkMap, IPublishedSnapshotAccessor publishedSnapshotAccessor, IVariationContextAccessor variationContextAccessor, IProfilingLogger logger, IScopeProvider scopeProvider, IDocumentRepository documentRepository, IMediaRepository mediaRepository, IMemberRepository memberRepository, IDefaultCultureAccessor defaultCultureAccessor, IDataSource dataSource, IGlobalSettings globalSettings, IEntityXmlSerializer entitySerializer, IPublishedModelFactory publishedModelFactory, UrlSegmentProviderCollection urlSegmentProviders) in D:\\a\\1\\s\\src\\Umbraco.Web\\PublishedCache\\NuCache\\PublishedSnapshotService.cs:line 149
...
at Umbraco.Core.Composing.LightInject.LightInjectContainer.GetInstance(Type type) in D:\\a\\1\\s\\src\\Umbraco.Core\\Composing\\LightInject\\LightInjectContainer.cs:line 111
at Umbraco.Core.Composing.ComponentCollectionBuilder.CreateItem(IFactory factory, Type itemType) in D:\\a\\1\\s\\src\\Umbraco.Core\\Composing\\ComponentCollectionBuilder.cs:line 33
at Umbraco.Core.Composing.CollectionBuilderBase`3.<>c__DisplayClass10_0.<CreateItems>b__0(Type x) in D:\\a\\1\\s\\src\\Umbraco.Core\\Composing\\CollectionBuilderBase.cs:line 103
at System.Linq.Enumerable.WhereSelectArrayIterator`2.MoveNext()
at System.Linq.Buffer`1..ctor(IEnumerable`1 source)
at System.Linq.Enumerable.ToArray[TSource](IEnumerable`1 source)
at Umbraco.Core.Composing.CollectionBuilderBase`3.CreateItems(IFactory factory) in D:\\a\\1\\s\\src\\Umbraco.Core\\Composing\\CollectionBuilderBase.cs:line 102
at Umbraco.Core.Composing.ComponentCollectionBuilder.CreateItems(IFactory factory) in D:\\a\\1\\s\\src\\Umbraco.Core\\Composing\\ComponentCollectionBuilder.cs:line 25
at Umbraco.Core.Composing.CollectionBuilderBase`3.CreateCollection(IFactory factory) in D:\\a\\1\\s\\src\\Umbraco.Core\\Composing\\CollectionBuilderBase.cs:line 120
at Umbraco.Core.Composing.LightInject.LightInjectContainer.<>c__DisplayClass20_0`1.<Register>b__0(IServiceFactory f) in D:\\a\\1\\s\\src\\Umbraco.Core\\Composing\\LightInject\\LightInjectContainer.cs:line 172
at DynamicMethod(Object[] )
at LightInject.ServiceContainer.<>c__DisplayClass150_0.<WrapAsFuncDelegate>b__0() in C:\\projects\\lightinject\\src\\LightInject\\LightInject.cs:line 3798
at LightInject.ServiceContainer.<>c__DisplayClass198_0.<EmitLifetime>b__1() in C:\\projects\\lightinject\\src\\LightInject\\LightInject.cs:line 4657
at LightInject.PerContainerLifetime.GetInstance(Func`1 createInstance, Scope scope) in C:\\projects\\lightinject\\src\\LightInject\\LightInject.cs:line 6169
at LightInject.ServiceContainer.EmitLifetime(ServiceRegistration serviceRegistration, Action`1 emitMethod, IEmitter emitter) in C:\\projects\\lightinject\\src\\LightInject\\LightInject.cs:line 4656
at LightInject.ServiceContainer.<>c__DisplayClass197_0.<ResolveEmitMethod>b__1(IEmitter methodSkeleton) in C:\\projects\\lightinject\\src\\LightInject\\LightInject.cs:line 4649
at LightInject.ServiceContainer.<>c__DisplayClass153_0.<CreateEmitMethodWrapper>b__0(IEmitter ms) in C:\\projects\\lightinject\\src\\LightInject\\LightInject.cs:line 3856
at LightInject.ServiceContainer.CreateDynamicMethodDelegate(Action`1 serviceEmitter) in C:\\projects\\lightinject\\src\\LightInject\\LightInject.cs:line 3776
at LightInject.ServiceContainer.CreateDelegate(Type serviceType, String serviceName, Boolean throwError) in C:\\projects\\lightinject\\src\\LightInject\\LightInject.cs:line 4743
at LightInject.ServiceContainer.CreateDefaultDelegate(Type serviceType, Boolean throwError) in C:\\projects\\lightinject\\src\\LightInject\\LightInject.cs:line 4705
at LightInject.ServiceContainer.GetInstance(Type serviceType) in C:\\projects\\lightinject\\src\\LightInject\\LightInject.cs:line 3437
at Umbraco.Core.Composing.LightInject.LightInjectContainer.GetInstance(Type type) in D:\\a\\1\\s\\src\\Umbraco.Core\\Composing\\LightInject\\LightInjectContainer.cs:line 111
at Umbraco.Core.FactoryExtensions.GetInstance[T](IFactory factory) in D:\\a\\1\\s\\src\\Umbraco.Core\\FactoryExtensions.cs:line 23
at Umbraco.Core.Runtime.CoreRuntime.Boot(IRegister register, DisposableTimer timer) in D:\\a\\1\\s\\src\\Umbraco.Core\\Runtime\\CoreRuntime.cs:line 204
-> System.ComponentModel.Win32Exception: The wait operation timed out
at Umbraco.Core.Exceptions.BootFailedException.Rethrow(BootFailedException bootFailedException) in D:\\a\\1\\s\\src\\Umbraco.Core\\Exceptions\\BootFailedException.cs:line 80
at Umbraco.Web.UmbracoInjectedModule.<>c.<Init>b__18_0(Object sender, EventArgs args) in D:\\a\\1\\s\\src\\Umbraco.Web\\UmbracoInjectedModule.cs:line 367
at System.Web.HttpApplication.SyncEventExecutionStep.System.Web.HttpApplication.IExecutionStep.Execute()
at System.Web.HttpApplication.<>c__DisplayClass285_0.<ExecuteStepImpl>b__0()
at System.Web.HttpApplication.StepInvoker.Invoke(Action executionStep)
at System.Web.HttpApplication.StepInvoker.<>c__DisplayClass4_0.<Invoke>b__0()
at Microsoft.AspNet.TelemetryCorrelation.TelemetryCorrelationHttpModule.OnExecuteRequestStep(HttpContextBase context, Action step)
at System.Web.HttpApplication.<>c__DisplayClass284_0.<OnExecuteRequestStep>b__0(Action nextStepAction)
at System.Web.HttpApplication.StepInvoker.Invoke(Action executionStep)
at System.Web.HttpApplication.ExecuteStepImpl(IExecutionStep step)
at System.Web.HttpApplication.ExecuteStep(IExecutionStep step, Boolean& completedSynchronously)
These two lines seem to be the culprit:
at Umbraco.Core.Exceptions.BootFailedException.Rethrow(BootFailedException bootFailedException) in D:\\a\\1\\s\\src\\Umbraco.Core\\Exceptions\\BootFailedException.cs:line 80
at Umbraco.Web.UmbracoInjectedModule.<>c.<Init>b__18_0
So it seems that as soon as UmbracoInjectedModule<T>.Init() is called, the boot-failed exception is thrown, and the rest is just old hat. Perhaps it is a module misbehaving? How would I know which module?
If the problem is indeed one of the "injected modules", does anyone have any ideas on how to troubleshoot further? Or how to force a re-init, so that the "boot failed" exception stops getting rethrown?
What kind of solution you took? Can you share it here.
I have the same problem and can't find a solution within Umbraco or core itself. The connection string is not terminated by itself - it is terminated when regular updates are made on the server. At that time, the apppools are raised, but since the first 'ping' to sql is unsuccessful, even when sql is set up, it still shows an error. Then I have to manually recycle the apppools. The sysadmin suggested making a command to recycle all apppools after each update after a certain timeout. It's an ugly hack, but it's worth a try...
We are blaming this on some sort of corruption of the Umbraco database which caused a seemingly infinite loop when booting. So a query/processing timeout on the surface. But definitely something in the database was the problem. We recreated the database from a different installation + uSync.
We've seen a similar issue where when the database server is unavailable, boot fails, then even when database is restored/back online, the Umbraco site will not boot until manually restarted (cached error)
I'm guessing that a BootFailedException is terminal, meaning boot is not re-attempted until a restart.
We've also noticed that if there's any hiccup with the SQL connection, Umbraco takes itself down and refuses to come online without a restart.
You can get into the /umbraco admin area, demonstrating that the website is in fact online, and able to connect to the database to verify your user connection, but the public site itself won't come back online without a restart.
Here's a log from one of our staging sites, where there's virtually 0 traffic. Slight network hiccup 5 days ago, and the site has been down ever since.
2022-04-08 16:02:44,469 [P9068/D2/T42] INFO umbraco.BusinessLogic.Log - Log scrubbed. Removed all items older than 2022-04-07 16:02:43
2022-04-08 16:25:50,907 [P9068/D2/T69] ERROR Umbraco.Core.UmbracoApplicationBase - An unhandled exception occurred
System.Data.SqlClient.SqlException (0x80131904): Execution Timeout Expired. The timeout period elapsed prior to completion of the operation or the server is not responding. ---> System.ComponentModel.Win32Exception (0x80004005): The wait operation timed out
at System.Data.SqlClient.SqlInternalConnection.OnError(SqlException exception, Boolean breakConnection, Action`1 wrapCloseInAction)
at System.Data.SqlClient.TdsParser.ThrowExceptionAndWarning(TdsParserStateObject stateObj, Boolean callerHasConnectionLock, Boolean asyncClose)
at System.Data.SqlClient.TdsParserStateObject.ReadSniError(TdsParserStateObject stateObj, UInt32 error)
at System.Data.SqlClient.TdsParserStateObject.ReadSniSyncOverAsync()
at System.Data.SqlClient.TdsParserStateObject.TryReadNetworkPacket()
at System.Data.SqlClient.TdsParserStateObject.TryPrepareBuffer()
at System.Data.SqlClient.TdsParserStateObject.TryReadByte(Byte& value)
at System.Data.SqlClient.TdsParser.TryRun(RunBehavior runBehavior, SqlCommand cmdHandler, SqlDataReader dataStream, BulkCopySimpleResultSet bulkCopyHandler, TdsParserStateObject stateObj, Boolean& dataReady)
at System.Data.SqlClient.TdsParser.Run(RunBehavior runBehavior, SqlCommand cmdHandler, SqlDataReader dataStream, BulkCopySimpleResultSet bulkCopyHandler, TdsParserStateObject stateObj)
at System.Data.SqlClient.TdsParser.TdsExecuteTransactionManagerRequest(Byte[] buffer, TransactionManagerRequestType request, String transactionName, TransactionManagerIsolationLevel isoLevel, Int32 timeout, SqlInternalTransaction transaction, TdsParserStateObject stateObj, Boolean isDelegateControlRequest)
at System.Data.SqlClient.SqlInternalConnectionTds.ExecuteTransactionYukon(TransactionRequest transactionRequest, String transactionName, IsolationLevel iso, SqlInternalTransaction internalTransaction, Boolean isDelegateControlRequest)
at System.Data.SqlClient.SqlInternalConnectionTds.ExecuteTransaction(TransactionRequest transactionRequest, String name, IsolationLevel iso, SqlInternalTransaction internalTransaction, Boolean isDelegateControlRequest)
at System.Data.SqlClient.SqlInternalConnection.BeginSqlTransaction(IsolationLevel iso, String transactionName, Boolean shouldReconnect)
at System.Data.SqlClient.SqlConnection.BeginTransaction(IsolationLevel iso, String transactionName)
at System.Data.SqlClient.SqlConnection.BeginDbTransaction(IsolationLevel isolationLevel)
at System.Data.Common.DbConnection.System.Data.IDbConnection.BeginTransaction(IsolationLevel isolationLevel)
at Umbraco.Core.Persistence.Database.OpenSharedConnection()
at Umbraco.Core.Persistence.Database.<Query>d__74`1.MoveNext()
at System.Collections.Generic.List`1..ctor(IEnumerable`1 collection)
at System.Linq.Enumerable.ToList[TSource](IEnumerable`1 source)
at Umbraco.Core.Persistence.Database.Fetch[T](String sql, Object[] args)
at Umbraco.Core.Persistence.Database.Fetch[T](Sql sql)
at Umbraco.Core.Sync.DatabaseServerMessenger.ProcessDatabaseInstructions()
at Umbraco.Core.Sync.DatabaseServerMessenger.Sync()
at Umbraco.Web.BatchedDatabaseServerMessenger.UmbracoModule_RouteAttempt(Object sender, RoutableAttemptEventArgs e)
at Umbraco.Web.UmbracoModule.OnRouteAttempt(RoutableAttemptEventArgs args)
at Umbraco.Web.UmbracoModule.ProcessRequest(HttpContextBase httpContext)
at Umbraco.Web.UmbracoModule.<Init>b__12_3(Object sender, EventArgs e)
at System.Web.HttpApplication.SyncEventExecutionStep.System.Web.HttpApplication.IExecutionStep.Execute()
at System.Web.HttpApplication.ExecuteStepImpl(IExecutionStep step)
at System.Web.HttpApplication.ExecuteStep(IExecutionStep step, Boolean& completedSynchronously)
ClientConnectionId:ffffffff-ffff-ffff-ffff-ffffffffffff
Error Number:-2,State:0,Class:11
ClientConnectionId before routing:ffffffff-ffff-ffff-ffff-ffffffffffff
Routing Destination:ffffffffffff.tr20000.eastus1-a.worker.database.windows.net,11011
2022-04-08 20:02:45,476 [P9068/D2/T87] INFO umbraco.BusinessLogic.Log - Log scrubbed. Removed all items older than 2022-04-07 20:02:44
Umbraco fails to boot even after database connection is warmed up. It seems the failture to boot is "cached"?
We have an Umbraco installation that takes around a minute to boot. Sometimes Umbraco will successfully boot. But some of the time Umbraco will fail to boot with the "failed to boot" screen and the logs will point to database time-outs as the reason.
However, even after the database is verifiably warmed up and radiply accessible from Umbraco (using a diagnostic page outside of Umbraco), Umbraco still shows the "failed to boot" screen immediately on page refresh, and the logs will still point to database time-outs as the reason. Only now these time-outs are happening immediately.
So it seems Umbraco is "caching" the failure to boot, including the reason for boot failure. Is this correct? Is there a way to turn off this "caching" behavior, so that subsequent page refreshs would result in a successful boot?
If not, is there a way to increase the time-outs during boot, specifically the database time-out? To stop the boot failures occuring in the first place.
Or is there a different way to try to get Umbraco to boot after a "failed to boot"?
The Umbraco connection string is verified to work, and sometimes Umbraco will boot after a restart.
The installation is load-balanced admin/front setup running on Azure Web Apps. We've never had any problems with this type of setup before.
The same solution works in the production environment. This is the test environment with rougly the same server sizing.
The only thing special about this Umbraco solution is that it takes long to start, probably because of custom start-up code or similar.
Umbraco 8.13.
Hello, Check if App_Data folder is included (having Umbraco.Sdf file)
It seems to be connection string issue (If you will be debugging your application, then you should first exclude app_data folder from your project)
Thanks.
It's not a connection string issue; Umbraco sometimes successfully boots using the same configuration files, but often fails to boot.
We have a diagnostic page in the solution that uses the Umbraco connection string to query the database, and this diagnostic page can successfully retrieve data as soon as the page can load. But at this point, Umbraco has already failed to boot.
Hi Per,
So, it sounds to me like the time-out probably isn't a connection time out, but a query/processing time out.
When a site starts up from, it has to rebuild the nuCache and, I think, it rebuilds the Examine indexes as well to ensure that they are fully up to date.
With sites that have large amounts of content/media/members, this process can reportedly time out. Although in my experience the process is still on going, but it's hogging all the resources while it runs. If you give it half an hour does it eventually start?
Have you also checked resources utilisation on your db when the timeouts are occurring?
Thanks
Nik
Thanks.
The site does not recover after a while.
The site does not consume any resources once the "Umbraco failed to boot" message appears.
I'm not sure any database timeouts are actually occurring.
After the attempted boot (which takes a little over a minute), the site (both /umbraco and /) immediately start returning the "Umbraco failed to boot" message.
This is what is written (and only this) to the Umbraco log on each attempted page load:
These two lines seem to be the culprit:
So it seems that as soon as
UmbracoInjectedModule<T>.Init()
is called, the boot-failed exception is thrown, and the rest is just old hat. Perhaps it is a module misbehaving? How would I know which module?If the problem is indeed one of the "injected modules", does anyone have any ideas on how to troubleshoot further? Or how to force a re-init, so that the "boot failed" exception stops getting rethrown?
What kind of solution you took? Can you share it here.
I have the same problem and can't find a solution within Umbraco or core itself. The connection string is not terminated by itself - it is terminated when regular updates are made on the server. At that time, the apppools are raised, but since the first 'ping' to sql is unsuccessful, even when sql is set up, it still shows an error. Then I have to manually recycle the apppools. The sysadmin suggested making a command to recycle all apppools after each update after a certain timeout. It's an ugly hack, but it's worth a try...
Regards
We are still facing this same problem. Does anyone have any tips on how to proceed with troubleshooting?
We are blaming this on some sort of corruption of the Umbraco database which caused a seemingly infinite loop when booting. So a query/processing timeout on the surface. But definitely something in the database was the problem. We recreated the database from a different installation + uSync.
We've seen a similar issue where when the database server is unavailable, boot fails, then even when database is restored/back online, the Umbraco site will not boot until manually restarted (cached error)
I'm guessing that a BootFailedException is terminal, meaning boot is not re-attempted until a restart.
We've also noticed that if there's any hiccup with the SQL connection, Umbraco takes itself down and refuses to come online without a restart.
You can get into the /umbraco admin area, demonstrating that the website is in fact online, and able to connect to the database to verify your user connection, but the public site itself won't come back online without a restart.
Here's a log from one of our staging sites, where there's virtually 0 traffic. Slight network hiccup 5 days ago, and the site has been down ever since.
is working on a reply...