Distributed Cache Failure on Azure Load Balanced Environment
Hello,
I'm finding myself in the situation where our cache fails to update on any of the slave machines.
I have a basic setup of 1 Master and 2 Slave machines setup in Azure, using the 7.6.5 Umbraco version, with the recommended configuration. I have no issues with small changes of up to 20 nodes, but I'm experiencing problems when going towards larger numbers (300 and up).
There are 2 main issues.
I am 'greeted' by this particular error 'DISTRIBUTED CACHE IS NOT UPDATED. Failed to execute instructions ({0}: \"{1}\"). Instruction is being skipped/ignored'. This is mostly when something along something similar to the below is entered in the umbracoCacheInstructions table :
[{"RefreshType":3,"RefresherId":"27ab3022-3dfa-47b6-9119-5945bc88fd66","GuidId":"00000000-0000-0000-0000-000000000000","IntId":0,"JsonIds":"[7694,7695,
several hundred other ids ,18159,17583]","JsonPayload":null}].
This usually happens on 'Publish with children' action on a bigger node.
The above would be less of an issue if the action would actually be skipped, but for whatever reason the application is continuously starting and failing from this particular kind of instruction which stops any other ulterior change in the back office to be reflected on the actual websites.
The only solutions so far that are found to work are restarting the applications altogether or cold booting them, but this is not a viable option (during high traffic or night hours).
Is this a known issue or perhaps I am missing something too obvious? I know something similar, regarding the instruction count, was supposed to be fixed in recent versions. Any suggestion or advice is highly appreciated.
Best Regards,
Vlad A.
EDIT:
This is as far as i could track the issue up to this point.
2017-11-24 10:36:19,179 [P4736/D7/T35] ERROR Umbraco.Core.Sync.DatabaseServerMessenger - DISTRIBUTED CACHE IS NOT UPDATED. Failed to execute instructions (id: 834, instruction count: 1). Instruction is being skipped/ignored
The thread has been aborted, because the request has timed out.
A minidump was created in AppData/MiniDump
System.Threading.ThreadAbortException: Thread was being aborted.
at System.Collections.ArrayList.setCapacity(Int32 value)
at System.Collections.ArrayList.EnsureCapacity(Int32 min)
at System.Collections.ArrayList.Add(Object value)
at System.Xml.XmlNamedNodeMap.AddNode(XmlNode node)
at System.Xml.XmlAttributeCollection.InternalAppendAttribute(XmlAttribute node)
at System.Xml.XmlDocument.ImportAttributes(XmlNode fromElem, XmlNode toElem)
at System.Xml.XmlDocument.ImportNodeInternal(XmlNode node, Boolean deep)
at System.Xml.XmlDocument.ImportChildren(XmlNode fromNode, XmlNode toNode, Boolean deep)
at System.Xml.XmlDocument.ImportNodeInternal(XmlNode node, Boolean deep)
at System.Xml.XmlDocument.ImportChildren(XmlNode fromNode, XmlNode toNode, Boolean deep)
at System.Xml.XmlDocument.ImportNodeInternal(XmlNode node, Boolean deep)
at System.Xml.XmlDocument.ImportChildren(XmlNode fromNode, XmlNode toNode, Boolean deep)
at System.Xml.XmlDocument.ImportNodeInternal(XmlNode node, Boolean deep)
at System.Xml.XmlDocument.ImportChildren(XmlNode fromNode, XmlNode toNode, Boolean deep)
at System.Xml.XmlDocument.ImportNodeInternal(XmlNode node, Boolean deep)
at System.Xml.XmlDocument.ImportChildren(XmlNode fromNode, XmlNode toNode, Boolean deep)
at System.Xml.XmlDocument.ImportNodeInternal(XmlNode node, Boolean deep)
at System.Xml.XmlDocument.ImportChildren(XmlNode fromNode, XmlNode toNode, Boolean deep)
at System.Xml.XmlDocument.CloneNode(Boolean deep)
at umbraco.SafeXmlReaderWriter.Clone(XmlDocument xml)
at umbraco.SafeXmlReaderWriter.Get(IScopeProviderInternal scopeProvider, AsyncLock xmlLock, XmlDocument xml, Action1 refresh, Action2 apply, Boolean writer)
at umbraco.content.GetSafeXmlWriter()
at umbraco.content.UpdateDocumentCache(Document d)
at Umbraco.Web.Cache.PageCacheRefresher.Refresh(Int32 id)
at Umbraco.Core.Sync.DatabaseServerMessenger.RefreshByIds(Guid uniqueIdentifier, String jsonIds)
at Umbraco.Core.Sync.DatabaseServerMessenger.NotifyRefreshers(IEnumerable1 instructions, HashSet1 processed)
at Umbraco.Core.Sync.DatabaseServerMessenger.ProcessDatabaseInstructions(IReadOnlyCollection1 instructionBatch, CacheInstructionDto dto, HashSet1 processed, Int32& lastId)
You could either wait for the release or implement the fix on your website, basically making DatabaseServerMessenger run on the slave instances as a recurring task instead of being triggered by a page request.
Thanks for the info. I figured this was something in the works but I'm ashamed I couldn't find the direct link myself. In any case, we disabled the "publish with all subpages" option for now for normal users since it's not really a necessity, but it's good to have your suggestion at hand just in case.
Again, thanks for giving me peace of mind, this was starting to get to me.
Distributed Cache Failure on Azure Load Balanced Environment
Hello,
I'm finding myself in the situation where our cache fails to update on any of the slave machines.
I have a basic setup of 1 Master and 2 Slave machines setup in Azure, using the 7.6.5 Umbraco version, with the recommended configuration. I have no issues with small changes of up to 20 nodes, but I'm experiencing problems when going towards larger numbers (300 and up).
There are 2 main issues.
I am 'greeted' by this particular error 'DISTRIBUTED CACHE IS NOT UPDATED. Failed to execute instructions ({0}: \"{1}\"). Instruction is being skipped/ignored'. This is mostly when something along something similar to the below is entered in the umbracoCacheInstructions table :
[{"RefreshType":3,"RefresherId":"27ab3022-3dfa-47b6-9119-5945bc88fd66","GuidId":"00000000-0000-0000-0000-000000000000","IntId":0,"JsonIds":"[7694,7695, several hundred other ids ,18159,17583]","JsonPayload":null}].
This usually happens on 'Publish with children' action on a bigger node.
The above would be less of an issue if the action would actually be skipped, but for whatever reason the application is continuously starting and failing from this particular kind of instruction which stops any other ulterior change in the back office to be reflected on the actual websites.
The only solutions so far that are found to work are restarting the applications altogether or cold booting them, but this is not a viable option (during high traffic or night hours).
Is this a known issue or perhaps I am missing something too obvious? I know something similar, regarding the instruction count, was supposed to be fixed in recent versions. Any suggestion or advice is highly appreciated.
Best Regards, Vlad A.
EDIT:
This is as far as i could track the issue up to this point.
2017-11-24 10:36:19,179 [P4736/D7/T35] ERROR Umbraco.Core.Sync.DatabaseServerMessenger - DISTRIBUTED CACHE IS NOT UPDATED. Failed to execute instructions (id: 834, instruction count: 1). Instruction is being skipped/ignored The thread has been aborted, because the request has timed out. A minidump was created in AppData/MiniDump System.Threading.ThreadAbortException: Thread was being aborted. at System.Collections.ArrayList.setCapacity(Int32 value) at System.Collections.ArrayList.EnsureCapacity(Int32 min) at System.Collections.ArrayList.Add(Object value) at System.Xml.XmlNamedNodeMap.AddNode(XmlNode node) at System.Xml.XmlAttributeCollection.InternalAppendAttribute(XmlAttribute node) at System.Xml.XmlDocument.ImportAttributes(XmlNode fromElem, XmlNode toElem) at System.Xml.XmlDocument.ImportNodeInternal(XmlNode node, Boolean deep) at System.Xml.XmlDocument.ImportChildren(XmlNode fromNode, XmlNode toNode, Boolean deep) at System.Xml.XmlDocument.ImportNodeInternal(XmlNode node, Boolean deep) at System.Xml.XmlDocument.ImportChildren(XmlNode fromNode, XmlNode toNode, Boolean deep) at System.Xml.XmlDocument.ImportNodeInternal(XmlNode node, Boolean deep) at System.Xml.XmlDocument.ImportChildren(XmlNode fromNode, XmlNode toNode, Boolean deep) at System.Xml.XmlDocument.ImportNodeInternal(XmlNode node, Boolean deep) at System.Xml.XmlDocument.ImportChildren(XmlNode fromNode, XmlNode toNode, Boolean deep) at System.Xml.XmlDocument.ImportNodeInternal(XmlNode node, Boolean deep) at System.Xml.XmlDocument.ImportChildren(XmlNode fromNode, XmlNode toNode, Boolean deep) at System.Xml.XmlDocument.ImportNodeInternal(XmlNode node, Boolean deep) at System.Xml.XmlDocument.ImportChildren(XmlNode fromNode, XmlNode toNode, Boolean deep) at System.Xml.XmlDocument.CloneNode(Boolean deep) at umbraco.SafeXmlReaderWriter.Clone(XmlDocument xml) at umbraco.SafeXmlReaderWriter.Get(IScopeProviderInternal scopeProvider, AsyncLock xmlLock, XmlDocument xml, Action
1 refresh, Action
2 apply, Boolean writer) at umbraco.content.GetSafeXmlWriter() at umbraco.content.UpdateDocumentCache(Document d) at Umbraco.Web.Cache.PageCacheRefresher.Refresh(Int32 id) at Umbraco.Core.Sync.DatabaseServerMessenger.RefreshByIds(Guid uniqueIdentifier, String jsonIds) at Umbraco.Core.Sync.DatabaseServerMessenger.NotifyRefreshers(IEnumerable1 instructions, HashSet
1 processed) at Umbraco.Core.Sync.DatabaseServerMessenger.ProcessDatabaseInstructions(IReadOnlyCollection1 instructionBatch, CacheInstructionDto dto, HashSet
1 processed, Int32& lastId)Sounds like your situation was same as ours. This was happening on our websites when a user does either of these:
The above actions will write an instruction with hundreds/thousands of IDs to refresh.
This will be fixed in v7.8.0: http://issues.umbraco.org/issue/U4-10150
You could either wait for the release or implement the fix on your website, basically making DatabaseServerMessenger run on the slave instances as a recurring task instead of being triggered by a page request.
Hi Leo,
Thanks for the info. I figured this was something in the works but I'm ashamed I couldn't find the direct link myself. In any case, we disabled the "publish with all subpages" option for now for normal users since it's not really a necessity, but it's good to have your suggestion at hand just in case.
Again, thanks for giving me peace of mind, this was starting to get to me.
Regards, Vlad A.
is working on a reply...