Urgent load balancing on umbraco 7 examine corruption issues
Hi
This relates to a new site we pushed live at the weekend for a major train company.
They are running a load balanced setup with DFS replication from web2 to web 1 (one way push). The Temp folder is excluded from being sent and now content is pushed over properly to web1.
However overnight web2 displayed errors with corrupt examine indexes - an excerpt of the error show is below.
This one seems to be related to Media (indeed it's the slideshow code that has failed) using TypedMedia - we've seen this type of error before and ended up not using TypedMedia for this very reason, however i'm concerned this is happening more and on quite a large and busy website.
Any ideas as to:
a) what causes the examine media or other indexes to get corrupted? b) how we can handle this so if it happens, the indexes self rebuild? c) stop it happening again
Obviously this is very urgent for us at the moment, any help would be appreciated.
Regards
Simon
2014-11-12 13:04:15,400 [46] ERROR Umbraco.Web.PublishedCache.XmlPublishedCache.PublishedMediaCache - [Thread 7] Could not load data from Examine index for media
System.IO.FileNotFoundException: Could not find file 'D:\WebRoot\App_Data\TEMP\ExamineIndexes\Internal\Index\_25.cfs'.
at Lucene.Net.Index.SegmentInfos.FindSegmentsFile.Run(IndexCommit commit)
at Lucene.Net.Index.DirectoryReader.DoReopenNoWriter(Boolean openReadOnly, IndexCommit commit)
at Examine.LuceneEngine.Providers.LuceneSearcher.ValidateSearcher(Boolean forceReopen)
at Examine.LuceneEngine.Providers.LuceneSearcher.GetSearchFields()
at UmbracoExamine.UmbracoExamineSearcher.GetSearchFields()
at UmbracoExamine.UmbracoExamineSearcher.CreateSearchCriteria(String type, BooleanOperation defaultOperation)
at Umbraco.Web.PublishedCache.XmlPublishedCache.PublishedMediaCache.GetUmbracoMedia(Int32 id)
2014-11-13 09:21:55,226 [7] ERROR Umbraco.Core.UmbracoApplicationBase - [Thread 29] An unhandled exception occurred System.ApplicationException: Could not create an index searcher with the supplied lucene directory ---> System.IO.FileNotFoundException: no segments* file found in Lucene.Net.Store.SimpleFSDirectory@D:\WebRoot\App_Data\TEMP\ExamineIndexes\Internal\Index lockFactory=Lucene.Net.Store.NativeFSLockFactory: files: at Lucene.Net.Index.SegmentInfos.FindSegmentsFile.Run(IndexCommit commit) at Examine.LuceneEngine.Providers.LuceneSearcher.ValidateSearcher(Boolean forceReopen) --- End of inner exception stack trace --- at Examine.LuceneEngine.Providers.LuceneSearcher.ValidateSearcher(Boolean forceReopen) at Examine.LuceneEngine.Providers.LuceneSearcher.GetSearchFields() at UmbracoExamine.UmbracoExamineSearcher.GetSearchFields() at UmbracoExamine.UmbracoExamineSearcher.CreateSearchCriteria(String type, BooleanOperation defaultOperation) at Umbraco.Web.PublishedCache.XmlPublishedCache.PublishedMediaCache.GetUmbracoMedia(Int32 id) at Umbraco.Web.PublishedCache.ContextualPublishedCache`1.GetById(Boolean preview, Int32 contentId) at Umbraco.Web.PublishedContentQuery.TypedDocumentById(Int32 id, ContextualPublishedCache cache) at ASP._Page_Views_Partials_Homepage_Slideshow_cshtml.Execute() in D:\WebRoot\Views\Partials\Homepage-Slideshow.cshtml:line 12 at System.Web.WebPages.WebPageBase.ExecutePageHierarchy() at System.Web.Mvc.WebViewPage.ExecutePageHierarchy() at System.Web.WebPages.WebPageBase.ExecutePageHierarchy(WebPageContext pageContext, TextWriter writer, WebPageRenderingBase startPage) at Umbraco.Core.Profiling.ProfilingView.Render(ViewContext viewContext, TextWriter writer) at System.Web.Mvc.Html.PartialExtensions.Partial(HtmlHelper htmlHelper, String partialViewName, Object model, ViewDataDictionary viewData) at ASP._Page_Views_Home_cshtml.Execute() in D:\WebRoot\Views\Home.cshtml:line 6 at System.Web.WebPages.WebPageBase.ExecutePageHierarchy() at System.Web.Mvc.WebViewPage.ExecutePageHierarchy() at System.Web.WebPages.WebPageBase.ExecutePageHierarchy(WebPageContext pageContext, TextWriter writer, WebPageRenderingBase startPage) at Umbraco.Core.Profiling.ProfilingView.Render(ViewContext viewContext, TextWriter writer) at System.Web.Mvc.ViewResultBase.ExecuteResult(ControllerContext context) at System.Web.Mvc.ControllerActionInvoker.c__DisplayClass1a.b__17() at System.Web.Mvc.ControllerActionInvoker.InvokeActionResultFilter(IResultFilter filter, ResultExecutingContext preContext, Func`1 continuation) at System.Web.Mvc.ControllerActionInvoker.InvokeActionResultFilter(IResultFilter filter, ResultExecutingContext preContext, Func`1 continuation) at System.Web.Mvc.ControllerActionInvoker.InvokeActionResultWithFilters(ControllerContext controllerContext, IList`1 filters, ActionResult actionResult) at System.Web.Mvc.Async.AsyncControllerActionInvoker.c__DisplayClass25.b__22(IAsyncResult asyncResult) at System.Web.Mvc.Controller.c__DisplayClass1d.b__18(IAsyncResult asyncResult) at System.Web.Mvc.Async.AsyncResultWrapper.c__DisplayClass4.b__3(IAsyncResult ar) at System.Web.Mvc.Controller.EndExecuteCore(IAsyncResult asyncResult) at System.Web.Mvc.Async.AsyncResultWrapper.c__DisplayClass4.b__3(IAsyncResult ar) at System.Web.Mvc.MvcHandler.c__DisplayClass8.b__3(IAsyncResult asyncResult) at System.Web.Mvc.Async.AsyncResultWrapper.c__DisplayClass4.b__3(IAsyncResult ar) at System.Web.HttpApplication.CallHandlerExecutionStep.System.Web.HttpApplication.IExecutionStep.Execute() at System.Web.HttpApplication.ExecuteStep(IExecutionStep step, Boolean& completedSynchronously)
Just experienced this issue on a 7.2.1 site not running in a load balanced env. But error also originates from a line accessing a media item, this one is using Umbraco.Media("id") though.
This is very disturbing, since the page permanently dies in a YSOD, when the lucene index file is gone.
Did you ever find a solution or a reported issue on this error?
Nope afraid not but the suspect was that the host did not setup the load balancing replication correctly - once we set it up again, we did not get this error.
Getting these errors in a site we just pushed on version 7.2.4 not using loading balancing (running AWS). useTempStorage is "true" in examine config.
MESSAGE:
Could not load data from Examine index for media
System.IO.FileNotFoundException: Could not find file 'D:\local\Temporary ASP.NET Files\root\23ac92a0\b8b0329b\App_Data\TEMP\ExamineIndexes\Internal\segments_e69'. File name: 'D:\local\Temporary ASP.NET Files\root\23ac92a0\b8b0329b\App_Data\TEMP\ExamineIndexes\Internal\segments_e69' at Lucene.Net.Index.SegmentInfos.FindSegmentsFile.Run(IndexCommit commit) at Lucene.Net.Index.SegmentInfos.Read(Directory directory) at Lucene.Net.Index.DirectoryReader.IsCurrent() at Examine.LuceneEngine.LuceneExtensions.GetReaderStatus(IndexReader reader) at Examine.LuceneEngine.Providers.LuceneSearcher.ValidateSearcher(Boolean forceReopen) at UmbracoExamine.UmbracoExamineSearcher.GetSearchFields() at UmbracoExamine.UmbracoExamineSearcher.CreateSearchCriteria(String type, BooleanOperation defaultOperation) at Umbraco.Web.PublishedCache.XmlPublishedCache.PublishedMediaCache.GetUmbracoMedia(Int32 id)
MESSAGE:
Provider=InternalIndexer, NodeId=-1
System.Exception: Error indexing queue items,Could not find file 'D:\local\Temporary ASP.NET Files\root\23ac92a0\b8b0329b\App_Data\TEMP\ExamineIndexes\Internal\_e8s.fnm'., IndexSet: InternalIndexSet
Just got this error this morning after we noticed the client's site was throwing a server error. The message was logged 7892 times since yesterday.
System.ApplicationException: Could not create an index searcher with the supplied lucene directory ---> Lucene.Net.Index.CorruptIndexException: doc counts differ for segment _e8u: fieldsReader shows 2 but segmentInfo shows 1
at Lucene.Net.Index.SegmentInfos.FindSegmentsFile.Run(IndexCommit commit)
at Lucene.Net.Index.DirectoryReader.Open(Directory directory, IndexDeletionPolicy deletionPolicy, IndexCommit commit, Boolean readOnly, Int32 termInfosIndexDivisor)
at Lucene.Net.Index.IndexReader.Open(Directory directory, IndexDeletionPolicy deletionPolicy, Boolean readOnly)
at Examine.LuceneEngine.Providers.LuceneSearcher.ValidateSearcher(Boolean forceReopen)
--- End of inner exception stack trace ---
at Examine.LuceneEngine.Providers.LuceneSearcher.ValidateSearcher(Boolean forceReopen)
at UmbracoExamine.UmbracoExamineSearcher.GetSearchFields()
at UmbracoExamine.UmbracoExamineSearcher.CreateSearchCriteria(String type, BooleanOperation defaultOperation)
at Umbraco.Web.PublishedCache.XmlPublishedCache.PublishedMediaCache.GetUmbracoMedia(Int32 id)
at Umbraco.Web.PublishedCache.ContextualPublishedCache`1.GetById(Boolean preview, Int32 contentId)
at Umbraco.Web.PublishedContentQuery.TypedDocumentById(Int32 id, ContextualPublishedCache cache)
at Umbraco.Web.UmbracoHelper.TypedMedia(Object id)
at Our.Umbraco.PropertyConverters.MediaPickerPropertyConverter.ConvertSourceToObject(PublishedPropertyType propertyType, Object source, Boolean preview)
at System.Lazy`1.CreateValue()
at System.Lazy`1.LazyInitValue()
at Umbraco.Web.PublishedCache.XmlPublishedCache.XmlPublishedProperty.get_Value()
at Umbraco.Web.PublishedPropertyExtension.GetValue[T](IPublishedProperty property, Boolean withDefaultValue, T defaultValue)
at Umbraco.Web.PublishedContentExtensions.GetPropertyValue[T](IPublishedContent content, String alias, Boolean recurse)
at MMGY.Common.Umbraco.Mappers.MMGYMapper.MapImage(IUmbracoMapper mapper, IPub
I rebuilt/redindexed and now the sites up again. This has to be related to media indexing, but I'm not familiar enough with Lucene to know where to start.
Setting useTempStorage="LocalOnly" (was "Sync") seems to have resolved the issue. When using "Sync" option, this appears to happen when AWS spins up a new instance. Thoughts? I assumed that the current examine indexes would be copied/synced to local temp storage from existing indexes in ~/App_Data/Temp...
@Daniel, if you are not load balancing why do you have a useTempStorage option in the first place?
I'm assuming you are not using the https://github.com/Shazwazza/UmbracoExamine.TempStorage and are just using this option now that it is part of the core?
You are correct, that examine will copy the indexes from existing ones to local temp storage, and then whenever indexes are written it writes to both locations. Unless AWS uses remote file stores, then this option is irrelevant for you. If this is reproducible all of the time then i will investigate, might be an AWS thing.
@Simon, without knowing the cause of the issue it's pretty hard to answer your questions. When it comes to load balancing there are so many moving parts and everyone decides to setup their environments differently in one way or another. The error you are getting would indicate that you have multiple servers or writers or threads writing to the same index. If you can replicate the issue please report how.
HI Shannon - thanks but we sorted it by setting up DFS properly (host had it configured wrong) - I did update the post on 5th, forgot to mark it as solved though doh!!!
Urgent load balancing on umbraco 7 examine corruption issues
Hi
This relates to a new site we pushed live at the weekend for a major train company.
They are running a load balanced setup with DFS replication from web2 to web 1 (one way push). The Temp folder is excluded from being sent and now content is pushed over properly to web1.
However overnight web2 displayed errors with corrupt examine indexes - an excerpt of the error show is below.
This one seems to be related to Media (indeed it's the slideshow code that has failed) using TypedMedia - we've seen this type of error before and ended up not using TypedMedia for this very reason, however i'm concerned this is happening more and on quite a large and busy website.
Any ideas as to:
a) what causes the examine media or other indexes to get corrupted?
b) how we can handle this so if it happens, the indexes self rebuild?
c) stop it happening again
Obviously this is very urgent for us at the moment, any help would be appreciated.
Regards
Simon
2014-11-12 13:04:15,400 [46] ERROR Umbraco.Web.PublishedCache.XmlPublishedCache.PublishedMediaCache - [Thread 7] Could not load data from Examine index for media
System.IO.FileNotFoundException: Could not find file 'D:\WebRoot\App_Data\TEMP\ExamineIndexes\Internal\Index\_25.cfs'.
File name: 'D:\WebRoot\App_Data\TEMP\ExamineIndexes\Internal\Index\_25.cfs'
at Lucene.Net.Index.SegmentInfos.FindSegmentsFile.Run(IndexCommit commit)
at Lucene.Net.Index.DirectoryReader.DoReopenNoWriter(Boolean openReadOnly, IndexCommit commit)
at Examine.LuceneEngine.Providers.LuceneSearcher.ValidateSearcher(Boolean forceReopen)
at Examine.LuceneEngine.Providers.LuceneSearcher.GetSearchFields()
at UmbracoExamine.UmbracoExamineSearcher.GetSearchFields()
at UmbracoExamine.UmbracoExamineSearcher.CreateSearchCriteria(String type, BooleanOperation defaultOperation)
at Umbraco.Web.PublishedCache.XmlPublishedCache.PublishedMediaCache.GetUmbracoMedia(Int32 id)
2014-11-13 09:21:55,226 [7] ERROR Umbraco.Core.UmbracoApplicationBase - [Thread 29] An unhandled exception occurred
System.ApplicationException: Could not create an index searcher with the supplied lucene directory ---> System.IO.FileNotFoundException: no segments* file found in Lucene.Net.Store.SimpleFSDirectory@D:\WebRoot\App_Data\TEMP\ExamineIndexes\Internal\Index lockFactory=Lucene.Net.Store.NativeFSLockFactory: files:
at Lucene.Net.Index.SegmentInfos.FindSegmentsFile.Run(IndexCommit commit)
at Examine.LuceneEngine.Providers.LuceneSearcher.ValidateSearcher(Boolean forceReopen)
--- End of inner exception stack trace ---
at Examine.LuceneEngine.Providers.LuceneSearcher.ValidateSearcher(Boolean forceReopen)
at Examine.LuceneEngine.Providers.LuceneSearcher.GetSearchFields()
at UmbracoExamine.UmbracoExamineSearcher.GetSearchFields()
at UmbracoExamine.UmbracoExamineSearcher.CreateSearchCriteria(String type, BooleanOperation defaultOperation)
at Umbraco.Web.PublishedCache.XmlPublishedCache.PublishedMediaCache.GetUmbracoMedia(Int32 id)
at Umbraco.Web.PublishedCache.ContextualPublishedCache`1.GetById(Boolean preview, Int32 contentId)
at Umbraco.Web.PublishedContentQuery.TypedDocumentById(Int32 id, ContextualPublishedCache cache)
at ASP._Page_Views_Partials_Homepage_Slideshow_cshtml.Execute() in D:\WebRoot\Views\Partials\Homepage-Slideshow.cshtml:line 12
at System.Web.WebPages.WebPageBase.ExecutePageHierarchy()
at System.Web.Mvc.WebViewPage.ExecutePageHierarchy()
at System.Web.WebPages.WebPageBase.ExecutePageHierarchy(WebPageContext pageContext, TextWriter writer, WebPageRenderingBase startPage)
at Umbraco.Core.Profiling.ProfilingView.Render(ViewContext viewContext, TextWriter writer)
at System.Web.Mvc.Html.PartialExtensions.Partial(HtmlHelper htmlHelper, String partialViewName, Object model, ViewDataDictionary viewData)
at ASP._Page_Views_Home_cshtml.Execute() in D:\WebRoot\Views\Home.cshtml:line 6
at System.Web.WebPages.WebPageBase.ExecutePageHierarchy()
at System.Web.Mvc.WebViewPage.ExecutePageHierarchy()
at System.Web.WebPages.WebPageBase.ExecutePageHierarchy(WebPageContext pageContext, TextWriter writer, WebPageRenderingBase startPage)
at Umbraco.Core.Profiling.ProfilingView.Render(ViewContext viewContext, TextWriter writer)
at System.Web.Mvc.ViewResultBase.ExecuteResult(ControllerContext context)
at System.Web.Mvc.ControllerActionInvoker.c__DisplayClass1a.b__17()
at System.Web.Mvc.ControllerActionInvoker.InvokeActionResultFilter(IResultFilter filter, ResultExecutingContext preContext, Func`1 continuation)
at System.Web.Mvc.ControllerActionInvoker.InvokeActionResultFilter(IResultFilter filter, ResultExecutingContext preContext, Func`1 continuation)
at System.Web.Mvc.ControllerActionInvoker.InvokeActionResultWithFilters(ControllerContext controllerContext, IList`1 filters, ActionResult actionResult)
at System.Web.Mvc.Async.AsyncControllerActionInvoker.c__DisplayClass25.b__22(IAsyncResult asyncResult)
at System.Web.Mvc.Controller.c__DisplayClass1d.b__18(IAsyncResult asyncResult)
at System.Web.Mvc.Async.AsyncResultWrapper.c__DisplayClass4.b__3(IAsyncResult ar)
at System.Web.Mvc.Controller.EndExecuteCore(IAsyncResult asyncResult)
at System.Web.Mvc.Async.AsyncResultWrapper.c__DisplayClass4.b__3(IAsyncResult ar)
at System.Web.Mvc.MvcHandler.c__DisplayClass8.b__3(IAsyncResult asyncResult)
at System.Web.Mvc.Async.AsyncResultWrapper.c__DisplayClass4.b__3(IAsyncResult ar)
at System.Web.HttpApplication.CallHandlerExecutionStep.System.Web.HttpApplication.IExecutionStep.Execute()
at System.Web.HttpApplication.ExecuteStep(IExecutionStep step, Boolean& completedSynchronously)
Just experienced this issue on a 7.2.1 site not running in a load balanced env. But error also originates from a line accessing a media item, this one is using Umbraco.Media("id") though.
This is very disturbing, since the page permanently dies in a YSOD, when the lucene index file is gone.
Did you ever find a solution or a reported issue on this error?
.Jesper
Nope afraid not but the suspect was that the host did not setup the load balancing replication correctly - once we set it up again, we did not get this error.
Si
Just happened across this on an Umbraco 7.1.8 site that isn't load balanced (it's running in a VM on AWS).
Getting these errors in a site we just pushed on version 7.2.4 not using loading balancing (running AWS). useTempStorage is "true" in examine config.
MESSAGE:
Could not load data from Examine index for media
MESSAGE:
Provider=InternalIndexer, NodeId=-1
Just got this error this morning after we noticed the client's site was throwing a server error. The message was logged 7892 times since yesterday.
I rebuilt/redindexed and now the sites up again. This has to be related to media indexing, but I'm not familiar enough with Lucene to know where to start.
Setting useTempStorage="LocalOnly" (was "Sync") seems to have resolved the issue. When using "Sync" option, this appears to happen when AWS spins up a new instance. Thoughts? I assumed that the current examine indexes would be copied/synced to local temp storage from existing indexes in ~/App_Data/Temp...
@Daniel, if you are not load balancing why do you have a useTempStorage option in the first place? I'm assuming you are not using the https://github.com/Shazwazza/UmbracoExamine.TempStorage and are just using this option now that it is part of the core? You are correct, that examine will copy the indexes from existing ones to local temp storage, and then whenever indexes are written it writes to both locations. Unless AWS uses remote file stores, then this option is irrelevant for you. If this is reproducible all of the time then i will investigate, might be an AWS thing.
@Simon, without knowing the cause of the issue it's pretty hard to answer your questions. When it comes to load balancing there are so many moving parts and everyone decides to setup their environments differently in one way or another. The error you are getting would indicate that you have multiple servers or writers or threads writing to the same index. If you can replicate the issue please report how.
HI Shannon - thanks but we sorted it by setting up DFS properly (host had it configured wrong) - I did update the post on 5th, forgot to mark it as solved though doh!!!
is working on a reply...