Copied to clipboard

Flag this post as spam?

This post will be reported to the moderators as potential spam to be looked at


  • Kieron McIntyre 116 posts 359 karma points
    Feb 14, 2012 @ 14:18
    Kieron McIntyre
    0

    Examine Azure: There is no Lucene index in the folder

    I have a multi-territorial, multi-language site using Azure Accelerator (with multiple instances) and also using Examine Azure. Unfortunately I cannot get the search to work consistently. I have one search index per territory, so 15 indexes. The Examine configuration is correct, and indexes are being populated - but not consistently. 

    I am using Luke to view the indexes offline and I can see that often the indexes are missing files, e.g. when opening one index I got the error:

    D:\Index\segments_6z (The system cannot find the file specified)

    Ultimately, on the published website I am getting the folowing exception, which ties in with the error above:

     

    [WebException: The remote server returned an error: (404) Not Found.]
       System.Net.HttpWebRequest.EndGetResponse(IAsyncResult asyncResult) +3230441
       Microsoft.WindowsAzure.StorageClient.EventHelper.ProcessWebResponse(WebRequest req, IAsyncResult asyncResult, EventHandler`1 handler, Object sender) +80
    
    [StorageClientException: The specified blob does not exist.]
       Microsoft.WindowsAzure.StorageClient.Tasks.Task`1.get_Result() +96
       Microsoft.WindowsAzure.StorageClient.Tasks.Task`1.ExecuteAndWait() +271
       Microsoft.WindowsAzure.StorageClient.CloudBlob.FetchAttributes(BlobRequestOptions options) +213
       Lucene.Net.Store.Azure.AzureDirectory.OpenInput(String name) +66
    
    [FileNotFoundException: _5_1.del]
       Lucene.Net.Index.FindSegmentsFile.Run(IndexCommit commit) +1661
       Lucene.Net.Search.IndexSearcher..ctor(Directory path, Boolean readOnly) +74
       Examine.LuceneEngine.Providers.LuceneSearcher.ValidateSearcher(Boolean forceReopen) +197
    
    [ApplicationException: There is no Lucene index in the folder: C:\Resources\directory\d5ab035dd0d34185b2a19a34f9ff83f2.WebRole.Sites\xxx.com\umbraco\webservices\Examine-Ui-Content-Deutschland\Index to create an IndexSearcher on]
       Examine.LuceneEngine.Providers.LuceneSearcher.ValidateSearcher(Boolean forceReopen) +909
       Examine.LuceneEngine.Providers.LuceneSearcher.GetSearchFields() +19
       UmbracoExamine.UmbracoExamineSearcher.GetSearchFields() +11
       UmbracoExamine.UmbracoExamineSearcher.CreateSearchCriteria(String type, BooleanOperation defaultOperation) +38
       PPG.ApplicationServices.Search.SiteSearchService.ListResults(SearchConfig config, SiteSearchCriteria criteria, Int32& totalItemCount) +162
       usercontrols_search_ResultList.Page_Load(Object sender, EventArgs e) in c:\Resources\Directory\d5ab035dd0d34185b2a19a34f9ff83f2.WebRole.Sites\xxx.com\usercontrols\search\ResultList.ascx.cs:32
       System.Web.Util.CalliHelper.EventArgFunctionCaller(IntPtr fp, Object o, Object t, EventArgs e) +25
       System.Web.UI.Control.LoadRecursive() +71
       System.Web.UI.Control.LoadRecursive() +190
       System.Web.UI.Control.LoadRecursive() +190
       System.Web.UI.Control.LoadRecursive() +190
       System.Web.UI.Control.LoadRecursive() +190
       System.Web.UI.Control.LoadRecursive() +190
       System.Web.UI.Control.LoadRecursive() +190
       System.Web.UI.Control.LoadRecursive() +190
       System.Web.UI.Control.LoadRecursive() +190
       System.Web.UI.Page.ProcessRequestMain(Boolean includeStagesBeforeAsyncPoint, Boolean includeStagesAfterAsyncPoint) +3064

     

    The thing that is really confusing me is the line: There is no Lucene index in the folder ..." . The index is not stored at this address (indexes are stored as containers within blob storage).

    Can anyone help?

  • Shannon Deminick 1525 posts 5271 karma points MVP 2x
    Feb 16, 2012 @ 18:23
    Shannon Deminick
    0

    Hi Kieron,

    We store any errors encountered by Examine on Azure in another blob storage folder as txt files (just like the accelerater does). You might be able to find some information in there.

    I've not actually seen this issue but I personally haven't used Azure as extensively as yourself. The error looks like its coming from a 3rd party library that Examine on Azure uses which is found here:

    http://code.msdn.microsoft.com/windowsazure/Azure-Library-for-83562538

    The error looks to be coming from this call:

    Lucene.Net.Store.Azure.AzureDirectory.OpenInput(String name) +66

    The error message from Examine should be changed of course since it's not a folder, this is just an error message which text wasn't properly updated for use on Azure. The error message should just say: No Lucene index found. The actual error that is causing that message is: "The specified blob does not exist" which is coming from that 3rd party library.

    This error could be either caused by Examine doing something it shouldn't with this 3rd party library, the 3rd party library having issues, or something in Azure that is going on... I'm definitely no Azure expert so I unfortunately can't off the top of my head know what that might be.

    Perhaps what may be occuring is an issue with the implementation of the 3rd party AzureDirectory library. There's 2 points that the documentation makes which contradict each other. One is:

    "To be more concrete: you can have 1..N worker roles adding documents to an index, and 1..N searcher webroles searching over the catalog in near real time."

    which, when writing this provider would meant that you can have 1..N web nodes writing to the AzureDirectory. But then near the bottom it states:

    "The index can only be updated by one process at a time, so it makes sense to push all Add/Update/Delete operations through an indexing role. The obvious way to do that is to have an Azure queue which feeds a stream of objects to be indexed to a worker role which maintains updating the index. "

    again, I'm not Azure expert but I'm assuming that means to 'push' all operations to a single web node? (though this defeats the purpose of Azure having a single point of failure).

    This provider assumes that multiple web nodes can write to the AzureDirectory.

    Another thing that may be occuring is that this is a caching issue. The AzureDirectory doesn't directly write to the Blob storage, it writes to local cache and then pushes it's updates to blob storage. So what might be occuring is, for example, Web node 'A' creates the index and writes some documents to it, before Web node 'A' pushes its changes to Blob storage, web node 'B' tries to search on the index but web node 'B' doesn't have the index in it's local cache so it goes to retreive it from Blog strorage but can't find it yet.

    The error: "There is no Lucene index in the folder" is thrown in the ValidateSearcher method in Examine's LuceneSearcher class. What might actually be occuring here, is that at the beginning of this method a call is made to: EnsureIndex which ensures that the index actually exists. Since no error is thrown during that operation, we can assume that the index gets created. When this occurs a call is made to GetLuceneDirectory() which the Azure provider always returns a new AzureDirectory instance... this could be the issue. So EnsureIndex is called on web node 'B', it creates the index on itself, but then further in the ValidateSearcher method, another call is made to GetLuceneDirectory() which returns a new instance of AzureDirectory... but this time its creating a searcher and might go and directory look in the Blog storage instead of the index created in memory... since its a new instance of AzureDirectory. 

    In theory, the AzureLuceneSearcher should only be returning the same instance of AzureDirectory anyways (AzureLuceneIndexer already adheres to this)... so this potentially could be your issue. In any case, I've made this provision in the codebase if you want to download the source on the 'default' branch and compile it with this new change. Please let me know if this does/doesn't fix the issue and we can go from there.

    Sorry for the lengthy reply :P

     

     

     

  • Kieron McIntyre 116 posts 359 karma points
    Feb 16, 2012 @ 21:15
    Kieron McIntyre
    0

    Wow, thanks for the reply! I did find a solution but as with many things it came with simplifying the picture a little.

    Within Umbraco I have multiple territorries (each a root node) each with multiple language sites. Each territory had it's own Lucene index. I have an Umbraco Event that (upon indexing of a publised node) added the language site's node id to the indexed document. This meant that I could filter all results by language and each territory could have it's own separate index.

    Purely in a exercise of simplifying the configuration I made all territories use the same index. This was simply to test that the configuration was correct. By doing this the problems seem to disappear and whe I thought about it, each territory didn't actually need its own index. It was just me trying to de-couple elements.

    Of course it doesn't solve the issue as to why, but I will look into this when I get a chance.

    Thanks for the help.

  • Shannon Deminick 1525 posts 5271 karma points MVP 2x
    Feb 17, 2012 @ 02:56
    Shannon Deminick
    0

    Great, very glad you got it working in the end, though i do still think that the fix I pushed up last night is a required change so i'll be keeping that change moving forward. If you run in to more issues similar to this one, please try the latest codebase on the default branch as it may fix it.

    Cheers!

  • Kieron McIntyre 116 posts 359 karma points
    Feb 17, 2012 @ 15:51
    Kieron McIntyre
    0

    Thanks I'll download the code and get up and running! Thanks for your help.

  • Kieron McIntyre 116 posts 359 karma points
    Feb 23, 2012 @ 11:37
    Kieron McIntyre
    0

    Just an update on this, the issue did not go away. To clarify, in addition to the above issue I was getting a lot of the following logged errors:

    [UmbracoExamine] (InternalIndexer)Cannot index queue items, the index is currently locked,, IndexSet: InternalIndexSet. NodeId: -1
    [UmbracoExamine] (InternalIndexer)Cannot index queue items, the index is currently locked. NodeId: -1
    [UmbracoExamine] (InternalMemberIndexer)Cannot index queue items, the index doesn't exist!,, IndexSet: InternalMemberIndexSet. NodeId: -1

     

    This is the situation thus far:

    I am using the Umbraco Accelerator with multiple instances of Umbraco on Azure and each instance was using SQL Session State. I upgraded the session state to use Azure's AppFabric and the problem seemed to occur much, much less often, but obviously once is too often.

    I changed the Accelerator configuration so it only ran one instance of Umbraco and now the problem has disappeared completely.

    I can only deduce that Umbraco.Azure was tripping over because multiple instances of Umbraco were all trying to update the same shared session state-stored data at the same time as a thread is using the session state and trying to write to the index? i.e. Only one thread can write to the Index but if the session state changes whilst the index is being written to, would this be causing the Index to become corrupted?

    Either way, the solution seems to be to run a single instance of Umbraco but then this makes it almost redundant for use with Accelerator.

    Any further help would be greatly appreciated as I will have to revisit this :)

  • Lennart Stoop 304 posts 842 karma points
    Feb 23, 2012 @ 11:58
    Lennart Stoop
    0

    Hi Kieron,

    I don't think SQL session state is used in the indexation process, as indexes are stored on blob storage and the local cache Shannon refers to is in memory of the application. Shannon does make some good arguments about what is potentially causing the issue.

    Did you manage to upgrade Examine with the latest code changes Shannon provided?

    http://examine.codeplex.com/SourceControl/changeset/changes/a4bed45c97d2

     

    Grtz

    L

  • Kieron McIntyre 116 posts 359 karma points
    Feb 23, 2012 @ 12:56
    Kieron McIntyre
    0

    Hi Lennart,

    Yes, I did upgrade to the latest code changes. Ok I misunderstood (or misread) that the Examine.Azure used the session - it actual makes sense that it wouldn't.

Please Sign in or register to post replies

Write your reply to:

Draft