Copied to clipboard

Flag this post as spam?

This post will be reported to the moderators as potential spam to be looked at


  • iNETZO 134 posts 497 karma points c-trib
    Mar 04, 2020 @ 19:26
    iNETZO
    0

    Examine indexing issues and high CPU

    Hi,

    We're using Umbraco 8.2.2 for a large site. Everytime we restart the applicationpool we have to reindex all Examine indexes because we're using examine to find the latest newsitems that are meeting some conditions.

    If we dont index manualy after starting the site, this content is not showing up. Sometimes it looks like the indexes get corrupt with causes high CPU usage. If we reindex again, the performance gets back to normal.

    Often (a few times a day) we get these type of errors:

    [Error] Error indexing queue items
    System.ArgumentNullException: Value cannot be null.
    Parameter name: key
       at System.ThrowHelper.ThrowArgumentNullException(ExceptionArgument argument)
       at System.Collections.Generic.Dictionary`2.Remove(TKey key)
       at Umbraco.Examine.ValueSetValidator.Validate(ValueSet valueSet) in C:\Umbraco-CMS-8-8.2\src\Umbraco.Examine\ValueSetValidator.cs:line 85
       at Examine.Providers.BaseIndexProvider.<IndexItems>b__12_0(ValueSet x) in C:\projects\examine-qvx04\src\Examine\Providers\BaseIndexProvider.cs:line 76
       at System.Linq.Enumerable.WhereSelectListIterator`2.MoveNext()
       at Examine.LuceneEngine.Providers.LuceneIndex.ForceProcessQueueItems(Boolean block) in C:\projects\examine-qvx04\src\Examine\LuceneEngine\Providers\LuceneIndex.cs:line 879
    

    and these:

    [Error] Error indexing queue items
    System.IndexOutOfRangeException: Index was outside the bounds of the array.
       at System.Collections.Generic.Dictionary`2.CopyTo(KeyValuePair`2[] array, Int32 index)
       at Examine.LuceneEngine.Providers.LuceneIndex.CopyDictionary(IDictionary`2 d) in C:\projects\examine-qvx04\src\Examine\LuceneEngine\Providers\LuceneIndex.cs:line 654
       at Examine.LuceneEngine.Providers.LuceneIndex.AddDocument(Document doc, ValueSet valueSet, IndexWriter writer) in C:\projects\examine-qvx04\src\Examine\LuceneEngine\Providers\LuceneIndex.cs:line 666
       at Examine.LuceneEngine.Providers.LuceneIndex.ProcessIndexQueueItem(IndexOperation op, IndexWriter writer) in C:\projects\examine-qvx04\src\Examine\LuceneEngine\Providers\LuceneIndex.cs:line 1201
       at Examine.LuceneEngine.Providers.LuceneIndex.ProcessQueueItem(IndexOperation item, IndexWriter writer) in C:\projects\examine-qvx04\src\Examine\LuceneEngine\Providers\LuceneIndex.cs:line 1020
       at Examine.LuceneEngine.Providers.LuceneIndex.ForceProcessQueueItems(Boolean block) in C:\projects\examine-qvx04\src\Examine\LuceneEngine\Providers\LuceneIndex.cs:line 879
    2020-03-04 20:10:33.635 +01:00 [Error] Error indexing queue items
    System.IndexOutOfRangeException: Index was outside the bounds of the array.
       at System.Collections.Generic.Dictionary`2.CopyTo(KeyValuePair`2[] array, Int32 index)
       at Examine.LuceneEngine.Providers.LuceneIndex.CopyDictionary(IDictionary`2 d) in C:\projects\examine-qvx04\src\Examine\LuceneEngine\Providers\LuceneIndex.cs:line 654
       at Examine.LuceneEngine.Providers.LuceneIndex.AddDocument(Document doc, ValueSet valueSet, IndexWriter writer) in C:\projects\examine-qvx04\src\Examine\LuceneEngine\Providers\LuceneIndex.cs:line 666
       at Examine.LuceneEngine.Providers.LuceneIndex.ProcessIndexQueueItem(IndexOperation op, IndexWriter writer) in C:\projects\examine-qvx04\src\Examine\LuceneEngine\Providers\LuceneIndex.cs:line 1201
       at Examine.LuceneEngine.Providers.LuceneIndex.ProcessQueueItem(IndexOperation item, IndexWriter writer) in C:\projects\examine-qvx04\src\Examine\LuceneEngine\Providers\LuceneIndex.cs:line 1020
       at Examine.LuceneEngine.Providers.LuceneIndex.ForceProcessQueueItems(Boolean block) in C:\projects\examine-qvx04\src\Examine\LuceneEngine\Providers\LuceneIndex.cs:line 879
    

    Unfortunately, the index name is not mentioned in the error message.

    Is there a way to find out which nodes might cause these indexing problems?

    Best regards,

    iNETZO

  • Alex Skrypnyk 6176 posts 24187 karma points MVP 8x admin c-trib
    Mar 05, 2020 @ 23:57
    Alex Skrypnyk
    0

    Hi iNETZO

    Do you have some custom index events?

    Alex

  • Stefan Kip 1614 posts 4131 karma points c-trib
    May 27, 2021 @ 14:43
    Stefan Kip
    0

    We have the exact same issues with a site on v8.12.
    And yes, there's a custom index. Do you know what might be causing this @alex?

  • Shannon Deminick 1526 posts 5272 karma points MVP 3x
    May 27, 2021 @ 15:03
    Shannon Deminick
    1

    Make sure you are using the latest examine version.

    Everytime we restart the applicationpool we have to reindex all Examine indexes because we're using examine to find the latest newsitems that are meeting some conditions.

    This sounds bad. You shouldn't be rebuilding indexes all of the time since this is expensive. Also, Umbraco tries to manage index rebuilding too so it's possible you are trying to rebuild indexes at the same time and/or at an incorrect and too eager time during the startup.

  • Stefan Kip 1614 posts 4131 karma points c-trib
    May 27, 2021 @ 15:08
    Stefan Kip
    0

    Thanks for the quick reply Shannon! We're on 1.2.0, does 1.2.1 contain important fixes regarding this?
    We're having the same issue as described here: https://github.com/umbraco/Umbraco-CMS/issues/8766
    And we see the exceptions from this TS. Must be related to the custom index we have, but what is wrong with it 😬

  • Shannon Deminick 1526 posts 5272 karma points MVP 3x
    May 27, 2021 @ 15:33
    Shannon Deminick
    0

    The less indexes the better. There is typically no reason to create extra indexes unless you are doing something extremely custom. Why doesn't the built in ones suit your needs? Is this a custom index for custom data or Umbraco data? If it's Umbraco data then you should consider why you need an extra custom index in the first place.

    You might be affected by this issue https://github.com/umbraco/Umbraco-CMS/issues/8893#issuecomment-819480825 . There's an open PR that is being reviewed worked on.

    But all that is an entirely different issue than the one described here.

    Are you also force rebuilding indexes on startup? It's hard to help without steps to replicate.

  • Stefan Kip 1614 posts 4131 karma points c-trib
    May 27, 2021 @ 15:40
    Stefan Kip
    0

    Well I've been switching back and forth between a custom index and adding fields to the existing umbraco members index.
    With the default members index I was running into this issue: https://our.umbraco.com/forum/using-umbraco-and-getting-started/105368-customize-examine-index-not-being-used-on-boot
    So that's why I went back to a custom index. It's also much faster to build than the default members index, I'm not sure why though.

    I'm not rebuilding on startup or similar, just a custom index. This is the complete code:

    using System;
    using System.Collections.Generic;
    using System.Linq;
    using Examine;
    using Examine.LuceneEngine;
    using Examine.Providers;
    using Umbraco.Core;
    using Umbraco.Core.Composing;
    using Umbraco.Core.Logging;
    using Umbraco.Examine;
    using Umbraco.Web.Search;
    
    namespace ClientName.Core.Components
    {
        public class ClientNameMembersIndexComposer : ComponentComposer<ClientNameMembersIndexComponent>, IUserComposer
        {
            public override void Compose(Composition composition)
            {
                composition.RegisterUnique<MemberIndexCreator>();
    
                base.Compose(composition);
            }
        }
    
        public class ClientNameMembersIndexComponent : IComponent
        {
            private readonly IExamineManager _examineManager;
            private readonly MemberIndexCreator _memberIndexCreator;
            private readonly ILogger _logger;
            private BaseIndexProvider _indexProvider;
    
            public ClientNameMembersIndexComponent(IExamineManager examineManager, MemberIndexCreator memberIndexCreator, ILogger logger)
            {
                _examineManager = examineManager;
                _memberIndexCreator = memberIndexCreator;
                _logger = logger;
            }
    
            public void Initialize()
            {
                foreach (var newIndex in _memberIndexCreator.Create())
                {
                    _examineManager.AddIndex(newIndex);
                }
    
                const string clientNameMembersIndexName = Models.Constants.ClientNameMembersIndexName;
                if (!_examineManager.TryGetIndex(clientNameMembersIndexName, out var index))
                {
                    _logger.Error<ClientNameMembersIndexComponent>($"Index {clientNameMembersIndexName} not found");
                    return;
                }
    
                //we need to cast because BaseIndexProvider contains the TransformingIndexValues event
                if (!(index is BaseIndexProvider indexProvider))
                    throw new InvalidOperationException("Could not cast");
    
                _indexProvider = indexProvider;
    
                _indexProvider.TransformingIndexValues += IndexProviderOnTransformingIndexValues;
            }
    
            public void Terminate()
            {
                if (_indexProvider != null)
                {
                    _indexProvider.TransformingIndexValues -= IndexProviderOnTransformingIndexValues;
                }
            }
    
            private static void IndexProviderOnTransformingIndexValues(object sender, IndexingItemEventArgs e)
            {
                if (e.ValueSet.Category != IndexTypes.Member || !e.ValueSet.Values.ContainsKey("zipcode")) return;
    
                var zipcodeField = e.ValueSet.Values.Single(x => x.Key == "zipcode");
                e.ValueSet.Set("zipcode", zipcodeField.Value.Single().ToString().Replace(" ", string.Empty));
            }
        }
    
        public class MemberIndexCreator : LuceneIndexCreator, IUmbracoIndexesCreator
        {
            private readonly IProfilingLogger _profilingLogger;
    
            public MemberIndexCreator(IProfilingLogger profilingLogger)
            {
                _profilingLogger = profilingLogger;
            }
    
            public override IEnumerable<IIndex> Create()
            {
                var fields = new List<string>
                {
                    "nodeName",
                    UmbracoExamineIndex.NodeKeyFieldName,
                    "email",
                    "city",
                    "zipcode",
                    "houseNumber",
                    "houseNumberAddition",
                    "residenceType"
                };
    
                var index = new UmbracoMemberIndex(Models.Constants.ClientNameMembersIndexName,
                    new UmbracoFieldDefinitionCollection(),
                    CreateFileSystemLuceneDirectory("ClientNameMembers"),
                    new CultureInvariantWhitespaceAnalyzer(),
                    _profilingLogger,
                    new MemberValueSetValidator(new[] { "Member" }, null, fields, null));
    
                index.FieldDefinitionCollection.TryAdd(new FieldDefinition("houseNumber", FieldDefinitionTypes.Integer));
    
                return new[] { index };
            }
        }
    }
    
  • Shannon Deminick 1526 posts 5272 karma points MVP 3x
    May 27, 2021 @ 16:09
    Shannon Deminick
    0

    It's also much faster to build than the default members index, I'm not sure why though.

    Hrm, not too sure about this. When indexes are rebuilt and populated, they are all populated from the same populator. So when the member index is rebuilt / populated - your index is populated at the same time with the same data. This is probably why it seems "much" faster. But it's occurring in parallel with the same data. That is how populators work in v8 so that the same data doesn't have to keep being looked up to populate indexes.

    The issue listed here with regards to CopyDictionary is possibly due to multiple indexes using the same values (i.e. 2x member indexes) and then one is transforming values which results in the same ValueSet being modified during indexing. There's a note here in Examine about this https://github.com/Shazwazza/Examine/blob/1e2fd060b71767aae7fa9f5cc8f67129b79ff5c7/src/Examine/LuceneEngine/Providers/LuceneIndex.cs#L751 which should be resolved one way or another ... but I think that fix should be in Umbraco to ensure that the ValueSet being passed in to each index during population is not the same reference type.

    In any case, I don't think you'd have that problem if you don't have a duplicated index using the same values.

Please Sign in or register to post replies

Write your reply to:

Draft