Copied to clipboard

Flag this post as spam?

This post will be reported to the moderators as potential spam to be looked at


  • Alastair Todd 44 posts 142 karma points
    Jul 09, 2016 @ 07:07
    Alastair Todd
    0

    Examine Index not reindexing well

    I am importing documents via the content api. Takes hours but when done the examine index contains all the docs. (i am testing via luke)

    When I try and reindex, only about 20% of the documents make it into the index.

    No error in the logs.

    How do I debug that?

    I do have a gathering node event, which for now I've commented out.

    This is the IndexSet config:

    <IndexSet SetName="PostsIndexSet" IndexPath="~/App_Data/TEMP/ExamineIndexes/{machinename}/Posts/">
        <IndexAttributeFields>
          <add Name="id" />
          <add Name="nodeName" />
          <add Name="writerName" />
          <add Name="parentID" />
          <add Name="urlName" />
        </IndexAttributeFields>
        <IndexUserFields>
          <add Name="postImage" />
          <add Name="postImageUrl" />
          <add Name="lowResImageUrl" />
          <add Name="postDate" EnableSorting="true" Type="DateTime" />
          <add Name="postDateFormatted" />
          <add Name="title" />
          <add Name="categories" />
          <add Name="primaryCategoryName" />
          <add Name="primaryCategorySlug" />
          <add Name="displayCategories" />
          <add Name="searchableCategories" />
          <add Name="rawHtml" />
          <add Name="rawVideoHtml" />
          <add Name="html" />
          <add Name="videoHtml" />
          <add Name="videoUrl" />
          <add Name="postType" />
          <add Name="youtubeId" />
          <add Name="metaDescription" />
          <add Name="isPublished" Type="Boolean" />
          <add Name="creatorId" />
        </IndexUserFields>
        <IncludeNodeTypes>
          <add Name="post"/>
        </IncludeNodeTypes>-
      </IndexSet>
    

    That generates 909 documents in the index. What is strange, just through sheer trial and error, if I remove the IncludeNodeTypes node, I get 12534 documents, 6000+ of which are of "_NodeTypeAlias" "post" !? (using Luke)

    I can probably work with it as I will filter posts on queries...but still...

    Any ideas?

  • Alastair Todd 44 posts 142 karma points
    Jul 09, 2016 @ 07:34
    Alastair Todd
    0

    I was supporting unpublished content.

    Seems the bad mix is that AND specifying IncludeNodeTypes

    They appear mutually exclusive?

Please Sign in or register to post replies

Write your reply to:

Draft