Copied to clipboard

Flag this post as spam?

This post will be reported to the moderators as potential spam to be looked at


  • Alastair Todd 44 posts 142 karma points
    Jul 09, 2016 @ 07:07
    Alastair Todd
    0

    Examine Index not reindexing well

    I am importing documents via the content api. Takes hours but when done the examine index contains all the docs. (i am testing via luke)

    When I try and reindex, only about 20% of the documents make it into the index.

    No error in the logs.

    How do I debug that?

    I do have a gathering node event, which for now I've commented out.

    This is the IndexSet config:

    <IndexSet SetName="PostsIndexSet" IndexPath="~/App_Data/TEMP/ExamineIndexes/{machinename}/Posts/">
        <IndexAttributeFields>
          <add Name="id" />
          <add Name="nodeName" />
          <add Name="writerName" />
          <add Name="parentID" />
          <add Name="urlName" />
        </IndexAttributeFields>
        <IndexUserFields>
          <add Name="postImage" />
          <add Name="postImageUrl" />
          <add Name="lowResImageUrl" />
          <add Name="postDate" EnableSorting="true" Type="DateTime" />
          <add Name="postDateFormatted" />
          <add Name="title" />
          <add Name="categories" />
          <add Name="primaryCategoryName" />
          <add Name="primaryCategorySlug" />
          <add Name="displayCategories" />
          <add Name="searchableCategories" />
          <add Name="rawHtml" />
          <add Name="rawVideoHtml" />
          <add Name="html" />
          <add Name="videoHtml" />
          <add Name="videoUrl" />
          <add Name="postType" />
          <add Name="youtubeId" />
          <add Name="metaDescription" />
          <add Name="isPublished" Type="Boolean" />
          <add Name="creatorId" />
        </IndexUserFields>
        <IncludeNodeTypes>
          <add Name="post"/>
        </IncludeNodeTypes>-
      </IndexSet>
    

    That generates 909 documents in the index. What is strange, just through sheer trial and error, if I remove the IncludeNodeTypes node, I get 12534 documents, 6000+ of which are of "_NodeTypeAlias" "post" !? (using Luke)

    I can probably work with it as I will filter posts on queries...but still...

    Any ideas?

  • Alastair Todd 44 posts 142 karma points
    Jul 09, 2016 @ 07:34
    Alastair Todd
    0

    I was supporting unpublished content.

    Seems the bad mix is that AND specifying IncludeNodeTypes

    They appear mutually exclusive?

  • This forum is in read-only mode while we transition to the new forum.

    You can continue this topic on the new forum by tapping the "Continue discussion" link below.

Please Sign in or register to post replies