find related pages on matching tags - Extending Umbraco

Dan Christoffersen 64 posts 119 karma points

Aug 13, 2009 @ 16:08

Find related pages on matching tags.

Hello Fellow Umbracos & Umbracas

We are doing a semi large website where we need to automatically list pages related to the current page.

The sollution we are currently working on is similar to what is used in most blogs, where every page is tagged with keywords. Then the idea was to do a search for every page containing the same (or some of the same) tags. Maybe even sort the results on the count of matching tags.

My question is how to do this without pushing the server to the limit at every page view? The 'Tags' datatype is already implemented in Umbraco by default, so I guess someone already have given it some thoughts ;-)

Any good suggestions? Maybe a better approach?

/Dan

Copy Link

Richard Soeteman 4046 posts 12899 karma points MVP 2x

Aug 13, 2009 @ 16:22

Hi Dan,

I would use the same approuch. I would use a seperate Macro to list the related posts uisng xpath or a search and cache that so it won't hit the server resources any time you visit the page.

Cheers,

Richard

Copy Link

Thomas Höhler 1237 posts 1709 karma points MVP

Aug 13, 2009 @ 16:26

Unfortunately the actual Tags implementation misses the count of the tags. I made a blog cumulus package where I extended the ITag interface of umbraco to get all the related sites. Perhaps this is a starting point for you. See here

Also I added a tag related search via xslt:

<xsl:template match="/">
    <xsl:variable name ="page" select="umbraco.library:RequestQueryString('page')" />
    <xsl:variable name ="tag" select="umbraco.library:RequestQueryString('tag')" />

    <xsl:choose>
      <xsl:when test ="$tag != ''">
        <xsl:choose>
          <xsl:when test="$page">
            <xsl:call-template name="ListTagItems">
              <xsl:with-param name="page" select="$page"></xsl:with-param>
              <xsl:with-param name="tag" select="$tag"></xsl:with-param>
            </xsl:call-template>
          </xsl:when>
          <xsl:otherwise>
            <xsl:call-template name="ListTagItems">
              <xsl:with-param name="page" select="0"></xsl:with-param>
              <xsl:with-param name="tag" select="$tag"></xsl:with-param>
            </xsl:call-template>
          </xsl:otherwise>
        </xsl:choose>
      </xsl:when>
      <xsl:otherwise>
      ...
      </xsl:otherwise>
    </xsl:choose>
</xsl:template>

 <xsl:template name="ListTagItems">
    <xsl:param name="page" />
    <xsl:param name="tag" />

    <h2>Blogs tagged with '<xsl:value-of select ="$tag"/>'</h2>
    
    <xsl:variable name="blogs" select="$currentPage/descendant-or-self::node [@nodeTypeAlias = 'blog.com.BlogPost'][contains((data [@alias = 'umbracoTags']), $tag)]"/>
    <xsl:for-each select="$blogs">
      <xsl:sort select="@createDate" order="descending"/>
      <xsl:if test ="((position() &gt; ($page * 5)) and (position() &lt; ((($page + 1) * 5) + 1)))">
        <xsl:call-template name="WritePost">
          <xsl:with-param name="post" select="."></xsl:with-param>          
        </xsl:call-template>
      </xsl:if>
    </xsl:for-each>
    <span class="CssBlogpostPager">
      <xsl:if test="$page &gt; 0">
        <a href="{umbraco.library:NiceUrl($currentPage/@id)}?page={$page - 1}" class="backInk">previous posts</a> |
      </xsl:if>
      <xsl:if test="count($blogs) &gt; (($page + 1) * 5)">
        <a href="{umbraco.library:NiceUrl($currentPage/@id)}?page={$page + 1}" class="nextInk">next posts</a>
      </xsl:if>
    </span>
  </xsl:template>

  <xsl:template name="WritePost">
    <xsl:param name="post"></xsl:param>
    ... write the content
  </xsl:template>

</xsl:stylesheet>

hth, Thomas

Copy Link

Douglas Robar 3570 posts 4711 karma points MVP ∞ admin c-trib

Aug 13, 2009 @ 16:43

The key with any of these options is to set the cache time on the macro to something quite large. I'd go for at least 10 minutes and maybe an hour (or even once a day). It depends on how quickly you need to see any change to the related pages output.

If you need live or nearly-live updating and can't use a large cache time, or the macro is simply too slow on the rare occassions it needs to run, look into using the lucene search to pull back the results. Probably a bit of customization needed but the performance will be exceptionally fast since lucene is index-based. Alternatively, issue a lucene search (and possibly cache its output)

cheers,
doug.

Copy Link

Thomas Höhler 1237 posts 1709 karma points MVP

Aug 13, 2009 @ 16:46

How many nodes will this semi large site have?

Thomas

Copy Link

Dan Christoffersen 64 posts 119 karma points

Aug 13, 2009 @ 17:04

First of all, I had forgotten about the cache feature of umbraco. I'm still pretty new to both umbraco and .net, but slowly getting there. My guess is that it will do the trick perfectly, just updating twice a day.

Secondly, I also had the idea that we could have an sql table that listed the related pages to every page. This table would then get updated every time a new page was created or changed. This way, the processing would be made only once, instead of every time the page was viewed. Any comments on this sollution? Anyway I will see if the cache will do the trick first, which I think it may.

Thomas: Number of nodes: Not sure actually. Potentially 500 a year. Not sure if that is a "semi large website" though. I guess it's relative.

/Dan

Copy Link

Richard Soeteman 4046 posts 12899 karma points MVP 2x

Aug 13, 2009 @ 17:23

Hi Dan,

I wouldn't go directly for a custom solution that is using a DB. What happens if the event for some reason isn't triggered. Or someone did an action that wasn't handled through events? Then you would have a corrupt index. Querying using xpath is pretty fast and with a good caching strategy it should perform well.

CHeers,

Richard

Copy Link

Thomas Höhler 1237 posts 1709 karma points MVP

Aug 13, 2009 @ 17:31

Also what you can do is to write a custom datatype which stores the correspondent Ids at publishing in the node, so you have all related nodes directly present on each node (no need to search anymore).

How to realize: Using ActionHandlers to add the ids which have the same tags. Take a look into the cmsTags and cmsTagRelationship tables and also in the umbraco.editorcontrols.tags.library class

If you need help just contact me.

Thomas

Copy Link

Hundebol 167 posts 314 karma points

Sep 08, 2009 @ 15:01

Hi Dan,

Did you ever get this thing to work, as i am working on the same thing.

best regards,
Brian

Copy Link

is working on a reply...

Flag this post as spam?

Find related pages on matching tags.