Copied to clipboard

Flag this post as spam?

This post will be reported to the moderators as potential spam to be looked at


  • Zoki 27 posts 103 karma points
    Feb 03, 2017 @ 09:19
    Zoki
    0

    Most efficient way to select latest of 15k+ articles

    So we have a website with 15k+ nodes (separated under 3 nodes each with about 5k child nodes).

    On a homepage we're showing latest articles from each category, plus we're showing "sticky" articles from all categories (bool sticky).

    Now, it kinda works, but it's not fast enough, I'm looking for most efficient way to select those nodes and return latest ones (custom "date" field).

    I'm looking at this: https://our.umbraco.org/documentation/reference/Common-Pitfalls/

    But having troubles writing this xpath, can someone confirm that would be most efficient way to return, let's say, latest 10 articles from each category?

  • Peter Duncanson 430 posts 1360 karma points c-trib
    Feb 03, 2017 @ 11:18
    Peter Duncanson
    0

    Firstly, instead of using a "isSticky" bool you could instead have a content picker on the homepage which allows editors to just pick which nodes to make sticky, this would reduce a bit content wide search. Although if you drop down to Examine you could use that too which would be pretty fast too.

    Regarding you "latest articles" query that sort of depends. Are you using a separate "published date" field or just the last save date of the document? Advantage of a separate one is the publish date never changes (or shouldn't) but if you use the Umbraco save date that could get updated whenever you go in and correct a typo which might not be what you want, however for "speed" of search using the save date (aka "updateDate" in the Umbraco DB) is probably the easiest and quickest method.

    Regarding speed of search the best way to get articles by date is probably to go to the database directly. The package called "The Dashboard" (https://our.umbraco.org/projects/backoffice-extensions/the-dashboard/) does something similar but in the back office, the source for that has lots of little tricks for how to get some of the information like this:

    https://github.com/enkelmedia/TheDashboard/blob/master/TheDashboard/Data/UmbracoRepository.cs

    Something along the lines of this (not tested) should work:

    public IEnumerable<Int32> GetLatestDocumentIDs( int limit = 10 ) {
      var res = UmbracoContext.Current.Application.DatabaseContext.Database.Fetch<Int32>
        (@"SELECT TOP " +  limit + " NodeId, updateDate
             FROM [cmsDocument]
             WHERE published=1 AND newest = 1
             ORDER BY updateDate");
      return res; }
    

    Then you can use those ids to look up the individual content nodes.

    Of course you could just Cache the output of your slow query as is so its only slow for one user which is always an option?

  • Zoki 27 posts 103 karma points
    Feb 03, 2017 @ 11:51
    Zoki
    0

    Peter, thanks, I do like your suggestions and I could try them out.

    Caching output is surely a great option for "end result", I would still like to speed up initial querying as much as possible.

    Setting content picker or MNTP on homepage is idea I didn't think about, but I could imagine not having to sort 15k nodes by "sticky" is a way to go, only question is how fast is getting that content using Umbraco.TypedContent that will return IPublishedContent instance for all sticky nodes (5-6 of them).

    Date that I use to sort latest articles is a custom field.

Please Sign in or register to post replies

Write your reply to:

Draft