Copied to clipboard

Flag this post as spam?

This post will be reported to the moderators as potential spam to be looked at


  • Matt 74 posts 278 karma points
    Feb 28, 2014 @ 23:40
    Matt
    0

    Updating Examine in Load Balanced Environment with no direct http access to servers (cant use distributed calls)

    Hi,

    I've read theough a number of threads here about the distributed calls, how they work and how Examine is updated on the distributed servers via the AfterUpdateDocumentCache.  

    My issue is that in our load balanced environment we have no direct http access to the individual load balanced boxes, so we cannot use the distributed calls approach outlined in the Umbraco documentation.  We only have access to the "main" authoring box that sits behind the firewall but is obviously hooked into the same database.  We are using DFS to copy files, etc across to the other boxes, so we do have that level of "access".

    Without the distributed call approach being available to us, my approach so far (PoC) is to call the following every 5 minutes:
    ApplicationContext.Current.ApplicationCache.RuntimeCache.ClearAllCache();
    (I was calling umbraco.library.RefreshContent() - both seem to work)

    I call the above on app startup and then every 5 minutes for each server (on 0 and 5 minute "times").  Before calling the above I grab the latest updateDate from cmsDocument where published=1 and compare to a static DatTimeI store so that I'm not un-necessarily updating the cache if nothing has actually changed.

    The above works just fine and the Umbraco cache refreshes itself every 5 minutes on the "really dumb front end servers".  It's also super quick so am assuming that there are some "smarts" behind the above calls that only updates the changed items in the cache.

    What doesnt work is updating the Examine indexes on each of the "really dumb front end servers".  I understand that the distributed publishing calls usually look after this, so I know why it doesnt work, but I need some help in figuring out an architecture to make it work.  I'm assuming there is no magic bullet where I can call a single Umbraco method to magically update Examine (without having to re-index the whole thing).

    Maybe I can capture in a custom db table the same thing that the distributed calls send out (not entirely sure what they do send).  And then on my "5 minute scheduled task" I look in this custom table for any new distributed calls and then manually "run them" to ensure examine also gets updated.  The issue is that I am not sure how to hook into the distributed call event to "log" that data.  Updating Examine by nodeid I can likely handle based on other examples in the forums.  Also need to try and work out if the above will work when/if a new "dumb front end" comes online.

    Any other ideas or issues with the above architecture?

    As some other background the site is a 30,000 node site with a 60Mb (currently) umbraco.config file.

    Thanks,
    Matt

  • Shannon Deminick 1488 posts 5014 karma points hq
    Mar 03, 2014 @ 01:53
    Shannon Deminick
    0

    Hi Matt,

    First thing is

    ApplicationContext.Current.ApplicationCache.RuntimeCache.ClearAllCache();

    is not the same as

    umbraco.library.RefreshContent()

    You should be careful about when you are calling this: ApplicationContext.Current.ApplicationCache.RuntimeCache.ClearAllCache

    as this is basically synonymous as clearing the entire HttpRuntime cache which means that you are effectively removing all runtime cache which will cause anything that uses it (which is quite a few things) to have to go re-fetch the raw data.

    This call: umbraco.library.RefreshContent() makes a distributed call to all servers, then on each server this will go rebuild the entire umbraco xml cache file based on what is in the database. This is also quite an intensive call and will trigger this to occur on every server taking part in your LB scheme. If you say that this server cannot make distributed calls to the other boxes then I can only assume that this xml cache file rebuild will only occur on the local machine - but you'd have to verify that. If you have many documents, this is an intensive call and should definitely not occur every 5 minutes.

    Perhaps a better idea rather than scheduling a rebuild of the xml cache every 5 minutes is to use DFS to your advantage and create a file queue of publishing events. On your main publishing/admin server whenever a doc is published or unpublished you could write a file, this file would get synced to all servers. Then on each server you can poll for new changes that haven't been processed locally. Each file could just contain an Id and if it is new/updated/deleted. Then handle it appropriately. Or you could just create another database table to handle this queue.

    If a new server comes online, you'll have to rebuild all indexes locally to that machine, same goes for the xml file. It will be tricky to ensure this server is precisely up-to-date though if you are bringing servers online dynamically with active editors in the back office. Perhaps based on when it comes online and when the rebuild process has started/completed (for both examine and the xml file), you could then check the queue and see if there's been any activity from when the rebuild process started, if there is you could just execute the last 'x' items in the queue which should in theory make sure your local environment is up-to-date.

  • Shannon Deminick 1488 posts 5014 karma points hq
    Mar 03, 2014 @ 01:57
    Shannon Deminick
    0

    Just a thought: You might just be able to use the existing data in the umbracoLog table for this stuff, but would need to verify that all the info you'd need is put in there, otherwise you could probably just add your own information to this table for the queue to work.

  • Matt 74 posts 278 karma points
    Mar 03, 2014 @ 02:17
    Matt
    0

    Thanks Shannon.  

    You are correct that we cannot do distributed calls - the db based approach seems like the best one and good thought on using the log file as a potential "already there" solution.  I was hoping for a built-in Umbraco approach to all this, but looks like I have to hook up all the plumbing myself to get this to work but at least the connections/events are available.  Is there anything to gain from hooking into the distributed calls somehow even though they wouldn't go to a physical server (capturing them and putting them in a db table) over just hooking into the normal publish/unpublish events?

    Base on the above it seems the psuedo-code is:

    1. assuming all servers start off up to date for now - will solve new server issue later
    2. hook into the publish/unpublish event and log those events somewhere (or possibly use umbracoLog table) so we know what nodes need to be republished and reindexed on the "dumb" boxes
    3. every 5 minutes on the dumb boxes we look at above log to see if there is anything to refresh in cache and re-index since we last checked - once we have a list of ids we call umbraco.library.UpdateDocumentCache(Id) for each of them - I believe this just updates the in-memory cache from cmsContentXml which is already up to date since it is a shared db?  We also need to update/remove the examine node id too.
    Only thing I am not sure of above is the v7 approach for umbraco.library.UpdateDocumentCache(d.Id) ?
    Thanks for your input on the "expensive operations".  In my testing on a 30,000 node site, hittign a single page and changing a single node and re-publishing it seemed to be super quick - the single node re-publish is not really a valid test but in a 5 minute period we would still be <100 nodes that would have potentially been re-published.
    Regards,

    Matt

  • Shannon Deminick 1488 posts 5014 karma points hq
    Mar 03, 2014 @ 02:56
    Shannon Deminick
    0

    We have a couple of internal interfaces that haven't been made public yet: IServerMessenger and IServerRegistrar. If you wanted to live on the edge, you could use a custom build based on the stable version you are using and make these public. The IServerMessenger is the thing that performs the distributed calls to each server. You could create a custom version of that to do what you want but it'll take a bit of work. If none of your front-end servers receive distributed calls via Http there may be other cache invalidation issues other than just content that may affect them as well - depending on what you are doing on the admin server so it may be worth going down this route.

    You could look into the code of DefaultServerMessenger and swap out the http calls with calls to add items to your 'queue'. In your app startup code you'd need to change the current IServerMessenger:

     ServerMessengerResolver.Current.SetServerMessenger(new YourServerMessenger());
    

    this needs to be done in your ApplicationEventHandler.OnApplicationStarting method.

    In v6.2+ the library.UpdateDocumentCache(id) just calls this underneath:

     DistributedCache.Instance.RefreshPageCache(documentId);
    

    This makes the distributed calls which will then cause the PageCacheRefresher to execute on each server which calls this code:

     ApplicationContext.Current.ApplicationCache.ClearPartialViewCache();
     content.Instance.UpdateDocumentCache(id);
     DistributedCache.Instance.ClearAllMacroCacheOnCurrentServer();
     DistributedCache.Instance.ClearXsltCacheOnCurrentServer();
    
  • Matt 74 posts 278 karma points
    Mar 03, 2014 @ 03:08
    Matt
    0

    Thanks for the suggestions - I think I have enough to go on (after making some decisions).  I took a look at the umbracoLog and it looks like the most promising thing so far (yet to look at media though).  The code examples above (and then doing search of the core code) ehlps a lot!

    Where is the "similar" code on the distributed servers that takes an id and refreshes that one examine doc?

    Regards,
    Matt 

  • Shannon Deminick 1488 posts 5014 karma points hq
    Mar 03, 2014 @ 03:13
    Shannon Deminick
    100

    Media isn't properly done for examine (yet), there's an outstanding issue here: http://issues.umbraco.org/issue/U4-3937

    For content, this is done by subscribing to these events:

      content.AfterUpdateDocumentCache += ContentAfterUpdateDocumentCache;
      content.AfterClearDocumentCache += ContentAfterClearDocumentCache;
    

    These are older events that actually fire on each server after a distributed call but should be changed over to use this event: CacheRefresherBase

    There's a thread about this here: http://our.umbraco.org/forum/core/general/47068-Distributed-publishing,-Examine-and-Media

    If you have a look into the Umbraco.Web.Search.ExamineEvents you can see how it is done

  • Matt 74 posts 278 karma points
    Mar 03, 2014 @ 03:17
    Matt
    0

    Perfect - exactly what I was looking for (had seen that thread previously too).  Info above and the other thread should get us to a solution!

  • Shannon Deminick 1488 posts 5014 karma points hq
    Mar 03, 2014 @ 03:21
    Shannon Deminick
    0

    Great! Would be interested to see the outcome, we've discussed having distributed calls and examine indexes updated via a queue for LB circumstances such as this. It also allows for easier additional and removals of servers in real time (another reason why IServerRegistrar was created). Of course we haven't pursued this idea yet as there been no time but I think the foundations are there so that it could work. Good luck!

  • Matt 74 posts 278 karma points
    Mar 03, 2014 @ 03:24
    Matt
    0

    I will be sure to report back on a solution (or failure :-)).  Will buy you a beer in June!

  • Matt 74 posts 278 karma points
    Mar 25, 2014 @ 04:24
    Matt
    0

    My solution is getting close with all the hard work done - I have an almost working PoC on Azure Websites that scales out and does the distributed calls to update the Umbraco Cache and Examine.  

    To finish off the PoC and release to the community I need a couple of things:

    • what event runs before IIS/Umbraco shutdown that would allow me to write to the database to tell it that the server is no longer active and not to push distributed calls to it?  due to the way that azure websites works i need to register/unregister on startup/shutdown - startup is easy enough, just having issues with what shutdown event to use?
    • I'm using 6.1.x and pushing out on the CacheRefresherBase<PageCacheRefresher>.CacheUpdated event, however CacheRefresherBase<UnpublishedPageCacheRefresher>.CacheUpdated does not seem to exist in 6.1.x - fixed? i dont seem to need it - tested unpublish and it unpublished on distributed servers just fine
    Hope, Shannon can shed some light on this? :-)
    Regards,
    Matt 
  • Shannon Deminick 1488 posts 5014 karma points hq
    Mar 25, 2014 @ 04:40
    Shannon Deminick
    0

    You can create your own instance of Umbraco.Web.UmbracoApplication and change your global.asax to inherit from that. Then in your instance you can override OnApplicationEnd - though I'm not sure what services will actually be available during that event.

    On some other notes. This issue is completed: http://issues.umbraco.org/issue/U4-3937

    You'll also be pleased to see this rev which we've publicized IServerMessenger and IServerRegistrar:

    96fa8c7dc90dd419f1a4b127999b1511086b433e

    You'll also see this new branch: https://github.com/umbraco/Umbraco-CMS/tree/7.1.0-batchdistcalls

    I'll be writing up a blog post about this and what it does soon but here's the gist:

    • This uses a new implementation of IServerMessenger
    • Allows LB to work without servers knowing anything about other servers - all servers are anonymous and manage keeping themselves up-to-date based on a central db queue of instructions
    • It works, and so far works very well!
    • There's of course more tests to run and a few outstanding implementation details
  • Matt 74 posts 278 karma points
    Mar 25, 2014 @ 04:48
    Matt
    0

    Cool! - a few questions:

    • Have you worked out how to get all that to work with Azure Websites where each scaled isntance does not have its own dns end-point that the back-office can get to?
    • The batching stuff is cool - I assume this means when servers go down/come back up, since state is in the db, it "should" all work just fine?
    • I browsed through the code but couldnt see exactly where the anonymous stuff happens - are you registering the servers automatically on startup/shutdown (IsActive=false) or are you taking some sort of other approach to know which servers are up?
    Regards,
    Matt 
  • Shannon Deminick 1488 posts 5014 karma points hq
    Mar 25, 2014 @ 05:00
    Shannon Deminick
    0

    With this new setup, there is no notion of server registrations, the servers just need access to the database and they poll when necessary to process the instructions stored there. You can add/remove servers all you want, none of them need to know anything about the other ones.

    The batching stuff is the fix is for this issue: http://issues.umbraco.org/issue/U4-2633

    which is an enhancement to how LB is currently working and severely limits the amount of chatter between servers especially when dealing with permissions. Basically instead of sending multiple instructions it just batches them in one request if there were multiple distributed calls required during a single request.

  • Bart ten Velde 16 posts 233 karma points
    May 03, 2014 @ 16:19
    Bart ten Velde
    0

    Hi Shannon, is your batchdistcalls solution on the roadmap to be included in into the core ? We're looking for hosting our website on azure in a few months and are in need of loadbalancing.

    Thanks for the feedback !

    Bart

  • Shannon Deminick 1488 posts 5014 karma points hq
    May 05, 2014 @ 08:40
    Shannon Deminick
    0

    yeah, there's code in a custom branch here:

    https://github.com/umbraco/Umbraco-CMS/tree/7.1.0-batchdistcalls

    The batching of distributed calls is working and is the first commit in that branch, the rest is a proof of concept which requires a lot more testing. Something I'll be getting around to on Fridays if I have time.

    By "Azure" what do you mean? Azure Websites (WAWS) or normal Azure (VMs) ? We do not currently support Azure Websites for load balancing. There's been plenty of discussion around this, some of which is on the google groups dev mail list. There's some barriers with WAWS that prevent Umbraco from running OOTB with load balancing. We have some POCs working for it but these require testing. One of these POCs is this git branch.

  • Bart ten Velde 16 posts 233 karma points
    May 05, 2014 @ 08:55
    Bart ten Velde
    0

    Hi, I meant Azure Websites. I thought the batching of distributed calls was also targeting the WAWS. Didnt look into the details yet, I assumed that running VM's would have the same setup as non-azure hosting...

  • Shannon Deminick 1488 posts 5014 karma points hq
    May 05, 2014 @ 09:02
    Shannon Deminick
    0

    Here's the barriers to WAWS:

    • WAWS servers use the same underlying SAN based file system which means that each server in your LB scenario uses the same file system = you will get file locking issues. Lucene is the more difficult thing to solve because of this but there are other file locking issues that need to be dealt with. The umbraco.config for one, but this can be handled by changing the appSetting umbracoContentXMLUseLocalTemp set to true
    • WAWS servers do not give you an internal DNS address therefore you cannot fill out the umbracoSettings.config distributedCall/servers list

    So OOTB umbraco will not work with load balancing on Azure websites but as mentioned there are a few POCs that I've created that seem to work but require more testing and implementation.

    Here's some more light reading around the subject:

    https://groups.google.com/forum/?fromgroups=#!searchin/umbraco-dev/Azure/umbraco-dev/igw53DhAzco/9AQzLJiefcwJ https://groups.google.com/forum/?fromgroups=#!searchin/umbraco-dev/Azure/umbraco-dev/Gk5dQnMRGIw/SBPy3zcG1A4J

    I plan on writing a detailed blog post about all of this soon.

  • Bart ten Velde 16 posts 233 karma points
    May 05, 2014 @ 09:15
    Bart ten Velde
    0

    Great feedback, tnx !

Please Sign in or register to post replies

Write your reply to:

Draft