stuff

Darren Ferguson 1022 posts 3259 karma points MVP c-trib

Mar 25, 2014 @ 18:16

Stuff

We've done the same.

https://github.com/darrenferguson/cloud-umbraco-cache-refresh

I would be good to collaborate somehow to avoid duplication etc.

Some other issues to be aware of:

1. Media should be placed in a CDN, which is pretty easy with this package: http://our.umbraco.org/projects/backoffice-extensions/ast-azure-media-sync

however, FileSystemProviders wasn't introduced to Umbraco until late v4 or early v6 (i'm not sure) So if you are on an older version, it won't work.

2. Examine isn't thread safe, and besides, the steup uses two azure websites, so you'll need the Examine Azure providers to write to blob. It works with recent versions, though isn't maintained and up to date with the rest of Examine (so you'll be using an older version of Examine).

3. Macro caches - Assuming that you call library.UpdateDocumentCache on each instance as we do, this doesn't clear the Marco caches and other internal caches, so be aware of that. We haven't gotten around to this yet. We can live without Macro caching for now.

4. As you run a separate single instance for the back office (as Umbraco requires) you will need to run your back office in shared or free mode - or place it in a different region from the front end. This i because if you scale a Standard website, Azure will automtically scale all of your Standard mode websites within that region. This sucks, you want to have dedicated resource for your back office to make the experience palatable for content editors, however running in shared mode, or placing your back office in a different region to your SQL azure server obviously degrades performance to some extent.

Give me a shout if you want to collaborate on this.

Copy Link

Matt 76 posts 280 karma points

Mar 26, 2014 @ 05:43

Firstly - code is now out at:
https://github.com/AussieInSeattle/rbcUmbracoOnAzure
(I'm totally new to git, sharing open source, etc, so let me know if I did anything wrong)

Thanks for your input Darren. Agreed on collaborating on coming up with some sort of solution - I actually have no need for this on any of my current projects, but it is still something I'm interested in solving/pursuing - it appears we are achieveing the same thing a couple of different ways. Not sure that either way is better than the other - they are just different and I think they get to the same sport/set of issues?

I took a look at your code - here's how I understand it:

storing machine name similar to me to track all current instances/hosts
on publish you put stuff (ids) into a db table so that the anonymous client machines can pickup those changes
anon client machines use a http module to check for any new content not more than every 10 seconds, but could be greater in low traffic
your approach is what I call a "pull" approach which instantiates from the front end servers via a users web request - with this appraoch there can also be 0 items to "process" for each server
from what I hear the core appraoch will be very very similar to this in an upcoming version

What I've written, does this:

i store machine name (not really used) and instance id to track all current instances/hosts - instance id is what azure websites uses to directly identify its scaled out instances - it is this id that you see in your ARRAffinity cookie when you hit a site with ARR in front of it (which is what Azure Websites uses)
from teh above, I essentially get to the same spot as "normal loadblancing" with the server names in the config file as it allows me to hit each azure websites instance with the exact same url, but instead pass in a custom cookie (ARRAffinity) with the InstanceId set as the value of the cookie - this allows me to make the same url (but with diff cookie) request multiple times on each cache update on the back end server
I am using teh standard built in umbraco web service that everyone else uses in "normal" load balanced environments - my servers all have the same host name though, with a different cookie to differentiate them
this appraoch is a "push" approach where the single back-office server only pushes out when changes are made

In response to your points:

1) agreed - i mention this in the package description (which is huge :-)) but did not link to the package

2) not sure why this is needed? yes i use two websites but they both maintain their own examine indexes - the back-office box that does the built in umbraco distributed call to the front-end boxes kicks off an umbraco cache refresh and examine then kicks in automatically after that on the front-end boxes without me having to do anything - that is all built into umbraco when you hit the web service endpoint (/umbraco/webservices/cacheRefresher.asmx) with RefreshById - maybe our two solutions vary here as I dont think this is an issue in my solution? My test page checks each server for pulling from Umbraco.Field and the Examine cache and all seems to be fine.

3) As per above I hit the built in umbraco end point that is meant to refresh the cache by id - would need to defer to someone like Shannon on whether this dumps the macro cache or not

4) Funny that you ran into this too - agreed that this sucks - I burnt an hour or so trying to get this to work by doing things in a different order, unchecking the websites from that damn dropdown multiple times - quite frustrating. Had not thought of using a different region - i think in that case you'd also pay more if one of the sites has the db in a different region to which it is in which means data costs would kick in since you are now psuhign sql stuff between regions?

I have not tested it, but I thought of maybe running multiple subscriptions (you can acceess all your subscriptions under one portal login) and then that way they are still in the same region, but hopefully they can both use standard - have you tried this?

Also not sure if we are seeing a bug as scottgu's post here:
http://weblogs.asp.net/scottgu/archive/2013/06/27/windows-azure-general-availability-release-of-web-sites-mobile-services-new-autoscale-alerts-support-no-credit-card-needed-for-msdn-subscribers.aspx
says you can scale websites independantly, or maybe it means you can have different websites on different "plans" (free,shared,standard)?

Look forward to further discussion.

Regards,
Matt

Copy Link

Darren Ferguson 1022 posts 3259 karma points MVP c-trib

Mar 26, 2014 @ 08:48

Hi Matt,

We'd authored our package before Azure websites moved to using sticky sessions by default - but yes, we did plan to update to use the affinity cookie and take your approach.

With Examine your challenge is a regular l/b setup you'd exclude Examine from DFS or your SAN storage. Examine hooks into the AfterUpdateDocumentCache event. It doesn't handle writes from multiple threads well, so there is a risk on Azure websites that you could get multiiple instances writing at once with latency and other considerations (I'm talking about when your front end scales, not issues between back office and front end which as you say would have separate indexes). It may appear to work for now, but I'd be really careful about going into production.

As you are using proper distributed calls I don't believe you need to worry about the Macro cache.

On the last point, I've got some feedback from the Azure websites team that the sacling behaviour is by design. Also, it was introduced relatively recently, you used to be able to scale standard websites independently of one another, so maybe the blog post you reference is out of date? I had thought of trying two different subscriptions but not sure if one could access the database of the other. Let me know if you get around to it before me.

Thanks.

Copy Link

Darren Ferguson 1022 posts 3259 karma points MVP c-trib

Apr 04, 2014 @ 14:35

Hey Matt - are you on Skype.

I think I figured out a way to get the back office working on a single site deployment. Would be good to bounce the idea off you.

Thanks.

Copy Link

Kyle 24 posts 63 karma points

Aug 19, 2014 @ 23:06

Hi Matt / Darren,

Thanks for putting this package out, it has been a real help.

I was wondering if you could help even further... when azure auto shuts down and then later on starts up my additional instances anything changed in the period in between isn't up-to-date on the new instance. Would it be wise to add something to the UmbracoStartup.cs in your package to trigger the new instance to refresh its cache?

I'm also experiencing spikes of 100% CPU after an update has been made when the site is under load from around 400 sessions. The spikes last around a minute and lock the site up completely. Where should I start looking? Sorry bit of a novice :-/

Thanks, Kyle

Copy Link

Darren Ferguson 1022 posts 3259 karma points MVP c-trib

Aug 21, 2014 @ 09:55

@Kyle - Umbraco should refresh the cache upon startup - but I understand that this doesn't always work 100%.

There are options in UmbracoSettings.config to tune this behaviour - though I'm not in front of a developer PC, but from memory preventing the flushing of the cache to disc may help.

You could also look into the logic that republish.aspx does when xml=true is appended to the query string as this does a more forcible refresh.

Lastly, which version of Umbraco are you using - as there were some niggle with the cache in certain 6.x and 7.x versions.

On the last point - How do you mean "after an update has been made?"

Thanks!

Copy Link

Kyle 24 posts 63 karma points

Aug 21, 2014 @ 10:13

@Darren

Thanks I'll look into the UmbracoSettings.config and see if I can make some adjustments, I think it may have been caused when I was changing the machine setting between Large & Medium and back again.

I'm using 7.1.4

How do you mean "after an update has been made?"

When a content editor updates a nodes properties and publishes their changes. The front end instances hit high CPU usage and the site take ages to respond. I've got a memory dump which a Microsoft tech is looking at to see if they can identify what's causing it.

Just to double check I moved the examine index to blob storage as mentioned above. Currently both the front & admin sites are sharing the same index. Is this correct ?

Thanks, Kyle

Copy Link

Darren Ferguson 1022 posts 3259 karma points MVP c-trib

Aug 21, 2014 @ 10:30

The spiking of CPU is quite common if you have a big Macro cache, because a publish clears it and Umbraco then rebuilds it as requests come in.

If you have a lot of traffic then this can cause what you see. A common issue is that lots of SQL queries start blocking because of a limited number of threads, which manifests itself as the CPU hitting 100% - which is slightly misleading.

Honestly - I'd need to get my head around the code to help.

Re Examine in blob, that is the recommended approach because the blob storage Examine provider is apparently thread safe for read/writes. You may want to ping Shannon on Twitter and check that this is still the latest.

Copy Link

Kyle 24 posts 63 karma points

Aug 22, 2014 @ 11:05

Microsoft Technical support got back to me and it does look like a Macro causing the slowdown. There are 3 pages which use a "large" Macro its cache is set to every hour. I'm assuming that having them all refreshing at once causes the heigh CPU.

I'm thinking in the short term disabling the macro until the "content" updates have been made, them one at a time at the macros back in. Not ideal.

Would it be possible to make all the instances share the same cache files? and have the admin instance do all the processing? (Maybe a stupid question)

Copy Link

Darren Ferguson 1022 posts 3259 karma points MVP c-trib

Aug 22, 2014 @ 11:58

Hi Kyle, the Macro cache is in memory.

I'd take the approach of profiling the code in the Macro and seeing why it performs badly.

Copy Link

Jamie Howarth 306 posts 773 karma points c-trib

Oct 20, 2014 @ 14:18

Hi all,

I've just sent in a pull request on Github that makes the backoffice stop breaking in v7 and fixes the startup handler that stops the site from booting in a local IIS instance (I use IIS locally to dev and then push to Azure).

HTH, B

Copy Link

is working on a reply...

This forum is in read-only mode while we transition to the new forum.

You can continue this topic on the new forum by tapping the "Continue discussion" link below.

Flag this post as spam?

Stuff