We are getting a weird issue with a live site we inherited. Its running on 7.2.6.
Every hour on the hour the cpu spikes to max 100% for about 1min making the site unresponsive. The 100% CPU is on the w3wp.exe process on an AWS EC2 virtual machine.
There is nothing in windows logs / umbraco logs / elmah logs.
The VM running the site is brand new with nothing else running on it just this website. The VM is running Windows Server 2016 Datacenter.
We are using SQL Server 2016 in AWS RDS. The instance size is db.m4.xlarge.
Off the top of my head, a scheduled task that happens to be hourly or maybe if IIS were for some reason configured to recycle the application pool every hour (the default is every 29 hours). Could also be if IIS were configured to restart if there are some number of errors in an hour (I think the default is to check for some number of errors in a 5-minute timespan). Could also be some external tool (e.g., a heartbeat/ping tool, such as New Relic) that is hitting a specific page that has an error that causes the site to restart.
Glad you figured it out! FYI, Cloudflare or another WAF can help to avoid DDOS attacks.
BTW, you might want to confirm that it's a DDOS attack and not an overly aggressive search engine. I've had search engines cause me headaches in the past.
Hourly CPU spikes
We are getting a weird issue with a live site we inherited. Its running on 7.2.6.
Every hour on the hour the cpu spikes to max 100% for about 1min making the site unresponsive. The 100% CPU is on the w3wp.exe process on an AWS EC2 virtual machine.
There is nothing in windows logs / umbraco logs / elmah logs.
The VM running the site is brand new with nothing else running on it just this website. The VM is running Windows Server 2016 Datacenter.
We are using SQL Server 2016 in AWS RDS. The instance size is db.m4.xlarge.
A bunch of stuff can cause something like this. This thread has some general guidance: https://our.umbraco.org/forum/using-umbraco-and-getting-started/85487-unexplained-spike-in-iis-server-cpu-activity-nearly-100-causing-outages
The two easiest things are to do Windows updates and set fcnMode to disabled.
I've also seen some weird stuff if HTTPS is used by isn't configured exactly right in a number of config files.
Changing fcnMode to disabled seems like a very drastic thing to do. I've never needed to do that before.
We have done fcnMode="Single" for one site previously.
The site does serve a lot of images. I might try fcnMode="Single".
I recommend to always use
fcnMode="Single"
! That's how Umbraco ships these days.See: http://issues.umbraco.org/issue/U4-7712
https://shazwazza.com/post/all-about-aspnet-file-change-notification-fcn/
Also, just wondering what could be causing the CPU spike on the hour?
It's like clock work. eg. 12pm, 1pm, 2pm.
Off the top of my head, a scheduled task that happens to be hourly or maybe if IIS were for some reason configured to recycle the application pool every hour (the default is every 29 hours). Could also be if IIS were configured to restart if there are some number of errors in an hour (I think the default is to check for some number of errors in a 5-minute timespan). Could also be some external tool (e.g., a heartbeat/ping tool, such as New Relic) that is hitting a specific page that has an error that causes the site to restart.
Probably a bunch of other scenarios.
IIS is just default settings, so yes it's the 29hr app restart.
There are no external tools on the server. It's all fresh out of the box. Strange.
Turns out that we already had fcnMode="Single".
Update:
fcnMode="Disabled" didnt work.
Update:
So we found the problem. The site is experiencing an hourly DDOS attack.
The thing that threw us off was that it wasnt happening every hour.
Glad you figured it out! FYI, Cloudflare or another WAF can help to avoid DDOS attacks.
BTW, you might want to confirm that it's a DDOS attack and not an overly aggressive search engine. I've had search engines cause me headaches in the past.
is working on a reply...