I have a customer with a v7 site, and we are continuously trying to fight the growing amount of repetitive spam they are getting through their Umbraco forms. We were initially using Recaptcha v2 and then that was replaced with v3 which was producing too many false positives and as they are a charity it was costing them donations so had to remove it.
We have tried a honey pot, and timestamp field to avoid submissions that are too fast to be human and both have limited success. Most of the current spam includes links in the enquiry field and so we have implemented a regex validation on the field in addition to the timestamp as another means of trying to block it without affecting genuine enquiries.
I have tried many times to post the exact same submissions that have got through but each time they are blocked so I am at a loss as to how they are getting through. Even if I disable javascript to get past the client-side validation of the submission is blocked on the server-side.
Whilst we are really keen to get this resolved - I am, at a professional level, now very curious as to how they are successfully getting the posts through. Any ideas on how they are doing and any other options to combat it.
I have been seeing exactly the same issues for some time across multiple client sites to the extent that I almost consider reCaptcha to be useless.
Like you, I use a honey pot and a timestamp to filter out submissions that are too fast. I allow the client to adjust the cut-off point for the number of seconds.
I also added an email blacklist facility for clients in the backoffice so they can block repeat offenders. This doesn't get widely used since they don't see many repeat addresses but is occasionally useful.
Beyond that, I regularly analyse the spam to identify patterns and I have added tests for those conditions to my spam filtering workflow over time. Here are a few examples:
Compare first name and last name fields (where previously clients just used a "name" field I've split this into two fields). Previous submissions showed that the vast majority of spammers enter exactly the
same thing in both fields so I filter those out.
Check company name field. Many spammers either submit a value which is either the
same as the first and last name fields, or simply "Google". So I
compare with first and last name and also allow clients to enter a comma separated list of company names to filter out, so they can update the list if they see a recurring name.
First and last name start with the same characters but are slightly
different. These started slipping the net recently, for example first
name: Dorotxmr and last name: DorotktwOX. I now check if both fields
start with the same characters AND the length of the repeating
characters is greater than, say, 4 (otherwise you start filtering out
names like Joe Johnson or Sarah Sampson).
The dot test. If the submitters email address has more than 4 dots before the @ I assume it's a spam email address.
I haven't implemented this one yet but you could also check the email address field and filter out any which end with a predefined list of endings that you would never expect to receive form submissions from, for example .xyz or .ru (maybe dubious to exclude a whole country??)
Implementing, reviewing and updating this is a pain but I keep it all in an Umbraco Forms Workflow so I can easily share it between projects. I switch on post moderation and leave any spam at Submitted status, while all genuine submissions get set to Approved. Clients can then review the entries and manually approve any genuine entries that have been accidentally identified as spam.
Thanks, Chris. The problem with workflows is that they occur after the form has already been submitted. I want to block them before that even happens. I just can't work out how they are getting it submitted. I'm almost certain that the level of spam we have reduced it down to is now being created manually by a humanoid and not a bot!
I may, however, be forced into going down a similar route to you as a last resort and implementing workflows to treat the symptoms until a cure can be found :(
I am trying to solve this on a v7 site and I didn't think those event handlers existed back then but have just checked and they do so I will take a look into it - thanks, Liam.
I think I remember taking something from a version 8 site I wrote to help with spam on a version 7 and needed a few changes in the code but using the intenseness it was no biggy.
Umbraco Forms Spam Prevention
I have a customer with a v7 site, and we are continuously trying to fight the growing amount of repetitive spam they are getting through their Umbraco forms. We were initially using Recaptcha v2 and then that was replaced with v3 which was producing too many false positives and as they are a charity it was costing them donations so had to remove it.
We have tried a honey pot, and timestamp field to avoid submissions that are too fast to be human and both have limited success. Most of the current spam includes links in the enquiry field and so we have implemented a regex validation on the field in addition to the timestamp as another means of trying to block it without affecting genuine enquiries.
I have tried many times to post the exact same submissions that have got through but each time they are blocked so I am at a loss as to how they are getting through. Even if I disable javascript to get past the client-side validation of the submission is blocked on the server-side.
Whilst we are really keen to get this resolved - I am, at a professional level, now very curious as to how they are successfully getting the posts through. Any ideas on how they are doing and any other options to combat it.
I have been seeing exactly the same issues for some time across multiple client sites to the extent that I almost consider reCaptcha to be useless.
Like you, I use a honey pot and a timestamp to filter out submissions that are too fast. I allow the client to adjust the cut-off point for the number of seconds.
I also added an email blacklist facility for clients in the backoffice so they can block repeat offenders. This doesn't get widely used since they don't see many repeat addresses but is occasionally useful.
Beyond that, I regularly analyse the spam to identify patterns and I have added tests for those conditions to my spam filtering workflow over time. Here are a few examples:
Implementing, reviewing and updating this is a pain but I keep it all in an Umbraco Forms Workflow so I can easily share it between projects. I switch on post moderation and leave any spam at Submitted status, while all genuine submissions get set to Approved. Clients can then review the entries and manually approve any genuine entries that have been accidentally identified as spam.
Hope some of that helps!
Thanks, Chris. The problem with workflows is that they occur after the form has already been submitted. I want to block them before that even happens. I just can't work out how they are getting it submitted. I'm almost certain that the level of spam we have reduced it down to is now being created manually by a humanoid and not a bot!
I may, however, be forced into going down a similar route to you as a last resort and implementing workflows to treat the symptoms until a cure can be found :(
You can add an event handler. https://our.umbraco.com/Documentation/Add-ons/UmbracoForms/Developer/Extending/Adding-an-Event-Handler-v8
You can see in the docs the example for a Form validation event. You can do your extra code here before it submits.
I am trying to solve this on a v7 site and I didn't think those event handlers existed back then but have just checked and they do so I will take a look into it - thanks, Liam.
I think I remember taking something from a version 8 site I wrote to help with spam on a version 7 and needed a few changes in the code but using the intenseness it was no biggy.
is working on a reply...