reCAPTCHA is broken

In the words of the Security Level setting for CloudFlare’s Firewall:

I’m Under Attack!

As it turns out, I’m not a stranger to having bots on my social network,
Ginpop.com.

I’ve even documented my previous experience with a DDoS attack and how
to mitigate a WordPress Pingback attack!

This particular attack was special though. It reminded me that the site is still
kind of relevant if somebody went out of their way to script something to try to
exploit it with their spam links.

I went to the link, it wasn’t even that good :-/

The attack itself wasn’t very sophisticated in the least. New account was
created, email was verified (by clicking the link in the message I sent them)
and they would proceed to upload a photo of a hot chick, and update the bio to
have a short CTA and a link to their lame landing page.

If I were going to express the bio text in RegEx it would look something like
this:

/(Welcome to|Best dating|(My (collection|new sexy))|My private webcam|PRIVATE VIDEO|New sex).*bit.ly/.*/

Or something like that. I eventually realized that every one of the fucking
profiles was using the same link.

That’s on me and made cleanup pretty easy. Just update the database to flag all
the accounts with that URL in the bio for deletion.

While all of this was happening, I was attempting to block the IP addresses as
new accounts were being created.

They weren’t being created too quickly but definitely was a losing battle on my
end to try to do things manually.

In the past I had piped IP addresses out to ufw to deny them.

Since I don’t like to consider myself a one trick pony, I figured I may as well
investigate a bit further and see if I could come up with something different.

I disabled new user registration at this point and started to interrogate the
data that was being submitted to create the accounts.

My observations were:

Most of the accounts were coming from distinct IP addresses. Said IP addresses
were distributed globally but seemed to be predominately European.
They all had the same exact user agent string:
Mozilla/5.0 (Windows NT 10.0; WOW64; rv:45.0) Gecko/20100101 Firefox/45.0
The domains of the email addresses used to sign up didn’t span many different
domains. Maybe 10 or 12, tops.
Nearly all of the domains I had ran a whois on were using dnsowl.com which
belongs to NameSilo (which is coincidentally my preferred domain registrar.
They have since been contacted regarding this observation)
They all had a g-recaptcha-response, which means they successfully got
through the CAPTCHA and in theory are “human”.

Am I dealing with one of those more sophisticated spam shops there they are
throwing bodies at the CAPTCHAs or did I miss an update or something from Google
and my site had been laying vulnerable while I run around with egg on my face?

Upon further investigation on Google’s reCAPTCHA site, I was in fact using the
latest version of reCAPTCHA. The only gotcha I noticed was that I didn’t have
the security level maxed out under the advanced settings.

I’ve since updated the setting to be “most secure” but in typical Google fashion
there wasn’t much context as to what that would actually do and if it could
solve anything for me.

I had also flipped the site to “Under Attack” mode on CloudFlare just in case.
This particular change has already started to be a burden to some of my more
impatient users.

Even further digging around the reCAPTCHA site, I came across some request data
that was most troubling. Seems that as of the last week or so there have been
days where most users (90% last Saturday) were not being served a CAPTCHA at
all:

Google reCAPTCHA Requests

Would be nice if Google had an option to disable this “feature”. At least for my
implementation, I want every single person to be served the CAPTCHA.

Since that’s not the case, I went ahead and implemented a couple of things to
remedy the situation.

First, as it was the most obvious, I blocked any registrations from that
particular user agent.

There may be some false positives being blocked from being able to register, but
my thought was, anybody running Firefox is probably running on something a bit
more recent than 45 which was released in 2016.

Also helps that most of my users are on Safari or Chrome.

The second piece of the puzzle was to implement my own janky CAPTCHA:

joshCAPTCHA FTW

Yep, that’s it. Just an additional check box.

Straight out of the 90s, amirite?

If the value of the field is empty, the registration is rejected. Nothing too
fancy and only took a minute to implement.

Best as I can tell, whatever piece of shit software these spammers are running
isn’t going to be quickly adapted to this change.

Next steps are to continue monitoring which registrations are making it through
and tweaking the formula as necessary.

reCAPTCHA is broken

About Josh