Can CAPTCHA Be Hacked?

By Brennan Whitfield

November 12, 2021

If you frequent the internet, you are likely familiar with the requests to select all the squares containing stoplights or to re-type a distorted word. These requests are examples of CAPTCHA, a system used to verify a user’s humanity before accessing a digital account. Since its release in 2007, CAPTCHA tasks have become more complicated to compensate for advances in artificial intelligence (AI) technology and its ability to learn how to crack these tests. Knowing this begs the question: does making these tests more complex actually deter away bots? Can CAPTCHA technology easily be hacked? 

What is CAPTCHA?

CAPTCHA is an acronym for “Completely Automated Public Turing test to tell Computers and Humans Apart,” and acts as a security method during account entry to prevent unauthorized access by machines or bots. The implementation of CAPTCHA into certain websites is intended to prevent bot spam incidents and ensure a digital environment is solely inhabited by humans. 

An overflow of bot accounts onto a digital platform can cause a multitude of problems. No overview of what accounts are being made by bots can result in mass spam, mass item holding in online stores, unfair distribution of votes in online polls, and in extreme cases DDoS attacks. CAPTCHA helps prevent these incidents and maintain digital equality by validating that every account created is by a real person.

CAPTCHA intends to validate user humanity through various tests, in which a user is asked to correctly duplicate or categorize a sample of text, imagery, or audio. These tests’ cumbersome nature are intended so that it is difficult for bots to understand and solve, but possible for humans. However, the reliability and necessity of common CAPTCHA tests are becoming questioned as AI technology only gets smarter by time

Yes, CAPTCHA Can Be Hacked

CAPTCHA in all of its forms can be hacked or bypassed, and easily so. There are even courses one can take to learn how to create bots to bypass image-based and text-based CAPTCHA. The conception of CAPTCHA being a dependable stronghold in the digital world fades as one looks more closely at how the tests operate.

Nikolai Tschacher, a computer science researcher, stated in a 2021 blog post that there exists a workaround to reCAPTCHA tests (version 2 and above), a type of CAPTCHA system hosted by Google. In a reCAPTCHA test, to verify humanity a user simply has to click a box with the agreement of “I’m not a robot”. Though seemingly an easy task, reCAPTCHA tests track the movement of a user’s mouse behind the scenes and determines whether its movement more closely resembles cursor movement by a human or a machine. There is an option for this test however which accommodates visually impaired users, where they may request instead to play an audio and type what they hear to pass the test. This audio option is where Tschacher and other researchers find the test’s weakness to be hacked.

By selecting the audio option in a reCAPTCHA test, you can then strip the mp3 file of the audio presented and input it into Google’s own speech-to-text API. After doing so, the audio file will be transcribed into text, which can then be copied and pasted into the reCAPTCHA’s solution box. Tschacher claimed that Google API will yield the correct answer in 91% of cases.

Diving deeper, the most used type of CAPTCHA, being text-based, was claimed to be cracked with 81% accuracy in 2017 by technology known as Recursive Cortical Network (RCN). RCN is a graphical AI model developed by Vicarious which mimics the functions of a human’s visual cortex and how we process visual stimuli. RCN technology is able to differentiate the shape of an object from its actual texture and appearance to the eye. Rather than viewing a distorted letter or character as a random shape in the visual space, this mechanism allows the AI to identify these objects in a CAPTCHA and then classify them based on its known alphabetical knowledge. RCN technology was created based on the concepts of human neuroscience, resulting in visual identification capabilities eerily similar to that of real people. 

Due to the advancing abilities of AI, technologies such as RCN are able to enter websites protected by text-based (and likely image-based) CAPTCHA a majority of the time. AI technology has been shown to be complex enough to nullify CAPTCHA tests, so what can be placed as an alternative that still ensures bot-free websites? 

humanID – No Common CAPTCHA Needed

humanID is a login authentication software which does away with the use of hackable CAPTCHA tests to verify one’s humanity. In lieu of common CAPTCHA, a website which has humanID implemented validates a user in their initial login by asking for their phone number. A phone number is a credential which is time-consuming and likely expensive to be created by bots in mass numbers (as opposed to emails). 

After a user’s phone number is provided and confirmed, their number is converted into a non-reversible hash, a unique identifier used to authenticate them in future logins. The phone number is then promptly deleted. A user’s hash is only used for identification such as logins and platform use, and is unable to be traced to the number provided by the user. humanID’s method of authentication encourages both a bot-free digital space, as well as an anonymous and time-saving login experience. 

