Tuesday, January 24, 2012

CAPTCHA factories

It seems to be the week for me to write about social issues in computing, so I'll toss one more in. Then, I promise, I'm done with that slant for a while, because it's really not my sort of thing.

Anyway, I can't help but note the fascinating (in a can't-look-away-from-the-abject-horror sort of way) recent research done by a team at the University of California San Diego: Re: CAPTCHAs -- Understanding CAPTCHA-Solving Services in an Economic Context.

CAPTCHAs, of course, are those strange little pictures full of numbers and letters, generally semi-garbled or semi-distorted, that various web pages ask you to type in in order to prove you're a human being, not a robot. The idea is to dissuade people who are writing programs to manipulate web pages that are only intended to be manipulated by human beings. The acronym stands for "Completely Automated Public Turing test to tell Computers and Humans Apart".

Do they work? Well, according to the UCSD study, they actually do work fairly well, according to the strict interpretation of their stated goal. After testing several specialized programs designed to be able to "break" CAPTCHAs, the researchers found that the automated solvers generally were not successful:

We observed an accuracy of 30% for the 2008-era test set and 18% for the 2009-era test set using the default setting of 613 iterations, far lower than the average human accuracy for the same challenges (75–90% in our experiments).

So CAPTCHAs are working, right?

Well, not so fast.

It turns out that there is an immense industry of CAPTCHA-solving, and the solvers are actual human beings, not computer programs:

there exists a pool of workers who are willing to interactively solve CAPTCHAs in exchange for less money than the solutions are worth to the client paying for their services.

These people apparently sit in front of computers for hours at a time, doing nothing but solving CAPTCHAs that are displayed in front of them by Internet-based solving services that then turn around and sell these solutions to clients willing to pay for CAPTCHA solutions:

Since solving is an unskilled activity, it can easily be sourced, via the Internet, from the most advantageous labor market—namely the one with the lowest labor cost. We see anecdotal evidence of precisely this pattern as advertisers switched from pursuing laborers in Eastern Europe to those in Bangladesh, China, India and Vietnam

How much do these people end up getting paid? Almost nothing, but still enough to attract workers:

on Jan. 1st, 2010, the average monthly payout to the top 100 earners decreased to $47.32. In general, these earnings are roughly consistent with wages paid to low-income textile workers in Asia [12], suggesting that CAPTCHA-solving is being outsourced to similar labor pools

What do the authors conclude from all of this? The answer is that you can view the whole arrangement in an economics framework:

Put simply, a CAPTCHA reduces an attacker’s expected profit by the cost of solving the CAPTCHA. If the attacker’s revenue cannot cover this cost, CAPTCHAs as a defense mechanism have succeeded. Indeed, for many sites (e.g., low PageRank blogs), CAPTCHAs alone may be sufficient to dissuade abuse. For higher-value sites, CAPTCHAs place a utilization constraint on otherwise “free” resources, below which it makes no sense to target them. Taking e-mail spam as an example, let us suppose that each newly registered Web mail account can send some number of spam messages before being shut down. The marginal revenue per message is given by the average revenue per sale divided by the expected number of messages needed to generate a single sale. For pharmaceutical spam, Kanich et al. [14] estimate the marginal revenue per message to be roughly $0.00001; at $1 per 1,000 CAPTCHAs, a new Web mail account starts to break even only after about 100 messages sent.

It's a cold, calculating, unfeeling analysis, but it's an absolutely fascinating paper, easy to read and full of lots of examples and descriptions of the details behind this corner of the Internet. I never knew this existed, and I'm wiser now that I do.

However, I still felt the irresistible urge to go and wash my hands after learning all this. :(

No comments:

Post a Comment