CAPTCHAs (Completely Automated Public Turing test to tell Computers and Humans Apart) are those little “type the word/letters/numbers you see above” tests on various web pages to prove you are actually a human not just a computer. If you spend much time online, you are likely to run into them.
Two articles in the past couple of days — one on NPR, one at the BBC, talk about a new program, “reCAPTCHA,” that is tying CAPTCHAs into library scanning efforts to put the human brain to use deciphering faded text that computers can’t recognize.
In some documents, where ink has faded and paper has yellowed, the character reading software can flag up to 20% of words as indecipherable. The hard-to-read words are then farmed out to the many thousands of sites that have signed up to be Recaptcha partners. Words are supplied to sites along with a control word that aims to ensure the person answering is human.
The responses to the obscured text are added to a database and particularly mangled text will be put before several people to ensure it is read accurately. Reporting in the journal Science the Recaptcha team says the scheme is about 99.1% accurate – as good as professional transcribers and beyond the limit demanded by archivists.
Deucedly clever.
In the last year it has helped resolve more than 440 million words and has just helped to complete the conversion of the entire archive of the New York Times from 1908 into digital form.
Excellent. If I didn’t already have a similar mechanism on this site (the little TinyTuring “type the letter” test at my comments) that works so well and easily, I’d be sorely tempted to sign up.
One thought on “Putting human brain cycles to work”