I love it when a plan comes together …
Last August, I noted with glee the introduction of reCAPTCHA. Rather than a CAPTCHA schema that just grabs random words, reCAPTCHA actually serves a purpose other than “just” security.
To archive human knowledge and to make information more accessible to the world, multiple projects are currently digitizing physical books that were written before the computer age. The book pages are being photographically scanned, and then transformed into text using “Optical Character Recognition” (OCR). The transformation into text is useful because scanning a book produces images, which are difficult to store on small devices, expensive to download, and cannot be searched. The problem is that OCR is not perfect.
reCAPTCHA improves the process of digitizing books by sending words that cannot be read by computers to the Web in the form of CAPTCHAs for humans to decipher. More specifically, each word that cannot be read correctly by OCR is placed on an image and used as a CAPTCHA. This is possible because most OCR programs alert you when a word cannot be read correctly.
But if a computer can’t read such a CAPTCHA, how does the system know the correct answer to the puzzle? Here’s how: Each new word that cannot be read correctly by OCR is given to a user in conjunction with another word for which the answer is already known. The user is then asked to read both words. If they solve the one for which the answer is known, the system assumes their answer is correct for the new one. The system then gives the new image to a number of other people to determine, with higher confidence, whether the original answer was correct.
That is so wildly, incredibly cool that I simply cannot stand it. Though the reCAPTCHA prompts are mildly annoying (not surprisingly, they are not always readable, though you can refresh them), that they are being used to “digitize books from the Internet Archive and old editions of the New York Times” is so spiffy, I’m willing to inconvenience myself and others, especially since it will cut down on spam at this site.
A couple of notes:
- I’ve set reCAPTCHA to show up in two places. First, if you register at this site, it will use reCAPTCHA to try and verify that you are real, not some evil SpamBot. If you are registered, then it won’t prompt you when you go to comment. If you aren’t registered and try to comment, reCAPTCHA will pop up to validate that you are a Real Human.
- I’ve deleted the already-registered users to this site (four of them) whose names I didn’t recognize. My apologies if you are a real person. Re-register (which will force you past a reCAPTCHA) and you’re set again.
We’ll see how this works. If I get a lot of complaints, I’ll reconsider. But … so cool …
Inutterably cool!
Making another comment simply because it’s fun to recaptcha!
Just had to try it myself 🙂
Well, if you register, you won’t get to any more. 🙂
All the more reason not to register! Ha!
Actually, Margie, I have you already registered. We can go over it later.
What role will registration play on your site?
All registration does at this point is let you bypass the comment CAPTCHA and skip filling in info at the comments (name, mail addy, etc.).
In theory I could use it to allow folks to post stuff here, and Margie got registered during the import because there were a couple of posts here under her name; it also lets me set her up to be a backup editor if, for some reason, it was required.
How, pray tell, does one register?
I’ve put the “Metalinkag” widget up toward the top of the sidebar so it’s more visible. It includes a “Register” (and “Login”) link.
I just wanted to point out that my reCaptcha for this entry is “tingling nowhere”.
I hope that’s not commentary . . .
Heh.
By the way, the Meta widget was bugging me (okay, the dynamic login/out/register/admin links were cool, but I didn’t need duplicate RSS links or a link to wordpress.org). So I used the instructions here (http://www.bigfootwebmarketing.com/2008/04/02/change-meta-widget-information-what-code-do-i-change/) to update it. Which I’m writing down here so that I remember …
Though this item ( http://wordpress.org/support/topic/197173?replies=3 ) has some better advice, which I may retrofit sometime in the future.
Now why did I register? I was going to comment on something, but now I’ve forgotten what… aging is not for the weak!
Welcome Deb. 🙂
Y’know, if you put the URL to your blog in the proper spot in the registration, it’ll generate a neat little link to it from here. 🙂
Here is an easier CAPTCHA that also addresses spam. http://demo.vidoop.com/captcha. We offer it as a free web service and WordPress plugin for easy install. We invite your feedback.
I don’t know that I’d call it easier. Probably more unambiguous, but it also required me several seconds to figure out the correct sequence of letters. An interesting idea, though, thanks.