An intresting fact on reCAPTCHA

Status
Not open for further replies.

Thrill

Banned
Banned
159
2012
62
0
If you're anything like me, you always thought CAPTCHA & reCAPTCHA were basically the same thing, and their only purpose was to stop spam.

I was watching a documentary on reCAPTCHA a little while ago, and as it turns out, not only is CAPTCHA and reCAPTCHA totally different.. but reCAPTCHA is something that is used for a completely different purpose.
Taken off Wikipedia:
The reCAPTCHA service is a user-dialogue system originally developed at Carnegie Mellon University's main Pittsburgh campus. It uses the CAPTCHA interface, of asking users to enter words seen in distorted text images onscreen, to help digitize the text of books, while protecting websites from bots attempting to access restricted areas.[1] On September 16, 2009, Google acquired reCAPTCHA.[2] reCAPTCHA is currently digitizing the archives of The New York Times and books from Google Books.[3] As of 2009, twenty years of The New York Times had been digitized and the project planned to have completed the remaining years by the end of 2010.[4]

Read more here



Word recognition software is used when a book page is scanned, to see what words are on the page, but when you have billions of pages, not every word is going to be recognized & accounted for. This is where reCAPTCHA came in.

Maby a lot of you already knew this, but I didn't, so this came across as a total surprise to me.




B-)
 
Last edited:
12 comments
The article could be updated as since Google Launched Street view it's being using reCaptcha to read street signs, shop signs and the numbers on houses, shops etc. Nothing sensitive or that would affect privacy though.

This helps it's maps products by knowing better where house number 84 or where Bob's Butchers is on a long street.

EDIT:

quote from Google

We’re currently running an experiment in which characters from Street View images are appearing in CAPTCHAs. We often extract data such as street names and traffic signs from Street View imagery to improve Google Maps with useful information like business addresses and locations. Based on the data and results of these reCaptcha tests, we’ll determine if using imagery might also be an effective way to further refine our tools for fighting machine and bot-related abuse online.
Here are examples of numbers found on Google Street View found in reCAPTCHA

iuBjh.jpg
 
Last edited:
^its probably super fast to proof-read everything thats been transcribed.
Microsoft word is a good example (the tech they use for improper sentences)
 
Now who's job is to actually check all the entries :)?
They don't.... you have two words beside each other. Lets say Google know the word on the left but not the one on the right they have 10 random people fill it in for them. If all 10 people enter the same then they know it's correct and what it is. They can then have another 100 enter the text again just to confirm this before before using the word 100 times as the one they know is correct while a new word becomes the unknown. By repeating, randomizing and analyzing the number of correct entries you'll know what it is without anyone having to confirm or check it. Basically it's we who check others are correct. It's all just numbers and algorithms which Google is probably one of the best at.

In theory the whole thing could be operated by a blind guy who can code.
 
Status
Not open for further replies.
Back
Top