Ahead of Its Time podcast by Setapp: Facial recognition

Welcome to the eighth episode of Ahead of Its Time, an original podcast from Setapp about the tech underdogs no one realized would shape the future.

Facial recognition almost feels like a hot trend that emerged in the past few years. Little did we know that it was invented by one Woody Bledsoe back in the 1960s! In this episode, Shaun Raviv unveils the truth about Bledsoe’s work and whatever it had to do with the CIA.

Then Karthik Kannan shares his experience of building on top of facial recognition technology to help visually impaired people navigate the world. Using Google Glass, his company Envision created glasses that can recognize faces and provide visually impaired users with all kinds of insights about their visual environments.

Show notes:

Shaun Raviv’s story about Woody in WIRED

Video: Karthik and users demoing Envision Glasses on CNET

Transcript:

Julia Furlan (00:04):

There are countless individual quirks and characteristics that make you different from the other 7.9 billion people on Planet Earth, but the most visible difference, the thing most intimately connected to your individual identity is without a doubt, your face. We intuitively know that everyone's face is unique, but what's harder to understand and to explain is exactly how one face is different from the next.

Karthik Kannan (00:34):

Faces are different, but they're also the same, right? They're unique and the same at the same time.

Julia Furlan (00:40):

That's Karthik Kannan. He spent the last several years measuring and quantifying the human face to help make the world more accessible. In 2017, Karthik co-founded a facial recognition and computer vision startup called Envision.

Julia Furlan (00:52):

They make AI assisted software and glasses that help the blind and visually impaired navigate the world. During his journey to digitally replicate the invisible computations our eyes and brain perform countless times a day, Karthik was amazed to learn just how much information we subtly communicate with our facial expressions.

Karthik Kannan (01:12):

You look someone in the eye and you can really know what their true intentions are when they're talking to you. You know if they're lying to you or if they're happy or they're sad. There's a tell in the face. And that nuance is something, I think, till date AI struggles to really capture.

Julia Furlan (01:27):

But AI is already pretty good at identifying different images and patterns. And the glasses he developed can recognize objects, read text, and provide a voice description of what's happening around the person who's wearing them. Karthik won't soon forget the day they first came to life.

Karthik Kannan (01:43):

We were just sitting in this room and I'm putting on the glasses for the first time. And it just starts speaking out text that was in front of it. And then it was capturing all that information and then speaking it out to me. It was insane.

Julia Furlan (01:55):

And recent advances are helping these glasses do more than recognize objects and read text. Karthik has programmed them to recognize and describe human faces. This has huge implications for visually impaired people who using this technology, can live a more independent and socially connected life.

Julia Furlan (02:14):

These remarkable glasses are part of a growing steady torrent of innovation in facial recognition technology, which incredibly enough, date back more than half a century to a relatively unknown mathematician and computer scientist named Woody Bledsoe.

Karthik Kannan (02:29):

I was reading this article about Woody Bledsoe. He said, I could see it or a part of it in a small camera that would fit on my glasses that would whisper into my year the names of my friends and acquaintances as I met them on the street. For UC, my computer friend had the ability to recognize faces. And it's even more eerie that 60 years later that I'm just sitting and reading this paragraph and I see it exactly fit with what we're doing at Envision.

Julia Furlan (03:06):

I'm Julia Furlan and this is Ahead of Its Time, an original podcast from Setapp, a show about the tech underdogs no one realized would shape the future. Setapp's versatile app subscription service empowers you to step into a new era of productivity. Karthik first learned about Woody Bledsoe years after starting Envision, and in all likelihood, he would probably still have no idea who Woody Bledsoe was or what he did if it weren't for a man named Shaun Raviv.

Julia Furlan (03:43):

One morning, a few years ago, Shaun Raviv woke up and made a beeline for his computer. The Atlanta based journalist had an idea for a short story about facial recognition, which came to him in a dream. He got started by researching the history of facial recognition. And what he would soon discover was more interesting than any piece of fiction he could have dreamt up.

Shaun Raviv (04:04):

And I started Googling and I found some very, very weird reference on a really strange website to Woody Bledsoe as being the founder of facial recognition. But there was very little out there. There's almost nothing to read about him that was interesting on the web. Basically nothing at all.

Julia Furlan (04:22):

Soon after he started his research, Shaun was in a coffee shop when his phone rang. The voice said...

Speaker 1 (04:27):

Hi. I'm Woody Bledsoe's son. You sent me an email?

Julia Furlan (04:30):

As Shaun stepped out to take the call, Woody's son began to tell him an intriguing story. Early one summer morning in 1995, Woody's son stopped by his dad's home in Austin, Texas for a visit. When he arrived, he saw Woody was sitting in his garage, door open waiting for him. At this point in his life, Woody was extremely sick. His body ravaged by a degenerative nerve disease called ALS. Woody's mind, however, was still sharp, but the disease had robbed him of his speech. So he needed to communicate using a small whiteboard.

Shaun Raviv (05:06):

He walked his son to a safe in his garage and he wrote down the combination for his son. He couldn't believe his dad still remembered it. He opened it and there was a bunch of old rotting papers, essentially. He never knew what was in there. And his dad said, pull them out a bunch at a time. And then he started looking through them. Then he would hand the papers back to his son.

Shaun Raviv (05:25):

He felt a lot like Indiana Jones, searching through some archive with lost papers. His son sort of looked at some of the papers. He saw the stamps that said classified on them, but he didn't exactly know what was in those papers because his dad didn't want him to read them. He told him to pull out a metal garbage can. He started putting papers in there and he lit it all on fire.

Julia Furlan (05:49):

After hearing his story, Shaun realized he needed to do some serious digging. He knew Woody spent most of his career teaching at the University of Texas. So he thought he'd start there. Soon he discovered a vague list of papers under Woody's name stored in the university's archives. So Shaun decided to send away for a few documents. What he saw when they arrived floored him.

Shaun Raviv (06:12):

It was amazing. It had all these amazing imagery of people's faces marked up with tons of mathematical equations that I couldn't even begin to understand. But there was also hundreds, if not thousands of photos in these papers of people with marked up faces, people with different lights and shadows turning their faces. And a lot of these photos were in black and white. Some of them were in color and they were just beautiful. They could be art.

Julia Furlan (06:39):

Shaun's next step was to track down former students and colleagues who knew Woody back in the early '60s. This is where a deeper story began to emerge. A story about Woody's secret career in the years before he joined the University of Texas.

Shaun Raviv (06:54):

He met a man named Ivan Browning. It's brilliant polymath who was just good at all sorts of sciences. Had all these crazy inventions. And they started working on character recognition together or automated pattern recognition. Trying to get computers to recognize patterns. They're working with letters specifically, but it could work for any written pattern or typed pattern. It was really, really advanced. And they were successfully able to get computer to recognize letters.

Julia Furlan (07:22):

They understood the significance of the breakthrough right away. So together in 1960, they started a company called Panoramic Research in Palo Alto, California long before it was the tech hub that it is today. Panoramic set out to move the world and explore ideas other companies found too silly. They worked on inventions like a robotic lawnmower, a dog powered vehicle, and a pen that could translate light into sound.

Shaun Raviv (07:50):

So they were this group of crazy people. They were just trying to explore everything. And amongst those things were working on their pattern recognition stuff and they realized it doesn't have to be just letter. It doesn't have to be numbers or shapes. It could also be faces. He was just really big on artificial intelligence.

Shaun Raviv (08:07):

He gave a speech once and he talked about these visions that he had of computers that could do what we do. That could look at a person and tell you who they are of people wearing Google Glass type glasses. And he just had these really advanced visions back then.

Julia Furlan (08:25):

The company struggled right out of the gate. Pitch after pitch to big name corporations were met with a steady stream of rejection. So without corporate contracts, Woody had to find another source of revenue to sustain his company. And this is where the plot thickens.

Shaun Raviv (08:43):

So I first had this vague notion that Woody did work for the CIA in one of these biographies that one of his friends wrote. But I did a lot of digging. I digged through 39 boxes at the University of Texas in Woody's archives. And so he apparently did not burn every reference to his CIA work. Digging through all them, it was just really clear that some of these companies that he was getting hired to work for were CIA front companies paying Woody and Panoramic to do facial recognition research.

Julia Furlan (09:11):

Woody started by trying to get a computer to recognize 10 faces. So he input the photos of 10 people into the computer and then input another 10 pictures of the same people to see if the computer could match them. Woody quickly realized just how complex and elusive this technology would be.

Shaun Raviv (09:34):

There was sort of too much variability in a face and in a photo of a face. If you think about it, you can look at a picture of the same person, two different photos, and they look pretty similar to us because we're human. We can see that they have the same nose, the same shape mouth, the same eyes, same hair, but a computer just can't instinctually do that.

Shaun Raviv (09:52):

They have to get through things like the lighting in the photo, shadows in the photo, the way the face is turned, the emotions of the face. If you're angry and if you're happy, you just look like a different person to a computer. You look like a different thing, a different shape. But it was a huge failure. They weren't able to do it.

Julia Furlan (10:07):

The computers of the time just weren't powerful enough for the task. Still, Woody asked the CIA for money to continue his work. And the CIA said yes. That's when Woody changed tactics. For his first attempt, he tried to make the process completely automated. This time, he would take what he called a man machine approach, which would give the computer some human assistance. The team began mapping coordinates for different facial features in each photo, including the eyes, nose, and eyebrows.

Shaun Raviv (10:37):

They used this to try and recognize at first, I think about 50 faces and it worked pretty well. They tried to cross match a photo of Woody when he was younger from 1945 to a photo of Woody when he was older in 1965, but he looked totally different in those photos except to a human.

Shaun Raviv (10:53):

He had lost so much hair, his face and jaws looked different and the computer just couldn't recognize him. And so overall, the second attempt was much more successful than the first, but also still a failure. The computer couldn't do what Woody wanted it to do.

Julia Furlan (11:10):

By then, Woody had good reason to look older. His funding was drying up and the stress of his work while trying to support his family left him emotionally and financially drained. In 1966, Woody left Panoramic and took a job at the University of Texas. Shortly after, Panoramic went out of business.

Julia Furlan (11:30):

A year later while living in Austin, Woody got one more chance to work on facial recognition technology. He was asked to develop a computer system that would allow law enforcement to match mug shots with photos of potential criminals. Woody went to work and this time he gave himself a difficult goal. He wanted his software to match faces faster than a person could, who did it manually.

Shaun Raviv (11:53):

So the fastest human took six hours to finish the task. But the computer, which was called a CDC 3800 completed the same task in about three minutes. So it was 100 fold reduction in time. So the humans were actually better at coping with head rotation and the bad photographic quality of some of the photos, but the computer was really, really good, much better at seeing the difference between people who had aged. But it was really the greatest success of Woody's career. And it was the last time he ever did an official project on facial recognition

Julia Furlan (12:28):

By 1970, the secret nature of Woody's facial recognition work came back to haunt him when a paper about facial recognition technology was released by another researcher. This new research was celebrated by Scientific American Magazine as the most cutting edge facial recognition technology of the time.

Julia Furlan (12:44):

But as far as Woody could tell, it was years behind what he had accomplished at Panoramic. He said he was frustrated that a "second rate study" would be seen as the best facial recognition system available. Unfortunately for Woody, because of the top secret nature of his research, he couldn't tell people about his work and he never got the credit he deserved.

Shaun Raviv (13:04):

None of his facial recognition papers were published. And that alone is probably proof that he was working for someone that didn't want them published, but they were CIA funded papers. And so they just sort of got forgotten. They were disappeared.

Shaun Raviv (13:17):

I was able to find drafts of them in his boxes in the University of Texas. I guess, he was too proud of them to just destroy all of them. And then his work in facial recognition was completely forgotten by the time it sort of picked up and became a really important part of society like it is today.

Julia Furlan (13:36):

When Woody died in 1995, tributes poured in. Friends and colleagues praised his work in mathematics and his work in automated reasoning. Not one of them mentioned his groundbreaking work at Panoramic, which is the basis for much of the facial recognition tech we have today.

Julia Furlan (13:52):

It would be another 25 years before Shaun Raviv's story in Wired Magazine would spread awareness of Woody's work and what it means to today's computer vision and facial recognition innovators. For Karthik Kannan, Woody's story was an eye opener.

Karthik Kannan (14:07):

Yeah, I think it happens a lot with AI pioneers where they're working on technology like this, but it's not something that they end up getting credit for because they're just ahead of their time. When I was reading the article about Woody and the work that he's done is sort of the work that everyone else is building on top of right now. That's just how science evolves, I think.

Julia Furlan (14:29):

Until recently, Woody Bledsoe was an invisible pioneer in what became a massive industry. But thanks to Shaun Raviv, his legacy is now out in the open for all to see. And Karthik's Envision glasses are the latest chapter in that legacy. The software works in a new generation of Google Glass, which has a tiny camera lens embedded in the front of the frames.

Karthik Kannan (14:53):

Now, these are the glasses, right? So I'm wearing them. And then I hope that you can hear stuff, right? So now I'm just going to go ahead and swipe on the glasses and it just speaks out all the different categories.

Speaker 2 (15:06):

Read.

Karthik Kannan (15:06):

There is read, for example, where you can read text.

Speaker 2 (15:09):

Find.

Karthik Kannan (15:09):

So it's got find. So you've got find people, find objects within the find category.

Speaker 2 (15:14):

Identify.

Karthik Kannan (15:15):

There's also identify, which is basically used to identify different types of objects around you. It's just got a lot of general identification functions.

Speaker 2 (15:23):

Describe scene.

Karthik Kannan (15:24):

So I'm at describe scene. And then I just do a two finger double tap. So you can hear it processing.

Speaker 2 (15:34):

A laptop on a table.

Karthik Kannan (15:35):

It said a laptop on a table.

Julia Furlan (15:40):

Envision glasses are effectively an AI assistant for the blind and visually impaired. They tell the user what's nearby, give visual information about an object and the user's environment. And of course, they recognize faces. The venture was born in 2015 when Karthik was invited to speak about software development at a school for the blind in India.

Julia Furlan (16:00):

He brought along a friend who's also named Karthik. Karthik Mahadevan. To keep things clear, I'm going to call his friend Mahadevan. They were asked about career paths within their vocations and what problems they could solve with our work.

Karthik Kannan (16:12):

And just as a fun experiment, we decided to throw the question back at the kids and we asked them, what kind of problems would they like to solve? Some people said the textbooks that they have or generally the material that they come across in their life, it's becoming really difficult for them to read through that stuff. They want to be able to solve it in some way using technology.

Karthik Kannan (16:33):

Some people said that they want to be able to just navigate their environments more independently, know if people are coming towards them or know if there is something in front of them. It was a very eye opening conversation. For some reason, that talk really stayed with us. And we felt that there has to be something that we could do. And we just started to talk about random ideas.

Julia Furlan (16:59):

During the bus ride home, Karthik and Mahadevan began brainstorming. AI was hitting its stride. New applications were starting to be as good as humans at more and more tasks, including the ability to identify faces. Soon after, Mahadevan headed to the Netherlands to start his masters. When he arrived, he needed a topic for his thesis.

Julia Furlan (17:20):

He thought back to the talk he and Karthik gave at the school for the visually impaired. And after a bit of deliberation decided to do his thesis on how AI, computer vision, and facial recognition could be used to help the blind. Karthik was immediately intrigued by his friend's idea.

Julia Furlan (17:38):

So the two decided to team up with Mahadevan on research and Karthik on building the app, an app which would serve two distinct functions. It would read text and it would recognize objects and faces. Karthik's first day working on the project was spent at a Starbucks in Bangalore. From morning until late at night, he read studies and papers on facial recognition.

Karthik Kannan (18:01):

It was the first time I remember that I actually took my time to study a human face. I've seen a human face and I've lived with it all my life, but it's the first time I'm noticing, so the distance between the eyebrows, for example, or the length of the nose, or someone has a pursed lip. And I still remember really zooming into my face and I wrote this little code that put all these red dots over my face to really understand the relationships between different things.

Julia Furlan (18:33):

After a few weeks, Karthik had built a crude prototype. It was a white screen with a blue button. Press the button and it would take a picture with an automated voice describing what it saw. At the same time in the Netherlands, Mahadevan interviewed people with visual impairments, asking them to test the prototype and provide feedback.

Julia Furlan (18:52):

A member of his thesis group shared it with a few people and those people quickly shared it with others. Soon, the app had hundreds of beta testers. Suddenly, there was a torrent of feedback allowing Karthik to quickly refine and improve the prototype. Eventually, Mahadevan reached the end of his thesis. So the two reluctantly decided that it was time to end the project.

Karthik Kannan (19:14):

I still remember sitting down and just composing a very quick email saying, hey, thanks a lot for being part of this journey. It's been super amazing. And we hit send. And wake up the next morning and then I literally find my inbox which is 200 replies and it was just popping. It was like, pong, pong, pong. It was like I could hear the notification on my phone all through morning.

Karthik Kannan (19:35):

And every one of them actually took out the time to write some really thoughtful stuff. I remember this one particularly old user who was like, sometimes my relatives meet up and then my grandkids also come. And so what happens is the mother shares pictures of all the different grandkids.

Karthik Kannan (19:51):

And she would tell me that every time she shares a photo on WhatsApp, she just takes the image from WhatsApp, puts it into the Envision app and she knows who's in the picture. And the Envision app also gives her a caption, looks like so and so looking at the camera and smiling, right?

Karthik Kannan (20:05):

And so she's like, all of a sudden I'm able to actually be part of the conversation. And there were stories like this and it almost felt very difficult for us to shut it down, right? I think there was one or two people who actually put the idea of us doing this full time and starting this as a company.

Julia Furlan (20:27):

Karthik decided to leave India to continue working on Envision with Mahadevan in the Netherlands. There, they ramped up their work. After countless hours researching, talking to users, refining, and improving, they were finally ready. In late 2018, they released the Envision app. Two days later and with no promotion whatsoever, they had 4,000 users. Then came an email from Google.

Julia Furlan (20:53):

It said they'd been nominated for a Google Play award. And the nominees were invited to attend a big conference in San Francisco where the winner would be announced. Envision was up against some pretty steep competition, including another facial recognition platform that had over 200,000 users. So Karthik never really gave much thought to the award itself. He was more interested in meeting some tech entrepreneurs, seeing some sites, and enjoying California.

Karthik Kannan (21:18):

I still remember I was not that well dressed for the occasion. I was probably the worst dressed person in that room. I remember I was just standing there. I was live streaming the event to the other Karthik back here in the Netherlands. I was on a video call with him. I was putting it on and I was just showing him, see, Envision's logo is coming up there.

Karthik Kannan (21:36):

And the next thing they say is the Google Play award, the winner is Envision. And I start to freak out because then I'm like, wait did we really win it? And then I could hear all these expletives coming from the other side of the video call. And then I cut the call. I walked up the stage.

Julia Furlan (21:51):

The contacts Karthik made at the conference paid off. When Google announced they would release a new edition of Google Glass in 2019, Karthik persuaded them to ship him a pair before they hit the market. Incorporating their facial recognition software into a wearable was the next logical evolution for the company.

Julia Furlan (22:11):

And it turns out Google Glass was exactly the hardware Karthik and his team needed to take their next step. Today, between the app and the glasses Envision is making the world more accessible for 40,000 people who use their technology. It's helping each of them live a more independent life and experience the world in a way Woody Bledsoe hoped might one day be possible.

Julia Furlan (22:42):

Since its ominous beginnings as a secret CIA funded project, computer vision and facial recognition have inspired controversy. There's really no shortage of problematic applications that bring up important questions around things like privacy and surveillance. And that won't go away anytime soon. But Shaun Raviv reminds us that no technology is inherently good or bad. It's what we do with it that really matters.

Shaun Raviv (23:05):

Facial recognition technology is, without a doubt, one of the most frightening technologies on earth. It can also be a really useful technology and has been. And my hope is that it's only used for good things in the future. And things like helping the blind see is one of those positive things.

Julia Furlan (23:21):

For Karthik Kannan, the ultimate goal is the same as it was when Woody Bledsoe first started working on facial recognition back in the '60s, to one day build something that can recognize the subtleties of the human face with the ease and speed of our own mind. And Karthik believes that when that day arrives, the people who will benefit the most are the people he's already helping today.

Karthik Kannan (23:45):

We're just slowly scratching the surface of it. We realize that 90% of the information is visual and there's so many visual relationships that the brain is able to form in a fraction of a second. And that visual fidelity is what we eventually want to capture with AI. So AI, in that sense, is the perfect fit for helping visually impaired people because AI is a tool that doesn't expect the world around it to change. It takes the world as it is. And it really tries to interpret it in a way that can help visually impaired people.

Julia Furlan (24:18):

I'm Julia Furlan and this is Ahead of Its Time, an original podcast from Setapp. Working on your next big thing, Setapp's productivity toolkit will help you stay focused and get stuff done. Head over to setapp.com to see if Setapp can help you bring your ideas to life.