Here in Google's Mountain View Labs, the company's building a life-size AI persona that can see you and talk to you. From what we were chatting about, you're the first journalist to test out this preview demo with me. Pretty groundbreaking stuff. We've seen the hardware before. This is Google Beam, the $25,000 teleconferencing system that makes it feel like the other person is in the same room. This was my first time trying it and the first time it's publicly been at Google IO. And I was blown away. You are definitely three dimensions. Wow. You're totally in this space enclosed by this uh this loop of fabric in front of me. But what if it wasn't a real person on the other end? That's where Sophie comes in.
Google wanted to create an AI agent that isn't just text on your screen or a voice in your ear. It's something you can walk up and talk to, too. What am I holding up right now? Ah, I see. Now, in your left hand, you have what looks like a transparent phone case with a magnetic ring built in, probably for magic. And in your right, a small circular accessory. Is that a magnetic ring stand or a grip that attaches to it? Yeah, that's right. It's an experiment, but the company imagines it could appear in workplaces, shops, and schools. I'll be honest, it's all over the place right now. It does Google things fairly well, pulling up maps, speaking in multiple
languages. Is its Japanese any good? You tell me. Can you tell me in Japanese? It's going to get as busy as places like Tokyo. Because it can see its surroundings, it can answer questions about a book, even recognize a sign language sign or two. Here's another hand sign. Tell me about this one. Oh, I know that one. You're making the sign for I love you. Combining the letters I, L, and Y in American Sign Language. But here's a genuine question. How natural does this seem to you? That's right. You're Sean Hollister from The Verge. You're also wearing a dark blue shirt today and sitting at that table with a paper in front of you.
Can you put me in the story? Absolutely. You're in. Just as Finley began to pull the lever, a friendly human appeared. A clever tinkerer named Shawn, who had been maintaining the magical tower for years. Shawn had short dark hair and a beard and wearing a dark blue shirt, smiled playfully. Personally, I would say it feels a lot like talking into a person, especially because it's currently programmed into the same wrote patterns as any other chatbot. It takes long pauses, runs over itself occasionally. It always sounds fake excited about whatever I ask, and it's so strange that Sophie will always change its accent on the fly.
I've just sent a picture of Finley entering that query. Oh, I can definitely keep going with the story. What do you think would be a useful feature? We could always refine the prompt to show it. and it always delivers the same length of answer before prompting me with the next question. What's your favorite part about reporting for them? I learn something new every day. That's an exciting way to be. I feel the same way. Google says the curt responses are because it's supposed to interact with IO attendees in short bursts and the accents may be model drift over time.
Perhaps most importantly though, Google isn't showing Sophie in 3D quite yet. I could have a totally different opinion about how real it feels when it's not just a face on a screen, but it's projected into the room. That real enough to touch 3D is the best part of Google Beam. And as part of our behind the scenes, we got a little more insight into how it actually works. The first Google Beam product, the HP Dimension, is an 8K 65 in screen that gives you 4K resolution per eye. But it's more interesting than that. I've always wondered how a flat screen can beam light to two of your different eyes. and product manager Andrew Nartker broke it down for us.
Very similar principles. If you've played Nintendo 3DS, like all of us back in the day, this is the best Nintendo 3DS you've ever seen. 3DS used a parallax barrier. Um, we use kind of an active light field display which does have a lenticular lens around. We're changing what's on the screen so that there's always two images on the screen. One for your left eye, one for your right eye. And then we're putting them on different parts of the screen, different pixels behind different lenses. So that from any of the positions in the room here, you're always seeing those two images as you should to see 3D.
And in this case, since we have a tracker on this camera, we have this uh this off-the-shelf tracker here, we can kind of show you. we can kind of show the viewer what the 3D looks like by moving back and forth and having the pixels show the right content through those lenses to this camera rather than this one. The wild thing is how you're not actually seeing video of the other person. This is all being recreated in 3D. Six real videos are turned into a digital reproduction at such high fidelity it arguably crosses the uncanny valley. makes you feel like it's okay that this is a digital person, a digital recreation instead of pure video. Most of this is rendered on Google servers.
Only a portion of it happens on the device itself. You don't get that 3D with the AI agent, though, nor do you get it with a new experimental group chat feature that Google's showing off, which allows you to finally bring laptops and phones into Beam teleconferencing, too. They're not committing to bring that out yet, but they're working with both Google Meet and Zoom. I want to see what good AI can look like with this 3D presence. I want to see if it's anything like my kid dreams of the Star Trek holiday. Can you give me uh directions? Sure thing, Brian. I can definitely pull that up for you. Where are you headed? And I'm Sean, by the way. I'm Sean.