A Robotics Professor Explains How Humanoid Robots Learn to Dance and Navigate the World

A Robotics Professor Explains How Humanoid Robots Learn to Dance and Navigate the World

Professor Aaron Ames answers internet questions about robotics, covering humanoid dancing, autonomous vehicles, robot sensors, and the challenges of automating tasks like folding clothes.

Robotics Professor Answers Robot Questions | Tech Support WIRED. | Transcript:

I'm Aaron Ames, I'm a professor of mechanical engineering. I'm here today to answer your questions from the internet. This is Robotic Support. Fruit Bone says, "I don't understand the benefits of these food delivery robots. You know, this is a job that could be automated, so let's have robots automate it." They obviously work in some scenarios, but they're not super robust. I think the bigger thing they're trying to solve is a proof of concept demonstration to see if we can solve things like automated package delivery, last mile delivery, things like that. A lot of these are first steps towards trying to solve this more general

delivery problem, which is actually a really hard problem. And by the way, putting these robots on the road has taught us a lot about both what works and also a lot of what doesn't. seen these delivery robots, they get stuck a lot, they can hit things, they can fall off the road. It does tell you how hard robots is, and I think that's sort of something to keep in mind when we're talking about humanoids and all these other things is look at the robots that are actually in the field today and how they sort of still kind of fail quite a bit, and it tells you we got a lot of work to do. And by doing that, at least on the robotic side, we will learn a lot. From Paradise Knights, "WTFs with

dancing robots. I thought they were supposed to be our slaves before they enslave us, at least for a bit. They're just partying." It's pretty good. Yeah, WTF on multiple levels. First and foremost, why are they having robots dance? First, there's amazing progress that's happened in humanoids. Among other things, backflipping, running. I mean, it really the behaviors we've achieved is in the last year or two has been absolutely remarkable. We figured out a very nice pipeline in which to get robots to do this. And the way you do it is you start with a human doing those actions. The reason why the dancing looks so human-like is it's a human dancing. It's it's really just puppetry.

Very beautiful and advanced engineering puppetry, but puppetry nonetheless. So, a human puts on a mocap suit or use cameras, and the human dances, not me cuz I'm a very bad dancer, but you know, a good dancing human. And then you take that data, and you get the trajectories from that data. That is the motion of the human over time and then you train a reinforcement learning algorithm on the humanoid robot that basically mimics or copies that human data as much as possible with the morphology of the robot and the end result is it dances like the person that was dancing. And as long as everything is just the way you expected it to be, meaning the environment's like it was when the human

did the thing, we can get robots to do that now. But you asked about this, aren't they supposed to be helping us sort of before they party? And the answer to that is that we still don't know how to solve that problem. That's sort of the not very secret secret is that all of the hard problems getting robots to be truly autonomous and in our homes are still hard problems that are unsolved. The next question is from project guy 111. What percentage chance do you think we'll end up in a Terminator future? So I think there's two answers to this. One is, what's the chance that AI and learning will do bad things? And I think that probability's actually fairly high if we're not careful. And now's the time to be careful. If you trust AI to be the

decision maker, if you're not very careful about having guardrails for that AI, it will make bad decisions. I mean, if you've ever used ChatGPT or LLMs, you see that it can produce really nice answers sometimes and it's impressive, but then sometimes it just wrong. So we cannot trust AI. In my opinion, you can never trust AI, but you can use AI as a powerful tool just like if you search something online on Google, you get a lot of results back, gives you a lot of information, but you have to verify and double-check. So I think if we put like AI in charge of our, you know, weapons or something silly like that, then, you know, bad things will happen. At the same time, the second part of Terminator

was it became sentient, right? It actually learned to think on its own. And I don't think we're anywhere near that. So I have no concern of sentient AI. Right now, AI is not intelligent. They say AI, artificial intelligence, there's no actual intelligence. It has no notion of what it's saying or doing. It is simply pattern matching at a scale we've never pattern matched before. From I got too silly, what benefits do legged robots have over wheeled or tracked vehicles? Legs are inherently beneficial if you want to operate in environments for which they're built for humans and more importantly where things are not flat. So wheels are massively efficient

in how you can move around environments as long as there's not uneven terrain. If you've ever been in a wheelchair or wheeled someone out around in a wheelchair, you realize very quickly how flat the world is not. Even what you perceive as flat, even in a city environment where there's sidewalks you realize there's curbs that don't dip enough. There's big breaks in them. And all those little things for wheels become big problems. They become sort of sticking points. And legs have the inherent ability to walk more robustly over multiple terrain types. Quadrupeds being one example, I mean where your dog or your cat can go on four legs is pretty incredible. Bipeds of course are sort of the ultimate expression of

mobility in human environments cuz the environments are built for us. So if you need to get into a small space and get up a small set of stairs or something like that or climb up a ladder, only a biped can do that. Give a ladder to a quadruped, it's not going to know what to do. Give a ladder to a human, as long as they're reasonably healthy, they can climb up that ladder no problem. From Sammy 514, what are robot dogs actually being used for? So we've actually come a long ways in robot dogs, which are actually technically called quadrupeds cuz they have four legs. It's really amazing what's happened in the last sort of decade where we've gone from these robot dogs being in really lab environments and

research environments to being things that you can buy at in- sanely low prices. The hardware has made immense progress. I think the practical use cases are still a little thin. They're thinking about doing them as things like inspecting buildings, sending robots ahead in disaster scenarios, right? And for that, legs are definitely better than wheels. The next question is from Livon21, is there any attempt to put Chat GPT inside a robot? Yeah, there's lots of attempts. So right now there's many humanoid robot makers that have that as a layer in their humanoid robots. And so, the way to think about this more generally is that as we're making robots do things, there's no one

thing that's going to make them do all things, right? So, imagine your body, you have a brain, you also have a spinal cord, you have proprioception. So, you sort of intrinsically have multiple computers running in your body at any given time. Like, your spinal cord is a computer in its own right. And those computers run different algorithms. So, at the highest level, at our cognitive level, that's where LLMs will sort of play a role if you'd like. So, you want to ask the robot a question and have it perceive something in about the environment and make a decision. So, ask the robot, you know, "Where is the red apple on the table?" It can use an LLM to parse that language into code, use

computer vision then to detect all the apples on the table and return based on that an estimate of where it is. When you actually want to reach for the apple, an LLM is not going to reach for the apple, that's where traditional robot control will come in or where reinforcement learning will come in, which is a whole 'nother type of learning that's different from LLMs. And those are what's actually running on the robot, right? Those are what are making the robot go, just like your spinal cord is really what moves your body most of the time without you having to think about it. From I want to be free 10, "Are autonomous vehicles mobile robots?"

Yeah, I mean, if you want to ask what the most advanced robot is today, you would take autonomous cars and mobile warehouse robots like Amazon has. Autonomous cars, I think, are one of the prime examples of the farthest that we've pushed autonomy. So, yeah, it's absolutely a robot. It's an advanced robot that's a beautiful work of engineering. From V night persona, "How does the Roomba know what's what?" I think what they're asking is how does the Roomba have a semantic understanding of the environment is the way we would technically phrase this question. That is able to identify things around the

room that it sees and correctly identifying them. So, basically, we've been able to take a bunch of training data, a bunch of examples of pictures on the internet, videos on the internet, and teach robots how to correctly identify those. By teach, what we mean is we can train up basically this big neural network that takes images in and produces what's in the images out. And as a result, we can now correctly identify most things in an image. And so, what Roomba does is simply it has a camera that's perceiving those things in the environment, it's checking with the internet based on these large models, and then identifying the things in the environment and using that information to tell you what's going on. The next

question is from Fireplace Air. Why make robots humanoid shaped when they could have six arms? Humanoid robots are the most suited to do the most things, even if they're not the best suited to do any given thing. Again, the world is built for us. We've built the robot for something of our shape and our function. Doors, stairs, narrow corridors, all these things. And if you want a robot that can slot into any scenario where a human can go, which is pretty much most scenarios we care about, not all, but most, you want somebody to start making you sandwiches at in your kitchen, or you want a robot that can be dropped into an existing factory and automate some tasks, right? I want to say all

that as a preface to the fact that humanoid robots are not the best form for, again, a given application. For example, right now, the largest owner of robots is Amazon. It has over a million. And as a result, one of the largest sector of robotics is warehouse robots. Robots with two wheels that go under these pallets and then move them around warehouses. In particular, what Amazon does is have the robots go pick up a pallet with the thing you ordered from Amazon, and it moves that over to a human who's filling the order, so the human can just stand there, and they never have to move, and basically all of this stuff comes to them. Now, you could theoretically have a humanoid robot do

this, like push these big pallets around the warehouse, but the question would be why? Like these robots are very good at doing what they're meant to do, and that environment is designed to work synergistically with the robot. So, you could have six arms on a humanoid if you had an application that determined that six arms would make a big difference. Maybe you have a firefighting robot and you find that two arms is not enough and you want four more arms cuz you want to hold the hose and you want to hold a camera and you want to hold the fire extinguisher. That's the great thing about robots, by the way, is we can make them any way we want. A Reddit user asks, "When it comes to automation, how

close is Amazon to actually automating most, if not all, warehouse work with robots?" The answer is very, very close. Two parts and then other ones are further away. So, in terms of automation of warehouses, Amazon is by far the leader in this domain. Amazon Robotics, in particular, has been working on warehouse robots, specifically robots that move around warehouses to move pallets around, to move your order to a person filling the order for, you know, 20-plus years now. And they've really refined that process. So, they maximally and efficiently can move packages to the sorter and then the sorter packages them up. So, the question becomes what remains? And what remains is then the

part that the humans currently doing, which is grabbing these things off the shelf and putting them in boxes. And so, this is much more an open-ended work where they're testing out solutions right now. So, for that, you can use things called local manipulators, meaning you'll have robot arms, maybe on mobile bases, and maybe use suction cups or soft graspers, and you would pick up objects and put them in boxes. And you can do this with varying success levels right now, but you're not at the success level that truly automate that where you don't need a person present for all objects. Cuz you imagine all the objects that go in Amazon boxes, all the different geometries and how they feel

and look. I mean, it's it's it's a very complicated problem to automatically and autonomously load all these objects. So, that problem sort of open-ended. Kaleidoscope inside asks, "If you could commission a robot to be built with no financial barriers, what would your robot do?" So, if I had no financial constraints, I would try to sort of solve the exoskeleton problem. Like a billion could get it done. A billion in my mind could develop something that would eradicate the need for wheelchairs. Essentially. A Reddit user asks, "Why do most of the four-legged robots, see Boston Dynamics, have their

knees look backwards?" It's the inverted leg morphology. There's potentially mathematical advantage to having your legs like this. It's actually a twofold thing. It's not just that they're inverted, but that they're very light. So, it turns out the way you control robots and the way you get really stable locomotion behavior on robots is by having very light legs and having most of your mass centered in one spot. There's lots of mathematical reasons for that leverage that as sort of an assumption. And what that means is as you're walking, if your legs are very light, you can place them very quickly to catch yourself as you fall. A lot of

the robots were designed based on that morphology. And in addition, the inverted leg is actually what birds have cuz birds actually have this type of morphology where they have very heavy big bodies and very light legs. And as a result, birds have some of the most robust locomotion out there. They can actually step down huge holes and then step up out of the hole without ever changing how their main body is moving. And so, if you can get that kind of behavior on robots, they'll be very robust and able to go all the places where you want to take robots with legs. A Reddit user asks, "What is so special about the Mars rover Curiosity?" I mean, what's not special about a Mars rover?

Pretty much everything. It was on Mars. I mean, the reality is that these robots are incredible engineering feats. First, just getting it to Mars is special. And then the rover itself, all the rovers that have gone, Curiosity and all of them, they have these amazing engineering sort of gems in them that they built up. You know, their suspension system is specifically built so that it can handle rough terrain. Their wheels are specifically built so they'll be robust to going over rocks and things like that and you won't blow out a tire, right? So, the entire structure is built to maximally be robust to this really harsh environment.

I mean, there's the electronics which have to deal with this very sandy and wind-swept environment where there's dust storms, you know, there's power limitations. Just the power alone, I mean, the sun doesn't come through at the same intensity, so you have to have good solar panels that actually charge the electronics. It The electronics have to be battery efficient, right? And then after all of that, you have to do science with it. And then Oh, of course, you have to talk back with Earth at the same time and make sure you don't lose that signal. I mean, the list goes on and on, but to build a system like that, I mean, every one of the rovers we've sent to Mars is an immense and

remarkable engineering feat that sort of just makes me happy as an engineer. The next question is from F. Krzuski 34111. Wade, how do autonomous cars even work? Like, are we trusting robots with our lives now? Not sure about this. It turns out that autonomous cars have gone through some things. About a decade ago, when it seemed to be close, there were all these startup companies that formed. And they all tried to do autonomous cars many different ways. And then exactly what you're worried about happened. There were crashes, people got hurt. Despite the fact that they'd

already been working on making the system safe, but they really realized, and I think that the entire autonomous car industry pushed more and more to make sure there was really rigorous safety methods. So, both in the algorithms to keep them safe, and then also protocols around that, software architectures, etc. And then data to support it. And so, in that way, there's a lot of evidence to show that autonomous cars, if they're currently deployed, have very rigorous safety standards, and there's a lot of data to back that up with the limited number of crashes that they've had. I'm not saying they can't get better, but, you know, after a decade, autonomous cars have done the hard work. A decade ago, we could have autonomous cars drive around

on the street, but the difference between driving around on the street for, you know, a demo and actually driving around and all of a sudden other cars pull out, or, you know, a ball bounces into the street and a kid chases after it, right? Those corner cases kill, and safety kills. Meaning, if you're not safe, you die, both as a company, but also as a technology. The next question is from Time is Grand. Noob question, why does Elon Musk think lidar is not a good idea? I don't know Elon Musk's mindset, but in my mind lidar is awesome. So, you have really two main sensing modalities in robots, autonomous cars, etc. I mean, there's many more, but in terms of

perceiving the environment, the two most popular things are using cameras, of course, which is what Tesla does, and you have lidar. Now, lidar sends out a bunch of laser pulses, and those bounce back and tell you what's around you. So, it's sort of 3D radar, but now it's using lasers, lidar. Cameras, of course, have a lot of information present. They can tell you not just how far objects are if you can properly estimate distance, but what objects there are and where they're at in the environment. So, you can get semantic understanding of the environment and things like that. Lidar does an incredible job of precisely identifying everything in the environment with like a full 360 view,

so you know exactly where all the objects are, but you have no idea really what the objects are. Lidar is super effective on robots and on autonomous cars, because what you want to do is make sure you're not hitting anything. And that means anything. It doesn't matter if you're sort of hitting another car or hitting, you know, the guardrails on the side of the road. You don't want to hit any of that stuff. And for that, lidar can work really quickly and really robustly to do things like dynamic collision avoidance. Cameras, again, give you this semantic understanding. And so, I imagine that what they're thinking at Tesla, generally speaking, is if they can solve the camera problem

and make them as good as lidar, then this will extend to a lot of other application domains. But I think what I found on robots is you want to use every sensor you can get your hands on. And so, for that, for safety-critical applications, which Tesla cars could still improve on, to be honest, you want lidar in the loop, so you can quickly and rapidly respond to dynamic changes in the environment and not have to deal with the latency that's present in a lot of perception-based um representations. We've got a Reddit user who asks, "Why is a surgical robot better than a surgeon?" The reality is that robots are really good at certain things. They're very, very good at precise, small

motions and doing those precise motions repeatedly again and again. There's also places where humans are infinitely better than robots. Basically, any place where you have to interact with the environment in a soft and tactile way. And so, here is where actually surgical robots get complicated because our skin is soft, our organs are soft, our bodies are soft. And so, as this robot is moving through your body to perform surgery, you need to be aware of those soft and sort of compliant interactions with the human body. And that's why a surgeon is in the mix. The surgical robot can do those precise

motions while the surgeon operates the robot and they get all this haptic and tactile feedback while they're doing the robotic surgery so they can take advantage of the precision of the robot while still being able to do what humans do really well, which is operating in these complex environments. This is from the Ballister Round. Why is it so difficult to make a clothes folding robot? The problem with clothes is that clothes are really difficult to model. They move around, there's fabric. I mean, it's very difficult to handle. So, a robot has to interact with something in the environment that we can't put in a computer easily. And so, that's why this is a big challenge and why there's

been a big push to use things like machine learning and AI to try to understand this clothes folding problem. That is, you train a robot with the humans folding clothes, like you teleoperate the robot so that it folds the clothes a bunch of times and then you try to teach the robot that task without it actually having a model of the environment itself, but just these reference trajectories that humans generated through teleoperation. The next question is from Surprise News 763. Whatever happened to the Neo robot? It's interesting because there was this big announcement that you could buy like a humanoid robot as soon as this year, in fact, and have in your home. Now, to give them credit, they completely were

transparent about the fact that this robot couldn't actually do a lot without teleoperation. And there's even in the app, they show you how you will book time on the robot with a teleoperator. So, this is interesting from a couple of perspectives. One is it's a tacit at least acknowledgement that humanoid robots are not actually ready to be deployed in your homes. I mean, that's the that's the clear and obvious ramification. So, the question is why are they going to put a robot in the home that's not actually ready? I mean, it can't do a lot of tasks on its own, very few. The reason why is right now the idea in the robotics industry is that the only thing that's missing is data. So, we have internet scale data

and that's what made LLMs possible. So, what they're betting on is if we can get sort of internet scale humanoid robot data, we will be able to solve all these problems just like ChatGPT solved the problems for language, we'll be able to solve in this other way. So, the idea with the Neo robot is not so much that they think they're really going to sell all of these robots and make money off them by having them teleoperated, but rather this is an amazing way to do data collection. You put all these robots in homes and then people agree to let you teleoperate them in their home, they collect all this data and then they're going to use this data to train a large model to determine how to make the robot

do these tasks automatically. If you can get early adopters to sign up for that because it's cool, you'll be able to generate enough data to actually train large models to learn how to do these tasks. But, I want to say one final thing is that I don't think it's going to work. I don't think that there's enough data you can collect in this way to solve this problem with just human data. People tend to confuse LLMs and text with robots, which are fundamentally different things. In particular, the amount of data needed to understand how a robot moves, these trajectories. Think about them as trajectories as the language of robots and they include position, velocity, force, all this rich dynamic information, but they're so variable cuz

there's so many degrees of freedom that it makes language seem simple. And in fact, the way to really understand this is look at evolution. We developed language in a very small fraction of our total evolution. Most of the time we spent evolving was to do these other basic things, right? Things that require motion of our body. That's why they say you only use 10% of your brain. What they mean by that is that a lot of your brain is being used for these very complicated motions that we do, right? And so, the point of this all being is I don't think that data's going to work. I think that what we're missing is not more data. Not that it can't be helpful or help to polish things, but we need to understand

physics as well. You need to merge physics with human data, and that's the only way you're going to solve the general intelligence problem on humanoid robots because robots are not language. That's it. That's all the questions. Hope you learned something along the way. Thanks.

More Entertainment Transcript