How Google Maps Finds the Shortest Route in Seconds: The Algorithm Explained

How Google Maps Finds the Shortest Route in Seconds: The Algorithm Explained

Google Maps can calculate the shortest route across millions of intersections in seconds using algorithms like Dijkstra's and A-star, combined with optimization techniques such as contraction hierarchies. This video explains how these algorithms work, their origins, and how they enable fast navigation in real-world applications.

How Does Google Maps Actually Work?. | Transcript:

- [Derek] Say you wanna drive from New York to San Francisco. How could you calculate the absolute shortest path? There are over 64 million intersections in the North American road system, so a simplified estimate gives around 10 to the 220 possible routes. Even if you could check a billion routes per second, testing all of them would still take over 10 to the 200 years. And this isn't just a problem for navigation software. Any independent system like a robot or a video game character that needs to navigate a maze, or obstacle course faces this same problem. And yet, Google or Apple Maps will find you the route in only four seconds. So how is that possible?

Well, we teamed up with the channel 2swap, who creates incredible simulations and animations to show you step by step, how these algorithms work. They trace their origins back to a shopping trip in Amsterdam in 1956. (jaunty music) Edsger Dijkstra needed a problem. He was working on software for ARMAC, an early computer developed at the mathematical center in the Netherlands. And it was almost ready to debut. But at the time, the general public considered computers to be practically useless. So useless, in fact, that Dijkstra ran into issues filing his marriage license.

- [Casper's Dijkstra Impression] Dutch marriage rights require you to state your profession, and I said that I was a programmer. But the municipal authorities of the town of Amsterdam didn't accept it on the ground that there was no such profession. And believe it or not, that under the heading "profession," my marriage act shows the ridiculous entry, "theoretical physicist." - [Derek] And so Dijkstra was looking for a specific demo to prove just how powerful ARMAC was, even to the layperson. - ['Dijkstra'] The problem of course, is that for a demonstration for non-computing people, you have to have a problem statement

that non-mathematicians can understand. And they have to understand the answer. - [Derek] And during that shopping trip, he thought of the perfect problem. - ['Dijkstra'] What's the shortest way to travel from Rotterdam to Groningen? - [Derek] In other words, a shortest path algorithm. Here's a simplified graph of the Netherlands. The nodes represent the cities, and the edges connecting them are the roads.

The simplest way to find the shortest path from Rotterdam to Groningen looks like this. Starting from Rotterdam, we're first going to explore all its neighbors looking for Groningen. Since Groningen isn't one of these neighboring nodes, we'll mark Rotterdam as explored, and keep searching. So now we'll branch out from these four nodes to explore all of their neighbors. And we keep branching out this way, exploring all neighboring nodes, until we stumble upon Groningen. This algorithm is known as Breadth-First Search. It will always find the shortest path because it checks all nodes one step away, and then two steps away, and so on. So if there were a path from Rotterdam to Groningen

in four steps, we would've found it on the fourth iteration, but we didn't. So the shortest path must be five steps. But there's a problem with this approach. I mean, the algorithm considers all these roads to be the same length, which is obviously not true. So let's add weights to represent distances between these nodes. Now the shortest path is much harder to figure out. So Dijkstra needed a better method. - ['Dijkstra'] One morning I was shopping in Amsterdam with my young fiance, and tired, we sat down on the cafe terrace to drink a cup of coffee.

And I was just thinking about whether I could do this. - [Henry] His idea was this. First he wanted to keep track of the shortest distance from the source to each node. This is also known as its cost. His goal, well, if he could find the shortest path from the source to one node, then he could build up other shortest paths starting from this new node, and then the next, and on. Now, the cost of the starting node was zero. But since he hadn't actually explored the rest of the graph yet, he set all the other nodes to a cost of infinity. Then just like breadth-first search, Dijkstra started from the source,

and explored each of its neighboring nodes. At every step he'd ask, "Is the current path to this node the shortest we've seen so far?" For example, this edge only has a weight of one, while the node's current cost is infinity. So this path is the shortest yet to that node. So Dijkstra updated its cost to one. Likewise, these nodes were updated to three, two, and four. After checking all of Rotterdam's neighboring nodes, Dijkstra marked it as explored. Then since this node had the lowest cost out of all the unexplored nodes, Dijkstra explored it next. Like before, he checked each of its edges to reduce the neighbor's costs.

Dijkstra could update this neighbor from infinity down to six, a total path length of five plus one, but he couldn't update this one. The current path has a cost of five, but the nodes' cost was three. That meant there was already a shorter path, this one directly from the source. So there was no need to update the cost. The next node to explore is this one with a cost of two, which would update its neighbor to five. Then this one with a cost of three would update its neighbors, and so on.

Sometimes Dijkstra had to update a node's cost a few times. For example, he first reached this node through this path with edge weights four, and then three, for a total cost of seven. But later he found this path was cost six. So he updated the nodes cost again. (gentle music) The first time Dijkstra reached Groningen, it had a cost of 19. But it didn't have the lowest cost out of all the unexplored nodes. There was still this node with a cost of 16, which could be hiding a shorter path to our goal. So he continued exploring nodes in cost order, and found this path with cost 18.

Now, out of all unexplored nodes, Groningen had the next lowest cost. So Dijkstra was confident that he'd found the shortest path. Here's why. If the lowest cost among the unexplored nodes is 18, then Dijkstra must have explored all the paths with cost 16, 14, 13, every path less than 18. And none of those paths reached Groningen. It's just like breadth-first search, searching every step away from the source. By exploring from lowest to highest cost, Dijkstra's algorithm guaranteed the shortest path to any target.

- ['Dijkstra'] It was a 20-minute invention. I designed it without pencil and paper. I learned later that one of the advantages of designing without pencil and paper is that you are almost forced to avoid all avoidable complexities. - [Derek] During ARMAC's official inauguration in 1956, Dijkstra asked the audience members for two towns on a simplified map of the Netherlands. ARMAC then printed the shortest route town by town. It ran perfectly. The computer was such a success that the center immediately started the next generation. And as for his algorithm, well, it just sat in his papers for years.

Algorithms weren't respectable enough for mathematics journals, and there weren't very many computing journals yet. But in 1959, he published his work in Numerische Mathematik, or Numerical Mathematics, a brand new German journal that needed some help getting established. And that was the beginning. - ['Dijkstra'] Eventually, that algorithm became, to my great amazement, one of the cornerstones of my fame. I found it in the early '60s in a German book on management science, Das Dijktra'sche Verfahren. Suddenly there was a method named after me.

Dijkstra's algorithm is still regarded as one of the most elegant, shortest path algorithms. But it isn't good enough for Google Maps. Let's say I want to get from Newark Airport in New Jersey over to the Central Park Zoo. First, Dijkstra's algorithm checks all the 10-kilometer journeys, and then all the 20 kilometer journeys, and so on until it reaches all the 28 kilometer journeys, which includes the zoo. The search frontier covers over 65,000 nodes, and includes places that are way off, like Staten Island, and large swaths of New Jersey. But even though it searched in a illogical directions, the runtime was about a 10th of a second,

which is incredibly fast. And for anyone wanting to set up their own pathfinding tests, we were able to get a feel for these algorithms using some Python scripts and actual maps from OpenStreetMaps. Solving problems using code like this is as much about the journey as it is about the destination. You get to understand how Dijkstra works under the hood. Along the way, you also brush up on your coding skills. If that's something you wanna try out, I strongly recommend today's sponsor, Boot.dev.

Instead of blinding you with code, their courses drill in the fundamentals by giving you actual problems to solve. Designing an RPG game, building a payment app, or yes, running Dijkstra's algorithm. Their courses start with overview videos and steadily feed in terminology to make sure that you actually understand it. They even cover pronunciation. You ever heard of SQL? Okay, it's SQL. Well, it's a programming language that's become a bottleneck for certain jobs. Take data science. 60% of US job listings last year required you to know SQL. That's almost 80% for Python.

Boot.dev is geared up to teach you the languages and tools that you'll need for backend and DevOps roles. From Go in Docker to Linux and AWS. You can start from scratch, or jump into more advanced topics. See, DevOps engineers in the US are still some of the highest paid tech roles. Because these skills are genuinely hard to learn. And that's the point. To get started, go to Boot.dev and use our code "Veritasium" to get 25% off your entire first year on the annual plan. I wanna thank Boot.dev for sponsoring this part of the video.

Now less than 100 milliseconds seems pretty fast. So what's the problem? Well, things slow down a lot with bigger and bigger graphs. Like the North America network. One way we can compare search algorithms is by their average query speed. First we pick lots of random points. Then we run Dijkstra's algorithm on each pair of points and average all the runtimes. On the North American network, a well-tuned Dijkstra takes around seven seconds per path. And for a lot of these searches, the algorithm has to explore most of those 64 million nodes.

One note, most of the queries in this average are for very long journeys, since it's unlikely that two randomly picked points will be close together. Now we've seen that Dijkstra runs much faster within one city, but this average is a good way to compare pathfinding algorithms. But still, I can wait seven seconds. I mean it took about four seconds on my phone. But the thing is, it's not just me. I'm waiting in a line behind Derek, Gregor, Casper, and all of you.

There can be millions of people asking for directions all at once. And while Google has a lot of servers, there's a limit. If we want Google Maps to run in a couple of seconds, we actually need the algorithm to run in less than a millisecond, something 7,000 times faster than Dijkstra's. (whooshing sound) Well, one problem with Dijkstra's is it searches in the opposite direction of the target. So if we know the rough direction, is there a way that we can punish a illogical route? Let's go back to our problem, getting from Newark to the Central Park Zoo. We'd like to prioritize nodes that are closer to the zoo. So using longitudes and latitudes, we can easily calculate the straight line distance between any node and our target.

We'll order the nodes by their cost, plus this straight line distance. A great way to visualize this comes from the channel Polylog. We can represent each node's straight line distance to the target as its height, so the graph stretches into 3D space. The further a node is from the zoo, the higher it is. So the algorithm penalizes moves that climb up the graph. That way nodes in the opposite direction aren't explored early on. So while Dijkstra's search frontier spreads out in all directions, this modified Dijkstra's, also called A-star, immediately heads toward the zoo.

It only checks around 7,000 nodes, which is almost a 10 times improvement over Dijkstra's. And by changing this penalty, also called a heuristic, we can tweak how aggressively A-star targets the zoo. The steeper the slope, the less A-star spreads out. It's a more human approach to finding the shortest path, since we're only exploring in the rough direction of the target. That's why A-star is one of the most popular algorithms behind video game NPCs. For example, many of the mobs in Minecraft use a version of A-star with extra penalties for dangerous zones. It's also a popular maze-solving algorithm. At worst, A-star checks as many nodes as Dijkstra. But on the right maze,

it can cut down the search space by more than half. But when I'm trying to get somewhere, I don't care about minimizing the distance. I just wanna get there the fastest. (Henry laughing) To optimize for time, we divide the straight line distance by the maximum speed allowed on the graph. This gives us the minimum travel time. But this heuristic is an extreme underestimate. - If you wanna go for the shortest path in the sense that shortest geographic distance, you will gain some advantages using A-star, compared to Dijkstra.

If you go for travel time, A-star, from my experience, loses against a well-tuned Dijkstra. You explore fewer nodes, but not that many. But you now also have to evaluate quite a few heuristics that you would not need to evaluate when doing pure Dijkstra. And those heuristic contain a nasty square root computation. - [Henry] On our New York City graph, Dijkstra's algorithm explored around 9.5 times more nodes than A-star, to find the shortest path by distance. But if we switch that to the shortest travel time, the number of nodes A-star explores triples, while Dijkstra only increases by 10%. And it only gets worse on larger graphs. So Google Maps need something better that works well for travel time.

What if we ran the search in both directions? From the source to the target, and from the target to the source? If two points are a distance R away, Dijkstra searches most of the nodes roughly within a circle of area pi R-squared. But with two searches, the frontiers will overlap around R over two, and cover an area of half pi R-squared, half that of the single search. And depending on the graph, it can sometimes be better than half. A basic Dijkstra search from Carnegie Hall to Wall Street explored over 7,200 nodes. But a bi-directional Dijkstra explored a little over 2,600 nodes, close to a three times improvement. But even a bi-directional search doesn't take advantage

of the logic built into road networks. - They still do things that you wouldn't necessarily conceive as a good idea, right? Like they still look at the drive-through at In-N-Out and see if there isn't perhaps a shorter path if you drive halfway into it, and then some clever way out of it, right? So maybe that is the intuition between why kind of Dijkstra and A-star itself don't suffice. Both of them still kind of don't have a good sense of the hierarchy of the road network.

- When I think about how to get from one place to another, I know that I'm probably gonna meander around some local roads, then hop on a highway, and then when I'm close to my destination, I'm gonna hop off the highway and meander around some more local roads. There's a hierarchy to road networks, but these algorithms can't take advantage of them. So they're as likely to search a twisty local road as an interstate highway, as long as they have the same weight. So how do we actually encode that human intuition into an algorithm? - If you look at kind of industry implementations in the '90s, for instance, like in these early in-car GPS systems, they had like a pre-computed road class that they annotated on the roads.

- So here was the basic idea. Surveyors would go out and take notes about a road. Like its speed limit, how wide it was, whether it was a highway. And then they would send that to the mapping companies who would manually categorize it into hierarchies. An example of a hierarchy might be express highway, local major road, or narrow road like this. Once it had the maps, GPS would run a bi-directional Dijkstra, but there was a catch. It first searched all the narrow roads within a candidate area. If it didn't find the target, it mapped the path to the next hierarchy, the major local roads. Then search all of the major local roads within the new candidate area.

If it still didn't find the target, it moved up again to the highway level. The two searches would overlap at some point, and return the final path. This way, a little pre-processing could reduce the search area and take advantage of the road hierarchy. - One obvious downside here is that how do you correctly compute those levels of hierarchy? And how can you prove to yourself that they're correct in the sense that you are not missing any shortest paths? - For example, if the candidate area is too small, there's a chance the algorithm returns a highway path, when a series of local roads really is better.

A much larger candidate area might always find the shortest path, but at some point, that's not much better than Dijkstra. If Google wanted to map the world, they need a better routing algorithm. (upbeat music) Using a pre-processed hierarchy gives us way faster query runtimes. But, there's a trade-off. Our brute force algorithm from earlier where we just calculated all the paths to get the shortest one is a perfect example. If we had a massive lookup table of all the shortest paths from any two points, then we would get insanely fast query runtimes. But building that list would take way too long.

Here I have a graph. On the vertical axis is query runtime. And on the horizontal axis is pre-processing time. In the case of our massive lookup table, it would be right about here. Even running Dijkstra from every one of those 64 million intersections, that's more than a decade of compute on a single core. And the entire table would be more than eight petabytes of raw data. That's like storing all of Wikipedia, 16 times over. And even if we had the table, we'd have to rebuild entire portions whenever a road closes, a new bridge opens, and for every small traffic adjustment.

Now in reality it would probably not even fit on this chart. But for now, we're just gonna put it here. On the other hand, Dijkstra's algorithm requires no pre-processing at all. But it takes a long time to run. So on our graph, it would fit about here. Google Maps need something in between. Very fast query run times, but not too much pre-processing. In that case, we want to be in this region. Now the idea is very similar to the manual hierarchies, but with two major changes. First, we'll automatically pre-process our graph to order nodes by how important they are.

An exit connecting two highways, quite important. The intersection to a tiny cul-de-sac, well, not as much. No more categorizing roads by hand. We'll still use a bi-directional Dijkstras that only searches up the hierarchy from both directions. But if we build it clever enough, we can get rid of candidate areas and still guarantee the shortest path. So phase one is building this hierarchy. How do we pick the most important nodes though? Let's take a look at these two small towns. We're gonna throw away everything in one town except for this house for right now. If someone in that house wants to get anywhere,

they have to pass through this intersection at the end of the bridge. All of the shortest paths to and from their house include that node, so it's pretty important to them. But that node isn't as important to the rest of the town. After all, the only reason to cross the bridge is to get to that one house. But with more and more houses on the other side, the bridge node gets more and more important. Any trip from one side to the other needs to use it. It's the most important when the two towns are roughly the same size, when it splits the graph in half. That's when the maximum number of shortest paths include that bridge.

There are other sets of nodes that split the graph in half, like this main street, but there's far more intersections along that street than just the one bridge node. So a small set of nodes, or a small cut of nodes is important when it roughly splits the graph in half. Let's bring back the North American network. These blue nodes roughly split the graph in half. If you wanted to get from anywhere on the east coast, to anywhere on the west coast, you'd have to go through one of these 102 nodes. It's just like the bridge connecting the two towns. - That's really small. But if you look at where does this thing go, that's essentially the Mississippi River.

And bridges over the Mississippi are expensive to build. - [Henry] These are the biggest bottlenecks. Out of all the 64 million plus nodes, these 102 get the highest rank, so the biggest numbers. Now let's find the next most important nodes. - Well, what can we do on both sides? Well, we can do the same thing. We again, search for small cuts. - The next cut on the east coast follows the Ohio River. And then it loosely traces up the Appalachian Mountains before heading back south along the Potomac River. On the west coast, the cut mostly follows the Rockies, and then the Colorado River.

These two sets get the next highest ranks for their sides, and we can split these in half again and rank those nodes. And again, and on, until we've ranked all 64 million nodes. This is called nested dissection. This pre-processing step has two goals. We want an algorithm that doesn't search a lot of nodes, so the runtime is really fast, but also will always return the shortest path. The hierarchy we build now needs to support both of these goals. Let's test this out on a smaller graph. We ignore the edge weights for the first step.

These two nodes split the graph roughly in half, so we give them the highest rank. Looking at one side, this node splits the subset in half, so it has the highest rank of the three. We can rank the remaining two nodes in any order. Now, on the other side, these two nodes split this subset in half. So they have the next highest rank. And then again, we rank the final two nodes with the remaining numbers. But what about this ranking? We split the graph the same, but the rankings within the same cut are all swapped around. Or if we split the graph differently, we could get this ranking.

It even splits the graph perfectly in half. Isn't that better? And these rankings are all valid. We'll see in a bit why we could use any one of them. But for now, we'll just take this one. The best way to visualize this ranking is to pull down, or contract the nodes in order. One is the furthest down, two is next up, and so on. Now we can bring in a search algorithm. To speed up the runtime, we limit the search space by only moving up the hierarchy. And that's why it's so important to use a bi-directional Dijkstra. If the algorithm couldn't search in both directions,

it would get stuck on a lot of pairs. So instead, the two searches meet in the middle at the top of the hierarchy. Now let's add the edge weights back. Okay, what's the shortest path from node four to node eight? Well, starting from node four, we search up, and reach node nine with a cost of 10, From node eight, we also search up, and get to node nine with a cost of five. So that must be the shortest path with a cost of 15. Done. Except, there's a problem. That's not the shortest path. Using this path only costs three.

This search doesn't check paths that go through lower rank nodes. But we can fix it with a little more pre-processing. Start from the lowest rank. Does this node connect to at least two higher nodes? On this graph, node one connects to three and eight. This is called a lower triangle. And while there's a chance this path is actually the shortest between node three and node eight, our search will never consider it. So we'll add a shortcut between nodes three and node eight. That's just the cost of this lower triangle.

Now, this edge doesn't correspond to anything real on the road network. It's just a way to keep track of shortest paths. We move on to the next lowest node and run the same test. But there could be multiple lower triangles, or there might already be an edge between the two higher nodes. In those cases, the shortcut represents whichever path has the minimum cost. A path that only uses lower ranked nodes is like a path that only uses smaller local roads. We don't want the algorithm to return a highway route if a local road is shorter.

Building shortcuts from the bottom up makes sure we don't miss these paths, while also ignoring them if they aren't needed. This way, we both limit the search space, and never miss the shortest path. But there's another benefit to shortcuts. We don't need to worry too much about getting a perfect ranking. We can roughly cut the graph in half, and rank nodes in the same cut in any order. If the ranking overlooks the shortest path, well, we'll add it back later with a shortcut. That's why we were able to get three valid rankings

using the same instructions earlier. But while any ranking will still find the shortest path, a bad one will go really slow, or it'll quickly blow the graph up. For example, with a series of nodes in a straight line, we don't need to add any shortcuts if we just contract the nodes from left to right. But when we search between the two ends, we have to explore every node and edge. And that's no improvement over Dijkstra's. Here's another possible ranking with the necessary shortcuts. End-to-end, we only need to search two edges, but we need a lot of shortcuts. And in some sections, we still have to check all the edges.

Manageable, but not very efficient. - And so you want to have this trade-off between the number of shortcuts that you introduce and how much faster the query gets. - [Henry] In practice, the nested dissection method gets a decent node ordering and doesn't take too long. This entire algorithm is called a customizable contraction hierarchy. And Ben tested it out on the North American network to give us a sense of the runtimes. In phase one, we order the nodes and add shortcuts. This doesn't change unless the graph itself changes, like a new bridge, or a road closure.

It's by far the most expensive part of the process. But once it's done, it hardly ever changes. - I think it would ended up at an hour 40 or something like that. You can spend more time, you get better results. So it's an argument to make that's a good idea. - [Derek] In phase two, we calculate the weights for all the shortcuts. Whenever there's traffic updates, we'll have to rerun this step. So it needs to be relatively quick, - I think seven, eight seconds without parallelization and take it down to about a second, and then that's where the memory starts setting.

- [Derek] Finally, phase three is the actual search. Remember, for a long path on the North American network, a well-tuned Dijkstra needs to explore most of the 64 million nodes and takes around seven seconds per path. For this contraction hierarchy, the query runtime was around 200 microseconds. - Depending on how you configure stuff, you can get down to like 100 microseconds or so. But for practical purposes, below a millisecond is all you need. - [Henry] That's 35,000 times faster than Dijkstra in just this experiment. And on average, it only needed to explore 1,450 nodes, cutting the search space by a factor of around 44,000.

A Dijkstra search from San Francisco to Montreal would need to check all the nodes roughly in this area. On a customizable contraction hierarchy, it only needs to search these 1,236 nodes. Most of the nodes cluster around the source and target. Once it gets far enough up the hierarchy, it only needs to check the most important cuts, like the Mississippi. While we don't know exactly what Google maps, Apple maps, or any of these big routing applications actually use under the hood, they all could be built with contraction hierarchies. And customizable contraction hierarchies is just one variation, and there are even more routing algorithms that we haven't covered.

But if you think about it, its core search is still Dijkstras, a 70-year-old algorithm. Four decades after Dijkstra publishes article, Danish computer scientist Mikkel Thorup said that all theoretical developments in single source shortest paths have been based on Dijkstra's algorithm. And since 2000, many of the newest shortest path algorithms still use bits and pieces of Dijkstras. From variations on contraction hierarchies, all the way to a recent theoretical breakthrough for an even faster shortest path algorithm. All of this, based on a 20-minute invention. Part of Dijkstra's enduring success comes down to just how simple it is. As he said, "Simplicity is prerequisite for reliability."

Dijkstra thought deeply about problems before ever touching a pen. Because at the end of the day, it would be a programmer who is accountable for their work, not the machine. (speaking foreign language) - So with those two ideas, he hoped to have one lasting legacy. He said, "If 10 years from now, when you're doing something quick and dirty, you suddenly visualize that I'm here, looking over your shoulders, and say to yourself,

'Dijkstra would not have liked this.' Well, that'd be enough immortality for me." So even if you can only control your little corner of the world, you can still make it as elegant and thoughtful as possible. For you, and all the people who come after. (bright music) Before we go, I just wanted to say thanks again to Boot.dev for sponsoring this video. If you guys wanna check 'em out, you can check the link in our description, or scan this QR code right here, and make sure to use that checkout code "Veritasium" for 25% off. So thanks to Boot.dev, and I also wanted to thank you for watching.

- [Speaker] You have arrived at your destination. (bright music)

More Entertainment Transcript