Okay, hoodies on. So, why are we wearing hoodies? Well, we're digging into an aspect of artificial intelligence that's been dominating the news for the past couple of weeks. And that's its impact on cybersecurity. As you've probably heard, Anthropic decided not to make its latest AI model, which is called Mythos, generally available to the public. And that's because Mythos is strikingly capable, according to the company, at finding vulnerabilities in software and figuring out how to exploit them, to crash systems, gain unauthorized access, and so on. In short, Anthropic has created an AI model with insane hacking skills. And not a hoodie in sight. I think we can take these off now cuz Yeah. Uh yeah. Uh For years, people have argued
about AI's impact on cybersecurity. As AI becomes more advanced, who will it help more? Cyber attackers or cyber defenders? Will it make the world more dangerous or will it boost safety? And if Anthropic could make an AI hacker as powerful as Mythos, how long will it be before open-source models, which anyone can get hold of, gain similar skills? That's quite a scary prospect, isn't it? If something like this becomes widely available, cuz frankly, you know, cyber attacks have been on the rise lately. The thing about AI is it's been getting better and better as a software engineer for the last well, 3 years, you know, since the ChatGPT moment. What's happened with Mythos is it's gone from being a coding
assistant, from being something that can work under the tutelage of an experienced coder themselves, that will, yes, speed up their ability to type code out, to something that can autonomously find and exploit a vulnerability with very little oversight from a human at all. It is an automated hacker. Okay. Now, we should say also for the pagans watching, I would name myself as one of those, hacking is where you exploit a system in a way that follows the rules, but subverts the intent of the rules. So, hacking isn't itself bad, and this is why people in the community prefer the term malicious hacker. Um but we are going to be using the term hacker just in that more sort of colloquial sense to be reason that, you know, I've got my
hoodies on and if I had mirror shades on, I'd be like Yeah, exactly. So, before we get to Mythos in particular, we should explain about this distinction that you have the idea of a vulnerability. So, it's something wrong with a piece of code, and that gives you a kind of way in to protect potentially exploit it to either crash a system or, you know, gain control over it. Um and I think what's particularly worrying, um and this is a really striking chart, um it shows you that the amount of time between one of these flaws, vulnerabilities, being discovered, disclosed, and turned into an actual exploit, so being used in an attack, has been falling very rapidly. So, it was 2.3 years the delay in 2018. And now
this delay has fallen to 20 hours. So, literally, someone discovers a vulnerability and there are people using it within a day. And at this rate, the um the delay between disclosure and exploitation is going to fall to 1 minute by 2028. This is extremely worrying, isn't it? And just, you know, to do the definitions, you can start off thinking about bugs. A bug is anytime a software does something that it shouldn't. A vulnerability is a particular type of bug that is dangerous. And an exploit is a piece of software that uses a vulnerability or multiple vulnerabilities to do the dangerous thing. Right. A vulnerability might be, "Oh, this is, you know, is supposed to treat numbers as 0 to 100, but oh dear, it treats them as -100 to +100, and it
crashes if it gets a negative number." That's bad. The exploit is if you then work out a way to use that crash to do something damaging down the line. I believe that if indeed using AI, you have cooked up a demonstration of exactly how this works. So, taking us through one of these flaws and how that could be turned into an exploit. Yes, this is my interactive demonstration made with the help of Opus 4.7. And by with the help of, I sort of mean by it. Yeah. Even the dumb models are pretty smart these days. And this is one of the exploits that was found by Mythos. This is an exploit in the operating system OpenBSD. And the vulnerability had been sitting there for 27 years. So, this kind of shows how
good it is at finding weaknesses, right? All it took to fix it was one line of code. But working out what the problem is, I think, just about the right level for you and me. before we dig into it, the important thing here is that this was a vulnerability that was there for 27 years. It's one of, I think, several thousand that they say Mythos has found. They haven't made them all public for obvious reasons. But essentially, the it was a it was there all along, and it meant that if you sent a particular kind of message across the internet to a system running that operating system, you could make it crash. Sure. And we're going to be working here slightly with a metaphor of hotels. Opus 4.6 suggested
this, but to start, let's do TCP, the way the internet works, the way any network works, requires me to tell you some information and you to go, "Heard that." Yeah. And that's great. But sometimes you don't want to be saying "Heard that" every single time I tell you something. So, you can say "Heard that, messages 1 to 6, messages 1 to 7." You can say "Heard that, messages 10 onwards." And I should know that there's a gap. You haven't said you've heard 8, 9, and 10. Right. We can explain this by filling a hotel, you know, Right. You send people up and you fill in rooms one to five. that these people have checked in.
They've checked in. And then I shout down, "Okay, 10 to 14 have checked in." And you go, "Okay, wait, I don't know where the people I sent up to sit in rooms 6 to 9 are. They've gone missing." Then down the line you go, "Ah, 6 to 9 have turned up." And you at the front desk know everyone's arrived. is what's happening under the hood on our computers when they're checking that they've got all of the packets that they got. Exactly. So, that's great. That's how the system works. The problem comes with the first bug that we've got, which is that, "Oops, if I tell you I filled room nine, you hear nine. If I tell you I filled room 2.2 billion, then what happens is because this is a
lovely 16-bit unsigned integer, it wraps round. And you hear 2.2 billion and you go, "Ah, he's filled room -2 billion." Right. And you don't think that's a problem because you've got this vulnerability. is because of the way that binary is used to encode numbers on computers. The highest the most significant bit is used for the sign, and so very big numbers become very big negative numbers. And that's a problem. It oughtn't be a problem but for the second bug in this system, which is that if you tell the hotel manager that you have put someone in a room that is simultaneously a higher number than the earliest unfilled room and a lower number than the earliest unfilled room, then well, the
whole system goes crazy. It tries to write there and it crashes out. And so, what you do is very simple. You say, "Hey, we've checked someone in to room 2 billion." And the system goes, "Okay, well, that's higher than the highest number, lower than the lowest number." It erases the gap, and then you get a kernel panic. Right. Okay. And so, in short, you can send one single message to any system running this. is a This is the worst type of crash. But in layman's terms, a kernel panic is the whole thing just freezes. thing just stops. And so, the reason why we know about this is because it's already been fixed. This is responsible disclosure. Anthropic didn't share the details of the bug until the bug had
been fixed. And if you run an OpenBSD system, patch it. So, let's move on to what Mythos is doing differently that makes it so good at this. Is the model trained on existing vulnerabilities that were known about? And is it looking for similar vulnerabilities? Or is it just sort of somehow looking for bugs? On the training, no. You know, this is not a special cybersecurity model. This is the scaling law in action. the next one after Opus, and it just turned out to be a good hacker. It's bigger and better. It has more data. It has more flops of compute. And it is just better at everything that Opus was already good at. Okay.
There are some ways in which it's specific to cybersecurity, which are that it doesn't have the safety uh features that Anthropic would want to build in. So, all of the model all of the models that you and I can use have been very carefully trained to make sure if we say, "Write me an exploit for Chrome," it will go, "No, I don't think I'll do that. That would be bad." Mythos, the Mythos preview that these companies have access to, doesn't have any of that. It will merrily do what you ask it in the cybersecurity domain, and that's why it's behind closed doors only.