AI
I’m in a bit of a rush with this one but the TLDR; the USA senate was told:
“On June 11th Mark Warner, the vice-chair of the Senate Intelligence Committee, said that General Joshua Rudd, who leads the National Security Agency and the Pentagon’s Cyber Command, had told him that Mythos “broke into almost all of our classified systems, not in weeks, but in hours”
This isn’t really great… and you will hopefully see why!
As you can imagine caused a bit of stir in the practioner community, why? because sure, he said that…. but the contents of this are almost certainly not entirely accurate (for oh so many a reason). Another problem here is, this stuff gets spread around like ‘fact’ all over the internet (in a world where most people don’t know how TCP/IP ‘works’ let alone how networks at scale work, this creates a problem IMHO), see the example post on twitter:

So I won’t try and write everything down here, maybe another day, but I did task Claude to do some myth busting for me, so I’ll share that:
analyst@xsv:~$ ./assess.sh –claim “mythos_owned_the_classified_estate”
No, MYTHOS Did Not “Break Into Almost All Classified Systems”
A viral quote, a third-hand chain of telephone, and a community that has clearly never had to write an RMADS. Let’s deflate this with architecture instead of vibes.
If you spend your weekend on the timeline you’ll have seen the card: “BREAKING: NSA confirms MYTHOS broke into almost all of its classified systems in hours.” It’s a great headline. It is also wrong in almost every load-bearing word — “BREAKING,” “NSA confirms,” and the implied scope all fail on contact with how classified networks are actually built and governed.
I have no inside knowledge of the exercise in question, and the real scope is classified, so nobody posting about it does either. But you don’t need the classified detail to assess the claim — you need to understand the architecture it allegedly defeated. So here’s the deflation, from someone who has had to live inside accreditation boundaries rather than screenshot them.
01 // Why the claim is unlikely to be based on reality
It’s third-hand, it was disowned by its own author, and it grew a word in transit.
Trace the provenance and the “BREAKING” energy drains out fast. A general reportedly said something to a senator in a private briefing. The senator (Mark Warner) repeated it to make a pro-regulation point — “thank God it was Anthropic… voluntary testing alone won’t solve this.” A journalist (Shashank Joshi, The Economist) quoted it. Then the internet bolted “NSA confirms” on the front and the word “autonomously” into the middle.
The decisive fact: the author of the line disowned the literal reading. Joshi posted that it would be a mistake to read it literally, that it “surely depends on using Mythos alongside other tools under very particular conditions,” and that not adding caveats was his error. When the person who wrote the sentence tells you not to read it the way the screenshot reads it, the screenshot is dead. Note too that “autonomously” — the scariest word in the secondary coverage — is not in Warner’s quoted remarks. It was added downstream.
02 // Low side, high side, and why “where it ran” decides everything
A model can only touch what it can reach. Classification boundaries are reachability boundaries.
Government networks aren’t one big estate with a naughty-step. They’re stratified by classification. Loosely: the low side is your lower-classification / internet-adjacent estate (think OFFICIAL, or NIPRNet in US terms); the high side is SECRET and above (SIPRNet at SECRET, and a separate TS/SCI fabric again). These are physically and logically separate networks — frequently separate cabling, separate hardware, separate rooms.
You don’t route between them. Where data must move, it crosses a Cross Domain Solution — a guard, or a one-way data diode, with review — and in plenty of places there is simply an air gap and a human carrying accredited media under a documented process. The implication for our viral claim is brutal and simple:
03 // RMADS & ATO: you can’t just let an AI run amok everywhere
Accredited systems operate inside a documented boundary and configuration. An autonomous agent with broad reach is a material change, not a free action.
This is the bit the hype merchants have clearly never touched. An accredited system doesn’t just “exist.” In the UK it carries an RMADS — a Risk Management and Accreditation Document Set — capturing its scope, threat model, controls, and accepted residual risk, signed off by a risk owner / accreditor. In US RMF terms it holds an ATO (Authority to Operate) granted by an Authorising Official under NIST SP 800-37. The system is permitted to run only within the boundary and configuration described in that paperwork.
So the fantasy of someone shovelling an experimental, broadly-scoped offensive AI agent onto live accredited infrastructure and letting it roam runs straight into change control. Introducing that capability is a material change to the risk posture: it demands a security impact assessment, change authorisation, and very plausibly re-accreditation. You do not get to “let it loose across the estate” — not because anyone thinks it’d be boring, but because the governance physically gates it. Either this was a separately authorised, bounded exercise, or it breached the very accreditation regime that defines these networks. The former is mundane. The latter would be a different scandal entirely, and nobody is alleging it.
04 // Segmentation — network and physical
“Almost all classified systems” assumes one contiguous target. It isn’t one. It’s many fenced fields.
Even within a single classification level, the estate is carved up — by community (intelligence vs military vs coalition), by mission, by programme, into enclaves with boundary protection and mandatory access control between them. Releasability caveats fence things further. This is segmentation as a design principle, not an afterthought.
Then add the physical layer: separate equipment, SCIFs for the high-side material, TEMPEST considerations, no cross-connection between fabrics. To literally compromise “almost all classified systems” you would have to defeat every one of these boundaries in turn — network segmentation, accreditation boundaries, community separation, and physical separation — within hours. That isn’t a capability claim; it’s a claim that the entire defence-in-depth architecture is decorative. It isn’t.
05 // Tests have to be scoped
Nobody points an unproven offensive tool at production classified ops. That’s what ranges are for.
Authorised offensive testing is not a free-for-all. It runs under Rules of Engagement: a defined target list, an authorising owner, deconfliction, a start and a stop. Crucially, you do not validate an experimental offensive AI capability against the live operational classified estate — you run it against a range or representative environment built to mirror real configurations precisely so you can let something aggressive off the leash without taking down operations.
Which means the honest translation of the quote is almost certainly: “in an authorised, scoped exercise, MYTHOS got into nearly all of the in-scope targets, fast.” That is a genuinely notable result. It is not “the NSA’s classified estate fell over in an afternoon,” and the gap between those two sentences is the entire story.
06 // MYTHOS is not magic
It’s a force multiplier, not a repeal of network architecture or physics.
Strip the mystique. The model’s relevant strength — the one everyone keeps citing — is rapid code and vulnerability analysis: read a codebase, surface the flaws, chain them. Used as an accelerant, it can compress weeks of human vuln-research into hours. That’s real, and it’s the actual point General Rudd was reportedly making.
But it still needs the same things any operator needs: a reachable target, a foothold, and an exploitable condition to act on. It cannot cross an air gap it has no path to. It doesn’t conjure access where the architecture grants none. It doesn’t dissolve a data diode or talk a SCIF into routing to the internet. Point it at a network it cannot reach and it produces exactly nothing. The novelty here is speed — not omnipotence — and conflating the two is how a sober capability observation becomes a “Skynet owns the Pentagon” card.
07 // What the claim does get right
Deflating the hype isn’t the same as dismissing the signal. Here’s the part you should take seriously.
Speed compression is real and it matters. “Hours not weeks” is the credible core. An offensive tool that collapses a human red-team timeline by an order of magnitude changes the economics of attack, and that’s a legitimate national-security concern — which is exactly why a Mythos-class capability sits behind a tighter release posture in the first place.
Accredited ≠ invulnerable. If your prior is “accredited networks don’t fall,” temper it. Accreditation is documented risk acceptance, not a force field. Red teams beat accredited estates routinely given scope and time. “A capable tool got into most of the range fast” doesn’t contradict that experience — it’s consistent with it.
The leadership reaction is the real datapoint. The signal isn’t “a breach was suffered.” It’s that the people who run US cyber now rate frontier-model offensive capability as a national-security issue, and moved accordingly. That part is sincere, not vendor theatre.
08 // Bottom line
One genuine unknown I can’t close from open sources: whether this was a bounded exercise rather than discovery of a live compromise. Nothing in the reporting suggests the latter — but it’s the load-bearing assumption under everything above. If it were the live estate rather than a range, every confidence level here moves. It almost certainly wasn’t.
You can also see some posts of mine about
I’ll try and show this another way in future but hopefully my watermelon concept makes sense and is simple enough for everyone to understand but doesn’t miss out important realities (abstractions are hard)
Cyber security is complex, details matter. We live in a very challenging time when hype and oversimplification rule and science and detail are left gasping for air!
oh and if you are interested in LLMs for offensive cyber, and what Mythos has been tested with, check this out: https://www.aisi.gov.uk/blog/our-evaluation-of-claude-mythos-previews-cyber-capabilities








