On August 12, 2014, network operators started their morning with an unwelcome surprise. The Internet’s IPv4 routing table suddenly exploded past 528,000 routes, breaching a critical threshold that many Cisco 6500 switches were configured to handle. For 6500s running default configurations, this meant hitting a hard limit that had been lurking in TCAM memory for years.
Welcome to 512K Day.
Every router needs to know where to send packets, and that information lives in something called the routing table. By 2014, that table had grown to over half a million entries, and it kept getting bigger. New ISPs pop up, existing networks split their address space for various reasons, and each change adds another entry to the global table.
The Cisco 6500 was everywhere in 2014. Data centers, ISPs, enterprise networks, and so on. The SUP720-3BXL, the supervisor card most 6500s were using at the time, used specialized memory called TCAM that could look up routes incredibly fast, but it had a default limit of 512,000 IPv4 routes. When the platform launched in 2004, the routing table was under 150,000 entries, so the limit seemed far away.
Except the Internet doesn’t stop growing.
On the morning of August 12th, a Verizon network (AS 701) started announcing thousands of new routes. Whether it was intentional or an accident, the effect was immediate. The routing table shot past 512K, and routers started slowing down or grinding to a halt. Sites like eBay and LastPass went down for users on affected networks.
We talked to Andree Toonk, who was running BGPMon at the time—a service that monitored BGP routing in real-time. He watched it unfold as reports started flooding in. Verizon caught the issue within seconds and started pulling back the announcements, but the damage had spread. His data showed over 12,000 prefixes affected across thousands of networks.
The somewhat anticlimactic part? Cisco had published a fix three months earlier. The TCAM could actually handle a million routes, but you had to reconfigure it and reboot, so this helped avoid an even bigger event.
Our Lab Recreation
For the video, we wanted to see this failure for ourselves. Abacus Hardware hooked us up with a 6500 and a SUP720-3BXL, and we set out to break it properly.
First problem: these things are heavy. Second problem: the power cable uses a connector we didn’t immediately have on hand. After some rummaging, we remembered our PowerMac G5s use the same plug and borrowed one of those cables.
We set up BGP peering and started pumping in routes—about 600,000 of them filtered from today’s Internet table. Watching the 6500 chug through them was genuinely cool. The CPU maxed out, the TCAM counter climbed: 85%, 90%, 95%… and then boom. TCAM exceptions, errors everywhere and performance was heavily impacted.
After applying the TCAM fix and waiting through a painfully slow reboot, it handled all 600,000+ routes without breaking a sweat.
512K Day ended up being more of a speed bump than a disaster, mostly because Cisco gave everyone advance warning. But it’s a reminder that the Internet is built on layers of technology, some of it decades old, all stretched way beyond what anyone originally imagined.
The routing table is over a million entries now, and hardware keeps adapting. There’s probably another 512K Day-style event that will occur in the future, but hopefully lessons from 2014 will help mitigate an outright disaster.
Thanks to Abacus Hardware for the 6500, and to Andree Toonk for sharing his experience.