IBM and Nextgen have been blaming each other for the failure of Census 2016. Based on today’s Senate Economics References Committee hearing into #CensusFail, it appears both companies were at fault to some extent. Nextgen may have incorrectly implemented geoblocking aimed at mitigating distributed denial of service (DDoS) attacks while IBM acknowledged it should have a real test of its router’s resilience to failure. But Alastair MacGibbon, the Special Adviser to the Prime Minister on Cyber Security, has laid the blame predominantly on IBM for failing to handle relatively small DDoS attacks that shouldn’t have brought down the Census website.
The Australian Bureau of Statistics (ABS) Census online form website was taken down after being pounded by DDoS attacks. IBM was the IT provider for the website and it enlisted Nextgen and Telstra to be the uplink providers. Each ISP had its links connected to one router each that would bring traffic to IBM’s datacentre.
The ISPs were instructed by IBM to implement geoblocking in anticipation of DDoS attacks to block incoming traffic from overseas.
According to IBM engineer Michael Shallcross, who oversaw the project and spoke at the Senate hearing, when the Census website came online, it was hit by a large volume of traffic coming through the link from Nextgen, which became fully saturated. This was identified as a DDoS attack. The traffic primarily came from Singapore on a router managed by Nextgen where the geoblocking rule was not properly implemented, Shallcross said.
After some time and more DDoS attacks, IBM made the decision to restart the two routers to remediate the issue. Unfortunately, the router connected to the Telstra link didn’t restart properly due to a configuration error. The decision was made to take the website down after IBM misinterpreted data that was being sent out from its load monitoring system as a possible security breach.
The reason why IBM used two uplink providers was to provide redundancy if one connection was affected. IBM had to deal with the unfortunate scenario where both services were affected.
When questioned about whether IBM would have done anything differently if it could do it all over again, Shallcross admitted the company didn’t test the routers adequately. IBM had only completed simulations of failure scenarios. In hindsight, Shallcross said the company should have done a ‘hard’ test, which essentially means pulling the plug on the router and then powering it up again.
Speaking at today’s Senate Committee hearing, McGibbon was called to provide an assessment of the #CensusFail incident. While there have been speculations as to whether the DDoS attacks actually happened, he assured the Committee members that those attacks did occur but they were rather small in scale.
According to information provided by Nextgen and IBM, the DDoS traffic was coming in at a rate of around 3Gbps. It’s not uncommon to see DDoS attacks that hit 100Gbps.
The system managing Census online was degraded by the DDoS attacks but they didn’t completely knock the website out; it was the ABS and IBM’s decision to pull the plug.
“It shouldn’t have caused the damaged that it did,” McGibbon said. It was clear the problem was exacerbated by insufficient communication between IBM and Nextgen, he said.
While IBM maintained that geoblocking was an effective solution to ward off DDoS attacks for Census Online, McGibbon made it clear that better alternatives were available.
“Had it worked properly, it may have protected the site but there are other DDoS mitigation you can acquire from ISPs and it’s my understandings the services were not acquired,” he said.
IBM provided its reasoning for refusing DDoS mitigation offered by Nextgen earlier in the day. IBM is currently in discussion with the government over possible compensation that it will pay for #CensusFail.
Comments
7 responses to “#CensusFail: IBM Slammed For Failing To Block Puny DDoS Attacks”
There are obvious mistakes done
1) Cenus is advertised as if it must be done on a particular day 09/Aug/2016 even though it was available from much earlier and can also be filled at later date. This would have reduced so much load.
2) Having simple geo blocking, validations and having simple captcha (not the crazy capcha) functionality would have reduced all the malicious attacks
3) I don’t think IBM used any queueing mechanism to handle the load, not everything comes from the business requirements, companies should use some brain and think out of the box
4) Govt trusting IBM even after failed payroll project. IBM still blames lack of requirements but they don’t have the people with the right qa skills to find the issues earlier.
5) Also building such a simple website would have costed less than a million dollar if built by solid contract developer with small team of automation and manual qa and ba. wasted huge amount of money which resulted in failure.
Has any evidence of the DDoS attacks come from anywhere other than IBM or the ABS?
Singapore denies the attack came from them, they say that the ABS has equipment in Singapore and IBM just saw that data coming into the country.
IBM must be quaking in their boots… I mean this IS the Australian Government after all…
No doubt the high paid fatcats will add one more question to the RFQ’s/RFP’s so when they next give IBM millions of dollars they won’t get away with it…
IBM cant get contracts with the Queensland Government anymore after the Health System Payroll debacle, with the CensusFail and the failure to reach major milestones in a current New Zealand project their reputation is on the line and the Federals and States may just cut them off. Revoking the government approved provider statuses is on the table, black listing it from any new contracts.
Also notice that while IBM had the project, they outsourced a lot of it and if IBMs job nowadays is nothing but an overpaid middle man full of contractors why not cut them out.
I mean seriously, they didn’t even doing a basic power test of “turning it off and on again” which is why it came crashing down so hard. Which is the funniest and saddest thing.
Senator Xenophon said today “The system fell over as a result of it being attacked by a pea shooter but the ABS has made it sound as though they were hit by a bazooka on this,” Ouch!
A Strategic Management lecturer once said that IBM traded off the saying “No one gets fired for buying IBM”, my Dad was a State Manager for WANG in 1987 and said he used to hear the same thing when IBM beat them despite quoting more on a tender. Somehow I don’t think it’ll work for much longer.