Chief scientist applies forensic skills to TSB meltdown

30 August 2018

In April 2018, the UK's TSB bank left up to 1.9m customers unable to access accounts, due to a botched system migration that was rushed, inadequately tested with poor internal communication.

TSB, previously part of the Lloyds Banking Group was bought in 2015 by Sabadell Bank and was due to migrate to Sabadell's in-house system, Proteo, in November 2017; that deadline was extended until April 2018. It was reported that at the time TSB were renting Lloyds legacy systems for £100m a year.

TSB said in a statement to Reuters: “We are working round the clock to put things right and to keep our customers informed about the latest position. This is our priority at the moment.

“There was extensive testing that was completed before migration. There will of course be an investigation into why the migration did not go as expected.”

However, the fallout of the crisis suggests otherwise according to software intelligence expert, Bill Curtis to put his IT forensic skills to the test.

This is the view of Bill Curtis, SVP and chief scientist at CAST, the software intelligence company:

We’re in the era of 9 digit defects. I’m not measuring that in bits and bites but in dollars, euros and pounds. We’re talking £100m disasters and this TSB one is pressing £200m. It’s quickly becoming a CEO disaster rather than a CIO only disaster. Increasingly, when these things happen, CIOs are fired as well as the CEOs.

Here are three areas where I think TSB failed to make due consideration:

1. Data migration

The fact that customers were getting access to other customers’ accounts suggests to me they screwed up the data migration. They didn’t properly test the data migration, of course that’s a huge challenge to do between two systems, ensuring both systems share the same commonalities.They didn’t have good data matching.

2. Software structural integrity

We kept hearing about services not being available and capacity being inadequate. If this is the case, then there were structural issues as well, which would cause outages, proponent degradation, we saw evidence of fraud - unauthorised access exploited by hackers.

There are services out there that help you monitor and analyze software architecture, which help identify the coding flaws that allowed the system to crash. They might not have used one. Using one is a fairly standard part of any testing process. There’s also penetration testing, where they hack their own system to see if it holds up, as well as user acceptance test, which they might not have done adequately; that’s when they test the system with traffic to see how it holds up.

If, as the IT team at TSB claims, it didn’t have enough time to test adequately, there’s certainly evidence to back that up. Both in terms of the breadth and duration of testing.

3. Process

This is where the actual migration time did not go to plan, and an overrun development time as well as when you have non-IT executives promising delivery dates without a plan to deliver on those dates, that’s bad. If IT find a problem during development then they have to go back, analyse the process and pinpoint the issue and that requires revising the plan. That’s a classic form of disaster because it can’t be legitimately met by testing.

These upgrades are very complex as you have to migrate data and system functionality - and it sounds like they might not have had time to test their completely new system at Sabadell, Proteo4UK new system.

Those are all the major problems. This isn’t just in banking either, but a common problem.

The way to do this which they didn’t follow, is to stagger migration. Migrate 10% of your customer and see if it works.

The senior executives, especially the CEO and operational executives, probably don’t have IT backgrounds. These folks need some level of intelligence into what the level of risk is and proper intelligence on the coding integrity and data migration. This gives them some level of confidence in the system and process.

I suspect they didn’t have that software intelligence at all and they jumped straight into the disaster. As a result they don’t know what to do next. They can say whatever they want to the press but they simply don’t have the data intelligence available in a summarised way to inform their business continuity decision.  

Bobsguide reached out to TSB who declined to comment.

 

Become a bobsguide member to access the following

1. Unrestricted access to bobsguide
2. Send a proposal request
3. Insights delivered daily to your inbox
4. Career development