You don't have javascript enabled.

IT resilience in financial services: a critical look at systemic vulnerabilities

Internal systems are a far more serious danger to financial services organizations and a higher cause of IT outages than cyber-attacks, as Simon Salloway, Director, Solution Architects EMEA & APAC at Lakeside Software, explains:

  • Simon Salloway, Director, Solution Architects EMEA & APAC
  • May 30, 2025
  • 7 minutes

The U.S. Federal Reserve has reaffirmed its commitment to enhancing operational resilience within the financial sector. Meanwhile, the UK Treasury Committee revealed that major banks and member-owned building societies experienced over a month’s worth of IT outages in just two years, driven primarily by internal system failures rather than cyberattacks.

These disruptions reveal a troubling fragility in financial IT infrastructure, raising questions about the sector’s ability to withstand systemic shocks. With regulations like the European Union’s Digital Operational Resilience Act (DORA) increasing scrutiny, financial institutions must confront the hidden risks that threaten stability. This article examines the systemic vulnerabilities in financial IT systems, explores their root causes, and argues for a proactive, strategic approach to IT resilience that goes beyond compliance to address broader societal and market implications.

The “Dark Estate”: hidden risks lurking beneath the surface

At the heart of many IT failures lies the “dark estate” – the unseen layers of infrastructure, including outdated hardware, underperforming software, and unmonitored endpoints, that harbor significant risks. A striking example is a New York financial institution that uncovered $9.6 million in unnecessary costs due to outdated device replacement policies. Standard lifecycle management dictated replacing devices, but closer analysis of endpoint performance revealed that 91% of flagged devices were still fully functional. In an industry where precision and speed are paramount, such inefficiencies expose a valuable opportunity: advanced analytics and modernized endpoint monitoring can enhance operational stability and reduce costs.

The dark estate is not merely a technical issue; it reflects deeper systemic problems. Legacy systems, often decades old, struggle to integrate with modern applications, creating performance bottlenecks and outage risks. These systems rely on cumbersome middleware to bridge disparate components, introducing additional points of failure. The UK Treasury Committee’s findings underscore this, noting that internal system failures, not external threats, are the primary drivers of outages. Yet, many institutions remain unaware of these vulnerabilities until disruptions occur, highlighting a lack of comprehensive system oversight.

Regulatory pressures exacerbate these challenges. Compliance with DORA, which mandates robust IT risk management, or FFIEC guidelines, which emphasize operational resilience, strains already fragile systems. Institutions face a dilemma: invest heavily in modernization to meet regulatory demands or prioritize short-term compliance fixes that leave underlying vulnerabilities unaddressed. This tension raises a critical question: are financial institutions building genuine resilience or merely checking regulatory boxes?

Root causes of IT fragility

To understand the sector’s vulnerabilities, we must examine the root causes of IT fragility. First, underinvestment in modernization is a pervasive issue. Financial institutions often defer costly system upgrades, leaving legacy platforms in place. These systems, while functional, are ill-equipped for the demands of digital banking, real-time transactions, and evolving cyber threats. The result is a patchwork IT estate that is both inefficient and prone to failure.

Second, misaligned incentives between IT and business leadership contribute to the problem. Business leaders prioritize customer-facing innovations, such as mobile apps or AI-driven services, while IT teams struggle to maintain aging infrastructure on limited budgets. This disconnect creates a reactive culture where IT issues are addressed only after they disrupt operations. The UK Treasury Committee’s report highlights the consequences: over 30 days of outages in two years, each eroding customer trust and market confidence.

Finally, the complexity of modern financial systems amplifies vulnerabilities. As institutions adopt cloud services, fintech integrations, and IoT devices, their IT estates become more interconnected and harder to oversee. A single failure – whether in a legacy system or a third-party service – can cascade across the network, as seen in outages that disrupt payment systems or trading platforms. This interconnectedness raises a sobering question: in a hyper-connected financial ecosystem, can any institution achieve resilience in isolation?

The cost of reactive strategies

Many financial institutions remain trapped in reactive “break-fix” cycles, addressing IT issues only after they occur. This approach is costly and unsustainable. IDC research highlights the stark contrast: organizations adopting predictive maintenance and comprehensive monitoring reduce IT downtime by up to 30%, while reactive models see downtime costs rise by as much as 60% annually due to inefficiencies and disruptions. Beyond financial costs, outages erode customer trust, particularly for underserved populations reliant on digital banking, and risk regulatory penalties under frameworks such as DORA.

Reactive strategies also obscure the dark estate. Without systematic monitoring of all endpoints – devices, applications, and network components – institutions miss critical vulnerabilities. The New York financial institution’s opportunity to save $9.6 million is a case in point: standard policies failed to account for real-time device performance, leading to wasteful spending. Comprehensive visibility is essential not only to identify risks but also to avoid false positives that lead to flawed decisions. Yet, achieving this visibility requires a cultural shift from firefighting to prevention.

A proactive path to resilience

To address these challenges, financial institutions must adopt a proactive, strategic approach to IT resilience. This begins with comprehensive system monitoring that illuminates the dark estate. By tracking device performance, application health, and network stability in real time, institutions can identify vulnerabilities before they escalate. IDC’s findings support this, showing that predictive maintenance significantly reduces downtime. However, monitoring alone is insufficient; it must be paired with a cultural shift that prioritizes prevention over reaction.

Another critical step is fostering accountability through dedicated roles, such as an End-User Experience Monitoring Manager. This position bridges technical performance and business impact, ensuring that issues like application latency or device slowdowns are addressed before they disrupt employees and customers. By combining real-time insights with cross-team collaboration (think IT and HR), this role drives smarter resource allocation and pre-empts vulnerabilities, reflecting a broader shift toward data-driven IT management. With AI, predictive analytics can further enhance this person’s role as can automation by enabling real-time detection and self-healing capabilities, allowing his/her IT colleagues to focus on strategic improvements rather than manual firefighting.

Broader implications and challenges

IT resilience in finance extends beyond technical fixes; it has profound societal and market implications. Outages disproportionately affect underserved communities, who more readily need always-on access to banking, exacerbating inequality. They also undermine trust in institutions, as customers question the reliability of digital services. In a globalized economy, a single institution’s failure can ripple across markets, disrupting payment networks or trading systems. The interconnected nature of finance demands collective resilience, yet competitive pressures often discourage collaboration.

Adopting proactive strategies also faces barriers. Modernization can be costly, and smaller institutions may lack the resources to overhaul legacy systems. Regulatory complexity, with overlapping mandates like DORA and FFIEC, can overwhelm IT teams, diverting focus from long-term resilience to short-term compliance. These challenges raise a provocative question: can the financial sector balance innovation, compliance, and resilience without creating new failure points?

Financial institutions can no longer rely on reactive strategies or patchwork solutions. To thrive in a digital, regulated, and interconnected world, they must embrace proactive resilience through comprehensive monitoring, strategic frameworks, and dedicated roles that prioritize performance and employee/customer experience.

Resilience is not just about avoiding downtime; it’s about adapting to disruption, meeting market demands, and maintaining customer trust. Institutions that lead in resilience will do more than survive – they will set the standard for a more stable, equitable, and innovative financial sector. The question is whether the industry can move beyond compliance to embrace a transformative vision of IT resilience before the next outage strikes.