The International Consortium of Investigative Journalists’s Panama Papers investigation has been hailed as the world’s biggest-ever financial leak. I discuss the unique technology that has helped make it possible and what the financial sector can learn from the episode.
Through the Panama Papers released this Spring, citizens round the world found out about the offshore tax haven activity of many of their national elites via its continuing expose of one of the world’s leading offshore money specialists, Mossack Fonseca.
The data that makes up the files was secretly obtained by German newspaper Süddeutsche Zeitung then shared with The International Consortium of Investigative Journalists (ICIJ), a network of independent reporting teams around the world, and more than 100 media partners, including The Guardian and BBC TV’s Panorama programme in the UK, among others. At 2.6 Terabytes and 11.5m documents, the Panama Papers is a dataset that dwarfs anything Wikileaks came up in both size and impact, observers are agreeing, and it is the subject of much on-going attention at both the societal and political level in many countries.
What’s the significance, however, for the banking and financial services community about what happened here? There is an important learning, as it shows what a new way of working with data at scale can offer.
That’s because this massive example of ‘data driven investigative journalism’ needed radically new ways of working with complex financial datasets to work. When an anonymous source tipped off the ICIJ about Mossack Fonseca, the team used advanced technology – technology that could process a large volume of highly connected data quickly, easily and efficiently – to manage the investigation.
That analysis had to be accessible to team members around the globe, regardless of their technical abilities, the vast majority of whom were not technical.
It’s been through the use of graph databases that reporters were able to discern patterns and spot trends that weren’t visible before. Graph database technology is, in the words of the ICIJ, “A revolutionary discovery tool that’s transformed our investigative journalism process.”
Transformative in what respects precisely? Because graph technology outperforms other ways of working with data at elaborating relationships. That matters to investigators, as, “Relationships are all-important in telling you where the criminality lies, who works with whom, and so on.”
But plainly, it’s not just investigative journalists who can benefit from being able to work with complex data, but any business trying to address large-scale connected data.
Graphs better reflect the way we think about the world
That’s because instead of breaking up data artificially the way a relational database does, graphs use a notational structure that echoes the way we intuitively think about and work with information. And once that intuitive data model is coded in a scalable architecture, a graph database is second to none at analysing the connections in huge and complex datasets. That allows any business user to spot trends and uncover secrets in ways they have never been able to before.
For example, all the social web giants – Google, Facebook and LinkedIn – have been using graph databases to derive value from connected data: the famed PageRank algorithm at Google, which mines connections between web links, is, at heart, a graph application, as are Facebook’s and LinkedIn’s tools for mapping real-time networks and connections to help us traverse “social graphs.”
As graph database technology has matured, such highly scalable connected data analysis is now available to the masses. Analysts like Gartner and Forrester are, as a result, predicting high take up in all sorts of enterprises as a result.
Graphs and financial services
Why finance in particular, though? Well, our sector has issues with managing data sets; notoriously, because of the way we computerised over the years, we suffer from disconnected systems, holding often disjointed silos of information. Graph technology can deliver big-scale, Master Data Management (MDM) – a way to connect these repositories of disjointed sets of information so as to create a more unified whole. In order to get a complete ‘360 degree view,’ a graph database can help build a registry-style MDM system that stores the most useful metadata, including the location of the actual master data.
The other reason is that, if you are a financial institution you want to be a couple of steps ahead of journalists uncovering any mismanagement or illegal transfer of finances. After all, just like data-driven journalists, the financial sector is interested in tracking down financial sharp dealings and straightforward fiscal fraud, which cost banks millions. And in the same way the graphs allowed the ICIJ investigators to ‘follow the money,’ so graphs are extraordinarily efficient at doing the same in the case of the fraudster or scammer.
At the same time, today’s sophisticated online scams are notoriously hard to spot with traditional approaches because they work with discrete data rather than looking at the relationships that underlie the fraud. As analyst group Gartner’s proposed solution to online fraud has it, to cope we need to start leveraging connected data in order to detect organised fraud – another way of saying look at the relationships, the sine qua non of what graphs do well.
Graph databases aren’t applicable or helpful for every problem in the financial sector. There are transactional and analytical processing needs in your bank or financial services firm for which relational technology will be the correct option (think systems of record such as your financial, HR or ERP). What’s more, there are NoSQL (Non SQL) database alternatives that handle other vast datasets well, especially in the Big Data context (e.g. Hadoop).
But a graph database makes sense for any organisation seeking to make the most of connected data. Complicated relationship datasets are what graph databases address – and that has to interest any financial services leader wanting to find new ways of working in our super-connected, data-driven market.
By Emil Eifrem, co-founder and CEO, Neo Technology.