Why financial services can no longer rely on relational databases for managing data

By Alex Hammond | 29 August 2017

bobsguide sat down with Adrian Carr, Senior Vice President, Worldwide Sales at MarkLogic Corporation to discuss the most pressing issues financial services are experiencing with their data storage solutions.


What is the major issue financial services are experiencing with their data storage solutions today?

Whatever industry they are in, our customers want a single 360-degree view of their corporate and customer data. The challenge is that all this information is often stored across numerous unconnected systems and data warehouses, many of which are invariably out of date. And companies are dealing with vast amounts of both structured and unstructured data.

Companies need to integrate all of this data —no matter what format it is in — to provide a single, real-time operational and transactional view to make more strategic decisions and gain valuable and potentially revenue-generating insights into their business processes or customers’ preferences.

And relational databases just aren’t up to the job. They are being pushed and twisted in ways that they weren’t designed for, which costs organisations time and money. Coping with the velocity of data - the speed at which data moves and changes shape – is one challenge. Another is relational databases’ inability to manage unstructured data effectively.

Today we are used to the fact that we can perform long, complex searches and find every shape of data with platforms like Google. If you search on, say, “eclipse,” Google will return results including videos, PDFs, even Facebook profiles. Yet, when you look at relational databases, you’re fixed in your query language and can query only what you have indexed. So, in a way, that’s an anachronism.

Is agility the major characteristic banks are looking for from their databases?

Certainly, if your data is not agile you can’t answer any questions that look at different types of data, or add data to data to get different types of answers.

But for every customer I talk to, their main focus is how to integrate data from silos quickly and easily. Investment banks have been early movers, driven by regulatory pressure, but the need is rippling through the market to insurance firms and retail banks wanting to satisfy both regulatory and business agility requirements.

Where does Marklogic fit into the picture?

Marklogic is a different type of database. All the data that goes into the database is indexed in real-time, in a similar manner to the Google search index. The net impact is you spend much less time dealing with the plumbing work of making the database work and a lot more time looking at and analysing data.

Many investment banks are struggling with a multitude of unconnected product lines and trading system fiefdoms – for example, one of our customers has talked publically about having 30 different trading systems including FX, equities, futures and derivatives. Historically they had been required to keep these separate. Then Lehmann Brothers failed and the regulator wanted to know the extent of the impact of Lehmann Brothers failing. The customer took over a week to respond because all the data was spread across so many silos.

Now as all their trades are made - price quoted, made, confirmed, adjusted and so on - the data is placed into a Marklogic trade hub. This then becomes the single source of truth. Today, just by taking one slice across the database, the Lehmann Brothers question could be answered incredibly quickly. And by having one source of truth that is operational and transactional, it’s easier to work out what other data can be added to make the database more useful.

Marklogic is helping to make banks and financial institutions agile and flexible enough to deal with any regulatory issue and ready to respond whatever the regulators require.

At the moment the pressing focus is getting data in order for MiFID II but there is also FATCA to consider, plus Dodd-Frank, FRTB, not forgetting BCBS 239. The truth is that there will always be a new or changing regulation on the horizon so storing data correctly is critical.

What would be the standard implementation time and cost for a Marklogic system? Is it cheaper and quicker than internal development?

Yes very much so. Our approach is agile development, in other words launching an initial solution and subsequently iterating a build. It took under five months to get phase one live for the investment bank I mentioned earlier.

Just about every Marklogic customer goes through a proof step. They need to see their data in our system; otherwise, they don’t believe we’ll be able to deliver. For another global investment bank we’re working with currently, we loaded all their data in a week which is incredibly fast.

Traditionally, this would take much longer because you would need to examine all of the data, model it, map it, build a schema, test the schema, go through sizing, and debate about data values to index everything before the metaphorical hose pipe is plugged into the content pump.

And that was the shift: Because the Marklogic database is agnostic, it doesn’t need to be transformed, so you don’t lose any of the data provenance you might if using traditional ETL (Extract, Transform, Load) methods. The regulator wants to know the provenance of the data presented to them, and they want to see one source of truth, which banks can now have.

This is the first significant evolution in databases in 30 years. Has the increased amount of regulation post-2008 been the driver of change?

I think some of the drivers came from the smart people innovating the technology, but we’re finding that one of the biggest pulls from the market is regulatory. Certainly, if the technology hadn’t been made available to start with, there wouldn’t be the pull, but we’re seeing that the imminent threat of large fines is a big driver.

Do you encounter any resistance or sceptics in the market, with legacy database systems having been in place and working for so long?

Ultimately, our biggest challenge is what we describe as relational thinking. We have customers who load the data and have to put it in a table, and that’s the only way they can understand the outputs. Marklogic doesn’t need the table, but some customers think they need it.

The Marklogic database is truly a disruptive technology, so when we engage with a company we have to undertake a certain level of cultural assessment

Some financial institutions will tell us they have a database that has been running perfectly for 30 years. But that’s because they have got used to having big teams of people doing ETL and plumbing work, all day every day. There are many businesses that are just not ready for the level of real change that comes through implementing a technology that will have such a substantial impact on their status quo in a key area.

On the other hand, we sign customers who are early adopters so don’t feel the need to ask where in their industry the technology has been successful before.

Investment banks are also used to operating with that mind-set. Technology is a completive weapon for investment banks. For example, if you’ve got an internet switch that has a great deal of latency compared to the industry’s best, then you will struggle. If you’ve got a technology that provides an investment bank faster insight or gives asset managers an additional edge, you’ll make hay. Our biggest barrier is the status quo of relational data.

What do you think are the next trends around the corner for data management?

We talk to a lot of analysts such as Gartner and 451 Group, and we’re currently seeing a wide diversification within the database world. We know that one day this diversification will all converge. The prediction is that in future there will be only plain databases, no matter what type of data companies or banks want to load into them. Additionally, no time will be spent focusing on the format, because it will be changing by the week, and by the month. This means all useful databases will need to be format-agnostic. The time and resources organisations save in this process will be spent looking into how greater value can be extracted from the data.

The intermediate step is what we call “multi-model.” If you’ve got a database that is capable of only one function while the world is converging, in order to survive this stage, you’re going to have to build or partner with a number of providers to gain the full stack of database functionality needed in today’s dynamic business environment.

The second trend we’re seeing happening at different speeds throughout the world is the separation from Hadoop. Eighteen months ago, there was a real confusion among Hadoop and many of the new-wave and NoSQL databases; that is dissipating now. Certainly in the US, companies have determined that it makes sense to have a mind-set where there is a preferred SQL, a preferred Hadoop, and a preferred NoSQL database. And that becomes the stack. But ultimately those three types will converge too.