Financial institutions must look to educate existing staff on data practices before hiring more data scientists, agreed a panel at the Financial Information Management conference in central London this week.
Simon Gordon, head of risk information services at Barclays said the bank has begun a process of training its staff in computing programming languages.
“I use the data to ask a lot of questions around risk, around our portfolio, where we are exposed, and really just looking for very quick answers. I don’t mind what solutions or software they are using… it’s just 'can I trust the data?' and 'can I make some key decisions with that data?' and the faster I get that answer back with certainty, the happier I am,” said Gordon.
“What we have done though, is now I am retraining a lot of my risk managers use Python and R, so that they can actually access the data and answer the questions themselves, and that has been great. Now we can trust the data, we can actually start deploying it.”
Banks are increasingly looking for analysts and junior traders to learn computer programming language. In March, Goldman Sachs said it was looking to add more programming languages for its analysts through its Securities Knowledge Exchange for its securities division. Citigroup also set up a collaboration lab in May for its traders to learn code, and its coders to learn how to trade, Bloomberg reported.
There needs to a balance between the amount of data scientists and data engineers within a financial institution, said Roshan Awatar, data strategy director at Lloyds.
“We spend a lot of time talking about data scientists, but actually we need to compliment them with data engineers. I think we need to balance out our hiring and reskilling… or we are going to go to an imbalance where we have too many data scientists and not enough data engineers.
Barclays’ Gordon said the bank was starting to be used artificial intelligence in pockets of the bank.
Mark Howden, deputy chief data officer at Santander CIB said that the bank is looking to reduce errors by limiting the human element in the creation of data.
“What we are trying to do is limit human involvement in inputting data, because it is humans that cause most of our data quality problems. So, we are looking at pure automation, we are looking to automate through open APIs into global data aggregators and then push that data down the stream.