Data quality Management (DQM) continues to remain in focus for many Banking organizations due to number of key business drivers which range from regulatory compliances, access to quality information, gaining predictive insights that help in business decisions and driving future business strategies. DQM is one of the key functions under the Data Governance area with a key focus to manage and improve the quality of data within the organization to achieve the desired business objectives. Data quality is generally measured on aspects of completeness, accuracy, integrity, consistency, uniqueness and timeliness. DQM encompasses of the following functions represented in the image below.
Figure 1 – Data Quality Lifecycle
Data as they say is the new oil, the cleaner the data, more efficient a Bank becomes in terms of Business decision, gaining predictive insights etc. Data quality remediation has been a traditional pain area for many financial institutions .A structured Data quality remediation process is very essential ,this ensures consistency and efficiency of how data quality issues are handled and resolved ,however It Is observed that most of the Banks do not have a sophisticated approach towards the same .Though there are a plethora of Data Quality tools available, however the Data Quality Resolution process in many Banks across Lines of Business is generally very manual, time consuming and low on controls.
A typical manual Data quality remediation follows the below process:
1. Data quality engine churns out Data quality issues.
2. These are routed to respective data quality (DQ) analyst for remediation’s through the Data quality system workflow or manually through emails in excel sheets
3. The Data Quality analyst teams then categories these data issues in to Data quality categories based on some Business rules manually
4. The categories are distributed to DQ analyst by emails who conducts the necessary fact finding on the issues
5. Data quality issues that follow a standard remediation path are addressed by the Business rules which may be a very standard activity.
6. Data Quality rules that do not get addressed in the earlier step are further studied by DQ analyst by looking at past occurrences and provide a suitable remediation and issues are closed in the Data Quality system.
As we observe most of the process has lot of manual touch points, low on auditability and time consuming. Data quality remediation cannot be fully automated as there may be newer errors that may be encountered which can be resolved through manual intervention, however there are still a sizeable number of Data quality issues which can followed the automated path in combination with a Machine learning capability .In this case, a cognitive RPA (Robotic process automation) solution which combines machine learning capabilities and traditional RPA capabilities can be a potent solution which will enable banks for a faster remediation of data quality issues and reduce the need for Manual intervention to a great extent.
The automation will require some prerequisites in terms of the Standard Business rules which are referred by DQ analyst to analyze and remediate the DQ issues along with a considerable training data set of all Data quality issues captured by the Bank for a significant time period which will act as an input to the machine learning model, basis which the ML models can provide a suitable recommendation for the DQ issue to the Analyst .The machine learning models are dynamic in nature and continuously learn from newer remediation’s as and when they are captured in the training data set there by capturing newer instances and reducing the need for manual intervention in the future
A high level view of the Cognitive RPA solution may encompass the below:
a) RPA Tool integration with the existing DQ system or the RPA tools can also pick up automated mails or issue list coming in from the DQ system
b) RPA BoT (apheresis of robot)then categorize the DQ errors based on business rules defined by the Bank
c) Automate the remediation’s for the issues which follow a standard remediation path as per Banks policies
d) The RPA BoT will run the remainder set on a cognitive solution, which based on the earlier training data set of DQ resolutions will suggest remediation.
e) The RPA based remediation’s and ML based remediation’s are compiled by the RPA BoT and finally emailed with a Log file to the DQ analyst
f) The DQ Analyst may go with the suggestion or reject, in either case the existing training data set will be updated with the learnings of the DQ analyst
Figure 2 – Suggestive illustration of a Cognitive RPA Solution in action
As we see this automated solution helps address the key challenges faced in a Data quality remediation process:
a) Reducing the manual intervention to a great extent as the machine learning models keep getting updated with the newer Data Quality remediation’s ,which help model to resolve similar Data quality instances in the future
b) Reduces turnaround time for resolutions – An automated solution can to a great extent bring down the manual efforts involved ,even in cases where the automation cannot arrive at a remediation for the data Quality issue but can still reduce the time involved in the analysis of the Data quality issue
c) Places better controls on the process – RPA solution puts a control wrapper around the entire set up and can also create logs and dashboards which help the user get information on the status of the automation run, exceptions encountered during the run etc.
Though RPA initially was perceived as a more point in time or a tactical solution but with its expanding capabilities of integrating with other Digital technologies like Machine learning, NLP etc., it is changing its perception slowly but steadily as an option which can also be looked at supporting strategic and complex use cases too.