Consulting | Current topics
Data Science Transformation
Growth in data science, implications for Model Risk and IT groups
Artificial Intelligence has occupied our collective mind-space over the last few years. However, the advent of ChatGPT has hastened the already ongoing digital transformation of the financial services industry. Business users are increasingly able to directly integrate data science into their daily lives through citizen data science tools. With the explosion in the scale, variety, and complexity of models, a reimagination of how data science is organized is necessary and due consideration should be given to ethical bias and data privacy.
Below we share our thoughts on this explosive growth in data science, implications for Model Risk and IT groups, the right framing for a ‘data science operating model’ conversation, and how we can help address these challenges.
What’s ahead of us?
A massive growth
- We, as individuals and retail clients, are surrounded by AI, through customer services, social networks, or more recently ChatGPT. The Financial Services B2B world, especially in non-client facing functions is significantly behind other sectors when it comes to AI adoption[1].
- As a matter of fact, Valuates projected that the global artificial intelligence market would grow at a CAGR of 38.0% from 2021 to 2030
- This growth will partially be carried out by the impressive growing number of Data Scientists – the United States Bureau of Labor Statistics plans a 36% annual growth until 2031 for that profession vs 0.5% on average for all professions.
Citizen developers & the explosion of use cases
- However, the biggest driver is the emergence of techniques, tools enabling Citizen developers within business and operational functions.
- Citizen developers can use AI models as they would be using MS Excel functions, only knowing the inputs and the outputs of the models, not necessarily the “inside” of the engine.
- One could compare the upcoming evolution to how today’s smartphones achieve better performance than the first computer rooms that existed 50 to 60 years ago. Only, this AI jump is coming much, much faster.
- The resulting innovations should be overwhelming since the focus of “Citizen” developers will be the use cases, and not the models. In a few months, the exponential adoption of ChatGPT and other Large Language Models is a true testimony to that transformation.
Significant Model Risk transformation along that massive growth
Model Risk Management must pivot rapidly from a point in time activity to a continuous activity, with the need to understand and manage new risks, threats, ethical and sustainability concerns.
- Continuous model validation
- The design of Model validation processes was very much influenced by US Capital Planning regulations[2], with an annual review of all existing financial models and throughout the cycle for new models.
- The very essence of AI Models implies that variables and “decisions” change very rapidly, implying a profound transformation of Model Validation processes with a switch to a “continuous” model validation process.
- The sharp increase of AI Models, to date and widely expected, thanks to a strong contribution of Citizen developers, also requires a much higher capacity.
- Explainability and interpretability
- As required by both several regulations and users, explainability and interpretability of results are demanded to avoid a “black box” reality: users and regulators need to be able to understand the models’ decisioning process, and the outcomes, but also make the valid inferences based on model outcomes,
- Modeling side effects: ethics & data privacy
- Many well-known retail examples of ethical issues have been made public around social media algorithms biases. Similar shortcomings arise with B2B models without proper policies and limits.
- Handling of Private data in the modelling value chain is both complex and challenging with now well-known GDPR, CCPA and equivalents in terms of traceability, and utilization requirements.
- Sustainability
- Finally corporate sustainability objectives need to be included very early in computer-intensive tasks, rather than after the fact.
Legacy IT inefficiencies and scaling costs
- Legacy architecture costs
- Unique architecture is an IT fantasy that often conflicts with modeling needs and regulatory timelines. Most institutions do (or did) not have any other option than letting Quant teams working with distinct modeling tools, environments, and creating significant redundancies in data extractions.
- This is a typical source of stress for both IT and compliance departments due to the multiplication of End-User Computing (EUC) tools, data quality issues leading to costly controls and potentially even more costly regulatory remediation requests.
- Scaling costs
- While Cloud computing has enabled many analyses that were previously inconceivable, computing costs can become significant, and therefore cloud resources need to be used only when required, prioritized and allocated dynamically.
Fragmented Operating Model and sub-optimal governance
The management of the Data Science “function”, together with its governance and own supporting sub-functions is already complex due its ubiquity, but will be even more acute with the projected increase in AI Models.
- Organizational fragmentation
- To best answer business needs and evolutions rapidly and specifically, Quantitative teams need alignment to their business areas vs. seating in central teams, a set-up that is usually only relevant in a “start-up” phase.
- In addition, there are many guests to the party, beyond Quantitative teams: IT, Data, business, and Risk & Compliance. This results in a lack of ownership, leaving it to the goodwill of these departments to work efficiently together.
- This has often resulted in issues and eventual MRAs[3] within CCAR processes in the banking industry.
- Evolving nature and scale
- Integrating new Quant teams focusing on AI, with often different enterprise cultures adds to the existing complexity.
- The sharp increase in models and use cases obviously makes the existing problems much bigger.
So, what are the solutions? Which approach?
- To enable business scaling while addressing such a multi-faceted problem – ranging from Model Risk Management to Operating Model, and IT costs, across all business lines and most functions – it’s usually easier to anchor the transformation around a common denominator.
- We believe that the IT platform can usually play that role, as long as it’s comprehensive enough while allowing substantial progress in each transformation phase, preferably in an agile manner, enabling equally the non-IT streams.
- The advent of ML Ops platforms, enables End-to-End Financial and/or Model Management
- Simply put, MLOps platforms are to Model Development, what DevOps is to IT Application Development, enabling cooperation, fast development, IT controls and cost management.
- In the case of Model Management, MLOps platforms allow Quant teams, IT teams, but also Model Risk Managers to cooperate on a single platform letting Model developers use a wide variety of modeling tools, with a standardized infrastructure layer including elastic compute and a unified data access, while Management teams can easily prioritize the computing requests.
- Data Quality
- A unified but federated Data Science Operating Model (DSOM) covering Financial/Risk models, as well as AI models, whichever the end user business group
- Very large financial institutions have often developed End-to-End Model Management capabilities for financial and risk management models, whether triggered by the regulators for the sell-side or by business need for the buy-side.
- The extension and standardization to AI models is not so obvious, since many AI initiatives have been championed by different business groups e.g., Sales and Marketing.
- For the above reasons, as well as mounting regulatory demands (see below) Financial Institutions need to transition and/or progress, depending on their maturity levels, towards a holistic but federated DSOM
- This transition is not only possible but necessary for small and medium-size institutions given the ubiquity of AI models, and the recent dazzling emergence of generative models, for both retail and institutional worlds.
Model Risk Management
The below describes a few potential evolutions in Model Risk Management processes and techniques.
- Implement the integrated, federated Data Science” Operating Model into a common MLOps platform that allow Model Validators and Risk Managers to be involved from the get-go, understanding much faster the modelling assumptions, and the data series, with traceability into every model version
- Build and Leverage a “self-service” model, based on model tiering inclusive of AI Models, where model developers, Quants and Citizens, contribute early and strongly to Model validation.
- Leverage newer techniques and tests e.g., Shapley values, that help with the explainability of AI/ML models
- Ensure a proper implementation of knowledge management processes within the MLOps platform, from model components to data series, model performance metrics and IT processes
- Define a set of model risk management limits within an integrated Model Risk framework, easy to understand by Quants and Citizen developers, business users, IT partners, and auditors.
Small to large, every financial institution is now bound to embrace AI or accelerate its transition, pressed either by competition or clients alike.
The transition to an overall Data Science Operating Model, covering both legacy financial models and AI models, enabled by a MLOps platform is paramount to scale rapidly and efficiently while controlling that growth, and properly managing Model Risks.
[1] Artificial Intelligence Index Report 2023 – Institute for Human-Centered AI, Stanford University, April 2023.
[2] SR 11-07, SR 15-18 and SR 15-19 further modified by the Tailoring rules in 2021. In addition, OCC introduced new guidance on Model Risk Management including AI models through its Comptroller’s handbook. In the UK, the Bank of England published the 6/22 Consultative Paper on Model Risk Management that also covers AI models.
[3] MRA: Matter Requiring Attention, first level of regulatory compulsory action after a supervisory review