How To Build a Data Foundation for the AI Era

How To Build a Data Foundation for the AI Era

JUNE 17, 2025

More than half of respondents to a KPMG survey (54%) said their bank has “implemented” foundational data capabilities. But few claim data capabilities that are “fully developed and operational,” and in the age of AI, the last step is increasingly out of reach. As we’ve written, the more data banks have, and the more effectively they process and use data, the more competitive they will be. But some don’t even have basic capabilities designed to fix siloed, inconsistently formatted information.

To succeed in the long run, a bank’s leadership will need a clear vision of the future of data as it applies to their institution and the industry overall. And, to make that vision a reality, they will need a data strategy that’s linked to enabling technology and business practices. In our research, we’ve observed that successful data strategies follow three principles:

Data is a strategic asset that’s crucial to decision-making.
Effective governance is fundamental to data quality and compliance.
Systems that create and process data should be seamlessly interconnected.

Applying these principles, banks can begin to think about infrastructure. Based on our research and experience, forward-thinking data strategies typically start with established tools and look ahead to the mainstreaming of AI across the bank. There are three main tiers to be aware of when building a data stack:

Application layer: Contains customer- or employee-facing applications such as the general ledger, ERP, and CRM; loan origination and operations; digital banking; payment systems, and risk and compliance tools. All applications create data, which is frequently interrelated, but the data they produce may be in different formats, have different characteristics, and be difficult to reconcile.

Processing layer: Collects, stores, and transmits data from applications. A data layer will include a data warehouse, data lake, or combination of the two, to allow data that’s created across the bank to be queried. It will also include data extraction software that pulls and cleans data from disparate systems. Event-driven architecture and modern APIs enable the latest generation of data management systems to stream data between applications.

Intelligence layer: This is where AI models are developed, trained, or validated. As we’ve also noted, AI models depend on high volumes of data, ideally made available to them in real time. Models then need to be deployed and monitored for compliance. The intelligence layer dovetails with the application layer, to provide real-time access to the bank’s data.

With this stack in mind, banks may adopt a roadmap for building out their data capabilities:

Awareness: Understand the value of holistic data to a bank’s performance, what may limit data use, and the organization’s need for technology that allows data to be used efficiently and effectively.

Vision: Articulate the types of data needed to inform business planning and identify the technology that will support the creation, collection, and seamless access to that data for all who need it.

Foundation: Roll out technology in line with the bank’s data strategy; create a data governance framework such that data from different systems can be used effectively in business processes across the organization.

Scale: Reach the point at which technology is integrated such that finding and using data is no longer a chore, because the data layer has been embedded into the bank’s workflows and access enables advanced analytics.

It’s important to remember that this is a long-term effort. The first step is to adopt the right mentality: to see data as a durable advantage and aim for decision-making driven by holistic, up-to-date insights. Tactical considerations regarding technology will follow and should be aligned strategically to the bank’s needs and resources.

How To Build a Data Foundation for the AI Era

How To Build a Data Foundation for the AI Era

Subscribe to our Insights