The role of data assurance standards and tools underpins AI

Steve Ellis

22 September 2023

The power of data in the modern world is undeniable. Whether it’s wielded to create advanced new products and services, or rigorously assessed to ensure bias does not infect critical decision-making algorithms, management of this precious resource matters in the data economy. For every use case, data quality and the processes behind collecting, accessing and sharing it is crucial for ensuring that it fulfils its intended purpose. This ultimately boils down to how trustworthy any data that is operationalised is—which is where data assurance comes into play. 

Our client The Open Data Institute defines data assurance as ‘the process, or set of processes, that increase confidence that data will meet a specific need, and that organisations collecting, accessing, using and sharing data are doing so in trustworthy ways.’ As data regulations evolve, along with consumer education around data rights and restrictions, businesses must have the means of providing data assurance to customers, partners and internal and external stakeholders. 

Commissioned by the Open Data Institute (ODI), the Metia Insight team conducted extensive research into the data assurance market, surveying 791 data assurance professionals across industries and regions. The findings reveal positive trends, from increased investment in data assurance products and services, to culture and leadership that advocates for an organisational understanding of data assurance. With nearly two-thirds of data assurance professionals reporting to CIOs, CEOs and COOs, it’s clear that the C-suite is taking an interest in the management and quality of data, particularly as mismanagement can lead to significant outages, fines, and reputational damage. With 92% of respondents stating that their organisation already has budget allocated for data assurance for the year ahead, further investment is clearly a priority. 

Trust has always been a crucial component in the success of any business, and with 94% of data holders and data users (organisations that use external data assurance services) believing external data assurance boosts trust, but 66% believing existing services do not fully meet their needs, the conditions are favourable for providers. So favourable, in fact, that the entire data assurance market is estimated to reach US$5.6 billion by 2027. 

In the age of the algorithm, anxiety around how data is operationalised is increasing. We may not always fully understand how machine learning models arrive at their decisions, but if we can be assured that their training data is clean, trustworthy and permitted to be used, they become much less problematic. 

As regulations around data and AI evolve, businesses must ensure that they have a robust data posture that takes into account technical infrastructure, culture, and best practice. AI is the ultimate use case for data and underpins all long-term data strategies. This has been made abundantly clear by the advent of generative AI, and the interest in the business use cases for the likes of ChatGPT and other large language models (LLMs). 

The problem with LLMs is that they are black-box solutions. Their effectiveness practically relies on being inscrutable due to the enormity of the datasets from which they learn. In a recent paper by ChatGPT’s creators, OpenAI, the authors acknowledge in the opening sentence, “Language models have become more capable and more widely deployed, but we do not understand how they work.” This makes them unpalatable in many instances, especially in heavily regulated industries, such as financial services. Yet, the renaissance of interest from business leaders in AI and its applications continues to grow. 

Whether the data assurance standards and tools currently exist to provide full confidence in LLMs in every context—side note, they almost certainly don’t—it’s undeniable that such technology and practices will be essential in directing the future of data-led innovation. 

Establishing the standards against which data assurance products, services and practices can be assessed is an essential component of unlocking the full potential of the market. Governments, industry bodies and organisations like the ODI will need to work closely together to deliver data assurance frameworks that will enable innovation that is largely unbridled from today’s data-related anxieties. 

The best thing about our work with the Open Data institute is that they actively want to share the research with the industry to increase discussion and advance best practices.  

To learn more about the key opportunities that exist for those offering data assurance products, download the Metia Market Demand for Data Assurance Services research report here.