Madalina Tanasie has a coolly assured demeanour, and in the midst of loud client-fishing and product launch clamour at Data Citizen's London event, it is a welcome change. Tanasie is the CTO for Collibra, a data governance platform – a demanding role in a rapidly expanding SaaS.
Valued at $5.25 billion in 2021, Collibra has built up an impressive roster of clients including L'Oréal and Lockheed Martin. The company's USP, for Tanasie, is its industry and cloud-agnostic approach to data management – it recently released new discovery and governance tools for AWS Cloud Formation, Azure Data Factory, Google Cloud Storage, Databricks and more.
Tanaise has first hand experience of data governance struggles, having started out as a software engineer at TotalSoft two decades ago, but it was in eleven years at Medidata, initially as a senior software engineer, then rising to VP of engineering, that she really recognised the challenges of data management and governance in the highly regulated life-sciences sector; an experience that left keenly attuned to the needs that companies in sectors like energy, healthcare, defence may have.
"We come in recognising that these environments are super complex and they don't want to and shouldn't move their data in a central place," she tells The Stack. "And even if they did make that attempt, where would they put so much data?"
"The needs of data governance have become mainstream" she adds.
"Its a cross-industry need to catalogue data and understand what kind of patterns exist within it by profiling and curating it."
While large firms do want to make most of the data they hold, its made difficult because even within a single company, the information is often closely guarded: "Customer information is held in so many different data sources across departments– so its super important to look at it all together. If you have asimilar information in multiple places, you are constantly asking which source you want to trust, which one you can throw, which one you can leverage for what purpose -- and my work is to evolve the platform to respond to those needs."
Given that the volume of data available within each firm has exploded, data warehousing or even using data lakes- isn't an optimal solution, according to the CTO. And this isn't a new problem. According to an earlier Gartner report, the failure rate of data lakes and big data projects was at nearly 80%. Things have improved since, but single platforms are not the answer for all customers.
"That's where being platform-agnostic comes in. We have hundreds of integrations in play that allow clients to extract metadata and then empower them pick data -- and that means more datasets are used than just being moved past."
Collibra, in short, plugs into these myriad data sources and rather than demanding that they all come together, furnishes users with an organised, searchable inventory of these distributed data assets. Call it a database about your databases, if you will; albeit one with a shiny front-end (users can plug in tools like Tableau for visualisations) that pulls in metadata, such as data descriptions, lineage, and quality information, and lets you add governance capabilities.
On being asked what are the big asks across industry when it comes to data governance and intelligence solutions, Collibra's CTO Tanasie elaborates through three use cases: The first bit is discovery -- clients come in saying I have a lot of data and I don't even know what I have."
"At that point its about getting a full understanding of the landscape and many of the customers that we have started there, they just want to extract the metadata from thousands of data sources that they have in the ecosystem and be able to put it together."
"The second use case can be summed up as about really understanding the data. It's about the profiling and classification part of it; to be able to understand the data similarity and the overlap and redundancy."
"And the final piece of the puzzle is really using the data -- that's the data marketplace. That where you've put all your data quality, and then you do policy enforcement and you eventually expose data as products to your companies."
Tanasie is cognizant that not all three are made equal when it comes to data governance and cloud needs.
"Many of our customers are using one or two of these tools, that is one or two of these three different use cases. I don't know if we have many that are using all three of them and that tells me that this is a market that is still evolving."
With a 19.2% CAGR in 2023, the data intelligence platform market is both evolving and growing. The Acumen Research and Consulting forecast expects the sector to reach a market size of $56.7 Billion By 2030.
And Tanasie says CDOs and other customers have ongoing challenges that they need help tackling: "Data governance is a moving target, especially with global regulation and market demands being dynamic.
"I've put a lot of effort towards our enterprise practises and that sets us up for scale in engineering and security," she concludes.