Cloud Data Platforms: Old Wine in New Bottles?

“So honestly – how are these ‘cloud data platforms’ really different from what data warehouse vendors were selling us 20 years ago?” It’s a question frequently asked in corridors and conference rooms after breathless demos and pitches from vendors and their implementation partners—a reaction clearly laced with a little skepticism hard-won from years of dealing with technology projects promising more than they delivered.

There’s certainly an exhausting amount of hype around the current cloud data technologies, but beneath it all there’s also an undeniable transformation going on in the way asset managers and owners are leveraging these technologies as part of their data strategy. So how do you really articulate what has changed, why it matters, and what should be done about it?

It’s not really about the technology

For many of us used to working in and around data, the impulse is to answer by starting to explain the details of the technology—grabbing a white board and scribbling arcane diagrams explaining the principles behind columnar data storage, scalable cloud architectures, or zero-copy-cloning.

But many of these technical ‘innovations’ have been around in some form or other for decades too—AWS first started to commoditize cloud computing nearly 20 years ago, and the principles of analytics warehousing have been well understood since the late 90’s. But while the components are not entirely new, platforms like those offered by Snowflake, Azure and AWS have reduced the obstacles to entry by putting them together in a way that makes them far easier to deploy, administer and utilize than ever before.

Still, ‘ease of use’ is a relative thing—while it really is as simple as clicking a few buttons to spin up a data warehouse, setting up a data platform in a way that is reliable, secure, and has an operating model around it to actually deliver on business objectives still requires significant thought, expertise, and planning.

Even then, simply provisioning infrastructure doesn’t deliver much value. The ‘build-it-and-they-will-come’ data strategy has a long and illustrious history of failure, as technology-driven projects to establish data infrastructure never quite seem to deliver on their promise. If the main promise of a new technology is just to make the ‘build’ part easier, surely that will just shorten the path to disappointment?

Focus on delivery 

No business strategies seem to get away with ignoring the customer as effectively as data strategies often have in the past. Sometimes little more than a PowerPoint fever dream of architecture diagrams and governance structures with an arrow towards ‘consumers,’ they often overlooked how to engage and benefit end users. Instead, analysts delivering value from data are often dismissed by technology as ‘doing shadow IT,’ and by data governance because they are ‘using ungoverned data.’

Putting the data platform into the service of these stakeholders should be a primary objective of a modern data program. The first questions a data strategy should ask are not about technology, they are: “who uses data, what are they doing, and how can we help them do it?” Beginning with this focus on delivery both engages important stakeholders that stand to benefit from the program, and grounds the program with the objectives of delivering clear business value. 

Getting the data right

The best data infrastructure and analytics are of no value unless the data they deliver is reliable, timely and of good quality—and effective governance and data operations can’t simply be bolted onto a data platform as issues become apparent.  

An effective data program must begin with the design of the governance roles and responsibilities, and the data ops structures and processes that will assure the quality of the data. Implemented in an agile manner, these structures can support a roadmap of incremental deliverables that coherently build towards an enterprise vision.

Every organization will also have large, difficult data problems to solve such as implementing new master data management practices, or replumbing legacy systems—but small, tangible wins along the way both help to sustain the data program and to provide feedback and calibration as the program, platform, and operating models mature. 

You need a data strategy

In answer to the opening question: No, a new cloud data platform won’t change anything if it’s used the same way as a data warehouse 20 years ago. But a modern architecture coupled with agile governance, data ops, and analytics practices can be transformational.  

The potential of modern cloud data platforms are clear: infrastructure can be efficiently provisioned and data can be readily acquired, transformed, and used with analytics tools to rapidly prototype and deliver data solutions. But realizing this potential requires a holistic data strategy driven by a focus on business outcomes, with governance and data operations that embrace agile data practices. Without these elements in place, no technology will deliver on its promise. When deployed as part of a cohesive data strategy, however, a modern data platform can quickly start to deliver value from data for your business in an efficient, agile and sustainable way.