Andreas Bartsch, Head of Service Delivery at PBT Group
Even though it has been talked about for the past several years, DataOps has only really gained momentum over the past 18-months or so. And, while the maturity of this is still low, organisations are looking for opportunities to become more efficient and agile as they seek to improve their delivery cycle and reduce the maintenance overheads.
But what is DataOps and why is it important? IBM defines it as the orchestration of people, process, and technology to deliver trusted, high quality data to consumers whether from employee or end user perspectives. A more formal definition can be found from Gartner which states that it is a collaborative data management practice focused on improving the communication, integration, and automation of data flows between data managers and data consumers across an organisation.
However, one of the most user-friendly definitions can be found here. It likes data to the flow of traffic through a city. The roads are the data pipelines and the DataOps is the transportation planning that covers everything from laying out new roads and intersections to studying and maintaining old ones and ensuring construction does not create gridlock.
In other words, DataOps comes down to sourcing data from the data warehouse and all its layers (which can include data lakes and operational data stores) through to the exfiltration process (think reporting and dashboarding).
Prior to the emergence of DataOps, what typically happened was data engineers, modellers, and architects designed the pipeline from the source to the data warehouse stage. It was then put down for exploration by end users through dashboards and other visualisation elements. From there, the data scientists developed models. However, there was no real IT governance structure behind the process.
But with DataOps, the development process is just one aspect of the work. Another, perhaps more critical part, is the maintenance aspect to ensure everything happens consistently and as quickly as possible. Think of DataOps as a collaborative data management practice focused on improving the communications and data flows between data managers and consumers in the organisation, removing silos as they go along.
By integrating DataOps, the business enables a quicker development cycle and introduces a more efficient and automated approach to data maintenance. Fundamentally, it is not an IT process but enabling the business with readily available data analytics that the entire organisation can access, not just the data scientists and engineers.
Furthermore, DataOps supports data usage as a business enabler and drives collaboration between stakeholders. The organisation can therefore use data analytics to create products and services that are vital for competitive differentiation.
A DataOps engineer therefore becomes the key position in this value chain. This person is responsible for building the infrastructure used for data development and maintenance. While it is not necessarily pure data work, the DataOps engineer supports the other data roles in the business to put in a mechanism that is consistent and agile.
As far back as 2017, becoming a DataOps engineer was recognised as being the sexiest job in analytics. Little did the author know how prophetic those words would be in 2021.