From Digital Maturity to Industrial Transformation

For all industrial organizations, the intelligent use of data produced by operational technology (OT) systems is central to efforts to improve operational excellence. So while everyone is talking about their digital transformation, use of data, scaling and time to value, very few in the industrial world are actually reaping the benefits.

And it isn’t data that is the challenge. OT data is the raw material that enables organizations to build more efficient, more resilient operations and improve employee productivity and customer satisfaction. This OT data is available in abundance, but industrial organizations struggle to generate value from their increasingly connected operations—with IDC showing that only one in four organizations analyzes and extracts value from data to a significant extent.

The lack of appropriate tools and processes is a significant obstacle resulting in data workers spending almost 90% of their time searching, preparing, and governing data. A fear of missing data value often led organizations to prioritize data centralization over data organization. In turn, this led to poorly thought-out “data swamps” that only perpetuate the issue of dark and uncontextualized data.

Companies that adopted machine learning (ML) to develop predictive algorithms quickly realized how critical it is to have trusted quality data and that historical data can’t always be trusted. Many organizations are also unable to address the requirements needed to achieve the data governance required to support data-driven innovation.

The reality is that as operational assets become more complex, connected, and intelligent—and provide more real-time information—the complexity of enabling data-driven decision-making to plan, operate, and maintain them increases. To put this in perspective, organizations across manufacturing, oil and gas, utilities, and mining expect their daily operational data throughput to grow by 16% in the next 12 months. Market intelligence provider IDC has been measuring the data generated daily by operations across these organizations’ silos and has modeled the future expansion of data and its use across industrial sectors. Even accounting for the growing digitalization of operations, IDC predicts that only about 30% of this data will be adequately utilized in 2025 (Fig. 5).


Only one in four organizations extracts value from data to a significant extent. Data dispersion and a lack of tools and processes to connect, contextualize, and govern data stand in the way of digital transformation.

Industrial DataOps promises to improve the time to value, quality, predictability, and scale of the operational data analytics life cycle. It’s also a stepping stone to a new way of managing data within the wider organization, enabling it to cope with growing data diversity and serve a growing population of data users.

Figure 5: Data Generation and Consumption in a $250 Million Industrial Operation, 2019–2025

DataOps as a Discipline

Before considering Industrial DataOps in detail, it’s worth taking a look at how the early adopters of DataOps in industries such as banking, retail, and pharmaceuticals, responded to the challenges of operationalizing data.

While asset-heavy industry faces many specific challenges, the broad challenge around operationalizing data for value is shared across many sectors. This universal need is what gave rise to DataOps as a discipline, and saw it gain traction in a range of domains.

The IOT boom brought data, and the promise of data, to the forefront of business strategies across the globe. As the world ran toward the potential of data to drive meaningful change, DataOps emerged as the leader for operationalizing data in enterprise.

To refer back to the Forrester definition from Chapter 1, the power of DataOps lies in its ability to:

“Enable solutions, develop data products, and activate data for business value across all technology tiers, from infrastructure to experience.”

“By 2023, 60% of organizations will have begun implementing DataOps programs to reduce the number of data and analytics errors by 80%, increasing trust in analytic outcomes and efficiency of Gen-D workers.”

DataOps platforms help data workers deploy automated workflows to extract, ingest, and integrate data from industrial data sources, including legacy operations equipment and technology.

DataOps offers a workbench for data quality, transformation, and enrichment, as well as intelligent tools to apply industry knowledge, hierarchies, and interdependencies to contextualize and model data. This is then made available through specific application services for humans, machines, and systems to leverage.

Figure 6: How Components of DataOps Interact to Continuously Deliver Business Value

Direct and Indirect Benefits of DataOps

Efficient Data Management

DataOps maximizes the productive time of data workers with automated data provisioning, management tools, and analytic workspaces to work with and use data safely and independently within specified governance boundaries. The approach can be augmented with AI-based automation for various aspects of data management—including metadata management, unstructured data management, and data integration—enabling data workers to spend more time on use case development.

Improved Data Accessibility

For many organizations, the current use of data is limited by dispersion across silos, spotty integration, and accessibility for centralized applications. Even where data sources are connected, data often lacks context due to limited documentation at the data’s origin or information loss due to inconsistent structure or tagging.

DataOps technology uses AI to enable rapid ingestion and contextualization of large amounts of data. And by improving data accessibility, DataOps brings a paradigm shift in how the organization accesses business-critical information, improving decision-making quality, reducing risk, and lowering the barriers to (and skills for) data innovation.


Rapid Development of Use Cases and Application Enablement

DataOps aims to shorten the time to value of data by making proofs of concept (POCs) quicker and cheaper to design, offering tools to operationalize and scale them.

Enterprise Data Governance as a By-Product

DataOps enables companies to set and enforce the basic principles for managing data. If implemented successfully, the approach provides consistency and ROI in technology, processes, and organizational structures, with better operations data quality, integration and accessibility, and stewardship. A DataOps platform should also enhance data security, privacy, and compliance with tracking, auditing, masking, and sanitation tools.

Figure 7: Data Management Solutions

Industrial DataOps Can Deliver Untapped Value for Asset-Heavy Enterprises

DataOps is the clear frontrunner to become the driving force for transformation in industry.

The definitions emerging in DataOps as an overall discipline have meaning across the business landscape but have a particular importance in industry. Not only do the traditional data challenges felt across all organizations become much more weighty in industrial sectors (both in terms of operational consequence and inherent data complexity), but industry also has its own set of unique challenges.

Data must be made available, useful, and valuable in the industrial context.

Amid the rush for change in the industrial world, the promises of data quickly became disappointments. In a push to show digital execution, many have embraced the AI hype.14 This has led to quickly demonstrable digital POCs, but not to truly operationalized—and even less scaled—concrete business OPEX value.

To adequately extract the value of industrial data insights, it’s essential to make operationalizing data core to your business strategy. This translates into developing and scaling mission-critical use cases across safety, efficiency, and sustainability. Data must be made available, useful, and valuable in the industrial context. We’ll now consider the key steps, opportunities, and challenges associated with deploying DataOps in an industrial organization. This is the route to extracting full value from your data.

Making Industrial Data Available

The fastest path to tapping into the value potential of digitalization in industry starts with getting the right data to the right user, with the right context, for the right problem, at the right time. However, the reality is that industrial data is still not easily accessible. It remains trapped in different systems, requiring data scientists to spend hours searching for it, collecting it together meaningfully, and preparing it for analysis.

Trust in Data

Industrial enterprises need to be able to trust the data that they are putting into operations, simply because the cost of failure is too great.

“We can’t operationalize unless we trust the data. If something fails and we can’t provide auditability we are finished as an industry.”

Digital Operations Manager, Aker BP


Industry faces a particularly difficult challenge considering the nature of the data being analyzed. This data goes beyond standard tabular enterprise data. The bulk of data collected from operations revolves around process and instrument data and requires subject matter expertise to make sense of it.

In addition to being highly specific to industry, the time-series data collected is fraught with challenges of its own. Anyone dealing with streaming data understands that data quality and consistency are major thorns in the side of data scientists. Data quality issues become hard to check and must be done continuously, highlighting the importance of constant monitoring.

IT/OT/ET/X Data Convergence

Traditional data silos are an obstacle to extracting value from data. Companies can make the most of Industrial DataOps to begin the integration process that spans asset and data life cycles across IT, OT, ET (engineering technology) and X (other types) data. The resulting converged data will support resilient decision-making across the organization and unlock the potential of fully fledged digital twin applications.


Advances Across Technologies Make IT/OT/ET/X Data Convergence More Possible Than Ever

IT/OT/ET/X data convergence is not about turning IT pros into plant engineers or machine operators into data scientists—although the latter is indeed happening regardless. It is instead about executing on a strategy to align and bring together formerly isolated subject matter experts (SMEs), cultures, platforms, and data deployed by OT and IT teams to improve operational performance through unified goals and KPIs.

In practice, this is happening through the adoption of a new data and digital platform that contextually fuses IT, OT, ET and other data types such as audiovisual data, making this contextualized data conveniently available to a growing audience of data consumers, both inside the enterprise and across its partner ecosystem.

Figure 8: Relationships Between IOT, Conventional IT Data, OT Data, ET Data, and Other Data Types (X Data)

Making Data Useful

Industrial data becomes truly useful (read: fit for operational purpose) when it is integrated, contextualized, and made securely available, explorable, and actionable to all data consumers (human and machine) within and outside the industrial enterprise. This should encompass all the various sources and formats including sensor data, process diagrams, 3D models, event histories, asset models, and unstructured documents.


Data contextualization involves connecting all the data for a clearer understanding of an asset or facility, establishing meaningful relationships between data sources and types to help users find and utilize relevant data from assets across the operation. This should be at the core of an Industrial DataOps platform.

A petroleum engineer, for example, would understand the sensor data streaming from an electric submersible pump of an oil-well site, but a data scientist might not. Contextualization links the pump identity from the asset hierarchy to its sensor data and related work orders and connects it to the asset’s 3D model.


Similarly, in the steel industry, a data scientist might not be able to grasp the complexity of predictive quality and steel-grade monitoring without a solid knowledge of the underlying chemicals and laws of physics. If they are given more context with a 3D model or knowledge graph, however, they would be able to visualize the operational context to develop models and data applications, in this case for anomaly detection.

Enabling Industrial Hybrid MLOps

Industrial DataOps platforms offer the combination of data-driven statistical and physics-driven process modeling and simulation. While each approach has its pros and cons, often an ML model based on a hybrid of the two will provide the best results. These tools empower developers with workflows compatible with third-party AI tools and other necessary tools to develop, train, and manage hybrid ML models. This enables them to operationalize use-case-specific data subsets efficiently and at the desired scale.

Further Adding to the Complexity

There are challenges inherent to operating in large, siloed industrial organizations: among them the different reporting lines, varied analytics workflows, competing business interests, and varying incentives. If industry is to break free of the complexity, tools and technologies must be the driver of change.

Adding insult to injury, the rise of AI/ML, combined with the difficulty of finding data scientists, is imposing its own set of requirements on data modeling, data source availability, data integrity, and out-of-the-box contextual metadata. These requirements are often very differentiated from those of the conventional BI user.

Data engineers working on industrial digitalization projects struggle with access to key source system data, in a way that is reminiscent of the year 2010 in non-industrial verticals. Industrial companies are not only facing the same challenges as, for example, their retail peers, but are presented with a superset of challenges resulting from the IT/OT convergence, and associated non-conventional IT-only data velocity, variety, and volume.

Figure 9: A Typical Data Pipeline for Analytics With Associated Workflow Challenges

Making Data Valuable

Extracting maximum value from data relies on being able to apply advanced models to produce insights that inform optimal decision-making, empowering operators to take action with confidence. This, in a nutshell, is what is meant by operationalizing data into production for value.

Advanced models combine data science with physics to generate synthetic data and advanced insights. This is complemented with machine learning and deep learning for scale. It’s crucial to observe, analyze, and optimize to deliver reliable forecasts and actionable insights.

Democratizing DataOps

Industrial DataOps platforms enable data users with low-code or no-code application development and model life-cycle management tools. This democratizes DataOps and facilitates a more collaborative working model, where non-professional data users can perform data management tasks and develop advanced analytics independently within specified governance boundaries. This democratization of data helps store process knowledge and maintain technical continuity so that new engineers can quickly understand, manage, and enrich existing models.

Going Beyond Proof-of-Concept

Too often, digital operation initiatives get trapped in “POC purgatory”, where scaling pilots takes too long or is too expensive. What holds them back are the IT/OT and OT/data science divides, and the inability to produce and access contextualized, quality data at scale.

By connecting data users with disparate operational data sources, an Industrial DataOps platform helps bridge those divides on the path to use-case operationalization. ML libraries of standard industrial use cases help developers save time when collecting data and developing and training their models. Data scientists can leverage this library and use it with component-level data. Once a use case is developed and the outcomes are satisfactory for one component of the plant, the contextualization of asset data allows it to be scaled to the plant or fleet level.


Some examples of common use cases in asset-heavy industries are: maintenance workflow optimization, engineering scenario analysis, digitization of asset process and instrumentation diagrams (P&IDs) to make them interactive and shareable, and 3D digital twin models to support asset management.

What to Consider When Adopting Industrial DataOps

Asset-heavy organizations should look to Industrial DataOps to unleash the full potential of IT/OT/ET/X data and to transform their traditional operating model. When starting on this journey, companies should:

  • Think of AI as a critical tool for both fact-driven decision-making and efficient management of the data supporting it. Bypassing human “midstream” data handling is key.
  • ”Data liberation” is critical to getting full value from DataOps. Maximizing your data extraction capabilities will make it easier to realize DataOps with your existing IT and OT architecture, limiting the need to invest in additional systems integration and OT data sources.
  • Develop a strong data governance model for IT/OT/ET/X data. This will dictate how new data is connected and integrated into the overall data architecture. It will also help serve a growing population of data and analytics business users.
  • Prioritize data organization over centralization. Start driving the connection and mapping of all relevant data sources with a clear list of target use cases in mind. As part of the governance model, all new data sources must have a connection, tagging, sharing, and integration plan.
  • Note that not all DataOps platforms have the same capabilities. Alignment with your goals, industry track record, and domain expertise should drive selection criteria. (See Appendix for a complete guide to provider evaluation).

Moving From Theory to Practice

As discussed in previous chapters, Industrial DataOps requires your organization to take some critical steps to start the journey on the right foot. For starters, it requires stronger cohesion among data stakeholders. Data science and IT must collaborate well beyond data access and resource allocation, while business should be involved in data projects well beyond the typical demand and validation stages.

Organizational divides compound a company’s inability to access data at scale, making asset analytics pilots too long or too expensive to operationalize. Bridging these organizational and operational gaps is a balancing act that requires focus and leadership.


Deploying the right tools—feature-rich, intuitive, and easily scalable—can be a catalyst for lasting, positive change.