Hybrid AI

Rules-based coding has been behind software creation for decades. If there is a problem or set of questions, we analyze it, establish its boundaries, variables, processes, and correlations, and turn this information into rules to define how software should work.

Scroll-Icon (1)

Rules-based coding has been behind software creation for decades. If there is a problem or set of questions, we analyze it, establish its boundaries, variables, processes, and correlations, and turn this information into rules to define how software should work.

This software and data management method created ‘dumb’ software, i.e., software that only behaved in a specific way unless the programmer changed its rules. If you found scenarios where rules weren’t so clear-cut, the software malfunctioned.

While rules-based programming served well for its time, its limitations became apparent, necessitating building ‘smarter’ software. And so entered the world of downstream digitalization, i.e., artificial intelligence (AI) through machine learning (ML), and later, Deep Learning (DL). And even though ML/DL addressed many of the limitations of logic/rule-based programming, it will be decades more before ML/DL can stand on their own.

In the meantime, industrial data management and data science for industry must still rely on human intervention and physics-based simulations. On this page, learn everything about hybrid AI and how it can provide value for heavy asset industries like oil and gas (O&G).

In this guide will learn more about:

1. What is Hybrid AI?

2. When Should One Apply Hybrid AI Solutions?

3. Benefits of Hybrid AI

4. How Feature Engineering Fits into Hybrid AI Modelling

5. How Feature Engineering Works

6. How to Make Hybrid AI Work for Any Industry

7. Conclusion

What Is Hybrid AI?

Understanding hybrid AI must begin with appreciating the various data modelling techniques applied to big data. Machine learning is the ability of software to learn by example. Instead of using rules to code what a table looks like, you supply millions of pictures of tables, and the algorithm identifies patterns that define the table’s appearance. Therefore, ML gets more accurate as it receives more data for a given application.

In the age of Big Data, industries produce massive amounts of sensor data in various forms – far beyond what even the most qualified data scientists can interpret. For a long time, instead of refining algorithms to uncover advanced analytics/insights, data scientists spent much of their time preparing data for analytics – clean, contextualized data is key to any analytics attempt.

Enter AIOps and MLOps, two game-changers that identified patterns from massive data sets faster than any group of humans could. So was born the world of data science, business intelligence, and predictive analytics. These DataOps fields have been applied in many industries, although much less in the manufacturing and heavy asset industries.

Why Pure Analytics Did Not Work for Heavy Asset O&G Industry

Data-driven AI through ML was the newest technology for enabling data management in many industries when it came along. This led to widespread pilot programs and the implementation of digital twins in heavy asset industry businesses.

Before ML, these companies relied on the trusty but less-hyped physics-based graph data modeling. Physics-based simulators directed decision-making based on historical data and other relevant parameters.

Unlike other industries, it has been difficult for pure AI-based approaches to gain acceptance in engineering science-based systems. There are two reasons for this:

  • AI has an inherent ‘black box’ nature – this means that delving deeper into the AI, the complexity levels increase. In simple terms, you may not always know why the system made the choices/recommendations/predictions it made, which is dangerous because
  • Heavy asset industry companies have slim to zero risk tolerance for critical failures or wrong predictions.

Therefore, AI’s predictive maintenance and production optimization became limited after O&G industry leaders discovered the failings of pure-AI approaches to solving O&G use cases of problems. This necessitated an all-inclusive approach – taking the best of what physics-based modeling offered and combining it with the remarkable capabilities of machine learning.

This is how hybrid AI for the heavy-asset industry was born.

Copy of Industry 4.0

When Should One Apply Hybrid AI Solutions?

Both physics-based models and ML are useful for making future predictions. When should each be used, the question then becomes, and when is a hybrid solution the best?

In the heavy-asset industry, most problems/use cases fall into two major categories:

- Systems that have plenty of experimental data on historical behavior but with no theoretical knowledge framework

- Systems with established mathematical/theoretical framework and some robust empirical behavior

In the first category, machine learning works perfectly, provided there is enough clean and contextualized training data. ML learns the underlying patterns using data from disparate data sources – the system itself, descriptions of various variables, and outcomes. Ultimately, it extrapolates this data to make predictions.

There are two cautions to applying MLOps this way: first, there is questionable confidence in predictions with insufficient data. This makes pure-AI approaches unsuitable for critical manufacturing processes in the heavy asset industry. Second, there is often a lack of training data for true failures in critical systems – primarily since all systems work towards preventing such costly and potentially life-threatening events.

Physics simulators have a distinct advantage in the second category – with a theoretical framework, they can predict with considerable confidence even without historical data. Industry-standard data science has applied the principles of physics-based modeling for decades. In that time, it has been continuously tested and validated for even the most critical simulation.

It has limited capabilities: it is costly to compute persisting physics-based models using live data in runtime environments. This requires massive computer resource investments considering the bulky IIoT use cases expected in the O&G industry.

How to Use Hybrid AI to Solve Physics-Based Simulation and Pure-AI Challenges

You can use physics-based models to describe rich, fully-interpretable, and physically accurate synthetic data, including equipment breakpoint data and virtual sensor data. Machine learning management systems are trained using large volumes of synthetic data.

Once trained, this data can be used to process real-time OT data for predictive maintenance, production optimization and operational excellence. After training, you can easily apply industrial data science and ML to make predictions efficiently with new data, even when churned at high velocity.

Hybrid AI in O&G needs to be subject-matter-supervised to understand the physical boundaries of the system’s conditions. Doing so greatly enhances the ability of the algorithm to produce meaningful outcomes.

Generally, hybrid AI models should be used for complex or sophisticated industrial process use cases. A mathematical modeling framework should exist in these systems to help the physics simulator create virtual data that may then be applied to training the MLOps for real-time prediction.


Untitled design (20)

Benefits of Hybrid AI

Cyber-physical systems – standard in the O&G industry – are complex systems capable of the four Cs – Control, Communication, Cognition, and Computation. These functions are powered by separate parts of the system/components, each with a unique role. Among them are myriads of sensors to curate raw data and inform any changes, whether through physics-based modeling or machine learning.

Apart from sensors, there are actuators that respond to stimuli changes and a computational intelligence platform to compute the response according to sensor data. A network interconnects all these.

The problem with data-driven pure AI systems is that they ignore all knowledge of the physical or abstract system. Except, that is, the data fed into the ML model itself, such as through data contextualization. Moreover, these systems require an infinite supply of data, which is almost always limited in heavy asset industries for various reasons, such as:

  • Hidden physical interactions
  • Sophisticated interactions (beyond what can be modeled from first principles)
  • Incomplete technical specifications of components

By leveraging hybrid data integration and industrial AI, we combine what is known about real system behavior and the domain to inform machine learning techniques.

What about Deep Learning?

You may wonder, what about deep learning –doesn’t it solve many of the limitations of machine learning? The simple answer is yes and no. Deep learning promises more complex problem solving, pattern recognition, and control by mimicking the human brain’s layered workings. However, it would be years, even decades, before deep neural networks (DNNs) reach the operation levels that could work for O&G use cases. 

Deep learning in O&G brings two significant challenges – it is even more data-hungry than its counterpart, and its actions are even more opaque. Key O&G industry players may wonder then whether there is any fit-for-purpose DL methodology that they should invest in for the foreseeable future.

However, DL cannot work outside of data, and training deep neural networks requires millions of data points. Where O&G is concerned, simpler machine learning models, bolstered by physics-based approaches, suffice to reduce the complexity of the problem to be solved and hence predict behavior accurately. 

How Feature Engineering Fits into Hybrid AI Modelling

The success of hybrid AI for O&G applications and use cases is demanding: you must figure out how to get advanced hybrid analytics applications and ways to deliver and operationalize data science at scale. All success stories begin with mastering these critical elements. The first step is understanding how O&G applications differ from classical AI/ML applications in other industries. 

 - Classical AI/ML applications

 - O&G applications

 - There are no alternative approaches to AI/ML

- AI/ML comes to compete against established physics-based models

 - Large errors in production or control have no severe consequences

- Even marginal errors can have catastrophic consequences

- Problems within data are noiseless 

- There may be noise or drift in the O&G sensors

The O&G industry plays a different game where AI/ML is concerned. In O&G, the physics functions/mathematical frameworks of problems are often non-linear, involving several input variables. What’s more, there may be huge biases and noise in data distribution, according to the prevailing conditions. Any state changes can render all historical data irrelevant, while other areas have few data points – not enough to establish a functional form.

In the O&G industry, a long history of operation doesn’t always mean you have a lot of training data. Only the data collected in stable conditions (as opposed to transient conditions) is valid for training data. This is the data collected:

- Without any operational changes

- After initial transient conditions have faded (takes anything from a few minutes to days). Where the flow is unstable, it is necessary to integrate over a specific period to create accurate data.

For this reason, even if the O&G industry has plenty of data, large portions of this historical data may be unusable for the desired use case. But with or without data, you’re looking for ways to improve operational excellence, which is what hybrid AI through feature engineering does.

What Is Feature Engineering?

Feature engineering, as relates to ML, is the process of applying domain knowledge and data mining strategies to extract features from raw data. The features can then be used to improve ML algorithms – it is practically applied ML.

ML algorithms use input data (called training data) to create outputs. Input data is made of features, often stored as structured columns in structured data. The features should have specific characteristics to be useful for ML applications. This is what feature engineering does.

There are two goals of feature engineering:

- To prepare the right training dataset according to ML algorithm requirements

- To improve ML models’ performance

Choosing features is a critical step in applying hybrid AI for a use case. Your choice of features has a direct impact on the outputs. Correct feature engineering is the first step towards successful ML.

How Feature Engineering Works

As mentioned, both physics-based modeling and machine learning have pros and cons, and hybrid AI maximizes each other’s upsides while making up for their downsides. ML works via regression modeling or mathematical modeling. In the former, you provide training/historical datasets and use it to train the ML model to make predictions on current and future conditions.

Mathematical modeling works through two methods: first principle modeling and empirical modeling. First principle modeling is complex and rigorous, but it gives more specific results than its counterpart. FPM is used to derive common conservation equations of momentum, mass, energy, and volume.

Where you have an O&G use case with a complex dataset featuring multiple parameters, mathematical modeling (dimensional analysis) can help you create fewer dimensionless numbers (1-2). Dimensionless numbers transform complex behavior into a near-linear and smoother function, which is easy to track.

The resultant function means that you need much less data to describe the use case behavior. The process of condensing parameters is what feature engineering does. Features should be chosen carefully; doing this forms simpler and smoother problems.

Physics and AI hybrid delivers working AI for industry

Feature Engineering: A Radical Approach?

Although relatively novel as an application of hybrid AI, feature engineering is not a new field in physics-based modeling. PBM already comes with sensor values that are not measured directly. The virtual sensor data needed for ML is often created by combining existing sensor data, physics-based models, and additional information (often information that provides context).

After creating virtual sensor data, sensor values can be transformed using physics-based data modeling to create more specific underlying functions. Therefore, feature engineering is a new way to apply the tried and tested physics-based systems. Applying machine learning extends the abilities of PBM to allow real-time industrial data processing and process optimization.

This is how O&G industries and related process manufacturers can apply hybrid AI to improve safety, performance, and productivity by leveraging data science. To see it in action, learn more about how Aker BP used hybrid AI/ML to optimize production.

Platform Water (1)

How to Make Hybrid AI Work for Any Industry

Even though the principles outlined above have been applied to the O&G industry, hybrid AI can be used for any process-based manufacturing system. Process industries have the same principles as the O&G industry – principles of chemistry and physics govern them, and they have similar equipment and sensors.

Hybrid AI can help these industries to improve their outputs according to various use cases. For instance, in cement manufacture, product quality is one of the most critical parameters, but it takes weeks before the quality check results come in. By this time, the batched has shipped to market and, if substandard, would have to be recalled.

To solve this challenge, a soft sensor can be deployed just before distribution – while the sensor may not ascertain quality, it can check parameters that indicate whether the product quality is acceptable or should be investigated further. Where the latter result returns, the batch can be held back, awaiting proper quality checks.

Prerequisites for Deploying Hybrid AI in Process Manufacturing

It can be challenging to know how best to deploy hybrid AI solutions at scale, given that every business/industry has specific and discrete requirements and use cases. Regardless, the following are the essential prerequisites for successful industrial hybrid AI application:

An understanding of the problem or use case – most engineers or line managers have specific details and in-depth knowledge of the use case

- Access to tools for feature engineering – some tools are simple and can be implemented by your internal IT department, while others require professional intervention

- Access to physics simulators – choose whether to implement an on-premise solution or enlist Simulator-as-a-Service from a reputable vendor. The advantages of as-a-Service arrangements include reducing initial expenditure and the ability to deploy immediately and at scale.

The accuracy of a hybrid AI model depends on the use case or problem it is solving. Sensor accuracy and amount of data also determine accuracy. It is harder to have a lower accuracy threshold for complex or sophisticated use cases.


Hybrid AI has stepped in to solve many of the challenges arising from Industry 4.0 in the O&G industry. Most notably, the fact that you cannot apply pure-AI-based principles to critical systems. It provides the best of both worlds – physics-based modeling and machine learning, to create more robust and optimized systems than each of its parts separately.

However, your hybrid AI deployment is only as good as your partners in the journey. At Cognite, we provide a host of services towards heavy asset industrial digitalization, including data contextualization, digital twinning, and other industry application suites.

Cognite is where data science and software meet industrial expertise. Contact us to get started with your industry 4.0 journey.

Industrial DataOps is an irreplaceable component of the industry 4.0 journey and all it brings, including industrial digitalization and smart manufacturing.

Request a demo to learn about putting sustainability at the heart of your operations

Operational Digital Twin: How Data Contextualization Provides A Complete, Actionable Understanding Of Industrial Operations