Forget hybrid cloud, hybrid analytics is the new black.
Until recently, purely data-driven artificial intelligence (AI) — machine learning most notably — has been looked upon as the most attractive technology for enabling new data across industries, including digital twins deployed by heavy asset industries such as oil and gas (O&G). More established, though much less hyped, physics-based modeling has rarely enjoyed the spotlight in recent years.
Because of AI’s inherent ‘black box’ nature, however, pure AI-based approaches are failing to gain full acceptance with field operations whose culture is rooted in engineering sciences with zero risk tolerance for critical systems. In addition, mounting empirical evidence from hundreds of proof-of-concepts involving promising AI startups by O&G industry leaders is debunking the omnipotence of AI to solve production optimization and predictive maintenance use cases as boldly as claimed.
This more informed reality of AI in industry is driving the future of hybrid machine learning, a blend of physics and AI analytics that combines the ‘glass box’ interpretability and robust mathematical foundation of physics-based modeling with the scalability and pattern recognition capabilities of AI.
Both physics-based models and machine learning (the most common form of AI applications) can be used to make future predictions — so which one to use for what, and when is a hybrid the best solution? The answer depends on the problem you are trying to solve, with the problem classes falling mainly in two categories.
- Systems with lots of experimental data about historical behavior, but no theoretical knowledge framework
- Systems with good mathematical theory framework in place (commonly matched with equally robust empirical behavior data). One advantage of a physics simulator is that is can predict with a certain confidence even when NO historical data exist. Which means it works from ‘first oil’ (and it works during the design phase). Historical data is used to increase accuracy and estimate uncertainty
For systems in the first category, a physics-based model is not possible as we are not able to formulate a robust mathematical model to describe the system. Machine learning, however, does not suffer from the same limitation. In fact, the flip side of AI’s ‘back box’ nature turns it into an advantage here, making it possible to use machine learning also in such scenarios; assuming enough contextualized training data is available. With this condition met, a machine learning model should be able to learn any underlying pattern between the system and its outcomes, and ultimately also make predictions.
Two caveats remain, however. First being the questionable confidence level in resulting predictions (i.e., the precision and recall challenge), possibly rendering an otherwise functioning AI approach unfit for many critical manufacturing processes. Second caveat is the oftentimes absent teaching sample of true failures in critical systems, as traditional scheduled equipment maintenance is designed to prevent such costly failures above all else.
For systems in the second category, a physics-based model can offer a good solution. Physics-based modeling is tried, tested, and validated for even the most critical of simulations — such as space flight orbits — but it too has limitations. The most notable limitation is the computational cost of persisting physics-based models in runtime environments with live data, especially across computationally heavy IoT use cases. It is here where hybrid analytics machine learning is offering an attractive solution.
Describing the system in detail using a physics-based model produces physically accurate, rich and fully interpretable synthetic data, such as virtual sensor data and equipment breakpoint data. This data is then used to train a machine learning model for subsequent live operational data analysis in predictive maintenance and production optimization use cases, leveraging the fact that once a machine learning model is trained, using it to make predictions on new data, even large with high velocity, is very efficient.
Second, to give the production algorithm its cognitive edge, such hybrid machine learning models are subject matter expert supervised to truly understand (hence the term ‘cognitive’) the physical boundary conditions of the systems. This greatly enhances the algorithm’s ability to produce meaningful outcomes.
To conclude, hybrid models are best suited for complex industrial process problems where a mathematical theory framework exists that can be used to teach a machine learning model that is then used on real-time data for predictions. The result is a high confidence tailored hybrid model combining strong domain knowledge (physics) with machine learning for cost efficiency and scalability. Especially in the proliferating space of digital twins, hybrid analytics is showing great potential.
- Tested, tried, proven across industries, and even the most critical applications.
- The uncertainty in the models has been extensively studied and can be taken into account during design and operation.
- Requires a good mathematical theory framework describing the system.
- Can be calibrated and validated with limited experimental data sets.
- Can predict outside the range of the existing data.
- Requires a complete set of boundary conditions for the mathematical equations in addition to information like geometry and other fluid/material properties. This is not always available.
- Commercial simulators are often very expensive.
- Computationally expensive to run at scale in live IoT environments.
- Requires extensive subject matter expertise.
- Difficult to scale across fleets of assets (BC, geometry and fluid/material properties must be changed, but the mathematical models scale. It may not scale with respect to time needed to set up for new asset, but it often scales well with respect to accuracy for different assets).
- Can predict future events.
- Requires a large contextualized teaching data set, including critical mass of failure events.
- Does not require information like geometry, fluid/material properties and can work on a much smaller set of sensors (but influences accuracy...).
- After initial model training, cost effective to run with real-time streaming data.
- Requires data science expertise.
- Scales very well across fleets of assets (Although not always true: A trained model can only be transferer for very similar assets. A new asset may have different sets of sensors, or on different locations, requiring retraining. A new asset may also be dominated by different physical phenomena requiring different sensors to get reliable predictions -- the physics may not scale).
- Can predict future events (assuming the future event is inside the training set).
- Not interpretable on prediction logic; ’black box’.
- Combines physics-based modeling and machine learning.
- Highly suitable for industrial systems analysts across many scenarios.
- Requires both subject matter and data science expertise.
- Offers semi-interpretable prediction logic.
- Cost efficient compared to pure physics-modeling in production and at scale across fleets of assets.
- Can be applied before any historical data exist (first oil).