Machine learning for engineers from time-series data

When data scientists are hard to come by, companies can find themselves at a disadvantage with all the operational data at their fingertips. Four tips for getting started with machine learning are highlighted.
By Sanket Amberkar May 21, 2018
Figure 1: Operational machine learning addresses shortcomings found in process control and analytics. Courtesy: Falkonry

Figure 1: Operational machine learning addresses shortcomings found in process control and analytics. Courtesy: Falkonry

Increased access to machine and operational data, proliferation of two-way communication, and lower costs for computing, connectivity, and storage, have set the stage for operational improvements, including in the oil & gas industry. McKinsey & Company predicts that analytics use will lead to a 20% rise in operational productivity gain.

While companies are rich in operational data, existing systems can’t analyze such massive amounts of it. Traditional approaches such as regression models, statistical process control (SPC), and optimization can’t leverage multivariate trends and uncover operational insights.

Operational machine learning addresses these shortcomings. It focuses on Big Data analytics for business operations but applies specialized machine learning approaches to meet industrial operation’s needs.

Time-series data generated in industrial operations is rich in insights into the health of production equipment. Nearly all operational systems—including control systems, industrial equipment, data centers, and sensing devices—produce time-series streams or bursts, whether from sensor readings, log entries, or activity traces.

The problem with most industrial machine learning solutions today is that as platforms they require data scientists on staff to prep the data, model the system, interpret results, and make operational recommendations. This poses two challenges:

1. Platforms get in the way of existing workflows and digital initiatives, disrupting operations and needing resources, proprietary instrumentation, and time to deploy.

2. Data scientists are expensive and hard to come by, and even when available, often are not operational experts.

Industrial subject matter experts (SMEs) on the other hand, understand operational processes and best practices. Engineers have detail knowledge of how to operate equipment, execute maintenance, and ensure safety. However, SMEs often lack the time to needed to implement complex solutions. [subhead]

Machine learning for engineers

Figure 2: Self-service machine learning for engineers makes it easier to implement and use. Courtesy: Falkonry

Figure 2: Self-service machine learning for engineers makes it easier to implement and use. Courtesy: Falkonry

This challenge is addressed by operational machine learning systems ready for use by industrial SMEs. SMEs can feed existing multivariate time-series data generated by operations into the machine learning system and based its outputs determine a course of action. The approach is effective because:

  • SMEs retain both the control and visibility needed to drive improvements.
  • SMEs are in the best position to determine the corrective action to take.
  • Identifying the use case and verifying the results within an operations team delivers quick time to value.

An example of such an approach is shown in Figure 2. Time-series data can come from a variety of sources and be provided as a real-time data stream to the machine learning system, which in turn analyzes the data to predictions on the equipment, system, or process state based on conditions the SMEs have trained it to find.

What is unique, and why it’s referred to as a “data scientist in a box,” are three key capabilities built into the system:

1. Unsupervised feature learning: Insights in time-series data come from multivariate trends or patterns, reviewed forensically to understand past system behavior. Patterns are often hard to describe and must be learned based on signals across data sources. Traditional analytics are unable to effectively learn or recognize time-series patterns. Feature learning autonomously discovers patterns hidden in time-series data, including patterns that go undetected by human observation. Once the patterns are discovered, machine learning can be applied.

2. Machine learning and predictions: This approach doesn’t just recognize defined patterns, but identifies new ones, correlating the patterns to operational events. From that point, an industrial SME can label those that correspond to known conditions such as a fault or downtime using the system’s log data. An SME can select from other consistent patterns preceding the fault condition and label it as a precursor event (see Figure 3). The precursor event can be an early warning or prediction likely to appear prior to the fault condition occurring.

3. Explanation: Most artificial intelligence and deep learning techniques can’t explain how they derive a prediction. Data scientists must interpret the model and resulting analyses. In comparison, the “data scientist in a box” approach determines which data signals the prediction requires and how much each signal contributed. This is a valuable insight for the sake of credibility and for guidance in determining root cause.

Applications across industries

Figure 3: Machine learning enabled pattern discovery and early warning. Courtesy: Falkonry

Figure 3: Machine learning enabled pattern discovery and early warning. Courtesy: Falkonry

Since machine learning discovers patterns in available time-series data, it doesn’t require developing mathematical models of physical systems or for suppliers having deep industrial domain expertise. In process operations, consistent, on-spec quality with profitable yields requires insight into real-time process conditions-a step beyond merely maintaining equipment. The time-series data controlling, or produced by, these processes defines the process health, estimates quality output, and reflects process yield. Machine learning delivers the insight to optimize outcomes and avoid machine failures.

Dealing with multivariate time-series data seems complex, but machine learning addresses much of that complexity With “ready to use” operational machine learning, historical data can be selected, patterns discovered, and models built and verified-eliminating the need for a data scientist or third party consultants and thus significantly reducing costs. Putting machine learning in SME’s hands delivers time to value, from identifying use cases to verifying results, because the process resides within the operations team. This approach delivers substantial improvements in asset performance, throughput, operator safety, and product quality.

Four steps for getting started with machine learning

1. Identify a use case and ask:

  • What is the specific problem you are trying to solve?
  • What approaches have you tried to solve this problem?
  • What is the cost of doing nothing?

2. Assess your readiness by asking:

  • Do you collect and archive time series data?
  • Do you have a data historian?
  • Have you prioritized your signals or have a hypothesis on where to look?
  • Have you defined evaluation criteria for success?

3. Implement a pilot by:

  • Provide a dataset for evaluation (for training the model)
  • Take the product training
  • Label the known events identified by the model.

4. Operationalize the pilot results by:

  • Validate the precursor events and alerts identified by the model
  • Assess models’ performance to reliably predict events and provide timely alerts
  • Incorporate predictive analytics into work processes and organizational plans.

Sanket Amberkar is senior vice president at Falkonry.

Sanket Amberkar
Author Bio: Vice President, Falkonry