Data Analytics Done Right in Oil and Gas

May 31, 2017 | Hector Klie

Data Analytics is a hot topic nowadays, but many companies struggle to implement a stable and productive platform to quickly respond to the needs of the Oil and Gas industry. In my opinion, we can address this problem by effectively executing on the following three components:

  1. Incubating a Data Analytics team
  2. Scaling the application of Data Analytics
  3. Building a data driven culture

Traditionally, Domain Experts such as Engineers and Geoscientists have received support from two specialized groups:

At the same time, there may be a few Domain Experts who are strong at mathematics or programming and may be able to work on R&D or Tech Support.

However, due to several technological trends and increasing competition, Oil and Gas companies are relying more on specialists that can help make sense of the vast amounts of data to make decisions faster and more accurately. From this need, the Data Science role has been adopted by this industry. Data Science emerged from a combination of developments in Mathematics, Statistics, and Computer Science. In the Oil and Gas industry, Data Scientists are often expected to have a good grasp of the business, otherwise they might fall into a third group of Machine Learning Experts. On the Venn diagram shown below (Fig. 1), I attempt to illustrate how these roles relate to each other.

Figure 1: An adaptation of Shelly Palmer’s Data Science Venn diagram.

This is not to say that Data Scientists are unicorns that can fill the shoes of Computer Scientists, Mathematicians or Statisticians, and Domain Experts. Instead, typically a Data Scientist in the Oil and Gas industry will master one or two of these domains, with a basic understanding of another domain. Nevertheless, talented Data Scientists are in high demand and low supply, and to some they may be perceived as unicorns.

Incubating a Data Analytics Team

Given how difficult it is to find talented Data Scientists, it is often more feasible to gather a group of specialists that can work together towards the goal of solving a wide range of data driven problems. Below I break down the steps to incubate an exemplary Data Analytics team:

Scaling the Application of Data Analytics

Data Scientists are heavily dependent on the availability of large volumes of quality data. However, to facilitate access to these data, a robust data platform is needed. This platform is generally referred to as Big Data, and building such platform is not a trivial task. This is especially true in the Oil and Gas industry where data tend to be very fragmented, diverse, unstandardized, and difficult to access compared to other industries.

Additionally, Data Scientists are primarily focused on developing fast prototypes to prove out a model. Once the model has been validated, a significant amount of work is needed to convert the prototype into a production ready solution. However, we often observe companies misusing the skills of a Data Scientist or members of the Tech Support Staff, which often results in the release of an unfinished product. From my experience, this is a critical gap in the Oil and Gas industry that may be addressed by understanding better how the technology industry is able to build sophisticated Data Analytics solutions.

Technology companies such as Google, Microsoft, Amazon, and others typically have large teams of Software Engineers with a few Data Scientists supporting the research and development of their products. Coincidentally, this is the industry where most Machine Learning experts are found. These individuals work closely together through an iterative process called Agile. Below I describe further how these roles collaborate together (Fig. 2).

Figure 2: Data Analytics in Tech Companies.

The roles depicted in the figure above can be broken down as follows:

My recommendation for Oil and Gas companies would be to build Software Engineering teams inspired by successful tech companies. This may require a significant change in culture and skillsets from existing Tech Support or IT Specialist groups. However if implemented successfully, it would significantly help accelerate a Data Scientist’s ability to develop innovative data driven solutions.

Building a Data Driven Culture

In the tech industry, the Agile methodology is widely used to help increase collaboration and productivity between individuals of diverse backgrounds. In the contest of Oil and Gas, my theory is that as cross-functional teams work together following Agile practices, we will naturally see:

As Data Analytics starts playing a more important role in the Oil and Gas industry, I can predict a greater need for collaboration between Data Scientists, Software Engineers, and Domain Experts. As a result, we may see our earlier Venn diagram evolve into the figure below (Fig. 3).

Figure 3: The exemplary Data Analytics team in Oil and Gas.

I believe the combination of these three roles are essential in order to form a successful Data Analytics team in this industry. Assuming the right talent is in place, and there is a close collaboration between these individuals, we may see a new breed of AI specialists emerge from this intersection.

The Data Analytics Funnel

Putting everything into perspective, we can think about how a business would be impacted when the three components introduced in the beginning are effectively executed. I will attempt to illustrate this in what I call the Data Funnel, which describes the degradation of information across a data workflow from the source to the point of action.

The Oil and Gas industry typically has a narrow but long funnel. This means that there is a significant delay in the response time (latency) and a limited amount of data (bandwidth) that can be processed at a given time. The illustration below attempts to summarize this idea (Fig. 4).

Figure 4: The manual data funnel.

An example of a typical manual data workflow might be broken down into three phases:

  1. Integration: A DBA (DataBase Administrator) will manually ETL (extracting, transforming, and loading) data for the consumption of Domain Experts. The lack of governance and integration with available data sources, and the inability to scale this data may severely hinder a DBA’s ability to prepare data for downstream consumption.
  2. Analysis: Domain Experts spend a significant amount of their valuable time analyzing and computing large amounts of inconsistent data using several disjointed commercial tools. Many factors such as human bias, noise, gaps in the data, and tool limitations can result in further degradation of information upon analysis.
  3. Decision: Strategists may be Domain Experts, Managers, or Executives that will carefully look through the analysis to evaluate a set of possible scenarios that may lead to the best business decision. However, a lack of data provided by upstream workflows may potentially lead to uninformed or inaccurate decisions.

In summary, even though this process has been effective in the past, as competition increases and resources become more scarce, it is difficult to scale such model to take advantage of all potential opportunities. In contrast, a well established Data Analytics team would be able to build a far more scalable Data Analytics platform that enables the business to automatically process significantly larger volumes of data at a much faster rate (see Fig. 5). The platform also facilitates a more robust feedback loop where the outcome of actions can be relayed back into the data source.

Figure 5: The automated data analytics funnel.

In this ideal scenario, a Data Analytics Platform would act as a solid bridge between sources and actions, and therefore shortening the latency and increasing the bandwidth of the Data Funnel. On the diagram above, you will also notice a new set of opportunities (highlighted in red) that are unlocked by such platform.

In other terms, the Data Analytics Platform would be able to provide a stronger integration with all available data sources, and can execute advanced analytics techniques to generate actionable business recommendations through dashboards or BI visualization tools. This would allows Strategists, Domain Experts, and Field Operators to respond swiftly and proactively to multiple business needs.

Ultimately, the purpose of this model is not to build solutions that will assist humans to perform monotonous tasks, but instead to train machines to do these tasks for humans so they can focus on more challenging and creative problems.