I started my career as a field-based completions engineer with a small independent operator that, like many of their peers, was in “lease grab” mode. Welcome to the frigid boomtown of Williston, ND circa 2011. The preparation process for frac’ing these early wells included running a CBL and pumping a Halliburton-style DFIT to get an estimate of the initial reservoir pressure. Having recently graduated with a Master’s focused on reservoir engineering, I could foresee this data being used as the starting point for a reservoir simulation. Unfortunately, my dreams of creating a Bakken/Three Forks reservoir simulation model that could improve our drilling unit economics didn’t come to pass due to the complexities of modeling a tight reservoir.
As an industry we have struggled to get accurate pre-drill production predictions using reservoir simulation in unconventional assets. This is especially true when we add more wells which increases the complexity of the fracture network we’re trying to describe. Not to be deterred, we’ve looked to multiple methods and technologies to help us fill in the gaps. For example, Rate Transient Analysis and Pressure Transient Analysis have helped quantify the conductivity of the fracture network. We’ve also done some science projects using technologies like fiber and micro-seismic to beef up our geo-mechanical knowledge and pinpoint where our proppant is going. The insights that these technologies have provided go beyond understanding the creation of the fracture network, and in the case of fiber, can even tell us about how the well is producing. While these technologies are informative, they are also expensive and operationally challenging. Additionally, better understanding of proppant placement and geomechanics alone does not improve our ability to accurately model fracture behavior.
A sizeable portion of the current unconventional reservoir analysis has been taking place in tools like TIBCO Spotfire® which allows engineers to quickly slice and dice their data to measure correlations. The ability to quickly build a wide range of interrelated maps and charts has helped us to develop a better analytical understanding of what’s driving well performance. The use of these types of tools has been very beneficial for creating a mental model of how the reservoir has behaved and instrumental in conducting well-based or area-based assessments. While the outputs from these workflows are often the foundation of a modern reservoir engineer’s “gut instinct,” they aren’t a direct replacement for reservoir simulation because they don’t create a comprehensive equation and are not specifically predictive.
This gap in our ability to successfully use reservoir simulation as a predictive method in the unconventional plays has created a space where an empirical method like machine learning can be a successful alternative. Reservoir simulations start with physically derived equations. Machine learning starts with data and creates empirically derived equations. In machine learning we start with the data we’ve gathered and utilize an algorithm to learn the relationships between the variables we are subject to (such as geology), the variables we can “control” (such as our frac design and well spacing), and what we want to predict (such as oil, water, or gas production). An algorithm is capable of handling far more variables (also called “features”) than tools like Spotfire can visualize in a single graph, and the end result is a predictive model rather than a visualization. In the modeling world you’ll often hear people say “no model is perfect, but some models are useful.” By starting with the data and backing into an equation, we’ve found a useful workaround to the requirement of describing the complex fracture behavior to a physically-derived model. As the popularity of this technology has grown the industry is starting to create some very useful models which help with workflows spanning from pre-drill forecasts and completions designs to identifying undervalued acreage.
Machine learning can be a very powerful tool, but it isn’t without its shortcomings. Most of these are workable, but it’s important to start with reasonable expectations. For one, extrapolating outside of the data is not advised, especially when using tree-based models that haven’t been ensembled with algorithms more suited for extrapolation. Machine learning is great for identifying an optimal design within the range of the data, but generally speaking it’s not a great idea to extrapolate beyond the bounds of the data. (for a deeper perspective see "Extrapolation Is Tough For Trees!") Machine learning also requires a clean dataset, and, depending on the problem you’re trying to solve, it may require a large clean dataset. In areas that aren’t geologically complex we can typically build a useful model with around 150 wells that are close in proximity. As the system becomes more complex, the algorithms require more wells (or data points) to be successful. Ideally, we like to work with large proprietary datasets, but we have also had success building models with public data, or using public data as a supplement. Finally, an algorithm can converge on multiple results – I often liken it to an unconfined system of equations. Some of the resulting models can have really nice error metrics but the results don’t tie to physics. This is where subject matter expertise and that aforementioned “gut instinct” comes in play and helps with troubleshooting the model to create something physically meaningful.
As our industry strives to push through the economic and operational challenges facing us, evolving our way of thinking is imperative; for me, at this time, that evolution is machine learning. The younger version of myself saw DFIT data as an important starting point for reservoir simulation but these days I also see DFIT data (in large enough quantities) as a fantastic input to a machine learning model aimed at optimizing well spacing and frac design.