Machine learning guide for oil and gas using python hoss belyadi - The ebook with all chapters is av

Page 1


https://ebookmass.com/product/machine-learning-guide-foroil-and-gas-using-python-hoss-belyadi/

Instant digital products (PDF, ePub, MOBI) ready for you

Download now and discover formats that fit your needs...

Machine Learning on Geographical Data Using Python 1st Edition Joos Korstanje

https://ebookmass.com/product/machine-learning-on-geographical-datausing-python-1st-edition-joos-korstanje/

ebookmass.com

Financial Machina: Machine Learning For Finance: The Quintessential Compendium for Python Machine Learning For 2024 & Beyond Sampson

https://ebookmass.com/product/financial-machina-machine-learning-forfinance-the-quintessential-compendium-for-python-machine-learningfor-2024-beyond-sampson/

ebookmass.com

Quantitative Trading Strategies Using Python: Technical Analysis, Statistical Testing, and Machine Learning Peng Liu

https://ebookmass.com/product/quantitative-trading-strategies-usingpython-technical-analysis-statistical-testing-and-machine-learningpeng-liu/ ebookmass.com

Leadership and Nursing Care Management E Book 6th Edition, (Ebook PDF)

https://ebookmass.com/product/leadership-and-nursing-care-managemente-book-6th-edition-ebook-pdf/

ebookmass.com

The Spatial and Temporal Dimensions of Interactions: A Case Study of an Ethnic Grocery Shop 1st ed. 2020 Edition

https://ebookmass.com/product/the-spatial-and-temporal-dimensions-ofinteractions-a-case-study-of-an-ethnic-grocery-shop-1sted-2020-edition-dariush-izadi/

ebookmass.com

The Double Life of Daisy Hemmings Joanna Nadin

https://ebookmass.com/product/the-double-life-of-daisy-hemmingsjoanna-nadin/

ebookmass.com

Dancing With Redemption (Barre To Bar Book 5) Summer Cooper

https://ebookmass.com/product/dancing-with-redemption-barre-to-barbook-5-summer-cooper/

ebookmass.com

eTextbook 978-1464113079 Myers’ Psychology for AP? 2nd Edition

https://ebookmass.com/product/etextbook-978-1464113079-myerspsychology-for-ap-2nd-edition/

ebookmass.com

A Teljes gondolkodj és gazdagodj Napoleon Hill

https://ebookmass.com/product/a-teljes-gondolkodj-es-gazdagodjnapoleon-hill/

ebookmass.com

https://ebookmass.com/product/k-is-for-karma-the-a-b-cs-of-witcherymoonbeam-chronicles-book-11-carolina-mac/

ebookmass.com

Machine Learning Guide for Oil and Gas Using Python

A Step-by-Step Breakdown with Data, Algorithms, Codes, and Applications

Hoss Belyadi
Obsertelligence, LLC
Alireza Haghighat
IHS Markit

Table of Contents

Cover image

Title page

Copyright

Biography

Acknowledgment

Chapter 1. Introduction to machine learning and Python

Introduction

Artificial intelligence

Data mining

Machine learning

Python crash course

Anaconda introduction

Anaconda installation

Jupyter Notebook interface options

Basic math operations

Assigning a variable name

Creating a string

Defining a list

Creating a nested list

Creating a dictionary

Creating a tuple

Creating a set

If statements

For loop

Nested loops

List comprehension

Defining a function

Introduction to pandas

Dropping rows or columns in a data frame

loc and iloc

Conditional selection

Pandas groupby

Pandas data frame concatenation

Pandas merging

Pandas joining

Pandas operation

Pandas lambda expressions

Dealing with missing values in pandas

Dropping NAs

Filling NAs

Numpy introduction

Random number generation using numpy

Numpy indexing and selection

Chapter 2. Data import and visualization

Data import and export using pandas

Data visualization

Chapter 3. Machine learning workflows and types

Introduction

Machine learning workflows

Machine learning types

Dimensionality reduction

Chapter 4. Unsupervised machine learning: clustering algorithms

Introduction to unsupervised machine learning

K-means clustering

Hierarchical clustering

Density-based spatial clustering of applications with noise (DBSCAN)

Important notes about clustering

Outlier detection

Local outlier factor using scikit-learn

Chapter 5. Supervised learning

Overview

Linear regression

Logistic regression

Metrics for classification model evaluation

Logistic regression using scikit-learn

K-nearest neighbor

Support vector machine

Decision tree

Random forest

Extra trees (extremely randomized trees)

Gradient boosting

Extreme gradient boosting

Adaptive gradient boosting

Frac intensity classification example

Handling missing data (imputation techniques)

Rate of penetration (ROP) optimization example

Chapter 6. Neural networks and Deep Learning

Introduction and basic architecture of neural network

Backpropagation technique

Data partitioning

Neural network applications in oil and gas industry

Example 1: estimated ultimate recovery prediction in shale reservoirs

Example 2: develop PVT correlation for crude oils

Deep learning

Convolutional neural network (CNN)

Convolution

Activation function

Pooling layer

Fully connected layers

Recurrent neural networks

Deep learning applications in oil and gas industry

Frac treating pressure prediction using LSTM

Chapter 7. Model evaluation

Evaluation metrics and scoring

Cross-validation

Grid search and model selection

Partial dependence plots

Size of training set

Save-load models

Chapter 8. Fuzzy logic

Classical set theory

Fuzzy set

Fuzzy inference system

Fuzzy C-means clustering

Chapter 9. Evolutionary optimization

Genetic algorithm

Particle swarm optimization

Copyright

Gulf Professional Publishing is an imprint of Elsevier 50 Hampshire Street, 5th Floor, Cambridge, MA 02139, United States

The Boulevard, Langford Lane, Kidlington, Oxford, OX5 1GB, United Kingdom

Copyright © 2021 Elsevier Inc. All rights reserved.

No part of this publication may be reproduced or transmied in any form or by any means, electronic or mechanical, including photocopying, recording, or any information storage and retrieval system, without permission in writing from the publisher. Details on how to seek permission, further information about the Publisher’s permissions policies and our arrangements with organizations such as the Copyright Clearance Center and the Copyright Licensing Agency, can be found at our website: www.elsevier.com/permissions.

This book and the individual contributions contained in it are protected under copyright by the Publisher (other than as may be noted herein).

Notices

Knowledge and best practice in this field are constantly changing. As new research and experience broaden our understanding, changes in research methods, professional practices, or medical treatment may become necessary.

Practitioners and researchers must always rely on their own experience and knowledge in evaluating and using any information, methods, compounds, or experiments described herein. In using such information or methods they should be mindful of their own safety and the safety of others, including parties for whom they have a professional responsibility.

To the fullest extent of the law, neither the Publisher nor the authors, contributors, or editors, assume any liability for any injury and/or damage to persons or property as a maer of products liability, negligence or otherwise, or from any use or operation of any methods, products, instructions, or ideas contained in the material herein.

Library of Congress Cataloging-in-Publication Data

A catalog record for this book is available from the Library of Congress

British Library Cataloguing-in-Publication Data

A catalogue record for this book is available from the British Library

ISBN: 978-0-12-821929-4

For information on all Gulf Professional Publishing publications visit our website at hps://www.elsevier.com/books-and-journals

Editorial Project Manager: Hillary Carr

Production Project Manager: Poulouse Joseph

Cover Designer: Christian Bilbow

Typeset by TNQ Technologies

Biography

Hoss Belyadi is the founder and CEO of Obsertelligence, LLC, focused on providing artificial intelligence (AI) inhouse training and solutions. As an adjunct faculty member at multiple universities, including West Virginia University, Mariea College, and Saint Francis University, Mr. Belyadi taught data analytics, natural gas engineering, enhanced oil recovery, and hydraulic fracture stimulation design. With over 10 years of experience working in various conventional and unconventional reservoirs across the world, he works on diverse machine learning projects and holds short courses across various universities, organizations, and the department of energy (DOE). Mr. Belyadi is the primary author of Hydraulic Fracturing in Unconventional Reservoirs (first and second editions) and is the author of Machine Learning Guide for Oil and Gas Using Python. Hoss earned his BS and MS, both in petroleum and natural gas engineering from West Virginia University.

Dr. Alireza Haghighat is a senior technical advisor and instructor for Engineering Solutions at IHS Markit, focusing on reservoir/production engineering and data analytics. Prior to joining IHS, he was a senior reservoir engineer at Eclipse/Montage resources for nearly 5 years. As a reservoir engineer, he was involved in well performance evaluation with data analytics, rate transient analysis of unconventional assets (Utica and Marcellus), asset

development, hydraulic fracture/reservoir simulation, DFIT analysis, and reserve evaluation. He was an adjunct faculty at Pennsylvania State University (PSU) for 5 years, teaching courses in Petroleum Engineering/Energy, Business and Finance departments. Dr. Haghighat has published several technical papers and book chapters on machine learning applications in smart wells, CO2 sequestration modeling, and production analysis of unconventional reservoirs. He has received his PhD in petroleum and natural gas engineering from West Virginia University and a master's degree in petroleum engineering from Delft University of Technology.

Acknowledgment

We would like to thank the whole Elsevier team including Katie Hammon, Hilary Carr, and Poulouse Joseph for their continued support in making the publication process a success. I, Hoss Belyadi, would like to thank two individuals who have truly helped with the grammar and technical review of this book. First, I would like to thank my beautiful wife, Samantha Walstra, for her continuous support and encouragement during the past 2 years of writing this book. I would also like to express my deepest appreciation to Dr. Neda Nasiriani for her technical review of the book.

I, Alireza Haghighat, want to acknowledge Dr. Shahab D. Mohaghegh, who was my PhD advisor. He, a pioneer of AI & ML applications in the oil and gas industry, has guided me in my journey to learn petroleum data analytics. I would like to thank my wife, Dr. Neda Nasiriani, who has been incredibly supportive throughout the process of writing this book. She encouraged me to write, made recommendations that resulted in improvements, and reviewed every chapter of the book from a computer science point of view. I also want to thank Samantha Walstra for reviewing the technical writing of this book.

Chapter 1: Introduction to machine learning and Python

Abstract

This chapter covers basic definitions of Artificial Intelligence, machine learning, and data mining. It then provides step-by-step instructions on how to set up Python Anaconda and Jupyter Notebook and all useful shortcuts. Afterward, an introduction to the following Python concepts is given; including data structures, e.g., lists, dictionary, tuples, sets, and control flows, e.g., if statements, for loops, nested loops, while loops, list comprehension, and functions. These concepts are explained using step-by-step examples. Next, pandas and numpy libraries are discussed in depth with multiple oil and gas examples. Various pandas' functions and concepts such as column selection, basic statistics, column renaming/manipulation, loc/iloc, column calculations, column dropping, conditional selection, grouping by, joining, merging, concatenating, pandas operations, and dealing with missing values are discussed with examples. Finally, various numpy library concepts such as creating numpy array, n by n matrix, identity function, random numbers (both real and integer), etc., are discussed with examples. Numpy indexing and selections are also discussed at the end of this chapter.

Keywords

Anaconda installation; Artificial Intelligence; Data mining; Jupyter Notebook; Machine learning; Numpy library; Pandas library; Python

Introduction

Artificial Intelligence (AI) and machine learning (ML) have grown in popularity throughout various industries. Corporations, universities, government, and research groups have noticed the true potential of various applications of AI and ML to automate various processes while increasing predicting capabilities. The potential of AI and ML is a remarkable game changer in various industries. The technological AI advancements of self-driving cars, fraud detection, speech recognition, spam filtering, Amazon and Facebook's product and content recommendations, etc., have generated massive amounts of net asset value for various corporations. The energy industry is at the beginning phase of applying AI to different applications. The rise in popularity in the energy industry is due to new technologies such as sensors and high-performance computing services (e.g., Apache Hadoop, NoSQL, etc.) that enable big data acquisition and storage in different fields of study. Big data refers to a quantity of data that is too large to be handled (i.e., gathered, stored, and analyzed) using common tools and techniques, e.g., terabytes of data. The number of publications in this domain has exponentially increased over the past few years. A quick search on the number of publications in the oil and gas industry with Society of Petroleum Engineer's OnePetro or American Association of Petroleum Geologists (AAPG) in the past few years aests to this fact. As more companies realize the value added through incorporating AI into daily operations, more creative ideas will foster. The intent of this book is to provide a step-by-step, easy-tofollow workflow on various applications of AI within the energy industry using Python, a free open source programming language. As one continues through this book, one will notice the incredible work that the Python community has accomplished by providing various libraries to perform ML algorithms easily and efficiently.

Therefore, our main goal is to share our knowledge of various ML applications within the energy industry with this step-by-step guide. Whether you are new to data science/programming language or at an advanced level, this book is wrien in a manner suitable for anyone. We will use many examples throughout the book that can be followed using Python. The primary user interface that we will use in this book is “Jupyter Notebook” and the download process of Anaconda package is explained in detail in the following sections.

Artificial intelligence

Terminologies such as AI, ML, big data, and data mining are used interchangeably across different organizations. Therefore, it is crucial to understand the true meaning of each terminology before diving deeper into various applications. AI is simply the use of machine or computer intelligence rather than human or animal intelligence. It is a branch of computer science that studies the simulation of human intelligence processes such as learning, reasoning, problem-solving, and self-correction by computers. Creating intelligent machines that work, react, and mimic cognitive functions of humans is the primary goal of AI. Examples of AI include email classification (categorization), smart personal assistants such as Siri, Alexa, and Google, automated respondents, process automation, security surveillance, fraud detection and prevention, paern and image recognition, product recommendation and purchase prediction, smart searches, sales, volumes, and business forecasting, advertisement targeting, news feed personalization, terrorist activity detection, self-driving cars, health diagnostics, mortgage default prediction, house pricing prediction, robo-advisors (automated portfolio manager), and virtual travel assistant. As shown, the field of AI is only growing with extraordinary potential for decades to come. In addition, the demand for data science jobs has also exponentially grown in the past few years where companies search desperately for computer scientists, mathematicians, data scientists, and engineers that have

postgraduate and preferably PhD degrees from accredited universities.

Data mining

Data mining is a terminology used in computer science and is defined as the process of extracting specific information from a database that was hidden and not explicitly available for the user, using a set of different techniques such as ML. It is also called knowledge discovery in databases (KDD). Teaching someone how to play basketball is ML; however, using someone to find the best basketball centers is data mining. Data mining is used by ML algorithms to find links between various linear and nonlinear relationships. Data mining is often used to help collect data on various aspects of the business such as nonproductive time, sales trend, production key performance indicators, drilling data, completions data, stock market key indicators and information, etc. Data mining can also be used to go through websites, online platforms, and social media to collect and compile information (Belyadi et al., 2019).

Machine learning

ML is a subset of AI. It is defined as the collection of using various algorithms to teach computers to find paerns in data to be used for future prediction and forecasting or as a quality check for performance optimization. ML provides computers the ability to learn without being explicitly programmed. Some of the paerns may be hidden and therefore, finding those hidden paerns can add significant shareholder value to any organization. Please note that data mining deals with searching specific information while ML focuses on performing a certain task. In Chapter 2 of this book, various types of ML algorithms will be discussed. Also note that deep learning is a subset of machine learning in which multi-layer neural networks are used for various purposes including but not limited to image and facial recognition, time series forecasting,

autonomous cars, language translation, etc. Examples of deep learning algorithms are convolution neural network (CNN) and recurrent neural network (RNN) that will be discussed with various O&G applications in Chapter 6.

Python crash course

Before covering the essentials of all the algorithms as well as the codes in Python, it is imperative to understand the fundamentals of Python. Therefore, this chapter will be used to illustrate the fundamentals before diving deeper into various workflow and examples.

Anaconda introduction

It is highly recommended to download Anaconda, the standard platform for Python data science which includes many of the necessary libraries with its installation. Most libraries used in this book are already preinstalled with Anaconda, so they don't need to be downloaded individually. The libraries that are not preinstalled in Anaconda will be mentioned throughout the chapters.

Anaconda installation

To install Anaconda, go on Anaconda's website (www.anaconda.com) and click on “Get Started.” Afterward, click on “Download Anaconda Installers” and download the latest version of Anaconda either using Windows or Mac. Anaconda distribution will have over 250 packages some of which will be used throughout this book. If you do not download Anaconda, most libraries must be installed separately using the command prompt window. Therefore, it is highly advisable to download Anaconda to avoid downloading majority of the libraries that will be used in this book. Please note that while majority of the libraries will be installed by installing Anaconda, there will be some libraries where they would have to separately get installed using the command prompt or Anaconda prompt window. For those libraries that have not been

preinstalled, simply open “Anaconda prompt” from the “start” menu, and type in “pip install (library name)” where “library name” is the name of the library that would like to be installed. Once the Anaconda has been successfully installed, search for “Jupyter Notebook” under start menu. Jupyter Notebook is a web-based, interactive computing notebook environment. Jupyter Notebook loads quickly, is user-friendly, and will be used throughout this book. There are other user interfaces such as Spyder, JupyterLab, etc. Fig. 1.1 shows the Jupyter Notebook's window after opening. Simply go into “Desktop” and create a folder called “ML Using Python.” Afterward, go to the created folder (“ML Using Python“) and click on “New” on the top right-hand corner as illustrated in Fig. 1.2.

You now have officially launched a new Jupyter Notebook and are ready to start coding as shown in Fig. 1.3.

Displayed in Fig. 1.4, the top left-hand corner indicates the Notebook is “Untitled.” Simply click on “Untitled” and name the Jupyter Notebook “Python Fundamentals.”

FIGURE 1.1 Jupyter Notebook window.

FIGURE 1.2 Opening a new Jupyter Notebook.

FIGURE 1.3 A blank Jupyter Notebook.

Turn static files into dynamic content formats.

Create a flipbook
Issuu converts static files into: digital portfolios, online yearbooks, online catalogs, digital photo albums and more. Sign up and create your flipbook.