Download pdf Ai time series control system modelling chuzo ninagawa full chapter pdf

Page 1


AI Time Series Control System

Modelling Chuzo Ninagawa

Visit to download the full and correct content document: https://ebookmeta.com/product/ai-time-series-control-system-modelling-chuzo-ninaga wa/

More products digital (pdf, epub, mobi) instant download maybe you interests ...

Advances in Power System Modelling, Control and Stability Analysis 2nd Edition Federico Milano

https://ebookmeta.com/product/advances-in-power-system-modellingcontrol-and-stability-analysis-2nd-edition-federico-milano/

Modelling Trends and Cycles in Economic Time Series 2nd Edition Terence C. Mills

https://ebookmeta.com/product/modelling-trends-and-cycles-ineconomic-time-series-2nd-edition-terence-c-mills/

Mathematical Modelling Nonlinear Control and Performance Evaluation of a Ground Based Mobile Air Defence System Mechanisms and Machine Science 76 Constantinos Frangos

https://ebookmeta.com/product/mathematical-modelling-nonlinearcontrol-and-performance-evaluation-of-a-ground-based-mobile-airdefence-system-mechanisms-and-machine-science-76-constantinosfrangos/

Power System Transients : Modelling Simulation and Applications 1st Edition Gevork Gharehpetian

https://ebookmeta.com/product/power-system-transients-modellingsimulation-and-applications-1st-edition-gevork-gharehpetian/

Computational Modelling of Biomechanics and Biotribology in the Musculoskeletal System: Biomaterials and Tissues (Woodhead Publishing Series in Biomaterials) 2nd Edition Zhongmin Jin (Editor)

https://ebookmeta.com/product/computational-modelling-ofbiomechanics-and-biotribology-in-the-musculoskeletal-systembiomaterials-and-tissues-woodhead-publishing-series-inbiomaterials-2nd-edition-zhongmin-jin-editor/

Robust Control System Design 3rd Edition Chia-Chi Tsui

https://ebookmeta.com/product/robust-control-system-design-3rdedition-chia-chi-tsui/

Dynamic System Modelling and Analysis with MATLAB and Python 1st Edition Jongrae Kim

https://ebookmeta.com/product/dynamic-system-modelling-andanalysis-with-matlab-and-python-1st-edition-jongrae-kim/

Power System Control and Stability 3rd Edition Paul M

Anderson

https://ebookmeta.com/product/power-system-control-andstability-3rd-edition-paul-m-anderson/

Fuzzy System Identification and Adaptive Control

Communications and Control Engineering Ruiyun Qi Gang Tao Bin Jiang

https://ebookmeta.com/product/fuzzy-system-identification-andadaptive-control-communications-and-control-engineering-ruiyunqi-gang-tao-bin-jiang/

Chuzo Ninagawa

AI Time Series Control System Modelling

AI Time Series Control System Modelling

AI Time Series Control System Modelling

Joint

Gifu University

Gifu, Japan

ISBN 978-981-19-4593-9

ISBN 978-981-19-4594-6 (eBook)

https://doi.org/10.1007/978-981-19-4594-6

© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023

This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed.

The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use.

The publisher, the authors, and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This Springer imprint is published by the registered company Springer Nature Singapore Pte Ltd. The registered company address is: 152 Beach Road, #21-01/04 Gateway East, Singapore 189721, Singapore

Preface

The Internet of Things (IoT) and artificial intelligence (AI) are without a doubt the most important technology topics of the near future. As the world undergoes Digital Transformation (DX), the Internet data collection through IoT is becoming the norm, and an era of massive time series data accumulation is about to begin. Furthermore, modeling technology will no longer be able to keep up with manually in extracting relevant information from the large ocean of time series data. AI modeling will be an inevitable core technology of the DX era.

Most image recognition, which is a representative of AI technology, can be said to be static modeling that does not depend on past history, but the time series data accumulated by DX can be said to be dynamic modeling in which the appearance of values changes with past history. Long short-term memory (LSTM), a neural network specialized for time series data, has been attracting attention as a neural network that is good at prediction depending on the history of time series data, and its tools are now readily available.

Since time series AI modeling is a sophisticated predictor that deals with history, it is difficult for beginners to understand the learning theory and effective training data collection techniques. For example, even if you study the theory and collect time series data to test it in your own work, you will face problems such as obtaining only biased training data or not being able to determine realistic convergence conditions.

However, there seem to be two extremes in the world: theoretical books with only mathematical formulas, or how-to books on tools. For graduate students in university laboratories and front-line engineers in the industrial world, there is a need for specialized books that serve as a “bridge” by describing the development of practical models based on theory.

In each chapter of this book, a structure is adopted that has never been seen before: a section that presents the basic theory in mathematical form, followed by a section that presents practical applications of the theory. In other words, the emphasis is on showing concrete examples of the application of the basic algorithms in the field of system control immediately after understanding them mathematically. By doing so, the author aimed to take a different approach from mathematical books that develop theories in an abstract manner by deriving pure mathematical formulas and from

how-to books that only describe how to input and output data to off-the-shelf tools without describing theories.

In general, examples of AI machine learning may appear to be handled well, but there may be cases where the reliability is not guaranteed. Therefore, all the examples in this book were selected from peer-reviewed papers by IEEJ, IEEE, and other experts to ensure reliability.

The structure of this book is as follows.

Chapter 1 begins with the definition of “time series” in system control and describes the position of control design modeling from time series data of physical quantities of interest.

Chapter 2 presents the basic theory of linear multiple regression modeling and autoregressive (AR) modeling as the most fundamental methods in time series data modeling and their practical applications.

Chapter 3 describes the basic theory of neural network modeling of dynamic characteristics of control targets as a representative of AI machine learning modeling with time series data and its application to modeling of step response characteristics.

Chapter 4 presents the theory of long short-term memory (LSTM) neural networks, which have attracted attention in recent years as a method for modeling control targets whose subsequent behavior differs depending on the time series history, and an example of a prediction model for sudden events in control as an application of the theory.

Chapter 5 describes the theory and examples of heuristic optimal search control as optimal control using the above time series AI model.

Chapter 6 describes a practical method for collecting time series data in the field of system control design, including a method for correcting collected data bias and a method for estimating training data from normal operation data.

Chapter 7 describes a practical method for implementing the methods described in the above chapters: a time series data collection platform, a method for extracting zones of interest from field-collected data, and self-developed machine learning software.

I would like to express my gratitude to many people for their help in compiling this book. Morio Takahama, former professor at Nagoya University, as an expert in control engineering, and Satoru Hayamizu, former professor at Gifu University, as an expert in AI, provided valuable comments on the manuscript. Former Assistant Professor Shun Matsukawa of Ninagawa Laboratory, Gifu University, and current Lecturer at Hokkaido University of Science, helped to confirm the mathematical expressions. Of course, I am grateful to Assistant Professor Yoshifumi Aoki, Ph.D. student Asif Iqbal, and other members of the Ninagawa Laboratory at Gifu University for their research. I would like to express my gratitude to all of them. Finally, I would like to thank my wife for allowing me to write this book at home for a long time.

Gifu, Japan Chuzo Ninagawa

4.4

5.2

4.4.4

5.4

5.2.1

5.4.1

7.3.1

About the Author

Prof. Chuzo Ninagawa is CEO of N Laboratory, Inc., and Professor of Smart Grid Power Control Engineering Joint Research Laboratory¸ Gifu University, Gifu, Japan. He has been Executive Chief Engineer of Mitsubishi Heavy Industries, Ltd., which is one of the largest hi-tech manufacturers in Japan. His research interests span various topics of smart grid, with special focus on virtual power plant (VPP) with a largescale aggregation of fast automated demand responses. He has published over 110 academic papers and five advanced research books.

Chapter 1 Introduction

Abstract This chapter begins with the definition of “time series” in system control and describes the position of control design modeling from time series data of physical quantities of interest. Statistical modeling approach is introduced as a black box modeling on the control system target. Then, the concept of machine learning modeling approach will be introduced.

Keywords Time series data · White box model · Black box model · Autoregressive model · Machine learning

1.1 Time Series

1.1.1 What is “Time Series” Dealt in This Book?

This book describes the practical application of artificial intelligence (AI) methods using time series data in system control. It consistently discusses the application of machine learning to the analysis and modeling of time series data of physical quantities to be controlled in the field of system control.

In the field of system control, a time series is a sequence of physical quantity data arranged in time order at fixed time intervals. Although it is not absolutely necessary to have a fixed time interval, virtually any data sequence with a fixed time interval is called a time series from the standpoint of mathematical handling and practical data organization. Depending on the time scale of the physical quantity of interest, the interval may be one microsecond or one hour. In any case, this book implicitly refers to them as fixed time interval time series.

In recent years, the term “time series” has often been used in the field of information science, especially in the field of AI, to refer to the order of appearance of words and scenes in language processing and video processing. In contrast to control, the term “time series” is used in situations such as word order and image transition. These are different from time series in system control and are outside the scope of this book.

The purpose of this book in discussing time series in system control is to predict and manipulate the target physical quantity based on its history. A dynamic system

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 C. Ninagawa, AI Time Series Control System Modelling, https://doi.org/10.1007/978-981-19-4594-6_1

is a system in which the subsequent change in the physical quantity to be controlled depends on the time series of the state physical quantity that determines the state.

Dynamic systems exhibit physical phenomena in which the transient characteristics are problematic as well as the steady-state characteristics. Transients are problematic because they are affected by the system state before and after the change occurs. The control target of this book is a system in which it is essentially important to handle the time history, or time series, of the physical quantities that determine the response behavior of the system state.

In addition, when dealing with time series for control, it is interesting to consider target systems in which each physical quantity that constitutes the time series data has a stochastic value.

In real-world control, even in systems where there is no uncertainty in the target itself, it is natural for the observed data of the controlled quantity to be noisy, and the actual manipulated physical quantity will naturally contain errors relative to the calculated value of the manipulated quantity. However, the subject of interest to control engineers is not merely such uncertainty in observations and operations, but dynamic systems dominated by intrinsic stochastic phenomena.

1.1.2 Time Series for Statistical Control

Naturally, if the time series of the past state history of a dynamic system is different, the transient response output, or the change in the controlled quantity, will be different even if the current input, or the amount of operation, is exactly the same. In addition, even if the time series of the state history and the input at the present time are exactly the same, there are dynamic systems with uncertain intrinsic behavior, i.e., statistical elements, depending on the target system. This book concentrates on statistical dynamic systems with inherent uncertainty.

The uncertain behavior of such statistical dynamic systems cannot be easily taken into account by the deterministic control theory that has been widely used so far. In recent years, research results in control theory have led to the development of advanced mathematical methods that can deal with uncertain control targets. However, it must be said that these methods are too difficult for engineers in the field who are involved in the control of embedded systems.

Such inherently uncertain dynamic systems may include the following. An example of a statistical dynamic system is the prediction of the path of a typhoon. The purpose of these systems is not to manipulate the system itself, but to quantitatively estimate the transient changes in the future based on the past history of state. Load frequency control, which targets each power plant from the power system’s central power supply command center, is an example of statistical dynamic system control. In these systems, prediction alone is not enough to solve the problem; it is necessary to feed back the observed values and perform control operations to output the operational quantities. However, the system frequency, which is the controlled quantity, cannot be completely controlled due to the inherent uncertainty, or statistical factor,

of the control target, and the rate of stay of the controlled quantity in the target region is to be kept within a practical range.

This section will review the target physical quantities, which are the controlled quantity, the operating quantity, and the state variable in the statistical dynamic system treated in this book. The physical quantities here are the quantities that determine the behavior of the target system, specifically, the properties governed by the physical laws of natural science, such as temperature, voltage, and velocity. These elements are called “physical control elements” in this book.

Next, we consider factors that are not “physically controlled” but are caused by social factors. In general society, the word “control” is used to describe managerial behavior toward highly uncertain phenomena. The word “infection control” for a novel coronavirus, for example, is not the same as the restrictive definition of control engineering. The term “control” seems to be closer to “management,” but the reason to use the term “control” is that it involves some kind of feedback and manipulation to approach a desired state. Naturally, these inherently uncertain elements are separate from the laws of physics, such as the psychology of human behavior. These elements are called “uncertain control elements.”

The control of statistical dynamic systems in this book deals with objects that have both of the above elements. In other words, it deals with a very interesting system control that combines two elements, one dealing with industrial physical quantities and the other dealing with social uncertainties.

Although there are many aspects of complex system control consisting of uncertain social and industrial systems that do not conform to the mathematical precision of automatic control engineering, the controller outputs an operational quantity based on the observed physical quantity of the dynamic system, and an uncertain response is returned. Uncertainty management control, which increases the probability that the controlled quantity will be in the desired range, is much needed in the real world.

In general, it must be said that automatic control theory as an academic discipline has been extremely difficult to handle dynamic systems with uncertain responses, although it makes full use of advanced mathematical theory. It is felt that a practical method for managing and controlling a control target with such uncertain behavior has not been accessible.

The topic of statistical extraction of “uncertainty control features” from time series data is addressed, which has been increasingly popular in recent years, as one of the answers in this book.

1.1.3 Dissemination of Time Series Data for Control

Now, let us discuss the history of time series data. Before the spread of digital computers as we know them today, data to be controlled in system control was often treated as analog quantities. In measurement and recording devices called pen recorders, physical quantities measured by sensors were converted directly into pendriven electrical signals and recorded as curves on paper fed from a roll. It can be

said that it is a kind of continuous measurement and recording of the time history of physical quantities.

Microprocessors, or microcomputers, were introduced to the world around 1970. It quickly developed and spread to the stage where it was called a personal computer around 1980. Since then, microprocessors and personal computers have been actively used in the measurement and recording of physical quantities for system control. These digital computers became the norm, and pen recorders became extinct. In the field of system control, measurement and recording came to be handled as digital data in the form of discrete numbers. As a natural progression, it became common to record discrete numerical values at fixed time intervals, and time series data became the basic method of recording the time history, time changes, and transient phenomena of physical quantities to be controlled.

Since around 1980, with the spread of microcomputers, a large number of devices and systems in society have come to be controlled by embedded systems. However, at that time, the embedded control of various equipment systems was developed within the equipment itself, and there were few systems that continuously communicated operation data to the outside for a long period of time. From around 2000, remote centralized control of equipment systems began to be used, and cases where data on equipment behavior was accumulated as a time series gradually appeared.

In addition, the Internet of things (IoT) communication network has become widespread since around 2010, and it has become possible to accumulate time series data on the behavior of embedded control systems on the things side of the IoT easily and at low cost. On the Internet side of the IoT, the cloud centralized monitoring and control side constantly collects time series data from many things side.

In this way, we are living in an era in which time series data on the behavior of a huge variety of devices is constantly being collected and stored in data centers at a short-time granularity through the IoT, and it seems that time series data is now regarded as a natural resource that is collected and stored even without saying anything.

As well as, we are living in an era where it is normal to collect and accumulate time series data on the behavior of control targets. It is becoming more and more important for engineers in the field to effectively use this vast amount of accumulated time series data for system control design, especially for modeling the control target.

1.2 Time Series and Control Models

1.2.1 Control Modeling

In the field of system control, a model is a representation of the dynamic characteristics of a system in a specific form, focusing on the essential parts from the perspective of control. This is too abstract, so let us consider it more concretely from the viewpoint of system control.

Since the actual object operates in the real world, various aspects of it operate in a complex and uncertain manner. However, among these behaviors, only a limited number of essential characteristics are necessary for a control system. The important point here is that only those features of the real thing that are necessary for control need to match. In other words, the model used in system control does not need to perfectly reproduce all the behavioral actions of the real thing to be controlled.

However, it is still not easy to capture and describe the dynamic behavior. Except for extremely simple dynamic models of quality point systems, when constructing a valuable control system in the real world, it is usually very difficult to describe the dynamic characteristics instead of the static characteristics when the system is a little complex.

Advanced control theories such as modern control theory have been studied, but even after several decades, they do not seem to be widely used in industry, especially in the field of embedded control.

The mathematical representation of a control target as a state space is not so common among engineers in the field of embedded control systems. Basically, the state-space representation needs to be in the form of differential equations or difference equations that determine the dynamic characteristics of the control target. It can be limited to the case when the physical laws governing the dynamic characteristics of interest in the behavior of the control target are known if it can be represented by such differential equations. This is not an easy task in the real world.

On the other hand, as mentioned in the previous section, with the recent spread of IoT, it has become very easy to collect and accumulate time series data of physical quantities to be controlled. In some cases, the time series data may already exist somewhere before the control system is designed.

In the field of system control design in industry, it is much more convenient to express the dynamic characteristics of a control target using already existing time series data than to express them in state space.

1.2.2 Control Model Building Methods

There are two main approaches to obtaining a control target model: the white box approach and the black box approach. The white box approach models the dynamic behavior of the control target using differential equations based on physical laws. This modeling method is also called first principle modeling or physical modeling and has long occupied a central position in system control as an orthodox approach.

If the target system has simple behavior and shows definite actions, it can be modeled by control engineers in the field using the white box approach. One of the advantages of the white box approach is that it is suitable for understanding the cause-effect relationship and the mechanism of the behavioral analysis, since it is modeled based on physical laws.

In many cases, white box modeling is limited to linear mathematical methods such as transfer functions and state-space representations. Therefore, it is necessary

to pay attention to the target system with strong nonlinearity. Specifically, nonlinear behaviors such as the ineffectiveness of the gravitational combination theory and the fact that a constant-fold change in the input variable is far outside the constant-fold in the output. This is a major difficulty in designing controller algorithms.

Of course, as widely known, a linear physical model based on a white box approach can be used in practical applications by limiting the behavior to small signals around a steady state or by dynamically adapting the parameters. The essential difficulty of the white box approach may rather be in modeling uncertainty.

On the other hand, the black box approach is a modeling method based on experimental data. In this method, the control target is regarded as a black box, and a dynamic model is obtained from its input/output data and state variable data using statistical methods.

In the field of control engineering, the term “system identification” means a black box approach to obtain a mathematical expression by statistically extracting the dynamic characteristics based on the time series data of the control target.

The advantage of this approach is that a statistical approximation model, although not an exact solution, can be obtained if a certain number of samples of experimental data are available, even when it is virtually impossible to represent the system with differential and difference equations based on physical laws in a complex system. In the past, it was often expensive to obtain time series data with a sufficiently fine temporal granularity and a sufficient number of samples. However, in recent years, with the easy and inexpensive acquisition of time series data by IoT via the Internet, the black box approach of system identification has been reconsidered.

This book uses both the physical model, which is a white box approach, and the system identification, which is a black box approach. For example, in the heat load model of a building multi-type air-conditioner, a physical model based on the differential equation of the heat balance between the building and the air-conditioner is used, and the change in the instantaneous power of the customer’s equipment from the viewpoint of the power system uses a statistical model, black box approach, identified from the time series of experimental data.

1.3 Control Time Series and AI Methods

1.3.1

Control Model by Time Series Analysis

The method of obtaining a control target model for system control design from time series data started in the 1970s.

In the past, time series analysis was often performed by decomposing the fluctuations of a time series into a discretized frequency domain. However, since the 1970s, an autoregressive modeling method has been established that directly converts the time series data into a dynamic model using the amount of information as a criterion [1, 2].

Autoregressive models are called autoregressive (AR) models. The multidimensional ones are called vector AR (VRA) models. Various derivatives of the AR model have also been proposed, such as AR moving average (ARMA).

The AR model alone cannot be used for the design of control systems based on modern control theory; it must be transformed into a state-space representation. A method to derive a discrete state-space representation from time series data via AR model has been studied.

The methods described so far for obtaining AR models and state-space representation models from time series data are all linear models. In many cases, the linear model is sufficient for control of a small signal range. However, depending on the control target, the model may not be sufficient in many cases where the dynamic characteristics are highly nonlinear or where the uncertainty of the statistical dynamic system is large.

From the latter half of the twentieth century until about 2010, nonlinearity and uncertainty were studied and devised in the field of control engineering. However, it has become too difficult for designers of embedded systems to use them in their products. An approach has emerged to obtain a dynamic model of a control target with nonlinearity and uncertainty in a black box manner as long as there is a sufficient amount of behavioral data.

There has been a trend to apply the results of AI to control from roughly 2010, with the rapid growth of artificial intelligence (AI) [3–5].

1.3.2 Control and AI Methods

In recent years, there has been a trend in control model building methods that has dramatically changed the situation. The combination of time series analysis and AI modeling methods is what the author wanted to write about in this book.

The time series data treated in this book is those that have at least a factor of uncertainty, fluctuation, or periodicity in the data sequence. From the author’s point of view, machine learning, an AI method, is essentially a statistical modeling engineering from a large amount of training data. It can also be interpreted as engineering to extract average characteristics rather than exact relationships between the realized values of each sample of a large amount of training data.

The difference between conventional analysis and modeling methods and AI machine learning, however, is that the former gives the algorithm for extracting characteristics for each application problem as a program, while the latter only gives the generic strategy for extracting characteristics in general terms and does not explicitly show the specific procedure for acquiring characteristics in the programming code.

The field of AI is very broad, ranging from pure science that approaches the activities of the human brain to practical applications in system control, which is the subject of this book. The author feels that the terms “intelligence” and “learning” are overused in control applications. It would be better to call it statistical engineering or data engineering.

The application of AI methods to system control seems to have become quite common in the past few years. Although there are various AI methods that can be applied to system control, the range of AI methods applied to machine control systems in this book has so far been quite limited. In a nutshell, they are mainly machine learning methods.

Then, why is there an incentive to use AI machine learning to analyze time series data in system control? As mentioned above, the essence of system control is to predict and manipulate the dynamic behavior of the target system. Systems with dynamic behaviors are called dynamic systems. Dynamic systems require the description of transient phenomena, and the theory of system control is basically based on differential or difference equations that describe the changes in the system. Since dynamic systems are not stable steady states but changing transient states, the changing transient states depend on the state history before the change. In other words, it is essential to predict the change from the present to the future based on the time history of each variable in the target system and to manipulate the system to achieve the desired change.

Rather than scientifically approaching the essence of human brain activity, machine learning, which is an AI method used in system control, uses the operation history of the target system as training data or the history of operation attempts on the target system as non-training data. In other words, the time series data of the target system is linked to the understanding and prediction of the system dynamics, and eventually to its operation.

In short, time series is the key to the application of AI machine learning to system control. In a nutshell, the philosophy of this book is: “time series data” + “AI machine learning” = “new practical control methods.”

References

1. G. Box, G. Jenkins, G. Reinsel, Time Series Analysis, Forecasting and Control, 3rd edn. (PrenticeHall, Inc., 1994)

2. P. Brockwell, R. Davis, Introduction to Time Series and Forecasting, 2nd edn. (Springer Science Business Media, LLC, 2002)

3. K. Hunt,G.Irwin,K.Warwick, Neural Network Engineering in Dynamic Control Systems (Springer-Verlag, 1995)

4. M. Norgaaed, O. Ravan, N. Poulsen, L. Hansen, Neural Networks for Modelling and Control of Dynamic Systems (Springer-Verlag, 2000)

5. A. Janczak, Identification of Nonlinear Systems Using Neural Networks and Polynomial Models (Springer-Verlag, 2005)

Chapter 2 Linear Time Series Modeling

Abstract This chapter deals with linear regression modeling and autoregressive modeling as basic approaches of the time series modeling for control system. The first half of the chapter provides mathematical bases of these modeling, and the latter half shows practical application examples of these linear time series modeling methods.

Keywords Linear regression model · MLA model · Autoregressive model · AR model

2.1 Linear Regression Models

2.1.1 One-Dimensional Linear Regression Model

As mentioned in the preface, a feature of this book is the use of AI methods to develop dynamic models of control targets from time series data collected in the field. The widespread use of IoT has made it possible to collect huge amounts of time series data, and the application of AI modeling methods to control models, which requires sufficient training data, has become easier. However, how to apply AI methods to control models? However, no matter how rich the dynamic expressive power of AI methods is, the approach of practitioners should be to start with linear modeling, which is a historically established basic method.

This chapter will discuss multiple regression analysis models and multidimensional autoregressive (AR) models [1], which are traditional methods essential for practitioners, prior to the application of AI methods to time series modeling in the next chapter. Both of these models are statistical regression models that start from the application of statistical mathematics, from which time series modeling is undone.

In this book, modeling of a system control target means finding an approximation function that represents and predicts the target characteristics based on time series data. In this modeling, there is a method called regression, which is to find an approximate equation that represents one’s own characteristics based on the distribution of sample data obtained through observation and so on. A regression model is a function

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023

C. Ninagawa, AI Time Series Control System Modelling, https://doi.org/10.1007/978-981-19-4594-6_2

that shows the causal relationship between an output variable Y and multiple input variables X 1 , X 2 ,..., X K .

Input X is said to be the explanatory variable, and output Y is said to be the objective variable. The most basic type of regression model is the linear regression model, which uses a linear equation as the function and is represented by Eq. (2.1).

Here, each coefficient α, β1 ,β2 ,...β K is called a parameter.

The linear regression model with one explanatory variable is called the linear simple regression (LSR) model, while the model with multiple explanatory variables is called the multiple linear regression (MLR) model. In this section, we first discuss the linear simple regression model, which is the most basic model.

Given a set of sample data x i (i = 1, 2, …, N ) for one explanatory variable X ,we can calculate and assume the population mean μi of the regression model. Adding the error εi to that value and using the sample data yi for the target variable Y ,we can assume the linear relationship μi = α + β1 x i as Eq. (2.2). In other words, the input–output relationship can be assumed and shown by the straight line.

The y-intercept α and the slope β1 of this line y = α + β1 x are to obtain a linear simple regression model. To do this, we use the least-squares method, which is the method that considers the error between this linear model and each sample data to be the smallest overall error. First, let us define a straight line as a linear monoregressive model.

Given the sample data ( x i , yi )(i = 1,..., N ) consisting of N numbers, and assuming that the parameters in Eq. (2.3)are ˆ α, ˆ β1 , substitute the sample data x i into Eq. (2.4).

This is called the predicted value, which is of course a point on a straight line. The difference between the model’s predicted value ˆ yi and the model’s actual output value yi is called residual.

Since ˆ α and ˆ β1 are in the notation of estimation, if we return to α and β1 , the error function by residual sum of squares is as Eq. (2.6).

The error function E is a quadratic function that is convex downward with respect to α and β1 , respectively. Therefore, by partial differentiating E by α and β1 and setting it to zero, α and β1 , which minimize E , can be obtained by solving Eqs. (2.7–2.10).

From the first equation in Eq. (2.11), we get Eq. (2.12) and α is obtained.

Substituting this into the second equation of Eq. (2.11), we obtained Eq. (2.13).

Then, β1 can be obtained. Now, by calculating the denominator (sum of mean error squares) and numerator (sum of mean error functions) of Eq. (2.15).

The parameters α and β1 can be obtained as Eq. (2.17).

Thus, the estimated linear single regression model is given as Eq. (2.19).

2.1.2 Multi-Dimensional Linear Regression Model

In this section, we discuss the case where there are multiple explanatory variables (independent variables) X 1 , X 2 ,..., X K as inputs. Connecting the objective variable Y with K explanatory variables X 1 , X 2 ,..., X K , the relationship among multiple variables is Eq. (2.20).

Modeling approximated by an algebraic expression of linear combination is called MLR. To find the function of this linear combination, the unknown parameters β1 ,β2 ,...β K , which are the linear combination coefficients, are estimated by Eq. (2.21).

The estimation is based on the sample data represented as, for example, if we add the difference εi between the i th multi-dimensional sample data ( yi , x i 1 , x i 2 ,..., x iK ) and the assumed linear equation Eq. (2.20), we obtain the Eq. (2.22).

Next, we formulate the functional relationship in this linear multiple regression model. The coefficient parameters α, β1 ,...,β K in the above equation are called the regression coefficients. In the method of determining these parameters, the leastsquares method is used as in the linear simple regression model described in the previous section. In the case of the linear multiple regression model, the error function is used to calculate the least squares using Eq. (2.23).

Therefore, as in the case of linear simple regression, we only need to find the error function E (α, β1 ,...,β K ), which is the partial derivative of the error function E (α, β1 ,...,β K ) with each of the parameters E (α, β1 ,...,β K ) all equal to zero.

Organizing them further, we can obtain a set of simultaneous equations for (α, β1 ,...,β K )

Each of the constants in the above simultaneous equations can be obtained by calculating the sum of the mean error squares and the sum of the mean error products below, as in the case of linear simple regression.

This allows us to find the solutions to the simultaneous equations in Eq. (2.25), and the obtained solutions are denoted as ˆ α, ˆ β1 ,..., ˆ β K , which can be used as regression coefficients to obtain the linear multiple regression The model is given as Eq. (2.30).

2.2 Fundamentals of AR Models

2.2.1 Overview of the AR Model

The AR model is a basic statistical mathematical model that is used in a wide range of fields such as economic forecasting and machine control. This model is used to multiply the time series data of several variables by coefficients and predict their future values from their linear combination. In other words, it is a model that uses a linear equation to express the change one step ahead from its own past change history.

In this book, a time series refers to the data sampled at regular intervals in the time transition of a certain control target system. As in the case of linear multiple linear regression, the number of variables (dimensions) is K , the multi-dimensional variable vector at a certain time t is x (t ), the linear combination coefficient matrix multiplied by the time series data of l points is A(l ), and the error from the measured value is ε (t ), which is expressed in Eq. (2.33).

A multivariate AR model that uses time series data (L × K data since there are K variables) for forecasting, where L is the order of the model going back in time (how many time series data in the past are used), is shown below using Eqs. (2.31–2.33).

In the AR model, the error distribution is modeled assuming that the mean is zero and the variance is σ 2 . The maximum probability approach, which is a statistical estimating method, can be employed if ε (t ), the error between the actual and measured value, follows a normal distribution. For ease of computation, x (t ) and ε (t ) are treated below by removing the mean value and converting them to time series data with zero mean.

The problem here is the AR model order L, which is the integer number of steps to go back in time. If the order L is too small, the model is too simple and the prediction error increases. On the other hand, if the order L is too large, the model becomes too complex and prediction becomes inaccurate. Based on these considerations, the Akaike information criterion (AIC), introduced in 1972, is a generalized method for finding the optimal AR model order. AIC is generally defined as, the maximum log probability and the number of parameters must be determined to calculate the AIC.

If the number of variables is K and the order is L, the number of parameters is as Eq. (2.36).

The maximum log probability of the multivariate AR model is as Eq. (2.38).

AIC can be calculated as a function of L by Eq. (2.39).

Therefore, the integer L that minimizes this AIC is the optimal AR model order of the other-variable AR model.

2.2.2 Yule-Walker Method (One Variable)

In this section, before dealing with multivariate AR models, we will discuss how to obtain AR model coefficients using a single-variable AR modeling method. In Sect. 2.2.3, we will expand the method to obtain the AR model coefficients for multiple variables.

Let x (t ) be the 1D time series data at a certain time t, A(l ) be the coefficient matrix (vector) of the AR model, ε (t ) be the error from the measured value, and L be the order of the AR model.

The probability density function of ε (t ) is the normally distributed probability density function f (x(t ); 0, σ 2 ), which is given by Eq. (2.42).

Next, find the probability function L i in the simultaneous probability density function of T time series data for time series t = 1, 2, …, T . The probability function is the probability density function of each time series data multiplied by a number that represents the probability of inferring that the precondition was “something” in view of the observed results.

By varying the value of A(l ) and finding the parameter value that maximizes the likelihood function Li( A(l ),σ 2 ), we can obtain the coefficient matrix A(l ) such that the probability of occurrence of these time series data is the highest. By finding A(l ) in this way, the model with the smallest error can be obtained. For ease of analysis, Eq. (2.44) is again referred to as Li( A(l ),σ 2 ), which is the logarithm of the above probability function.

Li( A(l ),σ 2 ) =

Here, the fact that ε (t ) with zero mean is normally distributed means that the probability of occurrence of ε (t ) = 0 is the greatest. From the partial derivative of the parameter obtained by the probability function expressed as the probability density function of this normal distribution, the extreme value obtained has the highest probability of occurrence at m = 1, 2, …, L. Let the partial derivative of this logarithmic likelihood function be 0 for each A(m)

Then, by rearranging the Eq. (2.45), we obtain Eq. (2.46).

Equation (2.47) can be obtained by expanding Eq. (2.46).

We will now explain the autocovariance function. It is necessary for solving Eq. (2.47). The autocovariance function C (|m l|) can be expressed by Eq. (2.48).

The autocovariance function C (|m l|) is the correlation between the time series data and the time series data shifted by |m l| points. The positive correlation is stronger when the value of autocovariance function is larger, and the negative correlation is stronger when the value of autocovariance function is smaller. Dividing both sides of Eq. (2.47)by T and adapting the autocovariance function, then, we obtain Eq. (2.49).

Plugging the autocovariance function of Eq. (2.47) into Eq. (2.49), we obtain Eq. (2.50).

In Eq. (2.50), m is 1, 2, …, L, and by substituting them into m, we derive m simultaneous equations. Equation (2.51) is in matrix form and is called the YuleWalker Eq. [1].

A(l ) obtained from this simultaneous equation is the maximum probability estimator of A(l ) in the maximum probability method. To obtain the coefficients A(1), A(2), …, A(L ) of the AR model (univariate) from the Yule-Walker equation, calculate the autocovariance matrix C (0), C (1), …, C (L ) from the time series data. Next, substitute them into Eq. (2.51) and solve the simultaneous equations of A(l ) to obtain the coefficients.

2.2.3 Yule-Walker Method (Multivariate)

In the previous section, we discussed the Yule-Walker method with one variable, and in this section, we will discuss the Yule-Walker method in multiple variables. Let the

vectors x (t ), ε (t ), and the matrix be A(l ), and transforming Eq. (2.34) in the previous section, the error at point t of the multivariate AR model is given by Eq. (2.52).

In the AR model, ε (t ) must be normally distributed, so the probability density function is Eq. (2.53) using the normal distribution.

Now, if we calculate the probability function using the probability density function in Eq. (2.53), as in the case of one variable in the previous section, the probability function for multiple variables is in Eq. (2.54), which has the same form as Eq. (2.43).

( A(l ),σ 2 ) =

For ease of analysis, we take the logarithm of Eq. (2.54) and define the log probability function.

Li( A(l ),σ 2 ) =

The maximum probability estimator of A(l ), a parameter of the probability distribution function, can be obtained for this log probability function by setting the partial derivative of A(l ) to zero and finding the extreme value.

Set the partial derivative of Eq. (2.56) to zero and rearrange Eq. (2.57).

Turn static files into dynamic content formats.

Create a flipbook
Issuu converts static files into: digital portfolios, online yearbooks, online catalogs, digital photo albums and more. Sign up and create your flipbook.