Polish-British Workshops Computer Systems Engineering Theory & Applications

Page 1

Proceedings to International Students Workshops

COMPUTER SYSTEMS ENGINEERING THEORY & APPLICATIONS

Editors: Radoslaw RUDEK Wojciech KMIECIK

Organised by: Department of Systems and Computer Networks, Faculty of Electronics, Wroclaw University of Science and Technology, Wroclaw, Poland with support from: The IET Control and Automation Professional Network The Institute of Measurement and Control Systems


Reviewers: Keith J. BURNHAM Andrzej KASPRZAK Leszek KOSZALKA Iwona POZNIAK-KOSZALKA Mariusz NOWOSTAWSKI Piotr SKWORCOW Marina YASHINA Slawomir ZIEMIAN Andrzej ZOLNIEREK Dawid ZYDEK

Cover page designer Aleksandra de’Ville

Typesetting: Camera-ready by authors Printed by: Drukarnia Oficyny Wydawniczej Politechniki Wrocławskiej, Wrocław 2016 Wybrzeze Wyspianskiego 27, 50-370 Wroclaw, Poland

ISBN 978-83-933924-1-4


International Students Workshops 14th PBW & 2nd ISW in Szklarska Poreba, Poland, June 2014 and th

rd

15 PBW & 3 ISW in Szklarska Poreba, Poland, June 2015

International Steering Committee

2014

2015

Keith J. BURNHAM Leszek KOSZALKA Iwona POZNIAK-KOSZALKA Henry SELVARAJ Grzegorz CHMAJ Emilio CORCHADO Rama Murthy GARIMELLA Karol GEGA Andrzej KASPRZAK Tomasz LARKOWSKI Lars LUNDBERG Mariusz NOWOSTAWSKI Pawel PODSIADLO Radoslaw RUDEK Przemyslaw RYBA Dragan SIMIC Piotr SKWORCOW Vaclav SNASEL Gwidon STACHOWIAK Ventzeslav VALEV Krzysztof WALKOWIAK

Keith J. BURNHAM Leszek KOSZALKA Iwona POZNIAK-KOSZALKA Henry SELVARAJ Wojciech KMIECIK Grzegorz CHMAJ Emilio CORCHADO Petre DINI Rama Murthy GARIMELLA Karol GEGA Milena KAROVA Andrzej KASPRZAK Tomasz LARKOWSKI Lars LUNDBERG Mariusz NOWOSTAWSKI Pawel PODSIADLO Radoslaw RUDEK Piotr SKWORCOW Gwidon STACHOWIAK Malgorzata SUMISLAWSKA Piotr SYDOR


2014

2015

Michal WOZNIAK Anna ZAKRZEWSKA Jan ZARZYCKI Dawid ZYDEK

Michal WOZNIAK Marina YASHINA Anna ZAKRZEWSKA Ivan ZAJIC Jan ZARZYCKI Slawomir ZIEMIAN Dawid ZYDEK Andrzej ZOLNIEREK

Local Organizing Committee

2014

2015

Wojciech KMIECIK Roza GOSCIEN Lukasz GADEK Justyna KULINSKA Pawel KSIENIEWICZ

Monika KOSZALKA Miroslawa GOZDOWSKA Tomasz KUCOFAJ Lukasz GADEK Pawel KSIENIEWICZ

Conference Proceedings Editors Radoslaw RUDEK – editor Wojciech KMIECIK – editor


IET Control & Automation Professional Network The IET Control and Automation Professional Network is a global network run by, and on behalf of, professionals in the control and automation sector with the specialist and technical backup of the IET. Primarily concerned with the design, implementation, construction, development, analysis and understanding of control and automation systems, the network provides a resource for everyone involved in this area and facilitates the exchange of information on a global scale. This is achieved by undertaking a range of activities including a website with a range of services such as an events calendar and searchable online library, face-to-face networking at events and working relationships with other organisations. For more information on the network and how to join visit www.theiet.org/control-automation


Preface It is with great pleasure that we as Editors write the preface of the Proceedings for the fourteenth and fifteenth Polish-British Workshops and the second and third International Students Workshops on Computer Systems Engineering: Theory and Applications. The Polish-British Workshops have been organized jointly by the Department of Systems and Computer Networks, Wroclaw University of Technology, Wroclaw, Poland and the Control Theory and Applications Centre, Coventry University, Coventry, UK since the year 2001 and have now become a traditional and integral part of the long-lasting collaboration between Wroclaw University of Technology and Coventry University, with the Workshops taking place every year. However, we have witnessed a steady growth of the Workshops, both in terms of participant numbers and the diversity of represented institutions from all over the World. This was reflected by extending the name of the Workshops from the 2013 edition onwards to International Students Workshop. The Workshops bring together young researchers from different backgrounds and at different stages of their career, including undergraduate and MSc students, PhD students and post-doctoral researchers. It is a truly fantastic and quite unique opportunity for early-stage researchers to share their ideas and learn from the experience of others, to become inspired by the work carried out by their elder colleagues and to receive valuable feedback concerning their work from accomplished researchers, all in a pleasant and friendly environment surrounded by the picturesque mountains of Lower Silesia. The both Workshops covered by this book took place in the small town Szklarska Poreba and as usually, the theme was focused on solving complex scientific and engineering problems in a wide area encompassing computer science, control engineering, information and communication technologies and operational research. Number of papers were presented by young researchers and engineers, however only the best papers were chosen to be published in this book. The problems addressed and the solutions proposed in the papers presented at the Workshops and included in the Proceedings are closely linked to the issues currently faced by society, such as efficient utilisation of energy and resources, design and operation of communication networks, modelling


and control of complex dynamical systems and handling the complexity of information. We hope that these Proceedings will be of value to those researching in the relevant areas and that the material will inspire prospective researchers to become interested in seeking solutions to complex scientific and engineering problems. We realise that none of this would be possible without the continued efforts and commitment of the Polish-British Workshop founders: Dr. Iwona PozniakKoszalka, Dr. Leszek Koszalka and Prof. Keith J. Burnham. On behalf of all researchers who have attended the Polish-British Workshop series, including ourselves, we would like to express our sincere gratitude for making the Workshop series such a tremendous success, for sharing with others their extensive knowledge and experience, and for providing valuable guidance to young researchers at this crucial stage of their careers. We would also like to express our special thanks to Dr. Garimella Rama Murthy for his valuable contribution to the Proceedings.

Dr. Wojciech Kmiecik, Department of Systems and Computer Networks, Faculty of Electronics, Wroclaw University of Science and Technology, Poland and Dr. Radosław Rudek, Department of Information Technology, Wroclaw University of Economics, Poland Editors of the Proceedings and Members of the International Steering Committees.


Contents Garimella RAMA MURTHY

RESOLUTION OF P=NP CONJECTURE: P=NP

10

Viacheslav BARKOV

METHOD OF PROVIDING FOR EQUITABLE DATA ACCESS

31

Lukasz GADEK Keith BURNHAM Leszek KOSZALKA

COMPUTATION OF POLE-PLACEMENT AND ROOT LOCUS METHOD FOR CLASS OF BILINEAR SYSTEMS

42

Róża GOŚCIEŃ

A SIMPLE COST ANALYSIS IN ELASTIC OPTICAL NETWORKS

54

Tomasz JANUS

INTEGRATED MATHEMATICAL MODEL OF A MBR REACTOR FORWASTEWATER TREATMENT

65

Justyna KULIŃSKA

DETECTION OF FUZZY PATTERNS IN MULTIDIMENSIONAL FEATURE SPACE IN PROBLEM OF BODY GESTURE RECOGNITION

81

Michal LASZKIEWICZ Tomasz LEWOC

NORMALIZATION OF GAS CONSUMPTION IN BUILDINGS

110

Anna STRZELECKA Tomasz JANUS Leticia OZAWA-MEIDA Bogumil ULANICKI Piotr SKWORCOW

MODELLING OF UTILITY–SERVICE PROVISION

120

Grzegorz ZATORSKI

OPTIMIZATION OF LOW-LEVEL COMPUTER VISION METHODS FOR EYE TRACKING

137

Sivo DASKALOV* Simona STOYANOVA

POINTED NAVIGATION OF A ROBOT WITH THE USAGE OF INFRARED CAMERAS AND MARKERS

149


Norbert KOZŁOWSKI

SENTIMENT CLASSIFICATION IN POLISH LANGUAGE

154

Piotr LECHOWICZ

PATH OPTIMIZATION OF 3D PRINTER

164

Evangelia MANOLA

MODELLING OF REFLECTIVE CRACKING IN FLEXIBLE COMPOSITE PAVEMENTS

176

Bartłomiej Filip SUPERSON Michał WANCKE

A LOVE SONG RECOGNITION

186


Computer Systems Engineering 2014 Keywords: hopfield network, min-cut, max-cut, stable states

Garimella RAMA MURTHY*

RESOLUTION OF P=NP CONJECTURE: P=NP In this research paper, the problem of optimization of a quadratic form over the convex hull generated by the corners of hypercube is attempted and solved. It is reasoned that under some conditions, the local/ global optima occur at the corners of hypercube. An efficient algorithm for the computation of global optimum stable state is discussed. It is proved that global optimum anti-stable state computation of a Hopfield network is equivalent to computation of maximum cut in the associated graph, an NP-hard problem. An exact deterministic polynomial time algorithm for the computation of maximum cut is proposed. Thus it is proved that P=NP.

1.

INTRODUCTION

It is well known that a Hopfield neural network acts as a local optimization device with the associated quadratic energy function. The detailed operation of Hopfield Neural Network (HNN) can be found in [2]. The following Convergence Theorem summarizes the dynamics of discrete time Hopfield neural network in the serial and parallel modes of operation. It characterizes the operation of the neural network as an associative memory. Theorem 1 Let the pair N = (M,T) specify a Hopfield neural network (M-Synaptic weight matrix; TThreshold vector). Then the following hold true [2] [1] Hopfield: If N is operating in a serial mode and the elements of the diagonal of M are non-negative, the network will always converge to a stable state (i.e. there are no cycles in the state space).

*

Signal Processing and Communication Research Center, International Institute of Information Technology, Hyderabad, India, e-mail:rammurthy@iiit.ac.in

10


[2] Goles: If N is operating in the fully parallel mode, the network will always converge to a stable state or to a cycle of length 2 (i.e the cycles in the state space are of length≤ 2). The so called energy function utilized to prove the above convergence Theorem is the following one: E ( t ) = V T ( t ) M V ( t ) − 2V T ( t ) T (1) Thus, HNN, when operating in the serial mode will always get to a stable state/ antistable state that corresponds to a local maximum/minimum of the energy function. Hence the Theorem suggests that Hopfield Associative Memory (HAM) could be utilized as a device for performing local/global search to compute the maximum/minimum value of the energy function. • Contribution of Bruck et.al The above Theorem implies that all the optimization problems which involve optimization of a quadratic form over the unit hypercube (constraint / feasible set) can be mapped to a HNN which performs a search for its optimum. One such problem is the computation of a minimum cut in an undirected graph. In the following Theorem, proved in [1], the equivalence between the minimum cut and the computation of global optimum of energy function of HNN is summarized. Theorem 2 Consider a Hopfield Neural Network (HNN) N= (M,T) with the thresholds at all nodes being zero i.e. T ≡ 0 . The problem of finding the global optimum stable state (for which the energy is maximum) is equivalent to finding a minimum cut in the graph corresponding to N. Corollary Consider a Hopfield neural network N=(M,T) with the thresholds at all neurons being zero i.e. T ≡ 0 . The problem of finding a state V for which the energy is global minimum is equivalent to finding a maximum cut in the graph corresponding to N.

Proof. Follows from the argument as in [1]. Q. E.D.

11


Thus, the operation of Hopfield neural network in the serial mode is equivalent to conducting a local search algorithm for finding a minimum cut in the associated graph. In view of the above Theorem, a natural question that arises is to see if a Hopfield neural network can be designed to perform a local search for minimum cut in a directed graph. The following Theorem is in the same spirit of above Theorem for directed graphs. Theorem 3 Let M be the matrix of edge weights (M is not necessarily symmetric) in a weighted directed graph denoted by G = (V,E). The network = ( , ) performs a local search for a directed minimum cut of G where

(M + M T ) M% = 2 N 1 Tk = ∑( M ki − M ik ) 2 i =1

Proof. Refer [10]. The author, after mastering the results in [1], [2] contemplated on removing the conditions required in Theorems 1 and 2. The fruits of such effort are documented in the following Section. This research paper is organized as follows. The author in his research efforts formulated and solved the problem of optimizing a quadratic form over the convex hull generated by the corners of unit hypercube. This result and the related ideas are documented in Section 2. In Section 3, some contributions are made towards solving the NP-hard problem of computing the global optimum anti-stable state of a Hopfield neural network. Finally some conclusions are reported in Section 4. 2.

OPTIMIZATION OF QUADRARIC FORMS OVER HYPERCUBE

The energy function associated with a HNN, considered in (1) is not exactly a quadratic form. The author questioned whether the threshold vector associated with a HNN can always be assumed to be zero (for instance by introducing a dummy node and suitably choosing the synaptic weights from it to the other nodes). The result of such effort is the following Lemma.

12


Lemma 1 There is no loss of generality in assuming that the threshold vector associated with a Hopfield Neural Network (HNN) is an ‘all-zero’ vector Proof: Refer the argument provided in [6]. •

Thus, it is clear that a properly chosen HNN acts as a local / global optimization device of an arbitrary quadratic form as the objective function on the constraint set being unit hypercube. Thus in the following discussion, we consider only a pure quadratic form as the energy function. • Also, in part 1 of Theorem 1, we require the condition that the diagonal elements of symmetric synaptic weight matrix are all non-negative. We now show in the following Theorem that such a condition can be removed. In this section, we consider the problem of maximization of quadratic form (associated with a symmetric matrix) over the corners of binary, symmetric hypercube. Mathematically, this set is specified precisely as follows:

S = { X = ( xi , x2 , x3 ,......, xN ) : xi = ±1 for1 ≤ i ≤ N }

(2) From now onwards, we call the above set simply as hypercube. This optimization problem arises in a rich class of applications. This problem is the analogue of the maximization over the hypersphere of quadratic form associated with a symmetric matrix. Rayleigh provided the solution to the optimization problem on the unit hypersphere. A necessary condition on the optimum vector lying on the unit hypercube is now provided. This Theorem is the analogue of the maximization over the hypersphere of a quadratic form associated with a symmetric matrix. The following Theorem and other associated results were first documented in [3]. Theorem 4 Let B be an arbitrary N x N real matrix. From the standpoint of maximization of the quadratic form i.e. u T Bu on the hypercube, it is no loss of generality to assume that B is a symmetric matrix with zero diagonal elements. If u maximizes the quadratic form

u T Bu , subject to the constraint that ui ≤ 1 for1 ≤ i ≤ N (i.e. u lies on the convex hull generated by the corners of hypercube), then (3) u = sign (Cu )

13


Where C =

1 ( B + BT ) with all the diagonal elements set to zero. In the above equation 2

(i.e. eqn 2.2) , Sign (0) is interpreted as +1 or -1 based on the requirement. Proof. Any arbitrary matrix can be split into symmetric and skew-symmetric components i.e.

1 1 C = ( B + BT )and ( B − BT ) 2 2

(4)

Since the quadratic form associated with the skew symmetric part (matrix) is zero, as far as the optimization of quadratic form is concerned, there is no loss of generality in restricting consideration to symmetric matrices. • It is now shown that as far as the current optimization problem is concerned, we can only consider symmetric matrices with zero diagonal elements. Consider the quadratic form u T Bu , where the vector u lies on the boundary of the hypercube. Since u lies on the boundary, the quadratic form can be rewritten in the following form: N

N

uT Cu = Trace(C ) + ∑∑ ui Cij u j

(5)

i =1 j =1 for i ≠ j

Since the Trace (C) is a constant, as far as the optimization over the hypercube is concerned, there is no loss of generality in restricting consideration to a matrix C whose diagonal elements are all set to zero. • In the above discussion, we assumed that the optimum of quadratic form over the convex hull of hypercube occurs on the boundary. It will be reasoned in the following discussion. Now, we apply the discrete maximum principle [8] to solve the static optimization problem. [9] Consider a discrete time system Z(k+1) = u(k) for k=0,1, where u(0) = u (6) The criterion function to be minimized is given by

1 J (0) = − Z T (1)CZ (1) = θ ( Z (1),1) 2 The Hamiltonian is given by

14

(7)


H [ Z k , uk , ρ k +1 , k ] = ρ kT+1u (k )

(8) From the Discrete maximum principle [8], since |u(0)|≤1, the Hamiltonian is minimized when u(0)= - sign ( ρ1 ) (9) From the following canonical equation [8],

ρ1 =

δθ

δ Z (1)

= −CZ (1)

(10)

Thus, from (6),(9) and (10), we have that

u (0) = u = sign(CZ (1)) = sign(Cu (0)) = sign(Cu )

(11)

Thus, the optimal vector u satisfies the necessary condition (3) and it lies on the boundary of the hypercube. Corollary Let E be an arbitrary N x N real matrix. If u minimizes the quadratic form u T Eu , subject to the constraint

ui = 1 for 1 ≤ i ≤ n , then u = − sign (Cu ) where C is the symmetric matrix with zero diagonal elements obtained from E.

(12)

Proof. It may be noted that the same proof as in the above Theorem with the objective function changed from maximization to minimization of quadratic form may be used Q.E.D. Definition The local minimum vectors of a quadratic form on the hypercube are called anti-stable states. Note: It is immediate to see that if is a stable state (or anti-stable state), then - (minus ‘u’) is also a stable state (or anti-stable state). • OPTIMIZATION OF HERMITIAN FORMS OVER THE COMPLEX HYPERCUBE Quadratic forms associated with a complex valued matrix have been investigated by many researchers. Generalization of Rayleigh’s Theorem to Hermitian matrices (discussed below) has been successful. Hermitian forms evaluated on the “complex hypercube” have

15


been pursued by the author [6]. Before giving details associated with optimization of such Hermitian forms, the underlying concepts are explained in the following: A. Complex Hypercube:

Consider a vector of dimension “N” whose components assume values in the following set{ 1+ j1 , 1 –j 1, -1 + j 1, -1 – j 1 }.Thus there are 4 N points as the corners of a set called the ‘Complex Hypercube”. B. Complex Signum Function (proposed by the author))

Consider a complex number “a + j b”. The “Complex Signum Function” is defined as follows: Csign (a + j b) = Sign (a) + j Sign (b) (13) For the purpose of completeness, we investigated quadratic forms associated with an arbitrary complex matrix. To the best of our knowledge, we are the ones to deal with the evaluation of such quadratic forms over the complex hypercube [6]. Now we prove a Theorem dealing with optimization of a Hermitian form over the complex hypercube. This result is in the same spirit of Complex Rayleigh’s theorem with the constraint set being the complex hypercube. Let u be a vector on the complex unit hypercube. Let the convex hull generated by the corners of hypercube be denoted by J. Theorem 5 Let F be an arbitrary N x N Hermitian matrix. From the standpoint of maximization of the * quadratic form i.e u Fu on the complex hypercube, it is no loss of generality to assume that F is a Hermitian matrix with zero diagonal elements i.e. H. If u maximizes the quadratic form , subject to the constraint that u ∈ J (i.e. u lies on the convex hull generated by the corners of complex hypercube), then (14), u = csign ( H u ) Where H is obtained from F by setting all the diagonal elements (of F) to zero and retaining the other elements as same. In the above equation (14) , Sign (0) is interpreted as +1+j or -1-j based on the requirement. Proof. Proof follows from same argument as Theorem 4 (for complex state vector) and is avoided for brevity Q.E.D. Remark 1

16


In view of equation (5), there is no loss of generality in assuming that the TRACE of matrix is zero for determining the stable / anti-stable states (i.e for optimizing the quadratic form). Since TRACE of a matrix M is the sum of eigenvalues, the sum of positive valued eigenvalues is equal to the sum of negative valued eigenvalues. Hence, it is easy to see that a symmetric matrix with zero diagonal elements cannot be purely positive definite or purely negative definite. It can be assumed to be indefinite with the largest eigenvalue being a positive real number. Thus the location of stable states (vectors) is invariant under variation of Trace (M). 3. GLOBAL OPTIMUM STABLE/ ANTI-STABLE STATE COMPUTATUION: COMPUTATION OF MINIMUM AND MAXIMUM CUT IN DIRECTED AND UNDIRECTED GRAPHS As discussed in Section II, Bruck et.al [1] showed that the problem of computing the maximum stable state is equivalent to that of computation of minimum cut in the associated undirected graph. As per Bruck et.al, this is claimed to be an NP hard problem [1]. But theoretical computer scientists Rao Kosaraju and Sartaj Sahni informed the author that the minimum cut in an undirected graph is known to be in P (i.e. polynomial time algorithms exist). They also informed the author that MAX CUT in an undirected graph is NP complete/ NP Hard. From the proof in [1] of Theorem 2, it follows that computing the MAXIMUM CUT is equivalent to the problem of determining the global minimum anti-stable state (i.e. determining the corner of unit hypercube where the global minimum of the quadratic form is attained). Goals: To see if an efficient polynomial time algorithm can be discovered for the problem of computing the minimum cut in an undirected graph. Also, finding a polynomial time algorithm for the NP complete problem of computing the MAXIMUM CUT in an undirected graph. Thus we are interested in knowing whether P = NP. Computation of minimum cut as well as maximum cut in an undirected/ directed graph with non-negative edge weight matrix is solved in [9],[7]. Polynomial time algorithms are designed (and analyzed) for these problems. Thus, one proof that P=NP is provided in [9],[7]. In the following discussion, we consider the quadratic form associated with the matrix M (which also can be treated as the synaptic weight matrix).

17


Lemma 2 If a corner of unit hypercube is an eigenvector of M corresponding to positive / negative eigenvalue, then it is also a stable / anti-stable state Proof. Let h be a right eigenvector of M corresponding to positive eigenvalue. Then, we have Mh = ρ h with

Sign( Mh ) = Sign( ρ h) = Sign ( h) = h

Thus h is also a stable state of M. Similar reasoning holds true when Q.E.D. The following remark follows from this Lemma.

a negative value is.

Remark 2 Suppose we consider a vector on the hypercube (one corner of hypercube) , say which is also the eigenvector of matrix M corresponding to the largest positive eigenvalue i.e. We have MX% = µmax X% . Then, since µ max is positive, we have that

Sign( MX% ) = Sign( µ max X% ) = X% Thus, in view of Rayleigh’s Theorem, such a corner of the hypercube is also the global optimum stable state.This inference follows from the fact that the points on hypercube can be projected onto the unit hypersphere.Such a case (where a corner of the hypercube is also the eigenvector corresponding to the maximum eigenvalue) is very special. It is well known that the computation of maximum eigenvector of a symmetric matrix can be carried out using a polynomial time algorithm (Elsner’s algorithm). Thus in such a case P = NP. Now, we consider the arbitrary case where the eigenvector corresponding to the largest positive eigenvalue is NOT a stable state Lemma 3 If ‘ y’ is an arbitrary vector on hypercube that is projected onto the unit hypersphere and x0 is the eigenvector of symmetric matrix M corresponding to the maximum eigen value (on the unit hypersphere), then we have that

18


yT My = µmax + 2µ max ( y − x0 )T x0 + ( y − x0 )T M ( y − x0 ) Proof: Follows from a simple argument and is avoided for brevity. Refer [11], [12]. Q.E.D. Remark 3 Since, by Rayleigh’s theorem, it is well known that the global optimum value of a quadratic form on the unit hypersphere is the maximum eigen value i.e. µ max , it is clear that for all corners of the hypercube projected onto the unit hypersphere, we must necessarily have that

2µ max ( y − x0 )T x0 + ( y − x0 )T M ( y − x0 ) ≤ 0. The goal is to choose a y, such that the above quantity is as less negative as possible (so that the value of quadratic form is as close to µ max as possible. Unlike in Remark 2, suppose that

L = Sign ( x0 ) ≠ Sign(Mx0 )

Then a natural question is to see if L can be utilized somehow for arriving at the global optimum stable state. Such a question was the starting point for the following algorithm to compute the global optimum stable state. ALGORITHM FOR COMPUTATION OF GLOBAL OPTIMUM STABLE STATE (ALSO ANTI STABLE STATE) OF A HOPFIELD NEURAL NETWORK: Step 1: Suppose the right eigenvector corresponding to the largest eigen value of M is real (i.e. real valued components). Compute such an eigenvector, x0 . Step 2: Compute the corner, L of hypercube from x0 in the following manner:

L = Sign ( x0 )

Step 3: Using L as in the initial condition (vector), run the Hopfield neural network in the serial mode of operation. In view of the following Lemma, eigenvector corresponding to the largest eigenvalue can always be assumed to contain real valued components. Lemma 4 If A is symmetric and real matrix , then every eigenvector can be CHOSEN to contain real valued components:

19


Proof. Follows from standard arguments associated with symmetric matrices. Specifically, it is well known that any real symmetric matrix can be diagonalized by a real orthogonal matrix. Thus, one can find a system of real eigenvectors. Q.E.D. Theorem 6 The above algorithm converges to the global optimum stable state using the vector L as the initial condition. Proof. In view of the results in [1] , the idea is to reason that the vector L is in the domain of attraction of the global optimum stable state. Equivalently using the results in [1], we want to show that the initial vector is in the coding sphere of the codeword corresponding to the global optimum stable state. Let be the global optimum stable state / vector on the unit hypercube. Let be one among the other stable states. Thus, the idea is to reason that the Hamming distance between ‘L’ and i.e. ( , ) is smaller than the Hamming distance between L and i.e. ( , ) i.e. To reason that ( , )< ( , ). The proof is by contradiction i.e. say ( , ) < ( , ) ….(15)

We know that the “sign” structure of the vectors ‘L’ and is exactly the same. More explicitly all the components of ‘L’ and have the same sign (positive or negative). Since the three vectors , , lie on the unit hypercube, we consider various possibilities with respect to the “sign” structure of those vectors. Thus, we define the following sets:

̅…..Set of components of vectors and those of the vector (and hence ‘L’).

that agree (both of them) in “sign” with

…..Set of components of vectors and that DONOT agree (both of them) in “sign” with those of the vector (and hence ‘L’).

̅ …..Set of components of vector components of vector .

where only

20

differs in “sign” from those


…..Set of components of vector components of vector .

where only

differs in “sign” from those

By the hypothesis, the cardinality of set ̅ , i.e. | ̅ | is atleast one larger than the cardinality of the set i.e. | | . For concreteness, we first consider the case where| ̅ |=| |+ 1. To illustrate the argument, we first consider the case where only the last component of differs from that of in sign (but not ) and all other components (of both , ) either agree or disagree in sign with those of . To proceed with the proof argument, the vectors L, , (lying on the unit hypercube) are projected onto the unit hypersphere through the following transformation: Let the projected vectors be Q, R i.e.

Q=

y0 K ,R = 0 , N N

where N is the dimension of the symmetric matrix M.. Thus, we want to reason (if the Hamming distance condition specified above i.e. equation (15) is satisfied i.e. our hypothesis) that the value of quadratic form associated with the vectors Q, R satisfies the following inequality i.e.

RT M R > QT M Q. Hence the idea is to thus arrive at a contradiction to the fact that stable state. In view of Lemma 3, we have the following expressions for

is the global optimum and

Q M Q = µ max + 2 µ max (Q − x0 ) x0 + (Q − x0 ) M (Q − x0 ) T

T

T

:

RT M R = µ max + 2 µ max ( R − x0 )T x0 + ( R − x0 )T M ( R − x0 ) Equivalently, we effectively want to show that

2µ max ( R − Q)T x0 + ( R − x0 )T M ( R − x0 ) − (Q − x0 )T M (Q − x0 ) > 0 Let us label the terms as follows: (I) = 2 µ max ( R − Q )T x0 (II) = ( R − x0 )T M ( R − x0 ) − (Q − x0 )T M (Q − x0 ) > 0

21


We want to show that (I) + (II) > 0. To prove such an inequality, we first partition the vectors Q, hypersphere) into two parts:

, R (lying on the unit

(I) PART (A) WHERE COMPONENTS OF Q, R SIMULTANEOUSLY AGREE OR DISAGREE IN SIGN WITH THOSE OF

x0

(II) PART (B) : LAST COMPONENT OF Q, THAT DISAGREES IN SIGN WITH THE LAST COMPONENT OF

x0

But the last component of R agrees in sign with that of Thus, the vectors , Q, R are given by

.

 x0A  Q  R  x0 =  B  , Q =  A  , R =  A  , QB   RB   x0  Where QB , RB are scalars. Also the components of {QA , RA } are simultaneously either +

"

√$

or−

"

√$

&. (.

is −* , with * > 0.

. Thus except for the last component, all other components of the vector R-Q

are all zeroes. Further suppose that the last component of Then it is easy to see that ( ) − )) = − . √$ Summarizing

( − ) = .0 0 … 0

Hence, we have that

( − )

=

-

√$

0 1 √$

)

.

* .

Thus ( I ) which is strictly greater than zero. Similarly, even when the last component of is +* with * > 0 , it is easy to reason that ( − ) is strictly greater than zero. Now we consider the other term i.e Term (II): We first partition M into a block structured matrix i.e. 3 4 = 567 8 √$

= 9

( −

""

"-

: ) , ( −

-"

; , where : is a scalar and

) in the following manner:

22

-" =

. "-

We also partition the vectors


 F (1)   F (2)  ; R − x = ( )  (2)  0 (1)  G  G 

( Q − x0 ) = 

Where G (1) , G (2) are scalars. As per the partitioning procedure, it is clear that F (1) = F (2)

Also, let us consider the case where the last component of &. (. is −* , with * > 0 . In such a case 1 1 + * . + * ; < (-) = − < (") = √ √ Note: For the case, where the last component of is +* ?&@ℎ * > 0, all the following equations are suitably modified. Details are avoided for brevity. ())

In term (II), the following definition are utilized: H = ( − ) ( − ) and J = ( − ) ( − ) . Thus (II) = H – J . In view of partitioning of the matrix M and vectors ( − ) , ( − C C C H = B (-) "" B (-) + 2 B (-) "- < (-) + < (-) -- < (-) C C C J = B (") "" B (") + 2 B (") "- < (") + < (") -- < (") .

); we have that

Using the fact that B (") = B (-) , we have that C H- J = 2 B (") "- D < (-) − < (") E + : F(< (-) )- − (< (") )- G Let B (") C

"-

=

"- B

(")

= H . Thus, we have that

1 1  1 2 1 2   ) − (θ + )  H − J = 2γ  θ − −θ −  + β (θ − N N N N    4γ  4θ  +β = −  N  N

Hence, we have the following expression for (I)+(II) (I)+(II) =

03 I K 3 J 3 4567 I − − $ + $ . √ √ √$

=

3

√$

F– H − ( :* − MNOP * ) G.

23


But since have that

is the eigenvector of M corresponding to the largest eigenvalue MNOP , we "-

"-

(Q)

– :* = − MNOP * . = :* − MNOP * .

(Q)

Hence we necessarily have that 3 (Q) 1 (I) + (II) = .– H − "-

√$ 3 (Q) 1 = $ .– "- B (") − "- √ 3 (Q) (Q) ) − "- 1 = .– "- ( (Q) − √$ 3 (Q) G. = F– "- √$

We first note that Or equivalently

"-

is constrained by the factthat R&ST ( ) = R&ST U

Thus, we necessarily have

   M 11 Sign   T   M 12 

V =

is a stable state. Thus .

= R&ST ( ).

Q ( A)   ( A) M 12     =  y0  1    β   +1   N   

From Theorem (3), we have freedom in choosing the diagonal elements of M (since Trace(M) contributes a constant value to the value of quadratic form on the hypercube). Thus by a suitable choice of :, we can ensure that (Q) < 0. "- It can be easily reasoned that the freedom in choosing the diagonal elements of M can be capitalized to ensure that 1 (Q) "- V = X R&ST U "" (Q) + √ is always satisfied (by a proper choice of diagonal elements of M).

24


Thus, we arrive at the desired contradiction to the fact that is a global optimum stable state. (and is not the one). Thus, the vector L is in the domain of attraction of the global optimum stable state. Hence, with this choice of initial condition, when the Hopfield neural network is run in the serial mode, global optimum stable state is reached. Now let us consider the case where | ̅ | ≥ | | + 2 . We generalize the above proof for this arbitrary case (using block matrices) . Even in this case, we want to show that (I) + (II) > 0 , where (I) = 2MNOP ( − ) and ( (II) = ( − ) − ) - ( − ) ( − ) .

Let us first consider the term (I).Partition the vectors { R, Q, the sets ̅ , , ̅ , considered in the above discussion) i.e.

 x0A   QA   RA   B Q  R  x0  B   x0 = C ; Q = ; R =  B .  x0   QC   RC   D     QD   RD   x0 

} into FOUR parts (as per

It is clear that from the description of the sets ̅ , , ̅ , , the following equations follow: Q = Q , ) = ) .

25


0 \ ⋮ a [ 0 ` [________` [ 0 ` [ ⋮ ` [ 0 ` [ _______ ` [ 2 ` ` [ ( − ) = [ √ ` ; ⋮ ` [ [ 2 ` ` [ _______ √ [ −2 ` ` [ ` [ √ [ ⋮ ` [ −2 ` _ Z √

+S" \ ⋮ a [ +(. ) ` [________` [ −b ` " [ ` ⋮ [ −(. ) ` <̅ [ ` = [ _______ ` = fB̅ g c [ +c" ` *̅ [ ⋮ ` [ +cd ` [ _______ ` [ −*" ` [ ⋮ ` Z −*e _

where g ji s ( components of vector G ), fij s (components of vector F), δji s (components of vector δ), θji s (components of vector θ) are all non-negative real numbers. Let the vector of all ones be denoted by (̅ . Hence we have that (R – Q ) z = { c ̅ + *̅ ) (̅ | .

Thus the term (I) becomes

2 µ max ( R − Q )T xo =

√$

√$

4 µ max (δ T + θ T )e > 0 N

Using the reasoning similar to the above , it is shown that (I)+ (II) > 0. Q.E.D. Remark 4 Similar Theorem is proved for computing the global optimum anti-stable state (global minimum of the associated quadratic form) of the Hopfield neural network. The associated algorithm is utilized for providing a polynomial time algorithm for the NPcomplete problem of MAXIMUM CUT computation in an undirected graph. Thus P=NP.

26


Claim Suppose } is a anti stable state of the matrix M, then } is stable state of the matrix -M (minus M). Proof. Follows from the definition of stable and anti-stable states Q.E.D. NOTE: Thus, as mentioned in the above Remark, to determine the global optimum antistable state of matrix M (i.e. global minimum value of the associated quadratic form), we compute the global optimum stable state of the matrix -M (minus M). Thus the algorithm discussed above (for computing the global optimum stable state) can be utilized for computing the MAXIMUM CUT in an undirected graph (an NP-complete problem). •

Calculation of Computational Complexity associated with the Above Algorithm:

The above algorithm involves the following computations: (A) Computation of eigenvector corresponding to the largest eigenvalue of the symmetric matrix (connection matrix of Hopfield neural network). A polynomial time algorithm for such a task is already available. (B) Using the associated vector as the initial condition, running the Hopfield neural network in serial mode until global optimum stable state is reached. It is possible to bound the number of computations for this task. The following well known result enables bounding the number of iterations in the serial mode: Note: Graph-theoretic error correcting codes are limited in the sense that the following upper bound on the minimum distance holds true;

2| E | , |V | Where d * is the minimum distance and | E |,| V | is the cardinality of edge and vertex sets d* ≤

of the associated graph respectively.

27


Interesting Generalizations: Complex Valued Associative Memories: In [6], the authors proposed a generalization of Hopfield neural network, called the Complex Amari-Hopfield neural network. In this case the “complex hypercube” (specified in Section 2) constitutes the state space of the neural network. The synaptic weight matrix of the network is a Hermitian symmetric matrix. Thus, in the case of such complex Amari-Hopfield network, the “complex signum” function of the largest eigenvector (i.e. eigenvector corresponding to the largest eigenvalue) is utilized as the initial condition to run the network in “serial mode”. It is reasoned that the global optimum stable state is reached through such a procedure. Note: In the case of real (valued state) Hopfield Neural Network, the state at each neuron assumes only two values i.e. { +1 or -1 }. Thus state value at every neuron constitute the solutions of the algebraic equation -

− 1 = 0 &. (. &~ + 1 •€ − 1 .

Neural network researchers proposed a Novel Complex Valued Hopfield neural network in which the state of every neuron satisfies the equation •

− 1 = 0 &. (. &~ ‚ ƒ@ℎ €••@ •b „T&@ .

In such an associative memory, using phase quantization, the state at a neuron at the next time unit is computed. The author proposed complex valued neural networks (mainly associative memories) in which the state of neuron at the next time instant is computed based on phase as well as magnitude quantization. •

Hopfield Neural Network: Associated One Step Associative Memory: Efficient Algorithm for Minimum Cut Computation:

We now propose a method which reduces the computational complexity of the method of computing the global optimum stable state [5]. The author is actively pursuing the problem of minimum/ maximum cut computation in an undirected/ directed graphs [4]. Efficient algorithms are developed. Lemma 5 Given a linear block code, a neural network can be constructed in such a way that every local maximum of the energy function corresponds to a codeword and every codeword corresponds to a local maximum. Proof. Refer the paper by [1].

28


It has been shown in [1] that a graph theoretic code is naturally associated with a Hopfield network (with the associated quadratic energy function). The local and global optima of the energy function are the code words. Goal: To compute the global optimum stable state (i.e. global optimum of the energy function) using the associated graph theoretic encoder. To achieve the goal, once again the largest real eigenvector is utilized as the basis for determining the information word that will be mapped to a global optimum stable state/ codeword (using the associated graph theoretic encoder). Note: Unlike the results in [7], in this research paper [section 3], it is not assumed that the weight matrix of graph is non-negative. Note: Using the results in [5], the results in this paper are generalized to multidimensional Hopfield neural network. Note: The results of this research paper are first documented in [11], [12].

4. CONCLUSION In this research paper, the author proved that local/ global optima (maxima/ minima) of a quadratic form (associated with a symmetric matrix) over the convex hull generated by corners of hypercube occurs at the corners (under some conditions). Polynomial time algorithm for the computation of global optimum stable state and global optimum anti-stable state (an NP hard problem) is designed and analyzed. Thus, it is proved that P=NP. REFERENCES [1] BRUCK J., and Blaum M., Neural Networks, Error Correcting Codes and Polynomials over the Binary Cube, IEEE Transactions on Information Theory, Vol.35, No.5, September 1989. [2] HOPFIELD JJ., Neural Networks and Physical Systems with Emergent Collective Computational Abilities, Proceedings of National Academy Of Sciences, USA Vol. 79, 1982, pp. 2554-2558. [3] RAMA MURTHY G., Optimal Signal Design for Magnetic and Optical Recording Channels, Bellcore Technical Memorandum, TM-NWT-018026, April 1st , 1991

29


[4] RAMA MURTHY G., Efficient Algorithms for Computation of Minimum Cut and Maximum Cut in an Undirected / directed Graph, Manuscript in Preparation [5] RAMA MURTHY G., Multi-dimensional Neural Networks : Unified Theory, Research monograph published by New Age International Publishers, New Delhi, 2007. [6] RAMA MURTHY G and NISCHAL B., Hopfield-Amari Neural Network : Minimization of Quadratic forms, The 6th International Conference on Soft Computing and Intelligent Systems, Kobe Convention Center (Kobe Portopia Hotel) November 20-24, 2012, Kobe, Japan. [7] RAMA MURTHY G., Minimum and maximum cut computation: Optimization of quadratic forms: Neural Networks, submited to ACM Transactions on Algorithms. [8] SAGE A. P., and WHITE C. C., Optimum Systems Control, Prentice Hall Inc, 1977 [9] RAMA MURTHY G., Optimization of Quadratic Forms: NP Hard Problems : Neural Networks, 2013 International Symposium on Computational and Business Intelligence (ISCBI 2013), August 24-26, 2013, New Delhi, India [10] BRUCK J., and SANZ J., A Study on Neural Networks, International Journal of Intelligemt System, Vol. 3, 1988, pp. 59-75. [11] RAMA MURTHY G., Towards a Resolution of P=NP Conjecture, Cornell University Archive. [12] RAMA MURTHY G., Towards a Resolution of P=NP Conjecture, IIIT-H Archive.

30


Computer Systems Engineering 2014 Keywords: distance learning, equitable access, logical programming, infocommunications network, optimization of structure, space-time algorithms

Viacheslav BARKOV∗

METHOD OF PROVIDING FOR EQUITABLE DATA ACCESS

Technology of network interaction and information transfer reaches the high level. It promotes the active development of distant education worldwide. It is known that the main complexity when training is examination. On the one hand it is required to provide honesty and impartiality, on the other hand, equitable opportunities to all participants. This work is devoted to the choosing of system architecture for simultaneously testing a large number of students. This assumes analysis of several types of the software, such as Apache HTTP Server, Internet Information Services, Apache Tomcat, technology (PHP, ASP.NET, Java Servlets), and different types of communication (GPRS, EDGE, UMTS, HSPA, HSPA+, LTE, WiFi, Ethernet, ADSL, etc). In addition, the user’s location takes into account. It is required to find the best tracks of information to ensure equitable access to data for all users. The method of solution based on graph theory and logical programming is presented.

1.

INTRODUCTION

The high level of information technology development has led to their spread in all fields of modern society, including education [2, 3, 6]. Information technology has been actively used in education for effective interaction between teachers and students. This has influenced the development of both full-time and distance learning. Nowadays Web technology is widely used. There are many Web server implementations today, such as Apache HTTP Server, Microsoft Internet Information Services, Apache Tomcat and so on. Web client implementations are also varied: Microsoft Internet Explorer, Mozilla Firefox, Google Chrome. There are many technologies and programming languages for the Web, such as ASP.NET, Java Servlets, PHP, Python, Rubby, etc. Active use of smartphones and tablets promoted the appearance of new Internet connections types, such as GPRS, EDGE, UMTS, HSPA, HSPA+, LTE, and the ∗

Department of Mathematical Cybernetics and Information Technologies, Moscow Technical University of Communications and Informatics, Russian Federation

31


rising popularity of WiFi networks. Therefore, there is a problem of optimal choice of the hardware and software architecture. Information technology provides students with opportunities for getting electronic version of tutorials and manuals. But examination procedure organization is still a problem. On the one hand it is necessary to ensure the honesty of the other side of equitable opportunities for all students. Examinees may have different devices (laptop, tablet, smartphone) and use different technologies for Internet connection. So, the participants have different data rate, and often at a disadvantage, as the examinees with a higher data rate will have more time to complete assignments. The accounting of completed and defended works is the second problem of education, because one teacher has several groups and must interact with many students. The third problem is the distribution of task variants between students of one or more groups. It is necessary that each student received an individual task and performed it by himself. Another problem is to check the works of students. Usually the teacher has to collect the written work of students, take them home, disassemble the handwriting of each student. There is a problem of the collecting, commenting and version controlling of works, even if these are implemented on computers. All these problems can be solved through the development and implementation of information systems, which will include a assignments control module, task variants distribution module, task accounting module, including loading of performed work on the server, module of communication between participants in the educational process and exam module. The algorithm for finding the optimal path uses the methods of logic programming. It is widely used in the design of intelligent systems [1].

2.

PROBLEM FORMULATION

This paper analyzes the problem of distance learning: organizational processes of simultaneously testing a large number of students and proposes a method of equitable data access. It is necessary to design a unified system architecture that combines hardware and logical levels, given the current technical capabilities and processes to optimize the time and session management exam. Usually distance learning is used client-server architecture. However, it is necessary to solve the following questions before we begin to develop: i). How to ensure equitable opportunity to all the exam, given that they use a variety of computing devices, and methods of communication?

32


ii). What is sufficient for the data transfer: the use of one of the existing technologies (eg, Web server IIS + ASP.NET or Web server Apache HTTP Server + PHP) or develop own solution based on TCP/IP? The resolution of these issues is the purpose of this paper. To solve them, we need to create a mathematical model of the system architecture as a directed loaded graph, taking into account the logical and physical levels, to determine the optimal time path for each client and develop an algorithm that provides equitable access to the data for all participants.

3.

A METHOD FOR EQUITABLE DATA ACCESS

As already mentioned earlier, the examination procedure organization system has client-server architecture. Relative to server, the client can be local or remote. Local clients are in the same network as the server, while remote clients connect via the Internet. To connect to the network (LAN or WAN) clients use different types of connections, such as Ethernet, ADSL, WiFi, or one of the mobile technology. The hardware structure of system is shown in Figure 1.

Fig. 1. The hardware architecture of the system

In this architecture, clients are not able to equality. The time of getting information from the server depends on two parameters. The first important parameter is the speed of data transmission from the client to the Internet service provider for remote clients and to the server for local clients. The second important parameter is the response time of the server. The maximum data rate for the most common technologies are presented in Table 1. 33


Table 1. Maximum data rate of different technologies Technology

Data Rate

Ethernet

10 mbps 1Gbps

ADSL

8 mbps

ADSL2+

20 mbps

WiFi (IEEE 802.11g)

54 mbps

WiFi (IEEE 802.11n)

300 mbps

LTE Advanced (4G)

300 mbps

LTE (3.9G)

100 mbps

HSPA+ (3.75G)

40 mbps

HSPA (3.5G)

14.4 mbps

UMTS (3G)

2 mbps

EDGE (2.75G)

384 kbps

GPRS (2.5G)

100 kbps

CSD (2G)

9.6 kbps

Server response time depends on the distance between the client and the server, and the technology used to connect. In order to simplify this relationship, we will not consider the distance. Table 2 shows the values of the response time for some of the wireless connections. Table 2. The values of the response time for wireless connections Technology

Response Time

WiFi

151 ms

2G

629 ms

3G

212 ms

HSPA+

172 ms

LTE

98 ms

Clients in our system are examinees. Each of them should get task variant. In this case, all examinees must familiarize themselves with the task at about the same time. As it turned out previously, the simultaneous getting task is difficult, since the data rate and response time are different. To solve this problem, we propose an algorithm of interaction between the client and server. The interaction can be divided into the following stages: 1. Connecting clients to the server (t0-t1) 2. Receiving of encrypted tasks by clients (t1-t2) 34


3. Sending of acknowledgment by clients (t1-t2) 4. Receiving of encryption keys by clients (t2-t3) 5. Sending of completed tasks by clients (t3-t4) 6. Receiving of ’time out’ signal by clients (t4) 7. Sending of currently executing task by clients (t4-t5) The timing diagram is shown in Figure 2.

Fig. 2. The timing diagram

4.

SOFTWARE ARCHITECTURE OPTIMIZATION

Client-server architecture can be implemented in different ways. It can use the HTTP protocol to transmit data or can use a proprietary protocol based on TCP/IP. Figure 3 shows how the architecture of the system can look at the use of technologies ASP.NET, Java Servlets, PHP. On the basis of the logical and physical structure we construct a graph that includes the hardware and software architecture. The graph is shown in Figure 4. Its notation is represented in Table 3. We will optimize the server and client software. Subgraph that includes only software is shown in Figure 5. Odds edges represent the time in milliseconds.

35


Fig. 3. The software architecture

Fig. 4. A graph G representing the architecture of the system, including hardware and logic levels

36


Table 3. The notation of graph G, represented in Figure 4 1

Server Operating System

a

Response time for client 1

2

Web browser

b

Response time for client 2

3

Own client application

c

Response time for client 3

4

Microsoft Internet Information Services

d

Response time for client 4

5

Apache Tomcat

e

Response time for client N

6

Apache HTTP Sever

f

Data transmit time for WiFi connection

7

Own server application

g

Data transmit time for Ethernet connection

8

ASP.NET application

h

Data transmit time for other connection type

9

Java Servlet

i

Browser start time

10

PHP Script

j

Own client app start time

11

Client with data

k

Microsoft IIS response time

12

Wifi connection

l

Apache Tomcat response time

13

Ethernet connection

m

Apache HTTP server response time

14

Other connection type

n

Own server app response time

15

Client 1

o

ASP.NET app response time

16

Client 2

p

Java Servlet response time

17

Client 3

q

PHP response time

18

Client 4

r

Response data transmit time

19

Client N

s

Response data transmit time

t

Response data transmit time

u

Response data transmit time

37


Fig. 5. The subgraph which includes server and client software

We need to calculate the minimum time from getting of data by server to sending the response to the client. To do this, we need to find the shortest path from vertex 1 to vertex 11 [5]. In order to find the shortest path in the graph, we will use the depth search algorithm [4]. This algorithm is used in logic programming to prove the goal. Therefore, to solve the problem, we can represent the graph as a set of predicates and write predicates to find the shortest path. The representation of the graph in Visual Prolog is shown in Listing 1. The rules for finding the shortest path is shown in Listing 2. The shortest path from vertex 1 to vertex 11 is 1-3-7-11. This means that the use of own solution is the best option. In order to make sure the choose is correct, an experiment was conducted. It was created a client application that sends multiple requests to the server, and calculates the average server response time. It was also developed a small server application that receives data over a TCP/IP protocol without using HTTP. The experiment results, represented in Table 4, show, that the development of own solution is optimal.

38


Listing 1. The representation of the graph in Prolog 1 2 3 4 5 6 7 8 9 10 11 12 13

arc(1,2,50). arc(1,3,50). arc(2,4,1.5). arc(2,5,15.5). arc(2,6,90). arc(3,7,0.3). arc(4,8,2.2). arc(5,9,3). arc(6,10,5.1). arc(7,11,50). arc(8,11,50). arc(9,11,50). arc(10,11,50).

Listing 2. The rules for finding the shortest path on the graph 1 path(EndPoint, EndPoint, CurrVertexList, CurrVertexList, CurrEdgeList, CurrEdgeList, CurrDist, CurrDist) :- !. 2 3 path(V, EndPoint, CurrVertexList, VertexList, CurrEdgeList, EdgeList, CurrDist, Dist) :4 arc(Label, V, NewV, D), 5 not(member(NewV, CurrVertexList)), 6 NewDist = CurrDist+D, 7 path(NewV, EndPoint, [NewV|CurrVertexList], VertexList, [La-bel|CurrEdgeList ], EdgeList, NewDist, Dist). 8 9 shorter(StartPoint, EndPoint, Dist) :10 path(StartPoint, EndPoint, [StartPoint], _, [], _, 0.0, D), 11 D < Dist, 12 !. 13 14 shortestPath(StartPoint, EndPoint, Vertexes, Edges, D) :15 path(StartPoint, EndPoint, [StartPoint], V, [], E, 0.0, D), 16 not(shorter(StartPoint, EndPoint, D)), 17 !, 18 reverse(V, [], Vertexes), 19 reverse(E, [], Edges).

Table 4. Results of the experiment Server

Data Transmission Time

Apache HTTP Server + PHP

195.1 ms

Microsoft IIS

103.7 ms

Apache Tomcat

118.5 ms

Own Solution

100.3 ms

39


5.

AUTOMATIC DEFINING OF TEXT VOLUME

Participating in the testing, clients send the data, which must be checked. Therefore, the system should have a tool for automatic analysis of the text files quality. Before we put them in a database, we must make sure that they have a sufficient volume and carry some meaning. The easiest way to check the volume is counting the number of characters (bytes) in the text. However, this method can not be applied when defining the text volume, because it is very easy to get around. For example, client can add extra text characters (spaces, tabs, etc.). Therefore, we propose to count words in the text for checking its volume. Source text is splitted into words. It is count the total number of words and the number of occurrences of each word in the text. The total number of words is used to determine the volume of text. Information about the frequency of occurrence of each word in the text can be used to determine the texts that are of no meaning.

6.

CONCLUSIONS

In this paper, the problem of the simultaneous testing of a large number of students was analyzed. It was designed the system architecture and the method for equitable data access. The method is based on waiting by server of the slowest client. It was also created a mathematical model of the system architecture as a directed loaded graph, taking into account the logical and physical levels. It helped to find the optimal logical structure. The solution is to design own client and server applications. Complicate the problem by assuming that the students have the opportunity to participating in the exam to move on some of the tracks. It is necessary to take into account the dynamic change of speed of Internet traffic. It is required to develop an algorithm to ensure equality for moving clients - members of the exam.

7.

ACKNOWLEDGMENTS

I am grateful to Professor M. V. Yashina for the problem statement and discussion of the results.

40


REFERENCES [1] BARKOV V.V., The Technology of Development Shared Libraries on Visual Prolog Logical Programming Language and Their Use in Other Programming Languages. The V Student Science Forum ”Telecommunications and Infocommunication Technology - Realities, Opportunities, Perspectives”, Moscow, 2014, pp. 64-67. [2] BUGAEV A.S., BUSLAEV A.P., KOZLOV V.V. AND YASHINA M.V., Distributed Problems of Monitoring and Modern Approaches to Traffic Modeling. 14th International IEEE Conference on Intelligent Transportation Systems (ITSC 2011), Washington, USA, 57.10.2011. [3] BUSLAEV A.P., PROVOROV A.V. AND YASHINA M.V., Infocommunication Systems of Saturated Traffic Control in Megalopolises. Proceedings of the 2013 International Conference on Internet Computing and Big Data (WORLDCOMP13), 2013, Las Vegas, USA. [4] HOPSROFT J.E., MOTVANY R. AND ULLMAN J.D., Intriduction to Automata Theory, Languages and Computation Addison-Wesley Publ. Comp. 2001. [5] SINGH A., Elements of Computation Theory. Springer, 2009. [6] YASHINA M.V. AND PROVOROV A.V., Verification of Info communication System Components for Modeling and Control of Saturated Traffic in Megalopolis. Proc. Of the 8th Int. Conf. on Dependability and Complex Systems DepCoS-RELCOMEX, New Results in Dependability and Computer Systems, Springer, 2013, pp. 531-542.

41


Computer Systems Engineering 2014 Keywords: bilinear system, root locus, pole placement, equivalent poles

Lukasz GADEK ∗ Keith BURNHAM † Leszek KOSZALKA ‡

COMPUTATION OF POLE-PLACEMENT AND ROOT LOCUS METHOD FOR CLASS OF BILINEAR SYSTEMS

This paper introduces an adaptation of the classical linear control theory representation of zeros, poles and gain into a bilinear approach. Discrete domain and diagonal bilinearity is considered. The placement of poles is a complete description of plants dynamics; hence it is a convenient form for calculation of various properties, e.g. rise time, settling time, etc. Such can be adapted into the bilinear structure characterized by varying parameters with respect to different points of operation(OP). The perturbation between different OPs can be assessed graphically and applied to determine regions of stability and instability or the shape of response. The transient between OPs is represented smilary to Linear Matrix Inequality (LMI), i.e. as a region of uncertainity.

1.

INTRODUCTION

Bilinear structure allows approximation of a non-linear (NL) plant into a decomposable form of linear model with NL term [1]. Initial industrial application utilizes mostly the property of bent steady state gain which improves modelling of water flow systems and industrial furnaces [2] where response is saturated gradually for high OP. By extending with the bilinear term, properties of the response become time variant with respect to current state (exceeding simplification to gain slide) as described in Section 2 and therefore a robust stability and behaviour prediction is required - pole-placement method, e.g. [3]. To achieve satisfactory performance in designing a bilinear plant controller ([4] and ∗ Department of Systems and Computer Networks , Wroclaw University of Technology, Poland, e-mail: lukasz.gadek@pwr.edu.pl † Control Theory and Application Centre, Coventry University, United Kingdom ‡ Department of Systems and Computer Networks, Wroclaw University of Technology, Poland

42


[3]), an efficient identification of the plant must be performed. The method on bilinear plant varying properties prediction is an extension allowing of more comprehensive understanding of the bilinear design. A similarity of equivalent poles movement with respect to OP and root locus of gain feedback system is observed in Section 3. Correlation of the classical root locus [5], pole-placement with output feedback [6] and equivalent (bilinear) poles locus will be a following stage of the research as highlighted in the last section.

2.

BILINEAR STRUCTURE

This section is aimed to introduce the bilinear structure in terms of mathematical representations (Section 2.1 and 2.2) and capabilities overview (Section 2.3). By extending a linear model with a bi-linear term, the bilinear model is obtained. It can be illustrated with an example of auxiliary function f (x, u) where x and u are time-dependant entities: y = f (x, u) x ∈ <n m n X X uj xi + ylin = i=1

ybil = ylin +

j=1 m n XX

u ∈ <m (1)

xi uj .

i=1 j=1

In (1) ylin is an output of linear model Pn which Pm fulfils superposition rule. The following equation contains bilinear term i=1 j=1 xi uj being attached to ylin . A similar approach can be applied to the State Space form and other representations. 2.1.

STATE SPACE

State Space is the most popular in a context of bilinear modelling and can be found in recent publications, e.g. [7]. The formulation of a bilinear MIMO term from [8] is as presented in (2): Fu ⊗ x

(2)

F = [F1 F2 F3 ... Fm ] Fi ∈ <nxn

u ∈ <m

43

x ∈ <n .


In this paper only SISO systems are considered, hence (2) can be simplified into form of (3): N ux

(3) nxn

N ∈<

u∈<

n

x∈< .

With (3) SISO State Space is established in (4): xk+1 = Axk + Buk + N uk xk

(4)

yk = Cxk yk , uk ∈ < xk , B, C ∈ <n A, N ∈ <nxn . The root locus method introduced in this paper is based on the Transfer Function (TF) equivalent (presented in Section 2.2). The transmittance from state space is explicit if a canonical form [9] is utilized. Assuming observability and diagonal bilinear matrix, coefficient matrices from (5) are used:     −a1 1 0 · · · 0 n1 0 0 · · · 0  n2 0 0 · · · 0  −a2 0 1 · · · 0     N = (5) A= .   .. .. .. . . ..  . . . .. .. . . . ..   . . .  .. . . −an 0 0

0

B = b1 b2 · · · bm 0 · · ·

T

nn 0 0 0 C = 1 0 ··· 0 .

Canonical form coefficients (5) are interchangeably used in difference equation (DE). Eq. (6) presents according Bilinear DE: yk = −

n X i=1

ai yk−i +

m X

bi uk−i +

n X

ni uk−i yk−i .

(6)

i=1

i=1

Under strong assumption of constant input u the bilinear form may be approximated with a linear equivalent where a ˜i = ai − ni uk−i . However such approximation can be used in the open loop control only, therefore it is impractical in the majority of industrial application.

44


2.2.

EQUIVALENT TRANSFER FUNCTION

Transfer Function (TF) can be achieved from DE by introducing z i - time shift operator where i is quantity of samples shifted forward. In the case of bilinear model two types of the equivalent TF are computable: - Assuming sluggish change of input (eg. slow PID or manual control) followed by u in denominator as in (7), - For fast controllers or systems with high inertia where y assumed to be sufficiently constant be approximate as coefficient in the numerator. Both forms are equivalents of TF as in each denominator or numerator contains timevarying variable while in classical approach these are static: b1 z −1 + · · · + bm z −m Y = . U 1 + a1 z −1 · · · + an z −n − n1 z −1 uk−1 − · · · − nn z −n uk−n

(7)

Natural step to recapture polynomial of static coefficients, is to assume u or y as approximately constant, e.g. if input change is negligible within last m samples then U ≈ uk−1 ≈ · · · ≈ uk−m (8): Y b1 z −1 + · · · + bm z −m = . U 1 + (a1 − n1 uk−1 )z −1 · · · + (an − nn uk−n )z −n

(8)

Such reasoning is a root of establishing the two types mentioned above. In this paper u in the denominator approach is used, although following the same procedure for both types leads to the identical conclusion. However, the latter approach requires more significant mathematical effort. 2.3.

PROPERTIES

From equivalent TF (8) two main properties of bilinear system can be highlighted. as Time shift operator z a is a discrete substitute of Laplace’s e t where a is a number of shifted samples and ts is sampling time of discrete system.

45


Fig. 1. Steady State gain of bilinear system accordingly to sum of n coefficients

Therefore, knowing that for t → inf Laplace operator is convergingPto zero, z assemble to 1. Input/output gain of a discrete system can be calculated as P ab . In case of bilinear system denominator coefficient is impacted by input. If input does not change for a certain number of samples after excitation (e.g. step change) then bilinear system gain can be represented as in Fig. 1. Eigenvalues of the bilinear system are also impacted by bilinearity. Characteristic Equation (CE) from equivalent TF (8) has a form of: z n + (a1 − n1 uk−1 )z n−1 · · · + (an − nn uk−n ) = 0,

(9)

where z satisfying (9) are poles of the system. Correlation between u and dislocation of poles is indisputable, its prediction and estimation is presented in Section 3.

3.

EQUIVALENT POLES LOCI

The aim of research is establishing a simple set of rules to predict perturbation of dynamical behaviour, i.e. movement of the equivalent poles, in a bilinear system. Derivation of LMI description of eigenvalues is performed in Section 3.1 while exemplary pole regions are plotted and commented in Section 3.2.

46


The reasoning is performed on diagonal bilinear and SISO system as described in Section 2.1. Due to simplification, initial derivation is utilizing approximated constant U = uk−1···n which is valid if uk−i ≈ uk−j ∀i, j ∈ [0, m]. 3.1.

ANALYTICAL APPROACH

In this section, solution of (9) with respect to U is derived. For simplicity 2nd order model with n = 2 is utilized. Pole computation (z) of a linear system is located in (10). Bilinear equivalent pole z˜ location is extended with n coefficient and U as presented: p −a1 ± a21 + 4a2 (10) z= 2 p −a1 + n1 U ± (a1 − n1 U )2 − 4(a2 − n2 U ) z˜ = . (11) 2 Initial step is calculation of eigenvalues with respect to imaginary plane, i.e. under condition =(˜ z ) = 0, which is expanded as follows: =(˜ z ) = 0 ⇐⇒ 0 ≤ (a1 − n1 U )2 − 4(a2 − n2 U ) =(˜ z ) = 0 ⇐⇒ 0 ≤

n21 U 2

− (2n1 a1 − 4n2 )U − 4a2 +

(12) a21

=(˜ z ) = 0 ⇐⇒ U ∈ (−∞, U out ] ∨ U ∈ [U in , ∞). Based on (12) break-out U out and break-in U in inputs can be calculated when (12) is equated to zero in. The result is presented as: p 2n1 a1 − 4n2 ± (−2n1 a1 + 4n2 )2 − 4(−4a2 + a21 )n21 out in U ,U = . (13) 2n21 Replacing U in (11) with (13) results with calculation of the break-out and -in points in the complex plane: √ 2n1 a1 −4n2 ± (−2n1 a1 +4n2 )2 −4(−4a2 +a21 )n21 −a1 + n1 2n21 z˜out , z˜in = (14) 2 p n2 ± n22 − n1 n2 a1 + n21 a2 =− . n1 From (12) it can be seen that the square root of (11) is a quadratic function. Therefore, out in <(˜ z ) = |˜z 2−˜z | denotes a point on real axis at which extrema of pole imaginary part

47


is achieved. Hence U = − nn12 is inserted into the square root in (11) resulting with (15): p ± n22 + n1 n2 a1 − n21 a2 max |=(˜ z )| = i. n1

(15)

Based on extrema points of equivalent poles loci form (14) and (15) it can be deduced that the locus trajectory is a circle with centre in (x = − nn12 , y = 0) (where x is real and √ 2 n −n n a +n2 a y is imaginary axis) and range r = =( 2 1n12 1 1 2 ) if U ∈ [U out , U in ]. The circle equation in (16) for 2nd order bilinear system is formed: (<(˜ z ) − x)2 + (=(˜ z ) − y)2 = r2 (

−a1 + n1 U (a1 − n1 n2 + )2 + 2 n1

(16) U )2

− 4(a2 − n2 U ) = 4

n22

− n1 n2 a1 + n21

n21 a2

.

The equality is satisfied for all U ∈ [U out , U in ]. Exemplary results obtained with numerical simulation are presented in Section 3.2. Second order loci is always based on circle in range of U ∈ [U out , U in ] for the assumption of constant U . If U is changing within known range then the equivalent pole position must be amened accordingly to rate of change of input (17). However, according to LMI approach [10] it does not include internal instabilities which may result in discrepancies between prediction and actual state: u1 = U

(17)

u2 = U + ∆. Replacing U in (11) with (17) leads to amendment in the following calculations. Resulting locus formula is a circle based accordingly to (16) amended into (18): 2 (<∆ (˜ z ) − x∆ )2 + (=∆ (˜ z ) − y∆ )2 = r∆

U )2

(a1 − n1 n2 −a1 + n1 U + )2 + 2 n1 2 n − n1 n2 a1 + n21 a2 − n21 n2 ∆ = 2 . n21

(

(18)

− 4a2 + 4n2 (U + ∆) = 4

Range of the circle is impacted with ∆ while centre position is not. Moreover, it can be observed that the occurrence of ∆ has impact exclusively on the vertical position of the pole; regardless to the rate of change, the position is not changed respect to the real axis.

48


This allows to represent equivalent poles as a region in which (19) holds if maximum ∆ is definable: n2 2 r − |n2 ∆| ≤ <(˜ z) + + =∆ (˜ z )2 ≤ r + |n2 ∆|, (19) n1 where in (19) properties r, <(˜ z ) and centre of gravity are identical to (16).

Fig. 2. Equivalent poles loci with respect to increasing step input (arrow direction shows location for incrementing u)

3.2.

SIMULATION

Method described in Section 3.1 has been validated numerically using MATLAB environment. Exemplary result of the locus for second order system with SS coef T 0.015 0 1.2 1 is presented in B = 1 −0.2 and N = ficients A = 0.002 0 −0.35 0 Fig.√ 2. According to the (16) the trace of the poles is within circle with range of r = 2 2 ·0.35 =( 0.002 +0.002·0.015·1.2+0.015 ) = 0.72 and centre in x = − 0.002 0.015 0.015 = −0.13, y = 0.

49


Fig. 3. Exemplary Loci of higher order systems

For systems with higher orders the circular movement applies on corresponding basis. However, depending on number of poles and bilinear coefficient both: centres and trajectories might be marginally perturbed as presented in Figure 3 where: - Plant I has two loci trajectories with r1 = 0.7842, r2 = 0.08407 and respective centres in (−0.159, 0) and (0.159, 0), - Plant II poles starting in complex plane are moving perpendicular to the real axis for descending u (a loci defined as r = ∞ ? ), - In plant III around break away point trajectory is perturbed. 50


Further investigation on the higher order systems is proposed in the last section.

Fig. 4. Equivalent poles loci as inequality region where ∆|uk−1..m | < 100

Fig. 4 represents a region of the equivalent poles against locus of U approximated as T 1.2 1 , B = 1 −0.2 , a constant. The system is described by following: A = −0.35 0 0.010 0 and |∆| < 100. Hence, based on (18), locus range perturbation is N = 0.002 0 r ∈ [0.65, 0.91] around the gravity centre in (x = −0.2, y = 0). Method derived in Section 3.1 has been validated for second order systems in numerical tests presented in this section (Fig. 2 and 4). The resemblance of equivalent poles and the gain feedback root loci can be observed due to the similar rotational behaviour and poles convergence at inf . It applies when close loop gain is mapped as U . The question to be asked is if this similarity is fully convertible.

51


4.

CONCLUSION

Method derived for a second order bilinear system in Section 3.1 has been validated and presented graphically in Section 3.2. Moreover, similar vein pattern for third and higher order systems can be observed. The method is visually similar to the root locus of closed loop gain control [5]. Hence, parts of the method may be applied interchangeably between bilinear and linear structures. Due to the more accurate dynamics prediction, a calculation of stable bilinear plant range of operation can be performed. It may be done utilising a correspondent method in the gain feedback loop locus. The critical (boundary) operating region of the bilinear model will be defined as a curve with respect to U and ∆. Improved predictability is followed by enhanced identification capabilities - when observing trace of both: gain and dynamical properties (e.g. rise time and overshoot) engineer would have additional resource for assessing plant’s structure type and the parameter estimation. The intuitive continuation of the research is expanding of the analytical reasoning for the higher order systems. Generalisation of the obtained algorithm and comparison with the classical root locus is essential due to large quantity of correspondence between the loci. The potential of integration of the output feedback stabilization method [6] and bilinear model stability problem should be investigated. Prediction of the pole location or the region of displacement may be useful for developing model based controllers. This approach contrary to the existing methods: - frequent update of the linearised model around current OP, - linearisation of the actual plant with a compensator [4] has potential of detecting the internal instabilities and therefore ensuring robustness.

REFERENCES [1] TAN N. and ATHERTON D.P., Stability margin computation for nonlinear systems: A parametric approach. 15th IFACE Triennial world congress, Spain. 2002. [2] MARTINEAU S., BURNHAM K.J., HAAS O.C.L., ANDREWS G., HEELEY A., Fourterm bi-linear PID controller applied to an industrial furnace. Control Engineering Practice, Vol.12, pp, 457-464, 2004. [3] TAYLOR C.J., CHOTAI A., BURNHAM K.J., Controllable forms for stabilising pole assignment design of generalised bilinear systems. Electronics Letters Vol.47, Issue 7, 437439 ,2011.

52


[4] GADEK L., KOSZALKA L., BURNHAM K.J., The Compensation of N-th Order Bilinearity Applied with Model Based Controller. Progress in Systems Engineering, Advances in Intelligent Systems and Computing, Vol. 330, pp. 49-54, 2015. [5] EYDGAHI A.M., Complementary Root Locus Revisited. IEEE Transactions on Education, Vol.44, Issue 2, 2001. [6] KONIGORSKI U., Pole placement by parametric output feedback. Systems & Control Letters, Vol.61, Issue 2, pp. 292-297, 2012. [7] LEE, C. H., JUANG J. N., Nonlinear system identificationa continuous-time bilinear state space approach. The Journal of the Astronautical Sciences, Vol.59, Issue 1-2, pp. 398-420, 2012. [8] VERDULT V., VERHAEGEN M., Bilinear state space systems for nonlinear dynamical modelling. Theory in Biosciences, 119(1), 1-9, 2000. [9] REUTER H., State space identification of bilinear canonical forms. Control ’94, International Conference on Control, 1994. [10] CHILALI M., GAHINET P., APKARIN P., Robust Pole Placement in LMI Regions. IEE Transactions on Automatic Control, Vol.44, Issue 23, 1999.

53


Computer Systems Engineering 2014 Keywords: optimization, elastic optical networks, routing, modulation and spectrum allocation

Róża GOŚCIEŃ*

A SIMPLE COST ANALYSIS IN ELASTIC OPTICAL NETWORKS Optical networks are an indispensable part of transport networks, due to their ability to carry in efficient way high-bandwidth aggregated data traffic over a large distance. Nowadays, the optical networks rely on the Wavelength Division Multiplexing (WDM) technology. However, this technology is not efficient enough to support still increasing traffic requests in future networks. Concurrently, a very promising approach for future optical networks is an idea of Elastic Optical Networks (EONs). An EON, on the contrary to conventional WDM network, operates within flexible frequency grid and makes use of advance modulation and transmission techniques. In this paper, an attempt to cost analysis in EON is presented. The considered cost includes cost of network devices and 1-year fiber leasing cost.

1. INTRODUCTION Every year, we observe that more and more people are interested in using network resources. Moreover, people use them with many different devices – private computer, business computers, smartphones, tablets, etc. [1]. Additionally, network users are interested in bandwidth-intensive services, that very often should be provided in a real time: video on demand, internet television, content delivery networks, cloud computing, etc. [1, 2]. As a repercussion of these trends, the total network traffic increases rapidly. According to [1], the total IP network traffic has increased more than fivefold during last five years and will increase threefold during the next five years. The increasing network traffic is a big challenge for telecommunication operator, which have to support this traffic in their transport networks. The transport networks are generally based on optical technologies, since only fiber is capable to support transmission of a very high bit-rate data over a large distance. Nowadays, in the optical networks the WDM (Wavelength Division Multiplexing) technology is broadly *

Department of Systems and Computer Networks, Wroclaw University of Technology, Poland, e-mail: roza.goscien@pwr.edu.pl

54


deployed. However, this technology suffers low spectrum utilization and is suspected to be not efficient enough for future networks. Therefore, some innovations and improvements in the field of optical communication are required. A very promising approach is based on the idea of Elastic Optical Networks (EONs). Elastic optical network is a solution, that was initially proposed in [3]. Here, the entire available frequency spectrum is divided into narrow frequency segments, called slices. According to ITU-T recommendation, the width of a particular slice is 6.25 GHz [4]. By grouping an even number of frequency slices, different-width channels can be created and used to transmit data. Therefore, the EON channel size can be suited to the given traffic demand, on the contrary to fixed 50 GHz WDM channels. Moreover, EONs apply distance-adaptive modulation formats, which allow for better spectrum usage according to the transmission path characteristics [5]. Along with an idea of EONs, a new optimization problem arises, called Routing, Modulation Level and Spectrum Allocation (RMLS). The aim of the RMLS problem is to assign for each traffic demand a lighpath (a routing path that connects demand end nodes, a modulation format, and a channel of size determined by a selected modulation format, path length and demand volume) [5]. The RMLS problem was proved to be NP-hard [5]. The optimization problems for EONs very widely discussed in the literature, but mostly in case of the optimization of the spectrum usage. In this paper, an EON is analysed according to its cost. The cost of operational EON is predicted for a period of six years 2014-2019 under real traffic model prepared based on Cisco forecast. The EON cost is also compared with cost of a WDM-based network. Moreover, the impact of the protection requirements on the network cost is taken into consideration. The rest of paper is organized as follows. Section 2 contains description of the optimization problem and EON cost. In the section 3, the methods to solve considered optimization problem are presented. The investigation description and results are included in the section 4. The last section 5 concludes the paper. 2. PROBLEM FORMULATION In this section, a description of the optimization problem is presented. Because of the limited space of this paper, the problem ILP formulation is not included. For more details about ILP modeling, the reader is referred to [5] 2.1 ROUTING, MODULATION LEVEL AND SPECTRUM ALLOCATION

In the paper, the routing, modulation level and spectrum allocation (RMSA) problem in EONs is considered. It is assumed, that a set of static unicast traffic demands is given in advance. Each traffic demand is characterized by a source node, destination node and volume expressed in Gbps. The link-path modelling approach is applied [6] and 10 different candidate paths are precalculated between each pair of 55


network nodes. Moreover, a set of available modulation formats is given: BPSK, QPSK, x-QAM where x belongs to {8, 16, 32, 64}. The aim of the optimization problem is to select for each traffic demand a lightpath, i.e., a modulation format, a routing path that connects demand source and destination node, and a channel allocated on this path. A channel is a subset of adjacent frequency slices, that can be used to transmit data. The size of a channel is a function of demand volume, path length and applied modulation format, similarly like in [5]. There are three main constraints in the RMSA problem: • Slice capacity – a particular slice on a particular link can be assigned to at most one ligthpath, realizing one demand, • Spectrum continuity – the channel assigned on the lightpath for a particular demand has to be the same on each network link, which belongs to this lightpath, • Spectrum contiguity – a channel has to include adjacent slices. In the problem, an additional constraint is assumed and limits the number of available frequency slices on each network link. There are many different feasible solutions for each solvable instance of the RMSA problem. However, the solutions differ in the values of some performance metrics. In this paper, the network cost is considered as the optimization criteria. 2.2 COST IN ELASTIC OPTICAL NETWORKS

In order to realize a traffic demand, three EON elements are necessary: • Transponders – devices that are able to transmit and receive optical signal, • Regenerators – devices that can regenerate (reinforce) optical signal if the transmission path length is longer that applied modulation format transmission range, • Spectrum resources – frequency slices that create a channel. In this paper, the network cost is defined as a cost of all transponders and regenerators required to support traffic demands and 1 year fiber leasing cost, that depends on the spectrum width, which is necessary to fulfil traffic requirements. All costs are presented in Euro in the current prices. The cost values similarly as in [7] and [8] are used. Since the network devices for EONs are not available on the market nowadays, the costs are expressed relatively to the cost of the WDM 10Gb transponder, which is estimated to be equal to 2k EURO [7]. In Table 1, the relative cost of devices is presented. 56


Table 1. Cost and power consumption of transponders/regenerators. Relative cost

10 Gbps WDM transponder/regenerator

1

40 Gbps WDM/O-OFDM transponder/regenerator

2.5

100 Gbps WDM/O-OFDM transponder/regenerator

3.75

400 Gbps O-OFDM transponder/regenerator

5.5

The fiber leasing cost is assumed to be equal to 2k EUR/km per a 20 year period (as in the methodology proposed in [7]). Accordingly, the relative cost of a "dark" 6.25 GHz slice is approximately equal to 7,813*10-5 per km per year. 4. ALGORITHMS In this section, two approaches for the considered optimization problem are described. The first one is based on the well-known shortest path method, the second one is a dedicated heuristic method. 4.1 SHORTEST PATH METHOD

The problem formulation is based on the link-path approach [6]. For each pair of network nodes, 10 different candidate routing paths are precalculated. The length of each predefined path can be easily calculated in the pre-processing stage. Thus, for a particular demand (characterized by the volume, source node and destination nodes) the cost of its realization on each candidate path under a particular modulation format can be easily obtained. Therefore, the problem can be solved in optimal way, by selecting for each traffic demand a connection of a candidate path and modulation format, that is characterized by a minimum cost. However, this method cannot be used to solve any problem instance, since it does not control spectrum usage. When the spectrum resources are limited on each network link, many solutions obtained in this way can be infeasible. 4.2 DEDICATED HEURISTIC METHOD

In order to find a problem solution when the spectrum resources are limited, a dedicated heuristic method is applied. The method is called Enhanced Adaptive Frequency Assignment (E-AFA), and was initially proposed in [9] for the RMLS problem under DPP scheme [12]. However, the method can be easily modified to no protection scenario. Below, the description of E-AFA method for RMLS problem with no protection is presented. 57


The E-AFA process starts with calculating a number of different metrics. Let d=1,2,…,D denotes a traffic demand. First, a special metric is calculated for each traffic demand d, according to equation (1). md =

nd p d 2| E |

(1)

In the formula (1), nd is the minimum number of slices required to realize demand d on its any candidate path, using any of the available modulation format. Similarly, pd is a length (expressed as a number of links) of the shortest candidate path for this demand. Finally, |E| denotes the number of network links. Next, traffic demands are divided into subsets Bm that include demands with the same value of metric md equal to m. The E-AFA method analyses sets Bm one by one in decreasing value of m. For a particular value of m, the method calls a function MinFS(d:md = m) in order to find a demand d, which currently allocated provides the minimum value of cost function. When the demand to be allocated is selected, a special function MinFSPath(d) is used to choose for this demand a lightpath, that currently provides allocation with the minimum value of the objective. In order to find this lighpaths, the function analyses all candidate paths and available modulation formats. The E-AFA process is presented in Algorithm 1. For more details of this method, the reader is referred to [9]. Algorithm 1 (E-AFA): 1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: 13:

B ← {d : d = 1,2,…,D} WHILE B ≠ ∅ m ← max{md : d ∈ B}, Bm ← {d : d ∈ B, md = m} WHILE Bm ≠ ∅ FOR EACH demand d ∈ Bm ɸd ← MinFS(d) END FOR d* ← arg min (ɸd) if more than one demand yields the minimum value of ɸd Allocate demand d* to a lightpath g ← MinFSPath(d*) B ← B \ {d*}, Bm ← Bm \ {d*}. END WHILE END WHILE TERMINATE

58


5. INVESTIGATION The goal of the investigation was threefold. Firstly, to evaluate potential cost of an EON in operational state. Secondly, to compare EON cost with cost of the currently deployed optical technology (WDM). Thirdly, to find out how additional network requirements (in the case study – survivability requirements) affect EON cost. 5.1 SIMULATION SETUP

In the investigation, an elastic optical network that can provide a variety of currently desirable services, including data-center based services (like Content Delivery Networking, cloud computing, video streaming etc. [5]), is considered. The numerical experiments are performed using Euro28 network topology (28 nodes, 82 links), presented in Fig. 1. In the figure, some specific nodes are assigned. The nodes that can host data centers (assigned with server icon) and nodes that connect European network with other continents (assigned with arrows). The location of special nodes was made based on the real data presented on the http://www.datacentermap.com/.

Fig. 1. Euro28 network topology

59


The traffic model was created for a period of six subsequent years 2014-2019 including four types of traffic that create overall network traffic: • city-city (typical point to point transmission), • city-data center (traffic related to data centers that can support such popular services like Content Delivery Networks, IP TV, video streaming), • data center-data center (point to point between nodes that can host data centers), • international traffic (traffic to/from nodes that connect European network with other continents, calculated as a ratio of total network traffic). The traffic of each traffic type was expressed as a set of unicast demands. It is assumed, that the total traffic volume in 2014 is equal to 15 Tbps and grows in subsequent years, according to the Compound Annual Growth Rate (CARG) for each traffic type. An additional assumption says, that in each analyzed year, the 20% of overall network traffic is provided by the international traffic. The ratio of each traffic type in the overall traffic and the CARG for each traffic type were obtained based on the Cisco forecast presented in [1] and [2] reports. In particular, the CARG of city-city traffic is equal to 18%, city-data center 31%, and data center-data center 32%. The traffic demands were generated similarly like in [10] according to the multivariable gravity model with real data related to population of the region that the city (node) covers, geographical distance between network nodes, and economy level expressed by GDP (Gross Domestic Product). 5.2 EON ASSUMPTIONS

The investigation assumptions are based on an EON with BV-Ts implementing the PDM-OFDM technology with multiple modulation for-mats selected adaptively between BPSK, QPSK, and x-QAM, where x belongs to {8, 16, 32, 64}. Here, the spectral efficiency is equal to 1,2,...,6 [b/s/Hz], respectively, for these modulation formats and PDM stands for Polarization Division Multiplexing, which allows to double the spectral efficiency. EON operates within a flexible ITU-T grid of 6.25 GHz granularity [4]. Three types of BV-Ts are used, each characterized by different capacity limit, respectively, 40 Gbps, 100 Gbps, and 400 Gbps. The BV-Ts allow for bit-rate adaptability with 10 Gbps granularity. The transmission model presented in [11] is applied, which estimates the transmission reach of an optical signal in a function of the selected modulation level and transported bit-rate. The guard band with width of 12.5 GHz is applied between neighboring connections. In all scenarios, it is assumed that the transmission reach is extended by means of regenerators, which are applied whenever necessary.

60


5.3 OPTIMIZATION RESULTS

In order to obtain results for any of predefined traffic scenarios, in the investigation the E-AFA method was applied. First, in Fig. 2 the calculated cost of an elastic optical network is presented for a period 2014-2019. The predicted network cost in 2014 is about 4.5 mln EURO and grows in subsequent years.

Fig. 2. Cost of an EON predicted for years 2014-2019.

In order to make obtained values more practical, the cost was also calculated for the WDM-based network. The comparison between network cost for WDM and EON technologies was presented in Fig. 3. It is easily noticeable that EON is much cheaper than network, which operates with WDM technology. The cost savings provided by the EON compared to WDM are presented in Table 2. The values express how much money is saved when using EON instead of WDM. As we can see, the EON provides up to 50% of cost savings compared to second technology.

61


Fig. 3. Network cost for EON and WDM predicted for years 2014-2019. Table 2. Cost savings provided by an EON compared to WDM.

Cost savings

2014

2015

2016

2017

2018

2019

50.25%

49.23%

47.02%

44.91%

43.01%

41.84%

Next, the network cost was calculated for a survivable EON, protected by a dedicated path protection (DPP) scheme [12]. In a nutshell, the DPP approach assumes that for a particular demand not one lightpath is to be established, but a pair of two link-disjoint lightpaths. One of these lightpaths (called a primary) is used in a normal network state, wherein the second one (called backup) is used only in case of a primary lightpath link failure. However, resource for both lightpaths have to be provided in advance. The same channel assignment policy is applied and thus both channels (related to the primary and backup lightpaths) have to have the same central frequency [12]. Since the primary and backup lightpaths can have different lengths and can use different modulation formats, the resource requirements can be different for these paths. In Fig. 4, the network cost is presented for an EON under DPP scheme. The figure presents the total network cost, the cost of realizing traffic demands on primary lightpaths and on backup lightpaths.

62


Fig. 4. Cost of an EON protected by the DPP scheme predicted for years 2014-2019.

What is clearly visible in Fig. 4, the realization of backup flows is more expensive than realization of corresponding primary flows. In Table 3, the additional cost of backup paths is presented. The values inform how much more cost flow realization on backup paths compared to its realization on primary paths. Notice, that backup paths cost up to 30% more. This issue was investigated in [12]. In a nutshell, it follows mostly from the fact that backup paths are generally longer than corresponding primary paths (in terms of both total length in kilometres and number of included links). Therefore, more spectrum resources are necessary and they are allocated on more network links. Moreover, since the traffic distance is much longer, some additional regenerators may be required. Table 3. Additional resources consumed by backup paths compared to primary paths.

Additional resources for backup paths

2014

2015

2016

2017

2018

2019

25.43%

25.76%

25.74%

26.66%

27.84%

28.96%

6. FINAL REMARKS In this paper, the network cost of an elastic optical network was considered. The cost was defined as a cost of network devices (transponders and regenerators) and 1year fiber leasing cost, required to realize a set of traffic demands. The numerical experiments were carried out with traffic models calculated for years 2014-2015, 63


based on the Cisco traffic forecast. The predicted EON cost was compared with the cost of WDM-based network. According to the results, a WDM-based network is up to 30% more expensive that an EON, which support the same traffic. Moreover, the findings of an investigation showed, that additional network requirements (like e.g. survivability requirements provided by a DPP scheme) can significantly increase network cost. The plans for future works include considerations of the RMLS problem with different optimization criteria (e.g., power consumption, spectrum usage, regenerators requirements) and different traffic demands (anycast, multicast). REFERENCES [1] Cisco Visual Network Index: Forecast and Methodology, 2013-2018, Cisco Visual Network Index, June 2014. [2] Cisco Global Cloud Index: Forecast and Methodology, 2012-2017, Cisco Visual Network Index, May 2013. [3] JINNO M., et al., Spectrum-efficient and scalable elastic optical path network: architecture, benefits, and enabling technologies, IEEE Communications Magazine, vol. 47, no. 11, 2009, pp. 66-73. [4] ITU-T Recommendation G.694.1 (ed. 2.0), Spectral grids for WDM applications: DWDM frequency grid, Feb. 2012. [5] GOŚCIEŃ R., et al., Distance-adaptive transmission in cloud-ready elastic optical networks, Journal on Optical Communications and Networking, in press. [6] PIÓRO M., and MEDHI D., Routing, Flow, and Capacity Design in Communication and Computer Networks, Morgan Kaufmann Publishers, 2004. [7] PALKOPOULOU E., et al., Quantifying spectrum, cost, and energy efficiency in fixed-grid and flex-grid networks, IEEE/OSA Journal of Optical Communications and Networking, vol. 4, no. 11, pp. B42-B51, 2012. [8] KLEKAMP A., et al., Efficiency of adaptive and mixed-line-rate IP over DWDM networks regarding CAPEX and power consumption, IEEE/OSA Journal of Optical Communications and Networking, vol. 4, no. 11, pp. B11-B16, 2012. [9] KMIECIK W., et al., Two-layer optimization of survivable overlay multicastng in elastic optical networks, Optical Switching and Networking, 2014, vol. 14, pp. 164-178. [10] KLINKOWSKI M., and WALKOWIAK K., On the advantages of elastic optical networks for provisioning of cloud computing traffic, IEEE Network, vol. 27, no. 6, 2013, pp. 44-51. [11] POLITI, C.T., et. al., Dynamic operation of flexi-grid OFDM-based networks, Optical Fiber Communication Conference and Exposition (OFC/NFOEC), 2012 and the National Fiber Optic Engineers Conference, vol., no., pp.1,3, 4-8 March 2012. [12] GOŚCIEŃ R., et al., Joint anycast and unicast routing and spectrum allocation with dedicated path protection in elastic optical networks, IEEE Design of Reliable Communication Networks (DRCN) 2014, pp. 1-8.

64


Computer Systems Engineering 2014 Keywords: Activated Sludge , Membrane bioreactor , EPS , SMP

Tomasz JANUS∗

INTEGRATED MATHEMATICAL MODEL OF A MBR REACTOR FOR WASTEWATER TREATMENT

This paper briefly describes an integrated mathematical model of an immersed membrane bioreactor (MBR) with hollow fibre outside-in membranes. The integrated model is composed of three interconnected submodels: the activated sludge model (ASM) extended with soluble and bound biopolymer kinetics, the membrane fouling model, and the interface model relating cake back-transport rate to air-scour intensity and specific cake resistance to concentration of extracellular polymeric substances (EPS). The integrated model is simulated on the plant layout used in the BSM-MBR benchmark model of Maere et al. [17] and predicts similar effluent quality to BSM-MBR whilst additionally enabling predictions of the transmembrane pressure (TMP) and of the effects of various operating conditions on membrane fouling.

1. INTRODUCTION MBR-based solutions are becoming more wide-spread in municipal as well as industrial wastewater treatment. What the technology is missing at the moment however is general-purpose mathematical models. Such models would allow to carry out similar simulation-based studies on MBR systems to what is already possible for other wastewater treatment processes such as, e.g. conventional activated sludge plants (CASPs). Until now only a handful of such MBR models have been developed and described in scientific literature although recent years have seen some important developments in the area of MBR modelling. Despite of these developments most models provide rather simplistic description of, either activated sludge kinetics, membrane fouling, or both. They are also unable to represent the main synergistic interactions that occur between various parts of a MBR such as, e.g. links between EPS and SMP kinetics and membrane fouling. Some of such models are briefly described below. ∗

Water Software Systems, tjanus@dmu.ac.uk

De Montfort University,

65

Leicester,

United Kingdom,

e-mail:


Zarragoitia-Gonz´alez et al. [25] linked the activated sludge model of Lu et al. [16] with a comprehensive membrane fouling model of Li and Wang [14]. Di Bella et al. [3] linked an ASM1-based SMP kinetic model with membrane fouling equations based on the model of Lee et al. [13]. Unfortunately in their paper, links between SMP and irreversible fouling have not been modelled. Additionally, in both publications, Petersen matrices of the biological models do not pass a mass-balance check. Mannina et al. [18] improved the model of Di Bella et al. [3] by swapping the nonmass and charge conserving model of Lu et al. [16] with a modified ASM1 model implementing the SMP kinetics proposed by Jiang et al. [8]. The filtration model was modified to include more fouling mechanisms whilst keeping the sectional model approach of Lee et al. [13] and the deep bed filtration equations introduced originally in Di Bella et al. [3]. Although their model was found to be in good agreement with the measurements collected on a MBR pilot plant, it suffers from the same weakness as the model of Di Bella et al. [3], i.e. irreversible fouling is not related to SMP concentration in the bulk liquid. Most recently Suh et al. [22] developed an integrated MBR model based on the benchmark simulation layout of Maere et al. [17], combined EPS and SMP production ASM3-based model (CES-ASM3) of Janus and Ulanicki [10] and the membrane fouling model of Li and Wang [14]. Their model again suffers from the same limitation as the previously outlined integrated models due to the fact that irreversible fouling has not been linked to SMP. The integrated MBR model developed in this study differs from the models mentioned in the previous paragraph in several aspects. First, the biological model predicts the concentrations of both soluble (SMP) as well as bound (EPS) polymers whilst maintaining the structure of the ASM1 model, hence allowing easy comparison of the results with BSM1 and BSM-MBR benchmark models. Second, the fouling model has a simple structure and a small number of parameters which are easily identifiable with a ‘pen and ruler’ approach using flux and pressure data from flux stepping experiments. Third, both the reversible and the irreversible fouling is in a functional relationship with biopolymer concentrations in the bulk liquid. Whilst irreversible fouling is assumed to be caused by SMP, reversible fouling is accelerated by the presence of EPS which lead to an increase in the specific cake resistance αc . Fourth, cake detachment depends on air-scouring rate accordingly to the shear stress vs. superficial gas velocity relationship obtained from the steady-state slug flow model of Zaisha and Dukler [24].

66


2. NOMENCLATURE a b df,o dslug fBAP fEP S,da fEP S,dh fEP S,a fEP S,h fM fS iXBAP iXEP S ki kh,EP S,20 kr KBAP KU AP lf m ˙ r,back n J Ri Rm Rr Rt SBAP SU AP tf Tl vsg vsl XEP S XM LSS YSM P α

fraction of SMP contributing to irreversible fouling (–) flux dependency coefficient in the irreversible fouling equation (L−1 m2 h) outer fibre diameter (m) diameter of the Taylor bubble (m) fraction of SBAP produced during biomass decay (gCOD gCOD−1 ) fraction of XEP S produced during autotrophic biomass decay (gCOD gCOD−1 ) fraction of XEP S produced during heterotrophic biomass decay (gCOD gCOD−1 ) fraction of XEP S produced during autotrophic biomass growth (gCOD gCOD−1 ) fraction of XEP S produced during heterotrophic biomass growth (gCOD gCOD−1 ) SU AP and SBAP retention on the membrane (–) fraction of SS produced during XEP S hydrolysis (gCOD gCOD−1 ) nitrogen (N) content of SBAP (gN gCOD−1 ) nitrogen (N) content of XEP S (gN gCOD−1 ) irreversible fouling strength (m kg−1 ) maximum XEP S hydrolysis rate at 20o C (d−1 ) irreversible fouling strength (kg m−2 s−1 ) half-saturation constant for SBAP (gCOD m−3 ) half-saturation constant for SU AP (gCOD m−3 ) distance between two fibres (m) mass flux of solids detaching from the cake and the membrane (kg m−2 s−1 ) cake compressibility factor (–) permeate flux (L m−2 h−1 ) resistance due to irreversible fouling (m−1 ) clean membrane resistance (m−1 ) resistance due to reversible fouling (m−1 ) total membrane resistance (m−1 ) concentration of biomass associated products (BAP) (gCOD m−3 ) concentration of utilisation associated products (UAP) (gCOD m−3 ) filtration cycle duration time (s) liquid temperature (o C) superficial gas velocity (cm s−1 ) superficial liquid velocity (cm s−1 ) concentration of extracellular polymeric substances (EPS) (gCOD m−3 ) concentration of mixed liquor suspended solids (MLSS) (g m−3 ) yield coefficient for heterotrophic growth on SU AP and SBAP (gCOD gCOD−1 ) oxygen transfer coefficient (–) 67


3. AIMS AND OBJECTIVES The main aim of this study is to create a mathematical model of an immersed MBR reactor which will allow to carry out simulation-based process designs, process and energy optimisation studies and model-based control strategy designs for MBR systems in a similar way to what is currently possible for conventional treatment processes. Such model will also enable integration of MBR process simulation within larger projects such as simulation of whole wastewater treatment plants (WWTPs) or integrated catchment modelling (ICM) studies. The main objective of an integrated simulation of MBR reactors is to improve the designs of the existing MBR systems in terms of energy-efficiency, resilience and effluent quality.

4. ACTIVATED SLUDGE MODEL WITH SMP AND EPS KINETICS The biological model used in this study, later referred to as CES-ASM1 (combined EPS and SMP ASM1-based model) incorporates the unified theory of production and degradation of SMP and EPS developed by Laspidou and Rittmann [12] within ASM1, although with one significant conceptual correction. Whilst Laspidou and Rittmann [12] assume that the biomass associated products (BAP) in the system originate from the hydrolysis of EPS, researchers such as Aquino and Stuckey [2] postulate that BAP is produced during EPS hydrolysis as well as during bacterial cell decay.

Fig. 1. EPS and SMP formation and utilisation pathways in the biological model

In fact, BAP had already been defined by Lu et al. [16] as the SMP fraction strictly originating from biomass decay. The lack of direct active cell decay-related SMP production in Laspidou and Rittmann [12] was found to be the main cause of discrepancies between the model predictions and the measurements with regards to SMP [19]. CES68


Table 1. Process rate expressions for the SMP and EPS kinetics in the biological model Symbol Process rate equation p1,b p1,c p2,b p2,c p7

SBAP SO SALK XH KBAP + SBAP KOH + SO KALKH + SALK SO SALK SU AP −0.069 (20−Tl ) XH e µU AP,20 KU AP + SU AP KOH + SO KALKH + SALK SBAP KOH SN O SALK −0.069 (20−Tl ) e µBAP,20 ηg XH KBAP + SBAP KOH + SO KN O + SN O KALKH + SALK SU AP KOH SN O SALK e−0.069 (20−Tl ) µU AP,20 ηg XH KU AP + SU AP KOH + SO KN O + SN O KALKH + SALK −0.11 (20−Tl ) e kh,EP S,20 XEP S e−0.069 (20−Tl ) µBAP,20

ASM hence provides the mechanisms for BAP production due to both EPS hydrolysis and biomass decay. The metabolic pathways of SMP and EPS in the biological model are visualised in Fig. 1. CES-ASM1 was calibrated on the experimental data obtained from a batch and a continuous-flow lab scale bioreactor and a full-scale continuous-flow bioreactor as described in Janus and Ulanicki [10]. Results of the CES-ASM1 calibration study can be found alongside the calibration results of CES-ASM3 in Janus and Ulanicki [10]. Process rate expressions for the SMP and EPS kinetics are shown in Table 1 whilst Table 2 presents the Petersen matrix and the composition matrix. All other process rates in the model are the same as in ASM1 from the publication of Henze et al. [7].

5. MEMBRANE FOULING MODEL The membrane fouling model is based on the concept of Liang et al. [15] where fouling is divided into short-term reversible fouling and long-term irreversible fouling, graphically represented in Fig. 2a. Both processes are described with two first order ordinary differential equations (ODEs) which describe an increase in the membrane resistance due to, respectively, irreversible fouling (Eq. 1) and reversible fouling (Eq. 2). The model additionally accounts for cake compressibility (Eq. 7), cake detachment due to presence of airflow/crossflow (Eq. 5), backwashing (Eq. 6), and flux-dependent soluble microbial products (SMP) deposition. Equation relating the fraction of SMP leading to irreversible fouling to the permeate flux follows the model proposed by Ye et al. [23] who found, through experimental analysis, that the fraction of alginate proteins depositing inside the membrane pores is in an exponential relationship with flux. The cake detachment model uses Equation 5 proposed by Nagaoka et al. [21] in which cake detachment rate is proportional to the shear stress on the membrane wall τw and is diminished by a

69


70

3

growth

growth

growth

growth

p2c Aer. SU AP

p3a Anox. SS

p3b Anox. SBAP

p3c Anox. SU AP

1

− YH

1

− YH

au-

Nitrogen (g N)

Ionic charge (Mole+ )

2

3

1

1

6

1

1−fP− fEP S,da −fBAP

−1

1

YH

γH

SU AP

7

1

1

iXB iXEP S

1

fEP S,da

fEP S,a

−1

1

YA

γA

1−fS

1 − fEP S,h − YSM P fEP S,h

1 − fEP S,h fEP S,h

EP S,h

1 − fEP S,h − YSM P fEP S,h γH 1 − fEP S,h YH f

1 − fEP S,h fEP S,h

1 − fEP S,h fEP S,h

XH XEP S

5

1−fP− −1 fEP S.dh fEP S,dh −fBAP

XS

4

iXBAP

1

fBAP

fBAP

1 − YSM P

− YSM P

1

SBAP

8

iXB

1

−1

1−fEP S,a

XA

9

iXP

1

fP

fP

XP

10

-1

13

x3c −x3c

x3b −x3b

24 64 − − 14 14 1 1 1 − 14

y3c

y3b

y3a

y2c

y2b

y2a

1

14

1 1

1 YA

SN H

14

−iXB −

x3a −x3a

SN O SN2

12

64 −YA 1 − 14 YA YA

x2c

x2b

x2a

SO

11

1

iXP− fP iXP

-1

1 iXB − − 14 7 YA

iXP− fP iXP

1

1

SALK

17

14 iXB − 14 iXB − 14 iXB − 14 1−YH iXB − 40 YH 14 1−YH iXB − 40 YH 14 1−YH iXB − 40 YH 14

XN D

16

−1 iXEP S− iXBAP (1−fS )

1

−1

SN D

15

This model assumes that ThOD is identical to the measured COD. 1 gSO =-1 gThOD, 1 gSN H = 0 gThOD, 1gSN O =-64/14 gThOD, 1 gSN2 =-24/14 gThOD.

ThOD (g ThOD) 1

1

Composition matrix

p9 Decay of totrophs

p8 Aerobic growth of autotrophs

fS of p7 Hydrolysis XEP S Autotrophic organisms

of hetp4 Dec. erotrophs p5 Hydrolysis of org. compounds p6 Hydrolysis of org. N

growth

p2b Aer. SBAP

p2a Aer. growth SS

1

Processes SI SS XI Heterotrophic organisms

2

Table 2. Stoichiometric (Petersen) and composition matrix for the biological model, j: process, i: component

compo- 1

p1 Ammonification

j

Model nents i


Cavitation

700

60 40

600

20

500

0 0

50

100

TMP, mbar

600

150 200 Time, min

250

300

Reversible fouling

TMP, mbar

J, L m−2 h −1

80

400

Measurements Model

400 300 200 100

200 Irreversible fouling

0

0 0

50

100

150 200 Time, min

250

0

300

50

100

150

200

250

300

Time, min

Fig. 2. (a) Representation of reversible and irreversible fouling (b) results of model calibration on data from a short-term flux-stepping experiment

pressure dependent static friction term λm ∆P which determines the combined effects of cake consistency and attachment to membrane surface. Backwashing is assumed to be an instantaneous process in which reversible resistance at the beginning of the (j + 1)th filtration cycle is equal to the fraction of reversible resistance at the end of the previous j th filtration cycle - see Eq 6. We assumed that ηb = 0, i.e. all reversible fouling is removed in a single backwash. R˙ i = a ki eb J J (SU AP + SBAP )

(1)

R˙ r = αc (J XM LSS − m ˙ r,back )

(2)

Rt = Rm + Ri + Rr

(3)

∆P = J/ [µ (Rm + Ri + Rr )]

(4)

m ˙ r,back = kr (τw − λm ∆P )

(5)

∀j ∈ N : Rrj+1 (τ = 0) = ηb Rrj (τ = tf )

(6)

αc = αc,0 (∆P/∆Pcrit )n

(7)

The model was calibrated on the data obtained in a short-term flux stepping experiment and during long-term operation of a pilot-scale membrane bioreactor (MBR), and exhibits good accuracy for its designated application and within the intended operating range. The calibration results are described in more detail in Janus et al. [9] although the model used in this study uses a different equation for SMP deposition vs. flux, as mentioned above.

71


6.

AIR SCOURING

The relationship between shear stresses on the membrane surface τw and superficial gas velocity vsg , hence the airflow rate, has been obtained through simulation of a slug-flow problem with a steady-state model of Zaisha and Dukler [24] and a geometric model of a hollow-fibre module adopted from Busch et al. [5]. The geometric model assumes that all fibres are staggered, such that three neighbouring fibres form an equilateral triangle. The model assumes that slug-flow is fully developed, axially symmetric, isothermal, steady-state, and under low pressure conditions. Both phases are at an equilibrium, i.e. no one-directional mass transfer occurs between the phases whilst coalescence and breakage happen at equal rates. It is also assumed that the flow geometry does not change in time, i.e. the hollow-fibre membrane bundles do not sway due to velocity and pressure gradients developing in the bulk liquid. vsl is related to vsg using a modified Chisti equation as proposed by B¨ohm et al. [4]. The slug-flow model was simulated with MATLAB’s Optimization Toolbox function lsqnonlin for a range of superficial gas velocities between 1 and 5 m s−1 which satisfy the aeration demands per membrane area (SADm) of 0.20-1.0 m3 m−2 h−1 . The average shear stresses τw on the fibre surface were calculated for different values of vsg , Tl and XT SS . It was found that τw can be approximated with a third-order polynomial with respect to vsg given in Eq. 8 where each coefficient pi is in a functional relationship with XT SS and Tl accordingly to Eq. 9: τw (vsg ) = p1 (vsg )3 + p2 (vsg )2 + p3 (vsg ) + p4 (8) pi = a1 + a2 XT SS + a3 Tl + a4 (XT SS )2 + a5 (XT SS Tl )

(9)

Values of all pi and ai coefficients can be found in Janus [11, chap 7]

7. SPECIFIC CAKE RESISTANCE AS A FUNCTION OF EPS Specific cake resistance under atmospheric pressure αc,0 is related to the EPS fraction in mixed liquor volatile suspended solids (MLVSS) expressed in mgTOC/gVSS using a modified expression originally proposed by Ahmed et al. [1] and given in Eq. 10. The modification lies in introduction of a proportionality constant m = 10 which was added due to the fact that αc,0 values obtained from the original equation of Ahmed et al. [1] were so small that no pressure gradients due to reversible fouling were observed in the model. TOC is calculated from COD by multiplying the COD values by a factor of three.

72


MLVSS is calculated from MLSS using the MLVSS/MLSS ratio of 0.7. ( ) 11 EP S 12 αc,0 = m 1.376 × 10 − 2.564 × 10 M LV SS

8.

(10)

INTEGRATED MODEL FORMULATION

Structure of the integrated biological and membrane fouling MBR model (IBMFMBR) is shown in Fig. 3 ui (t) signals represent inputs, yi (t) signals represent outputs and wi (t) denote the disturbances. The model is subdivided into three subsystems:

Fig. 3. Integrated MBR model structure

Bioreactor, Membrane and Interface. The Bioreactor is modelled with CES-ASM1 and the Membrane is described with Eqs. 1-7. The interface calculates specific cake resistance as a function of EPS/MLVSS according to Eq. 10, shear stresses τw as a function of airflow rate with the relationship obtained from the results of the slug-flow model, and oxygen transfer coefficient α as a function of MLSS with an exponential equation used in Maere et al. [17]. The IBMF-MBR model is formulated on the plant layout defined in BSM-MBR, R implemented in Simulink⃝ and simulated with model inputs, operational parameters and simulation scenarios borrowed from the original COST Benchmark model [6] and BSMMBR [17]. The plant is divided into five completely stirred tank reactors (CSTRs) - two anoxic tanks, two aerobic tanks with fine-bubble aeration and one membrane tank with coarse-bubble aeration. The tank volumes are however slightly different from the ones used in BSM-MBR. In IBMF-MBR each anoxic volume is increased from 1, 500 m3 to 1, 800 m3 at the cost of aerobic tanks and the membrane tank whose volumes are decreased from 1, 500 m3 to 1, 300 m3 . As a result anoxic fraction is increased from 40% to 51.4% bringing it is closer to the value recommended by MUNLV [20] for predenitrification MBR plants. Anoxic fraction had to be increased because denitrification 73


kinetics in CES-ASM1 are somehow slower from those in ASM1 due to alteration of the flow of organic substrates caused by introduction of SMP and EPS metabolic pathways. Stoichiometric and kinetic parameters governing the SMP and EPS kinetics in CESASM1 are as follows: YSM P = 0.45, γH = 0.0924, γA = 0, iXBAP = 0.07, iXEP S = 0.07, KU AP = 100, KBAP = 85, µU AP,20 = 0.45, µBAP,20 = 0.15, fS = 0.4, fEP S,h = 0.10, fEP S,dh = 0.025, fEP S,a = 0.0, fEP S,da = 0.0, fBAP = 0.0215, kh,EP S,20 = 0.17. Description and units of all the above parameters are given in Nomenclature. Out of the original ASM1 parameters only heterotrophic biomass yield YH was changed from its default value of 0.67 to 0.67/(1 + 0.0924) gCOD gCOD−1 . The rest of the biological parameters were given their default values as published in Henze et al. [7]. Filtration-related parameters are as follows: Rm = 3.0 × 1012 , fM = 0.5, b = 6.8 × 10−2 , ∆Pcrit = 30, 000, n = 0.25, γm = 1, 500, λm = 2 × 10−6 . Again, description and units of all filtration-related parameters are provided in Nomenclature. The aeration model is borrowed from Maere et al. [17] and so are all controller setpoints and operating parameters, except open-loop airflow setpoints to aeration tanks 3 & 4 which are set to 3,440 Nm3 h−1 and 3,360 Nm3 h−1 , respectively. Under closed-loop operation with DO control the airflow split ratio between tank 3 & 4 is set to 1.3 : 1. Membrane operates in a sequence of 10 min filtration periods with 1 min backwash intervals. Energy demand for aeration and mixing is calculated with the same equations as used in Maere et al. [17]. Energy demand for pumping is calculated with Equation 11 where geometric heights hig , sums of hydraulic losses hil and pump efficiencies η i for each pump are provided in Table 3. The above parameters for waste flow qw , internal recirculation qint and sludge recirculation qr were adjusted in order to match the energy costs published in Maere et al. [17] whilst η and hl for qe and qb have been assumed. Resulting geometric heights for these two flows are calculated by the membrane fouling model. Backwash flow is assumed to be twice that of the average permeate flow and corresponds to backwash flux of ∼40 L m−2 h−1 . Membrane resistance during backwash periods is assumed to be equal to Rm + Ri . ρw = 1, 000 kg m−3 . tsimu denotes simulation time (7 days). i=5 60 ρw g ∑ hgi + hli PE = 1000 tsimu ηi i=1

t0 +t ∫ simu

qi (t) dt

(11)

t0

IBMF-MBR is simulated in the same fashion as BSM-MBR but adopts one more control loop, i.e. nitrate control which manipulates qint in order to maintain a setpoint of 1.0 mg N-NO− 3 in the second anoxic tank. The control loop design is borrowed from Copp [6]. Input files had to be modified to take into account three new variables

74


Table 3. Parameters used in the pumping energy demand equation (Equation 11)

Parameter Geometric height Sum of losses Efficiency

Symbol

Unit

hg hl η

m m –

Flow qw

qint

qr

qe

qb

7.0 2.17 0.5

0.50 1.42 0.7

0.50 1.42 0.7

calculated 0.5 0.7

calculated 0.5 0.7

introduced in CES-ASM1, i.e. XEP S , SU AP , and SBAP . It is assumed that SU AP = 0, while SBAP is assumed to be equal to 70% of the influent soluble inert substrates SI in BSM1 and BSM-MBR. XEP S is assumed to constitute 5% of the biomass. EPS and BAP are assumed to contain 6% of N whilst UAP contain no nitrogen.

9. SIMULATION RESULTS The simulation results show that CES-ASM1 predicts lower sludge yields and lower denitrification rates to ASM1. This behaviour is caused by the alteration of the organic substrate pathways as a result of the introduction of SMP and EPS kinetics. The results also indicate that changes in the SMP and EPS content in MLSS in response to diurnal variations in the influent flow and loading rates are too small to have a noticeable impact on membrane fouling (see Fig. 5 and Fig. 6) whilst fouling rates are highly sensitive to fluctuations of solids concentration in the membrane tank (not shown) and flux rates (Fig. 5). Such model behaviour is a direct result of the biopolymer kinetic model which does not consider biopolymer production in response to environmental stress, only due to normal variations in substrate loading. In terms of ‘standard’ effluent quality parameters, IBMF-MBR model predicts similar effluent TN concentrations to BSM-MBR and similar number of TN consent violations as indicated in Fig. 4, although in larger anoxic volume. As shown in Table 4, IBMF-MBR in the open-loop configuration predicts similar energy demands for mixing, sludge pumping and aeration to BSM-MBR whilst in the closed-loop configuration the unit cost for membrane aeration drops significantly by 0.19 kWh m−3 . The energy cost for permeate pumping in IBMF-MBR is ten times less than in BSM-MBR despite of rather average for an ultrafiltration (UF) module calculated permeabilities of about 80-100 Lmh bar−1 . On the other hand the TMP calculated in the model may not be representative for a long-term operation of a MBR due to the fact that the simulations lasted only 28 days, rather a short amount of time for slow irreversible fouling process to be taken properly into account. As a result total energy cost per m3

75


15

10

5

20

Effluent TN (g/m3 )

20

Effluent TN (g/m3 )

Effluent TN (g/m3 )

20

15

10

5 8

10

12

15

TN, max TNOL TNCL

10

5

14

8

10

t (d)

12

14

8

10

t (d)

12

14

t (d)

Fig. 4. Effluent total nitrogen (TN) under (a) dry- (b) rain- and (c) storm-weather and open-loop (OL) and closed-loop (CL) operation

of treated wastewater in BSM-MBR is, on average, 0.2 kWh m−3 higher than in IBMFMBR. x 10

Dry weather Rain weather Storm weather

5 Ri (m/kg)

SSMP /MLSS in Vmem (mgCOD/gSS)

10

6

4 3 2 1 7

8

9

10 11 t (d)

12

13

14

5 4.5 4 3.5 3 Dry weather Rain weather Storm weather

2.5 2 7

8

9

10 11 t (d)

12

13

14

Fig. 5. (a) Irreversible fouling Ri and (b) SMP/MLSS ratio in the bioreactor during dry-, rain- and stormevents

10. CONCLUSIONS This paper outlines the development of an integrated biological and membrane fouling MBR model (IBMF-MBR) and presents some selected simulation results. Due to space restrictions the author cannot present a full comparison of results against BSMMBR as well as the outputs of the membrane fouling model. IBMF-MBR predicts similar effluent quality to BSM-MBR in terms of effluent ammoniacal-N, COD and TN, although the similarity in effluent TN predictions between both models had to be ascertained by increasing the anoxic volume fraction in IBMF-MBR by about 11.5% compared to BSM-MBR. The model also shows that the variations in SMP and EPS

76


x 10

Dry weather Rain weather Storm weather

1.18 αc (m/kg)

XEP S /MLSS in Vmem (mgCOD/gSS)

13

1.2

1.16 1.14 1.12 1.1 7

8

9

10 11 t (d)

12

13

14

99 98.5 98 97.5 97 96.5 Dry weather Rain weather Storm weather

96 95.5 7

8

9

10 11 t (d)

12

13

14

Fig. 6. (a) Specific cake resistance αc and (b) EPS/MLSS ratio in the bioreactor during dry-, rain- and storm-events Table 4. Comparison of energy costs between IBMF-MBR, BSM-MBR and three municipal MBR WWTPs – modified from Maere et al. [17]

Energy cost (kWh m−3 ) Mixing Sludge pumping Effluent pumping Aeration (bioreactor) Aeration (membrane) Total ∗)

Schilde Varsseveld Nordkanal BSM-MBR 0.05 0.10 0.07 0.07 0.23 0.52

0.04 0.11 0.12 0.24 0.34 0.85

0.11 0.01 0.02 0.11 0.45 0.71

0.03 0.05 0.07 0.21 0.53 0.90

IBMF-MBR Open-loop∗) Closed-loop∗) 0.039 0.046 0.008 0.22 0.49 0.81

0.039 0.049 0.008 0.22 0.30 0.62

dry-weather conditions with average permeate flow rate qperm,ave = 18286.3 m3 d−1

concentration due to diurnal flow and loading patterns are not that significant to have a noticeable effect on membrane fouling which is predominantly affected by flow rate variations and fluctuations of suspended solids in the membrane tank (not shown). Whether the model is correct in predicting that the changes in biopolymer concentrations in the plant receiving diurnal flow and load patterns are so small or whether the mechanisms of biopolymer production are not sufficient to give realistic outputs, can only be ascertained through extensive validation under dynamic conditions. At the moment it seems that process setpoints such as DO, MLSS and SRT have a significantly larger influence on SMP and EPS concentrations in the bioreactor than diurnal disturbances. The last conclusion is that whilst reversible fouling can be quantified using a standard 14-day simulation benchmark time-frame, quantification of the effects of irreversible fouling requires longer simulation horizons in the range of 12 months and, possibly, a chemical cleaning model to allow simulation of irreversible fouling recovery due to periodic chemical cleaning.

77


REFERENCES [1] AHMED, Z., CHO, J., LIM, B-R., SONG, K-G., AHN, K-H., 2007. Effects of sludge retention time on membrane fouling and microbial community structure in a membrane bioreactor. Journal of Membrane Science 287, 211-218. [2] AQUINO, S.F., STUCKEY, D.C., 2008. Integrated model of the production of soluble microbial products (SMP) and extracellular polymeric substances (EPS) in anaerobic chemostats during transient conditions. Biochemical Engineering Journal 38, 138-146. [3] DI BELLA, G., MANNINA, G., GASPARE, V., 2008. An integrated model for physical-biological wastewater organic removal in a submerged membrane bioreactor: Model development and parameter estimation. Journal of Membrane Science 322, 1-12. ¨ ´ ´ P.R., KRAUME, M., 2012. The [4] BOHM, L., DREWS, A., PRIESKE, H., BERUB E, importance of fluid dynamics for MBR fouling mitigation. Bioresource Technology 122(0), 50-61. [5] BUSCH, J., CRUSE, A., MARQUARDT, W., 2007. Modeling submerged hollowfiber membrane filtration for wastewater treatment. Journal of Membrane Science 288(1-2), 94-111. [6] COPP, J.B., 2002. The COST simulation benchmark - description and simulator manual. Luxembourg: Office for Official Publications of the European Communities. [7] HENZE, M., GUJER, W., MINO, T., VAN LOOSDRECHT, M., 2000. Activated sludge models ASM1, ASM2, ASM2d and ASM3. IWA Publishing. [8] JIANG, T., MYNGHEER, S., DE PAUW, D.J.W., SPANJERS, H., NOPENS, I., KENNEDY, M.D., AMY, G., VANROLLEGHEM, P.A., 2008. Modelling the production and degradation of soluble microbial products (SMP) in membrane bioreactors (MBR). Water Research 42(20), 4955-4964. [9] Janus, T., Paul, P., Ulanicki, B., 2009. Modelling and simulation of short and long term membrane filtration experiments. Desalination & Water Treatment 8, 37-47. [10] JANUS, T., ULANICKI, B., 2010. Modelling SMP and EPS formation and degradation kinetics with an extended ASM3 model. Desalination 261(1-2), 117-125.

78


[11] JANUS, T., 2013. Modelling and Simulation of Membrane Bioreactors for Wastewater Treatment. PhD Thesis, De Montfort University, Leicester. [12] LASPIDOU, C.S., RITTMANN, B.E., 2002. A unified theory for extracellular polymeric substances, soluble microbial products, and active and inert biomass. Water Research 36(11), 2711-2720. [13] LEE, Y., CHO, J., SEO, Y., LEE, J.W., AHN, K-H., 2002. Modeling of submerged membrane bioreactor process for wastewater treatment. Desalination 146(1-3), 451457. [14] LI, X-Y., WANG, X-M., 2006. Modelling of membrane fouling in a submerged membrane bioreactor. Journal of Membrane Science 278, 151-161. [15] LIANG, S., SONG, L., TAO, G., KEKRE, K.A., SEAH, H., 2006. A modeling study of fouling development in membrane bioreactors for wastewater treatment. Water Environment Research 78(8), 857-863. [16] LU, S.G., IMAI, T., UKITA, M., SEKINE, M., HIGUCHI, T., FUKAGAWA, M., 2001. A model for membrane bioreactor process based on the concept of formation and degradation of soluble microbial products. Water Research 35(8), 2038-2048. [17] MAERE, T., VERRECHT, B., MOERENHOUT, S., JUDD, S., NOPENS, I., 2011. A benchmark simulation model to compare control and operational strategies for membrane bioreactors. Water Research 45(6), 2181-2190. [18] MANNINA, G., DI BELLA, G., VIVIANI, G., 2011. An integrated model for biological and physical process simulation in membrane bioreactors (MBRs). Journal of Membrane Science 376, 56-69. [19] MENNITI, A., MORGENROTH, E., 2010. Mechanisms of SMP production in membrane bioreactors: Choosing an appropriate mathematical model structure. Water Research 44, 5240-5251. [20] Ministerium f¨ur Umwelt und Naturschutz, Landwirtschaft und Verbraucherschutz des Landes Nordrhein-Westfalen (Hrsg.), 2003. Waste Water Treatment with Membrane Technology (Abwasserreinigung mit Membrantechnik). [21] NAGAOKA, H., YAMANISHI, S., MIYA, A., 1998. Modeling of biofouling by extracellular polymers in a membrane separation activated sludge system. Water Science & Technology 38(4-5), 497-504.

79


[22] SUH, C., LEE, S., CHO, J., 2013. Investigation of the effects of membrane fouling control strategies with the integrated membrane bioreactor model. Journal of Membrane Science 429, 268-281. [23] YE, Y., CHEN, V., FANE, T., 2006. Modeling long-term subcritical filtration of model EPS solutions. Desalination 191(1-3), 318-327. International Congress on Membranes and Membrane Processes. [24] ZAISHA, M., DUKLER, E., 1963. Improved hydrodynamic model of two-phase slug flow in vertical tubes. Chinese Journal of Chemical Engineering 1(1), 18-29. ´ ´ [25] ZARRAGOITIA-GONZALEZ, A., SCHETRITE, S., ALLIET, M., JAUREGUIHAZA, U., ALBASI, C., 2008. Modelling of submerged membrane bioreactor: Conceptual study about link between activated sludge biokinetics, aeration and fouling process. Journal of Membrane Science 325, 612-624.

80


Computer Systems Engineering 2014 Keywords: machine learning, gesture recognition, depth sensor ∗ ´ Justyna KULINSKA

DETECTION OF FUZZY PATTERNS IN MULTIDIMENSIONAL FEATURE SPACE IN PROBLEM OF BODY GESTURE RECOGNITION

The subject of body gesture recognition is recently getting more popular. Natural User Interface (NUI), Human Robot Interaction (HRI) and gaming area are only few fields of applicability of body gesture recognition system. Existing systems allowed to obtain raw position of human body joints using 3D sensors. This kind of data can be further used as a features in recognition system. Majority of existing solutions uses various classifiers to recognize gestures. However huge disadvantage of such a system is low level of expandability what means that a lot of effort need to be done to expand systems gesture set. In this paper another approach based on Template-based Shell Clustering is introduced and compared with classical methods.

1.

INTRODUCTION

The problem of body gesture recognition is widely known in a literature for a long time. However recently it is getting more popular. Main reason for this situation is fact that two main fields of applicability of a gesture recognition system are also getting more popular. Those fields are Natural User Interface (NUI) and Human Robot Interaction (HRI). Of course there are more areas for such a system. Interesting example is described in [13], where body gesture recognition system is used in urban monitoring for detecting potentially dangerous situation of two people interaction. There are many ways for obtaining gestures data. Formerly, the most common approach to gesture recognition was based on RGB camera (survey paper in the subject [12]). However, this approach suffers from skin color-dependency and it is a relatively hard task. Now the most popular is the approach with use of 3D sensors - like Microsoft Kinect or Asus Xtion. Though this approach is relatively new, there already are many papers which use it for gesture recognition. Interesting paper [6] uses body joints, as a ∗

Department of Systems and Computer Networks, Wrocław University of Technology, Poland

81


way for recognition dynamic gestures, as a sequence of static key poses. Following examples are [3] and [8], which also use 3D sensors for simple body gestures recognition. All of mentioned and majority of other existing solutions, use classifiers for recognition process. This kind of approach, although gives relatively good results, suffers from one issue - low level of expandability. This means that creations of gesture set for the algorithm requires lot of effort, because the training set for classifiers should be as big and as various as possible. The main goal of this paper is to design and implement an algorithm for body gesture recognition, which will be answer for problem indicated for existing solutions - low level of expandability, which is the ability of an algorithm to expand a gesture set that it uses. Nevertheless, it is not a single criterion, which a good algorithm shall fulfills. Another important criterion that will be investigated in this paper is flexibility. This is very wide criterion, as it contains many various matters. Firstly, there should not be any limitations for a choice of a gesture to be recognized. Another issue considered in this criterion, is a person-independence. This means, that system shall properly recognize gestures for any person, independently from the one that create the training data. Succeeding matter is a distance from a sensor, which also should not be taken into consideration. The last matter concerning this criterion is fuzziness of input gestures. This means, that input gestures does not have to be perfect repetition of training data. Another goals of this paper are: to design and conduct experiments which will check if implemented algorithm fulfills above criteria and to compare effectiveness of implemented algorithm with existing solutions. Rest of the paper is organizes as follows. In Section 2 problem formulation is given. Next a general clustering method is discussed and the details about algorithm, on which our approach is based are presented (Section 3) together with the details of proposed solution (Section 4). Section 5 describes the experiments, whereas the results are presented in Section 6. The last section contains conclusions and considerations further improvements.

2.

PROBLEM FORMULATION

The problem of body gesture recognition is a part of a bigger set, which contains all recognition problems. In each, the idea is similar - few classes of objects are defined and when new object appeared, it should be properly classified to the one of existing classes. In this paper, a problem of body gesture recognition is taken into consideration – it means that object can be understood as a single performance of a gesture. Moreover, a class is defined as a one exemplary performance of a gesture.

82


In [2], it is stated that “Classification is the task of assigning labels to a set of instances in such a way that instances with the same label share some common properties (...)”. In the other words classification is a process of grouping objects into several groups in the way that objects within group are in some sens more similar than objects between groups. Clustering is a part of unsupervised classification when no previous knowledge about an objects labeling is known. In such a situation the only information that can be drawn is a similarity or dissimilarity of particular objects. As it was mentioned clustering is one of the Machine Learning task that belongs to the unsupervised learning technique. The common field of applicability is in the Data Mining for finding patterns in a data. It may seem that for this purpose only circular or ellipsoid-shaped clusters are sufficient. However it is not true in every application. Various other shapes, like for example crescent, also occurs in the data. Shell-clustering is a category of clustering which gives an opportunity for searching clusters of shellshapes. The example of this kind of algorithm is the c-shell clustering algorithm. This is a prototype-based approach which usually involves iterative procedure of minimizing objective function [10]. The commonly used objective function for possibilistic c-shell clustering [4] is (1) with use of a membership function (2): J=

N C X X j=1 i=1

m um ij dij +

C X

ηj

N X

(1 − uij )m

(1)

i=1

j=1 1

−1

d2ij m−1 uij = [1 + ( ) ] ηj

,

(2)

where C is the number of clusters, N is the number of objects, uij is the membership function with use of a distance dij , m is fuzzification factor and the ηj is a parameter called bandwidth or zone of influence, which controls dependence between uij and dij . The bandwidth parameter is strongly related with convergence issue, which will be explained further.

3.

TEMPLATE-BASED SHELL CLUSTERING

Template-based shell clustering is subset of shell-clustering which allows to recognized any cluster shape defined as a template. The paper that is a basis for implemented algorithm [10] used it not for the Data Mining task but for detecting objects on an image. An example of the algorithm work is presented in Figure 1.

83


Fig. 1. An example of work of the algorithm described in [10]

In this Section most important details of the algorithm described in [10] are presented, to prepare a base for explanation of designed solution. In the solution presented in [10], templates are graphs. Input set of points is obtained from an image after an edge detection. As it was previously stated, prototype-based approach is based on iterative changes in the prototype, to minimize objective function. In [10] objective function is a standard function (1). Prototypes are obtained from templates after some transformation. Three types of transformations are presented in [10]: Type I: Consist of translation, rotation and single scaling factor. Function transforming template point into prototype point is (3), where p is a prototype point, p∗ is a corresponding point in the template, Rj is the rotation matrix, sj is the scaling factor and tj is the translation vector: p = Rj sj p∗ + tj .

(3)

Type II: Consist of rotation, separate scaling factor for each dimension and translation. Transformation function is (4), where a scaling factor sj is replaced with a scaling matrix Sj : p = Rj Sj p∗ + tj . (4) Type II: Consist of affine transformation and translation. Transformation function is (5), where rotation and scaling are replaced with single affine transformation matrix Aj : p = Aj p ∗ + t j . 84

(5)


Distance between given point and a point in the prototype is calculated with use of transformation equation. Distance between given point xi and a point in the prototype for the transformation Type I is presented in (6): dij = |xi − pij | = |xi − (Rj sj p∗ij + tj )|.

(6)

This distance is further use in membership function (2) and hence in objective function (1). In each iteration of the algorithm nearest points are found for all points in the prototype. Next, change parameters (like rotation matrix or translation vector) are updated to minimize objective function. Change of each parameters is calculated separately, as partial derivative of objective function. All necessary equations are included in [10]. At the beginning few randomly initialized prototypes are created. Then for each prototype change parameters are modified separately. At the end of the algorithm, prototypes should be convergent at the place of a single object on an image. More details about the algorithm can be found at [10] as those are the only ones needed to present Template-based body gesture recognition algorithm.

4.

PROPOSED APPROACH – TEMPLATE-BASED BODY GESTURE RECOGNITION

Template-based body gesture recognition is a simplification of the algorithm used in [10] for finding an object on an image. The details about the algorithm are further presented in this section. 4.1.

FEATURES EXTRACTION

In the solution presented in [10], objects are two dimensional points extracted from an input image. In our approach objects are three dimensional location of body joints. Posture information is obtained with use of 3D sensor ASUS Xtion Pro and open-source libraries OpenNI[7] and NiTE[9]. Those libraries allows a user to obtain raw 3D position of 15 joints of human body: head, neck, left/right arm, left/right elbow, left/right hand, torso, left/right hip, left/right knee, left/right feet. The example of obtained skeleton is presented in Figure 2a. Second Figure 2b presents an example of a method of features extraction. Objects in the algorithm are not simple 3D position of joints. To achieve independence from different body proportion and different distance from the sensor, another way of features extraction is implemented. Objects in the algorithm are vectors from torso point to each

85


(a) A sample recognition of a (b) An example of a feature skeleton extraction method Fig. 2. An example of a skeleton and a feature extraction method

joint (example is presented as black arrow in Figure 2b), normalized by the length of a torso (blue line in Figure 2b). As the above mentioned libraries are able to detect 15 body joint for each person objects in three-dimensional feature space contains exactly 15 objects. The same previously saved set of points can be used as a single template. 4.2.

FITTING STAGE

Likewise in the algorithm described in [10], also in our approach prototypes are created as a modified templates. Three different fitting types are available: No transformation: No transformation fitting type means that prototype points are exactly the same as in a template. It is presented in (7), where pi is the i-th prototype point and p∗ij is a matching i-th point in the j-th template. Because body joint are labeled, there is no need to find the nearest point, as we exactly know which point in an input gesture belongs to the one in the template: pi = p∗ij .

(7)

Scaling: Scaling fitting type contains additional scaling matrix Sj for each prototype (8): (8) pi = Sj p∗ij .

86


Affine transformation: Affine transformation fitting type, changes Sj to more flexible Aj - affine transformation matrix for each prototype (9): pi = Aj p∗ij .

(9)

The difference between functions (9) and (5) is a lack of the translate vector. It is not needed because of the way of feature extraction explained in Section 4.1. In our approach templates as well as an input gesture has center - torso point in [0, 0, 0], so no translation is needed. There is also no rotation in function (8). This is done to compare simpler scaling approach which should eliminate differences in body proportion with more complex affine transformation which contains scaling, rotation and more. As it was already explained in Section 3 - in each iteration of the algorithm prototypes are being changed, to minimize objective function. In our approach the same objective function (1) is used, as in [10]. For this reason transformations from [10] can be used, after extension into three dimensions. Update equations for scaling and affine matrices are (10) and (11): PN m T ∗ i=1 uij xi Ipij(k) sjk = PN , (10) m ∗ 2 i=1 uij (pijk ) N N X X ∗ T −1 ∗ ∗ T m um uij (−xi )(pij ) ][ Aj = [ ij (pij )(pij ) ] .

(11)

i=1

i=1

In (10), sjk is the k-th diagonal element of scaling matrix Sj , I is a three dimensional identity matrix, p∗ijk is a value of the i-th matching point in the j-th prototype at the k-th dimension, and the vector p∗ij(k) is a vector p∗ij where all dimensions but the k-th are set zero. In both functions xi is the i-th point in an input gesture and the membership function (2) is used. As a distance required in membership function simple Euclidean distance from point xi and pij is used. Exemplary result of a fit stage of the algorithm is presented in Figure 3. In the above figure three sets of point are presented - template points, input gesture points and prototype points. At the beginning of the algorithm, prototype points are exactly the same as the template points and in each iteration of the algorithm they are being changed to fit to the input gesture. It can be observed in Figure 3, prototype points are very well matched to input gesture points. In objective function (1) two parameters are used: m - fuzzification factor, and ηj bandwidth. The fuzzification factor is also used as a parameter in proposed algorithm. As the name states it controls the level of fuzziness in a recognition process. The bandwidth parameter is omitted in this paper and it is set to 1. This parameter is connected

87


Fig. 3. An example of a fitting stage with use of affine transformation

with wrong convergence issue and in our approach wrong convergence is not accessible, as the input points are labelled. 4.3.

ALGORITHM

The idea of the proposed algorithm is presented in the Algorithm 1.

88


Algorithm 1 A pseudo-code for proposed algorithm function R ECOGNIZE(Gesture G) Create prototype for each template

. PREPARING STAGE

if F it type 6= N oT ransf ormation then . FITTING STAGE repeat for all prototype P do if P not converged then P reviousM atrix â†? matrix(P ) P reviousEvaluation â†? M EAN E VALUATION(P ) U PDATE(P , G) CurrentEvaluation â†? M EAN E VALUATION(P ) if P reviousEvaluation ≼ CurrentEvaluation then matrix(P ) â†? P reviousM atrix P ⇒ converged else if CurrentEvaluation − P reviousEvaluation ≤ then P ⇒ converged end if end if end for until all prototypes convergent or max iterations reached end if for all prototype P do Evaluations(P ) â†? M IN E VALUATION(P ) end for

. EVALUTAION STAGE

return arg(max(Evaluations)) end function The algorithm can be divided into three stages: preparing, fitting and evaluation. Preparing stage In a first stage of the algorithm, prototypes for each template are created. Initially prototype is exactly the same as the templates - so scaling/affine matrix, used then in fitting stage, is equal to the identity matrix. Single prototype is created for each template though each class.

89


Fitting stage This is the most crucial and the most complicated stage of the algorithm. In each iteration, prototypes are being changed to fit better into an input gesture. If no transformation fit type is used - this step is omitted. Fitting stage is an iterative procedure done separately for each prototype. Process repeated in each iteration for single prototype will be now explained. At the beginning of the process, previous values of scaling/affine matrix (called further change matrix) are saved. The prototype is evaluated with use of M eanEvaluation. Mean evaluation is average value of membership for all points in the prototypes and input gesture. It is calculated as (12), where j is an index of current prototype, N is a number of points in the prototype, and ui j is membership value calculate with use of (2): PN uij meanEvalj = i=1 (12) N Mean evaluation is used to obtain average value of fit for the prototype and input gesture. Next step of the process is to update change matrix for the prototype. Depending on chosen fit type (10) or (11) is used. After the update step, prototype is reevaluated with use of the same M eanEvaluation. Then: • if new evaluation is worse than previous evaluation - previously saved change matrix is restored, and prototype is considered as convergent, • if difference between new and previous evaluation is positive but less than predefined value - prototype is considered as convergent. If prototype is considered as convergent, the fitting process is not repeated for this prototype. Iterations of fitting stage are repeated until all prototypes converged or number of maximum iterations is reached. Then fitting stage is done. Evaluation stage The last stage of the algorithm is evaluation stage. In this stage all prototype are evaluated once again, but with different evaluation method than in the fitting stage. In the fitting stage M eanEvaluation is used to check if prototypes are constantly being improved. In this stage M inEvaluation is used. It is calculated as 13, where j

90


is an index of current prototype, N is a number of points in the prototype, and ui j is membership value calculate with use of (2): minEvalj = min(uij ), f or i ∈< 1, N > .

(13)

This evaluation is used instead of mean evaluation, to eliminate the situation where all but one joints are properly matched, however remaining one vary significantly. Our need is to have all joints properly matched and for that reason minimum value seems to be a good choice. The result of the whole algorithm is a template connected with a prototype with the highest min evaluation. Presented algorithm is slightly simplified comparing to the implemented one as it does not taking into consideration two additional matters. First one is parameter maximum change value, which is a boundary of maximum change in a prototype. It says that absolute value of difference in single change matrix element (comparing to the initial value), cannot be higher than a given parameter. If difference is higher it is limited to this given value. Second matter is a parameter minimum objective value, which controls detection of null-gestures. At the end of the whole procedure, if none of evaluated prototypes reach value higher than the given parameter - it is decided that null-gesture occurred.

5.

EXPERIMENT DESIGN

It was previously stated, the main goal is to create an algorithm of body gesture recognition which fulfills few criteria. Those criteria are: flexibility and expandability. Goals of experiments are: • to check how well the implemented algorithm comply with given criteria, • to compare accuracy of the algorithm with existing solutions in real application. To satisfy above goals various experiments need to be conducted. Details of all test cases investigated are explained later in this section. 5.1.

EVALUATION CRITERIA

Two different evaluation criteria are considered. They are effectiveness and accuracy.

91


Effectiveness Effectiveness can be also understood as correctness. This criterion does not rely on a final result which is a matched gesture. The measure that is considered in the criterion is a membership of gesture to a template. This value is used because it carried more information about the decision making process. Accuracy Accuracy criterion, as an opposite to the effectiveness, considered only final results of the algorithm. For single decision this value is binary - 1 if the algorithm undertake right decision and 0 otherwise. However if higher number of decision is considered accuracy can be calculated as (14), where ad is an accuracy of single decision and N is a number of all decisions: P ad A= d (14) N Accuracy can be also investigated separately for each gesture class. Not all evaluation criteria are considered in each experiment. The information about criterion that is used in a single experiment is appointed later in this section, when details of each experiments are provided. 5.2.

DATA – GESTURES

There are two main gestures set that are used in experiments: simple gestures set and real gestures set. First one is used in experiments examining overall efficiency of the algorithm. Second gesture set is used in comparison of the algorithm with existing solutions. This is done because the goal is to compare accuracy of algorithm in real application, what in this case mean real gestures. Gestures, both used as a template and used as an input gestures, were performed by a group of people. Data about posture of all those people is presented in Table 1.

92


Table 1. Height and weight of people performing gestures in the experiments

HUD JUS MAR PRZ SHO TOM ZIU

Height [cm] 186 164 163 196 180 183 160

Weight [kg] 74 58 49 80 67 72 52

Simple gestures A kernel of the simple gestures set is presented in Figure 4. The kernel consists of four gestures: X, T, PSI and SLASH.

(a) X gesture

(b) T gesture

(c) PSI gesture

(d) SLASH gesture

Fig. 4. Simple gestures

Not all gestures from this set are used in each experiment. The information which gestures are used in a single experiment is appointed when details of each experiments are provided.

93


Real gestures Real gestures called also sport gestures is additional set of gestures introduce to examine accuracy of algorithms in real application. Gestures from this set are presented in Figure 5. Each gesture represents one sport discipline. The idea is drawn from popular sport game for Microsoft XBOX. This kind of gestures have two advantages: firstly they are not as unnatural as previously used gestures and secondly, they are more similar to each other.

(a) Racing gesture

(b) Skiing gesture

(d) Archery gesture

(c) Boxing gesture

(e) Bowling gesture

Fig. 5. Sport gestures

5.3.

SCENARIOS

Experiments in this section are divided into two groups. First group try to accomplish the first goal presented in the previous part. In this paper, various experiments are conducted to check flexibility of the algorithm. The remaining criterion - expandability is not examined in the experiments. This is specificity of the algorithm and there is no area for testing. The implemented algorithm requires only one performance of a gesture to be used as a template. This is the least effort that can be achieved. Second test group is devoted to compare accuracy of the algorithm and existing solutions. Experiments for both groups will be further described separately.

94


5.3.1.

Algorithm efficiency

This group of experiments is responsible for verifying if the algorithm fulfills flexibility criterion. Fuzziness of the input gestures is examine in all experiments in this group, as the gestures were not performed ideally. Additionally first experiment examine issue of person independence. Parameters for the algorithm were chosen with use of trial and error method. Only property of the algorithm that is considered in further experiments is fitting type. The evaluation criterion which is used in experiments from this part is effectiveness. Person independence Person independence means that one person could create template and the other shall be properly recognized while performing the same gesture. Differences in body proportion should not affect the recognition process. To examine influence of different fitting types on this issue, five people were asked to perform a gesture twice. The performance of each person is used as a template and it is matched to performances of itself and of the other people. Gestures of different people are used as a template in turn, so that each of them is used exactly once. This process is repeated for three various gestures to eliminate risk of gesture-dependence. Gestures used for this experiment are X, T and PSI gestures from simple gestures set. People that took part in this experiments are: HUD, JUS, MAR, TOM, ZIU. Scaling fitting type In this experiment benefits of use a scaling fitting type are to be presented. For this purpose the same gesture is performed fourth times. Once as a template and three times as an input gesture. Each gesture performance slightly differs from the others in the way that should be eliminated using scaling fitting type. This process is repeated for two gestures to eliminate risk of gesture-dependence. An example of different, fuzzy performances of the same gesture is presented in Figure 6. Gestures used for this experiment are X and T gestures from simple gestures set. For the X gesture differences in performances are obtained as a different angle between arms and legs (in X-Y plane). For the T gesture differences in performances are obtained as different angle between arms and torso (in X-Z plane). Affine transformation fitting type This experiment is similar to the previous one. However this experiment is to show benefits of use of affine transformation fitting type. Alike in the previous experiment each gesture is performed fourth times. Once as a template and three times as an input gesture. Each gesture performance slightly differs from the others in the way that should be eliminated using affine transformation fitting

95


Fig. 6. An example of features for a template and three performances of a X gesture. Differences between performances and the template should be negligible small after scaling fit stage

type. This process is repeated for two gestures to eliminate risk of gesture-dependence. Gestures used for this experiment are T and SLASH gestures from simple gestures set. Differences in performances for the SLASH gesture are obtained as different shearing (in X-Y plane) and for the T gesture with different rotation (about Y axis). 5.3.2.

Comparison with existing solutions

Second group of the experiment is devoted to compare effectiveness of the implemented Template-based body gesture recognition algorithm (called further templatebased algorithm) and existing solutions using classifiers. The paper about comparing effectiveness of classifiers in problem of body gesture recognition[5] states that for this purpose the best classifier is Support Vector Machine (SVM) with Gausian kernel. Taking this information into account, SVM with Gausian kernel is used as the representative of existing solutions in the experiment. Parameters values for the SVM are taken from [5], and for the template-based was obtained using trail-error method. Fitting type used for template-based algorithm is affine transformation, as it is the most flexible. The main goal for this group of experiments is to simulate real application - what in this case mean real-life gestures. For this reason real gestures set, presented in Section 96


5.2 is used. Advantages of this kind of gestures are also described in this section. In this group of experiments only accuracy is used as an evaluation criterion. For SVM classifier, accuracy is calculated as (15), where T P (c) is value of true positives for each class - what means amount of properly recognized performances - and N is the size of the dataset. P T P (c) . (15) A= c N To measure effectiveness of SVM classifier Knime[1] application is used. To use SVM classifier, two datasets are required: training dataset used for training classifier and test dataset used for testing effectiveness. Datasets used in following experiments are: • Training dataset consists from 250 samples = 5 people * 5 gestures * 10 performances. People that took part in creation of this dataset are: JUS, MAR, PRZ, SHO, TOM. • Test dataset consists from 50 samples = 2 people * 5 gestures * 5 performances. People that took part in creation of this dataset are: HUD, ZIU. One important remark is that performances are done separately to achieve the biggest natural differences. Examples of extracted joints for the same gesture performed by different people are presented in Figure 7. It can be seen that performances done by different people vary significantly. Comparison of this two algorithms is hard because they have very different specificity of input data. For this reason two separate experiments were conducted: one with data of specificity designed for SVM and second with data of specificity designed for templatebased algorithm.

97


(a) A gesture performed by (b) A gesture performed by (c) A gesture performed by JUS MAR PRZ Fig. 7. The same gesture - boxing - performed by different people

SVM specified data Classifiers to achieve the best performance should use the biggest and the most varied training dataset as it is possible. For this reason in first experiment whole training dataset is used to train SVM classifier. As an opposite - Template-based algorithm works only with one performance of a gesture used as a template. Because of that, there is no way to use whole training dataset in the algorithm. For this reason as a templated is used one, randomly chosen, sample from training dataset for each gesture class. To eliminate risk of performance-dependence, process is repeated for 5 randomly chosen performances for each gesture. Accuracy is calculated separately for each repetition and mean value is used as an output accuracy. Test dataset is used for SVM to calculate accuracy. For the template-based solution the same test dataset is used as following input gestures. Template-based specified data As it was previously stated template-based algorithm use only one performance as a template. This performance should be the ideal representative of whole class. For this reason randomly chosen, fuzzy performances used in

98


previous experiments are not good choice for this algorithm. For this reason additional set of ideal performances done by single person (ZIU) is used. To minimize impact of randomness, five different sets are used and results are averaged. In this experiment SVM classifier use only one performance of single gesture as an training set alike the template-base algorithm. As an input gestures for both approaches, fuzzy performances from training dataset are used.

6.

RESULTS

In this section, the result of computational experiments are provided according to the defined objectives, data and scenarios. 6.1.

ALGORITHM EFFICIENCY

Person and distance independence As it was previously stated in Section 5.3.1, for this experiment each person performed three gestures twice. One performance is used as a template and the second one is an input gesture. Each person is matched to each other (including itself) for all three gestures but membership values for different gestures are averaged. Results of matching for different fitting type is presented in Figure 8. Difference in matching person to other people in compare to itself is clearly visible. The highest difference occurs while no transformation fitting stage is used, while for affine transformation the difference is small. It lead to the conclusion that feature extraction method described in Section 4.1 is not sufficient, though it contains normalization introduced to work against this problem. Difference in membership values for each fit type is better presented in Figure 9, where result values are averaged. Another interesting observation is that matching is very person-dependent - some people are better matched to the others and some are properly matched only to itself. However this phenomena seems to be unrelated with the body proportion of single person, which can be investigated by looking at values in Table 1.

99


(a) No transformation

(b) Scaling

(c) Affine transformation Fig. 8. Membership values of input gestures for each person to other people with use of different fitting type

100


Fig. 9. Mean of membership values for matching person to itself and to the other people

Scaling fitting type In this experiment advantage of using scaling fitting types is to be presented. Two gestures X and T are performed with modification which should be compensate using only scaling. Results are presented in Figure 10. In the figure orange column corresponds to the performance of input gesture, that is the most similar to the template gesture. It is clearly visible, that for no transformation , matching is worse for modified gestures and good for similar gesture. However scaling is sufficient to eliminate differences in membership values. Use of affine transform fitting type has no significant impact in compare to use of scaling fitting type. Afine transformation fitting type Next experiment is similar to the previous one, however it is to show advantages of use of affine transformation. Results is presented in Figure 11. Again it is clearly visible that for no transformation as well as for scaling fitting type differences in matching for different performances are huge. Advantage of using affine transformation is also clearly visible as those differences are not significant with use of affine transformation fitting type.

101


(a) X gesture

(b) T gesture Fig. 10. Membership values for input gestures with modifications for each fitting type

102


(a) SLASH gesture

(b) T gesture Fig. 11. Membership values for input gestures with modifications for each fitting type

After analysis results of experiments, it can be stated that fuzziness criterion is fulfilled the most with use of affine transformation fitting type. However all experiments use very simple gestures. Affine transformation fitting type could compensate many fuzzy changes in gesture performance. 6.2.

COMPARING WITH EXISTING SOLUTIONS

Second group of experiments concerns about effectiveness in real application. In this group real application of implemented algorithm will be compared with SVM classifier as the representative of existing solutions.

103


Table 2. Accuracy for SVM and Template-based algorithm

Accuracy

SVM 100%

Template-based 81.6%

Table 2 presents overall result for both approaches in experiments with data specified for classifiers. It can be seen that SVM classifier achieve 100% of accuracy for created verification set. It is surprisingly high value because of similarity of different classes and dissimilarity of performances. Implemented algorithm achieve lower performance details are presented in Figure 12. As it can be seen - accuracy vary between repetitions. In each repetition different set of templates is used so it can be concluded that result is very template-dependent. It is also gesture-dependent because each gesture has different level of accuracy.

104


(a) Accuracy per each repetition

(b) Accuracy per single gesture Fig. 12. Accuracy of implemented algorithm for each iteration and single gesture 6.3.

TEMPLATE-BASED SPECIFIED DATA

As it was previously stated, specificity of a training data for both algorithms are different. In previous experiments training set for the SVM classifier consists of many repetitions of a gesture thus it have greater amount of information. Moreover specificity of template-based approach requires ideal representatives used as an input template, not fuzzy performance like in previous experiment. For this reason in this experiment both algorithm uses the same training data. Results of five repetitions and averaged result for both algorithm is presented in Figure 13.

105


Fig. 13. Accuracy of implemented algorithm and SVM classifier for each iteration

It can be seen that in this case implemented algorithm has better accuracy. Only in one repetition it gave worse result than SVM. However differences are small - only for one repetition template-base algorithm achieve significantly higher accuracy. Different observation is that for this experiments differences in effectiveness of template-base algorithm are not big unlike in previous experiment. It can be concluded that for ideal performances used as templates the algorithm has always similar accuracy. It is not true in case of SVM classifier. It can be stated that results of template-based are more reliable as we can expected always similar accuracy, no matter which gestures we used as templates. In the experiment with data specified for the classifiers, SVM proved to be better achieving 100% of accuracy. It is very good results. Nevertheless it should be expected as it use very big and varied training data. When training data for both approaches were equated, template-based algorithm proved to be better, however differences in accuracy are not big. Nevertheless important information is that results for template-based approach does not vary significantly for different repetitions thus is seems to be more reliable. Another factor which influences the recognition process is fact, that when gesture is more complicated each person could understand it slightly different. Sometimes it is hard even for human to distinguish some performances of one gestures from the other gestures. In this case boxing and racing templates are perfect examples.

106


7.

CONCLUSIONS

As it was previously stated - nowadays subject of body recognition is being more popular. It is because of two main fields of applicability: Natural User Interface and Human Robot Interaction which are also getting more popular at the time. Main problem related with majority of existing solutions, which uses classifiers, is low level of expandability. This means that creating new or expanding existing gesture set of an algorithm requires huge effort. Presented template-based approach is the answer for this problem as it require only single performance of a gesture to be used as a template. Nevertheless expandability is not single criteria which algorithm should fulfills. It is assumed that good algorithm for gesture recognition should fulfills also another criterion which is flexibility. Results of experiments shows that especially with use of affine transformation many recognition problems can be minimized. It can be concluded that algorithm is insensitive to the person differences in sens of different body proportion. Of course for more complicated gestures there always remains problem of different interpretations of a gesture by different people. The high level of fuzziness is achieved as well, as with use of scaling or affine transformation fitting type, fuzzy gestures can be easily fitted to the original template. In other words - flexibility criterion is very well satisfied. As one of the biggest motivation is to meet the problems related with exiting solutions for this problem - also comparative studies need to be done. However in this case comparison is hard. To compare two algorithms they should use the same data to eliminate issue of data-dependence. This is a problem because specificity of training data for both approaches is very different, as it was previously explained. For this reason two different experiments were conducted - one with data specified for the classifiers and one with data specified for the template-based solution. In experiments with data prepared for classifiers, SVM achieved 100% accuracy and implemented algorithm about 80%. However it is not surprising, because SVM classifier works with greater amount of information. More evidences gives the second experiment in which both approaches uses the same training and test data. Here slightly better and more reliable proved to be Template-based body gesture recognition algorithm. As in many different subject, alike in this case - choice of solution depends on application requirements. If dynamic algorithm is required, for which expanding gesture set is very easy - implemented algorithm seems to be good choice. However if there is no need for dynamic properties and very high accuracy is needed - classifiers seems to be better choice. Important remark is that implemented algorithm were not compared

107


with classifiers in case of other evaluation criteria like time-performances. So the subject needs further examination. There is a lot of space for further investigation and improvement in the algorithm. Firstly various distance measure can be used as [11] claims that this choice is crucial. Obviously, different methods for feature extraction can be tested. Nevertheless the most crucial seems to be an evaluation method - value of membership of a prototype to an input gesture. Different evaluations can be investigated as well as different objective functions can be used instead. The example of such a function can be one that combines similarity of a prototype and a gesture (membership value) with value of change (differences in scaling/affine transformation matrices).

REFERENCES [1] KNIME.com AG, KNIME: Konstanz Information Miner application http://www.knime.org/, 2004-2014. [2] BANDYOPADHYAY S., Unsupervised classification similarity measures, classical and metaheuristic approaches, and applications. Springer, Berlin New York, 2013. [3] GU J., DING X., WANG S. AND WU Y., Action and gait recognition from recovered 3-d human joints. IEEE Transactions on Systems, Man, and Cybernetics Part B: Cybernetics 40(4):1021–1033, 2010. [4] KRISHNAPURAM R. AND KELLER J.M., A possibilistic approach to clustering. IEEE Transactions on Fuzzy Systems 1(2):98–110, 1993. ´ [5] KULINSKA J., Comparing effectiveness of classifiers in problem of body gesture recognition. COMPUTER SYSTEMS ENGINEERING - THEORY & APPLICATIONS, pages 147164, 2014. [6] MIRANDA L., VIEIRA T., MARTINEZ D., LEWINER T., VIEIRA A.W. AND CAMPOS M.F.M., Real-time gesture recognition from depth data through key poses learning and decision forests. In Conference on Graphics, Patterns and Images (SIBGRAPI), 2012 25th SIBGRAPI, pages 268–275, Aug 2012. [7] OpenNI, OpenNI: Open Natural Interaction library. https://github.com/OpenNI/OpenNI, 2010-2014 [8] PATSADU O., NUKOOLKIT C. AND WATANAPA B., Human gesture recognition using kinect camera. In Computer Science and Software Engineering (JCSSE), 2012 International Joint Conference on, pages 28–32, May 2012. [9] PrimeSense, NiTE: Natural Interaction Technology for End-user library. N.A. - project closed, 2010-2014 [10] WANG T., Possibilistic shell clustering of template-based shapes. IEEE Transactions on Fuzzy Systems 17(4):777–793, 2009.

108


[11] WANG T., Template-based shell clustering using a line-segment representation of data. Fuzzy Systems, IEEE Transactions on, 19(3):575–580, 2011. [12] WU Y. AND HUANG T.S., Vision-based gesture recognition: A review. In Gesture-based communication in human-computer interaction, pp. 103–115. Springer, 1999. [13] YUN K., HONORIO J., CHATTOPADHYAY D., BERG T.L. AND SAMARAS D., Twoperson interaction detection using body-pose features and multiple instance learning. In Computer Vision and Pattern Recognition Workshops (CVPRW), 2012 IEEE Computer Society Conference on, pp. 28–35, June 2012.

109


Computer Systems Engineering 2014 Keywords: normalization, gas usage, degree day, simulation models

Michal LASZKIEWICZ* Tomasz LEWOC

NORMALIZATION OF GAS CONSUMPTION IN BUILDINGS To find the best possible estimation of future gas consumption in a building, normalization that accounts for weather-related parameters can be used. This paper describes a gas consumption– simulating algorithm, designed by the author, which uses general weather data, degree days and historical data about gas consumption collected by the company Porta Capena. The implementation of the algorithm itself, as well as that of a testing environment used to evaluate, it are covered. Experiments were carried out for four buildings located in Belgium and the Netherlands. The results were compared to those obtained using older, standard algorithms. The newly-created was found to perform better than the older ones.

1. INTRODUCTION Renewable energy sources are becoming more and more popular a topic, discussed in press, on television, at conferences and even at meetings at the government level. At present, however, it’s hard to imagine completely ceasing to rely on conventional, non-renewable sources. Indeed, a great many buildings are currently heated using gas, which costs their owners thousands of Euros per year. To realize the sheer size of these spending is one of the first steps to reducing them. This paper covers the design and implementation of a new algorithm and testing environment used for prediction and normalization of gas consumption in residential and office buildings. The solutions presented enable prediction of future costs and detailed comparisons of different buildings, as well as comparisons of different periods of a given building’s use. Such knowledge can prove useful when planning future investments and managing existing infrastructure and installations [1]. This paper covers: ● the goals of the research performed; *

Department of Systems and Computer Networks, Wroclaw University of Technology, Poland

110


a brief theoretical introduction; the data acquisition process; the mathematical models designed and used; the plan and results of the experiments performed, which is the most important part of the work; ● conclusions; in that order. Special thanks to dr Adriaan Brebels for consulting and providing anonymous data from the EcoSCADA service. ● ● ● ●

2. RESEARCH GOALS The goal was to design and implement an algorithm that can normalize gas consumption in any given building. The algorithm was to take historical data as input and generate results in a timely manner, to allow for possible implementation in a web app where long waiting times would be unacceptable. In addition, a separate “data scan” algorithm, used to scan historical data and automatically suggest optimal simulation parameters, needed to be designed and implemented. 3. THEORETICAL INTRODUCTION Most models used for normalization of gas consumption use the abstract unit degree days, which in turn is used to state “how much” a building should be heated or cooled. Two types of degree days are commonly used [2-3]: ● Heating degree days (HDD) – used when the outdoor temperature is higher than the indoor temperature, in which case heating should be turned on and gas consumed; ● Cooling degree days (CDD) – used in the opposite scenario, in which case air conditioning is usually switched on and electrical energy consumed. Both types of degree days are, as the name implies, calculate for an entire day. This property means that it’s easy to add up degree days to obtain values for longer periods, such as weeks, months or years. [2] The outdoor temperature at which heating systems in a building should turn to guarantee a comfortable indoor temperature is called base temperature. It’s the most important parameter when determining the number of degree days. It should be kept in mind that the base temperature often differs from the final desired temperature in a building (e.g. in a building where 20°C is desired, the base temperature might be around 17°C). This difference is caused by a heating system’s inertia, i.e. after turning off heating, the building will continue to be heated for a time as now-inactive the 111


heating system’s components cool off. Some heat is also emitted by people and nonheating devices, such as computers. Pre-defined lists of guideline base temperatures for various building types exist. For example, in Belgium a standard base temperature is 10°C for warehouses, 15°C for libraries, offices and supermarkets, and a much higher 28°C for swimming pools[2]. While such guideline values are useful, it’s possible, and preferable, to compute a building-specific base temperature for a given building. This is an optimization problem. For a given candidate base temperature, a point chart is drawn on which the X axis is the number of degree days for the given base temperature in a specified time period. The Y axis is the gas consumption. Using linear regression, a linear function that best fits the points is found. Different base temperatures will result in different values of R2 , a goodness-of-fit measure. The goal is to find the base temperature that maximizes 2. See Fig. 1 for an example.

Fig. 1. Sample chart used to find best fit base temperature

The point of intersection of the approximated linear function with the Y axis is also of interest. It defines the so called baseload, which is the amount of energy used for heating functions independent of weather, such as for preparing food. It should be excluded from normalization involving degree days. [3] Degree days describe the total temperature deviation from the base temperature in a given time period. In a simplified, theoretical example, if for two days the outdoor temperature is 2°C lower than the base temperature, the total temperature deviation is 4 HDD (2°C * 2 days = 4 HDD). Of course, a real scenario is never so simple, since temperature is always changing. [8] Several methods of calculating HDD are used in practice: 112


Integration Method – using data from a large amount of measurements from a given time period and iterative computing of consecutive components of the final sum; ● Meteorological Office Method – a method which allows fairly accurate approximation of a daily degree days value using a computed or provided minimum and maximum outdoor temperature from a given time period, ● Mean Daily Temperature – a basic method using solely on base temperature and an average outdoor temperature from a given time period, ● Hitchin’s formula – uses statistical data, characterize by not needing data about outdoor temperature changes to be gathered. All of the above except Hitchin’s formula were used, with standard formulas [3], during the experiment covered by this paper. However, since the goal was to create and test a new normalization model, these standard methods were supplemented with data acquired directly from weather stations, both absolute and adjusted for wind chill. This nonstandard approach enabled weather conditions to better influence the approximation of future gas consumption. ●

4. MODELS TESTED Five algorithms were used in the experiment covered by this paper. Three of them are common existing algorithms [2-7], two were custom-made as part of the experiment (both are modifications of existing algorithms). All except the first one involve finding the coefficients of the linear function that gives the estimated gas consumption for a given amount of degree days (see Fig. 1). 4.1 NO NORMALIZATION ALGORITHM (NR)

This algorithm is not actually a method of gas consumption normalization. It assumes that a building will consume exactly as much data in a given time period as it did in the previous one. Of course, this solution won’t work well at all unless the weather at the building’s location changes very little over time. 4.2 NO BASELOAD REGRESSION ALGORITHM (NBR)

This algorithm employs normalization against degree days for a given base temperature, with the assumption that gas is used solely to heat the building. The normalization parameters are computed by using linear regression on a point chart similar to that used to find the base temperature, except that the point of intersection with the Y axis is always at (0,0), i.e. the least significant coefficient of the linear function is always zero. The other coefficient becomes the normalizing parameter.

113


4.3 SIMPLIFIED NO BASELOAD REGRESSION ALGORITHM” (SNBR)

A variant of the previous algorithm with only two points taken into account: (0,0) and the point whose coordinates are total degree days (x) and total gas consumption (y). This means that the normalization parameter can be found by simply dividing total gas consumption by total degree days. 4.4 FULL REGRESSION ALGORITHM (FR)

This differs from NBR in that it assumes that not the entire consumption of gas is to be normalized against weather conditions. This means both coefficients of the linear function mentioned in chapter 4.2 are taken into account when calculating projected gas consumption. When using this algorithm, is important to compute an accurate good base temperature for the building, else the estimated gas consumption for warm days may end up being negative. 4.5 MODIFIED FULL REGRESSION ALGORITHM (MFR)

This is a custom variant of the previous algorithm. The most significant change is the introduction of three base temperatures for a building instead of one. For the buildings used in the experiment, which have set opening and closing times, the three base temperatures were: ● the base temperature for the building when it’s open, ● the base temperature for the building on non-business days, ● the base temperature for the building on business days after closing time. This change requires a shift from a daily degree days sum to per-hour values called “degree hours”. Gas consumption was split into three separate values for the building’s “open” hours, “closed” hours and non-working days. For each of these values, an appropriate base temperature was found using the method described in chapter 3. In addition, a new method was used for estimating the baseload, i.e. portion of daily gas consumption that should be excluded from normalization. Besides using the intersection point of the linear function with the Y axis, a simple additional “data scan” algorithm was implemented that computes a suggested daily baseload by analyzing historical data.

5. DATA ACQUISITION AND TESTING ENVIRONMENT A very important step in the experiment was acquiring input data. Two sources were used: the database of the EcoSCADA service operated by Porta Capena and 114


weather data from the Weather Underground service [9]. The data collection process was different for each of these services. From EcoSCADA, historical data on gas consumption in the relevant buildings (in years 2012 and 2013) was acquired, in addition to information about the building’s guideline base temperatures and opening times as well as official degree days values for nearby weather stations. The testing environment was implemented as a web app, equipped with modules that enable communication with the data base, presentation of acquired data and calculating the actual gas consumption projections – the core of this experiment – using all five models described in chapter 4.

6. THE EXPERIMENT The experiment consisted of three parts with different scenarios. In the first scenario, the two “no baseline” normalization algorithms, described in chapters 4.2 and 4.3, were tested with two different base temperatures: a guideline value and a value suggested by the secondary data scan algorithm mentioned in chapter 4.5. The goal was to check whether it’s possible to find a base temperature noticeably better than the guideline for any of the buildings. The second scenario tested the new algorithm described in chapter 4.5 with various base temperatures, chosen iteratively, and various baseload values, generated by scanning historical data using the “data scan” algorithm. The goal was to evaluate the performance of both the normalization algorithm and the “data scan” algorithm. In the third scenario, all five algorithms were tested with the best possible parameters, chosen as in the previous scenario. The goal was to compare the accuracy of their estimations.

7. INTERPRETATION OF THE RESULTS The results are presented below in tables. Only relative errors are given for most values, since they are the most important measure of an algorithm’s performance. From the results of the first scenario (Table 1), It’s apparent that simple analysis of historical data using the secondary algorithm enabled selection of base temperatures significantly better than the guidelines, as proven by the increased accuracy of the estimations.

115


Table 1. comparison of prediction accuracy achieved using guideline base temperatures and new base temperatures provided by data scanning from 2012

Old base temperature Building

New base temperature

Relative error [%]

Temperature [°C]

Temperature [°C]

Relative error [%]

NBR

S. NBR

NBR

S. NBR

NBR

S. NBR

Amersfoort

16,5

3,7

5,0

15

16

0,7

5,5

Eindhooven

18

9,5

18,4

16,5

13

11,8

7,7

Vorselaar

18

5,9

1,4

15

16

1,0

0,3

Lieren

18

16,7

7,0

13

18

2,7

7,0

Average error:

8,95

7,95

4,05

5,13

Average error:

The first step of the second scenario was to use the “data scan” algorithm to find three optimal base temperatures (Table 2), for use with the new algorithm described in chapter 4.5, for each building. Table 2. Base temperatures proposed by “Data Scan” algorithm

Base temperature in a period Building

Working hours

After working hours

Holidays

Amersfoort

28

21

21

Eindhooven

13

13

13

Vorselaar

21

21

21

Lieren

18

18

21

116


The actual gas consumption estimations were then computed for each building. 512 simulations per building were carried out to cover all possible combinations of base temperatures. Below (Tables 3-6) is a comparison of prediction accuracy achieved for each building using the best possible combination of base temperatures (as found by the 512 simulations) and the base temperatures suggested by the “data scan” algorithm. Table 3. Results for scenario 2 - Amersfoort

Experiment:

Temperature T1,T2,T3 Wind Chill T1,T2,T3

Estimated Best Estimated Best

T1[°C]

T2[°C]

T3[°C]

28 28 28 28

21 28 21 28

21 28 21 28

Relative error [%] 3,8 3,7 4,9 4,8

The “data scan” algorithm performed fairly well for the Amersfoort building, with the base temperatures it suggested resulting in accuracy very similar to that achieved using the best base temperatures. Adjusting degree days for wind chill resulted in lowered accuracy. Table 4. Results for scenario 2 - Eindhooven

Experiment:

Temperature T1,T2,T3 Wind Chill T1,T2,T3

Estimated Best Estimated Best

T1[°C]

T2[°C]

T3[°C]

13 10 13 10

13 10 13 10

13 10 13 10

Relative error [%] 13,4 12,4 12,2 11,0

Eindhooven was the only building for which adjusting degree days for wind chill was beneficial. Table 5. Results for scenario 2 - Vorselaar

Experiment:

Temperature T1,T2,T3 Wind Chill T1,T2,T3

Estimated Best Estimated Best

T1[°C]

T2[°C]

T3[°C]

21 28 21 28

21 28 21 28

21 28 21 28

117

Relative error [%] 1,9 1,4 3,1 2,5


For the Vorselaar building, the new algorithm worked very well. The estimated gas consumption values had a very small relative error, which confirms that data gathered about this building is of very good quality. Table 6. Results for scenario 2 - Lieren

Experiment:

Temperature T1,T2,T3 Wind Chill T1,T2,T3

Estimated Best Estimated Best

T1[°C]

T2[°C]

T3[°C]

18 28 18 28

18 28 18 28

21 28 21 28

Relative error [%] 12,5 11,6 14,6 12,9

The accuracy of estimations for this building was relatively poor, comparable to that for the Eindhooven building. Unlike Eindhooven, however, Lieren didn’t benefit from adjusting degree days for wind chill. The last test performed was a comparison of the best results of each algorithm, with the aim of determining the best algorithm(Tables 7-8). Table 7. Comparison of all algorithms – relative error

Building

Amersfoort Eindhooven Vorselaar Lieren

NR [%]

NBR [%]

2,3 23,1 1,0 7,0

0,7 5,5 0,3 0,6

Algorithm S. NBR [%] 0,4 4,8 0,2 3,5

FR [%]

MFR [%]

3,6 10,8 0,0 7,6

3,7 11,0 1,4 11,6

It’s apparent that the accuracy of the algorithms vary greatly from building to building. As such, it’s hard to identify a single best algorithm. However, algorithms that ignore baseload were generally more accurate than the others. Table 8. Comparison of all algorithms – relative error without modulus applied

Building

Amersfoort Eindhooven Vorselaar Lieren

NR [%]

NBR [%]

-2,7 -12,7 -5,0 -4,8

-28,4 -20,2 -20,4 -0,6

Algorithm S. NBR [%] -26,3 -7,2 -9,0 -9,9 118

REG [%]

NA [%]

-16,0 30,6 -8,8 12,8

-17,4 -31,7 -6,2 16,1


The above table shows that in most simulations the gas consumption estimations were too small. This suggests that it may be possible to improve the algorithms by adding another parameter or modifying the method used to find the base temperature. 8. FINAL REMARKS The experiment can be considered a success, with all of its parts having been carried out and the newly created algorithm performing adequately. More specifically, it outperformed older, standard algorithms in cases where available data about a building was of good quality. This means there is much potential in future improvement of data processing methods. The part of the work that can most readily be used in real commercial applications is the secondary “data scan� algorithm, which can automatically select appropriate parameters for the normalization algorithms. In some cases this allows improvement of prediction accuracy without changing the actual algorithm. REFERENCES [1] LASZKIEWICZ M., Normalization of gas consumption in a building. Algorithms and experimental studies., Wroclaw University of Technology, Wroclaw, Poland, 2014 [2] BREBELS A., Normalization for outsider temperature, KU Leuven, Geel, 2012 [3] CIBSE, Degree-days: theory and application, The Chartered Institution of Building Services Engineers, London 2006, pp. 1-23 [4] SPINIONI J., VOGT J., and BARBOSA P., European degree-day climatologies and trends for the period 1951-2011, International journal of climatology, 2014 [5] Building Energy Research Group, Relationship between annual mean temperature and degree- days, Energy and Buildings, 2012 [6] SOLDO B., Forecasting natural gas consumption, Part of Croatian Electrical Company Group, 2011 [7] ETHO J., On Using degree days to account for effects of weather on annual energy use in office buildings, Energy and Buildings, 12/1988, pp. 113-127 [8] http://www.degreedays.net (accessed 2014) [9] http://polish.wunderground.com (accessed 2014)

119


Computer Systems Engineering 2014 Keywords: utility–service provision, infrastructure, household, graph theory, modelling, simulation

Anna STRZELECKA∗ Tomasz JANUS∗ Leticia OZAWA-MEIDA† Bogumil ULANICKI∗ Piotr SKWORCOW∗

MODELLING OF UTILITY–SERVICE PROVISION

Utility–service provision is a process in which products are transformed by appropriate devices into services satisfying human needs and wants. Utility products required for these transformations are usually delivered to households via separate infrastructures, i.e. physical networks such as, e.g. electricity grids and water distribution systems. However, provision of utility products in appropriate quantities does not itself guarantee that the required services will be delivered because the needs satisfaction task requires not only utility products but also fully functional devices. In this paper utility–service provision within a household is modelled with a directed hypergraph in which products and services are represented with nodes whilst devices are hyperedges spanning between them. Since devices usually connect more than two nodes, a standard graph would not suffice to describe utility–service provision problem and therefore a hypergraph was chosen as a more appropriate representation of the system. This paper first aims to investigate the properties of hypergraphs, such as cardinality of nodes, betweenness, degree distribution, etc. Additionally, it shows how these properties can be used while solving and optimizing the utility–service provision problem, i.e. constructing a so-called transformation graph. The transformation graph is a standard graph in which nodes represent the devices, storages for products, and services, while edges represent the product or service carriers. Construction of different transformation graphs applied to a defined utility–service provision problem is presented in the paper to show how the methodology is applied to generate possible solutions to provision of services to households under given local conditions, requirements and constraints. Water Software Systems, De Montfort University, Leicester, United Kingdom, e-mail: anna.strzelecka@dmu.ac.uk Institute of Energy and Sustainable Development, De Montfort University, Leicester, United Kingdom

120


1. INTRODUCTION Utility–service provision is a process in which utility products such as water, electricity, gas, food, etc. are delivered to households to satisfy basic human needs such as nutrition, adequate quantity and quality of water, thermal comfort, and wants such as leisure. These products can either be delivered via infrastructure or produced locally, e.g. electricity can be produced from sunlight using solar panels or water can be harvested from rainwater and subsequently treated to drinking water standards [11]. Utility–service provision problems focus on delivery of different utility products to households and then conversion of these utility products using different devices into services in order to satisfy basic human needs and wants. This task is very complex as its success depends on multiple factors such as proper functioning of devices and the infrastructures delivering the required utility products. Failure of one or more of these components might possibly deprive the users of a product and thus, service, for an unknown amount of time. A typical utility infrastructure is hierarchical and can be divided into households, communities, districts and cities. A household is a component of a community, the community can be a part of a district in a city whilst cities can be regarded as hubs in a country-wide network. Each level of this hierarchy has its own structure and distinct operational properties. The overall behaviour of the cities and districts emerges from the combined behaviour of single households [9]. Since it is crucial to understand the behaviour of the individual elements of a such system as well as the interactions between these individual elements with one another when scaling up [1] this paper is focused on an analysis of single households. The mapping between products, services and devices in the utility–service provision task is best to be represented with a directed graph. A standard directed graph restricts the user in providing a complete description of the system under investigation, [3] because standard graphs provide only one to one mapping between nodes and edges while in utility–service provision more then one product (represented as node) can be transformed (with transformation represented with an edge) into another product or products or services. Therefore, for description of our system we instead implemented a directed hypergraph, in which products and services are nodes whilst devices are hyperedges spanning between them. As mentioned above, a device typically has many inputs and outputs, and it connects more than two nodes. The main objective of this paper is to represent utility–service provision as a directed hypergraph and analyse its statistical properties such as degree distribution, path lengths, cardinality of nodes, etc. Such a utility–service provision hypergraph contains all available, i.e. possible to use, devices. These statistical measures can later be used to help in

121


solving individual utility–service provision problems under given constraints such as the availability of products and devices and the required services to be provided, [13]. While the overall utility–service provision hypergraph is later referred to as mastergraph the individual case-study utility–service provision for a specified household will be called a transformation graph. Therefore, the transformation graph is a sub-graph of the mastergraph [14]. For the purpose of further quantitative calculations and optimisation the transformation graph is however later converted into a standard graph in which nodes represent devices, and storage for products and services, while edges are product or service carriers. As the number of devices in the mastergraph is rather high this paper considers a simplified utility–service provision case in which the number of devices, products and services have been reduced. This simplified mastergraph is represented as a hypergraph (as described above) and its statistical properties are subsequently analysed. Moreover, different transformation graphs are then constructed from this simplified mastergraph to highlight how this methodology can be applied to real-life utility–service provision problem. The paper is structured as follows: Section 2 provides a theoretical introduction to graph theory followed by Section 3 that describes an approach to model utility–service provision with a directed hypergraph. In particular Section 4 describes a simplified utility–service provision example and its hypergraph description while Section 5 analyses the properties of the master graph considering all utilities, devices and services stored in the database. The paper concludes in Section 6 with potential future research directions.

2. BASIC DEFINITIONS A topological structure of a network can be represented as an undirected or directed standard graph G(V, E) where V = {v1 , v2 , ..., vn } denotes a set of vertices/nodes, also called vertices or components, and E = {e1 , e2 , ..., em } denotes a set of edges, also called links or lines, where n, m ∈ N. An edge eij is defined as a pair of nodes (vi , vj ), where i, j = 1..n. In physical networks the nodes represent individual components of a system such as transformers, substations or a consumer physical unit in the case of power grids, or storage facilities, control valves, pumps or demand sinks in the case of water distribution networks, while electrical cables or water pipes are represented as the edges [7, 15]. However, representing some networks as a standard graph has its limits because in such graphs an edge can connect only two nodes while, in general case, this may not

122


be sufficient to represent real world networks. Estrada and Rodriguez-Velazquez [3] presented a good example where using standard graphs to represent a network failed to provide a description of the investigated system. They analysed a collaboration network where nodes represented authors and edges showed collaboration between them. If such network is presented as a standard graph it will provide information about whether researchers have collaborated or not. However, it does not inform us if more than two authors connected in the network were co-authors of the same publication. But, representing a collaboration network as a hypergraph instead of a standard graph allows us to include this type of information. Similarly, utility–service provision within a household or a community is modelled with a directed hypergraph in which products and services are represented with nodes whilst devices are hyperedges spanning between them. Since devices usually connect more than two nodes, a standard graph would not suffice to describe utility–service provision problem and therefore a hypergraph was chosen as a more appropriate representation of the system. A hypergraph is a pair H = (V, E), where V = {v1 , v2 , ..., vn } is the set of nodes and E = {e1 , e2 , ..., em } is the set of hyperedges. A directed hyperedge ei ∈ E is a pair ei = (T (ei ), h(ei )) for i = 1, ..., m, where T (ei ) ⊂ V denotes the set of tail nodes and h(ei ) ∈ V \ T (e) denotes the head nodes. When |ei | = 2, for i = 1, ..., m, the hypergraph is a standard graph [4]. A directed hypergraph is a hypergraph with directed hyperedges. A standard graph can be defined with a n × n adjacency matrix [aij ] where aij = 1 if there is an edge connecting the node i to node j and aij = 0 otherwise, [8, 16]. A directed hypergraph is represented differently to a standard graph and requires a n × m incidence matrix [cij ] defined as follows:   −1 if vi ∈ T (ej ), (1) cij = 1 if vi ∈ h(ej ),   0 otherwise. Each node vi in a graph, both standard as well as hyper, has a number of incident edges ki . The value of ki defines the node’s degree, also called its connectivity [3]. In physical networks majority of nodes usually have a small connectivity while only a few nodes are highly connected [10]. Hypergraphs additionally have another property called cardinality which defines the number of nodes connected by a hyperedge. In case of the mastergraph representing the utility–service provision problem, cardinality shows what products and services a device uses or produces, respectively. The size of a hypergraph H is defined as the sum of the cardinalities of all its hyperedges, i.e.

123


size(H) =

X

|ei |

(2)

ei ∈E

Connections between nodes in a graph are described with paths, each path having an associated length. A path in a graph G is a subgraph P in the form V (P ) = {x0 , x1 , ..., xl },

(3)

E(P ) = {(x0 , x1 ), (x1 , x2 ), ..., (xl−1 , xl )}

(4)

such that V (P ) ⊂ V and E(P ) ⊂ E and where the nodes x0 and xl denote the end nodes of P whilst l = |E(P )| is the length of P . A graph is connected if for any two distinct nodes vi , vj ∈ V exists a finite path from vi to vj [7]. Otherwise the graph is not connected. Some technological networks such, in particular transport networks, can be additionally quantified by calculation of the shortest paths between nodes, e.g. shortest paths between bus stops or train stations. The shortest path lengths of a graph can be represented as a matrix D in which dij is the length of the shortest path from node vi to node vj . A typical separation between any two nodes in the graph is given by an average shortest path length L, also known as the characteristic path length L, which is defined as the mean of the shortest path lengths over all pairs of nodes in a graph. L=

1 N (N − 1)

X

dij

(5)

i,j∈V,i6=j

where N denotes the number of nodes (or network size). The maximum value of dij is called the graph diameter. The clustering coefficient Ci of a node vi is the ratio of the number of edges connecting the nodes with their immediate ki neighbours to the number of edges in a fully connected network [1]: Ci =

2Ei , ki (ki − 1)

(6)

where Ei is the number of edges leaving from node vi towards its ki neighbours. The clustering coefficient for a node quantifies a degree to which the node tends to cluster with the other nodes, i.e. the embeddedness of the node. The clustering coefficient of the entire network is calculated as an average of the clustering coefficients of all individual nodes and gives an overall indication of the clustering in the network.

124


Another crucial property of a node or edge is the betweenness centrality, also sometimes referred to as load. It is a measure of centrality and describes the importance of a given node or edge in a network by quantification of the number of shortest paths that traverse this node or edge [7]. (b)

Ci

=

X σjk (ni ) , σjk

(7)

j6=i6=k

where σjk (vi ) is the number of shortest paths between node vj and vk that pass through node vi and σjk is the number of shortest paths between nodes vj and vk [5]. The measure of betweenness centrality is useful in identifying critical nodes and evaluation of the network’s resilience to removal of certain nodes from the network, i.e. failures.

3. UTILITY–SERVICE PROVISION AS A DIRECTED HYPERGRAPH The analysis of the topology of utility–service provision systems was carried out with two aims in mind: (a) exploration of the possibility of delivering all services to households or small local communities via a single utility product by developing visions of future infrastructure [2, 6] and (b) conceptualisation of the utility-service provision via a simulation model [11, 12]. Provision of different utility products in appropriate quantities does not itself guarantee that the required services will be delivered as the needs satisfaction problem requires not only utility products but also appropriate devices. An approach to solve this type of problems was earlier described in [11] and [13], where a household was considered as an input-output system in which utility products provided to it may include drinking water, gas, electricity, heat, food, etc. Utility products can also be supplemented by natural resources through e.g. recycling rain water, extracting water from air or ground, capturing and converting energy from the sun or wind. Where production of a utility product from natural resources exceeds demand, e.g. onsite electricity generation exceeds energy consumption, surplus of that utility product can be sold back to the utility provider. Provision of a service from one or more utility products or conversion from one utility product to another may create one or more by-products which may either be recycled within a household or a community, or removed from the system as waste via a different infrastructure. The utility–service provision was analysed using a simulation model composed of the following four elements: a database, a problem formulation, a candidate solution, and a computational engine. The database contains the necessary information about devices, technologies, products including utility products, by-products and naturally available re-

125


sources, and services. The problem formulation is composed of the requirements, i.e. the amount and type of services to be provided and constraints, e.g. maximum size of a particular device or maximum amount of a product sourced on-site. A candidate solution or a transformation graph is a set of devices connected together that either are required to deliver services defined in the problem formulation or are necessary to recycle some of the products. An heuristic approach to build all possible transformation graphs was adapted. The methodology is as follows: (i) construction of initial transformation graph consisting only of service demand nodes; (ii) iterative search of the master graph for suitable devices to deliver services and/or to process products starting from the service nodes and going back to utility product nodes and natural resources nodes; (iii) creation of required storages for products. If during a particular iteration more than one device satisfies some criteria, the current transformation graph is copied and the algorithm proceeds with all copies independently. The final output of this algorithm is a collection of transformation graphs that satisfy the specific constraints and requirements defined in the problem formulation. The computational engine analyses the feasibility of solution(s) and calculates the mass balances of all products and services. More information about the simulation system can be found in [11] such as full description of the modelling approach adapted to simulate utility–service provision, different methods to create transformation graphs, and solved utility–service provision examples. The content of the database can be described in the form of a directed hypergraph in which the products and services are represented by nodes whilst the devices are hyperedges spanning between them. Typically a device has many inputs and outputs, hence it will usually connect more than two nodes. The hypergraph which is used to describe our utility–service provision network with all its products, services, and devices will later be refereed as a master graph [14].

4. A SIMPLIFIED UTILITY–SERVICE PROVISION SCENARIO In order to allow the user to better understand how the utility–service provision problem may be described as a directed hypergraph, we will first consider a simplified scenario in which only a small number of utility products, services and devices are considered. The hypergraph created for this purpose is shown in Figure 1. The nodes, marked in Figure 1 with circles represent products: n1 – Solar irradiation, n2 – Electricity, n3 – Food, n4 – Rainwater, n5 – Organic waste, n6 – Greywater, n7 – Clean water, n8 – Drinking water, n9 – Seawater, while the rectangular nodes represent the services: n10 – Drinking water, n11 – Clothes cleaning, n12 – Full body cleaning, n13 – Nutrition. In the adapted approach the product nodes are perceived as storages. Each edge represents

126


a different device which transforms one or more products into other product or products, or into a service: e1 – Silicon Photovoltaic system, e2 – Electric hob, e3 – Ultrasonic shower, e4 – Shower with electric water heater, e5 – Tap, e6 – Rainwater harvesting system, e7 – Washing machine, e8 – Greywater recycler, e9 – Ocean salinity power generation (reversed electro dialysis). Hence, our hypergraph representing a simplified utility–service provision problem consist of 9 edges and 13 nodes. This particular hypergraph is not acyclic which means that some devices use the same product as an input and output. In this case product n9 is used as an input and as an output by device e9 . e8 n7 n10

n4

e6

e5 n8

e9 n9

n6

e4

n2 e7

n12 n11 n1

e3

e1

e2 n3 n13

n5

Fig. 1. A directed hypergraph representing a simplified utility–service provision problem

4.1.

PROBLEM FORMULATION

Having such a utility–service provision solution space specified with the hypergraph in Figure 1 we can now search for possible candidate solutions given the required services to be delivered and constraints put on products. In this example the required services are n10 , n11 , n12 and n13 . The constraints are: product n2 cannot be delivered by the infrastructure and product n6 cannot be removed by the infrastructure. Natural products available locally are n1 , n4 and n9 .

127


4.2.

PROPERTIES OF HYPERGRAPHS USED FOR DEFINING TRANSFORMATION GRAPHS

The next step after problem formulation is to find candidate solutions based on the topology of the utility–service provision hypergraph and with the requirements and constraints specified in problem formulation. Search for candidate solutions is directed towards determining whether it is possible to deliver required services with available devices and under the given constraints. At this stage demands for services and maximum throughputs of the devices are not considered, only the topological properties of the hypergraph and the given constraints on the amounts of products to be produced and consumed. Table 1. Incidence matrix for the hypergraph introduced in Figure 1

e1 e2 e3 e4 e5 n1 -1 n2 1 -1 -1 -1 -1 n3 n4 1 1 n5 n6 1 n7 n8 -1 -1 n9 1 n10 n11 n12 1 1 1 n13

e6 e7 e8 e9 -1 -1 -1 1 -1 1 1 -1 1 -1 1 -1 2 1

Definition of a transformation graph from the hypergraph is preceded by formulation of an incidence matrix which describes the topological structure of the hypergraph. For our simplified hypergraph shown in Figure 1 the colour-coded incidence matrix is given in Table 1. The storage’s product outputs, i.e. tail nodes are shown in green and have an associated value −1 representing a sink and thus input to the device. Storage product inputs, i.e. head node are shown in red and have a value of 1 denoting the source, i.e. output from the device. Grey-coded fields represent products that are used both as an input and as an output and have been assigned a value of 2. Finally, blue-coloured fields denote the produced services and are have an assigned value of 1 as head nodes. The incidence matrix in Table 1 shows that product n2 can only be produced by devices e1 and e9 (see row 2) and that product n6 can be recycled by device e8 , as e8 is the input to n6 . Therefore, these devices can be used to formulate the candidate solution, i.e. the transformation graph to comply with the constraints listed in the problem formu128


lation in Section 4.1. Additionally, Table 1 indicates that service n10 can be delivered only by device e5 , service n11 can be delivered solely by device e7 ; service n12 can be delivered either by device e3 or e4 whilst service n13 can be provided by device e2 . Table 2. Inputs for devices used in the example presented in Figure 1

e1 e2 e3 e4 e5 e6 e7 e8 e9 n1 -0.91 n2 -3 -50 -2 -0.9 -1.2 -0.008 -4 n3 n4 -100 n5 -33 n6 n7 -69 -0.2 n8 -60 -2 -0.2 n9 Table 3. Outputs for devices used in the example presented in Figure 1

e1 e2 e3 e4 e5 e6 e7 e8 e9 n1 n2 0.1 1 n3 n4 1.5 0.5 6.5 n5 n6 60 69 n7 100 27 n8 0.4 n9 n10 1 1 n11 n12 1 1 n13 1

In addition to the incidence matrix, which provides information about the topology of the hypergraph, we introduce two more matrices which additionally provide quantitative information about the operating rules of the devices, i.e. how much of a product is used and produced by a given device. The benefit of such representations is that it enables a quantitative comparison of the amounts of products produced or used by corresponding devices and of the amounts of services they generated by the devices. This information can be later used to calculate the mass balances of all services and products and, subsequently, assess the feasibility of a given solution. Whilst Table 2 presents the amounts of 129


products (inputs) used by all devices Table 3 shows the amounts of products and services produced by these devices. Since services cannot be used by devices as inputs they not included in Table 2. Table 4. Shortest paths lengths in the hypergraph introduced in Figure 1

n1 n2 n3 n4 n5 n6 n7 n8 n9

n10 – – – – – – – 1 –

n11 2 1 – 2 – 2 1 3 2

n12 2 1 – 2 – 3 2 1 2

n13 2 1 1 2 – 3 2 4 2

Additional information about a hypergraph, for construction of transformation graphs, is provided by a matrix of shortest paths from product nodes to service nodes, such as one listed in Table 4. Shortest path values in Table 4 are calculated with Equation 5. The matrix of shortest paths is used for solution optimization, i.e. when it is required to minimize the number of used devices. If the sought solution is such in which the number of used devices is reduced to a minimum then the devices chosen from the hypergraph to form the transformation graph should lie on the shortest paths. Another piece of information provided in Table 4 indicates whether there is a possibility to deliver a service starting from a given product node, e.g., when investigating the first column it is clear that there is only one path between the product node n8 and the service node n10 . Also, by looking at the third row we can see that only the product n3 is required to satisfy the service n13 . This could be an indication that, from the point of increasing resilience, it might be beneficial to add other devices that could deliver this service. Table 4 can also be used to highlight critical nodes or edges, i.e. the nodes or edges which if removed will prevent the required services from being delivered. Table 4 also shows that there are no paths between the product n5 and any of the services. Thus, the product n5 is not used by any of the devices in this example as an input. Apart from shortest paths between product nodes and service nodes the matrix of shortest paths can be also used to investigate how one product can be converted into another. Accordingly to our problem formulation, only the product n6 needs to be converted into another product. As it can be seen in Figure 1 this particular product can only be converted by device e8 into the products n5 and n7 . Whilst information contained in either one of the representations is sufficient to uniquely define a graph, the incidence matrix and the matrix of shortest

130


paths offer different and complementary descriptions of a graph. Another parameter that is useful when analysing a hypergraph (and therefore potential candidate solutions) is the cardinality of hyperedges, which in our hypergraph are respectively: e1 = 2, e2 = 4, e3 = 3, e4 = 4, e5 = 2, e6 = 3, e7 = 4, e8 = 4, e9 = 4 This parameter shows how many inputs and outputs are associated with each device. Information about cardinality of hyperedges can be useful at the stage of choosing devices for the transformation graphs. The size of the hypergraph is calculated as a sum of all its hyperedge cardinalities. Since the size of this hypergraph is 30 and it contains 9 nodes, a device has on average 3.33 inputs and outputs. 4.3. TRANSFORMATION GRAPH DEFINITION

A candidate solution, i.e. transformation graph is formulated based on the requirements listed in the problem formulation defined in Section 4.1. In principle, this problem can have many solutions and these solutions can then be evaluated and compared against each other from the point of view of their resilience, energy demand, robustness, etc. The first step in generating transformation graphs is to create an empty transformation graph containing all required service nodes, i.e. n10 , n11 , n12 , and n13 . In the next step the hypergraph presented in Figure 1 is traversed to find appropriate devices required to deliver the services specified in the problem formulation task. If, at any point, more than one device that fulfils these requirements exists, a number of copies of the transformation graph, equal to the number of devices fulfilling the same requirements, are created. In this example we start with two initial versions of the transformation graph as two devices that can deliver service n12 exist in the utility–service provision hypergraph. Once all devices are connected, storage nodes for each product within a solution are added to facilitate subsequent dynamic simulation under changing demand and supply patterns with dynamic models generated from the transformation graphs. All storages are checked against the problem formulation to see whether each product can be delivered or removed by the infrastructure. Based on the problem formulation, product n2 cannot be delivered by the infrastructure and product n6 cannot be removed by the infrastructure. Therefore, a device that can produce product n2 is required. Again, there are two devices that can produce this product. Therefore, additional two copies are created. An example of a transformation graph is presented in Figure 2 where devices are given a rectangular shape, services have an octagonal shape and products are depicted with trapezoids.

131


Fig. 2. Example of a transformation graph

4.4. SENSITIVITY ANALYSIS

Sensitivity of a hypergraph to critical nodes can be described by degree distribution which provides the information on how many nodes in a hypergraph are highly and lowly connected. The more connections a node has the more important it is in the graph as more paths traverse through it. Therefore such a node may be critical to a proper functioning of a system described with such a graph. Figure 3 shows that the degree distribution for our hypergraph follows a power-law function P (k) âˆź k−0.9335 . As Figure 3 shows there are six nodes that have only one connection whilst one node representing electricity has eight connections. This indicates that electricity is crucial for the proper functioning of many, six to be precise (since two connections are electricity inputs not outputs), devices. Information included in the degree distribution of a hypergraph is very important in the analysis of the robustness of the solution as it helps in identifying the nodes that are critical for the operation of the solution system. Overall, the example presented in Figure 1 contains nine devices: two are producing electricity and six are requiring electricity to work, while one is independent, i.e. does not require electricity. (b) Top betweenness nodes in our example are: electricity: Cn1 = 0.216, clean water: (b) (b) Cn7 = 0.182, greywater: Cn6 = 0.083. It shows that the highest number of shortest paths is traversing the electricity node since most of the devices presented in this example need electrical power to operate. However, since there are two devices, e1 and e9 , that can produce electricity it is theoretically possible to deliver electricity to the system during a failure of a power grid. However, whether the amount of electricity provided from this second source is sufficient needs to be checked by calculating mass balances.

132


6

P (k)

4

2

0 1

2

3

4

5

6

7

8

k Fig. 3. Degree distribution in the example

5. A COMPLETE UTILITY–SERVICE PROVISION MODEL The ‘mastergraph’, as defined in the Introduction, has the same structure like the simplified hypergraph presented in Figure 1 but with larger number of nodes and hyperedges. It is built from the database of devices obtained from literature and technical data sheets and is used to generate authentic transformation graphs (i.e. utility–service provision solutions) for real-life problems. At the current state of development the mastergraph contains V = 60 nodes (products and services) and E = 97 hyperedges (devices). This hypergraph can be described in the same way as the simplified example discussed in the previous section. However, due to high number of edges and nodes both the hypergraph and the incidence matrix would not be readable. Thus, we will omit the visual presentation of the hypergraph and its incidence matrix and instead we will discuss the other remaining parameters. The size of the mastergraph is 299 with a device having, on average, 3.1 inputs and outputs. Degree distributions of the nodes in the master graph considering individually inputs (in), outputs (out), and total number of connected edges (all), are presented in Figure 4. The identified degree distribution for inputs as well as outputs follows a powerlaw function P (k) ∼ k−1.109 (for node inputs: P (k) ∼ k−0.9714 and outputs: P (k) ∼ k−0.8215 ). Analysis of the mastergraph led to the following observations. The most connected nodes are: electricity (total: 81, in: 28, out: 53), clean water (total: 19, in: 13, out: 6), 133


20

all

in

out

P (k)

15

10

5

0 1

2

3

4

5

6

7

8

9 10 11 12 13 14 15 16 17 18 19

k Fig. 4. Degree distribution in the master graph

drinking water (total: 16, in: 7, out: 9), greywater (total: 13, in: 11, out: 2) and waste (b) water (total: 12, in: 3, out: 9). Top betweenness nodes are: electricity: Cn1 = 0.367, (b) (b) (b) clean water: Cn7 = 0.128, greywater: Cn6 = 0.061, drinking water: Cn8 = 0.048.

6. CONCLUSION In the paper utility–service provision problem was formulated and presented with a hypergraph instead of simple graph as the hypergraph allows to map more than two nodes with one edge which is necessary to describe utility–service provision in which edges describe the devices which can use more then one product as an input and output. This hypergraph was then analysed with a number of statistical measures in order to extract additional information about the graph which is then used to generate candidate solutions, i.e. transformation graphs, which are subsequently used to specify dynamic models of utility–service provision systems which are then simulated and optimized [11, 12]. First the topology of the hypergraph is visualized with an incidence matrix which helps to identify the devices (edges) necessary to deliver the specified services (nodes) and the products (nodes) required for proper functioning of the devices (edges). This incidence matrix is then complemented with two additional matrices which, in addition to graph topology, also provide quantitative information about the amount of products

134


consumed and produced and services produced in the system. Subsequently, yet another matrix, i.e. the matrix of shortest paths is produced to aid with identification of shortest paths between utility products and services, i.e. transformations requiring the least amount of devices. Additionally, the matrix of shortest paths is used to identify critical products required to deliver a specified service required for an assessment of the system’s resilience, i.e. its lack of vulnerability to situations where one or many products fails to be delivered. To assess the vulnerability of the system to failures of the devices we use both the matrix of shortest paths (which can be used to show the number of alternative paths to deliver the required service) as well as the cardinality of hyperedges which quantifies how many inputs and outputs, i.e. products and services are associated with this particular device. Finally, as the last measure of quantifying a hypergraph we calculate the node degree distribution which shows the critical nodes (with large number of connections) as well as its share in the total number of nodes in the system. All of these measures offer crucial information for further analysis of utility–service provision such as generation of possible solutions (utility–service provision configurations), subsequent analysis of their resilience, vulnerability, robustness, and redundancy, and generation of dynamic models for the purpose of simulation and optimisation.

ACKNOWLEDGEMENTS This research is a part of and is sponsored by the Engineering and Physical Sciences Research Council (EPSRC) project “All in One: Feasibility Analysis of Supplying All Services Through One Utility Product” (EP/J005592/1).

REFERENCES [1] ANTONIOU P. and PITSILLIDES A., Understanding complex systems: A Communication networks perspective. Technical Report, Department of Computer Science, University of Cyprus, 2007. [2] CAMCI F., ULANICKI B., BOXALL J., CHITCHYAN R., VARGA L., and KARACA F., Rethinking future of utilities: supplying all services through one sustainable utility infrastructure. Environmental science & technology, vol. 46, 2012, pp. 5271–2. [3] ESTRADA E. and RODRIGUEZ-VELAZQUEZ J. A., Complex networks as hypergraphs. arXiv preprint physics/0505137, 2005. [4] GALLO G., LONGO G., PALLOTTINO S. and NGUYEN S. Directed hypergraphs and applications. Discrete Applied Mathematics, vol. 42, no. (2-3), 1993, pp. 177–201.

135


[5] HU J., YU J., CAO J., NI M. and YU W., Topological interactive analysis of power system and its communication module: A complex network approach. Physica A: Statistical Mechanics and Its Applications, vol. 416, 2014, pp. 99–111. [6] KARACA F., RAVEN P. G., MACHELL J., VARGA L., CAMCI F., CHITCHYAN R., BOXALL J., ULANICKI B., SKWORCOW P., STRZELECKA A., OZAWA-MEIDA L. and JANUS T., Single infrastructure utility provision to households: Technological feasibility study. Futures vol. 49, 2013, pp. 35-48. [7] PAGANI G. and AIELLO M., The power grid as a complex network: a survey. Physica A: Statistical Mechanics and its Applications, vol. 392, 2013, pp. 2688–2700. [8] ROSATO V., BOLOGNA S. and TIRITICCO F., Topological properties of high-voltage electrical transmission networks. Electric Power Systems Research, vol. 77, no. 2, 2007, pp. 99-105. [9] SAMET R.H., Complexity: The science of cities and long-range futures, Futures, vol. 47, 2013, pp. 49–58. [10] SCHOLZ M., Network science. www.network-science.org, 2015. [11] STRZELECKA A. and SKWORCOW P., Modelling and simulation of utility service provision for sustainable communities. International Journal of Electronics and Telecommunications, vol. 58, no. 4, 2012, pp. 389–396. [12] STRZELECKA A., SKWORCOW P. and ULANICKI B, Modelling, simulation and optimisation of utility-service provision for households: Case studies. Procedia Engineering, vol. 70, no. 0, 2014, pp. 1602–1609. [13] STRZELECKA A., SKWORCOW P., ULANICKI B., and JANUS T., An Approach to Utility Service Provision Modelling and Optimisation. In International Conference on Systems Engineering, 2012, pp. 191-195. [14] ULANICKI B., STRZELECKA A., SKWORCOW P., and JANUS T., Developing scenarios for future utility provision. In 14th Water Distribution Systems Analysis Conference, 2012, pp. 1424–1430. [15] YAZDANI A. and JEFFREY P., Complex network analysis of water distribution systems. Chaos: An Interdisciplinary Journal of Nonlinear Science, vol. 21, no. 1, 2011, pp. 0161111-10. [16] ZIO E. and GOLEA L.R. Analyzing the topological, electrical and reliability characteristics of a power transmission system for identifying its critical elements. Reliability Engineering & System Safety, vol. 101, 2012, pp. 67–74.

136


Computer Systems Engineering 2014 Keywords: eye tracking, image processing, pupil detection, image filtering, hough transform

Grzegorz ZATORSKI*

OPTIMIZATION OF LOW-LEVEL COMPUTER VISION METHODS FOR EYE TRACKING This paper presents results of examination of segmentation and filtering algorithms efficiency for eye pupil recognition problem. Digital image filters combined with image segmentation techniques, such as binary threshold, adaptive threshold were investigated. Hough Transform was used as a pupil recognition method. Its efficiency, in determined range for algorithms parameters, were measured by brute force approach. The research has been conducted for RGB images, which were captured by camera in the visible light range. Research has shown correlation between kernel size and accuracy of pupil recognition for used filters. Optimal threshold value for binary threshold algorithm was found.

1. INTRODUCTION Tracking the human point of gaze is called an eye tracking, and delivers a lot of valuable information for many fields of science. Parameters, such as eyeball movement or point of gaze, could be measured and transferred to electronic device. Further it can be used to analyze human behavior or to develop human-computer interaction interface. Analysis of data, measured and received from eyeball movement is used by many fields of science such as neurology, biomedical engineering, psychology and advertising market [2]. Eyeball movement could be transferred to other device movement. This is relevant for disabled people, in particular people with reduced mobility and people who are completely paralyzed. Communication interface developed from obtained data received from eyeball movement, could be the only opportunity for them to communicate with others (i.e. eye typing). In Psychology, parameters such as: point of gaze, gaze maps, time of gaze could provide data, for a research about human behavioral determinants. Another important field of application for eye tracking is advertising market. Analyzing eye tracking data, in particular points of gaze distribution (heatmaps), gives information which helps to optimize advertising campaign or improve graphical user interface in computer application. *

Department of Systems and Computer Networks, Wroclaw University of Technology, Poland

137


Eye tracking methods divide into four groups: electro-oculography (EOG), scleral contact lens/search coil, photo-oculography (POG) or video-oculography (VOG) and video-based combined pupil and corneal reflection methods [5]. Method using video (VOG), seems to be the easiest method for implementation because of its noninvasive character. This method provides relatively sufficient efficiency. 2. RELATED WORKS There are several methods proposed for pupil recognition task, also for each phase of entire process. In this section some of them will be showed and reviewed. Milad Soltany, Saeid Toosi Zadeh and Hamid-Reza Pourreza [8] proposed their approach to pupil positioning in images, using Circular Hough Transform and Gray Projection algorithms. They used a 640X480 RGB color image which was converted to gray scale color space. Binary threshold was used as a segmentation method. Circular Hough Transform was used as a pupil detection method. Their work proposed fast, non-IR established algorithm for images which are based on RGB color model. S. Dey and D. Samanta [11] proposed an efficient and accurate pupil detection method for developing better biometric identification systems. Investigation was carried out on CASIA iris database, which contains IR images. Downscaling of image was used as a pre-processing task. Power transform was used to remove irrelevant edge information from the image. Binary threshold was applied as segmentation method. Gaussian smooth operator was used as a filtering method. Pupil recognition method was based on Canny edge detection technique. A. De Santis and D. Iacoviello [6] focused on pupillometric data since the pupil morphology and dynamics. Four level segmentation is obtained by a recursive binary segmentation. Image data were pre-processed by Gaussian filter. Pupil recognition method was based on Canny edge detector and Sobel edge detector. Additionally, experiments were also aimed to test the algorithms robustness with respect to additive image noise.

3. PROBLEM FORMULATION Eye tracking process is related to the detection of a selected element of an eye and tracking the changes of its location, in relation to the head. Pupil seems to be the best chosen element for tracking, due to its regular (circular) shape and homogeneous color. For the purposes of pupil detection, digital image processing was composed of several phases (see Fig. 1). First phase is the image data acquisition, afterwards, the 138


distortions of the image (i.e. noise) are removed by filtering. Another phase is segmentation. Segmentation is a crucial process in an image recognition problem, because it simplify image to further processing. In the last two phases of pupil recognition, the area of pupil is extracted from segmented image and after that the pupil shape parameters are estimated [6]. The results obtained from shape estimation, served for further use. The coordinates of pupil center, which changes with time, provides information about eyeball movement.

Fig. 1. Flow chart of basic operations of pupil eye detection [6]

3.1 IMAGE FILTERING

Image filtering belongs to a subset of preprocessing operations and is a first step in computer vision task. The role of filtering is the reduction of image noise. Filtering methods divides into a couple of groups: statistical filters (resizable kernel size), static kernel size filters (high-pass, low-pass), and edge filters. 3.2 IMAGE SEGMENTATION

The image segmentation operation is the most important step leading to the analysis of processed image data - its goal is to simplify or change the representation of an image, into something that is more meaningful and easier to analyze [6]. This operation is crucial in computer vision filed. A lot of methods are presented in the literature, but there is no single method which can be considered good for all images, nor are all methods equally good for a particular type of image [1]. Image segmentation techniques divides into three groups. First group of methods is based on image histogram (i.e. binary threshold, adaptive threshold, k-means clustering). The second group consists of the edge detection methods and the third one from the region growing methods [7]. 3.3 PUPIL RECOGNITION

Object recognition is the final important step in machine vision. Previous, segmentation process, delivers simplified representation of the image. Almost always, when information about an object or region class is available, some pattern 139


recognition method is used [7]. One of such method is Hough Transform, which is used to regular objects recognition. Modified version of this transform (Circular Hough Transform) could be used for detection of circular shapes on image. 4. METODOLOGY 4.1 SCOPE AND ASSUMPTIONS

This research was performed, based on Table 1. Parameters of image database [12] URBISv1[4] database which contains a noisy eyes images. Fifty images were Parameter Value randomly selected from database and Nikon E5700 Camera E5700v1.0 Software further investigation was conducted for 71 mm Focal Length this subset. Table 1 and table 2 contains 1/30 sec Exposure Time parameters of images, which were used to RGB Color Representation ISO-200 ISO Speed test. We selected group of statistical image Table 2. Manual image classification [12] filtering algorithms, image segmentation methods from group of techniques based on Parameter Good Average Bad images histogram. Afterwards we used Focus 73.83% 17.53% 8.63% Hough Transform method for circle Reflections 58.87% 36.78% 4.34% 15.44% detection. Simulation was conducted for Visible Iris 36.73% 47.83% images which were made by camera in visible light range and converted to RGB color space. Investigations were provided by brute force approach. Parameters for algorithms were permuted in defined range, and also the efficiency factors were measured for these configurations of parameters. 4.2 EFFICIENCY FACTORS

1.

2.

Recognition time. Time was measured for all pupil recognition stages. The starting point of measuring time process is image load operation, and the ending point is apply Hough transform. Detection factor. It is binary indicator of pupil detection. We assumed that pupil is detected, if distance between the center of the detected circle and center of the pupil, is smaller than the length of the pupil radius. First, we calculated Euclidean distance between center of pupil circle and center of detected circle (1), afterwards described detection as function (2): (1) 140


(2)

3.

Recognition efficiency. We assumed that the quality indicator of detection is a ratio between a properly detected pupils to number of images in the subset. This factor is computed for every detection. Thus, we described recognition efficiency factor as a function (3): (3)

4.

Intersection area. We assumed that this efficiency factor is a part of the common intersection of the circles: detected circle and real pupil area. Thus, we described this factor as a function (4): (4)

, where: number of pixels in detected circle, number of pixels in real pupil circle.

5. NUMERICAL EXPERIMENTS 5.1 RESEARCH QUESTION

Which combination of segmentation techniques and filtering methods with a set of parameters, is the most time-effective and most efficient for eye's pupil recognition? How the parameters for algorithms which were used, affects the efficiency and time of recognition?

141


5.2 SELECTION AND PREPARATION IMAGES

We randomly selected 50 images from URBIS v1 database and we calculated two parameters for every image: coordinates of center of pupil and pupil radius. Figure 2 describes methodology of calculating images parameters. 5.3 EXAMINATION PROCESS

Entire process of recognition was executed for every picture from selected subset of 50 images. The first step was the loading of the original image. Afterwards, original image was filtered by filter with defined parameters. After that, filtered image was converted from RGB color space to grayscale color model. In the next step, image was segmented and then Hough transform was applied for image. Hough transform method provides detected circle coordinates and radius length. Detected parameters were compared with computed parameters in previous stage, which were saved in *.csv file. Figure 3 describes whole process. For a chosen dataset, during every iteration of algorithm, computed: average detection time, average recognition efficiency and average intersection area.

Fig. 2. Process of preparation sample of images for investigation. (1) – Original image; (2) – Cropped pupil; (3) – Pupil parameters detected (coordinates of center, radius) by Hough Transform; (4) – Saved parameters to *.csv file

Fig. 3. Stages of image processing. (A) – Original image; (B) – Filtered image (Gaussian filter); (C) – Transformed to grayscale color model; (D) – Segmented image (binary threshold); (E) – Detected by Hough transform (red circle was applied to original image)

142


5.4 PARAMETERS FOR USED ALGORITHMS

Based on the time complexity of algorithms, we determined parameters range for simulation. Dependence between time and parameters for filtering and segmentation algorithms is shown in Fig. 4 and Fig. 5. According to the figure 4, dependence between filter kernel size and filtration time for bilateral filter is exponential. Filters like Gaussian and Median, appear to have linear time complexity. According to the figure 5, both threshold calculating methods have linear time complexity. However, method based on Gaussian is more timeconsuming than second method. Arbitrarily adopted range of parameters for filtration algorithms have been placed in Table 3, 4 and 5.

Fig. 4. Dependence between processing time and kernel filter for Bilateral, Gaussian and Median filters (800x600 Ubiris v.1 image)

Fig. 5. Dependence between processing time and block size for adaptive threshold for two methods of calculating threshold parameter(800x600 Ubiris v.1 image).

Table 3. Range of parameters for binary threshold Segmentation method Threshold

Threshold value range

0-255

Threshold type binary

Table 4. Range of parameters for adaptive threshold

Adaptive threshold type

Block size range C constant range

Table 5. Range of parameters for filtration algorithms Filtering method Kernel size range Sigma x, sigma y

Gaussian 3-11 1-5

143

Median 3-11 n/a

Bilateral 3-11 n/a

ADAPTI VE_THR ESH_ME AN_C 3-11 1-5

ADAPTI VE_GA USSIAN _C 3-11 n/a


5.4 IMPLEMENTATION DETAILS AND TEST ENVIRONMENT

Test environment was implemented in C++ language using Open Computer Vision library. Program was compiled by g++ compiler in Ubuntu Linux 10.04. Tests were performed on Pentium(R) Dual-Core T4400 2,2 GHz which contains 2GB of RAM memory. Filtering, segmentation and detection methods were used from included Open Computer Vision Library [10]. 6. RESULTS The following graphs show results of investigation. Results were divided by sections, containing analysis data for binary threshold, adaptive binary threshold, influence of the filtering process on recognition efficiency and time efficiency. 6.1 DEPENDENCE BETWEEN FILTER KERNEL SIZE AND ACCURACY

According to Figures 6 and 7, kernel size has an effect on detection accuracy. If the kernel increases, the effectiveness increases too. This dependence is observed for Gaussian and Median filters combined with binary threshold and adaptive threshold.

Fig. 6. Dependence between kernel size and detection accuracy for Gaussian and Median filters. Binary threshold segmentation method was used

Fig. 7. Dependence between kernel size and detection accuracy for Gaussian and Median filters. Adaptive threshold segmentation method was used

144


6.2 BINARY THRESHOLD

Results of evaluation binary threshold algorithm combined with filters had been shown in Figure 8. This graph shows dependence between threshold parameter and average intersection area, for Gaussian, Median, bilateral filter and segmentation without filter. According to Fig. 8, the highest rate of detection is obtained, at T~50. This correlation was observed for Bilateral, Gaussian and Median filters in all ranges of used parameters. Maximum of detection was expected at T~0, considering the color of pupil, which is almost black (0 value in grayscale color space). We can also see, that filtering operation has a great impact on detected pupil area. The results of detection for algorithm without using filter are equal to about 15% of average intersection area. For algorithms using Gaussian, Bilateral and Median filters, 55%, 22% and 55% of average intersection area were observed respectively.

Fig. 8. Dependence between value of threshold parameter (binary threshold) – T and average intersection detected area. Figure for Gaussian, Bilateral, Median filters and binary segmentation without filer

145


6.3 ADAPTIVE BINARY THRESHOLD

According to Figure 9, there is no correlation between block size of adaptive threshold and average detected area. Also, as the same as in case of binary threshold, filtering has a great impact on detected area. The results of detection for algorithm without using filter are equal to about 2.5% of average intersection area. For algorithms using Gaussian, Bilateral and Median filters, 28%, 5% and 40% of average intersection area were observed respectively.

Fig. 9. Dependence between value of block size (adaptive threshold) and average intersection detected area. Figure for Gaussian, Bilateral, Median filters and adaptive segmentation without filer

146


6.4 TIME EFFICIENCY

Time of filtering is related to three main stages of our procedure: filtration stage, image segmentation and detection stage. Detection phase has constant time value. Overall time of recognition process is determined by filtration and segmentation methods. Time of the execution for segmentation methods combined with filtration methods have been placed in Table 6. Table 6. Average processing time for best results segmentation methods combined with filters. Results include runtime of Hough Transform Segmentation method Binary threshold

Filtering method Gaussian Median Bilateral None

Average processing time [s] 0,034 0,156 0,384 0,034

Adaptive threshold

Gaussian Median Bilateral None

0,053 0,206 0,247 0,053

7. CONCLUSION During the research, we tested two segmentation algorithms (binary threshold, adaptive threshold) combined with three filtering techniques (Gaussian, Median, Bilateral). Optimal solution for eye detection was found for binary threshold combined with Gaussian filter. This method is fast, efficient and has a good accuracy of detection (good ratio of intersection area between real pupil and detected). In general, in tested range of parameters, the best results for binary segmentation was observed. In a group of filters, better results for Median and Gaussian filters were obtained. According to the results, there is no clearly connection between image segmentation technique and filtering methods which would be universal for object recognition in images. Different combination of algorithms with different set of parameters could deliver mixed results. For every algorithm, which is a candidate to resolve problem, its efficiency should be tested separately. Positive correlation between filter kernel size and detection accuracy is observed. This dependence exists for Median and Gaussian filters combined with adaptive and binary threshold. Eye tracking is a dynamic issue, thus software which serves to eye recognition, should have utility value. The fact that, VOG method uses camera which captures approximately 30 frames per second, is important to recognize pupil parameters on 147


each frame. In real time recognition, for every frame, whole process should end before 1/30 second. Low level computer vision methods seems to be suitable for that issue because of its processing time. Future research should focus on wider range of parameters for used algorithms. In this research, used range was limited due to high time complexity of our test software. Used operations can be paralleled via GPGU, i.e. CUDA. Another step could be an implementation of additional algorithms for filtering and segmentation techniques or high level computer vision algorithms (e.g. classifiers). REFERENCES [1] [2]

[3] [4]

[5] [6]

[7] [8]

[9] [10] [11]

[12]

NIKHIL R. PAL, SANKAR R. PAL, A review of image segmentation techniques. Pattern Recognition, No. 26, 1993, pp. 1277-1294. DUCHOWSKI A., A Breadth-First Survey of Eye Tracking Applications, Behavior Research Methods, Instruments, & Computers (BRMIC), No. 34(4), November 2002, pp. 455–470. JACOBR J. K., and KARN K. S., Eye Tracking in Human-Computer Interaction and Usability Research: Ready to Deliver the Promises.: North-Holland/Elsevier. PROENCA H., and ALEXANDRE L.A., UBIRIS: A noisy iris image database, Cagliari, Italy, 13th International Conference on Image Analysis and Processing ICIAP 2005, September, Springer, LNCS 3617, pp. 970-977. A. DUCHOWSKI, Eye Tracking Methodology, Springer, 2007. A. DE SANTIS, D. IACOVIELLO, Optimal segmentation of pupillometric images for estimating pupil shape parameters, Computer methods and programs in biomedicine 84, 2006, pp. 174-187. SONKA M., HLAVAC V., and BOYLE R., Image Processing, Analysis and Machine Vision, Thomson Learning, Toronto, 2008. SOLTANY M., ZADEH S. T., and POURREZA H. R., Fast and Accurate Pupil Positioning Algorithm using Circular Hough Transform and Gray Projection, Proc .of CSIT vol.5, IACSIT Press, Singapore. BRADSKI G., KAEHLER A., Learning OpenCV, O’Reilly, Sebastopol, 2008. OpenCV project website, http://docs.opencv.org (accessed 22 January 2014). DEY S., SAMANTA D., An Efficient Approach for Pupil Detection in Iris Images, Advanced Computing and Communications, 2007. ADCOM 2007. International Conference, 18-21 Dec. 2007, pp.382,389. UBIRIS website, http://iris.di.ubi.pt/ubiris1.html (accessed 22 January 2014).

148


Computer Systems Engineering 2015

Keywords: infrared, navigation, robot

Sivo DASKALOV* Simona STOYANOVA†

POINTED NAVIGATION OF A ROBOT WITH THE USAGE OF INFRARED CAMERAS AND MARKERS

This paper features the usage of infrared cameras and markers for the location detection and navigation of a robot through a path. The algorithm is based on linear algebra and could be used to achieve pointed navigation of the robot through an obstacle course. Once implemented, the algorithm can be used as part of a more complex movement planning algorithm featuring artificial intelligence. A minimum of two markers are required for the location and orientation detection of the robot and two more for the control rod if pointed navigation is to be realized.

1. INTRODUCTION Automation in all its forms is of great interest to both scientists and businesses. A robot is a mechanical or virtual artificial agent, usually an electro-mechanical machine that is guided by a computer program or electronic circuitry. Mobile robots have the capability to move around in their environment and are not fixed to one physical location. Mobile robots are usually used in tightly controlled environments such as on assembly lines because they have difficulty responding to unexpected interference. A thermo graphic camera is a device that forms an image using infrared radiation, similar to a common camera that forms an image using visible light. Instead of the 450–750 nanometre range of the visible light camera, infrared cameras operate in wavelengths as long as 14,000 nm (14 µm). Active markers are infrared light emitting __________ *

Computer sciences and technologies department, Technical university of Varna, Bulgaria, e-mail: sivodaskalov@gmail.com † Computer sciences and technologies department, Technical university of Varna, Bulgaria, e-mail: sstoyanova26@gmail.com

149


elements, mostly LEDs. Due to their flexibility, active markers are ideal for custom targets and customized solutions. Navigation algorithms are fundamental for mobile robots. While there are many path planning algorithms for robots, but the focus in this particular paper is on the robot navigation through the usage of a control rod for pointing. The algorithm can also be used for the point to point navigation of a robot when there are no obstacles between the points. The execution of the algorithm on a queue of points representing all the turns in a labyrinth navigation algorithm could prove to be an efficient, alternative way of navigation. 2. PROBLEM FORMULATION The object of this paper is to present an interface between various navigation algorithms and a mobile robot. The interface must receive coordinates of the destination, as well as the current robot coordinates and then perform the necessary commands such as rotation and movement. For testing purposes we have generated destination points through the usage of a pointing rod with two infrared markers in both ends. 3. MATHEMATICAL BASIS Given two points ( , ) and ( , ) in the plane with normal vector (0,0,1) the equation of the line passing through both points is

=

(1)

The slope m of the line through the points A and B can be calculated: =

(2)

then the perpendicular bisector of , If center of AB is the point AB goes through the point M and is perpendicular to the line AB. From the perpendicularity we can conclude that ∗ = −1 Given two points ( , , ! ) and #( $ , $ , !$ ), the equation of the line passing through the two points is % %& %' %&

=

& '

&

=

& '

150

&

(3)


If : ) + + + ,! + - = 0 is a plane and the line CD is neither parallel to or contained in the plane, their intersection is a point from the line CD, but contained in the plane . 4. ALGORITHM The algorithm uses a room with infrared cameras in the four corners to locate and define the position and orientation of both the robot and the control rod. Two markers must be placed on both sides of the robot, the position of the robot is defined as the centre point between the two markers. The orientation of the robot is calculated by using the perpendicular bisector of the section between the two markers. The pointing navigation of the robot is achieved through the usage of a control rod made by attaching two infrared markers to a straight rod with a handle. These two markers define a line which depending on its orientation intersects the plane of the floor. This intersection grants us a second position and is the destination for the robot in the current time. The setup is shown in Fig. 1.

Fig. 1. Setup of the navigation system

The destinations calculated with the usage of the control rod can be placed in a queue on a timer or on the occurrence of a certain event. That way every time the

151


robot reaches its current destination, a new one is extracted from the queue and hence resulting in a more complex movement. The rotation adjustment of the robot can be done in real time with the aim to reach an orientation where the intersection between the line defined by the control rod and the floor is a point from the perpendicular bisector of the robot's markers. 5. EXPERIMENTATION SYSTEM The experiments have been conducted in the Virtual Reality Laboratory of the Technical University of Varna (Fig. 2.). The laboratory was opened on February 17, 2014 and it is the first of its kind in Bulgaria. It offers students and researchers access to scientific visualization, simulation, and artificial reality hardware.

Fig. 2. The virtual reality laboratory

The WorldViz PPT X (Precision Position Tracking) system has been used for the long-range virtual reality tracking of the markers and motions. The PPT system uses a total of six cameras that are capable of tracking spaces as large as 25 by 25 meters although the laboratory's size is considerably smaller, about 7 by 7 meters. These virtual reality tracking peripherals are ideal for precise viewpoint control, with intuitive hand ineraction in CAVE display system.

152


The Wand is a rugged, universal interaction device for navigating and manipulating the virtual scene and virtual objects. The hardware we are using from this laboratory are the six infrared cameras, which can allocate the position of multiple markers, as well as four markers to define the location and orientation of the mobile robot and the control rod. 6. CONCLUSIONS This article can be considered as the first step of a bigger project that includes real time realization of navigation algorithms for a mobile robot. The algorithm could be expanded to a larger scale through the usage of GPS localization. The future work could include integration of the API with various navigation algorithms and extensive analysis of the resulting system. REFERENCES [1] FEHLMAN L. W., AND HINDERS K. MARK., Mobile robot navigation with intelligent infrared image interpretation, Springer, New York, 2009. [2] LEE S., AND JAE-BOK S., Mobile robot localization using infrared light reflecting landmarks, Control, Automation and Systems, 2007. [3] GUO Y., BAO J., AND SONG A., Designed and implementation of a semi-autonomous search robot, Mechatronics and Automation, ICMA, 2009. [4] LIU H., STOLL N., JUNGINGER S., AND THUROW K., A common wireless remote control system for mobile robots in laboratory, Instrumentation and Measurement Technology Conference (I2MTC), IEEE International, 2012. [5] NOBLE B., Applied Linear Algebra, Englewood Cliffs, NJ: Prentice-Hall, 1969.

153


Computer Systems Engineering 2015 Keywords: sentiment classification, predictive analysis, Naive Bayes, generalized boosted models, random forest, text mining

Norbert KOZŁOWSKI∗

SENTIMENT CLASSIFICATION IN POLISH LANGUAGE

Many techniques are known for processing and classifying text documents. They are mainly based on the assumption that English language is used. Due to different words conjugations in polish language, some additional steps are necessary to use prediction making algorithms. This paper will present the process of creating a system capable of discriminating positive and negative opinions using raw data available publicly on the Internet. Three different classification techniques will be tested - Naive Bayes, Generalized Boosted Models and Random Forests. Results shows that both Generalized Boosted Models and Random Forests yield in nearly 93% accuracy but the second one is much faster

1.

INTRODUCTION

Nowadays we are are surrounded by vast amounts of data that is hard to process. In 1998 Merill Lynch stated that 80-90% of all potentially usable business data may originate in unstructured form that are mainly text documents [1]. Taking into consideration that the amount of information is doubling every two years [2] we need a solution that will help us to automatically extract some information and make decisions without human intervention. This paper focuses on the problem of determining whether a given opinion (e.g., about the product) is either positive or negative (sentiment classification). Implementation of such system in production environment allows company to automatically take an action when certain trends or individual incidents happen (e.g., a lot of negative comments after releasing a new product). Even though there are many useful resources about creating such a system being able to classify english sentences some extra steps are needed to be performed to apply it to polish language. In Section 2 the general problem is described. Algorithms used - Naive Bayes, Generalized Boosted Models and Random Forest are briefly described in Section 3. Section ∗ Institute of Computer Engineering, Control and Robotics, Wrocław University of Technology, Poland, e-mail: norbert.kozlowski@hotmail.com

154


4 describes the process of designing an experimentation system - collecting and preprocessing the data. In Section 5 algorithms are trained are compared both in terms of accuracy and the time needed to classify a single text. Finally Section 6 presents the conclusions with suggestions what else could be done to improve performance.

2.

PROBLEM FORMULATION

The main problem is how to find a probability that given sentence in polish langue is positive -P (positive|sentence). To obtain the answer some minor aspects have to be considered as well: • acquiring high quality data, • stemming sentences, • problem with representing data - curse of dimensionality, • choosing best algorithm.

3.

ALGORITHMS

Three tested algorithms - Naive Bayes, Generalised Boosted Models and Random Forest are default implementations provided by caret package for prediction analysis in R framework [4]. 3.1.

NAIVE BAYES (NB)

Naive Bayes method is a fairly simple probabilistic classifier based on applying Bayes’ theorem with strong (naive) assumption about independence between features. During learning phase it utilises training data to calculate an observed probability of each class based on feature values. When the classifier is used later on unlabelled data it uses the observed probability to predict the most likely class for the new features. They are best applied to problems in which the information from numerous attributes should be considered simultaneously in order to estimate the probability of an outcome. Many other algorithms ignore features that have weak effects - this method utilise all available evidence to subtly change the prediction. If a large number of features have relatively minor effects, taken together their combined impact could be quite large. Also the training time is linear, where in other algorithms the expensive iterative approach is used. 155


3.2.

GENERALIZED BOOSTED MODELS (GBM)

The idea of boosting is based on the assumption of generating weak learners that iteratively learn a portion of the difficult-to-classifiy examples in the training data by paying more attention (assigning more weight) to often misclassified examples. Beginning from the unweighted data set, the first classifier attempts to model the outcome. Examples that the classifier predicted correctly will be less likely to appear in the training dataset, and conversely, the difficult-to-classify examples will appear more frequently. Generalized Boosted Models are combining both Gradient Boosting Machine [5] (which is a combination of boosting and optimization technique) and some tweaks from Stochastic Gradient Boosting [6]. The first model is improved by introducing extra parameters like controlling the optimization speed, learning rate or variance reduction using subsampling. The current implementation of GBM algorithm allows tuning the following variables accordingly to the problem: • a loss function (data distribution), • the number of iterations T , • the depth of each tree K, • the shrinkage (or the learning rate) λ, • the subsampling rate p. Because of binary classification problem Bernoulli distribution was found most useful. The number of iterations was checked in range T ∈ (0, 500), the depth was examined in K = {1, 2, 3} cases. The shrinkage parameter λ is responsible for controlling the rate at which the boosting algorithm descends the error surface. When λ = 1 we return to performing full gradient steps. Performance is best when λ is as small as possible for selected T (which can be selected by performing cross-validation). In this experiment λ was set to λ = 0.1. Finally the subsampling rate p corresponds to the fraction of the training set observations randomly selected to propose the next tree in the expansion. Randomly subsampling the dataset with p < 1 introduces the randomness, and ensures that running the same model twice will result in similar but different fits. This value was left default as p = 0.5. 3.3.

RANDOM FOREST (RF)

Random forests are a scheme for building a predictor ensemble with a set of decision trees that grow in randomly selected subspaces of data [7]. They have emerged as as a 156


serious competitors to other well known methods such as boosting and SVM. They are fast and easy to implement, produce highly accurate predictions and can handle a very large number of input variables without overfitting. That makes the RF one of the most accurate general-purpose learning technique available [8]. The basic idea is that each tree in the forest is formed of randomly selected features. Than each tree is grown using CART methodology to maximum size without pruning. Then using randomization, and bagging method an algorithm is selecting which features to use for next tree. Because RF only uses a small, random portion of full feature set, they can handle extremely large dataset where other algorithms may fail (due to the curse of dimensionality). The drawback is that unlike a single decision tree a model is not easy interpretable. In algorithm used there are two parameters that can be tuned - number of trees that will be generated, and how many features will be selected for each one. For the purpose of the experiment the number of trees was fixed ntree = 500, and the number of features is within the set mt = {4, 7, 12, 20, 32}.

4.

EXPERIMENTATION SYSTEM

The first obligatory step of the process is obtaining high-quality data. Each opinion in the data set should contain as much diverse words as possible and be labeled either positive or negative. It can be done manually (by collecting different texts, reading them and assigning a grade) or automatically. The first way is very toilsome and probably not very rewarding, but it may be sufficient for doing a research work or designing a proof-of-concept system. In a second, automatic approach case a computer script that automatically gathers information from polish websites was created and used. Websites where every person can grade a shop or product writing a short review were found extremely useful. Due to this fact that this data is publicly available it was possible to obtain them using web-scraping techniques (processing source of HTML page). For this purpose a dedicated R script was written. It’s job was to scan specific website, parse it’s site structure and extract necessary information. Afterwards the final data set consist of nearly 320 000 unique, high-quality opinions with the grades ranging from 0.5 to 5.0 (see Figure 1). All data was stored in MySQL database which will be later used for either pre-processing and learning predictive algorithms steps.

157


Fig. 1. Distribution of grades

In pre-processing phase each opinion will be converted to remove redundant elements. All punctuation signs, numbers, trailing white spaces were removed and the whole text was brought to lower case. After these operations a dataset was ready for stemming. In English language there are some well tested algorithms available (e.g., Snowball) capable of performing this task. In polish language a Morfologik system (version 1.9) was used [9]. It is a morphological analyser using a few well known polish dictionaries. It provides only a Java API so a dedicated application had to be created it will retrieve opinions from the database, processes them, and stores back the changed version without affecting the original sentence. By making use of parallel processing the transformation of all opinions can done very fast (within one minute). For the later analysis only the opinions with extreme grades (0.5 and 5.0) were left reducing the overall number of opinions. After this step there was still nearly 6 times more positive ones than negative. This disproportion might be error-prone in later phases so the positive opinions were randomly shuffled and reduced to the number of negative ones. By doing this the dataset consists of about 30 000 equally balanced opinions. The next step is to create a document-term-matrix representation of processed opinions. In matrix representation each column will represent a term, and a row will be an opinion. Using data previously collected there are about 320 000 unique words in the data set so the matrix dimension will be 30000Ă—320000, where each cell represents how

158


many times certain term occurred in opinion. Due to the fact that the matrix is almost all filled with zeros it is called a sparse matrix [10] and is very inefficient to process. To reduce this size all very popular polish terms (such as conjunctions or prepositions) were removed (because they do not provide any information about the mood of the sentence). After this the sparsity was additionally reduced to 99,3%. The outcome of all these steps is that the number of terms was set to about 1 300 (which also can be understood as the most frequent and informative words).

5.

RESULTS

To create an automatic prediction system the data set created in Section 4 was divided into two parts - training and testing with 8 : 2 ratio. Sets have the same distribution of classes and a single opinion cannot occur in both of them. All the analysis is done accordingly to [12] and [13]. The training data set is used for creating a generic model of the system. In this phase different algorithms and their parameters will be evaluated. To select the best one the AUC metric is used. It uses the area under receiver operating characteristics (ROC) [11] to estimate the probability that a classifier will rank randomly chosen positive instance higher than a randomly chosen negative instance. This value varies from 0.5 − 1, where 0.5 means that the classifier is doing worthless, and 1 means it is excellent. To assure that the results are deterministic the 10-fold cross-validation was performed (the random number generator was also set to a fixed value). Results of training GBM and RF models are presented in Fig. 2 and Fig. 3. Three algorithms were tested using 10-fold cross validation and the best combination of parameters was selected using ROC metric. Results are presented in Table 1. Table 1. Training results

Classifier NB GBM RF

Parameters T = 499, K = 3 mt = 12

ROC 0.898 0.983 0.980

ROC SD 0.009 0.002 0.001

Each algorithm is finally checked using various metrics to spot some tradeoff between them. When evaluating the testing dataset (which contains about 6 000 unseen opinions during training phase) metrics will be: • overall accuracy,

159


Fig. 2. Training GBM model

Fig. 3. Training Random Forest model

• Cohen’s Kappa coefficient, • positive opinions accuracy, • negative opinions accuracy, • average time needed for making a single prediction.

160


Classifiers were finally evaluated on previously unseen opinions. Table 2 presents the collected results. The average time was calculated by measuring how long does it take to predict sentiment for each, single opinion in the testing data set. Additionally Figure 4 presents top 10 most important features (most influential words) for GBM classifier. Table 2. Testing results

Class. NB GBM RF

Acc 0.814 0.935 0.930

Kappa 0.628 0.870 0.860

POS Acc 0.780 0.935 0.952

NEG Acc 0.858 0.935 0.910

AVG time 16.59 ms 2.54 ms 9.51 ms

Fig. 4. 10 most important features for GBM

6.

CONCLUSIONS AND FUTURE WORK

Training results were very promising. The most simple method - Naive Bayes has the lowest ROC score, but computationally it was the simplest one. The other two, are almost equal. They are more advanced and time to train them was also drastically higher than NB. When looking at the results presented both in Table 1 and Table 2 it is pretty obvious that GBM classifier performs the best. Only RF is slightly better when 161


predicting positive opinions. The NB model seems to give the worst results, and have the largest time for the response, but when taking into considerations fact that is is based on simple mathematics it’s score is also impressive. There are some improvements that could be performed: • RF average response time could be improved by using less than 500 trees for making predictions, • opinions could be labelled using three states - positive, negative and neutral. This new class maybe could handle cases that are hard to classify, • single feature (word) can also be a collocation of n words (this is called a n-gram text preprocessing). When knowing popular collocations (i.e never more instead of just never and more) classifier could be aware of the context of the sentence, • instead of calculating term frequency (TF), more sophisticated approach taking into consideration how many documents has certain words (TF/IDF) could be applied, • extra validation using large block of texts (e.g., articles) could be performed to check if the overall sentiment is still properly detected. To fully automate the process of building a classification system for an end-user a last element is needed - a Java application that takes as an input a block of text, applies pre-processing and stemming operations, communicates with R script to evaluate the model and returns the result.

REFERENCES [1] SHILAKES C.C. AND TYLMAN J. (1998) Enterprise Information Portals. Merrill Lynch. [2] GANTZ J. AND REINSEL D. (2011) Extracting Value from Chaos. IDC IVIEW. [3] BELLMAN R.E. (1957) Dynamic programming. Princeton University Press. [4] KUHN M. (2008) Building Predictive Models in R Using the caret Package. Journal of Statistical Software 28(5). [5] FRIEDMAN J.H. (2001) Greedy Function Approximation: A Gradient Boosting Machine. Annals of Statistics 29(5):1189-1232. [6] FRIEDMAN J.H. (2002) Stochastic Gradient Boosting. Computational Statistic and Data Analysis 38(4):367-378.

162


[7] BREIMAN L. (2001) Random forests. Journal Machine Learning 45 (2001) 5-32. [8] BIAU G. (2012) Analysis of a Random Forests Model. Journal of Machine Learning Research 13 (2012) 1063-1095. [9] MILKOWSKI M. AND WEISS D. (2014) Mofologik Distribution System v1.9.0. Morfologik. [10] GOLUB G.H. AND VAN LOAN, C. F. (1996) Matrix Computations (3rd ed.). Baltimore: Johns Hopkins. [11] FAWCETT T. (2006) An Introduction to ROC Analysis. Pattern Recognition Letters 27 (8): 861-874. [12] FEINERER I. (2014) Introduction to the tm package. Text Mining in R. [13] LANZ B. (2013) Machine Learning with R. Packt Publishing pp. 101-117.

163


Computer Systems Engineering 2015 Keywords: path optimization, hybrid algorithms, heuristic

Piotr LECHOWICZ ∗

PATH OPTIMIZATION OF 3D PRINTER

3D printers more and more often are being used to create prototypes. The process of creating 3D objects consists of creating a set of thin layers. Each layer is a set of points, which has to be printed. This problem is the case of the travelling salesman problem. Layers can differ in density and the amount of the points placed on them. There were implemented different types of the algorithms. Some of them were used to create hybrid algorithms. Each of them consists of greedy algorithm and other ones, which are 2-opt, harmony search and simulated annealing.

1.

INTRODUCTION

3D printing, also known as additive manufacturing, is the process of making three dimensional solid objects from a digital model [4]. It is commonly used in prototyping because it allows to make 3D objects without a need to use forms. The main advantages are the low cost of the printer, low cost of printing an object and variety of materials which can be used. The whole process starts with making a virtual design of the object either in 3D modelling program or with the use of a 3D scanner. Three dimensional printing is created by using an additive process, it is the object is created by laying down successive layers. Due to this fact the model is ‘sliced’ into hundreds or thousands of very thin horizontal layers. For each layer a path for the printing tool is chosen. From the effectiveness of finding the optimal path depends the cost of the energy or/and time needed to prepare the whole layer.

2.

PROBLEM FORMULATION

There is given a layer to print which is described as an array of binary points, which has a size of a × b. Value of each point can be either 1, which means that it is a point to ∗

Advanced Informatics and Control, Wrocław University of Technology, Poland, e-mail: piotr.tobiasz.lechowicz@gmail.com

164


print, or 0 otherwise: Px,y

( 1, if printing point = 0, otherwise

, where x ≤ a, y ≤ b.

(1)

On each layer the printing tool has to visit exactly once each point which should be printed. The printing tool can also make moves without printing. Path made by the printing tool can be described by the order of the points to print: V = [P1 , P2 , · · · , Pn ],

(2)

where n is the number of printing points. Each point Pi , where 1 ≤ i ≤ n, has corresponding coordinates (xi , yi ) describing its placement on the layer. Assuming that the printing tool is using to move two perpendicular axes, the cost of the path can be considered in three different approaches: minimum distance, minimum time of printing and minimum energy. In all calculations the distance between two adjacent points is equal to 1 unit, energy consumed by one engine and time needed to move at the distance of one point is also equal to 1 unit. The results in unit values are sufficient to show the differences between the efficiency of various algorithms used to find optimal path (e.g., [2]). Different costs of the path are described as follows: • distance — sum of the distances between succeeding points C1 =

n p X (xi−1 − xi )2 + (yi−1 − yi )2 ,

(3)

i=2

• time — time needed to print two succeeding points is relevant to the printing tool’s engine which has to move farther C2 =

n X

max(|xi−1 − xi |, |yi−1 − yi |),

(4)

i=2

• energy — total consumed energy is equal to the sum of energy consumed by each of the two separate engines C3 =

n X

|xi−1 − xi | + |yi−1 − yi |.

(5)

i=2

The problem to solve is to find such an order of visiting points V , it is Hamiltonian path, that the cost C of printing and time needed for calculating it is minimized. 165


3.

ALGORITHMS

A travelling salesman problem is the NP-hard problem. Due to its computational complexity, with the increasing amount of points, the attempt of calculating the best solution, becomes highly time-consuming. However, there is a wide variety of algorithms which give approximated solution in reasonable time. In this section there are presented implemented algorithms which were used to solve this problem [1]. 3.1.

GREEDY ALGORITHM

A greedy algorithm is an algorithm which always makes the choice that looks the best at the moment. It makes a locally optimal choice with the hope of finding a global optimum. Commonly greedy strategy for the travelling salesman problem is as follows: at each stage visits an unvisited point nearest to the current city. In case of this particular problem of path finding in 3D printer, algorithm was slightly modified. First added modification changes the definition of locally optimal solution. Normally it is based only on the distance between two points, the point which is nearer is the better one than the farther one. However in this algorithm, due to the fact that the points are located in the regular grid, next optimal movement is also considered by the number of the adjacent, not yet printed points to the target one. Figure 1 shows an example layer with few points (marked as the grey ones). If optimal choice is made only by considering the distance, the path would look as it is shown in Figure 2 with the length L1 = 5. On √ the other hand, with the modification included, the path would have length L2 = 3 + 2. It is illustrated in Figure 3.

Fig. 1: Example layer to print

Fig. 2: Path number 1 for example layer

Fig. 3: Path number 2 for example layer

Comparing distances from Figure 2 and 3 we can notice that the point which is a little further but has only one adjacent point, is a better choice than point which is closer, but

166


has two adjacent points. With the increasing number of points the difference between paths can become significant, as it is shown in Figures 4 and 5.

Fig. 4: Path computed without considering number of adjacent points (the red one is an inefficient part of path)

Fig. 5: Path computed with considering number of adjacent points

The total length in Figure 4 is L3 ≈ 47.87 and in Figure 5 is L4 ≈ 35.96. This way of proceeding helps to eliminate isolated points, which may finally determine result to be unsatisfying. That’s why in the algorithm were introduced some parameters describing how much it prefers points which are further but have lower number of adjacent points. Let’s say, that there is a point P1 which we currently consider as a next move from current point P0 . There is also a point P2 and we want to examine whether it is a better choice for the next move than a point P1 . Distances from the current one to those points are equal to d[P1 ] and d[P2 ]. Those points have a given number of adjacent points described by n[P1 ] and n[P2 ]. There are two parameters (weights) w[dist] and w[neigh]. Algorithm 1 shows their impact on the decision about next move. Algorithm 1 Adjacent points modification 1: if d[P2 ] ≤ d[P1 ] - w[dist]: take the P2 as a better choice 2: else if d[P2 ] ≤ d[P1 ] + w[neigh] and n[P2 ] ≤ n[P1 ]: take the P2 as a better choice 3: else: take the P1 as a better choice Second introduced improvement in the algorithm is the modification of process of searching the nearest unvisited point from the currently visited one. Most of the printing 167


layers have rather densely distributed points, e. g. most of them are located in the centre of the layer. In order to reduce time needed for checking distances to all possible points to find the nearest one, algorithm firstly searches the neighbourhood of currently visited point, it is places for possible points, which are not further than one unit in each side. In case of negative result, algorithm extends the searches to two units in each side. If in that process any point was obtained, algorithm starts looking for the next move by searching a list of all possible points. Figures from 6 to 9 illustrate this part of the algorithm.

Fig. 6: Starting position for searching neighbourhood

Fig. 7: First iteration of searching

Fig. 8: Second iteration of searching

Fig. 9: Move of the printing tool

In Figure 6 we have starting point which was the latest one visited. Algorithm starts to search neighbourhood of the point. In Figure 7 is shown first iteration. Because algorithm didn’t find any point, it extends the search which is presented in Figure 8. Because there are three possible solutions for the next move, algorithm has to chose the Algorithm 2 Greedy algorithm 1: for i = 0, i < N fill the amount of neighbours n[Pi ] 2: choose a starting point P0 3: Create a list of unvisited points L[] := P1 · · · PN −1 3: set the number of unvisited points unv := N - 1 5: while unv > 0: 4: if unv > B: 5: Search for the best next point in the neighbourhood (not farther than 2 units) 6: if unv ≤ B or no point were found in step 5: 7: search for the point in the list of all unvisited points 8: unv := unv - 1 remove from L[] chose point and add it to the path

168


most optimal one. It considers both criteria — distance and number of adjacent points. The result is the upper one (Figure 9). This improvement is used by the algorithm only when the amount of unvisited points is greater than predefined parameter B. If the number of yet unvisited points is lower than B, algorithm searches the next move from the list of all possible points. Algorithm 2 describes the process of finding a path. The optimal point for the next move is chosen by the criteria presented in the algorithm 1. 3.2.

TWO OPT ALGORITHM

This algorithm is based on the exchange or swap of a pair of edges. To swap two edges (a, b) and (c, d), the nodes of the pair of edges are rearranged as (a, c) and (b, d). The algorithm tries to untangle crossing edges. There were implemented two varieties of this algorithm. In the first one it searches for the first swap which will result in shortest total length of path and executes it. The procedure is repeated till no improvements are found. The difference between the second one and the first one, is that it searches at each iteration for the best possible swap and then executes it. Algorithm 3 Two Opt algorithm 1: find randomly first solution V := [· · · , Pi , Pi+1 , · · · , Pj , Pj+1 , · · · ] 2: if algorithm is greedy: find first beneficial pair of edges (Pi , Pi+1 ) and (Pj , Pj+1 ) 3: else: find the most profitable pair of edges (Pi , Pi+1 ) and (Pj , Pj+1 ) 4: if no pair were found: exit from algorithm 5: else: swap edges V := [· · · , Pi , Pj , · · · , Pi+1 , Pj+1 , · · · ] and go to the 2)

3.3.

SIMULATED ANNEALING

Simulated Annealing algorithm [3] is inspired by annealing process in metallurgy. It emulates the physical process in which a solid is firstly heated and then slowly cooled. With higher temperature, metal is more vulnerable to deformation because of providing energy needed to break bonds. While the temperature becomes lower and lower, it is

169


Algorithm 4 Simulated Annealing algorithm 1: find randomly first solution V := [· · · , Pi , Pi+1 , · · · , Pj , Pj+1 , · · · ] 2: set this solution as the best one best := V and as the current one current := V 3: for i = 0, i < N: 4: select randomly two edges and swap them next := swap(current) 5: if d[next] < d[current] where d[path] is total distance of path: current := next if d[current] < d[best]: best := current 6: else: current := next with calculated probability for this temperature 7: i := i + 1 8: calculate new temperature temperature := cooling rate * temperature 9: if temperature reaches minimum return best solution 10: else: i := 0 and go to 3)

harder to deform metal. In case of Simulated Annealing algorithm, the temperature determines the probability of taking worse solution than currently obtained one in the future consideration. With lower temperature, algorithm less likely takes worse solution. During whole process the best found solution is kept in memory, and replaced if better one is obtained. New solutions are found by swapping two randomly chosen edges. After predefined number of iterations N , temperature is decreasing according to the cooling rate. When temperature reaches earlier defined minimum, algorithm stops. The probability of acceptance worse solution is calculated as follows: e

J(j)−J(i) T (t)

,

(6)

where J(i) is total distance of currently found solution, J(j) is total distance of the best found solution and T (t) is the temperature of time where the new solution was obtained. 170


This procedure is presented in the algorithm 4. 3.4.

HYBRID ALGORITHMS

There were implemented two hybrid algorithms – Greedy Two Opt and Greedy Annealing. In both of them the first stage is to find path through Greedy algorithm. This path is passed as an initial path to Two Opt or Simulated Annealing algorithm respectively. The idea behind it, is such that it is much more efficient to improve path close to the optimal solution, than totally randomly selected. 3.5.

NUMERICAL EXPERIMENTS

Computational experiments were made to verify the effectiveness of the proposed algorithms. During experiments there were taken following assumptions: if algorithms are tested on the same layer they should be running with the same parameters, e. g. if Greedy algorithm and Simulated Annealing algorithm were used to find in a solution in a specific layer, the Greedy Annealing algorithm should run with the parameters the same as in each one of the partial algorithms. It allows to objectively decide whether there are advantages coming from hybrid algorithms. In this paper there are presented results of experiments which were done for three layers. They were chosen such that they have either different number of points or density. Figures from 10 to 12 present chose layers.

Fig. 10: Layer 1

Fig. 11: Layer 2

Fig. 12: Layer 3

Layer 1 presented in Figure 10 has many points and each one has many adjacent points. Table 1 presents the results. In Figures 13, 14 and 15 are shown example paths. As a second example is chosen a layer with widely spread points concentrated in small groups. It is presented in Figure 11. Table 2 presents the results. In Figure 16, 17 and 18 are shown example paths.they should

171


Fig. 13: Greedy solution for layer 1

Fig. 14: Greedy Two Opt solution for layer 1

Fig. 15: Greedy Annealing solution for layer 1

Table 1: Results for layer 1

Algorithm

Layer Number of points Time

Greedy Two Opt Greedy Two Opt Simulated Annealing Greedy Annealing

Fig. 16: Greedy solution for layer 2

Distance

Energy

2809 2705 2703 3745 2802

3194 2984 3017 4417 3159

2547 2518 2485 3388 2563

Fig. 17: Two Opt solution for layer 2

1 2448 Calculation Time [ms] 663 873271 27494 3071234 534441

Fig. 18: Greedy Annealing solution for layer 2

The third example is a layer with widely spread points. It is presented in Figure 12. Table 3 presents the results. In Figure 19, 20 and 21 are shown example paths.

172


Table 2: Results for layer sp-9

Algorithm

Layer Number of points Time

Greedy Two Opt Greedy Two Opt Simulated Annealing Greedy Annealing

Fig. 19: Greedy Two Opt solution for layer 3

Distance

Energy

651 671 614 682 606

718 778 686 770 685

621 619 582 645 568

Fig. 20: Simulated Annealing solution for layer 3

2 404 Calculation Time [ms] 24 4071 210 52702 34044

Fig. 21: Greedy Annealing solution for layer 3

Table 3: Results for layer sp-10

Algorithm

Layer Number of points Time

Greedy Two Opt Greedy Two Opt Simulated Annealing Greedy Annealing

Distance

Energy

1282 1202 1165 1259 1173

1605 1517 1469 1580 1465

1139 1066 1031 1125 1049

3 411 Calculation Time [ms] 54 4027 494 51364 16749

In tables the lowest costs for each criteria were marked with the pale yellow colour.

173


In Figure 22 are shown calculation times for each algorithm in each layer. The y-axis is in logarithmic scale.

Fig. 22: Path calculation time for each layer

4.

CONCLUSIONS

The paths for layer in 3D printer can be found by using different algorithms. As it can be noticed in the tables presented, greedy algorithm finds slightly worse solution than most of the other implemented algorithms, but much faster than them. The simulated annealing didn’t give benefits from its ability to accept sometimes worse solution. Interesting are two implemented hybrid algorithms. Adding result of the Greedy algorithm as an initial path for Two Opt and Simulated Annealing algorithms reduces the time needed for finding a solution (Figure 22) than while using these two algorithms as a stand-alone ones. Moreover, in the same time calculated paths have lower costs. It is obvious, that the decision which algorithm should be used for the set of layers depends on the number of repetitions, it is the amount of the same layers or entire 3D objects which have to be printed. Implementation of those algorithms was my own contribution in two person’s project, which can be seen on the GitHub account on https://github.com/piotrlechowicz/RSMproject.

174


REFERENCES [1] KIM B.-I., SHIM J.-I. AND ZHANG M., Comparison of TSP Algorithms, December 1998, [online], [accessed 08-June-2015], https://louisville.edu/speed/faculty/sheragu/Research/Intelligent/tsp.PDF. [2] CORMEN T.H, LEISERSON C.E., RIVEST R.L. AND STEIN C., Introduction to Algorithms, 2009. [3] BERTSIMAS D. AND TSITSIKLIS J., Simulated Annealing, 1993, [online], [accessed 08-June-2015], http://www.mit.edu/∼jnt/Papers/J045-93-ber-anneal.pdf ´ [4] WOJCIK M., Path Optimization in 3D Printing, June 2014.

175


Computer Systems Engineering 2015

Keywords: composite pavements, reflective cracking

Evangelia MANOLA*

MODELLING OF REFLECTIVE CRACKING IN FLEXIBLE COMPOSITE PAVEMENTS

The rehabilitation of rigid (concrete) pavements with the placement of an asphalt overlay is a common maintenance technique which results in a structure known as a flexible composite pavement. The most common form of distress in this type of pavement is reflection cracking which can be due to traffic and/or climatic loading. A simplified mechanics-based approach identified from the literature has been developed to model the progression of thermal cracking from the base and surface of the asphalt overlay due to traffic and climatic loading. Verification/validation will be undertaken using data from the US Long-Term Pavement Performance (LTPP) database where specific in-service pavement sections have been closely monitored for a number of years allowing a comparison with the predictions to be made.

1. INTRODUCTION Pavements can be considered as multi-layered structures consisting of materials in horizontal layers with a durable surfacing. The primary function of the layers is to transmit the loads to the underlying soils. Pavements generally belong to two broad categories: flexible and rigid pavements [9]. However, combination of characteristics of both types in the design of a pavement has resulted in a type of pavement known as composite. The flexible composite pavement comprises of a surfacing of bituminous material over cement bound material which consists the layer of the road base. Reflection cracking is a specific distress found in composite pavements. The cracks that appear in the asphalt overlays are caused by discontinuities at similar positions in the pavement layers underneath. These discontinuities can be joints or cracks __________ * DIGITS Research Group, De Montfort University, United Kingdom, e-mail: evangelia.manola@email.dmu.ac.uk

176


in an old cracked concrete pavement underneath [2]. As causes of reflection cracking, traffic loading and temperature cycles need to be considered. Reflective cracking can propagate in two ways, either from the bottom of the overlay upwards or in the opposite direction, from the top to the bottom. Researchers like Scarpas et al [5], Elseifi et al [1] and Molenaar and Pu [3] have supported that reflection cracking occurs only from the bottom to the top, whereas others like Nesnas [4] state that it occurs from top to down only. Both of these types although have been noticed and confirmed and should be considered. This study aims to develop an approach for modelling the long-term performance of flexible composite pavements. In an effort to accomplish this, the mechanical description of the distresses that develop in these types of pavements is necessary. Distresses such as reflective cracking due to vehicle loading and thermal reflective cracking due to climatic loading, rutting (permanent deformations) are studied. 2. THEORETICAL MODELLING Up to this point a model for reflective cracking which has been identified and has the potential to be incorporated in the flexible composite pavement performance model is OLCRACK [6]. OLCRACK is a simplified computer program which aims to predict the crack growth development of bottom-up and top-down cracks in cement bound pavements with asphalt overlay and with the presence of grid reinforcement due to vehicular loading. A fatigue approach is used for the calculation of a crack propagation rate and contributes to the estimation of the overlay life. The materials are characterized by elastic layer moduli and simplified equations of statics mechanics are used to model the pavements. The pavement is considered as a beam on an elastic foundation. With moment and force balances, the crack propagation rate can be estimated, and therefore the depth of the crack through the thickness of the asphalt. The original program was developed in Excel, so in order to enhance the calculations and to make the procedures of modifying the program easier it was implemented in MATLAB software. Modifications were made such as the exclusion of the grid reinforcement characteristics so only a single layer of asphalt on top of a cement bound base was taken into account. The presence of the grid forces were removed and the model was readjusted with only shear and bending forces in the asphalt. 2.1 BOTTOM-UP CRACKING MODEL

In regards to the initiation of bottom-up cracking the critical position of the wheel load is considered to be on top of the crack between the two slabs [8]. In Fig. 1 it is 177


seen clearly that when the wheel load passes over from one slab (length L) to the other it causes the slabs to deflect at that point (y1), and the asphalt to bend. So the critical position of the wheel load (P) for the development of cracking from the bottom to the surface of the asphalt layer is when it is directly on top of the crack or joint in the rigid layer. The bending of the asphalt causes tensile stresses to occur in the bottom of the asphalt layer at that point. When tensile stresses are present and they exceed the tensile strength of the asphalt material, then a crack is possible to appear.

Fig. 1. Critical position of wheel for bottom-up cracking [8].

A specific model is used for the cracks that initiate at the bottom of the asphalt layer (Fig. 2). The asphalt layer is modelled as a single linear elastic material. The base of the pavement is rigid and consists of two slabs, the non-loaded ends of which act as hinges. The foundation is considered to be characterized by a modulus of subgrade reaction (k), which reveals the support of the underlying layers [7].

Fig. 2. Bottom-up cracking mechanism [6].

The calculated tensile strain can be used to determine the crack propagation rate. The crack rate can originate from a fatigue test which is usually used to determine a pavement’s life. It follows the form of a typical fatigue characteristic: The calculation of the length of the crack depth through the thickness of the asphalt then follows. That is achieved by multiplying the crack rate with the number of 178


wheel loads that have traversed the particular pavement section.

2.2 TOP-DOWN CRACKING MODEL

The model that is used for top-down cracking consists of three rigid slabs A, B and C (see Fig. 3). When the wheel load passes over the first crack the asphalt flexes. As the wheel continues along the pavement and the distance from the first crack grows, a differential movement takes place between slabs A and B. This results in tension in the left end of slab B, and the asphalt tending to detach from the underlying pavement at that point [6]. Regarding the initiation of cracking at the surface of the asphalt towards the base, the critical position of the wheel load (P) is not directly on top of the joint/crack but at a specific offset distance d from it as can be seen in Figure 4 [8]. At that distance the asphalt is forced to bend, having as a result the surface of the asphalt layer on top of the nearby joint to extend over a greater area and therefore the development of tensile stresses.

Fig. 3. Critical position of wheel for top-down cracking [8].

In Figure 4 it can be seen that the ends of slabs A and C are considered to be hinged and the foundation underneath is modelled with a modulus of subgrade reaction k which describes the state of the subgrade. When the wheel passes over each joint shear forces are present in the asphalt and at the crack joints (f1, f2 and F1, F2 respectively) and deflections occur at the ends of slab A and C (y1, y2) as well as differential deflection between each pair of slabs A-B and B-C (δ1, δ2). Due to the flexing of the asphalt when a wheel load is at some offset distance from the first crack a moment M1 develops in the asphalt above the first joint (M2 when wheel load is at offset distance from second crack).

179


Fig. 4. Top down mechanism [6].

The maximum total tensile strain at the crack is used to calculate the crack propagation rate. The crack rate originates from a fatigue test which is usually used to determine a pavement’s life. It follows the form of a typical fatigue characteristic: The calculation of the length of the crack depth through the thickness of the asphalt then follows. That is achieved by multiplying the crack rate with the number of wheel loads that have traversed the particular pavement section.

3. REFLECTIVE CRACKING MODEL APPLICATION 3.1 LONG TERM PAVEMENT PERFORMANCE DATABASE

An important objective of this project is to validate this model using data from the Long-Term Pavement Performance Infopave program by comparing predictions to actual field measurements. The Long-Term Pavement Performance (LTPP) database includes data on specific pavement sites regarding construction data of the pavement sections (layers, materials), monitoring data which includes information about distresses, deflections and the profile of the pavement sections, the maintenance and rehabilitation actions on these sections, the climatic and traffic data. Data such as material properties and traffic can be used as input to the model and the output from it concerning the distresses of the pavement can be compared to the actual measured distresses in LTPP. The selected pavement sections for use in the verification belong to the Specific Pavement Studies (SPS) 6 category which represent the Rehabilitation of Jointed Portland Cement Concrete (JPCC) Pavements. The total reflective cracking lengths along the years for each section in each states Illinois and Indiana are shown in the Figures 5(a-b). On the vertical axis is the total cumulative length of transverse cracking in metres on a specific section and on the horizontal axis each point relates

180


to the survey dates showing how many years from the start of the LTPP section. In states Indiana and Illinois the line charts show upward trends of the total reflective transverse cracking due to the traffic induced loading over the years. So the following diagrams confirm the existence of reflective transverse cracking and the different amount of years after the LTPP section initiation that it started to develop for each section.

Fig. 5 (a-b). Total Reflective transverse cracking (Illinois, Indiana).

3.2 EXAMPLE SIMULATION

A description of applying the model for a typical section in Illinois follows. Specifically, the section consists of four layers in total: an asphalt overlay with thickness 102 mm on top of a Jointed Plain Concrete Pavement (JPCP) of thickness 254 mm, a subbase of gravel with a thickness of 178 mm and a subgrade which consists of Silty Clay. A cross-section of the pavement can be seen in Fig. 6.

Fig. 6. Section of modelled pavement in Illinois.

The program can calculate the crack lengths that develop in the thickness of the asphalt from the bottom of the asphalt layer to the top and from the top to the bottom at the points where there is a joint in the underlying pavement. The crack propagation

181


rates are calculated and with the use of actual traffic loading data the crack lengths are estimated. Every wheel load contributes to the reduction of the effective elastic modulus of the asphalt and therefore to the propagation of the cracks. After one month of traffic loading a new crack length through the depth of the asphalt is calculated. By applying the model for a pavement section of Illinois Fig. 8a was the output from the program. The vertical axis shows the depth of the cracks in the asphalt for bottom-up and for top-down cracking in metres, for one slab of the pavement and the horizontal axis shows the cumulative number of wheel loads that have been applied on that section and have caused the respective crack propagation. In order to understand the result better a cross section of the asphalt layer is given in Fig. 7 which shows approximately the depth into which the cracks have propagated after traffic loading from 2009-2012. The vehicle loading from this period was not enough for the cracks to propagate through all the depth of the asphalt. The program calculated 15 mm depth for the top-down crack and 10 mm for the bottom-up.

Fig. 7. Cross-section of crack depths in asphalt.

4. RESULTS The reflective cracking model was also applied for pavement sections in the state of Indiana and Fig. 8 a-b show the output diagrams of the reflective cracking. Due to the early stage of the research it is not yet possible to interpret the results in comparison to the actual field distress data. However, some conclusions can be drawn concerning the amount of years a particular section with distinct material properties needs to develop reflective transverse cracking. What can be concluded is that for the state of Illinois (Fig. 8a) for 7000 wheel loads the passing of which took place over the whole duration of 2009-2012 the cracks had not reflected yet. On the other hand for the state of Indiana (Fig. 8b)

182


around 20000 wheel loads over two years (2009-2010) caused the cracks from the bottom to reflect and to meet the surface cracks. Also, conclusions can be drawn regarding the different crack depths that develop for the same state for bottom to up cracks and for top to down with the same vehicle wheel loads. The program seems to estimate a greater crack depth for top-down cracks than for bottom-up cracks. That is true especially for the state of Indiana, where the top-down crack depth is almost double than the bottom-up crack depth.

Fig. 8 (a-b). Total Crack Depth in asphalt for sections in Illinois, Indiana.

5. CONCLUSIONS AND FUTURE WORK Regarding the distress of transverse reflective cracking it is identified that its presence on the asphalt overlay is certain. The time of its appearance depends on a number of factors such as the material properties of each section as well as other characteristics concerning its construction. For example the layer thicknesses and the elastic moduli play a significant role, especially that of the asphalt layer. A really important factor is the vehicle loading that is applied on a pavement and there is available and reliable data to use from the LTPP database. Also, it is worth mentioning that in the program the process of the propagation of the crack is not ignored but taken into account with the help of the crack propagation rate. This rate is then used to calculate the crack length that is developed for a specific amount of vehicle loading. At the moment, the results are based on a monthly basis. OLCRACK is based on simplified mechanics equations and hypotheses, and is

183


still investigated for further adjustments to fit the need of my research project. It is implemented in MATLAB therefore it is relatively easy to make adjustments to it and to interpret the results. For now the early stage results show that OLCRACK is a program capable of taking into account the different mechanisms behind the development of bottom-up and top-down cracking since it is accepted in this research project that both exist and develop. This program intends to calculate the crack propagation rate followed by the crack depth to which it reaches in the asphalt overlay. In terms of future work, it is intended first of all to make a more accurate estimate of the elastic moduli of the actual layers as it is considered a very important factor from which the development of reflective cracking depends. In order to achieve that, methods of backcalculation of elastic moduli are intended to be used. These are based on the calculation of theoretical deflections to match actual surface deflection from the field. An issue that needs to be investigated is whether the differences in the output plots for each state mainly depend on the traffic data. That might be true because the values of the input for all four states range in similar values. Environmental factors could have a strong influence on the results but are not yet incorporated in the program. Daily or season temperature fluctuations, or average precipitation could be taken into account for future work. Given the different climate zones to which each state belongs to, a first step of including environmental factors would be to take in account factors that represent these climate zones. REFERENCES [1] ELSEIFI M.A. and AL-QADI I.L., A simplified overlay design model against reflective cracking utilizing service life prediction. Road Materials and Pavement Design, 5 (2), 2004, pp. 169-191. [2] MALLICK R.B. and EL-KORCHI T. Pavement engineering: principles and practice: CRC Press, 2013. [3] MOLENAAR A. and PU B. Prediction of fatigue cracking in cement treated base courses. In: Proceedings of 6th RILEM International Conference on Cracking in Pavements, 2008, pp. 191-199. [4] NESNAS K. et al. A model for top-down reflection cracking in composite pavements. In: Fifth international RILEM conference on reflective cracking in pavements: RILEM Publications SARL, 2004, pp. 409-416. [5] SCARPAS, A. and DE BONDT, A. Finite elements simulation of reflective cracking in asphaltic overlays. Heron, 41 (1), 1996.

184


[6] THOM N. A simplified computer model for grid reinforced asphalt overlays. In: Fourth International RILEM Conference on Reflective Cracking in Pavements-Research in Practice: RILEM Publications SARL, 2000, pp. 37-46. [7] THOM, N. Principles of pavement engineering. London: Thomas Telford, 2014. [8] THOM, N. H.; CHOI, Y. K.; COLLOP, A. C. Top-down cracking, damage and hardening in practical flexible pavement design. In: Ninth International Conference on Asphalt Pavements. 2002. [9] YODER, ELDON J., WITCZAK, MATTHEW W., Principles of pavement design. New York: Wiley, 1975.

185


Computer Systems Engineering 2015 Keywords: audio features extraction, music genres recognition, machine learning

Bartłomiej Filip SUPERSON∗ Michał WANCKE†

A LOVE SONG RECOGNITION

Recognition of music genres is nowadays gaining more and more popularity. There are hundreds of programs, starting from a common music players ending with a complex home entertainment studios, which offer functionality of classifying music based on their type and choosing only those which user really wants to hear. Unfortunately not all of popular music genres can be categorized yet using those software. In this paper there was checked if there is a possibility to recognize if examined song is a love song or other type of song. For this purpose 5 various types of classifiers were learned, tested and compared using previously prepared dataset consisting of 50 love songs and 150 other songs described by 5 selected from 331 unique features, which had the biggest impact on classifiers performance. Studies shows that using 1-Nearest Neighbors Classifier love songs can be recognized providing error rate lower than 8%.

1.

INTRODUCTION

Nowadays the functionality of music genre recognition is implemented in almost all of the common audio players and home entertainment studios and triggered a lot of academic research [1, 3, 4, 5, 6]. It has became one of the most important features of those kind of software, because it easily allows users to listen only those music species which they really want. Illustrative examples of such programs are Spotify, Winamp, Foobar2000. Mentioned applications enables user a possibility of quick categorization of his set of music files, but the lists of music styles provided by them cover only the most popular ones. One of the audio types which lack can be observed on this lists is the love genre. ∗ Student of Computer Science - Advanced Informatics and Control at Wrocław University of Technology, Poland, e-mail: bart.superson@gmail.com † Student of Computer Science - Advanced Informatics and Control at Wrocław University of Technology, Poland, e-mail: mic11w@gmail.com

186


The main goal of this paper was to check if there is a possibility to recognize love songs on the basis of features extracted from the audio files. This paper give producers of advices in area of a love song recognition and accelerate evolution of their applications, which finally will give us a pleasure of using theirs software. To achieve described above purpose we prepared Love Song Recognition Dataset (LSRD) and on the basis of it performed comparative analysis of chosen popular classifiers: • k-Nearest Neighbors Classifier (k-NN), • Neural Network Classifier (NN), • Multiple Classifiers System (MCS) - combiner of k-NN, NN and Bayes Normal Classifier, • Support Vector Machine Classifier (SVM), • Fisher Classifier (FC). Finally we proposed one best of them to further analysis. To prepare LSRD there was used MIRtoolbox, which offers extraction of 331 different types of features: tonal, tempo, timbre, spectral, rhythm, fluctuation, dynamics, etc. (the global average and dynamic evolution values) [1] and PRTools which includes all mentioned classifiers [2]. In Fig. 1 there is presented an overview of the features which can be extracted from music using mentioned above toolbox.

2.

RELATED WORK

In this section, a recent work on various aspects of music genres recognition is briefly presented. In [3], it was shown how to perform the classification of the genres of music files using local features extracted from their spectrograms with connection of their lowlevel characteristics and with use of Support Vector Machine Classifier. On the other hand, [4] focused on classification of music genres using features extracted from spectrograms, but here dynamic ensemble of classifiers was tested. Markov and Matsui [5] proposed using Gaussian Processes for music genres classification. They also analyzed possibility of music emotions estimation. In [6], a music genre classification was performed with 78% accuracy using ensemble of classifiers, but with use of only statistical features of audio files as an input.

187


3.

PROBLEM FORMULATION

The problem examined in this paper was to find among selected classifiers the best one which provides the smallest mean error from 3 times ran 10-fold cross validation (Err)(1-2)[7], area under ROC curve (AuROC)(3)[8] and the highest sensitivity (Sens)(4) and specificity (Spec)(5)[9]. Formulas of those rates are as follows Ek =

N o. of incorr. recognized songs f rom k f old N o. of all songs f rom k f old

(1)

10

1 X Ek × 100% Err = 10

(2)

k=1

Z

1

ROC × 100%

AuROC =

(3)

0

Sens = Spec =

N o. of prop. recognized love songs × 100% N o. of all love songs

(4)

N o. of prop. recognized other songs × 100%. N o. of all other songs

(5)

Consequently, the final objective (6) is formulated as follows minimize(Err, AuROC) & maximize(Sens, Spec).

4.

(6)

EXPERIMENTATION SYSTEM

To perform experiments there was prepared set of selected 50 love songs and 150 other songs in form of MP3 files. This set was then used as an input of created in Matlab experimentation system. Inside it MP3 files were converted to mono WAV files, downsampled to 11.025kHz and normalized with 97% to speed up further features extraction (Acoustica MP3 to WAV converter used to avoid mistakes with normalizing and downsampling the data). Then feature extraction using [1] were performed. Afterwards, with use of [2] features were scaled based on their variance, 5 best among them were selected using forward selection based on 1-Nearest Neighbor leave-one-out classification performance and LSRD was created. LSRD was then divided randomly into learning (LD) and testing (TD) datasets which afterwards are used respectively to learn selected untrained classifiers (with selected

188


parameter values) and test their performance. Illustration of the whole experimentation system is presented in Fig. 2.

Selected classifier parameters (PRTools)

SET OF MP3 FILES (mix of love and other songs) EXPERIMENTATION SYSTEM (IN MATLAB)

Downsampler to 11,025kHz, converter to mono WAV (Acoustica MP3 to WAV) Selected untrained Classifier (PRTools)

Features extractor (MIRtoolbox)

Training dataset (PRTools)

Trained classifier (PRTools)

Extracted features

Scaling, feature selection, dataset division (PRTools)

Testing dataset (PRTools)

CLASSIFIER PERFORMANCE RATES (Err, AuROC, Sens, Spec)

Fig. 1. Love Song Recognition Experimentation System

On the output of the experimentation system there were classifier performance rates calculated (1-5) on the basis of which comparative analysis of classifiers was performed and best classifier was chosen using (6).

189


5.

ALGORITHMS

There were 5 untrained classifiers chosen from available 25 in PRTools to analyze. For each of them, using LSRD, preliminary tests were performed to find theirs best input parameters - for each selected parameter value (selection of classifiers and parameters to tests was based on advices in [2]) there was mean error from 3 times ran 10-fold cross validation (1-2) calculated. For all of the preliminary tests the better parameter value of classifier, the lower error rate. 5.1.

k-NN

For k-NN there was one preliminary test performed to find the best value of nearest neighbors. Result of this test is presented in Fig. 3.

Error [%]

10

5

0

1

2

5

Number of nearest neighbours in k-NN [-] Fig. 2. k-NN errors (1-2) for selected numbers of nearest neighbors for LSRD

According to the results presented in Fig. 3 in terms of (1-2) the best found value of nearest neighbors was 1 with 7.3%. 5.2.

NN

For NN there were two preliminary test performed to find the best number of neurons in hidden layer and epochs. Results of this tests are presented respectively in Fig. 4 and Fig. 5. According to the results presented in Fig. 4 and Fig. 5 in terms of (1-2) the best found number of neurons in hidden layer was 5 and of epochs was 500 with 11.6% of error.

190


Error [%]

15 10 5 0

1

5

10

Number of neurons in hidden layer in NN [-] Fig. 3. NN errors (1-2) for selected numbers of neurons in hidden layer for LSRD

Error [%]

30 20 10 0

100

200

500

Number of epochs in NN [-] Fig. 4. NN errors (1-2) for selected numbers of epochs for LSRD

5.3.

MCS

For MCS there was one preliminary test performed to find the best type of combiner. Result of this test is presented in Fig. 6. According to the results presented in Fig. 6 in terms of (1-2) the best found combiner was mean combiner with 7.3%. 5.4.

SVM

For SVM there were two preliminary test performed to find the best order of polynomial kernel and regularization parameter. Results of this tests are presented respectively in Fig. 7 and Fig. 8.

191


20

Error [%]

15 10 5 0

Vote

Mean

Max

Type of combiner in MCS [-] Fig. 5. MCS errors (1-2) for selected types of combiners for LSRD

Error [%]

15 10 5 0

2

3

4

Order of polynomial kernel in SVM [-] Fig. 6. SVM errors (1-2) for selected orders of polynomial kernel for LSRD

According to the results presented in Fig. 7 and Fig. 8 in terms of (1-2) the best found order of polynomial kernel was 3 and regularization parameter was 1 with 12.3%. 5.5.

FC

For FC there weren’t any tests performed, because it is not parametrized.

6.

EXPERIMENTS AND THEIR ANALYSIS

Final comparative analysis of classifiers was performed using described in previous paragraphs experimentation system, algorithms with best found parameters and was

192


Error [%]

15 10 5 0

1

5

20

Regularization parameter in SVM [-] Fig. 7. SVM errors (1-2) for selected regularization parameters for LSRD

done using LD to learn classifiers and TD to test them. Additionally there can be said that the most important features found through 331 extracted from audio files were: • the frequency of maximal periodicity detected in frame by frame evolution of the values of second derivative of mel-frequency cepstrum coefficients MFCC (representation of the short-term power spectrum) • the normalized amplitude of that periodicity • mean and slope of first derivative of MFCC • mean of roughness of spectrum. and were included into LSRD [1]. In the Fig. 9 there was presented errors (1-2) of selected classifiers with best found parameters. According to the results presented in Fig. 9 the best classifiers in terms of (1-2) were equally 1-NN and MCS with 7.3%. In the Fig. 10 and Fig. 11 there were presented respectively ROC curves and areas under them (3) of selected classifiers with best found parameters. According to the results presented in Fig. 10 and 11 the best classifier in terms of (3) was 1-NN with 1.9%. In the Fig. 12 there was presented sensitivities (4) of selected classifiers with best found parameters. According to the results presented in Fig. 12 the best classifiers in terms of (4) were equally 1-NN and MCS with 93.3%.

193


Error [%]

10

5

0

NN

1-NN

MCS

SVM

FC

Classifiers with best found parameters Fig. 8. Errors (1-2) of selected classifiers with best found parameters

1.00 1-NN NN MCS SVM FC

Error II [-]

0.80 0.60 0.40 0.20 0.00

0

0.1

0.2

0.3

0.4

0.5

Error I [-] Fig. 9. ROC curves of selected classifiers with best found parameters

In the Fig. 13 there was presented specificities (5) of selected classifiers with best found parameters. According to the results presented in Fig. 13 the best classifiers in terms of (5) were equally 1-NN and MCS with 95.5%. Finally, based on (6) there can be said that the best from analyzed classifiers was 1-NN, which presents 7.3% error (1-2), 1.9% area under ROC curve (3), 93.3% sensi194


Area under ROC curve [%]

8 6 4 2 0

1-NN

NN

MCS

SVM

FC

Classifiers with best found parameters Fig. 10. Areas under ROC curves of selected classifiers with best found parameters

100

Sensitivity [%]

80 60 40 20 0

1-NN

NN

MCS

SVM

FC

Classifiers with best found parameters Fig. 11. Sensitivities of selected classifiers with best found parameters

tivity(4) and 95.5% specificity (5). Additionally, statistical analysis of obtained results was performed using re-sampled paired t tests (7). According to the results obtained for each pair of used classifiers with confidence of 95% all of them weren’t the same. This means that results obtained during this research were trustworthy [10].

195


100

Specificity [%]

80 60 40 20 0

1-NN

NN

MCS

SVM

FC

Classifiers with best found parameters Fig. 12. Specificities of selected classifiers with best found parameters

7.

CONCLUSION AND FUTURE WORK

In this paper, we compared 5 machine learning algorithms for a love song recognition. All of examined classifiers presented error rates lower than 12%, so they were performing very well. Finally, a classifier which we propose for this problem is k-NN with 1 nearest neighbor as an input parameter which presented in our research 7.3% error (1-2), 1.9% area under ROC curve (3), 93.3% sensitivity(4) and 95.5% specificity (5). Love songs used to learn in this paper were chosen by us, so for many people results obtained in our research can not be trustworthy, because of their own taste, so in further works there should be considered using clustering to separate love songs from other songs instead of selecting them based on our music tastes. There should be considered also creating bigger learning and testing dataset and checking more types of classifiers.

REFERENCES [1] LARTILLOT O. and TOIVIAINEN P., A Matlab toolbox for musical feature extraction from audio. In Proc. of the 10th International Conference on Digital Audio Effects (DAFx07), 2007. [2] DUIN R.P.W., JUSZCZAK P., PACLIK P., PE¸KALSKA E., DE RIDDER D., TAX D.M.J. and VERZAKOV S., PRTools4.1, A Matlab Toolbox for Pattern Recognition. Delft Uni-

196


versity of Technology, 2007. [3] COSTA Y.M.G., OLIVEIRA L.S., KOERICH A.L. and GOUYON F., Music genre recognition using spectrograms. In Proc. of 18th International Conference on Systems, Signals and Image Processing (IWSSIP), 2011, pp. 1-4. [4] COSTA Y.M.G., OLIVEIRA L.S., KOERICH A.L. and GOUYON F., Music genre recognition based on visual features with dynamic ensemble of classifiers selection. In Proc. of 20th International Conference on Systems, Signals and Image Processing (IWSSIP), 2013, pp. 55-58. [5] MARKOV K. and MATSUI T., Music Genre and Emotion Recognition Using Gaussian Processes. IEEE Access Journal, vol.2, 2014, pp.688-697. [6] CHATHURANGA D. and JAYARATNE L., Musical Genre Classification Using Ensemble of Classifiers. In Proc. of 4th International Conference on Computational Intelligence, Modelling and Simulation (CIMSiM), 2012, pp.237-242. [7] HASTIE T., TIBSHIRANI R. and FRIEDMAN J. The Elements of Statistical Learning (2nd edition). Springer, Stanford, CA, USA, 2008. [8] SPITALNIC S., Test Properties 2: Likelihood ratios, Bayes formula and receiver operating characteristic curves. Hospital Physician Journal, October 2004, , pp. 53-58. [9] SPITALNIC S., Test Properties 1: Sensitivity, Specificity and Predictive Values. Hospital Physician Journal, September 2004, pp. 27-31. [10] SINGHI S., Statistical Significance Testing. Materials from Machine Learning Lab, Arizona State University, Phoenix, USA, April 2005.

197


Issuu converts static files into: digital portfolios, online yearbooks, online catalogs, digital photo albums and more. Sign up and create your flipbook.