34th International Conference
Mathematical Methods in Economics MME 2016
Conference Proceedings
Liberec, Czech Republic September 6th – 9th, 2016
Š Technical University of Liberec 2016 ISBN 978-80-7494-296-9
Faculty of Economics, Faculty of Mechanical Engineering Technical University of Liberec Programme Committee doc. RNDr. Helena Brožová, CSc. doc. Ing. Jan Fábry, Ph.D. prof. RNDr. Ing. Petr Fiala, CSc., MBA doc. Mgr. Ing. Martin Dlouhý, Dr. MSc. doc. Ing. Jana Hančlová, CSc. prof. Ing. Josef Jablonský, CSc. Ing. František Koblasa, Ph.D. doc. RNDr. Ing. Ladislav Lukáš, CSc. doc. Dr. Ing. František Manlig prof. RNDr. Jan Pelikán, CSc. doc. Dr. Ing. Miroslav Plevný prof. RNDr. Jaroslav Ramík, CSc. Ing. Karel Sladký, CSc. Ing. Eva Šlaichová, Ph.D. doc. Ing. Tomáš Šubrt, Ph.D. doc. RNDr. Jana Talašová, CSc. prof. RNDr. Milan Vlach, DrSc. prof. RNDr. Karel Zimmermann, DrSc. prof. Ing. Mikuláš Luptáčik, Ph.D. prof. Dr. Sigitas Vaitkevičius prof. Ing. Miroslav Žižka, Ph.D.
Editors Aleš Kocourek Miroslav Vavroušek
Technical editor Miroslav Vavroušek
Cover layout Aleš Kocourek
Contents E. Adámek: The Role of Productivity in the Czech Republic: Does Balassa-Samuelson Effect Still Exist? ................................................................................................................... 1 S. Alfiero, F. Elba, A. Esposito, G. Resce: Italian Saving Banks Efficiency, Is Unity Strength? Bank Groups versus Stand-Alone ......................................................................... 7 O. Badura, J. Závacká: Are Remittances Important for the Aggregate Consumption Growth? ............................................................................................................................... 13 J. Bartoška: Semantic Model of Organizational Project Structure........................................... 19 J. Bartoška, P. Kučera, T. Šubrt: Modification of the EVM by Work Effort and Student Syndrome phenomenon ....................................................................................................... 25 J. Bartošová, V. Bína: Equivalent Scale and Its Influence on the Evaluationof Relative Welfare of the Household .................................................................................................... 31 O. Beneš, D. Hampel: Evaluation of the Destructive Test in Medical Devices Manufacturing ..................................................................................................................... 37 L. Beranek: Sentiment Prediction on E-Commerce Sites Based on Dempster-Shafer Theory.................................................................................................................................. 43 V. Beránková: OCA Index: Results for the EU's Member States ............................................ 49 D. Bílková: Trimmed L-Moments in Modeling Income Distribution ..................................... 55 A. Borovička: Selection of the Bank and Investment Products via Stochastic MultiCriteria Evaluation Method ................................................................................................. 61 M. Branda: A Chance Constrained Investment Problem with Portfolio Variance and Skewness Criteria - Solution Technique Based on the Successive Iterative Regularization ...................................................................................................................... 67 H. Brožová, I. Boháčková: Post-Crisis Development in the Agricultural Sector in the CR - Evidence by Malmquist Index and Its Decomposition ..................................................... 73 J. Brůha, J. Tonner, O. Vašíček: Modelling Effects of Oil Price Fluctuations in a Monetary DSGE Model ....................................................................................................... 79 R. Bucki, P. Suchánek: Information Support for Solving the Order Priority Problem in Business Logistic Systems................................................................................................... 85 A. Buczek: Modelling Legal Merger Time to Completion with Burr Regression ................... 91 J. Buček: Fiscal Policy for the 21st Century: Does Barack Obama Effect the Real Economic Policy? ................................................................................................................ 97
I
P. Budaj, M. Hrnčiar: Specifying the Use of Taguchi's Loss Function in Manufacturing and Service Sectors ............................................................................................................ 103 P. Čekmeová: Labor Market Institutions and Total Factor Productivity An Evidence in the European Union ........................................................................................................... 108 A. Černá, V. Přibyl, J. Černý: A Note on Optimal Network of Tourist Paths Reflecting Customer Preferences ........................................................................................................ 114 D. Černá, V. Finěk: Adaptive Wavelet Method for the Black-Scholes Equation of European Options .............................................................................................................. 120 L. Černá, V. Zitrický, J. Ponický: Income and Price Elasticity of Demand for Transport Services in Rail Passenger Transport in the Slovak Republic ........................................... 126 M. Černý: Finite-Sample Behavior of GLFP-Based Estimators for EIV Regression Models: A Case with Restricted Parameter Space ............................................................ 132 Z. Čičková, I. Brezina, J. Pekár: Capacited Vehicle Routing Problem with Time Restriction Using Minimal Number of Vehicles ............................................................... 138 O. Čížek: Long-Run Growth in the Czech Republic ............................................................. 143 M. Dlask: Fractional Brownian Bridge as a Tool for Short Time Series Analysis ................ 149 Z. Dlouhá: Earnings Effects of Job and Educational Mismatch in the Czech Graduate Labour Market ................................................................................................................... 155 M. Dlouhý, M. Tichá: Multicriteria Voting Game and Its Application ................................. 160 M. Dorda, D. Teichmann, V. Graf: On Queueing Model of Freight Train Classification Process ............................................................................................................................... 165 M. Dvořák, J. Kukal, P. Fiala: Data Extension for Stock Market Analysis ........................... 171 S. Dvořáková, P. Jiříček: Multi-Parametric Simulation of a Model of Nonconventional Public Projects Evaluated by Cost-Benefit Analysis ......................................................... 177 K. Echaust: On the Extremal Dependence between Stock and Commodity Markets ............ 183 M. Fałdziński, A. P. Balcerzak, T. Meluzín, M. B. Pietrzak, M. Zinecker: Cointegration of Interdependencies Among Capital Markets of Chosen Visegrád Countries and Germany ............................................................................................................................ 189 M. Fendek, E. Fendeková: Microeconomic Optimization Equilibrium Models of Demand and Supply for Network Industries Markets...................................................................... 195 E. Fendeková, M. Fendek: Models of Quantitative Analysis of the Competitive Environment in the Slovak Banking Sector ...................................................................... 201
II
P. Fiala, R. Majovska: Efficient Project Portfolio Designing ................................................ 207 T. Formánek, R. Hušek: On the Stability of Spatial Econometric Models: Application to the Czech Republic and Its Neighbors .............................................................................. 213 J. Franek: Application of Selecting Measures in Data Envelopment Analysis for Company Performance Rating........................................................................................... 219 L. Frýd: Granger Causalityin Macroeconomics Panel Data................................................... 225 M. Gad, J. Jablonský, K. Zimmermann: Incorrectly Posed Optimization Problems under Extremally Linear Equation Constraints ........................................................................... 231 M. Gavalec, K. Zimmermann: One Type of Activity Synchronization Problems ................. 237 B. Gavurova, S. Korony: Slovak Day Surgery Beds Reduction on the Basis of Cost Frontier Function ............................................................................................................... 242 T. Haloun, J. Marek, R. Rajmon, M. Kršíková: Analysis in a Time Series of Milk-Yield Production .......................................................................................................................... 248 J. Hanclova: A Two-Stage Data Envelopment Analysis Model with Application to Banking Industry in the Visegrad Group ........................................................................... 254 S. Hašková: Evaluation of Project Investments Based on Comprehensive Risk ................... 260 R. Hendrych, T. Cipra: On Comparing Various EWMA Model Estimators: Value at Risk Perspective ......................................................................................................................... 265 J. Hofman, L. Lukáš: Detection of Feasible Domains of Interpolation Parameters within Generalized EOQ Type Inventory Models ........................................................................ 271 M. Holčapek, T. Tichý: Nonparametric Regression Via Higher Degree F-Transform for Implied Volatility Surface Estimation ............................................................................... 277 P. Holeček, J. Talašová: A Free Software Tool Implementing the Fuzzy AHP Method ....... 283 V. Holý: Impact of Microstructure Noise on Integrated Variance Estimators: A Simulation Study ............................................................................................................... 289 M. Horniaček, Ľ. Šimková: Recursive Core of an OLG Economy with Production and Comprehensive Agreements .............................................................................................. 295 R. Horský: The Invertibility of the General Linear Process and the Structures in its Background ........................................................................................................................ 301 J. Hozman, T. Tichý: DG Approach to Numerical Pricing of Local Volatility Basket Options............................................................................................................................... 307
III
B. Chramcov, R. Bucki, M. Trzopek: Model Based Control of Production Flows in the Serial Logistic Process....................................................................................................... 313 L. Chytilová, F. Zapletal: The Comparison of DEA and PROMETHEE Method to Evaluate the Environmental Efficiency of National Steel Sectors in Europe ................... 319 L. Chytilová: Banking Efficiency in the Visegrád Group According to Data Envelopment Analysis ............................................................................................................................. 325 J. Jablonský, V. Sklenák: Multi-Period Analysis and Resource Allocation among Czech Economic Faculties............................................................................................................ 331 H. Jähn: Selected Problems of a Performance Analysis Approach for Networked Production Structures and Their Quantitative Solutions ................................................... 337 O. Jajkowicz: Determinants of the Shadow Economy in the Selected European Union Member Countries ............................................................................................................. 343 J. Janáček, M. Kvet: Designing a Robust Emergency Service System by Lagrangean Relaxation .......................................................................................................................... 349 M. Janáčková, A. Szendreyová: An Importance of the Population Density for The Location of the Emergency Medical Service Stations....................................................... 354 J. Janák: Fractional Brownian Motion and Parameter Estimation ......................................... 359 J. Kalina, B. Peštová: Some Robust Distances for Multivariate Data ................................... 365 V. Kaňková: A Note on Optimal Value of Loans .................................................................. 371 S. Kapounek, Z. Kučerová: Google Trends and Exchange Rate Movements: Evidence from Wavelet Analysis ...................................................................................................... 377 N. Kaspříková: Economic Efficiency of AOQL Variables Sampling Plans .......................... 383 E. Kassem, O. Trenz, J. Hřebíček, O. Faldik: Mathematical Model for Sustainable Value Added Calculation ............................................................................................................. 389 E. Keshavarz, M. Toloo: A Single-Stage Approach for Selecting Inputs/Outputs in DEA... 395 Z. Kiszová: Employee Selection - Case Study Applying Analytic Hierarchy Process .......... 401 F. Koblasa, M. Vavroušek, F. Manlig: Three Dimensional Bin Packing Problem in Batch Scheduling. ........................................................................................................................ 407 A. Kocourek, J. Šimanová, J. Šmída: Modelling of Regional Price Levels in the Districts of the Czech Republic........................................................................................................ 413 J. Kodera, Q. V. Tran, M. Vošvrda: Discrete Dynamic Endogenous Growth Model:Derivation, Calibration and Simulation ................................................................. 419
IV
M. Koháni: Robust Design of Tariff Zones in Integrated Transport Systems ....................... 425 V. Končiková, J. Buček, R. Barca: China and EU as Trade Competitors: Different Methodological Approaches .............................................................................................. 431 M. Kopa: Portfolio Selection Problem with the Third-Order Stochastic Dominance Constraints ......................................................................................................................... 436 E. Kotlánová, K. Turečková: Income Inequality in V4+ Countries at NUTS 2 Level .......... 442 N. Kouaissah: Optimal Portfolio Selection with Different Approximated Returns ............... 447 I. Krejčí, G. Koláčková, I. Tichá: Simulation Model of Czech Small Farming Business ..... 453 A. Kresta, T. Tichý, M. Toloo: Evaluation of Market Risk Models: Changing the Weights in the DEA Approach .......................................................................................... 459 L. Krištoufek, M. Vošvrda: Capital Market Efficiency in the Ising Model Environment: Local and Global Effects ................................................................................................... 465 Z. Kučerová, J. Poměnková: The Role of Banking and Shadow Banking Sector in the Euro Area: Relationship Assessment via Dynamic Correlation ........................................ 471 J. Kukal, Q. V. Tran: Estimation of Alpha Stable Distribution without Numerical Difficulties ......................................................................................................................... 477 M. Kuncová: Cost Minimization and Electricity Consumption Growth in the Czech Households in Case of the New Conditions ...................................................................... 483 M. Kvasnička: Simulation Framework for Testing Piketty's Predictions .............................. 489 M. Kvet, J. Janáček: Lexicographic Fair Design of Robust Emergency Service System ...... 495 P. Lachout: Discussion on Implicit Econometric Models ...................................................... 501 E. Litavcová, S. Jenčová, P. Vašaničová: On Modelling of Macroeconomic Context of Current Economic Development ....................................................................................... 506 L. Lukáš: American Option Pricing Problem Formulated as Variational Inequality Problem .............................................................................................................................. 512 T. Majer, S. Palúch: Rescue System Resistance to Failures in a Transport Network ............ 518 M. Małecka: Evaluation of Parametric ES Tests ................................................................... 523 J. Málek, Q. V. Tran: Evaluating the Ability to Model the Heavy Tails of Asset Returns of Two Alternative Distributions ....................................................................................... 528 L. Marek: Comparison of Wages in the Czech Regions ........................................................ 534 P. Marek, T. Ťoupal, F. Vávra: Efficient Distribution of Investment Capital ....................... 540
V
A. Mastalerz-Kodzis, E. Pośpiech: Dynamic and Spatial Analysis of Economic and Labour Market Characteristics in European Countries ..................................................... 546 M. Matějka, T. Karel, J. Fojtík, P. Zimmermann: Mixing Mortality Data Approach Using Czech Cohort Data............................................................................................................. 552 K. Mičudová, L. Lukáš: Various Interest Rates Models Used within Real Options Valuation Tools ................................................................................................................. 558 E. Mielcová: Application of Coalitional Values on Real Voting Data .................................. 564 M. Miśkiewicz-Nawrocka, K. Zeug-Żebro: The Application of Chosen Measures of Deterministic Chaos to Building Optimal Portfolios ........................................................ 570 M. Mojzes, J. Kukal, J. Bostik, M. Klimt: Annealing Based Integer Optimization Heuristic with Lévy Flights ............................................................................................... 576 M. Molnárová: Possible and Universal Robustness of Monge Fuzzy Matrices .................... 582 H. Myšková: Interval Fuzzy Matrix Equations ...................................................................... 588 M. Navrátil, P. Koťátková Stránská, J. Heckenbergerová: Sensitivity Analysis of MACD Indicator on Selection of Input Periods Combination ....................................................... 594 T. L. Nawrocki: Comparison of an Enterprise's Financial Situation Assessment Methods. Fuzzy Approach versus Discriminatory Models ............................................................... 600 D. Nejedlová: Prediction of Daily Traffic Accident Counts and Related Economic Damage in the Czech Republic ......................................................................................... 606 D. Němec, M. Macíček: Wage Rigidities and Labour Market Performance in the Czech Republic ............................................................................................................................. 612 S. Ortobelli, T. Lando, C. Nardelli: Parametric Rules for Stochastic Comparisons .............. 618 S. Palúch, T. Majer: Vehicle Scheduling with Roundabout Routes to Depot........................ 623 A. Pápai: Monetary Policy in an Economy with Frictional Labor Market - Case of Poland . 629 J. Pavelec, B. Šedivá: Adaptive Parameter Estimations of Markowitz Modelfor Portfolio Optimization ...................................................................................................................... 635 O. Pavlačka, P. Rotterová: Fuzzy Decision Matrices Viewed as Fuzzy Rule-Based Systems .............................................................................................................................. 641 M. Pechrová, V. Lohr: Valuation of the Externalities from Biogas Stations ......................... 647 J. Pekár, L. Mieresová, R. Marian: Multiple Criteria Travelling Salesman Problem ............ 653 J. Pelikan: Heuristics for VRP with Private and Common Carriers....................................... 658
VI
R. Perzina, J. Ramík: Fuzzy Decision Analysis Module for Excel ........................................ 663 Š. Peško, R. Hajtmanek: The Treatment of Uncertainty in Uniform Workload Distribution Problems ........................................................................................................ 669 B. Petrová: Multidimensional Stochastic Dominance for Discreteuniform Distribution ...... 675 K. Piasecki: The Intuitionistic Fuzzy Investment Recommendations .................................... 681 R. Pietrzyk, P. Rokita: A Decent Measure of Risk for a Household Life-Long Financial Plan - Postulates and Properties......................................................................................... 687 A. Podaras: Optimum Classification Method of Critical IT Business Functions for Efficient Business Impact Analysis ................................................................................... 693 A. Pozdílková, P. Koťátková Stránská, M. Navrátil: Regression Analysis of Social Networks and Platforms for Tablets in European Union .................................................. 699 P. Pražák: Dynamics Model of Firms' Income Tax ............................................................... 705 V. Přibyl, A. Černá, J. Černý: A Note on Economy of Scale Problem in Transport with Non-linear Cost Function of Transported Quantity ........................................................... 711 M. Rada, M. Tichá: Finite Game with Vector Payoffs: How to Determine Weights and Payoff Bounds under Linear Scalarization ........................................................................ 717 J. Ramík: Strong Consistent Pairwise Comparisons Matrixwith Fuzzy Elements and Its Usein Multi-Criteria Decision Making .............................................................................. 723 V. Reichel, M. Hloušek: The Housing Market and Credit Channel of Monetary Policy ...... 729 P. Rotterová, O. Pavlačka: Computing Fuzzy Variances of Evaluations of Alternatives in Fuzzy Decision Matrices ................................................................................................... 735 L. Roubalová, D. Hampel: The Main Inflationary Factors in the Visegrád Four .................. 741 J. Rozkovec, J. Öhm, K. Gurinová: Posteriori Setting of the Weights in Multi-Criteria Decision Making in Educational Practice ......................................................................... 747 L. Schaynová: A Client's Health from the Point of View of the Nutrition Adviser Using Operational Research ......................................................................................................... 751 J. Schürrer: Wavelets Comparison at Hurst Exponent Estimation......................................... 757 J. Sixta, J. Fischer: Possibilities of Regional Input-Output Analysis of Czech Economy ..... 762 V. Skocdopolova: A Comparison of Integer Goal Programming Models for Timetabling ... 768 K. Sladký: Transient and Average Markov RewardChains with Applications to Finance .... 773
VII
O. Sokol, M. Rada: Interval Data and Sample Variance: How to Prove Polynomiality of Computation of Upper Bound Considering Random Intervals? ....................................... 779 J. Stoklasa, T. Talášek: On the Use of Linguistic Labels in AHP: Calibration, Consistency and Related Issues ............................................................................................................. 785 V. Sukač, J. Talašová, J. Stoklasa: 'Soft' Consensus in Decision-Making Model Usingpartial Goals Method ................................................................................................ 791 K. Súkupová, O. Vašíček: Modelling of an Asymmetric Foreign Exchange Rate Commitment in the Czech Economy. DSGE Model with Constraints.............................. 797 K. Surmanová, M. Reiff: Long-Term Unemployment in Terms of Regions of Slovakia ..... 803 K. Szomolányi, M. Lukáčik, A. Lukáčiková: The Effect of Terms-of-Trade on Czech Business Cycles: A Theoretical and SVAR Approach ...................................................... 809 B. Šedivá: The Dynamic Behaviour of Wonderland Population-DevelopmentEnvironment Model ........................................................................................................... 815 O. Šimpach, M. Pechrová: Searching for Suitable Method for Clustering the EU Regions according to Their Agricultural Characteristics ................................................................ 821 E. Šlaichová, M. Rodrigues: A Computer-Based VBA Model for Optimal Spareparts Logistics Management in a Manufacturing Maintenance System..................................... 827 E. Šlaichová, E. Štichhauerová, M. Zbránková: Optimization of the Transportation Plan with a Multicriterial Programming Method....................................................................... 833 J. Šnor: Similarity of Stock Market Series Analyzed in Hilbert Space .................................. 839 T. Talášek, J. Stoklasa, J. Talašová: The Role of Distance and Similarity in Bonissone's Linguistic Approximation Method - a Numerical Study ................................................... 845 J. Talašová, J. Stoklasa, P. Holeček, T. Talášek: Registry of Artistic Performance - The Final State of the Evaluation Model .................................................................................. 851 M. Tichá: Multi-Criteria Coalitional Game with Choice from Payoffs ................................. 857 P. Tomanová: Measuring Comovements by Univariate and Multivariate Regression Quantiles: Evidence from Europe...................................................................................... 863 S. Vaitkevicius, V. Hovorková Valentová, E. Zakareviciute: The Use of Cluster Analysis for Development of Categorical Factors in Exploratory Study: Facts and Findings from the Field Research ..................................................................................................... 869 K. Vaňkátová, E. Fišerová: Analysis of Income of EU Residents Using Finite Mixtures of Regression Models ............................................................................................................ 875
VIII
P. Vašaničová, E. Litavcová, Š. Lyócsa: Co-Movement of Tourism Activity of V4 countries............................................................................................................................. 881 P. Volf: On Newsvendor Model with Optimal Selling Price ................................................. 886 J. Voříšek: Approximate Transition Density Estimation of the Stochastic Cusp Model ....... 892 I. Vybíralová: The Impact of Labor Taxation on the Economic Growth ............................... 898 K. Warzecha: The Similarity of the European Union Countries as Regards the Production of Energy from Renewable Sources .................................................................................. 904 M. Zámková, M. Prokop: The Evaluation of Factors Influencing Flight Delay at Popular Destinations of Czech Tourists .......................................................................................... 910 F. Zapletal, M. Šmíd: Decision of a Steel Company Trading with Emissions ...................... 916 J. Zbranek, J. Fischer, J. Sixta: Using Semi-Dynamic Input-Output Model for theAnalysis of Large Investment Projects ............................................................................................. 922 P. Zimčík: Tax Burden and Economic Growth in OECD Countries: Two Indicators Comparison ........................................................................................................................ 928 M. Žižka: Effectiveness and Efficiency of Fast Train Lines in the Czech Republic ............. 934
IX
Mathematical Methods in Economics 2016
The Role of Productivity in the Czech Republic: Does BalassaSamuelson Effect Still Exist? Emil AdĂĄmek1 Abstract. The Purchasing Power Parity (PPP) theory claims that price level (or its changes in the case of relative version of PPP) is the only determinant of the development of exchange rate. Nevertheless, it does not provide good empirical results, especially in the case of transitive economies. One of possible explanation is the existence of Balassa-Samuelson (BS) effect. It is one of the theoretical approaches which try to explain the development of exchange rate and price level. It proposes that faster growth of productivity leads to appreciation of domestic currency and/or to growth of domestic price level. The aim of this paper is to assess the role of BS effect in the Czech Republic. Nevertheless, this approach includes some strong assumptions such as validity of PPP theory for tradeable goods or, in the case of the “domesticâ€? BS effect, the assumption that wages tend to be equalized across tradable and non-tradable sectors. In this paper, the significance of BS effect is tested even if some of these conditions are not valid. We use quarterly data (2000 – 2015) in regression analysis. The Ordinary Least Squares (OLS) method is used to assess the influence of productivity on inflation rate. Keywords: balassa-samuelson effect, price level, inflation rate, exchange rate, productivity, regression analysis, czech republic JEL Classification: C22, E52, E58 AMS Classification: 62-07
1
Introduction
Balassa-Samuelson (BS) effect was described in [1] and [14]. According to this approach the prices of tradable goods and services are determined by arbitrage opportunity, so the relative version of PPP is valid for them. On the other hand PPP is not valid in case of non-tradable goods and services (because arbitrage is not possible). The theory states that faster growth of productivity in tradable sector leads to faster growth of prices in non-tradable sector. Nevertheless, this effect seems to be almost insignificant in the case of developed countries, because the productivity growth difference between two developed countries is negligible. On the other hand, transitioned economies have experienced rapid productivity growth. Moreover, the growth in the tradable sector exceeded the growth in the non-tradable one. That is why the discussions about BS effect related to transition countries has arisen in nineties. A lot of studies try to estimate its influence in the case of Central European (CE) or Central and East European (CEE) countries. The aim of this paper is to assess the impact of BS effect in the Czech Republic.
2 2.1
Theoretical Backgrounds Computing BS effect
While computing BS effect, at first, the price levels (đ?‘?) are decomposed into traded (đ?‘?đ?‘‡ ) and non-traded (đ?‘?đ?‘ ) components in domestic and foreign (*) countries: đ?‘?đ?‘Ą = đ?›źđ?‘?đ?‘Ąđ?‘‡ + (1 − Îą)đ?‘?đ?‘Ąđ?‘ , ∗
(1) ∗
đ?‘?đ?‘Ą ∗ = đ?›ź ∗ đ?‘?đ?‘Ąđ?‘‡ + (1 − Îąâˆ— )đ?‘?đ?‘Ąđ?‘ ,
(2)
where t denotes time and parameter ď Ą is share of traded components in consumption basket. The real exchange rate (q) can be expressed as the relative price of goods produced abroad to those produced in domestic country:
VĹ B-Technical University of Ostrava, Faculty of Economics, Department of National Economy, SokolskĂĄ tĹ™. 33, 701 21 Ostrava 1, Czech Republic, e-mail: emil.adamek@vsb.cz 1
1
Mathematical Methods in Economics 2016
đ?‘žđ?‘Ą = (đ?‘’đ?‘Ą + đ?‘?đ?‘Ąâˆ— ) − đ?‘?đ?‘Ą ,
(3)
where e stands for nominal exchange rate (CZK/EUR in this paper). By substituting equations (1) and (2) into (3), the changes of real exchange can be written as: ∗
∗
∗
Δđ?‘žđ?‘Ą = (Δđ?‘’đ?‘Ą + Δđ?‘?đ?‘Ąđ?‘‡ − Δđ?‘?đ?‘Ąđ?‘‡ ) + [(1 − đ?›ź ∗ )(Δđ?‘?đ?‘Ąđ?‘ − Δđ?‘?đ?‘Ąđ?‘‡ ) − (1 − đ?›ź)(Δđ?‘?đ?‘Ąđ?‘ − Δđ?‘?đ?‘Ąđ?‘‡ )].
(4)
PPP claims that prices of tradable goods can be expressed as: ∗
Δđ?‘?đ?‘Ąđ?‘‡ = Δđ?‘’đ?‘Ą + Δđ?‘?đ?‘Ąđ?‘‡ .
(5)
If it was true, the first term on the right side of equation (4) would disappear: ∗
∗
Δđ?‘žđ?‘Ą = (1 − đ?›ź ∗ )(Δđ?‘?đ?‘Ąđ?‘ − Δđ?‘?đ?‘Ąđ?‘‡ ) − (1 − đ?›ź)(Δđ?‘?đ?‘Ąđ?‘ − Δđ?‘?đ?‘Ąđ?‘‡ ).
(6)
Assuming that output of economy (Y) is determinated by a Cobb-Douglas production function, it can be expressed as: đ?‘Œđ?‘Ąđ?‘‡ = đ??´đ?‘‡đ?‘Ą (đ??żđ?‘‡đ?‘Ą )đ?›ž (đ??žđ?‘Ąđ?‘‡ )1−đ?›ž ,
(7)
đ?‘ đ?›ż đ?‘ 1−đ?›ż đ?‘Œđ?‘Ąđ?‘ = đ??´đ?‘ , đ?‘Ą (đ??żđ?‘Ą ) (đ??žđ?‘Ą )
(8)
where A represents productivity, L stands for labor, K is capital and parameters ď §, ď ¤ denote labor intensities in tradable and non-tradable sector respectively. Under perfect competition the profit of firm ( ď ?) can be written as: đ?›ąđ?‘Ą = đ?‘Œđ?‘Ąđ?‘‡ (đ??żđ?‘‡đ?‘Ą , đ??žđ?‘Ąđ?‘‡ ) + đ?‘ƒđ?‘ /đ?‘‡ đ?‘Œđ?‘Ąđ?‘ (đ??żđ?‘ đ?‘Ą , đ??žđ?‘Ąđ?‘ ) − (đ?‘Šđ?‘Ąđ?‘‡ đ??żđ?‘‡đ?‘Ą + đ?‘Šđ?‘Ąđ?‘ đ??żđ?‘ đ?‘Ą ) − đ?‘…đ?‘Ą (đ??žđ?‘Ąđ?‘‡ , đ??žđ?‘Ąđ?‘ ),
(9)
where PN/T is the ratio of prices in non-tradable to tradable sector, W denotes wages (costs of labor) and R represents interest rate (costs of capital). Function described by equation (9) will be maximized while conditions (10), (11) and (12) are valid: đ?œ•đ?‘Œđ?‘Ąđ?‘‡ đ?œ•đ??žđ?‘Ąđ?‘‡
= đ?‘ƒđ?‘ /đ?‘‡ đ?œ•đ?‘Œđ?‘Ąđ?‘‡ đ?œ•đ??żđ?‘‡ đ?‘Ą
đ?‘ƒđ?‘ /đ?‘‡
đ?œ•đ?‘Œđ?‘Ąđ?‘ đ?œ•đ??žđ?‘Ąđ?‘
= đ?‘…đ?‘Ą ,
(10)
= ��� ,
đ?œ•đ?‘Œđ?‘Ąđ?‘ đ?œ•đ??żđ?‘ đ?‘Ą
(11)
= đ?‘Šđ?‘Ąđ?‘ ,
(12)
By substituting equations (7) and (8) into these conditions, they can be written (in the logarithm form) as: đ?‘&#x;đ?‘Ą = log(1 − đ?›ž) + đ?‘Žđ?‘Ąđ?‘‡ − đ?›ž(đ?‘˜đ?‘Ąđ?‘‡ − đ?‘™đ?‘Ąđ?‘‡ ) = đ?‘?đ?‘ /đ?‘‡ + log(1 − đ?›ż) + đ?‘Žđ?‘Ąđ?‘ − đ?›ż(đ?‘˜đ?‘Ąđ?‘ − đ?‘™đ?‘Ąđ?‘ ),
(13)
đ?‘¤đ?‘Ąđ?‘‡ = log(đ?›ž) + đ?‘Žđ?‘Ąđ?‘‡ + (1 − đ?›ž)(đ?‘˜đ?‘Ąđ?‘‡ − đ?‘™đ?‘Ąđ?‘‡ ),
(14)
đ?‘¤đ?‘Ąđ?‘ = đ?‘?đ?‘ /đ?‘‡ + log(đ?›ż) + đ?‘Žđ?‘Ąđ?‘ + (1 − đ?›ż)(đ?‘˜đ?‘Ąđ?‘ − đ?‘™đ?‘Ąđ?‘ ).
(15)
By solving equation (13) and substituting results into (14) and (15) (assuming that (14) = (15)), the relative prices in non-traded sector can be expressed as: đ?‘?đ?‘ /đ?‘‡ = đ?‘?đ?‘Ąđ?‘ − đ?‘?đ?‘Ąđ?‘‡ = {đ?›ż [log đ?›ž +
1−đ?›ž đ?›ž
log(1 − đ?›ž) − log đ?›ż −
1−đ?›ż đ?›ż
log(1 − đ?›ż) + đ?‘&#x;đ?‘Ą (
1−đ?›ż đ?›ż
−−
1−đ?›ž đ?›ž
�
)]} + đ?‘Žđ?‘Ąđ?‘‡ − đ?‘Žđ?‘Ąđ?‘ . đ?›ž
(16) Considering that c, which is term in { } and includes interest rates and factor productivities, is constant; equation (16) can be written as: �
đ?‘?đ?‘ /đ?‘‡ = đ?‘?đ?‘Ąđ?‘ − đ?‘?đ?‘Ąđ?‘‡ = đ?‘? + đ?‘Žđ?‘Ąđ?‘‡ − đ?‘Žđ?‘Ąđ?‘ . đ?›ž
2
(17)
Mathematical Methods in Economics 2016
Equation (17) is the core of all estimated models. Particular forms of models are described in the Section 3.1.
2.2
Review of Empirical Literature
The paper [6] analyses the impact of economic catching-up on annual inflation rates in the European Union with a special focus on the new member countries of CEE. Using an array of estimation methods, he shows that the Balassa–Samuelson effect is not an important driver of inflation rates. Authors [2] investigate the relative price and relative wage effects of a higher productivity in the traded sector compared with the non-traded sector in a two-sector open economy model. They highlight the role of wages and imperfect mobility of working force. The paper [7] studies the Balassa–Samuelson effect in nine CEE countries. Using panel cointegration techniques, they find that the productivity growth differential in the open sector leads to inflation in non-tradable goods. The paper [12] uses time series and panel regression analyses to estimate role of BS effect in four CE countries (the Czech Republic, Slovakia, Poland and Hungary). She finds empirical evidence of BS effect in these countries. Author [5] finds long-term cointegration relationship between BS effect and relative prices in the Czech Republic, Hungary, Poland, Slovakia and Slovenia during transition process.
3
Methodology and Data
3.1
Estimated Models
Models used in this paper are based on equation (17). The assumption that interest rates are constant seems to be strong. That is why all models will be tested with term representing interest rate differential as well (models with interest rate differential are labeled as X1, where X denotes model specification and X = A, B, C). As it was mentioned above, equation (17) presumes also that the wages in tradable and non-tradable sectors ((14) and (15)) are equal. This presumption responds to so called Baumol-Bowen effect or domestic Balassa-Samuelson effect. This specification corresponding to Model A used in this paper. By substituting (17) into (4), the inflation rate differential in domestic and foreign country can be expressed as: �
đ?›żâˆ—
�
�
∗
∗
Δđ?‘?đ?‘Ą − Δđ?‘?đ?‘Ą ∗ = Δđ?‘’đ?‘Ą + (1 − đ?›ź) ( Δđ?‘Žđ?‘Ąđ?‘‡ − Δđ?‘Žđ?‘Ąđ?‘ ) − (1 − đ?›ź ∗ ) ( ∗ Δđ?‘Žđ?‘Ąđ?‘‡ − Δđ?‘Žđ?‘Ąđ?‘ ).
(18)
Assuming that labor intensities in tradable and non-tradable are the same the Model A can be defined as: đ??śđ?‘? đ??¸đ??´ đ??¸đ??´ đ??¸đ??´ (Δđ?‘?đ??śđ?‘? − Δđ?‘?đ??¸đ??´ )đ?‘Ą = đ?›˝1 Δđ?‘’đ?‘Ą đ??śđ?‘? + đ?›˝2 [(1 − đ?›ź đ??śđ?‘? )(Δđ?‘Žđ??śđ?‘? đ?‘‡ − Δđ?‘Žđ?‘ )đ?‘Ą − (1 − đ?›ź )(Δđ?‘Ž đ?‘‡ − −Δđ?‘Žđ?‘ )đ?‘Ą ] + đ?œ€đ?‘Ą
(19) where CZ and EA represents the Czech Republic and euro area, ď ˘ are estimated coefficients and ď Ľ is error term. Throughout the paper parameter ď ˘2 stands for BS effect and term in [] is labeled as bs. The assumptions of Model A are:  labor intensity in tradable sector equals labor intensity in non-tradable sector,  capital is mobile,  labor is homogenous (in the tradable and non-tradable sectors) and mobile within country, but not internationally,  PPP holds for tradable goods. The second form of model (Model B) relaxes the assumption that labor is homogenous and completely mobile within country, therefore wages not necessarily tend to be equal in tradable and non-tradable sector. By computing capital-labor ratio in equations (14) and (15) by substituting into (13), the relative prices can be expressed as: đ?‘?đ?‘ /đ?‘‡ = đ?‘?đ?‘Ąđ?‘ − đ?‘?đ?‘Ąđ?‘‡ = đ?‘? +
1−đ?›ż 1−đ?›ž
đ?‘Žđ?‘Ąđ?‘‡ − đ?‘Žđ?‘Ąđ?‘ − (1 − đ?›ż)(đ?‘¤đ?‘Ąđ?‘‡ − đ?‘¤đ?‘Ąđ?‘ ).
(20)
The last term on the right side of equation (20) represents wage differential. By substituting (20) into (4) the relative prices differential can be written as: Δđ?‘?đ?‘Ą − Δđ?‘?đ?‘Ą ∗ = Δđ?‘’đ?‘Ą + (1 − đ?›ź) (
1−đ?›ż 1−đ?›ž
Δđ?‘Žđ?‘Ąđ?‘‡ − Δđ?‘Žđ?‘Ąđ?‘ ) − (1 − đ?›ź ∗ ) (
1−đ?›ż ∗ 1−đ?›žâˆ—
∗
∗
Δđ?‘Žđ?‘Ąđ?‘‡ − Δđ?‘Žđ?‘Ąđ?‘ ) + (1 − đ?›ż)(1 − đ?›ź)(Δđ?‘¤đ?‘Ąđ?‘‡âˆ— −
đ?‘¤đ?‘Ąđ?‘ ∗ ) − (1 − đ?›ż)(1 − đ?›ź)(Δđ?‘¤đ?‘Ąđ?‘‡ − Δđ?‘¤đ?‘Ąđ?‘ ),
(21)
and Model B is given by equation: đ??śđ?‘? đ??¸đ??´ đ??¸đ??´ đ??¸đ??´ (Δđ?‘?đ??śđ?‘? − Δđ?‘?đ??¸đ??´ )đ?‘Ą = đ?›˝1 Δđ?‘’đ?‘Ą đ??śđ?‘? + đ?›˝2 [(1 − đ?›ź đ??śđ?‘? )(Δđ?‘Žđ??śđ?‘? đ?‘‡ − Δđ?‘Žđ?‘ )đ?‘Ą − (1 − đ?›ź )(Δđ?‘Ž đ?‘‡ − Δđ?‘Žđ?‘ )đ?‘Ą ] +
3
Mathematical Methods in Economics 2016
đ?›˝3 [(1 − đ?›ź đ??¸đ??´ )(Δđ?‘¤đ?‘‡đ??¸đ??´ − Δđ?‘¤đ?‘ đ??śđ?‘? )đ?‘Ą − (1 − đ?›ź đ??śđ?‘? )(Δđ?‘¤đ?‘‡đ??śđ?‘? − Δđ?‘¤đ?‘ đ??śđ?‘? )đ?‘Ą ] + đ?œ€đ?‘Ą .
(22)
where �3 measures the influence of wages differential and term connected with wages is label as wdif. The assumptions of Model B are:  labor intensity in tradable sector equals labor intensity in non-tradable sector,  capital is mobile,  labor is heterogeneous (in the tradable and non-tradable sectors) and/or not completely mobile within country,  PPP holds for tradable goods. The third specification deals with problem of validity of PPP. There are reasons (such as that there is reversible causality between exchange rate and price level or that the theory neglects expectations (see [13]) why PPP does not have to be valid. Authors such as [11] or [15] doubted its validity in the case of the Czech Republic using econometric techniques. If equation (5) does not hold, equation (18) will be as follows: �
đ?›żâˆ—
�
�
∗
∗
Δđ?‘?đ?‘Ą − Δđ?‘?đ?‘Ą ∗ = Δđ?‘?đ?‘Ą đ?‘‡ − Δđ?‘?đ?‘Ą đ?‘‡âˆ— + (1 − đ?›ź) ( Δđ?‘Žđ?‘Ąđ?‘‡ − Δđ?‘Žđ?‘Ąđ?‘ ) − (1 − đ?›ź ∗ ) ( ∗ Δđ?‘Žđ?‘Ąđ?‘‡ − Δđ?‘Žđ?‘Ąđ?‘ ).
(23)
Possible problem with this specification is that the price differential is explained by the price differential in tradable sector; hence the danger of possible endogeneity arises. Model C is therefore defined as: đ??śđ?‘? đ??¸đ??´ đ??¸đ??´ đ??¸đ??´ (Δđ?‘?đ??śđ?‘? − Δđ?‘?đ??¸đ??´ )đ?‘Ą = đ?›˝1 (Δđ?‘?đ?‘‡đ??śđ?‘? − Δđ?‘?đ?‘‡đ??¸đ??´ )đ?‘Ą + đ?›˝2 [(1 − đ?›ź đ??śđ?‘? )(Δđ?‘Žđ??śđ?‘? đ?‘‡ − Δđ?‘Žđ?‘ )đ?‘Ą − (1 − đ?›ź )(Δđ?‘Ž đ?‘‡ − −Δđ?‘Žđ?‘ )đ?‘Ą ] + đ?œ€đ?‘Ą . (24)
   
The assumptions of Model C are: labor intensity in tradable sector equals labor intensity in non-tradable sector, capital is mobile, labor is homogenous (in the tradable and non-tradable sectors) and mobile within country, but not internationally, PPP does not hold for tradable goods.
Mentioned above the interest rate differential is added to all 3 specifications of model. It was computed as follows: đ?‘–đ?‘‘đ?‘–đ?‘“ = ∆đ?‘– đ??śđ?‘? − ∆đ?‘– đ??¸đ??´ .
(25)
where i is interest rate and represents additional term in each model. All models were estimated using regression analysis.
3.2
Data
Prices There are two variables which measure prices in this paper. At first it is aggregate price level (P), which is dependent variable in all forms of model. It is measured as GDP deflator. The other variable measures prices in traded sector (PT) in Model C and it is defined as Producer Price Index (PPI) in manufactured products. The usage of two different price indexes should lower the danger of endogeneity of model. The higher inflation rate in tradable sector should cause rising of aggregate inflation rate. Data for both variables were gathered form Eurostat Database [8]. Exchange rate Exchange rate (E) is defined as bilateral nominal exchange rate (CZK/EUR) at the end of period in this paper. Higher values (depreciation) should cause rising of inflation rate. Data were obtained from Eurostat Database [8]. Productivity and Wages It is important to define tradable and non-tradable sector for variables dealing with productivity and wages. In the literature, there is serious discussion about which sectors are tradable. The lack of available data implies that most authors consider either all industry [5] or [9] or just manufacturing [10] as tradable sector in case of transition countries. Manufacturing is considered as tradable sector for the Czech Republic in this paper for both productivity and wages. The remaining sectors are considered as non-tradable.
4
Mathematical Methods in Economics 2016
Productivity (A) is defined as value added and divided by number of employees in relevant sector. Wages (W) are computed as ratio of wages and salaries to number of employees in relevant sector. Increase in both productivity and wages should lead to rising of inflation rate differential. Both series were sessional adjusted using Census X12 technique. Data were gathered from Eurostat Database [8] based on “Nomenclature gĂŠnĂŠrale des ActivitĂŠs ĂŠconomiques dans les CommunautĂŠs EuropĂŠennesâ€? (NACE) classification. Interest rates One of assumption of traditional approach to BS effect is that term c in equation (17) is constant. Some authors, such as [12] try to improve their models including interest rate differential into that equation. Because both Czech National Bank (CNB) and European Central Bank (ECB) targeting inflation rate; the policy rates (I) are used instead of money market rates in this paper. This variable is the only one which should have negative influence on inflation rate differential. Data (at the end of period) were obtained from ARAD – Data Series System [3] and from ECB – Database [4].
4 4.1
Results Results of Unit Root test
Since a regression analysis is used in this paper, it is important that all variables should be stationary. That is why results of unit root test are presented in this section. Augmented Dickey-Fuller (ADF) test is used to assess stationarity of all variables. The lag length is based on Schwarz Information Criterion. The results are depicted in Table 1. It is obvious that none of these variables has a unit root and can be used in regression analysis. variable Δ� bs wdif idif ΔpTCZ - ΔpTEA ΔpCZ - ΔpEA
4.2
t-statistic probability -7,0098 0,0000 -10,6439 0,0000 -5,1376 0,0001 -6,7683 0,0000 -5,0596 0,0001 -5,8363 0,0000 Table 1 Results of the ADF test
Estimated Models
In this section the outputs of models are presented. Since variables can influence each other with a lag, all models were tested with delays as well. Since the horizon of monetary policy in the Czech Republic come up to six quarters, each variable was lagged up to 6 periods (t = 0, 1, ‌, 6). Only the best results are depicted in the Table 2. All variables except for interest rate should have positive impact on independent variable. Models labeled with number include monetary policy rates as well. Nevertheless, this variable turned out to be insignificant in all specifications. variable A A1 B B1 C C1 0,2607*** 0,1689* 0,1730* x x 0,2560*** ď „e (4) 0,1650 0,2369** 0,2067* 0,1508 0,1210 Bs (4) 0,1928 -0,0110 x -0,0122 -0,0115 idif x x wdif (1) x x 0,4023*** 0,4074*** x x CZ EA x x x x 0,6125** 0,6324*** ď „pT - ď „pT (2) R2 0,123 0,145 0,331 0,358 0,112 0,136 DW 1,410 1,422 1,957 1,976 1,431 1,458 F-statistic 3,4943** 2,7717* 8,0815*** 6,7050*** 3,1493* 2,5766* Note: Number of lags is in the brackets; values of coefficients are depicted for each variable; OLS method was used in all models; each model includes intercept; *, **, *** denote 10%, 5% and 1% probability respectively. Table 2 Results of estimated models As concerns Model A, the impact of BS effect was insignificant. Also, model is significant only at 5% (1% while interest rates are including) significance level. Moreover, the low value of DW-statistic indicates positive autocorrelation.
5
Mathematical Methods in Economics 2016
Model B includes also wages. In this specification BS effect is significant. Moreover, all variables in this specification have a right sign and are significant. The variables explain 33.1 % of variation of inflation rate differential. DW-statistic indicates no first-order autocorrelation in this model. Model C should be used in the case that PPP is not valid. Nevertheless, results show that only significant variable is prices differential in the tradable sector in this model. Based on these results, Model B seems to be the best. Based on the empirical results, we can say that percentage point increase of relative productivity growth is associated with an increase of 0.24 % of relative inflation rate in the Czech Republic (compared to euro area).
5
Discussion and Conclusions
The aim of this paper was to assess the influence of BS effect in the Czech Republic. Six econometric models were made. We can divide them into three groups based on their assumptions. The Model B seems to provide best results. It claims that growth of inflation rate differential is influenced by nominal depreciation of currency, faster growth in the domestic relative (tradable to non-tradable sector) productivity growth and faster growth in the domestic relative (tradable to non-tradable sector) wages. According to [2], the role of wages and imperfect internal mobility is important. Based on this model, we can assume that there is a positive relationship between BS effect and relative prices. This founding is consistent with [5], [7] and [12], nevertheless, the size of the BS effect estimated in this paper is lower than in these papers ([12] estimated coefficients of BS effect in size over 2).
Acknowledgements This paper was financially supported within the VŠB - Technical University SGS grant project No. SP2016/101 “Monetary, Fiscal and Institutional Aspects of Economic Policy in Selected Countries“.
References [1] Balassa, B.: The Purchasing Power Parity Doctrine: A Reappraisal. Journal of Political Economy 72 (1964), 584–596. [2] Cardi, O., and Restout, R.: Imperfect mobility of labor across sectors: a reappraisal of the Balas-sa-Samuelson effect. Pôle européen de gestion et d'économie (PEGE). Working paper 16. 2014 [3] CNB: ARAD – Data Series System. [online database]. Praha: Česká národní banka. Available at: http://www.cnb.cz/docs/ARADY/HTML/index.htm. 2015. [4] ECB: Key ECB interest rates. [online database]. Available at: http://www.ecb.europa.eu/stats/monetary/rates/html/index.en.html. 2015. [5] Égert, B.: Estimating the impact of Balassa-Samuelson effect on inflation and real exchange rate during transition. Economic Systems 26 (2002), 1-16. [6] Égert, B.: Catching-up and inflation in Europe: Balassa–Samuelson, Engel's Law and other culprits. Economic Systems 35 (2011), 208-229. [7] Égert, B., Drinec, I., and Lommatzschd, K.: The Balassa–Samuelson effect in Central and Eastern Europe: myth or reality? Journal of Comparative Economics 31 (2003), 552-572. [8] EUROSTAT: Eurostat - Database. [online database]. Luxembourg: European Commision. Available at: http://ec.europa.eu/eurostat/data/database. 2015. [9] Fisher, C.: Real currency appreciation in accession countries: Balassa-Samuelson and investment demand. Review of World Economics 140 (2004), 179-210. [10] Jazbec, B.: Real Exchange Rates in Transition Economies. William Davidson Working Paper, 482. 2002. [11] Kim, B., and Korhonen, I.: Equilibrium exchange rates in transition countries: Evidence from dy-namic heterogeneous panel models. Economic Systems 29 (2005), 144-162. [12] Lojschová, A.: Estimating the impact of the Balassa-Samuelson effect in transition economies. Reihe Ökonomie/Economic Series, 140. 2003 [13] Mandel, M., and Tomšík, V.: Monetární ekonomie v malé otevřené ekonomice. 2. vyd. Praha: Management Press. 2008. [14] Samuelson, P.: Theoretical Notes on Trade Problems. Review of Economics and Statistics 46 (1964), 145– 154. [15] Sadoveanu, D., and Ghiba, N.: Purchasing Power Parity: Evidence from Four CEE Countries. Jurnal of Academic Research in Economics 4 (2011), 80-89.
6
Mathematical Methods in Economics 2016
Italian Saving Banks efficiency, is unity strength? Bank groups versus stand-alone Simona Alfiero 1, Filippo Elba 2, Alfredo Esposito 3, Giuliano Resce 4 Abstract. This study investigates the Italian SBs sector efficiency over the 2012– 2013 period. The measure of efficiency is estimated via SBM Data Envelopment Analysis. In the first stage, we evaluate the SBs efficiency while in the second we compare the performance of SBs belonging to bank groups with those stand-alone. To evaluate the impact of be part of a bank group we use Policy Evaluation tools, performing an impact evaluation with the controlled by a group considered as the “treatment” variable and checking for relevant banking ratios. To deal with selfselection bias (heterogeneity in treatment propensity related to variables), we use the PS Matching estimating the average treatment effects with the Ichino-Becker propensity scores. The research novelty resides in the combined application of DEA and Policy Evaluation tools for the specific field. Results show that when comparing SBs belonging to a banks group with stand-alone SBs, although a positive but not significant ATT, we find no relevant differences between the SBs part of group and the stand-alone. However, with reference to Technical Efficiency the stand-alone SBs experience the worst performance while after an insight into the inefficiency decomposition is clear that difficulties are due to managerial inefficiency. Finally, we present speculation, linked to real circumstances, with respect to the Italian SBs sector. Keywords: Saving Banks, Efficiency, Data Envelopment Analysis, Policy Evaluation, Average Treatment Effect on the Treated JEL Classification: G32 - G21 - C61 - C21 - O12 AMS Classification: 30C40 - 03B48 - 32S60
1
Introduction
The banking sector is still playing a crucial role in countries economics and Saving Banks (SBs), part of it, are fundamental in the economics developments as in [24]. The global crisis and the weak economic reestablishment belong to what Hansen [26] defined during the Great Depression as the economic “secular” stagnation being, in such periods, the efficiency pursue vital in supporting local economies mainly composed of SMEs and families. During the years, SBs evolved into full-service banks and JSCs (joint stock companies) and currently they are full commercial competitors that maximize profits being indistinguishable from other competitors as per [24]. Focusing on Italy, local SBs had a great influence on small territories while being the driver of local economies development. With respect to the present framework, the aim of our study is to investigate whether, and to what extent, belong to a banking group improves the performance in terms of efficiency.
2
Theoretical background and a brief note on the Italian SBs sector
Studies on the Italian banking sector efficiency are few such as the Favero-Papi [21] and Resti [33] about the determinants as scale inefficiencies and regional disparities and one on Italian cooperative banks as per [6]. Casu and Girardone [14] compared results obtained in different countries focusing on commercial banks. Since the work of Sherman and Gold [37], considered the first on banking industry, scholars refers to the DEA technique as a useful tool to measure the relative efficiency of banks as in [13] and [22]. Berger and DeYoung [12] and Williams [42] shows that a low levels of efficiency lead to lack of credit monitoring and inefficient control of operating expenses. With respect to the role of management and efficiency, from a managerial perspective, the “agency theory” as in [20] and [29] is relevant when referring to the separation of ownership and control thus implying conflicts of interests between managers and shareholders. On the other hand, the “stewardship theory” 1
University of Torino, Department of Management, simona.alfiero@unito.it University of Firenze, Department of Science for Economics and Business, filippo.elba@unifi.it 3 University of Torino, Department of Management, alfredo.esposito@unito.it 4 University of Roma 3, Department of Economics, giuliano.resce@uniroma3.it 2
7
Mathematical Methods in Economics 2016
argues that managers are reliable and there are no agency costs as in [18] and [19]. This latest viewpoint is typically a pre-financial scandals one within a paradigm of 10/15 years ago. Currently, the banking efficiency is evaluated via financial ratios, parametric and non-parametric approach as in [8] with different results by using divers methods as shown by Cubbin and Tzanidakis [16] where the mean DEA efficiency is higher than OLS analysis. Thanassoulis [38] found that DEA outperforms regression analysis on accuracy of estimates, but regression analysis offers a greater stability of accuracy even if Resti [32] shows that, both scores do not differ substantially. The Basel Committee [7] states that frontier efficiency measures are better representative in capturing the concepts of “economic efficiency”, “overall economic performance”, and “operating efficiency” providing a superior understanding on corporate governance topics. The main reason for the use of DEA technique is the smallest number of observations requirement and because it takes into account multiple inputs and outputs simultaneously, compared to ratios where one input relates to one output at each time as per [38]. Italian SBs started operating in the nineteenth century as institutions in which the credit and social aspects were living together rooted in the social and economic goals. However, at the end of the twentieth century due to sectoral regulatory developments they turn into full joint stock companies operating under complete commercial principle taking into consideration both stakeholder values and shareholder value. The Bank of Italy Report for 2014 [3]) states that in 2008-2014 period the bank employees and branches decreased by about 17,900 (-5.6%) and 3,400 (-9%) units because of distribution channels. As contribution to sector, Italian SBs at the end of 2014 are forty and holds 4,345 branches, more than 36 thousand employees, total assets of 206.2 billion and 144.4 billion in direct deposits. Currently, groups (which partly belongs to the local territorial Foundations) own SBs to cover the link between the ancient SBs and local communities. The SBs nature most of the time is still reflecting s the local political powers variable influence. Recently, the Italian Government [26] after some financial scandals issued the Law Decree 22 November 2015, n. 183 (a.k.a. “decreto salva banche”) and chose to terminate four banks while setting up four new substitute entities. It was the de facto transfer on local territories of the high socio-economic cost impacts and families and countryside people lost all their money because of the lack of financial skills to understand the nature of the business and risk level (a real fraud via the sale of a huge amount of subordinated debt). Interestingly, two recipient of the decree are SBs and part of our sample as another well-known troublesome (CR Ferrara, CR Chieti and Carige). Currently, the SBs sector is mainly hold by bank groups as result of the last 10/12 years mergers and takeovers, which led SBs to the loss of independency and links to territories.
3
Data specifications and methodology
The analysis covers a 36 Italian SBs sample during the 2012-2013 and the data source is Bankscope [4]. Four SBs were not included due to lacks of data. We have 20 SBs part of a bank group (representing the 55% of the sample) and 16 stand-alone SBs (45%). In the first part of the analysis, we evaluate the efficiency of SBs as per the Tone [38] SBM model while in the second part we compare the performances of the 20 SBs part of a bank group with the performances of 16 stand-alone SBs via policy evaluation tools. Table 1 provides the variables descriptive statistics. Variables Total Assets Operating Expenses Impaired Loans Loans Pre-tax Profit Customer Deposits
Type 2012 Min Max Input 799,70 49.322,00 Input 17,20 1.187,40 Input 45,70 3.141,00 Output 703,40 35.128,10 Output -205,10 163,20 Output 410,50 22.018,80
Std. Dev. 10249,91 237,193 815,513 7082,896 50,3527 4241,049
2013 Min Max 709,70 50.162,70 19,80 1.009,90 60,90 3.895,00 639,10 36.391,90 -2.205,80 251,90 482,40 23.251,80
Std. Dev. 9740 195,1604 1062,465 6844,524 380,1767 4396,648
Table 1 The 2012 and 2013 Data in Mln. of € – Source: Bankscope - Bureau Van Dijk (2015) The most used approach in banking efficiency measurement is the intermediation approach as per [36], which considers the bank as intermediators by transforming the production factors into outputs. The alternative approach is the production approach that uses traditional factors of production such as capital and labour to produce the number of accounts of loan and deposit services. In a mix-approach, deposits are likely to be take into account as both inputs and outputs. Referring to the banking sector, Anouze [2] presented a table showing that the intermediation approach is the most applied approach while the mix-approach represents 1/5 of the studies. The chosen variables able to measure the relative efficiency of the Italian SBs lies within the mix approach and the intermediation one. Total assets, operating expenses and impaired loans are inputs variables. Total assets is a proxy for the bank size, operating expenses (sum of personnel other operating expenses)
8
Mathematical Methods in Economics 2016
represents the labour input. Although many take it into the NPLs, we chose to refer to Impaired Loans as a risk proxy as in [33] that need, in any case, a reduction and envelops the potential risks that are likely to hit the banks and the stakeholders. With reference to the outputs side we chose the total loans, customer deposits (considered by many as an input and by other as an output) and pre-tax profits. The deposit variable is widely used as input or output as per [40] with a prevalence for an outputs consideration as in our case, and the pre-tax profits output variable is able to encompass both traditional operational and extraordinary activities that may, economically, impacts on financial statements. To measure efficiency we use DEA as in [5], a non-parametric technique in a SBM as per the Tone [41] version; hence, t multiplies all the variables. The linear program is the following: min đ?œ?đ?œ? = đ?‘Ąđ?‘Ą − đ?‘ƒđ?‘ƒđ?‘?đ?‘? đ?‘?đ?‘?
đ?‘Ąđ?‘Ą,đ?œ†đ?œ†,đ?‘†đ?‘†,đ?‘?đ?‘?
đ?‘Ąđ?‘Ą + đ?‘ƒđ?‘ƒđ?‘Śđ?‘Ś đ?‘†đ?‘† = 1 đ?‘Ľđ?‘Ľ0 = đ?‘‹đ?‘‹đ?‘‹đ?‘‹ + đ?‘?đ?‘? đ?‘Śđ?‘Ś0 = đ?‘Œđ?‘Œđ?‘Œđ?‘Œ − đ?‘†đ?‘† đ?œ†đ?œ† ≼ 0, đ?‘†đ?‘† ≼ 0, đ?‘§đ?‘§ ≼ 0
(1)
where đ?‘Ľđ?‘Ľ0 is the vector (3x1) of actual inputs under evaluation, đ?‘‹đ?‘‹ is the costs matrix (3 x 72) of banks sample (36 times 2 years), đ?‘?đ?‘? is the vector (3 x 1) for inputs excesses, đ?‘Śđ?‘Ś0 is the vector (3 x 1) of firms outputs under evaluation, đ?‘Œđ?‘Œ is the matrix (3 x 72) with the outputs of all the firms sample, đ?‘†đ?‘† is the vector (3 x 1) with the output slacks, đ?œ†đ?œ† is the vector (72 x 1) of intensity and đ?‘ƒđ?‘ƒđ?‘?đ?‘? and đ?‘ƒđ?‘ƒđ?‘Śđ?‘Ś are the vectors (1 x 3) that contains weights.
The result of the linear program solution (1) is Ď„, the relative efficiency of the Bank, where (1 - Ď„) is Technical Inefficiency: the averaged distance from the Constant Returns to Scale frontier, it includes both Managerial and Scale Inefficiency. The first depends directly on the management while the second is due to the dimension of the Bank. The SB is efficient and has an optimal dimension Ď„ = 1. By changing the specification of the problem (1) it allows the measure of Managerial Efficiency đ?œ?đ?œ?đ?‘‰đ?‘‰đ?‘‰đ?‘‰đ?‘‰đ?‘‰ (adding eÎť = t to the program) (1) as suggested by [5] and Scale Inefficiency (đ?œ?đ?œ?vrs − đ?œ?đ?œ?). The Managerial Inefficiency (1 - đ?œ?đ?œ?vrs ) is due to the management, being the averaged distance from the Variable Returns to Scale frontier. The Scale Inefficiency is due to the dimension, so generally in the short run the management does not have enough discretion to fill this gap. Uncovering the effects of being part of a group on efficiency is arduous because of the merge of a bank does not happens randomly. In order to evaluate this kind of influence, we perform an impact evaluation considering SBs part of a group as the “treatmentâ€? variable. Our check relies on a set of banking sector ratios measures such as, Impaired Loans to Gross Loans [33], Loans to Customer Deposits [42] and [1], and Interest Income on Loans to Average Gross Loans [23]. Moreover, we add the Cost to Income Ratio [9], [27] and [30]) and the Tier 1 Regulatory Capital [31] and [11]. The last two ratios are significant because of their being a proxy for cost efficiency and a benchmark for the operational efficiency while a high Tier 1 shown by banks during the subprime crisis led to higher returns; hence to a substantial resilience. To address self-selection bias (heterogeneity in treatment propensity that relates to the variables of outcomes), we refer to the PS Matching as per Rosenbaum and Rubin [34]: đ?‘ƒđ?‘ƒ(đ?‘‹đ?‘‹) = Pr(đ??ˇđ??ˇ = 1|đ?‘‹đ?‘‹) = đ??¸đ??¸(đ??ˇđ??ˇ|đ?‘‹đ?‘‹)
(2)
đ??´đ??´đ??´đ??´đ??´đ??´ = đ??¸đ??¸đ?‘ƒđ?‘ƒ(đ?‘‹đ?‘‹)|đ??ˇđ??ˇ=1 {đ??¸đ??¸[đ?‘Œđ?‘Œ(1)|đ??ˇđ??ˇ = 1, đ?‘ƒđ?‘ƒ(đ?‘‹đ?‘‹)] − đ??¸đ??¸[đ?‘Œđ?‘Œ(0)|đ??ˇđ??ˇ = 0, đ?‘ƒđ?‘ƒ(đ?‘‹đ?‘‹)]}
(3)
D ⊼ X | P(X)
(4)
Y(0), Y(1) ⊼ X | P(X)
(5)
where đ?‘ƒđ?‘ƒ(đ?‘‹đ?‘‹) is the PS: the conditional probability of receiving the treatment that in our case is being part of group)(đ??ˇđ??ˇ = 1) given certain characteristics (đ?‘‹đ?‘‹). The variables taken in to account for the PS estimates (đ?‘‹đ?‘‹) are the above-mentioned ratios. Taking the PS, the results of the Average Effect of Treatment on Treated (ATT) is the estimation of the Becker-Ichino [10] as follows:
where Y (0) is the performance (efficiency) of non-treated and Y (1) is the performance of treated, with the same PS P(X). To derive(3), given(2), we need two hypothesis: the Balancing Hypothesis and Unconfoundedness Hypothesis. The Balancing Hypothesis states that observations with the same PS (P(X)) must have the same distribution of observable and unobservable characteristics (X), independent of treatment status (D):
The Unconfoundedness Hypothesis says that the assignment to treatment is unconfounded given the PS: with đ?‘Œđ?‘Œ(0) and đ?‘Œđ?‘Œ(1) representing the potential outcomes in the two counterfactual situations of non-treatment and treatment. In our case, the Technical Efficiency and the Managerial Efficiency outcome are the measure of
9
Mathematical Methods in Economics 2016
(1). As robustness checks, to estimate ATT, we use four matching methods: Stratification as per Rosenbaum and Rubin [34], Nearest Neighbour as per Rubin [35], Radius as per Dehejia and Wahba, [17] and Kernel [25].
4
Italian SBs efficiency results and Group versus Stand-alone
The figure 1 highlights the level of the SBM relative efficiency of the SBs part of a group and the stand-alone.
Figure 1 Efficiency of Italian SBs over the 2012 – 2013 period The figure 2 and table 3 show the Technical Efficiency and inefficiency decomposition. With reference to Technical Efficiency, the stand-alone SBs are experiencing the worst performance while an insight into the inefficiency decomposition clarifies that difficulties are mainly due to managerial inefficiency. In terms of inefficiency the dimensions (Scale) of the stand-alone SBs allows for a better performance. 1 0,5
Scale Ineff. Manag. Ineff.
0
Tech. Eff. Part of a group
Stand-alone
Figure 2 The Technical efficiency and Inefficiencies of Italian SBs over the combined 2012 – 2013 period SBs efficiency decomposition Technical Efficiency Managerial inefficiency Scale Inefficiency Part of a group
0,359470231
0,495401225
0,145128544
Stand-alone
0,319707487
0,55351431
0,126778201
Table 2 The SBs efficiency decomposition over the 2012-2013 period The figure 3 refers to the slacks issues. The SBs part of group outperform respect the others when considering Loans and Customer deposit while the stand-alone SBs outperform with respect to the pre-tax profit and total asset.
Impaired Loans
2,5 1,5 0,5 -0,5
Loans
Operating Expenses
Pre-tax Profit Part of a group Stand-alone Customer Deposits
Total Assets Figure 3 The Slacks (refers to the whole 2012 – 2013 period) Yet we face the first self-selection problem: the performance difference is due to the Groups policies or to the self stand-alone performance? A merger or a takeover is never a casualty. Hence, we choose to compare the last
10
Mathematical Methods in Economics 2016
performances by controlling for a set of variables using the PS Matching approach. Having estimated the PS and following the Becker-Ichino [10] algorithm, we detect the ATT, where the treatment is to be part of a group. As table 3 shows, although the ATT is slightly positive but not particularly significant revealing that the economic side controls shows no significant differences between the 20 SBs part of a bank group and the 16 stand-alone. Method
Technical Efficiency
Managerial Efficiency
ATT Std. Err. 95% Conf. Interval ATT Std. Err. 95% Conf. Interval 0,167 0,132 0,196 0,377 0,271 0,144 0,068 0,616 Stratification 0,097 -0,053 0,315 0,286 0,146 -0,147 0,473 Nearest neighbour 0,152 0,089 0,090 -0,076 0,271 0,114 0,089 -0,040 0,293 Radius 0,123 0,106 -0,087 0,271 0,176 0,136 -0,204 0,342 Kernel Table 3 The ATT Results, Table 3 Stand-alone (0) vs Group (1) - Bootstrap statistics100 replications
5
Conclusions
Our study shows the level of relative efficiency of Italian SBs and when consider their belonging to a group or not we observe no significant differences. However, with reference to Technical Efficiency the stand-alone SBs are experiencing the worst performance while after an insight into inefficiency decomposition it becomes clear that difficulties are due to managerial inefficiency. In addition, via policy evaluation tools, we find no relevant dichotomies among SBs belonging to groups with those not part of it. It is important to consider that the driver that leads to the dismissal of power from local influencing policy-maker politicians’ people, which leads the Foundations that formally owns the SBs, usually (in the Italian framework), is the results of a financial scandal. According to the mentioned results, statuses are in line with real circumstances such as the Italian Law Decree 2183/2015 - the first application of the bail-in rules, even before the European Bank Recovery and Resolution Directive. Indeed, given the fact that ATT reveals no great differences, currently SBs seems to belong to a bank group or not on the base of their previous financial conditions or because of financial crime scandals that are always exceptional and not ordinary events. Not only a common bad management behaviour drives merger and takeovers but, actually, unrevealed and covered financial crimes, concretely, comparable to exogenous and unpredictable shocks. This is also an indirect confirmation that the mergers and takeovers of last 10/12 by large bank groups were, mainly, hiding rescue actions (widely known in Italy), being illogical the fact that the loss of independency by local political influencers and business groups was due only to low management standards.
5.1
References
[1] Altunbaş, Y., Marqués, D.: Mergers and acquisitions and bank performance in Europe: The role of strategic similarities. Journal of Economics and Business, 60(3) (2008), 204–222. [2] Anouze, A.: Evaluating productive efficiency: comparative study of commercial banks in gulf countries, Doctoral thesis, University of Ashton, (2010). [3] Banca di Italia: Annual Report, https://www.bancaditalia.it/pubblicazioni/relazione-annuale/2014/, (2014). [4] Bankscope - Bureau Van Dijk, (2015). [5] Banker, R., Charnes, A., Cooper, W.: Some models for estimating technical and scale inefficiencies in Data Envelopment Analysis, Management science, 30, (1984), 1078–1092. [6] Barra, C., Destefanis, S., Lubrano Lavadera, G.: Regulation and the crisis: the efficiency of Italian cooperative banks, Centre for studies in economics and finance, working paper 38, (2013). [7] Basel Committee on Banking Supervision: Enhancing corporate governance for banking organisations. February, available at http://www.bis.org/publ/bcbs122.pdf, (2006). [8] Bauer, P., Berger, A., Ferrier, G., Humphrey, D.: Consistency conditions for regulatory analysis of financial institutions: a comparison of frontier efficiency methods. Journal of economics and business, 50(2) (1998), 85–114. [9] Beccalli, E., Casu, B., Girardone, C.: Efficiency and stock performance in European banking. Journal of Business Finance & Accounting, 33.1‐2 (2006), 245–262. [10] Becker, S.O., Ichino, A.: Estimation of average treatment effects based on propensity score. The Stata Journal vol., 2,4 (2002), 358–377. [11] Beltratti, A, Stulz, R.; Why did some banks perform better during the credit crisis? A cross-country study of the impact of governance and regulation. Charles A. Dice Cent. Work. Pap. 2009-12 (2009). [12] Berger, A.N., De Young, R.: Problem loans and cost efficiency in commercial banking. Journal of banking and finance 21, (1997), 849-870. [13] Berger A.N., Humphrey, D.B.: Efficiency of financial institutions international survey and directions for future research. European journal of operational research, 98 (1997), 175–212.
11
Mathematical Methods in Economics 2016
[14] Casu, B., Girardone, C.: A comparative study of the cost efficiency of Italian bank conglomerates. Managerial finance, 28(9) (2002), 3–23. [15] Charnes, A., Cooper, W., Rhodes, E.: Measuring the efficiency of decision-making units. European Journal of Operational Research, vol., 2 (1978), 429–444. [16] Cubbin, J., Tzanidakis, G.: Regression versus data envelopment analysis for efficiency measurement: an application to the England and wales regulated water industry. Utilities policy, 7(2) (1998), 75–85. [17] Dehejia, R.H., Wahba, S.: Propensity Score Matching Methods for Nonexperimental Causal Studies. The Review of Economics and Statistics, 84,1 (2002), pp.151-161. [18] Donaldson, L.: The ethereal hand: organizational economics and management theory. Academy of management review, 15 (1990), 369–381. [19] Donaldson, T., Preston, L.E.: The stakeholder theory of the corporation: concepts evidence, and implications. Academy of management journal, 20 (1995), 65–91. [20] Eisenhardt, K.: Agency theory: an assessment and review. Academy of management review, (1989) 57–74. [21] Favero, C.A., Papi, L.: Technical efficiency and scale efficiency in the Italian banking sector: a nonparametric approach. Applied economics, 27.4 (1995), 385–395. [22] Fethi, M.D., Pasiouras, F.: Assessing bank efficiency and performance with operational research and artificial intelligence techniques:a survey. European journal of operational research, n. 2 (2010), 189–98. [23] Foos, D., Norden, L., Weber, M.: Loan growth and riskiness of banks. Journal of Banking & Finance, 34(12) (2010), 2929–2940. [24] Gardener, E.P.M., Molyneux, P., Williams, J., Carbo, S.: European savings banks: facing up the new environment. International journal of banking marketing, vol. 15(7) (1997), 243–254. [25] Heckman, J.J., Smith, J., Clements, N.: Making the Most Out of Programme Evaluations and Social Experiments: Accounting for Heterogeneity in Programme Impacts. The Review of Economic Studies, 64 (1997), 487–535. [26] Hansen, A.H.: Economic progress and declining population growth. The American economic review, 29, 1, (1939), pp. 1-15. [27] Hess, K., Francis, G.: Cost income ratio benchmarking in banking: a case study. Benchmarking: An International Journal, 11(3), (2004), 303–319. [28] Italian Government, http://www.gazzettaufficiale.it/eli/id/2015/11/23/15g00200/sg, (2015). [29] Jensen, M.C., Meckling, W.: Theory of the firm: Managerial behavior, agency costs and capital structure. Journal of Financial Economics, 3, 4, (1976) pp. 305–360. [30] Mathuva, D.M.,: Capital adequacy, cost income ratio and the performance of commercial banks: the Kenyan scenario. The International Journal of Applied Economics and Finance, 3.2 (2009), 35–47. [31] Merle, J.: An Examination of the Relationship Between Board Characteristics and Capital Adequacy Risk Taking at Bank Holding Companies. Academy of Banking Studies Journal, 12.1/2 (2013):3A. [32] Resti, A.: Evaluating the cost efficiency of the Italian banking system: what can be learned from the joint application of parametric and nonparametric techniques. Journal of banking & finance, 21 (1997), 221–250. [33] Resti, A., Brunella, B., Nocera, G.: The credibility of European banks’ risk-weighted capital: structural differences or national segmentations? BAFFI CAREFIN Centre Research Paper, 2015, 2015-9. [34] Rosenbaum, P.R., Rubin, D.B.: The central role of the propensity score in observational studies for casual effects. Biometrika vol. 70, 1 (1983), 41–55. [35] Rubin, D.B.: Matching to remove bias in observational studies. Biometrics, Vol. 29, n.1 (1973), pp.159-184. [36] Sealey Jr., C.W., Lindley, J.T.: Inputs, outputs, and a theory of production and cost at depository financial institutions, The journal of finance, (1977), 1251–1266. [37] Sherman, H.D., Gold, F.: Branch operating efficiency: evaluation with data envelopment analysis. Journal of banking and finance, no. 2, 9 (1985), 297–315. [38] Thanassoulis, E.: A comparison of regression analysis and Data Envelopment Analysis as alternative methods for performance assessments. Journal of the operational research society, 44,11 (1993), 1129– 1144. [39] Thanassoulis, E., Boussofiane A., Dyson, R.G.: A comparison of Data Envelopment Analysis and ratio analysis as tools for performance measurement. Omega, international journal of management science, 24(3) (1996), 229–244. [40] Toloo, M., Barat, M., Masoumzadeh, A.: Selective measures in data envelopment analysis, Annals of operations research, 226.1 (2015), 623–642. [41] Tone, K.: A Slacks-based Measure of Efficiency in Data Envelopment Analysis. European Journal of Operational Research vol., 130 (2001), 498–509. [42] Williams, J., Gardener, E.: The efficiency of European regional banking. Regional Studies, 37(4) (2003), 321–330. [43] Williams, J.: Determining management behaviour in European banking. Journal of banking and finance, 28 (2004), 2427–2460.
12
Mathematical Methods in Economics 2016
Are remittances important for the aggregate consumption growth? Ondřej Badura1, Jana Závacká2 Abstract. Remittances, though often overlooked, can be an important source of income and thus an important determinant of consumption function even at the aggregate level. It is evident that the importance of remittances for the receiving economy is given not only by the relative amount of emigrants, but also by its economic level. However, the most important factor is the share of remittances on the aggregate income of the receiving country. The aim of this study is to analyze the growth of the aggregate consumption in terms of remittance flows. To achieve this objective, there is primarily used panel data regression. To avoid great differences in economic conditions across the countries with high share of remittances on their national income, the regression was driven on chosen Asian economies only. The results confirm the statistical significance of the growth of remittances on the aggregate consumption growth in panel regression, the same result was obtained from the simple regression with robust estimations. These findings concurrently support the idea that remittance flows should be more considered. Keywords: remittances, consumption, Asian economics. JEL Classification: E21, F24 AMS Classification: 91B42
1
Introduction
Remittances are usually defined as income received by family members abroad. Though it doesn’t have to include just family members itself, it is still the most common example. But the really important thing here is the income nature of remittances. There is no doubt in professional economic circles that income is the key determinant of consumption function for a simple household as well as at aggregate level. But income itself it’s not or at least it doesn’t have to be homogenous, which also implies that different sources of household financial budget may also influence consumption spending differently. By dividing income into few smaller and more uniform parts and by studying this pieces of household budget separately we can possibly find new relationships and patterns and thus to significantly enrich our knowledge of consumer behavior. And this is exactly, what this paper is about. Particularly the aim of this study is to analyze the growth of the aggregate consumption in terms of remittance flows. When we analyze the macroeconomic aspects of received remittances, there are some factors determining the importance or rather their potential to significantly influence aggregate consumption spending. But all of these factors like relative amount of emigrants (relative to country’s population) or economic level of considered country lead to one thing - the share of remittances on the aggregate income of the receiving country. If the share is not high enough, the possibly found pattern may be very interesting and breathtaking, but still couldn’t make any significant changes in aggregate consumption behavior. That’s why we choose only countries with this share of about 10 % and more. In further regression analysis we focus only on Asian economies just to incorporate their specific macroeconomics characteristics and to enable us to better compare results from individual regressions. When we mention economies with high share of received remittances on their national income, it’s not surprising that we talk mainly about developing countries. For the huge remittances inflows relative to their gross domestic product (GDP) it can be expected that also their effect on the characteristics of income or consumption behavior of local households is enormous, even at aggregate level. This logical assumption is then confirmed by many empirical studies. Of those which aim was to analyze the impact of this type of international transfers on income we can mention for example publications dealing with particular countries such as Hobbs and Jameson [5], who deal with the influence of remittances on the distribution of income and poverty in Nicaragua, Lázaro et 1
VŠB - Technical University Ostrava, Department of Economics, Sokolská třída 33, 701 21 Ostrava, Czech Republic, e-mail: ondrej.badura@vsb.cz 2
VŠB - Technical University Ostrava, Department of Systems Engineering, Sokolská třída 33, 701 21 Ostrava, Czech Republic, e-mail: jana.zavacka1@vsb.cz
13
Mathematical Methods in Economics 2016
al [8], who analyze the same issue using data from Bolivia or more general study of Koechlin and Gianmarco [7], where authors prove a clear link between international remittances and income inequality using analyzes of panel data from 78 countries worldwide. Further it can be mentioned Gibson, Mc Kenzie and Stillman [4], who provide another complex view of impact of emigration and remittance flows on income and poverty in developing world discussing primarily difficulties in estimating those effects. The influence of remittances directly on consumption or other characteristics of consumer behavior is then a major goal of papers like Incaltarau Maha [6] analyzing so for Romania's macroeconomic environment, Yousafzai [9] or Ahmad, Kahn and Atif [1] discussing the same problem on the example of Pakistan or Zhu [10] for the data representing the rural areas of China, all finally prove the expected influence (weaker or stronger) of remittances on consumption characteristics. Gallegos and Mora [3] then come up with the long term relationship between remittances, consumption and human development in Mexico. Again, we can mention studies with a range far wider (international), which may be, for example, analysis of panel data of Combes and Sebek [2], which, among other things, points to the fact that the inflow of remittances significantly reduces consumer instability. All of those studies mentioned above show and prove how remittances can influence aggregate consumption spending (and income distribution). The main goal of this paper is then to bring another little contribution to this field of knowledge doing so for environment of Asian developing world. The paper is organized as follows. The data description and the model are introduced in Section 2. The estimation results are presented in Section 3. Section 4 concludes.
2 2.1
The model The data
.5
We have tested the significance of remittances on the consumption growth of the yearly data of Armenia (19962014), Bangladesh (1996-2014), Georgia (1997-2014), Jordan (1996-2014), Kyrgyz Republic (1996-2014), Nepal (1996-2014), Pakistan (1996-2014), Philippines (1996-2014), and Sri Lanka (1996-2010). These countries were chosen according to their high share of remittance inflow to the country on GDP and the availability of the data. The share of remittances on GDP expressed in real prices together with the estimated mean value could be seen in Figure 1.3
.3 .2 .1 0
Sri Lanka
Philippines
Pakistan
Nepal
Kyrgyz Republic
Jordan
Georgia
Bangladesh
Armenie
-.2
-.1
Remittances/GDP
.4
Remittances/GDP mean of Remittances/GDP
Country
Figure 1 The share of remittances on GDP and its mean value (expressed in real prices) across countries The data were downloaded from the World bank database. Because of the non-stationarity the data were transformed into their growth rates. All time series used for regression together with the plot of share of consumption on GDP is presented in Figure 3 – 6 in Appendix.
3
In some cases (for example Kyrgyz Republic) the share of remittances on GDP was quite close to zero in the first years of observations, but its mean value is still relatively high, so even these countries are still suitable for our analysis.
14
Mathematical Methods in Economics 2016
The regression is based on following time series data: growth of cpi (growth_cpi) – the time series was counted as an annual growth rate of the Consumer price index (expressed to the baseline 2010 = 100); growth of consumption (growth_c) – the household final consumption expenditure (current US$) were firstly expressed in real prices of year 2010 dividing by consumer price index (expressed to the baseline 2010 = 100), than the annual growth was counted; growth of GDP (growth_gdp) – the annual growth of real GDP was counted from GDP (current US$) divided by consumer price index (expressed to the baseline 2010 = 100); growth of remittances (growth_remittances) – the annual growth of received personal remittances (current US$) divided by consumer price index4 (expressed to the baseline 2010 = 100).
The regression was also controlled for the influence from the changes in short term interest rate, the rate of unemployment and the lagged values of all explanatory variables, however their presence in the model was rejected on the 10% level of significance. As for the unemployment rate, this result was quite expected. It is no secret that there is a strong interdependence of unemployment and development of GDP. Thanks to this correlation it was highly probable that the rate of explained variability in the model would not be significantly increased by adding the unemployment rate. Yet even this step was tested just to be sure, as well as the inclusion of lagged variables which influence was also not expected due to the annual nature of data. In other words, even the delay of one year is probably too much for a consumption function. Short-term interest rate surely should have an impact on the development of household consumption, but this "interest rate channel" is primarily significant in advanced economies. That’s why we attribute the statistical insignificance of the interest rate in our model to the developing nature of the selected countries and thus not fully developed market mechanisms.
2.2
The model
.1 .2 .3 .4 .5 0
consumption growth mean of consumption growth
Sri Lanka
Philippines
Pakistan
Nepal
Kyrgyz Republic
Jordan
Georgia
Bangladesh
Armenie
-.2 -.1
Consumption growth
The consumption growth expressed in real prices together with its estimated mean value in initial countries could be seen in Figure 2. We can see that the mean value as well as the variance of the growth rate of consumption differs across the countries, thus the estimation by panel regression was done.
Country
Figure 2 Consumption growth and its mean value across countries
4
At first it is necessary to convert indicators of consumption and remittances from nominal to real values deflating by the CPI. First, the meaning of this work is to observe the impact of remittances on real household consumption, hence its explanatory variable must be in real terms and secondly there is already contained a development of a price level (through CPI) in remittances in nominal terms, therefore when we include CPI itself among the regressors, it is necessary to deflate the indicator of remittances, otherwise the development of CPI is included twice among the explanatory variables.
15
Mathematical Methods in Economics 2016
Hausmann test rejected the fixed-effect in panel regression on the 5% level of significance, hence the random effects were considered:
growth _ cit 0 1 growth _ gdpit 2 growth _ remittance sit 3 growth _ cpiit uitRE itRE (1)
0 , 1 , 2 , 3 are model parameters, u itRE
the between-entity error and
itRE
within-entity error. Breusch-
Pagan Lagrange multiplier test did not reject at 5% level of significance the hypothesis that variances across countries are zero, thus the simple ordinary least square regression (OLS) was done.
growth _ cit 0 1 growth _ gdpit 2 growth _ remittance sit 3 growth _ cpiit itOLS
itOLS
(2)
is the error term from OLS regression. The residuals from the OLS model were not autocorrelated but the
homoscedasticity was at the 5% level of significance rejected, hence the robust estimations were used.
3
Estimation results
The estimation results are presented in Table 1 (the parameter estimations are in bold, under the parameter estimations there are standard deviations, the p-value is presented in the braclets). However the random effects in panel regression were rejected, we are presented also the estimations from panel regression for comparison only. We can see that the results are quite stable and the new estimations of standard errors are similar. Variable growth_gdp
growth_remittances
growth_cpi
const.
Number of observations R2 Adjusted R2
Random-effects GLS regression 0.968 0.041 (0.000) 0.028 0.006 (0.000) 0.172 0.097 (0.076) -0.016 0.008 (0.045) 157
OLS regression 0.968 0.054 (0.000) 0.028 0.005 (0.000) 0.172 0.093 (0.066) -0.016 0.007 (0.027) 157 0.796 0.792
Table 1 Estimation results The estimation results confirmed at 5% level of singificance the statistical significance of the growth rate of GDP and remittances in the model. The high value of the parameter by GDP growth signalizes the high sensitivity of the consumption growth on the growth of GDP (the increse of GDP growth per one percent point is connected with the increase of consumption growth of 0.968 percent point ceteris paribus). Considering that all countires from the panel are developing countries and the short term interest rate in the model was not significant and hence dropped from the regression, the high value of this parameter only confirms that in these countries the immediate consumption is mainly prefered to savings. This is perfectly in line with the significance of the growth of remittances in the model. However the estimation of parameter of the growth of remittances is only 0.028, its significance prooved the importance of remittances’ inflow into these countries and its direct contribution to the consumption growth. The statistical significance of the inflation (growth of cpi) was confirmed on the 10% level of significance. The positive value of the parameter estimation (0.172) is in accordance with the expected impact of the inflation, e.g. that with the increasing price level consumers rather prefer immediate consumption to savings.
16
Mathematical Methods in Economics 2016
4
Conclusion
The aim of this study was to verify the initial hypothesis that international remittances have an impact on aggregate consumption of households using a growth model. This assumption has been actually confirmed in the regression model for environment of Asia described above. As mentioned in the introduction, from the perspective of macroeconomic impacts it does not make much sense to examine the impact of the inflow of remittances in countries where these flows aren’t sufficiently large relative to the production output of the recipient country. That’s why the model was driven only on Asian countries with high share of remittances on their GDP, which de facto represents the developing economies. It is necessary to take into account that this fact substantially influences the explanatory ability of the model itself. The results are not fully general, but valid only for specific economic environment, where international transfers like these play an significant role. First, in developing world remittances are often an important part of household income and through consumption spending they can regularly increase GDP of the economy and so to contribute to real economic convergence. In some case where remittances are also used for investment and savings (which leads to increase in capital) this effect of long term economics growth and development is even raised to power. But it doesn’t have to be the rule. If receiving household gets used to this transfers to much, it can lower its nature need to procure the resources itself. In other words, it can lower their labor supply. Another problem lies in potential real appreciation of country’s currency, when inflows of remittances are too high. For export oriented economies, which developing countries usually are, this effect could be very dangerous. All of these aspects could be an important and interesting topic for the further scientific research. In the end it is useful to emphasize the fact, that despite the arguable effect of remittances on economy in long term, there is no doubt, that in short term it reduces poverty significantly. And because this happens mainly through consumption spending (most of remittances received are spend for necessary goods), it is essential to explore this impact channel at the first place. The significance of remittances on consumption growth in this developing world is now confirmed, but the question of its precise nature and deeper understanding of patterns that stand behind, still remains open a thus a potential motive for extension of this work.
Acknowledgements This paper has been funded by a grant from Students Grant Project EkF, VŠB-TU Ostrava within the project SP2016/112 "Remittances and their influence on macroeconomic and microeconomic indicators“ and the Operational Programme Education for Competitiveness – Project CZ.1.07/2.3.00/20.0296.
References [1] Ahmad, N., Khan, Z. U., and Atif, M.: Econometric Analysis of Income, Consumption and Remittances in Pakistan: Two Stage Least Square Method. Journal of Commerce 5 (2013), 1-10. [2] Combes, J. L., and Ebeke, Ch.: Remittances and Household Consumption Instability in Developing Countries. World Development 39 (2011), 1076-1089. [3] Gallegos, J. L. C., and Mora, J. A. N.: Remittances, Consumption and Human Development: Evidence of Mexico's Dependence. Análisis Económico 28 (2013). [4] Gibson, J., Mc Kenzie, D., and Stillman, S.: Accounting for Selectivity and Duration-Dependent Heterogeneity When Estimating the Impact of Emigration on Incomes and Poverty in Sending Areas. Economic Development and Cultural Change 61 (2013), 247-280. [5] Hobbs, A. W., and Jameson, K. P.: Measuring the Effect of Bi-directional Migration Remittances on Poverty and Inequality in Nicaragua. Applied Economics 44 (2012), 2451-2460. [6] Incaltarau, C., and Maha, L. G.: The Impact of Remittances on Consumption and Investment in Romania. Journal of European Studies 3 (2012), 61-86. [7] Koechlin, V., and Gianmarco, L.: International Remittances and Income Inequality: An Empirical Investigation. Journal of Economic Policy Reform 10 (2007), 123-141. [8] Lazarte, A., et al.: Remittances and Income Diversification in Bolivia's Rural Sector. Applied Economics 46 (2014), 848-858. [9] Yousafzai, T. K.: The Economic Impact of International Remittances on Household Consupmtion and Investment in Pakistan. Journal of Developing Areas 49 (2015), 157-172. [10] Zhu, Y., et al.: Where Did All the Remittances Go? Understanding the Impact of Remittances on Consumption Patterns in Rural China. Applied Economics 46 (2014), 1312-1322.
17
Mathematical Methods in Economics 2016
8
.1 .2 .3 .4 .5 .6
Appendix GDP growth mean of GDP growth
Sri Lanka
Philippines
Pakistan
Nepal
Kyrgyz Republic
Jordan
Georgia
Armenie
Bangladesh
-.4 -.3 -.2 -.1
0
GDP growth
Sri Lanka
Philippines
Pakistan
Nepal
Kyrgyz Republic
Jordan
Georgia
Bangladesh
Armenie
-1
0
1
2
3
4
5
Remittances growth
6
7
Remittances growth mean of Remittances growth
Country
Country
Figure 4 GDP growth and its mean value across countries
1.1
Consumption/GDP mean of Consumption/GDP
1 .9 .8 .7
0
.1
Consumption/GDP
.2
CPI growth mean of CPI growth
Sri Lanka
Philippines
Pakistan
Nepal
Kyrgyz Republic
Jordan
Georgia
Bangladesh
Armenie
Sri Lanka
Philippines
Pakistan
Nepal
Kyrgyz Republic
Jordan
Georgia
Bangladesh
Armenie
.6
-.1
CPI growth
1.2
.3
Figure 3 Remittances growth and its mean value across countries
Country
Country
Figure 6 The ratio of consumption on GDP and its mean value across countries
Figure 5 CPI growth and its mean value across countries
18
Mathematical Methods in Economics 2016
Semantic Model of Organizational Project Structure Jan Bartoška1 Abstract. The article discusses the use of semantic networks and Analytic Network Process (ANP) methods for quantifying a “soft” project structure. International standards of project management describe and set organizational, process and knowledge areas of project management for common practice, which can be designated as the so-called soft project structure. Although the prioritization of project roles in communication or project documentation in project management is a key to the success of a project, a unique approach to quantify the soft structure of a project has not yet been introduced. The semantic networks can be used in project management to illustrate and quantify links between internal and external objects of an environment of a project. Using the ANP method, it is possible to quantify elements even in a soft project structure, i.e. e.g. project roles, project documentation, and project constraints. The author of the paper loosely follows previous research in the application of semantic networks in the case of soft structures. This article aims to propose a procedure for displaying and quantifying the soft structure of a project. Keywords: Project Management, Semantic Networks, Stakeholder Management, Analytic Network Process. JEL Classification: C44 AMS Classification: 90C15
1
Introduction
Project management is a collection of many professional disciplines and skills when, besides traditional quantitative approach, qualitative approach which is the content of international standards is gaining on importance. International standards of project management, i.e. PMBOK® Guide [9], PRINCE2 [6], ISO 21500:2012 [3] and IPMA ICB [1] describe a common practice to set organizational, instrumentation, process and knowledge management aspects of a project, giving rise to the so-called "soft" structure of the project, i.e. a partially ordered set of documents, roles, workflows and tools with variable influence of internal and external environment of the project. Although the prioritization of project roles in communication and project documentation in project management is a key for the success of the project, a unique approach to quantifying the "soft" structure of the project has not yet been introduced. "Soft" structures of the project are usually just displayed in the form of knowledge maps or semantic networks. For potential quantification it is appropriate to focus particularly on the use of semantic networks. Semantic networks are conventionally utilized in ICT, especially in developing and maintaining software and web applications, e.g., Kim et al. [4] uses a semantic model for viewing and extracting structure tags for creating and managing web sites, or Zhu and Li [18] focus on developing a semantic approach in IT for the resolution of an application and contextual level. The use of semantic networks can lead to the improvement of the systems studied, such an example is a case study in the field of IT by Shing-Han, Shi-Ming, David and Jui-Chang [15]. Semantic networks are also used for expressing the relationships between users of information systems and content web servers, where significance quantification of elements is subsequently conducted with the help of the PageRank method [7]. In a similar manner, but using the ANP method (Analytic Network Process) [13], it is possible to quantify the elements of the "soft" project structure, i.e. e.g. project roles, project documentation and project constraints. The use of semantic networks in project management is not yet prevalent. For the environment of IT project models, Schatten [14] proposes semantic social networks for the extraction and transfer of knowledge within the project team and stakeholders of the project. Whereas Williams [17] uses a semantic network for the display and analysis of success factors in the structure of mutual expectations and limitations of the project stakeholders. While El-Gohary, Osman and El-Diraby [2] for example indicate that insufficient or inadequate stakeholders' involvement in the management of projects, programs and portfolios is the main reason for failure, and they further propose the semantic model as a tool to visualize and manage relationships and knowledge towards stakeholders in a multi-project environment. Following the example of their use in IT, it is therefore appropriate 1
Czech University of Life Sciences in Prague, Department of Systems Engineering, Kamýcká 129, Prague, Czech Republic, email: bartoska@pef.czu.cz
19
Mathematical Methods in Economics 2016
to continue developing semantic models to meet the needs of project management, i.e. a management structure of roles, documentation, and knowledge constraints in a project. This article aims to propose a procedure for displaying and quantifying the soft structure of the project. The specific outcome of the article will be the draft of the general structure of project management according to international standards, using the semantic network, its quantification by means of the ANP method and a subsequent interpretation of feedback to project management practice.
2 2.1
Materials and methods Project Management Institute – PMBOK® Guide
Project Management Institute (PMI) ranks among the largest non-profit associations in the world, dealing with project management. The association was founded in the United States in 1969. Since the 80s of the 20th century, in a broad discussion among experts and professional public, it has been developing a standard entitled "A Guide to the Project Management Body of Knowledge" (PMBOK® Guide). Since its fourth version, the PMBOK® Guide has been recognized as the standard of the Institute of Electrical and Electronics Engineers (IEEE) in the USA. The PMBOK® Guide [9] is process-oriented and in its knowledge areas it presents the set of qualitative (e.g. team management, team motivation, etc.) or quantitative approaches (e.g. CPM, EVM, etc.) necessary for project management. The PMBOK® Guide is divided into four parts: introduction (18 pages), environment and project life cycle (28 pages), project management processes (16 pages), knowledge areas (353 pages).The standard defines five process groups (47 in total) for project management and their direct link to the fields of knowledge, 10 in total. Knowledge areas of project management for the PMBOK® Guide are the following[9]: Project Integration Management; Project Scope Management; Project Time Management; Project Cost Management; Project Quality Management; Project Human Resource Management; Project Communications Management; Project Risk Management; Project Procurement Management; Project Stakeholder Management. Knowledge areas of the PMBOK® Guide standard [9] represent the substantive content of the project lifecycle processes. The standard thus dissolves project management into atomic transformations where the entry is secured, necessary tasks (expert, professional, managerial, etc.) are performed and output is created. Outputs from one process become inputs of the second.
2.2
PRINCE2 – Projects in Controlled Environments
International project management methodology PRINCE2 originated at Simpact Systems Ltd. in 1975 as a methodology of PROMPT projects management. In 1979, the UK government has accepted PROMPT II methodology as the official methodology for managing IT projects in public administration. In 1989, the methodology was expanded and published under the new name PRINCE - "Projects in Controlled Environments''. In 1996, based on the results of international studies in project management (with the participation of more than 120 organizations from around the world), the methodology was developed and released again in the second version as PRINCE2. Although the methodology has since undergone a series of amendments, the name has remained unchanged. For a long time, the owner of the methodology was the UK government (Office of Government Commerce). At present, it is again in the hands of a private company (Axelos Ltd.). Using the topics [6], the methodology indirectly defines its own knowledge areas, i.e. the areas to be managed across processes, with the help of specific expertise and with a specific goal. The topics of PRINCE2 methodology are as follows [6]: Business Case; Organization; Quality; Plans; Risk; Change; Progress. The seven processes which PRINCE2 methodology provides describe the life cycle of a project from the position of roles and responsibilities. All seven PRINCE2 processes can be continuously connected. The linkage shows a full sequence of operations from start to finish of the entire project life cycle. According to PRINCE2, the life cycle processes of the project are as follows [6]: Starting up a Project; Directing a Project; Initiating a Project; Controlling a Stage; Managing Product Delivery; Managing and Stage Boundary; Closing a Project. PRINCE2 methodology processes [6] are based on the interaction of project roles and stakeholders in the organizational structure of the project. Project roles and roles of stakeholders in PRINCE2 are as follows [6]: Steering Committee; Sponsor; Senior User; Senior Supplier; Project Manager; Team Manager; Change Authority; Project Support; Project Assurance.
20
Mathematical Methods in Economics 2016
2.3
Semantic model
A semantic model consists of a semantic network that Mařík et al. [5] define as "natural graph representation", where each node of a graph corresponds to a specific object and each edge corresponds to a binary relation. Mařík et al. [5] further state that "semantic networks can conveniently express the relations of set inclusion and membership in a set, and unique as well as general terms can be represented there". Semantic networks emerged at the end of the 60s of the 20th century. The term "semantic network" was for the first time used by Quillian [10] in his dissertation on the representation of English words. According to Sow [16], semantic networks are used for their ability to provide an easy-to-use system of information representation. The semantic networks are suitable for displaying and expressing vast information resources, management structures and processes.
2.4
Analytic Network Process (ANP)
The ANP (Analytic Network Process) method is the generalization of the AHP (Analytical Hierarchy Process) method. The ANP model reflects and explores the increasing complexity of network structure, where the network is made up of different groups of elements. Each group (cluster) comprises a homogeneous set of elements. Linkages can exist among clusters as well as among the elements. In the ANP model the hierarchy 12], [13] is removed.
Figure 1 The structure of elements and clusters in ANP model The benefit of the ANP method is the ability to express different preferences of links between elements and clusters. To express preferences the method of pair comparison is used. Preferences always occur precisely in assessing the importance of the two elements in terms of the element which refers to them - there rises a question "which of the elements is more important, and by how much". The resulting values for the sub-clusters are then combined into a super matrix where the normalisation of columns is performed [12], [13]. C W=C ⋮ C
W W ⋮ W
C
C W W ⋮ W
Where each block of the super matrix consists of: w w w w W = ⋮ ⋮ w w Under condition:
… C … W ⋯ W ⋱ ⋮ … W
… ⋯ ⋱ …
w w ⋮ w
w = 1, j ∈ 〈1, n〉
(1)
(2)
(3)
For a weighted super matrix (1) for which the relation (3) is valid, a calculation to obtain limit weights of the elements can be performed. The calculation is performed by squaring the weighted super matrix to a sufficiently large number. Since the super matrix has NxN size, the squaring is always feasible in a trivial manner (matrix multiplication).The result is the approximation of the weighted matrix to the limit matrix. Limit scales can be found in any column of the super matrix. The limit weight of each element expresses the strength of the effect on the overall structure of elements, i.e. it answers the question of how strongly an element affects the other elements [12], [13].
21
Mathematical Methods in Economics 2016
3 3.1
Results and Discussion Semantic Model of Organizational Project Structure
The Semantic Model of Organizational Project Structure [5] which would be based on the definitions and procedures of international project management standards can be viewed with the help of individual project roles, processes, management practices, parameters (constraints) and documentation. The Organizational Project Structure is comprised of the clusters: Project Roles, Project Constraint, Process Groups, Knowledge Area, and Project Documentation.
Figure 2 Semantic Model of Project Structure in the ANP model The representation of the project management structure in the form of a semantic network in the form of Project Roles, Knowledge Areas, Constraints and Project Documentation is based on the processes of a project life cycle. In the process of the project life cycle, the roles are an active part, procedures a substantive part and documentation an input and output level. The processes define and present in detail both major international standards, PMBOKÂŽ Guide [9] and PRINCE2 [6]. The semantic network in Figure2 displays an aggregated view of the project structure in the form of clusters. The roles are interested in parameters (their value); they use management practices (Knowledge Areas) and generate documentation. Parameters (project constraints) define the scope and manner of procedures for the management of the project and the content and complexity of documentation. The documentation, which captures changes in a project, retroactively updates parameters and is a tool in sub-management practices. Clusters and their relations can be further processed in detail in subsemantic networks. For example, the cluster of project documentation will look as follows:
Figure 3 Semantic Model of Project Documentation in the ANP model
3.2
Semantic model quantification of the project structure
The semantic model quantification of the project structure can be performed by the ANP method. The advantage of this method is the possibility of bias preferences among elements. For example, with the help of the Saaty
22
Mathematical Methods in Economics 2016
scale [13], the addressed roles of the project can differentiate their different attitude toward documentation. Creating the calculation of the ANP model (Figure 2) can be performed e.g. in a software tool Super Decisions Software 2.2 (http://www.superdecisions.com/). By calculating Calculus Type, a super matrix with limit weights can be obtained. The resulting weights of the individual elements reflect the influence of the element on the structure of the whole model even within individual clusters:
Project Documentation Project Plan Risk Register Stakeholder Register Project Charter Quality Register Work Package(s) Issues List Tasks List Communication Plan Business Case Budget Work Breakdown Structure Report TM Request PB Lessons Report Report PM Request PM
Weight in Model 0.09552 0.07752 0.06071 0.04394 0.03036 0.02925 0.02809 0.02424 0.0172 0.01626 0.01309 0.00369 0.00123 0.00028 0.00002 0.00002 0.00002
Weight in Cluster 0.216 0.176 0.138 0.100 0.069 0.066 0.064 0.055 0.039 0.037 0.030 0.008 0.003 0.000634 0.000045 0.000045 0.000045
Weight cumulatively 21.6% 39.2% 53.0% 62.9% 69.8% 76.4% 82.8% 88.3% 92.2% 95.8% 98.8% 99.6% 99.9% 99.9864% 99.9909% 99.9955% 100%
Population cumulatively 5.9% 11.8% 17.6% 23.5% 29.4% 35.3% 41.2% 47.1% 52.9% 58.8% 64.7% 70.6% 76.5% 82.4% 88.2% 94.1% 100.0%
Table 1 Limit weights of elements in the ANP model The resulting weights from Table 1 can be used for example in the case of project documentation to prioritize its creation and sharing in the project. The most important element in the cluster is the Project Plan.
3.3
Interpretation of the results for the semantic model of a project structure
Elements within the clusters can be assessed in terms of their weights, i.e. their relevance to the project management structure. The impact and abundances of elements in the cluster can be further assessed e.g. from the perspective of the existence of the Pareto principle [8] where 20% elements (e.g. project documentation) have an 80% impact on the management structure of the project within the same cluster: 120,0% 100,0% 80,0% 60,0% 40,0% 20,0% 0,0%
80%
Figure 4 Pareto chart - Interpretation of the results with project documentation For the project documentation (Figure 4) the Pareto principle did not occur. However, it can be concluded that the first seven documents (41.2%) have an impact on the management structure of the project within their cluster from 82.8%. In case of a real project environment of a company, this finding may lead to greater efficiency of project management - if, for example, this result is used in communications or management responsibilities for prioritizing in the creation and management of project documentation. By differentiating
23
Mathematical Methods in Economics 2016
between e.g. significant and less significant project documentation it is possible to achieve the saving of work effort and cost of the project - a bureaucratic burden is thus reduced in a project.
4
Conclusion
In the article a semantic model was used for expressing general project structure under international standards PMBOK® Guide and PRINCE2 where elements of a semantic network represent specific project roles, documents, constraints or knowledge areas. Using the ANP method, the quantification of the model was achieved. The differentiation of preferences among elements allowed for an additional expression of the general model of reality from the current practice (e.g. the relation of the roles and the documents).The outputs of the calculation were the limit weights of the elements expressing the influence of the element on the structure of project management. The results elicited in the article were further interpreted against the practice, in particular the possible use of the Pareto chart was outlined (the search for the Pareto principle).The benefit of the described procedure is the ability to prioritize the elements that exist in a project environment and project culture of a company, based on their influence in project management.
References: [1] Doležal, J. et al.: Projektový management podle IPMA (Project management based on IPMA). Grada Publishing a. s., Prague, 2012. [2] El-Gohary, N. M., Osman, H. and El-Diraby, T.E.: Stakeholder management for public private partnerships. International Journal of Project Management 24 (2006), 595-604. [3] ISO 21500:2012 Guidance on project management. International Organization for Standardization, Geneva, 2012. [4] Kim, H. L., Decker, S. and Breslin, J. G.: Representing and sharing folksonomies with semantics. Journal of Information Science 36 (2010), 57–72. [5] Mařík, V. et. al.: Umělá inteligence I. (Artificial Intelligence I.). Academia, Prague, 1993. [6] Murray, A. et al.: Managing Successful Projects with PRINCE2®. Fifth edition. AXELOS Ltd., 17 Rochester Row, London, 2013. [7] Page, L., Brin, S., Motwani, R. and Winograd, T.: The PageRank Citation Ranking: Bringing Order to the Web. Technical Report, Stanford InfoLab, Stanford, 1999. [8] Pareto, V.: Manuel d'économie politique. Fourth edition. Libraire Droz, Genève, 1966. [9] Project Management Institute: A Guide to the Project Management Body of Knowledge (PMBOK® Guide). Fifth Edition. Project Management Institute Inc., 14 Campus Boulevard, Newtown Square, PA, 2013. [10] Quillian, R.: A recognition procedure for transformational grammars. Doctoral dissertation. Massachusetts Institute of Technology, Cambridge (MA), 1968. [11] Rydval, J., Bartoška, J., Brožová, H.: Semantic Network in Information Processing for the Pork Market. AGRIS on-line Papers in Economics and Informatics 6 (2014), 59-67. [12] Saaty, T. L. and Vargas, L. G.: Decision Making with the Analytic Network Process: Economic, Political, Social and Technological Application with Benefits, Opportunities, Costs and Risks. Springer, NY, 2006. [13] Saaty, T. L.: Decision Making with Dependence and Feedback: The Analytic Network Process. RWS Publisher, Pittsburgh, 1996. [14] Schatten, M.: Knowledge management in semantic social networks. Computational and Mathematical Organization Theory 19 (2013), 538-568. [15] Shing-han, L., Shi-ming, H., David and C. Y., Jui-chang, S.: Semantic-based transaction model for web service. Information Systems Frontiers 15 (2013), 249-268. [16] Sowa, J. F.: Knowledge Representation: Logical, Philosophical, and Computational Foundations. Brooks/Cole Publishing Co., Pacific Grove (CA), 2000. [17] Williams, T.: Identifying Success Factors in Construction Projects: A Case study. Project Management Journal 47 (2015), 97-112. [18] Zhu, Y. and Li, X.: Representations of semantic mappings: A step towards a dichotomy of application semantics and contextual semantics. International Journal of Project Management 25 (2007), 121-127.
24
Mathematical Methods in Economics 2016
Modification of the EVM by Work Effort and Student Syndrome phenomenon Jan Bartoška1, Petr Kučera2, Tomáš Šubrt3 Abstract. Any project as a unique sequence of tasks is a subject to the human agent impact. This article aims to analyze the design and use of nonlinear modification of BCWS (Budgeted Cost for Work Scheduled) parameter within EVM (Earned Value Management) for monitoring project planned progress respecting human factor impact on the actual work effort. Duration of project activities is very often measured by the EVM tools, where the BCWS is a key parameter. Because BCWS is derived from the work effort in a project, work effort has a hidden influence on EVM. Consequently, in the real world of human resources, the BCWS is not usually linear. To express the BCWS parameters, Non-Linear mathematical models are used in this paper, describing the course of different work contours incorporating the Student Syndrome phenomenon. The human agent impact is apparent especially in the work contours of the resources allocated to the project. At the same time, the human agent impact manifests itself in the form of the Student Syndrome phenomenon. Hence, planning and progress of the project is burdened by the variability of the resources work effort affected by both different work contours and the Student Syndrome phenomenon. The article builds on the previous work of the authors. Key words: Project management, Earned Value Analysis, Non-Linear mathematical model, Work contour, Work effort. JEL Classification: C44 AMS Classification: 90C15
1
Introduction
Duration of project activities is very often measured by the EVM tools, where the BCWS is a key parameter. Because BCWS is derived from the work effort in a project, work effort has a hidden influence on EVM. Consequently, in the real world of human resources, the BCWS is not usually linear. The human agent impact is apparent especially in the work contours of the resource work. At the same time, the human agent impact manifests itself in the form of the Student Syndrome phenomenon. Hence, planning and progress of the project is burdened by the variability of the resources work effort affected by both different work contours and the Student Syndrome phenomenon. Earned Value Management [11, 9 or 5] is based on comparing the Baseline and Actual plan of the project realization. Baseline and Actual plan is determined by the partial work of single resources in the particular project activities. Baseline is always based on expected resource work contours and the impact of human agent is usually not included. The impact of human agent is usually expected only in the actual course of the project. The versatility of the human agent in projects can be described also by the first “Parkinson’s first law” [8]. It is natural for people to distribute work effort irregularly to the whole time sequence which was determined by the deadline of the task completion. The questions of “Parkinson’s first law” in project management are further dealt with in e.g. [4]. Work effort of an allocated resource has very often been researched in projects from the area of information technologies and software development, as these projects contain a high level of indefiniteness, and even common and routine tasks are unique. At the same time, it concerns the area where it is possible to find a great number of approaches to estimate how work-intensive the project will be or how long the tasks will take, and also case studies. The proposal for mathematical apparatus for planning the course of tasks within a case study is dealt with for instance in [7], or Barry et al. [1]. The authors Özdamar and Alanya [7] propose a particular pseu1
Czech University of Life Sciences in Prague, Department of Systems Engineering, Kamýcká 129, Prague, Czech Republic, email: bartoska@pef.czu.cz 2 Czech University of Life Sciences in Prague, Department of Systems Engineering, Kamýcká 129, Prague, Czech Republic, email: kucera@pef.czu.cz 3 Czech University of Life Sciences in Prague, Department of Systems Engineering, Kamýcká 129, Prague, Czech Republic, email: subrt@pef.czu.cz
25
Mathematical Methods in Economics 2016
do-heuristic approach to estimate the tasks course where the indefiniteness in the project is expressed by fuzzy sets. Barry et al. [1] concentrate on the existence and expression of the relation between project duration and total effort and in their theoretical starting points they point out the dynamics of the relation between the effort and project duration when a self-strengthening loop can be expected. The work effort can be described also using system dynamic models as presented e.g. in a study from project management teaching by Svirakova [10]. The others who research the project complexity and work effort are for instance Clift and Vandenbosh [3], who point out a connection between the length of life cycle and project management structure where a key factor is again a human agent. The aim of this paper is to propose and analyze an application of a non-linear modification of the BCWS parameter within EVM for the observation of the planned project duration according to the actual work effort of the resource considering at the same time the human resource impact. The paper builds on the previous authors’ works.
2 2.1
Materials and methods Student Syndrome phenomenon
If there is a deadline determined for the completion of a task and a resource is a human agent, the resource makes its effort during the activity realization unevenly and with a variable intensity. Delay during activity realization with human resource participation leads to stress or to tension aimed at the resource or the tension of the resource him/herself. The development and growth of the tension evokes the increase in the work effort of the human agent allocated as a resource. For more details see previous paper [2].
2.2
Mathematical model of the Student Syndrome
Authors in previous paper [2] propose a mathematical expression of the Student Syndrome. Its brief description follows: First, a function expressing the proper Student Syndrome denoted by p1 is introduced. It has three minima p1(t) = 0 in t = 0, t = 0.5, and t = 1; and two maxima: former one close to the begin and latter one close to the end of the task realization. Beside this, functions denoted by p2 expressing the resource allocation according to single standard work contours of flat, back loaded, front loaded, double peak, bell and turtle are proposed. All these functions are in the form of 4th degree polynomial. To express the strength of the Student Syndrome manifestation during the realization of a task the rate r of the Student Syndrome is introduced. It acquires values between 0 and 1 (r = 0 represents a situation when the Student Syndrome does not occur at all and the resource keep the work contour exactly; r = 1 means that the Student Syndrome manifests in its all strength and the resource absolutely ignore the work contour). As a result, the resource work effort during a real task realization can be modeled using function p = rp1+(1–r)p2. For more details see previous paper [2].
2.3
Modification of the BCWS by Work Effort
The EVM extension in the form of BCWS parameter modification for different work contours (turtle, bell, double peak, back loaded, front loaded, etc.) which is described below and applied in a case study is based on previous work of the authors of this paper [2, 6]. This approach can be applied when computing the BCWS of an activity in the project. It is computed in the classical way using the formula: BCWS = %PC . BAC
(1)
where %PC is the percentage of the work planned by the work calendar (planned completion percentage) and BAC is Budget at Completion of the project. The share of the whole work effort as a part of task duration requires can be calculated for a single resource as: a
∫
p ( t ) dt = 1
(2)
0
Let there are n resources, indexed by 1, 2, …, n, allocated at the task. Let rk, p1k, p2k denotes r, p1, p2 for k-th resource. Then the BCWS can be calculated:
26
Mathematical Methods in Economics 2016
BCWS
n = ∑ k =1
a
∫ (r 0
k
p 1 k ( t ) + (1 − r k ) p 2 k ( t )) dt ⋅ BAC
(3)
The resource work effort affects the growth of BCWS. It is not possible to expect uniform increase of BCWS in general and in case of all project tasks. In case of changing BCWS, EVM may provide fundamentally different results for the assessment of the state and development of the project.
3
Results and Discussion
The Student Syndrome phenomenon influences resource work effort, changes the work contour shape and transfers itself into the whole project EVM parameters. The work contours of the resources and the Student Syndrome impact may have a significant effect on both the actual and planned course of the project. As far as calculations in EVM are based on unrealistically expectations of BCWS, EVM may be unsuccessful.
3.1
Case Study
An illustrative fictitious case is described in Table 1. It is an IT project of a small extent. Planned tasks are differentiated by the type of work contour and by the human factor impact (Student Syndrome rate). IT projects are typical by a higher degree of procrastination, in particular in the last part of the project duration. Analytical work at the beginning is usually realized with the active participation of the customer side of the project.
Business Requirements End-User Requirements Conceptual Design Architectural Design Database Components Code/Logic Components GUI Interface Components User Acceptance Test Performance Test Product Release
Student Syndrome Rate 0.3 0.3 0.3 0.3 0.5 0.5 0.5 0.5 0.5 0.5
Work Contour Front Loaded Front Loaded Front Loaded Front Loaded Back Loaded Back Loaded Back Loaded Back Loaded Back Loaded Back Loaded
Start Man-day 0 5 0 7 7 5 7 8 13 10 13 17 13 15 30 5 32 6 35 5 83 Total Scope
Table 1 Case Study – The Project Plan The project is planned for 40 working days. The total scope of the planned work is 83 man-days. As well, the project plan (see Figure 1) can be subjected to time analysis by Critical Path Method, etc.), however, this paper does not further follow up these results.
Product Release Performance Test User Acceptance Test GUI Interface Components Code/Logic Components Database Components Architectural Design Conceptual Design End-User Requirements Business Requirements 0
5
10
15
20
25
Figure 1 Case Study – Gantt Chart
27
30
35
40
45
Mathematical Methods in Economics 2016
In case of the first 4 tasks we expect the work accumulation at the beginning and a lower Student Syndrome phenomenon impact; therefore these tasks work contour is front loaded (see Figure 2) and the Student Syndrome rate value r = 0.3 (see Table 1). The reason may be a higher level of cooperation with stakeholders in the project, i.e.e.g. the analysis and collection of information with the participation of the contracting authority and end users (the meetings are planned in advance and their terms are fulfilled, which has a positive impact on tasks duration). In case of the remaining 6 tasks, it is possible to expect the work accumulation at the end and a higher impact of procrastination; therefore these tasks work is back loaded (see Figure 3) and the value r = 0.5 (see table 1). The reason may be more technologically demanding nature of these tasks (programming and expert work).
Architectural Design
2
GUI Interface Components
2
1,5
1,5
1
1
0,5
0,5 0
0 1
2
3
4
5
6
7
1
8
Figure 2 Case Study – a task with front loaded work contour and r=0,3
2
3
4
5
6
7
8
9
10 11 12 13 14 15
Figure 3 Case Study – a task with back loaded work contour and r=0,5
The application of the work contour of front loaded at the project beginning leads to a higher work effort rate of resources planned. The application of the back loaded work contour and a higher human factor impact causes significantly lower work effort rate than planned. The resulting difference may be a reason of EVM failures.
3.2
Linear and Non-Linear BCWS
The BCWS curves (formula 3 and Figure 4) bring two parts of information: 1) the area below the curve expresses the expended work effort by the resource; 2) the function value determines the fulfilled project extent (e.g. on 40th day it is 83 man-days). The whole planned work effort course is higher than the identified course comprising the task nature and human resource impact. 100
Linear BCWS
80
Non-Linear BCWS 60
Difference 40
20
0 1
2
3
4
5
6
7
8
9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40
-20
Figure 4 Case Study – difference between Linear and Non-Linear BCWS From Figure 4 it can be evidently seen that since 13th day the work is carried out with a lower work effort. The overall difference between the Linear BCWS and the Non-Linear BCWS is –39.283 man-days (the difference between the areas below the curves). The observed difference is not an absent work in the project extent, but the work spent by the resources inefficiently due to the human factor impact. The reason lies in the nature of the tasks (back loaded versus front loaded) and the expected human factor impact (Student Syndrome phenomenon). Although the extent of the project (83 man-days) is completed through 40th day of duration, the allocated resources do not make the expected work effort to the project during its realization. The original work plan for single tasks is primarily based on the assumption of uniform work plan and work effort. When including the
28
Mathematical Methods in Economics 2016
Non-Linear course of work plan and work effort, a decline of the work effort occurs. At the same time, this work effort decline affects the Earned Value and increases the Actual Costs. What is more, the project completion date may be threatened. These conclusions should be subjected in further research.
3.3
Disillusionment in EVM by Linear BCWS
Including Actual Cost of Work Performed (ACWS) and the value of the actual work performed (% Work Performed per day) to the case study, the different progression of the values of the Linear BCWS and Non-Linear BCWS impacts significantly also other EVM parameters. The assessment of project status and development becomes inconsistent. Thus, the value of actual work performed may be affected by the Student Syndrome phenomenon e.g. in the following way (see Figure 5; for each day, an estimate of the work performed in relation to the plan is calculated). 140% 120% 100% 80% 60% 40%
% Work Performed per day
20% 0% 1
2
3
4
5
6
7
8
9
10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
Figure 5 Case Study – % Work Performed per day Further, to illustrate, let us expect that the project is in 30th day of its duration at this moment and the Actual Cost of Work Performed development is as follows (Figure 6, indicated in Man-days):
70 60
Actual Cost of Work Performed
50 40 30 20 10 0 1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
Figure 6 Case Study – Actual Cost of Work Performed (ACWP) Other derived EVM parameters especially CV% (Cost Variance %) and SV% (Schedule Variance %), express the relative deviation of the project from its budget or plan (Figure 7 and 8): 10%
SV%
SV%
10%
Starting the Project
0% -13%
Starting the Project
-8%
-3%
0% 2%
-13%
-8%
-3%
2%
-10%
-10%
-20%
-20%
-30%
-30%
-40%
-40%
-50%
-50%
-60%
-60%
-70%
-70%
CV%
CV%
-80%
Figure 7 Case Study – Within using Linear BCWS
-80%
Figure 8 Case Study – Within using Non-Linear BCWS
29
Mathematical Methods in Economics 2016
From Figures 7 and 8 it is apparent that the observed project progress in case of the Linear BCWS values application differs completely form the identified progress in case of the Non-Linear BCWS values application. When including the human factor impact and various shapes of the work contour for single activities (NonLinear BCWS, Figure 8), the project is assessed better at the beginning; however, during the course it often gets into a negative misalignment – it is more often in delay or over budget comparing to observing the duration with the Linear BCWS values (Figure 7). In case of the Linear BCWS values application the project would be assessed positively too: after the first five days mistakenly significant improvement of the state occurs (furthermore, it seems that from 6th to 9th day the project is ahead of time as the SV% values are greater than 0). The project is assessed completely differently. The inclusion of the human factor and work contours leads to discovery of a deeper crisis in the project.
4
Conclusion
The article discusses an application of a newly proposed BCWS calculation for monitoring the project duration within EVM. As the case study shows, as far as the task nature (work contour) and the human resource impact (Student Syndrome phenomenon) is comprised, the planned course of work effort (Non-Linear BCWS) may significantly differ from the commonly expected one (Linear BCWS). A possible decline or increase of resources work effort, which is not evident in case of the uniform work plan with uniform work effort, may manifest itself in an irregular increase or decrease of the Earned Value. This may result in malfunctioning EVM and project failure. The difference in course between the Linear and Non-Linear BCWS leads within EVM to different conclusions in the assessment of the project status and development. Overmuch positive or negative project assessment may result in uneconomic decisions with fatal consequences. With the inclusion of the human factor impact and various work contours and the derivation of the Non-Linear BCWS it is possible to obtain a much more accurate (though possibly less pleasant) illustration of the project.
References [1] Barry, E., Mukhopadhyay, T., and Slaughter, A. S.: Software project duration and effort: An empirical study. Information Technology and Management 3 (2002), 113–136. [2] Bartoška, J., Kučera, P., and Šubrt, T.: Modification of the BCWS by Work Effort. In: Proceedings of the 33rd International Conference Mathematical Methods in Economics MME 2015 (Martinčík, D., Ircingová, J., and Janeček, P., eds.). University of West Bohemia, Plzen, 2015, 19-24. [3] Clift, T., and Vandenbosh, M.: Project complexity and efforts to reduce product development cycle time. Journal of Business Research 45 (1999), 187–198. [4] Gutierrez, G.J., and Kouvelis, P.: Parkinson's law and its implications for project management. Management Science 37 (8) (1991), 990-1001. [5] Kim, B., C., and Kim, H., J.: Sensitivity of earned value schedule forecasting to S-curve patterns. Journal of Construction Engineering and Management 140 (7) (2014), 14-23. [6] Kučera, P., Bartoška, J., and Šubrt, T.: Mathematical models of the work contour in project management. In: Proceedings of the 32nd International conference on Mathematical Methods in Economics (Talašová, J., Stoklasa, J., and Talášek, T., eds.). Palacký University, Olomouc, 2014, 518-523. [7] Özdamar, L., and Alanya, E.: Uncertainty modelling in software development projects (with case study). Annals of Operations Research 10 (2001), 157–178. [8] Parkinson, C.N.: Parkinson’s Law and Other Selected Writings On Management. 1st ed., Federal Publications (S) Pte Ltd, Singapure, 1991. [9] Project Management Institute. A Guide to the Project management body of knowledge (PMBOK® Guide). Fifth edition. Newtown Square: Project Management Institute Inc., p. 589, 2013. [10] Svirakova, E.: System dynamics methodology: application in project management education. In: Proceedings of the 11th International Conference on Efficiency and Responsibility in Education (Houška, M., Krejčí, I., and Flégl, M., eds.). Czech University of Life Sciences, Prague, 2014, 813-822. [11] Vandevoorde, S., and Vanhoucke, M.: A comparison of different project duration forecasting methods using earned value metrics. International Journal of Project Management 24 (4) (2005), 289–302.
30
Mathematical Methods in Economics 2016
Equivalent scale and its influence on the evaluation of relative welfare of the household Jitka Bartošová 1, Vladislav Bína 2 Abstract. The quantitative description and income modeling usually employs equivalent income values based on the total income of households adjusted in order to provide a reasonable comparison of relative welfare of households differing in their extent and age structure. Equalized income of the household is perceived as a general indicator of economic sources available to each household member and suitable to perform comparisons across countries and regions. Unfortunately, there does not exist any generally accepted methodology of equivalent income calculation. In the presented paper authors discuss equivalent scales which in a different manner incorporate adults and children in the households according to their age in order to evaluate the number of consumption units. The influence of chosen method on basic summary characteristics of equalized incomes and on the chosen indicators of poverty and inequality is demonstrated on the data file from large European sample survey on incomes and living conditions of households EU-SILC. The paper focuses also on dependence of the relative welfare (represented by the scaling using equalized income) on the type of household. In certain types of households we can expect relatively strong dependence of the above mentioned characteristics on the type of household. Keywords: consumption unit, equalized income, poverty and inequality indicators. JEL Classification: C13, C15, C65 AMS Classification: 62H12
1
Equivalent scales
It is obvious that household expenditures grow with each additional household member, but they do not grow proportionally. Thanks to the possibility to share some of the expenditures, namely expenditures on housing, water and electricity, it is necessary to assign to each household a value (weight) in relation to its requirements. These values, so called consumption units, provide a possibility to compute equalized values of incomes which comprise income of standardized household or, in the other words, income per one equalized member of household. The choice of equivalence scale depends on the expected economies of scale in the consumption of household but also on the priorities assigned to the demands of particular individuals according to their age (adults and children). Equalized household income is perceived as a general indicator of economic resources available to each household member (OECD [8]). Thus, it can be used for analysis of incomes and their comparison across the regions and countries but also for the analysis of the risk of monetary poverty of individuals (see, e.g., Bühmann et al. [2], Flachaire and Nunez [3]). The aim of this paper is to compare two different types of models used for the construction of consuming unit and present the influence of the choice of model type on the equivalent scales together with the influence of setting the model parameters on basic summary characteristics of incomes and indicators of monetary poverty and inequality. For this purpose we use the linear equivalent scales constructed according to the methodology of Organization for Economic Co-operation and Development (OECD scale) and its variant used for the computation of official Eurostat statistics (OECD modified scale) and we compare it with non-linear equivalent scale constructed for the calculation of official statistics on New Zealand. The impact of choice of equivalent scale on monetary incomes can be evaluated from different viewpoints. For the demonstration of sensitivity of selected characteristics of incomes and indicators of monetary poverty and inequality on the selection of equivalence scale model and its parameter settings we use the resulting data from sample survey EU-SILC 2012 for the Czech households. It is large representative survey on incomes and living conditions carried out annually according to unified methodology of Eurostat in all member countries of European Union, in Norway, Sweden and on Iceland. 1,2
University of Economics in Prague, Faculty of Management, Department of Exact Methods, Jarošovská 1117/II, 37701 Jindřichův Hradec, {bartosov,bina}@fm.vse.cz.
31
Mathematical Methods in Economics 2016
2
Linear and non-linear equivalence scale
Evaluation of level, differentiation, non-uniformity of distribution or insufficiency of financial potential is influenced by the choice of scale (measure) in which the values of incomes are equalized. Two extreme possibilities of the scaling are on one hand the choice of total household incomes, i.e. the sum of all monetary means collected by the household members. In this case we completely ignore the size and age structure of household. On the other hand, the second extreme possibility is the assignment of the whole household income to all particular household members (Per Capita incomes). Larger households usually demand more resources but also have higher capacity for achieving higher incomes because they have more adult, economically active members. With the increase of number of household members there also appears a tendency to decrease relatively the amount of resources necessary to guarantee the household functioning. It can be expected that different age categories may have different requirements and among other relevant characteristics potentially influencing the financial needs counts also gender, job position and marital status. Integration of such differences to equivalence scale demands more sophisticated solution. Subsequently several types of equivalence scale definitions were created (e.g., OECD [8], Jorgensen and Slesnick [6], etc.) which aims to perform an optimal estimation of proportion of common households expenditures; individual household members’ incomes. It appears that unified and general approach for estimation does not exist since each country has different setting of price relations which, moreover, evolve in time.
2.1
Linear equivalence scale
Linear equivalence scale is constructed according to the methodology of the Organization for Economic Cooperation and Development (OECD scale). European Union employs the modified version (OECD modified scale) which is based on the different setting of model parameters. General linear model of equivalent consumption unit (CU) is defined as
CU ( , ) 1 C ( A 1)
(1)
where A is a number of adults in household, C is number of children in household younger than 14 years, α is the child adult equivalence parameter and β is the other persons adult equivalence parameter. OECD scale has the model parameters set to the values α = 0.5 and β = 0.7, whereas OECD modified scale uses lower values of parameters, namely α = 0.3 and β = 0.5. Both limit possibilities, i.e. total incomes assigned to household without considering its size and age structure correspond to the choice of α = β = 0, incomes Per Capita match to α = β = 1. This means that in the first case we get CU(0,0) = 1 and Per Capita the consuming unit is defined as CU(1,1) = A + C. The modified OECD scale has both parameters decreased by two tenth which lead to the emphasis of first component, namely the common expenditures on running of the household in contrary to the second component, i.e. individual expenditures of particular household members. This leads (in comparison to the original OECD scale) to the relative increase of household incomes with relatively higher number of members. Modification changes also the perspective of monetary poverty of households since relatively smaller households can now easily fall under the poverty threshold. The emphasis on common expenditures rather corresponds to the reality of countries of Western Europe where housing costs comprise higher percentage of total household expenditures. In post-communist countries still – in spite of the permanent growth of housing costs – persists the situation corresponding rather to the original OECD scale.
2.2
Non-linear equivalence scale
Non-linear equivalence scale was used for the calculation of mutually comparable values of household incomes on New Zealand. The change in methodology of income distribution analyses on New Zealand was stimulated by the Royal Commission of Inquiry into Social Security in years 1969 and 1972. The creation of models and estimation of their parameters were in that time in focus of many statisticians, e.g., Jensen [5]. And they suggested for income equalization to mutually comparable values several different models or suitable values of their parameters. The model used in this paper is based on power of the weighted sum of adult persons and children in the household. The model was constructed on the basis of econometrical analysis of household expenditures (with typical consumer basket). The coefficients react on different consumption of adult household members and children (parameter α) and on economies of scale which emerge thanks to the cohabitation of household members. This non-linear model is in general given by the formula
32
Mathematical Methods in Economics 2016
CU ( , ) ( A C )
(2)
where A is a number of adults in household, C is the number of children in household younger than 14 years, α is the child adult equivalence parameter and β is the household economies of scale parameter. Parameter α has the same role in both models (linear and power). For γ = 1 the power model changes to the linear and for α = 1 and γ = 0.5 to the square root model which was also use on New Zealand but was not successful since it does not react sufficiently on the age structure of household. Similarly as in the case of the linear model there exists a choice of parameters corresponding to both limits. Total incomes of household without consideration of size and age structure correspond to γ = 0, incomes Per Capita correspond to α = γ = 1.
2.3
Characterization of income distribution, inequality and monetary poverty
For assessing of the sensitivity of income distribution characteristics on the setting of parameters in models of equivalent consumption unit we employ only basic measures of location – median and lower and upper quartiles which are robust estimates of the middle half of income distribution. The sensitivity of income inequality on the setting of model parameters can be observed using the wellknown Gini index. .The Gini index is elicited from the Lorenz curve and measures the deviation of the income distribution of individuals or households from the perfectly uniform distribution. Its value is given by the ratio of the area between the line of absolute equality (diagonal y = p) and the Lorenz curve L(p) and the entire area under the diagonal. According to the fact that the area under the diagonal is equal to the half of area of the unit square we can obtain the Gini coefficient using the numerical integration of estimated Lorenz curve 1
G 1 2 L( p)dp
(3)
0
The Gini index takes values from the interval 〈0; 1〉 – value approaching to 0 indicates more egalitarian distribution of incomes in the considered society and vice versa. For the expression of monetary poverty we used the measures of poverty employed in EU countries. These stem from the class of Foster-Greer-Thorbecke poverty measures (see [4]) in general given by formula
1 q z yi P y, z , n i 1 z
(4)
where z > 0 is a beforehand given poverty threshold, y y1, y2 ,, yn is a vector of household incomes
sorted by size y1 y 2 y q z , q is the number of households belonging to the group under the poverty threshold and n is the total count of households. Parameter α conditions the measure of sensitivity of deprivation in case of households belonging below the poverty threshold (see [8]). For α > 1 the value of P begin to be distributionaly sensitive and with growing value of α grows the sensitivity on measuring the poverty of the poorest. For P reflects the poverty of the poorest persons (see [7]). The most commonly used measure of monetary poverty, so called head count index or risk-of-poverty-rate can be obtained by choosing α = 0. By choice of α = 1 we obtain another measure, describing the depth of poverty (or poverty gap) which is based on the summary evaluation of poverty according to the poverty threshold. The value of PG relates to the distance of poor from the poverty threshold. Thus we obtain information about the extent of poverty. But even this measure is not sensitive enough when the “poor person” becomes “very poor”. This lack of sensitivity will be removed by choice of α = 2.
3
Sensitivity of selected characteristics on the choice of equivalent scale
The choice of equivalent scale model and change of its parameters in a way projects into the changes of income distribution and its basic and advanced characteristics. It changes not only the momentum and quantile measures of location, variability and concentration (skewness and kurtosis) of the distribution. But this change of scale projects itself automatically also into the advanced indicators of income inequality, monetary poverty, etc. Reaction of selected indicators, i.e. median, both quartiles, Gini index and Foster-Greer-Thorbecke measures of poverty on the selection of models and change of its parameters is depicted in Figures 1 – 3. The influence of
33
Mathematical Methods in Economics 2016
parameter setting is presented simultaneously for both models – in the left part we can see the linear model, in the right part the non-linear model is depicted. The horizontal axis shows the change od parameter α (0 ≥ α ≤ 1) whereas the vertical axis shows in case of linear model the change of parameter β (0 ≥ β ≤ 1), in case of nonlinear model the change of parameter γ (0 ≥ γ ≤ 1). The sensitivity is presented using isoquants, i.e. the curves with constant value of considered characteristics Perpendicular curves to isoquants are gradients of considered characteristics. Density of the curves is proportional to the measure of change of the characteristics and thus it represents the sensitivity on the change of parameters in the given model.
Figure 1 Reaction of mean (upper) and median (lower): linear model (left), power model (right)
Figure 1 presents the contour maps for mean and median. This allows us to visually depict the difference in reaction of robust and non-robust characteristics of “center” of the income distribution on change of parameters in both models. Different shapes and densities of depicted isoquants show simultaneously the similarity of reaction in the same type of model and dissimilarity of reaction in different model types. The choice of linear and non-linear model projects into the reaction on the change of particular parameters (direction and curvature of isoquants). The sensitivity of reaction – it can be observed that robust statistics, i.e. median, reacts more sensitively – their isoquants have higher density. Figure 2 depicts the reaction of two basic measures of monetary poverty – risk of poverty and poverty gap. From the presented graphs we can infer that the choice of model is a fundamental decision in both assessing of the risk of poverty and the poverty gap. Even in this case we can see that the reaction is similar for both measures in one considered model. The sensitivity of reaction shows that the risk of poverty is more sensitive on the change of parameters in both models than the poverty gap. (Sensitivity analysis of risk of poverty, poverty gap and severity of poverty on the choice of parameters in linear model can be found also in Bartošová and Bína [1].)
34
Mathematical Methods in Economics 2016
Figure 2 Reaction of risk of poverty (upper) and poverty gap (lower): linear model (left), power model (right)
Figure 3 Reaction of Gini index: linear model (left), power model (right)
Figure 3 illustrates the response of Gini index. Different shapes and densities of isoquants show the dependence of reaction on both the choice of model and parameter setting. The perception of income inequality measured by Gini index is therefore affected by the choice of the model.
35
Mathematical Methods in Economics 2016
4
Conclusions
The empirical sensitivity analysis of selected statistics showed that in all cases we can observe significant influence of model type on the dependence of considered statistics with the given setting of model parameters. The selection of linear or non-linear expression of functional dependence has an important impact on the results. It implies the change of the relation of reaction on the setting of pairs of parameters in the models. We observed that this relation is for given type of model typical. Also the sensitivity, i.e. the measure of reaction on the change of parameters is in both types of models different and typical for the corresponding model. The choice of model and its parameters radically influences the values of income indicators serving as a basis for the objective evaluation of financial situation of households (risk of monetary poverty and income inequality). The EU methodology which defines the equalized incomes using linear model with parameters α = 0.3 and β = 0.5 does not provide a possibility to describe properly national differences in economies of scale which are essential particularly in case of post-communist countries. Thus, the international comparison of income inequality and relative poverty does not necessarily reflect accurately true conditions and hence, the foundations of setting of social policy of particular countries appear to be disputable. We propose to focus on the proper choice of the model and its parameters in dependence on situation in particular country or region.
Acknowledgements The research was supported by Internal Grant Agency of the University of Economics in Prague under project F6/34/2016.
References [1] Bartošová, J. and Bína, V.: Sensitivity of monetary poverty measures on the setting of parameters concerning equalization of household size. In: 30th International Conference Mathematical Methods in Economics, MME 2012 (Ramík, J. and Stavárek, D., eds.). Silesian University in Opava, School of Business Administration, Karviná, 2012, 25–30. [2] Bühmann, B., Rainwater, L., Schmauss, G. and Smeeding, T.: Equivalence Scales, Well-being, Inequality, and Poverty: Sensitivity Estimates Across Ten Countries Using the Luxembourg Income Study (LIS) Database. Review of Income and Wealth 34 (1988), 115–142. [3] Flachaire, E. and Nunez, O.: Estimation of the income distribution and detection of subpopulations: An explanatory model. Computational Statistics and Data Analysis 5 (2007), 3368–3380. [4] Foster, J., Greer, J. and Thorbecke, E. A Class of Decomposable Poverty Measures. Econometrica 52 (1984), 761–766. [5] Jensen, J.: Minimum Income levels and Income Equivalence Scales, Department of Social Welfare, Wellington, 1978. [6] Jorgenson, D. and Slesnick, D.: Aggregate Consumer Behavior and Household Equivalence Scales. Journal of Business and Economic Statistics 5 (1987), 219–232. [7] Morduch, J. Poverty Measures. Handbook on Poverty Statistics: Concepts, Methods and Policy Use. New York: United Nations, Department of Economic and Social Affairs, 2005. [8] OECD: What Are Equivalence Scales? Retrieved from http://www.oecd.org/eco/growth/OECD-NoteEquivalenceScales.pdf, 2016. [9] Ravallion, M. Poverty Comparisons: A Guide to Concepts and Methods. Světová banka, Washington, DC, 1992.
36
Mathematical Methods in Economics 2016
Evaluation of the destructive test in medical devices manufacturing Oldřich Beneš1, David Hampel2 Abstract. Testing the quality of produced medical devices is undoubtedly a crucial issue. If a product of compromised quality is sold, it would first of all have an impact on the health of patients and, consequently, on the economic performance of the manufacturer. Destructive testing of medical devices is expensive and there are attempts to replace it by non-destructive tests. During the substitution of destructive tests with non-destructive ones, the strength of the test should be rationalized at the stage of process control planning. To do so, logit or probit models are the first choice. This article deals with an evaluation of typical destructive test results from both product and process verification and validation of medical devices. The differences between the estimated logit and probit models are discussed and arguments when to prefer which one are collected. Finally, replacement of destructive testing by non-destructive testing is evaluated numerically. Keywords: destructive test, logit model, medical devices manufacturing, nondestructive test, probit model. JEL Classification: L63 AMS Classification: 62P30
1
Introduction
Within highly regulated industries, like the food industry and pharmaceutical or medical device production, there are two main requirements in terms of manufacturing process definition, or maybe it would be better said duties. During the development phase, process validation is performed where the evidence of process capability is given. After the market launch, evidence that the processes are under the same conditions as during the process validation should be present during the product’s life cycle. At the same time, the strong demand for reducing the time and costs needed for any new product development or even the implementation of change should be reflected. One way of meeting these requirements is to replace expensive destructive testing performed during the validation phase with much cheaper non-destructive testing during the production phase. To do this without compromising the quality and reliability of the manufacturing processes, the strong correlation between the results of direct, typically destructive, testing and nondestructive testing of products during standard production could provide the solution. From a theoretical point of view, the way how to provide a system with this type of evidence should be described and rationalized before the planning phase. So what do we have? Typically, we have information from the destructive test. During this test we directly measure the value that was required. This is usually called the structural value. If we find a depending value, which can be measured by a non-destructive test then we would have a so-called response variable. This sound like a standard linear model (e.g. a simple regression model) and can be thought of as having two ‘parts’. These are the structural variable and the response variable. The structural component is typically more or less normally distributed, covered by a normally distributed error and is often the random component. The response variable cannot be normally distributed. Replacing destructive tests with non-destructive ones is currently an issue in many fields. The use of statistical models for evaluating the reliability of non-destructive tests is described in the paper [10]. Probability models for the uncertainty in the flaw size determination and flaw detection are constructed. Flaw size models are based on the assumption that the measured size and true flaw size are related in a simple way: two models based on logarithmic and logit transformations of the flaw size are considered. Authors of the paper [3] describe the discovery and comparison of empirical models for predicting corrosion damage from non-destructive test data derived from eddy current scans of USAF KC-135 aircraft. The results 1
Mendel University in Brno, Department of Statistics and Operation Analysis, Zemědělská 1, 613 00 Brno, Czech Republic, e-mail xbenes19@mendelu.cz. 2 Mendel University in Brno, Department of Statistics and Operation Analysis, Zemědělská 1, 613 00 Brno, Czech Republic, e-mail qqhampel@mendelu.cz.
37
Mathematical Methods in Economics 2016
also show that while a variety of modelling techniques can predict corrosion with reasonable accuracy, regression trees are particularly effective in modelling the complex relationships. The aim of work [4] is to evaluate the reliability of non-destructive test techniques for the inspection of pipeline welds employed in the petroleum industry. The European methodology for qualification of non-destructive testing presented in [7] has been adopted as the basis of inspection qualifications for nuclear utilities in many European countries. Quantitative non-destructive evaluation discussed in [1] provides techniques to assess the deterioration of a material or a structure, and to detect and characterize discrete flaws. Therefore, it plays an important role in the prevention of failure. The described techniques are used in processing, manufacturing and for in-service inspection; they are particularly important for the in-service inspection of high-cost and critical load-bearing structures whose failure could have tragic consequences. The objective of this paper is to verify the possibility of replacing destructive testing based on tearing of the component with non-destructive testing involving only measurement of the component part.
2
Material and Methods
Our datasets come from destructive tests of a medical device. We focus on the part where two pipes of the same diameter are connected together. This procedure is illustrated in Fig. 1. Pipe A is expanded by tool C. Furthermore, pipes A and B are connected together under pressure. The overlap length of pipes A and B is measured (in mm). When the destructive test is performed, the strength for which the pipes are torn apart is measured (in N). Finally, we have a binomial variable that indicates whether the device passed (1) or failed (0) the test.
Figure 1 Illustration of pipe connections We will work with the results of 6 destructive tests, in each 60 devices were tested. The first three sets we will use as training sets dedicated to estimation (we will use the notation O1, O2 and O3 for the particular datasets and O together). The next three sets (V1, V2, V3 and V together) will be used for verification of the estimated relationships. The binary variable on passing the test is definitely not randomized because the data are collected in a predefined time or sequence. Nevertheless, a binary outcome is still thought to depend on a hidden, unknown Gaussian variable. The generalized linear model (GLiM) was developed to address such cases, and logit and probit models as special cases of GLiMs are appropriate for binary variables or multi-category response variables with some adaptations to the process [2, 6]. One of these two models should be the first choice solution for similar situations. A GLiM has three parts: a structural component, a link function, and a response distribution. The way we think about the structural component here does not really differ from how we think about it in standard linear models; in fact, that is one of the great advantages of GLiMs. The link function is the key to GLiMs: since the distribution of the response variable is non-normal, it is what lets us connect the structural component to the response one i.e. it ‘links’ them. Since the logit and probit are links, the understanding link functions allow us to intelligently choose when to use which one. Although there can be many acceptable link functions, there is often one that is special. The link function that equates them is the canonical link function. The canonical link for binary response data (more specifically binomial distribution) is the logit. Nevertheless, there are lots of functions that can map the structural component to the interval [0, 1], and thus be acceptable; the probit is also popular. The choice should be made based on a combination of knowledge of the response distribution, theoretical considerations and empirical fit to the data. To start with, if our response variable is the outcome of a Bernoulli trial (i.e. 0 or 1), our response distribution will be binomial, and what we are actually modelling is the probability of an observation being a 1. As a result, any function that maps the real number line to the interval [0, 1] will work. From the point of view of substantive theory, if we are thinking of our covariates as being directly connected to the probability of success, then we typically choose logistic regression because it is the canonical link.
38
Mathematical Methods in Economics 2016
However, the consideration is that both logit and probit are symmetrical, if you believe that the probability of success rises slowly from zero but then tapers off more quickly as it approaches one, then a cloglog is called for. A cloglog is slightly different from the others. It is based on a standard extreme value distribution and the curve approaches one more gradually than it approaches zero. The alternative of a ‘slower’ approach to zero can be achieved by redefining the response so as to apply the model to let us say unemployment rather than employment. The logit and probit functions yield very similar outputs when given the same inputs, except that the logit is slightly further from the bounds when they ‘turn the corner’. In addition, we could add the cloglog function here and they would lay on top of each other. Notice that the cloglog is asymmetrical whereas the others are not; it starts pulling away from 0 earlier, but more slowly, and approaches close to 1 and then turns sharply. Additional constraints could be based on the statement that probit results are not easy to interpret. This is not true, although interpretation of the betas is less intuitive. With logistic regression, one unit change in X is associated with a β change in the log odds of ‘success’ all else being equal. With a probit, this would be the change of β zβ z’s with two observations in a dataset with zz-scores of 1 and 2, for example. To convert these into predicted probabilities, you can pass them through the normal cumulative distribution function. It is also worth noting that the usage of probit versus logit models is heavily influenced by disciplinary tradition. For instance, economists seem far more used to probit analysis while researchers in psychometrics rely mostly on logit models. So while both models are abstractions of real world situations, logit is usually faster to use on larger problems, which means multiple alternatives or large datasets. The multinomial logit functions are classically used to estimate spatial discrete choice problems, even though the actual phenomenon is better modelled by a probit. But in the choice situation the probit is more flexible, so more used today. The major difference between logit and probit models lies in the assumption on the distribution of the error terms in the model. For the logit model, the errors are assumed to follow the standard logistic distribution while for the probit, the errors are assumed to follow the normal distribution. In principle, for general practice the model formalisms both work fine and often lead to the same conclusions regardless of the problem complexity. While the distributions differ in their theoretical rationale, they can for all practical purposes in my view be treated the same as it would take an impossibly large sample to distinguish them empirically. Practically, employment of GLiM will be the last task for us. Before this, we need to verify the relationship between strength and a higher defined overlap length. For datasets O1, O2 and O3 we estimate the suitable regression function by both general OLS and robust regression comprising bisquare weights [8], because we want to eliminate the effects of possible outliers. The stability of these three estimated regressions is judged by the general linear hypothesis [5, 9], where the equality of all corresponding parameters is tested. After this, the regression is estimated based on all training datasets O. Using this estimate, the MSE of all datasets is calculated; the MSE from verification datasets V1, V2, V3 is compared to the MSE of O1, O2 and O3. When the estimated regression function is verified, we use it for modelling the binomial variable indicating passing the test by an overlap of the pipes. For this purpose, we use both the logit and probit link. All of the calculations were performed in the computational system Matlab R2015b.
3
Results
Tearing strength was modelled by the overlap length of the pipes for all of the training datasets. The relationship is not symmetrical and we think it is well described by the cubic polynomial (all of the parameters are statistically significant, determination is higher than 0.95 for all of the datasets). There are no important differences between general and robust regression, see Fig. 2. Regressions for datasets O1, O2 and O3 can be considered to be identical, because the testing statistics value of the general linear hypothesis 1.004 is lower than the critical value 1.994. This means that the estimated relationship is stable at least over the training datasets. The regression cubic polynomial was estimated based on all training datasets O. Using this estimate, the mean square error (MSE) was calculated for all of the particular datasets, see Tab. 1. It is clear that the regression estimated with dataset O is suitable also for of the verification datasets V1, V2 and V3. The highest MSE was calculated for dataset O1; this is caused by the presence of outliers evident from the visible difference of the OLS and a robust estimate of the regression function. Sample
O1
O2
O3
O average
V1
V2
V3
V average
MSE
3.90
1.80
1.65
2.45
1.13
1.74
1.62
1.50
Table 1 MSE of all datasets when regression estimated for all training sets O is used
39
Mathematical Methods in Economics 2016
Furthermore, we modelled the binomial results of the destructive test by the pipe overlap length. Note that when we model this binomial variable by strength sufficient for tearing off the pipes, we obtain 100 % success. We used a function form describing the relationship between the tearing strength and the pipe overlap length i.e. a cubic polynomial. Logit and probit link functions are employed. The results are summarized in Tab. 2 and graphically illustrated in Fig. 3.
Figure 2 Estimated regression function – cubic polynomial (solid line OLS estimate, dashed line robust regression with bisquare weighting). X-axis pipe overlap length in mm, y-axis pipe tearing strength in N. Left top O1 dataset, left bottom O3, right top O2, right bottom all mentioned datasets together. It is clear that there are no substantial differences between the results obtained with the logit and the probit links. It should be mentioned that for the probit link we obtain systematically lower p-values than for the logit link when testing the significance of particular regression parameters. There are only slight differences between the MSE for the logit and the probit links. The percentage of correctly classified cases is always the same. This percentage is relatively high, above 93.3 % for the particular datasets. In Fig. 3, it is clear that the transition from 0 to 1 is not symmetric to the transition from 1 to 0, which reflects the non-symmetry of the previously estimated cubic polynomial. Sample MSE logit percentage logit MSE probit percentage probit
O1 0.012 96.7 0.033 96.7
O2 0.000 100.0 0.000 100.0
O3
all O
0.036
0.065
93.3
88.3
0.036
0.065
93.3
88.3
V1 0.021 98.3 0.022 98.3
V2 0.016 96.7 0.016 96.7
V3 0.000 100.0 0.000 100.0
all V 0.059 93.3 0.059 93.3
Table 2 MSE and percentage of correctly classified cases of GLiM estimate of binomial variable indicating passing the test by pipe overlap length with logit and probit link
40
Mathematical Methods in Economics 2016
Figure 3 Estimated regression function – cubic polynomial (solid line GLiM estimate with probit link, dashed line GLiM estimate with logit link). X-axis pipe overlap length in mm, y-axis result of destructive test (1 passed, 0 failed). Left graph all training datasets together, right graph all verification datasets together.
4
Discussion and Conclusions
The relationship between the strength when the component is destructed and the length of the pipe overlap was determined based on the datasets dedicated to the estimation. An adequate description is given by the cubic polynomial, where the coefficients were estimated by the both OLS method and robust regression comprising bisquare weights. The stability of this relationship was judged by the test of equality of three regression functions based on the general linear hypothesis. Furthermore, the determined function form of the dependency was verified using control datasets. The cubic polynomial enables a correct estimation of the asymmetrical relation. However, this representation has a limited range of applicability, because out of data we cannot assume proper behaviour. This shortage could be overcome by using a nonlinear Gauss function of order 2, but in this case we have 6 parameters instead of 4. It is also not easy to use this function in GLM. In addition, we attempted to fit a Bernoulli-distributed variable, which means that the component either passed (1) or fails (0). This variable is fully described by the strength for which the component is destructed. When we use the length of pipe overlap instead of this strength, we obtain a percentage of correctly estimated cases higher than 93.3 % for both the training and verification datasets. Finally, the performed analysis shows that it is possible to replace the destructive test by a non-destructive one. This fact has important economic consequences. The main reason for this alternative testing is to reduce the cost of process control during regular production without any compromise in the quality of product. The situation the test results are coming from is rather typical: high volume of annual production with a regular price of one piece. The figures for the mentioned production are the annual demand for the first year coming from a ramp up estimation i.e. 411,538 pieces with an annual increase of 1 %. The price of one piece in at the time of testing was 88.70 EUR. To be on the safe side the annual decrease in price is estimated as being 10 %. For the same reason, the number of tests is estimated as 3 fails for one success. However, this test should be based on the strong results of an engineering study. The possibility of fail is still rather high and rules for risk management of this safety factor should be used. To obtain a better and more detailed picture, the return on investment is calculated with a time scale set at 6 months with a 5-year life cycle. In this situation, perfect for using of the alternative testing usage, a return on investment would take third months and the complete saving before the end of the life cycle is as high as 3,038,528 EUR. This is 2.03 % of the calculated turnover (see Fig. 4 Scenario 1). If we consider only one test attempt, the results will obviously be even more promising (see Fig. 4, Scenario 2). But let us simulate a different situation. A real business case from a different company shows figures where the annual demand is 477 pieces only but with a high price of each unit of 324.38 EUR. The price is much more stable and the forecast is for an annual decrease of only 2 %. At the same time, this product is manufactured based on an individual order therefore standard process validation is not possible. In order to perform the verification two identical pieces need to be produced, whereby one of them will undergo destructive testing and the second will be sent to the customer. This situation still makes sense despite the small annual production. Despite the fact that the total amount of pieces produced during the estimated life cycle of five years will only be 2,434, the economic effect is still high due to the high price of a single product. Taking into account the safety factor
41
Mathematical Methods in Economics 2016
mentioned above, this situation fits Scenario 3 in Fig. 4. The return on investment will be after two years and the total saving before the end of the life cycle will be 402,531 EUR (which is 53.60 % of the calculated turnover). Similarly, without the safety factor it would be even more promising. The impact is higher because of the higher price of one unit, which means a higher price for the alternative test validation at the beginning (see Fig. 4, Scenario 4).
Figure 4 Effects of different production scenarios on company cumulative discounted cashflow Finally, from the economic point of view, attempts to find an alternative test method make sense in the case of high volume production or with a high price of one unit together with verification or high volume process control sampling. In these situations, the initial cost of alternative test method validation will be paid back in an acceptable time and increase competitiveness on the market.
References [1] Achenbach, J. D.: Quantitative nondestructive evaluation. International Journal of Solids and Structures 37, 1–2 (2000), 13–27. [2] Agresti, A.: An Introduction to Categorical Data Analysis. John Wiley & Sons Inc., New York, 2007. [3] Brence, J. R. and Brown, D. E.: Data mining corrosion from eddy current non-destructive tests. Computers & Industrial Engineering 43, 4 (2002), 821–840. [4] Carvalho, A. A., Rebello, J. M. A., Souza, M. P. V., Sagrilo, L. V. S. and Soares, S. D.: Reliability of nondestructive test techniques in the inspection of pipelines used in the oil industry. International Journal of Pressure Vessels and Piping 85, 11 (2008), 745–751. [5] Doudová, L. and Hampel, D.: Testing equality of selected parameters in particular nonlinear models, In: AIP Conference Proceedings 1648, 2015. [6] Fitzmaurice, G. M., Laird, N. M. and Ware, J. H.: Applied Longitudinal Analysis. John Wiley & Sons Inc., New York, 2011. [7] Gandossi, L. and Simola, K.: Framework for the quantitative modelling of the European methodology for qualification of non-destructive testing. International Journal of Pressure Vessels and Piping 82, 11 (2005), 814–824. [8] Huber, P. J. and Ronchetti, E. M.: Robust Statistics. John Wiley & Sons Inc., New York, 2009. [9] Milliken, G. A. and Graybill, F. A.: Extensions of the general linear hypothesis model. Journal of the American Statistical Association 65, 330 (1970), 797–807. [10] Simola, K. and Pulkkinen, U.: Models for non-destructive inspection data. Reliability Engineering and System Safety 60, 1 (1998), 1–12.
42
Mathematical Methods in Economics 2016
Sentiment Prediction on E-commerce Sites Based on Dempster-Shafer Theory Ladislav Beranek1 Abstract. Predictions sentiment on e-commerce sites are usually based on assigning a numerical score that is followed eventually by free text review. These score and text review are written by people who bought certain products or services (especially hotel industry) and want to evaluate their purchases. Most of the works that try to evaluate customers' sentiment from these evaluations deal with textual analysis of the reviews. In this work, we will consider this review as a collection of sentences, each with its own orientation and with sentiment score. Certain score aggregation method is needed to combine sentence-level scores into an overall review rating. On the basis of the analysis of existing methods, we propose a new method that aggregate score on the base of the Dempster-Shafer theory of evidence. In the proposed method, we first determine the polarity of reviews with the use of machine learning and then consider a sentence scores as evidence for an overall assessment review. Results from online data files show the usefulness of our method in comparison with existing methods. Keywords: E-commerce, Dempster-Shafer theory, Sentiment analysis, Customers’ reviews, Recommendation. JEL Classification: C49 AMS Classification: 28E15
1
Introduction
Online reviews can be good or bad, they are seldom neutral. These reviews are often provided in a free-text format by various websites that sell products and services. These reviews can help people decide better in their purchasing decisions. Mostly these text reviews are accompanied by data about the average review. The most common scheme of representing average review rating is five-star scores. Sentiment prediction methods are also used to extract other useful information from review texts. For example, the text body of reviews may be used to extract reviewers’ opinion toward different aspects of a product or service. For sellers or service providers (hotels, etc.), it is important to process these reviews, to find out sentiment of their customers and to check feedback. However, full comprehension of natural language text contained in reviews remains beyond the power of algorithms used at the present time. On the other hand, the statistical analysis of relatively simple sentiment cues can provide a meaningful sense of how the online review impacts customers. There are essentially two ways to use the text body of reviews for predicting their overall scores: considering any review as a unit of text or treating it as a collection of sentences, each with its own sentiment orientation and score. The main drawback of the first viewpoint is that it ignores the fine-grained structural information of textual content [6]. Therefore, the second point of view is preferable. In this approach, the sentiment score of each sentence within a review is first computed. Then, a score aggregation method is used to combine sentence-level scores into an overall review score. In fact, score aggregation can be seen as a data fusion step in which sentence scores are multiple sources of information that should be taken to generate a single review score. Score aggregation is a widespread problem in sentiment analysis [5, 6, 10]. In this paper, we describe method based on the Dempster-Shafer (DS) theory of evidence [18]. We determine the consumers' sentiment from reviews from chosen e-shop, and also how this sentiment varies with time. We give example of our analysis. In this paper, we discuss several aspects of our model for sentiment analysis which include:
Our model relies on tracking the reference frequencies of adjectives with positive and negative connotations. We present a method for creating a sentiment lexicon. We construct statistical indexes which meaningfully reflects the significance of sentiment of sentences and we apply the Dempster rule to integrate them. We provide evidence of the validity of our sentiment evaluation by correlating our index with evaluation of products on a chosen e-shop. Positive correlations prove that our model can measure public sentiment of consumers. Finally, we discuss possible applications and implications of our work.
1
University of South Bohemia, Faculty of Economics, Department of Applied Mathematics and Informatics, Studentska 13, Ceske Budejovice, beranek@ef.jcu.cz.
43
Mathematical Methods in Economics 2016
The remainder of the paper is organized as follows. Section 2 and 3 reviews background and related work; Section 4 illustrates the materials and methods; Section 5 reports experimental results and presents a discussion of examined methods; finally, Section 6 sets out conclusion and future work.
2
Dempster-Shafer Theory
Information related to decision making about cyber situation is often uncertain and incomplete. Therefore, it is of vital importance to find a feasible way to make decisions about the appropriate response the situation under this uncertainty. Our model is a particular application of the Dempster-Shafer theory. The Dempster-Shafer theory [18] is designed to deal with the uncertainty and incompleteness of available information. It is a powerful tool for combining evidence and changing prior knowledge in the presence of new evidence. The Dempster-Shafer theory can be considered as a generalization of the Bayesian theory of subjective probability [8]. In this paper, we propose a unique trust model based on the Dempster-Shafer theory which combines evidence concerning reputation with evidence concerning possible illegal behavior on an Internet auction. In the following paragraphs, we give a brief introduction to the basic notions of the Dempster-Shafer theory (frequently called theory of belief functions or theory of evidence). Considering a finite set referred to as the frame of discernment , a basic belief assignment (BBA) is a function m: 2 → [0,1] so that
m( A) 1 ,
(1)
A
where m() = 0, see [18]. The subsets of 2 which are associated with non-zero values of m are known as focal elements and the union of the focal elements is called the core. The value of m(A) expresses the proportion of all relevant and available evidence that supports the claim that a particular element of belongs to the set A but not to a particular subset of A. This value pertains only to the set A and makes no additional claims about any subsets of A. We denote this value also as a degree of belief (or basic belief mass - BBM). Shafer further defined the concepts of belief and plausibility [18] as two measures over the subsets of as follows:
Bel( A)
m(B),
(2)
m(B) .
(3)
B A
Pl ( A)
B A
A BBA can also be viewed as determining a set of probability distributions P over so that Bel(A) ≤ P(A) ≤ Pl(A). It can be easily seen that these two measures are related to each other as Pl(A) = 1 − Bel(¬A). Moreover, both of them are equivalent to m. Thus one needs to know only one of the three functions m, Bel, or Pl to derive the other two. Hence we can speak about belief function using corresponding BBAs in fact. Dempster’s rule of combination can be used for pooling evidence represented by two belief functions Bel1 and Bel2 over the same frame of discernment coming from independent sources of information. The Dempster’s rule of combination for combining two belief functions Bel1 and Bel2 defined by (equivalent to) BBAs m1 and m2 is defined as follows (the symbol is used to denote this operation): (m1 m2 )( A)
(4) where
k
1 1 k
m (B) m (C ) , 1
m (B) m (C) . 1
2
2
B C A
(5)
B C
Here k is frequently considered to be a conflict measure between two belief functions m1 and m2 or a measure of conflict between m1 and m2 [18]. Unfortunately, this interpretation of k is not correct, as it includes also internal conflict of individual belief functions m1 and m2 [5]. Demspter’s rule is not defined when k = 1, i.e. when cores of m1 and m2 are disjoint. This rule is commutative and associative; as the rule serves for the cumulation of beliefs, it is not idempotent.
44
Mathematical Methods in Economics 2016
When calculating contextual discounting [2] we also use the un-normalized (conjunctive) combination rule in the form [18] (we use the symbol to denote this operation):
(m1 m2 )( A)
m (B) m (C) . 1
2
(6)
B C A
3
Related Work
Sentiment analysis has been receiving increasing attention at the present time. Some examples of sentiment analysis applications include predicting sales performance [11], ranking products and merchants, linking Twitter sentiment with public opinion polls [17], and identifying important product aspects [23] or services [3]. In addition to these traditional applications, sentiment analysis has presented new research opportunities for social sciences in recent years, for instance, characterizing social relations [7] or predicting the stock market using Twitter moods [4]. Sentiment analysis tasks can also be studied in terms of their sentiment prediction component. In this sense, existing approaches can be grouped into three main categories: machine learning approaches [13, 15], linguisticbased strategies, and lexicon-based methods [20, 21]. In the first category, input texts are first converted to feature vectors and then, using a machine learning algorithm, a classifier is trained on a human-coded corpus. Finally, the trained classifier is used for sentiment prediction. Linguistic approaches use simple rules based upon compositional semantics. They exploit the grammatical structure of text to predict its polarity. Lexicon-based approaches work primarily by identifying some predefined terms from a lexicon of known sentiment-bearing words. Lexicon-based techniques use a list of sentiment-bearing words and phrases that is called opinion lexicon [10]. In the current study we use a lexicon-based method. Lexicon-based techniques deal with the determining semantic orientation of words. Author in [20] evaluates adjectives for polarity as well as gradation classification. A statistical model groups adjectives into clusters, corresponding to their tone or orientation. The use of such gradable adjectives is an important factor in determining subjectivity. Statistical models are used to predict the enhancement of adjectives. Authors of the paper [12] evaluate the sentiment of an opinion holder (entity) using WordNet to generate lists of positive and negative words. They assume that synonyms (antonyms) of a word have the same (opposite) polarity. The percentage of a word’s synonyms belonging to lists of either polarity was used as a measure of its polarity strength, while those below a threshold were considered neutral or ambiguous. Several systems have been built which attempt to quantify opinion from product reviews. Pang, Lee and Vaithyanathan [16] perform sentiment analysis of movie reviews. Their results show that the machine learning techniques perform better than simple counting methods. They achieve an accuracy of polarity classification of roughly 83%. Author in [1] detect the polarity of reviews using a machine learning approach and then use integration based on belief function theory to the counting of overall review rating. Nasukawa and Yi[14] identify local sentiment as being more reliable than global document sentiment, since human evaluators often fail to agree on the global sentiment of a document. They focus on identifying the orientation of sentiment expressions and determining the target of these sentiments. Shallow parsing identifies the target and the sentiment expression; the latter is evaluated and associated with the target. In [22], they follow up by employing a feature-term extractor. For a given item, the feature extractor identifies parts or attributes of that item. e.g., battery and lens are features of a camera.
4 4.1
Materials and methods Sentiment lexicon generation
Two recent and widely used lexicon-based tools for sentiment strength detection are SO-CAL [20] and SentiStrength [21]. SO-CAL (Semantic Orientation CALculator) uses lexicons of terms coded on a single negative to positive range of −5 to +5 [30]. Its opinion lexicon was built by human coders tagging lemmatized noun and verbs as well as adjectives and adverbs for strength and polarity in 500 documents from several corpora (e.g., rating “hate” as −4 and “hilariously” as +4). In SO-CAL intensifying words have a percentage associated with them. For example, “slightly” and “very” have −50% and +25% modification impacts, respectively. Therefore, if “good” has a sentiment value of 3, then “very good” would have a value of 3 × (100% + 25%) = 3.75. Moreover, it amplifies the strength of negative expressions in texts and decreases the strength of frequent terms. Empirical tests showed that SO-CAL has robust performance for sentiment detection across different types of web texts [20].
45
Mathematical Methods in Economics 2016
4.2
The determination of belief functions for sentiment analysis
We denote = {positive, negative} as a frame of discernment concerning the sentiment. The power set of the set (the set of all subsets) 2 has three elements (we do not consider the empty set): 2 = {{positive}, {negative}, }, where {positive} represents positive for example evaluation of some product (positive sentiment), {negative} means that this evaluation is negative (negative sentiment) and {} denotes ignorance. It means that we cannot judge whether the respective evaluation is positive or negative, for example if evaluation is neutral (neither positive nor negative). Polarity scores We use the raw sentiment scores to track two trends over time: • Polarity: Is the sentiment associated with the entity positive or negative? • Subjectivity: How much sentiment (of any polarity) does the entity garner?
Subjectivity indicates proportion of sentiment to frequency of occurrence, while polarity indicates percentage of positive sentiment references among total sentiment references. We focus first on polarity. We evaluate world polarity using sentiment data for all entities for the entire time period: positive_sentiment_references , m p ( positive) total_sentiment_references
m p (negative)
negative_sentiment_references , total_sentiment_references
(7)
mp () 1 mp ( positive) mp (negative) . We can evaluate entity polarityi using these equations for that days (dayi) when we want to monitor changes in sentiment. Subjectivity scores The subjectivity time series reflects the amount of sentiment an entity is associated with, regardless of whether the sentiment is positive or negative. Reading all news text over a period of time and counting sentiment in it gives a measure of the average subjectivity levels of the world. We evaluate world subjectivity using sentiment data for all entities for the entire time period:
ms ()
total_sentiment_references , total_references
1 ms () , 2 ms (negative) 1 ms ( positive) ms () . ms ( positive)
(8)
We also can evaluate entity subjectivityi using sentiment data for that day (dayi) only similar like in the previous case. General index of sentiment Once we obtain the belief functions, we combine them in a consistent manner to get a more complete assessment of what the whole group of signs indicates. The combination of belief functions is done with the help of the Dempster’s combination rule, see equation (4). (m p ms )({ positivei })
1 m p ({ positivei }) ms ({ positivei }) m p ({ positivei }) ms () m p () ms ({ positivei }) K
46
Mathematical Methods in Economics 2016
(m p ms )({negativei })
1 m p ({negativei }) ms ({negativei }) m p ({negativei }) ms () m p () ms ({negativei }) K 1 (9) (m p ms )({}) m p () ms () , K
where K:
K 1 (mp ({negativei })ms ({ positivei }) mp ({ positivei })ms ({negativei })) . It should be emphasized that these scores are either positive or negative, or cannot decide. We cannot have a positive assessment while at the same time a negative. According this assumption, we also have defined appropriate frame of discernment.
5
Results and discussion
We evaluated the analysis system on a corpus of 158 hotel reviews crawled from the web. These reviews contained 642 reviews. For the evaluation, these reviews were manually classified with respect to their polarity, including the neutral polarity besides positive and negative ones. Also, we annotated the reviews whether they cover more than one topic. The distribution from this manual classification is shown in Table 1.
Reviews
Positive
Negative
Neutral
Multi-topic
642
305
142
123
72
Table 1. Manual corpus classification
The experiment data was formed by the following approach. First, the sentiment holder and target pair frequency is calculated. Then the most popular sentiment holders and targets are selected by identifying a threshold where the frequency drops significantly. This approach ensures that our selected data has enough common sentiment targets among the different sentiment holders for the evaluation. For evaluation, we used the accuracy measure. The accuracy of a prediction method may be calculated according to the following equation: TP TN , (10) Accuracy TP FP TN FN where TP and TN are true positive and true negative, while FP and FN are false positive and false negative, respectively. In order to comparison our approach with others, we predict the sentiments using different algorithms: decision tree (J48 classification), K-Star, Naïve Bayes, and Support Vector Machines.
J48 K-star Naïve Bayes SVM Our model
Correct
False
Accuracy
281 251 263 277 268
143 173 161 147 156
0.66 0.59 0.62 0.65 0.63
Table 2. Comparison of various approaches with our model based on belief functions and their integration
Evaluated of different algorithms, the results in Table 2 were achieved. However, our model is relatively simple, and we expect improvement in accuracy after rebuilding and adjustment of our model.
47
Mathematical Methods in Economics 2016
6
Conclusion
In this paper we have studied sentiment prediction applied on data from hotel review sites. An evaluation of the techniques presented in the paper showed sufficient accuracy in sentiment prediction. Our experimental study and results showed that our suggested approach can be a useful tool for generating profiles of user sentiments. There are many interesting directions that can be explored. In our future work, we want to explore how sentiment can vary by demographic group, news source or geographic location. By expanding our model with sentiment maps, we will be able to identify geographical regions of favorable or adverse opinions for given entities. We also want to analyze the degree to which our sentiment indices predict future changes in popularity or market behavior.
Reference [1] Mohammad Ehsan Basiri, M.E., Naghsh-Nilchi, A.R., and Ghasem-Aghaee, N.: Sentiment Prediction Based on Dempster-Shafer Theory of Evidence. Mathematical Problems in Engineering 2014 (2014), Article ID 361201, 13 pages. [2] Beranek, L., and Nydl, V.: The Use of Belief Functions for the Detection of Internet Auction Fraud. In: Mathematical Methods in Economics 2013 (MME 2013), (Vojackova, H., ed.). Coll Polytechnics Jihlava, Jihlava, 2013, 31-36. [3] Bjørkelund, E., Burnett, T.H., and Nørvåg, K.: A study of opinion mining and visualization of hotel reviews. In: Proceedings of the 14th International Conference on Information Integration and Web-based Applications & Services (IIWAS '12). ACM, New York, NY, USA, 2012, 229–238. [4] Bollen, J., Mao, H., and Zeng, X.: Twitter mood predicts the stock market. Journal of Computational Science 2 (2011), 1–8. [5] Daniel, M.: Conflicts within and between belief functions. In: Proceedings of the Computational Intelligence for Knowledge-based Systems Design, and 13th International Conference on Information Processing and Management of Uncertainty. Dortmund, Germany, 2010, 696–705. [6] Ganu, G., Kakodkar, Y., and Marian, A.: Improving the quality of predictions using textual information in online user reviews. Information Systems 38, (2013), 1–15. [7] Groh, G., and Hauffa,J.: Characterizing social relations via NLPbased sentiment analysis. In: Proceedings of the International Conference onWeblogs and Social Media (ICWSM ’11), 2011. [8] Haenni, R.: Shedding new light on Zadeh’s criticism of Dempster’s rule of combination. In: Proceedings of the Eighth Int. Conf. on Information Fusion. Philadelphia, IEEE, 2005, 879–884. [9] Lappas, T.: Fake reviews: the malicious perspective. In: Natural Language Processing and Information Systems. Springer, 2012, 23–34. [10] Liu, B.: Sentiment Analysis and Opinion Mining. Synthesis Lectures on Human Language Technologies vol. 5, Morgan & Claypool, 2012. [11] Liu, B., Huang, X., An, A., and Yu, X.: ARSA: a sentiment-aware model for predicting sales performance using blogs . In: Proceedings of the 30th Annual International ACMSIGIR Conference on Research and Development in Information Retrieval (SIGIR ’07). July 2007, 607–614. [12] Liu, B.: Sentiment analysis: a multifaceted problem. IEEE Intelligent Systems, vol. 25, no. 3, 2010, 76–80. [13] Minařík, M., and, Šťastný, J. Recognition of Randomly Deformed Objects. In: MENDEL 2008, 14th International Conference on Soft Computing. Brno, Czech Republic. 2008, pp. 275-280. [14] Nasukawa, T., and, Yi, J.: Sentiment analysis: Capturing favorability using natural language processing. In: The Second International Conferences on Knowledge Capture. 2003, pp. 70–77. [15] Paltoglou, G., and Thelwall, M.: Twitter, MySpace, Digg: unsupervised sentiment analysis in social media. ACM Transactions on Intelligent Systems and Technology (TIST) 3 (2012), article 66. [16] Pang, B., and Lee, L.: Opinion mining and sentiment analysis. Foundations and Trends in Information Retrieval, 2 (2008), 1–135. [17] O’Connor, B., Balasubramanyan, R., Routledge, B.R., and Smith, N.A.: From tweets to polls: linking text sentiment to public opinion time series. In: Proceedings of the International Conference on Weblogs and Social Media (ICWSM ’10). vol. 11, 2010, 122–129. [18] Shafer, G.: A mathematical theory of evidence. Princeton University Press, Princeton, NJ, 1975. [19] Stuckenschmidt, H., and Zirn, C.: Multi-dimensional analysis of political documents. In: Natural Language Processing and Information Systems. Springer, 2012, 11–22. [20] Taboada, M., Brooke, J., Tofiloski,M., Voll, K., and Stede, M.: Lexicon-basedmethods for sentiment analysis. Computational Linguistics 37, (2011), 267–307. [21] Thelwall, M., Buckley, K., and Paltoglou, G.: Sentiment strength detection for the social web. Journal of the American Society for Information Science and Technology 63 (2012), 163–173. [22] Yi, J., Nasukawa, T., and Niblack, W.: Sentiment analyser: Extracting sentiments about a given topic using natural language processing techniques. In: 3rd IEEE Conf. on Data Mining (ICDM'03), 2003, 423-434 [23] Yu, J., Zha,Z.J., Wang, M., and Chua, T.S.: Aspect ranking: Identifying important product aspects from online consumer reviews. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies (ACL-HLT ’11). June 2011, 1496–1505.
48
Mathematical Methods in Economics 2016
OCA Index: Results for the EU’s member states Vendula Beránková1 Abstract. Probably the most famous monetary union in the world is the Eurozone, which now consists of nineteen European countries. The other EU member states are in the statute of candidate countries. The aim of the article is to evaluate an appropriateness of these nine EU countries standing out of the Eurozone for their membership according to the OCA index. This index, proposed by Bayoumi and Eichengreen [2], is based on the optimum currency area theory and its criteria. The paper utilise the traditional approaches of the optimum currency area theory. The OCA index includes variables that are important in deciding the country's entry into the monetary union and it is estimated by using a panel regression method. First, the original equation of Bayoumi and Eichengreen [2], who estimated the OCA index first, is used. Then, the modified equation of Horváth and Komárek [7] is estimated. The sample covers the period from 2000 to 2014. According to the results, there are countries which are more appropriate for the membership in the Eurozone because they reach satisfactory values of OCA index. But there are also countries with worse values and as such they are not appropriate candidates. Keywords: Monetary Union, Optimum Currency Area, Eurozone, OCA Index. JEL Classification: E42, E52 AMS Classification: 62M10
1
Introduction
The creation of monetary union is not something new in the world or not in the Europe. The reasons for creation of such a union can be different - historical, economic or political. There are few monetary unions such as the United States of America, the Eastern Caribbean Currency Area or the Central African Economic and Monetary Community in the world. But these days, probably the most famous monetary union is the monetary integration within the European Union – the Eurozone, which now consists of nineteen European countries 2. The other nine EU member states are in the statute of candidate countries for their membership in the Eurozone. Assess, whether the country is a good candidate for membership in the monetary union, it is not as simple as it might seem at first glance. Undoubtedly, joining the monetary union brings the country certain benefits (removing of the transaction costs of exchange, removing of the exchange rate risks, or intensification of competition and pressure on production quality, etc.). But on the other hand, the membership in the monetary union also brings some disadvantages. And probably the biggest disadvantage is the loss of autonomous monetary policy – member country is losing the possibility to react to potential economic shocks. It is necessary, that these benefits outweigh the obvious disadvantages. The optimum currency area theory and its criteria are an appropriate instrument by which it is possible to assess the costs and benefits of membership in the monetary union. The aim of this article is to evaluate an appropriateness of nine selected EU countries for their membership in the Eurozone according to the OCA index, which was created by Bayoumi and Eichengreen [2]. The OCA index is based on the optimum currency area theory and its criteria. The remainder of the paper is organized as follows. Next section deals with the theoretical basis of this paper, methods and data are introduced in the next section. Then, results are discussed. The conclusion is offered in the final section.
2
Optimum Currency Area
Optimum currency area is a relatively young economic theory dating back to the 60s of the 20th century. In this period, the first landmark studies were written. These studies formed the basis of the optimum currency area theory and enable its further development. Namely it is the work of: Robert Mundell: A Theory of Optimum Currency Areas [12]; 1
VŠB-TU Ostrava, Faculty of Economics, Department of National Economy, Sokolská třída 33, Ostrava, Czech Republic, vendula.berankova@vsb.cz address. 2 The Eurozone member states: Austria, Belgium, Cyprus, Estonia, Finland, France, Germany, Greece, Ireland, Italy, Latvia, Lithuania, Luxemburg, Malta, Netherlands, Portugal, Slovakia, Slovenia, Spain.
49
Mathematical Methods in Economics 2016
 
Ronald McKinnon: Optimum Currency Area [10]; Peter Kenen: The Theory of Optimum Currency Areas: An Eclectic View [8]3.
Based on these studies, three basic criteria for the optimum currency area were established. Concretely, these criteria are: Mundell criterion of workforce mobility, McKinnon criterion of openness of the economy and Kenen criterion of the diversification of production. These three criteria are very often accompanied by the criterion of alignment of economic cycles. When examining individual criteria, however we run into a problem, because empirical testing of fulfilling the OCA criteria does not give a clear answer to the question whether the ground for adoption of the single currency is advantageous or not. It is very difficult to assess what degree of fulfilment of the criteria and what combination of their performance is satisfactory, and what is not. The optimum currency area index is trying to overcome these problems (Hedija, [6]).
2.1
OCA Index
The OCA index assesses the costs and benefits of adopting the single currency. This index was first used by Tamim Bayoumi and Barry Eichengreen [2] in their article Ever Closer to Heaven? An Optimum Currency Area Index for European Countries. The authors wanted to develop a tool that would allow assessing on the basis of the OCA theory whether the country is suitable candidate for adopting the single currency. Hedija [6] suggests that the OCA index is constructed as a bilateral index, which assesses the appropriateness of introducing a single currency in the two countries. It is built on the optimum currency area theory and comes to the realization that the country's adoption of the single currency is the better, the smaller the tendency has nominal exchange rate to oscillate countries. These mutual fluctuations in nominal exchange rates of the currencies in the index are examined, depending on the level of fulfilment of the four criteria of optimum currency area: 1. the alignment of business cycles; 2. the similarity of economic structure; 3. the interdependence of trade; 4. the size of the economy. It is clear that the OCA index is based on traditional approaches of the optimum currency area theory. It was originally compiled from a sample of 21 countries 4 during the period 1983 – 1992 using a panel regression method. The equation of Bayoumi and Eichengreen [2], who estimated the OCA index, is defined as follows: đ?‘†đ??ˇ(đ?‘’đ?‘–đ?‘— ) = đ?›˝0 + đ?›˝1 đ?‘†đ??ˇ(∆đ?‘Śđ?‘– − ∆đ?‘Śđ?‘— ) + đ?›˝2 đ??ˇđ??źđ?‘†đ?‘†đ??źđ?‘€đ?‘–đ?‘— + đ?›˝3 đ?‘‡đ?‘…đ??´đ??ˇđ??¸đ?‘–đ?‘— + đ?›˝4 đ?‘†đ??źđ?‘?đ??¸đ?‘–đ?‘—
(1)
where đ?‘†đ??ˇ(đ?‘’đ?‘–đ?‘— ) is the standard deviation of the exchange in the logarithm of bilateral exchange rate at the end of the year between countries i and j, đ?‘†đ??ˇ(∆đ?‘Śđ?‘– − ∆đ?‘Śđ?‘— ) is the standard deviation of the difference in the logarithm of a real GDP between countries i and j, đ??ˇđ??źđ?‘†đ?‘†đ??źđ?‘€đ?‘–đ?‘— is the sum of the absolute differences in the shares of agricultural, mineral, and manufacturing trade in a total merchandize trade, đ?‘‡đ?‘…đ??´đ??ˇđ??¸đ?‘–đ?‘— is the mean of the ratio of bilateral exports to domestic GDP for two countries i and j, đ?‘†đ??źđ?‘?đ??¸đ?‘–đ?‘— is the mean of the logarithm of the two GDPs measured in U.S. dollars. HorvĂĄth and KomĂĄrek [7] modify the equation of Bayoumi and Eichengreen when they calculate the OCA index – they use the variable đ?‘‚đ?‘ƒđ??¸đ?‘ đ?‘–đ?‘— (openness of the economy) instead of the variable đ?‘†đ??źđ?‘?đ??¸đ?‘–đ?‘— (size of the economy). The reason was that Bayoumi and Eichengreen [2] found relatively little correlation between the size of the economy and trends in the transition to a fixed exchange rate regime. In contrast, the openness of the economy is one of the traditional optimum currency area criteria, and therefore HorvĂĄth and KomĂĄrek [7] include this variable in their regression model: đ?‘†đ??ˇ(đ?‘’đ?‘–đ?‘— ) = đ?›˝0 + đ?›˝1 đ?‘†đ??ˇ(∆đ?‘Śđ?‘– − ∆đ?‘Śđ?‘— ) + đ?›˝2 đ??ˇđ??źđ?‘†đ?‘†đ??źđ?‘€đ?‘–đ?‘— + đ?›˝3 đ?‘‡đ?‘…đ??´đ??ˇđ??¸đ?‘–đ?‘— + đ?›˝4 đ?‘‚đ?‘ƒđ??¸đ?‘ đ?‘–đ?‘—
(2)
where đ?‘†đ??ˇ(đ?‘’đ?‘–đ?‘— ) is the standard deviation of the exchange in the logarithm of bilateral exchange rate at the end of the year between countries i and j, đ?‘†đ??ˇ(∆đ?‘Śđ?‘– − ∆đ?‘Śđ?‘— ) is the standard deviation of the difference in the logarithm of a real GDP between countries i and j, đ??ˇđ??źđ?‘†đ?‘†đ??źđ?‘€đ?‘–đ?‘— is the sum of the absolute differences in the shares of agricultural, mineral, and manufacturing trade in a total merchandize trade, đ?‘‡đ?‘…đ??´đ??ˇđ??¸đ?‘–đ?‘— is the mean of the ratio of bilateral 3
The issue of optimum currency area is also engaged in a number of other, respectively recent studies. Their authors are e.g. Bayoumi and Eichengreen [2], Fidrmuc and Korhonen [5], Cincibuch and VĂĄvra [3], Mongelli [13], HorvĂĄth and KomĂĄrek [7], KuÄ?erovĂĄ [9], Mink, Jacobs and Haan [11], BachanovĂĄ [1], Eichengreen [4], SkoĹ™epa [14] or Hedija [6]. 4 European Union countries that were members at the time the article was published, without Luxembourg (i.e. Belgium, Denmark, Finland, France, Ireland, Italy, Netherlands, Portugal, Austria, Greece, the United Kingdom, Spain, Sweden), followed by Norway, Switzerland, USA, Canada, Australia, New Zealand and Japan.
50
Mathematical Methods in Economics 2016
exports to domestic GDP for two countries i and j, đ?‘‚đ?‘ƒđ??¸đ?‘ đ?‘–đ?‘— is the mean of the ratio of the trade, i.e. export and import to their GDP. Variable SD The variable đ?‘†đ??ˇ(∆đ?‘Śđ?‘– − ∆đ?‘Śđ?‘— ) is the standard deviation of the difference in the logarithm of real GDP of two countries. The real GDP (in prices of year 2005) is measured in U.S. dollars. It is computed as the difference of annual real GDP for each country:
ďƒŠ RGDPi (t ) RGDPj (t ) ďƒš SD(ď „yi ď€ ď „y j )  SDďƒŞln ; ln ďƒş RGDPj (t ď€1) ďƒťďƒş ďƒŤďƒŞ RGDPi (t ď€1)
(3)
where SD is standard deviation, RGDPi(t) is real GDP of country i in time t, RGDPi(t-1) is real GDP of country i in time t-1, RGDPj(t) is real GDP of country j in time t and RGDPj(t-1) is real GDP of country j in time t-1. Variable DISSIM The variable đ??ˇđ??źđ?‘†đ?‘†đ??źđ?‘€đ?‘–đ?‘— is the sum of the absolute differences in the share of individual components 5 in total bilateral trade. It attains values from 0 to 2. The value 0 means the same structure of bilateral trade. The value 2 means that commodity structure of bilateral trade of two countries is absolutely different. In this case, lower value implies better conditions to adopt a common currency. The variable has the following specification: N
XAij
A1
X ij
DISSIM ij  ďƒĽ
ď€
XA ji X ji
(4)
where XA is the export of each economic category, Xij is total export from country i to country j and Xji is export from country j to country i. Variable TRADE The variable đ?‘‡đ?‘…đ??´đ??ˇđ??¸đ?‘–đ?‘— is the mean of the ratio of bilateral trade (import plus export) to nominal GDP of countries i and j. This nominal GDP is measured in U.S. dollars. The variable đ?‘‡đ?‘…đ??´đ??ˇđ??¸đ?‘–đ?‘— was computed as follows: ďƒŠ X ij X ji ďƒš ďƒş TRADE ij  MEAN ďƒŞ ; ďƒŞďƒŤ GDPi GDPj ďƒşďƒť
(5)
where Xij is nominal export from country i to country j, Xji is nominal export from country j to country i, GDPi is nominal GDP of country i and GDPj is nominal GDP of country j. Higher value of this variable means better conditions to adopt a common currency because common currency is more convenient for countries which have higher level of bilateral trade. Variable SIZE The variable đ?‘†đ??źđ?‘?đ??¸đ?‘–đ?‘— represents the size of the economies. It is computed as the mean of the logarithm of real GDP of countries i and j. Again, the real GDP (in prices of year 2005) is measured in U.S. dollars.
ď ›
SIZEij  MEAN ln RGDPi ; ln RGDPj
ď ?
(6)
where RGDPi is the real GDP of country i and RGDPj is the real GDP of country j. Variable OPEN The variable đ?‘‚đ?‘ƒđ??¸đ?‘ đ?‘–đ?‘— measures the rate of openness of each economy. It is computed as the mean of the share of nominal trade (import plus export) on nominal GDP (measured in U.S. dollars) of two countries: ďƒŠX M Xj Mj ďƒš i ďƒş OPEN ij  MEAN ďƒŞ i ; GDPj ďƒş ďƒŞďƒŤ GDPi ďƒť 5
(7)
The variable đ??ˇđ??źđ?‘†đ?‘†đ??źđ?‘€đ?‘–đ?‘— is computed using the UN`s Basic Economic Categories (BEC) classification which distinguishes economic categories into 7 parts: 1. Food and beverages; 2. Industrial supplies; 3. Fuels and lubricants; 4. Capital goods; 5. Transport equipment; 6. Consumption goods; 7. Others.
51
Mathematical Methods in Economics 2016
where Xi is total nominal export of country i, Mi is total nominal import of country i, GDPi is nominal GDP of country i, Xj is total nominal export of country j, Mj is total nominal import of country j, GDPj is nominal GDP of country j.
2.2
Data
Data for variables đ?‘†đ??ˇ(∆đ?‘Śđ?‘– − ∆đ?‘Śđ?‘— ), đ?‘†đ??źđ?‘?đ??¸đ?‘–đ?‘— and đ?‘‚đ?‘ƒđ??¸đ?‘ đ?‘–đ?‘— is used from the World Bank Database. Variables đ??ˇđ??źđ?‘†đ?‘†đ??źđ?‘€đ?‘–đ?‘— and đ?‘‡đ?‘…đ??´đ??ˇđ??¸đ?‘–đ?‘— are calculated with the use of the data from the United Nations Commodity Trade Statistics Database and the World Bank Database. Data for variables đ?‘†đ??ˇ(đ?‘’đ?‘–đ?‘— ) is used from Eurostat. For all calculations, an annual data from the period 2000 – 2014 are used.
3
Results
In this case, the OCA index is compiled from a sample of nine selected European countries during the period 1983 – 1992. These countries are: Bulgaria, Croatia, the Czech Republic, Denmark, Hungary, Poland, Romania, Sweden, and the United Kingdom. Of course, the panel regression method is used. The equations of the OCA index look like follows (standard errors of coefficient estimates are in parenthesis): đ?‘‚đ??śđ??´ đ?‘–đ?‘›đ?‘‘đ?‘’đ?‘Ľ = 0,016 + 0,474 đ?‘†đ??ˇ(∆đ?‘Śđ?‘– − ∆đ?‘Śđ?‘— ) + 0,031 đ??ˇđ??źđ?‘†đ?‘†đ??źđ?‘€đ?‘–đ?‘— + 0,145 đ?‘‡đ?‘…đ??´đ??ˇđ??¸đ?‘–đ?‘— − 0,12 đ?‘†đ??źđ?‘?đ??¸đ?‘–đ?‘— (0,0518)
(0,0045)
(0,0731)
n = 126
R2 = 0,515
SE = 0,022
(0,0113)
(0,2702)
đ?‘‚đ??śđ??´ đ?‘–đ?‘›đ?‘‘đ?‘’đ?‘Ľ = 0,014 + 0,397 đ?‘†đ??ˇ(∆đ?‘Śđ?‘– − ∆đ?‘Śđ?‘— ) + 0,031 đ??ˇđ??źđ?‘†đ?‘†đ??źđ?‘€đ?‘–đ?‘— + 0,114 đ?‘‡đ?‘…đ??´đ??ˇđ??¸đ?‘–đ?‘— + 0,010 đ?‘‚đ?‘ƒđ??¸đ?‘ đ?‘–đ?‘— (0,1659) n = 126
(0,0087) 2
R = 0,507
(0,1555)
(0,0736)
(8)
(9)
(0,6978)
SE = 0,022
The equations for calculating the OCA index receives a specific form. Estimated values of coefficients β express the sensitivity of the index to the explanatory variable. When calculating the OCA index, both equations are used. The index is computed from annual data from the period 2000 – 2014. This selected time series are divided into two periods: pre-crisis period (2000 – 2007) and post-crisis period (2008 – 2014). Then, the OCA index for the whole period 2000 – 2014 is computed. This allows the appreciation of the OCA index over the time. The lower the OCA index is the more suitable candidate for adopting common currency the country is6. The empirical results are introduced in following Table 1 and Table 2. Country 2000 – 2007 2008 – 2014 2000 – 2014 Bulgaria 0,0476 0,0385 0,0433 Croatia 0,0401 0,0316 0,0361 Czech Republic 0,0412 0,0429 0,0417 Denmark 0,0346 0,0337 0,0342 Hungary 0,0440 0,0428 0,0435 Poland 0,0427 0,0443 0,0435 Romania 0,0496 0,0443 0,0471 Sweden 0,0376 0,0394 0,0385 United Kingdom 0,0327 0,0378 0,0351 Table 1 OCA Index based on equation (8), own calculations In the Table 1, there are presented results based on the equation (8). But the variable đ?‘†đ??źđ?‘?đ??¸đ?‘–đ?‘— is not included in calculations, because this variable seems to be statistically insignificant. Therefore, the OCA index is calculated in this case only by significant variables đ?‘†đ??ˇ(∆đ?‘Śđ?‘– − ∆đ?‘Śđ?‘— ), đ??ˇđ??źđ?‘†đ?‘†đ??źđ?‘€đ?‘–đ?‘— and đ?‘‡đ?‘…đ??´đ??ˇđ??¸đ?‘–đ?‘— . The best values are observed 6
Variables are computed for all nine EU countries outside the Eurozone, but there is a problem, because some data are not available for the Eurozone. So, when calculating variables đ??ˇđ??źđ?‘†đ?‘†đ??źđ?‘€đ?‘–đ?‘— and đ?‘‡đ?‘…đ??´đ??ˇđ??¸đ?‘–đ?‘— , Germany is used instead of the Euro Area.
52
Mathematical Methods in Economics 2016
in Croatia, Denmark, Sweden and the United Kingdom. Conversely, the worst value of the OCA index is in Romania. It should be also noted that since 2008 the OCA index has significantly improved in Bulgaria, Croatia and also Romania. On the other hand, since 2008 the OCA index has got worse in the Czech Republic, Hungary, Poland, Sweden and the United Kingdom. In the Table 2, there are presented results based on the equation (9). The best values of the OCA index are in the case of Croatia, Denmark and the United Kingdom. The worst values are observed in Bulgaria, Hungary and Romania. Since 2008, the OCA index has significantly improved in Bulgaria, Croatia and also Romania. On the other hand, the OCA index has got worse in the Czech Republic, Hungary, Poland, Sweden and the United Kingdom after the crisis. Country 2000 – 2007 2008 – 2014 2000 – 2014 Bulgaria 0,0516 0,0451 0,0486 Croatia 0,0445 0,0365 0,0408 Czech Republic 0,0442 0,0477 0,0456 Denmark 0,0393 0,0395 0,0394 Hungary 0,0483 0,0492 0,0487 Poland 0,0455 0,0474 0,0464 Romania 0,0526 0,0478 0,0504 Sweden 0,0421 0,0444 0,0432 United Kingdom 0,0359 0,0411 0,0383 Table 2 OCA Index based on equation (9), own calculations However, neither the assessment of the appropriateness of adopting a common currency using reconstructed index is not entirely clear nor straight forward. Moreover, Bayoumi and Eichengreen [2] find relatively little correlation between the size of the economy and trends in the transition to a fixed exchange rate regime. This problem was solved by HorvĂĄth and KomĂĄrek [7] who use variable đ?‘‚đ?‘ƒđ??¸đ?‘ đ?‘–đ?‘— instead of the variable đ?‘†đ??źđ?‘?đ??¸đ?‘–đ?‘— and their equation could have better informative value.
4
Conclusion
Probably the most famous monetary union is the Eurozone, which now consists of nineteen European countries. Nine EU member states are in the statute of candidate countries. The aim of this article was to evaluate an appropriateness of these nine EU states for a membership in the Eurozone according to the OCA criteria. The research was based on traditional approaches of the optimum currency area theory. It was used the methodology of Bayoumi and Eichengreen [2] who estimated the OCA index. Then the modified equation of Horvåth and Komårek [7] was used. The OCA index was computed for the period 2000 – 2014. Using the equation of Bayoumi and Eichengreen, the best values of OCA index are observed in Croatia, Denmark, Sweden and the United Kingdom. Conversely, the worst value of the OCA index is in Romania. Note that after the crisis, the OCA index has significantly improved in Bulgaria, Croatia and Romania. On the other hand, it has got worse in the Czech Republic, Hungary, Poland, Sweden and the United Kingdom since 2008. However, because Bayoumi and Eichengreen [2] find relatively little correlation between the size of the economy and trend in the transition to a fixed exchange rate regime, Horvåth and Komårek [7] modified their equation. Using the equation of Horvåth and Komårek, the best values of the OCA index are in Croatia, Denmark and the United Kingdom. The worst values are observed in Bulgaria, Hungary and Romania. Since 2008, the OCA index has significantly improved in Bulgaria, Croatia and Romania, while in the case of the Czech Republic, Hungary, Poland, Sweden and the United Kingdom it has got worse. According to these results it may be concluded that there were some countries, which were more appropriate for the membership in the Eurozone. These countries were especially Croatia, Denmark and the United Kingdom. Conversely, Romania is not appropriate for the membership in the Eurozone. These nine countries can be divided into three groups. First group, the main candidates for the membership in the Eurozone, consists of the United Kingdom, Denmark and Croatia. These countries reach satisfactory degree of convergence. Second group consists of the Czech Republic, Poland and Sweden. These countries gradually converge to the Eurozone. And finally, the third group of countries, which has a small degree of convergence, includes Romania, Bulgaria and Hungary.
53
Mathematical Methods in Economics 2016
Of course, the candidate countries must fulfil all the convergence criteria if they want to join the Eurozone. But the OCA index can be used as a convenient additional instrument by which it is possible to determine the appropriateness of countries for the membership in the Eurozone.
Acknowledgements This work was supported by the VSB-Technical University of Ostrava, Faculty of Economics under Grant SGS SP2016/101.
References [1] Bachanová, V.: Index optimální měnové oblasti pro Českou republiku. Ekonomická revue - Central European Review of Economic Issues 11 (2008), 42–57. [2] Bayoumi, T., and Eichengreen, B.: Ever Closed to Heaven? An Optimum-Currency-Area Index for European Countries. European Economic Review 41 (1997), 761–770. [3] Cincibuch, M., and Vávra, D.: Na cestě k EMU: Potřebujeme flexibilní měnový kurz? Finance a úvěr 50 (2000), 361–384. [4] Eichengreen, B.: Sui Generis EMU. NBER Working Paper No. 13740. University of California, Berkeley, 2008. [5] Fidrmuc, J., and Korhonen, I.: Similarity of Supply and Demand Shocks Between the Euro Area and the CEECs. Economic Systems 27 (2003), 313–334. [6] Hedija, V.: Index OCA – aplikace na země EU10. Ekonomická revue - Central European Review of Economic Issues 14 (2011), 85–93. [7] Horváth, R., and Komárek, L.: Teorie optimálních měnových zón: rámec k diskuzím o monetární integraci. Finance a úvěr 52 (2002), 386–407. [8] Kenen, P. B.: The Theory of Optimum Currency Areas: An Eclectic View. In: Monetary Problems of the International Economy (Mundell, R. A., and Swoboda, A. K., eds.). University of Chicago Press, Chicago, 1969. [9] Kučerová, Z.: Teorie optimální měnové oblasti a možnosti její aplikace na země střední a východní Evropy. Národohospodářský ústav Josefa Hlávky, Praha, 2005. [10] McKinnon, R. I.: Optimum Currency Areas. American Economic Review 53 (1963), 717–725. [11] Mink, M., Jacobs, J., and Haan, J.: Measuring Synchronicity and Co-movement of Business Cycles with an Application on the Euro Area. CESifo Working Paper No. 2112. University of Groningen, Groningen, 2007. [12] Mundell, R. A.: Theory of Optimum Currency Areas. American Economic Review 51 (1961), 657–665. [13] Mongelli, F. P.: “New“ Views on the Optimum Currency Area Theory: What is EMU Telling us? European Central Bank Working Paper No. 138. European Central Bank, Frankfurt, 2002. [14] Skořepa, M.: A Convergence-Sensitive Optimum-Currency-Area Index. IES Working Paper: 23/2011. Charles University, Prague, 2011.
54
Mathematical Methods in Economics 2016
Trimmed L-moments in Modeling Income Distribution Diana Bílková1 Abstract. This paper deals with the application of such robust method of point parameter estimation, as the method of TL-moments on economic data. The advantages of this highly robust parametric estimation method are aware when applied to small data sets, especially in the field of hydrology, meteorology and climatology, in particular considering extreme precipitation. The main aim of this contribution is to use this method on large datasets, and comparison the accuracy of this method of parametric estimation with the accuracy of the other two methods, such as the method of L-moments and maximum likelihood method, especially in terms of efficiency of parametric estimation. The study is divided into a theoretical part, in which mathematical and statistical aspects are described, and an analytical part, during which the results of the use of three robust parametric estimation methods are presented. The basis for the research are slightly older data on household net annual income per capita (it failed to obtain some later data). Total 168 income distributions of the years from 1992 to 2007 in the Czech Republic (distribution of household net annual income per capita in CZK) were analyzed. Keywords: L-moments and TL-moments of probability distribution, sample Lmoments and TL-moments, probability density function, distribution function, quantile function, order statistics, models of income distribution. JEL Classification: C13, C46 AMS Classification: 60E05
1
Introduction
Alternative robust version of L-moments, see for example [1], [3]‒[4] or [7]‒[8], will be now presented. This robust modification of L-moments is called „trimmed L-moments“, and labeled „TL-moments“. Threeparametric lognormal curves, see [5]‒[6], represent the basic theoretical distribution whose parameters were simultaneously estimated by three methods of point parameter estimation (methods of TL-moments and Lmoments and maximum likelihood method in combination with Cohen’s method) and accuracy of these methods was then evaluated. This is a relatively new category of moment characteristics of the probability distribution. There are the characteristics of the level, variability, skewness and kurtosis of probability distributions constructed using TLmoments that are robust extending of L-moments. L-moments alone were introduced as a robust alternative to classical moments of probability distributions. However, L-moments and their estimations lack some robust properties that belong to the TL-moments. The advantages of this highly robust parametric estimation method are aware when applied to small data sets, especially in the field of hydrology, meteorology and climatology. This paper shows the advantages of TLmoments when using to large data sets. Sample TL-moments are linear combinations of sample order statistics, which assign zero weight to a predetermined number of sample outliers. Sample TL-moments are unbiased estimations of the corresponding TLmoments of probability distributions. Some theoretical and practical aspects of TL-moments are still under research or remain for future research. Efficiency of TL-statistics depends on the choice of α proportion, for example, the first sample TL-moments l1( 0) , l1(1) , l1( 2) have the smallest variance (the highest efficiency) among other estimations from random samples from normal, logistic and double exponential distribution. When constructing the TL-moments, the expected values of order statistics of random sample in the definition of L-moments of probability distributions are replaced by the expected values of order statistics of a larger random sample, where the sample size grows like this, so that it will correspond to the total size of modification, as shown below.
1
University of Economics, Prague; Faculty of Informatics and Statistics; Department of Statistics and Probability; Sq. W. Churchill 1938/4; 130 67 Prague 3; Czech Republic; e-mail: diana.bilkova@vse.cz.
55
Mathematical Methods in Economics 2016
TL-moments have certain advantages over conventional L-moments and central moments. TL-moment of probability distribution may exist even if the corresponding L-moment or central moment of the probability distribution does not exist, as it is the case of Cauchy’s distribution. Sample TL-moments are more resistant to existence of outliers in the data. The method of TL-moments is not intended to replace the existing robust methods, but rather as their supplement, especially in situations where we have outliers in the data.
2
Theory and Methods
2.1
Trimmed L-moments of Probability Distribution
In this alternative robust modification of L-moments, the expected value E(Xr-j:r) is replaced by the expected value E(Xr+t1−j:r+t1+t2). Thus, for each r we increase sample size of random sample from the original r to r + t1 + t2 and we work only with the expected values of these r treated order statistics Xt1+1:r+t1+t2, Xt1+2:r+t1+t2, …, Xt1+r:r+t1+t2 by trimming the t1 smallest and the t2 largest from the conceptual sample. This modification is called the r-th trimmed L-moment (TL-moment) and is marked (rt1,t 2) . Thus, TL-moment of the r-th order of random variable X is defined 1 (rt1,t 2) r
r 1
r 1 E ( X r t1 j : r t1 t 2) , r 1 , 2 , .... j
(1) j j 0
(1)
It is apparent that the TL-moments simplify to L-moments, when t1 = t2 = 0. Although we can also consider applications, where the values of trimming are not equal, i.e. t1 ≠ t2, we focus here only on symmetric case t1 = t2 = t. Then equation (1) can be rewritten 1 r 1 (t ) j r 1 E ( X r t j: r 2t ) , r 1, 2 , ... . r (1) r j 0 j
(2)
Thus, for example, 1(t ) E ( X 1t:1 2t ) is the expected value of median from conceptual random sample of sample size 1 + 2t. It is necessary here to note that 1(t ) is equal to zero for distributions, which are symmetrical around zero. First four TL-moments have the form for t = 1 (1) 1 E ( X 2:3) ,
(1) 2
1 E ( X 3: 4 X 2: 4) , 2
1 (1) 3 E ( X 4:5 2 X 3:5 X 2:5) , 3 (1) 4
1 E ( X 5:6 3 X 4:6 3 X 3:6 X 2:6) . 4
(3) (4) (5) (6)
Note that the measures of location (level), variability, skewness and kurtosis of the probability distribution are based on 1(1) , (21) , 3(1) a (41) . Expected value E(Xr:n) can be written using the formula E ( X r : n)
1 n! x( F ) [ F ( x)] r 1 [1 F ( x)] n rd F ( x) . (r 1) ! (n r ) ! 0
(7)
Using equation (7) we can re-express the right side of equation (2) 1 (r 2 t ) ! 1 r 1 (t ) j r 1 x( F ) [ F ( x)]r t j 1 [1 F ( x)]t j d F ( x) , r 1, 2 , ... . r (1) r j 0 j (r t j 1) ! (t j ) ! 0
(8)
It is necessary to be noted here that (r0) r is a normal the r-th L-moment without any trimming. Expressions (3)−(6) for the first four TL-moments, where t = 1, can be written in an alternative manner 1
(1) 1 6 x( F ) [ F ( x)] [1 F ( x)] d F ( x) , 0
56
(9)
Mathematical Methods in Economics 2016
1
(1) 2 6 x( F ) [ F ( x)] [1 F ( x)] [2 F ( x) 1] d F ( x) ,
(10)
0
1
20 x( F ) [ F ( x)] [1 F ( x)] {5 [ F ( x)]2 5 F ( x) 1} d F ( x) , 3 0
(11)
15 x( F ) [ F ( x)] [1 F ( x)] {14 [ F ( x)]3 21[ F ( x)] 2 9 [ F ( x)] 1] d F ( x) . 2 0
(12)
(1) 3
1
(1) 4
Distribution may be identified by its TL-moments, although some of its L-moments or conventional central moments do not exit; for example 1(1) (expected value of median of conceptual random sample of sample size three) exists for Cauchy’s distribution, although the first L-moment λ1 does not exist. TL-skewness 3( t ) and TL-kurtosis (4t ) are defined analogously as L-skewness and L-kurtosis
2.2
3(t ) 3(t ) (t ) , 2
(13)
(4t ) (4t ) (t ) . 2
(14)
Sample Trimmed L-moments
Let x1, x2, …, xn is a sample and x1: n x2:n ... xn :n is an ordered sample. Expression Ê ( X
j 1: j l 1)
n i 1 n i xi :n i 1 j l j l 1
1 n
(15)
is considered to be an unbiased estimation of expected value of the (j + 1)-th order statistic Xj+1:j+l+1 in conceptual random sample of sample size (j + l + 1). Now we will assume that we replace the expression E(Xr+t−j:r+2t ) by its unbiased estimation in the definition of the r-th TL-moment (rt ) in (2) Ê ( X r t j:r 2t )
n i 1 n i xi :n , i 1 r t j 1 t j r 2t
1 n
(16)
which we gain by assigning j → r + t − j − 1 a l → t + j in (15). Now we obtain the r-th sample TL-moment 1 r 1 (t ) j r 1 Ê ( X r t j : r 2t ) , r 1, 2 , ... , n 2 t , l r (1) r j 0 j
(17)
n i 1 n i 1 r 1 1 (t ) j r 1 xi :n , r 1, 2 , ... , n 2 t , l r (1) n j r j 0 i 1 r t j 1 t j r 2t
(18)
i.e.
which is unbiased estimation of the r-th TL-moment (rt ) . Note that for each j = 0, 1, …, r − 1, values xi:n in (18) are nonzero only for r + t − j ≤ i ≤ n − t −j due to the combinatorial numbers. Simple adjustment of the equation (18) provides an alternative linear for r 1 i 1 n i j r 1 (1) 1 j 0 j r t j 1 t j (t ) . lr xi :n r i r t n r 2t n t
(19)
For example, we obtain for r = 1 for the first sample TL-moment (t ) l1
n t
wit:)n xi :n , (
i t 1
57
(20)
Mathematical Methods in Economics 2016
where the weights are given by i 1 n i t t (t ) . wi :n n 2t 1
(21)
The above results can be used to estimate TL-skewness and TL-kurtosis by simple ratios (t )
(t ) t3
l3 , (t ) l2
(t ) t4
l4 . (t ) l2
(22)
(t )
(23)
We can choose t = nα representing the amount of the adjustment from each end of the sample, where α is a certain proportion, where 0 ≤ α < 0,5. More on the TL-moments is for example in [2].
3
Results and Discussion
The researched variable is the net annual household income per capita (in CZK) within the Czech Republic (nominal income). The data obtained come from a statistical survey Microcensus – years 1992, 1996, 2002, and statistical survey EU-SILC (The European Union Statistics on Income and Living Conditions) – the period 20042007, from the Czech Statistical Office. Total 168 income distributions were analyzed this way, both for all households of the Czech Republic together and also broken down by gender, country (Bohemia and Moravia), social groups, municipality size, age and the highest educational attainment, while households are classified into different subsets according to the head of household. Method of TL-moments provided the most accurate results in almost all cases, with the negligible exceptions. Method of L-moments results as the second in more than half of the cases, although the differences between the method of L-moments and maximum likelihood method are not distinctive enough to turn in the number of cases where the method of TL-moments came out better than the method of L-moments. Table 1 is a typical representative of the results for all 168 income distributions. This table provides the results for the total household sets in the Czech Republic. It contains the value of known test criterion χ2. This is evident from the values of the criterion that the method of L-moments brought more accurate results than maximum likelihood method in four of seven cases. The most accurate results were obtained using the method of TL-moments in all seven cases. Figures 1–3 allow the comparison of these methods in terms of model probability density functions in choosing years (1992, 2004 and 2007) for the total set of households throughout the Czech Republic together. It should be noted at this point that other scale is on the vertical axis in Figure 1 than in Figures 2 and 3 for better legibility. It is clear from these three figures that the methods of TL-moments and L-moments bring the very similar results, while the probability density function with the parameters estimated by maximum likelihood method is very different from model probability density functions constructed using the method of TL-moments and the method of L-moments. Figures 4–6 only confirm this statement. 0,00005
0,00005
0,000045
0,000045
method of TL-moments
0,000035
0,00004
method of L-moments probability density function
probability density function
0,00004
maximum lilelihood method
0,00003 0,000025 0,00002 0,000015 0,00001
0,000035 0,00003 0,000025
0,00002 0,000015 0,00001
0,000005
0,000005
0
0
net annual household income per capita (in CZK)
Figure 1 Model probability density functions of three-parametric lognormal curves in 1992 with parameters estimated using three various robust methods of point parameter estimation
year 1992 year 1996 year 2002 year 2004 year 2005 year 2006 year 2007
net annual household income per capita (in CZK)
Figure 4 Development of probability density function of three-parameter lognormal curves with parameters estimated using the method of TL-moments Source: Own research
Source: Own research
58
Mathematical Methods in Economics 2016
0,000014
0,00005 0,000045
method of TL-moments
0,000012
maximum lilelihood method
0,00001
year 1992 year 1996 year 2002 year 2004 year 2005 year 2006 year 2007
0,00004
probability density function
probability density function
method of L-moments
0,000008
0,000006
0,000004
0,000035
0,00003 0,000025
0,00002 0,000015
0,00001
0,000002 0,000005
0
0
net annual household income per capita (in CZK)
Figure 2 Model probability density functions of three-parametric lognormal curves in 2004 with parameters estimated using three various robust methods of point parameter estimation
net annual household income per capita (in CZK)
Figure 5 Development of probability density function of three-parameter lognormal curves with parameters estimated using the method of L-moments Source: Own research
Source: Own research 0,000014
0,00005
year 1992 year 1996 year 2002 year 2004 year 2005 year 2006 year 2007
0,00004
probability density function
probability density function
0,000045
method of TL-moments method of L-moments maximum lilelihood method
0,000012
0,00001
0,000008
0,000006
0,000004
0,000035
0,00003 0,000025 0,00002
0,000015 0,00001
0,000002 0,000005 0
0
net annual household income per capita (in CZK)
Figure 3 Model probability density functions of three-parametric lognormal curves in 2007 with parameters estimated using three various robust methods of point parameter estimation
net annual household income per capita (in CZK)
Figure 6 Development of probability density function of three-parameter lognormal curves with parameters estimated using the maximum likelihood method Source: Own research
Source: Own research
Microcensus Year 1992 1996 2002
TLmoments 739.512 1,503.878 998.325
L-moments 811.007 1,742.631 1,535.557
EU-SILC Maximum likelihood 1,227.325 2,197.251 1,060.891
Year 2004 2005 2006 2007
TLmoments 494.441 731.225 831.667 1,050.105
L-moments 866.279 899.245 959.902 1,220.478
Maximum likelihood 524.478 995.855 1,067.789 1,199.035
Table 1 Value of Ď&#x2021;2 criterion of parameter estimations of three-parametric lognormal curves obtained using three various robust methods of point parameter estimation Source: Own research
Figures 7â&#x20AC;&#x2019;9 then represent the model relative frequencies (in %) of employees by the band of net annual household income per capita in 2007 obtained using three-parametric lognormal curves with parameters estimated by the method of TL-moments, method of L-moments and maximum likelihood method. These figures also allow some comparison of the accuracy of the researched methods of point parameter estimation compared with Figure 10, where are the really observed relative frequencies in individual sample income intervals.
4
Conclusion
Relatively new class of moment characteristics of probability distributions were here introduced. There are the characteristics of location (level), variability, skewness and kurtosis of probability distributions constructed using TL-moments. The accuracy of TL-moments method was compared to that of L-moments and the maximum likelihood method. Higher accuracy of the former approach in comparison to that of the latter two methods has
59
Mathematical Methods in Economics 2016
6
6
5
5
4
4
relative frequency (%)
relative frequency (%)
been proved by 168 income distribution data sets. Advantages of L-moments over the maximum likelihood method have been demonstrated by the present study as well.
3
2
3
2
1
1
0
0
middle of interval of net annual household income per capita (in CZK)
Figure 7 Method of TL-moments
middle of interval of net annual household income per capita (in CZK)
Figure 9 Maximum likelihood method Source: Own research
6
6
5
5
relative frequency (in %)
relative frequency (%)
Source: Own research
4
3
2
4
3
2
1
1
0
0
middle of interval of net annual household income per capita (in CZK)
middle of interval of net annual household income per capita (in CZK)
Figure 8 Method of L-moments
Figure 10 Sample ratios of employees Source: Own research
Source: Own research
Acknowledgements Paper was processed with contribution of long term institutional support of research activities at Faculty of Informatics and Statistics, University of Economics, Prague (IP 400040).
References [1] Adamowski, K.: Regional Analysis of Annual Maximum and Partial Duration Flood Data by Nonparametric and L-moment Methods. Journal of Hydrology 229 (2000), 219−231. [2] Elamir, E. A. H., and Seheult, A. H.: Trimmed L-moments. Computational Statistics & Data Analysis 43 (2003), 299–314. [3] Hosking, J. R. M.: L-moments: Analysis and Estimation of Distributions Using Linear Combinations of Order Statistics. Journal of the Royal Statistical Society (Series B) 52 (1990), 105–124. [4] Hosking, J. R. M., and Wakes, J. R.: Regional Frequency Analysis: An Approach Based on L-moments. Cambridge University Press, Cambridge, 1997. [5] Johnson, N. L., Kotz, S., and Balakrishnan, N.: Continuous Univariate Distributions. Wiley-Interscience, New York, 1994. [6] Kleiber, C., and Kotz, S.: Statistical Size Distributions in Economics and Actuarial Sciences. WileyInterscience, New York, 2003. [7] Kyselý, J., and Picek, J.: Regional Growth Curves and Improved Design value Estimates of Extreme Precipitation Events in the Czech Republic. Climate research 33 (2007), 243−255. [8] Ulrych, T. J., Velis, D. R. Woodbury, A. D., and Sacchi, M. D.: L-moments and C-moments. Stochastic Environmental Research and Risk Assessment 14 (2000), 50–68.
60
Mathematical Methods in Economics 2016
Selection of the bank and investment products via stochastic multicriteria evaluation method Adam Borovička1 Abstract. The main aim of the article is to choose the suitable bank products and investment instruments for a (partial) financial old-age security. For this purpose, a multicriteria evaluation of the selected alternatives is performed. Three main criteria are reflected – return, risk and cost. Return (or another criterion) can be described as a random variable with some probability distribution which is stated via statistical test. For evaluation of alternatives the stochastic multicriteria evaluation method is proposed. It is based on a combination of the principles of ELECTRE I and III methods. Further, the algorithms must be modified and improved (not only) in order to accept an input data with stochastic character. In a practical part, bank products (bank deposit, building savings and supplementary pension insurance) and investments products (open unit trusts) of the Česká spořitelna are chosen. These alternatives are evaluated by the proposed multicriteria approach for two types of strategies – risk-averse and risk-seeking. The results are analyzed and compared. Keywords: financial old-age security, multicriteria evaluation, stochastic return. JEL Classification: C44, G11 AMS Classification: 90B50, 62C86
1
Introduction
Every human being should think about a financial old-age security. The pension is often insufficient for paying all costs in this period of life. To eliminate such a problem, people usually try to save some money during their working age. They have a few possibilities how to do it. They can choose from various bank and investment products. But the question is which products are the most suitable and of course how to choose them. The answers are the main aim of this article. Imagine the following real situation. If you are a long-term client of Česká spořitelna, you want to use selected products of this company in order to save free financial resources for future pension age. Two types of products are available – bank products (bank deposit, building savings, supplementary pension insurance) and investment products (open unit trusts). To make a complex decision I recommend evaluating the products by more than one criterion. The three main criteria are reflected – return, risk and cost. The value of return (or another criterion) can be very variable in time. Then such a criterion can be expressed as a random variable with some probability distribution. For a selection of the suitable products the stochastic multicriteria evaluation method is proposed. This method uses some fragments of the principles of ELECTRE I and III methods. Some principles are adopted, some are modified and improved to solve precisely a real financial situation. The proposed method is complex. It can make a quantitative analysis from more than one angle. Moreover, an importance of criteria can be different. Then a decision can be made for various strategies. Further, input information can have a stochastic character. These facts are the advantages of the proposed method compared to other concepts (e.g. fundamental or technical analysis). In practical situation, the strategies are differentiated – risk averse and riskseeking strategy. For both types of the clients of Česká spořitelna the proposed method is applied to choose the suitable products. The results are analyzed and compared. The article has the following structure. After the Introduction (Section 1), the proposed stochastic multicriteria evaluation method is described in Section 2. The original parts of the algorithm are described in more detail. A brief overview of the current concepts is also not missing. In Section 3, two real decision making situations about financial old-age security are introduced. Section 4 summarizes all the most important aspects of this article and provides some possible ideas for future research.
2
Stochastic multicriteria evaluation method
We know quite a number of multicriteria evaluation methods. One group of method is based on the evaluation principle of utility function – WSA [5], AHP [11], ANP [12] and the others. I think that the most important 1
University of Economics, Department of Econometrics, W. Churchill Sq. 4, 130 67 Prague 3, adam.borovicka@vse.cz.
61
Mathematical Methods in Economics 2016
drawback of these methods can be just a utility function whose construction can be so difficult for a decision maker. But it depends on the particular method how the utility function is constructed. Another methods use a concept minimizing distance from the ideal solution for an evaluation of alternatives. Maybe the best known method from this group is TOPSIS [6]. These methods differentiate in a metric measuring a distance from the ideal solution. The methods of both mentioned groups usually provide the ranking of alternatives. But in my real decision making situation the ranking is not primarily required. Further, we know the methods working with the aspiration levels, for example conjunctive and disjunctive methods [6]. The aspiration levels must be specified by a decision maker which could be difficult for him/her. The next big group contents the methods using the concept of preference scoring relation. The best known are AGREPREF [8], ELECTRE I [9] or ELECTRE III [10]. The most methods from this group require information in the form of threshold values from a decision maker. A determination of these values (this value) could be a problem for a decision maker. It is obvious, that we know other multicriteria evaluation methods. Of course, the basic methods mentioned above were variously modified and improved (normalization techniques, importance of criteria, fuzzification, stochastic input data etc.). As briefly indicated above, I see some drawbacks in the current approaches. Moreover, in order to solve the particular real situation the original method cannot be sometimes applied. Then the current methods must be modified, or new concept must be proposed. It is also my case. Firstly, in my practical application some data are variable in time, some are invariable. So the multicriteria evaluation method must accept both forms of the input data. The most methods do not enable this combination. Secondly, it is the best if any additional information (threshold values, specification of utility function) is not required from a decision maker because it could be very problematic for him/her. Further, the algorithm should take into account the differences in criteria values to make a representative analysis of alternatives. Many approaches do not make that (e.g. ELECTRE III, conjunctive or disjunctive method). Finally, for easy applicability the method should be user-friendly, the algorithm should be comprehensible for a decision maker. To fulfill all mentioned basic demands, a new stochastic multicriteria evaluation method is proposed in this paper.
2.1
Algorithm of the proposed method
The algorithm of the proposed method can be briefly described in the following several steps: Step 1: We have n alternatives and k criteria. The evaluation of alternatives by each criterion is specified. This evaluation can be in strict form or in variable form in time. Then such an evaluation is formulated as random variable with some probability distribution. The probability distribution (with concrete parameters) is stated via a suitable statistical test (Chi-Square, Anderson-Darling or Kolmogorov-Smirnov test). Then for each probability distribution m random numbers are generated. Now m scenarios are specified. The following matrix of all criteria values for l-th scenarios Yl ( yijl ) is specified where yijl (i 1, 2,..., n; j 1, 2,..., k ; l 1, 2,..., m) is an evaluation of the i-th alternative by the j-th criterion. An importance of the j-th criterion is specified in the strict form as a weight v j ( j 1, 2,..., k ) , when conditions v j 0 and v j 1 hold. j
Step 2: For each scenario the criteria values are compared. As in ELECTRE III, the following two sets of criteria indices are specified l I iPj r s | yirl y ljr , yisl y ljs ; r I max , s I min
I where set I
max
l jPi
r s | y y , y y ; r I l jr
, or I
min
l ir
l js
l is
max
,sI
min
i, j 1, 2,..., n, i j; l 1, 2,..., m i, j 1, 2,..., n, i j; l 1, 2,..., m
,
contains the indices of maximizing, or minimizing criteria.
Step 3: The matrices S l and R l are determined for l-th scenario. Their elements express the grades of preference. The matrix S l is defined as in the ELECTRE III technique. The concept of matrix R is derived from ELECTRE I method. However, it takes into account the criteria importance and works with the standardized criteria values. The element of the matrix R l is formulated for each couple of alternatives i and j for l-th scenario in the following form
v
h
rijl
l hI iPj
| t yihl t y ljh |
k
vh | y y | t
h 1
l ih
t
l I iPj ,
rijl
l jh
62
i j,
rijl 0
else
Mathematical Methods in Economics 2016
where t yihl , or t y ljh (i, j 1, 2,..., n; h 1, 2,..., k ; l 1, 2,..., m) is the standardized (normalized) criteria value. The standardized values are computed as follows t
yijl
yijl
i 1, 2,..., n; j 1, 2,..., k ; l 1, 2,..., m ,
H lj
where H lj max( yijl ) ( j 1, 2,..., k ; l 1, 2,..., m) . For this formulation, all values must be positive. If they are not i
positive, the appropriate constant must be added. Step 4: The aggregate preference of the i-th alternative in face of the j-th alternative in terms of the l-th scenario is set by a proposed rule sijl slji rijl rjil .
The rule is modification of the ELECTRE III technique. The thresholds are eliminated. Then for the i-th alternative in the l-th scenario the following indicator is specified pil dil dil
i 1, 2,..., n; l 1, 2,..., m ,
where dil , or dil is the number of alternatives in face of which i-th alternative is preferred, or the number of alternatives that are preferred in face of the i-th alternative in terms of the l-th scenario. Now the set of subeffective alternatives is proposed as follows
E l ai ; pil max( pil ) i
l 1, 2,..., m ,
where ai denotes the i-th alternative. It is possible to say, that these alternatives are effective for l-th scenario. The concept of effective alternative for l-th scenario is inspired by ELECTRE I approach. However, it is quite not suitable (namely for low number of scenarios) because the effective alternative does not have to exist in the set of alternatives. To eliminate this drawback some ideas of ELECTRE III method are used for a specification of the effective alternative. Finally, the effective alternatives are selected over all scenarios. The effective alternative is such an alternative that is mostly times evaluated as subeffective. In addition, according to diminishing value of this indicator (how many times indicated as subeffective alternative) the alternatives can be ranked.
3
Selection of suitable product(s) for financial old-age security
In this part of the article, a very common situation of a real life is described. The most people in working age think about financial old-age security. I mean that this thinking is just very strong in the Czech Republic. The pension system in the Czech Republic is not stable, the rules are still changing. The age of retirement is still extended. Then many people think how to save some money for future usage during their pension age. They have a few options. Many people are the clients of Česká spořitelna. So let us focus on the case of long-term clients who have a current account in Česká spořitelna. It is meaningful that such a person appeals to “home” bank to advise what to do it in the described situation. A worker of the bank offers two types of products – bank and investment products. The bank products are bank deposit, building savings and supplementary pension insurance. The investment products are the open unit trusts. This investment instrument is also suitable for a smaller investor. They can invest in stocks or bonds of big companies around the world via these funds.
3.1
Criteria and strategies
To make a decision as complex as possible, several characteristics of the bank and investment instruments are taken into account. Three main criteria are specified – return, risk and cost. According to these criteria all selected instruments are evaluated. Then a worker of the bank presents this information to the client. Another characteristics should be reflected, namely in the case of open unit trusts. For instance, a currency in which allotment certificates are traded. Locality of the capital market with the open unit trusts, approach of the management of the fund or mood in the capital market could be included. Another criterion could be a liquidity of the saved money. All these mentioned criteria are not explicitly included. But they can be taken into account in a selection of the bank or investment instruments which are offered to the client. Then these selected instruments are analyzed from the perspective of the most important three criteria in terms of the multicriteria evaluation process.
63
Mathematical Methods in Economics 2016
In order to involve a wider scale of real decision making cases, two client’s strategies are specified – riskaverse and risk-seeking. Risk-averse client fears a loss of the investment so much. He/she is willing to undergo only low level of risk that the saved money loses a part of their value. He/she wants to have a certain confidence that the money will be available in his/her pension age. Therefore he/she is able to sacrifice some part of return for a lower risk. A criterion risk is the most important characteristic. The cost connected with the bank or investment products does not play a fundamental role in a decision making process. Risk-seeking client is more oriented to the return. He/she is able to undergo a greater risk for a greater level of return. His/her goal is to valorize his/her spared money as much as possible under a condition of greater level of risk. In contrast to riskaverse client, the return is more important criterion than risk. Therefore the cost connected with investment is a little bit more important for a risk-seeking client than for a risk-averse client. The client should specify his/her preferences about an importance of these criteria. It is obvious that the preferences are based on the client’s strategy. The preferences can be quantitatively expressed on some scale. So the client assigns the points (e.g.) from the interval 0,10 to each criterion, where 0 means the smallest (actually zero) importance, conversely 10 means the highest importance. To calculate the weights of criteria required by the proposed multicriteria evaluation method, the scoring method is applied [5]. This method is very simple for an application. It just uses the quantitative information about the importance of criteria. According to two client’s strategies mentioned above, the points are assigned to the chosen three criteria. The following table shows the assigned points, the weights of criteria for both types of clients as well (Table 1).
Criterion Return Risk Cost
Risk-averse client Points Weight 7 0,368 10 0,526 2 0.105
Risk-seeking client Points Weight 10 0,526 6 0,316 3 0,158
Table 1 Points and weights of criteria for both types of clients According to expectation, the criterion risk has the greatest weight for a risk-averse client. On the contrary for a risk-seeking strategy, the greatest weight is calculated to the criterion return. The weight of cost for a riskseeking client is a little bit greater than for a risk-averse client. This fact was also expected.
3.2
Data
We keep at disposition 3 bank products (bank deposit, building savings, supplementary pension insurance) and 8 open unit trusts (5 bond and 3 stock funds). The bond open unit trusts are High Yield dluhopisový, Korporátní dluhopisový, Sporobond, Sporoinvest and Trendbond. The stock open unit trusts are Global Stocks, Sporotrend and Top Stocks. The bank products are offered by Česká spořitelna and the open unit trusts are managed and offered by Česká spořitelna investment company. All instruments are evaluated by the selected criteria. The value of return is deterministic for the bank products. It is on the annual basis and expressed in percentages. Return of the bank deposit is in actual value. This value is actually historically stable. Return of the building savings has two components. One part is a contribution from the state and second part is interest from a deposit. Return of the building savings reflects actual conditions specified by Česká spořitelna. Return of the supplementary pension insurance is also composed of two parts. One component is a state contribution and second component is a return of the Česká spořitelna pension fund. Return of the pension fund is average annual for last 20 years. It is computed as a strict value because of lower number of observations. Return of the open unit trusts is variable in time. Then return of each fund is described as a random variable with some continuous probability distribution. The distribution is set on the basis of non-parametric statistical Anderson-Darling test [1]. The monthly returns from the period 2010-2015 are used. This period is chosen because a development in the capital market was calmer. Then it can better describe a long-term development which is so important for a practical application. The generated returns from the particular distributions are converted to the annual base. Risk connected with the bank deposit is considered as zero because the interest rate is stable in time. Risk of the building savings is also zero. The conditions are given by the appropriate contract, so that a change of return at least during 6 years is not possible. Risk of the supplementary pension insurance is connected with a possible deviation of interest rate from a deposit. It is calculated as average absolute negative deviation of returns. This concept is proposed in [2]. The risk of the open unit trusts is also calculated as average absolute negative deviation of their returns. Cost connected with the bank deposit reflects the tax on interest. There are no other costs. The same situation is in the case of supplementary pension insurance. Cost connected with the building savings includes charge for an account management, conclusion of
64
Mathematical Methods in Economics 2016
a contract and tax on interest. Cost connected with the open unit trust represents various charges, namely initial charge and management fee. Costs are also calculated in the percentage form as well as other two criteria. For building savings, the most profitable alternative is expected. It means that monthly deposited amount is 1700 CZK. The criteria values for supplementary pension insurance are calculated for the most profitable situation when the monthly deposited amount is 1000 CZK. It is obvious that the return of building savings and supplementary pension insurance is based on the amount of invested money. In our analysis the best alternative is searched for a person who can periodically tuck away lower amount of money. In case of building savings and supplementary pension insurance he/she accepts mentioned amount of money. The sock inserted in other products will be similar. If he/she would want to save more money, another product could be chosen via the additional procedure mentioned below. Table 2 shows evaluation of each product by each criterion. All data is expressed in the percentage form. According to statistical test the return of the open unit trusts can be described as a random variable with logistic probability distribution. Both parameters of distribution for each fund are also written in the following table. Product Bank deposit Building savings Supplementary pension insurance High Yield dluhopisový Korporátní dluhopisový Sporobond Sporoinvest Trendbond Global stocks Sporotrend Tops Stocks
Return 0.4 6 26.62 Logistic (6.52, 15.35) Logistic (4.42, 9.76) Logistic (5.15, 5.99) Logistic (0.82, 0.79) Logistic (3.77, 10.79) Logistic (11.29, 21.66) Logistic (-4.77, 44.05) Logistic (18.7, 42.76)
Risk 0 0 0.21 2.19 1.23 0.69 0.18 1.37 2.21 5.36 3.93
Cost 0.06 2.84 0.54 2.49 2.75 2.20 1.05 2.77 6.33 5.43 5.82
Table 2 Return, risk and cost of bank and investment products (in percentages) The market prices and cost of the open unit trusts are taken over from [7]. The needed information on the bank products is found in [3] and the historical returns of Česká spořitelna pension fund are taken over from [12]. On the basis of this information, the criteria values are calculated.
3.3
Results
Now the suitable products are selected for both types of strategies via the proposed multicriteria evaluation method. For representative results, 50 scenarios of returns of the open unit trusts are made. Thus, 50 values from particular probability distribution for each open unit trust are generated. The following table shows how many times each product is classified as a subeffective alternative for both types of clients (Table 3). Risk-averse client Product Building savings Bank deposit Supplementary pension insurance High Yield dluhopisový Korporátní dluhopisový Sporobond Sporoinvest Trendbond Global stocks Sporotrend Tops Stocks
Risk-seeking client Frequency 36 25 14 0 0 0 0 0 0 0 0
Product Supplementary pension insurance Top Stocks Global Stocks Sporotrend Trendbond High yield dluhopisový Sporobond Korporátní dluhopisový Sporoinvest Bank deposit Building savings
Table 3 Frequency of incidence of each product as subeffective alternative
65
Frequency 24 19 10 6 3 2 2 0 0 0 0
Mathematical Methods in Economics 2016
For a risk-averse investor, the effective alternative is the bank product building savings. None of the open unit trusts is specified as subeffective alternative because they have too great risk. The building savings is effective alternative thanks to a zero risk and a greater level or return than bank deposit. The worst position from the bank products belongs to the supplementary pension insurance thanks its nonzero risk. A better position of this product cannot be even achieved by significantly the greatest return from the bank products because the return has smaller weight than risk. For a risk-seeking investor, the effective alternative is the supplementary pension insurance. But according to expectation, the stock open unit trusts compete with it thanks a possible greater return. A decisive factor is a stable return of supplementary pension insurance compared to the stock open unit trusts. Because the return is the most important criterion for a risk-seeking investor, the other open unit trusts with lower level of return had not a chance to be an effective alternative. The same situation occurs in the case of bank deposit and building savings in spite of they have a zero level of risk. According to proposed multicriteria analysis, a decision maker can see which products are “good” and which are “bad”. In case of need the ranking of alternatives can also be made by a frequency of incidence of alternatives as subeffective alternative. An appropriate portfolio of more products can be made by other methods.
4
Conclusion
The article deals with a selection of the suitable products for financial old-age security. Two strategies of saving free financial resources during a working age are specified. For both cases the suitable products of Česká spořitelna are chosen via the proposed stochastic multicriteria evaluation method whose algorithm is described. The results of both strategies are analyzed and compared. The proposed method provides a complex quantitative multicriteria analysis. All stated criteria have a quantitative character. But some qualitative characteristics should be also explicitly taken into account in the analysis (e.g. mood in the capital market of the open unit trusts, some feeling or intuition of a decision maker etc.). Then they must be transformed to the quantitative form. One possible way should be via the fuzzy sets (numbers). So the proposed method would be fuzzified. Then it would become yet more complex for real decision making situations.
Acknowledgements The research project was supported by Grant No. IGA F4/54/2015 of the Internal Grant Agency, Faculty of Informatics and Statistics, University of Economics, Prague.
References [1] Anderson, T. W., Darling, D. A.: Asymptotic theory of certain "goodness-of-fit" criteria based on stochastic processes. Annals of Mathematical Statistics 23 (1952), 193–212. [2] Borovička, A.: Vytváření investičního portfolia podílových fondů pomocí fuzzy metod vícekriteriálního rozhodování. Ph.D. Thesis. University of Economics, Prague, 2015. [3] Česká spořitelna – products and services [online], available at: http://www.csas.cz/banka/nav/osobnifinance/produkty-a-sluzby-d00019523, [cit. 01-03-2016]. [4] Development of pension funds returns [online], available at: http://www.penzfondy.cz/, [cit. 01-03-2016]. [5] Fiala, P.: Modely a metody rozhodování. Oeconomica, Praha, 2013. [6] Hwang, C. L., Yoon, K.: Multiple Attribute Decision Making – Methods and Applications, A State-of-theArt Survey. Springer-Verlag, New York, 1981. [7] Investment centrum of Česká spořitelna [online], available at: https://cz.products.erstegroup.com/Retail/, [cit. 01-03-2016]. [8] Lagrѐze, E. J.: La modélisation des préférences: Préordres, quasiordres et relations floues. Ph.D. Thesis, Université de Paris V, 1975. [9] Roy, B.: Classement et choix en présence de points de vue multiples (la méthode ELECTRE). La Revue d'Informatique et de Recherche Opérationelle (RIRO) 2 (1968), 57–75. [10] Roy, B.: ELECTRE III: algorithme de classement basé sur une représentation floue des préférences en présence des critères multiples. Cahiers du CERO 20 (1978), 3–24. [11] Saaty, T. L.: The Analytic Hierarchy Process: Planning, Priority Setting, Resource Allocation. McGrawHill, New York, 1980. [12] Saaty, T. L.: Decision Making with Dependence and Feedback: The Analytic Network Process. RWS Publication, New York, 1996.
66
Mathematical Methods in Economics 2016
A chance constrained investment problem with portfolio variance and skewness criteria - solution technique based on the Successive Iterative Regularization Martin Branda
1
Abstract. We deal with an investment problem, where the variance of a portfolio is minimized and at the same time the skewness is maximized. Moreover, we impose a chance (probabilistic) constraint on the portfolio return which must be fulfilled with a high probability. This leads to a difficult nonconvex multiobjective stochastic programming problem. Under discretely distributed returns, this problem can be solved using the CCP-SIR solver (Chance Constrained Problems: Successive Iterative Regularization) which has been recently introduced by Adam and Branda [1]. This algorithm relies on a relaxed nonlinear programming problem and its regularized version obtained by enlarging the set of feasible solutions using regularizing functions. These both formulations as well as the solution technique are discussed in details. We report the results for a real life portfolio problem of a small investor. We compare the CCP-SIR solver with BONMIN applied to the deterministic mixed-integer reformulation. Keywords: Investment problem, chance constraints, skewness, successive iterative regularization JEL classification: C44 AMS classification: 90C15
1
Introduction
Markowitz [14] was the first, who considered moment conditions on portfolio composed from various assets available on a stock market. In particular, he maximized the expected return of a portfolio and at the same time minimized the variance of portfolio return. The variance served here to quantify the risk. The optimal portfolios then correspond to the efficient frontier in the mean-variance space. Skewness together with the mean and variance criteria was considered in the models for accessing portfolio efficiency, see, e.g., Joro and Na [12], Briec et al. [9]. Kerstens et al. [13] derived a geometric representation of the meanâ&#x20AC;&#x201C; varianceâ&#x20AC;&#x201C;skewness portfolio frontier, and Briec et al. [10] generalized the well-known one fund theorem. Alternatively to the higher order moment criteria, many researchers focused on axiomatic definitions of risk measure classes with desirable properties, e.g. convex risk measures [11], coherent risk measures [3], and general deviation measures [16]. The resulting models can be supported by the relations to the utility functions and stochastic dominance efficiency, see [6, 7, 8]. In this paper, we focus on a multiobjective portfolio problem with the variance and skewness criteria. The set of feasible portfolios is defined using a chance (probabilistic) constraint which imposes a high probability on the events when portfolio return exceeds a given value. Due to the presence of the nonconvex cubic term in the objective function, we were not able to use algorithms which rely on convexity of the underlying problem. Therefore we select an alternative approach which has been recently introduced by Adam and Branda [1]. They proposed an algorithm based on a relaxed problem which is a nonlinear 1 Charles University in Prague Faculty of Mathematics and Physics Department of Probability and Mathematical Statistics SokolovskaĚ 83, Prague 186 75, Czech Republic tel.: +420 221 913 404, fax: +420 222 323 316, branda@karlin.mff.cuni.cz
67
Mathematical Methods in Economics 2016
programming problem, and its regularized version that is obtained by enlarging the set of feasible solutions using regularizing functions. The resulting CCP-SIR solver1 (Chance Constrained Problems: Successive Iterative Regularization) is then able to obtain a stationary point of the original problem. We apply the CCP-SIR algorithm to the investment problem in the empirical part and compare its performance with the BONMIN [5] solver available in GAMS modelling system. The paper is organized as follows. In Section 2, we propose the notation and the variance-skewness model with a chance constraint on portfolio return. In Section 3, we formulate the relaxed problem and derive the optimality conditions. We also discuss the CCP-SIR algorithm which is then employed in the empirical study in Section 4. Section 5 concludes the paper.
2
Problem formulation
In this section, we formulate the multiobjective stochastic portfolio optimization problem with minimization of the portfolio variance and maximization of the skewness. The chance constraint provides a probabilistic lower bound for random portfolio rate of return. We consider n assets with random rates of return denoted by Rj , j = 1, . . . , n, with E|Rj |3 < ∞ and define the corresponding covariance matrix C and skewness tensor S elementwise as Cjk := E(Rj − ERj )(Rk − ERk ),
Sjkl := E(Rj − ERj )(Rk − ERk )(Rl − ERl ). If we employ the aggregate function approach of multiobjective optimization with aggregation parameter c > 0, we obtain n X n n X n n X X X Sjkl xj xk xl Cjk xj xk − c minimize j=1 k=1 l=1
j=1 k=1
n X subject to P Rj xj ≥ b ≥ 1 − ε, j=1
n X
(1)
xj = 1,
j=1
lbj ≤ xj ≤ ubj , where b is a minimal rate of return acceptable with probability 1 − ε. We allow various restrictions on portfolio weights using lbj , ubj . In particular, if lbj < 0, then short selling of the asset j is allowed.
3
Optimality conditions and algorithm
We consider discretely distributed returns with scenarios Rji (index j identifies an asset, i a scenario) PS and general probabilities pi > 0, i=1 pi = 1. Such distribution can be obtained as the observations of historical returns or by simulation from a multivariate continuous distribution. Standard approaches use mixed-integer programming reformulations, cf. [2, 15]. However, the employed algorithm relies on the
1 Matlab
source codes are freely available at http://staff.utia.cas.cz/adam/research.html
68
Mathematical Methods in Economics 2016
relaxed problem that can be formulated as a nonlinear programming problem: minimize
n X n X
j=1 k=1
subject to
S X i=1
pi yi ≥ 1 − ε,
yi b − n X j=1
Cjk xj xk − c
n X j=1
n X n X n X
Sjkl xj xk xl
j=1 k=1 l=1
Rji xj ≤ 0, i = 1, . . . , S,
(2)
xj = 1, lbj ≤ xj ≤ ubj ,
0 ≤ yi ≤ 1. According to [1, Theorem 2.2], the optimality condition for the problem at a point x can be stated as: 2
n X
k=1
Cjk xk − 3c
n X n X
j=1 k=1
Sjkl xk xl −
X
λi Rji + µ + µj = 0, j = 1, . . . , n,
i∈I0 (x)
λi = 0, i ∈ I00 (x, y),
λi ≥ 0, i ∈ I0+ (x, y) ∪ I01 (x, y),
µj ≤ 0, j ∈ J − (x),
(3)
µj ≥ 0, j ∈ J + (x), µj = 0 otherwise, µ ∈ R, where the index sets are defined as follows J − (x) := {j : xj = lbj }, J + (x) := {j : xj = ubj }, n X Rji xj = b}, I0 (x) := {i : j=1
(4)
I00 (x, y) := {i ∈ I0 (x) : yi = 0},
I0+ (x, y) := {i ∈ I0 (x) : yi ∈ (0, 1)}, I01 (x, y) := {i ∈ I0 (x) : yi = 1}.
The algorithm is based on a regularizing function imposed on the random constraints, which enlarges the set of feasible solutions. We assume that φt : R → R are continuously differentiable decreasing functions which depend on a parameter t > 0 and which satisfy the following properties: φt (z) > 0, for z ∈ R, φt (0) = 1,
φt (z t ) → 0, whenever z
φ0t (z t ) φ0t (z̃ t )
→ 0,
t t→∞
→ z̄ > 0,
whenever φt (z t ) & 0 and φt (z̃ t ) → z̄ > 0.
69
(5) (6) (7)
Mathematical Methods in Economics 2016
We will consider φt (z) = e−tz as a regularizing function. Using these functions we obtain the following regularized problem: n n X n X n n X X X minimize Cjk xj xk − c Sjkl xj xk xl j=1 k=1
subject to
S X i=1
j=1 k=1 l=1
pi yi ≥ 1 − ε,
φt b − n X j=1
n X j=1
Rji xj ≥ yi i = 1, . . . , S,
(8)
xj = 1, lbj ≤ xj ≤ ubj ,
0 ≤ yi ≤ 1. The Successive Iterative Regularization algorithm then iteratively increases the parameter t leading to tighter approximation of the feasibility set. The problem is solved by a nonlinear programming solver and the obtained solution is then used as a starting point for the next iteration. It can be shown that the sequence of the solutions converge to a stationary point of the relaxed problem (2). If the obtained point is not a stationary point of the original chance constrained problem, an additional local search procedure can be performed to find such point, see [1] for details. Finally, note that the problem (1) can be also reformulated using additional binary variables zi and a large constant M : n X n n X n n X X X Sjkl xj xk xl Cjk xj xk − c minimize j=1 k=1 l=1
j=1 k=1
subject to
S X i=1
b− n X j=1
pi zi ≥ 1 − ε, n X j=1
(9)
Rji xj ≤ M (1 − zi ), i = 1, . . . , S,
xj = 1, lbj ≤ xj ≤ ubj , zi ∈ {0, 1}.
This leads to a nonlinear mixed-integer programming problem which will be employed in the following section.
4
Numerical study
We consider the following n = 8 US assets observed monthly from January 2011 to December 2015. In particular, we consider Boeing (BA), Coca-Cola (KO), JPMorgan Chase (JPM), Oracle (ORCL), Microsoft (MSFT), Nike (NKE), Intel (INTC), Apple (AAPL). Since the problem is highly demanding, we have chosen S = 100 scenarios. The other parameters of the problem were set as follows: minimal acceptable returns b ∈ {−4, −2, 0, 2}, probabilistic level ε = 0.1, and aggregation (risk aversion) parameter c = 0.1. We employed the multivariate skew normal distribution, cf. [4], for modelling the random rates of return. Emphasize that this distribution is able to fit the skewness of the real data. For each value of b, we have rerun the whole optimization 10 times. In [1], we have also discussed the influence of the choice of M in problem (9) on the algorithm performance. It is highly desirable to choose this parameter as tight as possible. Since the maximal random value of Rji was approximately 35% and since |xj | ≤ 10, we have decided to choose M = 500. As the maximal constraint violation from all generated solutions was 80, this M showed to be a very reasonable bound. We compare three possible methods for the problem solution. The first one is based on the regularization and employs the CCP-SIR algorithm. The second one uses the solver BONMIN [5] available in the modelling system GAMS for solving the deterministic equivalent formulation which corresponds to
70
Mathematical Methods in Economics 2016
a mixed-integer nonlinear programming problem. The last method uses their combination and employs the result of BONMIN as a starting point for the CCP-SIR algorithm. For the regularized problem, the average solution time was approximately 4.5 seconds while we stopped BONMIN after 10 minutes. Thus, the solutions obtained by BONMIN were always suboptimal.
5 Regularized BONMIN Combination
4.5
log(objective)
4
3.5
3
2.5
2
1.5 0
5
10
15
20
25
30
35
Different runs Figure 1 Comparison of different algorithms for problem (1). The results of these three solvers can be seen in Figure 1. On the horizontal axis, there are different sets of simulated return realizations Rji and different values of b. The first 10 values corresponds to b = −4, the next 10 for b = −2, the next 10 for b = 0 and finally, the last 5 (some infeasible problems were omitted) for b = 2. Since the mean value of Rji was 5%, for small b, the feasible set was huge and the relaxed problem contained many local minima. Because the regularized method uses only first order information (and no globalization in the form of cuts), BONMIN performed better than the regularized method in this case. However, with increasing b, the regularized method started to prevail. This can be especially seen for the last five values with b = 2. However, almost in all cases, when we used the regularized method on the BONMIN solution, we were able to improve the solution significantly. We can conclude that the employed methods are able to solve problems in a very good way. Moreover, if they for some reason fail, they are still able to quickly improve suboptimal solutions obtained by other solvers.
5
Conclusions
We have proposed a portfolio optimization problem with the variance–skewness objective and a chance constraint on the portfolio return. Under the discrete distribution of returns, we have relaxed the problem and provided the optimality conditions. Algorithmic issues have been discussed based on the regularized problem obtained by enlarging the set of feasible solutions using regularizing functions. The good performance of the algorithm has been demonstrated using the real data and compared with the solver BONMIN available in GAMS. The combination of CCP-SIR with BONMIN seems to be also promising.
71
Mathematical Methods in Economics 2016
Acknowledgements This work was supported by the Czech Science Foundation under the Grant P402/12/G097 (DYME – Dynamic Models in Economics).
References [1] Adam, L., Branda, M.: Nonlinear chance constrained problems: optimality conditions, regularization and solvers, Journal of Optimization Theory and Applications (2016), to appear. DOI: 10.1007/s10957-016-0943-9 [2] Adam, L., Branda, M.: Sparse optimization for inverse problems in atmospheric modelling. Environmental Modelling & Software 79 (2016), 256–266. [3] Artzner, P., Delbaen, F., Eber, J.-M., Heath, D.: Coherent measures of risk. Mathematical Finance 9 (1999), 203–228. [4] Azzalini, A., Dalla Valle, A.: The multivariate skew-normal distribution. Biometrika 83 (4) (1996), 715–726. [5] Bonami, P., Biegler, L.T., Conn, A.R., Cornuejols, G., Grossmann, I.E., Laird, C.D., Lee, J., Lodi, A., Margot, F., Sawaya, N., and Waechter, A.: An algorithmic framework for convex mixed integer nonlinear programs, Discrete Optimization 5 (2) (2008), 186–204. [6] Branda, M.: Diversification-consistent data envelopment analysis based on directional-distance measures. Omega 52 (2015), 65–76. [7] Branda, M., Kopa, M.: On relations between DEA-risk models and stochastic dominance efficiency tests. Central European Journal of Operations Research 22 (1) (2014), 13–35. [8] Branda, M., Kopa, M.: DEA models equivalent to general N-th order stochastic dominance efficiency tests. Operations Research Letters 44 (2) (2016), 285–289. [9] Briec, W., Kerstens, K., Jokung, O.: Mean-variance-skewness portfolio performance gauging: a general shortage function and dual approach. Management Science 53 (2007), 135–149. [10] Briec, W., Kerstens, K., Van de Woestyne, I.: Portfolio selection with skewness: a comparison of methods and a generalized one fund result. European Journal of Operational Research 230 (2) (2013), 412–421. [11] Follmer, H., Schied, A.: Stochastic Finance: An Introduction In Discrete Time. Walter de Gruyter, Berlin, 2002. [12] Joro, T., Na, P.: Portfolio performance evaluation in a mean–variance–skewness framework. European Journal of Operational Research 175 (2006), 446–461. [13] Kerstens, K., Mounir, A., Van de Woestyne, I.: Geometric representation of the mean–variance– skewness portfolio frontier based upon the shortage function. European Journal of Operational Research 210 (1) (2011), 81–94. [14] Markowitz, H.M.: Portfolio selection. Journal of Finance 7 (1) (1952), 77–91. [15] Raike, W.M.: Dissection methods for solutions in chance constrained programming problems under discrete distributions. Management Science 16 (11) (1970), 708–715. [16] Rockafellar, R.T., Uryasev, S., Zabarankin, M.: Generalized deviations in risk analysis. Finance and Stochastics 10 (2006), 51–74.
72
Mathematical Methods in Economics 2016
Post-crisis development in the agricultural sector in the CR – Evidence by Malmquist index and its decomposition Helena Brožová1, Ivana Boháčková2 Abstract. This contribution aims to use Data Envelopment Analysis models, Malmquist index and its decomposition for analysis of the efficiency of the agricultural sector in the regions of the Czech Republic and its development from the year 2008 to 2013. The regions are characterized by three inputs (agricultural land, annual work units and gross fixed capital formation) and one output (gross value added). Malmquist index shows significant increase of efficiency in 2011 although in other years rather negative development or no change can be seen. The same shape can be seen for technological change. Pure and scale efficiency changes are similar and both show different behaviour than Malmquist index and technological change. Pure and scale efficiency are significantly increasing between years 2008 and 2009 and, further, slowly decreasing or without change. Keywords: Crisis, agricultural sector, efficiency, CCR model, BCC model, Malmquist index decomposition. JEL Classification: C44, C61, D24 AMS Classification: 90B50, 90C05, 90C90
1
Introduction
The productivity and efficiency of the agricultural sector is provided for long time. The first studies on this topic have appeared already in the 40s and 50s of the last century. Starting from Clark [6] and Bhattacharjee [3] analysis, a number of studies tried to analyse the differences in agricultural productivity of analysed units and its development (for instance, Yu et al. [14], Trueblood and Coggins [13]). The Data Envelopment Analysis (DEA) is the mostly used one from the non-parametric methods. The contribution of inputs to outputs is evaluated using the DEA models firstly developed by Charnes, Cooper and Rhodes [5] and Banker, Charnes and Cooper [2], which applied the Farrel’s approach (Farrel [9]) to measurement of productivity efficiency. The dynamic context of the relation of inputs and outputs, the growth of productivity efficiency was studied by Malmquist [11]. Measuring productivity change and Malmquist index (MI) decomposing into its sources is important, because enhancement of the productivity growth requires knowledge of the relative importance of its sources. In this regard the Malmquist productivity index (MI) is particularly used and decomposed into the technical efficiency change index and the technological change index and, further, these indices can be decomposed again (Yu et al.[14], Lovell [10], Ray [12], Zbranek [15]). The objective of this paper is to show the relative technical efficiency and its changes of the agricultural sector in the Czech regions in the crisis period 2008-2013. Efficiency progress according to the Malmquist index will be compared with changes of selected indices receiving by Malmquist index decomposition. The paper is organized as follows. Section 2 defines shortly basic DEA models, Malmquist index and its decomposition. Section 3 contains an analysis of the efficiency of the agricultural sector of the Czech regions and changes of the elements of the MI decomposition in the years 2008-2013. Section 4 summarizes the findings.
2
Malmquist index and its decomposition
The Malmquist index is usually applied to the measurement of productivity change over time, and can be multiplicatively decomposed into a technical efficiency change index and a technological change index. Its calculation can be based on decision-making units (DMU) efficiency according to the CCR and BCC models combining inputs and outputs from different time periods.
1
Czech University of Life Sciences, Faculty of Economics, Department of Systems Engineering, 165 00 Praha 6 – Suchdol, e-mail: brozova@pef.czu.cz. 2 Czech University of Life Sciences, Faculty of Economics, Department of Economics, 165 00 Praha 6 – Suchdol, e-mail: bohackiv@pef.czu.cz.
73
Mathematical Methods in Economics 2016
The model CCR (according to authors Charnes, Cooper and Rhodes [5] is evaluating the efficiency of a set of DMU, which convert multiple inputs into multiple outputs with the constant returns to scale (CRS) assumption. The outputs and inputs can be of various characteristics and of variety of forms that can be also difficult to measure. If the variable returns to scale (VRS) is supposed, so-called BCC model (according to authors Banker, Charnes and Cooper [2]) should be used. DEA models with super efficiency (Andersen, Petersen [1]) can be used for the distinction of the efficient DMUs. In this type of the DEA model a DMU under evaluation is not included in the reference set of the original DEA models. Model CCR The input oriented CCR DEA model (Charnes et al. [5]) measuring efficiency of DMU in time t is obtained as the maximum of a ratio of weighted outputs to weighted inputs under the constraints that this ratio for all DMUs is less than or equal to 1and the efficiency of the efficient DMU is equal to 1. The efficiency Φ Ht explicitly shows the necessary decreasing of inputs with the same amount of inputs of non-efficient DMU H in time t. Linearization of this model follows m
∑v Φ Ht ( x t , y t ) =
n
∑u j =1
jH
y tjH → MAX
i =1 m
subject to
iH
t xiH =1 n
− ∑ viH xikt + ∑ u jH y tjk ≤ 0, k = 1, 2,..., p i =1
(1)
j =1
u jH ≥ 0, j = 1, 2,..., n viH ≥ 0, i = 1, 2,..., m
th t where u jk and v ik are weights of outputs and inputs, y jk is the amount of j output from unit k , and xikt is
the amount of i th input to k th unit in time t, H is index of the evaluated DMU (this notation is used through the whole paper) . Output orientation of this model for all DMUs supposes the minimization model. The efficiency of all DMUs is greater than or equal to 1and the efficiency of the efficient DMU is equal to 1. The efficiency shows the necessary increasing of outputs with the same amount of inputs of non-efficient DMU. For the decomposition of the MI, we have to compute following models with different combination of inputs and output from different time periods (Zbranek [15]). We need to compute efficiency of the DMU inputs and outputs in time t+1 in the conditions of the time t. For this we use the model m
∑v Φ Ht ( x t +1 , y t +1 ) =
n
∑u
jH
j =1
y tjH+1 → MAX
i =1 m
subject to
iH
t +1 xiH =1 n
− ∑ viH xikt + ∑ u jH y tjk ≤ 0, k = 1, 2,..., p i =1
(2)
j =1
u jH ≥ 0, j = 1, 2,..., n viH ≥ 0, i = 1, 2,..., m
The efficiency of the DMU inputs and outputs in time t in the conditions of the time t+1 is calculated using the model m
∑v Φ Ht +1 ( x t , y t ) =
n
∑u j =1
jH
y tjH → MAX
i =1 m
subject to
iH
t xiH =1 n
− ∑ viH xikt +1 + ∑ u jH y tjk+1 ≤ 0, k = 1, 2,..., p i =1
(3)
j =1
u jH ≥ 0, j = 1, 2,..., n viH ≥ 0, i = 1, 2,..., m
Last two models are intended for efficiency calculation in case of combination of inputs and outputs in different time with conditions in different time. We use the following models: Inputs in time t+1, outputs in time t, and production conditions in time t m
Φ Ht ( x t +1 , y t ) =
∑v
n
∑u j =1
jH
y tjH → MAX
i =1 m
subject to
iH
t +1 xiH =1 n
− ∑ viH xikt + ∑ u jH y tjk ≤ 0, k = 1, 2,..., p , i =1
j =1
u jH ≥ 0, j = 1, 2,..., n , viH ≥ 0, i = 1, 2,..., m
74
(4)
Mathematical Methods in Economics 2016
and inputs in time t+1, outputs in time t, and production conditions in time t+1 m
Φ Ht +1 ( x t +1 , y t ) =
∑v
n
∑u j =1
jH
y tjH → MAX
i =1 m
subject to
iH
t +1 xiH =1 n
− ∑ viH xikt +1 + ∑ u jH y tjk+1 ≤ 0, k = 1, 2,..., p i =1
(5)
j =1
u jH ≥ 0, j = 1, 2,..., n viH ≥ 0, i = 1, 2,..., m
Model BCC The model BCC DEA model (Banker et al. [2]) supposes variable returns to scale. The input oriented BCC model uses the following linearization of the original DEA model: m
∑v i =1 m
n
Θ ( x , y ) = ∑ u jH y t H
t
t
j =1
t jH
+ q H → MAX
subject to
iH
t xiH =1 n
− ∑ viH xikt + ∑ u jH y tjk + q H ≤ 0 i =1
u jH v iH qH
j =1
k = 1, 2,..., p ≥ 0, j = 1, 2, ..., n ≥ 0, i = 1, 2, ..., m without limitation
(6)
t
where Θ Ht is efficiency of DMU H, u jk and v ik are weights of outputs and inputs, y jk is the amount of the j th output from unit k , and xikt is the amount of the i th input to the k th unit in time t, H is index of the evaluated DMU, and q H is variable allowing variable returns to scale. Malmquist index Malmquist productivity index was introduced into the literature by Caves, Christensen, and Diewert [4]. This index quantifies the change in multi-factor productivity of the DMU with the CRS assumption at two different points in time. A decade later, Färe, Grosskopf, Lindgren, and Roos [7] identified technological change and technical efficiency change as two distinct components of productivity change. Subsequently, Färe, Grosskopf, Norris, and Zhang [8] relaxed the CRS assumption allowing VRS at different points on the production frontier and offered a new decomposition of the Malmquist productivity index incorporating a new component representing the contribution of changes in scale efficiency. Malmquist index measures the efficiency change between two time periods by geometric mean of the ratios of the efficiency of each data point relatively to a common technological frontier in the first and second period. Φ t ( x t +1 , y t +1 ) Φ t +1 ( x t + 1 , y t +1 ) M H ( x t +1 , y t +1 , x t , y t ) = H t t t × H t + 1 t t ΦH (x , y ) ΦH (x , y )
1/ 2
(7)
where xik is the amount of i th input to k th unit and H is index of the evaluated DMU. Malmquist index is often decomposed into two values.
M H ( x t +1 , y t +1 , x t , y t ) = E H TH =
Φ Ht +1 ( x t +1 , y t +1 ) Φ Ht ( x t +1 , y t +1 )Φ Ht ( x t , y t ) × t +1 t +1 t +1 t +1 t t Φ Ht ( x t , y t ) Φ H ( x , y )Φ H ( x , y )
1/ 2
(8)
The first ratio - efficiency change EH - represents the DMU technical efficiency change between two periods, and the second one - technological change TH - represents the efficiency frontier shift between two periods (Färe et al. [7]). Considering VRS, the technical efficiency change EH can be calculated using enhanced formula
EH
Φ Ht + 1 ( x t + 1 , y t + 1 ) Θ t + 1 ( x t + 1 , y t + 1 ) Θ Ht + 1 ( x t + 1 , y t + 1 ) = H t × = SE H × P E H Θ Ht ( x t , y t ) ΦH (xt , y t ) Θ Ht ( x t , y t )
(9)
where the first ratio represents scale efficiency change SEH and the second one pure efficiency change PEH of evaluated DMU.
75
Mathematical Methods in Economics 2016
Also technological change TH can be further decomposed into the magnitude index and bias index, decomposed into the output bias index and input bias index. Since the values of these indices are not analysed in this study, this decomposition is not explain here. Malmquist index correction Malmquist index and its elements values are greater than 0. A value greater than 1 means efficiency increasing and contrary, while one less than 1 means decreasing of the efficiency. For greater clarity Malmquist index can be corrected or transformed by the following formulas (Zbranek [15]) If M H < 1 than M H = 1 â&#x2C6;&#x2019; 1
MH
(10)
If M H = 1 than M H = 0 If M H > 1 than M H = M H â&#x2C6;&#x2019; 1
If any value of corrected Malmquist index is equal to 0, there is no efficiency change. If any corrected Malmquist index value is greater than 0, there is efficiency increasing, and if it is lower than 0, there is efficiency decreasing. Also other elements of Malmquist index decomposition can be corrected by the same way with the same meaning. For clarity of their values corrected form of indices will be used in this analysis.
3
Efficiency of the agricultural sector in the regions of the Czech Republic
For the analysis of the efficiency of the agricultural sector in the Czech regions we used data from Eurostat databases and yearbooks of the Czech regions. Comparable basic factors of production, namely the area of agricultural land (ha), labour as annual work units and gross fixed capital (CZK mil) were chosen as inputs, while gross value added (CZK mil) created in the agricultural sector was chosen as output (Table 1). The gross fixed capital and gross value added were recalculated using interest rates and inflation index.
Area of agricultural land (ha) Overall work (AWU)
NUTS III Min Median Max Min Median Max Min Median Max Min Median Max
Gross fixed capital (CZK mil.) Gross value added (CZK mil.) Interest rates (MF, according to Eurostat) Inflation (MF, according to Eurostat)
2008 25,537 395,365 970,279 2,227 12,988 26,959 211 2,381 5,014 1,584 5,839 11,876
2009 25,458 394,516 969,861 2,777 13,131 26,228 334 1,741 3,469 1,577 4,900 8,683
2010 25,432 392,785 969,426 3,017 12,973 24,582 406 2,044 3,904 1,731 4,154 9,076
2011 16,521 392,031 859,074 3,357 13,300 26,186 582 1,909 4,449 2,265 6,034 12,810
2012 16,674 394,143 857,859 3,757 12,988 25,626 742 3,060 5,828 2,676 6,279 14,155
2013 16,841 392,509 857,797 3,338 12,886 26,132 749 2,634 5,257 2,528 6,364 13,725
4.6
4.8
4.2
3.7
3.9
3.9
6.3
0.6
1.4
2.3
1.7
1.7
Table 1 Basic information about the data Starting calculations of CCR model (1) were performed for all 14 regions of the Czech Republic. Since Prague region is rated by the model DEA significantly better than other regions in all periods and the other regions generally do not achieved efficiency and therefore could not be sufficiently analysed, Prague region was excluded from further evaluation. Then CCR models (1) with the CRS assumption and also BCC model (6) assuming VRS were used for the global evaluation of 13 Czech regions excluding Prague region (Table 2). Firstly we compare the overall technical efficiency in each period of each region under the CRS and VRS assumptions. Five, six or eight (more than one half) regions are shown as technical efficient according to the BCC model, only two, three or four (less than one third) regions are effective according to the CCR model.
76
Mathematical Methods in Economics 2016
Input orientation
CCR efficiency (%) 2008
Min Median Average Standard Deviation
2009
2010
2011
BCC efficiency (%)
2012
2013
2008
2009
2010
2011
2012
2013
54.15 79.22 81.98 72.29 72.59 73.07 57.13 80.54 84.79 82.03 74.98 83.88 77.42 90.53 87.49 91.67 90.87 90.97 95.46 100.00 98.44 97.82 98.21 94.23 79.53 91.05 90.13 87.77 90.84 89.01 87.71 95.58 95.38 94.36 93.45 94.90 15.86 7.51 6.02 8.59 7.41 8.40 14.70 6.30 5.33 6.73 7.76 4.94
Table 2 Efficiency of the regions according to the models CCR and BCC (without Prague)
3.1
Changes of the elements of MI
Subsequently, DEA models (1) to (6) were computed for all 13 regions and all periods and MI and selected MI elements were calculated and corrected for cleared explanation. The development of MI has expected behaviour after crisis. The agricultural sector shows efficiency decreasing in periods 2008/9 and 2009/10, efficiency increasing between 2010 and 2011 after crisis followed by little or no efficiency changes during next three years. The efficiency development in all regions are nearly equal (Table 3, Figure 1).
Min Median Average Max Standard Deviation
2008/09 2009/10 2010/11 2011/12 2012/13 -57% -28% 15% -15% -15% -8% -13% 38% 0% 4% -17% -12% 37% 1% 2% 18% 13% 57% 28% 11% 22% 11% 11% 11% 8%
Table 3 Corrected Malmquist indices â&#x20AC;&#x201C; The annual change of the regions efficiency
Figure 1 Corrected Malmquist index - The annual change of the regions efficiency The analysis of components of MI indicates that this development is driven by the development of technological changes, changes of the production possibility frontier during crisis and after it. Technological change has the similar characteristics as MI in contrast to the technical efficiency change (Figure 1, Figure 2).
Figure 2 The annual technical efficiency change and technological change The technical efficiency change index and its components pure efficiency change and scale efficiency change indices have similar values in all analyzed years (Figure 2, Figure 3). These indices show slight improvement from year 2008 to 2009 and show little or no changes during next five years. Main differences are in minimal and maximal values of these year to year indices. It means, that agricultural sector does not change its production system after overcoming the crisis. Small differences are caused by the change of production volume.
77
Mathematical Methods in Economics 2016
Figure 3 Pure and scale annual efficiency change of the regions in five periods
4
Conclusion
In 2008 the world economy faced its own most dangerous (financial) crisis. Malmquist index identifies expected development of efficiency in years 2008/2009 and following years; an economic collapse in 2008, followed by economic recovery till 2011 and the development in next years without fluctuations. Also the technological change index shows similar development as the global MI. On the other hand the efficiency change (technical, pure or scale efficiency change indices) shows very small positive as well as negative changes in all years. It can be explained by the relative permanence of the agricultural sector, by the effort of efficient productivity frontier to return to the previous level and also by permanent level, of technical efficiency.
References [1] Andersen, P. and Petersen N. C.: A Procedure for Ranking Efficient Units in Data Envelopment Analysis. Management Science 39 (1993), 1261-1264. [2] Banker, R. D., Charnes, R. F. and Cooper, W. W.: Some Models for Estimating Technical and Scale Inefficiencies in Data Envelopment Analysis. Management Science 30 (1984), 1078–1092. [3] Bhattacharjee, J.: Resource Use and Productivity in World Agriculture. Journal of Farm Economics 37 (1955), 57-71 [4] Caves, D. W., Christensen, L. R. and Diewert, W. E.: The Economic Theory of Index Numbers and the Measurement of Input, Output and Productivity. Econometrica 50 (1982), 1393-1414. [5] Charnes, A., Copper, W. and Rhodes, E.: Measuring the efficiency of decision-making units. European Journal of Operational Research 2 (1978), 429–444. [6] Clark, C.: The Conditions of Economic Progress. Macmillan & Co., London, 1st edition, 1940. [7] Färe, R., Grosskopf, S., Lindgren, B. and Ross P.: Productivity changes in Swedish pharmacies 1980-1989: a non-parametric Malmquist approach. Journal of Productivity Analysis 3 (1992), 85-101. [8] Färe, R., Grosskopf, S., Norris, M. and Zhang Z.: Productivity Growth, Technical Progress and Efficiency Changes in Industrialised Countries. American Economic Review 84 (1994), 66-83. [9] Farrel, M.: The Measurement of Productive Efficiency. Journal of the Royal Statistical Society, Series A 120 (1957), 253-281. [10] Lovell, C. A. K.: The Decomposition of Malmquist Productivity Indexes. Journal of Productivity Analysis 20 (2003), 437-458. [11] Malmquist, S.: Index numbers and indifference surfaces. Trabajos de Estadistica 4 (1953), 209−242. [12] Ray, S. C.: On An Extended Decomposition of the Malmquist Productivity Index. In: Seventh European Workshop on Efficiency & Productivity Analysis, Informs, 2001. [13] Trueblood, M.A. and Coggins, J.: Intercountry Agricultural Efficiency and Productivity: A Malmquist Index Approach. Washington DC, U.S. Dept. of Agriculture, Economic Research Service, 2001. [14] Yu, B., Liao, X. and Shen, H.: Parametric Decomposition of the Malmquist Index in Output-Oriented Distance Function: Productivity in Chinese Agriculture. Modern Economy 5 (2014), 70-85. [15] Zbranek, P.: Spoločná polnohospodárská politika a jej dopad na výkonnosť polnohospodárských podnikov: Aplikácia Malmquistových indewxov a Luenbergerových indikátorov, Doctoral thesis, Slovak University of Agriculture in Nitra, 300, 2015.
78
Mathematical Methods in Economics 2016
Modelling Effects of Oil Price Fluctuations in a Monetary DSGE Model Jan Brůha1, Jaromı́r Tonner2, Osvald Vašı́ček3 Abstract. The significant drop in oil prices after the end of 2014 has caused a positive supply shock for oil importing economies and at the same time it has further fuelled deflationary tendencies that already had been presented in many advanced countries. Because of the combination of these two effects in the environment of low inflation and low interest rates, the evaluation of the overall effect is not trivial. Therefore, it is not surprising that the drop in oil prices has become increasingly discussed also in the area of monetary policy. In this paper, we propose an extension of a standard DSGE model used for monetary policy analyses by the oil sector. Unlike existing extensions, we address issues not yet covered by Czech monetary policy models, such as the incidence of motor fuel excise taxes. Using the extended model, we empirically analyse the effects of the oil price drop on the Czech inflation and on the real economy. Keywords: oil prices, fuel taxes, DSGE. JEL classification: C44 AMS classification: 90C15
1
Motivation
The goal of the paper is to look at the impact of the changes in oil prices on the Czech economy through the lens of a monetary dynamic stochastic general equilibrium (DSGE) model. A reader may ask why we want to look at the impact of oil prices through the lens of a monetary structural model. Oil and energy prices are more often investigated in the area of energy, environmental, or transport economics and policy rather than in the area of monetary economics and probably. Indeed, the most widely used quantitative tools in this area are static or quasi-dynamic computable general equilibrium (CGE) models rather than DSGE models with forward looking agents and policy. This can be seemingly supported by the fact that the change in the oil price is after all a change in the relative price of one input and monetary policy deals with the overall price level rather than with relative prices of various goods. This naive position can be easily challenged if we take into the account the fact that because of price stickiness, the change in a relative price is usually also accompanied by a change in the overall price level. This is even more true for commodities that have a significant share in the consumption basket and therefore the corresponding price changes affect the aggregate price indexes in a non-negligible way. And this is precisely the case of motor fuels, which in advanced countries exhibit some 4% share of consumption basket and which are very oil-intensive. Therefore, even if it is true that in the very longrun the change in the oil price is just a change in the relative price of one of the production inputs to which the production and consumption structure of the economy would adapt in the respective way, from the medium-term perceptive it is natural that even stabilization policy in general, and therefore also monetary policy, should monitor the fluctuations in the oil price and their impacts on the economy. After the significant drop in oil prices after the end of 2014, this reasoning has gained even more prominence. On the one hand, the drop of oil price is a positive supply shock for oil importing countries, on the other hand, the resulting drop in motor fuel prices may have fueled deflationary tendencies that already had been presented in many advanced countries. The magnified deflationary tendencies may decrease the supply-shock benefits since in the environments of the low interest rates, the monetary 1 Czech
National Bank, Economic Research Department, Na Přı́kopě 28, Prague, Czech Republic, Jan.Bruha@cnb.cz National Bank, Monetary Department, Na Přı́kopě 28, Prague, Czech Republic, Jaromir.Tonner@cnb.cz 3 Masaryk University, Deparment of Economics, Lipová 41a, Brno, Czech Republic, osvald@econ.muni.cz 2 Czech
79
Mathematical Methods in Economics 2016
policy is constrained in its reaction. Moreover, the resulting drop in the consumption price index might negatively affect the overall inflation expectations and only a credible central bank with transparent policy can avoid this implausible side effect of the shock. These arguments show that it is important to analyze the effects of fluctuations in oil prices not only for the purpose of energy or environmental policy, but also for the purpose of monetary policy. This is the motivation of this paper to look at the aggregate implications of the oil price changes on the Czech economic growth and inflation through the lens of the monetary DSGE model. There is another contribution of this paper which is that unlike existing extensions of DSGE models, we address issues not yet covered by Czech monetary policy models, such as the incidence of motor-fuel excise taxes. therefore, the paper may be interesting not only for those readers who wish to learn about the effects of the recent drop in the oil price, but for DSGE modelers in general. The rest of this paper is organized as follows. The next section describes the extension and discusses its properties. Section 3 describes the results. The last section 4 concludes.
2
Data and Model
The model we are using for the analysis is a modification to the g3 model of the Czech National Bank. We refer to the publication (see [1]) for the description of the baseline version of the g3 model. 2.1
Data
Figure 1 Price of oil
2.2
Model
There are four major modifications of the original model. First, we assume that the oil is one of the imported goods and that it is domestically transformed to motor fuels (we ignore other usage of oil, for example for the production of plastic materials, as this is minority role). Given the fact motor fuels are very tradable goods, it is immaterial whether they are produced domestically or not. The motor fuel price is then affected not only by the oil price, but also by excise tax and â&#x20AC;&#x201C;for householdsâ&#x20AC;&#x201C; by the value added tax. Following the decomposition analysis by Pisa ([4]), the elasticity of the motor fuel price with respect to the oil price is about 0.4. The inclusion of the
80
Mathematical Methods in Economics 2016
Figure 2 Structure of the model effect of excise taxes on the final price of the motor fuel is often neglected in monetary models (unlike in their CGE counterparts) and presents one of the contributions of this paper for DSGE modelers. This motor-fuel production chain chain is illustrated on Figure 2. Second, we assume that the production function for the intermediate goods is extended as follows: αF L yt = At KtαK Lα t Ft ,
(1)
where yt is the production of the intermediate good, At is the productivity, Kt is the capital used, Lt is the labour input, Ft are motor fuels used for the production of the intermediate good, and the non-negative shares obey the restriction: αK + αL + αF = 1. The production function (1) together with the usual cost-minimization assumption imply the following reduced form for the production of the intermediate goods: 1/(1−αF )
y t = At
α /(1−αF )
Kt K
α /(1−αF )
Lt L
PtM F /Pty
−αF /(1−αF )
,
(2)
where PtM F /Pty is the relative price of the motor fuel to the price of the intermediate good. This basically means that the decrease in the relative price of motor fuels is homeomorphic to the productivity increase with the elasticity alphaF /(1 − alphaF ). Third, we extend the utility function of the households as follows: MF X Ct − χC t Ct − χCtM F t U= β log + κM F log , 1−χ 1−χ t>0
(3)
where Ct is the private consumption, C is the external habit, CtM F is the motor fuel consumption, and MF Ct is the habit of the external motor fuel consumption. The intratemporal optimal consumption pattern is given as: MF PtC (Ct − χC t ) = κM F PtM F (CtM F − χC t ), (4)
where PtC is the ex-fuel CPI and PtM F is the end-user price of motor fuels. The inter-temporal substitution is given by the standard Euler equation ([1]). Fourth, as oil is imported, the net foreign asset equation should be modified accordingly. The rest of the model is unchanged.
81
Mathematical Methods in Economics 2016
3
Empirical analysis
To assess the plausibility of the model with oil prices, standard tests were performed on historical data, mainly impulse response analysis and decompositions to shocks. 3.1
Impulse responses
The oil price shock is a representant of typical supply-side shocks, thus an increase of oil prices should lead to higher inflation, interest rates, more depreciated exchange rate (via negative net foreign assets) and to a decrease of the real economy. The calibration assumes that the growth of oil prices by ten p.p implies decrease of domestic real gdp by 0.15 p.p. The negative effect to real variables is only partially compensated by the export growth via exchange rate depreciation. The excise tax shock has similar effect to the main variables as the oil price shock except the exchange rate. Because of the fact that higher taxes represent competitive disadvantage, higher taxes imply more appreciated exchange rate.
16-May-2016 09:30:18 ..\database\Output-reports\compare_unanti
eps-poil
Inflation
Interest rate
ER growth
0.06 0.02 0.01 0 −0.01 03
0.04
0.1
0.02
0.05
0
0
−0.02 04
05
03
04
oil price Wage inflation
05
excise tax
−0.05 03
Real GDP growth
0
04
05
SS Real Consumption growth 0
0
−0.1 −0.1
−0.05
−0.2
−0.1
−0.3
−0.15
−0.4
−0.2
−0.3 03
04
05
03
Real Investment growth 0
04
05
Price of oil change
04
05
Trade balance
10
0
8
−0.1
−0.2
6 −0.2
4
−0.3
2
−0.4
0 03
03
04
05
−0.6
03
04
05 03
Figure 3 Oil price, exice tax and motor fuel shocks
82
04
05
1
Mathematical Methods in Economics 2016
3.2
Shock decomposition
Shock decomposition of the extended model has also revealed the contribution of oil price changes to the GDP growth. The overall effect is approximately 0.7 p.b. to GDP growth and -0.5 p.p to the CPI inflation for the years 2014 and 2015.
Figure 4 Real GDP growth shock decomposition
Figure 5 CPI inflation shock decomposition The tech label in the figures denotes technology shocks as labour-augmented technology shocks and TFP shocks. The oil label denotes the oil price shocks (excise tax shock included). The cost-push label denotes costpush shocks in consumption, investment, government, export, import and intermediate
83
Mathematical Methods in Economics 2016
sectors. The Foreign label denotes shocks to foreign variables, i.e. foreign demand, foreign interest rates and foreign prices. The gov label denotes government shocks. The habit, euler, inv, UIP, MP, and regul labels denote habit in consumption, wedge in the Euler equation, investment specific, uncovered interest rate parity, monetary policy and regulated prices shocks respectively. The LM label denotes labour market shocks. The REST label comprises the effects of those shocks which do not appear in the legend (including initial conditions).
4
Summary
The significant drop in oil prices after the end of 2014 has caused a positive supply shock for oil importing economies and at the same time it has further fuelled deflationary tendencies that already had been presented in many advanced countries. The proposed extension of a standard DSGE model has revealed that the effect of the oil price drop on the real economy had been about 0.7 p.p. to GDP growth and about -0.5 p.p to the CPI inflation for the years 2014 and 2015.
Acknowledgements This work was supported by funding of specific research at Faculty of Economics and Administration, project MUNI/A/1040/2015. This support is gratefully acknowledged.
References [1] Andrle, M., Hlédik, T., Kamenı́k, O., and Vlček, J.: Implementing the New Structural Model of the Czech National Bank. Czech National Bank, Working Paper Series 2, 2009. [2] Baumeister, C., Kilian, L.: What Central Bankers Need To Know About Forecasting Oil Prices. International Economic Review 55 (2014), 869–889. [3] Blanchard, O., Galı́, J.: The Macroeconomic Effects of Oil Shocks: Why are the 2000s so Different from the 1970s?. C.E.P.R. Discussion Papers, https://ideas.repec.org/p/cpr/ceprdp/6631.html, 2008. [4] Pı́ša, V.: The Demand for Motor Fuels in the Central European Region and its Impacts on Indirect Tax Revenues. Proceedings of the Conference Technical Computing Bratislava, 2012. Available at https://dsp.vscht.cz/konference matlab/MATLAB12/full paper/064 Pisa.pdf
84
Mathematical Methods in Economics 2016
Information Support for Solving the Order Priority Problem in Business Logistic Systems Robert Bucki1, Petr Suchánek2 Abstract. The ordering system diagram plays an important role in many manufacturing plants. There are cases in which orders cannot be made at the moment they are received so it is necessary to change the sequence of making them to increase production efficiency. Setting the priority of orders plays an important role as it lets manufacturers organize the course of production. Analytical methods are often confronted with their limits and, therefore, the use of computer simulation increases progressively. The simulation method as a tool is more and more often used to define the appropriate procedure for the selection of orders and organizing a manufacturing process. They are based on mathematical models which are subsequently used as a source for simulation programs. These programs are especially useful for strategic production planning. The main aim of the article is to design an adequate mathematical model based on the heuristic approach. The model represents one of the alternatives for the definition of the proper scheme of the sequence of orders with a possibility of changing their priorities. Keywords: information, mathematical modeling, simulation, ordering system, production system, business logistics, organization of production, strategic planning. JEL Classification: C02, C63, L21 AMS Classification: 93A30, 00A72, 90B06
1
Introduction
Each product which is produced in a company and subsequently sold comes into being by means of a certain sequence of actions. Production can be fully automatized or organized using a combination of machines and human labour. Production is realized on the basis of orders. In case when a company produces only one product type which requires modifications to the manufacturing process, the number of output products is given by the manufacturing capacity of a manufacturing line. If a company is engaged in manufacturing of a variety of different products which require changes in the production process (e.g. adjustments of production lines, the use of other machines, correcting dislocation of machines, etc.), there is a need to tackle optimization problems, respectively the relationship between the management of orders and production itself. Orders come independently throughout the manufacturing process and to achieve the highest possible efficiency of the enterprise as a whole it is necessary to have in place an adequate system of order control which is part of the production process. To optimize the presented problem there is a need to base it on the theory of process management which is currently a prerequisite for the use of advanced control methods (including quality control) and cannot be done without computer support [1], [3], [8], [11], [15]. Advanced software for decision support and management is equipped with increasingly integrated tools to support the simulations required to define the outputs of the company under the given circumstances [2], [12]. Simulation models are based on a variety of analytical and mathematical methods which are the fundamental basis for creation of simulation models and algorithms. To create simulation models there is a need to implement, for example, multi-agents approach [10], Petri nets [9], [16], hybrid approaches [7], heuristic approaches [14], [4] and others obviously with the support of mathematical and statistical methods. In this article, the authors use a heuristic approach accordingly to all possible variants of solving the problem (see algorithms in Tables 1-3). Finding all possible solutions to the problem is just one of the areas where heuristic algorithms are effectively implemented. Specifically, the paper highlights the problem of finding the sequence of orders to be made in the hypothetical manufacturing system. Orders can be chosen on the basis of the decision made by manufacturing algorithms including orders with the maximal priority. The second solution to this kind of problem can be effective in case of implementing a big number of simulation courses.
Institute of Management and Information Technology, ul. Legionów 81, 43-300 Bielsko-Biala, Poland, rbucki@wsi.net.pl. 2 Silesian University in Opava, School of Business Administration in Karviná, Department of Informatics and Mathematics, Univerzitní náměstí 1934/3, 73340 Karviná, Czech Republic, suchanek@opf.slu.cz. 1
85
Mathematical Methods in Economics 2016
2
Mathematical model
Let us assume there is a matrix of orders with elements which have adjusted priorities (1):
Z k z km,n , m 1,..., M , n 1,..., N , k 1,..., K , 1,...,
(1)
where: z km, n - the n-th order for the m-th customer at the k-th stage with the ω-th priority. Orders are subject to change after each decision about manufacturing as follows (2):
Z 0 Z 1k ... Z k ... Z K
(2)
At the same time z km,n 0 (the n-th order can be made for the m-th customer); z km,n 1 (otherwise). The order matrix elements take the following values after each decision about manufacturing: z km, n z km, n1 (in case there is a manufacturing decision at the stage k 1 ; z km,n z km,n1 (otherwise). For simplicity reasons the following grades of the priority are introduced = 0 (if there is no priority); = min (in case of the minimal priority); = max (in case of the maximal priority). It is assumed that if orders are not made in accordance with the demand, there are fines imposed. Let us introduce the matrix of fines for not making the nth order for the m-th customer (3):
C k cmk , n , m 1,..., M , n 1,..., N , k 1,..., K
(3)
where: cmk , n - the fine to be paid to the m-th customer for not making the agreed n-th order in time till the k-th stage of the manufacturing process (expressed in conventional monetary units). At the same time if cmk ,n 1 the m-th customer does not have any priority for the n-th product, otherwise cmk ,n 0 . To simplify the highly complex problem it is assumed that orders are directed to the manufacturing system which can consist of more subsystems. Such systems were analysed in detail in [5], [6] and [13]. To make orders there is a need to make use of the control methods - heuristic algorithms. They choose orders according to the requirements specified in them. The number of algorithms can be immense and it seems that it is worth testing all of them as they can deliver a solution which leads to minimizing order making time. There are algorithms taking into account only the amount of the order without considering its kind and a customer it is made for (Alg_1 – Alg_4) shown in Table 1, algorithms considering the customer - order correlation (Alg_5 – Alg_20) shown in Table 2 and algorithms based on the order customer correlation (Alg_21 – Alg_36) shown in Table 3. It is assumed that all considered algorithms have the same priority ω. Moreover, if the heuristic approach either fails to deliver a satisfactory solution or it is decided to implement methods “at random” focusing on either algorithms or orders or customers is unavoidable. There is also a possibility to combine all mentioned methods e.g. mixing the random choice of all the above.
2.1
Algorithms_α_max/min
It is assumed that orders for manufacturing are chosen in accordance with the amount of the order itself. Either the maximal or minimal orders are searched for. Moreover, there is an additional need to decide which order matrix element should be made in case there are more identical orders (see Table 1).
Table 1 Choosing orders characterized by the maximal priority
86
Mathematical Methods in Economics 2016
First, the maximal priority order z k , , 1 M , 1 N , k 0,1,..., K is chosen and then, if there are two or more such orders, the algorithm chooses the order z k ', ' , 1 ' M , 1 ' N , k 0,1,..., K .
2.2
Algorithms_α_custumer
It is assumed that orders for manufacturing are chosen in accordance with the detailed customer analysis. The µ-th customer, 1,..., M characterized by the specific total order is searched for (4):
: z k ,n max z km,n ; : z k ,n min z km,n . N
N
N
N
n 1
n 1
n 1
n 1
(4)
If there are more customers with identical number of orders, the one with the number ' is taken into account: ' max or ' min . Having found the ' -th customer there is a need to detect the -th order, 1 M
1 M
1,..., N ; : z ', max z k ', ; : z k ', min z k ', . If there are more identical orders for the set k
1 N
1 N
' -th customer, the one with the number ' is taken into account ' max , 1; N ; ' min , 1; N 1 N
. Finally, the order z ', ' becomes subject to manufacturing at the k’ stage (see Table 2). k
Table 2 Choosing an order based on the specific customer
87
1 N
Mathematical Methods in Economics 2016
2.3
Algorithm_αorder
It is assumed that orders for manufacturing are chosen in accordance with the detailed order matrix analysis. The -th order, 1,..., N associated by the specific customer is searched for (5): k k k k z m, max z m,n ; : z m, min z m,n . M
M
M
M
m 1
m 1
m 1
m 1
(5)
If there are more identical orders, the one with the number ' is taken into account: ' max or ' min , 1 M
1 M
1; N . Having found the ' -th order there is a need to detect the m -th customer, m 1,..., M ; : z k , ' max z km, ' ; : z k , ' min z km, ' . If there are more customers where ' -th order, the one with the number ' is taken into account ' max , 1; M ; ' min , 1; M . Finally, the order 1 M
1 M
z ', ' becomes subject to manufacturing at the k’ stage (see Table 3). k
Table 3 Decisions made on the basis of the specific order
3
Complex search for the satisfactory choice of products
To present the pseudo-code representing the model responsible for searching for the satisfactory solution (see Table 4) there is a need to assume the following: W Y A T
- the number of simulations for the random choice of algorithms; - the number of simulations for the random choice of orders; - the set of algorithms; - the total manufacturing time with the use of the α-th algorithm, α = 1,…,A;
88
Mathematical Methods in Economics 2016
T1 P a11 PP PP @
PP qC @ a12 F1 1 x 1 1 PP11 @ P P a c11 @ @ PP
21 q R P @ x12 @ T2 1 a22 1 PP P c12 @ PP @ @ PPP x @ 21 @ PP
@
@ c13 PPP qF q C @ P R @ @ a23 2P x 2 @ 1 1 @ P22 PP @ @ c a 22 32 @ P
@
P q R @ P R @ @ x31 T3 2P 1 @ c23 P @ PP @ x 32 PP@
@
a42 R qF P R C @ 3 3 1 a 43
T4 Figure 1 Transportation system
In the same way we can write equations which must be satisfied to ensure requirements of all consumers for computers of types T2 and T3 . In general, suppose that there are m computers T1 , T2 , . . . , Tm , n companies C1 , C2 , . . . , Cn , s retail stores R1 , R2 , . . . , Rs and r final consumers F1 , F2 , . . . , Fr . If there is no connection from Ti to Cj (from Rl to Fk ), we put aij = O (clk = O). to ensure requirements of all consumers for computers of types T2 and T3 . Our task is to choose the appropriate capacities xjl , j â&#x2C6;&#x2C6; N = {1, 2, . . . , n}, l â&#x2C6;&#x2C6; S = {1, 2, . . . , s} such that we ensure requirements of all consumers for all types of computers. We obtain max min{aij , xjl , clk } = bik (1) jâ&#x2C6;&#x2C6;N, lâ&#x2C6;&#x2C6;S
for each i â&#x2C6;&#x2C6; M = {1, 2, . . . , m} and for all k â&#x2C6;&#x2C6; R = {1, 2, . . . , r}. A certain disadvantage of any necessary and sufficient condition for the solvability of (1) stems from the fact that it only indicates the existence or nonexistence of the solution but does not indicate any action to be taken to increase the degree of solvability. However, it happens quite often in modeling real situations that the obtained system turns out to be unsolvable. One of possible methods of restoring the solvability is to replace the exact input values by intervals of possible values. The result of the substitution is so-called interval fuzzy matrix equation. In this paper, we shall deal with the solvability of interval fuzzy matrix equations. We define the tolerance, right-weak tolerance, left-weak tolerance and weak tolerance solvability and provide polynomial algorithms for checking them.
2
Preliminaries
Fuzzy algebra is the triple (I, â&#x160;&#x2022;, â&#x160;&#x2014;), where I = [O, I] is a linear ordered set with the least element O, the greatest element I, and two binary operations a â&#x160;&#x2022; b = max{a, b} and a â&#x160;&#x2014; b = min{a, b}. Denote by M, N, R, and S the index sets {1, 2, . . . , m}, {1, 2, . . . , n}, {1, 2, . . . , r}, and {1, 2, . . . , s}, respectively. The set of all m Ă&#x2014; n matrices over I is denoted by I(m, n) and the set of all column n vectors over I we denote by I(n). Operations â&#x160;&#x2022; and â&#x160;&#x2014; are extended to matrices and vectors in the same way as in the classical algebra. We will consider the ordering â&#x2030;¤ on the sets I(m, n) and I(n) defined as follows: â&#x20AC;˘ for A, C â&#x2C6;&#x2C6; I(m, n) : A â&#x2030;¤ C if aij â&#x2030;¤ cij for all i â&#x2C6;&#x2C6; M, j â&#x2C6;&#x2C6; N , â&#x20AC;˘ for x, y â&#x2C6;&#x2C6; I(n) : x â&#x2030;¤ y if xj â&#x2030;¤ yj for all j â&#x2C6;&#x2C6; N . The monotonicity of â&#x160;&#x2014; means that for each A, C â&#x2C6;&#x2C6; I(m, n), A â&#x2030;¤ C and for each B, D â&#x2C6;&#x2C6; I(n, s), B â&#x2030;¤ D the inequality A â&#x160;&#x2014; B â&#x2030;¤ C â&#x160;&#x2014; D holds true.
589
Mathematical Methods in Economics 2016
Let A ∈ I(m, n) and b ∈ I(m). In fuzzy algebra, we can write the system of equations in the matrix form A ⊗ x = b. (2) The crucial role for the solvability of system (2) in fuzzy algebra is played by the principal solution of system (2), defined by x∗j (A, b) = min{bi : aij > bi } (3) i∈M
for each j ∈ N , with min ∅ = I. The following theorem describes the importance of the principal solution for the solvability of (2). Theorem 1. [2, 13] Let A ∈ I(m, n) and b ∈ I(m) be given. (i) If A ⊗ x = b for x ∈ I(n), then x ≤ x∗ (A, b). (ii) A ⊗ x∗ (A, b) ≤ b. (iii) The system A ⊗ x = b is solvable if and only if x∗ (A, b) is its solution. The properties of a principal solution are expressed in the following assertions. Lemma 2. [1] Let A ∈ I(m, n) and b, d ∈ I(m) be such that b ≤ d. Then x∗ (A, b) ≤ x∗ (A, d). Lemma 3. [3] Let b ∈ I(m) and C, D ∈ I(m, n) be such that D ≤ C. Then x∗ (C, b) ≤ x∗ (D, b). Lemma 4. Let A ∈ I(m, n), b ∈ I(m) and c ∈ I. Then for each j ∈ N the following equality holds: min{x∗j (A ⊗ c, b), c} = min{x∗j (A, b), c}.
(4)
Proof. In the case that x∗j (A, b) ≥ c we have x∗j (A ⊗ c, b) ≥ c, according to Lemma 3, so both minimas are equal to c. In the second case x∗j (A, b) = bi < c for some i ∈ M , which follows that aij > bi . Then aij ⊗ c > bi and consequently x∗j (A ⊗ c, b) ≤ bi = x∗j (A, b). Together with x∗j (A ⊗ c, b) ≥ x∗j (A, b) we obtain x∗j (A ⊗ c, b) = x∗j (A, b) and consequently (4).
3
Matrix Equations
Let A ∈ I(m, n), B ∈ I(m, r), and C ∈ I(m, n) be matrices with elements aij , bik , and clk , respectively. We shall write system of equalities (1) from Example 1 in the matrix form A ⊗ X ⊗ C = B. Denote by X ∗ (A, B, C) = x∗jl (A, B, C) the matrix defined as follows x∗jl (A, B, C) = min{x∗j (A ⊗ clk , Bk )}. k∈R
(5)
(6)
We shall call the matrix X ∗ (A, B, C) a principal matrix solution of (5). The following theorem expresses the properties of X ∗ (A, B, C) and gives the necessary and sufficient condition for the solvability of (5). Theorem 5. [8] Let A ∈ I(m, n), B ∈ I(m, r) and C ∈ I(m, n). (i) If A ⊗ X ⊗ C = B for X ∈ I(n, s), then X ≤ X ∗ (A, B, C). (ii) A ⊗ X ∗ (A, B, C) ⊗ C ≤ B. (iii) The matrix equation A ⊗ X ⊗ C = B is solvable if and only if X ∗ (A, B, C) is its solution. Remark 1. Equality (6) can be written in the form X ∗ (A, B, C) = (X1∗ (A, B, C), X2∗ (A, B, C), . . . , Xs∗ (A, B, C)), where
Xl∗ (A, B, C) = min x∗ (A ⊗ clk , Bk ). k∈R
590
(7)
Mathematical Methods in Economics 2016
4
Interval matrix equations
Similarly to [5, 6, 7, 10], we define interval matrices A, B, C as follows: A = [A, A] = A ∈ I(m, n); A ≤ A ≤ A , B = [B, B] = B ∈ I(m, r); B ≤ B ≤ B , C = [C, C] = C ∈ I(s, r); C ≤ C ≤ C .
Denote by
A⊗X ⊗C =B
(8)
the set of all matrix equations of the form (5) such that A ∈ A, B ∈ B, and C ∈ C. We call (8) an interval fuzzy matrix equation. We shall think over the solvability of interval fuzzy matrix equation on the ground of the solvability of matrix equations of the form (5) such that A ∈ A, B ∈ B, and C ∈ C. We can define several types of solvability of an interval fuzzy matrix equation. Similarly to the max-plus algebra (see [8]), we define four types of solvability: Definition 1. Interval fuzzy matrix equation of the form (8) is • tolerance solvable if there exist X ∈ I(n, s) such that for each A ∈ A and for each C ∈ C the product A ⊗ X ⊗ C lies in B, • right-weakly tolerance solvable if for each C ∈ C there exist X ∈ I(n, s) such that for each A ∈ A the product A ⊗ X ⊗ C lies in B, • left-weakly tolerance solvable if for each A ∈ A there exist X ∈ I(n, s) such that for each C ∈ C the product A ⊗ X ⊗ C lies in B, • weakly tolerance solvable if for each A ∈ A and for each C ∈ C there exist X ∈ I(n, s) such that A ⊗ X ⊗ C ∈ B. 4.1
Tolerance Solvability
Theorem 6. Interval fuzzy matrix equation of the form (8) is tolerance solvable if and only if A ⊗ X ∗ (A, B, C) ⊗ C ≥ B.
(9)
Proof. A proof is based on the same idea as those of paper [8]. Suppose that m = r = s = n, i.e., all matrices are square of order n. Checking the tolerance solvability is based on the verifying of inequality (9). Computing Xl∗ (A, B, C) by (7) for fixed l requires n · O(n2 ) = O(n3 ) arithmetic operation. Hence, computing the matrix X ∗ (A, B, C) requires n · O(n3 ) = O(n4 ) operations. 4.2
Right-weak Tolerance Solvability
Lemma 7. Interval fuzzy matrix equation of the form (8) is right-weakly tolerance solvable if and only if the inequality (10) A ⊗ X ∗ (A, B, C) ⊗ C ≥ B holds for each C ∈ C. Proof. Let C ∈ C be arbitrary but fixed. The existence of X ∈ I(n, s) such that A ⊗ X ⊗ C ∈ [B, B] for each A ∈ A is equivalent to the tolerance solvability of the fuzzy matrix equation with constant matrix C = C = C, which is, according to Theorem 6, equivalent to (10). Therefore, interval fuzzy matrix equation (8) is right-weakly tolerance solvable if and only if inequality (10) is fulfilled for each matrix C ∈ C. 591
Mathematical Methods in Economics 2016
Lemma 7 does not give an algorithm for checking the right-weak tolerance solvability. It is easy to see that the tolerance solvability implies the right-weak tolerance solvability. The converse implication may not be valid. Theorem 8. An interval fuzzy matrix equation of the form (8) is right-weakly tolerance solvable if and only if it is tolerance solvable. Proof. Suppose that an interval fuzzy matrix equation is not tolerance solvable, i. e., there exist i ∈ M, p ∈ R such that A ⊗ X ∗ (A, B, C) ⊗ C ip < bip . Denote by C (p) the matrix with the following entries ( clk for k = p, l ∈ S, (p) clk = clk for k ∈ R, k 6= p, l ∈ S.
We will prove that
A ⊗ X ∗ (A, B, C (p) ) ⊗ C (p) ip = A ⊗ X ∗ (A, B, C) ⊗ C ip .
(11)
(12)
The both sides of (12) we can write as (p) A ⊗ X ∗ (A, B, C (p) ) ⊗ C (p) ip = max min{aij , x∗jl (A, B, C (p) ), clp } j∈N, l∈S
and
A ⊗ X ∗ (A, B, C) ⊗ C ip = max min{aij , x∗jl (A, B, C), clp } j∈N, l∈S
It is sufficient to prove that
(p)
min{aij , x∗jl (A, B, C (p) ), clp } = min{aij , x∗jl (A, B, C), clp }
(13)
for each j ∈ N, l ∈ S. The left-hand side of (13) is equal to n o (p) min{aij , x∗jl (A, B, C (p) ), clp } = min aij , min x∗j (A ⊗ clk , B k ), x∗j (A ⊗ clp , B p ), clp k6=p
and the right-hand side is equal to
n o min{aij , x∗jl (A, B, C), clp } = min aij , min x∗j (A ⊗ clk , B k ), x∗j (A ⊗ clp , B p ), clp . k6=p
We shall prove that
min{x∗j (A ⊗ clp , B p ), clp } = min{x∗j (A ⊗ clp , B p ), clp }.
(14)
According to Lemma 4 we obtain
min{x∗j (A ⊗ clp , B p ), clp } = min{x∗j (A, B p ), clp },
min{x∗j (A ⊗ clp , B p ), clp } = min{x∗j (A ⊗ clp , B p ), clp , clp } = min{x∗j (A, B p ), clp }. Since (14) is satisfied, from (12) we obtain A ⊗ X ∗ (A, B, C (p) ) ⊗ C (p) ip < bip . According to Lemma 7 an interval fuzzy matrix equation is not right-weakly tolerance solvable. The converse implication is trivial. 4.3
Left-weak Tolerance Solvability
Lemma 9. An interval fuzzy matrix equation of the form (8) is left-weakly tolerance solvable if and only if an interval fuzzy matrix equation of the form C > ⊗ X > ⊗ A> = B >
(15)
is right-weakly tolerance solvable. Proof. A proof is similar as those of paper [8]. Theorem 10. An interval fuzzy matrix equation of the form (8) is left-weakly tolerance solvable if and only if it is tolerance solvable. Proof. A proof follows from Theorem 8 and Lemma 9.
592
Mathematical Methods in Economics 2016
4.4
Weak Tolerance Solvability
Theorem 11. Interval fuzzy matrix equation of the form (8) is weakly tolerance solvable if and only if it is tolerance solvable. Proof. Suppose that (8) is weakly tolerance solvable. We obtain the following sequence of implications Th 8
(∀A ∈ A)(∀C ∈ C)(∃X ∈ I(n, s))A ⊗ X ⊗ C ∈ B ⇒ (∀A ∈ A)(∃X ∈ I(n, s))(∀C ∈ C)A ⊗ X ⊗ C ∈ B Th 10
⇒ (∃X ∈ I(n, s))(∀A ∈ A)(∀C ∈ C) A ⊗ X ⊗ C ∈ B,
hence (8) is tolerance solvable. The converse implication is trivial. 4.5
Application
Let us return to Example 1. Suppose that the amounts of computers given by the elements of matrices A and C are not exact numbers, but the are given as intervals of possible values. Further, request in provision of computers given by the numbers bik are from given intervals, too. The tolerance solvability answers the question whether there exist the numbers of computers completed by Cj for Rl such that for each values aij and clk from given intervals of possible values the requirements of all consumers are fulfilled. Moreover, in the positive case we obtain the solution given by matrix X ∗ (A, B, C).
References [1] Cechlárová, K.: Solutions of interval systems in fuzzy algebra. In: Proceedings of SOR 2001, (V. Rupnik, L. Zadnik-Stirn, S. Drobne, eds.) Preddvor, Slovenia, 321–326. [2] Cuninghame-Green, R. A.: Minimax Algebra. Springer, Berlin, 1979. [3] Myšková, H.: Interval systems of max-separable linear equations, Lin. Algebra Appl. 403 (2005), 263–272. [4] Myšková, H.: Fuzzy matrix equations, In: Proceedings of 33-th International Conference on Mathematical Methods in Economics (D. Martinčı́k, J. Ircingová, P. Janeček, eds.), University of West Bohemia, Plzeň, 2015, 572–577. [5] Myšková, H.: Robustness of interval Toeplitz matrices in fuzzy algebra, Acta Electrotechnica et Informatica 12 (4) (2012), 56–60. [6] Myšková, H.: Interval eigenvectors of circulant matrices in fuzzy algebra, Acta Electrotechnica et Informatica 12 (3) (2012), 57–61. [7] Myšková, H.: Weak stability of interval orbits of circulant matrices in fuzzy algebra, Acta Electrotechnica et Informatica 12 (3) (2012), 51–56. [8] Myšková, H.: Interval max-plus matrix equations, Lin. Algebra Appl. 492 (2016),111–127. [9] Nola, A. Di, Salvatore, S., Pedrycz, W. and Sanchez, E.: Fuzzy Relation Equations and Their Applications to Knowledge Engineering, Kluwer Academic Publishers, Dordrecht, 1989. [10] Plavka, J: On the O(n3 ) algorithm for checking the strong robustness of interval fuzzy matrices, Discrete Appl. Math. 160 (2012), 640–647. [11] Sanchez, E: Medical diagnosis and composite relations, In: Advances in Fuzzy Set Theory and Applications (M. M. Gupta, R. K. Ragade, R. R. Yager, eds.) North-Holland, Amsterdam, The Netherlands, 1979, 437–444. [12] T. Terano, Y. Tsukamoto, Failure diagnosis by using fuzzy logic. In: Proc. IEEE Conference on Decision Control (New Orleans, LA, 1977), 1390–1395. [13] Zimmermann, K.: Extremálnı́ algebra, Ekonomicko-matematická laboratoř Ekonomického ústavu ČSAV, Praha, 1976.
593
Mathematical Methods in Economics 2016
Sensitivity analysis of MACD indicator on selection of input periods combination
Martin Navrátil1, Pavla Koťátková Stránská2, Jana Heckenbergerová3 Abstract. The trend indicator MACD (Moving Average Convergence/Divergence) is commonly used powerful tool of technical analysis. It allows to predict many profitable trades (buying and selling signal), especially when convergence and divergence periods are correctly recognized. The MACD value is dependent on three input parameters, length of short, long and signal period respectively. Generally applied combination of periods 12/26/9 was introduced by Gerald Appel in 1979 for trading with silver. However there is no standard or methodology for selection of optimal input periods and many other combinations are utilized. The aim of presented contribution is the sensitivity analysis of MACD values on selection of short, long and signal period lengths. In the first part MACD indicator is formally defined using weighted exponential averages, from this definition the dependence of input parameters and resulting MACD value is more understandable. Case study involves three types of shares with growing, falling and changing trend respectively. From results we can conclude that prolonging of signal frame increases sensitivity of MACD indicator. Moreover combination 12/26/9 is not optimal for selected shares and lengths of short and long period are in mutual relation. Keywords: MACD indicator, weighted exponential average, technical analysis, profitable trade, short period, long period, signal period. JEL Classification: G17, C61 AMS Classification: 97M30
1
Introduction to Technical Analysis and Indicators
Predicting the future value of securities, especially stock prices, is constantly developing area. Discipline that describes methods of analysis and prediction of stock prices is called technical analysis. Main algorithms are based on statistical, numerical and visual analysis of share prices and sales volumes respectively. There are two essential groups of methods in technical analysis - the technical indicators and graphical formations known as chart patterns [4, 6, 7]. Technical indicators work with recent or close history data with goal to predict future trend of price. They generate so called “buying and selling signals” based on the expected rise or fall of share price. These signals help speculators to decide whether certain trade will be profitable [8]. Most popular methods of technical analysis are based on moving averages (simple, exponential or weighted) with given period. Moreover, for example price oscillators, are constructed using combinations of moving averages with different periods. Famous trend indicator MACD (Moving Average Convergence/Divergence) uses combination of three exponential averages with short, long and signal period respectively. Commonly applied combination of periods 12/26/9 was introduced by Gerald Appel in 1979 for trading with silver. However there is no standard or methodology for selection of optimal input periods and many other combinations are utilized [2, 5, 9]. The aim of presented contribution is sensitivity analysis of MACD values on selection of short, long and signal period lengths. Following section contains definition of MACD indicator and its input parameters. In section Case Study three types of shares with different type of trend are analyzed for selected combinations of input periods using trade success rates. Major results and directions of future work are summarized in the last section.
1
University of Pardubice, Faculty of Electrical Engineering and Informatics, Department of Mathematics and Physics, Studentska 95, 532 10 Pardubice, Czech Republic, e-mail: martin.navratil1@student.upce.cz 2 Masaryk Institute of Advanced Studies (MIAS CTU), Department of management and quantitative methods, Kolejni 2637/2a, 160 00 Prague 6, Czech Republic, email: stranska.pavla@seznam.cz 3 University of Pardubice, Faculty of Electrical Engineering and Informatics, Department of Mathematics and Physics, Studentska 95, 532 10 Pardubice, Czech Republic, e-mail: jana.heckenbergerova@upce.cz
594
Mathematical Methods in Economics 2016
2
Trend Indicator MACD (Moving Average Convergence/Divergence)
Let us denote: i Pi EMA PO n ep
time index stock price in time i Exponential Moving Average dataset Price oscillator dataset period of EMA exponential percentage.
Exponential moving average of given stock price dataset P is defined as: đ??¸đ?&#x2018;&#x20AC;đ??´(đ?&#x2018;&#x192;, đ?&#x2018;&#x203A;)đ?&#x2018;&#x2013; = đ?&#x2018;&#x192;đ?&#x2018;&#x2013; â&#x2C6;&#x2014; đ?&#x2018;&#x2019;đ?&#x2018;? + đ??¸đ?&#x2018;&#x20AC;đ??´(đ?&#x2018;&#x192;, đ?&#x2018;&#x203A;)đ?&#x2018;&#x2013;â&#x2C6;&#x2019;1 â&#x2C6;&#x2014; ( 1 â&#x2C6;&#x2019; đ?&#x2018;&#x2019;đ?&#x2018;?), where 2 đ?&#x2018;&#x2019;đ?&#x2018;? = . đ?&#x2018;&#x203A;+1 Now it is important to denote input parameters of MACD indicator: SP short period LP long period SgP signal period, and define MACD value in given time index as follows: đ?&#x2018;&#x20AC;đ??´đ??śđ??ˇđ?&#x2018;&#x2013; = đ??¸đ?&#x2018;&#x20AC;đ??´(đ?&#x2018;&#x192;đ?&#x2018;&#x201A;, đ?&#x2018;&#x2020;đ?&#x2018;&#x201D;đ?&#x2018;&#x192;)đ?&#x2018;&#x2013; = đ?&#x2018;&#x192;đ?&#x2018;&#x201A;đ?&#x2018;&#x2013; â&#x2C6;&#x2014; (
2 đ?&#x2018;&#x2020;đ?&#x2018;&#x201D;đ?&#x2018;&#x192;+1
) + đ?&#x2018;&#x20AC;đ??´đ??śđ??ˇđ?&#x2018;&#x2013;â&#x2C6;&#x2019;1 â&#x2C6;&#x2014; (1 â&#x2C6;&#x2019;
(1) (2)
2 đ?&#x2018;&#x2020;đ?&#x2018;&#x201D;đ?&#x2018;&#x192;+1
)
(3)
where đ?&#x2018;&#x192;đ?&#x2018;&#x201A; = đ??¸đ?&#x2018;&#x20AC;đ??´(đ?&#x2018;&#x192;, đ?&#x2018;&#x2020;đ?&#x2018;&#x192;) â&#x2C6;&#x2019; đ??¸đ?&#x2018;&#x20AC;đ??´(đ?&#x2018;&#x192;, đ??żđ?&#x2018;&#x192;).
(4)
In words, from subtraction of first two exponential moving averages with short and long period we are getting price oscillator dataset that moves around zero and indicates price general trend. Third exponential moving average with signal period is applied to price oscillator and it gives us MACD values. Buying and selling signals for selected share are detected as intersections of price oscillator and MACD datasets [10]. Resulting value of MACD indicator is dependent on three input parameters, length of short, long and signal period respectively. Their combination is commonly marked as triplet SP/LP/SgP. However there is no standard or methodology for selection of optimal input periods and many other combinations are utilized as summarized in following Table 1. Used In practice
In FX market
Automatic Trading System (ATS)
Short period 12 13 7 3 5 8 15 30 30
Long period 26 26 28 11 35 17 35 55 55
Table 1 Selected combination of MACD periods [3, 7, 8]
Figure 1 Shares of Walt Disney
595
Signal period 9 9 7 17 9 9 5 8 20
Mathematical Methods in Economics 2016
3
Sensitivity Analysis – Case Study
For our sensitivity analysis we decided to choose three daily 10years long datasets of shares with different types of trend. Growing trend is represented by shares of Walt Disney illustrated in Fig. 1, falling trend represents shares of Telecom Italia shown in Fig. 2 and variable, frequently changing trend is analyzed for shares of Big Lots inc. illustrated in Fig. 3.
Figure 2 Shares of Telecom Italia
Figure 3 Shares of Big Lots inc. Boundaries of input parameters were set as: SP 1; 20 LP SP 1; 40 .
(5)
SgP 2; 9 All possible combinations of input periods with step equal to one were analyzed in our software, which was developed by main author and it is not publically accessible. Each result (combination) was augmented with success rate based on holding duration of selected share. Simple success rate (SSR) is defined as number of profiting trades divided by number of all trades (6), to show how market speculations based on MACD indicator perform. w SSR , (6) n where w is number of profitable trades and n is number of all trades based on MACD indicator buying signal. The SSR ranges from 0 (none profitable trade) to 1 (all trades were profitable).
Resulting 5D dependencies are hard to visualize, this is the reason why we decided to specially cumulate information. In following figures Fig. 4, 5 and 6 , we depict these cumulated dependencies as ranges of transaction successfulness based on holding time (y-axis) for combinations of parameters (x-axis) that are sorted first by the short period, then by the long period and finally by the signal period.
Figure 4 Ranges of MACD success rates for Walt Disney shares with increasing trend
596
Mathematical Methods in Economics 2016
Figure 5 Ranges of MACD success rates for Telecom Italia shares with decreasing trend
Figure 6 Ranges of MACD success rates for shares of Big Lots inc. with variable trend Figure 4 illustrates MACD success rates for share of Walt Disney with rising trend. It is obvious that most of the trades with this share are profitable (success rate is higher than 0.5). Moreover 50percent threshold is broken only for a few combinations and only by the bottom boundary of profitability ranges. Opposite statements result from analysis of falling trend of Telecom Italia shares captured in Fig. 5. Only few profitable trades could be found, generally trades with this share are profitless for arbitrary selection of input parameter of MACD indicator. Most interesting result is illustrated in Fig. 6 for changing trend of Big Lots inc. shares. Selection of combination periods highly influences trade profitability. Trades are always profitable for low values of short period (SP), however with rising number of short and long period variation of success rates increases. Proper combination of MACD periods has to be utilized for guarantee of profitable trade.
3.1
Analysis of signal period length influence
Following analysis concentrates on selection of signal period. Figure 7 illustrates dependency between holding duration and SgP value for shares of Big Lots inc. Signal period affects mainly sensitivity of MACD indicator. For low values of SgP more buying and selling signals are generated and MACD sensitivity increases. Moreover Fig. 7 shows that shape of the graph is almost constant plane and therefore dependency between holding duration and SgP value is weak.
Figure 7 Sensitivity of MACD success rate on selection of signal period and holding duration
597
Mathematical Methods in Economics 2016
Figure 8 Sensitivity of MACD success rate on selection of signal period and holding duration
3.2
Sensitivity analysis of short and long period selection
Last but not least analysis of short and long period selection was performed for Big Lots inc. shares. 3D graph of success rates for given values of short period and long period is illustrated in Figure 8. Ridge of maximal profitability (around 0.6) is blue with two significant peaks for SP = 9, LP = 33 and SP = 13, LP = 14 respectively. Commonly applied combination of periods 12/26/9 gives resulting value of success rate equal to 0.58. All remarkable peaks augmented with profitability are summarized in following Table 2. Short period 9 13 10 12 12
Long period 33 14 32 15 26
Success rate 0.605 0.603 0.601 0.6 0.58
Table 2 Selection of MACD parameters combination resulting in high profitability trades with shares of Big Lots inc. augmented with result of 12/26/9 combination
Other interesting result is illustrated in Figure 9, where maximal success rates for different holding periods are depicted. This confirms that there is almost no dependency between duration of holding selected share and setting of MACD parameters. Trend of profitability curve is slightly increasing, however variation around it is very high and it is capturing periodical trading effect with length of one business week (5days).
Figure 9 Maximal success rates for different holding durations of Big Lots inc. shares
598
Mathematical Methods in Economics 2016
4
Conclusion
The aim of presented contribution is the sensitivity analysis of MACD values on selection of short, long and signal period lengths. Case study involves three types of shares with growing, falling and changing trend respectively. To show how market speculations based on MACD indicator perform, simple success rate was calculated for each combination of input parameters and duration of holding of selected share. From results we can conclude that prolonging of signal period increases sensitivity of MACD indicator. Lengths of short and long period significantly influence trade profitability. Moreover combination 12/26/9 is not optimal for selected share and lengths of short and long period are in mutual relation. Surprisingly, there is almost no dependency between duration of holding selected share and setting of MACD parameters.
Acknowledgements This work has been mainly supported by the University of Pardubice via SGS Project (reg. no. SG660026) and by the institutional support.
References [1] Bender, J. C., Osler, C. L, and Simon, D.: Trading and Illusory Correlations in US Equity Markets. Review of Finance 2, 17 (2012). [2] Bodas-Sagi, D. J., Fernandez-Blanco, P., Hidalgo, J. I., and Soltero-Domingo, F. J.: A parallel evolutionary algorithm for technical market indicators optimization. Journal of Nature Computing 12, 2 (2013), 195– 207. [3] Devi, M. S., and Singh, K. R.: Study on Mutual Funds Trading Strategy Using TPSO and MACD. Journal of computer Science and Information Technologies 5, 1 (2014), 884–891. [4] Friesen, G. C., Weller, P. A., and Dunham, L. M.: Price trends and patterns in technical analysis: A theoretical and empirical examination. Journal of Banking and Finance 6, 33 (2009), 1089–1100. [5] Hanousek, J., and Kopřiva, F.: Do broker/analyst conflicts matter? Detecting evidence from internet trading platforms. Proceedings of 30th International Conference Mathematical Methods in Economics, 2012. [6] Lo, A. W., Mamaysky, H., and Wang, J.: Foundation of Technical Analysis: computational algorithms, statistical inference, and empirical implementation. National Bureau of Economic Research, 2000. [7] Neely, Ch., Weller, P., and Dittmar, R.: Is Technical Analysis in the Foreign Exchange Market Profitable?: A Genetic Programming Approach. Journal of Financial and Quantitative Analysis 4, 32 (1997), 405–426. [8] Pring, M. J.: Technical analysis explained: The successful investor's guide to spotting investment trends and turning points. McGraw-Hill Professional, 2002. [9] Rosillo, R., Fuenten, de al D., and Brugos, J. A. L.: Technical analysis and the Spanish stock exchange: testing the RSI, MACD, momentum and stochastic rules using Spanish market companies. Journal of Applied Economics 45, 12 (2013), 1541–1550. [10] Tučník, P.: Automatizované obchodní systémy – optimalizace interakce s prostředím trhu při obchodování s komoditami. In Kognice a umelý život X (Kelemen, J., and Kvasnicka, V., eds.). Slezská univerzita v Opavě, Opava, 2010, 381–384.
599
Mathematical Methods in Economics 2016
Comparison of an enterprise’s financial situation assessment methods. Fuzzy approach versus discriminatory models. Tomasz L. Nawrocki1 Abstract. The main aim of the paper is to compare practical application of different enterprise’s financial situation assessment methods and identify their advantages and disadvantages. As subjects of this comparison have been adopted several popular discriminatory models on the one hand, and original model based on the fuzzy approach, on the other. Calculations were performed on the basis of financial data from annual reports for past ten years (2006-2015), published by selected companies listed on the Warsaw Stock Exchange and classified into different sectors of economy: services and trade, industry, finance. Conducted in the paper comparative analysis shows some advantages and disadvantages of considered approaches. For discriminatory models big advantage is the simplicity of use, but on the other hand, fuzzy approach offers far greater flexibility, which inevitably influences the greater objectivity generated in time results. Keywords: discriminatory models, financial situation assessment, fuzzy logic, solvency. JEL Classification: C69, G33 AMS Classification: 94D05, 26E50
1
Introduction
A feature of the market economy is operating in conditions of uncertainty and risk, which implies the need to control the corporate financial situation. Evaluation of this situation, as a part of the economic analysis, is an important tool for business management, since the information obtained due to it are the foundation for investment and financing decisions. Because of the imperfections of traditional ratio analysis (increasing the number of indicators rather than lead to a better orientation, results in information noise) for many years has been sought a synthetic measure allowing to evaluate enterprise’s condition with the greatest possible accuracy. In the literature term „financial situation” in context of enterprises has usually two meanings – narrow and wider. In the first it is associated with creditworthiness of entity and concentrate on its financial liquidity and indebtedness analysis. I the second one it is treated as a general financial condition of an enterprise including also issues of profitability, wildly understood efficiency as well as competitive advantages [3]. For purposes of this paper, the first – narrow – meaning of financial situation term was used. Although the first quantitative assessment models of creditworthiness date back to the 1930s (P.J. Fritz Patrick), a significant development acceleration in this field occurred in 1960s, along with the wider use of statistics and econometrics (W.H. Beaver, E.I. Altman), especially discriminant analysis [4]. At the same time, however, with the increasingly common use of computers and perceiving certain drawbacks and limitations of statistical and econometrical methods [4], [5], it can be seen further develop broadly defined methods of enterprises financial condition assessment, which focus on the neural networks or fuzzy modeling use [5]. Therefore, the main goal of this article is to present a comparative analysis results of corporate financial situation assessment with the use of several popular in the literature discriminatory models and Author’s original fuzzy model. Calculations were performed on the basis of financial data from annual reports for past ten years, published by selected companies listed on the Warsaw Stock Exchange and classified into different sectors of economy: services, industry and finance.
2
Research methodology
Among the many discriminatory models, for the purpose of realization the main article goal it was decided to use five of them, which had no certain industry restrictions [4] (in brackets were given the years of model creation): Mączyńska (1994), Gajdka & Stos (2006), Hołda (2001), Prusak (2005) and Hamrol – „poznański model” (1999-2002). The reason that probably the most popular in the world Altman’s discriminatory model [1] was ignored is this, that the Altman's Z-score model has industry limitation (manufacturers). Discriminatory functions with 1
Silesian University of Technology, Faculty of Organization and Management, Institute of Economics and Informatics, Roosevelta 26, 41-800 Zabrze, Poland; e-mail: tomasz.nawrocki@polsl.pl.
600
Mathematical Methods in Economics 2016
characterization of used variables are given below. In order to harmonize the limit values in considered models, with regard to the Gajdka & Stos model it was adopted subtracting from the final outcome 0.44, and in the case of the Prusak model adding 0.13. In result, for each considered model the limit value separating enterprises with bad financial situation from the others was 0. Mączyńska:
WM = 1,5 ⋅ X 1 + 0,08 ⋅ X 2 + 10 ⋅ X 3 + 5 ⋅ X 4 + 0,3 ⋅ X 5 + 0,1⋅ X 6
(1)
X1 – (Profit before tax + Depreciation and amortization) / Reserves and liabilities X2 – Total assets / Reserves and liabilities X3 – Profit from operations (EBIT) / Total assets X4 – Profit from operations (EBIT) / Revenues from sales X5 – Inventories / Revenues from sales X6 – Total assets / Revenues from sales Gajdka & Stos:
WGS = 0,20099 ⋅ X 1 + 0,001303 ⋅ X 2 + 0,760975 ⋅ X 3 + 0,965963 ⋅ X 4 − 0,341096 ⋅ X 5
(2)
X1 – Revenues from sales / Total assets X2 – Current liabilities x 365 / Cost of goods sold, general administration, selling and distribution X3 – Net profit / Total assets X4 – Profit before tax / Revenues from sales X5 – Reserves and liabilities / Total assets Hołda:
WH = 0,605 + 0,681⋅ X 1 − 0,0196 ⋅ X 2 + 0,00969 ⋅ X 3 + 0,000672 ⋅ X 4 + 0,157 ⋅ X 5
(3)
X1 – Current assets / Current liabilities X2 – Reserves and liabilities / Total assets X3 – Net profit / Total assets X4 – Current liabilities x 360 / Cost of goods sold, general administration, selling and distribution X5 – Revenues from sales / Total assets Prusak:
WP = −1,5685 + 6,5245 ⋅ X 1 + 0,148 ⋅ X 2 + 0,4061⋅ X 3 + 2,1754 ⋅ X 4
X1 – Profit from operations (EBIT) / Total assets X2 – Cost of goods sold, general administration, selling and distribution / Trade and other payables X3 – Current assets / Current liabilities X4 – Profit from operations (EBIT) / Revenues from sales Hamrol - „poz- W H − poz = −2,368 + 3,562 ⋅ X 1 + 1,588 ⋅ X 2 + 4,288 ⋅ X 3 + 6,719 ⋅ X 4 nański model”: X1 – Net profit / Total assets X2 – (Current assets – Inventories) / Current liabilities X3 – (Equity capital + Non-current liabilities) / Total assets X4 – Net profit on sales / Revenues from sales
(4)
(5)
In contrast to the approach used in mentioned above statistical and econometrical methods [9], assessment criteria selection in proposed by the Author original solution based on the fuzzy logic was carried out arbitrarily, on the basis of author’s knowledge and practical experience in the field of financial analysis. Proposed model generally bases on an earlier concept [6], but for purposes of this research more efforts were made to limit number of variables and, at the same time, take into account all the dimensions of enterprises financial condition assessment, that are crucial from the viewpoint of bankruptcy risk, or creditworthiness. Therefore, in the proposed model it was assumed that the financial situation of an enterprises can be analysed and assessed in two basic dimensions considered in the literature on financial analysis [8]: its financial liquidity and indebtedness. In the assessment of financial liquidity there were three basic dimensions included: static, income (cash flows) and structural (net working capital), and in the case of indebtedness two: debt level and ability to service the debt. The structure of the suggested enterprise’s financial situation assessment model, along with the most detailed assessment criteria within the particular modules, is presented in Figure 1. In the proposed model firstly it is intended to obtain partial assessments within the distinguished basic assessment criteria of financial situation. These assessments will result from the ratios calculated on the basis of data from financial statements. Next aggregated assessment results may be obtained in the areas of financial liquidity in static, income and structural dimension, debt level and ability to service the debt. Furthermore, these results constitute foundations for calculating general situation measures in the areas of financial liquidity and indebtedness, so that in the final stage, on their basis, it is possible to achieve overall financial situation assessment for the analysed enterprise.
601
Mathematical Methods in Economics 2016
The calculation tool in the suggested solution is based on the fuzzy set theory, which is one of the approximate reasoning methods [7], [10].
Figure 1 General structure of an enterprise’s financial situation assessment model CA – current assets, STD – short-term debt, CFo – cash flows from operations, NWC – net working capital (current assets – short-term liabilities), RNWC – request for net working capital (non-financial current assets – nonfinancial short-term liabilities), TA – total assets, FD – financial debt, BV – book value (equity capital), TD – total debt (liabilities and reserves), LTD – long-term debt, ND – net debt (FD – cash), NPS – net profit from sales, ITS – interests. It was assumed, that the financial data for calculation purposes (regarding both discriminatory models and original fuzzy model) will be obtained from financial reports of analyzed companies. In all cases of variables based on balance sheet (financial situation statement) values it is also assumed to use their average values from two-year period.
Detailed assumptions of fuzzy model construction In relation to the construction of proposed fuzzy model, based on the Mamdani approach [7], the following assumptions were made: • for all input variables of the model, the same dictionary of linguistic values was used, and their value space was divided into three fuzzy sets named {low, medium, high}; • for output variables of the model, in order to obtain more accurate intermediate assessments, the space of linguistic values was divided into five fuzzy sets named {low, mid-low, medium, mid-high, high}; • in case of all membership functions to the particular fuzzy sets, a triangular shape was decided for them (Figure 2 and Figure 3); • the values of fuzzy sets characteristic points (x1, x2, x3) for the particular input variables of the model were determined partly basing on the literature on enterprises financial analysis and partly arbitrarily, basing on the distribution of analysed variables values and on the author years of experience in the area of companies financial situation analysis (Table 1); • for fuzzification of input variables, the method of simple linear interpolation was used [2]; • fuzzy reasoning in the particular knowledge bases of the model was conducted using PROD operator (fuzzy implication) and SUM operator (final accumulation of the conclusion functions received within the particular rule bases into one output set for each base) [7]; • for defuzzification of fuzzy reasoning results within the particular rule bases simplified Center of Sums method was used [7].
602
Mathematical Methods in Economics 2016
Figure 2 The general form of input variables membership function to distinguished fuzzy sets.
Figure 3 The output variables membership function to distinguished fuzzy sets x1
x2
x3
µ(x) = 1/low
µ(x) = 1/medium
µ(x) = 1/high
Current Ratio (-) Current Assets Turnover (-) Operating Cash Flow Ratio (-) Liquidity Balance-Assets Ratio (-) Financial Debt-Equity Ratio (-) Debt Structure by Time (-) Net Debt Repayment Period by EBIT (years) Interests Coverage Ratio (-) Total Assets (’000 PLN) Revenues from Sales (’000 PLN)
0 0 0 -0,1 0 0 0 0 0 0
1 2 0,5 0,0 0,5 0,5 3 2 172.000 200.000
2 4 1 0,1 1 1 6 4 344.000 400.000
Table 1 The values of fuzzy sets characteristic points for particular input variables of the enterprise’s financial situation assessment fuzzy model Next, taking into consideration the general structure of the financial situation assessment model presented in Figure 1, author, basing on his knowledge and experience in the area of analysed issue, designed 12 rules bases in the form of „IF – THEN” (11 bases with 9 rules and 1 base with 27 rules), achieving this way a „ready to use” form of the financial situation assessment fuzzy model. The intermediate and final assessments generated by the model take values in the range between 0 and 1, where from the viewpoint of analysed issue, values closer to 1 mean a very favourable result (better financial situation, lower risk of financial problems and bankruptcy), while values closer to 0 indicate a result less favourable (worse financial situation, higher risk of financial problems and bankruptcy). It also needs to be noted, that due to the fuzzy model assumptions, since the outputs of the fuzzy rule-based model are defuzzified, the information about the uncertainty of an assessment is to some extent lost. However, on the other hand this procedure provides an information about narrower areas of financial situation assessment. All calculations related to the presented fuzzy model were performed in MS Excel.
3
Research results
In order to compare results generated by considered discriminatory models and proposed original fuzzy model, the financial situation assessment was conducted for 12 companies from different sectors of the economy, which shares are listed on the WSE – Orange PL (telecommunications), MW Trade (finance), Neuca (wholesale trade), Sfinks (restaurants), Polnord (real estate developer), Marvipol (real estate developer/retail trade), PBG (construction), Budimex (construction), JSW (coalmine), KGHM (copper mine), Żywiec Group (brewery), PKM Duda (meat industry). According to the adopted methodology, the basis for the financial situation assessment of mentioned above entities were data acquired from published by them in the years 2006-2016 annual reports. Despite methodological differences, both discriminatory models and proposed original fuzzy model inform about the same category – financial situation of enterprises. This informational (and not strictly numerical) surface was adopted as the basis for comparative analysis of investigated companies. The main issue was to compare a general perception of the enterprises’ financial situation through the prism of general results, obtained with the use of different methods. For this reason it was considered, that the graphical form of results presentation and their comparison will be adequate and sufficient at the same time. The results obtained during the research are presented in Figures 4-9. Analyzed companies were grouped according to similarity of business activity – service and trade, real estate developers and construction, manufacturers.
603
Mathematical Methods in Economics 2016
Figure 4 Financial situation of servicing and trading companies based on discriminatory models
Figure 5 Financial situation of servicing and trading companies based on original fuzzy model
Figure 6 Financial situation of re developers and construction companies based on discriminatory models
Figure 7 Financial situation of re developers and construction companies based on original fuzzy model
Figure 8 Financial situation of manufacturing companies based on discriminatory models
Figure 9 Financial situation of manufacturing companies based on original fuzzy model
604
Mathematical Methods in Economics 2016
Taking into account carried out research the attention should be paid to two issues. The first is significant differences in financial situation assessments of companies analyzed over the concerned period, which were obtained using the discriminatory models. Most often the lowest results, including signalling the possibility of bankruptcy, indicated the Gajdka & Stos model, which in most cases was not however covered in the future. On the other hand, the results obtained on the basis of the Hołda model were less susceptible to changes in time and remained at relatively high levels despite rather turbulent changes in the financial situation of investigated companies, with the interim systemic bankruptcy inclusive (Sphinx, PBG, PKM Duda). The second issue concerns whereas the main purpose of the article, which is a comparison of investigated companies financial situation assessment results, obtained using discriminatory models and original fuzzy model. As it can be seen in Figures 4-9, in several cases, these results differ quite substantially (Orange PL, MW Trade, Neuca, Polnord, Budimex, JSW, KGHM, Żywiec Group). The reasons for this situation can be traced to excessive referencing of discriminatory models on values or indicators, which in recent years are often subject to distortion by one-off events (profit from operations, profit before tax, net profit), or specific characteristics of the business activity (current assets, current liabilities), which also affects in minus objectivity of generated assessments. On the other hand, when it comes to the directions of financial situation changes in analyzed companies, both in the case of discriminatory models and the original fuzzy model they were generally similar.
4
Conclusions
Comparative analysis of corporate financial situation assessment, that was conducted on the purpose of this paper with the use of selected discriminatory models and original fuzzy model, shows some advantages and disadvantages of each of these approaches. For discriminatory models big advantage is the simplicity of use, which boils down to a calculation of several indicators based on published financial statements and discriminant function of a specific formula. The issue of application is in turn a significant barrier to the wider use of an approach based on fuzzy logic, where there is no simple formula, but a complex fuzzy model, in which the final result is obtained usually through the use of specific software. On the other hand, an approach based on the fuzzy logic is characterized by a far greater flexibility (eg. the possibility of selection a universal from sectoral viewpoint variables or replacing the variable distorting the final results, burdened by one-off events), which inevitably influences the greater objectivity generated in time results than in the case of discriminatory models (here a considerable importance for the results quality has quantitatively-sectoral scope of the research sample, based on which the discriminant function is developed). Although discriminatory models are, and probably will still continue to be, popular in the quick testing of enterprise’s financial situation assessment, it is undoubtedly the use of fuzzy logic provides ample opportunities for the development in this research area.
Acknowledgements This research was financed from BK 13/010/BK_16/0018.
References [1] Altman, E.I.: An emerging market credit scoring system for corporate bonds, Emerging Markets Review, 6/2005, p. 311-323. [2] Bartkiewicz, W.: Zbiory rozmyte [in:] Inteligentne systemy w zarządzaniu, Zieliński, J.S., (Ed.), PWN, Warszawa, 2000, p. 72-140. [3] Bombiak, E.: Modele dyskryminacyjne jako metoda oceny sytuacji finansowej przedsiębiorstwa, Zeszyty Naukowe Akademii Podlaskiej w Siedlcach. Seria: Administracja i Zarządzanie, nr86/2010, s. 141-152. [4] Jagiełło, R.: Analiza dyskryminacyjna i regresja logistyczna w procesie oceny zdolności kredytowej przedsiębiorstw, Materiały i Studia NBP, nr 286/2013. Available from: http://www.nbp.pl/publikacje/ materialy_i_studia/ms286.pdf. [5] Korol, T.: Modele prognozowania upadłości przedsiębiorstw – analiza porównawcza wyników sztucznych sieci neuronowych z tradycyjną analizą dyskryminacyjną, Bank i Kredyt, 6/2005, p. 10-17. [6] Nawrocki, T.: The concept of an enterprise’s financial situation assessment fuzzy model, Proceedings of the International Conference on Mathematical Methods in Economics 2015, p. 584-589. [7] Piegat, A.: Modelowanie i sterowanie rozmyte. EXIT, Warszawa, 2003. [8] Wędzki, D.: Analiza wskaźnikowa sprawozdania finansowego t.2. Wolters Kluwer, Kraków, 2009. [9] Wiatr, M.S.: Zarządzanie indywidualnym ryzykiem kredytowym, SGH, Warszawa 2011. [10] Zadeh, L.A.: Fuzzy sets, Information and Control 8 (1965), p. 338-353.
605
Mathematical Methods in Economics 2016
Prediction of Daily Traffic Accident Counts and Related Economic Damage in the Czech Republic Dana Nejedlová1 Abstract. Free datasets describing weather and traffic accidents in the Czech Republic have been used for the training of neural network that would predict the number of traffic accidents and the level of economic damage for a given day. The aim of the research is to find out whether there are enough statistical dependencies in the available data so that a practically usable predictor could be trained from them. The Pearson’s chi-squared test was used to select input attributes for the neural network. The selected attributes are month, day of week, temperature in two selected preceding days and in the current day, precipitation, and snow. The neural network has been trained on the daily data of the years 2009 till 2014 divided into training and development test sets. The accuracy of the network after this training on more recent days is higher than majority voting, which can motivate a future research. Keywords: Pearson’s chi-squared test, feed-forward neural network, traffic accidents. JEL Classification: C32, C45, R41 AMS Classification: 68T, 62H
1
Introduction
Predicting traffic accidents is a challenging task because they are not planned and apart from e.g. insurance frauds everyone wishes to avoid them. Scientific studies of traffic accidents focus on identifying factors that influence their incidence. Article [6] models by logistic regression the influence of rainfall on single-vehicle crashes using the information about crash severity, roadway geometries, driver demographics, collision types, vehicle types, pavement conditions, and temporal and weather conditions from databases available in Wisconsin, United States of America. Article [8] models by linear regression traffic volume in Melbourne, Australia. It identifies weather as a significant factor influencing the traffic volume as well as daytime and nighttime periods, day of the week, month, and holidays. The traffic volume is then used to predict the count of accidents by a nonlinear model that takes into account the observation that the number of accidents is directly proportional to the traffic volume and to the severity of weather conditions, but at the same time the inclement weather deters motorists from venturing onto the road and thus reduces the traffic volume, which justifies the used nonlinear regression model. Article [2] presents autoregressive time-series model of daily crash counts for three large cities in the Netherlands. This study uses data about daily crash counts, accurate daily weather conditions (wind, temperature, sunshine, precipitation, air pressure, and visibility), and daily vehicle counts for each studied area. This article also shows that if information on daily traffic exposure is not available, the information about day of the week is a good substitute. Article [15] uses autoregressive time-series model of annual road traffic fatalities in Great Britain and the monthly car casualties within the London congestion charging (CC) zone to investigate the impact of various road safety measures and legislations (seat-belt safety law, penalty points for careless driving, driving with insurance, and seat-belt wearing for child passengers). This study uses also the data about the traffic volume in the form of annual vehicle-kilometers travelled in Great Britain. Article [1] suggests a negative binomial model as the most appropriate for predicting traffic accidents using various details about studied location of traffic and about the drivers (age and gender). Available databases allowed assigning accidents to particular segments of State Road 50 in Central Florida, United States of America. Each segment was characterized by roadway geometries, traffic volumes, and speed limits. Weather was not included in the variables that contribute to accident occurrence. Results of this research identify combinations of qualities of drivers and roads which are most likely to result in an accident, which can be used as a source for arranging preventive interventions. This contribution identifies attributes significantly influencing traffic accident counts in the Czech Republic. It draws these attributes from free publicly available datasets about daily traffic accident counts, related economic damage, and weather. These datasets do not contain as much details as those that were available to the abovementioned studies. It is not surprising that the precision of prediction of traffic accidents using these datasets is low but when this precision is higher than the precision of a mere guessing then it indicates the existence of 1
Technical University of Liberec, Faculty of Economics, Department of Informatics, Studentská 1402/2, 461 17 Liberec 1, Czech Republic, e-mail: dana.nejedlova@tul.cz.
606
Mathematical Methods in Economics 2016
statistical dependencies that hold true in the whole studied time period. The existence of prevailing statistical dependencies can motivate future research using longer data series, additional data sources that could possibly become available, and various data models, the result of which could be a practically usable predictor of the number of accidents and the economic damage caused by these accidents for a given time period of the future. The data modeling technique used in this study is Pearson’s chi-squared test to identify the input data for the feed-forward neural network.
2
Data sources
Datasets used in this study are sources of information about traffic accidents, weather, and geomagnetic activity as some studies, e.g. [17, 18], indicate a possible effect of solar and geomagnetic activity on traffic accidents.
2.1
Traffic accidents
Dataset of traffic accidents is published by the Police of the Czech Republic on their web page [14]. The earliest record in this source is for February 10, 2009. The dataset is updated on a daily basis, and the data can be downloaded separately for each day. The data are structured by date and by 14 Czech regions, each pair of day and region having the following attributes: number of accidents, fatalities, severe injuries, minor injuries, and economic damage in thousands of Czech crowns (CZK). The accidents are classified also by their causes (speeding, not giving way, inappropriate overtaking, inappropriate driving, other cause, alcohol) in such a way that the sum of the number of accidents having a certain cause is equal or higher than the number of accidents, suggesting that some accidents have been assigned to more than one cause. There is an error in this database: on August 26, 2012 there are no data.
2.2
Weather
Weather dataset [11, 12] was acquired from the National Centers for Environmental Information. The earliest record in this source is for January 1, 1775. The dataset is updated on a daily basis with a several day delay. Data for the Czech Republic can be ordered from [13] by clicking the “Add to Cart” button, “View All Items” button, selecting the “Custom GHCN-Daily CSV” option, selecting the Date Range, clicking the “Continue” button, checking all Station Detail & Data Flag Options flags, selecting Metric Units, clicking the “Continue” button, entering email address, and clicking the “Submit Order” button. The data are structured by meteorological station and date of observation. For each of this pair the following attributes are available: precipitation in mm, snow depth in mm, and maximum and minimum temperature in degrees C. There are some errors in this dataset: records for March 18, 2007 and September 9, 2013 are missing. Some other days exhibited missing measurements in all meteorological stations. For these days the missing values have been added to the dataset from internet weather service [4] that presents measurements of a number of Czech meteorological stations for a selected day. Of all available meteorological stations the data from station Ruzyně were used.
2.3
Geomagnetic activity
Dataset of geomagnetic activity can be downloaded for a submitted time period from web page [5]. The earliest available date of observation is January 1, 2000. The data are structured by days.
3
Data preprocessing
Available time series of days have been divided into 3 parts defined by Table 1. The first two parts called Training Set and Development Test Set have been used for selecting useful attributes for the feed-forward neural network that would predict the number of accidents and related economic damage for a given day and for the estimation of parameters of this neural network. The third part called Test Set is used for the presentation of results in Section 5. This methodology of dividing the data into three sets is recommended e.g. in [7] and also in [3] where the Development Test Set is called Validation Set.
Set Training Development Test Test
From Date To Date Number of Days 2009.02.10 2014.02.09 1826 2014.02.10 2015.02.09 365 2015.02.10 2016.04.19 435
Table 1 Division of available data
607
Mathematical Methods in Economics 2016
In order to obtain meaningful results of analysis the acquired data have been transformed into values as described below.
3.1
Traffic accidents
Of all available information in [14] the total number of accidents and the total economic damage per day for all Czech regions have been selected as the values that will be predicted. The number of accidents There is a rising tendency in this time series that demanded transforming the original values into their differences from quadratic trend of the period spanning the Training and Development Test sets. The values of these differences have been transformed into three approximately equifrequent categories defined by Table 2. Without applying the quadratic trend the count of the “Low” category would be inadequately low in Development Test and Test sets containing late days, see Table 1, which would make the prediction of the Test Set from the Training and Development Test sets impossible. Economic damage This time series has been processed the same way as the number of accidents, see Table 2, but before this step the original values have been deprived of outliers. The outliers have been defined as the values above 50,000 thousand CZK. The outliers have been substituted by the mean value of the joint Training and Development Test sets equal to 16,876 thousand CZK, which resulted in classifying them into the “High” category.
Time Series Quadratic Trend Border Values Set / Category Training Development Test Test
The Number of Accidents Economic Damage y = x – 6.37E-06 ∙ t2 – 0.00471 ∙ t – 196 y = x – 6.69E-04 ∙ t2 + 1.5025 ∙ t – 13297 Lower than >= –20 and >= 22 Lower than >= –2000 >= 1300 –20 < 22 –2000 and < 1300 Low Medium High Low Medium High 602 628 596 611 611 604 132 100 133 119 126 120 141 123 171 131 139 165
Table 2 Division of daily values in the traffic accidents dataset into categories In Table 2 the x symbol stands for the original number of accidents and the number of thousand CZK with outliers substituted by the mean value, t stands for the order of the day in the time series where the date 2009.02.10 has t equal to 1, and y is the resulting value which is assigned the category defined by border values.
3.2
Weather
From weather dataset [11, 12], where each row is an observation of a single meteorological station for a single day, all available information has been utilized. To predict traffic accidents for a given day, weather for this day had been transformed into single values of attributes listed in Section 2.2. Precipitation and snow depth Two columns for a binary value of either precipitation or snow have been added to the weather dataset [11, 12] with value equal to zero when the observed value of precipitation or snow in a meteorological station in this row was either zero or missing. When the observed value was above zero, its binary value was equal to one. A single value of either precipitation or snow depth for a single day has been computed as arithmetic mean of these binary values for all meteorological stations respectively for precipitation and snow depth. Temperature For each meteorological station a single temperature for a single day has been computed as arithmetic mean of its maximum and minimum temperature. Missing values have been ignored. Missing values of both maximum and minimum temperature resulted in a missing value of arithmetic mean. A single value of temperature for a single day has been computed as arithmetic mean of these mean values for all meteorological stations ignoring missing values. The resulting temperature has been classified into categories Low, Medium, and High using border values 5 and 14 degrees C in a similar way to the border values –20 and 22 or –2000 and 1300 in Table 2.
608
Mathematical Methods in Economics 2016
3.3
Geomagnetic activity
Each row of the dataset obtainable from web page [5] contains values of 8 K-indices measured in a three-hour interval in a given day characterizing geomagnetic activity of this day. Missing measurements denoted as “N” have been replaced by a preceding measurement. The underlying theory is explained in [16]. Each day has been classified into one of possible categories Quiet, Unsettled, Active, Minor storm, Major storm, and Severe storm according to rules stated in [9].
Pearson’s chi-squared tests
4
Pearson’s chi-squared tests [19] employ categorical values obtained in data preprocessing described in Section 3. Data involved in their computation come from merged Training and Development Test sets. The results are summarized in Table 3.
Row
Test
1 2 3 4 5 6 7 8 9 10 11 12
Temperature and the Number of Accidents Temperature and the Level of Economic Damage Geomagnetic Activity and the Number of Accidents Precipitation and the Number of Accidents Precipitation and the Level of Economic Damage Snow and the Number of Accidents Snow and the Level of Economic Damage Month and the Number of Accidents Month and the Level of Economic Damage Day of the Week and the Number of Accidents Day of the Week and the Level of Economic Damage Week after the Transition to or from Daylight Saving Time (DST) and the Number of Accidents
Degrees of Freedom 4 4 6 2 2 2 2 22 22 12 12 4
0.995 Quantile 14.8643 14.8643 18.5490 10.5987 10.5987 10.5987 10.5987 42.7958 42.7958 28.2995 28.2995 14.8643
Χ2 48.567 40.686 6.0727 0.2950 6.9586 41.0788 44.3082 229.361 130.746 1094.896 745.122 14.263
Table 3 Tested dependencies Remarks to the rows of Table 3 are in Table 4.
Row 1, 2
Commentary to the Statistical Tests for a Given Row in Table 3 There is a low number of accidents and a low level of economic damage on days with low temperature and vice versa. The influence of geomagnetic activity on the number of accidents is not significant. Only 4 categories of 3 the lowest levels of geomagnetic activity listed in Section 3.3 have been observed, that is why there are 6 degrees of freedom: (4 – 1) ∙ (3 – 1), 3 standing for Low, Medium, and High number of accidents. 4, 5 The influence of precipitation on and the number of accidents and on the level of economic damage is not significant. Precipitation was treated as a binary value. The first category was for days with zero precipitation and the second category was for days with the precipitation above zero. The computation of precipitation is described in Section 3.2. There is a low number of accidents and a low level of economic damage on days with non-zero snow 6, 7 depth and vice versa. Snow depth was treated the same way as precipitation in a contingency table. 8-11 Month and day of the week are strongly correlated with the number of accidents and the level of economic damage, suggesting that the primary cause of accidents and economic damage is the level of traffic. 12 Week after the transition to or from daylight saving time and the number of accidents might be somewhat correlated. Days have been classified into three categories: days of the week after the transition to the DST, days of the week after the transition from DST, and the other days. The week begins on Sunday. Table 4 Details about tested dependencies in Table 3
Tested dependencies with Χ2 very close to a 0.995 Quantile (i.e. 1 – 0.995 is the probability that the tested pairs of attributes are independent) have been examined more closely in a series of contingency tables differing in a shift of the attributes Temperature, Week after the Transition to or from DST, Precipitation, and Geomagnetic Activity relatively to the other attributes The Number of Accidents and The Level of Economic Damage. The
609
Mathematical Methods in Economics 2016
result can be seen in Figure 1. The horizontal axis shows the number of days by which the attributes have been shifted. Had there been a peak around the zero shift, a significant dependency could have been detected in this way. Peaks in the negative part of the horizontal axis mean that the shifted attributes have a delayed effect on accidents and related economic damage. Peaks in the positive part of the horizontal axis mean that the categories of accidents and related economic damage have counts different from random distribution on days preceding the change in the shifted attributes. 120
Number of Accidents Temperature
100 Economic Damage Temperature
80
Number of Accidents Week after the Transition to or from Daylight Saving Time
60 40
Number of Accidents Geomagnetic Activity
20 Economic Damage Precipitation
0 -300
-250
-200
-150
-100
-50
0
50
100
Figure 1 Pearson’s chi-squared tests with a time shift in days It can be seen from Figure 1 that Precipitation and Geomagnetic Activity do not have a significant influence at least in the used categorical representation. The Transition to or from DST has a significant peak distant approximately a month from the zero shift in the positive part of the horizontal axis. Approximately 2 weeks after the time shift to or from DST the Number of Accidents becomes randomly distributed. Examining the related contingency table reveals that there is a relatively low number of accidents before the spring time shift to DST and a relatively high number of accidents before the autumn time shift from DST. Two biggest peaks at –60 and –276 are probably a manifestation of a yearly temperature and traffic cycle described in the next paragraph. It is probable that all significant peaks on the Transition to or from DST dataset in Figure 1 are caused by this cycle. The data series involving Temperature in Figure 1 suggest that accidents and related economic damage have a cyclic character and that they react to the change in temperature with approximately a month’s delay. This delay suggests that the primary cause of accidents and economic damage is the level of traffic. There is a low number of accidents and low level of economic damage in days with low temperature and vice versa in the contingency table for a minus-32-day shift. And there is a high number of accidents and high level of economic damage in days with low temperature and vice versa in the contingency table for a minus-219-day shift. It means that these two peaks in Figure 1 are anticorrelated. Although having correlated (or anticorrelated) attributes in the input of a predictor is not recommended, see e.g. [3], the neural network described in Section 5 has been usually more accurate when it used both of them.
5
Prediction using the feed-forward neural network
A feed-forward neural network has been reported in paper [10] dealing with traffic crashes occurred at intersections as a better solution than the Negative Binominal Regression employed also in articles [1, 2, 15]. A feed-forward neural network has been programmed in C language according to the algorithm described in [20] and used to predict separately the category of the number of accidents and the level of economic damage from month, day of week, temperature 219 days ago, temperature 32 days ago, temperature for the current day, precipitation and snow depth. All categorical attributes have been encoded in the form of the so-called dummy variables described in [3]. Precipitation and snow depth have been represented respectively as a single value described in Section 3.2. The network has been trained according to the Training Set and the training has been stopped when its accuracy of prediction of the Development Test Set was maximal. Neural network with final weights after this training has predicted correctly the category of the Number of Accidents in the range of 53% to
610
Mathematical Methods in Economics 2016
64% of days and the category of the Level of Economic Damage in the range of 43% to 50% of days in the Test Set, while guessing the most frequent category has accuracy of 39% and 38% respectively.
6
Conclusion
Artificial neural network can predict the number of accidents and the level of economic damage with accuracy slightly above majority voting. Its low accuracy is most likely caused by imprecise data. It would be possible, however, to create training data for a specific region and use the neural network trained with the similar methodology as described in this contribution for e.g. assessing the impact of local preventive police interventions.
References [1] Abdel-Aty, M. A., and Radwan, A. S.: Modeling traffic accident occurrence and involvement. Accident Analysis and Prevention. 32 (2000), 633–642. [2] Brijs, T., Karlis, D., and Wets, G.: Studying the effect of weather conditions on daily crash counts using a discrete time-series model. Accident Analysis and Prevention. 40 (2008), 1180–1190. DOI: 10.1016/j.aap.2008.01.001. [3] Hastie, T., Tibshirani, R., and Friedman, J.: The Elements of Statistical Learning. 2nd edition. Springer, New York, 2009. [accessed 2016-03-01]. Available from: http://statweb.stanford.edu/~tibs/ElemStatLearn/. [4] In-počasí: Archiv. [online]. (2016). [accessed 2016-04-04]. Available from WWW: http://www.inpocasi.cz/archiv/. [5] Institute of Geophysics of the Czech Academy of Sciences, v. v. i.: Data archive - K indices. [online]. (2016). [accessed 2016-04-04]. Available from WWW: http://www.ig.cas.cz/en/structure/observatories/geomagnetic-observatory-budkov/archive. [6] Junga, S., Qinb, X., and Noyce, D. A.: Rainfall effect on single-vehicle crash severities using polychotomous response models. Accident Analysis and Prevention. 42 (2010), 213–224. DOI: 10.1016/j.aap.2009.07.020. [7] Jurafsky, D, and Martin, J. H.: Speech and Language Processing. Prentice Hall, Inc., Englewood Cliffs, New Jersey 07632, 2000. [8] Keay, K., and Simmonds, I.: The association of rainfall and other weather variables with road traffic volume in Melbourne, Australia. Accident Analysis and Prevention. 37 (2005), 109–124. DOI: 10.1016/j.aap.2004.07.005. [9] Kubašta, P.: Výpočet geomagnetické aktivity. [online]. (2011). [accessed 2016-01-28]. Available from WWW: https://www.ig.cas.cz/kubasta/PersonalPages/zpravy/noaaVypocet.pdf. [10] Liu, P., Chen, S.-H., and Yang, M.-D.: Study of Signalized Intersection Crashes Using Artificial Intelligence Methods. In: Proceedings of the 7th Mexican International Conference on Artificial Intelligence (Gelbukh, A., Morales, E. F., eds.). Springer-Verlag, Berlin Heidelberg, 2008, 987–997. [11] Menne, M. J., Durre, I., Korzeniewski, B., McNeal, S., Thomas, K., Yin, X., Anthony, S., Ray, R., Vose, R. S., Gleason, B. E., and Houston, T. G.: Global Historical Climatology Network - Daily (GHCN-Daily), Version 3. [FIPS:EZ - Czech Republic - 2000-01-01--2016-03-31]. NOAA National Climatic Data Center. (2012), DOI: 10.7289/V5D21VHZ. [accessed 2016-04-14]. [12] Menne, M. J., Durre, I., Vose, R. S., Gleason, B. E., and Houston, T. G.: An Overview of the Global Historical Climatology Network-Daily Database. J. Atmos. Oceanic Technol. 29 (2012), 897-910. DOI: 10.1175/JTECH-D-11-00103.1. [13] National Centers for Environmental Information: Daily Summaries Location Details. [online]. 2016. [accessed 2016-04-04]. Available from WWW: http://www.ncdc.noaa.gov/cdoweb/datasets/GHCND/locations/FIPS:EZ/detail. [14] Policie České republiky (Police of the Czech Republic): Statistiky dopravních nehod (Traffic Accidents Statistics). [online]. (2015). [accessed 2016-04-14]. Available from WWW: http://aplikace.policie.cz/statistiky-dopravnich-nehod/default.aspx. [15] Quddus, M. A.: Time series count data models: An empirical application to traffic accidents. Accident Analysis and Prevention. 40 (2008), 1732–1741. DOI: 10.1016/j.aap.2008.06.011. [16] Reeve, W. D.: Geomagnetism Tutorial. [online]. (2010). [accessed 2016-01-28]. Available from WWW: http://www.reeve.com/Documents/SAM/GeomagnetismTutorial.pdf. [17] Střeštík, J., and Prigancová, A.: On the possible effect of environmental factors on the occurrence of traffic accidents. Acta geodaet., geophys. et montanist. Acad. Sci. Hung. 21 (1986), 155-166. [18] Verma, P. L.: Traffic accident in India in relation with solar and geomagnetic activity parameters and cosmic ray intensity (1989 to 2010). International Journal of Physical Sciences. 8.10 (2013), 388-394. DOI: 10.5897/IJPS12.733. [19] Wilcox, R. R.: Basic Statistics. Oxford University Press, Inc., New York, 2009. [20] Winston, P. H.: Artificial intelligence. 3rd edition. Addison Wesley, Reading, MA, 1992.
611
Mathematical Methods in Economics 2016
Wage rigidities and labour market performance in the Czech Republic Daniel Němec1, Martin Macı́ček2 Abstract. The main goal of our contribution is to evaluate the impact of structural characteristics of the Czech labour market on the business cycle and to quantify the role of wage rigidities in the labour market performance. The Czech economy is represented by a small dynamic stochastic general equilibrium model of an open economy. The model economy consists of three types of agents: firms, households and central bank. Model parameters are estimated using the quarterly data of the Czech Republic for the period from the first quarter of 2000 to the last quarter of 2014. The comparison of two alternative wage setting model schemes shows that the Nash bargaining wage setting with the real wage persistence fits the data better than the Calvo type of wage rigidities modelling approach. The existence of hiring costs in the Czech economy was proved as significant and the estimated hiring costs account for 0.77% of the gross domestic product. Keywords: wage rigidities, labour market frictions, DSGE model, hiring costs, Czech labour market JEL classification: E32, J60 AMS classification: 91B40
1
Introduction
The aim of this paper is to evaluate the impact of structural characteristics of the Czech labour market on the business cycle and to quantify the role of wage rigidities in the labour market performance in the period of growth and prosperity on one hand and the deep fall into the worldwide recession, that initiated in 2008 in the USA by the severe problems of a mortgage sector and spread rapidly throughout the whole world, on the other hand. To achieve this goal we make use of a Dynamic Stochastic General Equilibrium (DSGE) model of the small open economy. Our paper extends the previous investigations of the efficiency and flexibility of the Czech labour market carried out by Němec [5] and Němec [6]. We incorporate two types of wage rigidities into the model in order to take into consideration the complicated process of wage setting and permit involuntary unemployment in the model. Since firms are nowadays changing their hiring strategies and are far more prone to find and train their own talents, rather then relying on a short time working relationships, the model considers hiring costs which could be considerably high.
2
The Model
We use a medium-scale small open economy (SOE) DSGE model presented in Sheen and Wang [7]. The authors build the model by further developing the works of Adolfson et al. [1] and Blanchard and Galı́ [2]. The model economy consists of three types of agents: firms, households and central bank. We assume that government distributes all taxes to households as a lump sum payment, so there is no need to introduce government in the model explicitly. Firms are divided into four categories. Domestic intermediate goodsproducing firms employ labour and capital services from households, depending on the relative wage and rental price of capital, to produce intermediate goods. These goods are further sold to domestic final producers, who combine intermediate goods and transform them into a homogeneous goods. These 1 Masaryk
University, Faculty of Economics and Administration, Department of Economics, Lipová 41a, 602 00 Brno, Czech Republic, nemecd@econ.muni.cz 2 Masaryk University, Faculty of Economics and Administration, Department of Economics, Lipová 41a, 602 00 Brno, Czech Republic, 369753@mail.muni.cz
612
Mathematical Methods in Economics 2016
Figure 1 Simplified diagram of the model flows goods, that can be used either for consumption and investment, are sold both to domestic households and abroad. Trading with foreign economies is ensured by the existence of importing and exporting firms. The New Keynesian framework assumes monopolistic competition, thus the firms themselves set their prices optimally subject to Calvo [3] type pricing mechanism. Since inflation is assumed to be well maintained by the central bank, firms which do not optimize their prices in current period simply index their prices to the inflation. Firms have to finance all their wage bills by borrowing at the nominal cash rate. Infinitely long lived optimizing households gain utility from leisure, real balances and consumption, that is subject to the habit formation. They offer capital and labour to intermediate goods producing firms. These employers face hiring costs when employing a new worker. These costs can take a form of initial trainings for a new employee, time until she starts to do her job properly, time of her colleagues when giving her advices, the processes of choosing the right candidates, the costs of assessment centers etc., firms internalize these costs when making their price decisions, making inflation dependent on lagged, current and future expected unemployment rate. Further two alternative schemes of wage setting are considered. Nominal wage rigidities are arising from assumption that the labour is differentiated, thus giving workers power to set wages according to the Calvo [3] type of nominal rigidities. Real wage rigidities alternative is introduced by making real wages depend on the lagged real wage and the current Nash bargaining wage. A central bank conducts monetary policy by applying a Taylor type interest rate rule. Further insight into the model functioning provides simplified model diagram displayed in Figure 1.
3
Data and methodology
To estimate the model we use Czech quarterly time series from 2000Q1 to 2014Q4. The data are taken from databases of Czech National Bank, Czech Statistical Office, Ministry of labour and Social Affairs and Eurostat. Economic and Monetary Union of the European Union is perceived as a foreign economy. More detailed description of original time series is offered in Table 1. Kalman filter is employed to evaluate the likelihood function of the model and two Metropolis-Hastings chains are run to estimate the posterior density. One milion draws are taken and 70% are discarded in order to minimize the impact of initial values. We also try to control the variance of the candidate distribution to achieve the acceptance ratio of draws around 0.3. Foreign economy is modeled independently as a VAR(1) process. Many of the model parameters are calibrated. Discount factor β and steady state–level of unemployment U are set to match the observed interest rate and unemployment rate of the data sample. Parameter ϑ is set to 1, according to Blanchard and Gali [2], implying the unit elasticity of hiring cost to the current labour market conditions. Following Jääskelä and Nimark [4], steady–state level of domestic goods inflation π d is set to 1.005 and the shares of consumption and investments in imports are set to ωc = 0.2, ωi = 0.5 respectively. The curvature of the money demand function σq = 10.62, constant determining the level of utility the households gain from real balances Aq = 0.38, elasticity of labour supply σL = 1 and
613
Mathematical Methods in Economics 2016
Time series CPI Wnom C I X M Y Ynom R U E Yst CPIst Rst RER DI W L Lgr
Description Consumer price index Nominal wages Real consumption Real investment Real exports Real imports Real GDP Nominal GDP Nominal interest rate Unemployment rate Nominal exchange rate Foreign real GDP Foreign CPI Foreign nominal interest rate Real exchange rate GDP deflator Real wages labour force labour force growth
Detail Seasonally adjusted, 2014 = 100
Gross fixed capital formation
3M Interbank Rate - PRIBOR Exchange Rate CZK/EUR GDP EMU CPI EMU, 2014 = 100 3M EURIBOR = E × (CPI/CPIst) =100 × (Ynom/Y) =100 × (Wnom/DI) Seasonally adjusted = log(Lt ) − log(Lt−1 )
Table 1 Original time series description 00
(1) utilization cost parameter σa ≡ aa0 (1) = 0.049 are set according to Adolfson et al. (2007) [1]. Capital share α, separation rate δ, depreciation rate δk are set to match the Czech macroeconomic conditions. Parameter β δ ϑ δk α ωc ωi σq σL AL Aq σa πd U
Description discount factor separation rate elasticity of hiring cost to labour market condition depreciation rate capital share import share of consumption good import share of investment good money demand curvature parameter labour supply elasticity labour dis-utility constant money utility constant utilization cost parameter steady-state level of domestic goods inflation steady-state level of unemployment
Value 0.99 0.1 1 0.02 0.35 0.2 0.5 10.62 1 1 0.38 0.049 1.005 0.071
Table 2 Calibrated parameters All model calibrated parameters are summarized in Table 2. In prior definition we use the setting of Adolfson et al. [1] and Jääskelä and Nimark [4]. We set this priors as symmetrical as possible to prevent a potential distortion of the posterior estimates.
4
Estimation results and model assessment
In this paper two rival models are considered. RWHC model containing real wage rigidities and hiring costs and the alternative NWHC model that assumes nominal wage rigidities with the assumption of hiring costs. We compare the posterior log-likelihoods of the models using the Bayes factor. The RWHC model provides a better data fit compared to its rival. The logarithm of Bayes factor under the null hypothesis of nominal wage rigidities extends to 15, meaning the model with real wage rigidities is strongly favored by the data. The results of the benchmark model estimation are thoroughly depicted in Table 3 and Table 4. The posterior estimate of a habit formation parameter b is 0.78. This indicates substantially high level of households preference of a smooth consumption paths, i.e., households are willing to save and borrow money to keep their level of consumption relatively stable in time. The key parameter in labour market setup is the constant B, which determines the steady state level of hiring costs. The point estimate of this parameter is 0.19 with 90% confidence interval ranging from 0.12 to 0.26. This result indicates the significant support of hiring costs by the data. Although the hiring costs are not included in the theoretical model’s output we are able to derive the hiring costs to GDP ratio using some algebraic manipulations. The resulting share is 0.77% with the 90% confidence interval between 0.48 to
614
Mathematical Methods in Economics 2016
1.08 percent. The estimated posterior mean of parameter B is a bit smaller under nominal wage rigidity, with the value of B = 0.15 and the resulting hiring cost to GDP ratio of 0.60%. Parameter Description b habit formation B tightness coefficient SĚ&#x192; 00 curvature of investment adjustment cost Ď&#x2020;Ě&#x192;a risk premium f real wage AR(1) persistence Ρc imported consumption elasticity Ρi imported investment elasticity Ρf export elasticity Âľz steady-state growth rate d Îť sâ&#x20AC;&#x201C;s. markup: domestic mc Îť sâ&#x20AC;&#x201C;s. markup: imported-consumption mi Îť sâ&#x20AC;&#x201C;s. markup: imported-investment Calvo lottery Ξd domestic firm Ξmc consumption import firm Ξmi investment import firm Ξx exporter Monetary policy parameters Ď R interest rate smoothing rĎ&#x20AC; inflation response ry output response rs real exchange rate response râ&#x2C6;&#x2020;Ď&#x20AC; inflation change response râ&#x2C6;&#x2020; y output change response Persistence parameters Ď Îś c consumption preference Ď Îś N labour preference Ď Âľz permanent technology Ď temporary technology Ď Ď&#x2020; risk premium Ď Î&#x201C; investment-specific technology Ď zĚ&#x192;â&#x2C6;&#x2014; asymmetric foreign technology Ď Îťd markup: domestic Ď Îťmc markup: imported-consumption Ď Îťmi markup: imported-investment Ď Îťx markup: export
Prior Distribution Mean Std Beta 0.65 0.10 Normal 0.12 0.05 Normal 7.694 4.00 InvGamma 0.01 0.01 Normal 0.50 0.20 InvGamma 1.20 5.00 InvGamma 1.20 5.00 InvGamma 1.20 5.00 Normal 1.0025 0.001 InvGamma 1.20 3.00 InvGamma 1.20 3.00 InvGamma 1.20 3.00
Posterior Mean Std 5% 95% 0.78 0.05 0.71 0.86 0.19 0.04 0.12 0.26 11.31 2.68 6.83 15.57 0.52 0.36 0.09 1.00 0.42 0.09 0.27 0.57 4.44 1.22 2.51 6.27 1.18 0.14 1.05 1.36 6.01 1.20 4.03 7.90 1.004 0.001 1.002 1.005 1.91 0.26 1.47 2.32 1.09 0.03 1.04 1.13 1.11 0.04 1.05 1.17
Beta Beta Beta Beta
0.60 0.60 0.60 0.60
0.15 0.15 0.15 0.15
0.35 0.36 0.56 0.57
0.07 0.06 0.15 0.06
0.23 0.25 0.31 0.47
0.45 0.46 0.80 0.68
Beta Normal Normal Normal Normal Normal
0.85 1.70 0.125 0.00 0.00 0.00
0.05 0.30 0.05 0.05 0.10 0.10
0.78 2.46 0.04 0.004 0.15 0.16
0.03 0.22 0.03 0.04 0.08 0.05
0.73 2.10 -0.02 -0.07 0.02 0.07
0.83 2.82 0.10 0.08 0.27 0.24
Beta Beta Beta Beta Beta Beta Beta Beta Beta Beta Beta
0.50 0.50 0.50 0.50 0.50 0.50 0.50 0.50 0.50 0.50 0.50
0.15 0.15 0.15 0.15 0.15 0.15 0.15 0.15 0.15 0.15 0.15
0.55 0.57 0.66 0.89 0.59 0.59 0.52 0.83 0.88 0.50 0.87
0.11 0.14 0.08 0.05 0.16 0.08 0.15 0.09 0.05 0.15 0.07
0.37 0.34 0.55 0.82 0.33 0.45 0.27 0.70 0.81 0.26 0.78
0.73 0.79 0.79 0.97 0.85 0.72 0.77 0.95 0.95 0.75 0.96
Table 3 Prior and posterior The elasticity of investment with respect to price of an existing capital can be derived from the first order condition of households choosing the level of investment maximizing their utility function and linearizing this condition around the steady state. Solving the resulting equation forward and inserting back the original model variables we obtain the expression from which the elasticity can be easily derived. We use the equation to determine the elasticity of investment to a temporary rise in the price of current installed capital as (Âľz )12 SĚ&#x192; 00 and the elasticity of investment to a permanent rise in the price of installed 00
capital as (Âľz )2 SĚ&#x192;100 (1â&#x2C6;&#x2019;β) . The posterior estimate of an investment adjustment cost SĚ&#x192; is 11.31. Evaluating these expressions at our point estimates delivers the elasticities of 0.088 for a temporary rise of a price and 8.78 for a permanent rise respectively. The posterior estimate of the risk premium parameter Ď&#x2020;Ě&#x192;a = 0.52 implies (using the uncovered interest rate parity condition), that a 1 percent increase in net foreign assets reduces the domestic interest rate by 0.52 percent. The point estimate of real wage persistence f reached the value 0.42 implying a significant degree of real wage rigidity in the labour market. The comparison of values of consumption Ρc and investment Ρi imported elasticity parameters reveals, that the demand for a consumption of a foreign goods is significantly more sensitive to the changes of relative prices than the demand for investment goods. The export elasticity is with its value Ρf = 6.01 most sensitive. The point estimate of Âľz at 1.004 suggests, that the economy is growing only slowly in the steady state, this could well be caused by the time period chosen, where the economic crisis covers the substantial part of a data range. From the estimates of steadyâ&#x20AC;&#x201C;state markups we can derive the information, that the domestic firms have the most market power compared to other types of firms, since they can afford to set a highest markup. The estimates of the Calvo lottery parameters vary greatly across the different firm types. The lowest estimate of Calvo parameter belongs to the domestic firm. The value Ξd = 0.35 implies the average price fix duration of slightly more than 1.5 quarters. The second least rigid price
615
Mathematical Methods in Economics 2016
setting mechanism goes to consumption importing firms with average price fix duration of 1.6 quarters. The results indicate that the investment importing firms together with exporters are unable to adjust their prices most frequently with average price fix durations of 2.27 and 2.32 quarters respectively. Parameter Ď&#x192;Îś c Ď&#x192;Îś N Ď&#x192;Âľz Ď&#x192; Ď&#x192;Ď&#x2020; Ď&#x192;Î&#x201C; Ď&#x192;zĚ&#x192;â&#x2C6;&#x2014; Ď&#x192;R Ď&#x192;Îťd Ď&#x192;Îťmc Ď&#x192;Îťmi Ď&#x192; Îťx
Description preference shock: consumption preference shock: labour supply permanent technology shock temporary technology shock risk premium shock investment specific technology shock asymmetric foreign technology shock monetary policy shock markup shock: domestic production markup shock: imported-consumption markup shock: imported-investment markup shock: export
Prior Distribution Mean InvGamma 0.50 InvGamma 0.50 InvGamma 0.50 InvGamma 0.50 InvGamma 0.50 InvGamma 0.50 InvGamma 0.50 InvGamma 0.50 InvGamma 0.50 InvGamma 0.50 InvGamma 0.50 InvGamma 0.50
Std 3.00 3.00 3.00 3.00 3.00 3.00 3.00 3.00 3.00 3.00 3.00 3.00
Mean 3.29 2.70 0.50 0.43 0.41 11.21 0.22 0.28 1.22 3.89 0.45 4.96
Posterior Std 5% 95% 0.83 2.05 4.46 0.58 1.77 3.61 0.10 0.35 0.66 0.06 0.32 0.53 0.18 0.15 0.66 3.16 6.28 16.13 0.07 0.12 0.32 0.06 0.18 0.37 0.38 0.70 1.71 0.93 2.58 5.21 0.33 0.12 0.90 1.03 3.43 6.41
Table 4 Standard errors of exogenous shocks The point estimates of monetary policy parameters reveal that the central bank makes an effort to keep the interest rate path relatively smooth, Ď R = 0.78, and responses more than proportionately to inflation when setting the nominal interest rate, rĎ&#x20AC; = 2.46. The parameter of response to the real exchange rate is not significant, therefore it seems that the Czech national bank is not influenced by its changes directly when targeting interest rates. The direct response to the output turned out to be insignificant too. Although the values of last two parameters are small, they are significant. Thus the central bank puts a little weight on direct responses to the inflation change and output change. number of hiring
unemployment after hiring 2
10
Ut
Ht
5 0 â&#x2C6;&#x2019;5 â&#x2C6;&#x2019;10 2000
0 â&#x2C6;&#x2019;2
2003
2006
2009
2012
2015
2000
2003
20
20
10
10
0
2003
2006
2009
2012
2015
2000
2015
2003
2006
2009
2012
2015
2012
2015
labor supply preference
0
10
â&#x2C6;&#x2019;5
5
ÎśtN
wt
2012
0
real wage
â&#x2C6;&#x2019;10
0 â&#x2C6;&#x2019;5
â&#x2C6;&#x2019;15 2000
2009
â&#x2C6;&#x2019;10
â&#x2C6;&#x2019;10 2000
2006
hiring cost
gt
xt
labor market tightness
2003
2006
2009
2012
2015
2000
Steady state level
90% HPD sup (inf)
2003
2006
2009
Smoothed variable
Figure 2 Smoothed paths of labour market variables We proceed with the evaluation of the Czech labour market and its structural changes throughout the last 15 years, using our benchmark model. To analyze the development of the Czech labour market we focus on the following variables: Ht (number of hiring â&#x20AC;&#x201C; number of hires is the difference between current and lagged employment, after accounting for separation), Ut (unemployment after hiring â&#x20AC;&#x201C; due to the assumption of normalized labour force, we can write Utâ&#x2C6;&#x2019;1 = 1 â&#x2C6;&#x2019; Ntâ&#x2C6;&#x2019;1 , where Nt is the employment after hiring ends in time t), xt (labour market tightness â&#x20AC;&#x201C; can be interpreted as the job-finding rate from the perspective of the unemployed), gt (stationary hiring cost), wt (real wage â&#x20AC;&#x201C; compiled as a weighted sum of a Nash bargaining wage and its own lagged value), ÎśtN (labour supply preference â&#x20AC;&#x201C; the time varying weight that households assign to the labour dis-utility relatively to the utility gained from consumption and real balances). In Figure 2 the smoothed paths of these variables, together with 90% HPD interval, are depicted. The shaded area denotes the period of the latest recession starting in 2008. This moment is a clear break point for the Czech labour market and the worldwide economy in general. The number
616
Mathematical Methods in Economics 2016
of hiring oscillated until 2006 when the short period of growth started. In 2008 it reached its peak and then collapsed during the most severe period of financial crisis. From 2011 until now this variable gained again the stability and returned on its original path. The unemployment was declining continuously till 2008, where the unemployment rate hit the bottom. After a short period of a gradual rise the variable returned to the steady state level. In the last year the unemployment seems to be at the beginning of the another period of a decline. The labour market tightness and hiring cost are tightly bounded together. Both variables were growing steadily and hit the peak in 2008 corresponding to the lowest value of the unemployment rate. When the number of unemployed is low, the labour market is tight and therefore it is more expensive for firms to hire new workers. Following a sudden dip both labour market tightness and hiring costs fluctuated slightly and in 2014 took a growth path. The real wage experienced a long period of a remarkable growth. In 2006 the level of real wages had stabilized and since then fluctuated around the steady state level. During the last periods it seems that the real wages are slightly decreasing. The last variable depicted in the Figure 2 is the labour supply preference. We can see that from 2002 to 2005 the households were giving a relatively small weight to the dis-utility from working. In 2008 this variable shot up and returned to the steady state level in 2012. During this period the labour supply preference was very high relatively to the consumption and real balances preference, meaning agents were hesitating a lot, whether to actively participate in the working process. During the last periods the variable has been decreasing significantly.
5
Conclusion
The comparison of two alternative wage setting model schemes revealed that the Nash bargaining wage setting with the real wage persistence, reflecting the cautious approach of agents, fits the data better than the Calvo type of wage rigidities modeling approach. During the conjuncture preceding the crisis, number of hiring were exceptionally high, corresponding with the period of the lowest unemployment throughout the last fifteen years. This boom was accompanied with the increased labour market tightness which further resulted in costly hiring for firms hiring new employees. The outbreak of a sharp economic slump that followed decreased the new hires rapidly, the unemployment shot up and households started to consider their labour supply preferences more seriously. The latest improvement in the unemployment rate, tendencies in the labour market related variables and the optimistic economic atmosphere indicate the hints of hope to the Czech labour market and the economy in general.
Acknowledgements This work was supported by funding of specific research at Faculty of Economics and Administration, project MUNI/A/1040/2015. This support is gratefully acknowledged.
References [1] Adolfson, M., Lasén, S., Lindé, J., and and Villant, M.: Bayesian estimation of an open economy DSGE model with incomplete pass-through. Journal of International Economics 72 (2007), 481–511. [2] Blanchard, O., and Galı́, J.: labour Markets and Monetary Policy: A New Keynesian Model with Unemployment. American Economic Journal: Macroeconomics 2 (2010), 1–30. [3] Calvo, G. A.: Staggered prices in a utility-maximizing framework. Journal of Monetary Economics 12 (1983), 383–398. [4] Jääskelä, J. P., and Nimark, K.: A Medium-Scale New Keynesian Open Economy Model of Australia. Economic Record 87 (2011), 11–36. [5] Němec, D.: Investigating Differences Between the Czech and Slovak Labour Market Using a Small DSGE Model with Search and Matching Frictions. The Czech Economic Review 7 (2013), 21–41. [6] Němec, D.: Measuring Inefficiency of the Czech Labour Market. Review of Economic Perspectives Národohospodářský obzor 15 (2015), 197–220. [7] Sheen, J., and Wang, B. Z.: An Estimated Small Open Economy Model with Labour Market Frictions. SSRN Electronic Journal ( 2012).
617
Mathematical Methods in Economics 2016
Parametric rules for stochastic comparisons Sergio Ortobelli1, Tommaso Lando2, Carla Nardelli3 Abstract. In the financial literature, the second-degree stochastic dominance (2-SD) and the increasing and convex order (icx) are frequently used to rank distributions in terms of expectation and risk aversion. These dominance relations may be applied to the empirical distribution functions, obtained by the historical observations of the financial variables, or to parametric distributions, used to model such variables. In this regard, it is well documented that empirical distributions of financial returns are often skewed and exhibit fat tails. For this reason, we focus on the stable Paretian distribution, that has been shown to be especially suitable for modeling financial data. We analyze these two different approaches, namely parametric and non-parametric, and compare them by performing an empirical study. The analyzed dataset consists of the returns of a set of financial assets among the components of the S&P500. Keywords: stochastic dominance, heavy tails, risk aversion, stable distribution. JEL Classification: C44 AMS Classification: 90C15
1
Introduction
Stochastic dominance relations are aimed at ranking probability distributions according to an order of preference. Their usefulness in many fields of study, such as economics and finance is well known. In this regard, in a financial framework, investors are generally interested in ranking distributions of financial random variables, such as asset returns, according to their gain/loss expectations and their attitude towards risk. Several dominance rules have been proposed in the financial literature that conform to different risk profiles. These rules may be applied to the empirical distribution functions or, sometimes, to parametric models used for fitting financial distributions. In the latter case, dominance rules can be re-formulated as a comparison between parametric values, that is clearly an advantage in terms of computation as well as ease of interpretation. With regard to parametric models, in the financial literature it is well known that asset returns are not normally distributed. Several studies by Mandelbrot (see e.g. [3,4]) and Fama (see e.g. [1]) recognized an excess of kurtosis and skewness in the empirical distributions of financial assets, that lead to the rejection of the assumption of normality. It has been widely documented that the distribution of financial assets is generally heavy tailed, so that the main alternative models that have been proposed in the literature, among which we may cite the Studentâ&#x20AC;&#x2122;s t distribution, the generalized hyperbolic distribution [8] and the stable Paretian distribution [9] do not necessarily have finite variance. In particular, the stable Paretian distribution is mainly justified by a generalized version of the central limit theorem, ascribable to Gnedenko and Kolmogorov [2], which basically states that the sum of a number of i.i.d. random variables with heavy tailed distributions (the classical assumption of finite variance is relaxed) can be approximated by a stable Paretian model. This paper is concerned with the issue of analyzing the stochastic dominance rules that are the most suitable for dealing with heavy tailed distributions, such as the stable Paretian one. Generally, one of the concepts most frequently employed for comparison of financial random variables is the second-degree stochastic dominance (2SD) which conforms to the idea that distributions with higher expectation and an inferior degree of dispersion (to be understood as the risk) should be preferred. Complementarily to the 2-SD, we may also recall the increasing and convex order (icx), which conforms to the idea that distributions with higher expectation and higher degree of dispersion should be preferred. However, it may often happen that the 2-SD is not verified, so that some finer ranking criteria need to be introduced. We may cite the i-th-degree stochastic dominance (i-SD) with i>2, such as the 3-SD, that obeys to the principle of downside risk aversion, although i-SD requires finite moment of order i-1, in order to be defined. Hence, it is sometimes more suitable to deal with stochastic dominance relations that are based on i-th degree integration of the quantile distribution, that is, the i-th degree inverse stochastic dominance 1 University of Bergamo, department SAEMQ, via dei Caniana 2 Bergamo (Italy); and VSB-TU Ostrava, department of Finance, SokolskĂ trida 33 Ostrava (Czech republic), sergio.ortobelli@unibg.it. 2 University of Bergamo, department SAEMQ, via dei Caniana 2 Bergamo (Italy); and VSB-TU Ostrava, department of Finance, SokolskĂ trida 33 Ostrava (Czech republic), tommaso.lando@unibg.it 3 University of Bergamo, department SAEMQ, via dei Caniana 2 Bergamo (Italy), carla.nardelli@unibg.it
618
Mathematical Methods in Economics 2016
(i-ISD Muliere and Scarsini 1989). Both i-SD and i-ISD attach more weighting to the low tail of the distribution, emphasizing the degree of risk aversion towards the worst case scenario. Differently, some alternative approaches have been recently proposed in order to emphasize both tails of the distribution, such as the moment dispersion order [5], that is especially suitable for dealing with stable Paretian distributions. Recently, Ortobelli et al. [5] studied the parametric relations that allow one to verify the 2-SD and the icx order, under the assumption that the random variables are stable distributed. In an empirical context, these parametric rules can serve as an approximation to 2-SD or to icx. In this regard, we argue that it would be useful to understand if these approximations are close to what observed, that is, 2-SD or icx between empirical distributions. In section 2, we study the different approaches (non-parametric and parametric) from a theoretical point of view and propose some parametric ranking criteria, based on the results in [5], namely the asymptotic 2-SD and the asymptotic icx. These ordering rules serve as proxy to the 2-SD and the icx, respectively. Then, in section 3 we perform an empirical analysis of a set of financial assets (among the components of the S&P500) with the aim of identifying the couples of distributions that cannot be ranked according to the different orderings. In particular, we compare the results obtained with the parametric and the non-parametric approaches.
2
Methods
Let be the distribution function of the random variable . Stochastic dominance relations generally determine preorders in the set of random variables, or, equivalently, distribution functions. We recall that a preorder is a binary relation ≤ over a set that is reflexive and transitive. In particular, observe that a preorder ≤ does not generally satisfy the antisymmetry property (that is, ≤ and ≤ does not necessarily imply = ) and it is generally not total (that is, each pair , in is not necessarily related by ≤). The main stochastic dominance rules can be defined as follows. Definition 1. We say that
or, equivalently,
first-degree dominates (1-SD) , ≥
≤
≥
, iff
,∀ ∈ ℝ
for every increasing function
such that the expectations exist.
The 1-SD relation is quite a strong condition, often referred to as the concept of one random variable being “stochastically larger” than another. The 1-SD implies the following two weaker criteria, that conform to the idea that distribution with higher dispersion and inferior (or greater) dispersion, or “risk”, should be preferable for any investor who is risk averse (or risk lover, respectively). Definition 2. We say that
second-degree dominates (2-SD) , ≤
or, equivalently, exist. Definition 3. We say that
≥
,∀ ∈ ℝ
increasing-convex dominates (icx) ,
≥
, iff
for every increasing and concave function
1− or, equivalently,
≥
≥
1−
such that the expectations
≥ !" , iff ,∀ ∈ ℝ
for every increasing and convex function
such that the expectations exist.
When 2-SD and/or icx cannot be verified, some weaker orders need to be introduced. In particular, we shall need the following order of dispersion with respect to a center, that can also be interpreted as an order of tailweightness. Definition 4. Let and be random variables such that ∈ &' , i.e. | |' < ∞ for + ≥ 1. We say that dominates with respect to the central moment dispersion (cmd) order and write ≥!,- if and only if . / + ≤. / + , ∀+ ≥ 1, where .0 + = sign + |5|' . Stable distributions depend on location and scale parameters, moreover they are identified by a parameter which specifies the shape of the distribution in terms of skewness (to be intended as the disproportion between the left and the right tails) and, more importantly, a parameter that describes the asymptotic behavior of the tails. In this paper we stress the critical role of the tail parameter (also known as the stability index).
619
Mathematical Methods in Economics 2016
Let be a random variable in the domain of attraction of a stable law with finite mean = 6 and infinite variance. Let { } ∈ℕ be independent and identically distributed observations of . Thus, we know there exist a sequence of positive real values { } ∈ℕ and a sequence of real values { } ∈ℕ , such that, as n→+∞: : 1 ; + :→ @ :
<=
where @ ~ B C, D, 6 is an α-stable Paretian random variable, where 0 < F ≤ 2 is the so-called stability index, which specifies the asymptotic behavior of the tails, C > 0 is the dispersion parameter, D ∈ I−1,1J is the skewness parameter and 6 ∈ ℝ is the location parameter. Observe that, in this paper, we consider the parameterization for stable distributions proposed by Samorodnitsky and Taqqu [10]. We recall that, if < 2 , then | |' < ∞ for any + < F and | |' = ∞ for any + ≥ F. Therefore, stable distributions do not generally have finite variance, this happens only when F = 2 (i.e. the Gaussian distribution, | |' < ∞ for any +). In this paper, we shall focus on the case F > 1.
Unfortunately, except in few cases, we do not have a closed form expression for the density (and thereby the distribution function) of stable Paretian distribution, which is identified by its characteristic function. Therefore, it is not possible to verify 1-SD, 2-SD and icx by applying directly definitions 1,2,3. However, in a recent paper, Ortobelli et al. [5] have determined the parametric conditions under which it is possible to verify the 2-SD and the icx, under stable Paretian assumptions. We briefly summarize them in what follows. Theorem 1. Let = ~ BK C= , D= , 6= and ~ BL C , D , 6 . 1. If F= > F > 1, D= = D , C= ≤ C and 6= ≥ 6 , then = ≥ . If 6 ≥ 6= then 2. If F= = F > 1, C= = C , 6= = 6 and |D= | < |D |, then = ≥!,- .
≥ !"
=.
Theorem 1 states that i) the stability index is crucial for establishing a dominance (2-SD or icx), whilst ii) the skewness parameter may yield the cmd order, that is, a weaker order of dispersion (or risk) around the mean value. Put otherwise, an inferior degree of skewness generally corresponds to a less spread out distribution. Therefore, we argue that a risk averse investor would generally prefer = to if F= ≥ F > 1, |D= | ≤ |D |, C= ≤ C and 6= ≥ 6 , whilst a risk lover would generally prefer to = if F= ≥ F > 1, |D= | ≤ |D |, C= ≤ C and 6 ≥ 6= . This can be formalized by the following two definitions. Definition 5. Let = ~ BK C= , D= , 6= and ~ BL C , D , 6 stable distributed random variables. We say that with respect to the second degree asymptotic dominance (2-ASD) and write = ≥M iff F= ≥ = dominates F > 1, |D= | ≤ |D |, C= ≤ C and 6= ≥ 6 , with at least one strict inequality. Definition 6. Let = ~ BK C= , D= , 6= and ~ BL C , D , 6 stable distributed random variables. We say that dominates = with respect to the asymptotic increasing and convex order (A-icx) and write ≥M !" = iff F= ≥ F > 1, |D= | ≤ |D |, C= ≤ C and 6 ≥ 6= , with at least one strict inequality. Note that the value of the location parameter plays a critical role in both definitions 5 and 6. We expect that 2-SD and 2-ASD, as well as icx and A-icx, tend to be equivalent when the stable distribution is well fitting to the empirical distribution (see the recent studies of [5] and [6]). The following empirical analysis is aimed at the verification and the analysis of the dominance rules represented by definitions 2,3,5,6.
3
Empirical analysis
The results of section 3 have several applications in different areas of study, because of the fundamental role of the stable distribution, discussed in the introduction. It is well known that the stable Paretian model is especially suitable for approximating the empirical distribution of the returns of financial assets. In this section, we apply the stochastic dominance rules stated in the above to real financial data, in order to verify empirically the conformity between the parametric and the non-parametric approaches. The dataset consists of the components of S&P500, starting from 7/1/1998 until 7/4/2016. We assume that the number of trading days per year is 250 and, accordingly, we consider two different timeframes for collecting data: 125 days (6 months) or 500 days (2 years). Moreover, we update (recalibrate) the datasets within a moving window of historical observations (i.e. 125 or 500 days) every 20 days (monthly) or 125 days (6 months), thus we obtain 2×2 different sequences of empirical distribution functions. We are mainly concerned with the differences between the parametric and non-parametric rules. The non parametric approach consists in applying definitions 2 and 3 (2-SD and icx) to the empirical distribution function.
620
Mathematical Methods in Economics 2016
This is done for each couple of distributions, that is, for each couple of assets, within a given timeframe. If we assume that the S&P500 at time t consists of n(t) assets (generally 500), we approximately have n(t)×(n(t)-1) couples of distributions (yielded by 125 or 500 observations, according to the different moving window) at each recalibration time. Differently, the parametric approach is performed as follows. For each empirical distribution, we estimate the unknown parameters (F, C, D, 6) of the stable Paretian distribution with the maximum likelihood (ML) method (see [7]), then we apply definitions 5 and 6 in order to verify whether the conditions for 2-ASD or A-icx are satisfied, for each couple of distributions. In our analysis, we especially focus on the number (of percentage) of assets that are not dominated by any of the other assets. Non-dominated distributions may consist of one distribution that dominates all the others and especially a set of distributions among which is not possible to establish a ranking. The results are summarized in Table 1. 6 months criteria 2-SD 2-ASD icx A-icx
rec 20 0.058 0.127 0.018 0.079
rec 125 0.059 0.117 0.017 0.078
2 years rec 20 0.074 0.151 0.030 0.090
rec 125 0.078 0.142 0.031 0.087
Table 1 Percentages of non-dominated distributions according to different timeframes and different recalibration times Surprisingly, the percentage of couples that cannot be ranked, according to any preorder, is always very low, in that the ranking between the assets is “almost” complete in some cases. In particular, icx and A-icx obtain the inferior percentages of non-dominated couples, thus risk seekers could almost reach complete unambiguous rankings. We observe that, by increasing the timeframe of historical observations, the percentage of non-dominated couples generally decrease (slightly). With regard to the 2-SD and the icx, this means that, by increasing the number of observations, the number of cases when the curves N or N 1 − intersect, generally increase, as the shape of the empirical distribution may change with more freedom, in that it exhibits more flexibility. Similarly, with regard to the parametric approach, we argue that, by increasing the number of observations, the probability of obtaining couples of parametric values that do not comply to the criteria of definitions 5,6 generally increase. Moreover, it should be stressed, that the parametric rules (def.’s 5 and 6) represent a good indicator of the 2-SD and the icx, although they generally overestimate the percentage of non-dominated couples. In fact, this conversely means that the set of ranked couples is generally larger, so that if we verify the dominance relation of def. 5 (or 6) w.r.t. the estimated parameters, it is likely that conditions of def. 2 (or 3) are verified as well. Put otherwise, the set of ranked couples, according to def.’s 5 and 6, is approximately contained in the set of ranked couples according to def.’s 2 and 3.
4
Conclusion
We have examined the impact of different types of preoders on a real dataset. The aim was to stress the differences and the links between the parametric and non-parametric approaches by computing and investigating the number (percentages) of couples of distributions (i.e. empirical or fitted distributions) that can (or cannot) be dominated by others. The parametric approach may be useful for approximating the number of non-dominated empirical distributions. Moreover, we stress that the main advantage of the parametric method is its computational speed, compared to the non-parametric approach, that require the verification of an integral condition on the whole support.
Acknowledgements This paper has been supported by Italian funds ex MURST 60% 2013 and 2014 and MIUR PRIN MISURA Project, 2013-2015. The research was also supported through the Czech Science Foundation (GACR) under projects 1313142S and 15-23699S. Moreover, the research was done through SP2014/16, and SGS research project of VSBTU Ostrava, and furthermore by the European Social Fund in the framework of CZ.1.07/2.3.00/20.0296.
621
Mathematical Methods in Economics 2016
References [1] Fama, E.: Mandelbrot and the stable paretian hypothesis. Journal of Business, 36 (1963), 420-429. [2] Gnedenko, B.V., Kolmogorov A.N.: Limit distributions for sums of independent random variables, AddisonWesley Mathematics Series, Addison-Wesley, Cambridge, 1954. [3] Mandelbrot, B.: New methods in statistical economics. Journal of Political Economy 71 (1963), 421-440. [4] Mandelbrot, B.: The variation of some other speculative prices. Journal of Business 40 (1967), 393-413. [5] Ortobelli, S., Lando, T., Petronio, F., Tichy, T.: Asymptotic stochastic dominance rules for sums of iid random variables. Journal of computational and applied mathematics 300 (2016), 432-448. [6] Ortobelli, S., Lando, T., Petronio, F., Tichy, T.: Asymptotic multivariate dominance: a financial application. Methodology and computation in applied probability (2016), 1-19. DOI 10.1007/s11009-016-9502-y. [7] Nolan, J. P.: Maximum likelihood estimation and diagnostics for stable distributions. In LÊvy processes (2001), 379-400, Birkhäuser Boston. [8] Platen, E., Rendek, R.: Empirical evidence on student-t log-returns of diversified world stock indices. Journal of statistical theory and practice 2(2) (2008), 233-251. [9] Rachev, S.T., Mittnik, S.: Stable paretian models in finance, Wiley, New York, 2000. [10] Samorodnitsky, G., Taqqu M.,S.: Stable Non Gaussian Random Processes, Chapman & Hall, 1994.
622
Mathematical Methods in Economics 2016
Vehicle Scheduling with Roundabout Routes to Depot Stanislav Palúch1, Tomáš Majer2 Abstract. Minimum fleet size vehicle scheduling problem is to arrange a given set of trips into the minimum number of running boards. This problem can be modeled as an assignment problem or maximum flow problem both with polynomial complexity. Slovak and Czech bus providers often require that vehicles visit depot during working day in order to deliver money or to make a vehicle maintenance or refueling. Just mentioned procedure is called a roundabout route to depot. This paper studies ways how to manage that every running board contains a feasible roundabout route to depot. The proposed procedure is like that: First calculate bus schedule with the minimum number of vehicles. Then create a large set of fictive dummy trips starting and finishing in depot with time duration equal to the required time length of stay in depot. Afterwards calculate bus schedule with the minimum number of vehicles that contains exactly so many fictive trips as the number of vehicles. A maximum group matching formulation of this problem is proposed and some computer experiments with Gurobi solver for corresponding MILP model are discussed. Keywords: vehicle scheduling, mixed linear programming, group matching. JEL classification: C44 AMS classification: 90C15
1
Introduction
A bus trip is a quadruple (dk , ak , uk , vk ) where dk is departure time, ak is arrival time, uk is departure bus stop and vk is arrival bus stop of the trip k. Let M = m(u, v) be a time distance matrix determining the travel time m(u, v) from bus stop u to bus stop v. Trip j can be carried immediately after trip i by the same bus if dj ≥ ai + m(vi , uj ),
(1)
i.e. if a bus after arriving to the arrival bus stop vi of trip i can pull to the departure bus stop uj of the trip j sufficiently early. In this case we will say, that the trip j can be linked after trip i and we will write i ≺ j. A running board of a bus is sequence of trips i1 , i2 , . . . , ir such that for very k, 1 ≤ k < r it holds ik ≺ ik+1 . By other words, a running board is a sequence of trips which can be carried by the same bus in one day. It represents a day schedule of work for one bus. The goal of bus scheduling with minimum number of buses is the following: Given the set S of trips to arrange all trips from S into minimum running boards. Resulting set of running boards is called a bus schedule. Relation ≺ on the set S can be modeled by a digraph G = (S, E) where E = {(i, j)|i ∈ S, j ∈ S, i ≺ j}. 1 University of Žilina, Department of mathematical methods and operations analysis, Univerzitná 8215/1, 010 26 Žilina, e-mail: stanislav.paluch@fri.uniza.sk. 2 University of Žilina, Department of mathematical methods and operations analysis, Univerzitná 8215/1, 010 26 Žilina, e-mail: tomas.majer@fri.uniza.sk.
623
Mathematical Methods in Economics 2016
Suppose that S = 1, 2, . . . , n. Denote by xij a decision binary variable with the following meaning ( 1 if the trip j is linked immediately after trip i in a running board xij = (2) 0 otherwise Imagine that we have a bus schedule with n = |S| buses – every bus makes only one trip. This situation corresponds to xij = 0 for all (i, j) ∈ E. When we realized one linkage for (i, j) ∈ E (what is indicated by setting xij = 1) we have saved one bus. Therefore the more variables are equal to 1 the less vehicles are used in corresponding bus schedule. In order to obtain a feasible bus schedule, at most one trip j can be linked after arbitrary trip i and at most one trip i can be linked before arbitrary trip j. Just mentioned facts lead to the following mathematical model.
Maximize
X
xij
(3)
ij, (i,j)∈E
subject to:
X
xij
≤ 1
for j = 1, 2, . . . , n
(4)
xij
≤ 1
for i = 1, 2, . . . , n
(5)
xij
∈
i, (i,j)∈E
X
j, (i,j)∈E
{0, 1}
(6)
Mathematical model (3) – (6) is an instance of assignment problem, therefore condition (6) can be replaced by xij ≥ 0 (7) and linear program (3) – (5), (7) has still integer solution.
2
Model with depot visiting
Slovak and Czech bus providers often require that vehicles visit depot during working day in order to deliver money or to make a vehicle maintenance or refueling. A visit of the depot can be modeled as a special trip with both arrival place and departure place equal to depot and arrival and departure times determined by time interval necessary for all procedures required. In most cases this time interval is at least 30 minutes long and can be considered as a safety break for a driver. In practice, bus staying in depot do not have assigned fixed time positions, they are scheduled in weak traffic hours. Suppose, we have calculated a bus schedule with the minimum number of vehicles r till now without roundabout routes to depot. We propose following procedure in order to manage desired visits of all r buses in depot: Propose the set D of dummy trips representing bus stays in depot. Define digraph G = (S ∪ D, A)
(8)
where S ∪ D is the union of the set of original set of trips together with the set of all dummy trips – visits of depot. The arc set A of G is the set of all ordered pairs (i, j) of elements of S ∪ D such that i ≺ j. The goal is to arrange all trips from S and r trips from D into minimum number of running boards. We introduce two decision variables xij and zk with following meaning:
xij
=
zi
=
( 1 0 ( 0 1
if the trip j ∈ S ∪ D is linked immediately after trip i ∈ S ∪ D in the same running board (9) otherwise if the dummy trip i ∈ D is chosen into a running board if the dummy trip i ∈ D is not used in any running board 624
(10)
Mathematical Methods in Economics 2016
The number of chosen depot stays should be equal P exactly to r (the minimum number of running boards without roundabout routes) therefore |D| − i∈D zi = r If the dummy trip i ∈ D was not used then all variables xij have to be equal to zero for all j ∈ V . This is ensured by the constraint X xij ≤ 1 (11) zi + j; (i,j)∈A
since zi = 1 for any unused trip i ∈ D. Similarly if dummy trip j ∈ D is not used in any running board then zero values of all xij for all i ∈ V are guaranteed by X xij ≤ 1 (12) zj + i; (i,j)∈A
On the other hand, if dummy trip i ∈ D is used what is indicated by zi = 0 then the constraint (11) guarantees that at most for one j ∈ S is xij = 1. Similarly, if trip j ∈ D is used what is indicated by zj = 0 then the constraint (12) guarantees that at most for one i ∈ V is xij = 1. Hence, now we deal with the following optimization problem: X Maximize xij
(13)
ij (i,j)∈A
subject to:
X
xij
≤
1
for j ∈ S
(14)
xij
≤
1
for j ∈ D
(15)
xij
≤
1
for i ∈ S
(16)
xij
≤
1
for i ∈ D
(17)
zi
=
r
xij
∈
{0, 1}
i, (i,j)∈A
zj +
X
i, (i,j)∈A
X
j, (i,j)∈A
zi +
X
j, (i,j)∈A
|D| −
X
zi 2.1
(18)
i∈D
∈
{0, 1}
for all i, j such that (i, j) ∈ A for all i ∈ D
(19) (20)
Design of the set D of dummy trips
A deficit function f (x) is a function with domain h0, 1440) defined as follows f (x) = | { i | x ∈ hdi , ai i, i ∈ V } |
(21)
Every x ∈ h0, 1440) represents time in a day in minutes, value f (x) is the number of trips running in time moment x. A deficit function can help us to determine rush and weak traffic time interval. Typical deficit function has its rush hours in the morning and after noon and weak hours between 10:00 and 12:00 o’clock before noon. Roundabout routes should be placed into weak hour in order not to raise the number of vehicles. Remember that a trip is a quadruple (dk , ak , uk , vk ) where dk is departure time, ak is arrival time, uk is departure bus stop and vk is arrival bus stop of the trip k. Every dummy trip representing stay in depot should have departure and arrival place equal to depot. Let W ⊂ V be a set of trips running in week hours. If (di , ai , depot, depot) is a dummy trip then its latest time position is such that there exist a trip (dj , aj , uj , vj ) such that ai = dj − m(depot, uj )
625
(22)
Mathematical Methods in Economics 2016
otherwise this dummy trip could be shifted later. Therefore the set D of possible dummy trips can be defined as follows: D = { (a − 30, a, depot, depot) | a = dj − m(depot, uj ), j ∈ W }
(23)
When creating digraph G = (V ∪ D, A), exclude all arcs (i, j) such that i ∈ D and j ∈ D from arc set A in order to get away of possibility that two roundabout routes will be assigned to the same vehicle. Another possibility how to reduce the possibility of assigning two dummy trips to the same running board is following. If there are two dummy trip i ∈ D, j ∈ D such that there exist a trip k ∈ V such that i ≺ k ≺ j then remove one of dummy trips from D.
3
Computational experiment
We used Gurobi MILP solver on Intel Xeon with 4 CPU Cores to solve linear model (13)–(20). We had available real world data for municipal bus public transport in Slovak town Martin–Vrútky. The public transport system of this town consisted of 725 trips organized in 18 lines. Corresponding deficit function is presented on Figure 1. 40 35
Number of trips
30 25 20 15 10 5 0 0:00
2:00
4:00
6:00
8:00
10:00
12:00
14:00
16:00
18:00
20:00
22:00
0:00
Time of day
Figure 1 Deficit function of typical working day Depot is close to the bus stop 59. Using deficit function we have planed time positions of roundabout dummy trips to depot between 8:00 and 12:00 – in hours with weak traffic. First calculation of the minimum number of running boards (without roundabout routes to depot) solved by model (3)–(7) yielded 39 buses. Therefore we tried first to use r = 39 dummy trips – r = 39 in equation (18). The result was not satisfactory since sometimes two dummy trips were placed into one running board and so another running boards remained without roundabout trip. So we incremented the number of dummy trips to depot (constant r in equation (18)) and re–solved model.
626
Mathematical Methods in Economics 2016
We repeated this procedure until every running board contained at least one dummy trip. A feasible solution without need for more buses appeared for r = 58 wheres the number of running boards remained equal to 39. Computational steps are shown on Table 1. Number of dummy trips taken (r) Number of running boards Number of boards without depot Computational time [s] Number of dummy trips taken (r) Number of running boards Number of boards without depot Computational time [s]
39 39 11 6.40 49 39 3 7.09
40 39 10 7.16 50 39 3 6.22
41 39 9 6.18 51 39 2 6.75
42 39 9 6.82 52 39 2 6.79
43 39 8 7.51 53 39 2 7.01
44 39 7 6.25 54 39 2 6.93
45 39 6 6.13 55 39 2 7.34
46 39 6 6.58 56 39 2 6.52
47 39 5 6.24 57 39 2 6.84
48 39 6 6.41 58 39 0 7.44
Table 1 Dependence of the number of running boards without roundabout route on the number r of included dummy trips.
It can be easy seen that the solution with 39 running boards and 58 roundabout trips has to contain many running boards with more that one roundabout trip. An example of a running board with more than one dummy trip is shown on Table 2. Line 10 10 13 30 27 27 70 11 *** 70 *** 1 2 52 12 30 30 30 30 30 30 10 30
Trip 1 2 3 12 13 16 27 16 *** 37 *** 10 7 18 15 74 81 84 90 101 106 129 112
Dep.stop 31 91 59 10 59 70 81 91 59 59 59 10 59 44 18 10 59 10 10 59 10 31 10
Dep.time 04:35 05:00 05:29 06:21 06:47 07:25 08:20 08:53 09:23 10:32 11:12 13:25 13:50 14:10 14:43 15:26 15:48 16:26 17:11 18:13 19:21 20:40 22:36
Arriv.stop 91 31 43 59 70 59 91 31 59 91 59 102 26 56 24 59 10 59 59 10 56 91 59
Arriv.time 04:55 05:20 06:06 06:42 07:20 07:58 08:37 09:20 10:23 10:52 12:12 13:45 14:07 14:25 15:05 15:47 16:11 16:47 17:32 18:36 19:37 21:00 22:57
Table 2 Example of a running board with more than two dummy trips Solution containing exactly one dummy trip in every running board can be simply obtained by discarding abundant dummy trips.
627
Mathematical Methods in Economics 2016
4
Conclusion
We have defined problem of vehicle scheduling with roundabout routes to depot. We have formulated MILP mathematical model and computed running boards for public transport system in Slovak town Martin–Vrútky. Computation experiment showed that this model can be applied for real world instances of vehicle scheduling problem to find time positions of roundabout routes to depot. However, the presented model does not guarantee that chosen dummy trips are assigned to every running board – therefore some experiments are needed. Therefore, the further research will be focused on ways how to prohibit assignment of more than one dummy trip to one running board. Authors of this paper have taken part in bus public transport of more than twenty Czech and Slovak towns, cooperation with everyone of them was only temporary.
Acknowledgements The research was supported by the Slovak Research and Development Agency grant APVV-14-0658 ”Optimization of urban and regional public personal transport”.
References [1] Ceder, A.: Public-transit vehicle schedules using a minimum crew-cost approach. Total Logistic Management 4 (2011), 21–42. [2] Gurobi Optimization Inc.: Gurobi Optimizer Reference Manual, http://www.gurobi.com, 2015. [3] Palúch, S.: Two approaches to vehicle and crew scheduling in urban and regional bus transport. In: Proceeding of the Quantitative Methods in Economics (Multiple Criteria Decision Making XIV), Iura Edition, Bratislava, 2008, 212–218. [4] Palúch, S.: Vehicle and crew scheduling problem in regular personal bus transport - the state of art in Slovakia. In: Proceeding of the International Conference on Transport Science, Slovene Association of Transport Sciences, Portorož, Slovenija, 2013, 297–304. [5] Peško, Š.: Flexible bus scheduling with overlapping trips. In: Proceeding of the Quantitative Methods in Economics (Multiple Criteria Decision Making XIV), Iura Edition, Bratislava, 2008, 225–230. [6] Peško, Š.: The minimum fleet size problem for flexible bus scheduling. Studies of the Faculty of Management Science and Informatics 9 (2001), 75–81. [7] Wren, A., Rousseau, J.M.: Bus driver scheduling. In: Proceeding of the Computer-Aided Transit Scheduling, Springer Verlag, 1995, 173–187.
628
Mathematical Methods in Economics 2016
Monetary Policy in an Economy with Frictional Labor Market - Case of Poland Adam Pápai1 Abstract. The aim of this paper is to investigate the impacts and effectiveness of the monetary policy in the Polish economy during the last 14 years. Furthermore, we are interested in the effects of the Great Recession, which was caused by the financial crisis in 2007, on the behavior of key macroeconomic variables. To achieve our goals, we choose and estimate a dynamic stochastic general equilibrium model specified for a small open economy. The model is characterized by an explicitly defined labor market. The model version of this sector of the economy contains several frictions in form of a search and matching function, wage adjustment costs, hiring costs and wage bargaining processes. We implement a few different Taylor-type equations, which represent the decision making rule of the monetary authority. This approach helps us identify the main relationships, which the monetary policy maker takes into consideration while setting the interest rates. We then examine the impacts of monetary policy shock on the output and labor market. Keywords: DSGE model, monetary policy, labor market frictions, Polish economy. JEL classification: E32, J60 AMS classification: 91B40
1
Introduction
In this paper, we focus on the influence of the monetary policy maker’s decisions on labor market. It might not be the principal goal of the central banks to interfere in the movements on the labor market, however the decisions of the monetary authority may affect the preferences of employers and workers nonetheless. It is therefore desirable to investigate the connections between the monetary authority and the variables describing the labor market. To be able to assess these relationships, we follow up on our previous research (Pápai and Němec [5] and Pápai and Němec [6]), and estimate a dynamic stochastic general equilibrium (DSGE) model for a small open economy. In this paper we choose to focus more on the monetary policy rather than the labor market. We selected the economy of Poland, although it hardly classifies as small. Therefore, our goal is also to test, whether such model can replicate the low dependence of Poland on the foreign sector. DSGE models are widely used tools for forecasting and decision making in the macroeconomic sector. The great degree of modifiability increased their popularity in the last two decades. The literature regarding DSGE models is vast. Some, like Christiano et al. [3] incorporated financial frictions and labor market rigidities into a otherwise standard small open economy model. Their estimation results on Swedish data suggest, that introducing such limitations improve the accuracy of the model’s behavior. They conclude, that ”labor market tightness (measured as vacancies divided by unemployment) is unimportant for the cost of expanding the workforce. In other words, there are costs of hiring, but no significant costs of vacancy postings.” Antosiewicz and Lewandowski [2] analyze the factors behind the financial crisis and their transmission into labor markets of selected European countries, including Poland and the Czech Republic. They find a significantly higher effect of productivity shock to the wages in the Czech Republic than in Poland. Also, in all analyzed countries except of the Czech Republic, job destruction shocks influenced consumption significantly. For Poland, this suggests a decrease of consumption in times of high job loss probability. 1 Masaryk University, Faculty of Economics and Administration, Department of Economics, Lipová 41a, Brno, Czech Republic, papai@mail.muni.cz
629
Mathematical Methods in Economics 2016
In both countries, the unemployment rate is mostly influenced by foreign demand and job destruction shocks. However, the model of these authors does not contain monetary policy in any form. Tonner et al. [7] expand the core model of the Czech national bank by various implementations of labor market. They provide results of multiple predictions to assess the ability of the different model specifications to match the development of the selected macroeconomic variables. Their findings indicate, that altering the baseline model within acceptable bounds, the linking of unemployment with the labor market markup gives more accurate forecasting results.
2
Model
We selected a DSGE model originally developed by Albertini et al. [1], who used New Zealand data for their estimation. This model focuses on the analysis of the driving forces on the labor market and hence uses a detailed description of relationships among labor market variables. The model contains several types of rigidities. First, search and matching mechanism, created by Mortensen and Pissarides [4] is present to describe the flows on the labor market. This function matches unfilled and therefore unproductive job positions with unemployed workers. The search and matching introduces a rigidity to the model, because not all people who search for job and not all firms who want to fill their vacancies can do so. Another friction in the model has the form of vacancy posting costs. Firms cannot create vacant job positions freely, but must decide, whether the potential gains outweigh the induced costs. Finally, there are price and wage adjustment costs, which similarly to the previous frictions, induce actual costs to the firms. The model does not contain capital and government and consists of households, firms and the monetary authority. The representative household decides between consumption and labor. Current consumption depends on habit to achieve a smoother level over time. Labor brings disutility to the households. However, if the people decide to work, they get a compensation in form of hourly wages. There are three types of firms, each facing some kind of rigidity. Producers create intermediate goods and sell them on a perfectly competitive market. They hire employees and negotiate their wages and the hours using a Nash bargaining process. Retailers buy the unfinished products and combine them to sell final goods on a monopolistically competitive market to the households. This gives them the ability to adjust the prices. Finally, like the domestic retailers, importers sell foreign finished goods on the domestic market for consumption. They can also adjust their prices, while facing price adjustment costs. The last agent in the model is the monetary authority. Monetary authority The monetary policy maker has a key role in the macroeconomic realm. It is therefore desirable to incorporate the monetary authority into the DSGE model. The most common way is to define it using the Taylor rule. it = Ď r itâ&#x2C6;&#x2019;1 + (1 â&#x2C6;&#x2019; Ď r )(Ď Ď&#x20AC; Ď&#x20AC;t+1 + Ď y yt + Ď â&#x2C6;&#x2020;y â&#x2C6;&#x2020;yt + Ď â&#x2C6;&#x2020;e â&#x2C6;&#x2020;et ) + m
(1)
it = Ď r itâ&#x2C6;&#x2019;1 + (1 â&#x2C6;&#x2019; Ď r )(Ď Ď&#x20AC; Ď&#x20AC;t+1 + Ď y yt + Ď â&#x2C6;&#x2020;y â&#x2C6;&#x2020;yt ) + m
(2)
it = Ď r itâ&#x2C6;&#x2019;1 + (1 â&#x2C6;&#x2019; Ď r )(Ď Ď&#x20AC; Ď&#x20AC;t+1 + Ď y yt ) + m
(3)
it = Ď r itâ&#x2C6;&#x2019;1 + (1 â&#x2C6;&#x2019; Ď r )(Ď Ď&#x20AC; Ď&#x20AC;t+1 + Ď â&#x2C6;&#x2020;y â&#x2C6;&#x2020;yt ) + m
(4)
Equations (1)â&#x20AC;&#x201C;(4) show the slight modifications to the Taylor rule (presented in log-linearized forms), which were used for our estimations. Equation (1) represents the original variation taken from Albertini et al. [1], where the central bank sets the current nominal interest rate (it ) based on its previous value, the expected inflation gap (Ď&#x20AC;t+1 ), output gap (yt ), difference between the current and previous output (â&#x2C6;&#x2020;yt ) and nominal exchange rate (â&#x2C6;&#x2020;et ) gaps. Parameter Ď r is the interest rate smoothing parameter, while the other Ď -s represent the dependence of the interest rate on the development of the previously mentioned variables. The other three specifications of the Taylor rule leave out, among other things, the nominal exchange rate (et ). We leave out this variable to investigate this way, whether the monetary policy maker reacts to the impulses arising from the interactions between Poland and other economies. The omission of output shows us, whether the central bank makes a decision based on the current output
630
Mathematical Methods in Economics 2016
gap or also takes into consideration the difference of the output gaps and vice versa. We evaluate the plausibility of different model specifications using impulse response functions.
3
Data, methodology and calibration
Our sample covers the period between the first quarter of 2002 and the last quarter of 2015, which is 56 periods in total. We chose our observed variables as in Albertini et al. [1] with one exception, given the unavailability of quarterly time series for hours worked. Seven time series were selected to characterize the domestic economy. The harmonized unemployment rate (ut ), number of vacancies (vt ), hourly earnings (wt ), CPI (for calculation of inflation (πt )), gross domestic product (yt ) and three month interbank interest rate (it ) were acquired from the OECD database. The exchange rate between Polish zloty and euro was downloaded from the sites of the European central bank. The foreign sector (yt∗ , πt∗ , i∗t ), represented by the Eurozone is present in the model in the form of AR(1) processes. The variables are entered into the model in per capita terms. Each of the time series is seasonally adjusted. We use demean (for πt , it , πt∗ and i∗t ) and Hodrick-Prescott filter (for the rest of the variables), with smoothing parameter λ = 1600 to detrend the time series and get the cyclical components of the data. We use Bayesian techniques to estimate models (1)–(4). This allows us to impose additional information for the model in the form of prior densities. Furthermore, the Kalman filter allows us to investigate the trajectories of unobserved variables. For each estimation, we generate two chains of MetropolisHastings algorithm, while targeting acceptance ratio of 30%. Matlab and its toolbox Dynare is used to acquire our estimation results. Standard parameter calibration methods are used to help pinpoint the steady state. Some parameter values are taken form the literature, while others are calculated from the data. The discount factor was set to a widely used value of 0.99. The share of labor in production was set to 2/3. The parameter of home bias in consumption was calculated from the data as the import share on GDP (0.2843). This relatively small number supports our presumption of Poland not being a small open economy. The steady state of unemployment (0.1227) was calculated as a sample mean of unemployment rate. The prior densities of estimated parameters are presented in Table 1.
4
Estimation results Description Habit El. of substitution (dom. & for.) Bargaining power of firms Elasticity of matching Price and wage setting Back. looking price (dom. good) Back. looking price (for. good) Back. looking wage parameter Price adj. cost (dom. good) Price adj. cost (for. good) Wage adj. cost Monetary policy Interest rate smooth. Inflation Output gap Difference of output Difference of exchange rate
ϑ η ξ ν
Prior density β(0.5, 0.15) Γ(1, 0.2) β(0.5, 0.2) β(0.5, 0.2)
Model 1 0.6766 0.4454 0.6790 0.8591
Model 2 0.6833 0.4875 0.5847 0.8825
Model 3 0.6035 0.3926 0.7292 0.7307
Model 4 0.6750 0.5137 0.5627 0.8862
γH γF γW ψH ψF ψW
β(0.75, 0.1) β(0.75, 0.1) β(0.75, 0.1) Γ(50, 15) Γ(50, 15) Γ(50, 15)
0.9108 0.6457 0.3603 83.1461 54.5994 19.5984
0.9109 0.6922 0.3626 83.0849 53.3332 21.2508
0.8875 0.6974 0.4484 80.6558 59.9069 13.8994
0.9044 0.6880 0.3865 81.4520 52.5602 20.1162
ρr ρπ ρy ρ∆y ρ∆e
β(0.5, 0.15) Γ(1.5, 0.25) N (0.25, 0.1) N (0.25, 0.1) N (0.25, 0.1)
0.6666 2.9685 0.2693 0.1154 -0.0702
0.6768 3.0574 0.2180 0.1418 –
0.6081 2.8051 0.2930 – –
0.6570 3.0736 – 0.1310 –
Table 1 Estimation results - parameter means (’–’ means not estimated)
631
Mathematical Methods in Economics 2016
Table 1 shows the results of parameter estimates for all four models. Our first finding is, that the differences in the results of parameter estimates for these models are small and mostly insignificant. This could be given by the fact, that the changes made are really small. Therefore, we first describe the results of parameter estimations and not the differences between the models. Second, most of the posterior values are different from our priors. The deep habit parameter (ϑ) is relatively high, meaning the households try to smooth their consumption. The firm’s bargaining power parameter (ξ) varies across the models, but is not extraordinarily high. This could suggest, that there are some relatively powerful trade unions in Poland (see [8]). On the other hand, the elasticity of matching function with respect to the number of job seekers (ν) is estimated to be around 0.88. This means, that the number of matches depends more on the unemployed than on the vacancies. The price and wage setting parameters show us the degree of rigidities in the Polish economy. The backward looking parameters (γ) indicate how much the current value depends on its previous level. Our estimation results suggest, that the prices of the domestically produced goods are the stickiest, followed by the prices of the imported products. Wages are the least rigid among these three frictions. The price and wage adjustment cost parameters (ψ) indicate the size of costs induced by the changing of the product’s price or worker’s wage. Again, the value for the prices of domestic goods is the highest and the wage setting parameter is the lowest. Meaning, it is cheaper for producers to change the wage than for retailers to adjust the price. These results show low degrees of rigidities on the labor market of Poland. Lastly, the parameters important for the monetary policy maker are also presented in table 1. The main reason for leaving out the dependency of interest rates in the Taylor rule on the nominal exchange rate was the estimation result of the first model. Here, the posterior mean resulted in a negative number, while the 90% posterior density interval was between −0.1567 and 0.0071, so we could not rule out statistical insignificancy. The interest rate smoothing parameter was estimated in all cases above 0.6 which suggests relatively stable values. The parameter describing the dependency on inflation is estimated to be around 3, which is in line with the current literature. The monetary policy maker takes into consideration the output gap more than the differences of this variable. If we look at the differences among the four models, we find that the parameter estimation results are similar in models 1, 2 and 4. Model 3 sticks out the most. This is even more visible in the two following figures. Figures 1 and 2 show the results of impulse responses to monetary and technology shocks respectively. Models 2 and 4 behave in both cases almost identically and are at the very least at the boundaries of the 90% highest posterior density interval (HPDI) of model 1. These three models behave in accordance with the literature and our expectations. Model 3 however exhibits different behavior, which in several cases contradicts with the actual behavior of the real economy. Therefore, we will focus our further interpretation only on the three similar models.
Figure 1 Impulse responses to a monetary shock
632
Mathematical Methods in Economics 2016
Figure 2 Impulse responses to a technology shock At this point we would also like to note, that the variables of vacancy and unemployment are divided by economically active population and their volatility is smaller than the volatility of the rest of the variables. High deviations from steady state therefore do not mean high changes of the actual number of vacancies and number of unemployed. Furthermore, the impulse responses of vacancies, unemployment and wages contain zero in the 90% HPDI, so we cannot rule out statistical insignificancy. The transmission mechanism of monetary policy is presented as a monetary shock in figure 1. It causes an initial increase of interest rates. This restrictive monetary policy causes a drop of output and decrease of inflation. Because the unemployment does not change substantially, the firms lower the hours of workers to compensate for the lower production. Wages and vacancies also decrease, as is expected, when the output drops. Finally, this monetary policy creates an appreciation of exchange rate. Figure 2 shows the reactions of selected variables to a positive technology shock, that increases the output. The reactions of labor market variables are limited. Only models 2 and 4 indicate a lagged increase of vacancies created by the firms during expansion periods. The initial decrease of hours worked, when the labor is improved by the better technology, fades away during 4 â&#x2C6;&#x2019; 5 periods. These two models also show a slight increase of wages, while the unemployment remains without change and the exchange rate depreciates.
Figure 3 Historical shock decomposition of output
633
Mathematical Methods in Economics 2016
The models are not able to capture the counter cyclical changes in the variable of unemployment. This could be given by the unavailability of the time series of hours and the model therefore interprets the changes in the workforce as intensive (hours) and not extensive (unemployed). Finally, in figure 3 we present the historical shock decomposition of output from the results of model 2. There were some substantial negative labor market shocks at the beginning of the observed period. The most interesting conclusion of this figure is, that the foreign shocks influenced the Polish output mainly during the Great Recession. Their presence was very little in the other periods. This could suggest, that the Polish economy is, in fact, not so small and open.
5
Conclusion
In this paper we estimated four slightly different model specifications for a small open economy. We used quarterly data of Poland for the last 14 years. Our findings suggest that there are significant price rigidities in the Polish economy, while the frictions on the labor market are not so grave. The impulse response functions indicate, that the monetary policy is able to affect the real economy, though its influence on labor market is marginal. Also, the decision making by the monetary authority does not depend on the development of the exchange rate, however omission of output difference from the Taylor rule causes serious distortion in the estimated results. Finally, given that we indicated the size of the Polish economy, it would be more suitable to estimate models that capture its size better.
Acknowledgements This work was supported by funding of specific research at Faculty of Economics and Administration, project MUNI/A/1040/2015. This support is gratefully acknowledged.
References [1] Albertini, J., Kamber, G., and Kirker, M.: Estimated small open economy model with frictional unemployment. Pacific Economic Review, 17, 2 (2012), 326–353. [2] Antosiewicz, M., and Lewandowski, P.: What if you were German? - DSGE approach to the Great Recession on labour markets. IBS Working Papers. Instytut Badan Strukturalnych (2014). [3] Christiano, L. J., Trabandt, M., and Walentin, K.: Introducing Financial Frictions and Unemployment into a Small Open Economy Model. Journal of Economic Dynamics and Control, 35, 12 (2011), 1999–2041. [4] Mortensen, D. T., and Pissarides, C. A.: Job creation and job destruction in the theory of unemployment. Review of Economic Studies, 61, 3 (1994), 397–415. [5] Pápai, A., and Němec, D.: Labour market rigidities: A DSGE approach. Proceedings of 32nd International Conference Mathematical Methods in Economics (Talašová, J., Stoklasa, J., Talášek, T., eds.), Olomouc: Palacký University, 2014, 748–753. [6] Pápai, A., and Němec, D.: Labor Market Frictions in the Czech Republic and Hungary. Proceedings of 33nd International Conference Mathematical Methods in Economics (Martinčı́k, D., Ircingová, J., Janeček, P., eds.), Plzeň: University of West Bohemia, 2015, 606–611. [7] Tonner, J., Tvrz, S., and Vašı́ček, O.: Labour Market Modelling within a DSGE Approach. CNB Working paper series. 6/2015. [8] Trade unions. Link: http://www.worker-participation.eu/National-Industrial-Relations/Countries/ Poland/Trade-Unions
634
Mathematical Methods in Economics 2016
Adaptive parameter estimations of Markowitz model for portfolio optimization Josef Pavelec1, Blanka Šedivá
2
Abstract. This article is focused on a stock market portfolio optimization. The used method is a modification of traditional Markowitz model which extends the original one for adaptive approaches of parameter estimations. One of the basic factors which significantly influence optimal portfolio is the method of estimations of return on assets, risk and covariance between them. Since the stock market processes tend to be not stationary we can expect that prioritization of recent information will lead to improvement of these parameter estimations and thus to better results of the entire model. For this purpose a modified algorithm was design to estimate expected return and correlation matrix more stable. For implementation and verification of this algorithm we needed to build a program which was able to download historical stock market data from the internet and compute optimal portfolio using either traditional Markowitz model and its modified approach. Obtained results will be compared to the traditional Markowitz model provided real data. Keywords: Markowitz model, estimation of parameters, adaptive method. JEL classification: G11 AMS classification: 91G10
1
Introduction
There are many articles about optimal portfolio in the science of mathematics. The reason is that investing on a stock market with potential of profit is interesting for a large amount of people in the world. Thanks to this fact, there is also many approaches how mathematicians try to model the stock market. Some of them tend to believe that there is no relationship between the history and the future. This group use methods of random walk and tries to simulate large number of scenarios to forecast the future. The other group of scientists believe that there is a strong relation between historical prices and the future ones. In this article will focus on this approach and try to treat one of the biggest milestones of these methods - stability of the model provided the parameter estimation. We will introduce two different ways to improve the stability of optimization - matrix cleaning and data weighting.
2
Portfolio theory: basic results
Suppose we have a set of N financial assets characterized by their random return in chosen time period, so the random vector is the vector X = (X1 , X2 , . . . , XN ) of random returns on the individual assets. The distribution of vector X is characterized by vector of expected value with elements EXi = ri and by covariance matrix V whose i, j th element is the covariance between the Xith and the Xjth random variables. The elements on diagonal σi2 of matrix V represent variances of asset i. The Markowitz’s theory of optimal portfolio is focused on the problem to find optimal weight of each assets such that overall portfolio provides the best return for a fixed level of risk, or conversely the smallest risk for a given overall return [5]. More precisely, the average return Rp of a portfolio P of N PN assets is defined as Rp = i=1 wi ri where wi is the amount of capital invested in the assets i and ri are expected returns of the portfolio P can by associated with PNindividual assets. Similarly the risk of aP N the total variance σP2 = i,j=1 wi Vij wj or in alternative form σP2 = i,j=1 wi σi Cij σj wj where σi2 is the 1 University 2 University
of West Bohemia, Plzeň, Czech Republic, Department of Mathematics, jpavelec@kma.zcu.cz of West Bohemia, Plzeň, Czech Republic, Department of Mathematics, sediva@kma.zcu.cz
635
Mathematical Methods in Economics 2016
variance of asset i and C is the correlation matrix. The optimal portfolio which minimizes σP2 for a given value of Rp can be easily found introducing a Lagrange multiplier and leads to a linear problem where the matrix C has to be invertible [3],[2]. The resulting from a Markowitz optimization scheme, which gives the portfolio with the minimum P risk for a given return RP = wi ri P −1 j Cij rj /σj (1) wi σi = Rp P −1 i,j ri /σi Cij rj /σj By redefining wi as wi σi the σi is absorbed in ri and wi and the equations (3) can be write in matrix notation C −1 r wC = Rp T −1 (2) r C r and the corresponding risk of the portfolio over the period using this construction is σP2 =
Rp2 r T C −1 r
(3)
From mathematical equation (2) is obvious that usability of Markowitz model strongly depends on input data which are used for asset mean return estimations and the dominant role for stability is given by quality of estimation of the covariance matrix.
3 3.1
Parameter estimations - stability of the model Empirical correlation matrix
Suppose we have N stock return series with T elements each. If we want to measure and optimize the risk of this portfolio, it is necessary to use a reliable estimate of the covariance matrix V or correlation matrix C. If rit is the daily return of stock i at time t, the empirical variance of each stock is given by σi2 =
2 1X t ri − ri T t
(4)
and can be assumed for simplicity to be perfectly known. We also suppose, as usual, the daily return of stock rit is demeaned (ri = 0). The empirical correlation matrix is obtained as Eij =
1X t t x x , where xti = rit /σi T t i j
(5)
or in matrix form E = (1/T ) X T X, where X is the normalization T × N matrix of return Xit = rit /σi . 3.2
Random matrix theory and matrix cleaning
For a set of N different assets, the correlation matrix contains N (N − 1)/2 entries, which must be determined from N time series of length T . If T is not very large compared to N , we can expect that the determination of covariances is noisy and therefore that the empirical correlation matrix is to a large extent random. Because a covariance matrix is positive semidefinite, that the structure of it can by describe by real eigenvalues and corresponding eigenvectors. Eigenvalues of the covariance matrix that are small (or even zero) correspond to portfolios of stocks that have non-zero returns, but extremely low or vanishing risk; such portfolios are invariably related to estimation errors resulting from insufficient data. One of the approaches used to eliminate the problem of small eigenvalues in the estimated covariance matrix is the so-called random matrix technique. Random matrix theory (RMT), first developed by authors such as Dyson [4] and Mehta [9] for physical application, but there are also many results of interest in a financial context [7], [1], [11].
636
Mathematical Methods in Economics 2016
The spectral properties of C may be compared to those of random correlation matrix. As described by [7], [11] and others, if R is any matrix defined by R = (1/T ) AT A, where A is an N × T matrix whose elements are i.i.d random variables with mean zero and fixed variance σ 2 , than in the limit T, N → ∞ keeping ratio Q = T /N ≥ 1 constant, the density of eigenvalues of R is given by p (λmax − λ)(λ − λmin ) Q , λmin ≤ λ ≤ λmax , (6) P (λ) = 2 2πσ λ where the maximum and minimum eigenvalues are given by λmax/min = σ
2
1±
r
1 Q
2
.
(7)
The distribution P (λ) are known as the Marčenko-Pastur density [8] and the theoretical maximum and minimum eigenvalues determined the bounds for random matrix. If the eigenvalues of matrix are beyond these bounds, it is said that they deviate from random. If we apply this theoretical background of RMT to the correlation matrix we can separate the noise and non-noise parts of E. We cleaned the matrix by following procedure: 1. to construct the empirical correlation matrix as (5), 2. separate the noisy eigenvalues from non-noisy eigenvalues as (6), 3. to keep the non-noisy eigenvalues the same and to take average of the noisy eigenvalues, 4. to replace each eigenvalue associated with the noisy part by average of the eigenvalues, 5. to reconstruct correlation matrix. The simple repair mechanism, based on the spectral decomposition of the correlation matrix, is described for example in [6]. 3.3
Exponential weights - parameter estimation
Another method how we can minimize the non-stability of the model is weighting. Since we suppose that the most recent data are the most relevant and the older ones influence the future less, we designed model, which weights the data exponentially to the history. This idea was introduced in [10], where we can also find more details. The parameters we need to estimate are: • Return - estimate of expected return on asset Xi is in this case calculated by weighted mean: rbi =
T X t=1
rit
·δ
t
!
/
T X t=1
δ
t
!
(8)
• Risk - the estimated expected return of asset Xi is in this case calculated by sample weighted variance: v u T uP t t 2 u u t=1 δ · (ri − ri ) T u · , σ bi = u T P T −1 t t δ
(9)
t=1
• Empirical correlation matrix is computed using exponentially weighted by ! T T X X t t t Eij = δ xi xj / δ t , where xti = rit /b σi t=1
(10)
t=1
where δ ∈ (0, 1i is weighting parameter. This parameter δ is sometimes called a ”forgetting coefficient” and should be close to 1. The smaller it is, the faster older data get unsignificant. This approach is implemented and tested in created program StockMaTT, where we can either analyse our data using traditional Markowitz model (uses linear data weights) or using this adaptive approach with exponential weighting. The results of these two methods vary and depend on its parameters - on data history length for linear model and on λ for the adaptive model.
637
Mathematical Methods in Economics 2016
4 4.1
Data analysed Program StockMaTT
So that we can verify the model and figure out it’s results we needed to build a programme tool. A satisfying environment needs to have sufficient mathematical and statistical background and also needs to be fast enough to implement quite a complicated algorithm on complex data. R A tool that fitted our needs was the programming language MATLAB in combination with its GUI environment that makes the user’s controlability comfortable. The main screen where we update data and compute the optimal portfolio can be found in the Figure 1.
Figure 1 Window for data update and portfolio optimization
4.2
Real data
The model was also tested on real data. Since the focus of this article is on portfolio optimization we chose to use the data from the stock market. As a prefered server was chosen server yahoo.finance.com. For our analysis we used the time series of daily closing prices of stocks available mainly on NASDAQ, which is the biggest stock market in the USA with over 3900 assets from about 39 countries. The financial instruments that can be traded here besides stocks are also options and futures. Asset split and dividends Downloaded data also needed to be ”smoothened” for unexpected jumps caused by splitting the stocks and also for dividends. The stock spilt is a phenomenon that happens usually for expensive assets when the stakeholders want to support the stock liquidity. Usually they decide to split the stock in the rate 1 of its value. This of X:1 which means that suddenly all stock holders have X times more assets with X phenomenon causes this obvious jumps that can bee seen for example on the Apple Inc. stock in the Figure 2.
638
Mathematical Methods in Economics 2016
Figure 2 Sample asset split - Apple Inc. - before and after smoothening 4.3
Matrix cleaning based on random matrix theory
For better understanding of the data we applied the method described in subsection 3.2. Firstly we constructed the empirical measured correlation matrix E by using several different numbers of observations (T ) and we analysed the distribution of the eigenvalues. We compared the empirical distribution of the eigenvalues of the correlation matrix with the theoretical prediction given by (6) based on assumption that the correlation matrix is purely random. The results are summarized in Table 1. # observation 15 30 50 100 200
% of λ < λmin 7% 20% 33% 54% 74%
% of λmin < λ < λmax 91% 75% 60% 38% 19%
% of λmax < λ 3% 5% 7% 8% 8%
Table 1 Comparing eigenvalues of empirical correlation matrix and Marčenko- Pastur bounders. From these results we can see that the important information about asset mutual connections is carried by 3 to 7% of eigenvalues of the correlation matrix. By increasing the number of observations on which is based the correlation matrix estimation, slightly increases the number of non-random correlations, but also increases the instability of the correlation matrix since its eigenvalues are very small. As an optimal number of observations in this case seems to be number between 30 and 50. Figure 3 shows the results of our experiments on the data with 50 observations used for estimation of the correlation matrix. There is histogram of eigenvalues of the empirical correlation matrix and for comparison we plotted the density of (6) for Q = 3.8462 and σ = 0.660. A better fit can be obtained with a smaller value of σ = 0.462 (dashed blue line). This result is in accordance with similar articles, for example [7] or [11] and shows problems related to correct market risk estimation.
5
Conclusion
The goal of this article was to develop an advance approach for stability treatment of portfolio optimization. We have developed two methods to minimize the effects of unstability and tested these methods on real data. For purposes of comparison of traditional Markowitz model and the weighting modification we created a SW solution StockMaTT, where we could try to simulate investments with both methods and different parameters. As we expected the traditional Markowitz model is very sensitive on input data and using the weighting we obtained different results. To sum up, as a possible treatment of the unstability can be recommended both methods described in this article. A potential topic for the following studies could be the correct estimation of weighting parameter.
639
Mathematical Methods in Economics 2016
Figure 3 Empirical and prediction distribution of eigenvalues for T = 50.
Acknowledgements This article was supported by the European Regional Development Fund (ERDF), project ”NTIS - New Technologies for the Information Society”, European Centre of Excellence, CZ.1.05/1.1.00/02.0090.
References [1] Daly, J., Crane, M. and Ruskin, H. J.: Random matrix theory filters in portfolio optimisation: a stability and risk assessment. Physica A: Statistical Mechanics and its Applications, 387.16 (2008), pp. 4248–4260. [2] Danielsson, J.: Financial risk forecasting: the theory and practice of forecasting market risk with implementation in R and Matlab (Vol. 588). John Wiley & Sons, 2011. [3] Dupačová, J.: Markowitzův model optimálnı́ volby portfolia - předpoklady, data, alternativy (n.d.). Retrieved April 25, 2016, from http://msekce.karlin.mff.cuni.cz/ dupacova/downloads/Markowitz.pdf [4] Dyson, Freeman J.: Correlations between eigenvalues of a random matrix. Comm. Math. Phys. 19, no. 3 (1970), pp. 235–250. [5] Elton, E. J., Gruber, M. J., Brown, S. J. and Goetzmann, W. N.: Modern portfolio theory and investment analysis. John Wiley & Sons, 2009. [6] Gilli, M., Maringer, D. and Schumann, E.: Numerical methods and optimization in finance. Academic Press, 2011. [7] Laloux, L., Cizeau, P., Potters, M. and Bouchaud, J. P.: Random matrix theory and financial correlations. International Journal of Theoretical and Applied Finance, 3.03 (2000), pp. 391–397. [8] Marčenko, V. A. and Pastur, L. A.: Distribution of eigenvalues for some sets of random matrices. Mathematics of the USSR-Sbornik, 1.4 (1967), 457. [9] Mehta, M. L.: Random matrices (Vol. 142). Academic Press, 2004. [10] Pavelec, J.: Programový nástroj pro volbu optimálnı́ho portfolia. Diplomová práce. Západočeská univerzita v Plzni. Fakulta aplikovaných věd, 2013. [11] Potters, M., Bouchaud, J. P. and Laloux, L.: Financial applications of random matrix theory: Old laces and new pieces. arXiv preprint physics/0507111 (2005).
640
Mathematical Methods in Economics 2016
Fuzzy Decision Matrices Viewed as Fuzzy Rule-Based Systems Ondřej Pavlačka1, Pavla Rotterová2 Abstract. A decision matrix represents a particular problem of decision making under risk. Elements of the matrix express the consequences if a decision-maker chooses a particular alternative and a particular state of the world occurs. Assuming the probabilities of the states of the world are known, alternatives are compared on the basis of the expected values and variances of their consequences. In practice, the states of the world are often described only vaguely; they are expressed by fuzzy sets on the universal set on which the probability distribution is given. In literature, the common approach in such a case consists in computing crisp probabilities of the fuzzy states of the world by formula proposed by Zadeh. In the paper, we first discuss the problems connected with the common approach. Consequently, we introduce a new approach, in which a decision matrix with fuzzy states of the world does not describe discrete random variables but fuzzy rule-based systems. We illustrate the problem by examples. Keywords: decision matrices, decision making under risk, fuzzy states of the world, fuzzy rule-based systems. JEL classification: C44 AMS classification: 90B50
1
Introduction
In decision making under risk, decision matrices (see Tab. 1) are often used as a tool of risk analysis (see e.g. [1, 3, 8]). They describe how consequences of alternatives x1 , . . . , xn depend on the fact which of possible and mutually disjoint states of the world S1 , . . . , Sm will occur. The probabilities of occurrences of the states of the world are given by p1 , . . . , pm . Thus, the consequence of choosing an alternative xi , i ∈ {1, . . . , n}, is a discrete random variable Hi that takes on the values hi,1 , . . . , hi,m with probabilities p1 , . . . , pm . The alternatives are usually compared on the basis of the expected values EH1 , . . . , EHn and variances var H1 , . . . , var Hn of their consequences.
x1 x2 ··· xn
S1 p1 h1,1 h2,1 ··· hn,1
S2 p2 h1,2 h2,2 ··· hn,2
··· ··· ··· ··· ··· ···
Sm pm h1,m h2,m ··· hn,m
EH1 EH2 ··· EHn
var H1 var H2 ··· var Hn
Table 1 A decision matrix. In practice, the states of the world are often specified only vaguely, like e.g. ”inflation is low”, etc. Talašová and Pavlačka [7] showed that in such a case it is more appropriate to model the states of the world by fuzzy sets on the universal set on which the probability distribution is given. They proposed to 1 Palacký
University Olomouc, Faculty of Science, Department of Mathematical Analysis and Applications of Mathematics, 17. listopadu 1192/12, 771 46 Olomouc, Czech Republic, ondrej.pavlacka@upol.cz 2 Palacký University Olomouc, Faculty of Science, Department of Mathematical Analysis and Applications of Mathematics, 17. listopadu 1192/12, 771 46 Olomouc, Czech Republic, pavla.rotterova01@upol.cz
641
Mathematical Methods in Economics 2016
proceed in the same way as in the case of crisp (i.e. exactly described) states of the world; they set the probabilities of the fuzzy states of the world applying the formula proposed by Zadeh [9] and treated the consequences of alternatives as discrete random variables. However, Pavlačka and Rotterová [4] showed that Zadeh’s probabilities of fuzzy events lack the common interpretation of a probability measure. Another problem is what does it mean to say that the particular fuzzy state of the world will occur (see the discussion in Section 2). Therefore, in Section 3, a new approach to this problem will be introduced, in which a decision matrix with fuzzy states of the world does not describe discrete random variables but fuzzy rule-based systems.
2
A decision matrix with fuzzy states of the world according to [7]
First, let us describe and analyse the approach to the extension of a decision matrix to the case of fuzzy states of the world that was considered by Talašová and Pavlačka [7]. Let us assume that a probability space (Ω, A, P ) is given, where Ω denotes a non-empty set of all elementary events, A represents the set of all considered random events (A forms a σ-algebra of subsets of Ω), and P : A → [0, 1] is a probability measure that assigns to each random event A ∈ A its probability P (A) ∈ [0, 1]. A vaguely defined state of the world can be appropriately expressed by a fuzzy set S on Ω. A fuzzy set S on Ω is determined by the membership function µS : Ω → [0, 1]. The interpretation of membership degrees µS (ω), ω ∈ Ω, is explained in the following example. Example 1. Let us consider a vaguely defined state of the economy ”inflation is low”, denoted by S. Let Ω be a set of all inflation rates and let an inflation rate ω ∈ Ω occur. If µS (ω) = 1, then we would definitely say that inflation is low. If µS (ω) ∈ (0, 1), then we would say that inflation is low only partly. Finally, if µS (ω) = 0, then we would say that inflation is not low. As the probability space (Ω, A, P ) is given, Talašová and Pavlačka [7] assumed that fuzzy states of the world are expressed by fuzzy sets on Ω that are called fuzzy random events. A fuzzy random event S is a fuzzy set on Ω whose membership function µS is A-measurable (see Zadeh [9]). This assumption means that the α-cuts Sα := {ω ∈ Ω | µS (ω) ≥ α}, α ∈ (0, 1], are random events, i.e. Sα ∈ A for any α ∈ (0, 1]. For the case of fuzzy random events, Zadeh [9] extended the given probability measure P in the following way: A probability PZ (S) of a fuzzy random event S is defined as Z PZ (S) := E(µS ) = µS (ω)dP. (1) Ω
The existence of the above Lebesgue-Stieltjes integral follows directly from the assumption that µS is A-measurable. It can be easily shown (see e.g. [2, 4, 9]) that the mapping PZ possesses analogous mathematical properties as a probability measure P , and thus, it can be called a probability measure. Remark 1. Let us note that any crisp set S ⊆ Ω can be viewed as a fuzzy set of a special kind; the membership function µS coincides in such a case with the characteristic function χS of S. In such a case, Sα = S for all α ∈ (0, 1]. This convention allows us to consider crisp states of the world as a special kind of fuzzy states of the world. It can be easily seen that for any crisp random event S ∈ A, PZ (S) = P (S). Talašová and Pavlačka [7] considered the following extension of the decision matrix given by Tab. 21 : The Pm fuzzy states of the world are fuzzy random events S1 , . . . , Sm that form a fuzzy partition of Ω, i.e. of the world j=1 µSj (ω) = 1 for any ω ∈ Ω. The probabilities pZ1 , . . . , pZm of the fuzzy states Pm are given by pZj := PZ (Sj ), j = 1, . . . , m. Since S1 , . . . , Sm form a fuzzy partition of Ω, j=1 pZj = 1. Therefore, Talašová and Pavlačka [7] treated the consequence of choosing an alternative xi , i = 1, . . . , n, as a discrete random variable HiZ that takes on the values hi,1 , . . . , hi,m with probabilities pZ1 , . . . , pZm . 1 Besides the fuzzy states of the world, Talašová and Pavlačka [7] considered also fuzzy consequences of alternatives H i,j instead of the crisp consequences hi,j , i = 1, . . . , n, j = 1, . . . , m. As we are focused strictly on the fuzzification of the states of the world here, only crisp consequences will be considered further.
642
Mathematical Methods in Economics 2016
They compared the alternatives x1 , . . . , xn on the basis of the expected values EH1Z , . . . , EHnZ and variances var H1Z , . . . , var HnZ of their consequences, computed for i = 1, . . . , n as follows: EHiZ =
m X j=1
and var
HiZ
=
m X j=1
x1 x2 ··· xn
S1 pZ1 h1,1 h2,1 ··· hn,1
S2 pZ2 h1,2 h2,2 ··· hn,2
pZj · hi,j ,
(2)
pZj · (hi,j − EHiZ )2 .
··· ··· ··· ··· ··· ···
Sm pZm h1,m h2,m ··· hn,m
EH1Z EH2Z ··· EHnZ
(3)
var H1Z var H2Z ··· var HnZ
Table 2 A fuzzy decision matrix considered in [7]. Thus, we can see that by applying the Zadeh’s crisp probabilities of fuzzy states of the world, the extension of the decision matrix is straightforward. Now, let us discuss the problems that are connected with this way of extension of a decision matrix tool. As it is written in Introduction, the element hi,j of the matrix given by Tab. 1 describes the consequence of choosing the alternative xi if the state of the world Sj occurs. If we consider fuzzy states of the world instead of crisp ones, a natural question arises: What does it mean to say ”if the fuzzy state of the world Sj occurs”? Let us suppose that some ω ∈ Ω occurred. If µSj (ω) = 1, then it is clear that the consequence of choosing the alternative xi is exactly hi,j . However, what is the consequence of choosing xi if 0 < µSj (ω) < 1 (which also means that 0 < µSk (ω) < 1 for some k 6= j)? Thus, perhaps it is not appropriate in case of a decision matrix with fuzzy states of the world to treat the consequence of choosing xi as a discrete random variable HiZ that takes on the values hi,1 , . . . , hi,m . Moreover, it was pointed out by Rotterová and Pavlačka [5] that the Zadeh’s probabilities pZ1 , . . . , pZm of fuzzy states of the world express the expected membership degrees, in which the particular states of the world will occur. Thus, they do not have in general the common probabilistic interpretation - a measure of a chance that a given event will occur in the future, which is desirable in case of a decision matrix. Therefore, we cannot say that the values EH1Z , . . . , EHnZ , given by (2), and varH1Z , . . . , varHnZ , given by (3), express the expected values and variances of consequences of the alternatives. Hence, ordering of the alternatives based on these characteristics is questionable.
3
Fuzzy rule-based systems derived from a decision matrix with fuzzy states of the world
In this section, let us introduce a different approach to the model of decision making under risk described by the decision matrix with fuzzy states of the world presented in Tab. 2. Taking into account the problems discussed in the previous section, we suggest not to treat the consequence of choosing an alternative xi , i ∈ {1, . . . , n}, as a discrete random variable HiZ taking on the values hi,1 , . . . , hi,m with the probabilities pZ1 , . . . , pZm . Instead of this, we propose to treat the information about the consequence of choosing xi as the following basis of If-Then rules: If the state of the world is S1 , then the consequence of xi is hi,1 . If the state of the world is S2 , then the consequence of xi is hi,2 . ... If the state of the world is Sm , then the consequence of xi is hi,m .
643
(4)
Mathematical Methods in Economics 2016
For computing the output of the fuzzy rule-based system (4), it is appropriate to apply the so-called Sugeno method of fuzzy inference, introduced by Sugeno [6]. For any ω ∈ Ω, the consequence of choosing xi is given by Pm m X j=1 µSj (ω) · hi,j S P = Hi (ω) = µSj (ω) · hi,j , (5) m j=1 µSj (ω) j=1 Pm where we used the assumption j=1 µSj (ω) = 1 for any ω ∈ Ω. Let us note that the assumption that the fuzzy states of the world S1 , . . . , Sm form a fuzzy partition of Ω can be omitted in this approach. Since we operate within the given probability space (Ω, A, P ), HiS is a random variable. A distribution function of HiS is given for any h ∈ R as follows: Z S dP. FHiS (h) = P (Hi ≤ h) = {ω∈Ω|HiS (ω)≤h}
Remark 2. It can be easily seen from (5) that in the case of crisp states of the world S1 , . . . , Sm , the random variables H1S , . . . , HnS coincide with discrete random variables H1 , . . . , Hn taking on the values hi,1 , . . . , hi,m , i = 1, . . . , n, with probabilities pj = P (Sj ), j = 1, . . . , m. Hence, this new approach can be also considered as an extension of a decision matrix to the case of fuzzy states of the world. Analogously as in the previous approach, the alternatives x1 , . . . , xn can be compared on the basis of the expected values and variances of H1S , . . . , HnS . Let us derive now the formulas for their computation. In addition to that, we will compare these characteristics with the characteristics of H1Z , . . . , HnZ . First, let us show now that the expected values EH1S , . . . , EHnS coincide with EH1Z , . . . , EHnZ . For i = 1, . . . , n, we get Z Z X m m Z m X X EHiS = HiS (ω)dP = µSj (ω) · hi,j dP = µSj (ω) dP · hi,j = pZj · hi,j = EHiZ . (6) Ω
Ω j=1
j=1
Ω
j=1
Hence, from the point of view of the expected values of consequences that are taken into account in comparison of alternatives, both the approaches are the same. Nevertheless, as it will be shown next, the variances of H1S , . . . , HnS are generally different from var H1Z , . . . , var HnZ . For i = 1, . . . , n, var HiS is given as follows: 2 Z Z m X 2 S S 2 S 2 S 2 S 2 var Hi = E Hi − EHi = Hi (ω) dP − EHi = µSj (ω) · hi,j dP − EHiS . Ω
Ω
j=1
For var HiZ , i = 1, . . . , n, it holds that var HiZ = E HiZ
2
Hence, employing EHiS
− EHiZ 2
2
= EHiZ
var HiZ − var HiS =
=
m X j=1
2
pZj · h2i,j − EHiZ
2
=
Z X m Ω j=1
µSj (ω) · h2i,j dP − EHiZ
2
.
,
Z X m Ω j=1
µSj (ω) · h2i,j dP −
Z
Ω
m X j=1
2
µSj (ω) · hi,j dP
2 Z X m m X = µSj (ω) · h2i,j − µSj (ω) · hi,j dP ≥ 0, Ω
j=1
j=1
since the integrand is clearly non-negative (it represents the variance of a discrete random variable that takes the values hi,1 , . . . , hi,m with ”probabilities” µS1 (ω), . . . , µSm (ω)). Moreover, it is obvious that the difference of variances is equal to zero, if and only if hi,j = hi,k for any j 6= k such that both EµSj and EµSk are positive. Thus, whereas the expected values EH1S , . . . , EHnS and EH1Z , . . . , EHnZ are the same, the variances var H1S , . . . , var HnS and var H1Z , . . . , var HnZ generally differ. This means that ordering of the alternatives that is based on the expected values and variances of the consequences can be different for both the above described approaches. Let us illustrate this fact by the following numerical example.
644
Mathematical Methods in Economics 2016
Example 2. Let us consider the following situation: We can realize one of the two possible projects, denoted by x1 and x2 . The result depends solely on the fact what kind of a government coalition will be established after the parliamentary election. Let us assume that only the six possible coalitions, denoted by ω1 , . . . , ω6 , can be established after the election. For the sake of simplicity, let the probabilities of establishing each coalition be the same. Thus, the probability space (Ω, A, P ) is given, where Ω = {ω1 , . . . , ω6 }, A is the family P of all subsets of Ω, and P is the probability measure such that P ({ωk }) = 1/6, k = 1, . . . , 6, and P (A) = {k|ωk ∈A} P ({ωk }), for any A ⊆ Ω. We distinguish three possibilities (states of the world) - a right coalition (S1 ), a centre coalition (S2 ), and a left coalition (S3 ). Let the above mentioned three states of the world be expressed by the following fuzzy sets defined on Ω: 1 S1 = |ω1 ,0.8 |ω2 ,0.2 |ω3 ,0 |ω4 ,0 |ω5 ,0 |ω6 , 0 S2 = |ω1 ,0.2 |ω2 ,0.8 |ω3 ,1 |ω4 ,0.5 |ω5 ,0 |ω6 , 0 |ω1 ,0 |ω2 ,0 |ω3 , |ω4 ,0.5 |ω5 ,1 |ω6 , S3 =
where elements of the sets are in the form µSj (ωk ) |ωk , j = 1, 2, 3, and k = 1, . . . , 6. How the future yield (in %) depends on the fact which of these three types of a coalition will occur in the future is described in Tab. 3.
x1 x2
S1 15 15
S2 0 -3
S3 -4 0
Table 3 The future yields (in %) of x1 and x2 based on the type of a coalition. First, let us compute the expected values and variances of the random variables H1Z and H2Z . The probabilities of the fuzzy states of the world S1 , S2 , and S3 , computed according to (1), are given as follows: PZ (S1 ) = 0.333, PZ (S2 ) = 0.417, and PZ (S3 ) = 0.25. Hence, we get EH1Z
=
EH2Z
=
0.333 · 15 + 0.417 · 0 + 0.25 · (−4) = 4 %,
0.333 · 15 + 0.417 · (−3) + 0.25 · 0 = 3.75 %,
and var H1Z
=
var H2Z
=
0.333 · (15 − 4)2 + 0.417 · (0 − 4)2 + 0.25 · (−4 − 4)2 = 63,
0.333 · (15 − 3.75)2 + 0.417 · (−3 − 3.75)2 + 0.25 · (0 − 3.75)2 = 64.69.
By applying the rule of maximization of the expected value and minimization of the variance, we would choose the alternative x1 over x2 . However, as it was stated in Section 2, the interpretation of the characteristics EHiZ and var HiZ , i = 1, 2, is questionable. Thus, the selection of x1 is not trustworthy. H1S
Now, let us construct the random variables H1S and H2S . Since the universal set Ω is a discrete set, and H2S are discrete random variables. According to (5), we obtain the following: H1S (ω1 ) = 15 %, H1S (ω2 ) = 12 %, H1S (ω3 ) = 3 %, H1S (ω4 ) = 0 %, H1S (ω5 ) = −2 %, and H1S (ω6 ) = −4 %,
H2S (ω1 ) = 15 %, H2S (ω2 ) = 11.4 %, H2S (ω3 ) = 0.6 %, H2S (ω4 ) = −3 %, H2S (ω5 ) = −1.5 %, and H2S (ω6 ) = 0 %.
Both the random variables H1S and H2S taking on these values with the probabilities equal to 1/6. According to (6), the expected values EH1S and EH2S coincide with EH1Z and EH2Z , i.e. EH1S = 4 % and EH2S = 3.75 %. The variances of H1S and H2S are given as follows: P6 S 2 S k=1 (H1 (ωk ) − 4) var H1 = = 50.33, 6 P6 S 2 k=1 (H2 (ωk ) − 3.75) var H2S = = 47.03. 6 645
Mathematical Methods in Economics 2016
Thus, we can see that under this approach, we would not be able to decide between x1 and x2 according to the rule of maximization of the expected value and minimization of the variance, since EH1S > EH2S and var H1S > var H2S .
4
Conclusion
We have dealt with the problem of extension a decision matrix to the case of fuzzy states of the world. We have analysed the approach to this problem proposed by Talašová and Pavlačka [7] that is based on applying the Zadeh’s probabilities of the fuzzy states of the world. We have found out that the meaning of obtained characteristics of the consequences of alternatives, namely the expected values and variances, is questionable. Therefore, we have introduced a new approach that consists in deriving fuzzy rule-based systems from a decision matrix with fuzzy states of the world. In such a case, the obtained characteristics of consequences, based on which the alternatives are compared, are clearly interpretable. We have proved that the resulting expected values of consequences are for both the approaches the same, whereas the variances generally differ. In numerical example, we have shown that the final ordering of the alternatives according to the both approaches can be different. Next research in this field could be focused on the cases, where the consequences of alternatives are given by fuzzy numbers, and/or where the underlying probability measure is fuzzy.
Acknowledgements Supported by the project No. GA 14-02424S of the Grant Agency of the Czech Republic and by the grant IGA PrF 2016 025 Mathematical Models of the Internal Grant Agency of Palacký University Olomouc. The support is greatly acknowledged.
References [1] Chankong, V., Haimes, Y.Y.: Multiobjective decision making: theory and methodology. Elsevier Science B.V., New York, 1983. [2] Negoita, C.V., Ralescu, D.: Applications of Fuzzy Sets to System Analysis. Birkhuser Verlag–Edituria Technica, Stuttgart, 1975. [3] Özkan, I., Türken, IB.: Uncertainty and fuzzy decisions. In: Chaos Theory in Politics, Springer Netherlands, 2014, 17–27. [4] Pavlačka, O., Rotterová, P.: Probability of fuzzy events. In: Proceedings of the 32nd International Conference Mathematical Methods in Economics (J. Talašová, J. Stoklasa, T. Talášek, eds.), Palacký University Olomouc, Olomouc, 2014, 760–765. [5] Rotterová, P., Pavlačka, O.: Probabilities of Fuzzy Events and Their Use in Decision Matrices. International Journal of Mathematics in Operational Research. To Appear. [6] Sugeno, M.: Industrial applications of fuzzy control. Elsevier Science Pub. Co., New York, 1985. [7] Talašová, J., Pavlačka, O.: Fuzzy Probability Spaces and Their Applications in Decision Making. Austrian Journal of Statistics 35, 2&3 (2006), 347–356. [8] Yoon, K.P., Hwang, Ch.: Multiple Attribute Decision Making: An Introduction. SAGE Publications, California, 1995. [9] Zadeh, L.A.: Probability Measures of Fuzzy Events. Journal of Mathematical Analysis and Applications 23, 2 (1968), 421–427.
646
Mathematical Methods in Economics 2016
Valuation of the externalities from biogas stations Marie Pechrová1, Václav Lohr2
Abstract. The aim of the paper is to valuate the externalities caused by projects realized with financial support of Rural Development Program for the Czech Republic for years 2014-2020, particularly the externalities from biogas stations. We applied hedonic model valuation as we suppose that it can give the most objective price of externalities related to the biogas stations as the price is derived from the value of real estates. A case study was done on eight biogas stations in Jihomoravsky region. The data were obtained from real estate server at the beginning of 2.Q of 2016. Price of flats for sale was explained in the regression model by number of rooms, acreage and distance to the biogas station. Linear and log-linear relation was taken into account. As expected, the effect of the distance from biogas station on the flat prices is negative as its presence lowers the price by 0.15% according to linear regression model, or by 0.40% based on log-linear form of the model. Keywords: biogas station, rural development program, externalities JEL Classification: D61, C21 AMS Classification: 62J05
1
Introduction
The aim of the paper is to evaluate the externalities caused by projects realized with financial support of Rural Development Program for the Czech Republic for years 2014-2020 (RDP), particularly the externalities from biogas stations (BGS). “Externalities are defined as benefits or costs, generated as by-products of an economic activity that do not accrue to the parties involved in the activity.” [1] Externalities are generated as side effects of an activity and affect (usually negatively) the production or consumption possibilities of other economic agents without permission or compensation. Besides, the market prices of the externalities often do not exist. Therefore, they have to be monetary valuated. “Monetary valuation is the practice of converting measures of social and biophysical impacts into monetary units and is used to determine the economic value of non-market goods, i.e. goods for which no market exists.“ [8] There is a complex of externalities related to the BGS: noise from operation of the BGS or from transport of the feed, odor, esthetic point of view etc.; are negatively perceived by local inhabitants. In this sense, the presence of the BGS lowers their welfare. Therefore, when selecting the projects to be financed from RDP, not only economic, but also social point of view should be applied. Cost benefit analysis (CBA) is recommended as the suitable method for project evaluation. CBA measures inputs and outputs in monetary units. “It understands the benefits as every increase of the utility and costs as every decrease of the utility, what is sometimes hard to be determined in monetary terms.” [7] There are various methods to monetary valuate externalities related to certain type of projects. Particularly, hedonic price method (HPM) was selected and applied in this article. First, there are presented the results or researches which used HPM. Next section describes the method in detail together with used data and models. Then the results of the analysis are presented and discussed. Last section concludes.
1.1
Results of previous researches
To the best authors’ knowledge, there is no study which would assess the externalities related to the BGS. Therefore, we present results of researches aimed on evaluation of the odor from animal sites or landfills as we suppose that odor nuisance is one of the main externalities from BGS despite the fact that the researchers and experts are not agreed on whether or how much the biogas station smells. [7] Van Broeck, Bogaert and de Meyer [13] examined the externalities related to odor from animal waste treatment facility based on 1200 real estate prices in Belgium. They compared those houses situate within and outside odor affected areas at the same statistical sector. Log-linear model showed that an extra sniffing unit corresponds to a decrease of the average net actual value by 650 euros or by 0.4%. Linear regression model estimated de1
Institute of Agricultural Economics and Information, Department of the impacts of agricultural policy, Mánesova 75, Prague, Czech Republic, e-mail: pechrova.marie@uzei.cz 2 Czech University of Life Sciences Prague, Faculty of Economics and Management, Department of Information Technologies, Kamýcká 129, Prague, Czech Republic, e-mail: lohr@pef.czu.cz
647
Mathematical Methods in Economics 2016
crease by 1526 euros (0.8%). Also Milla, Thomas and Ansine [4] found negative and significant influence of proximity to swine facilities on the selling price of residential properties. Similar results were achieved also by Palmquist, Roka and Vukina [6] who analyzed the effect of large-scale hog operations on residential property values. “An index of hog manure production at different distances from the houses was developed. It was found that proximity caused statistically significant reduction in house prices of up to 9 percent depending on the number of hogs and their distance from the house.” [6] Bouvier et al. [2] examined six landfills, which differed in size, operating status, and history of contamination, and estimated by multiple regression model the effect of each landfill on rural residential property values. “In five of the landfills, no statistically significant evidence of an effect was found. In the remaining case, evidence of an effect was found, indicating that houses in close proximity to this landfill suffered an average loss of about six percent in value.” [2] The proximity to the site indeed plays important role as the effect is visible only to the certain distance. Nelson, Genereux and Genereux [5] examined the price effects of one landfills on 708 houses’ values in Minnesota, USA in 1980s. They came to the conclusion that the influence is visible up to 2-2.5 miles. At the landfill boundary, the prices decrease by 12% and about 1 mile away by 6% at about 1 mile. Beside the distance, also the character of the area is important determinant of the effect. Reichert, Small and Mohanty [10] analyzed the effect of five municipal landfills on residential property values in a major metropolitan area (Cleveland, Ohio, USA). They concluded that landfills would likely have an adverse impact upon housing values when the landfill was located within several blocks of an expensive housing area (the negative impact was between 5.5%-7.3% of market value depending upon the actual distance from the landfill), but in case of less expensive and older areas the landfill effect was considerably less pronounced (3%-4% of market value), and for predominantly rural areas there was no effect measured at all. The specifics of rural areas are analyzed for example by Šimpach and Pechrová [12]. Also the size of the landfill plays the role. Ready [9] found out that landfills that accept high volumes of waste (≥ 500 tons/day) decrease neighboring residential property values on average by 12.9%. This impact diminishes with distance at a gradient of 5.9% per mile in this case while lowervolume landfills decrease property values on average by 2.5% with a gradient of 1.2% per mile. Besides 20-28% of low-volume landfills have no impact at all on nearby property values. [9] Intensity of the decrease of real estate’s prices in similar scope was found also in California (USA) cities where industrial odors were present. Saphores and Aguilar-Benitez [11] found statistically significant reduction in house prices of up to 3.4% using GLS model with controlled heteroskedasticity and GIS software.
2 2.1
Data and methods Hedonic pricing method
The essence of hedonic pricing method is the evaluation of the externality’s price based on the decrease of the value of properties located near the site which cause loss of benefits. It belongs to the type of methods which are based on the alternative market (real estate market) when the real one does not exist. This method requires detailed data about the sales transactions and characteristics of the sold estates. When the frequency of property sales is not sufficiently high (for example in rural areas) the usage of HPM might be problematic. The price of the real estate P (or the logarithm of the price, depending on the chosen type of the functional form) is normally explained by three categories of variables (see equation 1): C – physical characteristics that contribute to the price of a house such as the number of rooms, lot size, date of construction, type of ownership, energy demand etc.), N – neighborhood characteristics (distance to the center, supermarket, school, park, etc.), E – characteristics of the surrounding environment (distance to the activity which nagging the local residents that is operation of BGS in our case).
P = f (C , N , E ) ,
(1)
Hedonic price models are commonly estimated by the method of ordinary least squares (OLS) in cases when cross-sectional data are available or by fixed effects models when the data are of panel nature.
2.2
Data
Data of real estates for sale in Jihomoravsky region of the Czech Republic were gathered. There were originally 1480 real estates for sale at the beginning of the second quarter of year 2016 in size categories from 1+kk up to 5+1. The size of the flat was measured by the number of rooms (e.g. 1+1 or 2+kk are 2 rooms) and the acreage in m2. The proxy for the externalities related to BGS was chosen the distance. The further is the house for sale; the lower is the price of the externality. There were 8 biogas stations operating in the region which used as a feed the organic waste from agricultural production. The characteristics of the BGS are displayed at Table 1.
648
Mathematical Methods in Economics 2016
Location
Power [kW] Type of feed
Čejč
1072
Hodonín
261
Mikulčice
500
Mutěnice Šebetov Suchohrdly u Miroslavi Švábenice Velký Karlov
526 130
pig manure pig manure, bedding, maize silage, corn silage silage, pig manure, farmyard manure, corn leftovers, moldings of tomatoes and apples pig manure, maize silage pig manure
495 526 1480
Operational since 2007
No. of nearby houses
Avg. distance from BGS 8 11.08
autumn 2009
97
4.41
N/A
36
10.96
2009 1997
22 19
4.92 8.70
pig manure, maize straw
2008
38
12.17
pig manure, maize silage animal by-products, manure, waste
2008
47
9.47
2006
51
11.56
Table 1 Characteristics of the biogas stations in the sample; Source [3], own elaboration The distance to the BGS was measured based on the GPS system. Longitude and latitude of particular real estate and of particular BGS were recalculated to Cartesian coordinate system and based on Pythagoras theorem the shortest (air) distance was calculated. Consequently, only those estates which distance to the nearest biogas station was lower than 15 km were selected and used for consequent analysis. Also the cases with incomplete data were dropped. Finally there were 318 real estates included into the analysis. Descriptive characteristics of the sample are displayed at Table 2. Average price of the real estate was 1.5 mil. CZK. On average, the flat had 3 rooms (2+1 or 3+kk) and its size was 72 m2. The closest estate to the BGS was 240 meters, the upper threshold was set on 15 km. On average, the estates were 8 km far from BGS. Majority of flats was close to BGS station in Hodonín run by company M PIG spol. s r.o.; 51 real estates were less than 15 km from BGS in Velký Karlov (Jevišovice) operated by ZEVO spol. s r.o.; 47 were close to Švábenice, etc. Above stated determinants were included into model. Originally, also the distance to the nearest biggest city (center) was calculated, but it showed to be statistically insignificant. Hence, it was dropped from the analysis. Variable
Mean
Std. dev.
Min
Max
y – price [CZK]
1 468 006
665 768
325 000
3 749 000
x1 – no. of rooms
3.13
0.97
1
6
x2 – size [m ]
71.96
33.77
18
382
x3 – distance to nearest BGS [km]
8.43
4.34
0.24
14.98
2
Table 2 Descriptive characteristics of the sample; Source own calculation Data were checked for the presence of multicollinearity. The pair correlation coefficients were not higher than 0.8, hence, there was no multicollinearity in the data detected.
2.3
Models
Hedonic price function has no known explicit form. Therefore, various forms are usually used such as linear, log-linear, log–log, and linear with Box–Cox transformations of the price and continuous regressors. In our case two forms of a model were selected as optimal based on multiple coefficient of determination (R2). Firstly, the relation between price and its determinants was modelled by linear function. Linear regression model estimated by OLS is defined as (2):
y = β 1 x1 + β 2 x 2 + β 3 x3 + u t ,
(2)
where y is explained variable, x are explanatory variables with parameters β, and ut represents stochastic term. Second form of the model was exponential function (3). Hence, a growth model was constructed.
y = e β1x1 + β 2 x2 + β3 x3 +ut In order to estimate (2) by OLS, the model was linearized by natural logarithm to log-linear form (4).
649
(3)
Mathematical Methods in Economics 2016
ln y = β 1 x1 + β 2 x 2 + β 3 x3 + u t
(4)
Another possible form (not applied in the paper) is exponential function linearized as (5).
ln y = β 1 ln x1 + β 2 ln x 2 + β 3 ln x3 + u t
(5)
All models provided better results when the constant was not included. The calculations were done in econometric software Stata 11.2.
3
Results and Discussion
Results of the linear form of the model with estimated parameters are presented in equation (6) and Table 3.
y = 148 187,10 x1 + 10284,55 x 2 + 25934,39 x3 t ,
(6)
The parameters have expected sign. Increase of each explanatory variable by a unit cause increase of price by certain amount of money. Increase by 1 room represents average increase in price by 148 187 CZK. Taking into account that average price of 1-room flat was 1.01 mil. CZK, of 2-room flat 1.26 mil. CZK, 3-room flat 1.51 mil. CZK, 4-room flat 1.56 mil CZK, the intensity of increase seems reasonable. When the flat is larger by 1 m2, the price increases by 10 285 CZK. When the real estate is 1 km further from the nearest BGS station, its price is higher by 25 934 CZK. All coefficients are statistically significant at 95% level. Average price of the house (when data from Table 2 are put into the equation 5) is 1 421 953 mil. CZK. Based on elasticities, it can be seen that the acreage of the flat influences the price the most. Increase of acreage by 1% means increase in price by 0.52%. On the other hand, when the distance of the real estate to BGS decreases by 1%, the price decreases by 0.15%. Therefore, we may conclude that the presence of the BGS could cause the decrease in flat price by 0.15%. The model fit the data well, from 85.60% (after adjustment). Also the whole model is statistically significant (F-value enables to reject H0 that all parameters of explanatory variables are jointly equal to zero).
Source Model Residual Total
Sum of squares
Degrees of freedom
7.08e14 1.18e
14
8.26e
14
Mean sum of squares
3
2.36e14
315
3.74e
11
2.60e
12
218
F(3, 315)
630.92
Prob > F
0.00
2
R
85.73% 2
Adj. R
Root MSE y x1 x2 x3
Coefficient
Std. error
148187.10 10284.55 25934.39
31218.22 1161.22 6877.98
t-value 4.75 8.86 3.77
85.60% 6.10e5
95% confidence inter- Average val elasticity 0.000 86764.53 209609.70 0.33% 0.000 7999.82 12569.27 0.52% 0.000 12401.79 39466.98 0.15%
p-value
Table 3 Results of linear regression model; Source own calculation Results of log-linear form of a model are displayed in equation (7) and Table 4.
ln y = 2,6721 x1 + 0,0211 x 2 + 0,3987 x3
(7)
Parameters of log-linear model could be interpreted in percentage terms. As expected when the distance increases by 1 km, price of the real estate increases, on average by 0.3987%. Increase of the number of rooms by 1 cause increase of price by 2.67% and increase by 1 m2 by 0.02%. Explanatory power of the model is better than in previous case – the changes of real estate’s price are explained by changes of explanatory variables from 93.89% (based on the adjusted coefficient of determination). The model as a whole is statistically significant as same as all its parameters.
650
Mathematical Methods in Economics 2016
Sum of squares
Source
5.95e4
Model
Mean sum of squares
F(3, 315)
1628.96
3
19822.76
Prob > F
0.00
3
315
12.17
63301.51
318
199.06
Coefficient
Std. error
2.67 0.02 0.40
0.18 0.01 0.04
Residual
3.83e
Total
ln y
Degrees of freedom
x1 x2 x3
t-value 1.16 3.18 10.16
2
p-value 0.000 0.002 0.000
R
93.94%
Adj. R2 Root MSE
93.89% 3.49
95% confidence interval 2.32 3.02 0.01 0.03 0.32 0.48
Table 4 Results of log-linear regression model; Source own calculation The results of our research show lower intensity of price decrease than other researches. The closest are to the study of van Broeck, Bogaert and de Meyer [13]. They estimated the same intensity of price decrease based on log-linear model (0.4%), but their research had the advantage that dispersion studies of odor were available for the sites. Therefore, the decrease of price is related to the increase of amount of sniffing units as the real estate is closer to the animal waste treatment facility. In our case, the price decrease is measured by the distance to the BGS. Linear regression models provided different results – van Broeck, Bogaert and de Meyer [13] reported the decrease by 0.8%, but in our case the calculated elasticity for the average value of other variables revealed lower elasticity (0.52%). Palmquist, Roa and Vukina [6] found statistically significant reduction in house prices of up to 9% in case of hog farms. This is relatively high intensity in comparison with our results taking into account the fact that BGS included in the research were mostly fed by pigs’ manure and hence the nature of the odor is similar. Different type of odor (local industrial) can explain different intensity in case of research by Saphores and Aguilar-Benitez [11], where the decrease was 3.4%. Also the distance matters. As the examined houses in our study were the closest to Hodonín BGS, the decrease of price might be steeper. However, running separate model only for those houses near Hodonín BGS did not bring statistically significant results and hence the model is not presented in the article.
4
Conclusion
The aim of the paper was to evaluate the externalities related to the biogas stations (BGS). From a complex set of externalities which are related to the operations of BGS such as odor, noise, esthetical point of view etc., the odor was taken as the most important and the results of our research are compared to studies evaluating the price of smell. A hedonic pricing method (as one valuation methods which utilize substitute markets) was used. The price of the flat was explained by its number of rooms, acreage and distance to BGS. Data were obtained by special software from the internet server. They included 318 flats for sale in Jihomoravsky region from the beginning of second quarter of the year 2016. Linear and log-linear (exponential) types of models were estimated by ordinary least squares method. The results of linear regression show that when the house is closer to the BGS by 1 km, its price is lower by 25 934 CZK. Using elasticity for average value it was calculated that when the distance to BGS decreases by 1%, the price of the house decreases by 0.15%. Based on log-linear model, we can conclude that when the distance decreases by 1 km, price of the real estate decreases by 0.3987%. However, despite that the fit of both models is high; they are simplified to some extent as they do not include other price’s determinants. It was not possible to use the software to download the data. To manually collect information about other characteristics of the flats which determine their price such as the type of ownership, whether there is cellar, balcony, garage etc. and include them into the model remains a challenge for future research. We will further develop our model and try set the price of externalities related to BGS as objective as possible so it will be possible to use it in CBA analysis of projects financed from RDP. Also the usage of contingent valuation method is planned.
Acknowledgements The paper was supported by the thematic task No. 4107/2016 (No. 18/2016) of Institute of Agricultural Economics and Information.
651
Mathematical Methods in Economics 2016
References [1] Birol, E.: Using Economic valuation techniques to inform water resources management: A survey and critical appraisal of available techniques and an application. Science of the Total Environment 365 (2006), 105 – 122. [2] Bouvier, R. A., Halstead, J. M., Conway, K. S., Manalo, A. B.: The Effect of Landfills on Rural Residential Property Values: Some Empirical Evidence. Journal of the Regional Analysis and Policy 30 (2000), 23–37. [3] CZBA (Czech Biogas Association). Mapa bioplynových stanic. [on-line]. Available from http://www.czba.cz/mapa-bioplynovych-stanic [cit. 15-03-2016] [4] Milla, K., Thomas, M.H., Ansine, W.: Evaluating the Effect of Proximity to Hog Farms on Residential Property Values: A GIS – Based Hedonic Price Model Approach. URISA Journal 17 (2005), 27–32. [5] Nelson, A., Genereux, J., Genereux, M.: Price Effects of Lands Fills on House Values. Land Economics 68 (1992), 259–365. [6] Palmquist, R. B., Roka, F. M., Vukina, T.: Hog Operations, Environmental Effects, and Residential Property Values. Land Economics 73 (1997), 114–124. [7] Pechrová, M.: Valuation of Externalities Related to Investment Projects in Agriculture. In: Proceedings of the 20th International Conference Current Trends in Public Sector Research. Masaryk’s university, Brno, 2016, 352–360. [8] Pizzol, M., Weidema, B., Brand, M., Osset, P.: Monetary valuation in Life Cycle Assessment: a review. Journal of Cleaner Production 86 (2015), 170–179. [9] Ready, R. C.: Do Landfills Always Depress Nearby Property Values? Rural Development Paper 27 (2005), 1–29. [10] Reichert, A., Small, M., Mohanty, S.: The Impact of Landfills on Residential Property Values. Journal of Real Estate Research 7 (1992), 297–314. [11] Saphores, J.-D., Aguilar-Benitez, I.: Smelly local polluters and residential property values: A hedonic analysis of four Orange County (California) cities. Estudios Económicos 20 (2005), 197–218. [12] Šimpach, O., Pechrová, M.: Rural Areas in the Czech Republic: How Do They Differ from Urban Areas? In: Proceedings of the 22nd International Scientific Conference on Agrarian Perspectives - Development Trends in Agribusiness. Czech University of Life Sciences Prague, Prague, 322–334. [13] van Broeck, G., Bogaert, S., de Meyer, L.: Odours and VOCs: Measurement, Regulation and Control Techniques. Kassel university press, Diagonale 10, Kassel, 2009. ISBN: 978-3-89958-608-4.
652
Mathematical Methods in Economics 2016
Multiple criteria travelling salesman problem PekĂĄr Juraj1, Lucia MieresovĂĄ2, Reiff Marian3 Abstract: Presented paper address the problem of linking the traveling salesman problem with the multiple criteria iterative - interactive methods. Thus the aim of the paper is to solve of multiple criteria traveling salesman problem, since research publications in this area are less frequent in comparison to the classical traveling salesman problem. In addressing the problem, iterative method is used for finding the compromise solution of multiple criteria decision making problem. In application part of paper, we demonstrate the way how to solve the problem of multiple criteria traveling salesman problem on data from the region of Slovakia. The first part of the paper deals with the definition of classical traveling salesman problem, multiple criteria traveling salesman problem and their mathematical formulations. The second part contains description of multiple criteria iterative method. The specification and solutions of real problem is presented in the third part. Key words: Multiple criteria problem, traveling salesman problem, iterative method. JEL Classification: C02, C61 AMS Classification: 90C11, 90B06
1 Introduction Vehicle routing problems ([2], [3], [4], [6]) are well known mainly for its utilization in many real-world applications. In this paper, we focused on analysis of the traveling salesman problem (TSP) and ways of its compromise solution computation, whereas the analysis is focused on multiple criteria modification of TSP compromise solution identification. Models of multiple criteria decision making offer solving tools for different types of problems, and in the our paper we have chosen to apply ISTM (Interactive Step Trade-off Method) on TSP. Interactive Step Trade-off Method [5], [7] reprezents iterative procedure, where the decision-maker based on on the decision-making process carried out by the two entities (analyst and decision-maker) select the alternatives from the set of effective solutions. In ISTM method, the decision maker, after obtaining the first solution has to decide, whether the target needs to be improved and should be maintained at least at current levels, or whether some certain part of it can be relaxed. Based on this compromise analysis, , ISTM procedure will find new effective solution that can satisfy decision maker preferences.
2 Modification of ISTM for solution of multiple criteria TSP In the next section we present a modification ISTM for solution of multiple criteria traveling salesman problem. Steps of computation procedure ISTM for obtaining solution of multiple criteria TSP can be summarized as follows: Step 1 Define multiple criteria optimization problem for solution of the routing problem TSP đ?&#x2018;&#x203A;
đ?&#x2018;&#x203A;
(1)
max đ?&#x2018;&#x201C;1 (đ?&#x2018;Ľ) = â&#x2C6;&#x2019; â&#x2C6;&#x2018; â&#x2C6;&#x2018; đ?&#x2018;&#x2018;đ?&#x2018;&#x2013;đ?&#x2018;&#x2014; đ?&#x2018;Ľđ?&#x2018;&#x2013;đ?&#x2018;&#x2014; đ?&#x2018;&#x2013;=1 đ?&#x2018;&#x2014;=1 đ?&#x2018;&#x203A; đ?&#x2018;&#x203A; (2)
max đ?&#x2018;&#x201C;2 (đ?&#x2018;Ľ) = â&#x2C6;&#x2019; â&#x2C6;&#x2018; â&#x2C6;&#x2018; đ?&#x2018;&#x2018;đ?&#x2018;&#x2013;đ?&#x2018;&#x2014; đ?&#x2018;Ľđ?&#x2018;&#x2013;đ?&#x2018;&#x2014; â&#x20AC;Ś
đ?&#x2018;&#x2013;=1 đ?&#x2018;&#x2014;=1 đ?&#x2018;&#x203A;
đ?&#x2018;&#x203A; (đ?&#x2018;&#x2DC;)
max đ?&#x2018;&#x201C;đ?&#x2018;&#x2DC; (đ?&#x2018;Ľ) = â&#x2C6;&#x2019; â&#x2C6;&#x2018; â&#x2C6;&#x2018; đ?&#x2018;&#x2018;đ?&#x2018;&#x2013;đ?&#x2018;&#x2014; đ?&#x2018;Ľđ?&#x2018;&#x2013;đ?&#x2018;&#x2014; đ?&#x2018;&#x2013;=1 đ?&#x2018;&#x2014;=1
(1)
1
University of Economics in Bratislava, juraj.pekar@euba.sk. University of Economics in Bratislava, mieresova.euba@gmail.com 3 University of Economics in Bratislava, reiff@euba.sk 2
653
Mathematical Methods in Economics 2016
đ?&#x2018;&#x203A;
â&#x2C6;&#x2018; đ?&#x2018;Ľđ?&#x2018;&#x2013;đ?&#x2018;&#x2014; = 1,
đ?&#x2018;&#x2014; = 1, 2, 3, â&#x20AC;Ś , đ?&#x2018;&#x203A;
đ?&#x2018;&#x2013;â&#x2030; đ?&#x2018;&#x2014;
đ?&#x2018;&#x2013; = 1, 2, 3, â&#x20AC;Ś , đ?&#x2018;&#x203A;
đ?&#x2018;&#x2013;â&#x2030; đ?&#x2018;&#x2014;
đ?&#x2018;&#x2013;=1 đ?&#x2018;&#x203A;
â&#x2C6;&#x2018; đ?&#x2018;Ľđ?&#x2018;&#x2013;đ?&#x2018;&#x2014; = 1, đ?&#x2018;&#x2014;=1
đ?&#x2018;Śđ?&#x2018;&#x2013; â&#x2C6;&#x2019; đ?&#x2018;Śđ?&#x2018;&#x2014; + đ?&#x2018;&#x203A;đ?&#x2018;Ľđ?&#x2018;&#x2013;đ?&#x2018;&#x2014; â&#x2030;¤ đ?&#x2018;&#x203A; â&#x2C6;&#x2019; 1, đ?&#x2018;&#x2013;, đ?&#x2018;&#x2014; = 2, 3, â&#x20AC;Ś , đ?&#x2018;&#x203A; đ?&#x2018;Ľđ?&#x2018;&#x2013;đ?&#x2018;&#x2014; â&#x2C6;&#x2C6; {0,1}, đ?&#x2018;&#x2013;, đ?&#x2018;&#x2014; = 1, 2, â&#x20AC;Ś , đ?&#x2018;&#x203A; Step 2 Utilize the method for generating efficient alternatives based on an ideal alternative đ??&#x; â&#x2C6;&#x2014; = (đ?&#x2018;&#x201C;1â&#x2C6;&#x2014; , đ?&#x2018;&#x201C;2â&#x2C6;&#x2014; , â&#x20AC;Ś đ?&#x2018;&#x201C;đ?&#x2018;&#x2DC;â&#x2C6;&#x2014; ) [5] for specified values of weights đ?&#x153;&#x2020;đ?&#x2018;&#x2014; (đ?&#x2018;&#x2014; = 1, â&#x20AC;Ś , đ?&#x2018;&#x2DC;). min đ?&#x2018;&#x201C;(đ??&#x2014;, đ??˛, đ?&#x203A;ź) = đ?&#x203A;ź (1)
1
(2)
đ?&#x153;&#x2020;1 1
â&#x2C6;&#x2019; â&#x2C6;&#x2018;đ?&#x2018;&#x203A;đ?&#x2018;&#x2013;=1 â&#x2C6;&#x2018;đ?&#x2018;&#x203A;đ?&#x2018;&#x2014;=1 đ?&#x2018;&#x2018;đ?&#x2018;&#x2013;đ?&#x2018;&#x2014; đ?&#x2018;Ľđ?&#x2018;&#x2013;đ?&#x2018;&#x2014; +
đ?&#x203A;ź â&#x2030;Ľ đ?&#x2018;&#x201C;1â&#x2C6;&#x2014;
â&#x2C6;&#x2019; â&#x2C6;&#x2018;đ?&#x2018;&#x203A;đ?&#x2018;&#x2013;=1 â&#x2C6;&#x2018;đ?&#x2018;&#x203A;đ?&#x2018;&#x2014;=1 đ?&#x2018;&#x2018;đ?&#x2018;&#x2013;đ?&#x2018;&#x2014; đ?&#x2018;Ľđ?&#x2018;&#x2013;đ?&#x2018;&#x2014; + đ?&#x203A;ź â&#x2030;Ľ đ?&#x2018;&#x201C;2â&#x2C6;&#x2014; đ?&#x153;&#x2020;2 . . . 1 (đ?&#x2018;&#x2DC;) â&#x2C6;&#x2019; â&#x2C6;&#x2018;đ?&#x2018;&#x203A;đ?&#x2018;&#x2013;=1 â&#x2C6;&#x2018;đ?&#x2018;&#x203A;đ?&#x2018;&#x2014;=1 đ?&#x2018;&#x2018;đ?&#x2018;&#x2013;đ?&#x2018;&#x2014; đ?&#x2018;Ľđ?&#x2018;&#x2013;đ?&#x2018;&#x2014; + đ?&#x203A;ź â&#x2030;Ľ đ?&#x2018;&#x201C;đ?&#x2018;&#x2DC;â&#x2C6;&#x2014;
đ?&#x2018;&#x2013;, đ?&#x2018;&#x2014; = 1,2, â&#x20AC;Ś , đ?&#x2018;&#x2DC;
(2)
đ?&#x153;&#x2020;đ?&#x2018;&#x2DC;
â&#x2C6;&#x2018;đ?&#x2018;&#x203A;đ?&#x2018;&#x2013;=1 đ?&#x2018;Ľđ?&#x2018;&#x2013;đ?&#x2018;&#x2014; = 1, đ?&#x2018;&#x2014; = 1, 2, 3, â&#x20AC;Ś , đ?&#x2018;&#x203A; đ?&#x2018;&#x2013;â&#x2030; đ?&#x2018;&#x2014; â&#x2C6;&#x2018;đ?&#x2018;&#x203A;đ?&#x2018;&#x2014;=1 đ?&#x2018;Ľđ?&#x2018;&#x2013;đ?&#x2018;&#x2014; = 1, đ?&#x2018;&#x2013; = 1, 2, 3, â&#x20AC;Ś , đ?&#x2018;&#x203A; đ?&#x2018;&#x2013;â&#x2030; đ?&#x2018;&#x2014; đ?&#x2018;Śđ?&#x2018;&#x2013; â&#x2C6;&#x2019; đ?&#x2018;Śđ?&#x2018;&#x2014; + đ?&#x2018;&#x203A;đ?&#x2018;Ľđ?&#x2018;&#x2013;đ?&#x2018;&#x2014; â&#x2030;¤ đ?&#x2018;&#x203A; â&#x2C6;&#x2019; 1, đ?&#x2018;&#x2013;, đ?&#x2018;&#x2014; = 2, 3, â&#x20AC;Ś , đ?&#x2018;&#x203A; đ?&#x2018;Ľđ?&#x2018;&#x2013;đ?&#x2018;&#x2014; â&#x2C6;&#x2C6; {0,1}, đ?&#x2018;&#x2013;, đ?&#x2018;&#x2014; = 1, 2, â&#x20AC;Ś , đ?&#x2018;&#x203A; đ?&#x203A;źâ&#x2030;Ľ0 Step 3 Categorize objective functions into the following subsets: W â&#x20AC;&#x201C; Subset subscript of goal objective functions, which value needs to be improved from current level đ?&#x2018;&#x201C;đ?&#x2018;&#x2014; (đ?&#x2018;Ľ đ?&#x2018;Ąâ&#x2C6;&#x2019;1 ) R â&#x20AC;&#x201C; Subset subscript of goal objective functions, which value at least should be maintained on current level đ?&#x2018;&#x201C;đ?&#x2018;&#x2014; (đ?&#x2018;Ľ đ?&#x2018;Ąâ&#x2C6;&#x2019;1 ) Z â&#x20AC;&#x201C; Subset subscript of goal objective functions, which value may be deteriorated from current level đ?&#x2018;&#x201C;đ?&#x2018;&#x2014; (đ?&#x2018;Ľ đ?&#x2018;Ąâ&#x2C6;&#x2019;1 ) đ?&#x2018;&#x160; = {đ?&#x2018;&#x2013;|đ?&#x2018;&#x2013; = đ?&#x2018;&#x2013;1, đ?&#x2018;&#x2013;2 , â&#x20AC;Ś , đ?&#x2018;&#x2013;đ?&#x2018;¤ } { đ?&#x2018;&#x2026; = {đ?&#x2018;&#x2014;|đ?&#x2018;&#x2014; = đ?&#x2018;&#x2014;1, đ?&#x2018;&#x2014;2 , â&#x20AC;Ś , đ?&#x2018;&#x2014;đ?&#x2018;&#x; } }
Let
(3)
đ?&#x2018;? = {đ?&#x2018;&#x2122;|đ?&#x2018;&#x2122; = đ?&#x2018;&#x2122;1, đ?&#x2018;&#x2122;2 , â&#x20AC;Ś , đ?&#x2018;&#x2122;đ?&#x2018;§ } where đ?&#x2018;&#x160; â&#x2C6;Ş đ?&#x2018;&#x2026; â&#x2C6;Ş đ?&#x2018;? = {1, 2, â&#x20AC;Ś , đ?&#x2018;&#x2DC;} and đ?&#x2018;&#x160; â&#x2C6;Š đ?&#x2018;&#x2026; â&#x2C6;Š đ?&#x2018;? = â&#x2C6;&#x2026;. Step 4 Let assume that đ?&#x2018;˘đ?&#x2018;&#x2013; (đ?&#x2018;&#x2013; â&#x2C6;&#x2C6; đ?&#x2018;&#x160;) denotes auxiliary variable, than the problem solved at this stage can be defined as follows: Max
đ?&#x2018;˘(đ??Ž) = â&#x2C6;&#x2018; đ?&#x153;&#x2020;đ?&#x2018;&#x2013; đ?&#x2018;˘đ?&#x2018;&#x2013; đ?&#x2018;&#x2013;â&#x2C6;&#x2C6;đ?&#x2018;&#x160;
đ?&#x2018;Ľâ&#x2C6;&#x2C6;ď &#x2014; đ?&#x2018;Ľ â&#x2C6;&#x2C6; ď &#x2014;đ?&#x2018;˘ =
(4) đ?&#x2018;&#x201C;đ?&#x2018;&#x2013; (đ?&#x2018;Ľ) â&#x2C6;&#x2019; â&#x201E;&#x17D;đ?&#x2018;&#x2013; đ?&#x2018;˘đ?&#x2018;&#x2013; â&#x2030;Ľ đ?&#x2018;&#x201C;đ?&#x2018;&#x2013; (đ?&#x2018;Ľ đ?&#x2018;Ąâ&#x2C6;&#x2019;1 ), đ?&#x2018;˘đ?&#x2018;&#x2013; â&#x2030;Ľ 0, đ?&#x2018;&#x2013; â&#x2C6;&#x2C6; đ?&#x2018;&#x160; đ?&#x2018;&#x201C;đ?&#x2018;&#x2014; (đ?&#x2018;Ľ) â&#x2030;Ľ đ?&#x2018;&#x201C;đ?&#x2018;&#x2014; (đ?&#x2018;Ľ đ?&#x2018;Ąâ&#x2C6;&#x2019;1 ), đ?&#x2018;&#x2014; â&#x2C6;&#x2C6; đ?&#x2018;&#x2026; đ?&#x2018;&#x201C;đ?&#x2018;&#x2122; (đ?&#x2018;Ľ) â&#x2030;Ľ đ?&#x2018;&#x201C;đ?&#x2018;&#x2122; (đ?&#x2018;Ľ đ?&#x2018;Ąâ&#x2C6;&#x2019;1 ) â&#x2C6;&#x2019; đ?&#x2018;&#x2018;đ?&#x2018;&#x201C;đ?&#x2018;&#x2122; (đ?&#x2018;Ľ đ?&#x2018;Ąâ&#x2C6;&#x2019;1 ), đ?&#x2018;&#x2122; â&#x2C6;&#x2C6; đ?&#x2018;?
{ } where ď &#x2014; still represents former set of feasible solutions and ď &#x2014;đ?&#x2018;˘ represents the set of additional structural constraints. ui represents improvements đ?&#x2018;&#x201C;đ?&#x2018;&#x2013; (đ?&#x2018;Ľ), u(u) is utility function representing improvement of objective functions, in which improvement is requested and đ?&#x153;&#x2020;đ?&#x2018;&#x2013; is positive weight factor, defined according to relative importance of goal objective function in subset W. We leave value đ?&#x153;&#x2020;đ?&#x2018;&#x2013; = 1 (đ?&#x2018;&#x2013; â&#x2C6;&#x2C6; đ?&#x2018;&#x160;) . Step 5 Solve the problem stated in step 4. Optimal solution denoted as xt, represent new effective solutions of former problem. Step 6 If decision maker is not satisfied with obtained solution đ?&#x2018;Ľ đ?&#x2018;Ą , in đ?&#x2018;Ą = đ?&#x2018;Ą + 1 procedure returns to step 3. Iterative procedure stops if: a) there is no requirement to improve any target value, or b) deterioration of desired goals is not feasible.
654
Mathematical Methods in Economics 2016
3 Application of MTSP utilizing ISTEM on data from Slovakia In the analysis nodes representing district towns of central Slovakia regions were used: Ĺ˝ilina and BanskĂĄ Bystricka County. The list nodes on which the analysis was performed from Ĺ˝ilina County is as follows: 1 BytÄ?a, 2 â&#x20AC;&#x201C; Ä&#x152;adca, 3 â&#x20AC;&#x201C; DolnĂ˝ KubĂn, 4 â&#x20AC;&#x201C; KysuckĂŠ novĂŠ mesto, 5 - LiptovskĂ˝ MikulĂĄĹĄ, 6 â&#x20AC;&#x201C; Martin, 7 â&#x20AC;&#x201C; NĂĄmestovo, 8 â&#x20AC;&#x201C; RuĹžomberok, 9 â&#x20AC;&#x201C; TurÄ?ianske teplice, 10 â&#x20AC;&#x201C; TvrdoĹĄĂn, 11 â&#x20AC;&#x201C; Ĺ˝ilina; and Banska Bystrica County: 12 â&#x20AC;&#x201C; BanskĂĄ Bystrica, 13 â&#x20AC;&#x201C; BanskĂĄ Ĺ tiavnica, 14 â&#x20AC;&#x201C; Brezno, 15 â&#x20AC;&#x201C; Detva, 16 â&#x20AC;&#x201C; Krupina, 17 â&#x20AC;&#x201C; LuÄ?enec, 18 â&#x20AC;&#x201C; PoltĂĄr, 19 â&#x20AC;&#x201C; RevĂşca, 20 â&#x20AC;&#x201C; RimavskĂĄ sobota, 21 â&#x20AC;&#x201C; VeÄžkĂ˝ KrtĂĹĄ, 22 â&#x20AC;&#x201C; Zvolen, 23 â&#x20AC;&#x201C; Ĺ˝arnovica, 24 â&#x20AC;&#x201C; Ĺ˝iar nad Hronom. When defining the tasks, set of three objectives were defined. Firs objective is to minimize the driving distance. Second objective in our case is to minimize distance traveled including detours of mountain passes and roads above specific height above sea level. The target mountain passes detour reflects safety requirements especially under the winter conditions on the route. Third objective represents preference of minimization of time duration of route Parameters: Matrix đ?&#x2018;Ť(đ?&#x;?) = {đ?&#x2018;&#x2018; (1) ij}, represents the value of the minimum distances between towns. Matrix đ?&#x2018;Ť(đ?&#x;?) = {đ?&#x2018;&#x2018; (2) ij}, represents the value of the minimum distances between towns omitting mountain passes. Matrix đ?&#x2018;Ť(đ?&#x;&#x2018;) = {đ?&#x2018;&#x2018; (3) ij}, represents the time value of the minimum distances between towns. In the first step, each of the three objectives are optimized, solve model (1) for each individual objective function. Afterwards pay-off table [1] is created from optimization results, the optimal values for each objective function are set on the diagonal in Table 1. đ?&#x2018;Ľ1
đ?&#x2018;&#x201C;1 (đ?&#x2018;Ľ 1 ) â&#x20AC;&#x201C;759
đ?&#x2018;&#x201C;2 (đ?&#x2018;Ľ 2 ) â&#x20AC;&#x201C;952.4
đ?&#x2018;&#x201C;3 (đ?&#x2018;Ľ 3 ) â&#x20AC;&#x201C;784
đ?&#x2018;Ľ2
â&#x20AC;&#x201C;835.3
â&#x20AC;&#x201C;869.1
â&#x20AC;&#x201C;831
3
â&#x20AC;&#x201C;763.5
â&#x20AC;&#x201C;950.4
â&#x20AC;&#x201C;781
đ?&#x2018;Ľ
Table 1 Pay-off table Based on pay-off table, the relative weights of tree targets were calculated with following values: [đ?&#x153;&#x2020;1 , đ?&#x153;&#x2020;2 , đ?&#x153;&#x2020;3 ] = [0.3943, 0.3245, 0.2817] In this stage it is possible to use method for generating effective alternatives, the method is based on ideal vector in order to find initial effective solution, solve model (2). Based on output of optimization software GAMS, the following initial effective solution was acquired. đ?&#x2018;Ľ 0 represents binary matrix, that depicts the order in which the nodes should be visited. In order to obtain values of đ?&#x2018;&#x201C;(đ?&#x2018;Ľ 0 ) it is necessary to make calculations based on the obtained binary matrix and matrices of values of individual criterions.
đ?&#x2018;Ľ0 =
đ??š(đ?&#x2019;&#x2122;0 ) = [784,9; 875,4; 810]
655
Mathematical Methods in Economics 2016
The resulting values can be interpreted as follows: Based on solution, the sequence of locations is determined, i.e. schedule of truck loading is determined. On the day of delivery, according to specific situation on the route, one of the three above listed solutions is implemented. If the decision-maker accepted the solution, in the case of the shortest route 784.9 km are driven. In case, the safety requirements are preferred and mountain passes and roads need to be avoided 875.4 km are driven. In the case of quickest route is preferred, it will take 810 minutes to drive calculated route. In order to acquire first interaction, the first interactive table is constructed (Table 2). đ?&#x2018;&#x201C;1 (đ?&#x2019;&#x2122;(0) ) 784,9 W
đ?&#x2018;&#x201C;2 (đ?&#x2019;&#x2122;(0) ) 875,4
đ?&#x2018;&#x201C;3 (đ?&#x2019;&#x2122;(0) ) 810
Z
Z
(0)
(0)
đ?&#x2018;&#x2018;đ?&#x2018;&#x201C;2 (đ?&#x2019;&#x2122; ) = 60 đ?&#x2018;&#x2018;đ?&#x2018;&#x201C;3 (đ?&#x2019;&#x2122; ) = 30 Table 2 First interaction of calculation procedure ISTM In this step, the decision maker and analysts interacts. It is necessary that the decision maker define his satisfaction with the solution that is provided. The decision maker can accept the proposed solution or identify which of these objectives needs to be improved. Compromises as shown in Table 2 mean that the decision maker would like to improve f1(đ?&#x2019;&#x2122;) at the expense of f2(đ?&#x2019;&#x2122;) and f3(đ?&#x2019;&#x2122;) so that f2(đ?&#x2019;&#x2122;) and f3(đ?&#x2019;&#x2122;) may be relaxed as follows: second target by 60 units, that means by 60 kilometers and third target by 30 units, that means by 30 minutes. On the basis of these values, the task which ensure the implementation of the conditions laid down by the decision maker is formulated (4). Variable đ?&#x2018;˘1 that represents improvement of utility function of first target (increment of the first criterion value) is introduced at this stage. Solving the problem leads to the following effective solution.
đ?&#x2018;Ľ (1) =
đ?&#x2018;˘1 = 0,1258,
đ??š(đ?&#x2019;&#x2122;(1) ) = [775,3; 934,9; 797,0]
Afterwards, the second interactive table is constructed, that represents the value of individual objective functions (Table 3). đ?&#x2018;&#x201C;1 (đ?&#x2019;&#x2122;(1) )
đ?&#x2018;&#x201C;2 (đ?&#x2019;&#x2122;(1) )
đ?&#x2018;&#x201C;3 (đ?&#x2019;&#x2122;(1) )
775,3
934,9
797
Table 3 Second interaction of calculation procedure ISTM Assuming the decision maker is satisfied with values đ?&#x2018;&#x201C;1 (đ?&#x2018;Ľ), đ?&#x2018;&#x201C;2 (đ?&#x2018;Ľ) and đ?&#x2018;&#x201C;3 (đ?&#x2018;Ľ) , obtained solution is the best compromise solution. After the calculation at this step, the value of the first objective function was improved, therefore, let assume that the decision maker is satisfied with the achieved results and doesnâ&#x20AC;&#x2122;t required further improvement of the goals. Selected compromise solution in this case would constitute the following route: BytÄ?a â&#x20AC;&#x201C; KysuckĂŠ NovĂŠ Mesto â&#x20AC;&#x201C; Ä&#x152;adca â&#x20AC;&#x201C; NĂĄmestovo â&#x20AC;&#x201C; TvrdoĹĄĂn - DolnĂ˝ KubĂn â&#x20AC;&#x201C; RuĹžomberok â&#x20AC;&#x201C; LiptovskĂ˝ MikulĂĄĹĄ â&#x20AC;&#x201C; Brezno - RevĂşca - RimavskĂĄ Sobota â&#x20AC;&#x201C; PoltĂĄr â&#x20AC;&#x201C; LuÄ?enec â&#x20AC;&#x201C; VeÄžkĂ˝ KrtĂĹĄ â&#x20AC;&#x201C; Detva - BanskĂĄ Bystrica â&#x20AC;&#x201C; Zvolen â&#x20AC;&#x201C; Krupina â&#x20AC;&#x201C; BanskĂĄ Ĺ tiavnica â&#x20AC;&#x201C; Ĺ˝arnovica â&#x20AC;&#x201C; Ĺ˝iar nad Hronom â&#x20AC;&#x201C; TurÄ?ianske Teplice â&#x20AC;&#x201C; Martin - Ĺ˝ilina - BytÄ?a.
656
Mathematical Methods in Economics 2016
The resulting values represent the following: If the decision-maker accepted the solution, in the case of the shortest route 775.3 km are driven. In case, the safety requirements are preferred and mountain passes and roads need to be avoided 934.9 km are driven. In the case of quickest route is preferred, it will take 797 minutes to drive obtained route.
Conclusion In the article the authors outline the procedure to solve multiple criteria traveling salesman problem. They are focused on an interactive method ISTM, which finds a compromise solution based on the interaction between decision-makers and analysts. Multiple criteria TSP solution procedure was proposed in the paper. Three objectives were defined: minimum driving distance, the minimum distance traveled by omitting the mountain passes and minimum time length. The result is a succession of towns of central Slovakia on the route. This procedure can be used to set the parameters of distribution of goods, in cases where the loading of goods is carried out at a time when conditions are not known and the decision on the selection of type of route (shortest, fastest route, under bad weather route conditions bypassing mountain passes) is carried out before actual delivery. The computational experiments were based on regional data of Slovakia. Software implementation was realized in GAMS (CPLEX solver 12.2.0.0).
Acknowledgements This paper is supported by the Grant Agency of Slovak Republic – VEGA, grant no. 1/0245/15 „Transportation planning focused on greenhouse gases emission reduction“.
References [1] BENAYOUN, R, MONTGOLFIER, J., TERGNY, J., LARITCHEV, O.: Linear programing with multiple objective functions: Step Method(STEM), Mathematical Programming, North-Holland Publishing Company, 1971. [2] ČIČKOVÁ, Z., BREZINA, I.: An evolutionary approach for solving vehicle routing problem. In: Quantitative methods in economics multiple criteria decision making XIV. IURA EDITION, Bratislava, 2008, 40–44. [3] ČIČKOVÁ, Z., BREZINA, I., PEKÁR, J.: Solving the Real-life Vehicle Routing Problem with Time Windows Using Self Organizing Migrating Algorithm. Ekonomický časopis 61(5/2013), 497–513. [4] ČIČKOVÁ, Z., BREZINA, I., PEKÁR, J.: Open vehicle routing problem. In: Mathematical Methods in Economics 2014. Olomouc: Faculty of Science, Palacký University, 2014, 124 – 127. [5] LIU, G. P., YANG, J. B., WHIDBORNE, J. F.: Multiobjective Optimisation and Control, Baldock, Hertfordshire, England, 2003. [6] MILLER, C.E., TUCKER, A.W., ZEMLIN, R.A.: Integer programming Formulation of Traveling Salesman Problems. Journal of the ACM, 7(4/1960), 326-329. [7] YANG J.B., CHEN CH., ZHANG Z. J.: The Interactive Step Trade-off Method (ISTM) for Multiobjective Optimization. IEEE Transactions on Systems, Man, and Cybernetics, 20(3/1990), 688-695.
657
Mathematical Methods in Economics 2016
Heuristics for VRP with Private and Common Carriers Jan Pelikan1 Abstract. The topic of the paper is a heuristic for a modification of vehicle routing problem. In the presented problem there are two kinds of carriers. The first carrier is a producer owning a fleet of vehicles with given capacity and costs per km. The second carrier is common carrier whose services are used by producer at fixed price of transported unit not dependant on distance from depot. A set of customers with given demand are known. The total costs that are minimized consists from private carrier costs and common carrier costs. Main question is what quantity of goods is transported by private and by common carrier. The problem is studied with and without split demand and with uniform and heterogenous fleet of vehicles. The mathematical model is shown and a modified insert heuristic is proposed. The model and proposed heuristic are tested on instances from testing dataset, results of the model obtained in limited computational time and results of insert heuristic with sequential and parallel routes creation are compared. Keywords: vehicle routing problem, insert heuristic, integer programming. JEL classification: C44 AMS classification: 90C15
1
Introduction
Transportation of goods from producers to consumers presents a substantial part of the costs and is therefore the subject of optimization problems solved by the models and methods of operations research. Typical problem is traveling salesman problem (TSP), or vehicle routing problem (VRP), where goods are transported from depot to custometrs through a communications network (roads, railways etc.). There are many modifications of those problems as for example VRP with time windows, with heterogenous fleet of vehicles, with more depots, with split delivery of demands, the pick and delivery problem, VRP with stochastic demand etc. The paper deals with modification of VRP, which consists of two kinds of carriers, private and common carriers. Goods from depot to customers can be transported or by private or common carrier, or partially by both carriers. Similarly to the classic VRP customersâ&#x20AC;&#x2122; demand is given, it can be split (SDVRPPC) or non-split (VRPPC). The problem aims at goods transportation from depot to customers with minimal costs which are the sum of private carrier costs and common carrier costs. Costs of private carrier depend on the length of vehicles routes multiplied by the costs per km of vehicles. Unlike private carrier costs of common carrier depend only on quantity of transported goods multiplied price of unit quantity transport. Common carrier costs do not depend on transported distance. There is a fleet of vehicles of private carrier located in depot, the capacity of vehicles is limited, there is no limit of capacity of common carrier. In the problem SDVRPPC is to decide what part of customerâ&#x20AC;&#x2122;s demand is delivered by private and by common carrier. In case of non-split demand problem VRPPC question is only whether the customerâ&#x20AC;&#x2122;s request is delivered by private or common carrier. A further problem is to create optimal routes for private carrier vehicles under condition that the route for each vehicle must carry the quantity of goods that does not exeed the capacity of the vehicle. The problem is NP hard because clasical vehicle routing problem can be reduced to this problem.
1 University
of Economics, W. Churchill sq. 4, Prague, pelikan@vse.cz
658
Mathematical Methods in Economics 2016
Tabu search heuristic is published in [2] for the problem with nonsplit demand. A heuristic proposed in [3] selects customers according the distance from depot, customers which are closed to depot are assigned to private carrier, the others to common carriers. A modification of the Clarke and Wrights savings algorithm is used in [1] to solve so called the truckload and less-than-truckload problem, where private carrier uses less-than-truckload vehicles while an outside carrier trucks.
2
Mathematical model of the problem
The mathematical model of vehicle routing problem with private and common carriers with split demand (SDVRPPC) and without split demand (VRPPC) are shown in this section. The models contain hudge number of binary variables, so the optimal solution cannot be obtained in limited computation time for real problems. 2.1
Split delivery model SDVRPPC
Let G = {V, E} is undirected complete graph, V = {1, 2, · · · , n}, depot is node 1, nodes 2, 3, · · · , n are customers. A number of vehicles of the private carrier is m. Parameters of the model are: di,j distance between node i and node j, qi demand of node i, ws capacity of s-th vehicle, s = 1, 2, · · · , m, ps costs per km of the s-th vehicle, cc , costs of the transport a unit of goods by common carrier. Variables of the model are: xsij binary, equals 1 if vehicle s travels from node i to node j, yis an amount of goods (a part of demand of node i), which is transported by s-th vehicle of private carrier, zi an amount of goods (a part of demand of node i), which is transported by common carrier (outside carrier), usi variables in anti-cyclic constraints. X
ps dsij xsij + cc
s,i,j
X i
X
xsij =
i
zi +
X
X i
qi
X j
X j
X
xsji
i
yis = qi
s
binary,
(1)
∀s, j
(2)
∀i
(3)
yis ≤ ws
∀s
(4)
xsij ≥ yis
∀s, i
(5)
xs1j ≤ 1
∀s
(6)
usi + 1 + n(1 − xsij ) ≤ usj xij
zi → min
zi ≥
0, yis
∀s, i ≥0
j>1
(7)
∀s, i, j.
(8)
In this formulation the objective function (1) minimizes the sum of travel costs of vehicles of private carrier and common carrier costs. Constraint (2) states that the same vehicle must enter and leave a node. Inequality (3) assures that capacity of s-th vehicle is not exceeded . Equation (4) means that the demand each node will be satisfied. Constraint (5) prevents supply of node by the vehicle s if this vehicle does not enter the node. At least one leave the depot is possible for each vehicle states (6). Anti-cyclic conditions are in (8).
659
Mathematical Methods in Economics 2016
2.2
Nonsplit delivery model VRPPC
Nonsplit delivery model contains binary variable zi which is equal 1 if common carrier assures whole demand qi of the node i. X X ps dsij xsij + cc zi qi → min (9) s,i,j
i
X
xsij =
i
X j,s
X
xsji
i
∀s, j
(10)
∀i
(11)
xsij = (1 − zi )
usi ≤ ws ∀s, i X xs1j ≤ 1 ∀s
(12) (13)
j
usi + qj + n(1 − xsij ) ≤ usj xsij , zi
binary
∀s, i
j>1
∀s, i, j.
(14) (15)
The objective function (9) minimizes the sum of travel costs of vehicles of private carrier and common carrier costs. Constraint (10) states that the same vehicle must enter and leave a node. Equation (11) means that node not served by common carrier has to be served by some vehicle of the private carrier. Inequality (12) assures that capacity of s-th vehicle is not exceeded. At least one leave the depot is possible for each vehicle states (13). Anti-cyclic conditions and defining of load usi of the vehicle s entering node i are in (14).
3
Insert heuristic
The following notation is used for the modified insert heuristic: the route of s-th vehicle is Rs = {v1s , v1s , . . . , vhs s } where v1s = vhs s = 1 and delivery to node i by vehicle s is denoted as yis . The symbol ∆(i, j, k) means ∆(i, j, k) = p(s)(di,j + dj,k − di,k ). Step 1: {choice of vehicle s and its initial route} Vehicle s and node k are selected in order to maximize the value σ(s, k, yks ) = cc yks − ps (d1,k + dk,1 ) , where yks is maximum uncovered demand of node k which does not exceed available capacity of the vehicle s, ie. 0 < yks ≤ qk and yks ≤ ws . Value σ(s, k, yks ) has to be positive. If no vehicle with initial route is selected then stop, otherwise put v1s := 1; v2s := k; v3s := 1; qk := qk − yks ; ws := ws − yks ; hs := 3 and go to Step 2. s Step 2: {insertion} Repeatedly select a node k and a quantity yks with the edge (vjs , vj+1 ) to maximize s s s s s value δ(s, k, j, yk ) = cc yk − ps (dvjs ,k + dk,vj+1 − dvjs ,vj+1 ) where value yk cannot exceed the available capacity of the vehicle s, ie. 0 < yks ≤ qk and yks ≤ ws . Value σ(s, k, yks ) has to be positive. If no node k is selected then go ro step 1, otherwise put qk := qk − yks ; ws := ws − yks ; hs := hs + 1, the s s edge (vjs , vj+1 ) is deleted from the route and edges (vjs , k) and (k, vj+1 ) are added to the route, go to Step 2.
Step 3: {common carriers transport} The remaining demand qk is assigned to transport by common carrier.
4
Main features of the problem and heuristic
Proposition 1. If {Rs ; s = 1, 2, . . . m} is optimal, then inequality s s s s ps (dvjs ,vj+1 + dvj+1 − dvjs ,vj+2 ) ≤ cc yvsj+1 s ,vj+2
is valid for all s and j = 1, . . . , hs − 2. Proof. Proof by contradiction.
660
(16)
Mathematical Methods in Economics 2016
Proposition 2. If inequality (16) holds for some s, then (17) is valid. ps
hX s −1 j=1
s ≤ cc dvjs ,vj+1
X
yis
(17)
i
Left hand side of the inequality (17) represends costs of private carrier to provide transport yjs by s-th vehicle to the customers j = 2, 3, . . . n, right hand side are costs of common carrier by the same amount of goods. Proof. Proof by induction over number of nodes of the route. Proposition 3. The solution obtained from insert heuristic satisfies the inequality (16) and (17). s s Proof. When the node vj+1 was inserted into the edge (vjs , vj+2 ), the inequality s s s ∆(vjs , vj+1 , vj+2 ) ≤ cc yj+1
(18)
s is valid for the quantity yj+1 . We have to prove that the inequality s s ∆(vjs , vj+1 , k) ≤ cc yj+1
(19)
s s is valid after subsequential insertion the node k into the edge (vj+1 , vj+2 ). s s The node vj+1 was inserted into the edge (vjs , vj+2 ) as it is true that s s s s cc yj+1 − ∆(vjs , vj+1 , vj+2 ) ≥ cc yks − ∆(vjs , k, vj+2 )
(20)
s s s for all nodes k and the value yks as well as cc yj+1 − ∆(vjs , vj+1 , vj+2 ) ≥ 0. 0
s s s s After insertion vj+1 into edge (vjs , vj+2 ) the node k is inserted into edge (vj+1 , vj+2 ) with value yks ≤ yks so we have 0 s s cc yks − ∆(vj+1 , k, vj+2 )≥0 (21)
From (20) and (21) we get s s s s s cc yj+1 ≥ cc yj+1 − ∆(vjs , k, vj+2 ) + ∆(vjs , vj+1 , vj+2 )≥ 0 s s s s s s s s s s s s ) + ∆(vjs , vj+1 , vj+2 )= ≥ cc yk − ∆(vj , k, vj+2 ) + ∆(vj , vj+1 , vj+2 ) ≥ ∆(vj+1 , k, vj+2 ) − ∆(vj , k, vj+2 s s s s s s s s s s s s s s = ps {(dvj+1 + d − d ) + (d + d − d ) − (d + d − d ,k k,vj+2 vj+1 ,vj+2 vj ,vj+1 vj+1 ,vj+2 vj ,vj+2 vj ,k k,vj+2 vj ,vj+2 )} = s s = ∆(vj , vj+1 , k)). Remark 1. Non-optimality of the result of insert heuristic exists in two causes: a) vehicles routes are not optimal, they can be improved by using some heuristic, b) it is not optimal amount of goods delivered to the nodes by private and common carrier. It is possible to exchange goods between private and common carrier.
5
Numerical example
The model and insert heuristic was tested on example S6A which is published in http://www.uv.es/ belengue/carp.html, results are shown in the Table 1. The parameters of the instance S6A are: n=31, m=5, cc =70. Insert heuristic is written in VBA language. The model is solved using CPLEX 12.0 on PC (IntelCore2Quad, 2,83GHz). The best value of object function is shown but the result is not optimal (computation was aborted), the gap of object function is placed under the value of total costs in parenthesis (in % of the lower bound). Value NP are costs of private carries, NC are costs of common carrier. Result of the model is better than the result of heuristic in case split delivery problem, for case non-split problem it is opposite.
661
Mathematical Methods in Economics 2016
non-split split
insert heuristic NP 3745 3650
NC 700 630
NP+NC 4445 4280
model NP 960 2880
NC 10780 630
NP+NC 11740 (90%) 3510 (44%)
Table 1 Results of S6A
Conclusion An interesting modification for vehicle routing problem is studied, mathematical model of the problem is presented and the new heuristic method is proposed. Main features of the heuristic are shown. As to solve the mathematical model is very difficult for real problem, the proposed heuristic method gives us good solution in polynomial time. Additionaly, the heuristic solution meets the necessary condition for the optimal solution.
Acknowledgements Supported by the grant No. 16-00408S of the Czech Grant Agency.
References [1] Chu, Ch-W.: A heuristic algorithm for the truckload and less-than-truckload problem., European Journal for Operations Research 165 (2005), 657-667. [2] Cote,JF. and Potvin, JY.: A tabu search heuristics for the vehicle routing problem with private fleet and common carrier, European Journal for Operations Research 198,2 (2009), 464-469. [3] Plevny, M.: Vehicle routing problem with a private fleet and common carriers - the variety with a possibility of sharing the satisfaction of demand. In: Mathematical Methods in Economics 2013, 31st international conference proceedings, College of Polytechnics Jihlava, (2013), s.730-736. ISBN 978-80-87035-76-4. [4] www.uv.es/ belengue/carp.html
662
Mathematical Methods in Economics 2016
Fuzzy Decision Analysis Module for Excel Radomír Perzina1, Jaroslav Ramík2 Abstract. There exists wide range of software products to support decision making. Main disadvantage of those software products is that they are commercial and relatively expensive and thus it prevents them to be used by small companies, students or researchers. This paper introduces a new Microsoft Excel add-in FuzzyDAME which is completely free and was developed to support users in multicriteria decision making situations with uncertain data. It can be also used by students to help them understand basic principles of multicriteria decision making, because it doesn’t behave as a black box but displays results of all intermediate calculations. The proposed software package is demonstrated on an illustrating example of real life decision problem – selecting the best advertising media for a company. Keywords: analytic hierarchy process, analytic network process, feedback, fuzzy, multi-criteria decision making, pair-wise comparisons. JEL Classification: C44 AMS Classification: 90C15
1
Introduction
Decision making in situations with multiple variants is an important area of research in decision theory and has been widely studied e.g. in [4], [7], [8], [9], [10], [11], [12]. There exists wide range of computer programs that are able to help decision makers to make good decisions, e.g. Expert Choice (http://www.expertchoice.com), Decisions Lens (http://www.decisionlens.com), Mind Decider (http://www.minddecider.com), MakeItRational (http://makeitrational.com) or Super Decisions (http://www.superdecisions.com). Main disadvantage of those programs is that they are commercial and relatively quite expensive and thus it prevents them to be used by small companies or individual entrepreneurs. Here we introduce a new Microsoft Excel add-in named FuzzyDAME – Fuzzy Decision Analysis Module for Excel which is a successor of another add-in developed by the authors called DAME that was used just for crisp evaluations [9]. The main advantage of the new add-in FuzzyDAME is possibility to work with both crisp and fuzzy evaluations. Comparing to other software products for solving multicriteria decision problems, FuzzyDAME is free, able to work with scenarios or multiple decision makers, allows for easy manipulation with data and utilizes capabilities of widespread spreadsheet Microsoft Excel. Users can structure their decision models into three levels – scenarios/users, criteria and variants. Standard pair-wise comparisons are used for evaluating both criteria and variants. For each pair-wise comparison matrix there is calculated an inconsistency index. There are provided three different methods for the evaluation of the weights of criteria, the variants as well as the scenarios/users – Saaty's Method [11], Geometric Mean Method [1] and Fuller's Triangle Method [4]. Multiplicative and additive syntheses are supported. For ranking fuzzy weights there are supported three methods – center of gravity method, optimistic alfa cut and pessimistic alfa cut method [3]. Using FuzzyDAME pairwise comparisons method with fuzzy evaluations can be also applied on a multicriteria decision making problem of ranking given alternatives, see [10].
2
Software Description
FuzzyDAME works with all current versions of Microsoft Excel from version 97 up to version 2016. The add-in can be downloaded at http://www.opf.slu.cz/kmme/FuzzyDAME. It consists of three individual files: • FuzzyDAME.xla – main module with user interface, it is written in VBA (Visual Basic for Applications), • FuzzyDAME.xll – it contains special user defined functions used by the application, it is written in C# and linked with Excel-DNA library (http://exceldna.codeplex.com). 1
Silesian University in Opava, School of Business Administration in Karviná, Department of Informatics and Mathematics, University Square 1934/3, Karviná, Czech Republic, perzina@opf.slu.cz. 2 Silesian University in Opava, School of Business Administration in Karviná, Department of Informatics and Mathematics, University Square 1934/3, Karviná, Czech Republic, ramik@opf.slu.cz.
663
Mathematical Methods in Economics 2016
•
FuzzyDAME64.xll – the same as FuzzyDAME.xll just for 64-bit version of Microsoft Excel. Correct .xll version is automatically loaded based on the installed Microsoft Excel version.
All files should be placed in the same folder and macros must be enabled before running the module (see Microsoft Excel documentation for details). FuzzyDAME itself can be executed by double clicking on the file FuzzyDAME.xla. After executing the add-in there will appear a new menu item “FuzzyDAME” in the Add-ins ribbon (in older Excel versions the menu item “FuzzyDAME” will appear in the top level menu). A new decision problem can be generated by clicking on “New problem” item in the main FuzzyDAME menu. Then there is shown a form with main decision problem characteristics, see Fig. 2. In the top panel there are basic settings: Number of scenarios, criteria and variants. In case a user doesn’t want to use scenarios or there is just a single decision maker, the number of scenarios/users should be set to one. In the second panel “Other settings” we can choose multiplicative or additive synthesis model. Here we set the mode how to compare scenarios/users and criteria by using either pairwise comparison matrix or by setting the weights directly. In the last panel the users choose how to evaluate variants according to the individual criteria. There are three options: Pairwise – each pair of variants is compared individually, Values max – indicates maximization criterion where each variant is evaluated by single value, e.g. revenues and Values min – indicates minimization criterion where each variant is evaluated by single value, e.g. costs. Then, after confirmation, a new Excel sheet with forms is created, where user can update the names of all elements and evaluate criteria and variants using pairwise comparison matrices as shown in the example in Fig. 1.
Figure 1 Pairwise comparison matrix example In the pairwise comparison matrix users enter values only in the upper triangle. The values in the lower triangle are reciprocal and automatically recalculated. If criterion (variant) in the row is more important than the criterion (variant) in the column user enters values from 2 to 9 (the higher the value, the more important is the criterion in the row). If the criterion (variant) in the row is less important than the criterion (variant) in the column the user enters values from 1/2 to 1/9 (the less the value, the less important is the criterion in the row). If criterion (variant) in the row is equally important to the criterion (variant) in the column user enters value 1 or leaves it empty. In the top right corner inconsistency index, which should be less than 0.1, is calculated. If this index is greater than 0.1, we should reconsider our pairwise comparisons, so that they are more consistent. In the very right column the weights of individual criteria (variants) based on the values in the pairwise comparison matrix are calculated and evaluation method is selected. The weights are automatically calculated using eq. (1).
3
Mathematical Background
Here we briefly describe mathematical background that is used for calculations in FuzzyDAME.
3.1
Crisp Calculations
In case of crisp evaluations, i.e. when we are able to compare all pairs of criteria/variants exactly we use geometric mean method for calculation criteria/variants weights wk from pairwise comparison matrices, see e.g. [10], as follows. 1/ n
n ∏ a kj j =1 wk = , k = 1,2,K, n , 1/ n n n ∑ ∏ aij i =1 j =1
(1)
where wk is the weight of k-th criterion (variant), aij are values in the pairwise comparison matrix, and n is number of criteria (variants). The inconsistency index is calculated using the formula [1] GCI =
wj 2 log2 aij ⋅ ∑ (n − 1)(n − 2 ) i < j wi
664
.
(2)
Mathematical Methods in Economics 2016
When we enter some values into individual pairwise comparison matrices all weights are recalculated, so that we obtain an immediate impact of our each individual entry. The matrix of normalized values as well as the graph with total evaluation of variants is then shown at the bottom of the sheet. The resulting vector of weights of the variants Z is given by the formula (3).
Z = W32W21 ,
(3)
where W21 is the n×1 matrix (weighing vector of the criteria), i.e.
w(C1 ) W21 = M , w(Cn )
(4)
w(C1 ,V1 ) L w(Cn ,V1 ) W32 = M L M , w(C1 ,Vm ) L w(C n ,Vm )
(5)
and W32 is the m×n matrix
where w(Ci) is the weight of criterion Ci, w(Vr,Ci) is the weight of variant Vr subject to the criterion Ci.
3.2
Fuzzy Calculations
In practice it is sometimes more convenient for the decision maker to express his/her evaluation in words of natural language saying e.g. “possibly 3”, “approximately 4” or “about 5”, see e.g. [11]. Similarly, he/she could use the evaluations as “A is possibly weak preferable to B”, etc. It is advantageous to express these evaluations by fuzzy sets of the real numbers, e.g. triangular fuzzy numbers. A triangular fuzzy number a is defined by the triple of real numbers, i.e. a = (aL; aM; aU), where aL is the Lower number, aM is the Middle number and aU is the Upper number, aL ≤ aM ≤ aU . If aL = aM = aU, then a is the crisp number (non-fuzzy number). In order to distinguish fuzzy and non-fuzzy numbers we shall denote the fuzzy numbers, vectors and matrices by the tilde, e.g. a~ = (a L ; a M ; a U ) . It is known that the arithmetic operations +,-, * and / can be extended to fuzzy numbers by the Extension principle, see e.g. [3]. If all elements of an m×n matrix A are triangular fuzzy numbers then we call A the triangular fuzzy matrix and this matrix is composed of the triples of real numbers. Particularly, if A is a pair-wise comparison matrix, we assume that it is reciprocal and there are ones on the diagonal. In order to find the "optimal" variant we have to calculate the triangular fuzzy weights as evaluations of the relative importance of the criteria and evaluations of the variants according to the individual criteria. We assume ~ ,w ~ ,..., w ~ , w ~ = ( w L ; w M ; wU ) , i = 1,2,...,n, which can be that there exists vectors of triangular fuzzy weight w 1 2 n i i i i calculated by the following formula (see [7]):
~ = ( w L ; w M ; wU ) , k = 1,2,...,n, w k k k k
(6)
where 1/ n
n S ∏ akj j =1 S wk = , S ∈ { L, M , U } . 1/ n n n M ∏ aij ∑ i =1 j =1
(7)
In [3], the method of calculating triangular fuzzy weights by (7) from the triangular fuzzy pair-wise comparison matrix is called by the logarithmic least squares method. This method can be used both for calculating the triangular fuzzy weights as relative importance of the individual criteria and also for eliciting relative triangular fuzzy values of the criteria of the individual variants out of the pair-wise comparison matrices. Now we calculate the synthesis - the aggregated triangular fuzzy values of the individual variants by formula (8), applied for triangular fuzzy matrices
665
Mathematical Methods in Economics 2016
~ ~ ~ Z = W32 W21 .
(8)
Here, for addition, subtraction and multiplication of triangular fuzzy numbers we use the fuzzy arithmetic operations mentioned earlier, see e.g. [7]. The simplest method for ranking a set of triangular fuzzy numbers in (8) is the center of gravity method. This method is based on computing the x-th coordinates of the center of gravity of each triangle given by the corresponding membership functions of , i = 1,2,...,n. Evidently, it holds
zig =
ziL + ziM + ziU . 3
(9)
By (9) the variants can be ordered from the best to the worst. There exist more sophisticated methods for ranking fuzzy numbers, for a comprehensive review of comparison methods see [3]. In FuzzyDAME we apply two more methods which are based on α-cuts of the fuzzy variants being ranked. Particularly, let ~z be a fuzzy alternative, i.e. fuzzy number, α ∈ (0,1] be a preselected aspiration level. Then the α-cut of ~z , [ ~z ] , is a set of all elements x with the membership value greater or equal to α, i.e. [ ~z ] = {x| α
µ ~z ( x) ≥ α }.
α
z1 and ~ z 2 be two fuzzy variants, α ∈ (0,1] . We say that ~z1 is R-dominated by ~ z 2 at the level α if Let ~ ~ ~ ~ ~ sup[ z ] ≤ sup[ z ] . Alternatively, we also say that z R-dominates z at the level α. 1 α
2 α
2
1
If ~ z = ( z L , z M , z U ) is a triangular fuzzy number, then z ] = z U − α ( z U − z M ) , sup[ ~ z ] = z U − α ( z U − z M ), sup[ ~ 1 α
1
1
1
2 α
2
2
2
as it can be easily verified. z1 is L-dominated by ~ z 2 at the level α if We say that ~ ~ z 2 ]α. Alternatively, we also say that ~ z 2 L-dominates ~z1 at the level α. inf[ z1 ]α ≤ inf[ ~ z = ( z L , z M , z U ) , then Again, if ~
z1 ]α = z1L + α ( z1M − z1L ) , inf[ ~ z 2 ]α = z 2L + α ( z 2M − z 2L ) . inf[ ~ Applying R and/or L domination we can easily rank all the variants (at the level α).
4
Case Study
Here we demonstrate the proposed add-in FuzzyDAME on a decision making situation selecting the “best” advertising media for a company. Advertising is an important part of the company’s marketing strategy. Advertising objectives may be to inform, remind, convince, create prestige, or to increase sales and profits. Different media have varying capacity to meet these objectives. Here we assume a company with objective to increase its profit. The goal of this realistic decision situation is to find the best variant from 4 pre-selected ones (Internet, TV, Radio and Press) according to 3 criteria: Overall Impression (pairwise), Look&Feel (pairwise) and Costs (pairwise). We assume that a decision maker is not able to precisely compare each pair of criteria/variants so it becomes more natural to use fuzzy evaluations, e.g. by triangular fuzzy numbers, see [10]. Setting of parameters can be seen on the Fig. 2.
Figure 2 Case study – New problem form
666
Mathematical Methods in Economics 2016
After confirmation by OK a new Excel sheet with forms is created, here the user can set in all required values. First we fill in the importance of the criteria that is given by the fuzzy pair-wise comparisons, see Fig. 3. Note that we use triangular fuzzy numbers, i.e. each pair-wise comparison input is expressed by 3 numbers.
Figure 3 Case study – criteria fuzzy comparison Then the corresponding triangular fuzzy weights are calculated, i.e. the relative fuzzy importances of the individual criteria are shown, see Fig. 4.
Figure 4 Case study – criteria fuzzy weights Next step is to include fuzzy evaluations of the variants according to the individual criteria by three pair-wise comparison matrices, see Fig. 5.
Figure 5 Case study – evaluation of variants Then the corresponding fuzzy matrix W32 of fuzzy weights is calculated, see Fig. 6.
Figure 6 Case study – fuzzy weights of variants Now the final aggregation of results is automatically calculated and we can see the total evaluation of variants as triangular fuzzy numbers, the center of gravity (COG) of each variant and the corresponding Rank in Fig. 7 as well as the graphical representation in Fig. 8.
Figure 7 Case study – total evaluation of variants
Figure 8 Case study – total evaluation of variants – graph As we can see the best variant is TV with center of gravity 0.343 followed by Internet with center of gravity 0.315, third one is Press with center of gravity 0.309 and the last one is Radio with center of gravity 0.222.
667
Mathematical Methods in Economics 2016
Apart from center of gravity method the add-in supports also two other variant ranking methods Rdomination (optimistic) and L-domination (pessimistic) as described in section III.B. These methods require alfa level which can be set in the Excel sheet. For the experiments we used alfa level 0.8, see Fig. 9. Final rank for those two methods can be seen in Fig. 10 and Fig. 11.
Figure 9 Case study – ranking method selection
Figure 10 Case study – total evaluation of variants – optimistic method
Figure 11 Case study – total evaluation of variants – pessimistic method
5
Conclusions
In this paper we have proposed a new Microsoft Excel add-in FuzzyDAME for solving decision making problems with certain and uncertain data. Comparing to other decision support software products FuzzyDAME is free, able to work with scenarios or multiple decision makers, can process crisp of fuzzy evaluations, allows for easy manipulation with data and utilizes capabilities of widespread spreadsheet Microsoft Excel. On two realistic case studies we have demonstrated its functionality in individual steps. We have also shown that introducing fuzzy comparisons can change rank of alternatives. This add-in is used by students in the course Decision Analysis for Managers at the School of Business Administration in Karvina, Silesian University in Opava. It can be recommended also for other students, researchers or small companies
Acknowledgements This research was supported by the grant project of GACR No. 14-02424S.
References [1] Aguaron, J. and Moreno-Jimenez, J.M.: The geometric consistency index: Approximated thresholds. European Journal of Operational Research 147 (2003), 137-145. [2] Buckley, J.J.: Fuzzy hierarchical analysis. Fuzzy Sets and Systems 17 (1985), 233-247. [3] Chen, S.J., Hwang, C.L. and Hwang, F.P.: Fuzzy multiple attribute decision making. Lecture Notes in Economics and Math. Syst., Vol. 375, Springer-Verlag, Berlin – Heidelberg, 1992. [4] Fishburn, P. C.: A comparative analysis of group decision methods, Behavioral Science 16 (1971), 538-544. [5] Horn, R. A. and Johnson, C. R.: Matrix Analysis, Cambridge University Press, 1990. [6] Gass, S.I. and Rapcsák, T.: Singular value decomposition in AHP. European Journal of Operational Research 154 (2004), 573–584. [7] Ramik, J. and Perzina, R.: Fuzzy ANP – a New Method and Case Study. In: Proceedings of the 24th International Conference Mathematical Methods in Economics 2006. University of Western Bohemia, 2006. [8] Ramik, J. and Perzina, R.: Microsoft Excel Add-In for Solving Multicriteria Decision Problems in Fuzzy Environment. In: Proceedings of the 26th International Conference Mathematical Methods in Economics 2008. Technical University of Liberec, 2008. [9] Ramik, J. and Perzina, R.: Solving Multicriteria Decision Making Problems using Microsoft Excel. In: Proceedings of the 32nd International Conference Mathematical Methods in Economics 2014. Palacky University Olomouc, 2014. [10] Ramik, J.: Strong Consistency in Pairwise Comparisons Matrix with Fuzzy Elements on Alo-Group. In: Proc. WCCI 2016 Congress, FUZZ-IEEE. Vancouver, 2016, to appear. [11] Saaty, T.L.: Multicriteria decision making - the Analytical Hierarchy Process. Vol. I., RWS Publications, Pittsburgh, 1991. [12] Saaty, T.L.: Decision Making with Dependence and Feedback – The Analytic Network Process. RWS Publications, Pittsburgh, 2001. [13] Van Laarhoven and P.J.M., Pedrycz, W.: A fuzzy extension of Saaty's priority theory. Fuzzy Sets and Systems 11 (1983), 229-241.
668
Mathematical Methods in Economics 2016
The treatment of uncertainty in uniform workload distribution problems Štefan Peško1, Roman Hajtmanek2 Abstract. We deal with solving an NP-hard problem which occurs in scheduling of workload distribution with uncertainty. Let us have m vehicles working according to n days schedule. Every day a vehicle performs some work, so for each day we have m working jobs to be done. Let us have the matrix with uncertainty elements giving the profit gained by work of i-th vehicle on j-th day. The total profit produced by i-th vehicle is the sum of elements in row i of matrix. If there is a big difference between the profits gained, the drivers of vehicles will have very different wages and workloads. This can be improved by suitable permutations of elements in the individual columns of matrix so that the row sums are well balanced. It means that we need to split the total workload to given number of approximately equal parts. The level of regularity is expressed by irregularity measure. The lower measure implies the higher regularity. The goal is to minimize the total irregularity of parts of schedule with uncertainty. We study a mixed quadratic programming (MIQP) formulation of the modified problem. Computational experiments with heuristic based on repeated solution of the MIQP problem are presented. Keywords: schedule with uncertainty, uniform split, irregularity measure, matrix permutation problem JEL classification: C02, C61, C65, C68, J33 AMS classification: 90C10, 90C15
1
Introduction
In this paper we investigate the solution of an NP-hard problem which occurs in uniform scheduling of workload distribution with uncertainty. We need to split total workload to given number of approximately equal parts. The level of uniformity is expressed by irregularity measure. The lower measure implies the higher regularity. The goal is to minimize the irregularity measure of parts of the concrete schedule. Our model is motivated by the following practical job scheduling problem: Let us have m vehicles working according to n days (periodic) schedule. Every day a vehicle performs some work, so for each day we have m working jobs to be done. Let us have the m × n matrix A with elements aij giving the profit gained by work of i-th vehicle on j-th day. The total profit produced by i-th vehicle is the sum of elements in row i of matrix A. If there is a big difference between the profits gained, the drivers of vehicles will have very different wages and workloads. This can be improved by suitable permutations of elements in the individual columns of matrix A so that the row sums are well balanced. The first formulation of this problem was mentioned in the 1984 article [2] by Coffman and Yannakakis. Further investigation of Matrix Permutation Problem (MPP) can be found in Černý [3] (in Slovak) and Tegze and Vlach [13] (in English) giving the optimality conditions for the case of two-column matrix. The approaches based on graph theory for solving graph version of the MPP are studied by Czimmermann [4, 5]. In the Dell’Olmo et al. [7] 2005 review article, 32 types of uniform k-partition problems are defined and examined from the viewpoint of computational complexity, classified by particular measure of set uniformity to be optimized. Most of the studied problems can be solved by linear time algorithms, some 1 University 2 University
of Žilina, Univerzitná 8215/1, 010 026 Žilina, Slovakia, stefan.pesko@fri.uniza.sk of Žilina, Univerzitná 8215/1, 010 026 Žilina, Slovakia, roman.hajtmanek@fri.uniza.sk
669
Mathematical Methods in Economics 2016
require more complex algorithms but can still be solved in polynomial time, and three types are proved to be NP-hard. The paper by Peško and Černý [9] presents different real situations where making managerial decisions can be influenced by methodology of fair assignment. The MPP with interval matrix presented in Peško [10] is motivated by practical need of regularity in job scheduling when inputs are given by real intervals. It is shown that this NP-hard problem is solvable in polynomial time for two-column interval matrix. Stochastic algorithm for general case of problem based on aggregated two-column sub-problems is presented. The possibility of using this algorithm as the alternative to solving the set partitioning model for a fair scheduling of workload distribution can be found in paper Peško and Hajtmanek [11]. Presently, the investigation of MPP problem continues in the working paper [1] which state the column permutation algorithm in a special case and also in the general form used in our article. The computational experiments or comparisons of different approaches to stochastic algorithm based on repeating solving aggregated two-column subproblems are given in the paper Peško and Kaukič [12]. The review of problems from years 1984-2015 can be found in paper Czimmermann, Peško and Černy [6]. We want to complete this in present article by cases of uncertain workloads which are represented by a vector of some characteristics (mean, dispersion, ranges, etc.) of profits.
2
Mathematical formulation
Throughout this paper we will denote by A = (aijk ) a nonnegative real matrix with m rows and n columns and p depths. Also we will denote by I = {1, 2, . . . , m}, J = {1, 2, . . . , n} and K = {1, 2, . . . , p} the sets of row, column and depth indices. For each column j ∈ J of matrix A we will use the notation πj for the permutation of rows in that column. Now, let us denote by π = (π1 , π2 , . . . , πn ) the vector of permutations of all columns of A and by Aπ we will denote the permuted matrix itself. Thus the element in row i, column j and depth k of matrix Aπ is aπj (i)jk for i ∈ I, j ∈ J, k ∈ K. We will use the notation Sπ = (sπik ) for the matrix with m rows and p columns of row-sums of permuted matrix Aπ , so
Sπ =
sπ11 sπ21 .. . sπm1
sπ12 sπ22 .. . sπm2
··· ··· ··· ···
sπ1p sπ2p .. . sπmp
X , where sπik = aπj (i)jk for i ∈ I, k ∈ K. j∈J
Let us have some irregularity measure f (see [13] for p = 1), i.e. the nonnegative real function defined on the set of nonnegative matrices in Rm×p . Following optimization problem will be called uniform workload distribution problem (UWDP) π∗
f (S ) = min
(
X
f (sπk ),
k∈K
π ∈ Πmn
)
,
(1)
where Πmn is a set of all n-tuples of permutations of the row indices I and sπk is k-th column of matrix Sπ . In this paper we will consider a basic irregularity measure defined on m dimensional vectors x = (x1 , x2 , . . . , xm ) fvar (x1 , x2 , . . . , xm ) =
X i∈I
2
(xi − x) ;
x=
1 X xi . m
(2)
i∈I
Note that for constant vector x∗ with elements equal to mean value x of row sum we have fdif (x∗ ) = fvar (x∗ ) = 0.
670
Mathematical Methods in Economics 2016
3
Example
The main idea of our heuristic approach will be explained in the following example where uncertain workloads are represented by interval matrix A = (aij1 , aij2 ), aij1 ≤ aij2 , i ∈ I, j ∈ J. Let us take the matrix A = (A1 , A2 ) with row-sums vector S and fvar (S) = 650. 1 4 2 1 1 0 0 1 3 15 6 4 5 0 2 1 A1 = 2 5 5 , A2 = 4 13 8 , S = 12 25 . 13 22 6 5 11 2 5 6 21 29 5 8 16 4 7 10 After replacing some adjacent elements marked in bold we 5 8 4 7 1 4 13 2 5 1 0 0 A1 = 0 2 6 , A2 = 6 4 6 5 2 5 5 2 1 0 0 10
0
get a solution with fvar (S ) = 22, 12 18 5 8 18 1 0 11 , S = 8 21 . 8 12 19 10 19 16
00
Next replacing gives us a better solution A00 with fvar (S ) = 8.0, 4 5 1 5 13 1 2 7 1 4 8 5 00 00 A1 = 8 0 5 5 , A2 = 6 5 , 2 2 6 6 4 11 0 0 10 2 1 16
S = 00
10 10 10 10 10
19 17 19 21 19
.
00
The solution A cannot be improved via changing independent adjacent elements. But there is an optimum solution A∗ = (A∗1 , A∗2 ) with fvar (S)∗ = 0, 4 5 1 5 13 1 10 19 2 2 6 4 4 11 10 19 ∗ ∗ A∗1 = 0 0 10 , A2 = 2 1 16 , S = 10 19 , 2 7 1 6 8 5 10 19 0 5 5 6 5 8 10 19 00
and so last solution A is only an approximation. For finding matrix minimizing the irregular objective function fvar via independent changed elements we use following mixed quadratic programming (MIQP) model.
4
MIQP model
Let us assume that a nonnegative real matrix A = (aijk ), i ∈ I, j ∈ J, k ∈ K is given. In this paragraph we will use notation i+ and i− for i ∈ I in following meaning ( ( i + 1, if 1 ≤ i < m, i − 1, if 1 < i ≤ m, + − (3) i = i = m, if i = 1. 1, if i = m. Let us define bivalent variables xij for i ∈ I, j ∈ J. Let us define also value xij = 1 if the elements of aijk and ai+ j k will be changed and xij = 0 otherwise. Non negative variables zik for i ∈ I, k ∈ K give i-th
671
Mathematical Methods in Economics 2016
row-sum in k-th depth of matrix matrix Aπ . Note that the column’s permutatons πj = (π1j , π2j , . . . , πmj ) are then defined by + i , if xij = 1, πij = i− , if xi− j = 1, i, otherwise. We denote mean row-sum of k-depth of matrix A by sk =
1 m
P
i∈I
P
j∈J
aijk .
Now we can formulate modifed model of mixed quadratic programming problem (MUWPD): XX (zik − sk )2 → min,
(4)
i∈I k∈K
zik =
X ai+ jk · xij + ai− jk · xi− j + aijk · (1 − xi− j − xij ) , j∈J
xij + xi+ j ≤ 1, xij ∈ {0, 1},
∀i ∈ I, ∀k ∈ K,
(5)
∀i ∈ I, ∀j ∈ J, ∀k ∈ K,
(6)
∀i ∈ I, ∀k ∈ K.
(8)
∀i ∈ I, ∀j ∈ J, ∀k ∈ K,
zik ≥ 0,
(7)
Objective function (4) is irregularity measure fvar of vector of all row-sums of permuted matrix Aπ . Constraints (5) define new row-sums of this matrix. Constraints (6) define independence on changed elements. Constrains (7) and (8) are obligatory. Consequently we can formulate a heuristic which is based on iterative improvements gained by solution the MUWPD model. Step 1. Let A = (aijk ), i ∈ I, j ∈ J, k ∈ K be an arbitrary nonnegative real matrix and a number of changed elements no = n (fictive initialize value). P P Step 2. Solve MUWPD model for matrix A with solution X = (xij ) and set no = i∈I j∈J xij . Step 3. If no = 0 then STOP, A = (aijk ) as approximate solution.
Step 4. Let matrix B = (bijk ), i ∈ I, j ∈ J, k ∈ K where ai+ jk , if xij = 1, bijk = ai− jk , if xi− j = 1, aijk , if xij + xi− j = 0. Set A = B and GOTO Step 2.
5
Computation experiments
Our experiments were conducted on HP XW6600 Workstation (8-core Xeon 3GHz, RAM 16GB) with OS Linux (Debian/jessie). We used Python-based tools and the Python interface to commercial mathematical programming solver Gurobi [8]. It should be noted that Gurobi solver is able to use all processor cores, so the solution time will be much smaller comparing to computation on one-core processor. We used academic license of Gurobi, which is a full version of solver without any constraints on size of problem instances or the number of processors (on multicore computers). Let us say that all matrices we tested were nonnegative integer-valued. In the real-world applications this can be achieved by quantifying the profits into several integer levels (seldom there are more than few hundreds levels). We experimented with randomly generated matrices with value zero of fvar objective function for three types of uncertain elements given by • intervals: AI = (aij1 , aij2 ) where 0 ≤ aij1 ≤ aij2 ≤ 100, 672
Mathematical Methods in Economics 2016
2 • mean and variance of the random variables from normal distribution: AN = (µij , σij ) where 2 0 ≤ µij ≤ 10, 0 ≤ σij ≤ 100,
• triangle fuzzy numbers: AT F = (aij1 , aij2 , aij3 ) where 0 ≤ aij1 ≤ aij2 ≤ aij3 ≤ 1000, for the instances with number of rows × columns from 10 × 7, 10 × 14 and 20 × 7. With runs of heuristic we achieved interesting results given below in table 1–3. Iterations in tables are numbers of solutions in Step 2 of heuristic. Characteristic optimum gives percentage of optimal solutions from 100 generated instances. Instances Time (sec.) Iterations Optimum (%) Objective fvar
10 × 7 4.3 7.7 0 30.8
10 × 14 68.6 5.8 79 0.7
20 × 7 311.7 14.6 0 97.8
Table 1 Mean computational characteristics for matrices AI
Instances Time (sec.) Iterations Optimum (%) Objective fvar
10 × 7 8.4 8.2 0 30.8
10 × 14 104.2 6.1 87 0.3
20 × 7 253.5 8.1 0 91.7
Table 2 Mean computational characteristics for matrices AN
Instances Time (sec.) Iterations Optimum (%) Objective fvar
10 × 7 12.0 9.3 0 137.3
10 × 14 564.2 8.1 1 13.1
20 × 7 239.8 15.9 0 588.0
Table 3 Mean computational characteristics for matrices AT F
6
Conclusion and future work
Computation experiments has shown that the presented model for the uniform workload distribution problems gives applied heuristic. Open question remains whether it is the exact method for same spacial real instances. We believe that the presented heuristics can be improved via generalizations of exchange elements in columns of matrix not only in rows 1 ↔ 2 ↔ 3 ↔ · · · ↔ m ↔ 1 with step 1 but also for steps 2, 3 . . . , m − 1. That means only to redefine i+ and i− in (3) with i+p and i−p (p = 2, 3, . . . , m = 1) as use the MUWPD model. So we can construct more robust heuristic which begins with possible exchange of elements from step m − 1, continues with step m − 2, etc. and ends with 1.
Acknowledgements The research was supported by the research grants APVV-14-0658 Optimization of urban and regional public personal transport.
673
Mathematical Methods in Economics 2016
References [1] Boudt, K., Vandeuffel, S. and Verbeken, K.: Block Rearranging Elements within Matrix Columns to Minimize the Variability of Row Sums, Working paper (2015). [2] Coffman. E. G. and Yannakakis, M.: Permuting Elements within Columns of Matrix in order to Minimize Maximum Row Sum, Mathematics of Operations Research, 9(3):384390 (1984). [3] Černý, J.: How to minimize irregularity? (Slovak), Pokroky matematiky, fyziky a astronomie, 30 (3), 1985, 145–150. [4] Czimmermann, P. and Peško,Š.: The Regular Permutation Scheduling on Graphs, Journal of Information, Control and Management Systems, 1, (2003), 15–22. [5] Czimmermann, P.: A Note on Using Graphs in Regular Scheduling Problems, Communications, 4, (2003), 47–48. [6] Czimmermann, P., Peško,Š. and Černý,J.: Uniform Workload Distribution Problems, Communications, 1A, (2016), 56–59. [7] DellOlmoa, P., Hansenb, P., Pallottinoc, S. and Storchid G.: On Uniform k-partition Problems, Discrete Applied Mathematics, 150, Issues 1–3, (2005), 121–139. [8] Gurobi Optimization Inc.: Gurobi optimizer reference manual, 2015, http://www.gurobi.com. [9] Peško,Š. and Černý, J.: Uniform Splitting in Managerial Decision Making, E+M, Economics and Management IX, 4, Technical university of Liberec, (2006), 67–71. [10] Peško,Š.: The Matrix Permutation Problem in Interval Arithmetic, Journal of Information, Control and Management Systems, 4, (2006), 35–40. [11] Peško,Š. and Hajtmanek, R.: Matrix Permutation Problem for Fair Workload Distribution, Proceedings of mathematical methods in economics, 32nd international conference, Olomouc, 2014, 783–788. [12] Peško,Š. and Kaukič, M.: Stochastic Algorithm for Fair Workload Distribution Problem, Proceedings of Informatics, 13nd international scientific conference of informatics, Poprad, IEEE, 2015, 206–2010. [13] Tegze, M. and Vlach, M.: On the matrix permutation problem, Zeitschrift fr Operations Research, 30, Issue 3, (1986), A155–A159.
674
Mathematical Methods in Economics 2016
Multidimensional stochastic dominance for discrete uniform distribution Barbora Petrová
1
Abstract. Stochastic dominance is a form of stochastic ordering, which stems from decision theory when one gamble can be ranked as superior to another one for a broad class of decision makers whose utility functions, representing preferences, have very general form. There exists extensive theory concerning one dimensional stochastic dominance of different orders. However it is not obvious how to extend the concept to multiple dimension which is especially crucial when utilizing multidimensional non separable utility functions. One possible approach is to transform multidimensional random vector to one dimensional random variable and put equivalent stochastic dominance in multiple dimension to stochastic dominance of transformed vectors in one dimension. We suggest more general framework which does not require reduction of dimensions of random vectors. We introduce two types of stochastic dominance and seek for their generators in terms of von Neumann - Morgenstern utility functions. Moreover, we develop necessary and sufficient conditions for stochastic dominance between two discrete random vectors with uniform distribution. Keywords: Multidimensional stochastic dominance, stochastic ordering, multidimensional utility function, generator of stochastic dominance. JEL classification: C44, D81 AMS classification: 90C15, 91B16, 91B06
1
Introduction
The concept of stochastic dominance is widely used in seeking for the optimal investment strategy. The reference strategy is represented by a random variable which corresponds, for instance, to revenue of a benchmark portfolio. Then the stochastic dominance constraints ensures that our strategy is at least as good as the reference strategy. Note that the comparison of two strategies depends on the order of considered stochastic dominance. There exists extensive theory concerning the stochastic dominance of different orders between random variables. For instance, one can find detailed description of the first order stochastic dominance in Dupačová and Kopa [3], of the second order stochastic dominance in Kopa and Post [4], of the third order stochastic dominance in Kopa and Post [5] and of the decreasing absolute risk aversion stochastic dominance in Post et al. [6]. For a higher order stochastic dominance we refer to Branda and Kopa [1]. When the investment strategy represented by a random variable is replaced by an investment strategy represented by a random vector, it is not clear how to extend the definition of the stochastic dominance to multiple dimension. One could put equivalent the multidimensional dominance to the one dimensional dominance between each coordinates of the random vectors. Dentcheva and Ruszczýnski [2] define linear stochastic dominance of random vectors which uses the mapping of the random vector to the space of the linear combination of its coordinates. Both concept are however inapplicable when considering a class of investors with multidimensional non-separable utility functions (for more detailed discussion on multidimensional utility functions see, for instance, Richard [7]). In this case there is a need of defining the stochastic dominance of random vectors without any reduction of their dimension. In the further text we restrict our attention only to the first order stochastic dominance. The paper is organized as follows. Chapter 2 is devoted to approximation of multivariate utility function by a linear combination of very simple functions, which comes to use in further chapters when defining the stochastic dominance (to see more detailed procedure we refer to Sriboonchitta et al. [8]). Chapter 3 summarizes characteristics of the first order stochastic dominance in one dimension. Chapter 4 1 Charles
University in Prague, Faculty of Mathematics and Physics, petrova@karlin.mff.cuni.cz
675
Mathematical Methods in Economics 2016
extends the one dimensional first order stochastic dominance into multidimensional and describes the generator of stochastic dominance in terms of von Neumann - Morgenstern utility functions. Chapter 5 focuses on stochastic dominance between two discrete random vectors where we formulate necessary and sufficient conditions for one vector to be stochastically dominated by another one. Finally, Chapter 6 provides a brief concluding statement and considerations about the future work.
2
Approximation of multivariate utility functions
x) ≤ u(yy ). In the whole text we A function u : Rd → R is said to be nondecreasing if x ≤ y implies u(x employ the usual partial order relation on Rd , that is for any x = (x1 , . . . , xd ) and y = (y1 , . . . , yd ) in Rd , we say that x ≤ y if and only if xi ≤ yi , i = 1, . . . , d. The following definition specifies the space of all utility functions which represents non-satiated individuals. Definition 1. We define by U1 the space of all nondecreasing functions u : Rd → R. The following definition introduces upper sets which are crucial for definition of the multidimensional stochastic dominance. Definition 2. A subset M ⊂ Rd is called an upper set if for each y ∈ Rd such that y ≥ x one also has y ∈ M whenever x ∈ M . We denote as M the set of all upper sets in Rd . One can similarly define lower sets and the set of all lower sets in Rd , however the former definition is sufficient for the further theory. We note in advance that all statements in the following chapters can be equivalently formulated using the concept of lower sets. The simplest upper sets in Rd are set with a single generator, i. e. there exists a unique point m ∗ ∈ Rd such that the upper set can be given as x ∈ Rd : x = m ∗ + t , t ≥ 0 }. These sets are thus of the form (m m∗ , ∞) = (m∗1 , ∞) × · · · × (m∗d , ∞) where {x ∗ ∗ ∗ m = (m1 , . . . , md ). Definition 3. We denote as M∗ ⊂ M the subset of all upper sets in Rd which have a single generator, i. x ∈ Rd : x = m ∗ +tt, t ≥ 0 }. e. there exists a unique point m ∗ ∈ Rd such that the upper set can be given as {x Remark 1. In R all upper sets are intervals of the form (m, ∞), m ∈ R and thus the subset M∗ coincides with the set M. Let u : Rd → R+ be a nonnegative nondecreasing function. As a measurable function, u can be approximated by the standard procedure in measure theory by the following procedure. Let ( i x) < i+1 for 2in ≤ u(x i = 0, 1, . . . , n2n − 1, n 2n , x) = 2 un (x x) ≥ n. n for u(x x) converges pointwise to u(x x) for each x ∈ Rd , as n → ∞. Let Ai,n = {x x ∈ Rd : u(x x) ≥ i/2n } Then un (x for each n ≥ 1. These sets are upper sets, which are moreover nested, i. e., A1,n ⊇ A2,n ⊇ . . . ⊇ An2n ,n with possible equality for some inclusion. Thus one can rewrite un as a positive linear combination of indicator functions of these sets as follows: n
n2 1 X x) = n un (x 1A (xx). 2 i=1 i,n
(1)
If u is nonnegative nonincreasing function, then the above approximation still holds true. In this case, n2n {Ai,n }i=1 is a sequence of nested lower sets. Remark 2. If u is a one dimensional nonnegative nondecreasing function, then Ai,n is of the form (ai,n , ∞), where ai,n ∈ R. Thus we can write each un as a positive linear combination of indicator functions of the form 1(ai,n ,∞) . Specifically, we have n
n2 1 X un (x) = n 1(ai,n ,∞) (x). 2 i=1
676
Mathematical Methods in Economics 2016
3
Univariate stochastic dominance
In this part we shortly summarize characteristics of the first order stochastic dominance between random variables. More detailed description together with the proof of the beneath stated theorem can be find in Sriboonchitta et al. [8]. Definition 4. Let X and Y be two random variables with distribution functions F and G. Then X stochastically dominates Y in the first order, denotes as X Y , if F (x) ≤ G(x) for every x ∈ R. Theorem 1. Let X and Y be two random variables with distribution functions F ,G and quantile functions F −1 , G−1 . Then the following statements are equivalent: 1. X stochastically dominates Y in the first order, 2. Eu(X) ≥ Eu(Y ) for any u ∈ U1 , 3. F −1 (α) ≥ G−1 (α) for all α ∈ (0, 1), 4. V aRX (α) ≤ V aRY (α) for all α ∈ (0, 1), where U1 is the space of all nondecreasing functions u : R −→ R. The following theorem states several equivalent ways how to decide about stochastic dominance between two random variables.
4
Multivariate stochastic dominance
We introduce this chapter by definition of multivariate stochastic dominance formulated by Sriboonchitta et al. [8]. Definition 5. Let X and Y be two d-dimensional random vectors. Then X stochastically dominates Y , X ∈ M ) ≥ P(Y Y ∈ M ). denoted as X Y , if for every upper set M ∈ M one has P(X In the next definition we establish a new type of stochastic dominance which requires survival distribution functions of considered random vectors. The survival distribution function F̄ of a random vector X is defined as F̄ (m m) = P(X X ≥ m) = P(X X ∈ (m1 , ∞) × · · · × (md , ∞)) for each m = (m1 , . . . , md ) ∈ Rd . Definition 6. Let X and Y be two d-dimensional random vectors with survival distribution functions F̄ and Ḡ. Then X weakly stochastically dominates Y , denoted as X w Y , if for each m ∈ Rd one has m) ≥ Ḡ(m m). F̄ (m When we consider random variables instead of random vectors, both definition are equivalent and we only talk about stochastic dominance. Indeed, in R all upper sets have the form of intervals (m, ∞), m ∈ R and thus P(X ∈ (m, ∞)) = P(X ≥ m) = F̄ (m). In this case, we obtain a standard definition of one-dimensional stochastic dominance. The next theorem describes the generator of stochastic dominance in terms of von Neumann Morgenstern utility functions. Theorem 2. Let X and Y be two d-dimensional random vectors. Then X stochastically dominates Y if X ) ≥ Eu(Y Y ) for all u ∈ U1 . and only if Eu(X Proof. We refer to Sriboonchitta et al. [8]. Our goal is to describe also the generator of the weak stochastic dominance in terms of von Neumann - Morgenstern utility functions. Consider a utility function u : Rd → R such that for each c ∈ R there xc ) = c and for the set of solution to equation u(x x) = c the following exists some x c ∈ Rd such that u(x x ∈ Rd : u(x x) = c} ∈ {x xc } + K, where K is the convex cone of the form inclusion holds true: S = {x x ∈ Rd : u(x x) ≥ c} are upper sets which belong to set M∗ . Then K = {tt : t ≥ 0 }. Moreover, sets {x obviously each such described utility function is included in the space U1 and we will show that the set of these utility functions induces the weak stochastic dominance.
677
Mathematical Methods in Economics 2016
Definition 7. We define by U1∗ the subspace of all nondecreasing functions u : Rd → R for which the x) = c is included in the convex set {x xc } + K, where x c satisfies u(x xc ) = c and set of solution to u(x K = {tt : t ≥ 0 }. Theorem 3. Let X and Y be two d-dimensional random vectors. Then X weakly stochastically dominates X ) ≥ Eu(Y Y ) for all u ∈ U1∗ . Y if and only if Eu(X X ) ≥ Eu(Y Y ) is satisfied for all u ∈ U1∗ . Let C1∗ be Proof. Firstly, we assume that the condition Eu(X ∗ the subspace of U1 consisting of functions of the form 1M ∗ , where M ∗ is an upper set in M∗ . Since X ) − Eu(Y Y ) ≥ 0 for all u ∈ C1∗ . Now we take an arbitrary C1∗ ⊆ U1∗ , according to the assumption Eu(X d m) ≥ Ḡ(m m). Let M ∗ = (m1 , ∞) × · · · × (md , ∞) and define m = (m1 , . . . , md ) ∈ R and show that F̄ (m u = 1M ∗ . Then we get X ) ≥ E1M ∗ (Y Y ) ⇐⇒ E1M ∗ (X ⇐⇒
⇐⇒
X ∈ M ∗ ) ≥ P(Y Y ∈ M ∗) P(X X ∈ (m1 , ∞) × · · · × (md , ∞)) ≥ P(Y Y ∈ (m1 , ∞) × · · · × (md , ∞)) P(X m) ≥ Ḡ(m m). F̄ (m
The last statement ensures the random vector Y to be weakly stochastically dominated the random vector X. To show the opposite implication, we employ the approximation of utility functions described in the previous section (Equation (1)). Let u ∈ U1∗ and let u+ = max(u, 0) and u− = min(u, 0) be the positive and negative parts of u. Both these functions are nonnegative, u+ is nondecreasing and u− is nonincreasing. The positive part u+ can be approximated by u+ n which equals to a positive linear x ∈ Rd : u(x x) ≥ i/2n }. These upper sets however combination of indicator functions of upper sets Ai = {x belong to the set M∗ since u ∈ U1∗ . According to our assumption, X weakly stochastically dominates Y , Y) X ) ≥ Eu+ X ∈ M ∗ ) ≥ P(Y Y ∈ M ∗ ) for all M ∗ ∈ M∗ , and thus we must have Eu+ which means that P(X n (Y n (X for each n. We get, by the monotone convergence theorem, X ) = lim Eu+ X ) ≥ lim Eu+ Y ) = Eu+ (Y Y ). Eu+ (X n (X n (Y n→∞
n→∞
For the negative part u− we use approximation by u− n which equals to a positive linear combination of x ∈ Rd : u(x x) ≥ i/2n }. These lower sets are elements of the set indicator functions of lower sets Ai = {x L∗ since u ∈ U1∗ . The alternative definition says that X weakly stochastically dominates Y if and only X ∈ L∗ ) ≤ P(Y Y ∈ L∗ ) for all L∗ ∈ L∗ , and thus Eu− X ) ≤ Eu− Y ) for each n. Again, by the if P(X n (X n (Y monotone convergence theorem, X ) = lim Eu− X ) ≤ lim Eu− Y ) = Eu− (Y Y ). Eu− (X n (X n (Y n→∞
n→∞
Combining last two inequalities gives us X ) = Eu+ (X X ) − Eu− (X X ) ≥ Eu+ (Y Y ) − Eu− (Y Y ) = Eu(Y Y ), Eu(X which finishes the proof. Clearly, when X stochastically dominates Y , then X also weakly stochastically dominates Y . This stems from the fact that the simplest upper sets in Rd have the form (m∗1 , ∞) × · · · × (m∗d , ∞), m = X ∈ M ) ≥ P(Y Y ∈ M ) for all upper sets M ∈ M the same inequality (m1 , . . . , md ) ∈ Rd and thus if P(X is valid also for special types of upper sets M ∗ ∈ M∗ . The opposite implication is however not true as shows the following example. Example 1. Consider two random vectors with discrete joint distribution functions: X Y
=
(0, 1)
with probability 1/2,
=
(1, 0)
with probability 1/2,
=
(0, 0)
with probability 1/2,
=
(1, 1)
with probability 1/2.
678
Mathematical Methods in Economics 2016
m) ≤ G(m m) for all m ∈ R2 . Indeed, Then F (m F (m1 , m2 ) = G(m1 , m2 ) = 0
if m1 < 0 or m2 < 0,
0 = F (m1 , m2 ) < G(m1 , m2 ) = 1/2
if 0 ≤ m1 < 1 and 0 ≤ m2 < 1, if m1 ≥ 1 and 0 ≤ m2 < 1,
F (m1 , m2 ) = G(m1 , m2 ) = 1/2
if 0 ≤ m1 < 1 and m2 ≥ 1,
F (m1 , m2 ) = G(m1 , m2 ) = 1/2
if m1 ≥ 1 and m2 ≥ 1,
F (m1 , m2 ) = G(m1 , m2 ) = 1
x = (x1 , x2 ) ∈ R : x1 +x2 ≥ and X weakly stochastically dominates Y . Now consider the upper set M = {x X ∈ M ) < P(Y Y ∈ M ) = 1/2 and X does not stochastically dominates Y . 3/2}, then 0 = P(X
5
Stochastic dominance for discrete distribution
In this chapter we focus on the stochastic dominance of vectors that have a discrete distribution. The motivation is as follows. When one solves a optimization problem with stochastic dominance constraints in order to seek for an optimal strategy, the distribution of a potential and reference strategy are often of the discrete form. For instance, one simulate the possible outcomes by a scenario tree based on the observed history. We focus on the stochastic dominance of vectors that have a uniform discrete distribution. We consider a special case when both compared vectors takes on values from sets with equal cardinality. x1 , . . . , x m } Theorem 4. Let X be a d-dimensional random vector with the uniform distribution on the set {x and let Y be a d-dimensional random vector with the uniform distribution on the set {yy 1 , . . . , y m }, x i ∈ Rd for all i = 1, . . . , m and y j ∈ Rd for all j = 1, . . . , m. Then X stochastically dominates Y if and only if there exists a permutation Π : {1, . . . , m} → {1, . . . , m} such that x i ≥ y Π(i) for all i = 1, . . . , m. X∈ Proof. Firstly, assume that X stochastically dominates Y , thus for each upper set M ∈ M one has P(X Y ∈ M ). We show by contradiction that there necessarily exists a permutation Π such that M ) ≥ P(Y x i ≥ y Π(i) for all i = 1, . . . , m. Assume that such permutation does not exists, i. e. for each permutation Π : {1, . . . , m} → {1, . . . , m} there exists at least one index in {1, . . . , m}, take an arbitrary one and x ∈ Rd : x = y Π(i) + t , t ≥ 0 }. denote it as i, for which x i < y Π(i) . We construct an upper set M as M = {x Then clearly x i does not belongs to M . Let us denote as J ⊂ {1, . . . , m} \ {i} the set of indices such that for each j ∈ J we have y Π(j) ∈ M . Then we have Y ∈ M) = P(Y
|J| |J| + 1 X ∈ M ), > ≥ P(X m m
where we use notation |J| for cardinality of the set J. The above stated inequality however contradicts the assumption about the stochastic dominance. Conversely, assume that there exists a permutation Π : {1, . . . , m} → {1, . . . , m} such that x i ≥ y Π(i) X ∈ M ) ≥ P(Y Y ∈ M ) for each upper set M ∈ M. Take for all i = 1, . . . , m and show that then P(X an arbitrary M ∈ M. Let us denote as J ⊆ {1, . . . , m} the set of all indices such that for each j ∈ J y Π(j) ∈ M . Then necessarily also x j ∈ M since by assumption x i ≥ y Π(i) for all i = 1, . . . , m and M is an upper set. Let us, moreover, denote as I ⊆ {1, . . . , m} the set of indices such that for each i ∈ I x i ∈ M . Then J ⊆ I, since x i ≥ y Π(i) , and therefore we must have X ∈ M) = P(X
|I| |J| Y ∈ M ). ≥ = P(Y m m
Since M was selected arbitrary, the random vector X stochastically dominates the random vector Y . Remark 3. When d = 1 the theorem gives us a well known result. Consider two random vectors X and Y with m equiprobable scenarios x1 , . . . , xm and y1 , . . . , xm . Reorder the realizations of both variables so as x(1) < x(2) < . . . < x(m) and y(1) < y(2) < . . . < y(m) . Then the random variable X stochastically dominates the random variable Y in and only if x(i) ≥ y(i) for all i = 1, . . . , m.
679
Mathematical Methods in Economics 2016
In practice, when we desire to decide about stochastic dominance of two random vectors which have discrete uniform distributions with the same number of possible realizations, we can proceed as follows. x1 , . . . , x m } Assume that we want to show that the random vector X uniformly distributed on the set {x stochastically dominates the random vector Y uniformly distributed on the set {yy 1 , . . . , y m }. For each x ∈ Rd : x = y i + t , t ≥ 0 } and seek for some i = 1, . . . , m we construct an upper set Mi as Mi = {x realization of the random vector X included in this upper set. As soon as some realization of X is assigned to some y i , the realization cannot be assigned to any other realization of Y . If it is possible to assign to each y i exactly one x i (but not unique) then X stochastically dominates Y . If there is no such solution then X cannot stochastically dominates Y . In other words, we seek for the permutation described in the previous theorem.
6
Conclusion
In this paper we presented a multidimensional stochastic dominance as it was defined in Sriboonchitta et al. [8]. We establish another type of stochastic dominance, the weak stochastic dominance, and described its generator in terms of von Neumann - Morgenstern utility functions. We clarified the relationship between two types of stochastic dominance. We focused our attention to exploration of stochastic dominance between discrete random vectors since optimization problems seeking for an optimal strategy often consider the potential and reference strategies to be realizations of discrete random vectors. We stated necessary and sufficient conditions for one discrete random vector with uniform distribution to be stochastically dominated by another one and propose the algorithm verifying their relation. In the future work, we plan to incorporate the multidimensional stochastic dominance constraints into the problem of finding the optimal investment strategy. We also intend to continue with examination of the stochastic dominance between random vectors with specific distribution. Namely, we would like to focus on random vectors with uniform distribution on convex sets and with multidimensional normal distribution. Another filed of interest is to extend the stochastic dominance to dominance of higher orders as it was done in one dimensional case.
Acknowledgements I would like to express my gratitude to my advisor doc. RNDr. Ing. Miloš Kopa, PhD., who offered invaluable assistance and guidance. The paper is supported by the grant no. P402/12/G097 of the Czech Science Foundation and SVV 2016 no. 260334 of Charles University in Prague.
References [1] M. Branda and M. Kopa. Dea models equivalent to general n-th order stochastic dominance efficiency tests. Operations Research Letters, 44:285–289, 2016. [2] D. Dentcheva and A. Ruszczýnski. Optimization with multivariate stochastic dominance constraints. Mathematical Programming, 117(1):111–127, 2009. [3] J. Dupačová and M. Kopa. Robustness of optimal portfolios under risk and stochastic dominance constraints. European Journal of Operational Research, 234(2):434–441, 2014. [4] M. Kopa and T. Post. A general test for ssd portfolio efficiency. OR Spectrum, 37(3):703–734, 2015. [5] M. Kopa and T. Post. Portfolio choice based on third-degree stochastic dominance. Management Science, 2016. doi: accepted. [6] T. Post, Y. Fang, and M. Kopa. Linear tests for decreasing absolute risk aversion stochastic dominance. Management Science, 61(7):1615–1629, 2015. [7] S. F. Richard. Multivariate risk aversion, utility independence and separable utility functions. Management Science, 22(1):12–21, 1975. [8] S. Sriboonchitta, W. K. Wong, S. Dhompongsa, and H. T. Nguyen. Stochastic Dominance and Applications to Finance, Risk and Economics. Chapman and Hall/CRC, 2009.
680
Mathematical Methods in Economics 2016
The Intuitionistic Fuzzy Investment Recommendations Krzysztof Piasecki1 Abstract. The return rate is considered here as an intuitionistic fuzzy probabilistic set. Then the expected return rate is obtained as an intuitionistic fuzzy subset in the real line. This result is a theoretical foundation for new investment strategies. All considered strategies are result of comparison intuitionistic fuzzy profit index and limit value. In this way we obtain an imprecise investment recommendation. Financial equilibrium criteria are a special case of comparison the profit index and the limit value. The following criteria are generalized: the Sharpeâ&#x20AC;&#x2122;s Ratio, the Jensenâ&#x20AC;&#x2122;s Alpha and the Treynorâ&#x20AC;&#x2122;s Ratio. Moreover, the safety-first criteria are generalized here for the fuzzy case. The Roy Criterion, the Kataoka Criterion and the Telser Criterion are generalized. Obtained here results show that proposed theory is useful for the investment applications. Moreover, the results so obtained may be applied in behavioural finance theory as a normative model of investmentâ&#x20AC;&#x2122;s decisions. Keywords: imprecise return rate, investment recommendation, intuitionistic fuzzy subset. JEL Classification: C02, G11 AMS Classification: 03E72, 90B50, 91G99
1 Research Problem Fuzzy probabilistic subset in the real line is the most common model of imprecisely estimated the return rate under quantified uncertainty. The knightian uncertainty is excluded here. From practical point-view, exclusion of knightian uncertainty is too strong assumption. Commonly recognized model of knightian uncertainty is intuitionistic fuzzy set (for short IFS) [1]. Therefore, the imprecise return rate under any uncertainty may be described by probabilistic IFS [5]. Then expected return rate will be given as IFS in the real line. In [2] and [7], we can find examples of such return rates described by probabilistic IFS which are well justified by economic reasons. Investment strategies determined by the fuzzy expected return rate are presented in [6]. The result of pursuing such strategy was assigning fuzzy financial investment recommendation to each security. The main goal of this article is consideration of investment recommendations given for securities with fuzzy probabilistic return. There will be considered the financial equilibrium criteria and the safety-first criteria. In this way, setting the investment goal will be able to take into account the imprecision risk.
2 Intuitionistic Fuzzy Expected Return Rate Let us assume that the time horizon đ?&#x2018;Ą > 0 of an investment is fixed. The basic characteristic of benefits from owning any security is its simple return rate đ?&#x2018;&#x;Ě&#x192; : đ?&#x203A;ş = {đ?&#x153;&#x201D;} â&#x;ś â&#x201E;? , where the set Ί is a set of elementary states of the financial market. The return rate is a random variable which is at uncertainty risk. In practice of financial markets analysis, the uncertainty risk is usually described by probability distribution of return rates đ?&#x2018;&#x;Ě&#x192;đ?&#x2018;Ą . This probability distribution is given by cumulative distribution function đ??šđ?&#x2018;&#x; : â&#x201E;? â&#x;ś [0; 1]. On other side, the cumulative distribution function đ??šđ?&#x2018;&#x; determines probability distribution đ?&#x2018;&#x192;: 2Ί â&#x160;&#x192; đ?&#x2018;&#x;Ě&#x192; â&#x2C6;&#x2019;1 (â&#x201E;Ź) â&#x;ś [0; 1], where the symbol â&#x201E;Ź denotes the smallest Borel Ď&#x192;-field containing all intervals in the real line â&#x201E;?. Let us assume that considered return rate is estimated by means of probabilistic IFS â&#x201E;&#x203A; determined by its membership function đ?&#x153;&#x152;â&#x201E;&#x203A; â&#x2C6;&#x2C6; [0; 1]â&#x201E;?Ă&#x2014;đ?&#x203A;ş and nonmembership function đ?&#x153;&#x2018;â&#x201E;&#x203A; â&#x2C6;&#x2C6; [0; 1]â&#x201E;?Ă&#x2014;đ?&#x203A;ş . For this case, expected return rate is IFS đ?&#x2018;&#x2026; â&#x2C6;&#x2C6; â&#x201E;?(â&#x201E;?) given as follow đ?&#x2018;&#x2026; = {(đ?&#x2018;Ľ, đ?&#x153;&#x152;đ?&#x2018;&#x2026; (đ?&#x2018;Ľ), đ?&#x153;&#x2018;đ?&#x2018;&#x2026; (đ?&#x2018;Ľ)): đ?&#x2018;Ľ â&#x2C6;&#x2C6; â&#x201E;?}, â&#x201E;?
(1) â&#x201E;?
where the membership function đ?&#x153;&#x152;đ?&#x2018;&#x2026; â&#x2C6;&#x2C6; [0; 1] and nonmembership function đ?&#x153;&#x2018;đ?&#x2018;&#x2026; â&#x2C6;&#x2C6; [0; 1] are determined by the identities đ?&#x153;&#x152;đ?&#x2018;&#x2026; (đ?&#x2018;Ľ) = â&#x2C6;ŤÎŠ đ?&#x153;&#x152;â&#x201E;&#x203A; (đ?&#x2018;Ľ, đ?&#x153;&#x201D;)đ?&#x2018;&#x2018;đ?&#x2018;&#x192; ,
(2)
PoznaĹ&#x201E; University of Economics and Business, Department of Operations Research, al. Niepodleglosci 10, 61875 PoznaĹ&#x201E;, Poland, k.piasecki@ue.poznan.pl 1
681
Mathematical Methods in Economics 2016
đ?&#x153;&#x2018;đ?&#x2018;&#x2026; (đ?&#x2018;Ľ) = â&#x2C6;Ťđ?&#x203A;ş đ?&#x153;&#x2018;â&#x201E;&#x203A; (đ?&#x2018;Ľ, đ?&#x153;&#x201D;)đ?&#x2018;&#x2018;đ?&#x2018;&#x192;
(3)
and the symbol â&#x201E;?(â&#x201E;?) denotes the family of all IFS in the real line â&#x201E;?.
3 Investment Recommendations Dependent on Expected Return The investment recommendation is the counsel given by the advisers to the investor. For convenience, these recommendations may be expressed by means of standardized advices. Many advisors use a different terminology and different number of words forming advice vocabulary [12]. In this article we will concentrate on the five-element adviserâ&#x20AC;&#x2122;s vocabulary given as the set đ?&#x201D;¸ = {â&#x201E;Ź, đ?&#x2019;&#x153;, â&#x201E;&#x2039;, â&#x201E;&#x203A;, đ?&#x2019;Ž} , (4) where: ď&#x20AC;
â&#x201E;Ź denotes the advice Buy suggesting that evaluated security is significantly undervalued,
ď&#x20AC;
đ?&#x2019;&#x153; denotes the advice Accumulate suggesting that evaluated security is undervalued,
ď&#x20AC;
â&#x201E;&#x2039; denotes the advice Hold suggesting that evaluated security is fairly valued,
ď&#x20AC;
â&#x201E;&#x203A; denotes the advice Reduce suggesting that evaluated security is overvalued,
ď&#x20AC;
đ?&#x2019;Ž denotes the advice Sell suggesting that evaluated security is significantly overvalued.
Let us take into account fixed security đ?&#x2018;&#x2020;Ě&#x2020; â&#x2C6;&#x2C6; đ?&#x2022;? with expected return rate đ?&#x2018;&#x;đ?&#x2018; â&#x2C6;&#x2C6; â&#x201E;? where the symbol đ?&#x2022;? denotes the set of all considered securities. For such case advisorâ&#x20AC;&#x2122;s counsel depends on expect return. Then the criterion for competent choice of advice can be presented as a comparison of the values đ?&#x2018;&#x201D;(đ?&#x2018;&#x;đ?&#x2018; ) and đ??şĚ&#x201A; defined as follows: ď&#x20AC;
đ?&#x2018;&#x201D;: â&#x201E;? â&#x2020;&#x2019; â&#x201E;? is an increasing function of substantially justified form,
ď&#x20AC;
đ??şĚ&#x201A; denotes the substantially justified limit value.
The function đ?&#x2018;&#x201D;: â&#x201E;? â&#x2020;&#x2019; â&#x201E;? serves as a profit index. Using this criterion we define the advice choice function Î&#x203A;: â&#x201E;? â&#x;ś 2đ?&#x201D;¸ as follows â&#x201E;Ź â&#x2C6;&#x2C6; Î&#x203A;(đ?&#x2018;&#x2020;Ě&#x2020;) â&#x;ş đ?&#x2018;&#x201D;(đ?&#x2018;&#x;đ?&#x2018; ) > đ??şĚ&#x201A; â&#x;ş đ?&#x2018;&#x201D;(đ?&#x2018;&#x;đ?&#x2018; ) â&#x2030;Ľ đ??şĚ&#x201A; â&#x2C6;§ ÂŹ đ?&#x2018;&#x201D;(đ?&#x2018;&#x;đ?&#x2018; ) â&#x2030;¤ đ??şĚ&#x201A; , đ?&#x2019;&#x153; â&#x2C6;&#x2C6; Î&#x203A;(đ?&#x2018;&#x2020;Ě&#x2020;) â&#x;ş đ?&#x2018;&#x201D;(đ?&#x2018;&#x;đ?&#x2018; ) â&#x2030;Ľ đ??şĚ&#x201A; , â&#x201E;&#x2039; â&#x2C6;&#x2C6; Î&#x203A;(đ?&#x2018;&#x2020;Ě&#x2020;) â&#x;ş đ?&#x2018;&#x201D;(đ?&#x2018;&#x;đ?&#x2018; ) = đ??şĚ&#x201A; â&#x;ş đ?&#x2018;&#x201D;(đ?&#x2018;&#x;đ?&#x2018; ) â&#x2030;Ľ đ??şĚ&#x201A; â&#x2C6;§ đ?&#x2018;&#x201D;(đ?&#x2018;&#x;đ?&#x2018; ) â&#x2030;¤ đ??şĚ&#x201A; , â&#x201E;&#x203A; â&#x2C6;&#x2C6; Î&#x203A;(đ?&#x2018;&#x2020;Ě&#x2020;) â&#x;ş đ?&#x2018;&#x201D;(đ?&#x2018;&#x;đ?&#x2018; ) â&#x2030;¤ đ??şĚ&#x201A; { đ?&#x2019;Ž â&#x2C6;&#x2C6; Î&#x203A;(đ?&#x2018;&#x2020;Ě&#x2020;) â&#x;ş đ?&#x2018;&#x201D;(đ?&#x2018;&#x;đ?&#x2018; ) < đ??şĚ&#x201A; â&#x;ş ÂŹ đ?&#x2018;&#x201D;(đ?&#x2018;&#x;đ?&#x2018; ) â&#x2030;Ľ đ??şĚ&#x201A; â&#x2C6;§ đ?&#x2018;&#x201D;(đ?&#x2018;&#x;đ?&#x2018; ) â&#x2030;¤ đ??şĚ&#x201A; .
(5)
In this way we assign the advice subset to the security đ?&#x2018;&#x2020;Ě&#x2020;. The value Î&#x203A;(đ?&#x2018;&#x2020;Ě&#x2020;) is called the investment recommendation. This recommendation may be used as valuable starting points for equity portfolio strategies. On the other hand, the weak point of the proposed choice function is omitting the fundamental analysis result and the behavioural factors impact. Analyzing the above choice function is easy to see lack strong boundary between the advices Buy and Accumulate and between the advices Reduce and Sell. Justification for distinguishing between these advices, we can search on the basis of fundamental analysis and between the behavioral aspects of the investment process.
4 Recommendations Dependent on Intuitionistic Fuzzy Expected Return Let us assume that expected return rate from the security đ?&#x2018;&#x2020;Ě&#x2020; â&#x2C6;&#x2C6; đ?&#x2022;? is represented by IFS đ?&#x2018;&#x2026; â&#x2C6;&#x2C6; â&#x201E;?(â&#x201E;?) described by (1). In the first step, we extend the domain â&#x201E;? of profit index to the set â&#x201E;?(â&#x201E;?). According to the Zadehâ&#x20AC;&#x2122;s Extension Principle [13], for any đ?&#x2018;&#x2026; â&#x2C6;&#x2C6; â&#x201E;?(â&#x201E;?) the profit index value đ?&#x2018;&#x201D;(đ?&#x2018;&#x2026;) of profit index is determined by its membership function đ?&#x203A;ž â&#x2C6;&#x2C6; [0; 1]â&#x201E;? and nonmembership function đ?&#x203A;ż â&#x2C6;&#x2C6; [0; 1]â&#x201E;? determined as follows đ?&#x203A;ž(đ?&#x2018;Ľ) = sup{đ?&#x153;&#x152;đ?&#x2018;&#x2026; (đ?&#x2018;&#x;): đ?&#x2018;Ľ = đ?&#x2018;&#x201D;(đ?&#x2018;&#x;)} = đ?&#x153;&#x152;đ?&#x2018;&#x2026; (đ?&#x2018;&#x201D;â&#x2C6;&#x2019;1 (đ?&#x2018;Ľ)), (6) đ?&#x203A;ż(đ?&#x2018;Ľ) = inf{đ?&#x153;&#x2018;đ?&#x2018;&#x2026; (đ?&#x2018;&#x;): đ?&#x2018;Ľ = đ?&#x2018;&#x201D;(đ?&#x2018;&#x;)} = đ?&#x153;&#x2018;đ?&#x2018;&#x2026; (đ?&#x2018;&#x201D;â&#x2C6;&#x2019;1 (đ?&#x2018;Ľ)),
(7)
In this way we define intuitionistic fuzzy profit index đ?&#x2018;&#x201D;: â&#x201E;?(â&#x201E;?) â&#x;ś â&#x201E;?(â&#x201E;?). In the second step, we extend the advice choice function domain â&#x201E;? to the set â&#x201E;?(â&#x201E;?). In agree with the Zadehâ&#x20AC;&#x2122;s Extension Principle [13], the investment recommendation Î&#x203A;(đ?&#x2018;&#x2020;Ě&#x2020;) is IFS described by its membership
682
Mathematical Methods in Economics 2016
function đ?&#x153;&#x2020;(â&#x2C6;&#x2122; |đ?&#x2018;&#x2020;Ě&#x2020;) â&#x2C6;&#x2C6; [0; 1]đ?&#x201D;¸ and nonmembership function đ?&#x153;&#x2026;(â&#x2C6;&#x2122; |đ?&#x2018;&#x2020;Ě&#x2020;) â&#x2C6;&#x2C6; [0; 1]đ?&#x201D;¸ . Due (5), (6) the membership function đ?&#x153;&#x2020;(â&#x2C6;&#x2122; |đ?&#x2018;&#x2020;Ě&#x2020;) is determined by identities đ?&#x153;&#x2020;(â&#x201E;Ź|đ?&#x2018;&#x2020;Ě&#x2020;) = sup{đ?&#x203A;ž(đ?&#x2018;Ľ): đ?&#x2018;Ľ â&#x2030;Ľ đ??şĚ&#x152; } â&#x2C6;§ inf{đ?&#x203A;ż(đ?&#x2018;Ľ): đ?&#x2018;Ľ â&#x2030;¤ đ??şĚ&#x152; }, đ?&#x153;&#x2020;(đ?&#x2019;&#x153;|đ?&#x2018;&#x2020;Ě&#x2020;) = sup{đ?&#x203A;ž(đ?&#x2018;Ľ): đ?&#x2018;Ľ â&#x2030;Ľ đ??şĚ&#x152; } = sup{đ?&#x153;&#x152;đ?&#x2018;&#x2026; (đ?&#x2018;&#x201D;â&#x2C6;&#x2019;1 (đ?&#x2018;Ľ)): đ?&#x2018;Ľ â&#x2030;Ľ đ??şĚ&#x152; },
(8) (9)
đ?&#x153;&#x2020;(â&#x201E;&#x2039;|đ?&#x2018;&#x2020;Ě&#x2020;) = sup{đ?&#x203A;ž(đ?&#x2018;Ľ): đ?&#x2018;Ľ â&#x2030;Ľ đ??şĚ&#x152; } â&#x2C6;§ sup{đ?&#x203A;ž(đ?&#x2018;Ľ): đ?&#x2018;Ľ â&#x2030;¤ đ??şĚ&#x152; },
(10)
đ?&#x153;&#x2020;(â&#x201E;&#x203A;|đ?&#x2018;&#x2020;Ě&#x2020;) = sup{đ?&#x203A;ž(đ?&#x2018;Ľ): đ?&#x2018;Ľ â&#x2030;¤ đ??şĚ&#x152; } = sup{đ?&#x153;&#x152;đ?&#x2018;&#x2026; (đ?&#x2018;&#x201D;â&#x2C6;&#x2019;1 (đ?&#x2018;Ľ)): đ?&#x2018;Ľ â&#x2030;¤ đ??şĚ&#x152; },
(11)
đ?&#x153;&#x2020;(â&#x201E;Ź|đ?&#x2018;&#x2020;Ě&#x2020;) = sup{đ?&#x203A;ž(đ?&#x2018;Ľ): đ?&#x2018;Ľ â&#x2030;Ľ đ??şĚ&#x152; } â&#x2C6;§ inf{đ?&#x203A;ż(đ?&#x2018;Ľ): đ?&#x2018;Ľ â&#x2030;¤ đ??şĚ&#x152; },
(12)
In similar way by (5), (7) we determine the nonmembership function đ?&#x153;&#x2026;(â&#x2C6;&#x2122; |đ?&#x2018;&#x2020;Ě&#x2020;). We have here đ?&#x153;&#x2026;(â&#x201E;Ź|đ?&#x2018;&#x2020;Ě&#x2020;) = inf{đ?&#x203A;ż(đ?&#x2018;Ľ): đ?&#x2018;Ľ â&#x2030;Ľ đ??şĚ&#x152; } â&#x2C6;¨ sup{đ?&#x203A;ž(đ?&#x2018;Ľ): đ?&#x2018;Ľ â&#x2030;¤ đ??şĚ&#x152; },
(13)
đ?&#x153;&#x2026;(đ?&#x2019;&#x153;|đ?&#x2018;&#x2020;Ě&#x2020;) = inf{đ?&#x203A;ż(đ?&#x2018;Ľ): đ?&#x2018;Ľ â&#x2030;Ľ đ??şĚ&#x152; } = inf{đ?&#x153;&#x2018;đ?&#x2018;&#x2026; (đ?&#x2018;&#x201D;â&#x2C6;&#x2019;1 (đ?&#x2018;Ľ)): đ?&#x2018;Ľ â&#x2030;Ľ đ??şĚ&#x152; },
(14)
đ?&#x153;&#x2026;(â&#x201E;&#x2039;|đ?&#x2018;&#x2020;Ě&#x2020;) = inf{đ?&#x203A;ż(đ?&#x2018;Ľ): đ?&#x2018;Ľ â&#x2030;Ľ đ??şĚ&#x152; } â&#x2C6;¨ inf{đ?&#x203A;ż(đ?&#x2018;Ľ): đ?&#x2018;Ľ â&#x2030;¤ đ??şĚ&#x152; },
(15)
đ?&#x153;&#x2026;(â&#x201E;&#x203A;|đ?&#x2018;&#x2020;Ě&#x2020;) = đ?&#x2018;&#x2013;nf{đ?&#x203A;ż(đ?&#x2018;Ľ): đ?&#x2018;Ľ â&#x2030;¤ đ??şĚ&#x152; } = inf{đ?&#x153;&#x2018;đ?&#x2018;&#x2026; (đ?&#x2018;&#x201D;â&#x2C6;&#x2019;1 (đ?&#x2018;Ľ)): đ?&#x2018;Ľ â&#x2030;¤ đ??şĚ&#x152; },
(16)
đ?&#x153;&#x2026;(â&#x201E;Ź|đ?&#x2018;&#x2020;Ě&#x2020;) = inf{đ?&#x203A;ż(đ?&#x2018;Ľ): đ?&#x2018;Ľ â&#x2030;Ľ đ??şĚ&#x152; } â&#x2C6;¨ sup{đ?&#x203A;ž(đ?&#x2018;Ľ): đ?&#x2018;Ľ â&#x2030;¤ đ??şĚ&#x152; }.
(17)
In this way we define function đ?&#x203A;ŹĚ&#x192;: đ?&#x2022;? â&#x;ś â&#x201E;?(đ?&#x201D;¸) of advice choice. The value đ?&#x203A;Ź(đ?&#x2018;&#x2020;Ě&#x2020;) is called imprecise investment recommendation. The value đ?&#x153;&#x2020;(đ?&#x2019;ł|đ?&#x2018;&#x2020;Ě&#x2020;) may be interpreted as a degree in which the advice đ?&#x2019;ł â&#x2C6;&#x2C6; đ?&#x201D;¸ is recommended for the security đ?&#x2018;&#x2020;Ě&#x2020;. Moreover, the value đ?&#x153;&#x2026;(đ?&#x2019;ł|đ?&#x2018;&#x2020;Ě&#x2020;) may be interpreted as a degree in which the advice đ?&#x2019;ł â&#x2C6;&#x2C6; đ?&#x201D;¸ is rejected for the security đ?&#x2018;&#x2020;Ě&#x2020;. Each of these values describes the investment recommendation issued to the security đ?&#x2018;&#x2020;Ě&#x2020; â&#x2C6;&#x2C6; đ?&#x2022;? by the advisor. Each advice is thus recommended to some extent. The investor shifts some of the responsibility to advisers. For this reason, the investors restrict their choice of investment decisions to advices which are recommended to the greatest degree and rejected to the smallest degree. In this way, the investors minimize their individual responsibility for financial decision making. This means that the final investment decision criteria may be criteria for maximizing the value đ?&#x153;&#x2020;(â&#x2C6;&#x2122; |đ?&#x2018;&#x2020;Ě&#x2020;)and minimizing the value đ?&#x153;&#x2026;(â&#x2C6;&#x2122; |đ?&#x2018;&#x2020;Ě&#x2020;). However, final investment decision will be made by the investor. Guided by own knowledge and intuition, the investor can choose the advice which is recommended in the lower degree. Let us note that we have here đ?&#x153;&#x2020;(â&#x201E;Ź|đ?&#x2018;&#x2020;Ě&#x2020;) = đ?&#x153;&#x2020;(đ?&#x2019;&#x153;|đ?&#x2018;&#x2020;Ě&#x2020;) â&#x2C6;§ đ?&#x153;&#x2026;(â&#x201E;&#x203A;|đ?&#x2018;&#x2020;Ě&#x2020;),
(18)
đ?&#x153;&#x2020;(â&#x201E;&#x2039;|đ?&#x2018;&#x2020;Ě&#x2020;) = đ?&#x153;&#x2020;(đ?&#x2019;&#x153;|đ?&#x2018;&#x2020;Ě&#x2020;) â&#x2C6;§ đ?&#x153;&#x2020;(â&#x201E;&#x203A;|đ?&#x2018;&#x2020;Ě&#x2020;),
(19)
đ?&#x153;&#x2020;(â&#x201E;Ź|đ?&#x2018;&#x2020;Ě&#x2020;) = đ?&#x153;&#x2020;(â&#x201E;&#x203A;|đ?&#x2018;&#x2020;Ě&#x2020;) â&#x2C6;§ đ?&#x153;&#x2026;(đ?&#x2019;&#x153;|đ?&#x2018;&#x2020;Ě&#x2020;),
(20)
đ?&#x153;&#x2026;(â&#x201E;Ź|đ?&#x2018;&#x2020;Ě&#x2020;) = đ?&#x153;&#x2026;(đ?&#x2019;&#x153;|đ?&#x2018;&#x2020;Ě&#x2020;) â&#x2C6;¨ đ?&#x153;&#x2020;(â&#x201E;&#x203A;|đ?&#x2018;&#x2020;Ě&#x2020;),
(21)
đ?&#x153;&#x2026;(â&#x201E;&#x2039;|đ?&#x2018;&#x2020;Ě&#x2020;) = đ?&#x153;&#x2026;(đ?&#x2019;&#x153;|đ?&#x2018;&#x2020;Ě&#x2020;) â&#x2C6;¨ đ?&#x153;&#x2026;(â&#x201E;&#x203A;|đ?&#x2018;&#x2020;Ě&#x2020;),
(22)
đ?&#x153;&#x2026;(â&#x201E;Ź|đ?&#x2018;&#x2020;Ě&#x2020;) = đ?&#x153;&#x2026;(â&#x201E;&#x203A;|đ?&#x2018;&#x2020;Ě&#x2020;) â&#x2C6;¨ đ?&#x153;&#x2020;(đ?&#x2019;&#x153;|đ?&#x2018;&#x2020;Ě&#x2020;).
(23)
This shows that the functions values (đ?&#x2019;&#x153;|đ?&#x2018;&#x2020;Ě&#x2020;), đ?&#x153;&#x2020;(â&#x201E;&#x203A;|đ?&#x2018;&#x2020;Ě&#x2020;), đ?&#x153;&#x2026;(đ?&#x2019;&#x153;|đ?&#x2018;&#x2020;Ě&#x2020;), đ?&#x153;&#x2026;(â&#x201E;&#x203A;|đ?&#x2018;&#x2020;Ě&#x2020;)are sufficient to determine the imprecise investment recommendation. For this reason, when considering the specific method of advice choice, we will only determine these values. Let us note that the use of imprecisely estimated return will allow to determine the differences between the advices Buy and Accumulate and between the advices Reduce and Sell.
5 Financial Equilibrium Criteria Each financial equilibrium model is given as the comparison of expected return from considered security and expected return of the market portfolio. We will consider fixed security đ?&#x2018;&#x2020;Ě&#x2020;.
5.1
The Sharpeâ&#x20AC;&#x2122;s Ratio
In this section any security đ?&#x2018;&#x152;Ě&#x2020; â&#x2C6;&#x2C6; đ?&#x2022;? is represented by the pair (đ?&#x2018;&#x;đ?&#x2018;Ś , đ?&#x153;&#x17D;đ?&#x2018;Ś2 ) â&#x2C6;&#x2C6; â&#x201E;?2 , where đ?&#x2018;&#x;đ?&#x2018;Ś is expected rate of return from đ?&#x2018;&#x152;Ě&#x2020; and đ?&#x153;&#x17D;đ?&#x2018;Ś2 is variance of return rate from đ?&#x2018;&#x152;Ě&#x2020;. We assume that there exists the risk free bond represented by
683
Mathematical Methods in Economics 2016
the pair (đ?&#x2018;&#x;0 , 0) and the market portfolio represented by the pair (đ?&#x2018;&#x;đ?&#x2018;&#x20AC; , đ?&#x153;&#x17D;đ?&#x2018;&#x20AC;2 ). If the security đ?&#x2018;&#x2020;Ě&#x152; is represented by the pair (đ?&#x2018;&#x;đ?&#x2018; , đ?&#x153;&#x17D;đ?&#x2018; 2 ), then Sharpe [9] defines the profit index đ?&#x2018;&#x201D;: â&#x201E;? â&#x2020;&#x2019; â&#x201E;? and the limit value đ??şĚ&#x201A; as follows đ?&#x2018;&#x201D;(đ?&#x2018;&#x;) =
đ?&#x2018;&#x;â&#x2C6;&#x2019;đ?&#x2018;&#x;0 đ?&#x153;&#x17D;đ?&#x2018;
,
(24)
đ?&#x2018;&#x; â&#x2C6;&#x2019;đ?&#x2018;&#x; đ??şĚ&#x201A; = đ?&#x2018;&#x20AC; 0 .
(25)
đ?&#x153;&#x17D;đ?&#x2018;&#x20AC;
The Sharpeâ&#x20AC;&#x2122;s profit index estimates amount of the premium per overall risk unit. The Sharpeâ&#x20AC;&#x2122;s limit value is equal to the unit premium of the market portfolio risk. Let us consider the case when expected return from the security đ?&#x2018;&#x2020;Ě&#x152; is estimated as đ?&#x2018;&#x2026; â&#x2C6;&#x2C6; â&#x201E;?(â&#x201E;?) described by (1). In agree with (9), (11), (14) and (16) we have here đ?&#x2018;&#x; â&#x2C6;&#x2019;đ?&#x2018;&#x; đ?&#x153;&#x2020;(đ?&#x2019;&#x153;|đ?&#x2018;&#x2020;Ě&#x2020;) = sup {đ?&#x153;&#x152;(đ?&#x153;&#x17D;đ?&#x2018; â&#x2C6;&#x2122; đ?&#x2018;Ľ + đ?&#x2018;&#x;0 ): đ?&#x2018;Ľ â&#x2030;Ľ đ?&#x2018;&#x20AC; 0},
(26)
đ?&#x2018;&#x; â&#x2C6;&#x2019;đ?&#x2018;&#x; đ?&#x153;&#x2020;(â&#x201E;&#x203A;|đ?&#x2018;&#x2020;Ě&#x2020;) = sup {đ?&#x153;&#x152;(đ?&#x153;&#x17D;đ?&#x2018; â&#x2C6;&#x2122; đ?&#x2018;Ľ + đ?&#x2018;&#x;0 ): đ?&#x2018;Ľ â&#x2030;¤ đ?&#x2018;&#x20AC; 0 },
(27)
đ?&#x2018;&#x; â&#x2C6;&#x2019;đ?&#x2018;&#x; đ?&#x153;&#x2026;(đ?&#x2019;&#x153;|đ?&#x2018;&#x2020;Ě&#x2020;) = inf {đ?&#x153;&#x2018;(đ?&#x153;&#x17D;đ?&#x2018; â&#x2C6;&#x2122; đ?&#x2018;Ľ + đ?&#x2018;&#x;0 ): đ?&#x2018;Ľ â&#x2030;Ľ đ?&#x2018;&#x20AC; 0 },
(28)
đ?&#x153;&#x17D;đ?&#x2018;&#x20AC;
đ?&#x153;&#x17D;đ?&#x2018;&#x20AC;
đ?&#x153;&#x17D;đ?&#x2018;&#x20AC;
đ?&#x2018;&#x; â&#x2C6;&#x2019;đ?&#x2018;&#x; đ?&#x153;&#x2026;(â&#x201E;&#x203A;|đ?&#x2018;&#x2020;Ě&#x2020;) = inf {đ?&#x153;&#x2018;(đ?&#x153;&#x17D;đ?&#x2018; â&#x2C6;&#x2122; đ?&#x2018;Ľ + đ?&#x2018;&#x;0 ): đ?&#x2018;Ľ â&#x2030;¤ đ?&#x2018;&#x20AC; 0}. đ?&#x153;&#x17D;đ?&#x2018;&#x20AC;
5.2
(29)
The Jensenâ&#x20AC;&#x2122;s Alpha
On the capital market we observe the risk-free return đ?&#x2018;&#x;0 and the expected market return đ?&#x2018;&#x;đ?&#x2018;&#x20AC; . The security đ?&#x2018;&#x2020;Ě&#x152; is represented by the pair (đ?&#x2018;&#x;đ?&#x2018; , đ?&#x203A;˝đ?&#x2018; ), where đ?&#x203A;˝đ?&#x2018; is the directional factor of the CAPM model assigned to this security. Jensen [3] defines the profit index đ?&#x2018;&#x201D;: â&#x201E;? â&#x2020;&#x2019; â&#x201E;? and the limit value đ??şĚ&#x201A; as follows đ?&#x2018;&#x201D;(đ?&#x2018;&#x;|đ?&#x153;&#x17D;đ?&#x2018; ) = đ?&#x2018;&#x; â&#x2C6;&#x2019; đ?&#x203A;˝đ?&#x2018; â&#x2C6;&#x2122; (đ?&#x2018;&#x;đ?&#x2018;&#x20AC; â&#x2C6;&#x2019; đ?&#x2018;&#x;0 ) , đ??şĚ&#x201A; = đ?&#x2018;&#x;0 .
(30) (31)
The Jensenâ&#x20AC;&#x2122;s profit index estimates amount of the premium of the market portfolio risk. The Jensenâ&#x20AC;&#x2122;s limit value is equal to the risk-free return rate. Let us consider the case when expected return from the security đ?&#x2018;&#x2020;Ě&#x152; is estimated as đ?&#x2018;&#x2026; â&#x2C6;&#x2C6; â&#x201E;?(â&#x201E;?) described by (1). In agree with (9), (11), (14) and (16) we have here
5.3
đ?&#x153;&#x2020;(đ?&#x2019;&#x153;|đ?&#x2018;&#x2020;Ě&#x2020;) = sup{đ?&#x153;&#x152;đ?&#x2018; (đ?&#x2018;&#x;): đ?&#x2018;&#x; â&#x2C6;&#x2019; đ?&#x203A;˝đ?&#x2018; â&#x2C6;&#x2122; (đ?&#x2018;&#x;đ?&#x2018;&#x20AC; â&#x2C6;&#x2019; đ?&#x2018;&#x;0 ) â&#x2030;Ľ đ?&#x2018;&#x;0 },
(32)
đ?&#x153;&#x2020;(â&#x201E;&#x203A;|đ?&#x2018;&#x2020;Ě&#x2020;) = sup{đ?&#x153;&#x152;đ?&#x2018; (đ?&#x2018;&#x;): đ?&#x2018;&#x; â&#x2C6;&#x2019; đ?&#x203A;˝đ?&#x2018; â&#x2C6;&#x2122; (đ?&#x2018;&#x;đ?&#x2018;&#x20AC; â&#x2C6;&#x2019; đ?&#x2018;&#x;0 ) â&#x2030;¤ đ?&#x2018;&#x;0 },
(33)
đ?&#x153;&#x2026;(đ?&#x2019;&#x153;|đ?&#x2018;&#x2020;Ě&#x2020;) = inf{đ?&#x153;&#x2018;đ?&#x2018; (đ?&#x2018;&#x;): đ?&#x2018;&#x; â&#x2C6;&#x2019; đ?&#x203A;˝đ?&#x2018; â&#x2C6;&#x2122; (đ?&#x2018;&#x;đ?&#x2018;&#x20AC; â&#x2C6;&#x2019; đ?&#x2018;&#x;0 ) â&#x2030;Ľ đ?&#x2018;&#x;0 },
(34)
đ?&#x153;&#x2026;(â&#x201E;&#x203A;|đ?&#x2018;&#x2020;Ě&#x2020;) = inf{đ?&#x153;&#x2018;đ?&#x2018; (đ?&#x2018;&#x;): đ?&#x2018;&#x; â&#x2C6;&#x2019; đ?&#x203A;˝đ?&#x2018; â&#x2C6;&#x2122; (đ?&#x2018;&#x;đ?&#x2018;&#x20AC; â&#x2C6;&#x2019; đ?&#x2018;&#x;0 ) â&#x2030;¤ đ?&#x2018;&#x;0 }.
(35)
The Treynorâ&#x20AC;&#x2122;s Ratio
Let us assume the assumptions are the same as the assumptions used in the section 5.2. Additionally we assume that the security return is positively correlated with the market portfolio return. Treynor [11] defines the profit index đ?&#x2018;&#x201D;: â&#x201E;? â&#x2020;&#x2019; â&#x201E;? and the limit value đ??şĚ&#x201A; as follows đ?&#x2018;&#x201D;(đ?&#x2018;&#x;) =
đ?&#x2018;&#x;â&#x2C6;&#x2019;đ?&#x2018;&#x;0
,
(36)
đ??şĚ&#x201A; = đ?&#x2018;&#x;đ?&#x2018;&#x20AC; â&#x2C6;&#x2019; đ?&#x2018;&#x;0 .
(37)
đ?&#x203A;˝đ?&#x2018;
The Treynorâ&#x20AC;&#x2122;s profit index estimates amount of the premium per market risk unit. The Treynorâ&#x20AC;&#x2122;s limit value is equal to the premium of the market risk. Let us consider the case when expected return from the security đ?&#x2018;&#x2020;Ě&#x152; is estimated as đ?&#x2018;&#x2026; â&#x2C6;&#x2C6; â&#x201E;?(â&#x201E;?) described by (1). In agree with (9), (11), (14) and (16) we have here đ?&#x153;&#x2020;(đ?&#x2019;&#x153;|đ?&#x2018;&#x2020;Ě&#x2020;) = sup{đ?&#x153;&#x152;(đ?&#x203A;˝đ?&#x2018; â&#x2C6;&#x2122; đ?&#x2018;Ľ + đ?&#x2018;&#x;0 ): đ?&#x2018;Ľ â&#x2030;Ľ đ?&#x2018;&#x;đ?&#x2018;&#x20AC; â&#x2C6;&#x2019; đ?&#x2018;&#x;0 },
(38)
đ?&#x153;&#x2020;(â&#x201E;&#x203A;|đ?&#x2018;&#x2020;Ě&#x2020;) = sup{đ?&#x153;&#x152;(đ?&#x203A;˝đ?&#x2018; â&#x2C6;&#x2122; đ?&#x2018;Ľ + đ?&#x2018;&#x;0 ): đ?&#x2018;Ľ â&#x2030;¤ đ?&#x2018;&#x;đ?&#x2018;&#x20AC; â&#x2C6;&#x2019; đ?&#x2018;&#x;0 },
(39)
đ?&#x153;&#x2026;(đ?&#x2019;&#x153;|đ?&#x2018;&#x2020;Ě&#x2020;) = inf{đ?&#x153;&#x152;(đ?&#x203A;˝đ?&#x2018; â&#x2C6;&#x2122; đ?&#x2018;Ľ + đ?&#x2018;&#x;0 ): đ?&#x2018;Ľ â&#x2030;Ľ đ?&#x2018;&#x;đ?&#x2018;&#x20AC; â&#x2C6;&#x2019; đ?&#x2018;&#x;0 }, đ?&#x153;&#x2026;(â&#x201E;&#x203A;|đ?&#x2018;&#x2020;Ě&#x2020;) = inf{đ?&#x153;&#x2018;(đ?&#x203A;˝đ?&#x2018; â&#x2C6;&#x2122; đ?&#x2018;Ľ + đ?&#x2018;&#x;0 ): đ?&#x2018;Ľ â&#x2030;¤ đ?&#x2018;&#x;đ?&#x2018;&#x20AC; â&#x2C6;&#x2019; đ?&#x2018;&#x;0 }.
(40) (41)
Investment recommendation made by means of the Treynorâ&#x20AC;&#x2122;s Ratio is identical with investment recommendation made by the Jensenâ&#x20AC;&#x2122;s Alpha.
684
Mathematical Methods in Economics 2016
6 The First Safety Criteria We consider the simple return rate đ?&#x2018;&#x;Ě&#x192; on the security đ?&#x2018;&#x2020;Ě&#x2020;. For each assumed value đ?&#x2018;&#x; â&#x2C6;&#x2C6; â&#x201E;? of expected simple return rate the probability distribution of this return is given by the cumulative distribution function đ??š(â&#x2C6;&#x2122; |đ?&#x2018;&#x;): â&#x201E;? â&#x;ś [0; 1] which is strictly increasing and continuous. Then the Safety Condition [8] is given in following way đ??š(đ??ż|đ?&#x2018;&#x;) = đ?&#x153;&#x20AC;, (42) where: ď&#x20AC;
đ??ż denotes minimum acceptable return rate;
ď&#x20AC;
đ?&#x153;&#x20AC; is equal to probability realization of return below the minimum acceptable rate.
The realization of return below the minimum acceptable rate is identified with loss. Therefore, the variable đ?&#x153;&#x20AC; denotes the loss probability. Let us consider the case when expected return from the security đ?&#x2018;&#x2020;Ě&#x152; is estimated as đ?&#x2018;&#x2026; â&#x2C6;&#x2C6; â&#x201E;?(â&#x201E;?) described by (1).
6.1
The Royâ&#x20AC;&#x2122;s Criterion
The Royâ&#x20AC;&#x2122;s Criterion [8] is that, for fixed minimum acceptable return rate đ??ż the investor minimizes the loss probability. Additionally in order to ensure financial security, the investor assumes the maximum level đ?&#x153;&#x20AC; â&#x2C6;&#x2014; of loss probability. Then the profit index đ?&#x2018;&#x201D;: â&#x201E;? â&#x2020;&#x2019; [â&#x2C6;&#x2019;1; 0] and the limit value đ??şĚ&#x201A; are defined as follows đ?&#x2018;&#x201D;(đ?&#x2018;&#x;) = â&#x2C6;&#x2019;đ??š(đ??ż|đ?&#x2018;&#x;) ,
(43)
đ??şĚ&#x201A; = â&#x2C6;&#x2019;đ?&#x153;&#x20AC; .
(44)
â&#x2C6;&#x2014;
In agree with (9), (11), (14) and (16) we have here
6.2
đ?&#x153;&#x2020;(đ?&#x2019;&#x153;|đ?&#x2018;&#x2020;Ě&#x2020;) = sup{đ?&#x153;&#x152;(đ?&#x2018;&#x;): đ??š(đ??ż|đ?&#x2018;&#x;) â&#x2030;¤ đ?&#x153;&#x20AC; â&#x2C6;&#x2014; },
(45)
đ?&#x153;&#x2020;(â&#x201E;&#x203A;|đ?&#x2018;&#x2020;Ě&#x2020;) = sup{đ?&#x153;&#x152;(đ?&#x2018;&#x;): đ??š(đ??ż|đ?&#x2018;&#x;) â&#x2030;Ľ đ?&#x153;&#x20AC; â&#x2C6;&#x2014; },
(46)
đ?&#x153;&#x2026;(đ?&#x2019;&#x153;|đ?&#x2018;&#x2020;Ě&#x2020;) = inf{đ?&#x153;&#x2018;(đ?&#x2018;&#x;): đ??š(đ??ż|đ?&#x2018;&#x;) â&#x2030;¤ đ?&#x153;&#x20AC; â&#x2C6;&#x2014; },
(47)
đ?&#x153;&#x2026;(â&#x201E;&#x203A;|đ?&#x2018;&#x2020;Ě&#x2020;) = inf{đ?&#x153;&#x2018;(đ?&#x2018;&#x;): đ??š(đ??ż|đ?&#x2018;&#x;) â&#x2030;Ľ đ?&#x153;&#x20AC; â&#x2C6;&#x2014; }.
(48)
The Kataokaâ&#x20AC;&#x2122;s Criterion
`The Kataokiâ&#x20AC;&#x2122;s Criterion [4] is that, for fixed loss probability đ?&#x153;&#x20AC; the investor maximizes the minimum acceptable return rate. In addition, in order to ensure interest yield, the investor assumes the minimum level đ??żâ&#x2C6;&#x2014; of return. Then the profit index đ?&#x2018;&#x201D;: [0; 1] â&#x2020;&#x2019; â&#x201E;? and the limit value đ??şĚ&#x201A; are defined in following way đ?&#x2018;&#x201D;(đ?&#x2018;&#x;) = đ??š â&#x2C6;&#x2019;1 (đ?&#x153;&#x20AC;|đ?&#x2018;&#x;) , đ??şĚ&#x201A; = đ??ż .
(49)
â&#x2C6;&#x2014;
(50)
đ?&#x153;&#x2020;(đ?&#x2019;&#x153;|đ?&#x2018;&#x2020;Ě&#x2020;) = sup{đ?&#x153;&#x152;(đ?&#x2018;&#x;): đ??š â&#x2C6;&#x2019;1 (đ?&#x153;&#x20AC;|đ?&#x2018;&#x;) â&#x2030;Ľ đ??żâ&#x2C6;&#x2014; },
(51)
đ?&#x153;&#x2020;(â&#x201E;&#x203A;|đ?&#x2018;&#x2020;Ě&#x2020;) = sup{đ?&#x153;&#x152;(đ?&#x2018;&#x;): đ??š â&#x2C6;&#x2019;1 (đ?&#x153;&#x20AC;|đ?&#x2018;&#x;) â&#x2030;¤ đ??żâ&#x2C6;&#x2014; },
(52)
đ?&#x153;&#x2026;(đ?&#x2019;&#x153;|đ?&#x2018;&#x2020;Ě&#x2020;) = inf{đ?&#x153;&#x2018;(đ?&#x2018;&#x;): đ??š â&#x2C6;&#x2019;1 (đ?&#x153;&#x20AC;|đ?&#x2018;&#x;) â&#x2030;Ľ đ??żâ&#x2C6;&#x2014; },
(53)
In agree with (9), (11), (14) and (16) we have here
đ?&#x153;&#x2026;(â&#x201E;&#x203A;|đ?&#x2018;&#x2020;Ě&#x2020;) = inf{đ?&#x153;&#x2018;(đ?&#x2018;&#x;): đ??š â&#x2C6;&#x2019;1 (đ?&#x153;&#x20AC;|đ?&#x2018;&#x;) â&#x2030;¤ đ??żâ&#x2C6;&#x2014; }.
6.3
(54)
The Telserâ&#x20AC;&#x2122;s Criterion
Based on the safety and profitability of the investment, the investor assumes a minimum level đ??żâ&#x2C6;&#x2014; of acceptable return and the maximum level đ?&#x153;&#x20AC; â&#x2C6;&#x2014; of loss probability. If for the security đ?&#x2018;&#x2020;Ě&#x2020; fulfils the condition đ??š(đ??żâ&#x2C6;&#x2014; |đ?&#x2018;&#x;đ?&#x2018; ) â&#x2030;¤ đ?&#x153;&#x20AC; â&#x2C6;&#x2014; ,
(55)
then it is called safe-haven security. The Telsersâ&#x20AC;&#x2122;s Criterion [10] is that the investor maximizes the return on safe-haven securities. In addition, to ensure the profitability of investments, the investor takes into account the rate đ?&#x2018;&#x; â&#x2C6;&#x2014; > đ??żâ&#x2C6;&#x2014; of financial equilibrium. Then the profit index đ?&#x2018;&#x201D;: â&#x201E;? â&#x2020;&#x2019; â&#x201E;? and the limit value đ??şĚ&#x201A; are defined in following way
685
Mathematical Methods in Economics 2016
đ?&#x2018;&#x201D;(đ?&#x2018;&#x;) = đ?&#x2018;&#x; ,
(56)
đ??şĚ&#x201A; = đ?&#x2018;&#x; .
(57)
â&#x2C6;&#x2014;
In agree with (9), (11), (14) and (16) we have here đ?&#x153;&#x2020;(đ?&#x2019;&#x153;|đ?&#x2018;&#x2020;Ě&#x2020;) = sup{đ?&#x153;&#x152;đ?&#x2018; (đ?&#x2018;&#x;): đ?&#x2018;&#x; â&#x2030;Ľ đ?&#x2018;&#x; â&#x2C6;&#x2014; },
(58)
đ?&#x153;&#x2020;(â&#x201E;&#x203A;|đ?&#x2018;&#x2020;Ě&#x2020;) = sup{đ?&#x153;&#x152;đ?&#x2018; (đ?&#x2018;&#x;): đ?&#x2018;&#x; â&#x2030;¤ đ?&#x2018;&#x; â&#x2C6;&#x2014; },
(59)
đ?&#x153;&#x2026;(đ?&#x2019;&#x153;|đ?&#x2018;&#x2020;Ě&#x2020;) = inf{đ?&#x153;&#x2018;đ?&#x2018; (đ?&#x2018;&#x;): đ?&#x2018;&#x; â&#x2030;Ľ đ?&#x2018;&#x; â&#x2C6;&#x2014; },
(60)
đ?&#x153;&#x2026;(â&#x201E;&#x203A;|đ?&#x2018;&#x2020;Ě&#x2020;) = inf{đ?&#x153;&#x2018;đ?&#x2018; (đ?&#x2018;&#x;): đ?&#x2018;&#x; â&#x2030;¤ đ?&#x2018;&#x; â&#x2C6;&#x2014; }.
(61)
Summary Imprecision is relevant to the investment process. Imprecise estimate of the expected return could be a consequence of taking into account behavioural aspects of the investment process. The results so obtained may be applied in behavioural finance theory as a normative model of investmentâ&#x20AC;&#x2122;s decisions. These results may provide theoretical foundations for constructing an investment decision support system. Thus, we have shown here that the behavioral premises can influence investment recommendations in a controlled manner. Applications of the above normative models cause several difficulties. The main difficulty is the high formal and computational complexity of the tasks involved in determining the membership function for the imprecise recommendations. The computational complexity of these models is the price we pay for the lack of detailed assumptions about the return rate. On the other hand, the low logical complexity is an important attribute of each formal model. In this paper, the main cognitive result is to propose general methodology for imprecise investments recommendations. Moreover, the paper also offers original generalization of the financial equilibrium criteria and of the first safety criteria to the intuitionistic fuzzy case.
Acknowledgements The project was supported by funds of National Science Center - Poland granted on the basis the decision number DEC-2012/05/B/HS4/03543.
References [1] Atanassov K., Stoeva S.: Intuitionistic fuzzy sets. In: Proceedings of Polish Symposium on Interval and Fuzzy Mathematics (Albrycht J., WiĹ&#x203A;niewski H., eds.). PoznaĹ&#x201E;, 1985, 23-26. [2] Echaust K., Piasecki K.: Black-Litterman model with intuitionistic fuzzy posterior return, SSRN Electronical Journal (2016). [3] Jensen M.C.: Risk and pricing of capital assets, and the evaluation of investments portfolios. Journal of Business 42, 2 (1969), 167-247. [4] Kataoka S.: A stochastic programming model. Econometrica 31, 1/2 (1963), 181-196. [5] Zhang, Q., Jia B., and Jiang S.: Interval-valued intuitionistic fuzzy probabilistic set and some of its important properties. In: Proceedings of the 1st International Conference on Information Science and Engineering ICISE2009. Guangzhou, 2009, 4038-4041. [6] Piasecki K.: On imprecise investment recommendations. Studies in Logic, Grammar and Rhetoric 37, 50 (2014), 179-194. [7] Piasecki K.: On return rate estimated by intuitionistic fuzzy probabilistic set. In: Mathematical Methods in Economics MME 2015 (Martincik D., Ircingova J., Janecek P., eds.). Publishing House of Faculty of Economics, University of West Bohemian, Plzen, 2015, 641-646. [8] Roy A.D.: Safety-first and the holding of assets. Econometrics 20 (1952), 431-449. [9] Sharpe W.F.: Mutual fund performance. Journal of Business 19 (1966), 119-138. [10] Telser L.G. : Safety first and hedging. The Review of Economic Studies 23, 2 (1965), 1-16. [11] Treynor J.L.: How to rate management of investment fund. Harvard Business Review 43 (1965), 63-75. [12] www.marketwatch.com/tools/guide.asp (access 01.09.2013). [13] Zadeh L.: The concept of a linguistic variable and its application to approximate reasoning, Part I. Inf Sci. 8 (1975), 199-249.
686
Mathematical Methods in Economics 2016
A decent measure of risk for a household life-long financial plan – postulates and properties Radoslaw Pietrzyk1, Pawel Rokita2 Abstract. A measure of risk that is well suited to a life-long financial plan of a household shell differ from other popular risk measures, which have originally been constructed for financial institutions, investors or enterprises. It should address threats to accomplishment of the household’s life objectives and must take into account its life cycle. The authors of this research have recently proposed several candidates for such measures. This article presents, in turn, a discussion about general properties that should be fulfilled by a risk measure in household financial planning. At the current stage of the discussion, the posited postulates are rather of a qualitative nature, but they may also serve in the future as a conceptual background of a set of more formalized ones. They may be a kind of analogue of the conditions set by Artzner, Delbaen, Eber and Heath (coherent measure of risk) or by Föllmer and Schied (convex measure of risk). They should, however, better comply with the life-long household financial planning perspective. This means, amongst others, including multiple financial goals and their timing. Also threats to the aim of life standard preservation should be reflected. Keywords: household financial planning, coherent measures of risk, integrated risk measure, life cycle. JEL Classification: D14 AMS Classification: 91B30
1
General types of risk in household financial planning
In [6] we presented a number of proposals of household financial plan risk systematizations, with regard to different sets of criteria. That part of the discussion was concluded with the following list of risk types: 1. Life-length risk; 2. Risk of investment and financing; 3. Income risk; 4. Risk of events (insurance-like events); 5. Risk of goal realization; 6. Operational risk of plan management (particularly risk of plan implementation); 7. Model risk. The choice of this list is dictated, to some extent, by the construction of the household financial plan optimization model that is used in [6], but, at the same time, it well reflects real-life risk situation of households. When formulating this systematization, the following elements were investigated for each type of risk: the most natural candidate for the risk variable, the way in which the given type of risk influences household finance, the way it is or may be taken into account in our household financial plan optimization model (described in [9], [8] and [6]), the way in which it is or may be measured, and the possibility of steering of this risk by the household. The types of risk that are present in household financial planning are very different in their nature (different risk variables, different risk factors influencing the risk variables, different statistical properties of the risk factors). This heterogeneity is, however, not the only problem of household risk analysis. As with some other situations when the success of a project is threatened by many risk factors of different nature, it is very important to avoid falling into a trap of spurious local risk mitigation. That is, the fault of such local reducing of particular risk types, which does not contribute to a successful overall risk reduction. This problem is the main motivation of using integrated risk measures. The existing risk measures, that are used in integrated risk management of financial institutions and enterprises, cannot be directly transferred to household financial planning. The main reason of this
1
Wroclaw university of Economics/Department of Financial Investments and Risk Management, ul. Komandorska 118/120, 53-345 Wrocław, radoslaw.pietrzyk@ue.wroc.pl. 2 Wroclaw university of Economics/Department of Financial Investments and Risk Management, ul. Komandorska 118/120, 53-345 Wrocław, pawel.rokita@ue.wroc.pl.
687
Mathematical Methods in Economics 2016
fact is that household financial plans need different measure of success (or failure) than a financial institution or enterprise. Important concepts underlying most risk models are risk variable and benchmark. A risk variable is a random variable, whose value may be interpreted as a measure of success (or failure) in a decision problem under risk. In the classical decision-making theory it is just the payoff (or sometimes regret â&#x20AC;&#x201C; i.e., opportunity loss). In risk analysis for financial institutions it is the value (e.g., of an investment portfolio or of the whole institution). In risk analysis for enterprises it is usually net cash flow or earning. The benchmark, in turn, may be just the expected value of the risk variable, or some other distinguished value, like, for example, a desired or planned one. Generally speaking, vast majority of popular risk measures describe potential behaviour of the risk variable in relation to the benchmark. What is, then, the risk variable and the benchmark for measuring of risk in household financial planning? We propose to use cumulated net cash flow as a risk variable. We do not impose any particular benchmark yet. Different measures of risk may use different benchmarks. Our concern is rather whether the risk measures fulfil our postulates on a decent integrated risk measure for household financial planning. Moreover, the nature of the problem is such that the behaviour of the cumulated net cash flow during the whole life of the household needs to be taken into consideration. And a measure of risk should, thus, look at the whole term structure of the cumulated net cash flow, and, of course, it must do it for various scenarios. Certainly, from the point of view of risk analysis, the most important values of the net cash flow process are negative ones, called shortfalls. In the process of the cumulated net cash flow, there may also occur cumulated shortfalls, which are more serious than just one-period shortfalls. The most serious ones are unrecoverable shortfalls. An unrecoverable shortfall may be defined as such, after which, under a given scenario, the cumulated net cash flow is negative and it remains negative until the end of the household (for this scenario). The ability of the household to recover from shortfalls depends not only on their sizes, but also on their timing. This is because some moments in the life cycle of a human, as well as in the life cycle of a household, are better and some other are worse for facing a financial shortfall (age, stage of the career, responsibility and obligations towards other persons, etc.).
2
Specificity of household financial planning
Before presenting some ideas concerning the risk measures, let us discuss the general properties of household finance and household financial planning, that make risk measurement for them different from that of financial institutions and enterprises. An important term in the discussion on risk in household financial planning is a financial goal. In [10] we proposed to distinguish a general, qualitative, category of life objectives and a quantitative, cashflow-based, category of financial gals. Life objectives are just the ideas on how the household members want their life to look like in the future. Financial goals are understood as the goals of being prepared for expenses that are necessary to accomplish the life objectives. They are defined much more precisely than the corresponding life objectives. Namely, they are given in terms of pre-planned magnitudes and pre-planned moments of negative cash flows. The actual making of the expenditures is, in turn, called realization of the financial goals. Financial planning deals with financial goals, which are financial representations of life objectives. To the main features of household finance, that distinguish households from other entities, belong the following: 1. Human life-cycle concerns, especially: a) long-term planning, b) human-capital-to-financial-capital transfer (see, e.g., [5]), c) the fact that labour-income-generating potential (i.e., human capital) is developed years before one may start to use it (childhood, youth, early career), d) parenthood considerations â&#x20AC;&#x201C; in most cases, an important part of human life cycle is the period of parenthood, during which significant changes both to income structure and consumption patterns are imposed, e) the fact that labour-income-generating potential (human capital) is usually depleted long before human life ends (vast part of personal financial plan must assume non-labour sources of income), f) the specificity of retirement goal (relatively big magnitude and lack of post-financing possibility). 2. Multitude of financial goals and ways of their financing. 3. All-goal-oriented planning (the general aim of life-long financial planning is accomplishing of all financial goals, if possible, with a special emphasis on retirement goal, as pointed out in the point 1.e, rather than maximization of value or profit).
688
Mathematical Methods in Economics 2016
4. Multi-person concerns: a) the role of internal risk sharing and capital transfer between household embers (compare [7], [2]), b) the problems of multivariate survival process modelling (compare [4]). 5. Specificity of household preferences with regard to goals, especially: a) inapplicability of the assumption of separable preferences to household financial goals, b) lack of preference transitivity, and thus – impossibility of defining a hierarchy of goals (except some simple cases of few non-complementary goals). 6. Specificity of household preferences with regard to intertemporal choice a) inapplicability of the simple rule that the nearer future is more important than the distant future in preference modelling for cumulated net cash flow term structure (what makes sense with regard to consumption time preference, would be an absurd assumption when considering cumulated surplus or cumulated shortfall timing) – see [6], b) the role of age-related costs of financing (see [10]) in modelling of time preferences. 7. Specificity of household risk: a) many types of risk; risk factors of different nature (in the sense of their economic interpretation) and with different statistical properties (distributional models, dynamics, observability, etc.) b) no consensus on how to integrate the types of household risk in one model, c) no obvious success-failure measure(s) to serve as risk variable(s), nor any unambiguous candidate for a benchmark that could be interpreted as an indicator of a normal (typical) or desired state, d) special role of life-length risk (and of the survival models that are used in its modelling – compare Yaari [11] model, and also its numerous modifications and augmentations), e) very long risk management horizon. Taking the above-listed features into consideration, and building on some more general requirements that are set for any risk measures, we made an attempt to formulate a list of postulates that a household financial plan risk measure should fulfil. Before, let us briefly describe the input and output of the cumulated net cash flow model, which underlies the model of risk.
3
The model of household’s cumulated net cash flow
The first step in defining the measure of risk is describing the risk variable. We propose to use cumulated net cash flow in this role. The measure of risk will look at its whole term structure throughout the lifespan of the household. Of course, the length of the analysed period may be different in each survival scenario. A general scheme of the cumulated net cash flow model is presented in the figure 1.
Figure 1 Household cumulated net cash flow model Note (fig.1): Examples of risk factors: D1 – the moment of death of the Person 1, D2 – the moment of death of the Person 2, RM – expected rate of return on equity portfolio, η – risk free rate, STB – real estate price at the moment of household end (moment of bequeathing). Risk aversion parameters: γ, δ – the numbers of years before and after the expected date of death, respectively, that determine bounds of the subset of all possible survival scenarios which the household wants to include into the optimization procedure. Other preferences: α, β – parameters describing propensity to consume and bequest motive. One may distinguish three blocks of input in the scheme of input and output of the cumulated net cash flow model. These are:
689
Mathematical Methods in Economics 2016
• • •
risk factors, which are random variables (left input segment of figure 1); constraints (central input segment in figure 1), amongst which the financial goals of the household belong; parameters (right input segment in figure 1).
Then, when the financial plan is optimized, some of the parameters from the right input segment are no longer fixed parameters, but they become decision variables of the optimization procedure. For instance, in the basic variant of the model, described in [9], the decision variables are: - consumption rate on the first day of the plan, - consumption-investment proportion, - retirement investment contribution division (given as the proportion of the household’s total retirement investment expenditure per period which is contributed to the private pension plan of the Person 1). Each realization of the random vector from the left input segment (risk factors) is a scenario. As the scenarios contain realizations of several risk factors, corresponding to different types of risk, the resulting bunch of term structures (trajectories) of cumulated net cash flow also contains the information about these risk types. Moreover, as accomplishment of all financial goals is imposed as a constraint, any shortfalls occurring in the term structure of cumulated net cash flows indicate some problems with financial goal realization. But a negative cumulated net cash flow (cumulated shortfall) is not always a threat to the household. And even when it is, its severity is not always the same. The severity of the shortfalls depends not only on their sizes but also on their timing. Figures 25 show pairs of cumulated net cash flow term structures which show some similarities but are very different from the practical point of view.
Figure 2 Cumulated net cash flow term structures with positive and equal final wealth Note (fig. 2): The left plot shows a cumulated net cash flow term structure with no shortfall and the right plot – with a recoverable shortfall somewhere along the line; in both cases the household leaves the same bequest.
Figure 3 Cumulated net cash flow term structures with equal shortfalls at the end Note (fig. 3): Both plots show cumulated shortfalls at the end, the shortfalls are of the same size, but the periods during which the household remains under shortfall are of different lengths (the situation of the household whose cumulated net cash flow is presented on the right panel starts to deteriorate two years earlier, but its deterioration speed is lower).
Figure 4 Cumulated net cash flow term structures with equal sums of discounted shortfalls
690
Mathematical Methods in Economics 2016
Note (fig. 4): Sums of shortfalls, taken time value of money into account, are equal, but severity of one big shortfall in an old age is much more serious than that of permanent small shortfalls, which might be, for instance, the result of a rolled-over debit of a modest size (not recommended but also not very dangerous).
Figure 5 Cumulated net cash flow term structures with single instances of shortfalls of the same size (time value of money taken into account), but with different timing Note (fig. 5): Here, the shortfall occurs only once and in both cases its value is the same (time value of money taken into account), but, in the case presented in the left plot, the household recovers from it (the plan is rather successful â&#x20AC;&#x201C; at least under this particular scenario), whereas the shortfall from the left plot is unrecoverable (the plan is failed). From the figures 2-5 one may grasp an idea of how important the shape of cumulated net cash flow term structure is. This means that the risk measure should be defined for term structures, not just for point values of cumulated surplus or shortfall.
4
Postulates on a household financial plan risk measure
In 1999 Artzner et al. [1] introduced the notion of coherent measures of risk, extended then by FĂśllmer and Schied [3] to the concept of convex measures of risk. The logical consistency and straightforward financial interpretation of the convex risk measure concept make it attractive both for theoreticians and practitioners of finance. Unfortunately, it has no direct implementation to the case of the life-long financial plan for a household. This is mainly because the risk realization in households is hardly ever understood as a loss of value of some assets. This may be a good interpretation for some subtypes of household risk, but does not apply to integrated risk of the whole financial plan. Here we make an attempt to identify the requirements that should be met by an integrated risk measure for a household whole-life financial plan. At the current stage of our research it is not a formalized definition of such measure yet, but rather a list of features that it should possess. The features that a decent integrated risk measure for household financial planning should have are as stated by the Postulate1. Postulate 1. Postulates on integrated risk measure for household financial plans: Postulate 1.1. (An analogue to subadditivity) The value of the risk measure of a common plan for a two person household is lower or equal than the sum of values of this measure for two plans of the two persons (treated separately in the financial sense) â&#x20AC;&#x201C; assuming that other conditions are unchanged. Postulate 1.2. (Transitivity) If a plan A is not more risky than B and B is not more risky than C, then the plan A is also not more risky than C. Postulate 1.3. The measure should reflect the size of a potential shortfall. Postulate 1.4. The measure should be sensitive to the phase of the household life cycle in which the potential shortfall will be encountered. Postulate 1.5. The measure should reflect the length of period (periods) during which the potential shortfall will be faced by the household. Postulate 1.6. The measure should incorporate the information about probability of shortfalls (e.g., number of scenarios with shortfalls and probabilities of these scenarios). Postulate 1.7. The measure should be in conformity with the interpretation that a scenario which is ending with a surplus (over a benchmark) is better than a scenario ending with a shortfall (bellow the benchmark). To that, a postulate referring to plan optimization is needed. The risk optimization procedure must be consistent with the household preference model and with the risk measure.
691
Mathematical Methods in Economics 2016
Postulate 2. The higher risk aversion declared by the household members, the less risky the optimal plan. The inverse does not hold, because a lower risk aversion should not automatically mean a more risky optimization outcome (the resulting plan may be more risky indeed, but only if the expected value of a measure of success for this more risky outcome is higher than for a less risky one). The postulate 1.1. says that risk sharing within a household (obtained through internal capital transfers) causes reduction of risk in the financial plan, or at least does not increase it. The postulate 1.2. guarantees basic logical consistency. The postulates 1.3.-1.7. all together impose the requirement that the measure looks at the whole process of cumulated net cash flows, takes into account shapes of its trajectories under all considered scenarios, recognizes shortfalls as instances of risk realization, and depends on timing of shortfalls and on how likely a shortfallgenerating scenario is. The last postulate (1.7.) is that the risk measure should reflect also the amount of residual wealth (that may be bequeathed). This, roughly speaking, means that such plans, for which the residual wealth is more likely to be negative (or below some benchmark) are more risky than those that rather end with a surplus.
5
Conclusions
A success or failure of a life-cycle-spanning financial plan depends on household’s ability to accomplish multiple life objectives in the framework of the plan, under different scenarios. The life objectives may find their financial interpretation in the household’s ability to cover the expenses that are necessary to realize them. The aim of being prepared to spend a defined sum of money at a pre-planned moment is called here a “financial goal”, and the actual expense itself is called “realization of the financial goal”. If, under a considered set of scenarios, realization of the financial goals does not harm household finance, the plan is not risky within this set of scenarios. Provided that the full realization of all financial goals is imposed, the plan is risky if there exists a subset of the considered set of scenarios in which the household would incur shortfalls. Severity of the shortfalls depends on how unlikely it is to recover from them, which, in turn, depends on their timing. The timing of shortfalls must be considered in relation to the phases of the household life cycle, taking into account the dates of some important events in the lives of its members. This refers, moreover, not only to the expected scenario, but to a whole bunch of considered scenarios. No classical measure of risk fulfils these conditions. We put forward some proposals of candidates for integrated risk measures of household financial plan in [10] and [6], but no postulates on such measures have been explicitly formulated so far. Therefore, there did not exist any criteria, according to which the measures might have been evaluated. Now, with the postulates that are proposed here, the direction of further research may be to check whether the proposed measures fulfil the requirements. If none of them conforms to all the conditions, then the next research objective will be to develop one that meets these requirements.
References [1] Artzner, P., Delbaen, F., Eber, J. M., and Heath, D.: Coherent Measures of Risk. Mathematical Finance 9(3), 1999, 203–228. [2] Brown, J. R. and Poterba J. M.: Joint Life Annuities and Annuity Demand by Married Couples. Journal of Risk and Insurance 67(4), 2000, 527–554. [3] Föllmer, H. and Schied, A.:Convex measures of risk and trading constraints. Finance and Stochastics 6(4), 2002, 429–447. [4] Hurd, M. D.: Mortality Risk and Consumption by Couples. NBER Working Paper Series, 7048, 1999. [5] Ibbotson, R. G., Milevsky, M. A., Chen, P., and Zhu, K. X.: Lifetime financial advice: human capital, asset allocation, and insurance. Research Foundation of CFA Institute, Charlottesville, Va., 2007. [6] Jajuga, K., Feldman, Ł., Pietrzyk R., and Rokita, P.: Integrated Risk Model in Household Life Cycle. Publishing House of Wrocław University of Economics, Wrocław, 2015. [7] Kotlikoff, L. J. and Summers, L. H.: The Role of Intergenerational Transfers in Aggregate Capital Accumulation. Journal of Political Economy 89(4), 1981, 706–732. [8] Pietrzyk, R. and Rokita, P.: Facilitating Household Financial Plan Optimization by Adjusting Time Range of Analysis to Life-Length Risk Aversion. In: Analysis of Large and Complex Data (Wilhelm, A. F. X. and Kestler, H. A., eds.), Springer International Publishing, Cham, 2016 (in press). [9] Pietrzyk, R. and Rokita, P.: On a concept of household financial plan optimization model. Research Papers of Wrocław University of Economics, 381, 2015, 299–319. [10] Pietrzyk, R. and Rokita, P.: Stochastic Goals in Household Financial Planning. Statistics in Transition. New Series 16(1), 2015, 111–136. [10] Pietrzyk, R. and Rokita, P.: Integrated measure of risk in a cumulated-surplus-based financial plan optimization model for a 2-person household. In 33rd International Conference Mathematical Methods in Economics. Conference Proceedings (Martinčík, D., Ircingová, J. and Janeček, P., eds.), Cheb, 2015, 647–652. [11] Yaari, M. E.: Uncertain Lifetime, Life Insurance, and the Theory of the Consumer. The Review of Economic Studies 32(2), 1965, 137–150.
692
Mathematical Methods in Economics 2016
Optimum classification method of critical IT business functions for efficient business impact analysis Athanasios Podaras1 Abstract. Business Impact Analysis (BIA) helps develop business recovery objectives by determining how disruptions affect various organizational activities. The criticality ranking of a given IT business function is an informally conducted task during the BIA procedure. Some experts underline that various market intermediaries relate the criticality of a given function to its estimated recovery time or to the impact that an outage would have from a financial, customer or other aspect. Other experts state that high criticality ranking is related to business functions that contain intricate and complex procedures. The current paper introduces a standard mathematical decision making approach for the classification of critical business functions based on their complexity. The method is based on the Use Case Points method for software complexity estimation. Its formal illustration is based on decision tree classification algorithms implemented with the R-Studio software. The approach aims to the simple and effective conducting of the BIA process for a given business function. Keywords: business continuity management (BCM), business impact analysis (ΒΙΑ), critical business functions, complexity, decision tree classification JEL Classification: C88 AMS Classification: 68W01
1
Introduction
Business Continuity is the management of a sustainable process that identifies the critical functions of an organization and develops strategies to continue these functions without interruption or to minimize the effects of an outage or a loss of service provided by these functions [15]. Miller and Engemann [12] mention that the BCP process involves analyzing the possible threats, crisis events, the risks, and the resulting business impact, and from this analysis, developing and implementing strategies necessary to ensure that the critical components of an organization function at acceptable levels. The importance of a fully organized Business Continuity Management policy in modern organizations, is remarkably highlighted by multiple researchers and practitioners. Ashurst et al [2] state that Business Continuity (BC) ensuring is a major strategic objective for many organizations. Antlová et al [1] mention that BCM is considered to be one of the key elements of ICT competencies. An important part of the BCM process is the implementation of a thorough Business Impact Analysis (BIA). Business Impact Analysis (BIA) helps develop business recovery objectives by determining how disruptions affect various organizational activities [11]. Moreover, one of the most important part of BIA is the efficient prioritization of the company’s objectives. The specific task is performed so that the most critical business functions will be restored first and within an accepted timeframe. Despite the significance of the specific task, modern enterprises do not follow a standard classification method for their critical business functions. On the contrary, the criticality estimation of a given business function is a vaguely performed task during the Business Impact Analysis (BIA) process, since the BIA process itself requires interviews with senior managers who may have different perspectives [5]. The current paper introduces a modern approach for the standard and optimum classification of core enterprise IT business functions. The contribution involves mathematical equations as well as decision making algorithms for its standardization.
2
Background-Problem statement
One of the crucial steps of the business continuity management process is the determination of the most critical business functions of an organization, as well as the resumption timeframes of these functions in a proactive and predictive manner. Based on a definition of the European Banking Authority [4], a function, is a structured set of activities, services or operations that are delivered by the institution or group to third parties (including its clients, counterparties, etc.). The concept of “critical function” is intrinsically linked to the concept of the underlying services that are essential to the setting up of the deliverable services, relationships or operations to third parties. Technical University of Liberec, Economics Faculty, Department of Informatics, Voroněžská 13, Liberec 1,460 01, athanasios.podaras@tul.cz 1
693
Mathematical Methods in Economics 2016
When mathematical models and software tools are proposed towards the implementation of Business Continuity/Disaster Recovery policy, they do not include any accurate or standard procedure towards the criticality ranking of the business functions. The International Organization of Securities Commissions (IOSCO) [9], underlines that various market intermediaries in different countries relate criticality of a given function with the estimated recovery time, while other support that criticality is related to the impact that an outage would have from a financial, customer, reputational, legal or regulatory. Another study performed by The Rowan University Business Continuity Management program (Rowan University BCM Policy, 2014) [14], underlines that high criticality ranking is related to business functions that contain intricate and complex procedures. Still, the diversity of opinions towards the criticality ranking of organizational functions reveals the lack as well as the necessity of the presence of standard classification procedures based on strong mathematical methods and decision making algorithms. Considering the above statements, the author of the current work introduces a modern method for the optimal classification of IT business functions in an enterprise. The core idea behind the contribution is that the criticality ranking of a given business function is based on its complexity. Organizational literature has considered complexity as an important factor in influencing organizations [7]. The method follows the principles of the Use Case Points [10] method for estimating software complexity. The basis of the current contribution is to classify a business function according to the Unadjusted Points value as calculated by Karner. The algorithmic procedure for deriving precise criticality ranking for any business function follows the rules of the binary classification decision trees. The optimum criticality ranking of a given function will lead to its reasonable and objective business impact analysis including the estimation of its recovery time as well as the recommendation of appropriate types of business continuity exercises based on various types of recovery scenarios.
3 3.1
Methods and tools Calculation of the unadjusted points of a business function according to Use Case Points
According to Karner [10], the first part of the Use Case Points method for estimating software complexity, is devoted to the calculation of the Unadjusted Points. The specific part of the approach classifies Actors and Use Cases and estimates the total value of unadjusted points. Of course the specific calculation is not enough for estimating the required time for the development of a software application, and Karner introduced various factors (Technical and Environmental) which are also taken into consideration throughout the specific process. However, the specific value is considered as an initial software complexity index. The current contribution, applies similar principles to estimate an initial index for the complexity of an IT business function. The differentiation from the Use Case points, is that the Actors are further classified as Human Level Actors and Application Level Actors. Moreover, the Use Cases are now replaced by Business Functions. The classification of the actors and the business functions is depicted at Table 1. Human Level Application Level Actorâ&#x20AC;&#x2122;s Business Weight of a Actors Actors Weight Function Business Function Simple Simple (External 0.5 Simple (<=3 0.5 (Employee) System with a debusiness profined API (Applicesses) cation Interface)) Average Average (External 1 Average 1 (Supervisor) system interacting (>=4 and through a protocol <=7 business such as TCP/IP) processes) Complex Complex (Person 1.5 Complex (>7 1.5 (Business Maninteracting via business proager/Expert) GUI (Graphical cesses) User Interface Table 1 Classification of actors and business functions according to the contribution The derived equations of the contribution are similar to the equations of the Use Case Points. Thus, the Unadjusted Points are formulated as follows:
694
Mathematical Methods in Economics 2016
n
UHW ď&#x20AC;˝ ď&#x192;Ľ ď&#x20AC;¨ Î&#x2018;1i ď&#x201A;´ Wi ď&#x20AC;Š
(1)
i ď&#x20AC;˝1
where UHW is the Unadjusted Human Weight value, đ??´1đ?&#x2018;&#x2013; is Human Level Actor i, and đ?&#x2018;&#x160;đ?&#x2018;&#x2013; is the Actorâ&#x20AC;&#x2122;s ith Weight, in order to compute Human Level Actors,
n
UAPW ď&#x20AC;˝ ď&#x192;Ľ ď&#x20AC;¨ Î&#x2018;2i ď&#x201A;´ Wi ď&#x20AC;Š
(2)
i ď&#x20AC;˝1
where UAPW is the Unadjusted Application Weight value, đ??´2đ?&#x2018;&#x2013; is Application Level Actor i, and đ?&#x2018;&#x160;đ?&#x2018;&#x2013; is the Actorâ&#x20AC;&#x2122;s Weight, and n
UBFW ď&#x20AC;˝ ď&#x192;Ľ ď&#x20AC;¨BPi ď&#x201A;´ Wi ď&#x20AC;Š
(3)
i ď&#x20AC;˝1
where, UBFW= Unadjusted Business Function Weight, n = Number of included Business Processes, (đ??ľđ?&#x2018;&#x192;đ?&#x2018;&#x2013; ) is the Type of the given Business Process i and đ?&#x2018;&#x160;đ?&#x2018;&#x2013; is the Weight of the corresponding Business Process. Based on the above equations it is concluded that the total number of unadjusted points should be provided by the sum of UHW, UAPW and UBFW values. The derived value is entitled Unadjusted Business Function Recovery Points (UBFRP). The word recovery indicates that the specific value will be considered for the further estimation of the demanded recovery time effort. The detailed description of the estimation of the recovery time is described by the author in [13]. The derivation of the UBFRP value helps in the initial, rapid, easy but standard classification of a given business function according to its complexity. The precise classification of the business function is described in the results section of the current paper. For the estimation of the approximate recovery time in case of an unexpected failure, more factors have to be considered and their description is beyond the scope of this article.
3.2
General classification of IT business functions
Gibson [6] classifies critical IT business functions into four (4) different levels, namely Impact Value Levels, based on two important values that is, the Recovery Time Objective (RTO) and the Maximum Accepted Outage (MAO). These values indicate a logical and a maximum timeframe within which an IT business function should be brought back to its normal operation. The classification of Gibson is depicted at Table 2. The initial estimation of the complexity (UBFRP) value of a given function enables us to classify it rapidly according to these rules. Levels L1
L2
L3
L4
Importance/CritiRTO cality of BF BF should operate <2 hours without interruption BF may be inter<24 hours rupted for a short period BF may be inter- <72 hours (3 rupted for 1 or days) more days BF may be inter<168 hours rupted for ex(1 week) tended period
MAO =2 hours
=24 hours
=72 hours (3 days) =168 hours (1 week)
Table 2 Classification of IT business functions based on RTO and MAO timeframes
3.3
Algorithmic expression of the method via binary classification decision trees
The author utilizes a decision tree in order to depict the algorithmic decision making process towards the classification of an IT business function based on its criticality. Decision tree learning is one of the most widely used
695
Mathematical Methods in Economics 2016
techniques. Its accuracy is competitive with other learning methods and it is very efficient. The learned classification model utilized in our case is the Classification and Regression Tree (CART) approach, which is introduced and described in detail by Breiman et al [3]. Grąbczewski [8], in his recent study, mentions that the method is designed to be applicable to both classification and regression problems. The algorithm is nonparametric and creates binary trees from data described by both continuous and discrete features. The same study [15] mentions that exhaustive search for the best splits estimates split qualities with the impurity reduction criterion with impurity defined as so called Gini (diversity) index. The decision tree built by the CART algorithm is always a binary decision tree (each node will have only two child nodes). The specific index is provided by the following formula: n
Gini (T ) 1 p( j)
2
(7)
j 1
where, p(j) is the relative frequency of class j in T, and T is the dataset which contains examples of n classes. For the induction of the decision tree, the node split is automatically derived with the help of the R-Studio software.
4 4.1
Results Initial estimation of the UBFRP Values
The first part of the current work involves the estimation of specific values regarding the Unadjusted Business Function Recovery Points (UBFRP) as described in the 3.1 section of the paper. The calculations have been performed with the creation of formulas in Microsoft Excel 2013. The inferred results are depicted at Table 3. The table includes the most representative UBFRP values which indicate changes in the decision making process, regarding not only the classification of the process as critical or non-critical, but also the assignment of the precise Level of importance and approximate RTO/MAO values according to Table 2. The calculations include a Simple, an Average and a Complex representative recovery scenario regarding IT business functions in case of their operational failure. The Simple scenario involves 1 Simple Human Level Actor, 1 Average Human Level actor and 1 Complex Human Level Actor (1, 1, 1). In a similar way, the Average Scenario is a (1, 2, 2) Human Level Actor combination and the Complex Scenario involves a (1, 3, 3) Human Level Actor combination. The three scenarios have corresponding combinations for Application Level Actors and Business Function categories. Recovery Scenario Simple Average Complex
UHW
UAPW
UBFW
UBFRP
(1,1,1) (1,2,2) (1,3,3)
(1,1,1) (1,2,2) (1,3,3)
(1,1,1) (2,2,2) (3,3,3)
9 15 21
Table 3 Classification of IT business functions based on RTO and MAO timeframes
4.2
Decision Tree induction in R-Studio
The specific software tool has been selected by the author for decision tree induction, due to the fact that it supports tree induction with various algorithms, i.e. ID3, C4.5 and CART. For the implementation of the CART algorithm 2 main packages in R-Studio had to be utilized, that is (tree) and (rpart) package. The author of the current article has performed both approaches, and both inferred the same result. The purpose of the research was to use the specific software tool for implementing a binary final decision of whether a business function should be classified as critical for immediate recovery or not, according to the above inferred UBFRP values. The decision tree was formulated after importing a detailed dataset with all the UBFRP values in Ms Excel, from a corresponding .txt file. The code required for installing the tree package and generating the decision tree according to the CART algorithm is depicted below: > install.packages("tree") > library(tree) > t3<-tree(IVL~UBFRPVAL, data=UBFRP_VALUES_2, method="recursive.partition") > plot(t3) > text(t3)
696
Mathematical Methods in Economics 2016
> print(t3) node), split, n, deviance, yval, (yprob) * denotes terminal node 1) root 25 67.70 L4 ( 0.2000 0.2400 0.2000 0.3600 ) 2) UBFRPVAL < 14.5 14 18.25 L4(NO) ( 0.0000 0.0000 0.3571 0.6429 ) 4) UBFRPVAL < 9.5 9 0.00 L4(NO) ( 0.0000 0.0000 0.0000 1.0000 ) * 5) UBFRPVAL > 9.5 5 0.00 L3(NO) ( 0.0000 0.0000 1.0000 0.0000 ) * 3) UBFRPVAL > 14.5 11 15.16 L2(YES) ( 0.4545 0.5455 0.0000 0.0000 ) 6) UBFRPVAL < 20.5 6 0.00 L2(YES) ( 0.0000 1.0000 0.0000 0.0000 ) * 7) UBFRPVAL > 20.5 5 0.00 L1(YES) ( 1.0000 0.0000 0.0000 0.0000 ) * The corresponding data.frame, or dataset, as well as the inducted tree are depicted at Fig. 1 and Fig. 2 respectively. The final nodes of the decision tree provide information about the impact value level of the function, as it is proposed by Gibson, as well as the Boolean decision about whether the given business function is critical or not (YES/NO). Root node is the UBFRP Value, according to which (If UBFRPVAL<14,5 then BF_Non-Critical = YES, with Impact Value Levels L3, L4) the criticality of the given function is determined.
Figure 1 Decision tree inducted in R-Studio
Figure 2 Dataset in Excel file for generating decision tree in R-Studio
5
Discussion
The major challenge of the current paper is the determination of the recovery complexity of a business function based on software complexity estimation. Can the two complexities be compared and could we derive reasonable complexity measurements for an IT business function this way? From the authorâ&#x20AC;&#x2122;s standpoint the answer to this question is positive since IT business functions rely on information systems and applications whose complexity is estimated by the Use Case Points method. Another issue of major importance is the simplicity of an approach so
697
Mathematical Methods in Economics 2016
as to achieve early and proactive, in terms of the formulation of the BIA document, classification of critical IT business functions. Miller and Engemann [11] state that, to be effective, methods to prioritize objectives should have some analytical rigor while at the same time, not being so complex that decision makers cannot use them. The current method is standard, follows algorithmic principles and provides reasonable results of impact value levels through which efficient and timely binary classification of the enterprise business functions as critical/noncritical can be determined, at least in a primary level. These elements of the presented method are highly advantageous for IT managers in modern enterprises.
6
Conclusion – Future work
The current paper introduced a part of an ongoing research in the field of IT Business Continuity Management, which focuses on the classification of IT business functions based on their criticality. The criticality is determined based on the complexity of the business function and its corresponding recovery timeframe. The primary principle is “the more complex the business function, the less time should be afforded for its recovery”. The method is formulated according to the Use Case Points principles for software complexity estimation. Moreover, the classification of a specific IT business function is implemented according to the CART algorithm for decision tree classification. The contribution, which is addressed to the business impact analysis (BIA) formulation, is promising for enterprise business continuity management due to its simplicity and its preciseness, taking into consideration that rapid decisions for business impact analysis can be taken. Future work will include the software development for the specific functionality, because automation for deriving the criticality classification results is highly demanded by enterprises. The software tool will include not only a primary classification of a given IT business function, but also the precise time required for its recovery. The software tool will include both a desktop application as well as a web platform.
Acknowledgements The current paper is supported by the SGS Project entitled “Advanced Information and Communication Services as a Key Tool during Emergency Situation Management”, No 21142, Technical University of Liberec.
References [1] Antlová, K., Popelinský, L., & Tandler, J.: Long term growth of SME from the view of ICT competencies and web presentations, E&M Ekonomie a Management, 4 (2011), 125-139. [2] Ashurst, C., Freer, A., Ekdahl, J., and Gibbons, C.: Exploring IT enabled innovation: a new paradigm? International Journal of Information Management, 32 (2012), 326–336. [3] Breiman, L., Friedman, J.H., Olshen, R. and Stone, C.L.: Classification and regression trees. Chapman and Hall, 1984. [4] European Banking Authority (EBA). Technical advice on the delegated acts on critical functions and core business lines. EBA/Op/2015/05, 2015 [5] Gallangher, M. Business continuity management - How to protect your company from danger. Prentice Hall, 2003. [6] Gibson, D.: Managing risks in information systems. Jones & Bartlett Learning, 2010 [7] Gorzeń – Mitka, I., and Okręglicka, M.: Review of complexity drivers in enterprise. In: Proceedings of the 12th International Conference Liberec Economic Forum’15 (Kocourek, A., ed.). Liberec, 2015, 253-260. [8] Grąbczewski, K.: Meta-Learning in Decision Tree Induction. Springer International Publishing, Switzerland, 2014 [9] International Organization of Securities Commissions (IOSCO). Market intermediary business continuity and recovery planning. Consultation Report - CR04/2015, 2015 [10] Karner, G. (1993). Resource estimation for objectory projects. In: Systems SF AB. [11] Miller, H.E., and Engemann, K.J.: Using analytical methods in business continuity planning. In: Proceedings of MS 2012 (Engemann, K.J., Gil Lafuente, A.M. and Merigó, J.M., eds.). LNBIP, Springer, Heidelberg, 2012, 2-12. [12] Miller, H.E., and Engemann, K.J.: Using reliability and simulation models in business continuity planning. International Journal of Business Continuity and Risk Management 5 (2014), 43-56 [13] Podaras, A., Antlová, K., and Motejlek, J.: Information management tools for implementing an effective enterprise business continuity strategy, E&M Ekonomie a Management, 1 (2016), 165-182. [14] Rowan University. Business Continuity Management Policy, 2014 [15] Tucker, E.: Business continuity from preparedness to recovery – a standards-based approach. Elsevier Inc., 2015.
698
Mathematical Methods in Economics 2016
Regression Analysis of Social Networks and Platforms for Tablets in European Union Alena Pozdílková1, Pavla Koťátková Stránská2, Martin Navrátil3 Abstract. In the last few years two operation systems for tablets: Android and iOS dominate in all states of European Union. Android is leading almost through whole Europe, it has a dominant position in all EU countries except the UK with a relatively small variations. Four social networks that are used on tablets, having the largest number of users in Europe include Facebook, Twitter, Stumble Upon and Pinterest. The number of portable electronic devices, whose major portion consists tables, is now experiencing worldwide growth. The aim of the paper is to create a regression model for individual model platforms and social networks for tablets in Europe, with a focus on the European Union and their analysis. The result of this paper will be also determination the degree of sensitivity of individual parameters in regression models. Keywords: Social network, Regression, Tablet, European Union. JEL Classification: C58, G21, C610 AMS Classification: 90C15
1
Introduction
The number of mobile devices more than doubled over the last two years in the Czech population. It results from a comparative analysis of data research of the Media project [11]. The fastest expansion is of distribution of tablets in last two years, whose numbers have tripled. 1.9 million people aged 12-79 years, representing 22 percent of the population currently own tablet according to their research [11]. In the European Union dominate two operating systems for tablets in last few years: Android and iOS. Other operating systems for tablets, namely Linux and Win8.1 RT, occupy only a small part of the market [1]. Android, which has a dominant position throughout Europe, also has a dominant position in all EU countries except the UK. According to standard deviations can be seen relatively small variations across the EU. Unlike iOS, this with a smaller variation achieves major deviations [1]. Facebook, Twitter, Stumble Upon and Pinterest are four social networks with the largest number of users using tablets in the European Union [11]. Social networks have been studied by many authors [6], [7]. The most important social network with many active users all around the world Facebook was founded in 2004. Facebook’s mission is to give people the power to share and make the world more open and connected. People use Facebook to stay connected with friends and family, to discover what’s going on in the world, and to share and express what matters to them. Facebook popularity and distribution of users by age shows Figure 1. Statistics: 890 million daily active users on average for December 2015; 745 million mobile daily active users on average for December 2015; 1.39 billion monthly active users as of December 31, 2015; 1.19 billion mobile monthly active users as of December 31, 2015; approximately 82.4% of our daily active users are outside the US and Canada [2].
1
University of Pardubice, Faculty of Electrotechnics and Informatics, Department of Mathematics and Physics, Studentská 95, 532 10 Pardubice 2, Czech Republic, e-mail: alena.pozdilkova@upce.cz. 2 University of Pardubice, Faculty of Electrotechnics and Informatics, Department of Mathematics and Physics, Studentská 95, 532 10 Pardubice 2, Czech Republic, e-mail: pavla.kotatkovastranska@upce.cz. 3 University of Pardubice, Faculty of Electrotechnics and Informatics, Department of Mathematics and Physics, Studentská 95, 532 10 Pardubice 2, Czech Republic, e-mail: martin.navratil@student.upce.cz.
699
Mathematical Methods in Economics 2016
Figure 1 Facebook popularity and distribution of users by age [2] Analysis in following chapters is focused on the European Union, states in Eurozone and states in European Union with their own currency. 19 countries of the European Union in Eurozone are: Belgium, Estonia, Finland, France, Ireland, Italy, Cyprus, Latvia, Lithuania, Luxembourg, Malta, Netherlands, Portugal, Austria, Greece, Slovakia, Slovenia and Spain.
2
Data and methodology
Stats are based on aggregate data collected by StatCounter [4] on a sample exceeding 15 billion pageviews per month collected from across the StatCounter network of more than 3 million websites. Stats are updated and made available every 4 hours, however are subject to quality assurance testing and revision for 14 days from publication [4]. StatCounter is a web analytics service, tracking code is installed on more than 3 million sites globally. These sites cover various activities and geographic locations. It provides independent, unbiased stats on internet usage trends, which are not collated with any other information sources. No artificial weightings are used [4].
3
Analysis for Eurozone (19 states)
The correlation analysis will be performed in this part at first. Correlation matrix will be composed. This matrix will be established between different social networks and mobile platforms [3], [5]. Subsequently regression models will be developed where the dependent variable is the relevant social network, respectively mobile platform architecture and the independent variable is time (in years 2012-2015 for all countries of the Eurozone).
3.1
Correlation Analysis for Eurozone
The correlation analysis through all the years and states - marked correlations at a significance level of 0.05 are statistically significant. See Table 1 and Table 2. variable Facebook Twitter Stumble Upon Pinterest Other I.
Facebook 1.000000 -0.813517
Twitter -0.813517 1.000000
StumbleUpon -0.756457 0.512300
Pinterest -0.858389 0.505493
Other I -0.668029 0.293331
-0.756457
0.512300
1.000000
0.662972
0.564610
-0.858389 -0.668029
0.505493 0.293331
0.662972 0.564610
1.000000 0.734851
0.734851 1.000000
Table 1 Results of correlation analysis â&#x20AC;&#x201C; correlation matrix for social networks variable iOs Android Other II.
iOs 1.000000 -0.994559 -0.029220
Android -0.994559 1.000000 -0.045848
Other II. -0.029220 -0.045848 1.000000
Table 2 Results of correlation analysis â&#x20AC;&#x201C; correlation matrix for platforms
700
Mathematical Methods in Economics 2016
The significant correlation relations among variables can be seen from the correlation coefficients in Table 1 and Table 2. The significance was detected by Statistica software version 1.12. The insignificant relationship is between variables Other I. and Twitter or Other II and Android, iOS.
3.2
Regression Analysis for Eurozone
The regression models for variables Facebook, Twitter, StumbleUpon, Pinterest, Other I, iOS, Android and Other II will be analyzed in this part [8], [10], [12]. The regression models are properly assembled, the quality of the model is verified based on the p-values for the parameter beta b (the parameter is statistically significant at a level of 0.05, p-value is less than the significance level). The quality of the entire regression model is satisfied (p-value 0.0000), determination index is less than the statistic Durbin-Watson - it is not an apparent regression. This is applied to all models. Facebook and Android are expected a growing trend according to the regression model. StumbleUpon, Pinterest and iOS are expected conversely decreasing trend [9]. The regression coefficient indicates how much the variable changes when you increase the time unit by one. A positive sign means that the value will increase with the growth of time units, negative value of the variable will decrease with an increase in the time unit. See Table 3 and Table 4. variable Facebook Twitter Stumble Upon Pinterest Other I.
Regression parameter 0.573 -0.460
0.00000 0.00000
-0.520
0.00000
-0.470 -0.330
0.00000 0.00340
p-value
Confident interval -95% 4.305 -4.380
Confident interval +95% 8.578 -1.656
-1.401 -2.577 -1.02
-0.636 -1.006 -0.209
Table 3 Summary results of regression analysis for social networks variable iOs Android Other II.
Regression parameter -0.610 0.575 0.450
p-value 0.00000 0.00000 0.00000
Confident interval -95% -7.917 3.824 0.213
Confident interval +95% -4.287 7.591 0.575
Table 4 Summary results of regression analysis for platforms The tables above present the regression parameters "b" as well as the confidence intervals, which complement the conclusions regarding the statistical significance of the regression parameters. The regression parameters for all variables were identified as statistically significant (rejecting the null hypothesis H0: the regression parameter is insignificant), but the conclusion should be complemented by the confidence interval, respectively determining the interval in which the relevant parameter ranges. The growth respectively average growth can be expected in the range values of 4.305 to 8.578 in the social network Facebook. The variable Twitter is expected to decline between the values of -4.380 to -1.656 in the next period. StumbleUpon, Pinterest and Other I. variables for mobile iOS platform also can be expected to decline in the next period, while Android is expected to grow.
4
Analysis for European Union without Eurozone (9 states)
The correlation analysis and regression models in years 2012-2015 for all countries out the Eurozone (9 states) will be performed in this part.
4.1
Correlation Analysis for states out of Eurozone
Table 5 and Table 6 show marked correlations at a significance level of 0.05, which are statistically significant. variable Facebook Twitter
Facebook 1.000000 -0.863876
Twitter -0.863876 1.000000
StumbleUpon -0.758516 0.626506
701
Pinterest -0.888166 0.602293
Other I -0.836347 0.605628
Mathematical Methods in Economics 2016
Stumble Upon Pinterest Other I.
-0.758516
0.626506
1.000000
0.614636
0.459498
-0.888166 -0.836347
0.602293 0.605628
0.614636 0.459498
1.000000 0.839374
0.839374 1.000000
Table 5 Results of correlation analysis – correlation matrix for social networks variable iOs Android Other II.
iOs 1.000000 -0.996267 -0.251372
Android -0.996267 1.000000 0.193457
Other II. -0.251372 0.193457 1.000000
Table 6 Results of correlation analysis – correlation matrix for platforms The significant correlation relations among variables can be seen from the correlation coefficients in Table 5 and Table 6. The significance was detected by Statistica software version 1.12. The insignificant relationship is between variables Other I. with StumbleUpon and Other II with Android and iOs.
4.2
Regression Analysis for states out of Eurozone
The regression models for variables Facebook, Twitter, StumbleUpon, Pinterest, Other I, iOS, Android and Other II will be analyzed in this part [8], [10], [12]. The regression models are properly assembled, the quality of the model is verified based on the p-values for the parameter beta b (the parameter is statistically significant at a level of 0.05; p-value is less than the significance level). The quality of the entire regression model is satisfied (p-value 0.0000), determination index is less than the statistic Durbin-Watson - it is not an apparent regression. This is applied to variables Facebook, StumbleUpon, Pinterest, iOS and Android. Twitter, Other I and Other II aren´t for regression models statistically significant. Facebook and Android are expected a growing trend according to the regression model. StumbleUpon, Pinterest and iOS are expected conversely decreasing trend [9]. The regression coefficient indicates how much the variable changes when you increase the time unit by one. A positive sign means that the value will increase with the growth of time units, negative value of the variable will decrease with an increase in the time unit. See Table 7 and Table 8. variable Facebook Twitter Stumble Upon Pinterest Other I.
Regression parameter 0.438 -0.310
0.00810 0.063
-0.600
0.00010
-0.390 -0.260
0.02010 0.118
p-value
Confident interval -95% 1.262 -3.152
Confident interval +95% 7.615 0.092
-1.507 -2.468 0.120
-0.551 -0.221 -1.217
Table 7 Summary results of regression analysis for social networks variable iOs Android Other II.
Regression parameter -0.450 0.437 0.096
p-value 0.00560 0.00780 0.58080
Confident interval -95% -13.110 2.162 -0.333
Confident interval +95% -2.410 13.104 0.586
Table 8 Summary results of regression analysis for platforms Similar conclusions are also found in countries that are not part of the Eurozone, despite the higher p-values and thus less difference compared to the 0.05 level of significance than was the case in the Eurozone countries. The regression parameters were not identified as statistically significant for all the variables (rejecting the null hypothesis H0: the regression parameter is insignificant). Parameters Twitter, Other I and Other II were labeled as statistically insignificant.. The growth respectively average growth can be expected in the range values of 1.262 to 7.615 in the social network Facebook. The variable Twitter is expected to decline between the values of -3.152 to 0.092 in the next
702
Mathematical Methods in Economics 2016
period. StumbleUpon, Pinterest and Other I. variables for mobile iOS platform also can be expected to decline in the next period, while Android is expected to grow. Average increases, respectively decreases are shown in the confidence interval.
5
Graph results
The results from previous chapters are illustrated by the figures below. Figures 3 and 4 show the boxplot comparison for different parts of Europe in years 2012 and 2015 for all variables. Growing trend for most of the compared countries can be seen in tables in previous chapters. Values in Figure 4 are for many countries much higher than values in Figure 3.
Figure 3 Boxplot comparisons for different parts of Europe in 2012
Figure 4 Boxplot comparisons for different parts of Europe in 2015
6
Discussions
The trend of social network Facebook is surprising in comparison with others social networks. Facebook is expected to grow, while Twitter, StumbleUpon and Pinterest are expected to decline. A similar phenomenon occurs in the case of mobile platforms. iOS is expected to decline in the upcoming period, while the growth of Android is anticipated. The argument proving the summary above could be a correlation matrix, respectively values of the individual correlation coefficients. The compared confidence intervals have approximately the same values for both the Eurozone countries and countries outside the Eurozone except variable Facebook. Values for social network Facebook will grow faster in the Eurozone countries than in countries outside the Eurozone for the upcoming season.
703
Mathematical Methods in Economics 2016
7
Conclusions
The popularity of social networks is experiencing a worldwide growth (including people 60+). The number of users is increasing thanks to connectivity with mobile devices. It can be seen that Eurozone states and states with its own currency have the same trend in the upcoming period for both social networks and mobile platforms. The price is the main reason, why cheap Android dominates the market of mobile devices against expensive iOS. Another reason is the brand of tables. Nowadays are popular products from Asia manufactures, which use in most cases Android operating system. Among other factors affecting the sale of various mobile devices are promotions, the purchasing power of the population, popularity, reliability, price, charging patent, easy to install software etc. The analysis shows that for the social network Facebook is expected to grow for both compared areas – Eurozone and states outside Eurozone. The variable Twitter is expected to decline in next period. StumbleUpon, Pinterest and Other I. for the mobile iOS platform are also expected to decline in the coming period, while Android is expected to grow. The results of regression models can be used as a recommendation for further development in the area. The results can be an incentive for additional marketing promotion and development within social networks. It can be also used for more complex regression models and further research.
Acknowledgements This work has been mainly supported by the University of Pardubice via SGS Project (reg. no. SG660026) and institutional support.
References [1] C news. URL: http://www.cnews.cz/clanky/android-stale-utika-konkurenci-ma-takovy-podil-ze-jej-moznazacne-vysetrovat-eu, online, [cit. 30. 4. 2016]. [2] Facebook informations and statistics. URL: https://www.facebook.com/about, online, [cit. 30.4.2016]. [3] Field, A.: Discovering Statistics Using SPSS. London, SAGE Publications, 2005. [4] Global stats, Statcounter. URL: http://gs.statcounter.com/mobile-social-media-CZ-monthly-201303201303-bar, online, [cit. 30. 4. 2016]. [5] Hebák, P. et al.: Vícerozměrné statistické metody [3]. Praha, Informatorium, 2007. [6] Jackson, M., and Wolinsky, A.: A Strategic Model of Economic and Social Networks. Journal of Economic Theory 7 (1996), 44-74. [7] Kvasnička, M.: Markets, Social Networks and Endogenous Preferences. 30th International Conference Mathematical Methods in Economics, Palacký University Olomouc, 2014. [8] Lavine, K. B.: Clustering and Classification of Analytical Data. Encyclopedia of Analytical Chemistry, Chichester, John Wiley & Sons Ltd, 2000, 9689- 9710. [9] Meloun, M., Militký, J., and Hill, M.: Počítačová analýza vícerozměrných dat v příkladech. Praha, Academia, 2005. [10] Romesburg, Ch.: Cluster Analysis for Researchers. North Carolina, Lulu Press, 2014. [11] Unie vydavatelů, Median, Stem/Mark. URL: http://www.mediaguru.cz/2015/07/tablety-a-chytre-telefonyse-v-populaci-rychle-siri/#.Vk3jUr8_aOB, online, [cit. 30.4.2016]. [12] Zou, H., Hastie, T., and Tibshirani, R.: Sparse principal component analysis. Journal of computational and Graphical Statistics 15, 2 (2006), 265-280.
704
Mathematical Methods in Economics 2016
Dynamics Model of Firms’ Income Tax Pavel Pražák
1
Abstract. The aim of this paper is to introduce a simple dynamic model that describes the tax revenue of a given government. The paper deals with a modification of Ramsey model that uses the neoclassical dynamical model and determines the optimal consumption as a result of optimal intertemporal choices of individual households. For simplicity the model considers only a tax on income of firms. This leads to a different selecting of optimal prices of labor and capital services. Euler equation of optimal consumption path of household is then introduced as well as its particular interpretation. We also study longterm effects of taxation of the given model. Hence the steady state of the dynamical model is introduced and then a formula for steady state Laffer curve is derived. Having this formula we further study the properties of optimization of tax revenue. First, it is shown that the optimal tax rate is dependent on the magnitude of elasticity of capital which is an exogenous parameter of the neoclassical production function. Then methods of comparative statics are used to find a sensitivity of tax revenue at steady state of the economic unit on some parameters. Keywords: Laffer Curve, taxation, consumption, optimal control. JEL classification: C61, H21, E62 AMS classification: 49J15, 39C15
1
Introduction
Governments use taxes to finance their public services. Although we understand this reality well we rather prefer low tax rates. In past decades some supply side economists suggested tax cuts and claimed that low tax rates can stimulate investment, spending and consumption that can support economic growth, [4], [3], [13]. Their argumentation was partially supported by popular Laffer curve, [8]. The fact that some taxes were reduced in the last two decades in European Union countries can be documented on Figure 1. Particularly in the Czech Republic the corporate income tax rate was gradually reduced from 40% in 1995 to 19% in 2010, [7]. Because lower corporate tax can attract more foreign investments it is preferred by government that plan to improve technology equipment or that want to improve employment in the given country. However, the cutting taxes without corresponding cutting government spending can result in fluent increase in budget deficit, [3], [9]. It seems that the recent debt crisis in some EU countries can stop tax cut trends and even it can reverse some of them. The debate on the level of tax rates is therefore still unfinished and current. It is known that taxes can change equilibrium prices and quantities of taxed goods, [9], which can influence tax revenue. We think that it is useful to construct models that allow us to study the influence of different endogenous or exogenous parameters of the model on tax revenue, [13]. The aim of this paper is to answer the partial question of how the behaviour of households and firms in an economic unit can adjust if fiscal policy of a government changes tax rate on firm’s income.
2
Methods
A dynamic macroeconomic model that consists of household and production sectors will be considered. The production sector will be represented by a production function. The government sector will be represented only by an exogenous parameter given by tax rate on output of production sector. It is considered that households maximize their total consumption over the planning period and that they are constrained by their budgets at each period. It means that their purchases can be financed by 1 University of Hradec Králové, Faculty of Informatics and management, Department of Informatics and Quantitative Methods, Rokitanského 62, Hradec Králové, Czech Republic, pavel.prazak@uhk.cz
705
Mathematical Methods in Economics 2016
Figure 1 Reduction of tax rate on corporate income of EU countries in 1995 and 2013, in %., [7]. their incomes after tax. The problem of maximization the total utility of consumption will be given as a discrete time dynamic optimization problem, [12], [2]. Such a problem can be characterized by two principal features: a) the evolution of a given macroeconomic system is described by a discrete time dynamical system and b) a cost function that represents the demand of a planner is given. The dynamical system that describe the evolution of a state variable can be controlled by a control variable. In a model with discrete time a difference equation can be used, [11]. The aim of the optimization problems is to find maximum (or minimum) of a given cost function with respect to constraint that is represented by a difference equation. Since a long-last planning horizon will be considered, a dynamic optimization problem over an infinite horizon will be used. The advantage of this formulation is that it is not necessary consider what happens after a finite horizon is reached. Then necessary optimal conditions can be formulated, [6], [2], [10]. These conditions allow finding candidates for optimal solution.
3
Model
Two main sets of agents will be considered. Households own labor and assets and receive wages and rental payments. Firms produce homogeneous output and rent labor and capital. Instead of considering all particular households or all particular firms we introduce a representative household or a representative firm. The representative household is a model that is used for the decisions of many hidden small households. Similarly, the representative firm is a model that is used for the decisions of many hidden small firms. Government will be represented only indirectly by its tax rate on output of the representative firm. There are several markets. The rental market for labor L with price w, w > 0, and capital services K with price R, R > 0. Households can borrow and lend with the interest rate r, 0 < r < 1, at the assets market. Firms sell their products at the market for final output that is considered to be competitive. The commodity price in this market is normalized to unity. The objective of the representative firm is to maximize its after-tax profit at each time period. For further analysis we consider that the revenue is non-deductible and that the profit of the firm can be expressed in the following form Π(K, L) = (1 − τ )F (K, L) − RK − wL,
(1)
where F (K, L) is a production function and τ, 0 < τ < 1, is a constant tax rate, which is an exogenous parameter given by government. This situation correspond with value-added tax. For alternative profit function and further discussion see section 5 of this paper. The first order conditions for the problem (1) are given by the following equations (1 − τ )
∂F ∂F − R = 0, (1 − τ ) − w = 0. ∂K ∂L
(2)
Notice that in this model the optimal price of capital services or wages are dependent on tax rate τ. If it is assumed that the production function is neoclassical with constant returns to scale the output can be
706
Mathematical Methods in Economics 2016
written as Y = F (K, L) = LF (K/L, 1) = Lf (k),
(3)
where k = K/L is capital-labor ratio, y = Y /L is per capita output and f (k) = F (k, 1) is intensive neoclassical production function with corresponding properties, [1]. Using the given notations equations (2) can be rewritten as (1 − τ )f 0 (k) = R, (1 − τ )(f (k) − kf 0 (k)) = w. (4) Assume further that T (τ ) denotes the tax revenue of the government per one period when the tax rate is τ , then T (τ ) = τ · Y = τ · Lf (k). (5) The household’s preferences can be represented by an infinite series of additive intertemporal utility function with a constant rate of time preference ∞ X
β t u(ct ),
(6)
t=0
where u(.) is an intertemporal utility function with standard properties, [1], ct is a consumption of household at period t, β = 1/(1 + ρ) is a rate of time preference and ρ ≥ 0 is a discount rate. A low value of β indicates that the household is impatient and it prefers current consumption to future consumption. If the household cares equally about the present and the future consumption, then β = 1. The household chooses a consumption plan which maximizes its total life time utility subject to its budget constraint. This constraint can be expressed as a difference equation for household’s assets at and has a form at+1 = wt + (1 + rt )at − ct , t ≥ 0,
(7)
with a given initial value a0 . Now the objective of the representative household can be shortly written as (∞ ) X max β t u(ct ), at+1 = wt + (1 + rt )at − ct . (8) ct ≥0
t=0
To solve this infinite time horizon problem (8) the discrete version of the maximum principle can be used, [10]. The household can freely choose the level of consumption ct in each period t in response to change of the tax rate τ. It means that consumption ct is a control variable and its assets at is a state variable of the given problem. First we introduce the current value Hamiltonian H(at , ct , λt+1 ) = u(ct ) + βλt+1 (wt + (1 + rt )at − ct ),
(9)
where λt+1 is the adjoint variable. Then the first-order necessary conditions can be expressed as system of two difference equations ∂H (at , ct , λt+1 ) ∂c ∂H (at , ct , λt+1 ) ∂a
= u0 (ct ) − λt+1 = 0, t ≥ 0,
(10)
= βλt+1 (1 + rt ) = λt , t ≥ 1.
(11)
If (10) is substituted into (11) we gain the following Euler equation u0 (ct ) = β(1 + rt )u0 (ct+1 ),
(12)
that represents the necessary condition of the given optimal control problem. Equation (12) describes the optimal local trade-off between consumption in two succeeding periods and its intuitive interpretation is as follows. The household cannot gain anything from redistributing consumption between two succeeding periods. A one unit reduction of consumption in the first period lowers the utility in this period by u0 (ct ). The one consumption unit which is saved in the first period can be converted into (1 + rt ) units of consumption in the second period and it can raise the second period utility by discounted value β(1 + rt )u0 (ct+1 ). The Euler equation states that these two quantities are equal at the global optimum. We will return to this equation once again later.
707
Mathematical Methods in Economics 2016
4
Results and Interpretations
The equilibrium of the economy is a process that is described by the quartet (ct , at , kt , yt ) and corresponding prices (wt , Rt , rt ) such that households and firms behave optimally and all markets clear. Optimization of the representative firm is given by (4). Optimization of the representative household is given by (7), (12). Because no arbitrage in asset market is considered, the return from capital stocks that depreciate at the constant rate δ, 0 < δ < 1, is equal to return from interest rate. It can be written as Rt − δ = rt .
(13)
Consider a closed economy and clearing at the asset market. The only asset in positive net supply is capital, because all the borrowing and lending must cancel within the closed economy. It means that the equilibrium in this market is characterized by the following equality kt = at .
(14)
If we use this relation, relations (4), (13) and substitute them into (7), we get kt+1 = (1 − τ )f (kt ) + (1 − δ)kt − ct .
(15)
If we use again (14), (4) and substitute them into Euler equation (12), we get u0 (ct ) = (1 − τ )f 0 (kt ) + (1 − δ). βu0 (ct+1 )
(16)
The left side of this equation is the household’s marginal rate of substitution (MRS) between consumption in the two succeeding periods. The right side of this equation is the household’s marginal rate of transformation (MRT), which is how much extra output results the next period consumption from an additional unit of saving this period. The Euler equation states that in optimal consumption process MRS is equal to MRT. Assume now that a steady-state of the economy has been already reached and let us study long run effects of taxation. In the steady-state both the capital stock and the consumption remain constant in time. It means that kt = k ◦ and ct = c◦ for all t, respectively. If we apply properties of steady state in equation (16) and use the fact that β = 1/(1 + ρ), we gain the following equation for stationary value of capital ρ+δ f 0 (k ◦ ) = · (17) 1−τ Knowing stationary value of capital relation (15) can be used and the stationary value of consumption can be found c◦ = (1 − τ )f (k ◦ ) − δk ◦ . (18) To be able to solve equation (17) we introduce intensive Cobb-Douglas production function y = f (k) = Ak α ,
(19)
where α, 0 < α < 1, is an output elasticity of capital and A, A > 0, is the level of technology, [1]. If the derivative of (19) is substituted into (17) we get ◦
k = (1 − τ )
1 1−α
·
Aα ρ+δ
1 1−α
.
(20)
If (20), (5) and the assumption that labor L has a constant magnitude the following closed form solution to the steady state of tax revenue can be found α 1−α α 1 α ◦ ◦ α 1−α 1−α T (τ ) = τ Lf (k ) = τ LA(k ) = τ (1 − τ ) · LA (21) ρ+δ The graph of function (21) has reversed U shape and can be considered as a model of the Laffer curve for the steady state of the economic unit, see Figure 2. The settings for the left panel of this figure is δ = 0.3, ρ = 0.8, A = 1, L = 1 and α = 0.6. The settings for the right panel of this figure remains the same with one modification α = 0.4. Using (21) it can be proved that the maximum tax revenue of government is reached for tax rate τopt = 1 − α. As a consequence of these observations we can summarize that the Laffer curve effect can be observed if τ > 1 − α. 708
Mathematical Methods in Economics 2016
Figure 2 Laffer curve for different parameter specification.
5
Discussion
For the further analysis methods of comparative statics will be used. If the derivative of (21) with respect to A is considered and this result is used to find the total differential of T (τ ), we successively get ∆T (τ ) T (τ ) ∆A A
=
1 = ηA , 1−α
(22)
which means that the point elasticity of steady state tax revenue for technology level A is (1 − α)−1 . Now we can estimate the effect of a change of the technology level: 1% increase in the level of technology A causes (1 − α)−1 % increase in steady state tax revenue. Similarly the point elasticity of steady state tax revenue for capital productivity α can be determined ∆T (τ ) T (τ ) ∆α α
=
α (1 − τ )αA · 1 − α + ln = ηα . (1 − α)2 ρ+δ
(23)
It means that 1% increase in the capital productivity α causes ηα % increase/decrease in steady state tax revenue. If the same setting as in Figure 2 (the left panel) is used, then for τ = 0, 2 we gain ηα = −1.6. Hence 1% increase in the output elasticity of capital α causes 1.6% decrease in steady state tax revenue. Another important note that has to be mention is the problem of profit function of the representative firm that is given by (1). Assuming that the revenue of the firm is deductible the after-tax profit of the firm can be written in a more appropriate way as Π(K, L) = (1 − τ )(F (K, L) − RK − wL).
(24)
The study of the steady-state level of capital in the model with profit function (24) will be analyzed in another study.
6
Conclusion
A good tax system allows a government to withdraw enough money to finance its public services. In the past the question of maximization of tax revenue was the subject of series of static economic models. This question is contemporary again because there exists a fluent increasing of government’s budget deficits in some European Union countries. In this paper, we discussed an optimal behaviour of households and firms under tax on firms’ income. We consider that the problem of taxes setting is a contemporary because of the fluent increasing of government’s budget deficits in some European Union countries. Results that are based on the analysis of the model can be summarized as follows: a) the optimal behaviour of household’s consumption is given by Euler equation (12) or (16). b) the optimal prices of capital services and labor are given by equations (2); these prices are dependent on tax on income of firms, c) to study long run effects of the given tax rate a steady state of the economic unit was introduced; we find that the steady state tax revenue can exhibit Laffer effect, d) the optimal tax rate on firm’s income at steady state given by Laffer curve (21) is indirectly proportional to elasticity of capital; the higher the magnitude of the
709
Mathematical Methods in Economics 2016
elasticity of capital is the lower the optimal tax rate on firm’s income, e) we also find that the higher level of the technology of firm is the higher tax revenue can gain. In the future we would like to discuss the following steps: a) to introduce a model with more complex fiscal tools focused on taxes, particularly we would like to consider not only tax on firms’ earnings but also tax rates on wage income, private asset income and consumption, b) having introduced more complex model we would like to calibrate the parameters of the model for the case of the Czech Republic
Acknowledgements Support of the Specific research project of the Faculty of Informatics and Management of University of Hradec Králové is kindly acknowledged.
References [1] Barro, R. and Sala-i-Martin, X.: Economic Growth. MIT Press, Cambridge MA, 2004. [2] Bertsekas, D. P.: Dynamic Programming and Optimal Control. Athena Scientific, Belmont MA, 2005. [3] Blanchard, O.: Macroeconomics. Prentice Hall, New Jersey, 2000. [4] Canto, V. A., Joines, D. H. and Laffer, A. B.: Foundations of Supply-Side Economics: Theory and Evidence. Academic Press, New York, 1983. [5] Case, K. E. and Fair, R. C.: Principles of Economics. Prentice Hall, Upper Saddle River, NJ, 2007. [6] Chow, G. C.: Dynamic Economics, Optimization by Lagrange Method. Oxford University Press, Oxford, 1997. [7] Hemmelgarn, T.: Taxation trends in the European Union. Eurostat statistical books, European Union, 2013. [8] Laffer, A. B.: Government exactions and Revenue deficiencies. Cato Journal 1, 1 (1981), 1-21. [9] Lowell, M. C.: Economics with Calculus. World Scientific, Singapore, 2004. [10] Miao, J.: Economic Dynamics in Discrete Time. MIT Press, Cambridge MA, 2014. [11] Prazak, P.: Discrete Dynamic Economic Models in MS Excel. In: Proceedings of the 11th International Conference Efficiency and Responsibility in Education 2014 (Houska, M., Krejci, I. and Flegl, M. eds.), Czech University of Life Sciences, Prague, 2014, 595-601. [12] Sydsaeter, K., Hammond, P., Seierstad, A. and Strøm, A.: Further Mathematics for Economic Analysis. Prentice Hall, Harlow, 2005. [13] Trabandt, M. and Uhling, H.: The Laffer curve revisited. Journal of Monetary Economics 58 (2011), 305-327.
710
Mathematical Methods in Economics 2016
A Note on Economy of Scale Problem in Transport with Nonlinear Cost Function of Transported Quantity Vladimír Přibyl1, Anna Černá2, Jan Černý3 Abstract. The article deals with a problem, which is a modification of the known assignment problem: Given a set Q of mutually interchangeable objects to be transported. Given a road network, represented by a digraph G, where each arc is characterized by a transit time and by a nonlinear cost of joint transit of any number of elements from the set Q (the joint cost is supposed less than the sum of individual costs). Given the vertices, which represent the initial and the final positions of the elements from the set Q. The problem consists of the two following questions: S1. How to assign the final positions to the initial ones? S2. How to design a route connecting a pair resulting from S1, for each such pairs optimally? Remark: It is supposed that if several routes resulting from S1 pass through a given arc, then the corresponding objects are transported together with the nonlinear costs depending on the number of the objects. Therefore, the set of routes is optimal if the sum of costs of joint transits of elements through all “exploited” arcs is minimal. In the paper the optimization technique, which is a generalization of the one from [2], is presented and verified. Keywords: optimization, assignment, routing, joint transport, nonlinear costs. JEL Classification: C61, O18, R42 AMS Classification: 90B06, 90B10
1
Introduction
Mass production is more efficient than the individual one. This rule is valid for transport as well. E.g., an aircraft transports hundreds passengers together, a freight train carries many packages at once etc. Nevertheless, one can see persons or consignments transported individually, although they could be transported more efficiently together, at least in parts of their trips. Extensive class of optimization problems focused on such situations is presented in articles [2] and [10], the latter of which also introduces a classification system in their structure. Furthermore, these articles contain the mathematical models and methods for solving a number of problems of belonging to this structure. However, the case of non-linear cost dependence on the number of jointly transported object has not been resolved yet and remains open. The main purpose of the present paper is to fill this gap for the case of mutually interchangeable objects. The issue of the article is related to issues of non-linear costs network flows. A special case of convex nonlinearity is dealt in the paper [1] and the concave one in [7]. The difference is that the present paper does not work with flows but with individual objects to be transported and, moreover, no assumption concerning the type of nonlinearity is introduced. From systemic viewpoint, as well as because the nonlinearity may occur, it is also related to marshalling trains issue [4], [5]. However, it deals with at least thousands objects to be transported and therefore the methodologies are different from the current case dealing at most with tens. Another relationship, i.e. the one with truck routing (dispatching) [3], [6], [9], cannot serve as methodological source, since it does not deal with nonlinearity of costs. Hence an original approach is necessary.
1.1
Problem Formulation
Assume that a set of m mutually interchangeable (=equal-type) objects are located in the different origin vertices o1, …, on of an (undirected) graph G = (V, E, ck) representing a network, where the value ck(e) is the transit cost of University of Economics/Faculty of Management, Jarošovská 1117, Jindřichův Hradec, Czech Republic pribyl@fm.vse.cz. 2 University of Economics/Faculty of Management, Jarošovská 1117, Jindřichův Hradec, Czech Republic cerna@fm.vse.cz. 3 University of Economics/Faculty of Management, Jarošovská 1117, Jindřichův Hradec, Czech Republic cerny@fm.vse.cz. 1
711
Mathematical Methods in Economics 2016
k objects through the edge e where c0(e) = 0, c1(e) > 0 and ck+1(e) ≥ ck(e) for each e E and k = 1, 2, ... Assume further that the objects are to be translocated to the different destination vertices u1, …, un (and it does not matter which object to which vertex), determined by the one to one mapping d of the set where d(oi) is said the destination of the object located in oi. The translocation from origins to destinations will take place along the routes determined by a mapping r of the set of all pairs (oi, d(oi)) into the set of all routes on G, connecting the pairs (o, u) O U where O ={o1, …, on}, U = {u1, …, un} and r(oi, d(oi)) is said the transport route of the object, located in oi to its destination d(oi). Each such a pair (d, r) is said a translocation plan for the given origins o1, …, on and destinations u1, …, un in the graph G (briefly plan). The problem is to determine optimal plan (d, r), which meets the following condition:
c( d , r )
c
k ( e ) ( e) min
where k(e) = card{qi Q: e r(oi, d(qi))}
eE
(1)
In (1), the value c(d, r) represents the cost of the plan (d, r).
1.2
Problem Discussion
Remark l. The requirement that all n translocated objects in the Problem 1.1 ought to have different origins and different destinations can be omitted in practice without any change of resolving method or algorithm: Assume e.g. that a vertex o is the origin of three objects. Then three different “dummy” vertices o1, o2, o3 and edges (o1, o), (o2, o), (o3, o) may be added with zero costs ck(e) = 0 for all e E and k = 0, 1, … Similar way can be adopted for destinations. I.e., this special formulation of the problem 1.1 does not limit its practical application. It is suitable to define the digraph D(G) = (V, A, c) derived from G in such a way that [v, w] A and [w, v] A if and only if (v, w) E and, of course, ck[v, w] = ck[w, v] = ck(v, w) for k = 0, 1, 2, … It is better for expression of the fact that, e.g., the vertex v immediately follows the vertex w in a route r by the sentence: “r passes through [v, w]” or “ [v, w] r ” which becomes ambiguous replacing [v, w] by (v, w). The following proposition is obvious, since for k > 0 it is ck > 0 on E or A respectively: Proposition 1. Any solution (d, r) of a problem 1.1 contains only simple routes, i.e. the paths in the graph-theoretical sense. Consequently, all “candidate” routes for the solution of the problem 1.1 ought to be simple, i.e. are allowed to pass any vertex at most once. It has no sense that within a plan (d, r) there exist a duplex passing of an edge, i.e. two routes r1, r2 and an edge (v, w) E such that [v, w] r1, and [w, v] r2 concurrently. Assume r1 = r1h,v,w,r1t, r2 = r2h,w,v,r2t and replace them by r1’ = r1h,v,r2t, r2’ = r2h,w,r1t in the plan, denoted now (d’, r’). All routes from (d, r) except two have remained in (d’, r’) as well and the two last have been shortened and therefore are cheaper which implies that c(d’, r’) < c(d, r). That implies the following proposition: Proposition 2. Any solution (d, r) of a problem 1.1 is duplex edge passing free. Consequently, each “candidate” plan (d, r) solving the problem 1.1 ought to be duplex edge passing free. The similar problems of network theory as 1.1 have been solved by many different methods. The most frequent is, probably, MILP, i.e. the mixed integer linear programming. However, the nonlinearity in objective function complicates this approach. Use of a piecewise linearization method [8] would lead to extremely large problems. Similarly, use of DfS (depth-first-search) technique seems too complicated as well since there are too many possible plans. The authors consider some modification of “negative cost circuit” approach more hopeful.
2
Solution Method for Problem 1.1 The method is a modification of the one of the “classic” problem of minimum cost flow problem.
2.1
Theoretical Basis
Use of definitions and denotations from 1.1 continues. The increment of cost is defined as gk(e) = ck(e) – ck–1(e) for each e E and k = 1, 2, … Obviously, ck(e) = g1(e) + … + gk(e) for each e E and k = 1, 2, … The first aim of this subchapter is to investigate mutual relations of two given plans (d, r) and (d’, r’) in the digraph D(G). Lemma 1. Let (d, r) and (d’, r’) be two given plans of Problem 1.1 in the digraph D(G). Then the mapping d* = d– 1 d’(): O ={o1, …, on} O is one-to-one and it defines a basic decomposition : O = O1 O2 … Ok .s.t. d*(Oi) = Oi for all i = 1, …, k and cannot be refined more.
712
Mathematical Methods in Economics 2016
Proof. Since the mapping d is one-to one, the inverse mapping d–1 is unambiguously defined and also one-to-on. Therefore, d* is one-to-one as well. Thus one can define a retagging of the vertices o1, …, on and u1, …, un as follows: o1 remains as before, u1 = d’(o1), if d*(o1) = d–1(u1) = o1 then it is trivial and O1 = {o1}, = {O1, …}. Hence, for this moment, it is supposed that o2 = d–1(u1) o1 and v2 = d’(o2), etc. Suppose that m is the lowest number for which d–1(um) = o1. Of course, then for all j < m it holds d–1(uj) = oj+1 and, consequently, the set O1 = {o1, …, om} may be called indecomposable with respect to the mappings d’ and d. If m = n, then O1 = O = {o1, …, on} and O is called indecomposable as well. Otherwise, when m < n the retagging procedure can be repeated starting with om+1 etc. That concludes the proof. Lemma 2. If (d, r) and (d’, r’) are two solutions of the Problem 1.1, then either the set O = {o1, …, on}is indecomposable with respect to the mappings d’ and d, or (d, r) and (d’, r’) together with the Problem 1.1 itself can be considered a composition of k independent subplans (di, ri) and (di’, ri’), i = 1, …, k of k sub-problems defined subsequently by origin sets Oi and destination sets Ui, i = 1, …, k in the same network G resp. D(G) where d(o) = di(o) and d’(o) = di’(o) o Oi for i = 1, ..., k and r(o, d(o)) = ri(o, di (o)) and r’(o, d’(o)) = ri’(o, di’(o)). Proof. The assertion of Lemma 2 is an easy consequence of Lemma 1. Corollary. If c(d’, r’) < c(d, r) when the conditions of Lemma 2 are met then there exist such i {1, …, k} that c(di’, ri’) < c(di, ri). The solution method of Problem 1.1 works with the following digraph D(r, r’) = (V(r, r’), A(r, r’), ) which is called derived from the digraph D(G) and from the plans (d, r), (d’, r’) s. t. the set O = {o1, …, on}is indecomposable with respect to the mappings d’ and d, using the denotations k([v, w]) = card{o O: [v, w] r(o, d(o))} and k’([v, w]) = card{o O: [v, w] r’(o, d’(o))}: V(r, r’) = {v V: there exists such o O that v r(o, d(oi) or v r’(o, d’(o))} [v, w] A(r, r’) either [w, v] r(o, d(o)) for some o O; then j([v, w]) = ck([w, v])([w, v]) – ck([w, v])–j([w,v]) or [v, w] r’(o, d’(o)) for some o O; then j([v, w]) = cj([v, w]) As one can see, the arcs defined in the first rows have opposite orientation than in the routes r. Therefore, they are called reversed (oriented from U to O i.e. the arrow head is closer to O then the other end). The others are called straight (oriented from O to U). A circuit s (not necessary simple) in the digraph D(r, r’) is called admissible, if it transits through any reverse arc [v, w] not more than k([w, v]) times and through any straight arc [v, w] not more than k’([v, w]) times. The first question is, whether the digraph D(r, r’) is strongly connected. Lemma 3. If the set of origins O is indecomposable, then D(r, r’) is strongly connected. Proof. Let vo V(r, r’), vd V(r, r’) and vo vd. It is necessary to prove that there exists a route from vo to vd in D(r, r’). If vo r(oi, d(oi)) for i < m then there exists a route r(vo, oi) from vo to oi since the directions of arcs from r(oi, d(oi)) in D(G) are reversed in D(r, r’). From the proof of the Lemma 1 it follows that there exist a route from oi to ui, then to oi+1(modm), to ui+1(modm), etc. which implies that there exist a route from vo to each origin oi and each destination ui. The same remains obviously valid if vo r’(oi, d’(oi)), since d’(oi) = ui. Similarly, if vd r(oi, d(oi)) or vd r’(oi, d’(oi) then there exist a route from ui or from oi to vd and the proof is complete. Lemma 4. Let A+(v) = {[v, w] A(r, r’)} and A–(v) = {[w, v] A(r, r’)} for the given v V(r, r’). Let deg+(v)
deg–(v)
k ([ v, w]) k ([ v, w]) for all v V(r, r’)
v ,w A( r ,r)
k ([ w, v]) k ([ w, v]) for all v V(r, r’) w,v A( r ,r)
(2)
(3)
Then deg+(v) = deg–(v) for each v V(r, r’). Proof. The assertion is obviously valid for all vertices o1, …, on, u1, …, un since both mappings d–1 and d’ are oneto-one. The validity for other vertices follows from the construction of D(r, r’) and from the fact that any route of any object enters an intermediate vertex and then exits from it. The proof is complete. The method of solution of Problem 1.1 is based on the following theorem.
713
Mathematical Methods in Economics 2016
Theorem 1. Let (d, r) and (d’, r’) be plans of the problem 1.1 with indecomposable set O, let c(d’, r’) < c(d, r). Let MD(r, r’)= (V(r, r’), A*(r, r’)) be the multi-digraph derived from the digraph D(r, r’) in such a way that each straight arc in D(r, r’) is multiplied k’(a) times and each reverse one k(a) times. Then A1. An Euler circuit p exists in the multi-digraph MD(r, r’) and the corresponding circuit p* passing the same vertices in the same order in D(r, r’) has total cost γ(p*) = c(d’, r’) – c(d, r) < 0. A2. Let the plan (d, r) is changed following the circuit p* in such a way that (i) if γ([w, v]) < 0 for some[w, v] which is contained j times in the circuit p*,then j transits of primal routes from the plan (d, r) through the arc [v, w] are omitted and this is done for all such arcs [v, w] A; afterwards (ii) if γ([v, w]) > 0 for some[v, w] which is contained j times in the circuit p*,then j transits of primal routes from the plan (d’, r’) through the arc [v, w] are added and this is done for all such arcs [v, w] Athen it unambiguously determines the new plan equal to (d’, r’). Proof of Theorem 1. The assertion A1 holds as a consequence of the construction of MD(r, r’) and Lemmas 3 and 4. A2 follows from the fact that the circuit p* was derived from Euler circuit p and from the construction of the multigraph MD(r, r’) assuring that all routes from (d, r) are cancelled and all transits of all routes from (d’, r’) are introduced. A circuit p* from the proof of Theorem 1 is said converting the solution (d, r) into (d’, r’). Corollary. If the plan (d, r) is not optimal, and thus there exist another plan (d’, r’) with c(d’, r’) < c(d, r), then a) either the set O is indecomposable with respect to the mappings d’ and d and then, due to Theorem 1, there exist a circuit p* in D(r, r’) transforming (d, r) into (d’, r’), i.e. into a better plan; b) or the set O is decomposable, then due to Lemma 2, its corollary and Theorem 1 there exist a circuit p* transforming (d, r) into a better plan (d”, r”), but, maybe, worse than (d’, r’). Remark 2. Obviously, there exist always a transforming circuit p* which does not contain a pair of arcs [v, w] together with [w, v], since its elimination has no influence on transforming operation. Remark 3. A quite natural question is whether there exist a simple circuit (= passing any arc at most once) p* transforming (d, r) into a better plan (d”, r”) in the case when (d, r) is not optimal. Unfortunately, it is not true, as one can see in Figure 1, using the costs determined by Table 1. All origin (o1...3) and destination (u1…3) vertices are considered the dummy ones (see Remark 1) and are drawn by a dotted line (of course with their dummy edges)
Figure 1 Demonstration of Remark 3 Let o1 = o1’ = 1, d(1) = d’(1) = 14; o2 = o2’ = 1, d(2) = d’(2) = 15; o3 = o3’ = 3, d(3) = d’(3) = 16; r(1, 14) = 1,5,10,14; r(2, 15) = 2,5,10,15; r(3, 16) = 3,7,6,5,10,11,12,16; c(d, r) = 6*13 + 4*5 + 45 = 143 r’(1, 14) = 1,5,6,11,10,14; r’(2, 15) = 2,6,11,15; r(3, 16) = 3,7,6,11,12,16; c(d’, r’) = 2*12 + 4*18 + 45 = 141 Therefore, (d, r) is not optimal since (d’, r’) is better. However, a simple circuit p* transforming (d, r) into a better plan (d”, r”) with c(d”, r”) < 143 does not exist. It can be proved indirectly: Let p* be such a simple circuit. It cannot contain the arc [10, 5] with the cost –3 since then it ought to contain one of the arcs [4, 9], [6, 11], [7, 12], or [8, 13] with the cost 36 and the negative costs of circuit is not reachable. However, a circuit with negative costs obviously exists neither among the vertices 1-8 nor 9-16.
714
Mathematical Methods in Economics 2016
2.2
Description of the Method for Problem 1.1
The main idea of the method is based on the Corollary of Theorem 1: First to find any initial plan (d, r) for the sets O and U in the digraph D(G) (e.g. by a greedy algorithm). Then cancel costs to reverse arcs from all r(o, d(o)) and give them costs similarly as in D(r, r’) (although any r’ is unknown). The costs of all other arcs [v, w] remain unchanged. Afterwards look for a simple circuit p with negative cost in D(G) using any tagging method (e.g. Ford or something like this). After having found it cancel one transit through any arc with a negative cost in p and add a new transit through the ones with positive cost (of the first transit). Obviously, new plan with smaller costs is reached. If no simple circuits exist more, then the construction of a non-simple circuit via tagging procedure will continue in a modified way: The multiplicity of passing of any arc in D(G) by p* is limited by n = |O|. After each passing of an arc a by the subsequently constructed p* the cost c(a) is changed similarly as defined in D(r, r’). The technical implementation of this idea reached by the authors until now is not suitable for its unacceptable computational complexity, but there are hopes for its acceleration.
3
Computational experience
For testing purposes, the authors have designed a simple graph with edges evaluated using 4 types of nonlinear cost functions. Cost functions (CF1, CF2, CF3 and CF4) are shown in Table 1. There are labels in Figure 1 showing the relevant cost functions of edges. Because of the cardinality |O| = 4 in all test cases, the functions are tabulated for a maximum of 4 objects. Type of cost function CF1 CF2 CF3 CF4
k=1 5.0 12.0 13.0 36.0
k=2 5.8 14.0 15.0 42.0
k=3 6.3 15.0 16.0 45.0
k=4 6.5 15.5 16.5 46.5
Table 1 Transit cost of k objects The tagging procedure was tested on above mentioned graph (depicted in Figure 1) with the origin vertices o1 = 1; o2 = o3 = 2; o4 = 3 and destination vertices u1 = 14; u2 = u3 = 15; u4 = 16 and different initial plans. Table 2 describes the circuits and their resulting plans with their costs in the case when the optimal plan is reachable by successive application of 6 simple circuits. The structure of a plan description is “…|k [arc]|…”, where k is the number of objects transported via an arc [arc]. Step
Costs
0
190
1
188
2
184
3
175
4
159,5
5
150,5
6
146,5
Plan description
Cycle description
1 [1,4]|1 [4,9]|1 [9,14]|2 [2,6]|1 [3,8]|2 [6,11]| 1 [8,13]|2 [11,15]|1 [13,16] 1 [1,4]|1 [4,5]|1 [10,14]|2 [2,6]|1 [3,7]|1 [5,6]| 3 [6,11]|1 [7,12]|1 [12,13]|3 [11,15]|1 [13,16]|1 [15,10] 1 [1,5]|1 [10,14]|2 [2,6]|1 [3,7]|1 [5,6]|3 [6,11]|1 [7,12]| 1 [12,13]|3 [11,15]|1 [13,16]|1 [15,10] 1 [1,5]|1 [10,14]|2 [2,6]|1 [3,7]|1 [5,6]|3 [6,11]|1 [7,12]| 1 [12,13]|2 [11,15]|1 [13,16]|1 [11,10] 1 [1,5]|1 [10,14]|3 [2,6]|1 [3,7]|1 [5,6]|4 [6,11]|1 [11,12]|1 [12,13]|2 [11,15]|1 [13,16]|1 [7,2]|1 [11,10] 1 [1,5]|1 [10,14]|2 [2,6]|1 [3,7]|1 [5,6]|4 [6,11]| 1 [11,12]|1 [12,13]|2 [11,15]|1 [13,16]|1 [7,6]|1 [11,10] 1 [1,5]|1 [10,14]|2 [2,6]|1 [3,7]|1 [5,6]|4 [6,11]| 1 [11,12]|2 [11,15]|1 [12,16]|1 [7,6]|1 [11,10]
16,13,8,3,7,12,15,10,14,9,4,5,6,11,15,12,13,16 16,13,8,3,7,12,15,10,14,9,4,1,5,4,9,14,10,15,1 2,7,3,8,13,16 16,13,8,3,7,12,15,11,10,15,12,7,3,8,13,16 16,13,8,3,7,2,5,6,11,12,7,3,8,13,12,15,10,14,9, 4,5,2,6,5,1,4,9,14,10,15,11,10,5,4,1,5,10,11, 15,12,13,16 16,13,8,3,7,6,2,7,3,8,13,16 16,13,12,16
Table 2 Application of successive simple circuits leading to the optimal plan in 6 steps Step 0 describes the initial plan obtained by the greedy algorithm, which for each destination vertex finds the cheapest route connecting it with an available origin vertex. The algorithm for searching simple circuits stops once the first circuit with negative costs is reached, transforms the plan and starts searching again. Experiments have shown that starting with another initial plan can lead to non-optimal plans, which cannot be improved by application of simple circuit. It corresponds with Remark 3. Table 3 describes a non-simple circuit, which can transform the same initial plan as in previous case into the optimal plan in only one step. Unfortunately, this circuit was found intuitively. Although the implemented simple
715
Mathematical Methods in Economics 2016
circuit tagging procedure is relatively fast (computational time in case described in Table 2 was about 1 second), extending it for non-simple circuits leads (until now) to the unacceptable computational times. Step
Costs
0
190
1
146,5
Plan description
Cycle description
1 [1,4]|1 [4,9]|1 [9,14]|2 [2,6]|1 [3,8]|2 [6,11]| 1 [8,13]|2 [11,15]|1 [13,16] 1 [1,5]|1 [10,14]|2 [2,6]|1 [3,7]|1 [5,6]|4 [6,11]| 1 [11,12]|2 [11,15]|1 [12,16]|1 [7,6]|1 [11,10]
16,13,8,3,7,6,11,10,14,9,4,1,5,6,11,12,16
Table 3 Application of non-simple circuit leading to the optimal plan in one step
4
Conclusion and Outline of Future Research
The paper presents and solves the problem of optimal plan of, partially joint, transport of n mutually interchangeable objects through the given network. It is assumed that the objects are located in n origin vertices and ought to be transported to the given set of destination vertices when it does not matter which object will end in a specific destination. The main feature of the problem is that the joint transportation of k objects through an edge nonlinearly depends on k and it is much cheaper, that k times costs of individual transportation there. The solution method is based on repeated looking for a circuit (simple if possible, but not necessarily) with negative cost in an auxiliary digraph derived from the network and an initial solution, which is easily found by the origin-destination matching resulting from the “classic” assignment problem. The main direction of future research can be expected in the acceleration of the solution method, either based on quicker finding of a non-simple circuit with negative cost or looking for an absolutely different principle, e.g. mixed integer linear programming, but not based on piecewise linearization of cost.
Acknowledgements The authors greatly appreciate the support of research, whose results are published in this paper, by the Faculty of Management, University of Economics in Prague within the resources for long-term institutional development, Project number IP600040.
References [1] Bertsekas, D.P., Polymenakos L.C. and Tseng, P.: An -Relaxation Method for Separable Convex Cost Network Flow Problems. SIAM Journal of Optimization 7 (1997), 853-870. [2] Černý, J. and Přibyl, V.: Partially Joint Transport on Networks. In Mathematical Methods in Economics 2013. Jihlava: College of Polytechnics Jihlava, 2013, 113-116. [3] Cordeau, J.F., Desaulniers, G., Desrosiers, J. and Solomon, M.M.: The VRP with Time Windows. In: The Vehicle Routing Problem, Paolo Toth and Daniele Vigo (eds), SIAM Monographs on Discrete Mathematics and Applications, 2001. [4] Flodr, F., Mojžíš, V. and Pokorný, J.: Dopravní provoz železnic – vlakotvorba (Railway Traffic - Marshalling Trains – in Czech), VŠDS Žilina, 1989. [5] Gašparík, J. a kol.: Vlakotvorba a miestne dopravné procesy (Marshalling Trains and Local Transport Processes – in Slovak). 1st ed.. Univerzita Pardubice 2011. [6] Golden, B. L., Raghavan, S. and Wasil, E. A. (Eds.). (2008). The vehicle routing problem: latest advances and new challenges. Vol. 43, Springer Science & Business Media. [7] Guisewite, G.M. and Pardalos, P.M.: Minimum concave-cost network flow problems: Applications, complexity, and algorithms. Annals of Operations Research 25 (1990), 75-99. [8] Moussourakis, J. and Haksever, C.: Project compression with nonlinear cost functions. Journal of Construction Engineering and Management 136 (2009), 251-259. [9] Peško, Š, 2002. Optimalizácia rozvozov s časovými oknam (Time Constrained Routing Optimization – in Slovak). Konferencia KYBERNETIKA - história, perspektívy, teória a prax, zborník referátov, 76–81, Žilina. [10] Přibyl, V., Černý, J. and Černá, A.: Economy of Scale in Transport Industry – New Decision Problems. Submitted to Prague Economic Papers, the preprint in electronic form can be requested from the authors.
716
Mathematical Methods in Economics 2016
Finite game with vector payoffs: how to determine weights and payoff bounds under linear scalarization Miroslav Rada1 , Michaela Tichá2 Abstract. Assume a two-player finite noncooperative game with multiple known payoff functions (also called multi-criteria game) of each player. Furthermore, assume that utility of both the players can be obtained as a linear weighted scalarization of their payoffs. Given a strategy profile of the players, we are interested in the set of weights for which the strategy profile is the Nash equilibrium of the scalarized game. We use the standard quadratic program for finding Nash equilibria in bimatrix games to show that the set of the sought weights can be described as a system of linear inequalities. Furthermore, assuming that the given strategy profile is really a Nash equilibrium of the game, we show how to compute lower and upper bounds of players’ utilities over all possible weights; this can be in fact realized using linear programming. We also discuss some further extensions and suggest some generalizations and robustifications of the approach. Keywords: multi-criteria game, vector payoffs, weighted scalarization, payoff bounds. JEL classification: C72 AMS classification: 91A10
1
Introduction
1.1
Multi-criteria games
Multi-criteria games are natural extension of classic game-theoretic models, this holds both for games in strategic form and for coalitional games. Related work. The basics for multi-criteria games were laid independently by Blackwell [2] and Shapley [11, 12]. Work of both the authors addressed roughly the same problem: the question of existence of a Nash equilibrium in multi-criteria zero-sum game, however, Shapley’s terminology is closer to the one that we are nowadays used to. This work was followed by several other researchers, for example by Zeleny [16]. Zeleny’s approach introduced a form of weighted linear scalarization, while still considering zero-sum games. This was generalized by several other authors, namely Borm et al. [3] or Kruś and Bronisz [7] to mention some. Recently, a survey on solution concepts of multi-criteria games appeared due to Anand et al. [1]. There are many more interesting works in the area of noncooperative multi-criteria games. We mention the work of Voorneveld et al. [14] and especially the work on stable equilibria by de Marco and Morgan [5], which is closely related to our paper. Recently, interesting topics in multi-criteria game theory were addressed also by Yu and Liu [15] (robustification), by Pusillo and Tijs [10] (E-equilibria), by de Marco [4] (generic approach to games) or by Fahem and Radjef [6] (proper efficiency of equilibria). Our starting point. Most recently, several results on both the noncooperative and coalitional games were achieved in the doctoral thesis by Tichá [13]. One of the results – determining weights under which a given strategy profile is Nash equilibrium in a finite noncooperative game – serves as the starting point for this paper. We take the result, strengthen it slightly and formulate an approach for obtaining additional informations from the knowledge of a strategy profile. The goal is formulated more precisely and formally in Section 2.
1.2
Notation and general results for noncooperative games
Convention. In the whole paper, superscripts serve to distinguish variables, not for exponentiation. For a natural n, the symbol [n] denotes the set {1, . . . , n}. The symbol 1 is the vector (1, . . . , 1)T of suitable dimension.
For the purposes of this paper, it is sufficient to consider games of two players. However, for sake of generality,
1
University of Economics, Prague, Faculty of Finance and Accounting, W. Churchill Sq. 4, 130 67 Prague 3, Czech Republic, miroslav.rada@vse.cz 2 University of Economics, Prague, Faculty of Informatics and Statistics, W. Churchill Sq. 4, 130 67 Prague 3, Czech Republic, michaela.ticha@vse.cz
717
Mathematical Methods in Economics 2016
we start with some known results from the area of noncooperative games, which sets our problem into broader perspective. First, we introduce the general definitions and propositions, then we switch to the two-player case. We start with an exhaustive definition that shall fix the notation: Notation 1 (Basic symbols). We define and denote by (a) n the number of players, (b) N := [n] the set of players, (c) mi the number of pure strategies of i-th player, for all i ∈ N , [mi ] the set of pure strategies of i-th player, for all i ∈ N , (d) S i :=Q (e) S := i∈N S i the set of pure strategy profiles, i (f) X i :=Q {xi ∈ Rm : xi ≥ 0, 1T xi = 1} the set of mixed strategies of i-th player, for all i ∈ N , (g) X := i∈N X i the set of all strategy profiles, (h) xi ∈ X i the strategy of i-th player, for all i ∈ N , (i) sij ∈ X i the j-th pure strategy in the mixed space of i-th player, for all i ∈ N , (j) x := (xi )i∈N ∈ X the strategy profile of the players, (k) x−i := (x1 , . . . , xi−1 , xi+1 , . . . , xn ) the strategy profile x without the strategy of i-th player; this is to be used with some strategy ξ i ∈ X i of i-th player; in particular, (ξ i , x−i ) is the strategy profile x with strategy ξ i instead of the strategy xi , (l) ri the number of payoffs of i-th player; (m) Pji the mapping S 7→ R, for each i ∈ N and for each j ∈ [ri ], P P Q (n) vji (x) := s1 ∈S 1 · · · sn ∈S n Pji (s1 , . . . , sn ) k∈N xisk the mapping X 7→ R called j-th payoff of i-th player, for j ∈ [ri ] for i ∈ N , (o) v i (x) := (vji (x))j∈[ri ] the payoff vector of i-th player, for all i ∈ N . Note that the symbol P in the item (m) in the above definition has the similar role as the payoff matrices in the two-player games. Also, the letter P was chosen as a shortcut for “payoff”. Notation 1 provides us with notation that allows us to define the multi-criteria game in a very simple manner: Definition 2 (Multi-criteria game in strategic form). Given N, X and ((vji )j∈[ri ] )i∈N , the triplet G := (N, X, ((vji )j∈[ri ] )i∈N )
(1)
is called (mixed extension of finite) multi-criteria game in strategic form. For definition of equilibria of the multi-criteria game, it will be helpful to introduce several ways of comparing vectors. Convention. Let a, b ∈ Rm be given vectors. We define the following comparison operations: • a≥b if aj ≥ bj for every j ∈ [m], • a b if a ≥ b and a 6= b, • a>b if aj > bj for every j ∈ [m].
Definition 3 (Weak equilibrium of multi-criteria game). A strategy profile x ∈ X is called weak equilibrium of multi-criteria game G, if, for all i ∈ N , there is no ξ i ∈ X i such that v i (ξ i , x−i ) > v i (x).
(2)
Definition 4 (Strong equilibrium of multi-criteria game). A strategy profile x ∈ X is called strong equilibrium of multi-criteria game G, if, for all i ∈ N , there is no ξ i ∈ X i such that v i (ξ i , x−i ) v i (x).
(3)
Straightforward approach to solving a multi-criteria game is to aggregate the multi-ple criteria somehow into one criterion. This principle is called scalarization and is pretty standard in other areas of multi-criteria decisionmaking. Generally, there are many different ways how to scalarize; here, we are interested in linear scalarization, i.e. in aggregating the criteria via linear combination (or, more precisely, convex combination) of the criteria with some coefficients of the combination. To formalize this, Notation 5 follows: Notation 5 (Scalarization stuff). We define and denote by i (o) W i := {w ∈ Rr : w ≥ 0, 1T w = 1} the set of possible weights of the payoffs, for all i ∈ N , i i (p) w ∈ W the weights of i-th player, for all i ∈ N , 718
Mathematical Methods in Economics 2016
(q) ui (x, wi ) :=
P
j∈[r i ]
wji vji (x) the mapping X × W i 7→ R called utility function of i-th player, for all i ∈ N .
Definition 6 (Linear scalarization of multi-criteria game in strategic form). Given N, X and (ui )i∈N (assume that knowledge of ui implies the knowledge of wi ), the triplet Gw := (N, X, (ui )i∈N )
(4)
is called linear scalarization of (mixed extension of finite) multi-criteria game in strategic form. Multi-criteria games and their scalarizations (and their Nash equilibria) are related by the classic result of Kruś and Bronisz [7]: Theorem 1. Assume x ∈ X is a weak Nash equilibrium of the scalarized multi-criteria game Gw = N, X, (ui )i∈N .
(5)
Then x is a weak Nash equilibrium of the multi-criteria game
G = N, X, ((vji )j∈[ri ] )i∈N .
(6)
In other words, the theorem says that if we assign weights to each criterion and convert multi-criteria game to classical game, we will find a subset of the weak equilibria of the multi-criteria game corresponding to the preferences of players. In this paper, we assume mixed extensions of finite games. However, Theorem 1 is even stronger: it holds for arbitrary payoff functions and strategy spaces. For our specific case of payoffs that are linear in every strategy of each player, the converse of the implication in Theorem 1 also holds: for a Nash equilibrium x of the multi-criteria game G, there exist weights w such that x is Nash equilibrium of Gw . This justifies the importance of studying the scalarizations.
2
Problem formulation
From now on, we assume n = 2, i.e. we are restricted to games of two players. Standard problem in two-player games, both one-criterial and multi-criterial, is to find equilibria. For one-criterial games, there are several ways how to find equilibrium. Here, we come from the classic mathematical-programming formulation of Mangasarian and Stone [9]. However, we shall mention a Lemke-Howson algorithm [8] for finding all equilibria of this game. Convention. The shortcut NE stands for weak Nash equilibrium. Since we have now only two players, it will be helpful to consider the mappings Pji (see Notation 1(m)) as matrices and to simplify vji (see Notation 1(n)) to x1T Pji x2 , as it is usual for the case of bimatrix games (bimatrix multi-criteria games, in our case). This is formalized in Notation 7. Notation 7 (Replacement for the case of two players). We replace the following symbols from Notation 1 and 5 as follows and also add one more symbol: • Not. 1(m) – every symbol Pji now represents a matrix of dimensions m1 × m2 and denotes the payoffs of i-th player according to j-th criterion, for all j ∈ [ri ], i ∈ {1, 2}, • Not. 1(n) – the mapping vji is now defined as vji (x) = x1T Pji x2 , P • Not. 5(q) – the mapping ui is now defined as ui (x, wi ) = j∈[ri ] x1T wji Pji x2 = x1T P i x2 , where P • P i = j∈[ri ] wji Pji , for all i ∈ {1, 2}, • Not. 1(h) – the strategy x1 of player 1 is from now on understood as row vector; this will be useful in matrix multiplications. The result of Mangasarian and Stone is (for our game Gw with fixed w) stated in Theorem 2. Theorem 2 (Quadratic program for bimatrix games [9]). Given a one-criterial game Gw = ({1, 2}, X 1 × X 2 , (u1 , u2 )) and fixed weights w1 ∈ W 1 , w2 ∈ W 2 , the strategy profile (x1 , x2 ) ∈ X 1 × X 2 is a Nash equilibrium of Gw if and only if it is the global optimal solution of the quadratic program max2 x1 P 1 + P 2 x2 − α − β, 1 1 2 x ∈X ,x ∈X ,α,β∈R
P 1 x2 − α1 ≤ 0, 1
2
x P − β1 ≤ 0,
with the optimal value of 0.
719
(7)
Mathematical Methods in Economics 2016
Corollary 3. Given weights w1 ∈ W 1 and w2 ∈ W 2 , one is able to find at least one Nash equilibrium of a multi-criteria game of two players using quadratic program (7) and Theorem 1. Our problem(s). The problem “we are given multi-criteria game and weights, compute a Nash equilibrium” is easily solvable due to Corollary 3. However, one can pose it in the opposite way: “given a strategy profile x, find weights w1 ∈ W 1 , w2 ∈ W 2 such that x is a Nash equilibrium”.
(P1)
Furthermore, if there are multiple such weights, one can ask, what is actually the payoff of each particular player. Since that cannot be answered without an additional information about weights, we will address the following weaker problem: “given a strategy profile x, find lower and upper bound of payoff of each player s.t. x is a NE”.
(P2)
Goal. The goal of this paper is simply to propose a way how to solve problems (P1) and (P2) efficiently. Structure. The rest of the paper is structured as follows. In Section 3, we formulate an optimization problem for solving problem (P1), based on quadratic program (7); this is formalized in Lemma 5. The formulation (7) is further extended in Section 4 into form suitable for solving (P2), see Lemma 6. The Section 5 completes and overviews the paper.
3
Finding the set of weight vectors
We come out from the quadratic program (7). The Theorem 2 can be directly used for characterizing the set of weights for which a given x ∈ X is NE – it is sufficient to swap variables – we optimize over w1 , w2 instead of x1 , x2 . The formulation then reads X X max x1 wj1 Pj1 + wj2 Pj2 x2 − α − β, (8a) 1 1 2 2 w ∈W ,w ∈W ,α,β∈R
X
j∈[r 1 ]
x1
j∈[r 1 ]
j∈[r 2 ]
wj1 Pj1 x2 − α1 ≤ 0,
X
j∈[r 2 ]
(8b)
wj2 Pj2 − β1 ≤ 0,
(8c)
Note that the formulation (8) is linear in w1 and w2 . This significantly simplifies the situation. First, let us supplement the formulations (7) and (8) with several easy but useful propositions: Proposition 4. (a) Let (x1 , x2 , α, β) ∈ X × R2 be a feasible solution of (7) for some fixed w1 ∈ W 1 and w2 ∈ W 2 . Then x1 P 1 x2 − α ≤ 0 and x1 P 2 x2 − β ≤ 0 hold. (b) LetP(w1 , w2 , α, β) ∈ W 1 × W 2 ×P R2 be a feasible solution of (8) for some fixed x ∈ X. Then 1 1 1 2 1 2 2 2 x j∈[r 1 ] wj Pj x − α ≤ 0 and x j∈[r 2 ] wj Pj x − β ≤ 0. w (c) x ∈ X is NE of G if and only if the objective value of (7) or (8) is 0. Proof. 1) The items (a) and (b) follow from the way the programs (7) and (8) are derived. We focus on the program (8). In fact, the m1 constraints (8b) tell that the payoff of player 1 for every pure strategy s1j is less than α; making standard trick with the convex combination of these inequalities we can conclude that, for every x2 , the payoff of player 1 is less than α regardless of choice of x1 , hence, it is less than α for the solution of (8), too. The same holds for the player 2 and constraints (8c). 2) The item (c) is obvious according to the items (a) and (b) and Theorem 2. Remark. Note that α resp. β is the upper bound for payoff of player 1 resp. 2. If x ∈ X is NE, then α resp. β is payoff of the player 1 resp. 2. Lemma 5 (Linear program for (P1)). Assume the game Gw and a strategy profile x ∈ X,wi ∈ W i are given. The
720
Mathematical Methods in Economics 2016
sets W 1 (x), W 2 (x) of weights wi ∈ W i such that x is NE of Gw are the convex polyhedra X X W 1 (x) := w1 ∈ W 1 : wj1 Pj1 x2 ≤ α1, x1 wj1 Pj1 x2 = α , j∈[r 1 ] j∈[r 1 ] X X wj2 Pj2 ≤ β1, x1 wj1 Pj2 x2 = β . W 2 (x) := w2 ∈ W 2 : x1 2 1 j∈[r ]
(9)
j∈[r ]
Proof. We take the formulation (8). Due to Proposition 4(c), the objective function can be treated as constraint by setting the requested value to 0. Then, due to Proposition 4(b), this new constraint can be separated into two constraints. Clearly, all constraints in definition of W 1 (x) and W 2 (x) are linear in w1 resp. w2 . Since the sets are defined as intersection of linear constraints, they are given as convex H-polyhedra.
4
Finding payoff bounds
Note that with unknown weights, the payoffs are also unknown. Here, we are interested in computation of lower and upper bound of payoff of each player. Since the sets W 1 (x) and W 2 (x) have linear description, this is for a given x possible simply with a linear program: Lemma 6 (Linear program for (P2)). Assume the game Gw is given and a strategy profile x ∈ X that is known to be NE. Then, the lower bound of payoff of i-th player can be determined using the linear program X min x1 wji Pji x2 , (10) i w
j∈[r i ]
i
s.t. w ∈ W i (x).
(11)
Proof. Obvious. The upper bound can be found by replacing “min” by “max” in (10).
5
Conclusion
Given linear scalarization of a multi-criteria game with known strategy profile and unknown weights, we presented an approach (see Lemma 5) for determining weights of criteria under which the strategy profile is a Nash equilibrium. Also, we presented an approach (see Lemma 6) for computing upper and lower bound of payoffs of players; since under unknown weights they are also unknown. Surprisingly, the above problems can be solved simply using linear programming and hence it is fully reasonable and ready to use in practical multi-criteria games – unfortunately, for space reasons, we cannot present the example of a multi-criteria game (for example the multi-criteria battle of sexes) to demonstrate the power of our results. On the other hand, the simplicity of the presented results calls for some generalizations. Namely, it is apparent that the approach shall be generalized for the games of arbitrary number of players. Also, the results could be somewhat robustified. Namely, consider the following problem: from m observations of realizations of a game, we estimate the strategy profile x. What is the α-percent region of confidence for weights? What is the α-percent region of confidence for payoffs? These questions should be the subject of our interest for further research.
Acknowledgements The work of the first author was supported by Institutional Support of University of Economics, Prague, VŠE IP 100040. The work of the second author is supported by internal grant agency VŠE IGA F4/54/2015.
References [1] Anand, L., Shashishekhar, N., Ghose, D., and Prasad, U. R.: A survey of solution concepts in multicriteria games. Journal of the Indian Institute of Science 75 (2013), 141–174. [2] Blackwell, D.: An analog of the minimax theorem for vector payoffs. Pacific Journal of Mathematics 6 (1956), 1–8.
721
Mathematical Methods in Economics 2016
[3] Borm, P., Vermeulen, D., and Voorneveld, M.: The structure of the set of equilibria for two person multicriteria games. European Journal of Operational Research 148 (2003), 480–493. [4] De Marco, G.: On the genericity of the finite number of equilibria in multicriteria games: a counterexample. International Journal of Mathematics in Operational Research 5 (2013), 764–777. [5] De Marco, G., and Morgan, J.: A Refinement Concept for Equilibria in Multicriteria Games Via Stable Scalarizations. International Game Theory Review 9 (2007), 169–181. [6] Fahem, K., and Radjef, M. S.: Properly efficient Nash equilibrium in multicriteria noncooperative games. Mathematical Methods of Operations Research 82 (2015), 175–193. [7] Kruś, L., and Bronisz, P.: On n-person noncooperative multicriteria games described in strategic form. Annals of Operations Research 51 (1994), 83–97. [8] Lemke, C., and Howson, J.: Equilibrium Points of Bimatrix Games. Journal of the Society for Industrial and Applied Mathematics 12 (1964), 413–423. [9] Mangasarian, O. L., and Stone, H.: Two-person nonzero-sum games and quadratic programming. Journal of Mathematical Analysis and Applications 9 (1964), 348–355. [10] Pusillo, L., and Tijs, S.: E-equilibria for Multicriteria Games. In: Advances in Dynamic Games. Springer, 2013, 217–228. [11] Shapley, L. S.: Equilibrium Points in Games with Vector Payoffs. Technical report, RAND, 1956. [12] Shapley, L. S., and Rigby, F. D.: Equilibrium Points in Games with Vector Payoffs. Naval Research Logistics Quarterly 6 (1959), 57–61. [13] Tichá, M.: Vícekriteriální hry. Ph.d. thesis, University of Economics, Prague, Prague, 2016. [14] Voorneveld, M., Grahn, S., and Dufwenberg, M.: Ideal equilibria in noncooperative multicriteria games. Mathematical methods of operations research 52 (2000), 65–77. [15] Yu, H., and Liu, H. M.: Robust multiple objective game theory. Journal of Optimization Theory and Applications 159 (2013), 272–280. [16] Zeleny, M.: Games with multiple payoffs. International Journal of Game Theory 4 (1975), 179–191.
722
Mathematical Methods in Economics 2016
Strong Consistent Pairwise Comparisons Matrix with Fuzzy Elements and Its Use in Multi-Criteria Decision Making Jaroslav Ramı́k
1
Abstract. The decision making problem considered in this paper is to rank n distinct alternatives from the best to the worst, using the information given by the decision maker in the form of an n by n pairwise comparisons matrix with fuzzy elements from abelian linearly ordered group (alo-group) over a real interval. The concept of reciprocity and consistency of pairwise comparisons matrices with fuzzy elements have been already studied in the literature. We define stronger concepts, namely the strong reciprocity and strong consistency of pairwise comparisons matrices with fuzzy intervals, derive necessary and sufficient conditions for strong reciprocity and strong consistency and investigate some consequences to the problem of multi-criteria decision making. Keywords: pairwise comparisons matrix, matrix with fuzzy elements, multicriteria decision making. JEL classification: C44 AMS classification: 90C15
1
Introduction
A decision making problem (DM problem) which forms an application background in this paper can be formulated as follows. Let X = {x1 , x2 , ..., xn } be a finite set of alternatives (n > 1). The decision maker’s aim is to rank the alternatives from the best to the worst, using the information given by the decision maker in the form of an n × n pairwise comparisons matrix. Fuzzy sets being the elements of the pairwise comparisons matrix can be applied whenever the decision maker is not sure about the preference degree of his/her evaluation of the pairs in question. Fuzzy elements may be taken also as the aggregations of crisp pairwise comparisons of a group of decision makers in the group DM problem. Decision makers acknowledge fuzzy pairwise preference data as imprecise knowledge about regular preference information. Usually, an ordinal ranking of alternatives is required to obtain the ”best” alternative(s), however, it often occurs that the decision maker is not satisfied with the ordinal ranking among alternatives and a cardinal ranking i.e. rating is then required. The former works that investigated the problem of finding a rank of the given alternatives based on some pairwise comparisons matrices are e.g. [3], [4], [7], and [8]. The recent paper is in some sense a continuation of [7].
2
Preliminaries
Here, it will be useful to understand fuzzy sets as special nested families of subsets of a set, see [6]. A fuzzy subset of a nonempty set X (or a fuzzy set on X) is a family {Aα }α∈[0,1] of subsets of X such that A0 = X, Aβ ⊂ Aα whenever 0 ≤ α ≤ β ≤ 1, and Aβ = ∩0≤α<β Aα whenever 0 < β ≤ 1. The membership function of A is the function µA from X into the unit interval [0, 1] defined by µA (x) = sup{α | x ∈ Aα }. Given α ∈]0, 1], the set [A]α = {x ∈ X | µA (x) ≥ α} is called the α-cut of fuzzy set A. If X is a nonempty subset of the n-dimensional Euclidean space Rn , then a fuzzy set A of X is called closed,
1 Silesian University in Opava/School of Business Administration in Karvina, University Sq. 1934/3, 73340 Karvina, Czech Republic, ramik@opf.slu.cz
723
Mathematical Methods in Economics 2016
bounded, compact or convex if the α-cut [A]α is a closed, bounded, compact or convex subset of X for every α ∈]0, 1], respectively. We say that a fuzzy subset A of R∗ = R ∪ {−∞} ∪ {+∞} is a fuzzy interval whenever A is normal and its membership function µA satisfies the following condition: there exist a, b, c, d ∈ R∗ , −∞ ≤ a ≤ b ≤ c ≤ d ≤ +∞, such that µA (t) = 0 µA µA (t) = 1 µA
if t < a or t > d, is strict increasing and continuous on [a, b], if b ≤ t ≤ c, is strict decreasing and continuous on [c, d].
(1)
A fuzzy interval A is bounded if [a, d] is a compact interval. Moreover, for α = 0 we define the zero-cut of A as [A]0 = [a, d]. A bounded fuzzy interval A is the fuzzy number, if b = c. An abelian group is a set, G, together with an operation (read: operation odot) that combines any two elements a, b ∈ G to form another element in G denoted by a b, see [1]. The symbol is a general placeholder for a concretely given operation. (G, ) satisfies the following requirements known as the abelian group axioms, particularly: commutativity, associativity, there exists an identity element e ∈ G and for each element a ∈ G there exists an element a(−1) ∈ G called the inverse element to a. The inverse operation ÷ to is defined for all a, b ∈ G as follows: a ÷ b = a b(−1) . An ordered triple (G, , ≤) is said to be abelian linearly ordered group, alo-group for short, if (G, ) is a group, ≤ is a linear order on G, and for all a, b, c ∈ G: a ≤ b implies a c ≤ b c. If G = (G, , ≤) is an alo-group, then G is naturally equipped with the order topology induced by ≤ and G × G is equipped with the related product topology. We say that G is a continuous alo-group if
is continuous on G × G. By definition, an alo-group G is a lattice ordered group. Hence, there exists max{a, b}, for each pair (a, b) ∈ G × G . Nevertheless, a nontrivial alo-group G = (G, , ≤) has neither the greatest element nor the least element. G = (G, , ≤) is divisible if for each positive integer n and each a ∈ G there exists the (n) (n)-th root of a denoted by a(1/n) , i.e. a(1/n) = a. The function k.k : G → G defined for each a ∈ G by kak = max{a, a(−1) } is called a G-norm. The operation d : G × G → G defined by d(a, b) = ka ÷ bk for all a, b ∈ G is called a G-distance. Next, we present the well known examples of alo-groups, see also [2], or [7]. Example 1: Additive alo-group R = (] − ∞, +∞[, +, ≤) is a continuous alo-group with: e = 0, a(−1) = −a, a(n) = a + a + ... + a = n.a.
Example 2: Multiplicative alo-group R+ = (]0, +∞[, •, ≤) is a continuous alo-group with: e = 1, a(−1) = a−1 = 1/a, a(n) = an . Here, by • we denote the usual operation of multiplication. Example 3: Fuzzy add. alo-group Ra = (] − ∞, +∞[, +f , ≤), see [7], is a continuous alo-group with: a +f b = a + b − 0.5, e = 0.5, a(−1) = 1 − a, a(n) = n.a − (n − 1)/2. Example 4: Fuzzy multipl. alo-group ]0, 1[m =(]0, 1[, •f , ≤), see [2], is a continuous alo-group with: a •f b =
3
ab , e = 0.5, a(−1) = 1 − a. ab + (1 − a)(1 − b)
PCF matrices
Notice that elements of PCF matrices may be crisp and/or fuzzy numbers, also crisp and/or fuzzy intervals, fuzzy intervals with bell-shaped membership functions, triangular fuzzy numbers, trapezoidal fuzzy numbers etc. Such fuzzy elements may be either evaluated by individual decision makers, or, they may be made up of crisp pairwise evaluations of decision makers in a group DM problem. Now, we shall define reciprocity properties for PCF matrices. Reciprocity of a PC matrix is a natural property defining the evaluation of couples of alternatives in the reverse order. First, we define reciprocity for PCF matrices, then we define the concept of the strong reciprocity and we shall investigate some relationships between the two properties. We derive necessary and sufficient conditions for a PCF matrix
724
Mathematical Methods in Economics 2016
to be strong reciprocal. Our approach will cover the classical definitions of reciprocity presented e.g. in [5], and [7]. Let C = {c̃ij } be an n × n PCF matrix, α ∈ [0, 1]. C is said to be α- -reciprocal, if the following condition holds: For every i, j ∈ {1, 2, ..., n} there exist cij ∈ [c̃ij ]α and cji ∈ [c̃ji ]α such that cij cji = (−1) e, or, equivalently, cji = cij . C = {c̃ij } is said to be -reciprocal, if C is α- -reciprocal for all α ∈ [0, 1]. Remark 1. If C = {c̃ij } is a PCF matrix with crisp elements, then c̃ij = cij with cij ∈ G for all i and j, and the above definition coincides with the classical definition of reciprocity for crisp PCF matrices: (−1) A crisp PCF matrix C = {cij } is -reciprocal if for all i and j: cji = cij . Particularly, C = {cij } is additive-reciprocal if cji = −cij for all i and j; C = {cij } is multiplicative-reciprocal if cji = c1ij for all i and j. Clearly, if C = {c̃ij } is an α- -reciprocal PCF matrix, α, β ∈ [0, 1], β ≤ α, then C = {c̃ij } is β- -reciprocal. We denote R [cL ij (α), cij (α)] = [c̃ij ]α . For a PCF matrix with non-crisp elements the reciprocity condition is a rather weak condition. That is why we shall ask stronger conditions to define the concept of strong reciprocity. Let C = {c̃ij } be an n × n PCF matrix, α ∈ [0, 1]. C is said to be strong α- -reciprocal, if the following condition holds: For every i, j ∈ {1, 2, ..., n} and for every cij ∈ [c̃ij ]α there exists cji ∈ [c̃ji ]α such that cij cji = e.
(2)
C = {c̃ij } is said to be strong -reciprocal, if C is strong α- -reciprocal for all α ∈ [0, 1]. Remark 2. Every strong α- -reciprocal PCF matrix is α- -reciprocal, but not vice-versa. Proposition 1. Let C = {c̃ij } be a PCF matrix, α ∈ [0, 1]. The following conditions are equivalent. (i) C is strong α- -reciprocal. R R L (ii) cL ij (α) cji (α) = e and cij (α) cji (α) = e, for all i, j ∈ {1, 2, ..., n}. R R (−1) (−1) (iii) [cL , (cL ], for all i, j ∈ {1, 2, ..., n}. ji (α), cji (α)] = [(cij (α)) ij (α))
Remark 3. When evaluating fuzzy elements of a PCF matrix C = {c̃ij }, only one of the membership functions of elements c̃ij and c̃ji , i 6= j, should be evaluated, the other should satisfy condition (ii), or (iii). Then the PCF matrix becomes strong -reciprocal. If C = {c̃ij } is a strong α- -reciprocal PCF matrix, α, β ∈ [0, 1], β < α, then C = {c̃ij } need not be strong β- -reciprocal. It is, however, β- -reciprocal. Rationality and compatibility of a decision making process can be achieved by the consistency property of PCF matrices. Let G = (G, , ≤) be a divisible and continuous alo-group, C = {c̃ij } be a crisp PCF matrix, where cij ∈ G ⊂ R for all i, j ∈ {1, 2, ..., n}. The following definition is well known, see e.g. [2]. A crisp PCF matrix C = {cij } is -consistent if for all i, j, k ∈ {1, 2, ..., n} : cik = cij cjk . Now, we extend the definition to non-crisp PCF matrices as follows, see also [7]. Let α ∈ [0, 1]. A PCF matrix C = {c̃ij } is said to be α- -consistent, if the following condition holds: For every i, j, k ∈ {1, 2, ..., n}, there exists cik ∈ [c̃ik ]α , cij ∈ [c̃ij ]α and cjk ∈ [c̃jk ]α such that cik = cij cjk . The matrix C is said to be -consistent, if C is α- -consistent for all α ∈ [0, 1]. The following properties are evident. • If α, β ∈ [0, 1], β ≤ α, C = {c̃ij } is α- -consistent, then it is β- -consistent. • If C = {c̃ij } is α- -consistent, then it is α- -reciprocal. • If C = {c̃ij } is -consistent, then it is -reciprocal.
725
Mathematical Methods in Economics 2016
Now, the consistency property of PCF matrices will be strengthen. Similarly to α- -strong reciprocity defined earlier, we define strong α- -consistent PCF matrices and derive their properties. Let α ∈ [0, 1]. A PCF matrix C = {c̃ij } is said to be strong α- -consistent, if the following condition holds: For every i, j, k ∈ {1, 2, ..., n}, and for every cij ∈ [c̃ij ]α , there exist cik ∈ [c̃ik ]α and cjk ∈ [c̃jk ]α such that cik = cij cjk . The matrix C is said to be strong -consistent, if C is strong α- -consistent for all α ∈ [0, 1]. The following properties are evident. • Each strong α- -consistent PCF matrix is α- -consistent. • If C = {c̃ij } is a strong α- -consistent PCF matrix, α, β ∈ [0, 1], β < α, then C = {c̃ij } need not be strong β- -consistent. However, it must be β- -consistent. • If C = {c̃ij } is strong α- -consistent, then C = {c̃ij } is strong α- -reciprocal. • If C = {c̃ij } is strong -consistent, then C = {c̃ij } is strong -reciprocal. Now, we formulate two necessary and sufficient conditions for strong α- -consistency of a PCF matrix. This property may be useful for checking strong consistency of PCF matrices. Proposition 2. Let α ∈ [0, 1], C = {c̃ij } be a PCF matrix. The following three conditions are equivalent. (i) C = {c̃ij } is strong α- -consistent. R R L L R (ii) [cL ik (α), cik (α)] ∩ [cij (α) cjk (α), cij (α) cjk (α)] 6= ∅, for all i, j, k ∈ {1, 2, ..., n}. L R R R L (iii) cL ik (α) ≤ cij (α) cjk (α), and cik (α) ≥ cij (α) cjk (α), for all i, j, k ∈ {1, 2, ..., n}.
Remark 4. Property (iii) in Proposition 2 is useful for checking strong α- -consistency. For a given PCF matrix C = {c̃ij } it can be easily checked whether inequalities (iii) are satisfied or not. Example 5: Consider the additive alo-group R = (R, , ≤) with = +, see Example 1. Let PCF matrices A = {ãij } be given as follows: (0; 0; 0) (1; 2; 3) (3.5; 6; 8) A = (−3; −2; −1) (0; 0; 0) (2.5; 4; 5) . (−8; −6; −3.5) (−5; −4; −2.5) (0; 0; 0) Here, A is a 3 × 3 PCF matrix, particularly a PCF matrix with triangular fuzzy number elements and the usual ”linear” membership functions. Checking inequalities (iii), we obtain that A is strong α- consistent PCF matrix for all α, 0 ≤ α ≤ 1, hence, A is strong -consistent.
4
Priority vectors, inconsistency of PCF matrices
Definition of the priority vector for ranking the alternatives will be based on the optimal solution of the following optimization problem: (P1)
subject to
cL ij (α)
α −→ max; ≤ wi ÷ wj ≤ n K
cR ij (α)
(3) for all i, j ∈ {1, 2, ..., n},
wk = e,
(4) (5)
k=1
0 ≤ α ≤ 1, wk ∈ G, for all k ∈ {1, 2, ..., n}.
(6)
If optimization problem (P1) has a feasible solution, i.e. system of constraints (4) - (6) has a solution, then (P1) has also an optimal solution. Let α∗ and w∗ = (w1∗ , ..., wn∗ ) be an optimal solution of problem (P1). Then α∗ is called the -consistency grade of C, denoted by g (C), i.e. g (C) = α∗ . Here,
726
Mathematical Methods in Economics 2016
w∗ = (w1∗ , ..., wn∗ ) is called the -priority vector of C. This vector is associated with the ranking of alternatives as follows: If wi∗ > wj∗ then xi xj , where stands for ”is better then”. If optimization problem (P1) has no feasible solution, then we define g (C) = 0. The priority vector will be defined in this section. Generally, problem (P1) is a nonlinear optimization problem that can be efficiently solved e.g. by the dichotomy method, which is a sequence of optimization problems, see e.g. [5]. For instance, given α ∈ [0, 1], = +, problem (P1) can be solved as an LP problem (with variables w1 , ..., wn ). Let C = {c̃ij } be a PCF matrix, α ∈ [0, 1]. If there exists a triple of elements i, j, k ∈ {1, 2, ..., n} such that for any cij ∈ [c̃ij ]α , any cik ∈ [c̃ik ]α , and any ckj ∈ [c̃kj ]α : cik 6= cij cjk , then the PCF matrix C is α- -inconsistent. If for all α ∈ [0, 1] the PCF matrix C is α- -inconsistent, then we say that C is -inconsistent. By this definition, for a given PCF matrix C and given α ∈ [0, 1], C is either α- -consistent, or, C is α- -inconsistent. Notice, that for a PCF matrix C problem (P1) has no feasible solution, if and only if C is inconsistent, i.e. C is α- -inconsistent for all α ∈ [0, 1]. It is an important task to measure an intensity of -inconsistency of the given PCF matrix. In some cases, a PCF matrix can be “close”to some -consistent matrix, in the other cases -inconsistency can be strong, meaning that the PCF matrix can be “far”from any -consistent matrix. The -inconsistency of C will be measured by the minimum of the distance of the “ratio”matrix R R W = {wi ÷ wj } to the “left”matrix C L = {cL ij (0)} and “right”matrix C = {cij (0)}, as follows. Let w = (w1 , ..., wn ), wi ∈ G, i, j ∈ {1, ..., n}. Denote dij (C, w)
R = e if cL ij (0) ≤ wi ÷ wj ≤ cij (0), R = min{kcL ij (0) ÷ (wi ÷ wj )k, kcij (0) ÷ (wi ÷ wj )k}, otherwise.
(7)
Here, by k.k we denote the norm defined in Section 2. We define the maximum deviation to the matrix W = {wi ÷ wj } : I (C, w) = max{dij (C, w)|i, j ∈ {1, ..., n}}. (8) Now, consider the following optimization problem. (P2) I (C, w) −→ min; subject to
n K k=1
wk = e, wk ∈ G, for all k ∈ {1, 2, ..., n}.
(9) (10)
The -inconsistency index of PCM matrix C, I (C), is defined as I (C) = inf{I (C, w)|wk satisfies (10) }.
(11)
If there exists a feasible solution of (P2), then -inconsistency index of PCM matrix C, I (C), is equal to e, i.e. I (C) = e. If w∗ = (w1∗ , ..., wn∗ ) is an optimal solution of (P2), then I (C) = I (C, w∗ ). It is clear that an optimal solution of (P2) exists, the uniqueness of the optimal solution of (P2) is, however, not saved. Depending on the particular operation , problem (P2) may have multiple optimal solutions which is an unfavorable fact from the point of view of the decision maker. In this case, the decision maker should reconsider some (fuzzy) evaluations of pairwise comparison matrix. Now, we define a priority vector also in case of g (C) = 0, i.e. if no feasible solution of (P1) exists. Let C be an -inconsistent PCM matrix. The optimal solution w∗ = (w1∗ , ..., wn∗ ) of (P2) will be called the -priority vector of C. For a PCF matrix C = {c̃ij } just one of the following two cases occurs: • Problem (P1) has a feasible solution. Then consistency grade g (C) = α, for some α, 0 ≤ α ≤ 1, I (C) = e. The -priority vector of C is the optimal solution of problem (P1). • Problem (P1) has no feasible solution. Then consistency grade g (C) = 0, C is -inconsistent, and I (C) > e. The -priority vector of C is the optimal solution of problem (P2).
727
Mathematical Methods in Economics 2016
Example 6: Let X = {x1 , x2 , x3 } be a set of alternatives, let F = {f˜ij } be a PCF matrix on the fuzzy multiplicative alo-group ]0, 1[m =(]0, 1[, •f , ≤), with: ab , e = 0.5, a(−1) = 1 − a, kak = max{a, 1 − a}. (12) ab + (1 − a)(1 − b) Fuzzy multiplicative alo-group ]0, 1[m is divisible and continuous. For more details and properties, see Example 4, eventually, [7]. Let (0.5; 0.5; 0.5) (0.6; 0.7; 0.8) (0.75; 0.8; 0.9) F = (0.2; 0.3; 0.4)) (0.5; 0.5; 0.5) (0.7; 0.75; 0.8) . (0.1; 0.2; 0.25) (0.2; 0.25; 0.3) (0.5; 0.5; 0.5) a •f b =
Here, F is a 3 × 3 PCF matrix, particularly, PCF matrix with elements on ]0, 1[. F is a •f -reciprocal PCF matrix (noncrisp), the elements of F are triangular fuzzy numbers. There is an optimal solution of the corresponding problem (P1) with C = F , the consistency grade g•f (F ) = 0.6. The •f -priority vector w∗ of F is w∗ = (0.586, 0.302, 0.112), hence x1 x2 x3 . The inconsistency index I•f (F ) = e = 0.5.
5
Conclusion
This paper deals with pairwise comparison matrices with fuzzy elements. Fuzzy elements of the pairwise comparison matrix are usually applied whenever the decision maker is not sure about the value of his/her evaluation of the relative importance of elements in question. In comparison with PC matrices investigated in the literature, here we investigate pairwise comparison matrices with elements from abelian linearly ordered group (alo-group) over a real interval (PCF matrices). We generalize the concept of reciprocity and consistency of pairwise comparison matrices with triangular fuzzy numbers to bounded fuzzy intervals of alo-groups (trapezoidal fuzzy numbers). We also define the concept of priority vector which is an extension of the well known concept in crisp case used here for ranking the alternatives. Such an approach allows for unifying the additive, multiplicative and also fuzzy approaches known from the literature. Then we solve the problem of measuring inconsistency of a PCF matrix C by defining corresponding indexes. Six numerical examples have been presented to illustrate the concepts and derived properties.
Acknowledgements Supported by the grant No. 14-02424S of the Czech Grant Agency.
References [1] Bourbaki, N.: Algebra II. Springer Verlag, Heidelberg-New York-Berlin, 1998. [2] Cavallo, B., and D’Apuzzo, L.: A general unified framework for pairwise comparison matrices in multicriteria methods. International Journal of Intelligent Systems 24 4 (2009), 377–398. [3] Leung, L.C., and Cao, D.: On consistency and ranking of alternatives in fuzzy AHP. European Journal of Operational Research 124 (2000), 102–113. [4] Mikhailov, L.: Deriving priorities from fuzzy pairwise comparison judgments. Fuzzy Sets and Systems 134 (2003), 365–385. [5] Ohnishi, S., Dubois, D. et al.: A fuzzy constraint based approach to the AHP. In: Uncertainty and Intelligent Inf. Syst., World Sci., Singapore, 2008, 217–228. [6] Ramik, J., and Vlach, M.: Generalized concavity in optimization and decision making. Kluwer Academic Publishers, Boston-Dordrecht-London 2001. [7] Ramik, J.: Isomorphisms between fuzzy pairwise comparison matrices. Fuzzy Optim. Decis. Making 14 (2014), 199–209. [8] Xu, Z.S., and Chen, J.: Some models for deriving the priority weights from interval fuzzy preference relations. European Journal of Operational Research 184 (2008), 266–280.
728
Mathematical Methods in Economics 2016
The housing market and credit channel of monetary policy Vlastimil Reichel1, Miroslav Hloušek2 Abstract. This paper is focused on a credit channel of monetary policy in the housing market. We test the relevance of the credit channel in the Czech housing sector in period 2006:Q1 - 2015:Q4. We use two types of structural VAR models to identify monetary policy shocks and to investigate if they have impacts on behaviour of the real variables. These VAR models are estimated using seven variables: real GDP, consumer price inflation, real house prices, volumes of mortgages, and total loans to households, mortgage interest rate and money market interest rate. Point estimates of impulse responses show that monetary policy shock has a (i) negative effect on the real GDP and mortgages of households; (ii) temporary positive effect on the real house price and (iii) positive effect on the spread between the mortgage rate and the money market interest rate. These results indicate that the monetary policy is able to significantly affect the housing market in the Czech economy. Keywords: Housing sector, VAR model, credit channel. JEL classification: E51, E52, C22 AMS classification: 91B84
1
Introduction
Housing sector plays an important role in the business cycle, because changes in house prices can have important wealth effects on consumption and residential investment, which is the most volatile component of aggregate demand (see Bernanke and Gertler [1]). The housing market in the Czech economy was relatively small, but it has been expanding rapidly. The size of mortgage market in 2002 was 4.18 % of GDP, but it has increased to 21.72 % of GDP in 2015 (See Czech National Bank [3]). One of the strong impulses for development of the housing market in the Czech Republic was a rent deregulation law implemented in 2006. A larger share of the housing sector allows to observe typical relationships among macroeconomic variables. These relationships are called credit channel of monetary policy. Iacoviello and Minetti [7] argue that the relevance and form of the credit channel depends on structural features of the housing finance system of each country. Typically, countries with more developed market have stronger connections between monetary policy and the housing sector. Concretely, monetary policy can affect the housing market through behaviour of households and mortgage banks. To identify the credit channel in the Czech housing market, we use two types of structural VAR models. These models are estimated and studied by impulse response functions. The results show that there is evidence of credit channel in the Czech economy and thus the monetary policy is able to significantly affect the housing market variables.
2
Methodology and Data
2.1
Data
We use seven macroeconomic variables for the identification of the VAR models. They are: real GDP, consumer price inflation, real house prices, volume of mortgages and total loans to households, mortgage interest rate and money market interest rate. The data are obtained from the Czech Statistical Office 1 Masaryk 2 Masaryk
University, Lipová 41a, Brno, Czech Republic, reichel.v@mail.muni.cz University, Lipová 41a, Brno, Czech Republic, hlousek@econ.muni.cz
729
Mathematical Methods in Economics 2016
(CZSO) and the Czech National Bank (CNB) databases and cover period 2006:Q1 â&#x20AC;&#x201C; 2015:Q4. More detailed description, data sources and transformation of the variables is quoted in Table 1. Variable
Description
Data Source
Y CPI HP HL TL R M SP
real GDP Consumer Price Index Offering price of flats deflated by CPI Housing loans (mortgages) to households Total loans to households 3M PRIBOR Mortgage interest rate Spread between M and R
CZSO CZSO CZSO CNB CNB CNB CNB own constr., CNB
Transformation log, â&#x2C6;&#x2020; log, â&#x2C6;&#x2020; log, â&#x2C6;&#x2020; log, â&#x2C6;&#x2020; log, â&#x2C6;&#x2020; â&#x2C6;&#x2019; â&#x2C6;&#x2019; â&#x2C6;&#x2019;
Table 1 Data for VAR models We transform following variables: Y, CP I, HP, HL, T L into stationary form using first differences (â&#x2C6;&#x2020;) of logarithm. Other variables R, M and SP are assumed stationary and are used without transformation. 2.2
VAR model
Two structural VAR models are used for identification of monetary policy shocks. With variables of interest collected in the k-dimensional vector Yt the reduced-form VAR model can be written as: Yt = A1 ¡ Ytâ&#x2C6;&#x2019;1 + A2 ¡ Ytâ&#x2C6;&#x2019;2 + ¡ ¡ ¡ + Ap ¡ Ytâ&#x2C6;&#x2019;p + ut ,
(1)
where Ytâ&#x2C6;&#x2019;p is vector of variables lagged by p periods, Ai is a time-invariant k Ă&#x2014; k matrix and ut is a k Ă&#x2014; 1 vector of error terms. As the error terms are allowed to be correlated, the reduced-form model is transformed into structural model. Pre-multiplying both sides of equation 1 by the (k Ă&#x2014; k) matrix A0 , yields the structural form: A0 Yt = A0 A1 ¡ Ytâ&#x2C6;&#x2019;1 + A0 A2 ¡ Ytâ&#x2C6;&#x2019;2 + ¡ ¡ ¡ + A0 Ap ¡ Ytâ&#x2C6;&#x2019;p + B t .
(2)
The relationship between structural disturbances t and reduced-form disturbances ut is described by equation of AB model (see Lutkepohl [8] or Lutkepohl and Kratzig [9]): A0 ut = B t ,
(3)
where A0 also describes the contemporaneous relation among the endogenous variables and B is a (k Ă&#x2014; k) matrix. In the structural model, disturbances are assumed to be uncorrelated with each other. In other words, the covariance matrix of structural disturbances ÎŁe is diagonal. The model described by equation (2) can not be identified. Therefore, first the matrix B is restricted to (k Ă&#x2014; k) diagonal matrix. As a result, diagonal elements of matrix B represent estimated standard deviations of the structural shocks. The next step is to impose restrictions to matrix A0 . The matrix A0 is lower triangular matrix with additional restrictions that are obtained from statistical properties of the data. The starting point of the process is computation of partial correlations among the variables in the model and subsequent elimination of statistically insignificant relations. This methodology was used by Fragetta and Melina [5] who analyzed influence of fiscal policy shock in the US economy. This methodology is based on well defined statistical rules and allows to derive many possible structural VAR models that are later evaluated by statistical information criteria.
730
Mathematical Methods in Economics 2016
2.3
Credit channel
The credit channel can take many forms. It theoretically describes how monetary policy can affect the amount of credit that banks provide to firms and consumers which in turn affects the real part of economy. Our idea (inspirated by Lyziak et Al. [10]) of the monetary transmission mechanism is captured in Figure 1. Decrease of real estate price Decrease of housing loans of households
Increase of interest rates
Expectation of future official interest rate
Lower demand for loans
Total loans of households
Tightening of monetary policy Decrease of inflation expectation
Decrease of investment Decrease of consumption
Decrease of aggregate demand
Decrease of inflation
Figure 1 Monetary transmission mechanism Monetary policy restriction represented by rise in short-term interest rates is followed by increase of lending and deposit rates in commercial and mortgage banks. If we assume price rigidities it causes rise in the real interest rates which affects consumption, savings and investment decisions. Increase in interest rates has an impact on a behaviour of households whose willingness to borrow declines. This leads to a drop in a growth of total household debt. This is a type of credit channel that affects the economy by altering the amount of credit of households and firms. The change in households behaviour may affect the real estate market because lower demand for loans also means lower demand for mortgage loans. It negatively influences demand for real estates and causes a decline of the real estate prices growth. To find an evidence of monetary transmission mechanism and credit channel in the Czech housing finance system we are inspired by work of Iacoviello and Minetti [7] and use two structural VAR models. First SVAR model includes six variables: GDP growth rate, CPI inflation, money market interest rate, growth rate of real house price index, growth rate of households housing loans and growth rate of households total loans. This VAR model (with vector Yt = [∆Y, π, R, ∆HP, ∆HL, ∆T L]) is denoted by Iacoviello and Minetti [7] as uninformative model for detecting a credit channel. The model is based on the idea that increase in money market interest rate leads to a fall in loan demand implying a reduction in households loans. The idea is consistent with the traditional monetary transmission mechanism. The second VAR includes: GDP growth rate, CPI inflation, money market interest rate, growth rate of real house prices and the spread between a mortgage interest rate and a money market interest rate. Iacoviello and Minetti [7] assume that this VAR model (with vector Yt = [∆Y, π, R, ∆HP, SP ]) should capture the increase in the external finance premium associated with a credit channel.
3 3.1
Results Identifications of SVAR models
We identify the monetary policy shocks in both types of VAR models following Iacoviello and Minetti [7] with algorithm proposed by Fragetta and Melina [5]. First, the VAR models described in equation 1 are estimated by OLS method. For both VAR models the lag length of four quarters was determined according to Akaike and Hannan-Quinn information criteria. Then vector of residual ut for each variable is
731
Mathematical Methods in Economics 2016
acquired and partial correlations among the series of innovations are computed. These partial correlations are quoted in Table 2. VAR1 uyt uπt uR t uHP t uHL t uTt L
VAR2 uyt 1 -0,11 -0,04 0,04 -0,20 -0,40
uπt 1 0,63 -0,32 0,39 0,35
uR t
uHP t
1 -0,27 0,45 0,43
1 0,11 0,14
uHL t
uTt L
1 0,96
uyt uπt uR t uHP t uSP t
uyt 1 -0,13 0,14 0,51 -0,07
uπt
uR t
uHP t
uSP t
1 0,18 0,09 -0,24
1 0.34 -0,95
1 -0,34
1
1
Table 2 Estimated partial correlations of the series innovations
A0 of SVAR1 1 −a21 0 0 −a51 −a61
1 −a32 −a42 −a52 −a62
1 −a43 −a53 −a63
1 −a54 −a64
1 −a65
1
A0 of SVAR2
1 −a21 −a31 −a41 0
1 −a32 0 −a52
1 −a43 −a53
1 −a54
1
Table 3 Identification of matrix A0 is SVAR models
The next step is to determine the SVAR model with AB specifications. Many authors use lower triangular structure of matrix A0 based on a Cholesky decomposition. In this paper we use a data oriented method (see Fragetta and Melina [5]) based on the analysis of partial correlations among the variable’s residuals. In other words, if partial correlation (in Table 2) is statistically insignificant1 , we assume that a matrix element corresponding to this correlation is equal to zero. The resulting matrix is shown in Table 3. 3.2
Impulse responses function SVAR1
For evidence of monetary policy transmission mechanism we simulate reaction to identified monetary policy shock (increase in nominal interest rate by one standard deviation). Figure 2 shows the responses of variables to a monetary contraction, along with percentile confidence interval calculated according to Hall [6]. The impulse responses show that after the shock the GDP growth at first increases but in the second quarter decreases and reaches the minimum in the fifth quarter. Inflation also increases at impact but then slows down and after four periods becomes negative. Slightly similar reaction was observed by Iacoviello and Minetti [7] in economy of Germany, Norway and Finland or by Lyziak et Al. [10] for Poland or by Ciccarelli, Peydro and Maddaloni [2] for euro area. The initial increase of inflation is usually interpreted as prize puzzle problem. A monetary contraction leads to a significant decline in housing and total bank loans. From the figure it is clear, that tight monetary policy has similar effects on housing and total loans of households. The change of household’s behaviour (through a change in demand for real estate) affects the real estate market. Real house prices significantly react with the expected negative sign although with some lag. The though of decline is in seventh quarter, almost two years after the monetary restriction. Relatively delayed responses of this variable to monetary policy shock may be caused by: (i) still evolving real estate market and (ii) boom and bust in the Czech economy during estimated period. 3.3
Impulse responses function SVAR2
Figure 3 shows the impulse responses of the GDP growth, inflation, growth rate of real house prices and spread between mortgage rate on new housing loans and interbank interest rate to monetary policy shock. 1 If
it is lower than ±0.11.
732
Mathematical Methods in Economics 2016
0.3
0.3
0.3
0.2
0.2
0.1
0.1
0.2 0.1 0 −0.1 0
0
−0.1
−0.1
−0.2
−0.2
−0.3
−0.3
−0.2 −0.3 −0.4 −0.5 −0.6 −0.7
−0.4 0
5
10
15
20
25
−0.4 0
(a) responses of ∆GDP
5
10
15
20
25
0
(b) responses of π
1
5
10
15
20
25
20
25
(c) responses of R
0.6
0.6
0.4
0.4
0.2
0.2
0.5
0 0
0
−0.2
−0.2
−0.4
−0.4
−0.6
−0.6
−0.5
−1
−1.5
−0.8 0
5
10
15
20
25
−0.8 0
(d) responses of ∆HP
5
10
15
20
25
0
(e) responses of ∆HL
5
10
15
(f) responses of ∆T L
Figure 2 SVAR type 1 - Impulse responses functions to monetary restriction shock
0.8 0.6
0.7
0.7
0.6
0.6
0.5
0.5
0.4
0.4
0.3
0.3
0.2
0.2
0.1
0.1
0.4 0.2 0 −0.2
0
0
−0.1
−0.1
−0.4 −0.6
−0.2
−0.8
−0.2
−0.3 0
5
10
15
20
25
−0.3 0
(a) responses of ∆GDP
5
10
15
20
25
(b) responses of π
2
0
5
10
15
(c) responses of R
0.6
0.4
1.5
0.2 1 0 0.5 −0.2 0 −0.4 −0.5
−0.6
−1
−0.8 0
5
10
15
20
(d) responses of ∆HP
25
0
5
10
15
20
25
(e) responses of SP
Figure 3 SVAR type 2 - Impulse responses functions to monetary restriction shock
733
20
25
Mathematical Methods in Economics 2016
Similarly as in the first model we can see firstly positive and after four quarters negative and statistically significant reaction of GDP growth. Along the same line react also inflation and real house price growth. The initial increase in both variables can be caused by time that economy needs to adapt to increase in the interest rate, or by some type of prize puzzle effect. The spread SP decreases significantly and after five quarters it starts to increase. The initial decline may be caused by long-term fixation of mortgage rate (see Iacoviello and Minetti [7] p. 8). The increase after five quarters can be interpreted as an adaptation of mortgage interest rate. In other words, the mortgage banks adjust the mortgage rate to higher interbank interest rate. The significant increase of spread after the monetary contraction points at presence of a broad credit channel.
4
Conclusion
In this paper we tested the importance of credit channel in the housing sector of the Czech economy in period 2006:Q1 â&#x20AC;&#x201C; 2015:Q4. We used two structural VAR models for identification of monetary policy shocks and studied reaction of macroeconomic and housing sector variables to increase in interest rate. The impulse responses showed that monetary policy can significantly affect the GDP growth, inflation, households loans, real house prices and the spread between mortgage interest rate and benchmark interest rate. Presence of the credit channel in the Czech housing market thus demonstrates that monetary policy can substantially influence important part of the economy. Existence of such mechanism should be taken into consideration for formation of monetary policy and building economic models.
Acknowledgements This paper was supported by the specific research project no. MUNI/A/1049/2015 at Masaryk University.
References [1] Bernanke, B., and Gertler, M.: Inside the black box: The credit channel of monetary transmission. Journal of Economic Perspectives, 1995. [2] Ciccarelli, M., Peydro, J.L. and Maddaloni, A.: Trusting the bankers: a new look at the credit channel of monetary policy. Working Paper Series 1228, European Central Bank, 2010. [3] Czech National Bank.: CNB ARAD, 2016. http://www.cnb.cz/ [4] Czech Statistical Office.: CZSO, 2016. http://www.czso.cz/ [5] Fragetta, M. and Melina.G.: The Effects of Fiscal Shocks in SVAR Models: A Graphical Modelling Approach, Birkbeck Working Papers in Economics and Finance 1006. Department of Economics, Mathematics and Statistics, Birkbeck, 2010. [6] Hall, P.: The Bootstrap and Edgeworth Expansion. Springer, New York, 1992. [7] Iacoviello, M. and Minetti, R.: The Credit Channel of Monetary Policy: Evidence from the Housing Market, Journal of Macroeconomics 2008. [8] Lutkepohl, H.: New introduction to multiple time series analysis. Springer, Berlin, 2006. [9] Lutkepohl, H. and Kratzig M.: Applied time series econometrics. Cambridge University Press, Cambridge, 2004. [10] Lyziak T., Demchuk O., Przystupa J., Sznajderska A. and Wrobel E.: Monetary policy transmission mechanism in Poland. What do we know in 2011?. National Bank of Poland Working Papers 116, National Bank of Poland, Economic Institute, 2012.
734
Mathematical Methods in Economics 2016
Computing Fuzzy Variances of Evaluations of Alternatives in Fuzzy Decision Matrices Pavla Rotterová1, Ondřej Pavlačka2 Abstract. Decision matrices are a common tool for solving decision-making problems under risk. Elements of the matrix express evaluations if a decisionmaker chooses the particular alternative and the particular state of the world occurs. In practice, the states of the world as well as the decision-maker’s evaluations are often described only vaguely. In the literature, there are considered fuzzy decision matrices, where alternatives are compared on the basis of the fuzzy expected values and fuzzy variances of their evaluations. However, the fuzzy variances of alternatives are calculated without considering dependencies between fuzzy evaluations under particular fuzzy states of the world and the fuzzy expected evaluation. So the final ranking of alternatives can be wrong. In the paper, we propose a proper way of how should the fuzzy variances of evaluations of alternatives be obtained. The problem is illustrated by the example of stocks comparisons. Keywords: decision matrices, decision-making under risk, fuzzy states of the world, fuzzy expected values, fuzzy variances. JEL classification: C44 AMS classification: 90B50
1
Introduction
Decision matrices are one of decision-making models to ordering alternatives. Elements of the matrices express evaluations of alternatives if the decision-maker chooses the particular alternative and the particular state of the world occurs. In decision-making models, there are usually considered crisp (precisely defined) states of the world like e.g. ”economy will grow more than 5 % next year”, and crisp decision-maker’s evaluations like e.g. 0. Crisp decision matrices are considered e.g. in [1], [2], [6] and [8]. However, in practice, states of the world as well as the decision maker’s evaluations are sometimes described vaguely, like e.g. ”economy will grow next year” or ”approximately zero”, respectively. Fuzzy decision matrices are considered e.g. in [7]. Talašová and Pavlačka [7] proposed the ranking of alternatives in a fuzzy decision matrix based on fuzzy expected evaluations of the alternatives and fuzzy variances of the alternatives evaluations. However, they suggested an approach to the computation of fuzzy variances without considering the dependencies between fuzzy evaluations under particular fuzzy states of the world and the fuzzy expected evaluation. Finally, Talašová and Pavlačka [7] selected the best alternative according to a decision-making rule of the expected value and the variance. Based on this rule, a decision-maker selects the alternative that maximises his/her expected evaluation of the alternative and minimises the variance of his/her evaluation of the alternative. The aim of this paper is to propose the proper way of the fuzzy variance computation. The paper is organised as follows. Section 2 is devoted to the extension of a crisp probability space to the case of fuzzy events. In section 3, a decision matrix with fuzzy states of the world and with fuzzy evaluations under particular states of the world is described. The approach of the fuzzy variance computation proposed 1 Palacký
University Olomouc, Faculty of Science, Department of Mathematical Analysis and Applications of Mathematics, 17. listopadu 1192/12, 771 46 Olomouc, Czech Republic, pavla.rotterova01@upol.cz 2 Palacký University Olomouc, Faculty of Science, Department of Mathematical Analysis and Applications of Mathematics, 17. listopadu 1192/12, 771 46 Olomouc, Czech Republic, ondrej.pavlacka@upol.cz
735
Mathematical Methods in Economics 2016
in [7] is also described here, and the proper way of the fuzzy variance computation is introduced . The differences of these two approaches are illustrated in the example in section 4. Some concluding remarks are given in Conclusion.
2
Fuzzy events and their probabilities
Let a probability space (Ω, A, P ) be given, where Ω is a non-empty universal set of all elementary events, A denotes a σ-algebra of subsets of Ω, i.e. A is a set of all considered random events, and P denotes a probability measure. In this section, the σ-algebra A is extended to the case of fuzzy events and crisp probabilities of fuzzy events are defined. A fuzzy set A on a non-empty set Ω is determined by its membership function µA : Ω → [0, 1]. F(Ω) denotes the family of all fuzzy sets on Ω. Let us note that any crisp set A ⊆ Ω can be viewed as a fuzzy set of a special kind; the membership function µA coincides in such a case with the characteristic function χA of A. In accordance with Zadeh [9], a fuzzy event A ∈ F(Ω) is a fuzzy set whose α-cuts Aα := {ω ∈ Ω | µA (ω) ≥ α} are random events, i.e. Aα ∈ A for all α ∈ [0, 1]. The family of all such fuzzy events, denoted by AF , forms a σ-algebra of fuzzy events, which was shown in [4]. Let us note that in a decision matrix described further, fuzzy events represent fuzzy states of the world. Zadeh [9] extended the probability measure P to the case of fuzzy events. This extended measure is denoted by PZ . He defined a probability PZ (A) of a fuzzy event A as follows: Z PZ (A) := E(µA ) = µA (ω) dP. (1) Ω
The mapping PZ possesses the important properties of a probability measure, see e.g. [5] and [9]. However, there is a problem with interpretation of the probability measure PZ as was discussed in [5].
3
Fuzzy variance calculations in a fuzzy decision matrix
In this section, we introduce fuzzification of the decision matrix. Then, we recall the formulas for calculations of the fuzzy expected value and the fuzzy variance of the decision-maker’s evaluations proposed by Talašová and Pavlačka [7]. Finally, we show the proper way of the computation of the fuzzy variance. A vaguely defined quantity can be described by a fuzzy number. A fuzzy number A is a fuzzy set on the set of all real numbers R such that its core Core A := {ω ∈ Ω | µA (ω) = 1} is non-empty, whose α-cuts are closed intervals for any α ∈ (0, 1], and whose support Supp A := {ω ∈ Ω | µA (ω) > 0} is bounded. The family of all fuzzy numbers on R is denoted by FN (R). In the paper, we consider an α-cut U of any fuzzy number A expressed by Aα = [AL α , Aα ] for any α ∈ (0, 1].
x1 x2 ··· xn
S1 PZ (S1 ) H1,1 H2,1 ··· Hn,1
S2 PZ (S2 ) H1,2 H2,2 ··· Hn,2
··· ··· ··· ··· ··· ···
Sm PZ (Sm ) H1,m H2,m ··· Hn,m
Characteristics (F) EH1 (F) EH2 ··· (F) EHn
(F) var H1 (F) var H2 ··· (F) var Hn
Table 1 Fuzzy decision matrix Now, let us introduce a fuzzy extension of the decision matrix, where states of the world and also evaluations under particular states of the world are modelled by fuzzy sets, given in Table 1. In the matrix, x1 , x2 , . . . , xn represent alternatives of a decision-maker, S1 , SP 2 , . . . , Sm fuzzy states of the world m that form a fuzzy partition of a non-empty universal set Ω, i.e. j=1 µSj (ω) = 1 for any ω ∈ Ω, PZ (S1 ), PZ (S2 ), . . . , PZ (Sm ) stand for the probabilities of the fuzzy states of the world S1 , S2 , . . . , Sm defined by (1), and for any i ∈ {1, 2, . . . , n} and j ∈ {1, 2, . . . , m}, Hi,j means the fuzzy evaluation if 736
Mathematical Methods in Economics 2016
the decision-maker chooses the alternative xi and the fuzzy state of the world Sj occurs. Moreover, we assume that probability distribution on Ω is known. Further, we consider that fuzzy elements in the matrix are expressed by fuzzy numbers. U Let (Hi,j )α = [(Hi,j )L α , (Hi,j )α ] for any α ∈ (0, 1]. The fuzzy expected evaluations, denoted by (F) EH1 , (F) EH2 , . . . , (F) EHn , are calculated as
(F)EHi =
m X j=1
PZ (Sj ) · Hi,j ,
(2)
i.e. for any i ∈ {1, 2, . . . n} and any α ∈ (0, 1], the α-cut of the fuzzy expected evaluation (F) EHiα = U L ], is calculated as follows: , EHiα [EHiα L EHiα =
m X j=1
U EHiα =
m X j=1
PZ (Sj ) · (Hi,j )L α, PZ (Sj ) · (Hi,j )U α.
Talašová and Pavlačka [7] considered fuzzy variances denoted by (F) vart H1 , (F) vart H2 , . . . , (F )vart Hn of alternatives evaluations defined for any i ∈ {1, 2, . . . n} as (F) vart Hi =
m X j=1
2
PZ (Sj ) · (Hi,j − (F)EHi ) .
(3)
L U For any α ∈ (0, 1], the α-cut (F) vart Hiα = [vart Hiα , vart Hiα ] is computed as follows: L vart Hiα =
m X j=1
where ej = U vart Hiα =
(
PZ (Sj ) · ej ,
0 n o y2 2 y1 min ((Hi,j )α − EHiα ) |y1 = L, U, y2 = L, U
m X j=1
if
U U L 0 ∈ [(Hi,j )L α − EHiα , (Hi,j )α − EHiα ],
otherwise.
n o y2 2 PZ (Sj ) · max ((Hi,j )yα1 − EHiα ) |y1 = L, U, y2 = L, U
However, this approach, based on the concept of constrained fuzzy arithmetic (see e.g. [3]), does not takes into account dependencies between fuzzy evaluations under particular fuzzy states of the world Hi,j and the fuzzy expected value (F)EHi . Let us propose a proper way of the calculation of the α-cut of the fuzzy variance denoted by (F) vars Hi for any i ∈ {1, 2, . . . n}. Let us denote for any α ∈ (0, 1] L U (F) vars Hiα = [vars Hiα , vars Hiα ].
The lower and upper bounds of the (F) vars Hiα are defined as follows: !2 m m X X L vars Hiα = min PZ (Sj ) · hi,j − PZ (Sk ) · hi,k | hi,j ∈ (Hi,j )α , j = 1, 2, . . . , m , j=1 k=1 !2 m m X X U vars Hiα = max PZ (Sj ) · hi,j − PZ (Sk ) · hi,k | hi,j ∈ (Hi,j )α , j = 1, 2, . . . , m . j=1
(4)
(5)
k=1
In the formulas (4) and (5), the resultant fuzzy expected value is not employed in contrast to (3). Instead of this, the way how the fuzzy expected value is obtained, is taken into account. Thus, this approach takes into consideration the relationships among variables.
The resultant fuzzy variances obtained by the formulas (4) and (5), are subsets of fuzzy variances obtained by formula (3). So for each α ∈ (0, 1], (F) vars Hiα ⊆ (F) vart Hiα . Using the formulas (4) and (5), the uncertainty of the fuzzy variance is not falsely increased as in the case of the variance computation by the formula (3). Moreover, alternatives can be differently ordered if fuzzy variances are calculated according to the formulas (4) and (5) instead of (3). This is illustrated by the example in the following section.
737
Mathematical Methods in Economics 2016
4
Example of the fuzzy decision matrix of stock comparisons
Let us consider a problem of comparing two stocks, A and B, with respect to their future yields. The considered states of the world are great economic drop, economic drop, economic stagnation, economic growth and great economic growth denoted by GD, D, S, G, and GG, respectively. They are expressed by a trapezoidal fuzzy numbers that form a fuzzy scale shown in Figure 1. A trapezoidal fuzzy number is determined by its significant values a1 , a2 , a3 , and a4 , where a1 ≤ a2 ≤ a3 ≤ a4 . The membership function of any trapezoidal fuzzy number A ∈ FN (R) is defined for any x ∈ R as follows:
µA (x) =
x−a1 a2 −a1
1
a4 −x a4 −a3
0
if x ∈ [a1 , a2 ), if x ∈ [a2 , a3 ], if x ∈ (a3 , a4 ], otherwise.
A trapezoidal fuzzy number A determined by its significant values is denoted by (a1 , a2 , a3 , a4 ). Let us assume that the economy states are given by the development of the gross domestic product, abbreviated as GDP. We also assume that next year prediction of GDP development shows a normally distributed growth of GDP with parameters µ = 1.5 and σ = 2.
Figure 1 Fuzzy scale of the economy states
The resultant fuzzy expected values and fuzzy variances can be compared based on their centres of gravity. The centre of gravity of a fuzzy number A ∈ FN (R) is a real number cogA given as follows: R∞ µA (x) · x dx R∞ cogA = −∞ . µ (x) dx −∞ A Economy states Probabilities A yield (%) B yield (%) Economy states Probabilities A yield (%) B yield (%)
GD = (-∞, -∞, -4, -3) 0.0067 -48 -37 -31 -20 -46 -38 -32 -24 G = (0.25, 1.5, 3, 4) 0.4596 6 12 17 24 6 12 16 22
D = (-4, -3, -1.5, -0.25) 0.1146 -24 -17 -10 -3 -22 -17 -11 -6 GG = (3, 4, ∞, ∞) 0.1612 20 27 34 41 24 29 35 40
S = (-1.5, -0.25, 0.25, 1.5) 0.2579 -5 -3 3 8 -4 -3 3 4
Table 2 Fuzzy decision matrix of stocks yields
738
Mathematical Methods in Economics 2016
Stock Characteristic (F) EA (F) EB (F) vart A (F) vart B (F) vars A (F) vars B
Fuzzy Stock Yield (%) 1.62 6.90 12.71 19.22 2.72 7.49 12.45 17.22 5.66 79.11 346.67 857.79 22.07 97.84 335.49 702.75 41.87 117.61 257.13 447.2 74.55 134.71 259.79 392.80
Centre of Gravity 10.17 9.75 287.76 267.34 210.57 213.26
Table 3 Fuzzy expected values and fuzzy standard deviations
The fuzzy expected values (F) EA and (F) EB, computed by formula (2), are trapezoidal fuzzy numbers. Their significant values are given in Table 3. Fuzzy variances (F) vart A and (F) vart B, obtained by formula (3), and (F) vars A and (F) vars B, calculated by (4) and (5) , are not trapezoidal fuzzy numbers. Their membership functions are shown in Figure 2. We can see that properly computed fuzzy variances are significantly narrower than variances calculated by the original formula. In this example, we can also see that the proper calculation of the fuzzy variance can cause a change in the decision-makerâ&#x20AC;&#x2122;s preferences, i.e. based on the results calculated by the formula (3), the decision-maker is not able to make a decision on the basis of the rule of the expected value and the variance, while based on (F) vars A and (F) vars B, the decision-maker should select the stock A (the higher expected value and the lower variance than the stock B). The expected values and the variances are compared based on their centres of gravity.
Figure 2 Membership functions of fuzzy variances (F) vart A, (F) vart B, (F) vars A, (F) vars B
5
Conclusion
We have proposed the proper way of the fuzzy variance computation in the fuzzy decision matrix. The original way of the fuzzy variance computation does not consider dependencies between fuzzy evaluations under particular fuzzy states of the world and the fuzzy expected evaluation. Contrary, the proper one takes the dependencies into account, which means that obtained fuzzy variances do not falsely increase the uncertainty of this variance. In the numerical example, we have shown, that considering dependencies between fuzzy evaluations under particular fuzzy states of the world and the fuzzy expected evaluation can affect final ranking of the alternatives.
739
Mathematical Methods in Economics 2016
Acknowledgements Supported by the grant IGA PrF 2016 025 Mathematical Models of the Internal Grant Agency of Palacký University Olomouc and by the project No. GA 14-02424S of the Grant Agency of the Czech Republic.
References [1] Ganoulis, J.: Engineering Risk Analysis of Water Pollution: Probabilities and Fuzzy Sets. John Wiley & Sons, 2008. [2] Huynh, V. et. al.: A Fuzzy Target Based Model for Decision Making Under Uncertainty. Fuzzy Optimization and Decision Making 6, 3 (2007), 255–278. [3] Klir,G. J., Pan, Y.: Constrained fuzzy arithmetic: Basic questions and some answers. Soft Computing 2 (1998), 100–108. [4] Negoita, C.V., Ralescu, D.: Applications of Fuzzy Sets to System Analysis. Birkhuser Verlag–Edituria Technica, Stuttgart, 1975. [5] Pavlačka, O., Rotterová, P.: Probability of fuzzy events. In: Proceedings of the 32nd International Conference Mathematical Methods in Economics (J. Talašová, J. Stoklasa, T. Talášek, eds.), Palacký University Olomouc, Olomouc, 2014, 760–765. [6] Özkan,I., Türken,IB.: Uncertainty and fuzzy decisions. In: Chaos Theory in Politics, Springer Netherlands, 2014, 17–27. [7] Talašová, J., Pavlačka, O.: Fuzzy Probability Spaces and Their Applications in Decision Making. Austrian Journal of Statistics 35, 2&3 (2006), 347–356. [8] Yoon, K.P., Hwang, Ch.: Multiple Attribute Decision Making: An Introduction. SAGE Publications, California, 1995. [9] Zadeh, L.A.: Probability Measures of Fuzzy Events. Journal of Mathematical Analysis and Applications 23, 2 (1968), 421–427.
740
Mathematical Methods in Economics 2016
The main inflationary factors in the Visegrád Four Lenka Roubalová1, David Hampel2 Abstract. This paper deals with the analysis of the main inflationary factors and their changes caused by the onset of the economic crisis in 2008. We attempt to uncover the factors causing inflation in the Visegrád Four from both demand and supply sides in the pre-crisis period as well as in the following one. To determine main inflation drivers in each country, multiple regression models are estimated. To describe the common factors causing inflation among all member countries we use panel data models. We found out that the most significant inflationary factors have been changed after 2008 but these factors do not differ significantly in observed countries. The results prove the influence of the domestic factors in the pre-crisis period. The impact of the external factors, especially commodity prices, is significant in the period during crisis. In conclusion, supply side factors have major influence on inflation and the inflationary factors have been changed over time. Keywords: demand and supply side inflation factors, domestic and external inflation factors, inflation, multiple regression models, panel data models, Visegrád Four. JEL classification: E31 AMS classification: 91B84
1
Introduction
Inflation has been a common theme and subject of expert analysis for years. The economic structure has been changing at both, global and national level and so it is interesting to examine factors influencing inflation in different countries and compare the results obtained among different periods. The knowledge about main inflation drivers can be useful for individuals and businesses to anticipate further economic development. Especially, it is important for monetary authorities in countries where inflation targeting regime is implemented. As Stock and Watson reported in [16], there have been significant changes in the dynamic of inflation causing changes in the economic structure for the past five decades. Nowadays, for instance, energy is not so important part of the total expenditure in compare with the period of oil shocks. The labour union become smaller and the production of services increases. This development is associated with the monetary policy changes as well. The rules are more flexible and take into account several factors like inflation expectations, for example. There are not many publications dealing with inflationary factors in countries of Central Europe. We can mention the results reported by Golinelli and Orsi in [6] who proved the inflationary impact of the production gap and exchange rate in V4 countries during the transformation process. Further study deals with the inflation drivers development from the transformation process to the financial crisis in 2008, see [1], and points out the importance of cost-pushed inflation. In [2] Time-Varying Parameter VAR model with stochastic volatility and VAR Neural Network model are used for predicting inflation based on selected inflationary factors. Financial crisis can break the economic structure significantly and affect the efficiency of monetary policy, see for example [4], [9] and [14] and monetary authorities can be forced to accommodate these changes. The crisis can affect the behavior of individuals and businesses which react more sensitively to 1 Mendel
University in Brno, Department of Statistics and Operation Analysis, Zemědělská 1, 613 00 Brno, Czech Republic, xkravack@mendelu.cz 2 Mendel University in Brno, Department of Statistics and Operation Analysis, Zemědělská 1, 613 00 Brno, Czech Republic, qqhampel@mendelu.cz
741
Mathematical Methods in Economics 2016
some economic changes during crisis. BVAR model with the specifically defined prior is used for modeling monetary transmission mechanism in the Czech Republic and Slovakia from January 2002 to February 2013 and the results point to the inadequacy of the common VAR model, see [17]. In [3] asymmetries in effects of monetary policy and their link to labour market are investigated. Time-Varying Parameter VAR model with stochastic volatility is employed there to capture possible changes in transmission mechanism and in underlying structure of economy over time. The main aim of this paper is to find out the main inflationary factors in V4 countries in the period before the onset of the crisis in 2008 and in the period during crisis. We attempt to verify the thesis that inflationary factors have been changed over time. This analysis should also provide some information about the impact of external and domestic factors or supply and demand side factors among V4 countries.
2
Material
The selection of macroeconomic variables used in the econometric analysis is based on macroeconomic theory and on the information published in inflation reports of monetary authorities. A dependent variable is an inflation rate measured by CPI changes compared to the same period of the previous year. For Slovak Republic estimates we use HICP changes, which do not differ substantially but they are suitable for comparing and discussion to NBS publications. As the independent variables we use 23 macroeconomic indicators which represent two groups of variables, the external and the domestic factors. The group of domestic factors include GDP of each country which represents aggregate demand and household consumption which represent consumer demand. There are many notices about total aggregate demand and consumer demand in the inflation report, so both variables are included. Then we use price factors like food prices, energy prices, prices of fuels, unprocessed food prices and industrial production prices. The labour market is represented by the level of unemployment, nominal wages, labour productivity per worker and other labour costs. Then we use the government debt to GDP rate, monetary aggregates, three month interbank offered rates, 3M PRIBOR, 3M BUBOR, 3M WIBOR a 3M (BRIBOR) EURIBOR and inflation expectations. The external factors represent a group of variables containing exchange rate of domestic currency to EUR, nominal effective exchange rate, current account balance and import prices. We also use variables representing global development like oil prices, European industrial production prices and average global agricultural prodcution prices for EU and USA. Data of macroeconomic variables – quarterly time series – are expressed as an index, change against the same quarter from the last year. This form of expression is advantageous because it eliminates the effect of seasonality. The number of time series observations T = 51 corresponds to the period from the first quarter of 2003 to the second quarter of 2015, while Q1 2003 – Q3 2008 represents the period before the crisis and Q4 2008 – Q2 2015 represents the period during the crisis. Analyzed data were obtained from ARAD – database managed by the Czech National Bank, macroeconomic database of the National Bank of Slovakia, database of the Polish Statistical Office, database of the National Bank of Poland and from Eurostat database. Time series of Brent oil prices and global agricultural production prices were obtained from the OECD database and from the database of the European Central Bank.
3
Methods
The first part of this paper deals with creating multivariate regression models for each country of the Visegrád Group separately. We start with testing stationarity of the aforementioned time-series: approximately using the graph of time series and exactly by unit root tests, specifically we use Augmented Dickey-Fuller (ADF) test [5] and Kwiatkowski-Phillips-Schmidt-Shin (KPSS) test [12]. The consequence of modeled time series nonstationarity can be spurious regression. Therefore, it is necessary to test whether examined time series are cointegrated. This task we perform by verification of residues of cointegration regression stationarity. Before carrying out the parameters estimate we verify the presence of multicollinearity using pairwise correlation coefficients and Variance Inflation Factors; variables causing multicollinearity are eliminated. Further we apply backward elimination of variables according to their significance given by t-test. When
742
Mathematical Methods in Economics 2016
selecting the best model from several candidate models we compare adjusted coefficient of determination and information criteria (AIC, BIC, HQC). In particular, we look to whether the parameter estimates are consistent with economic theory. We also provide specification tests (LM test and Ramsey RESET test), serial correlation test (significance of residual autocorrelations, Durbin-Watson and Ljung-Box test), heteroskedasticity tests (Breusch-Pagan and White test). Normality of the error term we verify graphically using a histogram and Q-Q plot; for exact testing we use the Shapiro-Wilk test. Inflationary factors common to all V4 countries can be detected by the analysis of panel data representation. We deal with non-stationary time series, therefore we test cointegration of panel representation of data by the Kao test [11]. Generally, it is possible to choose model with fixed or random effects of objects. This decision is based on Hausman test [7] usually. In our case, when number of objects is lower than number of variables, we cannot use model with random effects, so only models with fixed effects are estimated. Tests of different intercepts for particular objects are conducted. Statistical tests are provided at the significance level α = 5 %. Estimates of multivariate regression models and panel data models are obtained using software Gretl 1.9.91, EViews 8 and Matlab R2015b.
4
Results
To estimate the models we used quarterly time series. Although some of them are non-stationary, we found this series cointegrated. Almost all estimated parameters are statistically significant, see Tab. 1. The models are well specified and all classical assumptions were fulfilled. With the respect to the fact that predicting inflation is very difficult, the coefficients of determination between 73 % and 96 % indicate that following variables describe the variability of inflation rate very well. Estimated inflation is illustrated graphically in Fig. 1, dashed line.
Variable 3M interbank rate Agricultural prices Consumption Energy prices Food & energy prices Food prices GDP GDP EA Inflation expectations Monetary base Nominal wages Oil prices Unproc. food prices
CZ × 0.001 × × × × 0.004 × × × 0.065 <0.001 ×
Before crisis period HU PL SK V4 × × <0.001 × <0.001 × × × × 0.004 × × <0.001 × <0.001 <0.001 × 0.002 × × × × <0.001 <0.001 × × 0.008 0.003 0.099 × × × <0.001 0.019 × × × × × × × × × × × × × × × × × ×
CZ × × × <0.001 × 0.003 × × <0.001 <0.001 × × ×
During crisis period HU PL SK <0.001 × × <0.001 <0.001 <0.001 × × 0.050 × × × × × × × × × × × × × 0.068 × × × × × <0.001 <0.001 × × × <0.001 <0.001 <0.001 0.028 <0.001 ×
V4 × <0.001 × <0.001 × × × × <0.001 × × 0.043 ×
Table 1 Variables included in models for particular countries (CZ, HU, PL, SK) and in panel data model (V4) in both pre-crisis and crisis periods. P-values of the t-test are given, the symbol ‘×’ means that insignificant variable is omitted in concrete model. For the Czech Republic, the positive signs of all numerical estimates determine positive influence of these variables on inflation. In the case of Hungary we found all variables affecting inflation positively. The negative impact of the interest rate is related to the loans denominated in euro during this period. For Poland, the negative impact of the Eurozone aggregate demand results from the declining aggregate demand in Eurozone that caused increasing demand for food production form Poland. It caused, that Polish export increased and the price level increased as well. All numerical estimates in model for Slovak Republic have positive signs which indicate that these variables affect inflation rate positively. The panel data analysis results indicate factors that are the most convenient to describe inflation development during both periods in the Visegrád Four countries collectively. The most appropriate
743
Mathematical Methods in Economics 2016
Czech Republic
8
Hungary
10
7 8
6 5
6
4 4
3 2
2
1 0
0 -1 2002
2004
2006
2008
2010
2012
2014
2016
-2 2002
2004
2006
Poland
5
2010
2012
2014
2016
2012
2014
2016
Slovakia
10
4
2008
8
3 6
2 4
1 2
0 0
-1
-2 2002
2004
2006
2008
2010
2012
2014
2016
-2 2002
2004
2006
2008
2010
Figure 1 Real inflation (solid line), multivariate regression estimate (dashed) and panel Fixed Effects model estimate (dotted) for V4 countries. Y-axis inflation rate, x-axis time. Vertical line separates periods before and during crisis. 2 resulting model with Radj = 0.832 in the pre-crisis period is the fixed effect model containing regressors like energy prices, food prices and domestic aggregate demand. With the use of Kao test of panel cointegration with p = 0.0029 we can reject the null hypothesis of non-cointegration. We can reject the existence of common intercepts using the test with p = 0.0021, and we cannot reject homoskedasticity and normality of random errors. Estimated inflation for both pre-crisis and during-crisis periods is illustrated graphically in Fig. 1, dotted line. 2 The period during crisis can be collectively described with the fixed effects model with Radj = 0.691 where the inflation rate is described with the use of regressors like global agricultural prices, crude oil prices, inflation expectations and energy prices. Kao test of panel cointegration with p < 0.001 proved we can reject the null hypothesis of non-cointegration. With the use of the test of common intercepts with p < 0.001 we can reject the null hypothesis about common intercept among groups of countries. We cannot reject the null hypothesis about normality of random errors, but we reject their homoskedasticity (we can see greater variance for Hungary and Slovakia than for Czech republic and Poland).
5
Discussion
Generally our results prove that inflation in V4 countries can be successfully described with food prices, energy prices and with the national aggregate demand development in the pre-crisis period. We have found the importance of domestic factors in this period. However, there are some differences among individual countries. It is worth pointing out the case of the Czech Republic where the impact of crude oil prices and global agricultural prices are detected. The changes in oil prices in 2002 affected the
744
Mathematical Methods in Economics 2016
industrial prices development significantly through the changes in the import prices in 2005. This changes affected primary oil processing prices thereby the energy production and distribution prices increased. The changes in crude oil prices and in global agricultural prices finally rise domestic food prices. We should also mention the impact of nominal wages that contributed to this development of price level in the Czech Republic. The impact of labour costs in the Czech Republic results form the research of M. Alexová. She found the significant impact of this factor on inflation in the Czech Republic and Hungary in 1996–2011, see [1]. Eurozone aggregate demand is one of the main inflation drivers in Hungary. It can be connected with the rising output of the main trading partners in this period. According to MNB evidence the net export was the most important component contributing to the domestic aggregate demand increase at the end of this period1 . The results indicate the importance of inflation expectations in this period individually in Poland and Hungary. It can be connected wit the accession to EU in this period. In the case of Hungary there is significant impact of the Eurozone demand. The economic growth in Eurozone probably affected inflation expectation in Hungary as well. The results prove that the external factors prevailed during crisis. Generally in all countries the oil prices, global agricultural prices, inflation expectations and energy prices are the main inflation drivers during this period. Neely and Rapach found that international factors affect inflation of more than 50 % and the impact of this group of factors still has been increasing, see [15]. Multiple regression models point some detail information about inflation factors not typical for all V4 members. In the case of Poland, the results show negative impact of the Eurozone demand on inflation in Poland. Due to the loss of output and weak aggregate demand in Eurozone these countries increased the amount of imported food production from Poland2 . In Hungary a total of more than two thirds of all household loans were denominated in Swiss Francs and Euros in 2010 [10]. However, the Forint3 -denominated loan repayment increased with the depreciation of Forint. Secondarily, the depreciation contributed to an increase in inflation. In all countries together we can see considerable impact of inflation expectations. These results are in accordance with Levy (1981), who found that the inflation expectations and the monetary base are significant inflation factors during crisis. Although there should not be significant effect of monetary base in the inflation targeting regime, we found the importance of money indicators in Poland and in the Czech Republic. Another research points that monetary base is not very important in predicting inflation in V4 countries, but in the case of individual countries, especially Poland, the results indicate that some money indicators improve the inflation forecast [8]. This conclusion confirms our results concerning the fact that monetary base can be important within individual countries but it does not play the role in general development of price level in the V4 countries together. In spite of the general summary which indicates that inflationary factors have been changed over time, it is necessary to mention energy prices. Panel data models indicate the impact of domestic energy prices on inflation in both periods. There are various sources affecting this variable. Except for oil prices development there are many domestic influences like VAT tax, regulations and weather, which affect the consumption of energy. According to Czech National Bank4 evidence we assume that in the pre-crisis period the oil prices development affected inflationary pressure of energy prices. Secondly, it could be strengthened by the extremely low temperature measured in Central Europe in winter 2005. During the crisis the impact of energy prices on inflation is lower and multiple regression models do not prove the importance of this variable in all V4 countries individually. Our results imply that the inflation development is mostly affected by factors on the supply side that cause cost-push inflation. The research of M. Alexová partly confirms our results but her results put more weight on labor costs, see [1].
6
Conclusions
We used multiple regression models and panel data models to identify the main inflationary factors in the Visegrád Four. Estimated models proved there are some specific characteristics in the inflation development in each country, but generally there is a group of factors common to all V4 countries in each period. Despite these individual differences we can conclude that the main inflationary factors do not 1 Magyar
Nemzeti Bank Quarterly Report on Inflation (November 2007). newsletter from Poland 2014/2 of the Czech-Polish Chamber of Commerce. 3 Magyar national currency. 4 Czech National Bank Inflation Report July 2005. 2 Economic
745
Mathematical Methods in Economics 2016
differ significantly in observed countries. We found out that the impact of domestic factors prevail in the pre-crisis period. The impact of external factors and inflation expectations is significant during crisis. From this estimates we derive that impact of external factors increase and the cost-pushed inflation has prevailed. The inflation factors have been changed by the onset of economic crisis in 2008. The results imply that national banks are not able to drive all inflationary factors directly, because a significant part of total inflation development is based on external factors.
References [1] Alexová, M.: Inflation Drivers in New EU Members, NBS Working Paper 6, (2012), 1–42. [2] Dobešová, A. and Hampel, D.: Predicting Inflation by the Main Inflationary Factors: Performance of TVP-VAR and VAR-NN Models. In: Proceedings of the 33rd International Conference Mathematical Methods in Economics, (2015), 133–138. [3] Dobešová, A. and Hampel, D.: Asymmetries in Effects of Monetary Policy and Their Link to Labour Market – Application of TVP-VARs to Three European Countries. In: Proceedings of the 33rd International Conference Mathematical Methods in Economics, (2014), 156–161. [4] Dobešová, A. and Hampel, D.: Inflation Targeting During Financial Crisis in Visegrád Group Countries, Economics and Management 2, (2014), 12–19. [5] Elliott, G., Rothenberg, T. J. and Stock, J. H.: Efficient Tests for an Autoregressive Unit Root, Econometrica 64, 4 (1996), 813–836. [6] Golinelli, R. and Orsi, R.: Modelling Inflation in EU Accession Countries: The Case of the Czech Republic, Hungary and Poland, Eastward Enlargement of the Euro-zone Working Papers 9, (2002), 1–43. [7] Hausman, J. A.: Specification Tests in Econometrics, Econometrica 46, 6 (1978), 1251–1271. [8] Horvath, R. Komarek, L. and Rozsypal, F.: Does Money Help Predict Inflation? An Empirical Assessment for Central Europe, CNB Working paper series 2010 5, (2010), 1–27. [9] Jannsen, N., Potjagailo, G. and Wolters M. H.: The Monetary Policy During Financial Crises: Is the Transmission Mechanism Impaired?, Kiel Working Paper, (2015), 1–37. [10] Jı́lek, J.: Finance v globálnı́ ekonomice I. Penı́ze a platebnı́ styk. Grada, Praha, 2013. [11] Kao, C.: Spurious Regression and Residual Based Tests for Cointegration in Panel Data, Journal of Econometrics 90, 1 (1999), 1–44. [12] Kwiatkowski, D., Phillips, P. C. B., Schmidt, P. and Shin, Y.: Testing the Null Hypothesis of Stationarity Against the Alternative of an Unit Root, Journal of Econometrics 54, 1–3 (1992), 159–178. [13] Levy, M.: Factors Affecting Monetary Policy in an Era of Inflation, Journal of Monetary Economics 8, 3 (1981), 351–373. [14] Mishkin, F. S.: Monetary Policy Strategy: Lessons from the Crisis, NBER Working Paper 16755, (2011), 1–62. [15] Neely, C. and Rapach, D.: International Comovements in Inflation Rates and Country Characteristics, Journal of International Money and Finance 30, 7 (2011), 1471–1490. [16] Stock, J. H. and Watson, M. W.: Modeling Inflation After the Crisis, NBER Working Paper 16488, (2010), 1–59. [17] Vaněk, T., Dobešová, A. and Hampel, D.: Selected Aspects of Modeling Monetary Transmission Mechanism by BVAR Model. In AIP Conference Proceedings 1558, (2013), 1855–1858.
746
Mathematical Methods in Economics 2016
Posteriori setting of the weights in multi-criteria decision making in educational practice Jiřı́ Rozkovec1, Jan Öhm2, Kateřina Gurinová3 Abstract. The aim of this paper is to propose an alternative way of classification of students writing tests, when here we mean subjects like mathematics, physics etc. The classification is a multi-criteria decision making problem. The tasks to solve there are criteria and the results of each student are values of individual variant and the weights are numbers of points assigned to examples. Usually, the teacher sets these points before the test according to his meaning about a difficulty of particular example and also according to a mutual comparison of all the examples. A disadvantage of this approach could be the individual experience of the teacher who does not have to estimate the difficulties for a given group of students fittingly. Then easy examples are appraised by high number of points and vice versa. The presented method calculates the weights (i.e. the points) according to the results achieved by the students and then the total score. The score of a particular example is calculated only in percent and then its average. The reciprocal value of the average is the weight. Afterwards the total score of every test is evaluated and also identified the difficulties of assigned tasks. Keywords: multi-criteria decision making, weights, goodness-of-fit test. JEL classification: C44 AMS classification: 90B50
1
Introduction
Pedagogy, in its normative character, systematically dictates teachers to evaluate their students. Evaluation is a complex and difficult process in which some action (act, product) is evaluated using verbal or non-verbal methods, not only by a grade (i.e. grading is less complex and figures as a result of evaluation - see [3]). Evaluation of student’s performance in achievement tests is one of the key operations in an achievement test creation process. It has many functions and is arguably divided into many typologies. The only indisputable division of evaluation is by two elementary aspects − absolute and relative. Absolute aspect is formulated by an observed student’s result compared to a desirable result, potentially to the ideal result. Relative aspect expresses student’s level (e.g. knowledge) related to his colleagues. It tells which of them are better, equal or worse (see [6]). Evaluation process is subject of many critical essays. It is considered a moral aspect of assessment [7], economic criteria of testing or the cumulation of evaluation acts in some methods [6]. Teachers are often under pressure and unfavourable judgements for their presumptive stressful grading. Čapek in [2] warns againts normative evaluation (in his classification similar to relative) and discusses an abrogation of evaluation by grades. Subjectivism and cumulation of evaluation acts can be decreased by the usage of achievement tests. Generally are achievement test in opposition to psychological test (IQ test, test of personality, etc.) because they measure knowledge and skills [5]. Following the functional classification [5], achievement tests can be divided into these categories: 1 Technical University of Liberec/Faculty of Economics, department of Economic Statistics, Voroněžská 13, 46001 LIBEREC, Czech Republic, jiri.rozkovec@tul.cz 2 Technical University of Liberec/Faculty of Economics, department of Economic Statistics, Voroněžská 13, 46001 LIBEREC, Czech Republic, jan.ohm@tul.cz 3 Technical University of Liberec/Faculty of Economics, department of Economic Statistics, Voroněžská 13, 46001 LIBEREC, Czech Republic, katerina.gurinova@tul.cz
747
Mathematical Methods in Economics 2016
• INITIAL. Used at the beginning of lesson in order to evaluate initial level of students’ knowledge. • DIAGNOSTIC. Used for analysis of knowledge and skills, detection of errors and their elimination. • OUTPUT. Its purpose is for evaluation of lesson outputs. • TEMPORARY. Has time limit and contain equally difficult tasks. Success is determined by achievement realised in particular time interval. • SCALAR. Opposite to temporary. Contain variously difficult tasks. Success is determined by achieved level of difficulty. These tests are considered as objective although it is the misleading statement. Achievement tests are focused on objective examination of students’ content of curriculum mastery.
2
Grading as a multiple-criteria decision making
Grading could be considered as a multiple-criteria decision making problem, where the aim is to set ranking of alternatives, not to find any optimal solution. The alternatives are in fact the results of particular tests. For the evaluation will be used weighted sum model (WSM) [4]. Let’s assume that ~ i = (xi1 , xi2 , . . . , xik ). Vectors there is n alternatives and k criteria, where ith alternative is a vector X ~1 , X ~2 , . . . , X~n then create rows of decision matrix Xn×k . The decision criteria create the columns of X. X The corresponding weights are in vector w ~ = (w1 , . . . , wk ). However, this model slightly differs from the practical life where teacher sets the points for all examples in advance - denote these values as a vector ~b = (b1 , . . . , bk ). Then he/she evaluates the proportion which has been solved for each task by absolute number of points. The total score, in other words importance of the particular alternative, is given by the sum of achieved points for all k examples. To use WSM it is necessary to rearrange this data this way. In the decision matrix X there are again the proportions mentioned in the previous paragraph but as a relative ones, i. e. xij ∈ [0, 1] for i = 1, . . . , n, j = 1, . . . , k. The weights wj , j = 1, . . . , k then imply from: bj wj = Pk
i=1 bj
(1)
~ i where UA (Xi ) ∈ [0, k] (index A means that In this case the importance of alternative UA (Xi ) = w ~X a priori weights were used for the calculation of the importance). But here a small problem appears - and it is the setting of weights. 2.1
A priori weights
It was mentioned above that the weights from (1) were set in advance - it means before writting of the test. How were they set? Accidentally? Intentionally based on teacher’s experience? Either way, they express an assumption about the difficulty of the examples which is at the same time a transformed expectation about the students’ abilities and knowledge. And it could cause errors, because the teacher can overestimate or underestimate them. In the first case some tasks would have greater weights than they should have and vice versa. To sum it, these a priori weights do not fit the difficulty of the tasks. In further text the proposal of construction of posteriori weights will be presented. 2.2
Posteriori weights
The weights which are being constructed should ensure the ranking of the students, but not only this. Also they should identify the tasks which are able to distinct good and poor students from the knowledge
748
Mathematical Methods in Economics 2016
point of view. In other words, tasks solved either by everybody or nobody do not have such ability and from the teacher’s point of view they do not have any additional profit. The method is based just on the values of matrix Xn×k . The measure of the difficulty dj of j th task is expressed as a reciprocal value of average x·j of j th column of X: n
x·j =
1X xij n i=1
(2)
1 x·j
(3)
dj =
for j = 1, . . . , k. Naturally, x·j ∈ [0, 1], so dj ∈ [1, ∞). It is necessary to add that the criteria where x·j equals to 0 or 1 will be excluded from the processing, because they do not have, besides other things, just the ability of distinction as was mentioned above. Further dj , j = 1, . . . , k have to be standardized to the weights vj : dj vj = Pk
i=1
(4)
dj
~i Now the importance UP of alternative Xi can be evaluated in a common way, i. e. UP (Xi ) = ~v X (index P means that posteriori weights were used for the calculation of the importance U ). Now the a priori and posteriori weights w ~ and ~v can be compared by χ2 goodness-of-fit test [1]. Proportions in w ~ are considered as theoretical distribution, while ~v is the empirical one.
3
Practical examples
The following two examples are the practical demonstration of the method described above. The both concern with tests of mathematical analysis from our educational practice in the last year. Let’s note that the significance level for all test is α = 0, 01. Example 1. This sample has following parameters: n = 69, k = 6, i. e. there was six examples and 69 students. The success rates of examples were: ~x· = (0, 2638, 0, 3237, 0, 4928, 0, 5246, 0, 6928, 0, 1246). The prior weights are: w ~ = (0, 125, 0, 375, 0, 125, 0, 125, 0, 125, 0, 125) If they are compared with the success rates, then, roughly said, the higher weight, the lower success. From ~x· the posteriori weights ~v were evaluated using formula (4): ~v = (0, 1869, 0, 1523, 0, 1001, 0, 094, 0, 0712, 0, 3955) The χ2 goodness-of-fit test has the test statistic χ2 = 54, 13. The critical value is χ25 (0, 99) = 15, 09. So the null hypothesis that the proportions are indentical is rejected. Now the question is how the importances UA and UP differ. At first the test of their correlation will be performed. Sample correlation r(UA , UP ) = 0, 9549, P-Value = 0,00. So there is a strong positive correlation between them. Let’s define now the difference Z(Xi ) as: Z(Xi ) = UA (Xi ) − UP (Xi )
i = 1, . . . , n
(5)
The average and variance of the difference Z is Z = 0, 088, s2Z = 0, 0064. Z is normally distributed (P-Value of χ2 goodness-of-test is 0,049). The hypothesis H0 : µ = 0 was rejected and H1 : µ > 0 was accepted (P-Value=0). So there is a statistically significant positive difference UA − UP .
749
Mathematical Methods in Economics 2016
Example 2. This sample has following parameters: n = 72, k = 5. The success rates were: ~x· = (0, 4792, 0, 3281, 0, 4219, 0, 4427, 0, 599). The weights w ~ and ~v are: w ~ = (0, 2, 0, 2, 0, 2, 0, 2, 0, 2) ~v = (0, 1825, 0, 2666, 0, 2073, 0, 1976, 0, 146) The χ2 goodness-of-fit test has the test statistic χ2 = 2, 78. The critical value is χ24 (0, 99) = 13, 28. So the null hypothesis that the proportions are indentical is not rejected. Sample correlation r(UA , UP ) = 0, 9947, P-Value = 0,00. So there is also strong positive correlation between UA and UP . The average and variance of the difference Z for this example is Z = 0, 017, s2Z = 0, 0007. Z is normally distributed (P-Value of χ2 goodness-of-test is 0,042). The hypothesis H0 : µ = 0 was again rejected and H1 : µ > 0 was accepted (P-Value = 10−7 ). So there is a statistically significant positive difference UA − UP .
4
Conclusion
The aim of this paper was to propose an alternative approach of posteriori setting of the weights for multiple-criteria decision making problems in educational practice. Concretely it concerns with the grading of written tests. Usually the weights are given in advance, but they would not have to correspond to reality. So the motivation is to assess the difficulty of the solved tasks more precisely in comparison with the a priori weights. The estimates are based on the achivements of tested students and this is the main difference - the weights are, in fact, generated by the values of the decision matrix. The idea how to express the difficulty is to use a reciprocal value of the average achievement for the particular task. In presented examples the weights w ~ and ~v were significantly different (Example 1), but in Example 2 their fit was not rejected. It means that in the first case teacher’s prediction of the difficulty was not so accurate like in the second one. It was the main goal - the construction and comparison of weights. Consequently, the relation between UA (Xi ) and UP (Xi ) was briefly analyzed. The importance UA (Xi ) calculated with a priori weights w ~ was greater than UP (Xi ) using posteriori weights ~v . Their difference was normally distributed. The mean of difference Z is positive in the both cases and it is statistically significant. Further the correlation of UA and UP was tested. It is nearly perfect and positive in the both cases as well.
Reference [1] Anděl, J.: Statistické metody. Matfyzpress, Praha, 2007. [2] Čapek, R.: Odměny a tresty ve školnı́ praxi. Grada Publishing, Havlı́čkův Brod, 2008. [3] Dittrich, P.: Pedagogicko-psychologická diagnostika. H & H, Jinočany, 1992. [4] Jablonský, J.: Operačnı́ výzkum. Professional Publishing, Praha, 2002. [5] Kratochvı́l, M.: Kontrola, zkoušenı́ a hodnocenı́ ve školnı́ praxi. Technická univerzita v Liberci, Liberec, 1998. [6] Solfronk, J.: Zkoušenı́ a hodnocenı́ žáků. Pedagogická fakulta Univerzity Karlovy, Praha, 1996. [7] Wiggins, G. P.: Assessing Student Performance: Exploring the Limits and Purpose of Testing. JosseyBass Publishers, San Francisco, 1993.
750
Mathematical Methods in Economics 2016
A client’s health from the point of view of the nutrition adviser using operational research Lucie Schaynová
1
Abstract. In this paper, we analyse daily nutrient requirements of an individual person from the point of view of the nutrition adviser. The goal is to simplify the adviser’s menu planning for a client as much as possible. After that, we design an individual eating plan for a week using a new linear optimization model. The model respects eating habits and it follows the client’s or adviser’s recommended recipes taking the compatibility of foods into account. The model involves linear constraints to ensure that two incompatible foods are not used in the same course. The model comprises further constraints to guarantee the diversity of the courses. The purpose of other constraints is to use an exact amount of some food, e.g. one whole egg or 100 grams of cheese, during the week. The model is made up so that the final dietary plan for the client is as natural as possible. The model gives recommended amounts of foods for weekly recipe planning. Keywords: nutrient requirement, linear programming, dietary plan JEL classification: C44 AMS classification: 90C05
1
Introduction
Proper nutrition is important for the proper development and functioning of our organism and to maintain a good state of health. It also represents a very important role in the prevention of civilization diseases. Nutrition from the perspective of linear programming has always the same foundation: the fulfillment of nutritional requirements for health of a larger group of people or the population. The objective functions of the linear programming models often vary. For example minimize greenhouse gas emmisions in the United Kingdom [8]. The cheapest eating is mainly connected with helping the people in developing countries. The goal is to design a nutritionally adequate diet suitable for the complementary feeding period using locally available food with the minimal price [3], [2]. For countries without restriction of the availability of food, models that include popularity of foods (the most frequently bought foods), with minimal price are used again [11]. On the basis of these favorite foods studies for inhabitants, in which they observe current and new optimal diets, are designed. Both kinds of diets are compared to make the people realize the difference between them and to give them recommendations to change their eating habits. That was done for the people in Italy [4] and in France [9]. More individual is a treatise of the eating model for a group of people designed for one week, which uses information about popularity of recipes in Hawaii. There is minimized either the price or total time of preparation of meals [7]. In this article we are going to present the diet problem for an individual person from the point of view of the nutrition adviser. We develop a diet plan tailored to a clinent’s needs using linear programming. The model follows the adviser’s steps when creating the diet plan. For the adviser the model is very helpful from the financial and the time point of view.
2
The problem
The nutrition adviser offers an individual consultation to a client who suffers from some malfunction nutrition or is in danger of some diseases that can be affected by proper nutrition, such as obesity, 1 University of Ostrava, Faculty of Science, Department of Mathematics, 30. dubna 22, 701 03 Ostrava, lucie.schaynova@osu.cz
751
Mathematical Methods in Economics 2016
diabetes, malnutrition, celiac disease, high blood pressure, etc. Nutrition consultancy is also used in prevention of diseases and can help healthy people too. A healthy client can have a new diet plan in the context of normal (rational) or sports nutrition. Consultancy also includes clients who adhere to an alternative way of eating and want to make sure in their eating habits. This is intended for vegetarians, vegans, etc. In this article we will focus on the rational way of eating. The adviser’s methodology includes the anamnestic and analytical part. Anamnestic part In this part the adviser is informed about the client’s collection of basic personal data, family and personal medical history, mental health status in the present and in the past, surgical procedures, diet or physical intolerance. It is also necessary to know whether the client is taking any medication. This information is necessary to be known if the adviser has to consult the new diet with the client’s doctor. Further information is about the lifestyle (smoking, alcohol), eating habits, time stress or physical activity. When some information is missing it can have a negative impact on the client’s health during the new diet. In this article we assume that the client is healthy and has no indispositions so that we can create a diet plan focused on rational eating. Analytical part This part includes anthropometric measurements. The adviser needs to know the client’s body height, weight, circumference of hip and waist, measurement of subcutaneous fat, visceral fat, BMI, etc. The adviser evaluates the client’s physical condition and sets up a goal that the client should achieve. The adviser determines the ideal body weight [6] as follows: wm = 0.655 hm − 44.1 , wf = 0.593 hf − 38.8 ,
where wm or wf is the ideal weight of a man or a woman, respectively, in kilograms and hm or hf is the height of the man or the woman, respectively, in centimeters. Then the basal energy is needed to be determined. This is energy expenditure to ensure basic living functions and temperature of the body. The basal energy can be calculated by using indirect calorimetry or by using the following Harris-Benedict equations [6]: vm = 4.184 (66.5 + 13.8 wm + 5.0 hm − 6.76 am ) ,
vf = 4.184 (655.1 + 9.56 wf + 1.85 hf − 4.68 af ) ,
where vm or vf is the basal energy of the man or the woman, respectively, in kilojoules (kJ) per day and am or af is the age of the man or the woman, respectively, in years. Then the factor activity p is needed to be determined. This is a coefficient of average physical activity. In the table which can be found in [12, p. 27], there are some values assigned to some particular activities. For example the coefficient of sleeping is 0.95, the coefficient of hard working is 2.5, etc. The client has to record all his or her activites during the week and their duration. Then the adviser calculates the value of p. Then the total daily energy requirement can also be calculated by using indirect calorimetry, monitoring heart rate or by using the equation t = vp + d , where t is the total daily energy requirement in kJ per day, v is the basal energy and d is the postprandial thermogenesis, see [1]. Then the adviser sets recommended amounts of some nutrients for the client’s ideal weight from the total energy intake. The nutrients should be composed so that the 15 %, 30 %, and 55 % of the total daily energy intake comes from proteins, fats, and saccharides, respectively.
752
Mathematical Methods in Economics 2016
Those proportions mentioned are recommended for a healthy adult with normal physical activity. In our calculations we use the fact that one gram of fats yields 37 kJ of energy, one gram of proteins or saccharides yield 17 kJ of energy [13]. The client has to tell the adviser all the eating habits, such as which foods or dishes are the client’s favourite or unfavourite, and the client should evaluate his or her normal eating habits. The client has to record all consumed food items during one week along with their amounts. The adviser can calculate from the data which amounts of nutrients are necessary to be increased or reduced. Then the adviser assesses the client’s nutritional situation and gives recommendation for positive change. Then the adviser can create a new diet with the help of some software. The principle of the design is always the same. The adviser selects foods from a database according to the adviser’s recipes or the client’s preferred food items, respectively. It is necessarry to assignt some quantity to each food. The adviser usually works with the most important nutrients only. In addition, the database does usually not contain all values of nutrients and many of them can be missing. When the adviser compiles a list of all the food items for one day, then some nutrients may exceed or do not reach the required amounts. Therefore, the adviser has to go back in the list of the food items and to change the quantity of some items. Sometimes the adviser has to replace some food items by other ones and changes the food quantities until the nutrients are at the optimal levels. This procedure can be time consuming. The adviser has to do the same procedure for at least the next 6 days. A diet plan is usually prepared for one week. Finally the adviser gives the client a final recommendation and principles which the client should follow.
3
Mathematical model
The basic components of the diet are nutrients. The optimal intake of nutrients is necessary for our body and life. Every nutrient is a carrier of some type of biochemical function in our organism and any lack or surplus of nutrients is dangerous. Let us consider food items i = 1, . . . , m and nutrients j = 1, . . . , n. We will work with 5 courses (breakfast, first snack, lunch, second snack and dinner) and we design a diet plan for 7 days, d = 1, . . . , 7, considering 5 courses, l = 1, . . . , 5, a day. So in total we have k = 1, . . . , 35 courses during a week. Denote the set of all 35 courses during the 7 days as G and let the non-empty disjoint sets Gd for all days be such that G = G1 ∪ G2 ∪ · · · ∪ G7 . In addition let the set of all breakfasts E1 , first snacks E2 , lunches E3 , second snacks E4 and dinners E5 be such that G = E1 ∪ E2 ∪ · · · ∪ E5 . Consider a binary matrix A = (aif ) for all i, f = 1, . . . , m which means compatibility between food items (1 if two food items are compatible, 0 otherwise) and binary matrix B = (bik ) for all i = 1, . . . , m, k = 1, . . . , 35 which means compatibility between food items and courses (1 if a food item and a course are compatible, 0 otherwise). The columns of the matrix B for every course during the week can be the same. Let us have a real matrix of nutritive values C = (cij ) for all i = 1, . . . , m, j = 1, . . . , n. Vectors + of minimal and maximal daily recommended values of nutrients are b− = (b− = (b+ j ) and b j ) for all − − + j = 1, . . . , n. Vectors of minimal and maximal quantity of food are v = (vi ) and v = (vi+ ) for all i = 1, . . . , m. Let yik be a binary variable. The variable means if the food item i is used in the course k. We forbide two incompatible food items in the same course as follows yik + yf k ≤ 1 ,
(1)
for all i, f = 1, . . . , m, k = 1, . . . , 35, where i 6= f and aif = 0. It is undesirable to have a food item in an incompatible course so yik = 0
(2)
for all i = 1, . . . , m and k = 1, . . . , 35 such that bik = 0. We want to have at least three food items in every course X yik ≥ 3 i
for all k = 1, . . . , 35.
753
(3)
Mathematical Methods in Economics 2016
Let xik â&#x2030;Ľ 0 be a real variable that means the amount of food item i used in course k. We need to satisfy daily nutrient recommendation as follows X X X X cij xik â&#x2030;Ľ bâ&#x2C6;&#x2019; and cij xik â&#x2030;¤ b+ (4) j j i
i
kâ&#x2C6;&#x2C6;Gd
kâ&#x2C6;&#x2C6;Gd
for all j = 1, . . . , n and d = 1, . . . , 7. Inspired by [10], we can add distribution of energy intake within the daily nutritional amounts, it means 20 % of total energy is needed for breakfast, 30 % for lunch, 25 % for dinner and 12,5 % for first and second snack. Inequalities for breakfast look as follows X X X X ci1 xik â&#x2030;Ľ 0.2 bâ&#x2C6;&#x2019; ci1 xik â&#x2030;¤ 0.2 b+ and (5) 1 1 , i
i
kâ&#x2C6;&#x2C6;E1
kâ&#x2C6;&#x2C6;E1
+ where bâ&#x2C6;&#x2019; 1 or b1 means minimum or maximum of daily recommended value of energy, respectively. The inequalities for other daily courses are analogous.
We do not want to waste the food so we can add condition to ensure that we consume a whole package (100 grams, 150 grams, etc.) of a food item or a whole hen egg (the average weight of an egg is 50 grams). So we add equalities like X xik = 100 zi100 + 50 zi150 , (6) k
for some food items i that are supplied in packages of certain sizes, where variables like zi100 , zi150 are integer variables such that 0 â&#x2030;¤ zi100 â&#x2030;¤ zi150 . The coefficient 50 by zi150 is the difference between the size of the packages of 100 and 150 grams. We would like to use some minimum or maximum value of food item every day xik â&#x2030;Ľ viâ&#x2C6;&#x2019; yik
and
xik â&#x2030;¤ vi+ yik ,
(7)
for all i = 1, . . . , m and k = 1, . . . , 35. There can happen that some food item is repeated in some courses. Let Ml,g be a set of similar foods used in the same course l. For example the set of milk products for breakfast such as milk, curd or yogurt. Another example is the set of bakery products for breakfast. We do not want to have the same milk product nor the same bakery product for breakfast in two consecutive days. We can achieve that by inequalities X |yiq â&#x2C6;&#x2019; yir | â&#x2030;Ľ 1 (8) iâ&#x2C6;&#x2C6;Ml,g
for all q, r â&#x2C6;&#x2C6; El , |q â&#x2C6;&#x2019; r| = 5, and l = 1, . . . , 5 and the respective groups g of foods (such as milk products, bakery products, etc.). The inequality (8) is equivalent to â&#x2C6;&#x2019; i,q,r â&#x2030;¤ yiq â&#x2C6;&#x2019; yir â&#x2030;¤ i,q,r
and
X
iâ&#x2C6;&#x2C6;Ml,g
i,q,r â&#x2030;Ľ 1 .
(9)
We have the new real variable i,q,r for all q, r â&#x2C6;&#x2C6; El , |q â&#x2C6;&#x2019; r| = 5, and l = 1, . . . , 5 and the respective groups of foods g. In this model composed of conditions (1)â&#x20AC;&#x201C;(9) we do not use any objective function. We minimize value 0. We are interested in finding a feasible solution only.
4
Results
We need input data described in Section 2 for our model described in Section 3. Let us consider a woman. She is 26 years old and 173 cm tall. The ideal body weight is wf = 64 kg, the basal energy is vf = 6166 kJ, the factor activity is p = 1.5 and the postprandial thermogenesis is d = 919 kJ. The total daily energy requirement is t = 10109 kJ. The composition of required macroelements is as follows: 89 or 82 or 327 grams of proteins or fats or saccharides, respectively. It is not necessary that the diet plan has the exact values mentioned above, tolerance of 10 % can be used in the model so we can work with intervals.
754
Mathematical Methods in Economics 2016
Nutrient Energy [kJ] Proteins [g] Fats [g] .. .
Food items [100 g] Chicken Potatoes 694 20 10 .. .
··· ··· ··· ··· .. .
322 2 0 .. .
Recommended amounts Min Solution Max 9098 80 73 .. .
10364 94 90 .. .
11122 98 90 .. .
Table 1 Input data and calculated amounts of nutrients for one day In the model we work with 28 nutrients and 65 food items (see [5]). The amounts of microelements are taken from [5], [12], [13]. Our model works with the values from Table 1. There are food items and the respective amounts of nutrients in each of the items. The minimal and maximal amounts of nutrients for one day are also given. Then, our model consists of 4432 variables, out of which 10 are integer and 2205 are binary, and 53411 constraints. The calculated diet for the client for one day of the week is shown in Table 2. The amounts of the nutrients in the one-day diet plan are shown in Table 1 in the Solution column. This model was computed in the specialized optimization software FICO(R) Xpress Optimization Suite. On a Windows XP SP3 computer with 0.99 GB RAM and Intel Atom 1.60 GHz CPU, the computation took about 5 seconds. Course
Food items
Energy [kJ]
Breakfast
120 g yogurt, 30 g apple, 35 g muesli, 20 g orange marmelade, 34 g wholemeal biscuit, 6 g syrup, 130 g mandarin 60 kaiser rolls (1 piece), 9 g margarine, 70 g cheese Soup: 74 g whole wheat pasta, 33 g green beans, broth The main course: 150 g chicken, 130 g potatoes, 20 g cucumber, 33 g eggplant, 25 g lentils, 8 g sunflower oil 50 g yogurt, 61 g cornflakes, 18 g honey, 7 g syrup 120 g kaiser roll (2 pieces), 13 g margarine, chives Salad: 24 g tomatoe, 94 g bell pepper, 100 g cucumber, 32 g ricotta, 13 g olive oil, basil
2224.4
Snack Lunch
Snack Dinner
1137.5 3336.6
1390.3 2275.0
Table 2 Optimal diet plan for one day The client with this diet plan can be sure that all the nutritional values for her day are fulfilled.
5
Discussion
The composition of the food items in Table 2 corresponds to the client’s eating habits. It may be very time consuming for the adviser to create a one-day diet plan that respects the optimal amounts of all 28 nutrients; it may even be impossible for the adviser to do that for the whole week. Our model gave us the solution in a couple of seconds. Time consuming in our case can be creating the matrices of compatibility A and B. But the matrices can be the foundations of the adviser’s data. The adviser can do only some small changes for every client in the matrices. In addition the model can give us a diet plan for period longer than one week. The advisers usually do not care about wasting the food. If the clients are eating according to the adviser’s diet plan, the clients can have a lot of wasted food items in the end of the week. Inequalities (8) of our model prevent that. This article gives an idea how the working steps of the adviser and the care of the client’s health can be improved.
755
Mathematical Methods in Economics 2016
Acknowledgements The author thanks to doc. David Bartl for comments that helped to improve the article. The use of the FICO(R) Xpress Optimization Suite under the FICO Academic Partner Program is gratefully acknowledged. This research was supported by the internal grant no. SGS12/PřF/2016–2017, Geometric mechanics, optimization and number theory. The author is grateful to the two anonymous referees for their useful comments that helped to improve the manuscript.
References [1] Bender, D., A.: Introduction to nutrition and metabolism. CRC Press, London, 2002. [2] Briend, A., Ferguson, E., Darmon, N.: Local food price analysis by linear programming: a new approach to assess the economic value of fortified food supplements. Food and Nutrition Bulletin 22.2 (2001), 184–189. [3] Briend, A., et al.: Linear programming: a mathematical tool for analyzing and optimizing children’s diets during the complementary feeding period. Journal of Pediatric Gastroenterology and Nutrition 36.1 (2003), 12–22. [4] Conforti, P., D’amicis, A.: What is the cost of a healthy diet in terms of achieving RDAs? Public Health Nutrition 3.3 (2000), 367–373. [5] Heseker, H., Heseker, B.: Die Nährwerttabelle: [über 40.000 Nährstoffangaben; einfache Handhabung; Tabellen zu Laktose, Fruktose, Gluten, Purin, Jod und trans-Fettsäuren]. Umschau, Neustadt an der Weinstraße, 2016. [6] Kastnerová, M.: Poradce pro výživu. Nová Forma, České Budějovice, 2011. [7] Leung, P., Wanitprapha, K., Quinn, L., A.: A recipe-based, diet-planning modelling system. British Journal of Nutrition 74.2 (1995), 151–162. [8] Macdiarmid, J., et al.: Sustainable diets for the future: can we contribute to reducing greenhouse gas emissions by eating a healthy diet? The American Journal of Clinical Nutrition 96.3 (2012), 632–639. [9] Maillot, M., Vieux, F., Amiot, M., J., Darmon, N.: Individual diet modeling translates nutrient recommendations into realistic and individual-specific food choices. The American Journal of Clinical Nutrition 91.2 (2010), 421–430. [10] Reihserová, R.: Jak poskládat stravu v průběhu dne? Svět potravin. Available from: http://www.svet-potravin.cz/clanek.aspx?id=4177 (cited 25 April 2016). [11] Sklan, D., Dariel, I.: Diet planning for humans using mixed-integer linear programming. British Journal of Nutrition 70.1 (1993), 27–35. [12] Společnost pro výživu: Referenčnı́ hodnoty pro přı́jem živin. Výživaaservis s.r.o., Praha, 2011. [13] Svačina, Š.: Klinická dietologie. Grada Publishing, Praha, 2008. [14] Zavadilová, V.: Výživa a zdravı́. Ostravská univerzita v Ostravě, Ostrava, 2014.
756
Mathematical Methods in Economics 2016
Wavelets Comparison at Hurst Exponent Estimation Jaroslav SchuĚ&#x2C6;rrer1 Abstract. In this paper we present Discrete Wavelet Transformation based on Hurst exponent estimation and compare different wavelets used in the process. Self-similar behavior mostly associated with fractals can be found in broad range of areas. For self-affine processes the local properties are reflected in the global ones and the Hurst exponent is related to fractal dimension, where fractal dimension is a measure of the roughness of a surface. For usually nonstationary time series the Hurst exponent is a measure of long term memory of time series. From former works mentioned in references we know that Discrete Wavelet Transformation provides better accuracy compared to Continuous Wavelet Transformation and that it outperforms methods based on the Fourier spectral analysis and R/S analysis. Keywords: Hurst exponent, Wavelet Transformation, signal power spectrum. JEL classification: C44 AMS classification: 90C15
1
Hurst exponent
Many processes of interest in finance modeling are considered as self-similar processes. For these processes we estimate dimensionless estimator, called Hurst exponent and denoted by H usually, for self-similarity of a time series. Hurstâ&#x20AC;&#x2122;s exponent estimation for real world data plays an important role in the study of processes that exhibit properties of self-similarity. Hurst exponent was firstly defined by Harold Edwin to develop law for regularities of the Nile water level. Hurst exponent is one of the fractal measures, which varies from 0 < H < 1. There are three important regions of Hurst exponent values within meaningful range [0, 1]. For persistent time series H > 0.5 and H < 0.5 for anti-persistent time series and finally H = 0.5 for uncorrelated series. There are many methods for Hurst exponent estimation using time series. Most commonly used method are classical Rescaled Range analysis (R/S analysis), variance-time analysis, detrended fluctuation analysis (DFA) and wavelet based estimator. In this article we focus on the wavelet based estimator method. It is widely accepted, that many stochastic processes exhibit a long-range dependence and fractal structure. The most suitable mathematical method for research of the dynamics and structure of such processes is fractal analysis. Let define self-similar stochastic process X(t) as the process aâ&#x2C6;&#x2019;H X(at), a > 0 which shows the same second-order statistical properties as X(t). Hurst exponent is a measure of selfsimilarity or a measure of duration of long-range dependence of a stochastic process. Example of fractal stochastic structures is the modern financial market which uses information about past events to affect decisions in the present, and contains long-term correlations and trends. The market remains stable, as long as it retains its fractal structure.
2
Discrete Wavelet Transformation and signal power spectrum
Wavelets represent way of analyzing time series in particular non-stationary time series where they ensure time and frequency localization. It means that they provide wavelet coefficients that are local both in time and frequency. Main features of wavelets, multi-resolution and localization are well suited for extraction fluctuations at various scales from local trends over appropriate window sizes. Extracted fluctuations are influenced by the choice of wavelet. 1 Czech Technical University in Prague/Masaryk Institute of Advanced Studies, Kolejni 2637/2a, Prague, Czech Republic, schurjar@cvut.cz
757
Mathematical Methods in Economics 2016
Multiresolution analysis decomposes signal into subsignals with different size resolution levels. From practical point of view, it is not possible to calculate wavelet coefficients at every scale. So we choose only subset of scales and positions to make calculations. Dyadic scales and positions (based on powers of two) are used for Discrete Wavelet Transformation (DWT). DWT is linear transformation where in our case input data is represented by vector of length equal to power of two and output is an another vector of the same length which separates two different frequency components. An efficient way to implement this scheme using filters was developed in 1988 by Mallat. This algorithm is known as Pyramidal algorithm or two-channel sub-band coder. The input vector (signal) is passed through a series of high pass filters to analyze the high frequencies, and it is passed through a series of low pass filters to analyze the low frequencies. The resolution of the signal is changed by the filtering operations, and the scale is changed by upsampling and downsampling (subsampling) operations. The DWT analyzes the signal at different frequency bands with different resolutions by decomposing the signal into an approximation and detail coefficients. DWT employs two sets of functions, called scaling functions and wavelet functions, which are associated with low pass and high pass filters. The original signal x[n] is first passed through a halfband high pass filter g[n] and a low pass filter h[n] with subsampling by 2, simply by discarding every other sample. This constitutes one level of decomposition and can be mathematically expressed as follows: yhigh [k] =
X n
ylow [k] =
X n
x[n].g[2k − n]
(1)
x[n].h[2k − n]
(2)
where yhigh [k] and ylow [k] are the outputs of the high pass and low pass filters respectively, after subsampling by 2. This procedure can be repeated for further decomposition where at every level, the filtering and subsampling will result in half the number of samples thus half the time resolution and half the frequency resolution. Following figure 1 schematically depicts this procedure. Because of filtering the lenght of input signal playes an important role. Wavelet algorithms expect the input length to be a power of two. If length is not suitable we have to use some extrapolation of the input data in order to extend the signal before computing the Discrete Wavelet Transform using the cascading filter banks algorithm. There are several methods of signal extrapolation that are mainly used: zero-padding, constant-padding, symmetric-padding and periodic-padding. Length of signal also influences maximum number of decomposition levels which is equal to f loor(log2 (data len/(f ilter len − 1))/(log2 (2))).
Figure 1 Wavelet decomposition scheme. For a given mother wavelet ψ and scaling function ϕ with approximate coefficient a(j, k) and detail coefficients d(j, k) these coefficients are defined as: a(j, k) =
Z
∞
X(t)ϕj,k (t)dt, d(j, k) =
−∞
Z
∞
X(t)ψj,k (t)dt
(3)
−∞
where ϕj,k = 2−j/2 ϕ(2−j t − k), ψ = 2−j/2 ψ(2−j t − k). Original signal X(t) is then represented in terms of DWT with following equation:
X(t) =
X
aj,k ϕj,k (t) +
J X X j=1 k
k
758
dj,k ψj,k (t)
(4)
Mathematical Methods in Economics 2016
The wavelet power is calculated by summing the squares of the coefficient values (approximation or detailed) for each level: N 2j
E(j) =
−1
X
2 Wj,k
(5)
k=0
3
Hurst exponent estimation
During last decade there were several methods published which use wavelet transformation to estimate Hurst exponent. The averaged wavelet coefficient method described by Simonsen and Hansen in [5] is based on the statistical equality of continuous wavelet transformation of self affine function h(x) expressed by equation W [h(x)](a, b) ' W [λ−H h(λx)](a, b) where continuous wavelet transformation (CWT) is defined by equation 1 W [h](a, b) = √ a
Z
∞
−∞
∗ ψa,b (x)h(x)dx
(6)
Here ψ ∗ (x) denotes the complex conjugate. We also mention that DWT can be understood as a critically sampled CWT. Another method published by Faggini and Parziale in [1] is the wavelet-based Hurst parameter estimator based on a spectral estimator obtained by performing a time average of the wavelet detail coefficients |dj,k |2 at a given scale Sr =
1 X |dj,k |2 Ni j
(7)
where Ni is the number of wavelet coefficients at a scale i and N is number of data points. First DWT is computed log2 E[d(j, k)]2 followed by variance of these estimates and then linear regression is performed to find slope γ. H is then calculated as: H = 0.5(1 + γ), 0 < γ < 1. Next, we mention the fluctuation functions based on discrete wavelet coefficients proposed by Manimaran,Paniggrahi and Parikh in [4] that use wavelet power also referred as energy to calculate fluctuation function at a level s: 21 s X F (s) = E(j)
(8)
j=1
The scaling behavior is then obtained through F (s) ∼ sH . Hurst exponent is then obtained from slope of the log-log plot of F (s) versus scales s. Here H is the Hurst scaling exponent, which can be obtained from the slope of the log-log plot of F (s) vs scales s. At the end we describe the Gloter, Hoffmann method published in [2] which is based on the energy level estimator. This estimator is defined by the equation:
Qj+1 E = C2−2H Qj
(9)
where Q is the detailed coefficient energy of certain decomposition level and C is an arbitrary unknown constant. Computing this for several j and using a linear least squares fit, the Hurst exponent can be estimated as γ ∗ (−0.5) where γ is slope of Ej .
759
Mathematical Methods in Economics 2016
4
Comparison of Hurst exponent estimation
In this part we compare different wavelets in Hurst exponent estimation. The fractal Brownian motion (fBM) has been chosen as a model of random process which exhibits fractal properties. We generate two realisations with known Hurst exponent equal to 0.5 and 0.8. The length of data sample was chosen so that the border effect is eliminated. In Figure 2 we can see generated realizations of fBM.
Figure 2 fBM with H1 = 0.5 (blue), H2 = 0.8 (green). All numerical computation were done by own code written in Python with the use of pyWavelets and NumPy packages. The following algorithms have been compared: • R/S analysis • The wavelet energy level estimator Haar • The wavelet energy level estimator DB2, DB4, DB5 • The wavelet energy level estimator Coiflet 1 • The wavelet energy level estimator Biorthogonal 1.3 First we analyze fBm with H = 0.5 and two different lengths of input vector N = 10000 and N = 5000. Following tables sumarizes results from R/S analysis and WEL estimators using different wavelet families. We can observe that Coiflet wavelets are pretty good in Hurst estimation. Another important observation is that compact support length of wavelet influences estimation accurancy (compare Haar, DB2 - DB5). Time series number of datapoints also plays important role. Hurst estimate method RS analysis WEL estimator Haar WEL estimator DB2 WEL estimator DB4 WEL estimator DB5 WEL estimator Coif1 WEL estimator Bior1.3
H (N=10 000) 0.51 0.47 0.46 0.46 0.43 0.499 0.479
Difference -0.01 0.03 0.04 0.04 0.07 0.001 0.021
H (N=5000) 0.482 0.445 0.344 0.420 0.335 0.463 0.476
Difference 0.018 0.055 0.156 0.08 0.165 0.037 0.024
Table 1 Hurst estimator methods comparison for fBM with H = 0.5
760
Mathematical Methods in Economics 2016
Following table summarize result of the same procedure for input vector with Hurst exponent H = 0.8.
Hurst estimate method RS analysis WEL estimator Haar WEL estimator DB2 WEL estimator DB4 WEL estimator DB5 WEL estimator Coif1 WEL estimator Bior1.3
H (N=10 000) 0.8 0.78 0.78 0.77 0.69 0.78 0.79
Difference 0.00 0.02 0.02 0.03 0.11 0.02 0.01
H (N=5000) 0.817 0.756 0.671 0.763 0.699 0.747 0.75
Difference -0.017 0.044 0.029 0.037 0.101 0.053 0.05
Table 2 Hurst estimator methods comparison for fBM with H = 0.8
If we further decrease input vector length we notice that efficiency of Hurst estimator based on the DWT dramatically goes down. So we can conlude that estimator is efficient above N = 4096.
5
Conclusion
Discrete Wavelet Transformation approach yields correct values for the Hurst exponent. But we have to take into account several elements. First of all, border effect due filtering can lead to wrong values. This can be eliminated with smallest possible filter selection and signal extension. The length of input data also influences maximum level of decomposition during DWT and in our case the bigger is better. We can also observe some underestimation of H due to sampling especially in early stages of decomposition. Big advantage of DWT approach to Hurst exponent estimation is that Wavelet transformation can operate on non-stationary data (some methods can be applied only to stationary time series).
Acknowledgements Research described in the paper was supervised by Prof. M. Vosvrda, Institute of Information Theory and Automation of the ASCR.
References [1] Faggini, M., Parziale, A.: Complexity in Economics: Cutting Edge Research. Springer, 2014. [2] Gloter, A., Hoffmann, M.: Estimation of the Hurst parameter from discrete noisy data. Annals of Statistics 35 (2007), 1947-1974. [3] Gneiting, T., Schlather, M.: Stochastic models which separate fractal dimension and Hurst effect. SIAM Review 46 (2001), 269–282. [4] Manimaran, P., Paniggrahi, P. K., Parikh, J. C.: On Estimation of Hurst Scaling Exponent through Discrete Wavelets. ArXiv Physics e-prints physics/0604004 (2006), [5] Simonsen, I., Hansen, A., Nes, O. M.: Determination of the Hurst Exponent by use of Wavelet Transforms. Physical Review E 53 (1998), 2779–2787. [6] Walnut, D.: An Introduction to Wavelet Analysis. Birkhäuser, Boston, 2004.
761
Mathematical Methods in Economics 2016
Possibilities of Regional Input-Output Analysis of Czech Economy Jaroslav Sixta1, Jakub Fischer2 Abstract. Regional Input-Output Analysis belongs to the group of data demanding models. Such models can be used for detailed description of regional economy and for advanced regional studies. Theoretically, this issue is well founded and there is lots of available literature for these issues. But in reality, pure regional Input-Output Tables are difficult to find. Only some official statistical authorities compile and publish so detailed regional data or even completed Regional Input-Output Tables. Construction of these tables is demanding for data sources, i.e. statistical surveys and capacities. Therefore, we used different approach based on regionalization models and we constructed pure regional Input-Output Tables for the Czech Republic for 2011. Such work lasted three years (2013-2015) and we finally published product by product tables for NUTS 3 level, regions â&#x20AC;&#x153;Krajeâ&#x20AC;? of the Czech Republic. The paper follows the work and illustrate regional specifics with simple tools of input-output analysis. Czech regions are compared on the level of input coefficients and its contribution to the national aggregates. Mutual dependency of regions is discussed, as well. Keywords: Regional, Input-Output, Multipliers. JEL Classification: C67, R10, O11 AMS Classification: 65C20
1
Introduction
Regional Input-Output Analysis (RIOA) is a powerful tool for analyzing regional economy. The structure of regional output and intermediates allows identification of regional dependency on imports and regional sensitivity to final demand shocks. Advanced economic models used for the preparation of economic policy need detailed and reliable data. In specific cases, regularly published national Input-Output Tables (IOTs) based on country averages are not adequate for the analysis. The fear of deviation from reality grows with analytical description of the regional impact. Even though the demand for regional data is quite high and frequent, the availability of such tables is somewhat lacking or not very timely trail behind this demand. A discussion of the necessity of regional policies and their efficient application require both detailed data and skilled users. A regional economy consists of both regional businesses and regional branches of national or multinational companies. When discussing regional economic growth, employment and living conditions, effective tools for the measurement of policy impacts are usually missing. Fortunately, compilation of Input-Output Tables has a long tradition in the Czech Republic, see Sixta [13] and construction of regional tables is therefore possible. Regional input-output models face the principal difficulty with the lack of available data, which makes them more demanding for compilation. Since 1950s, input-output analysis was applied mostly in the form of singleregion models to solve regional challenges. Comparing inter-regional (IRIO) and multiregional (MRIO) models proves IRIO models much more data demanding. Papers aimed at the level of real regions (of a state) occur rather rarely. Some attempts can be found, e.g. in Italy (Benvenuti, Martellato and Raffaelli, [1]), Finland (see [6]) or the Netherlands [2]. Some other examples may be found in Wiedmann [16]. The case of derivation of regional inputoutput tables based on GRIT method is deeply discussed in Miller and Blair [7]. GRIT is easy to apply and therefore it is widely popular among economists. It was used for numbers of countries, for example also for the Czech Republic [10], China (Wen and Tisdel [15]), etc. The aim of this paper is to compare Input-Output Analysis based on national IOTs with the analysis based on regional IOTs. For this case we chose Moravian-Silesian Region, the region aimed at mining industry and facing serious economic problems. The issue is illustrated by simple statistic IOA. Since Regional Input-Output Tables (RIOTs) are not officially compiled in the Czech Republic, we used RIOTs constructed by the Department of Economic Statistics of the University of Economics in Prague for the year 2011. These tables are available for the
1
University of Economics in Prague, Department of Economic Statistics, Nam W. Churchilla 4, Prague, Czech Republic, sixta@vse.cz 2 University of Economics in Prague, Department of Economic Statistics, Nam W. Churchilla 4, Prague, Czech Republic, fischerj@vse.cz
762
Mathematical Methods in Economics 2016
use of domestic output (RIOTd) and imported products (RIOTi), product-by-product type (82) and valued at basic prices. There are lots of available information sources connected to RIOTs, e.g. Miller and Blair [7].
2
Methodology
Regional Input-Output Tables were compiled within the research project, see Sixta and Fischer [12] and detailed methodology can be found also in Sixta and Vltavská [13]. The main principle lies in decomposing of national symmetric IOTs into regions. The key role is played by estimates of output vectors, regional trade and published regional accounts. The crucial point significantly influencing the quality of regional input-output analysis is the quality of RIOTs. Especially, it is connected with the movement of goods and services; both by international and interregional trade. The most import part, regional output vectors, is not officially published. The Czech Statistical Office publishes only gross value added. The estimates of all 14 regional output vectors are based on the decomposition of national output vectors in specific institutional sector. National input coefficients were used for the estimates of regional intermediate consumption. It means that the national technology is applied on regional level. Expenditure approach is also not officially published and therefore we reconstructed the use side of IOTs in line with Kramulová and Musil [9]. Regional input-output tables are available for all 14 regions of the Czech Republic in ESA 1995 [4] methodology. Despite ESA 2010 [5] was implemented in September 2014, RIOTs based on ESA 1995 are fully usable for input-output analysis. Methodical changes between ESA 1995 and ESA 2010 do not affect results significantly. Such analysis is based on the technological relationships that were not changed. The key methodical differences between input-output tables compiled according to ESA 1995 and ESA 2010 lie in the recording of processing. With respect to improved approach to ownership concept, processing in international trade is excluded from imports, intermediate consumption, export and output. For the purposes of static input-output analysis, intermediates that originate only from domestic output are use and therefore the changes in import matrices do not significantly influence the results. The benefits of regional input-output tables are demonstrated by simple static input-output analysis. Even if this tool is rather simple, it is very convenient for illustration purposes. It covers mainly complete impact of the change of final use on output, gross value added and employment. Suppose that all necessary IOA models requirements are fulfilled (see EU [3]) and then the change of vector of output (x) is estimated on the basis of the change of final demand (y), see formula 1:
x ( I Ad ) 1 y
(1)
where x
output vector,
y
final demand vector,
Ad
matrix of input coefficients computed from domestically produced intermediates.
Intermediate consumption (zj) is estimated by given input coefficients (ai,j) from output vector:
z j i x j aij
(2)
gross value added (gj) is derived as difference
g j x j ( z j mi j )
(3)
where gj
gross value added of product j,
xj
output of product j,
zj
intermediates used in product j,
mij
imported intermediates used in product j.
Distinguishing between domestically produced (z) and imported (mi) intermediates allows identification of relationship between regional output and regional imports. The emphasis is put on the identification of the impact
763
Mathematical Methods in Economics 2016
on the particular region. It means that import matrices that are available for all fourteen Czech regions to cover both international and interregional trade. The trade matrix connecting the regions has not been finished yet and we are still working on this issue. For every region, it is possible to separate international trade and interregional trade (both imports and exports). Imports from foreign countries were allocated according to the use of particular products and export according to output. Interregional flows were obtained by balancing of supplies and uses on the level of product. The sum of interregional exports must equal the sum of interregional imports, for more details see Sixta and VltavskĂĄ [13] or Kahoun and Sixta [8].
3
Regional Input-Output Model
The effect of using of regional input-output tables is presented on the case of Moravian-Silesian Region (MSR) and the decline of final demand for mining industry products. Moravian-Silesian Region amounts about 10% of Czech gross value added in 2011. This region is significantly dependent on the mining industry that forms 7% of regional gross value added. In the industries of mining and energy (B, D and E) about 26 thousand workers work, consisting 5% of the region. Mining industry is very much linked with other industries and possible negative effects of final demand on output of mining industry exceeds the region itself. The dependency of the region is given by the structure of intermediates. For the illustration of negative final demand shock, letâ&#x20AC;&#x2122;s suppose the decrease in export of mining industry by 20 CZK bn. From the perspective of the region, it is not important whether the shock is given by the other regions or situation abroad. It means that:
ď &#x201E;y = - 20 000.
(4)
When using static input-output analysis described by formulas 1 to 3, the overall change of the vector of output is dependent on the matrix of input coefficients. The model is computed for two cases; the first is based on national symmetric input-output tables and the second one is based on regional input-output tables for Moravian-Silesian Region. In the first case, matrix Ad is obtained from national IOTs and in the second case from regional inputoutput tables. The changes in output vector (x) are presented on Figure 1. A
B
C
D
E
F
G
H
I
J
K
L
M
N
O
P
Q
R S-U
0
-5 000
-10 000
-15 000
-20 000 Regional
National
-25 000 Figure 1 Change of output, CZK mil. Overall difference between changes in the structure of output given by two mentioned approached is not very significant. It is caused by the very specific input structure of mining industry. In other words, mining industry is significantly influencing national input-output table. The most significant difference is obtained for transport and storage industry (H) since these products are highly imported for Moravian-Silesian Region. Similarly, the imports of manufacturing products can explain the differences in manufacturing industry (C). The key benefit when using regional input-output tables for input-output analysis lies in the estimates of regional production and regional imports. When using national IOTs (NIOTs), it may not be clear which part of the
764
Mathematical Methods in Economics 2016
output is imported from other regions and which part comes from abroad. Figure 2 describes total impact of imported products into the regions. A
B
C
D
E
F
G
H
I
J
K
L
M
N
O
P
Q
R S-U
0
-500
-1 000
-1 500
-2 000
-2 500 Regional
National
-3 000 Figure 2 Change of imports, CZK mil. Similarly, to the case of output, differences between both coefficients are relatively moderate. The most important impact is obtained for products used in mining industry by 2.5 CZK bn. that represents 36% of the decrease of intermediate consumption in this industry, see Table 1. Indicator Output Domestic intermediates Imported intermediates Gross value added Imports
Total industries RIOT NIOT -27 053 -31 497 -7 053 -11 497 -4 005 -4 538 -15 995 -15 462 -6 710 -7 868
Mining industry RIOT NIOT -20 062 -20 170 -4 328 -6 383 -2 455 -2 829 -13 279 -10 957 x x
Table 1 Comparison of Input-Output Analysis based on RIOTs and NIOTs, CZK mil. The impact on the decrease in external demand on gross value added is highly influenced by imported intermediates. If we use RIOT for Moravian-Silesian Region, the decrease in value added counts 16 CZK bn. In the case of national IOT, it is about 15.5 CZK bn. The most significant difference is observed in the change of total output, intermediate consumption and imports. When using regional input-output table, the distribution of imports between regional flows and flows from abroad is possible. The impact is described on figure 3. The overall decrease of imports is about 4 CZK bn. of which 13% (500 CZK mil.) belongs to regional flows and 87% (3.5 CZK bn.) to flows from abroad. Mostly, the affected products cover mining, manufacturing and energy, trade and transport. Finally, it means that the decrease of imports in Moravian-Silesian Region will affect other Czech regions by 500 CZK mil. The above described impacts deals with static input-output analysis only and therefore these effects can be assessed to immediate reaction of the economy. In reality, the impacts are more spread in the economy and multiplication effects takes place. In more advanced models (e.g. Zbranek and Sixta [17]) the initial shock will be distributed via economic processes in many rounds. The decrease of exports will cause the drop of output, value added, wages, operating profits. Decreased wages and profits in the regions mean weaker initiatives for investments (gross fixed capital formation) and household consumption.
765
Mathematical Methods in Economics 2016
S-U R Q P O N M L K J I H G F E D C B A -3 000
-2 500
-2 000 -1 500 -1 000 Imports from abroad Regional Imports
-500
0
Figure 3 Distribution of imports
4
Conclusion
The paper briefly describes the benefits of Regional Input-Output Tables for Input-Output Analysis. It was demonstrated that using national averages (national Symmetric Input-Output Tables) leads to different estimates of output and subsequently imports and exports. The most important benefit of using Regional Input-Output Table lies in the possibility of estimates of the impacts on individual regions of the Czech Republic. For demonstration purposes, we selected Moravian-Silesian Region and we modeled the impact of the decrease in final demand for mining products production (output of mining industry). The analysis was conducted in two variants. The first variant is based on Regional Input-Output Tables and the second variant is based on National Input-Output Table. Even though these results are not very different in the case of value added, there are significantly different for output, intermediate consumption and imports. Regional Input-Output tables are not usually published by official statistical authorities. These tables are available only for some countries for selected years, e.g. the U.S., Spain or Finland. International databases are usually aimed at groups of countries and pure regional tables focusing on the regions of a particular country are very scarce. We prepared Regional Input-Output Tables for 14 regions of the Czech Republic at basic prices for 2011. Even though we used ESA 1995 methodology, the tables can be still easily used for various approaches to InputOutput Analysis. The aim of the demonstration described within this paper was to illustrate possibilities and promote their usage. These tables are available for free download from our webpage, see http://kest.vse.cz/veda-avyzkum/vysledky-vedecke-cinnosti/regionalizace-odhadu-hrubeho-domaciho-produktu-vydajovou-metodou/
Acknowledgements Supported by the grant No. 13-15771S “Regionalization of Estimated Gross Domestic Product by Expenditure Method” of the Czech Science Foundation and also with the long term institutional support of research activities by the Faculty of Informatics and Statistics of the University of Economics
References [1] Benvenuti S. C., Martellato D., Raffaelli C.: A Twenty-Region Input-Output Model for Italy. Economic Systems Research (1995), 101-116. [2] Eding G., Oosterhaven J., Vet De B., Nijmeijer H.: Constructing Regional Supply and Use Tables: Dutch Experiences. In: Hewings G. J. D., Sonis M., Madden M., Kimura. Y.: Understanding and Interpreting Economic Structure, Berlin, 1999, 237-263. [3] European Union: Input-Output Manual. Luxembourg, 2008. [4] Eurostat: European System of Accounts – ESA 1995. Luxembourg, 1996. [5] Eurostat: European System of Accounts – ESA 2010. Luxembourg, 2013.
766
Mathematical Methods in Economics 2016
[6] Flegg. A. T. and Tohmo T.: Regional Input-Output Tables and the FLQ Formula: A Case Study of Finland. Regional Studies 47 (2013), 703-721. [7] Kahoun, J. and Sixta, J.: Regional GDP Compilation: Production, Income and Expenditure Approach. Statistika (2013), 24–36. [8] Kramulova, J. and Musil, P.: Experimentální odhad složek výdajové metody regionálního HDP v ČR. Politická ekonomie (2013), 814–833. [9] Miller, R. E. and Blair, P. D.: Input-Output Analysis: Foundations and Extensions, 2009. [10] Semerák V., Zigic K., Loizou E., Golemanova-Kuharova A.: Regional Input-Output Analysis: Application on Rural Regions in Germany, the Czech Republic and Greece. In: European Association of Agricultural Economists, Rural development: governance, policy design and delivery. Ljubljana, 2010. [11] Sixta, J.: Development of Input-Output Tables in the Czech Republic. Statistika (2013), 4–14. [12] Sixta, J. and Fischer, J.: Regional Input-Output Models: Assessment of the Impact of Investment in Infrastructure on the Regional Economy. In: Proceedings of Mathematical Methods in Economics, Cheb, 2015. [13] Sixta, J. and Vltavská, K.: Regional Input-output Tables: Practical Aspects of its Compilation for the Regions of the Czech Republic. Ekonomický časopis 64 (2016), 56–69. [14] Wen. J. J. and Tisdell C. A.: Tourism and China's Development: Policies, Regional Economic Growth and Ecotourism, Singapore, 2001. [15] Wiedmann T. A: Review of Recent Multi-region Input-Output Models Used for Consumption-Based Emission and Resource Accounting. Ecological Economics (2009), 211-222. [16] Zbranek, J. and Sixta J.: Possibilities of Time Input-Output Tables. In: Proceedings of Mathematical Methods in Economics, Jihlava, 2013, 1046–1051.
767
Mathematical Methods in Economics 2016
A Comparison of Integer Goal Programming Models for Timetabling Veronika Skocdopolova1 Abstract. Timetabling is a wide-spread problem that every school or university deals with. It is an NP-hard problem. Optimisation of a timetable in general leads to cost reducing or avoiding time wasting. Goal programming is a widely used technique for solving not only multi-criteria decision making problems. It became very popular since it was formulated more than 60 years ago. Using goal programming for timetabling enables including soft constraints in the mathematical model. There are two main approaches for solving the timetabling problem. The first one is to split the problem into several interrelated stages. The other one is solving the problem with one complex model. In this paper there are compared three integer goal programming models for timetabling that were presented at previous MME conferences. The compared models are a three-stage, a four-stage, and a complex integer goal programming model. All three models were formulated for timetable construction at the University of Economics, Prague. Keywords: Goal programming, integer programming, timetabling, soft constraints. JEL Classification: C61 AMS Classification: 90C29
1
Goal Programming and Timetabling
Goal programming is a widely used technique not only for multi-criteria decision making. Due to quite easy formulation of the goal programme and good understanding its methodology by decision makers, many papers and books dealing with goal programming applications appeared since its first formulation in 1955. The applications to education are summarized in [15]. One of them is timetabling, a yearly problem of each school. The main aim of this paper is a comparison of three integer goal programming models prepared for the department of econometrics at the University of Economics, Prague. In this chapter the basics of goal programming are summarized. There are also introduced the main approaches to the problem of timetabling. In the next part the three models are briefly described and then they are compared in the last chapter.
1.1
Goal Programming
Goal programming is based on assumption that the main decision-maker objective is to satisfy his or her goals. The aim of goal programming models is not to find the best solution, but to find a solution that meets the decision maker’s goals. Therefore the chosen solution can be dominated. The generic goal programming model can be formulated as follows [11]: Minimise
z f (d , d )
(1)
x X ,
(2)
f k (x ) d k d k g k , k = 1, 2, …, K,
(3)
subject to
d k , d k 0 ,
k = 1, 2, …, K,
1
University of Economics, Prague, Faculty of Informatics and Statistics, Department of Econometrics, W. Churchill sq. 4, Prague, Czech Republic, skov01@vse.cz.
768
Mathematical Methods in Economics 2016
where (1) is a general penalisation function of vectors of positive and negative deviations d– and d+, X is the set of feasible solution satisfying all of the constraints including non-negativity constraints (2), fk(x) is an objective function that represents the k-th criterion, gk is the k-th goal, the decision maker wants to meet, d k is the negative deviation from the k-th goal, which represents the underachievement of the k-th goal, d k is the positive deviation from the k-th goal, which shows the overachievement of the k-th goal; and (3) are the goal constraints. The goal constraints are often titled soft constraints, while the constraints that form the set of feasible solutions can be called hard constraints. Time to time in practice certain small deviation from the goal is acceptable with just a little penalisation and greater deviation is allowed but with higher penalisation. To preserve the linear form of the model we can use the multistage penalisation function. In this case we have to replace the goal constraints (3) with the following one f k (x ) d k d k1 d k2 g k , k = 1, 2, …, K,
(4)
0 d k1 hk1 , 0 d k2 hk2 , d k 0 , k = 1, 2, …, K,
(5)
where d k1 is the 1st stage positive deviation that is non-negative and has an upper bound hk1 , and d k2 is the 2nd stage positive deviation that is also non-negative with an upper bound hk2 . The two-stage penalisation function can be formulated as follows: K
z uk d k vk 1d k1 vk 2 d k2 ,
(6)
k 1
where vk1 and vk2 are the penalty constants of the 1st and 2nd stage positive deviation and uk is the penalty constant of the negative deviation.
1.2
Timetabling
Timetabling is a wide-spread problem that every school deals with. There are many ways how to construct a university timetable. From the time-tested scheduling board, over various heuristic and metaheuristic methods to sophisticated optimisation models. There are two main approaches to construction of a timetable using mathematical modelling. The first of them is creating a complex model usually using binary variables (e.g. [3] or [9]). However, solving integer programming models is, in general, NP-hard problem [7], so it leads to utilizing various heuristic or metaheuristic methods. Heuristic and metaheuristic methods give us solutions that are relatively close to optimal solution in relatively reasonable time. The most common metaheuristic methods used for timetabling are evolutionary algorithms (e.g. genetic [10] or memetic [13]), algorithms based on graph colouring (e.g. [1] or [5]), local and tabu search (e.g. [6] or [14]) or simulated annealing (e.g. [8]). The other approach is based on a decomposition of the problem into several interrelated stages (e.g. [2], [4], or [12]). This means outputs of one stage are inputs for the next stage.
2
Integer Goal Programming Models for Timetabling
During the past four years three integer goal programming models were presented at the conference Mathematical Methods in Economics. These models were formulated to help with timetable construction at the department of econometrics, University of Economics, Prague. In the next sections all three models are briefly described.
2.1
Three-stage model
This was the first model formulated for the department. It was inspired by [2]. In the model there are three interrelated stages. In the first stage each course (lectures and seminars) is assigned to a teacher. There are three goals in this stage – maximisation of total preference of courses, and achieving maximum seminar and lecture loads. The hard constraints of the first stage ensure, that any teacher is not assigned to a course he or she evaluated with 0 preference; all courses are assigned to a teacher; and any teacher does not teach more than two courses of selected subjects. The result of this stage is the assignment of all courses to teachers according to their preferences.
769
Mathematical Methods in Economics 2016
In the second stage each course (assigned to a teacher) is assigned to a time window. The only goal of this stage is, that the total number of courses assigned to a time window cannot exceed the number of classrooms available for that time slot. The hard constraints of the second stage assure, that the teacher assigned to the course will be available in certain time window every teacher can be assigned most to one time window; and every course have to be assigned exactly to one time window. The other constraints deal with cohesion of input from the previous stage and variables of this stage. The last stage of the model assigns courses (assigned to teachers and time windows) to classrooms. There is only one goal in this stage, which deals with location of the course to a classroom with adequate capacity. The hard constraints of the third stage ensure, that each course has to be assigned exactly to one classroom; each classroom can be assigned most to one course; and any course can be assigned to a classroom only in case the classroom is available in the time window assigned to the course. The result of the third stage is a complete timetable for the department. This model was presented in [16]. The main problem of this model was that small seminars were often assigned to classrooms with large capacity. Therefore the four-stage model was formulated. The three-stage model also does not deal with the requirement to assign certain seminars to computer classrooms. This requirement is included in the four-stage model.
2.2
Four-stage model
The four-stage model is an extension of the three-stage model. In the first stage a complete timetable of lectures is prepared. To this timetable the seminars are added in the next three stages that are the same as in three-stage model. The only difference is that hard constraints assuring assignment of computer classrooms were added to the third and fourth stage. This model was presented in [18]. With these multi-stage models we have to assure the continuity of all the stages. It is necessary to secure in each stage that there will be a feasible solution in the next stage. This means eg. the model has to respect number of classrooms available in each time window while assigning courses to time windows. Due to the necessity of the continuity of all stages it was complicated to include some special requirements to the model. Hence the complex model was formulated.
2.3
Complex model
The complex model presented in [17] assures all the necessary features of timetable and includes all the special requirements of the department of econometrics. These requirements are considering teachers’ time preferences; assignment of certain courses to computer classrooms every week or every other week; teaching certain courses in block; assignment of certain courses to particular classrooms; and teaching in two distant campuses.
The complex model has six goals: maximisation of total preference of courses; maximisation of total time preference; achieving adequate course loads of teachers; utilisation of classrooms’ capacities; achieving a given number of days per week each teacher has at least one course; and achieving a given number of courses per day each teacher has.
The principle of two-stage penalization function is utilized in the third and fourth goal.
In addition to the hard constraints of the multi-stage models the constraints in the complex model assure, that each teacher has courses only in one campus each day (no transfers during one day); the courses that need computers are assigned to the computer classrooms every or every other week; the courses that are thought in block are assigned to immediately following time windows, to the same classroom, and to the same teacher; and certain courses are assigned to particular classrooms.
770
Mathematical Methods in Economics 2016
This complex model creates the timetable in just one stage. Although it is an NP-hard problem, the optimisation via software does not take a long time. The problem with ca 7.5 million variables and 6 thousands constraints was solved using Gurobi 6.0 solver (www.gurobi.com) in less than 10 minutes (on a notebook with quad-core processor IntelÂŽ CoreTM i7-4702MQ 2.2GHz, operation memory 16 GB DDR3 and 64bit operation system Windows 8.1). The complex model have been implemented into an application written in Visual Basic for Applications (VBA). The application can use anybody even without any knowledge of either mathematical modelling or optimisation systems. The user just fill in all the input data to an MS Excel worksheet and click a button. The results of the optimisation are then exported back to the MS Excel worksheet and the final timetable is created via VBA procedures.
3
Comparison of the Models
The input data for all three models are collected and adjusted in MS Excel worksheet using VBA. The multi-stage models were solved via optimisation system LINGO 14.0 (www.lindo.com). Solving each stage took just a few seconds. The data were transformed between the stages using procedures written in VBA. For solving the complex model was chosen the solver Gurobi and the model was written in MPL (Mathematical Programming Language), which enables using many different solvers, such as CPLEX, XPRESS, CONOPT and many others. All of the models were verified on the same data set for better comparison. It was the data from the summer term 2011/2012. In that term the models had to deal with 32 teachers, 80 courses, 35 time windows and 84 classrooms. In the Table 1 there are numbers of variables and constraints for each model.
3-stage model 1st stage 2nd stage 3rd stage 4th stage
all variables integer variables constraints all variables integer variables constraints all variables integer variables constraints all variables integer variables constraints
2,690 2,560 2,962 95,100 92,400 7,751 6,969 6,640 330 x x x
4-stage model 23,092 22,860 4,395 2,624 1,984 2,144 73,664 71,610 6,527 185,422 5,146 7,711
Complex model 7,437,642 7,436,956 5,893 x x x x x x x x x
Table 1 Number of variables and constraints The difference in the number of variables between multi-stage models and the complex model is obvious. What is remarkable is the number of constraints is much higher for the multi-stage models than for the complex model, although there are not included all the necessary requirements in the multi-stage models. The reason for that is, that the multi-stage models need more constraints to ensure the integrity of the model. Moreover some of the constraints are used repeatedly in each stage. The multi-stage models create applicable timetables. This means that the timetable avoids time conflicts, each course is assigned to a teacher, time window and classroom, and there is at most one course in each classroom. The problem of the multi-stage models was too complicated formulation of the special requirements such as assigning certain courses to computer classrooms every week or every other week, teachersâ&#x20AC;&#x2122; time preferences, teaching courses in block or teaching in two distant campuses. The main reason for choosing the multi-stage model at first was an assumption of the computational complexity of the complex model. However, solving the complex model with ca 7.5 million variables took less than 10 minutes. Therefore the complex model is more useful for creating the department timetable. The user does not have to transform the data between the stages. Although the transformation is done automatically by VBA procedures, there is an eventuality of making a mistake. The question is how it would be, if we would like to use the model for creating a timetable for the whole university. Rough estimate of the number of integer variables in such a model is 18.5 billion. It can result in lack
771
Mathematical Methods in Economics 2016
of operation memory while preparing the model for solving or it can take several days to compute any result. In this case it is worth considering an adjustment of the multi-stage models. That is an issue for the future research.
4
Conclusion
In this paper three integer goal programming models for timetabling were briefly described and compared. The multi-stage models do not ensure all the specific requirements of the timetable at the department. The complex model assures all the necessary requirements. On the other hand if we would like to use the complex model for creation of a timetable for the whole university, it can lead to unsolvable model. In this case would be better to adjust one of the multi-stage model.
Acknowledgements The research is supported by the Internal Grant Agency of the University of Economics, Prague, grant No. F4/62/2015.
References [1] Abdul-Rahman, S., Burke, E. K., Bargiela, A., McCollum, B., and Özcan, E.: A constructive approach to examination timetabling based on adaptive decomposition and ordering. Annals of Operations Research 218 (2014), 3–21. [2] Al-Husain, R., Hasan, M. K., and Al-Qaheri, H.: A Sequential Three-Stage Integer Goal Programming (IGP) Model for Faculty-Course-Time-Classroom Assignments. Informatica 35 (2011), 157–164. [3] Asratian, A. S., and Werra, D.: A generalized class teacher model for some timetabling problems. European Journal of Operational Research 143 (2002), 531–542. [4] Badri, M. A.: A two-stage multiobjective scheduling model for [faculty-course-time] assignments. European Journal of Operational Research 94 (1996), 16–28. [5] Burke, E. K., Pham, N., Qu, R., and Yellen, J.: Linear combinations of heuristics for examination timetabling. Annals of Operations Research 194 (2012), 89–109. [6] Caramia, M., Dell'Olmo, P., and Italiano, G. F.: Novel Local-Search-Based Approaches to University Examination Timetabling. INFORMS Journal on Computing 20 (2008), 86–99. [7] Cerny, M.: Computations, Volume III (In Czech). Professional Publishing, Prague, 2012. [8] Gunawan, A., Ng, K. M., and Poh, K. L.: A hybridized Lagrangian relaxation and simulated annealing method for the course timetabling problem. Computers and Operations Research 39 (2012), 3074–3088. [9] Ismayilova, N. A., Sagir, M., and Rafail, N.: A multiobjective faculty–course–time slot assignment problem with preferences. Mathematical and Computer Modelling 46 (2005), 1017–1029. [10] Jat, S. N., and Yang, S.: A hybrid genetic algorithm and tabu search approach for post enrolment course timetabling. Journal of Scheduling 14 (2011), 617–637. [11] Jones, D., and Tamiz, M.: Practical Goal Programming. International Series in Operations Research & Management Science 141. Springer, New York, 2010. [12] Ozdemir, M. S., and Gasimov, R. N.: The analytic hierarchy process and multiobjective 0–1 faculty course assignment. European Journal of Operational Research 157 (2004), 398–408. [13] Qaurooni, D.: A memetic algorithm for course timetabling. In: GECCO '11: Proceedings of the 13th annual conference on Genetic and evolutionary computation. Association for Computing Machinery, Dublin, 2011, 435–442. [14] Santos, H. G., Ochi, L. S., and Souza, M. J. F.: A Tabu search heuristic with efficient diversification strategies for the class/teacher timetabling problem. Journal of Experimental Algorithmics 10 (2005), 1–16. [15] Skocdopolova, V.: Application of Goal Programming Models to Education. In: Efficiency and Responsibility in Education 2012. Czech University of Life Sciences Prague, Prague, 2012, 535–544 [16] Skocdopolova, V.: Construction of time schedules using integer goal programming. In: Mathematical Methods in Economics 2012, Silesian University in Opava, Karvina, 2012, 793–798. [17] Skocdopolova, V.: Optimisation of University Timetable via Complex Goal Programming Model. In: Mathematical Methods in Economics 2015, University of West Bohemia, Plzeň, Cheb, 2015, 725–730. [18] Skocdopolova, V.: University timetable construction – computational approach. In: Mathematical Methods in Economics 2013, College of Polytechnics Jihlava, Jihlava, 2013, 808–813.
772
Mathematical Methods in Economics 2016
Transient and Average Markov Reward Chains with Applications to Finance Karel Sladký
1
Abstract. The article is devoted to Markov reward chains with finite state space. Since the usual optimization criteria examined in the literature on Markov reward chains, such as a total discounted, total reward up to reaching some specific state (called transient models) or mean (average) reward optimality, may be quite insufficient to characterize the problem from the point of a decision maker. It seems that it may be preferable if not necessary to select more sophisticated criteria that also reflect variability-risk features of the problem. Perhaps the best known approaches stem from the classical work of Markowitz on mean variance selection rules, i.e. we optimize the weighted sum of average or total reward and its variance. In the article explicit formulae for calculating the variances for transient and average models are reported along with sketches of algorithmic procedures for finding policies guaranteeing minimal variance in the class of policies with a given transient or average reward. Application of the obtained results to financial models is indicated. Keywords: dynamic programming, transient and average Markov reward chains, reward-variance optimality, optimality in financial models. JEL classification: C44, C61, C63 AMS classification: 90C40, 60J10, 93E20
1
Introduction
The usual optimization criteria examined in the literature on stochastic dynamic programming, such as a total discounted or mean (average) reward structures, may be quite insufficient to characterize the problem from the point of a decision maker. To this end it may be preferable if not necessary to select more sophisticated criteria that also reflect variability-risk features of the problem. Perhaps the best known approaches stem from the classical work of Markowitz (cf. [2]) on mean variance selection rules, i.e. we optimize the weighted sum of average or total reward and its variance. In the present paper we restrict attention on transient and average models with finite state space and in the class of optimal policies we find the policy with minimal variance.
2
Notation and Preliminaries
In this note, we consider at discrete time points Markov decision process X = {Xn , n = 0, 1, . . .} with finite state space I = {1, 2, . . . , N }, and compact set Ai = [0, Ki ] of possible decisions (actions) in state i ∈ I. Supposing that in state i ∈ I action a ∈ Ai is chosen, then state j is reached in the next transition with a given probability pij (a) and one-stage transition reward rij will be accrued to such transition. (We assume that pij (a) is a continuous function of a ∈ Ai .) A (Markovian) policy controlling the decision process, π = (f 0 , f 1 , . . .), is identified by a sequence of decision vectors {f n , n = 0, 1, . . .} where f n ∈ F ≡ A1 × . . . × AN for every n = 0, 1, 2, . . ., and fin ∈ Ai is the decision (or action) taken at the nth transition if the chain X is in state i. Let π m = (f m , f m+1 , . . .), π hence π = (f 0 , f 1 , . . . , f m−1 , π m ), in particular π = (f 0 , π 1 ). The symbol the expectation if ∑ E i denotes m−1 n π X0 = i and policy π = (f ) is followed, in particular, E i (Xm = j) = ij ∈I pi,i1 (fi0 ) . . . pim−1 ,j (fm−1 ); P(Xm = j) is the probability that X is in state j at time m.
1 Institute of Information Theory and Automation of the Czech Academy of Sciences, Pod Vodárenskou věžı́ 4, 182 08 Praha 8, Czech Republic, sladky@utia.cas.cz
773
Mathematical Methods in Economics 2016
Policy π which selects at all times the same decision rule, i.e. π ∼ (f ), is called stationary, hence following policy π ∼ (f ) X is a homogeneous Markov chain with transition probability matrix P (f ) whose ∑ (1) ij-th element is pij (fi ). Then ri (fi ) := j∈I pij (fi )rij is the expected one-stage reward obtained in (1)
state i. Similarly, r(1) (f ) is an N -column vector of one-stage rewards whose i-the elements equals ri (fi ). The symbol I denotes an identity matrix and e is reserved for a unit column vector. Considering standard probability matrix P (f ) the spectral radius of P (f ) is equal to one. Recall that ∑n−1 (the Cesaro limit of P (f )) P ∗ (f ) := lim n1 k= P k (f ) (with elements p∗ij (f )) exists, and if P (f ) is apen→∞
riodic then even P ∗ (f ) = lim P k (f ) and the convergence is geometrical. Then g (1) (f ) = P ∗ (f ) r(1) (f ) k→∞
(1)
is the (column) vector of average rewards, its ithe entry gi (f ) denotes the average reward if the process starts in state i. Moreover, if P (f ) is unichain, i.e. P (f ) contains a single class of recurrent states, then p∗ij (f ) = p∗j (f ), i.e. limiting distribution is independent of the starting state and g (1) (f ) is a constant vector with elements ḡ (1) (f ). It is well-known (cf. e.g. [3, 7]) that also Z(f ) (fundamental matrix of P (f )), and H(f ) (the deviation matrix) exist, where Z(f ) := [I − P (f ) + P ∗ (f )]−1 , H(f ) := Z(f ) (I − P ∗ (f )). Transition probability matrix P̃ (f ) is called transient if the spectral radius of P̃ (f ) is less than unity, i.e. it at least some row sums of P̃ (f ) are less than one. Then limn→∞ [P̃ (f )]n = 0, P̃ ∗ (f ) = 0 g (1) (f ) = P̃ ∗ (f ) r(1) (f ) = 0 and Z̃(f ) = H̃(f ) = [I − P̃ (f )]−1 . Observe that if P (f ) is stochastic and α ∈ (0, 1) then P̃ (f ) := αP (f ) is transient, however, if P̃ (f ) is transient it may happen that some row sums may be even greater than unity. Moreover, for the so-called first passage problem, i.e. if we consider total reward up to the first reaching of a specific state (resp. the set of specific states), the resulting transition matrix is transient if the specific state (resp. the set of specific states) can be reached from any other state.
3
Reward Variance for Finite and Infinite Time Horizon
∑n−1 Let ξn (π) = k=0 rXk ,Xk+1 be the stream of rewards received in the n next transitions of the considered Markov chain X if policy π = (f n ) is followed. Supposing that X0 = i, on taking expectation we get for the first and second moments of ξn (π) (1)
vi (π, n) := E πi (ξn (π)) = E πi
n−1 ∑
rXk ,Xk+1 ,
(2)
vi (π, n) := E πi (ξn (π))2 = E πi (
k=0
n−1 ∑
rXk ,Xk+1 )2 .
(1)
k=0
It is well known from the literature (cf. e.g. [1],[3],[7],[8]) that for the time horizon tending to infinity (1) policies maximizing or minimizing the values vi (π, n) for transient models, as well as policies maximizing (1) (1) or minimizing the value gi (π) = limn→∞ n−1 vi (π, n) can be found in the class of stationary policies, ∗ ˆ ¯∗ ¯ i.e. there exist f , f , f , f ∈ F such that for all i ∈ I and any policy π = (f n ) (1)
(1)
(1)
(1)
(1)
(1)
vi (f ∗ ) := lim vi (f ∗ , n) ≥ lim sup vi (π, n), vi (fˆ) := lim vi (fˆ, n) ≤ lim inf vi (π, n), n→∞
g(f¯∗ ) := lim
n→∞
3.1
n→∞
n→∞
n→∞
(2)
1 (1) ¯∗ 1 (1) 1 (1) 1 (1) v (f , n) ≥ lim sup vi (π, n), g(f¯) := lim vi (f¯, n) ≤ lim inf vi (π, n). (3) n→∞ n→∞ n i n n n n→∞
Finite Time Horizon
If policy π ∼ (f ) is stationary, the process X is time homogeneous and for m < n we write for the generated random reward ξn = ξm +ξn−m (here we delete the symbol π and tacitly assume that P(Xm = j) and ξn−m starts in state j). Hence [ξn ]2 = [ξm ]2 +[ξn−m ]2 +2·ξm ·ξn−m . Then for n > m we can conclude that {∑ } (4) E πi [ξn ] = E πi [ξm ] + E πi P(Xm = j) · E πj [ξn−m ] . j∈I
E πi [ξn ]2
=
{∑ π
E πi [ξm ]2 + E i
j∈I
∑ } P(Xm = j) · E πj [ξn−m ]2 + 2 · E πi [ξm ] P(Xm = j) · E πj [ξn−m ]. (5) j∈I
774
Mathematical Methods in Economics 2016
In particular, from (2), (4) and (5) we conclude for m = 1 ∑ (1) (1) (1) vi (f, n + 1) = ri (fi ) + pij (fi ) · vj (f, n)
(6)
j∈I
(2)
vi (f, n + 1) (1)
where ri (fi ) :=
∑
j∈I
(2)
ri (fi ) + 2 ·
=
∑ j∈I
(1)
pij (fi ) · rij · vj (f, n) +
(2)
pij (fi ) rij ,
ri (fi ) :=
(2)
(1)
∑
j∈I
∑
(2)
pij (fi ) vj (f, n)
(7)
j∈I
pij (fi )[rij ]2 .
Since the variance σi (f, n) = vi (f, n) − [vi (f, n)]2 from (6),(7) we get ∑ ∑ (2) (1) σi (f, n + 1) = ri (fi ) + pij (fi ) · σj (f, n) + 2 pij (fi ) · rij · vj (f, n) j∈I
(1) −[vi (f, n
=
∑ j∈I
2
+ 1)] +
∑
j∈I
(1) pij (fi )[vj (f, n)]2
(8)
j∈I
(1)
(1)
pij (fi )[rij + vj (f, n)]2 − [vi (f, n + 1)]2 +
∑ j∈I
pij (fi ) · σj (f, n).
(9)
Using matrix notations (cf. [5]) equations (6),(7),(8) can be written as: v (1) (f, n + 1) v (2) (f, n + 1)
= =
σ(f, n + 1)
=
r(1) (f ) + P (f ) · v (1) (f, n) r(2) (f ) + 2 · P (f ) ◦ R · v (1) (f, n) + P (f ) · v (2) (f, n)
r(2) (f ) + P (f )σ(f, n) + 2 · P (f ) ◦ R · v (1) (f, n) −[v (1) (f, n + 1)]2 + P (f ) · [v (1) (f, n)]2 (2)
(10) (11) (12)
(2)
where R = [rij ]i,j is an N × N -matrix, and r(2) (f ) = [ ri (fi )], v (2) (f, n) = [vi (f, n)], (1) v (1) (f, n) = [(vi (f, n)], σ(f, n) = [σi (f, n)] are column vectors. The symbol ◦ is used for Hadamard (entrywise) product of matrices. Observe that r(1) (f ) = (P (f ) ◦ R) · e, r(2) (f ) = [P (f ) ◦ (R ◦ R)] · e. 3.2
Infinite-Time Horizon: Transient Case
In this subsection we focus attention on transient models, i.e. we assume that the transition probability matrix P̃ (f ) with elements pij (fi ) is substochastic and ρ(f ), the spectral radius of P̃ (f ), is less than unity. Then on iterating (10) we easily conclude that there exists v (1) (f ) := lim v (1) (f, n) such that n→∞
v (1) (f ) = r(1) (f ) + P̃ (f ) · v (1) (f ) ⇐⇒ v (1) (f ) = [I − P̃ (f )]−1 r(1) (f ).
(13)
Similarly, from (11) (since the term 2 · P (f ) ◦ R · v (1) (f, n) must be bounded) on letting n → ∞ we can also verify existence v (2) (f ) = lim v (2) (f, n) such that n→∞
v (2) (f ) = r(2) (f ) + 2 · P̃ (f ) ◦ R · v (1) (f ) + P̃ (f ) v (2) (f ) hence
v (2) (f ) = [I − P̃ (f )]−1
{
(14)
} r(2) (f ) + 2 · P̃ (f ) ◦ R · v (1) (f ) .
(15)
On letting n → ∞ from (8), (9) we get for σi (f ) := lim σi (f, n) n→∞
σi (f ) =
(2)
ri (fi ) +
∑ j∈I
(1) −[vi (f )]2
=
∑ j∈I
+
pij (fi ) · σj (f ) + 2
∑
∑ j∈I
(1)
pij (fi ) · rij · vj (f )
(1) pij (fi )[vj (f )]2
j∈I
(1)
(1)
pij (fi )[rij + vj (f )]2 − [vi (f )]2 +
775
(16) ∑ j∈I
pij (fi ) · σj (f ).
(17)
Mathematical Methods in Economics 2016
Hence in matrix notation σ(f ) = r(2) (f ) + P̃ (f ) · σ(f ) + 2 · P̃ (f ) ◦ R · v (1) (f ) − [v (1) (f )]2 + P̃ (f ) · [v (1) (f )]2 .
(18)
After some algebra (18) can be also written as σ(f ) =
[I − P̃ (f )]−1 · { r(2) (f ) + 2 · P̃ (f ) ◦ R · v (1) (f ) − [v (1) (f )]2 }.
(19)
In particular, for the discounted case, i.e. if for some discount factor α ∈ (0, 1) the transient matrix P̃ (f ) := αP (f ) then (19) reads σ(f )
= [I − αP (f )]−1 · { r(2) (f ) + 2 · αP (f ) ◦ R · v (1) (f ) − [v (1) (f )]2 }.
(20)
(20) is similar to the formula for the variance of discounted rewards obtained by Sobel [6] using different methods. 3.3
Infinite-Time Horizon: Average Case
We make the following Assumption 1. There exists state i0 ∈ I that is accessible from any state i ∈ I for every f ∈ F. Obviously, if Assumption 1 holds then for every f ∈ F the transition probability matrix P (f ) is unichain (i.e. P (f ) have no two disjoint closed sets). As well known from the literature (see e.g. [3]), if Assumption 1 holds, then the growth rate of v (1) (f, n) is linear and independent of the starting state. In particular, there exists constant vector g (1) (f ) = P ∗ (f )r(f ) (with elements ḡ (1) (f )) along with vector w(1) (f ) (unique up to an additive constant) such that w(1) (f ) + g (1) (f ) = r(f ) + P (f )w(1) (f ). (21) In particular, it is possible to select w(1) (f ) such that P ∗ (f )w(1) (f ) = 0. Then w(1) (f ) = H(f )r(f ) = Z(f )r(f ) − P ∗ (f )r(f ). On iterating (21) we can conclude that v (1) (f, n) = g (1) (f ) · n + w(1) (f ) + [P (f )]n w(1) (f ).
(22)
To simplify the limiting behavior we make also Assumption 2. The matrix P (f ) is aperiodic, i.e. limn→∞ [P (f )]n = P ∗ (f ) exists for any P (f ). Then for n tending to infinity v (1) (f, n)−ng (1) (f )−w(1) (f ) tends to the null vector and the convergence is geometrical. In particular, by (22) we can conclude that for ε(n) = P (f )n w(1) (f ) v (1) (f, n) = g (1) (f ) · n + w(1) (f ) + ε(n).
(23)
The symbol ε(n) is reserved for any column vector of appropriate dimension whose elements converge geometrically to the null vector. Employing the above facts we can conclude that by (6),(21),(22) ∑ (1) (1) (1) (1) vi (f, n + 1) + vj (f, n) = ri (f ) + pik (f ) · vk (f, n) + vj (f, n) k∈I
= ri (f ) + 2nḡ (1) (f ) +
∑
(1)
(1)
pik (f )wk (f ) + wj (f ) + ε(n)
k∈I (1)
=
(1)
(1)
vi (f, n + 1) − vj (f, n)
(1) wi (f )
(1)
(2n + 1)ḡ (f ) + + wj (f ) + ε(n) ∑ (1) (1) = ri (f ) + pik (f ) · vk (f, n) − vj (f, n)
(24)
k∈I
= ri (f ) +
∑
k∈I
= ḡ
(1)
(1)
(1)
pik (f )wk (f ) − wj (f ) + ε(n) (1)
(1)
(f ) + wi (f ) − wj (f ) + ε(n)
776
(25)
Mathematical Methods in Economics 2016
From (23),(24),(25) we get ∑ (1) (1) (1) (1) pij (f ) [vi (f, n + 1) + vj (f, n)][vi (f, n + 1) − vj (f, n)] j∈I
=
∑ j∈I
(1)
= 2nḡ (1) (f ) +
∑ j∈I
(1)
(1)
(1)
pij (f )[2nḡ (1) (f ) + ḡ (1) (f ) + wi (f ) + wj (f )][ḡ (1) (f ) + wi (f ) − wj (f )] + ε(n) ∑ j∈I
(1)
(1)
pij (f )[ḡ (1) (f ) + wi (f ) − wj (f )]
{ } (1) (1) pij (f ) [ḡ (1) (f ) + wi (f )]2 − [wj (f )]2 + ε(n)
= 2nḡ (1) (f ) · ri (f ) +
∑ j∈I
} { (1) (1) pij (f ) [ḡ (1) (f ) + wi (f )]2 − [wj (f )]2 + ε(n).
(26)
Similarly by (23) for the third term on the RHS of (8) (and also for the third term on the RHS of (12)), we have ∑ ∑ (1) (1) pij (f ) · rij · [n · ḡ (1) (f ) + wj (f ) + ε(n)] pij (f ) · rij · vj (f, n) = j∈I
= n · ḡ
(1)
(f ) · ri (f ) +
∑ j∈I
j∈I
(1)
pij (f ) · rij · wj (f ) + ε(n).
(27)
Substitution from (26), (27) into (8) yields after some algebra ∑ ∑ (2) (1) σi (f, n + 1) = pij (f ) · σj (f, n) + ri (f ) + 2 · pij (f ) · rij · wj (f ) j∈I
+
∑
j∈I
(1) pij (f )[wj (f ]2
j∈I
=
∑ j∈I
− [ḡ
(1)
(1)
(f ) + wi (f )]2 + ε(n) (1)
(1)
pij (f ) · {σj (f, n) + [rij + wj (f )]2 } − [ḡ (1) (f ) + wi (f )]2 + ε(n)
(28)
Hence, in matrix form we have: σ(f, n + 1) = σ(f ) + s(f ) + ε(n),
(29)
where elements si (f ) of the (column) vector s(f ) are equal to ∑ (1) (1) si (f ) = pij (f )[rij + wj (f )]2 − [g (1) (f ) + wi (f )]2
(30)
j∈I
=
∑ j∈I
(1)
(1)
pij (f )[rij + wj (f ) − g (1) (f )]2 − [wi (f )]2 .
(31)
Observe that by (31) follows immediately from (30) since by (21) ∑ (1) (1) −2 j∈I pij (f )(rij + wj (f ))g (1) (f ) − [g (1) (f )]2 = −2wi (f )g (1) (f ) − [g (1) (f )]2 .
Employing (22) and the analogy between (9) and (29) we can conclude that 1 σ(f ) = P ∗ (f )s(f ) n is the average variance corresponding to policy π ∼ (f ). G(f ) = lim
n→∞
4
(32)
Finding Optimal Policies
For finding second order optimal policies, at first it is necessary to construct the set of optimal transient or optimal average policies (cf. e.g. [1, 3, 7]). Since optimal policies can be found in the class of stationary policies, i.e. there exist f ∗ , f¯∗ ∈ F such that ∗
v (1) (f ∗ ) ≥ v (1) (π)
resp.
g (1) (f¯∗ ) ≥ g (1) (π) for every policy π = (f n ). ∗
(33)
Let F ⊂ F be the set of all transient optimal stationary policies, F̄ ⊂ F be the set of all average optimal stationary policies. Stationary optimal policies minimizing total or average variance can be constructed on applying standard policy or value iteration procedures in the class of policies from F ∗ or F̂ ∗ . 777
Mathematical Methods in Economics 2016
5
Specific Example: Credit Management
The state of the bank is determined by the bank liabilities, i.e. deposits and the capital and is also influenced by the current state of the economy (cf. [7, 9]). It is a task for expert to evaluate each possible state of the bank by some value, say i ∈ I, we assume that the set I is finite. A subset of the state space I, say I ∗ , is called optimal; the decision maker tries to reach this set. To this end at each time point the decision maker receives some money amount depending on the current state of the bank, say ci , to improve the state of the bank. The decision maker has the following options: 1. advertise the activity of the bank, 2. assign small reward as a courtesy to the non-problematic credit holders, 3. warn and penalized the problematic credit holders. Based on the experience of the bank, suitable advertising improves the state of the bank by reaching from state i some more suitable state j ∈ I with probability pij (1). Similarly a courtesy reward in the total amount ci can help to reach a more suitable state j ∈ I with probability pij (2). Finally, warning and penalizing the problematic credit holders changes the state by reaching state j with probability pij (3). Using the above mentioned approach the problem of optimal credit-granting policy is formulated as a problem of finding optimal policy of a controlled Markov chain. Observe that the transient model can also grasp models with discount factor depending on the current state. Moreover, if the discount factor is very close to unity we try to optimize the long run average reward.
6
Conclusions
The article is devoted to second order optimality in transient and average Markov reward chains. To this end, formulas for the variance of total, discounted and average rewards are derived. In the class of optimal policies, procedures for finding policies with minimal total or average variance are suggested. Application of the obtained results for finding an optimal credit-granting policy of a bank is discussed.
Acknowledgements This research was supported by the Czech Science Foundation under Grant 15-10331S and by CONACyT (Mexico) and AS CR (Czech Republic) under Project 171396.
References [1] Mandl, P.: On the variance in controlled Markov chains. Kybernetika 7 (1971), 1–12. [2] Markowitz, H.: Portfolio Selection - Efficient Diversification of Investments. Wiley, New York, 1959. [3] Puterman, M.L.: Markov Decision Processes – Discrete Stochastic Dynamic Programming. Wiley, New York, 1994. [4] Sladký, K.: On mean reward variance in semi-Markov processes. Mathematical Methods of Operations Research 62 (2005), 387–397. [5] Sladký, K.: Second order optimality in transient and discounted Markov Decision Chains. In: Proceedings of the 33th International Conference Mathematical Methods in Economics 2015 (D. Martinčák et al., eds.), University of West Bohemia, Plzeň, Cheb 2015, pp. 731–736. [6] Sobel, M.: The variance of discounted Markov decision processes. J. Applied Probability 19 (1982), 794–802. [7] Bäuerle, N. and Rieder, U.: Markov Decision Processes with Application to Finance. Springer, Berlin, 2011. [8] Veinott, A.F.Jr.: Discrete dynamic programming with sensitive discount optimality criteria. Annals Math. Statistics 40 (1969), 1635–1660. [9] Waldmann, K.-H.: On granting credit in a random environment. Mathematical Methods of Operations Research 47 (2005), 99–115.
778
Mathematical Methods in Economics 2016
Interval data and sample variance: How to prove polynomiality of computation of upper bound considering random intervals? Ondřej Sokol1 , Miroslav Rada2 Abstract. We deal with the computation of sample variance over interval data. Computing the upper bound of the sample variance is an NP-hard problem, however, there exist some efficient algorithms for various special cases. Using Ferson’s algorithm, the computation of the maximal possible variance over interval-valued dataset can be realized in polynomial time in the maximal number of narrowed intervals intersecting at one point; narrowed means that the intervals are shrinked proportionally to the size of the dataset. Considering random datasets, experiments conducted by Sokol allowed for conjecturing that the maximal number of narrowed intervals intersecting at one point is at most of logarithmic size for a reasonable choice of the data-generating process. Here, we assume uniform distribution of centers and constant radii. Under this setting, we show that the computation of expected value of the maximal number of narrowed intervals intersecting at one point (which is random variable here) is reducible to the evaluation of the volume of some special simplicial cuts. Keywords: Interval data, sample variance, simplicial cuts. JEL classification: C44 AMS classification: 90C15
1
Introduction
Interval data. Assume an interval-valued dataset. We do not know the exact values of data. Instead of that, we know the intervals which these values belong to. This kind of uncertainty occurs naturally when we process rounded, censored or categorized data. Similar data-related issues arise also when we work with measurements which have known tolerances or when we work directly with intervals, e.g. interval predictions or intervals of minimal and maximal daily values in the stock market. In our paper, we deal with the computation of statistics over interval data. When only interval data are at our disposal, we are interested in determining lower and upper bounds of selected statistics. From these lower and upper bounds, we can draw conclusions over the entire dataset. Related work. Under interval uncertainty, even some of the basic statistics are not easy to compute. This paper is a contribution to the case of sample variance when the true mean is unknown. This statistic will be investigated in the next sections. However, other statistics have been studied too. For example t-ratio was studied by Černý and Hladı́k [2], entropy by Kreinovich [9] and Xiang et al. [13], higher moments by Kreinovich et al. [10] and others. Summaries of approaches to computing various statistics under interval uncertanity can be found in [7] and [11]. Goal. In this paper, we deal with the expected computational complexity of sample variance over random data. In our previous paper [3], a conjecture based on simulations experiment about computational complexity has been stated. According to this conjecture, the upper bound of computational complexity in average over reasonable random data is polynomial (for details, see Conjecture 3). The conjecture has not been proven yet. The goal of this paper is to outline an approach to do so by interconnecting nice computational geometry tools with the statistical background of the conjecture.
2
Problem statement
A general framework. Consider an unobservable one-dimensional dataset x1 , . . . , xn (for example, it can be a random sample from a certain distribution). We are given an (observable) collection of intervals x1 = [x1 , x1 ], . . . , xn = [xn , xn ] such that it is guaranteed that 1
xi ∈ x i ,
i = 1, . . . , n.
(1)
University of Economics in Prague, Department of Econometrics, Winston Churchill Square 4, CZ13067 Prague, Czech Republic, ondrej.sokol@vse.cz 2 University of Economics in Prague, Department of Econometrics, Winston Churchill Square 4, CZ13067 Prague, Czech Republic, miroslav.rada@vse.cz
779
Mathematical Methods in Economics 2016
We emphasize that, given the observed values x = (x1 , . . . , xn )T and x = (x1 , . . . , xn )T , the only information about the distribution of x = (x1 , . . . , xn )T is that the axiom (1) holds a.s. Let a statistic S(x1 , . . . , xn ) be given. Formally, S : Rn → R is a continuous function. Due to our weak assumptions, the main information (and in some sense the only information) we can infer about S(x1 , . . . , xn ) from the observable dataset x, x is the lower and upper bound, respectively, of the form S = min{S(ξ) : x ≤ ξ ≤ x},
S = max{S(ξ) : x ≤ ξ ≤ x},
where the inequality “≤” between two vectors in Rn is understood componentwise. Bounds S, S for many important statistics S have been extensively studied in literature. Sometimes, the situaPn tion is easy: for the example of sample mean (i.e. S ≡ µ b = n1 i=1 xi ) we immediately get n
1X µ b= x, n i=1 i
n
1X µ b= xi . n i=1
(2)
Sample variance under interval data. On the other hand, some statistics have been shown to be very hard to compute. In this paper, we will work with one of them: the sample variance. Sample variance is computed as n
σ b2 =
1 X (xi − µ b)2 , n − 1 i=1
(3)
Pn where the true mean µ is unknown and is replaced by the sample mean µ b = n1 i=1 xi . The computation of the upper and lower bounds, respectively, reduces to the optimization problems 2 n n 1 X 1X 2 xj (4) xi − : x ≤ x ≤ x , and σ b = minn x∈R n j=1 n i=1 2 n n 1 X X 1 2 xi − xj :x≤x≤x . σ b = maxn x∈R n j=1 n i=1
(5)
It is obvious that computation of σ b2 is a convex quadratic problem and thus it can be solved in weakly polynomial time. Furthermore, Ferson et al. [6] proposed another interesting method, yielding a strongly polynomial algorithm. Yet another strongly polynomial algorithm has been formulated in [13]. The computation of σ b2 is an NP-hard problem. A proof of the NP-hardness can be found in [5]. Moreover, it is known to be inapproximable with an arbitrary absolute error, see Černý and Hladı́k [2]. The latter paper also gives a useful positive statement: computation of σ b2 can be done in pseudopolynomial time. The computational properties of σ b2 have been studied extensively, see e.g. Ferson [6], Xiang [13], Dantsin [4]. The most interesting fact is that for many special cases the problem can be solved efficiently.
Our problem. In this paper, we address the question whether the efficiently solvable cases are “rare” or “frequent” in practice. In particular, the statistical approach usually assumes that the observed data x, x are generated by an underlying random process. Then, the question is whether the hard instances—such as the instances resulting from NP-hardness proofs—occur with a high or low probability.
The paper by Ferson et al. [6] presents an algorithm for computation of σ b2 with the property stated by Theorem 1. To fix notation, for an interval x = [x, x] we denote xC = 21 (x + x) its center and x∆ = 21 (x − x) its radius. For an α ∈ (0, 1), let αx denote the α-narrowed interval [xC − αx∆ , xC + αx∆ ].
Theorem 1. There exists a polynomial p(n) for which the following holds true: when x1 , . . . , xn are given such that every k-tuple of distinct indices i1 , . . . , ik ∈ {1, . . . , n} satisfies [b µ, µ b] ∩ n1 xi1 ∩ n1 xi2 ∩ · · · ∩ n1 xik = ∅,
b are given by (2).] then the algorithm by Ferson et al. makes at most 2k p(n) steps. [Here, µ b, µ
(6)
Remark 2. The original paper presents a weaker statement, in particular the intersection property from (6) is written as n1 xi1 ∩ n1 xi2 ∩ · · · ∩ n1 xik = ∅. 780
Mathematical Methods in Economics 2016
So, we are interested in the maximum k for which the condition (6) holds. Let Kn denotes such k. C ∆ ∆ We assume that the centers xC 1 , . . . , xn are sampled from a distribution Φ and that the radii x1 , . . . , xn are sampled from a nonnegative distribution Ψ and the samples are independent. Then we are looking for estimated value of Kn .
Based on simulation experiments, Conjecture 3 has been stated (see also [3]): Conjecture 3. If Φ is a continuous distribution with finite first and second moments and its density function is √ limited from above and Ψ has finite first and second moments, then E(Kn ) = O( log n) and Var(Kn ) = O(1). Remark 4. The paper [3] presented a much weaker formulation of Conjecture 3, namely it conjectured that E(Kn ) = O(log n); also no conjecture on variance of Kn was stated. This Conjecture 3 has not been proven yet. To rephrase √ the goal, we outline the approach that could be helpful in proving the first part of hypothesis that E(Kn ) = O( log n) for suitable choice of the data generating processes Φ and Ψ.
3
Simplex as the space of gaps between centers
Convention. From now on, when we speak about i-th interval or about interval with index i, we mean the narrowed one, i.e. n1 xi . We set Φ as uniform distribution on interval (0, 1) and Ψ = 1. Therefore, the computational complexity is affected by the distance of centers of each other. We remind that the radius of narrowed interval decreases with n. Firstly, since the distributions of centers xC i for all i ∈ [n] are independent, we can w.l.o.g. assume that (7) holds: C C xC (7) 1 ≤ x2 ≤ · · · ≤ xn . C Furthermore, we set xC 0 = 0 and xn+1 = 1
C We denote the distance di of adjacent points xC i and xi−1 : C di = xC i = 1, 2, . . . , n + 1. i − xi−1 , Pn+1 From the definitions it follows that i=1 di = 1 and di ≥ 0 for every i. The Theorem 5 holds:
(8)
Theorem 5. Assume d = (d1 , · · · , dn+1 ) is given by (8). Pn+1 1. The support of the random variable d is the set {d ∈ Rn+1 : d ≥ 0, i=1 di = 1}. 2. The random variable d has Dirichlet distribution with parameters α = (1, . . . , 1).
Hence, random variable d = (d1 , d2 , . . . , dn+1 ) is uniformly distributed over an (n+1)-dimensional simplex Sn+1 with vertices a1 = (1, 0, . . . , 0), . . . , an+1 = (0, . . . , 0, 1). Pn+1 The simplex Sn+1 has affine dimension n, since Pnit lives in hyperplane i=1 di = 1. It can be mapped to n-dimensional space using projection dn+1 = 1 − i=1 di . The resulting simplex will be denoted by ∆n . It reads n
∆n = {d ∈ R : d ≥ 0,
n X i=1
di ≤ 1}.
(9)
This projection means no loss of information. The resulting simplex is fulldimensional. Remark 6. Note, however, that we are not interested in d1 and dn+1 as they do not affect the number of intersecting narrowed intervals.
4
Probability
Denote by P(Kn = k) the probability that, given a fixed n, the maximum number of intersecting intervals is k. Similarly we define P(Kn ≥ k). We are interested in the expected value of maximal number of intersecting intervals E(Kn ) =
n X
kP(Kn = k)
k=1
for different n, in particular, we want to examine its behaviour when n → ∞. Observation 7 forms a basis for the rest of the text.
781
(10)
Mathematical Methods in Economics 2016
Observation 7. Assume an index set I ⊆ [n] with |I| = k for some k ∈ N is given. The following holds: \1 xi 6= ∅ n
i∈I
⇐⇒
C max xC i − min xi ≤ i∈I
i∈I
max{j∈I}
2 n
⇐⇒
X
i=1+min{j∈I}
di ≤
2 . n
Roughly speaking, k intervals have a nonempty intersection if and only if the farthest centers (of these intervals) are close enough. Nevertheless, for us, it will be sufficient to consider sets of adjacent indices, i.e. I ⊆ [n] such that |I| = k and max{i ∈ I} − min{i ∈ I} = k − 1. An index set with this property will be called connected.
For computation of probability that all intervals in a connected index set I have a nonempty intersection (have a common point), we can use the geometric view introduced in the last section. We use the fact that differences Pmax{i∈I} between adjacent centers are uniformly distributed on ∆n . We define HIn := {d ∈ Rn : j=1+min{i∈I} dj ≤ n2 }. Since I is connected, the probability of nonempty intersection depends only on n and k = |I|, and neither on the Pk minimal nor the maximal index in I. Therefore, definition of HIn can be simplified to Hkn := {d ∈ Rn : j=2 dj ≤ 2 n n }. The halfspace Hk cuts off a part of ∆n . The bigger the cut off is, the more probable is intersection of intervals in I. We denote by C(n, k) the set ∆n ∩ Hkn . Analogously, we define C(n, I). The following Lemma holds:
Lemma 8. Assume we have n intervals in total. The probability that k adjacent intervals share a common point is denoted by P(n, k) and can be computed as P(n, k) =
Vol(C(n, k)) = n! Vol(C(n, k)). Vol(∆n )
The last equality follows from the fact that volume of ∆n is
(11)
1 n! .
The volume of C(n, k). The volume of C(n, k) can be computed or bounded in several ways. Gerber [8] and Varsi [12] studied the exact volume of simplex cut off. Gerber’s work is especially interesting, since he proposed not only the formulas for volume of simplex cut off, but also the recursive formula for computing the fraction of volume of the simplex to be cut – the probability P(n, k) directly. Since the recursive formula is not directly usable for our purpose, we formulated the following upper bound for the volume of C(n, k). Lemma 9. The following holds: k 2 1 Vol(C(n, k)) ≤ . n (n − k + 1)!(k − 1)!
(12)
As the last instance, we mention excellent survey on algorithms for computation of volume of a polytope by Büeler et al. [1]. However, for our purposes, an analytic formula for volume would be the most appropriate. From this point of view, the only suitable formula is the upper bound stated by Lemma 9. Two approaches to expressing P(Kn = k). Utilizing the formula for P (n, k), there are two approaches to expressing P(Kn = k). We discuss them in Sections 4.1 and 4.2.
4.1
Inclusion-exclusion principle
We use the standard formula for computing probability of exact value of a variable: P(Kn = k) = P(Kn ≥ k) − P(Kn ≥ k + 1).
Then, note that Kn ≥ k occurs if and only if there are k adjacent intervals that share common point. Hence, we can consider all adjacent k-tuples of intervals, for each such k-tuple we cut off ∆n and compute volume of the cut area:
P(Kn ≥ k) = n! Vol
[
I⊆[n],|I|=k
C(n, I) .
(13)
Clearly, one can’t simply sum up the volumes of all the C(n, I), since the cut off areas are overlapping. We shall utilize the inclusion-exclusion principle. However, we face the problem that we need to compute volume of
782
Mathematical Methods in Economics 2016
much more complex polytopes, since the simplex is cut by more halfspaces (the number of halfspaces varies from 1 to n − k + 1). Currently, we do not know how to even compute a tight upper bound for volume of the union of C(n, I) in (13).
4.2
Complementary probability
Here, we take another approach. Instead of computing volume of union of cuts (formed by halfspaces in form HIn ), we compute volume of intersection of the opposite cuts and ∆n . That allows us to get rid of the inclusion-exclusion framework, however, the main problem with “complex” polytope remains unchanged. P To be more formal, we define HI−n as the set {d ∈ RRn : i∈I di ≥ 2/n for connected index set I with |I| = k. Then, \ \ P(Kn < k) = n! Vol ∆n HI−n . (14) I⊆[n],|I|=k
Then, P(Kn = k) = P(Kn < k + 1) − P(Kn < k).
Note, that in (14) we need to compute the volume of simplex cut off by n − k + 1 hyperplanes, i.e. we need an analytic expression of volume of a polytope given as an intersection 2n − k + 2 hyperplanes. As we also need send n to ∞, this task is untractable in full generality. We can only hope that the structure of the polytope is “special” and will allow for an efficient computation.
5
Conclusion
In this paper we presented an approach to compute the expected value of the maximal number of narrowed intervals that have a common point under specific stochastic setup. We connected computational geometry tools with the statistical background of the problem; our approach is based on geometric interpretation of the problem as we compute the probabilities using the volume of simplex cut-offs. Although we sketched some ideas for the computation, the goal is still quite far away. Although the volume of simplex cut by one halfspace can be either computed directly or approximated (e.g. by Lemma 9), the evaluation of the volume of simplex cut by multiple halfspaces is computationally much more expensive task. Nevertheless, using the fact that distances between centers of intervals are uniformly distributed over n-dimensional simplex, the inclusion-exclusion principle seems to be a viable option. Further research should focus on exact formulation of the probabilities using inclusion-exclusion principle, as sketched in Section 4.1.
Acknowledgements The work of Ondřej Sokol was supported by the grant No. F4/63/2016 of the Internal Grant Agency of the University of Economics in Prague. The work of Miroslav Rada was supported by project GA ČR P403/16-00408S and by Institutional Support of University of Economics, Prague, IP 100040.
References [1] Büeler, B., Enge, A., and Fukuda, K.: Exact Volume Computation for Polytopes: A Practical Study. In: Polytopes — Combinatorics and Computation (Kalai, G., and Ziegler, G. M., eds.). Birkhäuser Basel, Basel, 2000, 131–154. [2] Černý, M., and Hladı́k, M.: The Complexity of Computation and Approximation of the t-ratio over Onedimensional Interval Data. Computational Statistics & Data Analysis 80 (2014), 26–43. [3] Černý, M., and Sokol, O.: Interval Data and Sample Variance: A Study of an Efficiently Computable Case. In: Mathematical Methods in Economics 2015 (Martinčı́k, D., Ircingová, J., and Janeček, P., eds.). University of West Bohemia, Plzeň, 2015, 99–105. [4] Dantsin, E., Kreinovich, V., Wolpert, A., and Xiang, G.: Population Variance under Interval Uncertainty: A New Algorithm. Reliable Computing 12 (2006), 273–280. [5] Ferson, S., Ginzburg, L., Kreinovich, V., Longpré, L., and Aviles, M.: Computing Variance for Interval Data is NP-hard. ACM SIGACT News 33 (2002), 108–118. [6] Ferson, S., Ginzburg, L., Kreinovich, V., Longpré, L., and Aviles, M.: Exact Bounds on Finite Populations of Interval Data. Reliable Computing 11 (2005), 207–233.
783
Mathematical Methods in Economics 2016
[7] Ferson, S., Kreinovich, V., Hajagos, J., Oberkampf, W., and Ginzburg, L.: Experimental Uncertainty Estimation and Statistics for Data Having Interval Uncertainty. Sandia National Laboratories, 2007. [8] Gerber, L.: The Volume Cut Off a Simplex by a Half-space. Pacific Journal of Mathematics 94 (1981), 311–314. [9] Kreinovich, V.: Maximum Entropy and Interval Computations. Reliable Computing 2 (1996), 63–79. [10] Kreinovich, V., Longpré, L., Ferson, S., and Ginzburg, L.: Computing Higher Central Moments for Interval Data. Technical report, University of Texas at El Paso, 2004. [11] Nguyen, H. T., Kreinovich, V., Wu, B., and Xiang, G.: Computing Statistics under Interval and Fuzzy Uncertainty, volume 393 of Studies in Computational Intelligence. Springer Berlin Heidelberg, Berlin, Heidelberg, 2012. [12] Varsi, G.: The Multidimensional Content of the Frustum of the Simplex. Pacific Journal of Mathematics 46 (1973), 303–314. [13] Xiang, G., Ceberio, M., and Kreinovich, V.: Computing Population Variance and Entropy under Interval Uncertainty: Linear-time Algorithms. Reliable computing 13 (2007), 467–488.
784
Mathematical Methods in Economics 2016
On the use of linguistic labels in AHP: calibration, consistency and related issues Jan Stoklasa1, Tomáš Talášek2 Abstract. Saaty’s AHP method is a frequently used method in decision support. It provides its users with a simple and effective way to input necessary information concerning his/her preferences through pair-wise comparisons, as well as various consistency measures of the expressed preferences. The linguistic level attached to Saaty’s fundamental scale provides a seemingly easy-to-use interface between the model (pair-wise comparison matrices) and its users. In this paper we focus on the linguistic level of Saaty’s fundamental scale and analyze the consequences of its use as it was proposed by Saaty. We show that even in the context of the consistency condition as defined by Saaty, the linguistic level seems to be improperly defined and calibration of the linguistic labels might be in place. Also proper transformations into different languages rather than simple translations of the terms suggested in English are necessary. We argue, that in the context of the existence of a transformation between the multiplicative and additive form of Saaty’s matrices it is reasonable to investigate whether the additive or the multiplicative form should be used to confirm (or appropriately set) the meanings of the linguistic terms. We illustrate on a practical example how the calibration of the meanings of linguistic terms in the additive form can affect the performance of the linguistic level in terms of consistency in the multiplicative form. Keywords: linguistic label, linguistic scale, AHP, pairwise comparison, calibration, consistency. JEL classification: C44 AMS classification: 91B74
1
Introduction
AHP is a widely used multiple criteria decision making tool proposed by T. L. Saaty [7, 8] and it has received much attention of practitioners and researchers since its introduction. Let us consider a single decision maker who needs to evaluate each alternative from a given set of n alternatives {A1 , . . . , An } and let us assume that only one evaluation criterion is considered. This criterion can be either quantitative (measurable or numerical) or qualitative. It is convenient to express the preference structure on the set of alternatives by comparing pairs of alternatives and assessing which one is more preferred to the other and also assessing the strength of this preference. The aim of the AHP is to obtain evaluations h1 , . . . , hn of these alternatives. Based on these evaluations a reciprocal square matrix H of the dimension n × n can be constructed, H = {hij }ni,j=1 , such that hij = hi /hj and obviously hij = 1/hji . The value hij then represents the relative preference of alternative i over alternative j and can be linguistically interpreted as Ai being hij times more important than Aj . The usual multiple-criteria decision-making problem is one of finding such evaluations h1 , . . . , hn , as they are not known in advance. To do so an estimation of the matrix H, a reciprocal square matrix of preference intensities S = {sij }ni,j=1 , sij = 1/sji , can be constructed by a decision maker. In case of more decision makers the matrices of preference intensities can be aggregated into a single overall 1 Palacký University Olomouc, Department of Applied Economics, Křı́žkovského 12, 77180 Olomouc, jan.stoklasa@upol.cz. 2 Palacký University Olomouc, Deptartment of Mathematical Analysis and Applications of Mathematics, 17. listopadu 1192/12, 77146 Olomouc and Lappeenranta University of Technology, Skinnarilankatu 34, 53850 Lappeenranta, Finland, tomas.talasek@upol.cz.
785
Mathematical Methods in Economics 2016
matrix of intensities of preferences. This can be done for each element of this overall matrix by computing a geometrical mean of all the values provided by various decision makers for the respective pairwise comparison. Based on the matrix SPthe evaluations h1 , . . . , hn can be computed as the arguments of minn Pn imum of the following expression i=1 j=1 (sij − hhji )2 . The solution to this problem (the evaluations h1 , . . . , hn ) can be found as the components of the eigenvector corresponding to the largest eigenvalue of S. Alternatively the logarithmic least squares method can be applied and the solutions found in the form qQ n n hi = j=1 sij , i = 1, . . . , n. The consistency condition for matrices of preference intensities suggested by Saaty is expressed as (1) sik = sij ∙ sjk, for all i, j, k = 1, 2, . . . , n. It is well known that using the values {1, 2, 3, 4, 5, 6, 7, 8, 9} and their reciprocals in the matrix of preference intensities (see the Saaty’s fundamental scale in Table 1), the consistency condition might not be fulfilled for expertly defined matrices of preference intensities, particularly if these are of larger order. In cases when “pure” consistency cannot be reached, Saaty defines the inconsistency index CI based on the −n spectral radius (λmax ) of S by CI = λmax n−1 . For a perfectly consistent matrix λmax = n and hence CI = 0. For other matrices Saaty defines the inconsistency ratio CR = CI/RIn , where RIn is a random inconsistency index computed as an average of inconsistency indices of randomly generated reciprocal matrices of preference intensities of the order n. As long as CR < 0.1 the matrix S is considered to be consistent enough. In this paper we discuss one of the most important parts of the use of AHP (and methods based on matrices of intensities of preferences in general) - the input of preference-intensities. In Section 2 we recall the Saaty’s fundamental scale and the linguistic interpretations of its numerical values and point out discrepancies between the numerical and linguistic level in the context of consistency of Saaty’s matrix. Section 3 first provides an overview of the currently available calibration methods for the Saaty’s scale, in Subsection 3.2 we propose a novel fuzzy preference relation based calibration and in Subsection 3.3 we discuss the new and original calibration methods in the context of a weakened consistency condition. Section 4 provides a discussion and summary of the presented topic.
2
The (fundamental) scale
The elements sij are estimations of the actual values of hij and are provided by the decision maker as answers to questions “What is the intensity of preference of Ai over Aj for you according to the given criterion? ” or alternatively “How much more do you prefer Ai over Aj ? ”. sij 1 3 5 7 9 2,4,6,8
linguistic labels of the numerical intensities of preferences alternative i is equally preferred as alternative j alternative i is slightly/moderately more preferred than alternative j alternative i is strongly more preferred than alternative j alternative i is very strongly more preferred than alternative j alternative i is extremely/absolutely more preferred than alternative j correspond with the respective intermediate linguistic meanings obtained by joining the respective two linguistic labels Tk and Tl by “between” into the label “between Tk and Tl ”
Table 1: Saaty’s scale - 9 numerical values of its elements for expressing preference (or indifference in case of 1) of alternative i over alternative j and their suggested linguistic labels. We can observe that the questions are not easy to answer - at least not in a precise manner. We can not expect all decision makers to be able to provide answers such as “A3 is 3.56 more preferred than A5 ”. Even if the decision-maker was able to provide such a precise answer, can we be sure that such a value is reliable? Saaty suggests the scale presented in Table 1 to be used to express preference intensities. Linguistic labels are suggested to represent five elements of the scale. There are several possible modes in which the intensity of preference can be provided - out of all these, two were originally suggested by Saaty - the linguistic mode and the numerical mode (see Table 1, which also summarizes their mutual relation). Huizingh and Vrolijk [3, p. 243] found that under their experimental design “ the numerical
786
Mathematical Methods in Economics 2016
and the verbal mode are equally able to predict the ranking within the set of alternatives ” (note that they consider ranking, not the actual evaluations of alternatives). The verbal (linguistic) mode, however, results in higher inconsistency according to [3]. They also conclude [3, p. 245] that “using the verbal mode without knowing how people interpret the preference phrases leads to a small loss in decision quality” (again just rankings were considered). But Dyer and Forman [1] clearly warn against the combination of these two nodes in any set of judgements. This seems to suggest, that although both modes can work separately, their relation might not be as easy as it is presented in Table 1. We aim to add more arguments to support the claim that the assignment of linguistic labels to the numerical values of the fundamental scale is arbitrary and even contradicts the consistency condition set by Saaty. It is clear that the appropriateness of a linguistic label for a given preference-intensity (or vice versa the numerical meaning of a given linguistic expression) varies among the decision makers, is context dependent and can even be language specific. Since there is no one-to-one mapping between languages that would maintain meaning, we cannot accept the translations of Saaty’s fundamental scale (its linguistic mode) into different languages to work with the same numerical values as meanings. It is questionable, whether the meanings of the translations maintain the same position (and relative position) on the universe {1, 2, 3, 4, 5, 6, 7, 8, 9}. It may be that the differences in meaning among different language mutations of AHP are not too big, but not to check them and simply accept a translation of the linguistic scale does not follow any methodological good practice at all. Whether the OR researchers want to accept it or not, there is need for empirical confirmation that we are using the methods correctly - mainly with methods that employ a linguistic mode of description. If we interpret the numerical values of the fundamental scale {1,2,3,4,5,6,7,8,9} and their reciprocals in the suggested way, that is as describing how many times is one alternative preferred to another one, we need to explain why there is no greater value than 9. There is in fact no natural maximum of the number of times something can be preferred to something else. This problem is even stressed when quantitative criteria are used and the ratio of their values is explicitly larger than 9 (say 10 EUR is ten times more than 1 EUR, but the restricted fundamental scale allows us to use 9 as a maximum value for any element in S). When we now assign a linguistic label “extremely (or absolutely) preferred to ” to “9 times more preferred than”, we can confuse the decision makers. What is more, using Table 1 we define “3 times more preferred ” to be just “slightly preferred ” which also does not seem to correspond with our intuition. It would seem that any calibration (setting appropriate labels for the numerical values of this scale) will be problematic, unless approached from a context where natural minima and maxima of preference-intensity ratios exist. If we examine the linguistic mode of Saaty’s fundamental scale in the context of his consistency condition (1), we run into further problems. To be compatible with the linguistic labels used in Saaty’s scale, the condition should make sense also when we substitute the linguistic labels into it. Let us consider sik = 3 and skj = 3 then based on (1) we need sij = 3 ∙ 3 = 9. If we transform this into the linguistic level, we get if “Ai is slightly more preferred than Ak ” and “Ak is slightly more preferred than Aj ” then “Ai is extremely/absolutely more preferred than Aj ”. This is rather counterintuitive - we would expect a much smaller preference between Ai and Aj induced by two slight preferences. The consistency condition (1) is not well defined for the linguistic labels (or the linguistic labels are not well defined). In any cases if the decision maker provides information in linguistic form only and (1) is required, we declare as consistent something that is counterintuitive.
3
Possible solutions
It seems to us that there is no strong correspondence between the linguistic labels and the numbers that are supposed to represent their meaning (other than the fact that the linguistic labels can be ordered in the same way as their numerical counterparts). As was already mentioned, this might not be a big problem, if just one of the levels is used to obtain S - that is if the decision maker provides either just numbers, or just linguistic values. If, however, these two levels are combined and the decision maker has to deal with the fact that e.g. “9 times more = absolutely more”, or “3 times more = slightly more” problems can occur due to the possible ambivalence of the assigned meaning. Also the resulting evaluations can be questionable, if the numerical values do not reflect the meaning of the linguistic labels well enough. The control of the decision maker over the process of obtaining evaluations fades in this case. There are at least two problems that need to be addressed - the discrepancy between the linguistic mode and the numerical mode, and the poor performance of linguistic labels in terms of the consistency
787
Mathematical Methods in Economics 2016
condition and its “linguistic performance”. 3.1
Calibration - achieving correspondence of the two discussed modes
Since the correct meaning of the linguistic labels of the intensities of preferences (e.g. “ strongly more preferred ”) can depend on the decision-maker, on the type of decision making situation (context) and on the whole set of linguistic labels chosen for the description of preference intensities (including its cardinality and the choice of linguistic terms), the idea of customizing the meaning of the linguistic terms (expressed as a value from the fundamental scale {1,2,3,4,5,6,7,8,9} and the respective reciprocal values, or simply as values from the interval [1/9, 9]) has been addressed by several authors with varying success. Finan and Hurley [2] propose a calibration method that results in a geometric numerical scale providing meanings for the linguistic terms. They use the process of transitive calibration, which first specifies the number of the linguistic terms to be assigned numerical values (say four linguistic values: {equally important, slightly more important, more important, much more important } = {T1 , T2 , T3 , T4 }; we are looking for their numerical equivalents {t1 , t2 , t3 , t4 }, t1 = 1 by definition), in the second step they ask the decision-maker to decide the following: if (x is T2 than y) and (y is T2 than z), what is the relation of x to z - is it T3 or T4 ? At the end they require the decision maker to specify the meaning of (x is T2 than y) in terms of “x being k% more important than y”. If the decision-maker specified, that (x is T3 than z), they get t2 = 1 + k, t3 = (1 + k)2 and t4 = (1 + k)4 . Note that they operate under the consistency condition expressed for the numerical values as ((sij > 1) ∧ (sjk > 1)) =⇒ (sik > max{sij , sjk }) . This, however, seems to be too restrictive, since the combination of T2 and T2 does not need to be perceived as Tl , where l > 2. In our opinion T2 should also be considered as a possible answer (see the concept of weak consistency further in the text). Ishizaka and Nguyen [4] suggest a calibration procedure for qualitative criteria, that computes the numerical values of linguistically expressed intensities of preferences on the bases of pair-wise comparisons of geometrical figures with different areas (ordered in increasing order with respect to the area). The decision maker is asked to describe linguistically how big the difference between the areas of the given pair of objects is. The judgements are then compared with the real ratios of the areas of the objects and the numerical meaning of the linguistic terms is computed. Although this might seem as a simple and useful idea, there are several methodological issues connected with this solution - 1) there was never a higher ratio of the areas than 1 : 9 used in the study, 2) calibrating a scale for a qualitative criterion on a quantitative universe with different interpretation is very risky (since we know that the meanings of the linguistic terms are context-specific). Paper [4] however presents more evidence for the necessity of addressing the issue of calibration properly. In both these approaches we can identify the problem of trying to calibrate the meanings of linguistic terms of a variable that might not have a natural maximum on a restricted universe (e.g. [1/9, 9]) . This can be bypassed by transforming this restricted universe into a different universe with natural minimum and maximum, with which the decision-maker can work reasonably well. 3.2
Calibration - a fuzzy preference relation approach
It is possible to transform a multiplicative (elements interpreted in terms of ratios/multiples) pairwise comparison matrix S = {sij }ni,j=1 into an additive (elements interpreted in terms of differences) pairwise comparison matrix Z = {zij }ni,j=1 using the following transformation (see e.g. [6]):
1 (2) (1 + log9 sij ). 2 The resulting matrix Z then carries the same information concerning the preferences of the decision maker. Z is additively reciprocal, that is zij = 1 − zji , zij ∈ [0, 1] and zii = 0.5 for all i, j = 1, . . . , n. As such the matrix Z can be interpreted as a fuzzy relation, its elements zij representing the degree of preference of Ai over Aj . Obviously, zij = 0.5 is interpreted as indifference between Ai and Aj , zij = 1 is interpreted as Ai is absolutely preferred to Aj , and zij = 0 is interpreted as Aj is absolutely preferred to Ai . The transformation formula (2) can be used in the context of fuzzy pairwise comparison matrices as well (see [5, 9]). zij =
As we already know the meanings of the linguistic labels in the multiplicative case (see Table 1), we can now transform them into the additive representation using (2) and see, whether they “make sense”
788
Mathematical Methods in Economics 2016
in the additive case, where a natural minimum and maximum of the degree of preference exists. Figure 1 summarizes the results of the transformation of the meanings of linguistic labels from the multiplicative case into the additive representation. We can see, that at least the meaning of the linguistic label “slightly preferred” seems to be misplaced - it is exactly half the way between indifference and extreme preference. We therefore suggest to construct the meanings of the available linguistic labels to be more intuitive (and thus making the linguistic labels more compatible with the multiplicative consistency condition) by constructing the meanings in the additive model instead. That is to define an appropriate meaning of each of the five linguistic labels used in Saaty’s scale as a number from [0.5, 1]. Then these values are transformed back to the multiplicative universe (model) and a closest integer value from {1, 2, 3, 4, 5, 6, 7, 8, 9} would be assigned to them. The result of such an approach, when the numerical meanings are chosen to be equidistant in the additive universe [0.5, 1] is summarized in Figure 2. We can see that after this modification even the multiplicative consistency condition seems to work better for the linguistic mode of the fundamental scale - if we again consider that “Ai is slightly more preferred than Ak ” and “Ak is slightly more preferred than Aj ” then the consistent result is “Ai is between strongly and very strongly more preferred than Aj ” (numerically sik = 2 and skj = 2 and therefore sij = 2∙2 = 4) - this seems to us slightly closer to the intuitive expectation of the aggregated preference than in the original case. The uniform distribution of meanings of the five linguistic labels on [0.5, 1] is just an example to illustrate the proposed approach. The meanings of these terms would have to be set in accordance with the understanding of these linguistic labels by the decision maker.
Figure 1: Transformation of the numerical values corresponding with the linguistic labels (in colour) and the intermediate numerical values and their reciprocals from Saaty’s multiplicative scale into the values of the additive scale.
3.3
Figure 2: Transformation of the meanings of the linguistic labels of Saaty’s scale that are considered to be uniformly distributed on [0.5, 1] in the additive approach back to numerical values of the multiplicative scale. For each linguistic label an exact numerical value of its meaning after transformation is presented and the closest integer is assigned as its meaning in the multiplicative case.
Revision of consistency requirements
To use the linguistic level for inputs more safely, a weaker consistency condition (see definition 1), that reflects the linguistic labels well has been proposed in [12] and further elaborated in [10]. Definition 1. (Weak consistency condition [10]) Let S = {sij }ni,j=1 be a matrix of preference intensities. We say, that S is weakly consistent, if for all i, j, k ∈ {1, 2, . . . , n} the following holds: sij > 1 ∧ sjk > 1
=⇒
sik ≥ max{sij , sjk };
(sij = 1 ∧ sjk ≥ 1) ∨ (sij ≥ 1 ∧ sjk = 1)
=⇒
sik = max{sij , sjk }.
(3) (4)
The properties of this condition are discussed in more details in [9, 10, 11, 12]. It is important to note here, that this condition is reasonable on both the numerical and the linguistic level of description. In situation when “Ai is slightly more preferred than Ak ” and “Ak is slightly more preferred than Aj ” we just require Ai to be “at least slightly more preferred than Aj ”. Such a condition is obviously much weaker
789
Mathematical Methods in Economics 2016
than (1). It can therefore be seen as a minimum requirement on the consistency of expertly defined matrices of preference intensities.
4
Conclusion
Defining the meanings of the linguistic labels of Saaty’s scale in the additive model on a universe with natural minimum and maximum proposed in this paper seems reasonable - the decision maker is asked to simply find a fixed point on the interval [0.5, 1] (that is between indifference and absolute preference) for each linguistic label. This way we are able to assign linguistic description to 5 real numbers representing the strength of a preference (we can also consider the intermediate linguistic terms thus obtaining 9 linguistically interpretable values). We can also consider the real number assigned to a linguistic term as its meaning to be a typical (best) representative of the given linguistic term and use fuzzy sets with kernels containing these typical values and define appropriate fuzzy scales to be able to linguistically label all the possible numerical values from the given scale. The issue of calibration of the linguistic mode of Saaty’s scale remains open, but we hope that this paper identifies at least some of the possible paths to its solution.
Acknowledgements Supported by the grant GA 14-02424S Methods of operations research for decision support under uncertainty of the Grant Agency of the Czech Republic and partially also by the grant IGA PrF 2016 025 of the internal grant agency of Palacký University Olomouc.
References [1] Dyer, R. F., and Forman, E. H.: An Analytic Approach to Marketing Decisions. Prentice Hall International, New Jersey, 1991. [2] Finan, J. S., and Hurley, W. J.: Transitive calibration of the AHP verbal scale. European Journal of Operational Research, 112, 2, (1999), 367–372. [3] Huizingh, E. K. R. E., and Vrolijk, H. C. J.: A Comparison of Verbal and Numerical Judgments in the Analytic Hierarchy Process. Organizational Behavior & Human Decision Processes, 70, 3, (1997), 237–247. [4] Ishizaka, A., and Nguyen, N. H.: Calibrated fuzzy AHP for current bank account selection. Expert Systems with Applications, 40, 9, (2013), 3775–3783. [5] Krejčı́, J.: Additively reciprocal fuzzy pairwise comparison matrices and multiplicative fuzzy priorities. Soft Computing (2015), 1–16. DOI 10.1007/s00500-015-2000-2 [6] Ramı́k, J., and Vlach, M.: Measuring consistency and inconsistency of pair comparison systems. Kybernetika, 49 (3), (2013), 465–486. [7] Saaty, T. L.: A scaling method for priorities in hierarchical structures. Journal of Mathematical Psychology, 15, 3, (1977), 234–281. [8] Saaty, T. L.: Fundamentals of Decision Making and Priority Theory With the Analytic Hierarchy Process. RWS Publishers, Pittsburgh, 2000. [9] Stoklasa, J.: Linguistic models for decision support. Lappeenranta University of Technology, Lappeenranta, 2014. [10] Stoklasa, J., Jandová, V., and Talašová, J.: Weak consistency in Saaty’s AHP - evaluating creative work outcomes of Czech Art Colleges Classification of works of art. Neural Network World, 23, 1, (2013), 61–77. [11] Stoklasa, J., Talášek, T., and Talašová, J.: AHP and weak consistency in the evaluation of works of art - a case study of a large problem. International Journal of Business Innovation and Research, 11, 1, (2016), 60–75. [12] Talašová, J., and Stoklasa, J.: A model for evaluating creative work outcomes at Czech Art Colleges. In Proceedings of the 29th International Conference on Mathematical Methods in Economics - part II (2011), University of Economics, Prague, Faculty of Informatics and Statistics, 698–703.
790
Mathematical Methods in Economics 2016
‘Soft’ consensus in decision-making model using partial goals method Vojtěch Sukač1 , Jana Talašová
2
, Jan Stoklasa3
Abstract. The paper deals with the method of reaching consensus in group decision-making under fuzziness proposed by Sukač et al. in 2015. This multiexpert (ME) multiple-criteria decision-making (MCDM) model utilizes fuzzy evaluations of absolute type and stems from the partial-goals tree evaluation methodology introduced by Talašová in 2003. The fuzzy evaluations of alternatives are in the form of fuzzy numbers on a universe representing the degree of fulfillment of a given goal and as such they express the fuzzy degrees of fulfillment of the given goal by the respective alternatives for each expert. Each expert is assigned a decision competence expressed by a fuzzy number. The group aggregation of expert evaluations is based on the idea of the ‘soft’ consensus introduced by Kacprzyk and Fedrizzi in 1986. The alternatives which are evaluated “good enough” by a “sufficient amount” of “important” experts are identified and those alternatives form the set of possible solutions. In this paper we briefly recall the steps of the multi-expert ‘soft’ consensus reaching method by Sukač et al. and discuss its performance on practical examples in comparison with one of the most commonly used ME-MCDM approaches based on the weighted average of expert fuzzy evaluations. The center of gravity based ordering is applied to select the best evaluated alternative in this case. This way the paper analyzes the performance of the ‘soft’ consensus based MCDM method in the context of a well known benchmark and points out the added value of the ‘soft’ consensus based MCDM model. Keywords: Fuzzy, group decision-making, fuzzy weighted average, consensus reaching, soft consensus. JEL classification: C44 AMS classification: 91B74
1
Introduction
We make lot of decisions every day, in most cases we decide intuitively. Usually this does not require a lot of time to think about nor does it require formal models and decision support. Sometimes, however, we have to make decisions whose consequences are serious (e.g. choosing the best candidate for a job, choosing the best area to live, etc.). In these situations, an appropriate decision-making model should be used. Important and strategic decisions also take into account the view of more parties. In companies strategic decision-making frequently involves more people in order to reach higher objectivity (and also to avoid wrong decisions caused by overlooking some of the important aspects of the decision-making situation). In such a case it is desirable for each of the individuals to have (slightly) different preferences. Sometimes, however, the assessments of multiple experts (decision-makers — DMs) are too different, which makes find an alternative acceptable for everyone difficult or even impossible. Due to this, the negotiation and consensus reaching processes have been studied quite frequently in the literature. In most of the papers dealing with consensus reaching in group decision-making the consensus reaching is based on the aggregation of fuzzy preference relations. In this paper we recall and analyse the fuzzy group decision-making model proposed in [5, 6] which utilizes fuzzy evaluations of absolute type. These evaluations can be interpreted as the degrees of fulfillment of given goals [7]. Such evaluations are desirable, since they are not dependent on the set of 1 Palacký
University Olomouc, Department of Mathematical Analysis and Applications of Mathematics, 17. listopadu 1192/12, 77146 Olomouc, vojta.sukac@gmail.com 2 dtto, jana.talasova@upol.cz 3 Palacký University Olomouc, Department of Applied Economics, Křı́žkovského 12, 77180 Olomouc, jan.stoklasa@upol.cz
791
Mathematical Methods in Economics 2016
alternatives and describe the acceptability of the alternatives (this also removes the “one-eyed is king among the blind” effect from the evaluation). Group aggregation of evaluations of alternatives is performed by a fuzzy weighted average operation with fuzzy weights (see e.g. [4]). Consensus reaching in the model analyzed in this paper is based on the idea that the choice of the best alternative should be made only among the alternatives that are good enough according to most of the relevant experts. Such pre-selection of alternatives is based on the idea of ‘soft’ consensus introduced by Kacprzyk and Fedrizzi [1, 2]. The paper is organized as follows. The next section summarizes the definitions of the concepts necessary to introduce the model. In the third section the decision-making model is described in brief. The fourth section contains two examples on which the results and the performance of the ‘soft’ consensus based model are compared with the standard approach based on the weighted average of expert fuzzy evaluations and the defuzzification by the center of gravity method.
2
Preliminaries
Let U be a nonempty set called universe. Fuzzy set A on U is determined by its membership function μA (x) : U → [0, 1], where μA (x) expresses the degree of membership of x in fuzzy set A — from 0 for “x definitely does not belong to A” to 1 for “x definitely belongs to A”, through all the intermediate values. The family of all fuzzy sets on the universe U is denoted by F (U ). Let A be a fuzzy set on U and α ∈ [0, 1]. The crisp set Aα = x ∈ U | μA (x) ≥ α is called the α-cut of fuzzy set A. The support of fuzzy set A is a (crisp) set Supp(A) = {x ∈ U | μA (x) > 0 . The kernel of fuzzy set A is a (crisp) set Ker(A) = {x ∈ U | μA (x) = 1 .
In cases, when the support of A is a discrete set (Supp(A) = {x1 , . . . , xk }), then the fuzzy set A can be denoted as A = {μA (x1 ) x1 , . . . , μA (xk ) xk }. A special type of fuzzy sets whose universe is a subset of R, the so called fuzzy numbers, can be defined in the following way. Let U ⊂ R be an interval. A fuzzy number N is a fuzzy set on the universe U which fulfills the following conditions: a) Ker(N ) 6= ∅; b) for all α ∈ (0, 1], Nα are closed intervals; c) Supp(N ) is bounded. The family of all fuzzy numbers on U is denoted by FN (U ). Each fuzzy number N is determined by N = {[N (α), N (α)]}α∈[0,1] , where N (α) and N (α) is the lower and upper bound of the α-cut of fuzzy number N respectively, for 0 < α ≤ 1 and [N (0), N (0)] is the closure of the support of N , i.e. [N (0), N (0)] = Cl(Supp(N )). A trapezoidal fuzzy number N is determined by ordered quadruple (n1 , n2 , n3 , n4 ) ⊂ U 4 of significant values of N satisfying (n1 , n4 ) = Supp(N ) and [n2 , n3 ] = Ker(N ). The membership function of a trapezoidal fuzzy number N is x−n1 if n1 ≤ x < n2 , n2 −n1 1 if n2 ≤ x ≤ n3 , (1) μN (x) = 4 −x nn4 −n if n3 < x ≤ n4 , 3 0 otherwise. N is called a triangular fuzzy number if n2 = n3 . Closed real intervals and real numbers can be represented by special cases of trapezoidal fuzzy numbers. See e.g. [3] for more details.
This paper utilizes the linguistic approach to group decision-making problems. Linguistic variable [8] is the 5-tuple (V, T (V), U, G, M ), where V is the name of the linguistic variable, T (V) is the set of its linguistic terms (the values of V), U ⊂ R is the universe on which fuzzy numbers expressing meanings of these linguistic terms are defined, G is grammar used for generating of linguistic terms T (V) and M is a mapping that assigns to each term C ∈ T (V) its meaning C = M (C) (a fuzzy number on U ).
3
Description of the model
For the purpose of this paper we consider the group decision-making problem addressed in [6] with a group E = {E1 , . . . , Ep } of p ≥ 2 individuals (experts) with possibly different competences and a set X = {X1 , . . . , Xn } of n ≥ 2 alternatives. Experts’ competences are given by trapezoidal fuzzy numbers Lk , k = 1, . . . , p on the interval [0, 1], where the endpoints of this continuum have the following interpretations: 0 means an incompetent expert and 1 means a fully competent expert. Each expert
792
Mathematical Methods in Economics 2016
provides fuzzy evaluations of each alternative Xi . These fuzzy evaluations Hik , i = 1, . . . , n, k = 1, . . . , p are provided in the form of trapezoidal fuzzy numbers on the interval [0, 1] and express the degree of fulfillment of the given goal by an alternative Xi according to expert Ek (0 is interpreted as goal not being fulfilled at all, 1 means complete fulfillment of the given goal). These evaluations can also be the result of a multiple-criteria assessment of the given alternative by the given expert. As such these evaluations are of absolute type, that is they are not dependent on the set of alternatives and describe the acceptability of the alternatives for each expert. 3.1
The benchmark model — standard approach based on weighted averaging and Center of gravity defuzzification
For each alternative Xi , its fuzzy evaluations Hik , provided by all experts are aggregated using the fuzzy weighted average operation with competences Lk as fuzzy weights to obtain group evaluation Hi , i = 1, . . . , n. Then the group fuzzy evaluations of alternatives are defuzzified using the centre of gravity method R1 xμHi (x)dx tHi = R0 1 (2) μ (x)dx 0 Hi
and the alternatives are sorted in ascending order with respect to their center of gravity. The alternative with the highest centre of gravity is chosen as the best one. In the case of more alternatives with the same or very similar centre of gravity sharing the highest place on the list, all these are the result of decision-making model and any of them can be chosen as the optimal one. This approach is considered as standard approach and the results obtained from this approach will be compared with results obtained from the ‘soft’ consensus based model. 3.2
The ‘soft’ consensus-based model
The model proposed in [5] and further developed in [6] also assumes, that for each alternative Xi , its fuzzy evaluations Hik are provided by all experts. The subsequent computations are, however, based on the idea that the optimal alternative should be chosen among such alternatives which are good enough according to the meaning of a sufficient amount of important experts. Due to this aim is defined subset of the set of alternatives, which includes only such alternatives which are good enough according to the given quantity of relevant experts. This subset is called the candidate set. Note, that it can happen that there is no element in the candidate set (i.e. the candidate set is empty). In this case, no alternative is chosen (none of the alternatives is considered good enough — in the consensus sense — by the experts). This constitutes a fundamental difference to the standard approach that there some alternative is always chosen, even if it’s completely unsatisfactory. Not to recommend any alternative in cases when none is good enough in the consensus sense is also well in line with the basic idea and purpose of evaluations of absolute type. The model can be summarized in the following steps (the exact procedure of defining the candidate set far exceeds the scope of this paper and its detailed description can be found in [5, 6]): 1. First a linguistic variable Ab with the linguistic term set {Ab1 , . . . , Ab5 } = {excellent, good, acceptable, borderline, unacceptable} is introduced to express the level of acceptance of alternatives by experts. Using the values of this linguistic variable and their fuzzy-number meanings a modified set of linguistic terms {A1 , . . . , A5 } expressing “at least Abr ”, where Ar , Abr ∈ FN ([0, 1]), r = 1, . . . , 5, is defined.
2. The fuzzy sets Fri of experts suggesting Ar for Xi , Fri ∈ F ({E1 , . . . , Ep }), r = 1, . . . , 5, i = 1, . . . , n, p 1 k are subsequently defined as Fri = {θri E1 , . . . , θri Ep }, where θri = supx∈[0,1] min{μAr (x), μHik (x)} , k = 1, . . . , p.
b1 , . . . , Q b4 } = {almost all, more than half, about half, minority} 3. The linguistic quantifier set {Q is introduced and the meanings of the derived linguistic quantifiers {Q1 , . . . , Q4 } expressing “at bs ” are defined respectively by fuzzy numbers on [0, 1]. The linguistic term important exleast Q pert is also introduced, with its meaning B ∈ FN ([0, 1]). The fuzzy set of important experts I ∈ F ({E1 , . . . , Ep }) is defined as I = {ζ1 E1 , . . . , ζp Ep }, where ζk = supx∈[0,1] min{μB (x), μLk (x)} , k = 1, . . . , p.
793
Mathematical Methods in Economics 2016
4. The truth value of the statement “the alternative Xi is Ar (e.g. at least good ) with respect to the opinion of quantity (Qs ) of important experts (I)” (denoted ξir,s ) is determined, i = 1, . . . , n, r = 1, . . . , 5, s = 1, . . . , 4 (see [6] for details). 5. For all r, s the set Υr,s is defined, which
includes such alternatives which are Ar according to Qs of important experts, Υr,s = {Xi ∈ X ξir,s = 1}. for non-emptyness is defined, e.g. K = (1, 1), (1, 2), . . . , 6. The order K in which to scan the sets Υ r,s (2, 3) , where each pair of values represents a given combination of r and s. In our case we suggest to check first the set of alternatives considered (excellent by almost all ) then the set of alternatives considered (excellent by majority ), ..., and the last set we check for non-emptyness is the set of alternatives that are (at least good by at least half ). The candidate set, which is the first nonempty set Υr,s according to K (denoted Υ∗ ) includes those alternatives, among which the most promising one is to be chosen. Note that K can be defined to include only such sets that are still relevant for the given problem. E.g. looking for a set of alternatives that are at least borderline might not be appropriate if we are not willing to accept borderline alternatives. In this case pairs (3, s) and (4, s) will not be present in K for any value of s. Similarly we can reflect our requirements of a sufficient amount of experts that agree on the evaluation.
4
Examples — comparison of the ‘soft’ consensus model with the benchmark
In this section two examples will be presented. Our aim is to show that in situations when there is no significant discrepancy among the evaluations provided by the experts, there is not much difference in the results provided by both methods (see Example 1). However, in situations when there are large differences in evaluations of alternatives among experts, both approaches diverge and provide different results — the standard approach aiming for an “ideal compromise” while the ‘soft’ consensus model aims for an alternative sufficiently acceptable by a sufficient amount of important experts (see Example 2). 4.1
Example 1 — no discrepancy in the evaluations among the experts
A company has decided to buy a new camera. It has a choice of 8 different cameras in the price category from 10 up to 20 thousands CZK. Their assessments have been provided by 6 experts and it can be found in Table 1 (the four numbers in each row represent the four significant values of the respective trapezoidal fuzzy number evaluation). Experts were assigned competences based on their position in the company and their experience with photographing. Competences are expressed by fuzzy numbers on the interval [0, 1], where 0 is a definitely incompetent expert (his/her involvement is only an effort of an unbiased view) and 1 is a fully competent expert who has either a high position in the company, or he/she is a very experienced photographer. Comptences of experts are as follows: L1 = (0.54, 0.62, 0.69, 0.77), L2 = (0.23, 0.31, 0.38, 0.46), L3 = (0.69, 0.77, 0.85, 0.92), L4 = (0.54, 0.62, 0.69, 0.77), L5 = (0.38, 0.46, 0.54, 0.62), L6 = (1, 1, 1, 1). Effort of the management of the company was to choose such camera that the greatest part of important experts designated as the most suitable. In the standard approach, group evaluations are computed using fuzzy weighted average operation and alternatives are compared using the center of gravity of these overall group fuzzy evaluations. The centres of gravity of group evaluations are listed in Table 2. It is clear that camera7 is evaluated the best and the company will buy this camera. In the ‘soft’ consensus-based model, we compute the acceptability of each alternative by a given quantity of experts and thus the candidate set is defined. The candidate set constitutes of alternatives evaluated as good or better by majority of the important experts, which contains three alternatives, namely camera1, camera2, camera7. Centres of gravity of group evaluations of these 3 alternatives were calculated: tH1 = 0.6355,
tH2 = 0.6629,
tH7 = 0.6942,
and the alternative camera7 was chosen as the optimal one. Note, that the ordering of the alternatives and the centers of gravity are the same, the only difference being that only camera1, camera2 and camera7 are considered as “first choice” viable candidates for the best alternative in the ‘soft’ consensus-based approach.
794
Mathematical Methods in Economics 2016
alternative camera1 camera2 camera3 camera4 camera5 camera6 camera7 camera8
0.43 0.41 0.26 0.27 0.19 0.52 0.62 0.49
Expert 1 0.59 0.76 0.57 0.74 0.39 0.52 0.37 0.55 0.32 0.46 0.67 0.81 0.76 0.92 0.61 0.74
0.84 0.82 0.64 0.69 0.58 0.91 0.97 0.86
0.23 0.26 0.22 0.49 0.19 0.46 0.46 0.37
Expert 2 0.36 0.50 0.39 0.55 0.35 0.48 0.64 0.80 0.32 0.46 0.59 0.72 0.59 0.75 0.50 0.64
0.65 0.70 0.61 0.86 0.59 0.84 0.85 0.77
0.20 0.19 0.06 0.08 0.13 0.25 0.31 0.15
Expert 3 0.33 0.49 0.29 0.47 0.14 0.28 0.18 0.36 0.26 0.41 0.40 0.58 0.45 0.60 0.29 0.45
0.64 0.62 0.40 0.51 0.54 0.74 0.73 0.58
alternative camera1 camera2 camera3 camera4 camera5 camera6 camera7 camera8
0.53 0.65 0.41 0.22 0.27 0.40 0.58 0.37
Expert 4 0.67 0.81 0.79 0.93 0.54 0.67 0.37 0.53 0.39 0.52 0.55 0.71 0.71 0.86 0.51 0.66
0.90 0.97 0.79 0.65 0.64 0.81 0.91 0.79
0.53 0.56 0.27 0.15 0.35 0.49 0.69 0.60
Expert 5 0.70 0.89 0.74 0.92 0.40 0.58 0.30 0.48 0.47 0.63 0.67 0.84 0.82 0.95 0.74 0.89
0.95 0.97 0.76 0.64 0.79 0.91 0.98 0.96
0.53 0.58 0.22 0.16 0.18 0.30 0.43 0.26
Expert 6 0.70 0.87 0.74 0.91 0.36 0.50 0.25 0.43 0.30 0.45 0.40 0.59 0.57 0.75 0.35 0.54
0.92 0.95 0.64 0.58 0.60 0.72 0.83 0.70
Table 1 Evaluations of eight cameras by each expert. t H1 0.6355
t H2 0.6629
t H3 0.4276
tH4 0.4080
tH5 0.4094
tH6 0.5967
tH7 0.6942
tH8 0.5480
Table 2 Centres of gravity of group evaluations of eight cameras in the benchmark approach. In this example we have shown that the ‘soft’ consensus-based model provides the same result as the standard approach in cases when there is not a significant discrepancy between the evaluations of the experts. It can therefore be considered as a substitute tool for the standard approach in these circumstances. In addition, it provides an information about how much are experts satisfied with the result. We know, that the majority of the important experts finds the given three alternatives good or better — a linguistic summary of the decision making problem and the strength of the consensus is therefore automatically available as an additional piece of information that is easy to use and understand. 4.2
Example 2 — evaluations of alternatives differ among experts
The family of four looking for a holiday destination. Evaluations of alternatives are presented in Table 3. Comptences of experts (family members) are as follows: L1 = (1, 1, 1, 1), L2 = (0.85, 0.92, 1, 1), L3 = (0.69, 0.77, 0.85, 0.92), L4 = (0.69, 0.77, 0.85, 0.92). alternative destination1 destination2 destination3 destination4 destination5 destination6 destination7 destination8
Expert 1 0.27 0.16 0.25 0.61 0.20 0.44 0.18 0.17
0.41 0.27 0.37 0.74 0.35 0.60 0.31 0.28
0.57 0.45 0.56 0.88 0.50 0.77 0.47 0.44
Expert 2 0.73 0.62 0.73 0.95 0.63 0.86 0.63 0.58
0.22 0.39 0.23 0.15 0.34 0.14 0.66 0.21
0.33 0.52 0.33 0.25 0.47 0.21 0.79 0.29
0.49 0.67 0.48 0.39 0.61 0.36 0.92 0.45
Expert 3 0.62 0.78 0.61 0.53 0.74 0.51 0.96 0.60
0.08 0.15 0.15 0.10 0.19 0.14 0.07 0.70
0.18 0.24 0.24 0.17 0.29 0.28 0.15 0.84
0.35 0.43 0.43 0.35 0.47 0.43 0.33 0.97
Expert 4 0.51 0.60 0.60 0.50 0.62 0.58 0.51 0.99
0.64 0.64 0.57 0.24 0.29 0.70 0.22 0.22
0.77 0.77 0.71 0.36 0.41 0.83 0.33 0.31
0.90 0.90 0.85 0.49 0.55 0.95 0.48 0.47
0.96 0.96 0.91 0.61 0.69 0.98 0.62 0.63
Table 3 Evaluation of eight destinations by each family member. In the standard approach, group evaluations are calculated using fuzzy weighted average operation and alternatives are compared by the center of gravity method. The centres of gravity of group evaluation are presented in the top row of Table 4. It’s clear from the table that destination6 is chosen as the best, since it’s center of gravity of group evaluation is the highest. In our consensus-based model, for each destination is computed acceptability by a given quantity of experts and the candidate set is defined. The candidate set contains only one alternative, namely destination5, which has been evaluated as acceptable or better by majority of the important experts. Whereas destination6 is labelled as at least good by about half of important experts. Note that in this case the worst alternative according to the standard approach is selected as the optimal one. Let us consider all experts have assigned full competence (that is (1,1,1,1)). From the Table 4 it is clear that
795
Mathematical Methods in Economics 2016
according to the standard approach the order of alternatives is slightly different but the consensus-based model again recommends the alternative with the lowest (or one of the lowest) group evaluation. competence
t H1
tH2
tH3
tH 4
tH 5
t H6
tH 7
tH 8
optimal
different fully
0.4975 0.5019
0.5261 0.5342
0.4971 0.5022
0.4732 0.4596
0.4606 0.4603
0.5440 0.5477
0.4864 0.4781
0.4972 0.5104
5 5
Table 4 Centers of gravity of group evaluations of destinations.
5
Conclusion
In this paper we have analyzed the ‘soft’ consensus model proposed by Sukač et al. [6] and its performance in comparison with the standard ME-MCDM approach based on the (fuzzy) weighted average of partial evaluations and the center of gravity defuzzification method applied to the overall evaluations to obtain the ordering of the alternatives. The first example shows that if experts evaluate alternatives broadly similar, we obtain the same (or similar) results as in the standard ME-MCDM approach. This suggests that in “standard” situations when there is no apparent conflict between the evaluations provided by the experts, both models agree on the conclusion. The ‘soft’ consensus based approach, however, provides also an additional piece of information concerning the degree of consensus of the evaluators. The second example shows that in the case of different opinions of experts, the result of the ‘soft’ consensus-based model can be very different from the benchmark approach — even such alternatives that are evaluated very bad according to the standard approach can be recommended as the optimal solution of the problem. The requirement of consensus is taken into account in this case, hence alternatives with ambivalent evaluations cannot be recommended. The ‘soft’ consensus-based model is not looking for an “ideal compromise” as the standard approach does, but rather for a solution that is sufficiently acceptable for sufficient amount of important experts. It therefore aims closer to a non-conflict solution than the standard approach. This clearly shows that the standard approach is suitable for different problems, when a “compromise” solution is acceptable and the consensus is not required. In situations when consensus (sufficient agreement of a sufficient amount of experts) is of importance the ‘soft’ consensus model can provide more appropriate solutions than the standard approach.
References [1] Kacprzyk, J., and Fedrizzi, M.: ‘Soft’consensus measures for monitoring real consensus reaching processes under fuzzy preferences. Control and Cybernetics 15 (1986), 309–323. [2] Kacprzyk, J., Fedrizzi, M., and Nurmi, H.: ‘Soft’degrees of consensus under fuzzy preferences and fuzzy majorities. In: Consensus under Fuzziness (J. Kacprzyk et al., eds.), Springer Science+Business Media, New York, 1997, 55–83. [3] Klir, G.J., and Yuan, B.: Fuzzy Sets and Fuzzy Logic: Theory and Applications. Prentice Hall PTR, 1995. [4] Pavlačka, O.: Modeling uncertain variables of the weighted average operation by fuzzy vectors. Information Sciences 181 (2011), 4969–4992. [5] Stoklasa, J. et al.: Soft consensus model under linguistically labelled evaluations. In: Proceedings of the 33rd International Conference on Mathematical Methods in Economics 2015, University of West Bohemia, Plzeň, 2015, 743–748. [6] Sukač, V., Talašová, J., and Stoklasa, J.: A linguistic fuzzy approach to the consensus reaching in multiple criteria group decision-making problems. Acta Universitatis Palackianae Olomucensis. Facultas Rerum Naturalium Mathematica 55 (2016), (in press). [7] Talašová, J.: Fuzzy metody vı́cekriteriálnı́ho hodnocenı́ a rozhodovánı́. Vydavatelstvı́ UP, Olomouc, 2003, (in Czech). [8] Zadeh, L., A.: The concept of a linguistic variable and its application to approximate reasoning-I. Information Sciences 8 (1975), 199–249.
796
Mathematical Methods in Economics 2016
Modelling of an asymmetric foreign exchange rate commitment in the Czech economy. DSGE model with constraints. Karolı́na Súkupová1, Osvald Vašı́ček2 Abstract. In this paper we analyse the use of the foreign exchange rate as an instrument of monetary easing. The asymmetric commitment used by the Czech National Bank is modelled as a constraint in a DSGE model. The model we used for our analysis is based on the concept of Justiniano, Preston (2010) and we redesigned it to use nominal exchange rate in a uncovered interest parity (UIP) condition. We estimated the model using a set of the Czech and Euro area data series and then performed a simulation using the method and tools proposed and developed in Holden (2016). Our aim was to show the impact of a long-term use of asymmetric exchange rate commitment into the economy. To answer this question we analyse the simulated trajectories of an endogenous variables that represent the development of economy. In the first part we described the model structure and we compared the different specifications of UIP condition we used for the analysis. Then we provided a brief characteristic of a method we used to implement constraints into a DSGE model. The final section provides discussion of the obtained results. Keywords: Zero Lower Bound, unconventional monetary policy, constraints in DSGE models. JEL classification: C32, E17 AMS classification: 91B51, 91B64
1
Introduction
In the aftermath of the 2008 – 2009 recession, monetary authorities lost their ability to adjust the interest rates to fulfil the monetary policy targets. The persistent low price levels and negative price shocks pushed the economies towards the deflation and the need of the new instruments of the monetary policy restored the discussion among both academia and central bankers. The possible ways to deal with the liquidity trap were discussed in the early 2000s, for example in Eggertson, Woodford [3]. The papers reacted to the Japanese stagnation and exceptionally low policy rates in the US economy. One sort of the recommendations for the optimal policy is based on the operations that increase the monetary base (quantitative easing, well conducted fiscal stimulation) and it emphasizes the importance of the right management of the public expectations. Some authors tried to find another monetary policy instrument that could directly substitute the policy rates and the nominal exchange rate was natural candidate. The intervention to foreign exchange rate and whole process of escaping from the zero lower bound and deflationary trap is described in Svensson [10]. The Czech National Bank decided to use the foreign exchange rate as an additional instrument of the monetary policy in November of 2013. The bank board reacted to the secondary recession and consequent low inflation rate that was induced both by low domestic demand and low foreign inflation rate. The bank board declared so called asymmetric commitment, bank operates on the foreign exchange rate market to hold the FX rate above some declared value, in this case 27 CZK/EUR. 1 Masaryk
University, Faculty of Economics and Administration, Department of Economics, Lipová 41a, 602 00 Brno, 324246@mail.muni.cz. 2 Masaryk University, Faculty of Economics and Administration, Department of Economics, Lipová 41a, 602 00 Brno, osvald@econ.muni.cz.
797
Mathematical Methods in Economics 2016
The mechanism of the depreciation pass through is based on the import price channel. The prices of imported goods rise, as the change in FX rate changes the relative prices of domestic and foreign goods. The relatively cheaper domestic goods cause higher domestic demand for domestic goods and that consequently increases the price of domestic production. Overall price level rises, which changes the general price level expectations. The secondary result of the depreciation is the stimulation of domestic production both by higher domestic and foreign demand. This scenario is linked to unexpected change in foreign exchange rate and in the long time period the relative prices should accommodate the depreciation. The aim of this paper is to analyse the impact of the asymmetric commitment, i.e. repeated intervention that holds the nominal exchange rate above declared level.
2
Model
The base of the model that was used for our analysis was derived in [6] and [1]. The structure of the model consists of households, two stages of domestic producers and domestic importers and the monetary authority. Households offer their labour to domestic intermediate goods producers and set their price of labour in a Calvo manner. The intermediate goods producers use the labour as the only input and set their prices in a same way as households. The final goods producers buy intermediate goods from the previous group of firms and process these inputs into the products that are offered to households. The prices of final goods are also staggered. The importers buy products from abroad and sell them to households for prices that are directly linked to foreign price level, but also staggered in a Calvo manner. Monetary authority sets the nominal interest rates following the Taylor type monetary rule that includes both inflation and price level. The decision to include price level targeting was motivated by Holden in [5], who incorporated price level to monetary rule to improve stability of the algorithm. The foreign economy is modelled as a closed version of domestic economy. The original version of a model included an UIP condition with real exchange rate and a risk premium that depends on a net foreign position of a country. For our purposes we changed the condition and related equations to use nominal exchange rate and the condition has following form rt â&#x2C6;&#x2019; rtâ&#x2C6;&#x2014; = Et et+1 â&#x2C6;&#x2019; et â&#x2C6;&#x2019; riskt , where rt and rtâ&#x2C6;&#x2014; denote domestic and foreign nominal interest rate, et denotes a nominal exchange rate, and riskt is the risk premium that is modelled as riskt = Âľrisk,t + Ď&#x2021; â&#x2C6;&#x2014; nf at , where Âľrisk,t is shock in a risk premium modelled as an AR(1) process and nf at denotes a net foreign asset position, parameter Ď&#x2021; is a risk premium elasticity. The second UIP condition specification is following Montoro, Ortiz in [9] and takes form Et et+1 â&#x2C6;&#x2019; et = rt â&#x2C6;&#x2019; rtâ&#x2C6;&#x2014; + ÎłĎ&#x192; 2 ($tâ&#x2C6;&#x2014; + $tcbâ&#x2C6;&#x2014; ), where $tâ&#x2C6;&#x2014; and $tcbâ&#x2C6;&#x2014; denote amount of foreign bond sales and purchases on a FX market from foreign investors and central bank, Îł is a risk aversion parameter of FX market dealers and Ď&#x192; denotes a standard deviation of a nominal exchange rate. Capital inflows $tâ&#x2C6;&#x2014; are modelled as AR(1) process. The amount of bonds purchased or sold by central bank in this operation is determined by the equation $tcbâ&#x2C6;&#x2014; = Ď&#x2021;T (et â&#x2C6;&#x2019; eT ) + Ď&#x2021;e (et â&#x2C6;&#x2019; etâ&#x2C6;&#x2019;1 ) + Ď&#x2021;q qt + cb t Monetary authority may intervene under the different regimes. The combination of the parameters (Ď&#x2021;T = 1, Ď&#x2021;e = 0, Ď&#x2021;q = 0) implies exchange rate targeting, the combination (Ď&#x2021;T = 0, Ď&#x2021;e = 1, Ď&#x2021;q = 0) means so called â&#x20AC;?leaning against the windâ&#x20AC;?, i.e. smoothing of exchange rate. The combination (Ď&#x2021;T = 0, Ď&#x2021;e = 0, Ď&#x2021;q = 1) describes the case of intervention to the real exchange rate. The calibration (Ď&#x2021;T = 0, Ď&#x2021;e = 0, Ď&#x2021;q = 0) implies that only unexpected intervention shocks influence the foreign exchange. To distinguish between two specifications of the model in the text, we call the original version of the model RISK and the model with the UIP condition following [9] is denoted as MODUIP. The parameters of both model specifications were estimated using seven seasonally adjusted time series of the Czech economy and Eurozone from the period between 2000Q1 and 2015Q4. Both domestic and foreign economies were represented by the time series of percentual change in the real output, the inflation rate measured by
798
Mathematical Methods in Economics 2016
consumer prices and the nominal interest rate (3M PRIBOR, 3M EURIBOR respectively). The link between the domestic and foreign economy is represented by the nominal CZK/EUR exchange rate depreciation. The estimation of the model was performed with the use of toolbox Dynare, all calculations were performed in the system Matlab.
3
Foreign exchange rate intervention
The foreign exchange rate intervention may be modelled in different manners. First attempts to include exchange rate targeting into the standard DSGE model framework considered FX rate targeting as an additional monetary policy objective. The FX rate was incorporated into the decision function and monetary authority used interest rate movements as a tool to meet its goals. The inflation, output growth and exchange rate are operational targets. This approach cannot be used in case of the Czech intervention, because exchange rate is identified as a tool not a target. BenesĚ&#x152; et al. in [2] suggested the model structure with two separated decision functions, one for interest rate and one for the exchange rate. Exchange rate as well as interest rate are considered as monetary policy tools and interact with each other through the UIP condition. In this approach, both decision rules are equivalent. Nevertheless, this approach is not in line with policy of the Czech National Bank. The exchange rate is not considered as a substitute to interest rate, central bank does not adjust FX rate following any decision rule to meet its operational targets. Montoro, Ortiz constructed a model of the interventions to FX rate based on the FX market approach included in the UIP condition described above. Their approach was not intended to model the use of the FX rate as an monetary policy tool, however it is close to the policy of CNB and managed floating. The approach described above was successfully used for modeling the intervention of CNB by Malovana in [8], nevertheless it has some disadvantages. First, the FX targeting rule is symmetrical, i.e. we are not able to study asymmetric commitment that is recently used by CNB. Second, the changes in the purchase and sale orders of foreign bonds has only limited influence on FX rate. Third, we are not able to use this rule to set target on exact value. We decided to take advantage from the FX market based UIP condition, but we use it only to model unexpected changes in the FX rate. The asymmetric commitment is modeled as a constraint in a model and I use an algorithm from Tom Holden [4] and [5]. The nature of the shadow shock corresponds to expected interventions to FX rate performed by CNB.
4
Nonlinear constraint
Holden and Paetz in [4] introduced and derived a method that is based on so called shadow price shocks. The shadow price shock SP is incorporated into the equation of the constrained variable and it drives this variable from the negative values back to the zero level. The constrained model equation takes a form xt = Âľx + Ď&#x2020;â&#x2C6;&#x2019;1 ytâ&#x2C6;&#x2019;1 + Ď&#x2020;0 yt + Ď&#x2020;+1 Et yt+1 â&#x2C6;&#x2019; Âľy (Ď&#x2020;â&#x2C6;&#x2019;1 + Ď&#x2020;0 + Ď&#x2020;+1 ) + SP . The interesting feature of the shock is that it is expected. The participants of economy know that the constrained variable cannot violate the bound. In case of the ZLB, the shadow shock may be interpreted as the positive expected monetary innovation, i.e. the interest rate is higher than in unbounded case and that has influence on economy. The constraint imposed on the exchange rate has similar meaning - each time the bound on the nominal exchange rate binds, there is the expected depreciation shock that holds the FX rate above declared value. The algorithm saves values of SP and impulse responses of the model variables to all values of this shock. The impulse response to the modelâ&#x20AC;&#x2122;s shock under constraint may be written as irf x = Âľx +v+ÎąM, where v denotes the impulse response to this shock without constraint, M is the matrix of the impulse response functions to the values of the shadow price shock and Îą is the magnitude of these shocks. To find the magnitude Îą, Holden solves a quadratic programming problem Îąâ&#x2C6;&#x2014; =
argmin
0 1 {Îą0 (Âľâ&#x2C6;&#x2014; + v â&#x2C6;&#x2014; ) + Îą0 (M â&#x2C6;&#x2014; + M â&#x2C6;&#x2014; )Îą}. 2
Îąâ&#x2030;Ľ0 Âľâ&#x2C6;&#x2014; +v â&#x2C6;&#x2014; +M â&#x2C6;&#x2014; Îąâ&#x2030;Ľ0
Holden extended this algorithm into toolbox DynareOBC described in [5]. This toolbox enables us to simulate and estimate models with constraints. The aim of author was to develop easily applicable tool-
799
Mathematical Methods in Economics 2016
box and to derive the conditions of existence and uniqueness of solution of the model with non-linear constraints. The unfortunate feature of the tool DynareOBC is its high computational complexity. The algorithm demands to set a assumed horizon of binding constraints and solves the quadratic programming problem for each time period. For larger models this requires to use the computational sources that provide sufficient computational capacity. The Czech academic community may use Metacentrum (https://metavo.metacentrum.cz/).
5
Results
The limited space allows us to show only small fraction of the results, nevertheless we try to choose examples that both illustrate the use of the tool and are relevant for recent economic situation. In the Figure 1, we compare the simulated trajectories of the models with bounded interest rates and different specifications of the UIP condition. The nature of the algorithm enables us to interpret the zero lower bound as a positive monetary shock that struck the economy independently to decisions of the monetary authority.
Figure 1 Selected trajectories of a model with constraint on the interest rate. Specification MODUIP in the left column, specification RISK in the right column. In both versions of the model, constraint on the interest rate pushes the nominal exchange rate towards the greater appreciation. The combination of the FX rate appreciation and the positive shadow monetary shock mutes the output growth and deepens the recessions. There is a noticeable impact to foreign goods (imports) inflation rate - the bound is linked to lower imports inflation rate. These results provide an explanation of the decision of central bank to depreciate the FX rate. Domestic and foreign interest rates reached the zero lower bound in the third quarter of 2012. Domestic economy should be stimulated both by lower interest rate and depreciation of the exchange rate, but the floor that constraints domestic interest rate pushes exchange rate to further appreciation. Central bank decided to intervene
800
Mathematical Methods in Economics 2016
Figure 2 Selected trajectories of a model with constraint on the nominal exchange rate. Specification MODUIP in the left column, specification RISK in the right column. and depreciated the exchange rate to compensate the influence of the constraint. The main difference between the two specifications of the model is the divergence of the nominal exchange rate when the interest rate is bounded in case of the RISK specification. That is caused by the net foreign position (not depicted) that is deviated by the shadow shock and returns to steady state very slowly. The simulation of the asymmetric commitment is depicted in the Figure 2. The specification MODUIP shows some interesting results. The nominal interest rates are linked to exchange rate through UIP condition and that causes rise of the interest rates in case of bounded FX rate. The rise of the interest rates may be interpreted also as a reaction of central bank to the output rate, that is positively stimulated by weaker exchange rate. That is decribed in [7] as a consequence of depreciation, but it is not in line with economic reality. The persistence of the negative effect of the interest rate can be seen after period 50, when the exchange rate depreciates and the unbounded case of output growth is higher than the case related to commitment. Another interesting result is the trajectory of imports inflation, that is significantly muted by commitment, especially in the negative range. That suggests that the commitment may be suitable to use when the risk of deflation is caused by foreign factors. The results for the specification RISK are similar, but less significant. In our further research we would like to continue to use the toolbox DynareOBC to simulate the synergic effect of presence of the Zero Lower Bound on domestic interest rates in combination with other constraints. The primary goal is to combine ZLB and the asymmetric commitment, the next could be the combination of domestic and foreign ZLB, as the Eurozone struggles with zero rates.
801
Mathematical Methods in Economics 2016
6
Conclusion
In the paper we deal with the use of the nominal exchange rate as an unconventional instrument of monetary policy, when the interest are bounded by zero and central bank needs to ease the monetary conditions. The use of a new modelling tool enables us to model the asymmetric commitment as a nonlinear constraint in a model. We compared two different specifications of the UIP condition in a model and the results suggested that the UIP condition based on the FX market solution provides more satisfactory results. The results showed the impact of the Zero Lower Bound on interest rates into the economy and showed that the bound is equivalent to the monetary restriction. In such case, foreign exchange rate depreciation compensates the impact of a bound. The analysis of the simulated trajectories in case of the bounded foreign exchange rate suggested that the asymmetric commitment may have desired implications to development of the economy, especially used to fight deflationary pressures from foreign sector.
Acknowledgements This work is supported by funding of specific research at Faculty of Economics and Administration, project MUNI/A/1040/2015. This support is gratefully acknowledged. Access to computing and storage facilities owned by parties and projects contributing to the National Grid Infrastructure MetaCentrum, provided under the programme ”Projects of Large Research, Development, and Innovations Infrastructures” (CESNET LM2015042), is greatly appreciated.
References [1] Alpanda, S., Kotz, K. and Woglom, G.: The role of the exchange rate in a New Keynesian DSGE model for the South African economy. South African Journal of Economics 78 (2010), 170-191. [2] Beneš, J. Berg, A. Portillo, R.A. and Vávra, D.: Modeling Sterilized Interventions and Balance Sheet Effects of Monetary Policy in a New-Keynesian Framework. Working Paper WP/13/11, International Monetary Fund, 2013. [3] Eggertson, G.B. and Woodford, M.: The Zero Bound on Interest Rates and Optimal Monetary Policy. Brookings Papers on Economic Activity 34, 1 (2003), 139–235. [4] Holden, T. and Paetz, M.: Efficient simulation of DSGE models with inequality constraints Discussion Paper 1612, School of Economics, University of Surrey, 2012. [5] Holden, T.: Existence, uniqueness and computation of solutions to dynamic models with occasionally binding constraints. Component of the DynareOBC tool. https://github.com/tholden/dynareOBC School of Economics, University of Surrey, 2016. [6] Justiniano, A. and Preston, B.: Monetary policy and uncertainty in an empirical small open economy model. Working Paper Series WP-09-21, Federal Reserve Bank of Chicago, 2010. [7] Lı́zal, L. and Schwarz, J.: Foreign Exchange Interventions as an (Un)Conventional Monetary Policy Tool (October 2013). BIS Paper No. 73i. [8] Malovaná, S.: The Efectiveness of Unconventional Monetary Policy Tools at the Zero Lower Bound: A DSGE Approach. Karlova Univerzita, Fakulta sociálnı́ch studiı́. Master thesis, 2014. [9] Montoro, C. and Ortiz, M.: Foreign Exchange intervention and Monetary policy Design: A Market Microstructure Analysis. Third Draft, 2014. [10] Svensson, L.E.O.: The Zero Bound in an Open Economy: A Foolproof Way of Escaping from a Liquidity Trap. Monetary and Economic Studies 19, S1 (2001), 277–312
802
Mathematical Methods in Economics 2016
Long-term unemployment in terms of regions of Slovakia Kvetoslava Surmanová1, Marian Reiff2 Abstract. The paper focuses on the analysis of long-term unemployment in Slovak regions oriented on the regional disparities at NUTS 3 level, using annual data from a regional database of the Slovak Statistical Office for the period of 2001- 2014. Unemployment as one of the key indicators of the economy, contributes to creating the picture about the level of development and living standards in the country. Since regional disparities in Slovakia are large, this issue is highly topical. In recent years, it is necessary to address not only the total number of unemployed in Slovakia, but to look particularly at the situation in the regions. The highest rates of the long-term unemployment are in the districts of central and eastern Slovakia. Despite the fact, that the reduction of unemployment rate belongs to basic economic policy framework, it is difficult to ensure minimization of this negative phenomenon. We use to analyse the methodology of classical linear model and methodology of seemingly unrelated regressions (SUR model). Keywords: unemployment, Okun’s law, SUR model, region. JEL Classification: E24, C30 AMS Classification: 62P20
1
Introduction
Unemployment is one of the long-term objectives pursued in the field of economic policy. The necessity to deal with this issue arises from the fact that unemployment plays important role in society and has not only economic, but also social dimension. The current effort of economists to not only understand the problem, but even to find a way of eliminating unemployment, or at least minimize its consequences is therefore justified. The paper is organized in chapters. After introduction in first chapter, the second chapter deals with theoretical basis of the Okun’s law. Subsequently, in the third chapter it is briefly point out the regional differences in the selected indicators. The fourth chapter describes the methodological apparatus that is used in the estimation of the Okun’s relationship. Estimation results are presented and interpreted in chapter five. In this part of the article we compare the results achieved at the NUTS 3 level with the results obtained by Ordinary least squares method (OLS) on data of Slovak Republic as a whole. The paper closes with concluding remarks.
2
Okun´s Law – theoretical background
Economic theory knows several approaches focused at unemployment in the national economy. Provided examples are Pigou’s the theory of unemployment, The Classical Theory of Unemployment, Keynesian Theory of Unemployment, Okun’s law theory, Theory of Phillips curve, etc. (for more details see [5], [8], [9], [10]). All conceptions are well known and are supported by academic literature (see e.g. [7], [11] and [13]). In [12] authors attempted to find some simple original dynamic models of the labor market equilibrium in order to derive a Phillips curve. In [4] there is used Okun’s law and estimated age and country specific Okun coefficients for five different age cohorts. In [2] there is examined Okun’s Law in European countries by distinguishing between the permanent and the transitory effects of output growth upon unemployment and by focusing on the effect of labour market protection policies. In [1] there is examined the validity of Okun‘s Law for V4 countries. In [6] author focused on two questions. The first, is Okun’s law a reliable and stable relationship? And the second, is this law a useful for forecasting? In this paper, we will focus on one of the approaches that is mentioned above, namely Okun’s low. The traditional interpretation of Okun’s law in terms of macroeconomic theory captures the relationship between economic output and unemployment. It is not only formulation of theoretical economic basis, it is mainly supported by empirical research, which was presented in 1962 by Arthur M. Okun (Okun was an advisor to President Kennedy in the 1960´s). He has focused on the relationship of unemployment and real output. Okun’s law captures the negative relationship between these two indicators on the background of a simple linear model. This theory was
University of Economics in Bratislava, Department of Operation Research and Econometrics, Dolnozemská cesta 1, 852 35 Bratislava, Slovakia, surmanova@yahoo.com. 2 University of Economics in Bratislava, Department of Operation Research and Econometrics, Dolnozemská cesta 1, 852 35 Bratislava, Slovakia, marian.reiff@euba.sk. 1
803
Mathematical Methods in Economics 2016
gradually modified by other economist authors who have implemented different factors into the original relationship the Okun has not considered. Although several versions are known, the paper analysis focuses on the incremental (differential) version of Okunâ&#x20AC;&#x2122;s Law. This captures the quarterly change in the unemployment rate in relation to the change of the growth in output, which can be written as follows: â&#x2C6;&#x2020;đ?&#x2018;˘ = đ?&#x203A;ź + đ?&#x203A;˝ â&#x2C6;&#x2122; đ?&#x2018;&#x201D;đ??şđ??ˇđ?&#x2018;&#x192; + đ?&#x153;&#x20AC;đ?&#x2018;Ą , where â&#x2013;ł đ?&#x2018;˘ = đ?&#x2018;˘đ?&#x2018;Ą â&#x2C6;&#x2019; đ?&#x2018;˘đ?&#x2018;Ąâ&#x2C6;&#x2019;1 and đ?&#x2018;&#x201D;đ??şđ??ˇđ?&#x2018;&#x192; =
đ??şđ??ˇđ?&#x2018;&#x192;đ?&#x2018;Ą â&#x2C6;&#x2019;đ??şđ??ˇđ?&#x2018;&#x192;đ?&#x2018;Ąâ&#x2C6;&#x2019;1 đ??şđ??ˇđ?&#x2018;&#x192;đ?&#x2018;Ąâ&#x2C6;&#x2019;1
(1)
. â&#x2C6;&#x2020;đ?&#x2018;˘ is change in the unemployment rate, đ?&#x2018;&#x201D;đ??şđ??ˇđ?&#x2018;&#x192; is growth of
GDP and đ?&#x153;&#x20AC;đ?&#x2018;Ą is stochastic random error. Parameter đ?&#x203A;˝ is known as Okunâ&#x20AC;&#x2122;s coefficient and it is assumed to have negative sign (see for example [6] or [8]). Ratio (-đ?&#x203A;ź/đ?&#x203A;˝) represents GDP growth rate, consistent with a stable unemployment rate. Above mentioned ratio equally reflects how quickly the economy must grow in order to maintain a constant level of unemployment.
3
Regional disparities at NUTS 3 level
By joining the EU, Slovakia committed itself to the implementation of regional policy. The aim of EU regional policy is reducing disparities between the levels of development of individual regions and mitigating the backwardness of less developed regions of the country. Under the above the focus is on monitoring the disparities between regions in the amount of the GDP and the unemployment rate. Slovakia is at the NUTS 3 level divided into eight regions (BA - Region of Bratislava, TT â&#x20AC;&#x201C; Region of Trnava, TN â&#x20AC;&#x201C; Region of TrenÄ?Ăn, NT â&#x20AC;&#x201C; Region of Nitra, ZI â&#x20AC;&#x201C; Region of Ĺ˝ilina, BB â&#x20AC;&#x201C; Region of BanskĂĄ Bystrica, PO â&#x20AC;&#x201C; Region of PreĹĄov and KE â&#x20AC;&#x201C; Region of KoĹĄice), and these territorial units are too heterogeneous and incomparable. And not just in terms of comparison of the level of convergence, as well as regional disparities. Therefore by analyzing them, we can expect significant differences. For this reason, analysis on the regional level provides more detail and richer results compare to analysis conducted at the NUTS 1 â&#x20AC;&#x201C; the national level. We used the data from the web page of Statistical Office of the Slovak Republic [14]. We can divide unemployment rate in the regions of Slovakia into 2 groups (see Figure 1). The first group with lower rate of unemployment includes regions BA, TT, TN, ZI a NT. The second group includes east country regions, namely PO, KE a BB region. High value of sample variance (see Table 1) as well as a graph representation of the unemployment rate over time points to the highest variability in NT region (22.64). By comparison with the regions with the lowest sample variance - BA (2.26), it's up to ten times more.
BA
TT
TN
NT
ZI
BB
PO
KE
2013
2015
30 25 20 15 10 5 0 2001
2003
2005
2007
2009
2011
Figure 1 Unemployment rate in region SR in time (in %)
804
Mathematical Methods in Economics 2016
2001 - 2015
U_BA
U_TT
U_TN
U_NT
U_ZI
U_BB
U_PO
U_KE
Median
4.63
8.37
9.56
11.76
10.91
18.86
17.75
17.30
Standard Deviation
1.50
3.03
2.47
4.76
3.04
3.13
3.43
3.69
Sample Variance
2.26
9.20
6.08
22.64
9.26
9.78
11.74
13.61
Range
4.19
11.22
8.20
16.02
10.83
9.67
11.91
12.53
Minimum
1.98
4.29
4.50
7.10
5.55
14.10
12.05
13.02
Maximum
6.17
15.51
12.70
23.12
16.38
23.77
23.96
25.55
Table 1 Descriptive statistics of unemployment rate in time 2001-2015 With regard to regional differences is necessary to mention wages. The amount of wages to some extent determines the need to be employed. A monthly salary varies in different regions, while wages are the highest in the BA region, the region with the lowest wage is PO (see Figure 2).
average wages in EUR
1100 1000 900 800 700 600 500 400 300 2000
2001
w_BA
2002
w_TT
2003
2004
w_TN
2005
2006
w_NT
2007
2008
w_ZI
2009
2010
2011
w_BB
2012
w_PO
2013
w_KE
Figure 2 Development average nominal wages in regions in time (in EUR) For the demonstration of maturity regions it is used quantile display of work productivity (see Figure 3). We have chosen first and last observation available at the regional level, i.e. 2001 and 2013. From Figure 3 it is possible to observe the dynamics in the development of labor productivity in terms of quintile breakdown of regions into five groups within the study period. In two regions (TT and TN), the labor productivity per employed person has declined. In two regions (ZI and BB), however, it has increased. PO region remains weakest regions even after 13 years period, characterized with the lowest productivity of work.
Figure 3 The dynamics of labor productivity in the period 2001 - 2013
4
SUR model
SUR model is created by joining multiple single-equation models in a single multi-equation model, while it may seem that this link is at first sight "Seemingly, unrelated". If there is the correlation between the random component of the individual equations of the same observation, đ?&#x2018;?đ?&#x2018;&#x153;đ?&#x2018;Ł(đ?&#x2018;˘đ?&#x2018;Ą1 , đ?&#x2018;˘đ?&#x2018;Ą2 ) â&#x2030; 0, then the model estimates to seemingly
805
Mathematical Methods in Economics 2016
unrelated regressions using the method of estimating model parameters to seemingly unrelated regression (method SUR) (for more details see eg. [3]). Analytically this type of multi- equation model can be written as follows: ď&#x192;Š y 1 ď&#x192;š ď&#x192;Š X1 ď&#x192;Şy ď&#x192;ş ď&#x192;Ş 0 ď&#x192;Ş 2ď&#x192;ş ď&#x20AC;˝ ď&#x192;Ş ď&#x192;Ş . ď&#x192;ş ď&#x192;Ş ... ď&#x192;Ş ď&#x192;ş ď&#x192;Ş ď&#x192;Ťď&#x192;Şy g ď&#x192;ťď&#x192;ş ď&#x192;Ťď&#x192;Ş 0
0 ď&#x192;š ď&#x192;Š β1 ď&#x192;š ď&#x192;Š u1 ď&#x192;š 0 ď&#x192;şď&#x192;ş ď&#x192;Şď&#x192;ŞÎ˛ 2 ď&#x192;şď&#x192;ş ď&#x192;Şď&#x192;Şu 2 ď&#x192;şď&#x192;ş ď&#x20AC;Ť ... ... ... ď&#x192;ş ď&#x192;Ş . ď&#x192;ş ď&#x192;Ş . ď&#x192;ş ď&#x192;şď&#x192;Ş ď&#x192;ş ď&#x192;Ş ď&#x192;ş 0 ... X g ď&#x192;ťď&#x192;ş ď&#x192;Ťď&#x192;ŞÎ˛ g ď&#x192;ťď&#x192;ş ď&#x192;Ťď&#x192;Şu g ď&#x192;ťď&#x192;ş 0
...
(2)
X 2 ...
where y is a â&#x20AC;&#x153;blocksâ&#x20AC;? vector of dependent variable with dimension (g Ă&#x2014; n Ă&#x2014; 1), X is a â&#x20AC;&#x153;blocksâ&#x20AC;? matrix of independent variables with dimension (g Ă&#x2014; n Ă&#x2014; k), đ?&#x203A;&#x192; is a â&#x20AC;&#x153;blocksâ&#x20AC;? vector parameters with dimension (g Ă&#x2014; k), u is a â&#x20AC;&#x153;blocksâ&#x20AC;? vector of stochastic errors with dimension (g Ă&#x2014; n Ă&#x2014; 1), g is a number of cross-section units, k is a number of determinants and n is a number of observations in time. Equation (2) displays a system of equations in which the explanatory variables contained in one equation at the same time do not act in a different equation of the system.
5
Empirical Results
The empirical analysis on regions of Slovakia was realized on data for the period 2001 - 2014. As 14 observations are available for each region, it was used multi- equation model seemingly unrelated regressions - SUR model, as appropriate method for the analysis. This methodology is relatively rarely used, as in the analysis of panel data, which are combination of spatial and time information it is mostly used panel data methodology. The whole analysis was carried out in the software EViews. In estimation of the differential version of Okunâ&#x20AC;&#x2122;s law (1) we use data on unemployment rate (in %) and more specifically its first difference. It acts as a determinant of relative indicator of real GDP growth, which we obtained from the conversion of nominal GDP at the regional level and GDP at constant prices for the whole Slovakia. The above presented theory of Okunâ&#x20AC;&#x2122;s law and SUR model were used to detect Okunâ&#x20AC;&#x2122;s coefficient of eight Slovak NUTS 3 regions. We estimate two versions SUR model. The first SUR model with all regions and the second with seven regions, without BA region. The region of BA was excluded because it shows extreme levels of wages. We monitored, whether the exclusion of BA model region causes significant changes. Table 2 contains the results of the regression analysis. Our expectations regarding the impact of excluding BA region of the model, however, did not materialize. Okunâ&#x20AC;&#x2122;s coefficient (β) nor the required rate of growth of GDP (-đ?&#x203A;ź/đ?&#x203A;˝), which should ensure stable unemployment rates in individual models do not differ significantly.
Region BA TT TN NT ZI BB PO KE Average Breusch-Pagan test đ?&#x153;&#x2019; 2 critical Portmanteau test đ?&#x153;&#x2019; 2 critical
đ?&#x203A;˝ -0.04*** -0.09*** -0.11*** -0.21*** -0.13*** -0.10* -0.11*** -0.22*** -0.13
SUR_8 regions -đ?&#x203A;ź/đ?&#x203A;˝ 9.23 2.00 5.08 3.04 5.12 1.47 3.80 2.89 4.08 203.69 41.34 63.68 83.68
2
R 0.32 0.48 0.44 0.47 0.47 0.26 0.20 0.59 0.40
SUR_7 regions (without BA) R2 -đ?&#x203A;ź/đ?&#x203A;˝ đ?&#x203A;˝ -0.10*** 2.62 0.51 -0.12*** 5.17 0.46 -0.19*** 2.72 0.45 -0.14*** 5.35 0.50 -0.09* 1.15 0.25 -0.12** 4.11 0.21 -0.19*** 2.38 0.54 -0.12 3.36 0.42 158.36 32.67 43.67 66.34
*** parameter is significant on 1 %, ** significant on 5 %, * significant on 10 % level of significance.
Table 2 The results of the estimation of the Okunâ&#x20AC;&#x2122;s SUR model
806
Mathematical Methods in Economics 2016
Both estimated versions of models after Breusch-Pagan test validate the hypothesis of correlation of random components of two different equations at the same time t. Based on Portmanteau test it was rejected hypothesis of correlation of random components in each of the eight equation formulas, and for two different periods of time. Values in column (-đ?&#x203A;ź/đ?&#x203A;˝) denote what growth rate in each of the regions is need to maintain a stable unemployment rate in the region. It means, by more than 5% of the economy should grow in the BA (9.23%), TN Region (5.08%) and ZI (5.12%) (Model SUR_8). For the second version of the estimate SUR_7, economic growth must be in TN (5.17%) and ZI (5.35%) of the region, results are similar to the SUR_8. The lowest growth, maintaining a stable unemployment regions have BB, TT and KE region. Its value is about 2%. Okunâ&#x20AC;&#x2122;s coefficient itself can be interpreted as follows. According to Okunâ&#x20AC;&#x2122;s it law should be true, if the growth rate of GDP will be in the PO region 4.8% and TN region 6.08%, unemployment in mentioned regions will decrease equally by 0.11%. The low value of the Okunâ&#x20AC;&#x2122;s coefficient in Bratislava region indicates a low sensitivity to GDP growth in order to changes the unemployment. Highest sensitivity of the change in unemployment show results for NT region and KE region. For further comparison it is estimated the equation (1) with OLS method on aggregate data for the Slovak Republic on basis of annual data for the period 2001-2014. If the growth rate of GDP was zero, the unemployment rate will rise by 1.6795%. To maintain a stable rate of unemployment requires that growth GDP can amount to 5.82%. According to Okunâ&#x20AC;&#x2122;s law it should be true if the GDP growth rate will be 6.82%, the unemployment rate will drop by 0.303%. Linear regression result is captured on Figure 4. The estimated parameters and the model are significant on the 5 % level of significance. In the model is absent autocorrelation of random error and R2 = 0.7252. 5 4 3
gGDP
2 1
â&#x2C6;&#x2020;đ?&#x2018;˘ = -0.3038 . gGDP + 1.6795
0
-6
-4
-2
-1 0
2
4
6
8
10
12
14
16
-2 -3
The change in the unemployment rate
Figure 4 Estimate Okunâ&#x20AC;&#x2122;s law with OLS method for all Slovak Republic
6
Conclusion
In the paper we test the hypothesis of Okunâ&#x20AC;&#x2122;s law in NUTS 3 regions of Slovak republic during the 2001-2014 period using model of seemingly unrelated regressions. Since the analysis at regional level encounters a problem with sufficient and actual database, quantitative analysis has limited capabilities. In analysing the Okunâ&#x20AC;&#x2122;s law on regional data from Slovakia, we have come in terms of the negative value of the Okunâ&#x20AC;&#x2122;s coefficient to same conclusion as [6], [8] and others. Results in individual regions reveal a significant disparity. There are regions in which unemployment rate reacts more sensitive to economic growth and the regions where the impact is negligible. Results of the analysis indicate that the coefficient with the lowest values is in BA and KE regions and with highest values in NT region, this fact implies that labour market is rigid. Higher values of the Okunâ&#x20AC;&#x2122;s coefficient refer to stronger fluctuations of fixed linking unemployment to fluctuations in product surveyed regions. By comparing the results for the whole country and the results on a regional basis, we can conclude that the regional dimension, by looking at the unemployment rate is eye-catching because the analysis of differences in the regions can serve in large measure to the implementation of proper and targeted regional social policy. Since the unemployment rate and the impact on other determinants, it would be appropriate to extend the basic model by other information that would lead to increased variability of the model. Equally interesting analysis could be extended to dummy variable that would have corrected the disturbances which occurred as a result of expression of the crisis in 2009.
Acknowledgement This work was supported by the Grand Agency of Slovak Republic â&#x20AC;&#x201C; VEGA grand No. 1/0444/15 "Econometric Analysis of Production Possibilities of the Economy and the Labour Market in Slovakia".
807
Mathematical Methods in Economics 2016
References [1] Boďa, M. et al.: (A)symetria v Okunovom zákone v štátoch Vyšehradskej skupiny. Politická ekonomie 63, 6 (2015), 741–758. [2] Economou, A., Psarianos, I. N.: Revisiting Okun´s Law in European Union Countries. Journal of Economic Studies 43, 2 (2016), 275–287. [3] Greene, W. H.: Econometric Analysis (Seventh ed.). Upper Saddle River, Pearson Prentice-Hall, 2012. [4] Hutengs, O., Stadtmann, G.: Don´t trust anybody over 30: Youth unemployment and Okun´s law in CEE countries. Bank I Kredyt 45, 1 (2014), 1–16. [5] Keynes, J.: The General Theory of Employment, Interest and Money. London, Harcourt, 1936. [6] Knotek, S. E.: How Useful is Okun´s Law? Kansas City: Federal Reserve Bank of Kansas City, 2007. [7] Mouhammed, A. H.: Important Theories of Unemployment and Public Policies. Journal of Applied Business and Economics 12, 5 (2011), 100–110. [8] Okun, A. M.: Potential GNP: Its Measurement and Significance. American Statistical Association, proceedings of the Business and Economics Statistics Section, Alexandria, Virginia: American Statistical Association, 1962. [9] Pigou, A. C.: The Theory of Unemployment. London, Macmillan, 1933. [10] Phillips, A. W.: The Relation between Unemployment and the Rate of Change of Money Wages in The United Kingdom, Economika 25, 100 (1958), 283–299. [11] Sweezy, P.: Professor Piqou´s Theory of Unemployment. The Journal of Political Economy 42, 6 (1934), 800–811. [12] Szomolányi, K. at al.: The Relation between the rate of change of money wage and unemployment rates in the frame of the microeconomic dynamics. Ekonomické rozhľady 41, 1 (2012), 44–65. [13] Trehan, B.: Unemployment and Productivity, Federal Reserve Bank of San Francisco Economic Letter 28, (2001), 1–3. [14] www.statistics.sk.
808
Mathematical Methods in Economics 2016
The Effect of Terms-of-Trade on Czech Business Cycles: A Theoretical and SVAR Approach Karol Szomolányi1, Martin Lukáčik2, Adriana Lukáčiková3 Abstract. Empirical and theoretical studies imply contradictory results about termsof-trade effect on business cycles. Empirically the terms-of-trade have a negligible effect on the short-term economic fluctuations. But theoretical models – both Keynesian and real business cycle models – predict significant influence of the terms-of-trade on business cycles. To lower a significance of terms-of-trade shocks, theoretical models have been extended by introducing non-tradable goods in the economy. However this modification does not help and the theoretical impact of the terms-of-trade is too high. Using Czech data and MXN theoretical model, we confirm that model predictions of the terms-of-trade impact do not match with observations gathered from the SVAR model estimate of Czech cyclical components of the terms-of-trade trade balance-to-output ratio, output consumption, gross investment and real exchange rate. Keywords: Terms-of-trade, business cycle, SVAR model, MXN model. JEL Classification: C44 AMS Classification: 90C15
1
Introduction
Terms-of-trade is theoretically significant source of business cycles and it causes shifts in trade balance. However different theoretical and empirical studies lead to different results of the short-run terms-of-trade impact on output and on trade balance. There are two theoretical effects of terms-of-trade impact on trade balance. Harberger [6] and Laursen and Metzler [8] used traditional Keynesian model to show that trade balance grows with terms-of-trade. On the contrary, dynamic optimizing models of Obstfeld [10] and Svensson and Razin [12] leads to a conclusion that positive effect of terms-of-trade on the trade balance is weaker the more persistent a termsof-trade shock is. Uribe and Schmitt-Grohé [15] showed that in small open economy real business cycle model (or dynamic stochastic general equilibrium model) with capital costs sufficiently permanent terms-of-trade shocks have negative impact on the trade balance. Empirical studies of Aguirre [1], Broda [4] and Uribe and Schmitt-Grohé [15] surprisingly do not support statistically significant impact of term-of-trade on output in poor and emerging countries. In general authors can confirm an intuition that the more open the economy is the higher effect of terms-on-trade on trade balance is. This result may not be achieved in theoretical general equilibrium models even if non-tradable goods are considered. Uribe and Schmitt-Grohé [15] developed MXN model with importable (M) exportable (X) and non-tradable (N) goods to show that an existence of non-tradable goods “reduce the importance of terms-of-trade shocks.” However authors state that the theoretical model still overestimates the signification of the terms-of-trade impact on the business cycles. In this paper we confirm this theoretical and empirical mismatch. In the first part of the paper we present and estimate the empirical SVAR model of the terms-of-trade impact on the Czech business cycles to state that the influence of the terms-of-trade is very small. In the second part we present and calibrate MXN model of Uribe and Schmitt-Grohé [15] using Czech observations to state that the theoretical model predicts relatively high impact of the terms-of-trade on business cycles.
2
Empirical Model
First, we used vector autoregressive (VAR) models for our analysis. Every endogenous variable is a function of all lagged endogenous variables in the system in VAR models. See Lütkepohl [9] for more details about them. The mathematical representation of the VAR model of order p is: y t = A1 y t-1 + A 2 y t-2 + ... + A p y t-p + e t
1
(1)
University of Economics, Department KOVE, Dolnozemská cesta 1, Bratislava, Slovakia, szomolan@euba.sk. University of Economics, Department KOVE, Dolnozemská cesta 1, Bratislava, Slovakia, lukacik@euba.sk. 3 University of Economics, Department KOVE, Dolnozemská cesta 1, Bratislava, Slovakia, istvanik@euba.sk. 2
809
Mathematical Methods in Economics 2016
where yt is a k vector of endogenous variables; A1, A2, …, Ap are matrices of coefficients to be estimated; and et is a vector of innovations that may be contemporaneously correlated but are uncorrelated with their own lagged values. The VAR model (1) can be interpreted as a reduced form model. A structural vector auto-regressive (SVAR) model is structural form of VAR model and is defined as: Ay t = B1 y t-1 + B 2 y t-2 + ... + B p y t -p + Bu t
(2)
A SVAR model can be used to identify shocks and trace these out by employing impulse response analysis and forecast error variance decomposition through imposing restrictions on used matrices. Uribe and Schmitt-Grohé [15] proposed a specification of the SVAR, through which we can determine responses on terms-of-trade impulse:
ut f ft f t −i tb tb u tb t t −i t yt p yt −i uty A = ∑ Bi + B c ct i =1 ct −i ut it it −i uti rer rert rert −i ut
(3)
where f is relative cyclical component of the terms of trade, tb is relative cyclical component of the trade balance to output ratio, y is relative cyclical component of output, c is relative cyclical component of consumption, i is relative cyclical component of investment and rer is relative cyclical component of real exchange rate. The utf, utttb, uty, utc, uti and utrer are structural shocks of given variables. We estimated the parameters of the SVAR specification (3) using Amisano and Giannini [3] approach. The class of models may be written as: Ae t = Bu t
(4)
The structural innovations ut are assumed to be orthonormal, i.e. its covariance matrix is an identity matrix. The assumption of orthonormal innovations imposes the following identifying restrictions on A and B: AΣe A = BB T
T
(5)
Noting that the expressions on both sides of (5) are symmetric, this imposes k(k+1)/2 = 21 restrictions on the 2k2 = 72 unknown elements in A and B. Therefore, in order to identify A and B, we need to impose (3k2-k)/2 = 51 additional restrictions. The matrix A of unrestricted specification is a lower triangular matrix with unit diagonal (15 zero and 6 unity restrictions) and matrix B is a diagonal matrix (30 zero restrictions) in this just-identified specification. Other tested restrictions are imposed on elements of matrix A (matrix of contemporary effects of endogenous variables), which means that specification becomes over-identified and testable. The selected lag of model (3) is validated by sequential modified likelihood ratio test statistic and information criteria and by the LM test for autocorrelations. Significant values of serial correlation for lower lags could be a reason to increase the lag order of an unrestricted VAR, but this is not our case. We verified the stability of a VAR model (i.e. whether all roots have modulus less than one and lie inside the unit circle). We estimated the parameters of restricted and unrestricted specifications. Using the logarithm of the maximum likelihood functions of both specifications we calculated the likelihood ratio statistics and verified the significance of restrictions. Tests confirmed no immediate impact of terms-of-trade on output, consumption, investment and exchange rate, trade balance on output, consumption and exchange rate and investment on exchange rate. All tests are explained in Lütkepohl [9] for example. Data for Czech economy are gathered from the Eurostat portal [5]. The responses to the terms-of-trade shock in the Czech Republic are in the Figure 1. As output shock elasticity coefficient is not statistically significant, the improvement in terms-of-trade has no impact on the aggregate activity and the one-quarter delayed output expansion is statistically insignificant. The same result applies to the consumption and real exchange rate. Investment displays a small expansion. On the other hand, the impact of the terms-of-trade shock on trade balance is statistically significant. The 10 % increase in the terms of trade causes a decrease about 2 % in trade balance. The result suggests confirmation of Obstfeld-Svensson-Razin effect rather than Harberger-Laursen-Metzler effect of the terms-of-trade in the Czech Republic.
810
Mathematical Methods in Economics 2016
Response of TOT to TOT Shock
Response of TB to TOT Shock
.020
.001
.016
.000
.012 -.001 .008 -.002 .004 -.003
.000 -.004
-.004 1
2
3
4
5
6
7
8
9
10
1
2
3
Response of Y to TOT Shock
4
5
6
7
8
9
10
9
10
9
10
Response of C to TOT Shock
.004
.004 .003
.002
.002 .000 .001 -.002 .000 -.004
-.001
-.006
-.002 1
2
3
4
5
6
7
8
9
10
1
2
Response of I to TOT Shock
3
4
5
6
7
8
Response of RER to TOT Shock
.015
.015
.010
.010
.005 .005 .000 .000 -.005 -.005
-.010 -.015
-.010 1
2
3
4
5
6
7
8
9
10
1
2
3
4
5
6
7
8
Figure 1 Impulse Response Functions to Terms-of-trade (TOT) Shock in the Czech Republic
3
Theoretical Model
Uribe and Schmitt-Grohé [15] presented the model with import-able (m), export-able (x) and non-tradable (n) sectors. The presence of non-tradable goods should reduce the importance of terms-of-trade shock. We consider a large number of identical households with preferences described by the utility function: m x n ht ) ht ) ht ) ( ( ( − − ct − ∞ ωm ωx ωn t U = E0 ∑ β
ωm
ωx
ωn
1−σ
−1
(6)
1−σ
t =0
where ct denotes consumption, for sector j ∈ {m,x,n}, htj denotes hours worked in the sector j. Sectoral labour supplies are wealth inelastic and parameters ωj denote wage elasticity in the sector j. The symbol E0 denotes the expectations operator conditional on information available in initial period 0. The parameter σ measures the degree of relative risk aversion. Households maximize the lifetime utility function (6) subject to the budget constraint:
(
ct + it + it + it + φm k t +1 − k t m
x
n
m
m
)
2
(
+ φx k t +1 − k t x
x
)
2
(
+ φn k t +1 − k t n
n
)
2
τ
+ pt d t =
τ
=
p t +1 d t +1 1 + rt
+ wt ht + wt ht + wt ht + ut k t + ut k t + ut k t m
m
x
x
n
n
m
m
x
x
n
(7)
n
where for sector j ∈ {m,x,n}, itj denotes gross investment, ktj denotes capital, wtj denotes real wage rate and utj is the rental rate of capital in the sector j. Quadratic terms of the budget constraint (7) are capital adjustment costs, where ϕj denotes capital adjustment cost parameter in the sector j. The variable ptτ denotes the relative price of the tradable composite good in terms of final goods, dt denotes the stock of debt in period t denominated in units of the tradable composite good and rt denotes the interest rate on debt held from period t to t + 1. Consumption, investment, wages, rental rates, debt, and capital adjustment costs are all in units of final goods. The capital stocks accumulation is given by:
k t +1 = (1 − δ ) k t + it ; ∀j ∈ {x , m, n} j
j
j
where δ denotes constant depreciation rate.
811
(8)
Mathematical Methods in Economics 2016
There are 5 types of large number of identical firms in the economy which differ according to their output: firms producing final goods, tradable composite goods, import-able goods, export-able goods and non-tradable goods. Final goods are produced using non-tradable goods and a composite of tradable goods via CES technology 1 1 1 1− µ1 τ 1− n 1− B ( a t , a t ) = χ τ ( a t ) µ + (1 − χ τ ) ( a t ) µ
τ
(9)
n
τn
τn
τn
where atτ denotes the tradable composite good and atn the non-tradable good, 0 < χτ < 1 denotes distribution parameter and μτn > 0 is the elasticity of substitution between tradable composite good and non-tradable good. The tradable composite goods is produced using importable and exportable goods as intermediate inputs via CES technology 1 1 1 1− µ1 m 1− x 1− at = A ( a t , a t ) = χ m ( a t ) µ + (1 − χ m ) ( a t ) µ
τ
m
x
mx
mx
mx
(10)
where atm denotes import-able good and atx the export-able good, 0 < χm < 1 denotes distribution parameter and μmx > 0 denotes the elasticity of substitution between import-able and export-able goods. Import-able, export-able and non-tradable goods are produced with capital and labour via the Cobb-Douglas technologies:
( ) (h )
yt = A k t j
j
αj
j
j
1− α j
t
; ∀j ∈ {x , m , n}
(11)
where sector j ∈ {m,x,n}, ytj denotes output and Atj denotes total factor productivity in the in sector j. To ensure a stationary equilibrium process for external debt, we assume that the country interest-rate premium is debt elastic
(
rt = r + ψ e *
d t +1 − d
−1
)
(12)
where r* denotes the sum of world interest rate and the constant component of the interest-rate premium, the last term of (12) is the debt-elastic component of the country interest-rate premium and we assume the parameter debt-elastic ψ > 0. Model implied terms-of-trade ft is assumed to follow AR(1) process
log
ft f
= ρ log
f t −1 f
+ πε t
(13)
where εt is a white noise with mean zero and unit variance, and f > 0. The serial correlation parameter is 0 < ρ < 1 and terms-of-trade standard error is π > 0. For details of households’ and firms’ problem first-order conditions, market clearing and competitive equilibrium derivation and definitions see Uribe and Schmitt-Grohé [15]. Calibrating the model we follow Uribe and Schmitt-Grohé [15] process. The calibrated values of the model parameters are in the Table 1 for Czech economy. We assume that values of the parameters σ, δ, r* in Czech repulic fit with the values commonly used for calibrating small open economies (Uribe and Schmitt-Grohé [15]) We assume that wage elasticity is same in all three sectors. Uribe and Schmitt-Grohé [15] provide a rich discussion with literature references on calibrating the elasticity of substitution between import-able and export-able goods. The values of ωm, ωx, ωm, μτn, αm, αx, αn, are gathered from Ambriško [2]. Considering high-frequently (i.e. quarterly) data it is assumed that μmx = 0.8 in Czech economy. Calibrating f , Am and An we adopt values of Uribe and Schmitt-Grohé [15]. The values of terms-of-trade serial correlation, ρ, and standard error, π, correspond to the data characteristics used in empirical models. To calibrate χm, χτ and Ax we follow a process of Uribe and Schmitt-Grohé [15] and implied moment restrictions of average share of value-added exports in GDP, sx, average trade balance-to-GDP ratio, stb, and average share of non-tradable goods in GDP, sn. Likewise Uribe and Schmitt-Grohé [15] we use OECD Trade in Value-Added (TiVA) [13] and UNCTAD statistical databases [14] to find values of these moment restriction. The values of the rest implied structural parameters, d and β come
812
Mathematical Methods in Economics 2016
from the values of calibrated ones. We fail to reach a negative reaction of the trade balance to a terms-of-trade shock in the theoretical MXN model to follow empirical facts observed in the Figure 1. After substituting big values for the parameter ψ, the response of the trade balance is negative but very small in absolute value. Therefore we calibrate ϕj, j ∈ {m,x,n} and ψ to capture moments observed in the empirical model. Firstly, using SVAR model in the section 2 we estimate the value of investment-terms-of-trade volatility ratio conditional on terms-of-trade shocks to equal approximately 0.45. Secondly, as Uribe and Schmitt-Grohé [15] pointed out, the standard deviation conditional on terms-of-trade shock of investment in the trade sector is 1.5 times as large as its counterpart in the non-traded sector.
σ δ r* ωm ωx ωn μτn αm αx αn μmx
f Am An π ρ
calibrated structural parameters 2.00 Uribe and Schmitt0.10 Grohé [15] 0.04 2.70 2.70 2.70 0.76 Ambriško [2] 0.35 0.35 0.25 0.80 1.00 1.00 1.00 0.013 0.464
Uribe and SchmittGrohé [15]
Empirical model
moment restrictions sn 0.27 UNCTAD [14] sx 0.34 OECD [13] stb 0.08 pmym/(pxyx) 1.00 Uribe and SchmittGrohé [15] σim+ix/σin 1.50 σi/σtot 0.45 Empirical model implied structural parameter values χm 0.672 χτ 0.979 1.800 d Ax β
1.436 0.962
ϕm ϕx ϕn ψ
0.0125 0.021 0 0.0176
Table 1 Calibration of the MXN Model in the Czech Republic
Figure 2 Impulse Response Functions to Terms-of-trade (TOT) Shock in the Czech Republic In order of finding model equilibrium the first order linear approximation to the nonlinear solution are applied using algorithms created and modified by Klein [7] and Schmitt-Grohé and Uribe [11]. Responses to the terms-of-trade impulses and covariance-variance matrix conditional on the terms-of-trade shock is computed using algorithm of Uribe and Schmitt-Grohé [15].
813
Mathematical Methods in Economics 2016
4
Conclusion
In empirical models we observe small impact of terms-of-trade on business cycles in Czech Republic. In Czech Republic investment reacts immediately, while other aggregates do not change (or they change later mostly as reaction of other variables) on terms-of-trade shock. Terms-of-trade has negative effect on the trade balance in Czech Republic. However, theoretical model calibrated to suit empirical observations overestimates the influence of terms-oftrade shocks in Czech Republic. Both output and investment rise after terms-of-trade shock realization in Czech Republic. The theoretical falls in real interest rates are overestimated as well. As we already pointed out, we cannot achieve a negative reaction of the trade balance in the theoretical model as it is in the empirical model.
Acknowledgements Supported by the grant No. 1/0444/15 of the Grant Agency of Slovak Republic - VEGA.
References [1] Aguirre, E.: Business Cycles in Emerging Markets and Implications for the Real Exchange Rate. Ph.D. Dissertation, Columbia University, New York, 2011. [2] Ambriško, R.: A Small Open Economy with the Balassa-Samuelson Effect. Working Paper Series (ISSN 1211-3298), Charles University CERGE EI, Prague, 2015. [3] Amisano, G., and Giannini, C.: Topics in Structural VAR Econometrics, 2nd ed. Springer-Verlag, Berlin, 1997. [4] Broda, C.: Terms of Trade and Exchange Rate Regimes in Developing Countries. Journal of International Economics 63 (2004), 31–58. [5] EUROSTAT portal. Retrieved from http://ec.europa.eu/eurostat/data/database [Accessed 4th April 2016] [6] Harberger, A. C.: Currency Depreciation, Income, and the Balance of Trade. Journal of Political Economy 58 (1950), 47–60. [7] Klein, P.: Using the Generalized Schur Form to Solve a Multivariate Linear Rational Expectations Model. Journal of Economic Dynamics and Control 24 (2000), 1405–23. [8] Laursen, S., and Metzler, L. A.: Flexible Exchange Rates and the Theory of Employment. Review of Economics and Statistics 32 (1950), 281–299. [9] Lütkepohl, H.: New Introduction to Multiple Time Series Analysis. Springer-Verlag, Berlin, 2005. [10] Obsfeld, M.: Aggregate Spending and the Terms of Trade: Is There a Laursen-Metzler Effect? Quarterly Journal of Economics 97 (1982), 251–270. [11] Schmitt-Grohé, S. and Uribe, M.: Closing Small Open Economy Models. Journal of International Economics 61 (2003), 163–185. [12] Svensson, L. E. O., and Razin, A.: The Terms of Trade and the Current Account: The Harberger-LaursenMetzler Effect. Journal of Political Economy 91(1983), 97–125. [13] Trade in Value Added database (OECD.STAT). Retrieved from: http://www.oecd.org/sti/ind/measuringtradeinvalue-addedanoecd-wtojointinitiative.htm [Accessed 4th March 2016] [14] UNCTAD database. Retrieved from http://unctadstat.unctad.org/wds/ReportFolders/reportFolders.aspx?IF_ActivePath=P,15912 [Accessed 4th March 2016] [15] Uribe, M., and Schmitt-Grohé, S.: Open Economy Macroeconomics. Columbia University, New York, 2016. Retrieved from http://www.columbia.edu/~mu2166/book/ [Accessed 4th March 2016]
814
Mathematical Methods in Economics 2016
The Dynamic Behaviour of Wonderland Population–Development–Environment Model Blanka Šedivá
1
Abstract. The Wonderland Population-Development-Environment Model (PDE) allows to study the interactions between the economic, demographic and environment factors of an idealized world, thereby enabling them to obtain insights transferable to the real world. This model was first introduced by Sanderson in 1994 and now there are several modification of this model. From a mathematical perspective, the PDE model is a system of non-linear differential equations characterized by slow-fast dynamics. This means that some of the system variables vary much faster than others. The existence of speed of dynamical in variables in model implies problems with numerical solution of models. This article concentrates on the numerical solutions of model and on the visualization dynamical behaviour of a four dimensional continuous dynamical system, the Wonderland model. We analyse the behaviour of model for selected part of the parametric space and we showed that the system of four differential equations Wonderland model can generate behaviour typical for chaotic dynamic systems. Keywords: Dynamical System, Model, Slow–Fast Dynamics.
Population–Development–Environment
JEL classification: C63 AMS classification: 34C60, 91B55
1
Introduction
Many models focused on relations relationships between economic factors, demographic and environmental factors have been developed to address climate change policy and these models are used for analysing several scenarios of future development of population from a critical (nightmare) scenario to a dream scenario. Between these two extreme limitations is usually model of the sustainable development expected. Sustainable development, although a widely used phrase and idea, has many different meanings and therefore provokes many different responses. In broad terms, the concept of sustainable development is an attempt to combine growing concerns about a range of environmental issues with socio-economic issues. Today’s economic and civilization development does not imply, in the opinion of many experts, sustainable concept for future. The economic expansion leads to population growth. Population in the world is currently (2016) growing at a rate of around 1.13% per year. The current average population change is estimated at around 80 million per year. Annual growth rate reached its peak in the late 1960s, when it was at 2% and above. The rate of increase has therefore almost halved since its peak of 2.19 percent, which was reached in 1963. The annual growth rate is currently declining and is projected to continue to decline in the coming years. Currently, it is estimated that it will become less than 1% by 2020 and less than 0.5% by 2050. This means that world population will continue to grow in the 21st century, but at a slower rate compared to the recent past. World population has doubled (100% increase) in 40 years from 1959 (3 billion) to 1999 (6 billion). It is now estimated that it will take a further 39 years to increase by another 50%, to become 9 billion by 2038. The latest United Nations projections http://esa.un.org/unpd/wpp/ indicate that world population will reach 10 billion persons in the year 2056 (six years earlier than previously estimated). The endless population growth creates more pressure on economic development and the related increasing of economic production puts a strain on the ability of the natural environment to absorb the 1 University
of West Bohemia, Plzeň, Czech Republic, Department of Mathematics, sediva@kma.zcu.cz
815
Mathematical Methods in Economics 2016
high level of pollutants that are created as a part of this economic growth. Therefore, solutions need to be found so that the economies of the world can continue to grow, but not at the expense of the public good. In the world of economics the amount of environmental quality must be considered as limited in supply and therefore is treated as a scarce resource. This is a resource to be protected and the only real efficient way to do it in a market economy is to look at the overall situation of pollution from a benefit-cost perspective. This article focuses on the numerical solutions and the visualisation dynamical behaviour of the Wonderland model, which can be regarded as one of the simple dynamic models of geographic, economic and environmental interactions. The special feature of Wonderland model is the fact that not all system variables evolve with the same velocity. The velocity varies by two magnitudes and this slow fast dynamics makes solving difficult. The resulting mixture of slow and fast dynamics can lead to unpredictable, catastrophic transition even if all functions in model are deterministic, i.e. no stochastic force are introduced.
2
Sanderson Wonderland model
The first version of Wonderland model was published by Warren Sanderson as a part of IIASA study - The International Institute for Applied System Analysis in 1994 [8]. The original model was written in discrete time. Next, the model has been reformulated in continuous variables and this system has been used to investigate sustainability of economic growth [5],[2], [3] and as a benchmark problem for visualization techniques for higher dimensional dynamical systems [9]. In this parer the continuous version of Wonderland model of economic, demographic and environmental interaction is used. The dynamics in Wonderland is determined by four state variables: • x(t) - population, demographic variable; • y(t) - per capital output, economic performance; • p(t) - pollution per unit of output; • z(t) - quality of environment. The state variables for population x and per capita output y can assume all non-negative real values x, y ∈ [0, ∞), while the stock of natural capital z and the pollution per unit of output p are confined to the unit interval z, p ∈ [0, 1]. The variable z can be interpreted as a level of natural capital. If the natural capital is not polluted at all it takes on the value z ≈ 1. On the other extreme, when the environmental is so polluted, that it produces the maximum possible damage to human health and to the economy, z ≈ 0. The value of p =≈ 1 represent situation when there is maximal pollution per unit of output and on the other hand p ≈ 0 implies no pollution per unit of output. These four state variables evolve according to following set of non-linear, difference equations:
dx dt dy dt dz dt dp dt
= x · n(y, z)
= y · γ − (γ + η)(1 − z)λ h i = νz(1 − z) eω(g(z)−f (x,y,z,p))−1 = −χp
where
816
(1) (2) (3) (4)
Mathematical Methods in Economics 2016
c(y, z) = ϕ(1 − z)µ y
is pollution control,
y(y, z) = y − c(y, z)
is net per capital output,
1 eσc(y,z)x − f (x, y, z, p) = pxy − κ 2 1 + eσc(y,z)x n(y, z) = b(y, z) − d(y, z)
is pollution flow, is population growth rate, that is calculated as difference between crude birth rate b(y, z) and crude death rate b(y, z).
The complex dynamical model includes 20 parameters, which can be grouped as follows demographic: economy: environmental: environmental policy: 2.1
Population
dx dt
β1 , β2 , β, δ1 , δ2 , δ3 , α, ϑ, γ, η, λ, κ, σ, δ, ρ, ω, ν, ϕ, µ, χ.
= x · n(y, z)
The first differential equation describes population growth which depends endogenously only on per capital output y and the level of natural capital z. In eq. (1), the population growth rate, n(y, z), can be written as the difference between the crude birth rate, b(y, z) (number of births per 1000 population per time at a given time step) and the crude death rate, d(y, z) (number of deaths per 1000 population per time at a given time step): eβ y (5) b(y, z) = b(y) = β1 β2 − 1 + eβ y eα y d(y, z) = d(y, z) = δ1 δ2 − · 1 + δ3 (1 − z)ϑ (6) α y 1+e The parameters β1 , β2 , β govern the birth rate, while the parameters δ1 , δ2 , δ3 , α and ϑ govern the death rate. At eq. (5) and (6) it can be seen how both the birth rate and death rate decrease with increases in the per capita output, y. Furthermore, in eq. (6), the death rate is seen to increase when the environment deteriorates, i.e., when z decreases. These effects are in line with recent studies relating population growth with industrial output [1]. The maximum crude birth rate is realised for zero net per capital output b(y = 0) = β1 (β2 − 12 ) and minimal crude birth rate is the limit situation for unlimited grow of output: lim b(y) = β1 (β2 − 1). y→∞
In the literatureh [9] there exists also i other type of function, that describes the population’s growth in βy 1 form b2 (y, z) = β1 β2 − 2 1 + 1+β y . The limit behaviour of this alternative function is the same as for the function previous formulated b(y). 2.2
Product
dy dt
= y · γ − (γ + η)(1 − z)λ
Net per capital output y(y, z) is defined as per capital output y minus expenditures on pollution control c(y, z). Pollution control expenditures c(y, z) = ϕ(1 − z)µ y depend on how polluted environmental is and not on the current flow emissions. The availability of natural capital also influence the growth rate of economy as indicated by equation (2). When environmental is totally polluted, i.e. z ≈ 0, per capital output shrink at the rate −η, while if environmental is not polluted at all, i.e z ≈ 1 per capital output increases at rate γ. 2.3
Quality of environmental (natural capital)
dz dt
= νz(1 − z) eω(g(z)−f (x,y,z,p))−1
The growth of quality of environmental is depended on all state variables. The first part of equation (3) assumes the logistic model for quality of environmental. The speed at which natural capital regenerates is indicated by term ν eω(g(z)−f (x,y,z,p))−1 . 817
Mathematical Methods in Economics 2016
This term depends positively on the self-cleaning ability of nature, which is described by function g(z) = ωδ z ρ , and this term depends negatively on the amount of pollution, which is modelled function f (x, y, z, p)
= pxy − κ
eσc(y,z)x 1 + eσc(y,z)x
(7)
The difference between two flows g(z) − f (x, y, z, p) is the net effect of natural and human forces on environmental. As we can see the model of dynamic behaviour quality of environmental is based on the assumption that the interaction of net effect on environmental is non-linear. 2.4
The pollution flow
dp dt
= −χp
Pollution per units of output p exponential falls towards to zero as technology improves and environmental consciousness growth with rate χ.
3
Numerical experiment
The numerical experiments were inspirited by the origin Sanderson’s work [8] and special attention was focused on analysis the behaviour the solving of differential system equations for different values of parameters χ, ϕ and κ. These parameters involve to the governing the expenditures for pollution abatement. The parameter ϕ from function c(y, z) influences how expenditure increase. 3.1
Slow–fast dynamics
The numerical calculation have been performed using the Matlab R2013b. For numerical solving a stiff solver was used, that is based on an implicit Runge-Kutta formula for solving stiff differential equations with a trapezoidal rule step as its first stage and a backward differentiation formula of order two as its second stage. The Wonderland model is an example of set differential equations exhibits slow-fast behaviour. One system variable (in this model z) varies much faster then the others (x, y and p). The analyse this phenomenon, we can rewrite the system using a scaling factor ε with 0 < ε 1. The scaling factor is called the perturbation parameter in singular perturbation theory [4]. The application of methods of perturbation theory enables us to separate slow and fast components of the system and to analyse both components separately. The reduced system captures the slow dynamics reformulate the equation (3) to form 0 = νz(1 − z) eω(g(z)−f (x,y,z,p))−1 . The layer system captures the fast dynamics, where right sides in equations (1),(2) and (4) are zero and dynamic of system reduced only on the dynamic equation (3) for pollution per unit of output. 3.2
Visualization results and dynamics
Visualisation the behaviour of dynamical systems can provide us a deeper understanding of underlying dynamics. For analysing the model we used the following value of parameters: demographics factors β1 = 0.04, β2 = 1.375, β = 0.16, δ1 = 0.01, δ2 = 2.5, δ3 = 4, α = 0.18, ϑ = 15, economy factors γ = 0.02, η = 0.1, λ = 2, environmental κ ∈ (1, 100), σ = 0.02, δ = 1, ρ = 2, ω = 0.1, ν = 1 and environmental police ϕ ∈ (0, 1), µ = 2, χ ∈ (0.01, 0.02). Our analysis has concentrated on changing three parameters χ, ϕ and κ while holding all the other parameters fixed. The parameter χ is one of the most responsible parameter for system, this one governs the economic decoupling rate, i.e. the rate at which technological innovation reduce the pollution flow per unit of output. Parameters ϕ and κ describe policy planning, the parameter ϕ can be interpreted as a rate at which expenditures increase, so the net per capital output is calculated as y(y, z) = y −ϕ(1−z)µ y. Since we require c < y, this limit ϕ < 1. The parameter κ is included in pollution function f (x, y, z) (7) and determines the effectiveness of expenditures. The future of Wonderland economy modelled by our system of differential equations is demonstrated for three sets of parameters that describe the three basic scenarios:
818
Mathematical Methods in Economics 2016
(A) Economists’ dream scenario χ = 0.02, κ = 20 and ϕ = 0.5. In this case, the per capital output increases over time, population converges toward stationary level of zero growth rate and the pollution flows steadily decreases. (B) Environmentalists’ nightmare scenario χ = 0.01, κ = 20 and ϕ = 0.5. In this case, when parameter χ is reduced from 0.02 to 0.01, the system don’t obeys the criteria of sustainability. (C) Sustainable development scenario χ = 0.01, κ = 70 and ϕ = 0.5 describes the situation when the progress of technology can help to sustainable economic increase.
Figure 1 The population x(t), per capital output y(t), quality of environment z(t) and pollution p(t) for scenarios (A), (B) and (C) The other numerical experiments has been focused on the stability of the solution. We analyse the behaviour of the solution system for parameter κ from interval 40 to 100 and parameter ϕ from interval (0.2, 0.5). The modelled value of per capital output at the time t = 300 is in the pictures bellow as a scaled coloured graph. There is the region of nightmare scenarios, where per capital output tends for t → 0 to zero (black color) and the region of dream scenarios where per capital output tends to ∞ (white color). As we can see in detail, the border between this two scenarios is not exactly a clear line. This type of behaviour is typical of dynamic system operating in chaotic regime.
Figure 2 The scaled color graph of per capital output at time t = 300 for parametric region κ ∈ (40, 100) × ϕ ∈ (0.2, 0.5) (left picture) and detail for parameters κ ∈ (70, 80) × ϕ ∈ (0.4, 0.5) (right picture)
819
Mathematical Methods in Economics 2016
4
Conclusion
The behaviour of the four dimensional dynamical system which describe Wonderland Population-Development-Environment Model was presented. The future of the model for three basic scenarios – dream scenario, nightmare scenario and sustainable development scenario – was demonstrated. The behaviour of state variables for several part of parametric space ( parameters χ, ϕ and κ) have been detailed analysed. The detailed comparison of parameters in the Wonderland model with estimation of this parameters based on real data sets are great challenge for our future works. Of course, Wonderland is a toy model which captures the global features of complete human– environment interaction, but not its a detail. On the other hand though there are no stochastic, exogenous shocks, the future of state variables seems quite unpredictable in some part of parametric space that is typical for chaotic dynamic systems. The next studying of a chaotic behaviour of Wonderland model is the second motivation for our continuing interest in this model.
Acknowledgements This article was supported by the European Regional Development Fund (ERDF), project ”NTIS - New Technologies for the Information Society”, European Centre of Excellence, CZ.1.05/1.1.00/02.0090.
References [1] Bar, M. and Leukhina, O. M.: A model of historical evolution of put and population. Federal Reserve Bank of Minneapolis, Available http://www.stanford.edu/group/SITE/papers2005/Leukhina.05.pdf (2005).
outat:
[2] Guckenheimer, J., Krauskopf, B., Osinga, H. M. and Sandstede, B.: Invariant manifolds and global bifurcations. Chaos: An Interdisciplinary Journal of Nonlinear Science, 25.9 (2015): 097604. [3] Kohring, G. A.: Avoiding chaos in Wonderland. Physica A: Statistical Mechanics and its Applications, 368.1 (2006), pp. 214–224. [4] Kuehn, C.: Geometric Singular Perturbation Theory. Multiple Time Scale Dynamics. Springer International Publishing, 2015, pp. 53–70. [5] Milik, A., Prskawetz, A., Feichtinger, G. and Sanderson, W. C.: Slow-fast dynamics in Wonderland. Environmental Modeling & Assessment, 1.1-2 (1996), pp. 3–17. [6] Milik, A. and Prskawetz, A.: Slow-fast dynamics in a model of population and resource growth. Mathematical Population Studies, 6.2 (1996), pp. 155–169. [7] Pruyt, E and Kwakkel, J. H.: A bright future for system dynamics: From art to computational science and beyond. In Proceedings of the 30th International Conference of the System Dynamics Society, St. Gallen, Switzerland, 22-26 July 2012. System Dynamics Society. [8] Sanderson, W. C.: Simulation models of demographic, economic, and environmental interactions. PopulationDevelopmentEnvironment. Springer Berlin Heidelberg, 1994, pp. 33–71. [9] Wegenkittl, R., Grller, M. E., & Purgathofer, W.: A guided tour to wonderland: Visualizing the slow-fast dynamics of an analytical dynamical system. Technical Report TR-186-2-96-11, Institute of Computer Graphics and Algorithms, Vienna University of Technology, 1996.
820
Mathematical Methods in Economics 2016
Searching for suitable method for clustering the EU regions according to their agricultural characteristics Ondřej Šimpach1, Marie Pechrová2 Abstract. Not only Common Agricultural Policy, but also other policies at the EU level require unified approach. However, agriculture in various member states as same as in particular regions differs significantly. Therefore, the aim of the paper is to find the appropriate method which will group the most similar regions according to their agricultural characteristics and create appropriate number of groups which would be suitable for the application of the agricultural policy. These results from different methods of HCA are compared and discussed. Particularly single, average, complete, weighted average, median, centroid, and Ward’s methods are applied. Also various metrics (Euclidean, Square Euclidean, absolute and maximum-value, Minkowski and Canberra distances, and correlation coefficient and angular separation measures) are used. The data about NUTS II regions of the EU are obtained from Eurostat for the latest available year (mainly 2013). Main indicators for the description of agricultural in particular region were: agricultural income, utilized agricultural area, labour and others. The most suitable well-balanced groups were created by Ward’s method with Canberra distance. Keywords: Clustering, measures, NUTS 2 regions, agricultural policy JEL Classification: C38, Q18 AMS Classification: 62H30
1
Introduction
Spatial econometrics is an important tool for support of the policy-making decisions. The analyses enable to see the results of taken measures and suggest needed correction. For example research of Smith et al. [12] used spatial econometric techniques to evaluate RDPs in the European Union, at the NUTS 2 level. They focused specifically on labour productivity in the agricultural sector. At first side their results seemed to show that spending within the regional development programs on the competitiveness program (axis 1) had a statistically significant positive relationship with the increase of agricultural labour productivity in southern Europe. However, when their controlled properly for spatial effects (rural versus urban areas), the effect disappeared. “This shows how not taking spatial econometrics into account can lead to erroneous (policy) conclusions” [12]. Similarly Becker et al. [1] examined the effects of EU’s structural policy (particularly of the Objective 1 facilitating convergence and cohesion within the EU regions). They observed the development of average annual growth of GDP per capita at purchasing power parity (PPP) during a programming period and average annual employment growth at NUTS 2 and NUTS 3 levels. Becker et al. [1] concluded that “Objective 1 treatment status does not cause immediate effects but takes, in the average programming period and region, at least four years to display growth effects on GDP per capita”. Also Palevičienė and Dumčiuvienė [6] used multivariate statistical methods to analyse the EU’s NUTS 2 level socio-economic data and to identify the clusters of socio-economic similarity. “The results showed that despite long lasting purposeful structural funds allocations there are still big regional development gaps between European Union member states” [6]. Pechrová and Šimpach [7] searched for the development potential of the NUTS 2 regions using hierarchical cluster analysis (Ward’s method and Squared Euclidean distances), the regions were clustered into groups with same characteristics. “The cluster analysis objective is to find out which objects are similar or dissimilar to each other” [9]. There are various clustering algorithms. Basic division is on hierarchical and non-hierarchical methods. First mentioned group contains agglomerative polythetic approach, two-dimension agglomerative clustering, division monothetic and division polythetic approaches. Non-hierarchical methods can be based on k-means algorithm or use fuzzy approach to cluster analysis. “When compared to standard clustering, fuzzy clustering provides more flexible and powerful data representation” [10]. Schäffer et al. [11] proposed new Bayesian approach for quantifying spatial clustering that employs a mixture of gamma distributions to model the squared distance of points to their second 1
University of Economics Prague, Faculty of Informatics and Statistics, Department of Statistics and Probability, W. Churchill sq. 4, 130 67 Prague 3, Czech Republic, ondrej.simpach@vse.cz. 2 Institute of Agricultural Economics and Information, Modelling of the Impacts of Agricultural Policy, Mánesova 1453/75, 120 00 Prague 2, Czech Republic, pechrova.marie@uzei.cz.
821
Mathematical Methods in Economics 2016
nearest neighbours. Ritter et al. [8] proposed a new method for autonomously finding clusters in spatial data belonging to the nearest neighbour approaches. “It is a repetitive technique which produces changing averages and deviations of nearest neighbour distance parameters and results in a final set of clusters” [8].
2
Data and methods
The aim of the paper is to find the appropriate method which will group the most similar regions according to their agricultural characteristics and create balanced groups. Firstly, the data are introduced, than the hierarchical cluster analysis methods and metrics used in the article are described. The data matrix was obtained from Eurostat [5] for the latest available years (mainly 2013). The data were not available for all EU’s regions (only for 2014). All NUTS 2 regions from Germany, Belgium and Slovenia were missing. Unfortunately, very important agricultural indicators such as the herds’ sizes and types, structure and acreage of crops were not available for majority of regions. Hence, they were not included into the analysis. The data were checked whether there is the correlation between them. Based on pairwise Pearson correlation coefficients (the values higher than 0.9), some of indicators were excluded from the analysis. In Table 1 we present the descriptive characteristics only for those variables which were chosen to be used in the next step. The agricultural holdings created an output in average height of 1 533 thousand CZK in year 2013. They used around 706 thous. ha of agricultural area and 411 thous. ha of arable land on average. The average share of arable land was 56% on average, but it differed across the EU. Similarly the share of agricultural land was higher (81%), but varied across the EU regions. Indicators related to the number of workers were highly correlated. Therefore, only the number of sole holders working on the farm was used. There were 45 898 of them on average. High standard deviation points on higher variability in the data. Sometimes it is even twice as high (in case of the number of sole holders working on the farm). The presence of variability in the data can negatively influence the results of clustering methods. Zero values stand for region Inner London with no agricultural production and land. Used variables Agricultural output (million EUR) Utilized agricultural area (ha) Arable land (ha) Share of arable land Share of agricultural land Sole holders working on the farm (pers.)
Mean 1533 706 038 410 724 56% 81% 45 898
Std. Dev. 1593 752 940 474 818 27% 19% 99 096
Min 0 0 0 0% 0% 0
Max 9 390 5 295 680 3 371 340 99% 100% 749 260
Table 1 Descriptive characteristics of data from Eurostat (2016); own elaboration As the variables are in different units standardized. Consequently, hierarchical cluster analysis was applied. At first, the resemblance matrix was calculated. Choice of similarity or dissimilarity measure depends on the type of variables (nominal, ordinal, ratio, interval, and binary). We utilized Euclidean, squared Euclidean, absolute-value, and maximum-value distances, and Minkowski distance with p argument, Minkowski distance with p argument raised to power (2), Canberra distance, correlation coefficient similarity measure, and angular separation similarity measure. Euclidean and squared Euclidean distances are based on Pythagoras theorem. Euclidean distance between two data points (Xi and Yi) is calculated as the square root of the sum of the squares of the differences between corresponding values (1).
d
n
X i 1
Yi
2
i
(1)
(It is similar to Minkowski distance with argument p = 2.) The Euclidean Squared distance metric uses the same equation (1) without the square root. As a result, clustering with the Euclidean Squared distance metric is faster. Calculation of absolute value distance (i.e. Manhattan distance) is similar to Minkowski distance with argument p = 1. Maximum-value distance (Czebyshev) is calculated as Minkowski with p → ∞. Minkowski distance with other arguments is calculated as (2). 1/ p
n 1 p d X i Yi i 0
(2)
It is often used when variables are measured on ratio scales with an absolute zero value. Its disadvantage is that even a few outliers with high values bias the result. Canberra metric is a dissimilarity coefficient defined in interval ajk = < 0;1 >, where ajk = 0.0 means maximum similarity when objects j and k are identical. “Each term in the sum is scaled between 0.0 and 1.0 equalizing the contribution of each attribute to overall similarity” [9]. Multiplier 1/n averages the n proportions (3):
822
Mathematical Methods in Economics 2016
a jk
1 n X ij X ik . n i 1 X ij X ik
(3)
Correlation coefficient similarity measure is defined in interval rjk = < -1;1 >, where rjk = 1.0 represents maximum correlation between the variables. Its advantage is that it can be applied on non-normalized data and its accuracy of the score increases. The calculation is as follows (4): n
rjk
X i 1
ij
X ik
n 2 1 n X X ij ij n i 1 i 1
n 1 n X ij X ik n i 1 i 1 2
n 2 1 n X X kj ik n i 1 i 1
. 2
(4)
Angular separation similarity measure represents cosine vectors between two angles (5). It is defined in interval sjk = < -1;1 >, where higher value indicates the similarity. The cosine of 0° is 1, for any other angle is < 1. n
s jk
X i 1
ij
X ik
(5) 1/ 2
X ij2 X ir2 r 1 i 1 n
n
Both above stated distances place anti-correlated objects maximally far apart (see e.g. [14]). Extended discussion on similarity measures can be found e.g. in [3]. Joining clusters by dissimilarity coefficients two clusters identified with the smallest dissimilarity coefficient values are merged. When using similarity coefficient, two clusters identified with the largest similarity coefficient value are joined. The article uses different clustering methods: single linkage (SLINK), average linkage method (i.e. unweighted pair-group method using arithmetic averages, UPMGA), complete linkage (CLINK), weighted average linkage, median linkage, centroid linkage, and Ward’s linkage. Single linkage method finds the two most similar spanning objects in different clusters. SLINK tents to produce compacted trees. It is useful only when clusters are obviously separated. When objects are close to each other, SLINK tends to create long chain-like clusters that can have a relatively large distance separating observations at either end of the chain. Average linkage method defines the similarity between two clusters as the arithmetic average of the similarities between the objects in one cluster and objects in the other cluster. It usually produces trees which are between extremes of those created by single or complete linkage. It also tends to give higher values of the cophenetic correlation coefficient. “On average UPGMA produces less distortion in transforming the similarities between objects into a tree” [9]. It can also have a weighted form. In complete linkage clustering method the similarity between any two clusters is determined by the similarity of their two most dissimilar spanning objects. SLINK tends to produce clusters with similar diameters and extended trees. It is useful when the objects form naturally separated clusters. However, the results can be sensitive to outliers. Median linkage belongs to the averaging techniques, but uses medians instead of arithmetic means. This enables to mitigate the effect of possible outliers in the data. With the median linkage method, the distance between two clusters is the median distance between an observation in one cluster and an observation in the other cluster. Centroid method determines the distance of clusters by the distance of their centres (calculated as averages of real values). Its weighted form is useful when different size of clusters is expected. It usually utilizes the squared Euclidean distance metrics. Ward’s method [15] merges the clusters with minimal within-cluster sum of squared deviations from objects to centroids. The distances of objects are again usually measured by squared Euclidean distance. It tends to create relatively small clusters because of the squared differences, but with similar numbers of observations. However, it is sensitive to outliers. Another disadvantage is that the distance between clusters calculated at one step of clustering is dependent on the distance calculated in previous step. The clustering was cut when the number of clusters was 5 to create reasonable number of groups for policy treatment. The calculations were done in Stata 11.2 where above stated metrics and methods are available.
3
Results and discussion
For the policy making purposes, it is important that the clusters contain similar number of regions. We created 5 clusters by each method. Usual disadvantage of single linkage method – the tendency to chaining – appeared also in our application. Regardless the distance metrics created four clusters with 1 or 2 regions and 1 cluster (number 5) with others. Therefore, it will not be further taken into account. Average linkage method produced also groups
823
Mathematical Methods in Economics 2016
where majority of regions was included in one cluster (mostly 1 or 2) and the remaining clusters included only few regions. Canberra distance rearranged the regions differently (majority of them was in 5 th cluster, than in 1st). Correlation coefficient and angular separation similarity measures crated more balanced clusters in case of using average linkage clustering method. Complete linkage with majority of used metrics was also not optimal. Almost all regions were clustered to the 1st group. Using Canberra distance, majority of regions was included in 3rd group. It may be due to the presence of outliers in the data. Correlation coefficient and angular separation similarity measures created the most balanced clusters for complete linkage method. Weighted average linkage grouped to 3rd cluster more regions than its unweighted variant. Canberra distance in this case put more regions in 3rd cluster while in unweighted case it was the 5th regions. The number of regions in each cluster was balanced in case of using correlation coefficient and angular separation similarity measures. Median linkage put almost all regions into the 1st cluster. Only using angular separation similarity measure the 3rd cluster emerged as the biggest. Despite that it should mitigate the influence of the outliers to some extent, this was not the case. Probably London region with minimal values in almost all variables represented a problem to this method. Similar problem is with centroid linkage. All regions are in the 1st cluster, only with correlation coefficient and angular separation similarity measures the clusters are more balanced. Despite that Ward’s linkage is normally used with squared Euclidean distance, more feasible seems absolute and maximum values metrics (see Table 2). While the first mention creates one cluster with only three regions (cluster 5), the number of regions grouped by the other stated methods is more equal; but maximum-value method put only 10 regions in the fourth cluster. Therefore, Canberra distance seems to be optimal in terms of the number of regions in each cluster. Minkowski distance with p argument provided similar results to Euclidean distance and Minkowski distance with p argument raised to power to squared Euclidean distance in some cases as the value of parameter p was 2. Correlation coefficient and angular separation similarity created the most suitable groups. As the first one is only a special case of the latter one (angular separation standardized by centring the coordinates on its mean value), we may suggest using the angular separation similarity measure. Canberra distance enabled to create relatively well balanced groups in the last category. Despite that it has certain disadvantage with higher number of variables, this is not our case and we can recommend its usage. Distance Euclidean distance squared Euclidean distance absolute-value distance maximum-value distance Minkowski distance with p argument Minkowski distance with p argument raised to power Canberra distance correlation coefficient similarity measure angular separation similarity measure
1 62 98 24 79 62 98 71 49 46
Cluster’s number 2 3 4 3 58 73 22 59 32 22 63 68 57 29 10 3 58 73 22 59 32 42 23 45 31 64 46 27 57 61
5 18 3 37 39 18 3 33 24 23
Table 2 Number of regions in clusters using Ward’s linkage method and various metrics; own elaboration As Ward’s (and average) method is according to [4] the most suitable in majority of cases, we suggest combining it with Canberra distance method. Analogical results for UPGMA and SLINK methods were achieved by Bouguettaya et al. [2]. They compared distance measure functions and found out that Canberra distance seems better than the Euclidean counterpart. „With no noticeable difference in computational cost, correlation achieved by the Canberra method is consistently higher than the correlation obtained by the Euclidean method on a same data set with either UPGMA or SLINK” [2]. In our case, this approach grouped regions with minimal values in almost all variables to the fifth cluster and with maximal values to the first cluster (see Table 3). Cluster 1 2 3 4 5
Agric. output Utilized agric. (million EUR) area (ha) 2 827 1 189 1 198 827 382
1 474 755 510 578 560 681 160 985 145 464
Arable land (ha)
Share of arable land
Share of agric. land
62% 45% 83% 51% 44%
89% 86% 61% 94% 54%
871 046 224 192 473 554 79 111 66 148
Sole holders working on the farm (pers.) 96 704 43 615 21 287 6 579 10 266
Table 3 Characteristics of clusters created by Ward’s method using Canberra distance; own elaboration
824
Mathematical Methods in Economics 2016
Regions grouped to the fifth cluster produced the less agricultural output with the less acreage of UAA and arable land. It contained mainly regions from France, Italy, and Poland. Regions in first cluster produce the highest agricultural output, utilize the most agricultural area and arable land in absolute values and had the most sole holders working on the farm. It included e.g. regions from Austria. The results of Ward’s clustering method using different metrics are presented at Figure 1 in Appendix.
4
Conclusion
Agriculture in European Union varies in particular regions. As Common Agricultural Policy requires unified approach, the aim of the paper was to find the appropriate method which will group the most similar regions according to their agricultural characteristics and create appropriate number of groups which would be suitable for the application of the agricultural policy. The results from different methods of hierarchical cluster analysis showed that finding a suitable method for clustering regions is not an easy task. Each clustering method or distance metric provided different results (see also comparison in [13]). Besides, majority of them proved to be sensitive to outliers. Mostly the best results were provided by correlation coefficient and angular separation similarity measures. Surprisingly, often used Ward’s linkage method with squared Euclidean distances did not provide well balanced groups. Normally this combination of methods tends to create relatively small clusters because of the squared differences, but with similar numbers of observations which is important for the application in the area of agricultural policy-making. Therefore, we recommend using Ward’s method when clustering EU regions for policy purposes using rather Canberra distance. This measure provided well balanced groups with clearly different characteristics which enable to formulate appropriate political measures.
Acknowledgements The research was supported by the Czech Science Foundation project no. P402/12/G097 DYME – “Dynamic Models in Economics” and also by Thematic task no. 4107/2016 of the Institute of Agricultural Economics and Information.
References [1] Becker, S. O., Egger, P. H., von Ehrlich, M.: Going NUTS: The effect of EU Structural Funds on regional performance. Journal of Public Economics 94 (2010), 578–590. [2] Bouguettaya, A., Yub, Q., Liub, X., Zhouc, X., Songa, A.: Efficient agglomerative hierarchical clustering. Expert Systems with Applications 42 (2015), 2785–2797. [3] Cha, S. H.: Comprehensive Survey on Distance / Similarity Measures between Probability Density Functions. International Journal of Mathematical Models and Methods in Applied Sciences 4 (2007), 300–307. [4] Dunn, G., Everitt, B. S.: An introduction to mathematical taxonomy. Cambridge University Press, 1982. [5] Eurostat: Agriculture, forestry and fisheries. [on-line] Available at: http://ec.europa.eu/eurostat/data/ database [cit. 2016-02-20]. [6] Palevičienė, A., Dumčiuvienė, D.: Socio-Economic Diversity of European Regions: Finding the Impact for Regional Performance. Procedia Economics and Finance 23 (2015), 1096–1101. [7] Pechrová, M., Šimpach, O.: The Development Potential of the Regions of the EU. In: Region in the Development of Society. Mendel University in Brno, Brno, (2013), 322–335. [8] Ritter, G. X., Nieves-Vázquez, J.-A., Urcid, G.: A simple statistics-based nearest neighbor cluster detection algorithm. Pattern Recognition 48 (2015), 918–932. [9] Romesburg, H. C.: Cluster Analysis For Researchers. Lulu Press, North Carolina, 2004. [10] Rovetta, S., Masulli, F.: Visual stability analysis for model selection in graded possibilistic clustering. Information Sciences 279 (2014), 37–51. [11] Schäffer et al.: A Bayesian mixture model to quantify parameters of spatial clustering. Computational Statistics and Data Analysis 92 (2015), 163–176. [12] Smith, M. J., van Leeuwen, E. S., Florax, R. J. G. M., de Groot H. L. F.: Rural development funding and agricultural labour productivity: A spatial analysis of the European Union at the NUTS2 level. Ecological Indicators 59 (2015), 6–18. [13] Šimpach, O.: Application of Cluster Analysis on the Demographic Development of Municipalities in the Districts of Liberecky Region. In: 7th International Days of Statistics and Economics. Slaný: Melandrium, (2013), 1390–1399. [14] van Dongen, S. Enright, A.J.: Metric distances derived from cosine similarity and Pearson and Spearman correlations. [on-line] Available at: http://arxiv.org/abs/1208.3145 [cit. 2016-04-14]. [15] Ward, J. H.: Hierarchical grouping to optimize an objective function. Journal of the American Statistical Association 58 (1963), 236–244.
825
angular separation similarity measure
correlation coefficient similarity measure
maximum-value distance
Canberra distance
absolute-value distance
Minkowski distance with p argument raised to power
Squared Euclidean distance
Minkowski distance with p argument
Euclidean distance
Mathematical Methods in Economics 2016
Appendix
Figure 1 Dendrograms for Wardâ&#x20AC;&#x2122;s clustering method and various distance metrics; own elaboration
826
Mathematical Methods in Economics 2016
A computer-based VBA model for optimal spare parts logistics management in a manufacturing maintenance system Eva Šlaichová1, Márcio Rodrigues2 Abstract. Spare parts management in maintenance systems has become increasingly challenging in major global corporations, since it’s hard to predict exactly when a piece of equipment fails and needs repair or replacement of its parts. Service time length, lead times, spare parts costs, the cost of non-production and service efficiency modelled with operations research tools and mathematical methods guide this multi-criteria decision-making process. Moreover, not always predictive and preventive maintenances are able to provide the adequate manufacturing management information about which equipment would need proper assistance and historical data of corrective maintenance hadn’t revealed a periodic pattern of failure modes for accurate demand forecasting. This paper proposes a conceptual model using VBA language that links timelines of preventive and corrective maintenances of equipments with stock levels and quality indicators, for a non-periodic time series for a more accurate management of optimal inventory levels and demand forecasting for spare parts in manufacturing maintenance systems. All tests and results acquired with actual corporate data as inputs are also presented. Keywords: spare parts, maintenance, operations research, logistics. JEL classification: C44, C53 AMS classification: 90B05
1
Introduction
The proper management of spare parts in maintenance systems has become very important in major companies, due to a continuous search for increasing plant efficiency, reduce unproductive times and most relevant, costs. It depends on a combination of many factors and cooperation of many key departments, not only setting up correctly the values of initial stocks and periodic purchase orders of such parts for each equipment but also combined with proper cost control to let plant management wisely decide which types of spare parts are suitable to stock or to purchase on-demand. Key questions should be periodically asked and answered during this continuous process: Which are my key spare parts? How are their inventories? Keep or don’t keep safety stocks? Which are the lead times of their suppliers? Is it wise to use Third-Party-Logistics (3PL)? How long are the production downtimes due to spare parts replacement or machinery repairs? And the most important: what’s the impact on the final customer? Additionally, pattern recognition to predict failure modes in industrial machines is directly linked with predictive and preventive maintenance plans of such equipments, through detailed analysis of intrinsic parameters of the process and the machines themselves, such as pressure variations, temperature ranges , tension, state of the components, calibration, electrical, hydraulic drives, sensors, impellers, thermostats, actuators, thermocouples, among many others and also associated with databases of historical data of non-conformities. 1 Technical
University of Liberec/Faculty of Economics, Department of Business Administration and Management, Studentska 1402/2 Liberec 1, 461 17 Czech Republic, eva.slaichova@tul.cz 2 Universidade Federal da Bahia - UFBA/Escola Politécnica da UFBA, Departamento de Engenharia Mecânica, R. Prof. Aristides Novis, 2 - Federação, Salvador - BA, 40210-630, Brazil, marciovtr@gmail.com
827
Mathematical Methods in Economics 2016
The main goal of this paper is to propose a conceptual model for spare parts logistics in maintenance systems, which outputs demand and cost forecast key informations to help companies make wise and fast decisions regarding accurate spare parts purchases, according to corrective and preventive maintenance plans of main and bottleneck machines, time saving plus managing more efficiently their stocks and reduce costs. The methodology used during the model development and this paper contribution includes the following activities: • a wide literature review and consolidation of main concepts: maintenance types, Total Productive Maintenance, spare parts classification, inventory and demand forecasting methods; • mathematical and economic model definitions, with the help of operations research tools and economic concepts; • survey creation and submission for corporate data acquisition; • real corporate data treatment and analysis; • VBA programming and development of user interface; • tests and empirical improvements. This paper is divided in five sections. The first section is a brief introduction to spare parts logistics management and an overview of the main contribution, as described above. Section 2 presents the literature research regarding relevant inventory management methods applied to spare parts. Section 3 describes the main contribution of this paper, with both mathematical, economic and computational approaches of the proposed model. Section 4 discusses some of the results acquired using real corporate data and future improvement opportunities for the model and the topic. The last section resumes the main content discussed on the previous sections of this paper.
2
Literature Review
All over the past years many authors have developed and applied their own methods and models for proper spare parts management in maintenance systems. Aiming an accurate management of spare parts, two main aspects should be considered: demand forecasting and inventory policies. The next two subsections detail relevant used methods in major companies all over the past years. 2.1
Demand Forecasting methods
Relevant demand forecasting methods applied in major industries are shown as follows labelled by authors: • Krever (2005): mean and variance of demand during lead time. Single Demand Approach (SDA) (Different than Periodic Demand Approach PDA). Variables of demand as a function of time, where D(t) = (Quantity Ordered During Lead T ime, T ime Intervals Between Demands, Individual Lead T ime) (1) • Croston (1972): an alternative method which separates the estimation of intervals between demands of the amounts demanded in each occurrence; [3] • Willemain, Smart and Schwarz (2004): bootstrapping techniques for intermittent demands - assess the demand distribution during lead-time; [3]
828
Mathematical Methods in Economics 2016
2.2
Inventory methods applied to spare parts
The inventory models for products with high and independent demand is a consolidated area in operations management. From the original work by Harris (1913) on Economic Order Quantity (EOQ), different inventory models have been developed, including the following classic models: [2] • Continuous Review (R, Q): in this model the inventory is continuously monitored and, when the level reached the order point ”R” a lot size of ”Q” (economic order quantity) is placed; • Periodic Review (T, S): in this case, orders are placed in fixed interval of time ”T”, in an amount to replace the inventory position to the maximum inventory level ”S”; • Base Stock (B): here, at each withdrawal from inventory, an order of the same amount is made for replacing the baseline, keeping the inventory position (inventory on hand plus open orders) constant and equals to ”B”; • Periodic Review Model: maximum inventory demand distribution with accumulated historical information. Popovic (1987) developed a periodic review model in which the maximum inventory are determined based on a demand distribution estimated a priori and adjusted a posteriori using Bayesian inference with accumulated historical information. Initially, the demand rate (λ) is considered constant and, then, an alternative model is presented for time varying demand rate, given by Λ(t) = (k + 1)λ.t.k (2) • Petrovic’s (1990): Needs to consider subjective aspects such as quality failure modes. Petrovic claims that the decisions in the spare parts management need to consider subjective aspects in addition to the traditional data for costs, lead times and demand. They developed an expert system to manage inventory with the premise that the distribution of the time between failures of a component follows the exponential distribution (therefore did not consider the cases of early mortality and aging/wearing). Users estimate the failure rates of components (or classify them in a subjective scale) and then answer questions about repairability, repair time, cost and criticality level of components. After inputting and processing data, the user gets the list of recommended lot sizes and the total expected inventory cost; [4] • Aronis’ (2004): used the Bayesian model to forecast the demands and determine the single parameter B of a base stock model for spare parts. Estimates and failure history of similar items are used a priori to fit the proper distribution of the demand and then the levels of required inventories (base level) are calculated; [5] • Johnston and Boylan’s (1996): Compared forecasting made through exponential smoothing to the ones made through Crostons method, and concluded that the latter is superior when the average interval between demands is greater than 1.25 time periods (time bucket). Syntetos and Boylan (2001) pointed out a bias in the original Croston’s model and proposed a correction that gave rise to the SBA (Syntetos-Boylan Approximation) model; [6];[7] • Krever’s Intermittent demand (2005): computes the mean and variance of demand during the lead time. In their approach, known as Single Demand Approach (SDA), as opposed to the more traditional Periodic Demand Approach (PDA) with time buckets, three random variables are used: amounts demanded during the lead-time, time intervals between demand occurrences, and the lead-time itself. [8];[9] In the following section is shown the proposed model and concepts to achieve the main contribution of this paper.
3
Conception of the proposed model
The basic concept of widely-used Croston’s method for demand forecasting, consistent of separating estimation of intervals between demands of the amounts demanded in each occurrence was used as a starting point for the proposed model.
829
Mathematical Methods in Economics 2016
The main difference is that the timeline for demand events is a combination of both preventive maintenance date series for an specific equipment and spare part type and unpredicted corrective maintenance cumulative events time series to adjust previous amount demanded before next preventive maintenance scheduled event, as shown in Figure 1 below:
Figure 1: Preventive (A), Corrective (B) and Combined (C) maintenance. Conceptual demand forecast chart for Spare Part Type 1 (D) As a direct implication of the unpredicted corrective maintenance events, once there is a new event the demand forecast needs to be adapted with new quantities. The following subsection describes the mathematical development of this time series. 3.1
Mathematical development of the spare parts inventory model
As represented in Figure 1 (A) and (B), each timeline diagram is a group of (ti , qi ) and (t0i , qi0 ) pairs, whereas ti and t0i represent planned preventive maintenance dates and unpredicted corrective maintenance events and qi and qi0 amount of spare parts planned/needed, respectively. Figure 1 (C) represents a combined timeline where (A)’s and (B)’s maintenance events are sorted in order of occurrence. An additional auxiliary variable ∆i is used to represent time intervals between first (ti ) and subsequent time event (ti+1 or t0i+1 ) and this last one can be whether preventive or corrective. For timeline consolidation, it is necessary that there must be a last time event, here addressed as T , which represents the last preventive maintenance time event for this machine that will require this spare part or in case the demand forecast is estimated for a lower range, this brings the restriction turned into the equation n X i=0
∆i =
n X i=0
|(ti −
t0i−1 )|
∴
n X i=0
|(ti − t0i−1 )| ≤ T
(3)
The amounts of spare parts demanded qi and qi0 for preventive and corrective maintenance time events, respectively, and stock levels (in the following equations addressed as variable sp) for demand forecast are used to formulate the conceptual demand forecast amounts of spare parts. The initial stock levels for each type of spare part (sp0 ) is assumed in this paper as an input data from the company and is chosen according to its own stock policies and limitations (warehouse capacity or size, inventory cost upper budget limit, supplier lead times, internal or external logistics):
sp(ti ) = sp(ti−1 ) + spe (ti − ti−1 ) + rq (i) + sp0
(4)
(4) represents the demand forecast as a function of preventive maintenance time series. As an conceptual output for the model, Figure 1 (D) presented a chart of the proposed theoretical conceptual demand forecast function.
830
Mathematical Methods in Economics 2016
3.2
Economic development of the spare parts inventory model
As previously mentioned in Section 1, the main goal of proposed model is to help companies to make smarter and quicker decisions for spare parts management, which leads to the main economic goal: costs reduction. From mathematical model presented on previous subsection, the initial stock levels once considered as given information were obtained using ABC curve of spare parts unitary costs and ABC curve of spare parts’ suppliers costs from lower lead times for process critical machines. The decision to make is which demand forecast scenario delivers the minimum total cost, defined by the objective function Z = f (qi , qi0 ), where n X kq (qi + qi0 ), when qi 6= qi0 Zmin = f (qi , qi0 , kq ) = min Zi=0 (5) n
kq .qi di,
when qi = qi0
i=0 As a direct implication of (5), it’s also possible to determine the total spare parts inventory cost forecast, as (5) is defined for each machine and each type of spare part. Considering conceptually a group of 10 machines (auxiliary variable a) and 15 different types of spare parts suitable to all 10 machines (auxiliary variable b) comes a new objective function Y = f (Z), where
Ymin = f (qi , qi0 , kqab ) = min
10 X 15 X n X kqab (qi + qi0 ),
b=0 i=0 Za=0 10 Z 15 Z n kqab .qi da db di, when qi = qi0 a=0
3.3
when qi 6= qi0
b=0
(6)
i=0
Computer-Based VBA model
Moreover, using VBA programming language, Solverr tool and N euroXLr neural network add-in for M icrosof tr Excelr multicriteria decision ended up with the proposed spare parts logistics model user interface. It contains traceability information regarding each type of spare parts, quality non-conformity historical data, supplier lead times and all input functions allowing the users to generate reports.
Figure 2: Model user interface In the next section results regarding real data processing and analysis are presented.
4
Results and discussion
After tests and debugging, reports from proposed model provide optimized demand and cost information for each type of spare part in preventive maintenance timeline, as shown in Figures 3 (E) and (F)
831
Mathematical Methods in Economics 2016
respectively, also total inventory cost for chosen plant and the amount of spare parts needed and many other criteria and filters can be applied to it, which proves to be a powerful tool in time-cost savings and can be widely used in complex spare parts logistics maintenance systems.
Figure 3: Demand (E) and cost (F) forecast charts using company data
5
Conclusion
This paper intended to present a conceptual model for spare parts management in maintenance systems, showing its mathematical, economic and computational interfaces and working principles. Confirms how demanding and urgent the seek for optimization and cost reduction in industrial processes are nowadays. Future works of this paper are possible, where results can be compared with existing methods for spare parts management and also opportunities for its improvement.
References [1] Rêgo, J., and Mesquita, M.: Spare parts inventory control: a literature review. In: Production Journal ’11, vol. 21, issue 4, São Paulo, 2011, 656-666. [2] Waller, D.: Operations Management: A Supply Chain Approach. Thomson Learning, London, 2003. [3] Bacchetti, A. and Saccani, N.: Spare parts classification and demand forecasting for stock control: Investigating the gap between research and practice, Omega 40 (2012), 722–737. [4] Rego, J. S. and de Mesquita, M. A.: Demand forecasting and inventory control: A simulation study on automotive spare parts, International Journal of Production Economics 161 (2015), 1–16. [5] Martnez-Jurado, P. J. and Moyano-Fuentes, J.: Lean Management, Supply Chain Management and Sustainability: A Literature Review, Journal of Cleaner Production 85 (2014), 134–150. [6] Bacchetti, A. and Saccani, N.: Spare parts classification and inventory management: a case study In: Syntetos Salford Business School Working Paper (WP) Serie (2010), 1–36. [7] Gan, S., Zhisheng Zhang, Y. and Jinfei, S.: Joint optimization of maintenance, buffer, and spare parts for a production system, Applied Mathematical Modelling 39 (2015), 6032–6042. [8] Jouni, P. and Huiskonen, J. and Pirttila, T: Improving global spare parts distribution chain performance through part categorization: a case study, International Journal Production Economics 133 (2011), 164–171. [9] Behfard, S., van der Heijden, M. C. and Zijm, W. H., M.: Last time buy and repair decisions for spare parts distributions on the line, European Journal of Operational Research 244 (2015), 498–510.
832
Mathematical Methods in Economics 2016
Optimization of the Transportation Plan with a Multicriterial Programming Method Eva Šlaichová1, Eva Štichhauerová2, Magdalena Zbránková3 Abstract. Environmental policy and corporate social responsibility have become very important criteria in considering of performance of the enterprises. Transportation is the second largest source of greenhouse gas emissions, in particular CO2 and N2O, in the EU, and road transport is so far the largest emission source within the transport sector. This article deals with the of best route choice decisions on vehicle energy consumption and emission rates for different vehicle types using multicriterial linear programming method. The purpose was to find the optimal transportation plan so as to simultaneously minimize the fuel consumption and the total CO2 emission. The ecologic-economical model is tested on the real data of the selected logistical enterprise. The paper also presents general applicability of the multicriterial model. From the model results it cannot be implied that the amount of emissions produced was directly proportional to fuel consumption, rather the contrary. Keywords: emission reduction, mathematical modeling, linear programming, operation analysis, sustainable development JEL Classification: C02, Q52 AMS Classification: 90C05
Introduction Greenhouse gas emissions and their component CO2 have a significant negative effect on human health and negatively effects also the economy of the country. The Czech Republic deals with this urgent problem as many other countries in Europe and all over the world. With an increasing worldwide concern for the environment, logistics providers and freight carriers have started paying more attention to the negative externalities of their operations [4]. The question is whether between the amount of produced emissions and fuel consumption is a direct proportion. If so, it would be possible to prove that the environmental and economic aspects are important during the provision of transport services. In a number of, mainly small or medium-sized businesses, the environmental performance is a result of economic management choices, with little or no regard to their environmental impact. Furthermore companies are unlikely to invest in environmental issues unless they come under economic or legal pressure [8]. There is also an agreement that strategic decisions should have a larger impact on emissions than operative decisions. There is, however, a disagreement on what specific decisions have the largest impact, and what those decisions really will lead to regarding environmental impact [3]. In the frame of these issues it is possible to use also methods of the operation research, e.g. adaptive evolution algorithm [7]. However, linear programming seems to be one of the most efficient techniques because it can be used to solve problems dealing also with environmental and ecological topics [9]. The authors of the article decided to use two simple assignment models with different objective functions. The next step was to compare the results with the multi-objective (multi-criteria) linear programming with the weighted sum of these objectives. This article presents the results of the case study from the real logistic company (with modified data) focused on comparison of two models, model optimizing the transportation plan while minimizing emissions of CO2 and the same model minimizing fuel costs.
1
Faculty of Economics, Technical University of Liberec, Department of Business Administration and Management, Studentská 1402/2, Liberec 1, Czech Republic, eva.slaichova@tul.cz. 2 Faculty of Economics, Technical University of Liberec, Department of Business Administration and Management, Studentská 1402/2, Liberec 1, Czech Republic, eva.stichhauerova@tul.cz. 3 Faculty of Economics, Technical University of Liberec, Department of Business Administration and Management, Studentská 1402/2, Liberec 1, Czech Republic, magdalena.zbrankova@tul.cz.
833
Mathematical Methods in Economics 2016
1 Literature Review Over the past two decades, many organizations have taken steps to integrate the principles of sustainability into their long and short-term decision-making [1]. At the local and regional levels, a significant portion of freight transportation is carried out by trucks, which emit a large amount of pollutants. In the transportation sector emissions are dominated by CO2 emissions from burning fossil duele [4]. These cause atmospheric changes and climate disruptions which are harmful to the natural and built environments and pose health risks. Environmental issues and the inclusion of environmental strategies in strategic thinking are interesting subjects of investigation [13]. Already in 1995 Wu and Dunn discussed logistical decisions in light of their environmental impact [10]. Regarding transportation, they highlight three areas with high-environmental impacts, the construction of transport networks related to infrastructure and inventory location, the operation of vehicles, including fuel choice and the disposal of vehicles. As Oberhofer claims, more efficient and reduced use of vehicles, fuel switching, renewable electricity and fuels and the use of hydrogen and biomass as transportation fuels should be considered as a means of stabilization [8]. On the other hand, Wu and Dunn point out that companies must re-evaluate where facilities are located, whom they cooperate with, what technology is used and the whole logistics structure [10]. According to the lean management principles environmental friendly logistics structures are characterised by fewer movements, less handling, shorter transportation distances, more direct shipping routes and better utilisation. However, the business main objective is not the protection of the environment and therefore companies strive from within to balance ecology and economy [11]. Therefore as Ghobadian (2006) declares, where cost savings can be generated from green practices, green logistics initiatives tend to be applied in the freight transport sector [5]. As Demir points out the minimization of the total travelled distance has been accepted as the most important objective in the field of vehicle routing and freight transportation [4]. The number, quality and the availability of the truck models with lower emmisions have increased considerably recently. Most studies in the field of green transportation have focused on a limited number of factors, mainly vehicle load and speed. Mainly vehicle speed is shown to be quite important, of course the crucial is to make the drivers drive with minimum fuel consumption for a given routing plan. Light vehicles should be preferred over medium duty and heavy vehicles. Medium should be also preferred to heavy vehicles if possible. Another point is that positive road gradient leads to an increase in fuel consumption. This should be taken into account especially in connection with GIS software which can nowadays provide information of the road gradient. Besides CO2 emissions pollutants, noise, accidents and environmental damage and other traffic externalities could be examined at the local and regional levels. A number of studies focus mainly on the routing aspect of green logistics [4]. However, other problems linked to routing presents former reductions in emissions. For example, facility location is concerned with locating a set of depots so as to maximize the profit generated by providing service to a set of customers. The relocation of a depot may lead to reductions in CO2 emissions [6]. Driver working hours should be considered, not only because of labour law requirements, but as well as for the acknowledgement of the health hazards arising from demanding workload of some transportation plans. As Zhang claims, to enhance the interaction between companies and investors for higher environmental goals economic incentives to business-led voluntary initiatives should be used [12].
2 Optimization of the transportation plan while minimizing emissions of CO2 This part of the contribution contains information about applied methodology, about the structure of the model, and the description of data used in the case study.
2.1 Data and methodology This contribution presents the results of the case study on the example of the logistic company. The aim is to compare three models. A model optimizing the transportation plan while minimizing emissions of CO2 and the same model minimizing fuel costs. The third model is based on multi-criteria evaluation of both factors. The data were modified at the request of the company. For a research and data collection, interview method was used. Linear programming was applied to set both models because all the constraints can be described by the linear equations and inequations. Calculations were processed by the software tool Solver provided by Microsoft Excel 2007.
834
Mathematical Methods in Economics 2016
2.2 The multi-objective assignment problem – the structure of the model The goal is to minimize logistics companies produced a total volume of CO2 emissions per day, while minimizing the costs of daily fuel consumption when assigning trucks to routes from the main depot to the target depot. The prerequisite is that one day is realized transport on all routes just once. The multi-objective assignment problem can be formulated as follows:
z1 i 1 j 1 eij * xij m
Minimize
n
(1) (2)
z 2 i 1 j 1 cij * xij m
n
xij 0;1
(3)
x ij 1
(4)
xij 1
(5)
n j 1
m i 1
i = 1, …, m; j = 1, …, n; m ≥ n where (1) the objective function to determine the total CO2 emissions produced per day, and (2) the objective function to determine the total cost of the fuel consumption per day, the two criteria are of the minimization type. Ei j structural factor is the amount of CO2 emissions produced by the i-th vehicle on the route from the main depot to depot j-th and back (on the j-th line). Structural coefficient cij is the cost of fuel consumption i-th vehicle to the j-th route. Logistics company has a number of m trucks with them it provides transport n routes, the number of vehicles exceeds the number of available routes. Theoretically there are m * n possible combinations of vehicle and route. When assigning cars to routes so that each route was secured by just one car and each car was assigned to no more than one route. An obligatory condition (3) indicates that the variable xij can have only two values; it equals 1 if a car is assigned to depot j and 0 otherwise. Assuming that the number of vehicles in the fleet exceeds the number of routes that need to be serviced. For fleet logistics company at maximum one car can be assigned to a route between the capital and the target depot. Capacity of the fleet can not be exceeded. Thus, no more than one of the variables xij must be for the i-th car equal to 1, and this must apply to all vehicles. Line constraints are given in equation (4). Each route between the depot and the depot target must be implemented in one car, at least one vehicle must be assigned to the route. Thus, just one of the variables xij must be for the j-th route is equal to 1, and this must apply to all route. Column constraints are given by equation array (5). After application of the principle of aggregation of the objective function to the objective function (1) and (2) and the transformation of the original values of structural coefficients eij and ci,j the problem can be solved as a normal linear programming (6) under the conditions (3), (4) and (5): Maximize
z i 1 j 1 eijT * xij * v1 i 1 j 1 cijT * xij * v2 m
n
m
n
(6)
eijT 0;1 cijT 0;1 where the weight v1 has the relative importance of the criteria of the total volume of CO2 emissions and weight v2 indicates the relative importance of the criteria of the cost of the overall fuel consumption coefficients and the transformed values of the original structural coefficients eij and cij.
2.3 The application of the model on the real data Necessary information about the composition of the fleet logistics companies are contained in Table 1. This is a truck with a maximum load between 15–19 m3. Limitation in the case of vehicles B13 is the maximum length of the route without refueling (CNG), suggesting that it can be assigned only to routes within a distance of 400 km.
835
Mathematical Methods in Economics 2016
Identific Year of ation of Emissions in Type of the truck* producti the truck CO2 [g/km] on Bi Renault Master 2.8 DTI B1 1999 287 Fiat Ducato Maxi 2.8 JTD B2 2002 235 Mercedes-Benz Sprinter 308 CDI B3 2004 282 Fiat Ducato 33 Multijet 120 L3H2 B4 2007 218 Mercedes-Benz Sprinter 315 CDI B5 2009 252 Fiat Ducato 2.3 JTD 17H L4H3 Maxi B6 2012 225 Renault Master 2.8 DTi B7 2012 215 Iveco Daily 35S15V Euro 5 B8 2013 187 Mercedes-Benz Sprinter 319 CDI B9 2014 205 Mercedes-Benz Sprinter 313 D Blue Efficiency + B10 2014 195 Mercedes-Benz 319 CDI B11 2014 207 VW LT35 - MAXI JUMBO B12 2003 254 Fiat Ducato 3.0 Natural Power CNG L2H2 B13 2011 239 * estimated combined consumption in v litres per 1 km ** consumption of gas CNG in kg per 1 km
Consumption of fuel [l/km]* 0.108 0.072 0.107 0.082 0.104 0.085 0.081 0.094 0.078 0.075 0.080 0.103 0.088**
Table 1 Volume of emmisions CO2 and consumption of fuel regarding the type of the truck Final Depot
Route Aj
Ústí nad Labem A1 Olomouc A2 Hradec Králové A3 Ostrava A4 Brno A5 Plzeň A6 České Budějovice A7 Liberec A8 Chodov (KV) A9 Jihlava A10 Velké Přílepy + Praha 10* A11 * two routes are secured by one vehicle
Length of the route [km] 214 526 250 708 376 224 268 242 324 226 126 (tj. 92+34)
Table 2 Length of the route between main depot in Modletice and final depots outward journey included Structural factors eij of the objective function z1 (1) were obtained by multiplying the volume of CO2 emissions i-th vehicle g/km (see Table 1) and the path length between the depot and the j-th depot in km (see Table 2). Due to the fact that the fleet is made up of both diesel trucks, as well as one-consuming CNG, consumption was converted into monetary terms. For this purpose, the price of 1 liter of diesel was set at CZK 32 and the price of 1 kg of CNG gas CZK 26. Structural coefficients cij from the objective function z2 (2) were obtained by multiplying the monetary value of consumption fuel i-th vehicle CZK/km (see Table 1) and the length of the j-th km route (see Table 2). An exception in the matrix of prices are coefficients e13,2 and e13,4. To avoid assigning vehicles B13 routes to relevant depots Olomouc and Ostrava, distant from the main depot Modletice more than 400 km, the authors (Burkard et al., 2012) recommend to replace the original value with an extraordinary high value – there 999,999 (g or CZK). In the case of multi-criteria evaluation which work with the transformed values after conversion, additional equating coefficients are indigenous to the structural coefficients range from 0 to 1. In the case study, the question of determining the exact weights of the criteria has not been addressed. Optimization was performed for the situation when two criteria are considered as important (e.g. the application of multi-criteria evaluation during the same values scales, where v1 = v2 = 0.5), and in extreme situations where it
836
Mathematical Methods in Economics 2016
is important to only one or only the second criterion (e.g. monocriterial optimization, where v1 = 1 and v2 = 0, and vice versa).
3 Discussion of the results The total cost of fuel consumption per day in a situation that the vehicles are assigned to routes with respect to the requirement for a minimum total volume of produced CO2 is CZK 9,329, which is 4.29 % (CZK 384) more than cost-optimal combination. Total daily volume of CO2 emissions in a situation where the company respects the requirement for minimum consumption costs, is 766,062 g, which is about 3.77 % (27,800 g) than the combination of vehicles minimizing CO2 emissions. If the vehicles are assigned to routes based on current optimization of the two criteria, the total daily volume is 749,190 grams of CO2 emissions and total cost of consumption of fuel is CZK 9,031 per day. Cars assigned to minimize: total cost of the Destination Depot Route both objectives total CO2 emissions consumption per together per day day Ústí nad Labem A1 B5 B12 B12 Olomouc A2 B10 B10 B9 Hradec Králové A3 B4 B7 B7 Ostrava A4 B8 B2 B10 Brno A5 B9 B13 B11 Plzeň A6 B13 B8 B6 České Budějovice A7 B7 B11 B13 Liberec A8 B6 B4 B4 Chodov (KV) A9 B11 B9 B2 Jihlava A10 B2 B6 B8 Velké Přílepy + Praha 10 A11 B12 B5 B5 (unused car) B1 B1 B1 (unused car) B3 B3 B3 The total volume of CO2 766,062 g 749,190 g 738,262 g emissions per day The total cost for fuel CZK 9,329 CZK 9,031 CZK 8,945 consumption per day Table 3 Comparison of the models The result of this article is an optimization of the transportation plan regarding important side effects, namely CO2 emissions and fuel consumption. The contributions of this article are description of the model including the factors of the fuel consumption and CO2 emissions into existing planning methods for vehicle routing, introduction of multicriterial integer programming formulations that minimizes a fuel consumption. The possible limitation of the research is that the model does not include labor costs and emission costs are not expressed as a function of load, speed and other parameters. Results of the model analysis based on real data came to the following important conclusion. Within minimizing emissions from the cost perspectives also other factors should be taken into account, e.g. labor costs. Another point is that use of fewer vehicles generally implies a lower fuel consumption and a higher utilization rate of the vehicle capacity. Heavy vehicles typically generate more emissions at low speed than medium vehicles. Medium vehicles produce significantly low emissions under the same operational conditions, so it indicates that more low-emitting vehicles should be assigned to lower-speed routes. As it can be also concluded, significant improvements in air quality and energy consumption are achievable by educating drivers [2]. Economical driving in addition to fuel economy carries even further: it is possible to calculate the savings of brakes, tires, clutch, etc. Safe and defensive driving style, which is linked to the driver's ability to anticipate and deal with crisis situations, eventually to avoid them, can translate even in lower accident rates and other related benefits, e.g. insurance, liability. It follows that drivers should learn the
837
Mathematical Methods in Economics 2016
rules of economic and defensive driving and should be encouraged to respect them. In the case of achieving fuel savings, financial award can be offered to them in a certain amount for a certain periods. From the model results it cannot be implied that the amount of emissions produced was directly proportional to fuel consumption, rather the contrary. Although according to the results, between low emissions and costs can be find a positive relationship. If the company prefer vehicles that have lower emissions for longer transport routes, it is possible to reduce overall emissions, thereby reducing the carbon. As indicated by the results, a vehicle that was not deployed to transportation were not meeting the requirements of EURO II and III Those vehicles were produced before 2004 and their emissions are among the highest (more than 280 g per kilometer). At the same time, these vehicles have the highest consumption (Table 1), so that even when selecting the vehicles to minimize the total cost of the fuel consumption, these vehicles would not be put into operation as well. From this perspective, it is worth considering a further economic evaluation of whether it would be economically and environmentally efficient to replace the "outdated" new vehicles that fulfill stricter emission standards for pollutants EURO VI.
Conclusion Road freight transportation is essential for the economic development, but it emissions of CO2 are harmful to the environment and to human health. For many logistic companies the planning of transportation activities has mainly focused on cost minimization. However by minimizing high emitting driving behavior, air quality can be improved significantly. The article has compared models that have been developed in order to optimize the fuel consumption and greenhouse gas emissions associated with road freight transportation. When comparing the modeled results data from the selected logistic company were used.
References [1] Ahi, P.: An analysis of metrics used to measure performance in green and sustainable supply Chin. Journal of Cleaner Production 86 (2015), 360–377. [2] Ahn, K. and Raka, H.: The effects of route choice decisions on vehicle energy consumption and emissions. Transportation Research, Part D 13, 3 (2008), 151–167. [3] Aronsson, H. and Brodin, M. H.: The environmental impact of changing logistics structures. International Journal of Logistics Management 1, 3 (2006), 394–415. [4] Demir, E., et al.: A review of recent research on green road freight transportation. European Journal of Operational Research 237, 3 (2014), 775–793. [5] Ghobadian, A.: In search of the drivers of high growth in manufacturing SMEs. Technovation 26, 1 (2006), 30–41. [6] Harris, I. and Naim, M.: Assessing the Impact of Cost Optimization Based on Infrastructure Modelling on CO2 Emissions. International Journal of Production Economics 131, 1 (2011), 313–321. [7] Koblasa, F. and Manlig, F.: Application of Adaptive Evolution Algorithm on real-world Flexible Job Shop Scheduling Problems. In: Proceedings of the 32nd International conference on Mathematical Methods in Economics (Talašová, J., Stoklasa, J., Talášek, T., eds.). Olomouc, 2014, 425–430. [8] Oberhofer, P., et al.: Sustainable development in the transport sector: Influencing environmental behaviour and performance. Business Strategy and the Environment 22, 6 (2013), 374–383. [9] Štichhauerová, E., et al.: Use of linear programming method to constructing a model for reduction of emission in a selected company. In: Proceedings of the 32nd International conference on Mathematical Methods in Economics. (Martinčík, D., Ircingová, J., Janeček, P., eds.). Cheb, 2015, 805–811. [10] Wu, H. and Dunn, S.: Environmentally responsible logistics systems. International Journal of Physical Distribution & Logistics Management 25, 2 (1995), 20–38. [11] Wygonik, E. and Goodchild, A.: Evaluating CO2 emissions, cost, and service quality trade-offs in an urban delivery system case study. IATSS Research 35 1 (2011), 7–15. [12] Zhang, Y. R.: Analyzing the Promoting Factors for Adopting Green Logistics Practices: A Case Study of Road Freight Industry in Nanjing, China. Procedia - Social and Behavioral Science 125 (2014), 432–444. [13] Žabkar, V., et al.: Environmental Strategy: A Typology Of Companies Based On Managerial Perceptions Of Customers’ Environmental Activeness And Deterrents. E+M Economics and Management 16, 3 (2013), 57–74.
838
Mathematical Methods in Economics 2016
Similarity of Stock Market Series Analyzed in Hilbert Space Jakub Šnor1 Abstract. Time series from stock markets exhibit various types of fluctuations and can be studied as samples from unknown n-dimensional distribution. The main question behind the paper is how to recognize similar stocks or similar time periods of their history. Whenever current stock segment is similar to another former one of the same stock or the other, the investment strategy should be also similar. Similarity measure is built in Hilbert space of infinite dimension as follows. Parzen estimate of probability density function was used to obtain scalar products of stock segment pairs. This product and corresponding metrics were expressed analytically as double sum over segment items to enable the use of Principal Component Analysis, Cluster Analysis and Self-Organized Maps for the description of stock similarities and differences. Mathematical formulation of similarity task and analytic calculations are results of my original research work. General principle is demonstrated on real stocks of leading IT companies in the period 2009-2015. Both PCA and SOM enable to visualize the similarities in infinite Hilbert space. All the calculations were performed in the Matlab environment. Keywords: Hilbert space, self-organization, Parzen estimate, stock market, time series, market similarities. JEL classification: G11 AMS classification: 91B84
1
Introduction
Every decision process is based on knowledge of actual state of given object. Object similarities can help to improve the quality of decisions. In the particular case of stock market operations, the time series of price history are good source of supporting information. Using sliding windows we can convert the problem to investigation of statistical sample similarities. Resulting task is also complex but it should be solved using Parzen estimate [7] of probability density function and data processing in Hilbert space [13]. My original research work consists of novel method how to calculate the dot product of probability density function estimates which is based on statistical sample analysis. It is useful not only for distance and similarity measurement but also for self-organization. This paper is continuation of my previous work [10] which was focused on Kernel Principal Component Analysis (KPCA), Cluster Analysis (CA) and Self-Organized Mapping (SOM) in Hilbert space of infinite dimension.
2
Sample Description
Let M ∈ N be the length of a sequence and an > 0 the value of the sequence for n = 1, . . . , M . Then the sequence is traditionally described as {an }M n=1 . Sequence can be also described by logarithmic differences of neighboring values defined as ak+1 (1) zk = ln ak for k = 1, . . . , M − 1. 1 Czech Technical University in Prague, Faculty of Nuclear Sciences and Physical Engineering, Břehová 7, Praha 1, 115 19, Czech Republic, snorjaku@fjfi.cvut.cz
839
Mathematical Methods in Economics 2016
The sequence can be also cut to smaller sub-sequences. Then we can apply sliding window of given length l to {zk }M−1 k=1 and obtain vectors xk = (zk , . . . , zk+l−1 ) for k = 1, . . . , M − l. Segment can be described as statistical sample X = (x1 , . . . , xN ) where N ∈ N, N = M − l. In this paper, we analyze stocks of different IT companies. The main goal is to compare the stock states of various titles. The subject of comparison are two statistical samples x and y. We can compare: • two segments of different length of the same stock; • two segments of different length of different stocks; • two segments of different period of the same stock; • two segments of different period of different stocks. The comparison of two segments of different stocks may be used to identify stocks which values are developing with the similar rate. Analyzing two segments of different length may serve to discover two stocks that are behaving similarly but with different speed (for example mobile industry is growing way faster than was computer industry one decade ago). Studying stock in different time period will most probably discover stocks with some seasonal development or it may help with prediction how one stock will develop based on the previous development of the same stock or different one. As a special case we may analyze the full period of two different stocks as well.
3
Parzen Estimate in Hilbert Space
Let N ∈ N be number of observations and h > 0 be the bandwidth parameter. Let ||x|| be Euclidean norm of sample x of dimension n ∈ N. Then Parzen Estimate [7] is defined as f (x) =
N X
1 (2π)n/2 hn N
k=1
||x − xk ||2 exp − 2h2
(2)
This estimate is consistent for lim h = 0
(3)
lim N hn = +∞
(4)
N →∞
N →∞
Traditionally, the parameter h is chosen according to the number of observations N . We set h = h0 N −α
(5)
for h0 > 0 and α ∈ (0, n−1 ). The investigation in Hilbert Space is based on dot product existence. In our case we calculate dot product of probability distribution functions as Z f(x)g(x)dx. (6) (f|g) = Rn
Using Parzen estimate from samples x, y of density f, g with length Nf , Ng we obtain (f|g) =
Ng Z Nf X X 1 ||x − xk ||2 dx exp − (2π)n hf hg Nf Ng 2σ 2 n j=1 R
(7)
k=1
where hf = h0 Nf−α , hg = h0 Ng−α . After the integration in Rn resulting formula is obtained as Ng Nf X X ||xk − yj ||2 1 exp − (f|g) = 2σ 2 (2π)n/2 σ n Nf Ng k=1 j=1
840
(8)
Mathematical Methods in Economics 2016
where σ = h0
q Nf−2α + Ng−2α
(9)
In this case α = 1/2n is recommended in the middle of its acceptable range. The parameter h0 will be subject of investigation. The product (f|g) of densities f(x), g(x) in infinite dimensional Hilbert Space will be directly used for PCA calculations and adequate metrics [10] p d(f, g) = (f|f) + (g|g) − 2(f|g) (10) is used as distance in cluster analysis and SOM learning.
4
Cluster Analysis in Hilbert Space
Let M, H ∈ N be number of patterns from Hilbert space and given number of clusters. Pattern set is then S = {xk ∈ H : k = 1, . . . , M }. Traditional batch clustering [3] is driven by partition vector [4] p = (p1 , . . . pM ) ∈ {1, . . . , H}M . Cluster Cj ⊂ S is defined as Cj = {xk ∈ S : pk = j}. The clusters C1 , . . . , CH represent partition of S as trivial to verify. For non-empty cluster Cj we define weight for pattern k as wk = 1/card(Cj ). Partition quality Q(p) is defined [12] as Q(p) =
H X X
j=1 k∈Cj
||xk − tj ||2
(11)
where tj is centroid of Cj . After small rearrangement we directly obtain Q(p) =
H X
card(Cj )
j=1
X
k∈Cj
wj,k ||xk − tj ||2 =
H X
qj
(12)
j=1
where qj is quality of cluster Cj calculated according [10] as card(Cj ) k−1
qj = card(Cj )
X X k=2
l=1
wj,k wj,l ||xk − xl ||2 .
(13)
General aim of batch cluster analysis is to obtain optimum clustering (partition) driven by partition vector popt ∈ argmin Q(p). According to [1], optimum clustering is NP-hard task in Euclidean space which is special case of finite Hilbert space. Therefore, optimum clustering in Hilbert space is also NPhard one. In this case, trivial defense against resignation is the application of any optimization heuristics (K-means [12], Simulated Annealing (SA) [2], Fast Simulated Annealing (FSA) [9], Integer Cuckoo Search (ICS) [5] based on integer Lévy flights [11], Steepest Descent (SD) [8] or Random Descent (RD) [6].
5
Self-Organized Mapping in Hilbert Space
SOM structure and learning strategies were widely discussed in [3]. We are focused on batch learning and our main motivation is in adapting the main principles of SOM learning in metric space [4] to general Hilbert space. Let N be a non-empty set of vertices (nodes). Let N R⊂ (14) 2 be a set of edges (relations). Then the pair G = hN , Ri is an undirected connected graph which typically represents the topology of SOM [3]. The number of output nodes H = card(N ). Let d∗ : N × N → N0 be a vertex distance in the graph of SOM. Then M = hN , d∗ i is a metric space over the SOM. The maximum value of the vertex distance over N × N is called diameter D of the SOM. The SOM in Hilbert space is a function SOM : S → N , while the SOM learning is an algorithm of the SOM function design according to the set of patterns S [4].
Batch learning of SOM in Hilbert space H is also driven by vector p = (p1 , . . . pM ) ∈ {1, . . . , H}M which places patterns from S ⊂ H into nodes from N of given SOM graph. Novel penalization strategy 841
Mathematical Methods in Economics 2016
is based on weighted centroids around individual SOM nodes and corresponding centroid quality. When the pattern is placed into investigated node, its weight is maximum possible but when it is placed in node neighborhood the weight is function of d∗ according to formula χ(d∗ (i, pxk )) wi,k = PM ∗ j=1 χ(d (i, pxj ))
(15)
where characteristics χ : {0, . . . , D} → R+ is non-increasing function satisfying χ(0) = 1, χ(D) ∈ (0, 1). In the case of batch SOM learning [3], only direct neighbors are involved in centroid calculations with full weight. Therefore, χ(d∗ ) = 1 for d∗ ≤ 1 and χ(d∗ ) = 0 otherwise. The other characteristics are also useful as: • rectangular with characteristics χ(d∗ ) = I(d∗ ≤ R), 2 1 d∗ ∗ • Gaussian with characteristics χ(d ) = exp − 2 R ,
∗ • Exponential with characteristics χ(d∗ ) = exp − dR , • Cauchian with characteristics χ(d∗ ) =
∗ 2 −1 1 + dR
where R > 0 is learning radius. Gaussian characteristics is frequently used in traditional Kohonen SOM learning and should be preferred also in this method. As seen from (15) the weights wi,k are well defined and positive satisfying M X
wi,k = 1, for all i = 1, . . . H
(16)
k=1
which represents node views to given data set S. SOM quality is then defined as a sum of node view penalizations according to formula H X si (17) S(p) = i=1
where si is ”cluster” quality with weights wi,k for given node i according to (13). The main difference between cluster analysis and SOM in Hilbert space is in interaction among patterns from various nodes. SOM partition is formally the same as clustering partition but the influence of any pattern overcome its node limitations. Therefore, SOM learning is just integer minimization of S(p) for given pattern set S.
This task is similar to optimum clustering. General integer minimization heuristics (SA, FSA, ICS, SD, RD) can be employed again. Alternative way is to modify K-means heuristic without necessity of node centroid calculations. Novel heuristics of SOM learning in Hilbert space begins with random partition p. The main loop performs SOM partition revisions till penalization S(p) decreases as follows: For every pattern xk ∈ S and every node i we calculate d(xk , ti ) where ti is hidden centroid of Ci around node i. New pattern position in SOM is then given by formula pk ∈ argmini=1,...,H d(xk , ti ).
6
Application to Stock Indexes
The proposed method was tested on data from stock markets1 since 2nd January 2009 to 22nd March 2016 of 10 popular IT companies: Adobe Systems (ADO), Apple Computer (APP), Amazon.com (AMA), Advanced Micro Devices Inc. (AMD), Autodesk Inc. (AUT), Cisco Systems Inc. (CIS), Computer Sciences Corporation (CSC), Intel (INT), Microsoft (MIC) and Yahoo! (YAH). Each time series contained 1824 values. For each two neighboring values was computed logarithmic difference according to (1) reducing its length by one and then the sliding window of length 10 was applied. After this data preprocessing, the scalar product of each time series pair was computed.
842
Mathematical Methods in Economics 2016
13.5
AMA
13 APP
YAH
AUT
PCA2
12.5
ADO
12
11.5
11
INT
MIC CIS
CSC AMD
10.5
10 -2
-1.5
-1
-0.5
0
0.5
1
1.5
2
2.5
PCA1 Figure 1 Kernel Principal Component Analysis and Cluster Analysis markers of stocks
Kernel Principal Component Analysis with kernel K(x, y) = (x|y) was used as referential method and its results were mapped into two-dimensional space. Two dimensions were chosen for easy comparison with planar SOM. The results of 2D KPCA are depicted on Fig. 1. Cluster analysis into three clusters was used as the second referential method. Steepest descent heuristic with random initial point was used for finding the minimum of (12). Stocks ADO, AMA, AUT, and YAH formed the first cluster performed as black dots in Fig. 1. The second cluster consists of stocks APP, CIS, CSC, INT and MIC as dark grey dots. The third cluster contained only stock AMD as light grey point in the same image. The stock market data was finally self-organized by presented algorithm and compared with referential methods. Due to small number of stock titles, hexagonal topology of SOM with seven nodes and traditional Gaussian repulsion function with the radii R = 1.5 and R = 7.5 were applied to the same data. The steepest descent heuristic was again used for the minimization (17). The results are depicted on Fig. 2. As seen, smaller repulsion brought results similar to KPCA, while higher repulsion formed structure similar to CA. Therefore, the novel method is able to make compromise between two traditional analytic approaches.
1 http://www.wessa.net/stocksdata.wasp
843
Mathematical Methods in Economics 2016
ADO AMA AUT YAH
AMA AUT YAH
APP
ADO
CIS CSC INT MIC
APP
CIS CSC INT MIC
AMD
AMD
Figure 2 Self-Organized Mapping for R = 1.5 (left) and R = 7.5 (right)
7
Conclusion
Novel method of SOM in Hilbert space was used for self-organization of statistical samples. This technique enables to analyze stock market states as demonstrated on IT company stocks. The novel method is a compromise between Principal Component Analysis and Cluster Analysis. The proposed method together with hexagonal topology of SOM is a general tool for stock market analysis but the choice of pre-processing parameters l, σ and SOM parameters R, H is essential and will be subject of future research.
References [1] Garey, M. R., and Johnson, D. S.: Computers and intractability: a guide to the theory of NP-completness. W.H. Freeman and Company, New York, 1997. [2] Kirkpatrick, S., Gellat C. D., and Vecchi, M. P.: Optimization by Simulated Annealing. Science 4598 (1983), 671-680. [3] Kohonen, T.: Self-Organizing Maps. Springer Verlag, New York, 1995. [4] Kukal, J.: SOM in Metric Space. Neural Network World. Prague, 2004, 469-488. [5] Kukal, J., Mojzeš M., Tran, Q. V., and Boštı́k, J.: Integer Cuckoo Search. In: Proceedings of 18th International Confeence on Soft Computing MENDEL 2012. VUT Press, Brno, 2012, 298-303. [6] Nesterov, Y.: Efficiency of Coordinate Descent Methods on Huge-Scale Optimization Problems. SIAM Journal on Optimization 2 (2012), 341-362. [7] Parzen, E.: On Estimation of a Probability Density Function and Mode. The Annals of Mathematical Statistics 3 (1962), 1065-1076. [8] Snyman, J. A.: Practical mathematical optimization: an introduction to basic optimization theory and classical and new gradient-based algorithms. Springer, New York, 2005. [9] Szu, H., and Hartley, R.: Fast simulated annealing. Physics Letters A 3-4 (1987), 157-162. [10] Šnor, J., and Kukal, J.: Self-organized mapping in Hilbert Space. Kybernetika (submitted). Prague, 2015. [11] Viswanathan G.M.: Lévy flights and Superdiffusion in the Context of Biological Encounters and Random Searchers. Physics of Life Reviews 3 (2008), 133-150. [12] Xu, R., and Wunsch, D. C.: Clustering. IEEE Press, Piscataway, NJ, 2009. [13] Young, N.: An introduction to Hilbert space. Cambridge University Press, New York, 1988.
844
Mathematical Methods in Economics 2016
The role of distance and similarity in Bonissone’s linguistic approximation method – a numerical study Tomáš Talášek1, Jan Stoklasa2, Jana Talašová3 Abstract. Linguistic approximation is a common way of translating the outputs of mathematical models in the expressions in common language. These can then be presented to decision makers who have difficulties with interpretations of numerical outputs of formal models as an easy-to-understand alternative. Linguistic approximation is a tool to stress, modify or effectively convey meaning. As such it is an important yet neglected area of research in management science and decision support. During the last forty years a large number of different methods for linguistic approximation were proposed. In this paper we investigate in detail the linguistic approximation method proposed by Bonissone (1979). We focus on its performance under different “fit” measures in its second step - we consider various distance and similarity measures of fuzzy sets to choose the most appropriate linguistic approximation. We conduct a numerical study of the performance of this linguistic approximation method, present its results and discuss the impact of a particular choice of a “fit” measure. Keywords: Linguistic approximation, two-step method, fuzzy number, distance, similarity. JEL classification: C44 AMS classification: 90B50, 91B06, 91B74
1
Introduction
In practical applications of decision support models that employ fuzzy sets it is often necessary to be able to assign a linguistic label (from predefined linguistic scale) to a fuzzy set (usually obtained as an output of some mathematical model). This process is called linguistic approximation. The main reason for applying linguistic approximation is to “translate” (abstract/formal) mathematical objects into the common language (recent research also suggests that the ideas of linguistic approximation can be used e.g. for ordering purposes - see [7]). This way the outputs of mathematical models can become easier to understand and use for the decision-makers. The process of linguistic approximation involves the selection of the best fitting linguistic term from a predefined term set as a representative of the given mathematical object (fuzzy set). Obviously, since the set of linguistic terms is finite (and usually contains only a few linguistic terms), the process distorts the actual output of the mathematical model to some extent (add or decrease uncertainty, shift the meaning in the given context etc. - hence approximation). The key to a successful linguistic approximation is to find an appropriate tradeoff between understandability and loss (distortion) of information (see e.g. [10, 6]). Linguistic approximation relies in many cases on distance and similarity of fuzzy sets, on the subsethood and the differences in relevant features of the output to be approximated and the meaning of its approximating linguistic term. In this paper we focus on the Bonissone’s two-step method for linguistic approximation [1], since it combines the idea of semantic similarity with the requirement of the closeness of meaning. In the first step, the method preselects a given amount of linguistic terms, that embody the semantic best fit (based on a specified set of features). In the second step the linguistic term whose meaning is the closest based on some distance/similarity measure is selected. We investigate the role of distance/similarity measure in the 1 Palacký
University, Olomouc, Křı́žkovského 8, Olomouc, Czech Republic and Lappeenranta University of Technology, Skinnarilankatu 34, Lappeenranta, Finland, tomas.talasek@upol.cz 2 Palacký University, Olomouc, Křı́žkovského 8, Olomouc, Czech Republic, jan.stoklasa@upol.cz 3 Palacký University, Olomouc, Křı́žkovského 8, Olomouc, Czech Republic, jana.talasova@upol.cz
845
Mathematical Methods in Economics 2016
second step of this method and on a numerical study we compare the performance of the Bhattacharyya distance suggested by Bonissone with the dissemblance index distance measure and two fuzzy similarity measures. Based on the results of a numerical study we analyze what features employed in the first step (namely position and uncertainty - the same features employed by Wenstøp [9] in his method) are emphasized and which are distorted by each of the distance/similarity measures. This way the paper strives to contribute to the scarce body of research on good practices in linguistic approximation.
2
Preliminaries
Let U be a nonempty set (the universe of discourse). A fuzzy set A on U is defined by the mapping A : U → [0, 1]. For each x ∈ U the value A(x) is called a membership degree of the element x in the fuzzy set A and A(.) is called a membership function of the fuzzy set A. Ker(A) = {x ∈ U |A(x) = 1} denotes a kernel of A, Aα = {x ∈ U |A(x) ≥ α} denotes an α-cut of A for any α ∈ [0, 1], Supp(A) = {x ∈ U |A(x) > 0} denotes a support of A. A fuzzy number is a fuzzy set A on the set of real numbers which satisfies the following conditions: (1) Ker(A) 6= ∅ (A is normal ); (2) Aα are closed intervals for all α ∈ (0, 1] (this implies A is unimodal ); (3) Supp(A) is bounded. A family of all fuzzy numbers on U is denoted by FN (U ). A fuzzy number A is said to be defined on [a,b], if Supp(A) is a subset of an interval [a, b]. Real numbers a1 ≤ a2 ≤ a3 ≤ a4 are called significant values of the fuzzy number A if [a1 , a4 ] = Cl(Supp(A)) and [a2 , a3 ] = Ker(A), denotes a closure of Supp(A). Each fuzzy number A is determined by where Cl(Supp(A)) A = [a(α), a(α)] α∈[0,1] , where a(α) and a(α) is the lower and upper bound of the α-cut of fuzzy number A respectively, ∀α ∈ (0, 1], and the closure of the support of A Cl(Supp(A)) = [a(0), a(0)]. A union of two fuzzy sets A and B on U is a fuzzy set (A ∪ B) on U defined as follows: (A ∪ B)(x) = min{1, A(x) + B(x)}, ∀x ∈ U . The fuzzy number A is called linear if its membership function is linear on [a1 , a2 ] and [a3 , a4 ]; for such fuzzy numbers we will use a simplified notation A = (a1 , a2 , a3 , a4 ). A linear fuzzy number A is said to be trapezoidal if a2 6= a3 and triangular if a2 = a3 . We will denote triangular fuzzy numbers by ordered triplet A = (a1 , a2 , a4 ). More details on fuzzy numbers and computations with them can be found for example in [2]. Let A be a fuzzy number on [a, b] for which a1 6= a4 . R Then A could be described by several real number characteristics, such as cardinality: Card(A) = [a,b] A(x)dx; center of gravity: COG(A) = R R xA(x)dx/Card(A); fuzziness: Fuzz(A) = [a,b] S(A(x))dx, where S(y) = −y ln(y) − (1 − y) ln(1 − y) [a,b] R and skewness: Skew(A) = [a,b] (x − COG(A))3 A(x)dx.
A fuzzy scale on [a, b] is defined as a set of fuzzy numbers T1 , T2 , . . . , Ts on [a,b],Pthat form a Ruspini s fuzzy partition (see [5]) of the interval [a, b], i.e. for all x ∈ [a, b] it holds that i=1 Ti (x) = 1, and the T ’s are indexed according to their ordering. A linguistic variable ([11]) is defined as a quintuple (V, T (V), X, G, M ), where V is a name of the variable, T (V) is a set of its linguistic values (terms), X is an universe on which the meanings of the linguistic values are defined, G is an syntactic rule for generating the values of V and M is a semantic rule which to every linguistic value A ∈ T (V) assigns its meaning A = M (A) which is usually a fuzzy number on X. Linguistic variable (V, T (V), X, G, M ) is called a linguistic scale on [a, b] if X = [a, b], T (V) = {T1 , . . . , Ts } and M (Ti ) = Ti , i = 1, . . . , s form a fuzzy scale on [a, b]. Terms Ti , i = 1, . . . , s are called elementary terms. Linguistic scale on [a, b] is called extended linguistic scale, if besides elementary terms contains also delivered terms in the form Ti to Tj where i < j, i, j ∈ {1, . . . , n} and M (Ti to Tj ) = Ti ∪ Ti+1 ∪ · · · ∪ Tj .
3
Bonissone’s two step method for linguistic approximation
Bonissone’s two step approach for linguistic approximation [1] was proposed in 1979. In contrast to the majority of linguistic approximation approaches, Bonissone suggested to split the process into two steps – in the first step the set of suitable linguistic terms for the approximation of a given fuzzy number is found (this “pre-selection step” is done based on the semantic similarity), then in the second step the most appropriate term for the linguistic approximation is found from this set of suitable linguistic terms. In the pre-selection step the set P = {Tp1 , . . . , Tpk } of k (k ≤ s) suitable linguistic terms from T (V)
846
Mathematical Methods in Economics 2016
is formed in the way that the meaning of these pre-selected terms are similar to the fuzzy set O (an output of a mathematical model to be approximated) with respect to four characteristics (cardinality, center of gravity, fuzziness and skewness). These characteristics are assumed to capture the semantic value of a fuzzy set used to model the meaning of a linguistic term. The semantic value of a fuzzy set on a given universe can thus be represented by a quadruple of real numbers (values of 4 features in four-dimensional space). Let A be a fuzzy set on [a, b]. Then the respective characteristic quadruple is denoted as (a1 , a2 , a3 , a4 ) where a1 = Card(A), a2 = COG(A), a3 = Fuzz(A) and a4 = Skew(A). Let the fuzzy set O on [a, b] be an output of a mathematical model that needs to be linguistically approximated by one linguistic term from the set T (V) = {T1 , . . . , Ts }. T (V) is a linguistic term set of a linguistic variable (V, T (V), [a, b], G, M ), such that Ti = M (Ti ), i = 1, . . . , s are fuzzy numbers on [a, b]. Linguistic terms {T1 , . . . , Ts } are ordered with respect to the distance of their characteristic quadruples from the characteristic quadruple of O. The ordered set N = (Tp1 , . . . , Tps ) is thus obtained, such that d(Tp1 , O) ≤ d(Tp2 , O) ≤ · · · ≤ d(Tps , O) where d(Tpi , O) =
4 X j=1
wj |tjpi − oj |, i = 1 . . . , s,
(1)
P4 and wj , j = 1, . . . , 4 are normalized real weights (i.e. j=1 wj = 1, wj ≥ 0, j = 1, . . . , 4). The choice of weights is usually left with the user of the model and some features could be even optional. Wenstøp [9] for example proposed (in his method for linguistic approximation) to use only two features – cardinality (uncertainty) and center of gravity (position). First k linguistic terms (the parameter k is specified by the decision maker) from the ordered set N are stored in the set P and the pre-selection step is finished. In the second step, the linguistic approximation TO ∈ P of the fuzzy set O is computed. The fuzzy set TO = M (TO ) is computed as TO = arg min d1 (Tpi , O) Tpi ∈P
using the modified Bhattacharyya distance: Z h i1/2 d1 (A, B) = 1 − (A∗ (x) · B ∗ (x))1/2 dx ,
(2)
(3)
U
where A∗ (x) = A(x)/Card(A(x)) and B ∗ (x) = B(x)/Card(B(x)). This way the linguistic term TO is found as the closest linguistic approximation among the pre-selected linguistic terms. The Bhattacharyya distance (3) can be substituted by different distances or similarity measures1 of fuzzy numbers – this step will, however, modify the behaviour of the linguistic approximation method. In the next section the following distance and similarity measures of fuzzy numbers are considered: • A dissemblance index (introduced by Kaufman and Gupta [4]) of fuzzy numbers A and B is defined by the formula Z 1 (4) d2 (A, B) = |a(α) − b(α)| + |a(α) − b(α)| dα, 0
• A similarity measure (introduced by Wei and Chen in [8]) of fuzzy numbers A and B is defined by the formula P4 |ai − bi | min{P e(A), P e(B)} + min{hgt(A), hgt(B)} s1 (A, B) = 1 − i=1 · , (5) 4 max{P e(A), P e(B)} + max{hgt(A), hgt(B)} p p where P e(A) = (a1 − a2 )2 + (hgt(A))2 + (a3 − a4 )2 + (hgt(A))2 + (a3 − a2 ) + (a4 − a1 ), P e(B) is defined analogically • A similarity measure (introduced by Hejazi and Doostparast in [3]) of fuzzy numbers A and B can be defined by the formula P4 |ai − bi | min{P e(A), P e(B)} min{Ar(A), Ar(B)} + min{hgt(A), hgt(B)} s2 (A, B) = 1 − i=1 · · , (6) 4 max{P e(A), P e(B)} max{Ar(A), Ar(B)} + max{hgt(A), hgt(B)} where Ar(A) = 21 hgt(A)(a3 − a2 + a4 − a1 ), Ar(B) is defined analogically and P e(A) and P e(B) are computed identically as in the previous method.
1 In
the case of similarity measure the arg min function in the formula (2) must be changed to arg max function.
847
Mathematical Methods in Economics 2016
4
Numerical experiment
We restrict ourselves for the purpose of this paper to the linguistic approximation of triangular fuzzy numbers. We assess the performance of the distance measures d1 and d2 and of the two similarity measures s1 and s2 in the context of Bonisonne’s linguistic approximation method by the following numerical experiment. We randomly generate 100 000 triangular fuzzy numbers on [0, 1] to be linguistically approximated (denoted {O1 , . . . , O100000 }) and compute their cardinalities {o11 , . . . , o1100000 } and their centers of gravity {o21 , . . . , o2100000 }. We assume that for all these generated fuzzy numbers the hypothetical result of the first phase of Bonisonne’s method is the set of linguistic terms P = {Tp1 , . . . , Tpk } - in our numerical study this set is the linguistic term set of an extended linguistic scale constructed from a uniform Ruspini fuzzy partition of the universe [0, 1] with 5 triangular fuzzy numbers. This way we obtain the linguistic term set P = {Tp1 , . . . , Tp15 }, the meanings of these linguistic terms are {T1 , . . . , T15 }, with cardinalities {t11 , . . . , t115 } and centers of gravity {t21 , . . . , t215 }. Using each distance and similarity measure we find the linguistic approximation of each generated output applying the second step of Bonisonne’s method - this 2 1 2 1 way we obtain {TOd11 , . . . , TOd100000 }, {TOd12 , . . . , TOd100000 }, {TOs11 , . . . , TOs100000 } and {TOs12 , . . . , TOs100000 } as the linguistic approximations of the generated triangular fuzzy numbers using d1 , d2 , s1 and s2 respectively. Figure 1 plots the cardinalities of the approximated fuzzy numbers (horizontal axis) against the cardinality of the meaning of the respective linguistic approximation for all the distance/similarity measures. We can clearly see from the plots, that all four measures provide linguistic approximations with both higher cardinality (points above the main diagonal) and with lower cardinality. It, however, seems, that higher cardinality case is more frequent (points in the left upper corner of the plots). This can be reasonable, since even in common language we tend to use super-categories to generalize the meaning. In all the methods it is possible to also get a linguistic approximation with a lower cardinality (i.e. the meaning of the linguistic approximation is less uncertain than the original output of the model). Note, that since we have generated triangular fuzzy numbers on [0, 1], the maximum possible cardinality of any generated fuzzy number was 0.5. The Bhattacharyya distance is the only one from the investigated measures, that provides very highly uncertain approximations. This behavior could be tolerated only if the reason for the addition of uncertainty is the tendency of the measure to achieve a linguistic approximation that is more general than the approximated fuzzy set. Table 1 summarizes in how many cases the kernel of the resulting linguistic approximation is a superset of the kernel of the approximated results - in these cases the “typical representatives” of the output are also the “typical representatives” of the approximated linguistic term. We can see that Bhattacharyya distance focuses on this aspect much more than the other investigated methods. The situation for the centers of gravity is summarized analogically in Figure 2. Here the desired state can be no presence of a systematic bias of the approximation. This corresponds with the points being close to the main diagonal in the respective plot, or evenly distributed to the left and to the right. We can see that with respect to this requirement the Bhattacharyya distance performs rather well. Both similarities perform in most cases in the following way: i) in case of lower centers of gravity of the approximated result they shift the center of gravity of the meaning of the linguistic approximation lower than the original center of gravity of the approximated results, ii) in case of higher centers of gravity of the approximated result they shift the center of gravity of the meaning of the linguistic approximation higher than the original center of gravity of the approximated results. Similarities seem to have an amplifying effect on the center of gravity - shifting the center of gravity to the endpoints of the universe. This can be a desirable property in cases, when such an amplification of meaning is needed. k
d1
d2
s1
s2
k Card{Oi |Ker(Oi )⊆Ker(TO ),i=1,...,100000} i 100000
0.2868
0.1559
0.1778
0.1632
Table 1: The relative count of cases when the Ker(Oi ) ⊆ Ker(TOk i ), k ∈ {d1 , d2 , s1 , ss2 } out of the given 100 000.
5
Conclusion
In the paper we have investigated the role of different distance and similarity measures of fuzzy numbers in the second step of Bonissone’s linguistic approximation method. We focused on the cardinality and
848
Mathematical Methods in Economics 2016
Figure 1: Comparison of cardinalities of the approximated outputs (horizontal axis) and the meanings of their linguistic approximations (vertical axis) for d1 , d2 , s1 and s2 .
Figure 2: Comparison of centers of gravity of the approximated outputs (horizontal axis) and the meanings of their linguistic approximations (vertical axis) for d1 , d2 , s1 and s2 .
849
Mathematical Methods in Economics 2016
center of gravity characteristics of fuzzy numbers. We have performed a numerical experiment which investigated the differences between the chosen characteristics of randomly generated triangular fuzzy numbers on the interval [0, 1] and the characteristics of the meanings of their linguistic approximations computed by Bonissone’s method. This served as a basis for the analysis of the performance of two different distance measures and two similarity measures of fuzzy numbers in the linguistic approximation context. The results of the numerical experiment suggest, that the Bhattacharyya distance tends to provide more uncertain approximations than the other methods and is more likely to provide approximating linguistic term that “catch” the typical representatives (the kernel of the approximating linguistic term meaning is a superset to the kernel of the approximated fuzzy number). Both presented similarity methods have amplifying effect on the center of gravity – they shift the center of gravity of the approximating linguistic term meaning to the endpoints of the universe.
Acknowledgements Supported by the grant by the grant GA 14-02424S Methods of operations research for decision support under uncertainty of the Grant Agency of the Czech Republic and partially also by the grant IGA PrF 2016 025 of the internal grant agency of the Palacký University, Olomouc.
References [1] Bonissone, P. P.: A pattern recognition approach to the problem of linguistic approximation in system analysis. In: Proceedings of the IEEE International Conference on Cybernetics and Society, 1979, 793–798. [2] Dubois, D., and Prade, H.: Fuzzy sets and systems: theory and applications, Academic Press, 1980. [3] Hejazi, S. R., Doostparast, A., and Hosseini, S. M.: An improved fuzzy risk analysis based on a new similarity measures of generalized fuzzy numbers. Expert Systems with Applications, 38, 8 (2011), 9179–9185. [4] Kaufman, A., and Gupta, M. M.: Introduction to Fuzzy Arithmetic, Van Nostrand Reinhold, New York, 1985. [5] Ruspini, E.: A New Approach to Clustering. Information and Control, 15 (1969), 22–32. [6] Stoklasa, J.: Linguistic models for decision support. Lappeenranta University of Technology, Lappeenranta, 2014. [7] Talášek, T., Stoklasa, J., Collan, M., and Luukka, P.: Ordering of Fuzzy Numbers through Linguistic Approximation Based on Bonissone’s Two Step Method. In: Proceedings of the 16th IEEE International Symposium on Computational Intelligence and Informatics. Budapest, 2015, 285–290. [8] Wei, S. H., and Chen, S. M.: A new approach for fuzzy risk analysis based on similarity measures of generalized fuzzy numbers. Expert Systems with Applications, 36, 1 (2009), 589–598. [9] Wenstøp, F.: Quantitative analysis with linguistic values. Fuzzy Sets and Systems, 4, 2 (1980), 99–115. [10] Yager, R. R.: On the retranslation process in Zadeh’s paradigm of computing with words. IEEE Transactions on Systems, Man, and Cybernetics. Part B: Cybernetics, 34, 2 (2004), 1184–1195. [11] Zadeh, L. A.: The concept of a linguistic variable and its application to approximate reasoning I, II, III. Information Sciences, 8 (1975), 199–257, 301–357, 9 (1975), 43–80.
850
Mathematical Methods in Economics 2016
Registry of Artistic Performance - the final state of the evaluation model
Jana Talašová1, Jan Stoklasa2, Pavel Holeček3, Tomáš Talášek4 Abstract. The amendment to the Law on Higher Education Institutions adopted in 2016 means the enactment of the Registry of Artistic Performance (RAP) and its transfer under the administration of the Ministry of Education, Youth and Sports. In this context it was necessary to finalize the mathematical model for the evaluation of the outcomes of artistic creative work stored in RAP. The results of the model – scores assigned to the outcomes – are used for the distribution of a part of the funding from the state budget among universities in the Czech Republic. For the purpose of the evaluation, the creative work outcomes are classified into categories based on three criteria: significance of the outcome, its extent, and its institutional reception. These categories have been assigned scores by the AHP. In time modifications to the Saaty’s matrix have proven to be necessary. The algorithm for multi-expert classification of the creative-work outcomes has been improved in the process. The description of the development and successive modifications of the evaluation model are the focus of this paper. Keywords: Registry of Artistic Performance, creative-work outcomes, multiple criteria evaluation, AHP, group evaluation. JEL Classification: C44 AMS Classification: 90B50
1
The Registry of Artistic Performance
The Registry of Artistic Performance (RAP) is a web application for storing the information on the outcomes of artistic creative work (denoted OACW or outcomes further in the text) produced by the universities in the Czech Republic. Since 2013, the application has been performing this task. Another purpose of the application is the evaluation of the registered outcomes: on the basis of expert evaluations, the outcomes are divided into categories with a pre-defined point gain. The sum of the points obtained by each individual university in the past 5 years then represents one of the indicators used for the funding of these universities from the state budget. In 2016, the amendment to the Law on Higher Education Institutions, in which the RAP is explicitly stated, was adopted. As soon as the Law on Higher Education Institutions becomes effective (September 1, 2016), RAP will enter a fully professional mode under the administration of the Ministry of Education, Youth and Sports. It was therefore necessary to finish the tuning of the evaluation model based on the results of the most recent analyses of the development and performance of RAP and the respective evaluation methodology. A quantitative evaluation of OACW (i.e. pieces of art and artistic performances) represents a task comparable in its complexity to the evaluation of the research and development (R&D) results. While various scientometric tools [1] have been developed for the quantitative evaluation of the R&D outcomes since 1950s, there was only a very limited possibility to reuse some previously developed procedures in case of OACW evaluation. The evaluation model used in the RAP has undergone significant development since its introduction in 2010 [6] till its current state. The development was motivated primarily by the endeavor to ensure the maximum objectivity of the evaluation of OACW and the comparability of the evaluations among various fields of artistic production. This paper describes the final state of the evaluation model at the time of the transition of RAP into a fully professional mode and the development of the evaluation model. From the mathematical point of view, the methodology of the evaluation of OACW, which is used in the RAP, is comprised of two models – (1) a multiple-criteria evaluation model based on the Saaty’s method, which 1
Palacký University Olomouc/Faculty of Science, Department of Mathematical Analysis and Applications of Mathematics, 17. listopadu 1192/12, 771 46 Olomouc, jana.talasova@upol.cz. 2 Palacký University Olomouc/Faculty of Arts, Department of Applied Economics, Kateřínská 17, 772 00 Olomouc, jan.stoklasa@upol.cz. 3 Palacký University Olomouc/Faculty of Science, Centre of the Region Haná for Biotechnological and Agricultural Research, Šlechtitelů 241/27, 783 71, Olomouc – Holice, pavel.holecek@upol.cz. 4 Palacký University Olomouc/Faculty of Science, Department of Mathematical Analysis and Applications of Mathematics, 17. listopadu 1192/12, 771 46 Olomouc, tomas.talasek@upol.cz.
851
Mathematical Methods in Economics 2016
has been used to derive the scores of the individual categories of the artistic work, and (2) a group decisionmaking model, which is used to assign specific OACW into the individual categories. Another separate challenge was to design the categories of OACW so that comparisons across different fields of artistic production are possible.
2
Categorization of OACW and the scores of categories
For the purpose of the registration and evaluation of the creative- work outcomes in the RAP, the whole area of artistic production is divided into 7 fields: architecture, design, film, fine arts, literature, music, and theatre. OACW (i.e. pieces of art and artistic performances) stored in the RAP database are evaluated according to the following three criteria: 1. relevance or significance, 2. extent, 3. institutional reception. The first mentioned criterion is the most important one; it reflects the quality of the piece of art or artistic performance. The second criterion has been included into the evaluation because the financial subsidy for a given university should be similar when it produces a lesser number of extensive (large) pieces of art and when it produces a greater number of smaller pieces of art, provided that the evaluation of the quality of the outcomes is similar. The third criterion represents, to some extent, the confirmation of the evaluation according to the first criterion (very significant pieces of art are usually presented in respected institutions – although not exclusively). Statistically speaking, there is a dependency between the evaluation according to the first criterion and the third one; on the other hand, the expert evaluation according to the first criterion is done independently of the value of the third criterion – the significance is assessed independently of the institutional reception. For the criterion relevance or significance of the outcome, four levels of the evaluation are considered:
an outcome of crucial significance and originality (level “A”),
an outcome containing numerous important innovations (level “B”),
an outcome pushing forward modern trends (level “C”),
other artistic outcomes (level “D”).
For the criterion extent of the outcome, three evaluation levels have been defined:
large (level “K”),
medium (level “L”),
limited (level “M”).
And finally, according to the criterion institutional reception, creative-work outcomes created and reflected in the following context are distinguished:
international context (level “X”),
national context (level “Y”),
regional context (level “Z”).
Each outcome is assigned a triplet of the letters – for example, AKX, BKY, or CLZ – representing the resulting category of the outcome. In total, there are 27 categories with nonzero scores. Concerning the categories beginning with the letter “D”, they contain outcomes without any RAP point gain, regardless of the values of the other two criteria (for the purposes of the evaluation, we denote all such categories simply as a “D” category). If the outcome does not belong to any of the defined categories according to one or more of the criteria, or if the outcome does not meet the formal requirements, it is assigned into the category rejected (denoted as “R”). The decision on the significance of the outcome (levels “A”, “B”, “C”, or “D”) is based exclusively on the evaluations provided by experts. Originally the extent of the outcome was also decided based on the opinion of experts from the given field of arts. The meanings of the terms large, medium and limited, however, depend on the context (on the particular field of artistic production). That is why in the context of the extent criterion in each field of the artistic production, general types of outcomes with a pre-assigned level of extent have been defined. The assignment of the outcome into one of the extent categories is thus done by selecting one of these general outcome types. Concerning the evaluation of the outcome’s institutional reception, lists of institutions corresponding with the possible levels of this criterion (X, Y, Z) are defined and annually actualized based on the opinions of experts.
852
Mathematical Methods in Economics 2016
For the determination of scores of each of the categories (AKX,…,CMZ) in the RAP, Saaty’s pair-wise comparison method described in [3, 4] has been used. The method uses an evaluation scale with verbal descriptors to express the intensities of preferences between the pairs of categories. The verbal level of this method was utilized, because it was necessary to obtain the information on preferences from the experts from the field of art. The mathematicians in the project team paid a great attention to the organization and facilitation of the whole process of gathering the expert data – to ensure that the process of data input was as natural for the experts as possible. The aim was to obtain an objective reflection of the experts’ preferences among the categories. The representatives of each field of artistic production (the so called guarantors of the fields) were first instructed to specify one example of an outcome for each of the categories with nonzero point gain in their field. The experts could use these examples for the comparisons in the process of determination of scores of categories and, therefore, they did not have to compare the abstract categories. The whole process of determining the preferences was divided into two steps. First, the experts used the pair-wise comparison method to determine the ordering of the categories according to their significance. Subsequently, they determined the preference intensities (the elements of Saaty’s matrix) for these ordered categories. Because of the size of the used Saaty’s matrix (27x27), it was difficult to achieve its sufficient consistency. In this context, the notion of a weak consistency of a Saaty’s matrix has been applied ([5], [6]).The weak consistency condition is very easy to check in cases when the categories are ordered in the descending order according to their significance (the values in the rows of the Saaty’s matrix are non-decreasing from left to right; the values in columns are non-decreasing from below to the top). The individual elements of the pair-wise comparison matrix and, subsequently, also Saaty’s matrix were the result of the consensus among the team of experts representing all individual fields of the artistic production; the process was also influenced by the mathematicians who pointed out minor inconsistencies in the preferences. The eigenvector method was used to derive the relative evaluations of the categories from the Saaty’s matrix. The vector was then scaled so that the highest possible category (AKX) has the same score (305 points) as the research and development result with the highest evaluation according to the Methodology of research and development results evaluation valid in the Czech Republic at the time when the model has been designed. The initial scores (used for the evaluation in RAP in 2010 - 2015) are listed in Table 1. Category
Relevance or significance
Extent
Institutional reception
AKX crucial significance
large
international
AKY crucial significance
large
national regional
AKZ crucial significance
large
ALX crucial significance
medium international
AMX crucial significance
limited
international
ALY crucial significance
medium national
ALZ crucial significance
medium regional
BKX containing numerous important innovations
large
international
AMY crucial significance
limited
national
AMZ crucial significance
limited
regional
BKY containing numerous important innovations
large
national regional
BKZ containing numerous important innovations
large
BLX containing numerous important innovations
medium international
BMX containing numerous important innovations
limited
BLY containing numerous important innovations BLZ containing numerous important innovations
international
medium national medium regional
BMY containing numerous important innovations
limited
BMZ containing numerous important innovations
limited
regional
CKX pushing forward modern trends
large
international
CLX pushing forward modern trends
medium international
CKY pushing forward modern trends
large
CKZ pushing forward modern trends
large
regional
limited
international
CMX pushing forward modern trends CLY pushing forward modern trends CLZ pushing forward modern trends
national
national
medium national medium regional
CMY pushing forward modern trends
limited
national
CMZ pushing forward modern trends
limited
regional
Initial scores
305 259 210 191 174 138 127 117 97 90 79 66 62 48 44 40 37 31 26 24 19 17 16 12 10 9 8
Adjusted scores
305 242 98 220 188 162 85 137 114 76 68 38 62 56 49 32 42 28 24 22 20 12 17 15 10 13 8
Table 1 The comparison of the initial and the adjusted scores for the individual categories. Thorough analyses of the data stored in RAP are carried out annually after the input phase is completed. The analyses aim to: (a) facilitate the comparability among the individual fields of artistic production, (b) increase the objectivity of the evaluation, (c) identify possible problems of various kinds that might be encountered
853
Mathematical Methods in Economics 2016
in the process of gathering and evaluating the outputs. The results of the analyses performed in 2013 initiated significant changes of the types of outputs registered in the RAP – whole pieces of art, whose extent can differ in various segments significantly were replaced by individual artistic performances in some fields of artistic production (such as theatre). Primarily, possible general types of outcomes according to the extent criterion were predefined for each of the field and these were assigned an appropriate K or L or M level. The analyses also discovered that outputs, whose significance differs considerably from their institutional reception (A.Z, B.Z), occur in the data more frequently than it was expected. Such cases were assumed to be theoretically possible – e.g. if the first presentation of an outcome was at the end of the year, the experts could recognize its high quality, but the output did not manage to receive the institutional reception yet (the reception evaluation can be increased in the following years, which would increase the output’s score in the RAP automatically). Such real life examples of outputs from the individual fields were used by the experts for setting the preferences intensities of these categories in the Saaty’s matrix for the calculation of the scores of categories. Longterm observation showed, however, that the assignment of the A.Z category often represents an overestimation of the output; the high expert evaluation is not always supported by a higher institutional reception in the whole monitored 5 year period. That is why a new determination of the significance order of categories and a revision of the elements of Saaty’s matrix was performed in 2015. The categories of outputs with only regional reception (the triplet of letters ending with Z) were moved towards the end of the group of categories with the given significance level (A, B, or C). Afterwards, the elements of Saaty’s matrix were again defined; the original expertly defined preferences intensities of the other categories served as an important basis of this process. The resulting weakly-consistent Saaty’s matrix that was used for the calculations is depicted in Figure 1. The new adjusted scores of categories are listed in Table 1 together with the initial scores. The new scores were again standardized so that score of the highest category (AKX) was 305 points. These modifications of Saaty’s matrix increased also its consistency measured by the CR index (see [3] for the definition).
Figure 1 The new Saaty’s matrix of preference intensities for the individual categories.
3
Assignment of the OACW into the individual categories
The model used for assigning OACW stored in the RAP into the individual categories of the artistic production is, from the mathematical point of view, a group decision-making model. The original model used until 2013 was based on the following principle: The initial assignment of the outcome into one of the categories was proposed by the university that has registered this outcome into the RAP. Subsequently, two independent experts proposed their evaluations for the outcome. If a majority agreement has been achieved according to the criteria significance and extent (the value of the criterion institutional reception
854
Mathematical Methods in Economics 2016
has been given by assignment of the institutions into one of the levels X, Y, Z, which has been already known at the moment of the evaluation of the outcome by the experts), the overall evaluation has been determined. If the majority agreement was not achieved, the outcome has been discussed in the Arbitration committee that has consisted of the representatives of the individual universities. The Arbitration committee arbitrated also the outcomes which have been proposed to rejection by at least one of the experts and also the outcomes where the university appealed against their stated evaluations. In 2013, it was decided that the initial evaluation of the outcome proposed by the university should not enter the calculation of the resulting evaluation any longer. From 2013 this initial evaluation is assessed by the guarantor of the respective field of art. This way, the comparability of the evaluation within each field of artistic production is ensured. The guarantor’s evaluation and the evaluations of the other two independent (randomly assigned) experts form the basis for the resulting evaluation calculation. The analyses of the stored data performed in 2015 resulted in other modification proposals of the algorithm for assigning the OACW into the individual categories of the artistic production. The goal was to create a deterministic algorithm that would make it possible to remove the Arbitration committee from the decision process. The basic rule is that if the majority agreement has not been achieved, the RAP application would select another independent expert, whose task it would be (for each of the criteria) to select one of the evaluations proposed by the previous experts (including the possibility to reject the outcome). Taking into account the number of levels, which have been defined for the individual criteria, the requirement that the new expert should “incline” to the opinion of one of the previous ones does not represent a problem for the expert. The practical realization of the proposed rule differs among the three criteria. The list of institutions assigned according to the institutional reception into the categories X, Y, or Z are updated every year before the beginning of the period for gathering the data; after the data are gathered (before the expert evaluation is performed) the new institutions registered into the RAP application are also assigned into one of the categories. The assignment is given by the opinion of the guarantor of the respective field and two independent experts – if the majority agreement has not been reached, another assigned expert assesses the output by inclining to the opinion of one of the previous experts (however, this case is quite improbable). All the undecided cases that could in the original version of the methodology end up before the Arbitration committee are thus eliminated. In case of the extent criterion, the assignment of the corresponding level (K, L, or M) is given automatically by the type of the outcome, which is selected from a list when an output is being registered into the RAP. However, because of the continuous progress in the art, new kinds of OACW emerge. That is why a possibility to evaluate an outcome’s extent by the experts as large (K), medium (L), or limited (M) has been preserved for exceptional cases. The same rule as in case of the other criteria is followed – the consensus of at least two of the experts is required. The main emphasis in the expert evaluation of the outcomes stored in the RAP is put on the significance criterion. Let us note that if the university assigns some of their outcomes into the D category (which does not result in any point gain), such outcome does not proceed further into the evaluation process - this way, the need for expert evaluation for insignificant outcomes is reduced. The target state of the model is that even the D category outputs are assessed by the guarantor of the given field of artistic production concerning the minimum requirements for an output to be stored in RAP. The guarantor and the independent experts can select any of the four levels of the significance criterion. If the guarantor and both of the independent experts (randomly assigned to the output by the RAP application) select different evaluations, another expert selected by the RAP application would assess the output by inclining to the opinion of one of the previous experts. This way the overall evaluation of the outcome according to this criterion would be provided. A special case occurs if any of the evaluators proposes the rejection of the outcome. In this case, the same general rules are applied as in the previous cases – the output is rejected if at least two of the evaluators proposed its rejection. The main difference is that the reason for the rejection does not need to be connected with any of the three criteria; rejection can be also caused by formal flaws of the record entered into the RAP itself. There is a significant difference between the D category (with no point evaluation) and the situations when an outcome is rejected; the rejected results are expected to be subjected to sanctions for incorrectly registered outcomes.
4
Conclusion
The mathematical model of evaluation used in the Registry of artistic performance was being developed during the years 2010 to 2016, with changes and modifications inspired by the findings of the annual analyses of the data stored in the RAP. The model’s authors were striving to keep the mathematical model as simple and comprehensible as possible. One goal of the performed modifications in the evaluation model was to improve the outcomes’ comparability among various fields of artistic production (the solution comprised a modification of
855
Mathematical Methods in Economics 2016
the types of the registered outcomes and clear assignment of extent categories K, L, M to these types). Another goal was to increase the evaluation objectivity (implemented by decreasing the point scores of questionable categories A.Z and B.Z, and by introducing the rule that agreement of at least two external evaluators is required in any evaluation performed in the RAP application). Although various experiments with significantly innovated versions of the model (see e.g. [2]) were performed according to the requests of external authorities, it has always turned out that the originally designed conception of the solution is more suitable for the given purpose. The scores of the outputs from the past 5 years are used in the Czech Republic for the distribution of a part of the funding from the state budget for the pedagogical activities of the universities for the following year, similarly to the scores obtained by these universities for their activities in the research and development. However, it would be desirable to take the scores for the artistic creative-work into account also in the division of the funding for creative-work of the Czech universities, where only the points for the research and development have been used so far. Acknowledgements The research has been supported by the grant GA14-02424S Methods of operations research for decision support under uncertainty of the Grant Agency of the Czech Republic and partially also by the grant IGA PrF 2016 025 of the internal grant agency of the Palacký University, Olomouc.
References [1] Garfield, E.: Citation indexes for science: A new dimension in documentation through association of ideas. Science 122, 3159 (1955), 108–111. [2] Jandová, V., Stoklasa, J., Talašová, J.: Modification of the AHP based model for evaluating artistic production of Czech colleges. Proceedings of the 32nd International Conference on Mathematical Methods in Economics 2014, Palacký University, Olomouc, (2014), 372–377. [3] Saaty, T.L.: The fundamentals of decision making and priority theory with the Analytic Hierarchy Process. Vol. VI of the AHP Series, RWS Publ., 2000. [4] Saaty, T. L.: Decision making with the analytic hierarchy process. International Journal of Services Sciences. 1, (2008), 83–98. [5] Stoklasa, J., Jandová, V., Talašová, J.: Weak consistency in Saaty’s AHP – evaluating creative work outcomes of Czech Art Colleges, Neural network world. 23, 1 (2013), 61–77. [6] Talašová, J., Stoklasa, J.: A model for evaluating creative work outcomes at Czech Art Colleges. Proceedings of the 29th International Conference on Mathematical Methods in Economics 2011 – part II, Praha, (2011), 698–703.
856
Mathematical Methods in Economics 2016
Multicriteria coalitional game with choice from payoffs Michaela Tichá1 Abstract. Multicriteria games are model situations in which at least one player has more than one criterion. If we consider that every player is able to assign weights to the criteria, then the situation is simple. In the case of noncooperative games, the game is reduced to a one-criterion classic game. However, in coalitional games one problem remains if the players have different weights: We cannot simplify the vector payoff function to a scalar payoff function. We can analyze the contribution of each player, but it is difficult to analyze the contribution of coalitions. We use the known formulation of linear model that split the gain of the coalition by use of bargaining theory - egalitarian solution. Then we generalize the model for the case when players and coalitions have possibility to choose from more payoffs. We use the binary variables and formulate a mixed integer programming problem. Finally we shall determine the final coalitional structure. Keywords: multicriteria games, game theory, cooperative games, multiple multicriteria games. JEL classification: C72 AMS classification: 91A10
1
Introduction
The multicriteria coalitional game is a class of cooperative games with vector pay-off functions. The players create coalitions depending on what kind of benefit they obtain. Knowing the preferences of the players we can compare the utility of the players. The utility of a player is the weighted sum of the individual criteria. We can easily determine the utility of a player when he is alone. However, in the case of (at least) two-member coalition we cannot easily determine the benefit of the coalition because the players usually have different preferences (weights). We use the egalitarian solution for allocation of the gain among players. We maximized the extra utility for the individual players in the case of egalitarian solution. The condition was that neither of the players attains lower utility, compared to being in coalition with a smaller number of members. This concept is taken from [1]. We use this concept to solve the problem where players can choose from more vector payoffs.
2
Problem formulation
Here we focus on the type of game where the gain depends on several options from which players can choose. Players have several criteria and several payoffs to choose, they may vote among them. To better clarify this issue, provide the following example. Consider that a coalition of {α, β} can choose between vectors (1, 3) or (5, 2). In the classic monocriterial game this selection is not logical because a player would always choose a higher gain. However, in multicriterial situation, the final choice is not obvious, because one solution does not dominate the second solution. If we know the preferences (weights) of the players, we can determine what gain the player chose, if he were in a one-member coalition. But even then we can not determine the selected gain of the more-member coalition because players may have different preferences.
1 University of Economics, Prague, Department of Econometrics, W. Churchill Sq. xticm11@vse.cz
857
1938/4, 130 67 Prague 3,
Mathematical Methods in Economics 2016
Definition 1. Multicriteria coalitional game with choice is (N, v), where N = {1, 2, . . . , n} is the set of players, n ∈ N represents the count of players. Nonempty subset S ⊆ N of the set of players is called coalition. The characteristic function of the game v : S → Rm×l
assigns the gain v(S) ∈ Rm×l to any coalition S ⊆ N , m ∈ N represents the count of the criteria (the count of all the criteria of players), l ∈ N represents the count of variants of payoffs, that players can choose. We denote v τ σ (S) - the gain in the criterion τ while choosing the variant σ:
v(S) =
v 11 (S) v 12 (S) . . . v 21 (S) v 22 (S) . . . .. . m1 v (S) v m2 (S) . . .
v 1l (S) v 2l (S) . ml v (S)
The characteristic function determines the total payoff of the coalition in each criterion. We assume that v τ σ (S) ≥ 0 ∀τ ∈ {1, 2, . . . , m}, ∀σ ∈ {1, 2, . . . , l} fo all the coalitions S. Let assume for example l = 3, m = 2, which means that every coalitione S can choose from three payoff vectors, for instance (2, 6) or (4, 2) or (1, 9). It is only on the decision of the players from the coalition to decide what vector they prefer. Let consider further the known preferences of the players. Definition 2. Multicriteria coalitional game with choice and known preferences is a triplet (N, v, W ), where N = {1, 2, . . . , n} is the set of players, n ∈ N represents the count of players. Nonempty subset S ⊆ N of the set of players is called coalition. The characteristic function of the game v : S → Rm×l
assigns the gain v(S) ∈ Rm×l to any coalition S ⊆ N . m ∈ N represents the count of the criteria (the count of all the criteria of players), l ∈ N represents the count of variants of payoffs, that players can choose. We denote v τ σ (S) - the gain in the criterion τ while choosing the variant σ. Let further
W =
w11 w21 .. . wn1
w12 w22
... ...
wn2
...
w1m w2m wnm
is the matrix of preferences of the players, waτ represents the weight of the criterion τ of the player a. waτ ∈ [0, 1] m X
a ∈ N, τ ∈ {1, 2 . . . , m}, ∀a ∈ N.
waτ = 1
τ =1
Definition 3. An allocation X(S) ∈ Rk×m , k = |S| is a payoff matrix whose line represents the payoff for each player from the coalition S when variant σ is selected. xaτ (S) represents the payoff of the player a in the criterion τ in the coalition S by the variant σ.
3
Solution
Assuming known preferences of players, we can easily determine which payoff the player gives priority if he is in one-member coalition. He prefers the payoff that gives the highest utility. Unfortunately it can not be easily determined for coalitions with more than one member with different preferences. Individual members of the coalition may prefer a different payoff vector. We will show the approach that gives a complete coalition structure in multicriterial coalitional game with choice with known preferences. We proceed from the smallest coalition to the grand coalition. We determine for each coalition which option would be chosen and how the gain is split among the players. We consider that the gain is split by egalitarian approach.
858
Mathematical Methods in Economics 2016
1) |S| = 1 For the player a ∈ N , who is alone, his gain is given because it is not divided among several players. So his utility is given, too, it is easy to compute: ua ({a}) = max wa1 v 1σ ({a}) + wa2 v 2σ ({a}) + . . . + wam v mσ ({a}) . σ∈{1,2,...,l}
2) |S| = 2 In the case of two-member coalition we solve the following program based on [1]. The program is extended to mixed integer programming, binary variables yσ are added. When the variant σ is chosen, then yσ = 1, otherwise yσ = 0.
max
D
xa ,xa0 ,D
ua ({a, a0 }) − ua ({a}) ≥ D
ua0 ({a, a0 }) − ua0 ({a0 }) ≥ D xaτ
xa0 τ xaτ + xa0 τ
≥ 0
∀τ ∈ {1, 2, . . . , m}
≥ 0 =
l X
∀τ ∈ {1, 2, . . . , m} yσ v τ σ (S)
σ=1
∀τ ∈ {1, 2, . . . , m}
ua ({a, a0 }) ≥ ua ({a}) + ε
ua0 ({a, a0 }) ≥ ua0 ({a0 }) + ε l X
yσ
=
1
σ=1
where
yσ
binary
∀σ ∈ {1, 2, . . . , l},
ua ({a, a0 }) = wa1 xa1 ({a, a0 }) + wa2 xa2 ({a, a0 }) + . . . + wam xam ({a, a0 }),
ua0 ({a, a0 }) = wa0 1 xa0 1 ({a, a0 }) + wa0 2 xa0 2 ({a, a0 }) + . . . + wa0 m xa0 m ({a, a0 }). This is what we assign to each coalition S, which does not arise because it is not profitable for any of players: xaτ (S) := 0 ∀a ∈ S, ∀τ ∈ 1, 2, . . . m.
3) |S| = z; z = 3, . . . , n We use the same procedure for |S| = 3, then for |S| = 4 etc. up to |S| = n. For every coalition S we solve the following program:
max
D
xa ,a∈S;D
ua ({S}) − ua ({a}) ≥ xaτ
X
xaτ
≥
=
∀a ∈ S, ∀τ ∈ {1, 2, . . . , m}
0 l X
yσ v τ σ (S)
∀τ ∈ {1, 2, . . . , m}
σ=1
a∈S
ua (S) ≥
l X
∀a ∈ S
D
yσ
=
ua (S 0 ) + ε
∀a ∈ S; ∀S 0 ∈ {S 0 ; a ∈ S 0 ∧ |S 0 | ≤ |S|}
1
σ=1
yσ
binary
∀σ ∈ {1, 2, . . . , l},
859
Mathematical Methods in Economics 2016
where ui (S) =
m X
wiτ xiτ (S).
τ =1
For all coalitions S, that shall not ares because they are not profitable for any of players, we assign ∀a ∈ S, ∀τ ∈ 1, 2, . . . m.
xaτ (S) := 0
Pn Subsequently we solved i=2 ni programs. It is the mixed integer programming problem. When we know the allocation X(S) for all coalitions, we can determine the final coalitional structure. We will build on stable coalitions. Definition 4. We say that the coalition S is stable if every player in the coalition has higher utility by at least ε versus the coalitions in which he is a member and if there is no coalition with the same utility and fewer members. In one-criterial case it is reduced to a standard definition of coalition stability, and it is therefore its natural generalization. It should also be noted that if a grand coalition (the coalition of all players) is stable, then it is the solution. Such a case cannot occur in which there would exist division of a grand coalition with all conditions satisfied, while there would concurrently exist another coalition with the highest overall utility because the condition for the division of the grand coalition is that every player must have higher utility from it than from being in any other coalition. If the grand coalition is not stable, we choose a coalition with the highest value of the objective function, i.e., that where the players obtain the highest value of the extra utility. Furthermore, we select the remaining players’ coalition with the highest value of the objective function and thus we proceed until we get a coalition structure of the game. Like in classical one-criterion cooperative game such a case may occur that there are more coalitions with the highest utility. In this case, there may be multiple solutions and the coalition structure is not uniquely determined.
v({β})
v({γ})
v({α, β})
v({α, γ})
v({β, γ})
v({α, β, γ})
1st variant 2nd variant
v({α})
Example 1. Let us consider the game of three players α, β, γ with two criteria and two variants with the characteristic function shown in the Table 1 and the preferences of players shown in the Table 2.
(1;2) (2;1)
(3;5) (4;4)
(7;2) (5;6)
(7;7) (6;8)
(7;8) (8;7)
(12;11) (11;12)
(17;15) (16;16)
Table 1: Characteristic function of players α, β, γ with choice
1st criterion 2nd criterion
player α 0.5 0.5
player β 0.2 0.8
player γ 0 1
Table 2: Preferences of players α, β, γ with choice
We consider ε = 0, 1. ε is the amount which a player must obtain by entering into a coalition to be willing to cooperate and not to prefer a coalition with a smaller number of members in which he has more power. This parameter is important because otherwise the solution where only one player would get all the extra gain is possible. We proceed from the coalition with the least number of members to the grand coalition. 1) |S| = 1 In the first step, we determine the utility of one-member coalitions, which is the maximum utility
860
Mathematical Methods in Economics 2016
from all the variants: uα ({α})
=
uβ ({β})
=
uγ ({γ})
=
max
σ∈{1,2}
max
σ∈{1,2}
max
σ∈{1,2}
wα1 v 1σ ({α}) + wα2 v 2σ ({α} = max(0.5 · 1 + 0.5 · 2; 0.5 · 2 + 0.5 · 1) = 1.5, wβ1 v 1σ ({β}) + wβ2 v 2σ ({β} = max(0.2 · 3 + 0.8 · 5; 0.2 · 4 + 0.8 · 4) = 4.6, wγ1 v 1σ ({γ}) + wγ2 v 2σ ({γ} = max(0 · 7 + 1 · 2; 5 · 0 + 1 · 6) = 6.
2) |S| = 2 In the second step we find the distribution of two-member coalitions. First we solve the program for S = {α, β}: max
D
xα ,xβ ,D
uα ({α, β}) − uα ({α}) ≥ D uβ ({α, β}) − uβ ({β}) ≥ D xα1 xα2 xβ1 xβ2 xα1 + xβ1 xα2 + xβ2
≥ 0 ≥ 0 ≥ 0 ≥ 0
= y1 v 11 ({α, β}) + y2 v 12 ({α, β}) = y1 v 21 ({α, β}) + y2 v 22 ({α, β})
uα ({α, β}) ≥ uα ({α}) + ε uβ ({α, β}) ≥ uβ ({β}) + ε y1 + y2
=
y1 , y2
1
binary,
where uα ({α, β}) = wα1 xα1 ({α, β}) + wα2 xα2 ({α, β}), uβ ({α, β}) = wβ1 xβ1 ({α, β}) + wβ2 xβ2 ({α, β}). We used methods for mixed integer programming problem and find optimal solution, see Table 3. Then we solve a similar program for coalitions {α, γ} and {β, γ}. Both coalitions are profitable, the programs have feasible and optimal solution shown in the Table 3. coalition player st 1 criterion 2nd criterion utility
{α, β} α 6 0.23 3.12
total utility utility from cooperation (2 · D) y1 y2
{α, β} β 0 7.77 6.22 9.34 3.24 0 1
{α, γ} α 7 0 3.5
{α, γ} γ 0 8 8
{β, γ} β 11 4.67 5.93
11.5 4 1 0
{β, γ} γ 0 7.33 7.33 13.26 2.66 0 1
Table 3: Solution of the multicriteria game with choice for coalitions {α, β}, {α, γ} and {β, γ}
Coalition {α, β} prefers the second variant as we see in the Table 3 (y2 = 1). Coalition {α, γ} would choose the first variant (y1 = 1) and coalition {β, γ} prefers the second variant. 3) Finally we determine the allocation of the grand coalition by the following program
861
Mathematical Methods in Economics 2016
max
D
xα ,xβ ,xγ ,D
ua ({α, β, γ}) − ua ({a}) ≥ D xaτ
X
xaτ
∀a ∈ {α, β, γ}
≥ 0 =
2 X
∀a ∈ {α, β, γ}, yσ v τ σ ({α, β, γ})
σ=1
a∈{α,β,γ}
ua ({α, β, γ}) ≥ ua (S 0 ) + ε y1 + y2 y1 , y2
=
1
∀τ ∈ {1, 2}
∀τ ∈ {1, 2}
∀S 0 ∈ Ω
binary,
where Ω = {{α, β}; {β, γ}; {α, γ}; {α}; {β}; {γ}}. For the solution for the grand coalition see Table 4.
player α 8.45 0 4.23
1st criterion 2nd criterion utility total utility utility from cooperation (3 · D) y1 y2
player β 7.55 7.27 7.33
player γ 0 8.73 8.73 20.29 8.18 0 1
Table 4: Solution of the multicriteria game with choice for coalition {α, β, γ}
There exists a feasible solution for a grand coalition, it is also a solution with the greatest utility from cooperation. The grand coaliton is stable, players prefer the second variant (y2 = 1). Therefore the coalition structure of the game is ({α, β, γ}) .
4
Conclusion
In the paper we formulated multicriteria game with choice. This is a new situation in the game where players may have a possibility to choose their payoff vectors from two or more variants. In one-criterial game, this situation would be meaningless, a player or a coalition would always choose a greater payoff. However, if the payoff function is a vector, it is not clear that the payoff vector is greater. We considered known preferences, used utility theory and solve this problem using mixed integer programming problem. At the end we illustrated the solution on example - multicriteria game of three players.
Acknowledgements The article is supported by internal grant agency VŠE IGA 54/2015.
References [1] Tichá, M., Dlouhý, M.: Multicriteria coalitional games with known individual weights, European Journal of Operational Research, submitted.
862
Mathematical Methods in Economics 2016
Measuring Comovements by Univariate and Multivariate Regression Quantiles: Evidence from Europe Petra TomanovaĚ 1 Abstract. The aim of this paper is to measure changes in dependencies among returns on equity indices for European countries in tranquil periods against crisis periods and to investigate their asymmetries in the lower and upper tail of their distributions. The approach is based on a conditional probability that a random variable is lower than a given quantile while another random variable is also lower than its corresponding quantile. Time-varying conditional quantiles are modeled by the regression quantiles. In addition to the univariate conditional autoregressive models, the vector autoregressive extension is considered. In the second step, the conditional probability is estimated through the OLS regression. The results document a significant increase in European equity return comovements in bear markets during the crisis in 1990s and 2000s. The explicit controlling for the high volatility days does not appear to have an impact on the main findings. Keywords: CAViaR, codependence, contagion, regression quantiles. JEL classification: C22, C32, G15 AMS classification: 91G70
1
Introduction
Measuring comovements among equity markets has become an important issue in particular for policy makers who are concerned with the stability of the financial system. Due to the dependencies between equity returns, a loss in one market can be accompanied by a loss in another market. It is essential to analyze the dependencies properly since an underestimating of high correlations may cause even more severe underestimating of potential losses which may lead to a weakened stability of the financial system. However, estimating equity market correlations in financial crisis is a difficult exercise and misleading results have often been reported in the past since there is often a spurious relationship between correlation and volatility [7], [8]. An increase in financial market comovements due to the transmission of crises across countries is referred to as the â&#x20AC;?contagionâ&#x20AC;? in financial literature. From extensive reviews dealing with a contagion such as [4], [5] and [9] it is evident that the problem could generally be split into three basic categories: modeling first and second moments of returns, estimating the probability of coexceedance and methods based on copula theory. The econometric framework in terms of time-varying regression quantiles, which is used in this paper, has several advantages with respect to those methodologies. The problem of spurious correlation measures is not an issue here since the regression quantiles are robust to heteroskedasticity and outliers. Moreover, it allows to identify and measure asymmetries in comovement in the upper and lower tails of the distributions. Finally, no assumptions on neither the joint distributions nor the marginal distributions of the variables are needed since the regression quantile is a semi-parametric technique [2]. This approach of measuring comovements was introduced by Cappiello et al. [2]. Since then, several applications have been published. Zagaglia et al. [11] investigated the impact of the turmoil on the relation between gold and the U.S. dollar. Beine et al. [1] combined the approach with a panel data analysis to investigate global integration. Cappiello et al. [3] applied the approach on euro area equity returns and tested whether comovements changed between them after the introduction of the euro currency. 1 University of Economics, Prague, Faculty of Informatics and Statistics, Department of Econometrics, W. Churchill Sq. 4, 130 67 Prague 3, Czech Republic, petra.tomanova@vse.cz.
863
Mathematical Methods in Economics 2016
2
Estimating comovements
The methodology of Cappiello et al. [2] can be split into three separate stages: modeling the time-varying quantiles by regression quantiles, estimating average probabilities of comovements by OLS regression and testing the changes in financial comovements between tranquil and crisis periods. In this paper a generalization for n-tuple of random variables is employed. 2.1
Modeling time-varying quantiles
Let yit be a set-up with n random variables, {yit : i = 1, . . . , n}, at time t. Then the conditional time-varying quantile qi,j,t at time t for a given confidence level θj ∈ (0, 1) and for a random variable yit conditional on Ft−1 is defined as Pr[yit ≤ qi,j,t |Ft−1 ] = θj . Let qit (β ij ) be an empirical specification for the qi,j,t where β ij is the p-vector of parameters. The optimization problem for an estimator β̂ ij,T of unknown vector of parameters β ij is defined as T T 1X 1X ρθj (yit − qit (β ij )) = (θj − 1[yit ≤ qit (β ij )]) × (yit − qit (β ij )), min β ij T T t=1 t=1
where ρθ (e) = eψθ (e), ψθ (e) = θ − 1[e ≤ 0].
For the purpose of modeling individual quantiles the methodology of Engle et al. [6] is adopted. The authors proposed a conditional autoregressive value at risk by the regression quantiles (CAViaR) models which were primarily constructed to estimate the value at risk (VaR). Since VaR is defined as a particular quantile of future portfolio values conditional on current available information, Pr[yt < VaR t |Ft−1 ] = θ, the problem of finding the VaR t is clearly equivalent to estimating the time-varying conditional quantiles. The generic CAViaR model of Engle et al. [6] with dummy variable for crisis can be specified as qit (β ij ) = β0,ij +
v X
βk,ij qi,t−k (β ij ) +
w X
βl,ij f (xi,t−l ) + βp−1,ij S1,t ,
l=1
k=1
where f (xi,t−l ) is a function of a finite number of lagged values of observable variables. The vector of parameters, β ij = [β0,ij β1,ij . . . βp−1,ij ]0 , is p-dimensional, in particular p = 1 + v + w + 1. The autoregressive terms βk,ij qi,t−k (β ij ), k = 1, . . . , v, ensure that the quantile changes smoothly over time and the f (xi,t−l ) links qit (β ij ) to observable variables that belong to the information set. The dummy variable S1,t identifies the crisis periods. Furthermore, a simple version of vector autoregressive (VAR) extension to the quantile models for two variables, yit , i = 1, 2 [10] is adopted: q1t (β j ) = b11,j S1,t + x0t β 1j + b12,j q1,t−1 (β j ) + b13,j q2,t−1 (β j ), q2t (β j ) = b21,j S1,t + x0t β 2j + b22,j q1,t−1 (β j ) + b23,j q2,t−1 (β j ),
(1)
where β 1j = [b10,j b14,j b15,j . . . ]0 , β 2j = [b20,j b24,j b25,j . . . ]0 and xt = [1 x1t x2t . . . ]0 denotes predictors belonging to Ft−1 . Matrix of parameters β j is estimated by the quasi-maximum likelihood (QML) method. Once the QML estimator q̂i,j,t = qit (β̂ j ) is obtained, conditional quantile functions q̂i,j,t can be computed. 2.2
Estimation of conditional probability of comovement
The approach of measuring comovements is based on an estimation of probability that a random variable yit at time t falls below a conditional quantile, given that other random variables {ykt : k = 1, . . . , n ∧ k 6= PT i} are also bellow their corresponding quantiles. Let F j,T ≡ T −1 t=1 Ft (q1,j,t , . . . , qn,j,t ) be an average probability that all {yit : i = 1, . . . , n} random variables fall below their quantiles over a given time period, where Ft (q1j , . . . , qnj ) is a cumulative distribution function for the set of quantiles (q1,j,t , . . . , qn,j,t ). The conditional quantiles are estimated univariately and multivariately by regression quantiles. Then the
864
Mathematical Methods in Economics 2016
indicator variables are constructed in such way that they are equal to one if the realized random variable is lower than the conditional quantile and zero otherwise. To obtain the average probability of comovement among y1t , . . . , ynt the following linear regression model is constructed and estimated: I1t (βĚ&#x201A; 1j,T ) Ă&#x2014; I2t (βĚ&#x201A; 2j,T ) Ă&#x2014; ¡ ¡ ¡ Ă&#x2014; Int (βĚ&#x201A; nj,T ) = W t Îą0j + t , j = 1, . . . , m,
(2)
where Iit (β ij ) â&#x2030;Ą 1[yit â&#x2030;¤ qit (β ij )], i = 1, . . . , n, W t â&#x2030;Ą [1 S t ] and Îą0j is an s-vector of unknown parameters. The S t denotes an s â&#x2C6;&#x2019; 1 row vector of time dummies. Let ÎąĚ&#x201A;j,T be the OLS estimator of (2), ÎąĚ&#x201A;lj,T be the (l + 1)-th element of ÎąĚ&#x201A;j,T for l = 0, 1, . . . , s â&#x2C6;&#x2019; 1 and Slt represent the l-th element of S t . Row vector (â&#x2C6;&#x2019;l) St denotes the vector S t from which the l-th element is removed and 0 is a zero vector of required (â&#x2C6;&#x2019;l) dimension. Then dummies S t are defined as {Slt = 1, S t = 0}Tt=1 [2]. Cappiello et al. [2] provided a set of conditions under which the estimated intercept ÎąĚ&#x201A;0j,T of (2) converges in probability to the average probability of comovement in the period of t â&#x2C6;&#x2C6; {t : S t = 0} and the sum of estimated parameters ÎąĚ&#x201A;0j,T + ÎąĚ&#x201A;lj,T converges to the average probability of comovement in the period corresponding to the dummy. 2.3
Testing changes in financial comovements
For the confidence level θj let Ftâ&#x2C6;&#x2019; (θj ) â&#x2030;Ą θjâ&#x2C6;&#x2019;1 Pr(y1t â&#x2030;¤ q1t (β 1j ), . . . , ynt â&#x2030;¤ qnt (β nj )),
Ft+ (θj ) â&#x2030;Ą (1 â&#x2C6;&#x2019; θj )â&#x2C6;&#x2019;1 Pr(y1t â&#x2030;Ľ q1t (β 1j ), . . . , ynt â&#x2030;Ľ qnt (β nj )).
Likelihood pt (θj ) of a tail event at time t for any subset Î&#x201C;t of random variables {yit : i = 1, . . . , n}, given that a tail event occurred for {yit : i = 1, . . . , n â&#x2C6;§ yit â&#x2C6;&#x2C6; / Î&#x201C;t } is defined as ( Ftâ&#x2C6;&#x2019; (θj ) if θj â&#x2030;¤ 0.5 pt (θj ) â&#x2030;Ą . + Ft (θj ) if θj > 0.5 For the purpose of this study the realization of random variable yit is a return of an equity index for i-th country at time t and the first dummy variable S1,t from the regression (2) identifies crisis times. Then the probability of comovement in tranquil times and the probability of comovement in crisis times is defined as X pt (θj ), p0 (θj ) â&#x2030;Ą C0â&#x2C6;&#x2019;1 tâ&#x2C6;&#x2C6;{t:S t =0}
p1 (θj ) â&#x2030;Ą C1â&#x2C6;&#x2019;1
X
n o (â&#x2C6;&#x2019;1) tâ&#x2C6;&#x2C6; t:S1,t =1,S t =0
pt (θj ),
respectively, where C0 and C1 is a number of observations during tranquil and crisis period, respectively. Capiello et al. [2] applies the following rigorous joint test for an increase in comovements: X X δĚ&#x201A;(θ, θ) = (#θ)â&#x2C6;&#x2019;1 [p1 (θj ) â&#x2C6;&#x2019; p0 (θj )] = (#θ)â&#x2C6;&#x2019;1 ÎąĚ&#x201A;1,j , θj â&#x2C6;&#x2C6;[θ,θ]
θj â&#x2C6;&#x2C6;[θ,θ]
where #θ is a number of addends in the sum and ÎąĚ&#x201A;1,j is the OLS estimate of Îą1,j in (2). When the null hypothesis H0 : δĚ&#x201A;(θ, θ) = 0 is rejected in favor of the alternative hypothesis H1 : δĚ&#x201A;(θ, θ) 6= 0, the comovements do change between tranquil and crisis periods.
3
Data
Returns on Morgan Stanley Capital International (MSCI) world indices, that are used in this paper, are market-value-weighted, do not include dividends and are denominated in the local currency. The sample includes prices of 16 European countries from December 31, 1987 to April 30, 2015. The sample period spans over 7 131 days which results in 7 130 continuously compounded log returns for each of the European countries. However, the final sample includes less observations since data have been adjusted
865
Mathematical Methods in Economics 2016
for non-simultaneous closures. The same data cleaning procedure as in [2] is employed. It takes into an account the fact that national holidays and administrative closures do not fully coincide. For each pair of countries a new data set is created. It includes only equity returns for days on which both markets were opened that day and had been opened the day before. After the data cleaning the number of valid daily returns varies from 6 193 (the Greece – Portugal country pair) to 6 738 (the Germany – Netherlands country pair). A next important aspect of the data is the cross-country synchronicity of daily returns. Descriptive statistics and sample characteristics for each country were investigated. The presence of financial crises in the sample results in extreme returns and strong fluctuation. Common characteristics indicate leptokurtic distribution of returns and the null hypothesis of the Jarque-Bera test, i.e of normally distributed returns, is rejected at the 1% significance level. The presence of autocorrelation is confirmed by the Ljung-Box test and the presence of a unit root in the return series is readily rejected by the Augmented Dickey-Fuller test at the 1% significance level. Hence the methodology is suitable for analyzing the data sample. Cappiello et al. [2] determined 7 crisis periods, 3 of them have been taken from the paper of Forbes et al. [7]: Tequila crisis – November 1, 1994 to March 31, 1995; Asian crisis – June 2, 1997 to December 31, 1997; Russian crises – August 3, 1998 to December 31, 1998; Argentinean crisis – March 26, 2001 to May 15, 2001; the US sub-prime crisis – February 15, 2007 to March 30, 2007; Lehman bankruptcy – September 1, 2008 to October 31, 2008; turbulences in the euro area – August 1, 2011 to September 30, 2011. The consequent period of September 4, 2012 – April 30, 2015 represents rather a tranquil period of time, thus further crisis periods are not defined. The crisis sample includes 530 equity returns. After the data cleaning procedure is applied, the crisis sample sizes vary from 466 observations (the Ireland – Portugal country pair) to 510 observations (the Netherlands – Norway country pair). In the data for European countries a clear evidence of higher correlations over turbulent times than over tranquil times is found. The highest increase of correlation indicates the Denmark – Portugal country pair, for which the excess correlation between tranquil and crisis periods is equal to 0.29. The weakest effect is present in returns of four important European economies – Belgium, Germany, France and the Netherlands. For all bivariate combinations of 16 European countries the average correlation is approximately 0.54 and 0.70 over tranquil days and over days of turbulence respectively. Forbes et al. [7] pointed out that such differences are caused by the heteroskedasticity issue rather than by an increased dependence among country-specific returns during the crises. Time-varying regression quantiles are generally robust to heteroskedasticity and hence an improper inference due to the spurious correlation is avoided.
4
Results and discussion
Ten different univariate CAViaR models are considered for the purpose of this study: adaptive model with G = 10, symmetric absolute value, asymmetric slope of Engle et al. [6], non-linear specification of Cappiello et al. [2] and their modifications. In addition to univariate models, the multivariate model (1) where xt = [1 |y1,t−1 | |y2,t−1 |]0 is estimated. Each model is estimated for 19 quantile probabilities θj ranging from 5% to 95% using a quasi-Newton algorithm with a cubic line search procedure. The selection of the ”best” univariate models is based on the in-sample Dynamic Quantile test of Engle et al. [6]. In addition, the following criteria are taken into an account: computational time, occurrence of quantile-crossing problem, number of parameters and their significance. Quantile-crossing problem refers to the situation when the monotonicity assumption qit (β i1 ) < qit (β i2 ) < · · · < qit (β im−1 ) < qit (β im ) implied by 0 < θ1 < θ2 · · · < θm−1 < θm < 1 is violated. Further, the sample is split into two sub-periods of January 1, 1988 – September 3, 2012 and September 4, 2012 – April 30, 2015 to assess the univariate and the multivariate models by an out-of-sample DQ test. Similarly as in the case of the univariate models, this comparison takes into account some additional criteria: correct fraction of expectation, computational time and quantile-crossing problem. Based on the aforementioned criteria, the following CAViaR model is selected as the most suitable model for European countries: qit (β ij ) = β0,ij + β1,ij qi,t−1 (β ij ) + β2,ij yi,t−1 + β3,ij yi,t−2 + β4,ij |yi,t−1 | + β5,ij S1,t . In the pivotal analysis, all 120 European country pairs are analyzed for the period of January 1, 1988 – April 30, 2015. Four null hypotheses δ̂(θ, θ) = 0 of different intervals are tested for each European
866
Mathematical Methods in Economics 2016
country pair. The results are reported in the Figure 1 whereP the t statistics are visualized. They are computed from the average of α̂1,j over θj , i.e. δ̂(θ, θ) = (#θ)−1 θj ∈[θ,θ] α̂1,j , and the associated standard errors. The t statistic provide a joint test for changes in comovements between two countries. The blue color indicates significant changes in comovements between tranquil and crisis periods and the red color indicates insignificant ones. Matrix on the left represents results for the lower tail of distributions: upper triangle represents results for δ̂(0.05, 0.25) and the lower triangle for δ̂(0.25, 0.50). And similarly, matrix on the right represents results for the upper tail of distribution: upper triangle represents results for δ̂(0.50, 0.75) and the lower triangle for δ̂(0.75, 0.95).
Figure 1 Results from the joint test for changes in comovements (t statistic) The null hypothesis for δ̂(0.05, 0.25) is not rejected only for 5 out of 120 country pairs at the 5% significance level, namely Austria – Greece, Belgium – Denmark, Belgium – Greece, Greece – Spain and Greece – Switzerland: the comovements do not change significantly between tranquil and crisis periods for these country pairs. However, for the 96% majority of country pairs, the probabilities of comovements for θj ∈ [0.05, 0.25] are significantly higher during the crisis times. In case of δ̂(0.25, 0.50), the differences in comovements are significant for 88% of all country pairs. The null hypothesis δ̂(0.25, 0.50) = 0 is not rejected mostly for country pairs containing France, Germany and Spain. However, for the upper tail of distributions the findings are opposite in sense that for the majority, 86.7% and 85.0% of country pairs, changes in comovements are not significant in the case of δ̂(0.50, 0.75) and δ̂(0.75, 0.95), respectively. This evidence clearly suggest that for the most country pairs the comovements change significantly between tranquil and crisis periods in the lower half of distribution but not in the upper half (using the 5% significance level). These findings are also verified by joint tests considering two interval δ̂(0.05, 0.50) and δ̂(0.50, 0.95). To check for the robustness of the estimates, the analyzes have been repeated for two different time spans and controlled for different levels of volatilities. In addition, the analyzes have been repeated for country triplets. Insignificance in changes in comovements in the upper tail of distributions are even more pronounced when a shorter time span of January 2, 1995 – April 30, 2015 is used. For δ̂(0.50, 0.75) the changes are significant only for 1 out of 120 country pairs, Denmark – Switzerland, and for δ̂(0.75, 0.95) the changes are significant only for 6 country pairs. Results from the shortest period of June 3, 2002 – April 30, 2015 are quite clear: there is no significant change in probabilities of comovement for all combinations of 16 European countries in the upper tail of distributions except for 3 country pairs although for the lower tail the changes are significant for 96% of country pairs. Moreover, when n is set to 3 and the longest sample period is considered, the results indicate significant changes in the lower tail of the distribution for all country triplets. Upper tail changes are not significant for vast majority of the country triplets, as in the previous case. Finally, dummy variables for different in financial markets are used 2levels of volatilities 2 2 as control variables. Let the control dummy S2,t ≡ 1 σEWMA,t > q0.90 (σEWMA,t ) , where σEWMA,t is an average return on both stock markets forming the country pair computed as the exponentially weighted
867
Mathematical Methods in Economics 2016
2 moving average (EWMA) with decay coefficient equal to λ = 0.97. Then let q0.90 (σEWMA,t ) be its 90% unconditional quantile. Inclusion of the control dummy S2,t into models does not have an impact on the main findings.
Overall, the results confirmed that the distributions of returns for European markets are characterized by strong asymmetries. These asymmetries cannot be detected by a simple correlations or estimates of adjusted correlation coefficients of Forbes et al. [7].
5
Conclusion
The methodology of measuring changes in comovements was applied on 16 European equity markets. Results for the period of January 1, 1988 to April 30, 2015 show that comovements in equity returns across European markets tend to increase significantly in turbulent times against tranquil times in the lower tail of the distribution. However, for vast majority of markets the differences are not significant in the upper tail of the distribution. These results for upper tail contradict the findings of Capiello et al.[2] who analyzed the Latin American countries, since the authors documented significant changes of comovements in crisis times for the whole distribution of the returns. Moreover, the changes in comovements in the upper tail of distributions tend to be insignificant even more for the shorter periods of January 2, 1995 – April 30, 2015 and June 3, 2002 – April 30, 2015. On the other hand, the changes in comovements tend to remain significant for the lower tail of the distributions. This asymmetric dependence documents a significant increase in European equity return comovements in bear markets during the crisis of 1990s and 2000s and insignificant changes in bull markets.
Acknowledgements This work was supported by the research grant VŠE IGA F4/73/2016, Faculty of Informatics and Statistics, University of Economics, Prague and by the Pigeon. This support is gratefully acknowledged.
References [1] Beine, M., Cosma, A., and Vermeulen, R.: The dark side of global integration: Increasing tail dependence, Journal of Banking & Finance 34 (2010), 184–192. [2] Cappiello, L., Gérard, B., Kadareja, A., and Manganelli, S.: The impact of the euro on equity markets, Journal of Financial Econometrics 12 (2014), 645–678. [3] Cappiello, L., Kadareja, A., and Manganelli, S.: Measuring comovements by regression quantiles, Journal of Financial and Quantitative Analysis 45 (2010), 473–502. [4] De Bandt, O., and Hartmann, P.: Systemic risk: a survey, ECB working paper (2000). [5] Dungey, M., Fry, R., González-Hermosillo, B., and Martin, V., L.: Empirical modelling of contagion: a review of methodologies, Quantitative finance 5 (2005), 9–24. [6] Engle, R., F., and Manganelli, S.: CAViaR: Conditional autoregressive value at risk by regression quantiles, Journal of Business & Economic Statistics 22 (2004), 367–381. [7] Forbes, K., J., and Rigobon, R.: No contagion, only interdependence: measuring stock market comovements, The journal of Finance 57 (2002), 2223–2261. [8] Longin, F., and Solnik, B.: Extreme correlation of international equity markets, The journal of finance 56 (2001), 649–676. [9] Pericoli, M., and Sbracia, M.: A primer on financial contagion, Journal of Economic Surveys 17 (2003), 571–608. [10] White, H., Kim, T., and Manganelli, S.: VAR for VaR: Measuring tail dependence using multivariate regression quantiles, Journal of Econometrics 187 (2015), 169–188. [11] Zagaglia, P., and Marzo, M.: Gold and the US dollar: tales from the turmoil, Quantitative Finance 13 (2013), 571–582.
868
Mathematical Methods in Economics 2016
The Use of Cluster Analysis for Development of Categorical Factors in Exploratory Study: Facts and Findings from the Field Research Sigitas Vaitkevicius 1, Vladimira Hovorkova Valentova 2, Evelina Zakareviciute 3 Abstract. This paper describes the development of categorical latent variables and their use for the typology development. The main idea of this modeling was to investigate the possibility of using the classification methods instead of factor analysis for development of the final latent variable which can cumulatively explain the set of primary indicators. The modeling is based on the empirical findings from the online retail consumers’ behavior study. Selected data allowed confirm the statement that even the small data sets using the classification data analysis methods can display the significant Ecological Validity. The modeling was performed in three steps. First, the number of primary indicators was reduced using the factor analysis. Based on it several latent variables were created. Second, the k-mean cluster analysis instead of secondary factor analysis was used for development of three cluster variables that represent six clusters in total. Third, all three variables were used for development of the final latent variable which is categorical and represents the eight theoretically possible and five empirically confirmed categories. Keywords: Categorical variables, latent classes, typology, online retail JEL Classification: C18, C38, C93 AMS Classification: 62H25, 62H30
1
Introduction
Factor analysis and cluster analysis already have been discussed widely – e. g. in [3], [6], [10], [13], [14]. Factor analysis until now mostly have been used for development of latent variables which describe variability among observed correlated variables in terms of a potentially lower number of unobserved variables called factors. An advantage of the factor analysis is definitely its feature to reduce the number of observed variables connecting them into latent variables that cumulatively explain the correlated content of observed variables. Looking from the point of the interpretation logic, correlation of observed variables is logically significant. If it is not, the development of the factor would be logically unexplainable. In this cases science of methodology already proposed the alternative research methods. It can be cluster analysis because it explores the types of data relations instead of their correlations [6], [10]. Cluster analysis is focused on the partitioning of similar objects into meaningful classes when both the number of classes and the composition of the classes are to be determined [5], [10]. There are the studies that allow answer to the questions when and what type of classification design should be used in order to get the correct model of the clusters [4]. For example, in model-based clustering it is assumed that the objects under study are generated by a mixture of probability distributions with one component corresponding to each class. When the attributes of objects are continuous, cluster analysis is sometimes called latent profile analysis [8], [11], [1], [17]. When the attributes are categorical, cluster analysis is sometimes called latent class analysis (LCA) [11], [9], [1], [15]. There is also cluster analysis of mixed-mode data [5] where some attributes are continuous while others are categorical. This paper describes the situation when the complete verification of the questionnaire is not possible using the factor analysis due to the uncorrelated latent factors. It is based on the real findings from the study that is used as the case which represents the possible research scenario. It combines three of four types of the latent 1
Kaunas University of Technology, Department of Management, Donelaicio g. 20 – 405, Kaunas, Lithuania, sigitas.vaitkevicius@gmail.com 2
Technical University of Liberec, Department of Economic Statistics, Studentska 2, Liberec, vladimira.valentova@tul.cz 3
Kaunas University of Technology, Department of Management, Donelaicio g. 20 – 405, Kaunas, Lithuania, evelina.zakareviciute@gmail.com
869
Mathematical Methods in Economics 2016
variable model. They are in application order: factor analysis, latent profile analysis, and latent class analysis [7]. This example was selected because it explicates the situation when the small number of observations is meaningful and statistically significant. The possibility of use of the factor analysis, the latent profile analysis and the latent class analysis with the small number of observations is also discussed in scientific literature [12]. Just in this paper all this knowledge is combined together in order to answer to the questions if it is possible to verify the questionnaire using the classification design instead of correlative one and how rational and significant it may be when it is used with the small number of observations.
2
Assumptions that Lead to the Use of Classification Methods instead of Correlation Methods for Verification of the Questionnaires
Latent variable model depends on two main components in its structure: manifest variable that describes the type of initial variable and latent variable which represents the factor variable. The manifest variable and the latent variable depend on its nature and they can be continuous or categorical [2, p. 145]. In the literature presented variable model shows that a factor can be created no matter what type of the variable is. There is just one question that remains unanswered: if it is possible to develop the data analysis design which allow mixing of continuous and categorical types of variables in order to verify the questionnaire. Next, in this paper there is presented the discussion that ground the conditions when it is possible to use correlative methods and when to use the classification methods for the verification. Comparing the factor analysis and latent trait analysis, the latent variables are treated as continuous normally distributed variables, and in latent profile analysis and latent class analysis as from a multinomial distribution [7]. The manifest variables in factor analysis and latent profile analysis are continuous and in most cases their conditional distribution given the latent variables is assumed to be normal. In latent trait analysis and latent class analysis the manifest variables are discrete. These variables could be dichotomous, ordinal or nominal variables [7]. Their conditional distributions are assumed to be binomial or multinomial. Indeed, it is quite different situation that also gives a rise to the question about the possibility of combination of the ways of latent variable development in order to extract one single variable which would verify the consistency of the questionnaire. The answer to the particular questions becomes possible comparing the behavior of the data. The variables of such analysis are not comparable but the pattern of their behavior can be compared especially if it is enriched by graphs and some statistics. For the illustrative purpose of relationship between the results got by application of correlative and classification methods, the linear diagrams were used together with the main rates from the factor, reliability and k-mean cluster analysis. For illustration, three theoretically possible variables were created â&#x20AC;&#x201C; see Figure 1 (d). They are created in a way where VAR1 is filled with random numbers that range from 1 to 7, VAR2 is created by adding 3 to each VAR1 number except the last one, where is added 2.9. It is done because if VAR2 variable is created which would correlate with VAR1 equally by 1 it would not be possible to estimate factor coefficients because the correlation matrix is not positive definite. The correction does not change the essence just allow conducting the comparative analysis. The VAR3 is reversed VAR1. VAR4 is created by repeating VAR1 two times. It means that the values of VAR4 are 1, 5, 2, 7, 3, 4, 5, 7, 1, 2, 1, 5, 2, 7, 3, 4, 5, 7, 1, 2. Firstly, because it would correspond to VAR2 presented as the first 10 items in VAR5 and secondly, it would correspond to VAR3 presented as the second 10 items in VAR5. It means that VAR5 assumes these values: 4, 8, 5, 10, 6, 7, 8, 10, 4, 4,9 ,7, 3, 6, 1, 5, 4, 3, 1, 7, 6. In this way, we work with variables which correlate positively by 1 (VAR1 and VAR2); negatively by -1 (VAR1 and VAR3 (absolutely), or VAR2 and VAR3 (approximately)); and has zero (0) correlation (VAR4 and VAR5) (see Table 1).
1,5 1 0,5 0 -0,5 -1
1,5 1 z-score
z -score
The results of comparative analysis are presented in Figure 1 for the k-mean cluster analysis and in the Table 1 for the Factor and Reliability analysis. The comparative analysis has shown that the Symmetrical two cluster model (a) presented in Figure 1 completely corresponds with the positive strong correlation and can be used as a substitute for the latent variable developed on the base of the factor analysis which is characterized by factor scores equal to 1.000 and by Cronbachâ&#x20AC;&#x2122;s Alpha also equal to 1.000.
Type 1 Type 2
0,5
Type 1
0
Type 2
-0,5 -1
VAR1
-1,5
VAR2
VAR2
Symmetrical two cluster model, when is strong positive correlation (r=1, p=0.000), based on k-mean cluster when n=10
VAR3
Asymmetrical two cluster model, when is strong negative correlation (r=-1, p=0.000 ), based on k-mean cluster when n=10
(a)
(b)
870
z-score
Mathematical Methods in Economics 2016
1,5 1 0,5 0 -0,5 -1 -1,5
Type 1 Type 2 Type 3 Type 4 VAR4
VAR5
Mixed four cluster model, when is zero correlation (r=0, p=1.000 ), based on k-mean cluster when n=20
(c)
(d)
Figure 1 Illustrative Data (d) and Their Comparative Models The Asymmetrical two cluster model (b) presented in Figure 1 completely corresponds to the negative strong correlation and can be used as a substitute for the latent variable developed on the base of the factor analysis which is characterized by factor scores equal to 1.000 and -1.000. In this case the Cronbach’s Alpha shows a negative average covariance among items. Variables
Correlation
Sig. (1tailed)
KMO and Bartlett's Test Bartlett's Test Approx. df Sig. Chi-square 0.500 64.658 1 0.000
KMO
1.000 0.000 VAR1 VAR2 -1.000 0.000 0.500 64.658 1 0.000 VAR2 VAR3 0.02 0.497 0.500 0.000 1 0.995 VAR4 VAR5 Extraction Method: Alpha Factoring (a) The value is negative due to a negative average covariance among items. 1 factor extracted. 1iteration required.
Total Variance Explained. Extraction Sums of Squared Loadings % of Variance Factor Matrix: Cronbach's Factor Score Alpha 99.986 1.000 1.000 1.000 99.986 1.000 -20564.444(a) -1.000 0.081 0.028 0.003 0.028
Table 1 Correlation, Factor analysis and Reliability Analysis Coefficients for Theoretical Variables The Mixed four cluster model (c) displayed in Figure 1 completely correspond to the zero correlation and cannot be used as a substitute for any latent variable developed on the base of the factor analysis. In this case the Cronbach’s Alpha is equal to 0.003 which means that no continuous factor is possible. Typically, in this situation the researchers that use factor analysis for questionnaires verification have statement that future development of latent variables is impossible. Moreover, the extraction of one single variable which should verify the consistency of the questionnaire also has to be terminated. The reason can be seen in the Mixed four cluster model (c). There is the alternative option of usage the Latent profile analysis for development of latent variable. In this case the main difference is that the latent variable is categorical and if it is final variable it means no problem. But if it is in the middle of the verification process, it will need more advanced verification design based on multi-method analysis. In the next chapter an empirical example of the multi-method verification design is presented.
3
Methods and Design of the Research
As an empirical example the Lithuanian on-line retail industry and the internet shops managers´ and owners´ perception about the demand formation motives in the internet shopping have been studied. For this research the questionnaire technique was selected. The correspondent survey method in presence of the researcher has been used. The discussion with the interviewee was used for depth description of the content but not strictly as a tool for the attitude formation. For this research the questionnaire consisted of 112 questions was used. It was specially designed for this research. 39 questions were used as primary indicators for extraction of latent variables. The 14 indicators together with 13 latent variables have been used for development of the categorical variables. All other questions were designed as the demographical and contextual variables that were used for the description of the interviewee´s characteristics and for description of the attractiveness of the internet shopping. The researched data have been processed using SPSS 20.0 and Microsoft Excel 2010. The license holders are Kaunas University of Technology in Lithuania and Technical University of Liberec in the Czech Republic. The most of the Cronbach's Alpha values exceed the 0.700 except only one – the “shopping quality” (0.352). It consists of two primary indicators which factor score is 0.626. Inter-Item Correlations, Extraction Sums of Squared Loadings and Corrected Item-Total Correlation are also high which shows together with Cronbach's Alpha that latent variables are very homogenous according to the extent of their common content. Relatively high homogeneity of latent assertions shows the factor score but in this case it has been observed the higher abstraction rate for the some variables (when the factor score was less
871
Mathematical Methods in Economics 2016
than 0.700) that are included into latent variables: „Functionality“ and „Increasing in accessibility“4. Because of this quality their presence in the latent variables has been treated as a feature which only partially related to the content of the latent variable. The primary variables which factor score exceeded 0.700 are interpreted as essential features describing the latent variable. The data analysis has been made in this order: At first, the contextual analysis of each question has been performed. During it the answers to the questions has been compared to the comments from the discussion with the interviewee.
Next, the “Encoding the real authorial intention using the Phenomenological Hermeneutic system” logical structure for interpretation of the findings has been used [16]. It has been used for the theoretical modelling of the statement truth which was needed for development of theoretical economic demand formation (EDF) motives model.
Factor analysis. For reduction of the number of primary items the Factor analysis using Alpha Factoring method and Varimax axis rotation has been selected. The obtained individual latent variables additionally have been tested using the Reliability analysis.
Latent profile analysis. Next, the typology of latent variables has been modelled using the k-mean cluster method by selecting the smallest meaningful number of groups. It was needed to develop the Typology of the economic demand formation (EDF) motives in Lithuanian online retail industry. The typological modelling was performed in two stages. Firstly, it was used for an attribute structure consolidation on the categorical level. The problem of the significance of the results has been resolved transforming the latent variable values to the z-scale. The conventional statistical practice is applied for interpretation of the distance on the z-scale. If distance between the points of two clusters was equal or more than one standard deviation, it was interpreted that opinions differ significantly. Three cluster models were developed. First one, Demand inclination and growth interferences model (Q1) describes the factors that interfere in the demand formation. The two types of the expert notion were extracted. First one covers the notion of the three experts who think that the growth of online stores trade interferes the complex effect of three latent variables: problems in process of shopping, lack of information, and discrepancy in price and quality. The second type of the expert notion which covers the notion of the four experts is based on the statement that the growth of online stores trade interferes in the lack of immediate communication and the variables: lack of information and the problems in the process of shopping have no impact on the demand inclination. The second Demand emergence stimulation model (Q2) also diversifies experts into two groups. First one is based on four expert opinions that the accessibility, time and information of the client about the product are critical to stimulate the demand formation. The other three experts identified the low price as a demand formation stimulation tool.
The third one, the Demand growth acceleration and growth incentives model (Q3) explored that some estores (n=5) are focused on the clients who look for lower price (Price oriented shopping) and others (n=2) offers for Ease oriented shopping client. Comparing the clusters depending on the demand emergence stimulation model and those that depend on the Demand growth acceleration and growth incentives model it was noticed that some of the experts see the price as a factor which stimulate the emergence of the demand based on the attraction of the new customers while others see the price as a factor that accelerate the growth of the demand by exploiting the already existing customers. Latent class analysis. At the last stage, the typological modelling using the Q1, Q2 and Q3 categorical variables was performed. The Two step cluster applying distance measure - Log-likelihood, and selecting number of clusters - fixed number (5) was used. The biggest empirically possible group number was selected for this modelling. The main idea for this model is to make a realistic-empirical model based on expert´s evaluation which would be closest to the theoretical-mathematical model describing idealistic model of typology of the economic demand formation (EDF) motives in Lithuanian online retail industry.
4
Working with the different size of data sets it should be noticed that the indicators which have the factor score close to the 0.500 comparing to the ones that tend to be close to the 0.999 are much broader by their content. For example, on the scale of 'Openness to changes' the one of nine primary indicators 'risk acceptance' is too broad by its content for the latent variable. It is because the risk can be as an indicator for rational openness to the forthcoming changes and as an indicator for irrational activity like to risk driving the car and so far. Because of such content its factor score for the latent variable was close to 0.500 (Personnel management center (2007) Strategical individual competencies. KTU Faculty of Economics and Management, Personnel management center [CD catalogue], ISBN 978-9955-25-549-9)
872
Mathematical Methods in Economics 2016
4
Discussion on Results
As a result, five empirical types were determined. They were interpreted qualitatively and the final interpretation of the demand formation motives is presented in Table 2. The difference comparing the use of factor analysis for extraction of one single variable which verifies the consistency of the questionnaire is in the interpretation. Using the factor analysis, it is just enough to state that single variable is extracted which shows the consistency of the questionnaire. In this case, no matter that also by using the factor analysis one single variable was extracted, the consistence is presented by the final set of categories which each of them is meaningful itself and can be interpreted separately as well as together. It means that in formativeness of such variable is grater comparing to the factor variable. Moreover, the variable gives the possibility to compare the respondent groups by the behavior pattern that is presented in the inner structure of the variable. For example, from Table 2 it can be seen that respondents which belong to the first group are similar to the second group by the Q1, by the Q2 are similar to the three and four groups, and by the Q3 are similar to the groups four and five. Dissimilarities are also seen. It gives not only the knowledge that consistency exists but also explains what consistency means in a particular case. Another advantage is the possibility to reduce the complete interpretation to the Key Concept. It is useful developing the typology. In Table 2 is presented the example of such a reduction. In this case, the development of such a typology allows discover the strategies used by the internet shops owners that have led them into success in their business. Therefore, the use of classification methods for verification was not limited only to the evaluation of consistence but also made possible the discovery of business qualities that are important for the better understanding of business nature.
Result Demand growth deceleration Stimulation of demand emergence Demand growth acceleration
Typology of the economic demand formation (EDF) motives Model of economic motives Fast and Deliberate Clear and informed reasonableeasy lowerFast quality reasonableprice shopprice shopshopping price shopping model, ping model, model, n=1 ping model, n=2 n=1 n=1 1 2 3 4 Lack of Lack of clearness clearness and Slowness Slowness and fairness fairness Economic Accessibility Accessibility Accessibility motives and inforLow price and inforand information mation mation Price
Ease
Ease
Fast lowprice shopping model, n=2 5 Slowness
Low price
Price
Price
Table 2 Classification Model of Empirically Explored Economic Demand Formation Motives Methods selected for this study allowed not only explore the existing patterns but also to explore findings which have not been found across the researched population yet. The categories that identify these three patterns are presented in Table 3 and show the patterns of economic motives that have not been found used as a key success factor for a company success achievement between the Top sellers in Lithuanian on-line retail industry. It can mean that these models have not been enough economic and ideologically attractive for sellers as a way to develop business and strive for profit. It is also the discovery and does not matter that it was not found in empirical data. Typology of the economic demand formation (EDF) motives Model of economic motives Clear and easy Clear, lowershopping model, price shopping n=0 model, n=0 1 2 Result Reduction the Reduction the The strategy for interferences removal (Delack of clearness lack of clearness mand inclination model) and fairness and fairness Economic The strategy for attraction of the new customers Accessibility and motives Low price (Model of demand emergence stimulation) information The strategy for retention of existing customers Ease Price (Demand growth acceleration model)
Cheap and simple shopping model, n=0 3 Immediate communication Low price Ease
Table 3 Typology of the Economic Demand Formation (EDF) Motives, Empirically unexplored Categories
873
Mathematical Methods in Economics 2016
5
Conclusion
In summary, it can be said that beside the simple verification of consistency, the use of classification design for it in addition revealed the complete model of customer’s economic motives perceived by managers and onlineretail business developers and allow the identification of those that were the exceptionally important for onlineretail business development and success at Lithuanian market. It demonstrates the richness of the classification methods when they are used as a continuation of a factor analysis and not only in the situations when the factor analysis is not possible but also when it can be replaced in order to get more information about the inner consistency of the final latent variable. Finally, it is possible to state that the proposed methodology was useful and allowed objectively identify the differences in the expert opinion that was needed for the empirical description of the economic formation motives of the researched phenomenon.
References [1] Bartholomew, D. J., and Knott, M. Latent variable models and factor analysis. Kendall’s Library of Statistics 7. London: Arnold, 1999. [2] Bartholomew, D. J., Steel, F., Moustaki, I., and Galbraith J. I. The Analysis and Interpretation of Multivariate Data for Social Scientists. Chapman & Hall/CRC, New York, 2002. [3] Child, D. Essentials of Factor Analysis. Continuum International Publishing Group, London, 2006. [4] Creswell, J. Qualitative Inquiry and Research Design: Choosing Among Five Approaches. Sage Publications, Thousand Oaks, 2007. [5] Everitt, B. S. Cluster Analysis. Edward Arnold Publishing, New York, 1993. [6] Everitt, B. S., Landau, S., Leese, M., and Stahl, D. Cluster Analysis. John Wiley & Sons, Ltd, London, 2011. [7] Everitt, B. S. An Introduction to Latent Variables Models. Chapman & Hall, New York, 1984. [8] Gibson, W. A. Three multivariate models: Factor analysis, latent structure analysis, and latent profile analysis. Psychometrika 24, 3 (1959), 229-252. [9] Goodman, L. A. Exploratory latent structure analysis using both identifiable and unidentifiable models. Biometrika 61, 2 (1974), 215-231. doi:10.1093/biomet/61.2.215. [10] Kaufman, L., and Rousseeuw, P. J. Finding Groups in Data: An Introduction to Cluster Analysis. John Wiley & Sons, Inc., New Jersey, 2005. [11] Lazarsfeld, P. F., and Henry, N.W. Latent Structure Analysis. Boston: Houghton Mill, 1968. [12] Pearson, H. R. Recommended Sample Size for Conducting Exploratory Factor Analysis on Dichotomous Data. Journal of Modern Applied Statistical Methods 9, 2 (2010), 359-368. [13] Řezanková, H., Húsek, D., and Snášel, V. Shluková analýza dat. Professional Publishing, Praha, 2007. [14] Stankovičová, I., and Vojtková, M. Viacrozmerné štatistické metódy s aplikáciami. Iura Edition, spol. s r. o., Bratislava, 2007. [15] Uebersax, J. S. Probit Latent Class Analysis with Dichotomous or Ordered Category Measures: Conditional Independence/Dependence Models. Applied Psychological Measurement 23, 4 (1999), 283-297. [16] Vaitkevicius, S. Rethinking the Applicability of Hermeneutic Systems for Coding and Statistical Analysis of Authorial Intentions in Economics. Inzinerine Ekonomika-Engineering Economics, 24, 5 (2013), 415423. [17] Vermunt, J. K., and Magidson, J. Latent class cluster analysis. In Hagenaars, J. A. and McCutcheon A. L. (eds.), Advances in latent class analysis. Cambridge University Press, 2002.
874
Mathematical Methods in Economics 2016
Analysis of Income of EU Residents Using Finite Mixtures of Regression Models Kristýna Vaňkátová1, Eva Fišerová2 Abstract. In situations where the classical linear regression is inapplicable due to a heterogeneity of the population, mixtures of regression models are a popular choice. The method acquires parameters estimates by modelling the mixture conditional distribution of the response given the explanatory variables. The mixture distribution is given by the weighted sum over all components and, in the end, an individual regression models, one for each component, can be estimated simultaneously. The estimation of parameters is done via the expectation–maximization (EM) algorithm, a widely applicable algorithm for computing maximum likelihood estimates from incomplete data. Recently, mixture models are used more and more in various fields, including biology, medicine, genetics and the economics. In this paper, income of residents of EU countries is explored, in further detail we analyse the relationship between an annual old age pension and income of people over 65 years. While the data show evident heterogeneity, the mixture regression approach is required. Keywords: mixture regression models, linear regression, EM algorithm, income, old age pension JEL classification: C11, C38, C51, C52, E01, E24 AMS classification: 62J05, 62H30, 62F10, 62P20
1
Introduction
Many macroeconomic factors characterizing an economic situation of a country are linked. In this paper, the influence of a social protection on an income of EU countries citizens in retirement age is evaluated. Consequently, two factors representing each side of the relationship were chosen - an annual income of EU countries inhabitants, as a specification of living standard, and old age pension expenditures of given country as a measure of social protection. Focusing on this relation we may proceed to a solution within regression analysis, where the function describing the general dependency (over all countries) is of main interest. Potentially, when the homogeneity is not accomplished, clusters of data should be examined. We approach the problem for people in an age category of 65 years plus. Even basic examination of the data indicates that in this particular case the data comes from at least two latent classes. Hence, the finite mixtures of regression models (FMRM) approach is applied to estimate parameters of each regression model (also called component). While the classical linear regression is based on an assumption of a homogeneous population, in some situations we suspect that there are several heterogeneous groups in the population that a sample represents. FMRM are designed to model this unobserved heterogeneity using a maximum likelihood methodology as presented for instance in [1], [2], [4] and [5]. An extensive review of finite mixture models is given in [10]. Finite mixture models with a fixed number of components are usually estimated with the expectation-maximization (EM) algorithm [3] within a maximum likelihood framework and with MCMC sampling within a Bayesian framework. In addition, concomitant variable models for the component weights provide the possibility to partition the data into the mixture components through other variables [6]. This extension can provide both more precise parameter estimates and better components identification. Applications of mixtures of regression models can be found in various fields of statistical 1 Palacký
University in Olomouc, Faculty of Science, Department of Mathematical Analysis and Applications of Mathematics, 17. listopadu 12, 771 46 Olomouc, Czech Republic, kristyna.vankatova@upol.cz 2 Palacký University in Olomouc, Faculty of Science, Department of Mathematical Analysis and Applications of Mathematics, 17. listopadu 12, 771 46 Olomouc, Czech Republic, eva.fiserova@upol.cz
875
Mathematical Methods in Economics 2016
applications such as climatology, biology, economics, medicine and genetics; see e.g. [7] and [8].
2
Mixtures of regression models
Finite mixtures of regression models are frequently used in applications to model unobserved heterogeneity. The sample can be clustered into classes by modelling the conditional distribution Yj |xj , where Yj denotes the response variable and xj means the vector of explanatory variables (possibly including an intercept) for jth subject [1]. Then the individual regression models for the classes can be estimated simultaneously. Suppose that each observation (yj , xj ) belongs to one of c classes. The mixture of linear regression models is given as follows T xj β 1 + ε1j with probability π1 , xT with probability π2 , j β 2 + ε2j Yj = .. . T xj β c + εcj with probability πc ,
where β i denotes the p-dimensional vector of unknown regression parameters for the i th component, εij error variance for the ith are random errors with normal distribution N(0, σi2 ), and σi2 is the unknown P c component. Probabilities (π1 , . . . , πc ) are mixing proportions, i.e., πi > 0 and i=1 πi = 1, and c is the number of the mixture components. The response variable Yj is assumed to be distributed as a finite sum or mixture of conditional 2 univariate normal densities φ with the expectation xT j β i , and the dispersion σi , i = 1, . . . , c. Accounting for the mixture structure, the conditional density of Yj |xj is ( ) c c 2 X X −(yj − xT j βi ) T 2 2 − 12 , f (yj |xj , Ψ) = πi φ(yj |xj β i , σi ) = πi (2πσi ) exp 2σi2 i=1 i=1 where the symbol Ψ denotes the vector of all unknown parameters T 2 2 T Ψ = (π1 , . . . , πc , (β T 1 , σ1 ), . . . , (β c , σc )) .
2.1
Parameter estimation
Parameter estimation in the mixtures of linear regression models have mainly been studied from a likelihood point of view. For a fixed number of classes c, the parameters Ψ are estimated by maximizing the log–likelihood function log L(Ψ|x1 , . . . , xn , y1 , . . . , yn ) =
n X j=1
log
c X i=1
2 πi φ(yj |xT j β i , σi ) .
In this particular case, the observations can be viewed as an incomplete data. The idea here is to think of the data as consisting of triples (xj , yj , zj ), where zj is the unobserved vector indicator that specifies the mixture component from which the observation yj is drawn [10]. The unobserved component memberships zij of the observations are treated as missing values and the data are augmented by estimates of the component memberships, i.e. the estimated a-posteriori probabilities τij . Thus, any jth observation can be assigned to the ith cluster (using Bayes rule) via the estimated probability: 2 πi φ(yj |xT j β i , σi ) . T 2 h=1 πh φ(yj |xj β h , σh )
τij = Pc
Pc Since mixing proportions sum to unity, i=1 πi = 1, the log–likelihood function can be optimize in a sense of Lagrange multipliers method. The maximum Pc likelihood estimates are found by initially forming an augmented log–likelihood function to reflect i=1 πi = 1 constraint. In order to obtain stationary equations, we compute the first order partial derivatives of the augmented log–likelihood function and equate them to zero. Afterwards, it is a matter of few simple modifications to acquire a new system
876
Mathematical Methods in Economics 2016
of equations obviously corresponding to stationary equations of another optimization problem, that is maximization of c X n X 2 log Lc (Ψ) = τij log φ(yj |xT (1) j β i , σi ). i=1 j=1
Since in the likelihood (1) only the estimated a-posteriori probabilities τij are considered instead of unobservable values zij , (1) is called expected complete log–likelihood. This particular structure gainfully lends itself to the development of a two-stage EM algorithm. The number of components is chosen by comparing BIC or AIC of various models, each with a different number of components. 2.2
EM algorithm
The EM algorithm [3] is an iterative procedure which alternates between an Expectation step and a Maximization step. The EM algorithm works on the expected complete log–likelihood and exploits the fact that the expected likelihood is in general easier to maximize than the original likelihood. In the E-step, a-posteriori probabilities τij are estimated. Consequently, expected complete log–likelihood is maximized in the M-step and vector of unknown parameters Ψ is updated. The EM algorithm has been shown to increase the likelihood in each step and hence to converge for bounded likelihoods. The (k +1)th iteration of EM algorithm can be summarized as follows: b (k) in the k th iteration, replace E-step: Given the observed data y and current parameter estimates Ψ the missing data zij by the estimated a-posteriori probabilities: (k) τbij
(k) b (k) b2(k) ) π bi φ(yj |xT j βi , σ i
=P c
h=1
(k) b (k) b2(k) ) π bh φ(yj |xT j βh , σ h
.
(2)
(k) b (k) ), obtain M-step: Given the estimates τbij for the a-posteriori probabilities τij (which are functions of Ψ
b (k+1) of the parameters by maximizing expected complete log–likelihood new estimates Ψ b (k) ) = Q(Ψ, Ψ
n X c X j=1 i=1
(k)
2 τbij log φ(yj |xT j β i , σi ).
(3)
This maximization is equivalent to solving the weighted least squares problem, where the vector y = 1/2 (k) (y1 , . . . , yn )T of observations and the design matrix X = (x1 , . . . , xn )T are each weighted by τbij . b (k+1) is derived by performing c separate weighted least-squares analyses. In the Thus, the entire set of β i 2 (k+1) same spirit, we estimate σ bi and, lastly, we update the estimates of probability πi using Pn (k) bij j=1 τ (k+1) π bi = for i = 1, . . . , c. n
Initial values of regression parameters are based on a random division of observations into c compo(0) nents, i.e. on initial τbij probabilities, where for each observation yj only one of these c probabilities equals to 1 and other ones are set to zero. The EM algorithm is stopped when the (relative) change of log–likelihood is smaller than a chosen tolerance.
3
Mixtures of regression models with concomitant variables
The mixture of regression models is assumed to consist of c components where each component follows a parametric distribution. Each component has assigned a weight which indicates the a-priori probability for an observation to come from this component and the mixture distribution is given by the weighted sum over the c components. If the weights depend on further variables, these are referred to as concomitant variables [6]. We consider FMRM with concomitant variables of the form of f (yj |xj , ω j ) =
c X i=1
2 πi (ω j , αi )φ(yj |xT j β i , σi ),
877
Mathematical Methods in Economics 2016
where ω j denotes the vector of concomitant variables for the jth observation. In this case, the vector of all unknown parameters is given by T T 2 T 2 T Ψ = ((αT 1 , β 1 , σ1 ), . . . , (αc , β c , σc )) .
The component weights πi need to satisfy c X
πi (ω j , αi ) = 1
and
πi (ω j , αi ) > 0
for
i = 1, . . . , c,
(4)
i=1
where αi are the parameters of the concomitant variable model. Different concomitant variable models are possible to determine the component weights. The mapping function only has to fulfill condition (4). In the following, we assume a multinomial logit model for the πi given by T expωj αi for i = 1, . . . , c, πi (ω j , αi ) = Pc ωT j αh h=1 exp
with α = (α1 , . . . , αc ) and α1 ≡ 0.
In this case the expected complete log–likelihood can be derived in similar way like in the previous section and, as a result, the EM algorithm for FMRM with concomitant variables is the following: b (k) in the k th iteration, replace E-step: Given the observed data y and current parameter estimates Ψ the missing data zij by the estimated a-posteriori probabilities τij : (k) τbij
=P c
(k) b (k) b2(k) ) b i )φ(yj |xT πi (ω j , α j βi , σ i
h=1
(k)
(k)
2(k)
b b i )φ(yj |xT πh (ω j , α bi j βi , σ
. )
(k) b (k) ), obtain M-step: Given the estimates τbij for the a-posteriori probabilities τij (which are functions of Ψ
b (k+1) of the parameters Ψ by maximizing new estimates Ψ where
and
b Q(Ψ, Ψ
(k)
b ) = Q1 (β i , σi2 , i = 1, . . . , c ; Ψ
b Q1 (β i , σi2 , i = 1, . . . , c ; Ψ b (k) ) = Q2 (α, Ψ
(k)
)=
n X c X j=1 i=1
n X c X j=1 i=1
(k)
b ) + Q2 (α, Ψ
(k)
),
(k) 2 τbij log φ(yj |xT j β i , σi )
(k)
τbij log (πi (ω j , αi )) .
Formulas Q1 and Q2 can be maximized separately. The maximization of Q1 gives new estimates 2(k+1) b (k+1) , σ b (k+1) . Q1 is maximized using the β bi , i = 1, . . . , c, and the maximization of Q2 gives α i weighted ML estimation of linear models and Q2 by means of the weighted ML estimation of multinomial logit models. The treatment regarding a choice of a number of components and starting estimates of regression parameters is analogous to FMRM.
4
Analysis of income of EU residents
Here we focus on modelling the relationship between the mean of equivalised net income of residents and old age pension (OAP) government expenditures. The main task is to find a general dependency over all countries for people over 65 years. The data come from the Eurostat database and they capture last 15 years of development considering given indicators. Each observation in a dataset stands for one state of EU in one year and both variables are measured in Euros. OAP is standardized by computing OAP per inhabitant.
878
Mathematical Methods in Economics 2016
As the first step in our analysis of income, the contribution of time factor to given model was analysed. After careful consideration, time was not included in the model neither as a predictor or a concomitant variable. Using time as a predictor resulted in a model, where most of time levels were insignificant, moreover, on closer inspection, both OAP and mean income have very similar time development. Meaning, adding time to the model is, in fact, excessive and does not carry any new information. Further, based on an assumption that impact of OAP on income could be one or more years delayed, several new models were created. However, none of them performed better than the original one. Therefore, only a simple relationship of income as a response variable and OAP as an explanatory variable is used in a final study. Also, still in terms of ordinary linear regression, we could extend our model by 27 additional explanatory variables representing levels of factor variable GEO, variable describing country of observation origin, together with interaction terms (GEO*OAP). These models provide reasonably quality fit, with some reservations (over-parametrization and/or incorrectly fitted trend in some countries), but they lack to capture the general relationship we are interested in. We have a desire to find the clusters of countries that share the same development of studied relationship. That is the idea behind applying FMRM. The statistical software R [11] contain several extension packages for estimation of mixture regression models. The results of our study are built on flexmix package, introduced in [9]. The package implements a framework for maximum likelihood estimation with the EM algorithm. The main focus is on finite mixtures of regression models and it allows for multiple independent responses and repeated measurements.
40000 30000 20000
Mean Equivalised Net Income/Euro
0
10000
30000 20000 10000 0
Mean Equivalised Net Income/Euro
40000
Firstly, a suitable number of components must be determined. In order to do so, FMRM was estimated for mixture consisting of 1 to 4 components and all FMRM models were evaluated based on BIC. This information criterion suggests the two component mixture is desirable.
500
1000
1500
2000
2500
3000
3500
500
Old Age Pension/Euro per inhabitant
1000
1500
2000
2500
3000
3500
Old Age Pension/Euro per inhabitant
Figure 1 Fitted FMRM without (left) and with concomitant variable (right). Estimation via FMRM resulted to the following two regression lines (Figure 1, left) \ = 1295.7 + 6.4 ∗ OAP, Income
\ = −420.0 + 12.2 ∗ OAP. Income
(5)
The structure of countries included in the first or second component suggests that GEO variable has a strong effect on classification of each observation. By providing GEO as a concomitant variable, the models estimates (5) are updated to (Figure 1, right) \ = 815.6 + 6.6 ∗ OAP, Income
\ = −337.4 + 12.0 ∗ OAP Income
(6)
It is visible that not only regression lines are slightly different, but the structure of clusters varies as well. The upper (red) cluster is mainly containing observations from Ireland, Spain, Croatia, Cyprus, Slovenia,
879
Mathematical Methods in Economics 2016
Slovakia, Malta and Luxembourg. The rest of EU countries lie in the second, vertically lower, cluster (black). The only regression parameter estimated as insignificantly different from zero is the intercept of the upper component. The accuracy of estimates slightly increases visibly once the concomitant variable is added to a model. For an estimate of intercept standard deviation declines from 517.57 (231.46 for the second component) to 440.57 (210.42), for slope estimate the deviation reduces from 0.30 (0.11) to 0.27 (0.10).
5
Discussion
Income, as an indicator of life quality, is frequently discussed and analysed in economic studies. For people older than 65 years, income appears to be highly dependent on old age pension that given state provide. In order to evaluate this relationship in best possible manner, the FMRM were applied. The results indicate that data follow two different simple regression models, first including most of the EU countries, where the slope is considerably smaller. The second cluster contains mostly observations from 8 specific countries (Ireland, Spain, Croatia, Cyprus, Slovenia, Slovakia, Malta, Luxembourg), where income is growing almost twice as fast as in the first cluster. There may be many reasons behind this behaviour, from different retirement age or significantly varying age structure of a population, to a general tendency of maintaining job even in a retirement age.
Acknowledgements The authors thankfully acknowledge the support by the grant IGA PrF 2016 025 of the Internal Grant Agency of Palacký University Olomouc.
References [1] Bengalia, T., Chauveau, D., Hunter, D. R., Young, D. S.: Mixtools: An R Package for Analyzing Finite Mixture Models, Journal of Statistical Software 32, 6 (2009), 1–29. [2] De Veaux, R. D.: Mixtures of Linear Regressions, Computational Statistics & Data Analysis 8, 3 (1989), 227–245. [3] Dempster, A. P., Laird, N. M., Rubin, D. B.: Maximum Likelihood from Incomplete Data Via the EM-Algorithm, Journal of the Royal Statistical Society, Series B 39 (1977), 1–38. [4] Desarbo, W. S., Cron, W. L.: A Maximum Likelihood Methodology for Clusterwise Linear Regression, Journal of Classification 5, 2 (1988), 249–282. [5] Faria, S., Soromenho, G.: Fitting Mixtures of Linear Regressions, Journal of Statistical Computation and Simulation 80, 2 (2010), 201–225. [6] Grün, B., Leisch. F.: FlexMix Version 2: Finite Mixtures with Concomitant Variables and Varying and Constant Parameters, Journal of Statistical Software 28, 4 (2008), 1–35. [7] Grün, B., Scharl, T., Leisch, F.: Modelling time course gene expression data with finite mixtures of linear additive models, Bioinformatics 28, 2 (2012), 222–228. [8] Hamel, S., Yoccoz, N. G., Gaillard, J. M.: Assessing variation in life-history tactics within a population using mixture regression models: a practical guide for evolutionary ecologists, Biological Reviews (2016). [9] Leisch, F.: FlexMix: A General Framework for Finite Mixture Models and Latent Class Regression in R, Journal of Statistical Software 11, 8 (2004), 1–18. [10] McLachlan, G., Peel, D.: Finite Mixture Models. John Wiley & Sons, New York, 2000. [11] R Core Team. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, 2016.
880
Mathematical Methods in Economics 2016
Co-movement of Tourism Activity of V4 countries Petra Vašaničová1, Eva Litavcová2, Štefan Lyócsa3 Abstract. We study the dependence of tourism activity within four Central and Eastern European Countries (CEE): Poland, Czech Republic, Slovak Republic, and Hungary. Using monthly data of the nights spent at tourist accommodation establishments by residents and non-residents (total) in V4 countries from January 2003 to December 2015 we show, that there exists statistically significant short-term (up to 6 months), mid-term (up to 12 months) and long-term (up to 24 months) comovement between V4 countries. Maximum long-term co-movement (ρ = 0.88) of tourism activity was found between Poland and Slovak Republic, while minimum short-term co-movement was found between Hungary and Slovak Republic (ρ = 0.29). In general, we can say that lower dependencies between tourism activity were found for short-term co-movements, while larger for long-term co-movement. The results have implications for tourism policies, as they might shed light on whether co-operation between CEE countries is a viable strategy for supporting tourism activities. Keywords: tourism, nights spent, Central and Eastern Europe, co-movement, longrun correlations JEL Classification: C32, L83 AMS Classification: 62-07
1
Introduction
When attracting tourist, should neighboring countries cooperate or compete? Using monthly data of the nights spent at tourist accommodation establishments by residents and non-residents in V4 countries from January 2003 to December 2015 we show, that there exists statistically significant short-term (up to 6 months), mid-term (up to 12 months) and long-term (up to 24 months) co-movement between V4 countries. Maximum long-term comovement (ρ = 0.88) of tourism activity was found between Poland and Slovak Republic, while minimum shortterm co-movement was found between Hungary and Slovak Republic (ρ = 0.29). In general, we can say that lower dependencies between tourism activity were found for short-term co-movements, while larger for longterm co-movement. Most of the existing literature is centered around the tourism-growth nexus (see in: [2], [3], [7], [9]). Bayramoğlu and Ari [2], provided strong unidirectional evidence for causality from the expenditures of foreign tourists who visited Greece to the growth of Greece. Further on, Pérez-Rodríguez, et al. [9], found significant, asymmetric and positive dependence between tourism and GDP growth rates for United Kingdom, Spain and Croatia. Mishra et al. [7], provides evidence of long-run unidirectional causality from tourism activity to economic growth of the India. Finally, Chou [3], in his study applies the bootstrap panel Granger causality approach to test whether domestic tourism spending promotes economic growth. Using a sample of 10 transition countries the results were rather mixed. Another strand of the literature focuses on interactions of tourism with other economic sectors. For example, Zeren et al. [11], studied the dependence between finance, tourism and advertising in Turkey. Karadžova and Dičevska [4], looked at interactions between the development of the financial system and tourism development in the case of Republic of Macedonia. Serquet and Rebetez [10] examined whether there exists an impact of high temperature values at lower elevation on the number of overnight stays made by domestic tourists in the Swiss Alps and they demonstrated a correlation between domestic tourist demand and hot summer temperatures in the Swiss Alps. Interestingly, to the best of our knowledge, there are only two studies covering co-movement of tourism activity. Lord and Moore [5] analyzed co-movement in tourist arrivals in the 21 Caribbean destinations for a period from 1977 to 2002. They investigated if tourism can encourage the goal of regional integration. The empirical 1
University of Presov, Faculty of Management, Konstantinova 16, Presov, Slovakia, petra.vasanicova@smail.unipo.sk. 2 University of Presov, Faculty of Management, Konstantinova 16, Presov, Slovakia, eva.litavcova@unipo.sk. 3 University of Presov, Faculty of Management, Konstantinova 16, Presov, Slovakia, University of Economics in Bratislava, Institute of Economics and Management, Dolnozemská cesta 1, Slovakia, stefan.lyocsa@gmail.com.
881
Mathematical Methods in Economics 2016
results presented in the paper provided no evidence for either absolute or conditional convergence in tourism penetration to the Caribbean. This finding is confirmed using both pairwise and panel unit root tests that allow for cross-sectional dependence. Although there are differences in the rate of tourism dependence in the Caribbean, the industry has strengthened regional growth convergence over the past 25 years. The similarities in tourism cycles across the region also suggest that a joint promotion project for Caribbean region could yield significant gains. MĂŠrida [6] contributed to the ability to understand the relationship between the tourist arrivals to Spain from nine developed countries (Belgium, France, Germany, Italy, Netherlands, Portugal, Sweden, United Kingdom, United States and Spain) and the business cycles of both the source countries and Spain. The results from co-movement analysis revealed different patterns for each country. For example, arrivals from France, Germany, Portugal and Sweden seem to be more correlated with the movement of the Spanish business cycle than with their own economic cycle while in other countries (such as Italy or the United Kingdom) both aspects seem to be equally correlated with the tourist outflows to Spain, although for both of them the evolution of their own inner economies seem to be more relevant. We contribute to the existing literature in two significant ways. First, we fill the gap in the literature on comovement of tourism activity. Second, existing studies usually cover countries, where tourism is a significant part of the total production of the economy. Our study covers new markets, the CEE countries, where tourism plays only a minor role of the total economy. Still, we have found strong evidence for co-movement between tourism activities of these countries. The remainder of the paper is structured as follows. Section 2 presents the data. Section 3 describes the methodology. In Section 4 we discuss results and conclude.
2
Data
Our study examines long-run correlation between the nights spent at tourist accommodation establishments (such as: hotels, holiday and other short-stay accommodation, camping grounds, recreational vehicle park and trailer parks) by residents and non- residents (total) in V4 countries, namely: Slovak Republic (SK), Czech Republic (CZ), Poland (PL) and Hungary (HU). Our dataset consists of monthly data from January 2003 to December 2015. Data were obtained from Eurostat database. Due to the significant seasonal pattern in the data, we have first removed the seasonal component via annualization of the monthly data series. For example, our first observation corresponds to the December 2003 and it is the sum of total monthly nights spent from January 2003 to December 2003. Next observation corresponds to January 2004 and corresponds to the sum of total monthly nights spent from February 2003 to January 2004, etc. The resulting time series are deseasonalized and are displayed on Fig. 1.
Figure 1 Total nights spent at tourist accommodation establishment of V4 countries
882
Mathematical Methods in Economics 2016
Note: CZ, HU, PL, and SK denotes the Czech Republic, Hungary, Poland, and Slovak Republic.
3
Methodology
In our paper we decided to use long-run correlation coefficient ρxy , which was presented in [8]. It is a nonparametric approach, which has several advantages, among others it is a model free approach and it leads to a consistent estimate of the variance-covariance matrix even in the presence of autocorrelation and heteroskedasticity. The long-run covariance matrix Ω of the process Zt = [yt, xt]T is defined as:
yy ΩT xy
T T xy 1 lim T E Zi ZTj xx T i 1 j 1
(1)
where, for a given sample size T:
ΩT
T
j
j T
(2)
T
where
1 T Zt Zt j , T t j 1 T j T 1 Zt j Zt , T t j 1
j 0 , t 1,2,...,T j 0
(3)
In line with the study of Andrews [1] the following estimator is employed:
ΩT
T 1
j k j T 1 M
T j
(4)
where k(.) is a real-valued kernel and M the bandwidth parameter. The kernel function is described in [1] and together with the bandwidth parameter it determines the scheme (and the weight) with which past crosscovariances enter the estimation of the long-run covariance matrix. In our empirical work, we have used the Quandratic spectral kernel weighting function:
k QS ( x)
sin6 x / 5 cos6 x / 5 6 x / 5 12 x 25
2 2
(5)
Given the variance-covariance matrix, the long-run correlation is estimated as:
ˆ xy
ˆ xy
(6)
ˆ xxˆ yy
Finally, Panopoulou et al. [8] have provided the necessary distributional assumption for the significance test of the long-run correlation coefficient. More specifically, the following test statistics follows the normal distribution under the null hypothesis:
T 2 ˆ xy xy ~ N 0, 1 xy M
2
(7)
The long-run correlation was calculated for different bandwidth parameters M. More specifically, if we set the bandwidth parameter to M = 1,2,..,6 the resulting correlation takes into account dependencies across previous or past six months (from one series to another and vice versa). Therefore, such correlations were considered to represent short-term dependence. Similarly, if the bandwidth parameter was set to M = 7,8,...,12 the resulting correlation represents the mid-term dependence and finally, with M = 13,14,…, the resulting correlation represents the long-term dependence.
883
Mathematical Methods in Economics 2016
4
Results and Conclusion
Our results are reported in Table 1. Looking at the significances, it appears clear that almost all relationships are statistically significant. Moreover, the relationships are clearly also economically meaningful, as the correlations range from ρ = 0.29 to ρ = 0.88.
Country/ M CZ - HU CZ - PL CZ - SK HU - PL HU - SK PL - SK
1 0.57d 0.54d 0.35d 0.60d 0.29d 0.47d
Country/ M CZ - HU CZ - PL CZ - SK HU - PL HU - SK PL - SK
13 0.73c 0.70c 0.50b 0.62c 0.46a 0.70c
Short-term co-movement Medium-term co-movement 2 3 4 5 6 7 8 9 10 11 0.58d 0.59d 0.60d 0.61d 0.63d 0.64d 0.65d 0.67d 0.68c 0.69c 0.58d 0.59d 0.60d 0.62d 0.63d 0.64d 0.65d 0.66d 0.67c 0.68c 0.39d 0.42d 0.43d 0.44c 0.45c 0.46c 0.47c 0.48b 0.49b 0.49b 0.66d 0.67d 0.67d 0.67d 0.66d 0.65d 0.65d 0.64c 0.64c 0.63c 0.33d 0.34c 0.35c 0.35b 0.36b 0.38b 0.39a 0.40a 0.42a 0.43a 0.51d 0.54d 0.56d 0.58d 0.59d 0.60d 0.61d 0.62c 0.64c 0.66c Long-term co-movement 14 15 16 17 18 19 20 21 22 23 0.74c 0.75c 0.75c 0.76c 0.77c 0.78c 0.79c 0.80c 0.80c 0.81c 0.70c 0.70c 0.70c 0.70c 0.70c 0.69b 0.69b 0.69b 0.69b 0.69b 0.51a 0.51a 0.52a 0.53a 0.55a 0.57a 0.58a 0.58a 0.57a 0.56a 0.61b 0.61b 0.60b 0.60b 0.60b 0.61b 0.61b 0.62a 0.62a 0.63a 0.47a 0.48a 0.49a 0.50a 0.51a 0.51a 0.52a 0.52a 0.52a 0.52a 0.73c 0.75c 0.78c 0.82c 0.85c 0.87c 0.88c 0.88c 0.88c 0.88c
12 0.71c 0.69c 0.50b 0.62c 0.45a 0.68c 24 0.82c 0.68b 0.55a 0.64a 0.52a 0.88c
Table 1 Co-movement between tourism activities of V4 countries. Note: CZ, HU, PL, and SK denotes the Czech Republic, Hungary, Poland, and Slovak Republic; a, b, c, d denotes 10%, 5%, 1%, and 0.1% significance level; M denotes the bandwidth parameter. Bold values represent the highest correlations for a given pair of countries.
Summary statistics of co-movements between tourism activities of V4 countries is listed in Table 2. According to this table, maximum long-term co-movement (ρ = 0.88) of tourism activity was found between Poland and Slovak Republic at the bandwidth parameter M = 22, while minimum short-term co-movement (ρ = 0.29) was found between Hungary and Slovak Republic at the bandwidth parameter M = 1. Maximum mean value of comovement of tourism activity (ρmean = 0.71) was found between Poland and Slovak Republic. Minimum mean value of co-movement (ρmean = 0.44) of tourism activity was found between Hungary and Slovak Republic. Country mean maxM CZ - HU 0.70 0.8224 CZ - PL 0.66 0.7016 CZ - SK 0.50 0.5821 HU - PL 0.63 0.6703 HU - SK 0.44 0.5224 PL - SK 0.71 0.8822
minM 0.571 0.541 0.351 0.601 0.291 0.471
Table 2 Summary statistics of co-movement between tourism activities of V4 countries. Note: CZ, HU, PL, and SK denotes the Czech Republic, Hungary, Poland, and Slovak Republic; M denotes the bandwidth parameter. Bold values denote extreme values across all pairs of countries.
In general, we can observe that lower dependencies between tourism activity were found for short-term comovements, while larger for long-term co-movement. Although, short-term co-movements were observed to have lower values, they were still statistically significant. To conclude, these results may be helpful for a deeper understanding of the tourism industry in V4 countries. Although, the tourism plays only a minor role of the total economy in these V4 countries, we found strong evidence of co-movement of tourism activity. It appears, that tourist agencies in V4 might find it useful to cooperate when promoting their countries abroad. One readily available explanation for high dependence of tourism activity might be, that the co-movement only manifest the co-movement of real economies (the business cycle). This however, remains to be tested in our subsequent works.
884
Mathematical Methods in Economics 2016
Acknowledgements Supported by the grant No. 1/0392/15 of the Slovak Grant Agency, by the grant KEGA 037PU-4/2014 of the Cultural and Educational Grant Agency of the Slovak Republic, and by the grant VEGA 1/0412/17 of the Scientific Grant Agency of Ministry of Education SR and Slovak Academy of Sciences.
References [1] Andrews, D. W. K.: Heteroskedasticity and Autocorrelation Consistent Covariance Matrix Estimation. Econometrica 59 (1991), 817-858. [2] Bayramoğlu, T., Y. O. Ari.: The relationship between tourism and economic growth in greece economy: a time series analysis. Computational Methods in Social Sciences 3 (2015), 89-93. [3] Chou, M. .C.: Does tourism development promote economic growth in transition countries? A panel data analysis. Economic Modelling 33 (2013), 226-232. [4] Karadžova, V., S. Dičevska.: Interactions between financial system development and tourism development: conditions in Republic of Macedonia. Sustainable Tourism: Socio-Cultural, Environmental and Economics Impact (2011), 169-186. [5] Lorde, T., W. Moore.: Co-movement in tourist arrivals in the Caribbean. Tourism Economics 14 (2008), 631-643. [6] Mérida, A. L.: Tourism and economic growth: causality, persistence and asymmetry. Universidad de Huelva, Huelva, 2015. [7] Mishra, P. K., H. B. Rout, S. S. Mohapatra.: Causality between Tourism and Economic Growth: Empirical Evidence from India. European Journal of Social Sciences 18 (2011), 518-527. [8] Panopoulou, E., Pittis, N., Kalyvitis, S.: Looking far in the past: revisiting the growth-returns nexus with non-parametric test. Empirical Economics 38 (2010), 743-766. [9] Pérez-Rodríguez, J. V., F. Ledesma-Rodríguez, M. Santana-Gallego.: Testing dependence between GDP and tourism's growth rates. Tourism Management 48 (2015), 268-282. [10] Serquet, G., M. Rebetez.: Relationship between tourism demand in the Swiss Alps and hot summer air temperatures associated with climate change. Climatic Change 108 (2011), 291-300. [11] Zeren, F., M. Koç, F. Konuk.: Interaction between finance, tourism and advertising: evidence from Turkey. Tourism and Hospitality Management 20 (2014), 185-193.
885
Mathematical Methods in Economics 2016
On newsvendor model with optimal selling price Petr Volf1 Abstract. The contribution studies the problem of optimizing the inventory amount to achieve maximal profit, under circumstances of uncertain, random demand. This matter, known also as the newsvendor problem, has various variants. After reviewing two basic formulations, we concentrate to the model when the vendor is able to change selling price. Naturally, increased price leads to smaller demand, and vice versa. To describe such an effect, we assume a nonlinear regression-like model of dependence of demand on the price, with an inflexion point at a ’base’, customary price. This is in fact the main innovation proposed in the present contribution, as standardly linear or exponential demand curves are considered. We then derive optimal solution concerning both the inventory level and selling price, maximizing either expected profit or chosen quantile of profit distribution. Solutions are illustrated with the aid of simple numerical examples. Keywords: stochastic optimization, newsvendor model, quantile optimization, demand function. JEL classification: C41, J64 AMS classification: 62N02, 62P25
1
Introduction
The newsvendor model is a mathematical formulation of the problem how to determine optimal inventory level to be sold in circumstances of uncertain demand. It is named by an analogy with the situation of a newspaper vendor who must decide how many copies of the paper to purchase, knowing that unsold copies will become worthless. The simplest case is characterized by fixed both purchase and selling prices, without any additional costs. In fact it serves as a basic block for building more sophisticated and realistic model variants [3]. The present paper starts by description of such a simple instance. Then, in next parts, some generalizations are provided. First, a model is recalled considering additional costs for stocking unsold items or for not meeting the demand. Then, the main part deals with the case allowing the vendor to change the selling price. We propose a nonlinear, cubic regression-like model for dependence of demand on price, discuss such a choice, and derive rules for simultaneous assessing both optimal stock amount and optimal price in order to maximize chosen characteristics of (random) profit. Namely, we take into account either expected profit or certain profit distribution quantile. We prefer to deal with quantiles, as they are more universal characteristics of distribution (especially of non-symmetric one), simultaneously the calculation of optimal values in considered models is feasible enough. The presentation of models and optimal solutions is accompanied by simple illustrative examples. 1.1
Basic formulation of the newsvendor problem
Let D be a (nonnegative) random variable with known probability distribution representing the demand (of units of certain commodity), further let each unit be sold for price q and purchased for price c, (c < q), S be the number of units stocked, i.e. purchased to be sold. In the simplest case c and q are fixed, we search for optimal stocking quantity S which maximizes the profit Z = q · min(S, D) − c · S.
(1)
As both D and therefore Z are random, we have a choice which criterion to maximize. It is well known and can be shown easily that to get maxS EZ, optimal solution is S = SE = ((q−c)/q)-quantile of distribution 1 FP
TU Liberec, Department of Applied Mathematics, Univerzitnı́ nám. 1, Liberec, Czech Republic, petr.volf@tul.cz
886
Mathematical Methods in Economics 2016
of D, while for maximization of the α-quantile of Z the optimal stock equals S = Sα = α-quantile of D (naturally, in case q ≤ c the optimal decision is S = 0). See for instance [2],[3],[5]. Computation of resulting distribution of profit Z from (1), hence also of achieved values of EZ and quantiles of Z, QZ (α), is then rather easy task. Namely, denote f, F, Q the density, distribution function and quantile function of random demand D, further F̄ = 1 − F , then from (1) it follows that for given S Z S Z S F̄ (x)dx − cS, EZ = (qx − cS)f (x)dx + (q − c)S F̄ (S) = q 0
0
after integration by parts. As regards quantiles of Z as function of S, they equal QZ (α) = (q − c)S for S ≤ Q(α), QZ (α) = qQ(α) − cS for S > Q(α). Then maximum of it, which is attained at Sα = Q(α), equals Q(α) · (q − c). This solution can also be interpreted as providing the stock quantity that guarantees that the demand will be satisfied with probability at least α. It is possible and interesting to study dependence of distribution of Z and values of its characteristics, including its variance, on chosen stock amount S. This is actually the purpose of the following example. We have selected the normal distribution for D, D ∼ N (µ = 150, σ = 20) (which is practically positive with P ∼ 1), prices were set to c = 10, q = 15. Then 0.1, 0.25, 0.5, 0.75, and 0.9 quantiles of this normal distribution equal 124.3690, 136.5102, 150.0000, 163.4898, and 175.6310, respectively. It means that these values are also levels of stock Sα maximizing corresponding α-quantiles of profit Z. Further, the ratio (q − c)/q = 1/3, hence the optimal stock level maximizing EZ equals SE = 141.3855, corresponding to 1/3 quantile of distribution of D. Figure 1 shows dependence of considered 5 quantiles of profit Z on S. It contains also the graph of mean values EZ and curves EZ − 2σZ , EZ + 2σZ , where σZ denotes standard deviation of Z. Though the value SE = 141.3855 guarantees maximal possible expected profit, simultaneously it is seen that the variance (and therefore uncertainty) of profit really achieved is rather large. However, let us consider just slightly sub-optimal S, for instance S = 130. Expected profit is then just slightly smaller, approximately from 600 decreasing to 580, simultaneously standard deviation of distribution of Z is smaller significantly (approximately twice). It is also seen how for even smaller levels of stock (less than 120, say) the distribution of possible profit becomes to be rather narrow, but, naturally, all considered distribution characteristics are already lower. 1200
1000
EZ+2σ
800
Q(0.9)
600
Q(0.75) EZ
400
Q(0.5)
200
Q(0.25)
0
Q(0.1)
−200 EZ−2σ −400 100
110
120
130
140
150 S
160
170
180
190
200
Figure 1 Dependence of characteristics of profit Z on inventory amount S in model (1).
1.2
Generalization with additional costs
The model from above is naturally just a basis for more general cases, see overviews in [1],[3]. One of generalizations can consider a set of additional costs, namely p = penalty (per unit) for non-satisfied demand, and h = cost of stocking each non-sold unit. Moreover, let s be a rest inventory from the previous period which enters the present period as a part of S. Then we search for optimal amount S − s to be purchased. If all costs are fixed and demand is random, then again there is no problem to derive (at least numerically) the distribution of profit Z, where Z = q · min(S, D) − c · (S − s) − p · max(D − S, 0) − h · max(S − D, 0), hence 887
Mathematical Methods in Economics 2016
Z = D(q + h) − S(c + h) + sc for S > D, Z = S(q − c + p) − Dp + sc for S ≤ D,
(2)
and to find S maximizing either expected profit EZ or certain quantile of Z. Namely, maximum of EZ is attained at SE = (q +p −c)/(q +p+h)-quantile of distribution of demand D, while optimal S maximizing a quantile of Z has to be computed numerically on the basis of distribution (2). Figure 2 illustrates the model. The main characteristics are the same as in Figure 1, moreover we set h = p = 3, s = 0. Then SE = Q(0.381) = 143.94, optimal S maximizing selected quantile curves can be assessed numerically and are visible from the graph. In the example, under given values of parameters, they are higher than in model (1), while corresponding distribution of profit is shifted down, due additional costs. 1200 EZ+2σ
1000
800 Q(0.9) 600 Q(0.75)
EZ
400
Q(0.5) 200 Q(0.25)
0
Q(0.1)
−200
−400 EZ−2σ
−600
−800 100
110
120
130
140
150 S
160
170
180
190
200
Figure 2 Dependence of characteristics of profit Z on inventory amount S in model (2).
2
Demand dependent on price
In the sequel we shall consider yet another generalization, allowing in the framework of basic model (1) for changes of selling price q. That is why we shall need a relation between price and demand. Thus, the aim is then to find both optimal stock amount and optimal selling price. Inspired for instance by [3] we propose several regression-like models, additive or multiplicative. We shall present basic results with linear demand functions, however later we shall prefer nonlinear trend curves and discuss their use and meaning. Taking into account that, as we have seen, the quantiles of profit distribution are easier to compute, and, mainly, that quantiles are in many cases more reliable characteristics of non-symmetric distributions (for discussion on quantile optimization see [2] or [4]), we shall concentrate to optimization of profit quantiles. Even simple examples in preceding section show how a few quantiles can nicely characterize the main part of profit distribution. 2.1
Additive model
Let as assume that the demand depends on selling price in the following manner D(q) = d(q) + ε.
(3)
where d(q) is a trend function, non-increasing, smooth (e.g. with first derivative), ε stands for a random variable, centered and independent of q. In fact, we should assume nonnegative function d(q), to give it a practical sense. Let us denote F (x), Q(α) the distribution function and quantiles of ε, respectively, F̄ = 1−F . Then, for fixed q, the distribution function of demand D(q) is Fq (x) = F (x−d(q)) and its quantiles are Qq (α) = Q(α) + d(q). The most standard choice of ε is the normal distribution N (0, σ 2 ). In this case, however, we have to truncate it at large values of q in order to keep demand nonnegative. Further, still assuming fixed q, we are able to find optimal stocks maximizing EZ, SE (q) = Q((q − c)/q) + d(q), or maximizing Qz (α), which equals Sα (q) = Q(α) + d(q). In the second case we are also able to express achieved maximal value, namely QˆZ (α, q) = Sα (q) · (q − c) = [Q(α) + d(q)] · (q − c). 888
(4)
Mathematical Methods in Economics 2016
Now, the task is to find q maximizing (4). It means to solve equation dQˆZ (α, q) = Q(α) + d(q) + d0 (q) · (q − c) = 0. dq
(5)
Example with linear trend. Let d(q) = a + bq, b < 0, (d(q) = 0 for q > −a/b). From (5) the equation Q(α) + a + 2bq − bc = 0 follows, which is solved at q = c/2 − (Q(α) + a)/2b. If we assume that demand at c is positive, i.e. −a/b > c, then optimal q > c. As regards maximization of expected profit, for fixed q its maximum equals Z SE (q) F̄ (x − d(q))dx − c · SE (q). (6) EZ(q) = 0
It is seen that here the problem of assessing optimal q could be solved just numerically. 2.2
Log-additive model
We can as well speak on multiplicative model for demand, its logarithm (we use natural logarithms) leads to log D(q) = l(q) + ε. (7) This model seems to be more convenient than the additive one, guaranteeing positive demand for all q. Again, l(q) is assumed non-increasing. When, standardly, ε ∼ N (0, σ 2 ), D(q) has the log-normal distribution and its variance increases with decreasing price q. Naturally, we could consider an explicit model of dependence of variance on q, however this is beyond scope of the present study. Denote again by F, F̄ , Q corresponding characteristics of distribution of ε, then characteristics of distribution of demand (at fixed q) are Fq (x) = F (ln x − l(q)), Qq (α) = exp(Q(α) + l(q)). Hence we can easily express stock amount maximizing either EZ or a quantile of Z, under given q, namely: SE (q) = exp[Q((q−c)/q)+l(q)], Sα (q) = exp[Q(α)+l(q)]. Taking into account the form of optimal EZ (6), we see that the next step, maximization over price q, is not easy and can be attempted by a convenient numerical procedure. On the other hand, the form of optimal quantiles of Z is simpler and in certain instances can be solved by direct computation. At a fixed q, maximal α-quantile of profit Z equals QˆZ (α, q) = exp[Q(α) + l(q)] · (q − c). Equation for assessing optimal q then reads dQˆZ (α, q) = exp[Q(α) + l(q)] · [l0 (q) · (q − c) + 1] = 0. dq
(8)
Example with log-linear trend. Assume now, similarly as above, that l(q) = a + bq, b < 0. Then, searching for optimal q, we obtain the following equation dQˆZ (α, q) = exp[Q(α) + a + bq] · (b(q − c) + 1) = 0. dq It is solved at q = c − 1/b, which is larger than c. Notice an interesting feature that optimal q does not depend on α.
3
Nonlinear demand function
Linear (or exponential) functions of dependence of demand on price are the most commonly used models, however I feel that for our purposes other models could be more appropriate. The idea is the following: Let us imagine that there exists a base price, or customary price q0 for given commodity, and that its small changes actually do not change the demand. However, larger alteration can have already much stronger influence on demand. For instance, under too high price customers start to search for substitution, either from another source or by a similar product. Similarly, low price attracts new buyers. Thus, I guess, a suitable dependence function should be a curve with an inflexion point and with derivative close to zero at q0 , smoothly increasing to left and decreasing to right (for larger price). For instance like a cubic polynomial with conveniently chosen parameters. Figure 3 shows examples of such functions, they are specified below. The purpose of examples is just to illustrate the approach and method of solution. Naturally, there is a rather rich choice of similar functions, for instance an inversion to S-curve (of Gompertz) or functions composed from more parts.
889
Mathematical Methods in Economics 2016
3.1
Example of additive cubic trend
I propose to choose function d(q) = a + b(q − q0 )3 , with b < 0, possessing all required properties. Namely, its derivative at q0 equals zero, a = d(q0 ), q0 is also its inflexion point. Optimal values are then obtained via (4) and (5). For a numerical example, let us fix c = 0.7, b = −0.5, q0 = 1, hence also a = 1, further let the standard deviation of normal random variable ε be σ = 0.1, and α = 0.5, which means that we wish to optimize median of profit Z. Corresponding cubic demand function d(q) and optimal values (w.r. to S) QˆZ (α, q), i.e. medians of profit maximized for fixed q, are displayed in right subplots of Figure 3. Optimal values of price q, stock amount S and achieved median of profit Z then equal qopt = 1.73, Sopt = 0.8055, QZ,opt (0.5) = 0.8297. Naturally, if q ≤ c, ’optimal’ S(q) = 0. EXP(CUBIC) MODEL
CUBIC MODEL of DEMAND 2
1.5
1.5
d(q)
exp l(q)
2
1
0
0.5
1
1.5
0.5
2
1
1
0.5
0.5
0
0
0
0.5
1
1.5
2
0
0.5
1 q
1.5
2
Z
Q (α,q)
QZ(α,q)
0.5
1
−0.5
−0.5
−1
−1
−1.5
−1.5
0
0.5
1 q
1.5
2
Figure 3 Exponential-cubic trend and cubic trend for demand (first row), optimal values of medians of profit Z as functions of q (second row).
3.2
Example of log-additive cubic trend
Now it is assumed that the log of demand has a cubic trend, l(q) = a + b(q − q0 )3 . It means that the trend function of demand equals d(q) = exp(l(q)). Again, 1-st and 2-nd derivatives of both functions are zero at q0 , conditions d(q0 ) = d0 implies that a = log(d0 ). Optimal solutions maximizing α-quantile of Z are obtained in accordance with rules derived above, including expression (8). In a numerical example, we fixed c = 0.7, b = −0.7, q0 = 1, also d(q0 ) = 1, hence here a = 0, further σ = 0.1, and α = 0.5 as above. Left subplots of Figure 3 display function d(q) = exp(l(q)) (above) and corresponding QˆZ (α, q) (below). Achieved optimal values of price q, stock amount S and median of profit Z are equal to qopt = 1.69, Sopt = 0.7946, QZ,opt (0.5) = 0.7866. We can say that both models offer comparable solution. Optimal recommended price is higher than the base price q0 , under qopt the profit is higher in spite of lower demand. In order to compare models and their performance, we present here also numerical results from linear and log-linear models. In the former, trend function d(q) = a+bq had parameters b = −1, a = 1−bq0 = 2, as we again fixed q0 = 1, and required d(q0 ) = 1. However, in these models q0 is just one of points on demand curve, without the meaning of a ”base point” (like above). Further, let again c = 0.7, σ = 0.1 and α = 0.5. Functions d(q) and QˆZ (α, q), i.e. medians of Z optimized w.r. to S for fixed q, are displayed in Figure 4, right subplots. Optimal values of price q, stock amount S and median of profit Z were equal to qopt = 1.35, Sopt = 0.65, QZ,opt (0.5) = 0.4225. For example of log-linear trend l(q) = a + bq we have selected b = −1, a = −b = 1, then again exp(l(1)) = 1. Functions exp(l(q)) and QˆZ (α, q) are displayed in left subplots of Figure 4. In this case the optimal
890
Mathematical Methods in Economics 2016
EXPONENTIAL MODEL
LINEAR MODEL for DEMAND 2.5
2
2
1.5
1.5
d(q)
exp l(q)
2.5
1
1
0.5
0.5
0
0
0.5
1
1.5
0
2
0
0.5
1
1.5
2
0
0.5
1 q
1.5
2
0.5 0.5 0
Q (α,q)
Z
QZ(α,q)
0 −0.5 −1
−0.5
−1 −1.5 −2
0
0.5
1 q
1.5
2
−1.5
Figure 4 Exponential trend and linear trend for demand (first row), optimal values of medians of profit Z as functions of q (second row). values of price q, stock amount S and median of profit Z equaled qopt = 1.70, Sopt = 0.4966, QZ,opt (0.5) = 0.4966.
4
Conclusion
In the present paper several basic variants of the newsvendor model have been reviewed. The main contribution consisted in proposing the solution in the case with simultaneous control of inventory amount and selling price, under nonlinear (here cubic) demand curve. The selection of function form was justified by introducing the term ”base price” as inflexion point of the curve. Then, the price optimal from the vendor point of view is, as a rule, higher, which can be interpreted as a base price plus a premium, [3]. A question arises how an appropriate demand function should be estimated. The main source of information should be, I believe, an experience, based on data collection and analysis, extrapolation of extracted relations, as well as on expert knowledge and opinion. Another variant of the model can consider also adaptation of price during selling period. As it has been already said, rather simple models studied here can then serve as building blocks for construction of dynamical multi-period models.
References [1] Gallego, G.: IEOR 4000 Production Management Lecture 7. Columbia University, 1995. Retrieved 30.03.2016 from http://www.columbia.edu/ gmg2/4000/pdf/lect 07.pdf [2] Kim, J. H., and Powell, W.: Quantile optimization for heavy-tailed distributions using asymmetric signum functions. Working Paper, Princeton University, 2011. Retrieved 12.01.2016 from http://castlelab.princeton.edu/Papers/ [3] Petruzzi, N. C., and Dada, M.: Pricing and the Newsvendor Problem: A Review with Extensions, Operations Research 47 (1999), 183–194. [4] Volf, P.: On quantile optimization problem with censored data. In:Proceedings of the 31st International Conference on Mathematical Methods in Economics 2013, College of Polytechnics, Jihlava, 2013, 1004–1009. [5] Wikipedia: Newsvendor model. Retrieved 30.03.2016 from https://en.wikipedia.org/wiki/Newsvendor model
891
Mathematical Methods in Economics 2016
Approximate Transition Density Estimation of the Stochastic Cusp Model Jan Vořı́šek
1
Abstract. Stochastic cusp model is defined by stochastic differential equation with cubic drift. Its stationary density allows for skewness, different tail shapes and bimodality. There are two stable equilibria in bimodality case and movement from one equilibrium to another is interpreted as a crash. Qualitative properties of the cusp model were employed to model crashes on financial markets, however, practical applications of the model employed the stationary distribution, which does not take into account the serial dependence between observations. Because closed-form solution of the transition density is not known, one has to use approximate technique to estimate transition density. This paper extends approximate maximum likelihood method, which relies on the closed-form expansion of the transition density, to incorporate time-varying parameters of the drift function to be driven by market fundamentals. A measure to predict endogenous crashes of the model is proposed using transition density estimates. Empirical example estimates Iceland Krona depreciation with respect to the British Pound in the year 2001 using differential of interbank interest rates as a market fundamental. Keywords: multimodal distributions, stochastic cusp model, approximate transition density. JEL classification: C46 AMS classification: 62F10
1
Introduction
Stationary density of stochastic cusp model belongs to the class of generalized normal distributions. Since it has four parameters, its flexible enough to allow for skewness, kurtosis and bimodality (Cobb et al. [4]). Modes of stationary density correspond to the stable equilibria of differential equation with cubic polynomial. In bimodality case there are two stable equilibria attracting the process and one unstable one between them, which repulses the process and corresponds to the antimode of stationary density. Movement from one stable equilibrium to another, which is defined by switching from one side of unstable equilibrium to another, can be viewed as an intrinsic crash in the system. This potentially large shift towards another stable equilibrium level is considerable advantage of the cusp model in describing certain systems over traditionally used mean reverting linear models, which has just one stable attracting equilibrium. Another advantage is a random walk behavior (under certain parametrization) in the middle of the domain, which was found appropriate for interest rate modeling by Aı̈t-Sahalia [1]. However, complexity of cusp model brings some disadvantages, along with the lower research interest, which results into the lack of recent theoretical results. The major disadvantage lies in non-existence of closed-form solution of the transition density. There are several different approaches to overcome this obstacle: Euler approximation, simulation based methods, binomial approximations, numerical solution of Kolmogorow equations, and Hermite expansions. The last method, proposed by Aı̈t-Sahalia [2] and [3], gives unlike the other methods closed-form approximation of the transition density, that converges 1 Academy
of Sciences of the Czech Republic, Institute of Information Theory and Automation, Pod Vodárenskou věžı́ 4, CZ-182 08 Prague 8, Czech Republic Department of Probability and Mathematical Statistics, Faculty of Mathematics and Physics, Charles University Prague, Sokolovská 83, CZ-186 75 Prague 8, Czech Republic, vorisek@karlin.mff.cuni.cz
892
Mathematical Methods in Economics 2016
to the true likelihood function. This methods allows to consistently estimate parameters of the diffusion processes. Approach, which utilizes time-varying parameters, but only stationary density of stochastic cusp model were theoretically derived by Creedy and Martin [5] and empirically tested by the same authors [6] for the USD/GBP exchange rate. There were attempts to model currency crises by stationary cusp density (e.g. [7], [10]), some of them neglecting serial dependence completely, some reflecting it in variability of estimates. Goal of this paper is to provide an estimation technique, which reflects the time dependence, by extension of approximate transition density method to include exogenous variables, which drive parameters of the model. As an illustration example of proposed methodology serves Icelandic Krona deprecitation with respect to British Pound, given interest rate differential as an explanatory variable. The rest of this paper is organized as follows. Section 2 recalls a method, which approximates transition density function. Estimation of stochastic cusp model with time-varying parameters and a measure to access potential of crash is introduced in Section 3. Section 4 presents an illustration example of proposed methodology using daily ISK/GBP exchange rate for the period from July 1, 1999 to December 31, 2004. Section 5 summarizes the results and concludes.
2
Approximate transition density function
Transition density of a general diffusion process dXt = µ(Xt ; θ)dt + σ(Xt ; θ)dWt ,
(1)
does not have an analytical form. There are different techniques, how it can be approximated including Euler approximation, simulation-based methods or a numerical solution of forward Kolmogorov equation (see [8]). Aı̈t-Sahalia ([3]) proposed a method based on Hermite polynomials to derive closed-form expansion for the transition density. Assume following assumptions : 1. The functions µ(x; θ) and σ(x; θ) are infinitely differentiable in x, and three times continuously differentiable in θ, for all x ∈ DX = (−∞, ∞) and θ ∈ Θ. 2. There exists a constant ζ such that σ(x; θ) > ζ > 0 for all x ∈ DX and θ ∈ Θ. 3. For all θ ∈ Θ, µ(x; θ) and its derivatives with respect to x and θ have at most polynomial growth near the boundaries and ∂µ(x; θ) 1 µ2 (x; θ) + < ∞, lim − x→±∞ 2 ∂x and a transformation of original process X into a new process Y as Z X 1 du. Y ≡ γ(X; θ) = σ(u; θ) By applying Itô’s Lemma, Y has unit diffusion dYt = µY (Yt ; θ)dt + dWt , where the drift is given by µY (y; θ) =
µ(γ −1 (y; θ); θ) 1 ∂σ(x; θ) − |x=γ −1 (y;θ) . σ(γ −1 (y; θ); θ) 2 ∂x
Than approximation of the log-transition density of y given initial values y0 and a time step ∆ is given by: K
1 C (−1) (y|y0 ; θ) X (k) ∆k (K) lY (y|y0 , ∆; θ) = − log(2π∆) − + C (y|y0 ; θ) 2 ∆ k! k=0
893
(2)
Mathematical Methods in Economics 2016
where K is order of expansion connected to the power of ∆ and coefficients C (k) (y|y0 ; θ) can be calculated (K) by substitution of proposed solution (2) into forward and backward Kolmogorov equations. Given lY , (K) the expression for lX is given by the Jacobian formula K
1 (γ(x) − γ(x0 ))2 X (k) ∆k (K) lX (x|x0 , ∆) = − log(2π∆σ 2 (x)) − + . C (γ(x)|γ(x0 ); θ) 2 2∆ k!
(3)
k=0
3
Stochastic model cusp
The univariate stochastic cusp model can be characterized by nonlinear diffusion process for the variable of interest
dXt =
Xt − λ − α+β σ
Xt − λ σ
3 !
1 dt + σw dWt , 2σ
(4)
where Wt is standard Brownian motion, σ > 0 and σw > 0. Stationary density can be expressed analytically "
x−λ β + fs (x; θ) = ηs (θ) exp α σ 2
x−λ σ
2
1 − 4
x−λ σ
4 #
,
(5)
where ηs (θ) is normalizing constant. For identifying bimodality of the stationary probability density function (5) serves statistic called Cardan’s discriminant
δC =
α 2 2
3 β − . 3
(6)
The parameters α (asymetry) and β (bifurcation) are invariant with respect to changes in λ (location) and σ (scale), as is δC , and they have following approximate interpretations [4]. If δC ≥ 0 then α measures skewness and β kurtosis, while δC < 0 then α indicates the relative height of the two modes and β their relative separations. 3.1
Time-varying parameters
Following for example Creedy and Martin [6] or Fernandes [7], one can allow parameters α and β from (4) to be time varying: xt − λ α(ξt ) + β(ξt ) − σ
dxt =
xt − λ σ
3 !
1 dt + σw dWt , 2σ
(7)
where ξt is a vector of market fundamentals that are strictly exogenous with respect to xt . Using approximate transition density (3) log-likelihood of the observed values xi , ξi is given by:
ll
(2)
=
n−1 X i=1
+
−
∆2σ 2 σw
∆σ 2 σw +
"
−
2 2 − zi2 z 4 − zi4 (zi+1 − zi ) αi zi+1 − zi βi zi+1 + + − i+1 + 2∆σ 2 σw 4 σw 8σw
β2 σw 2 αi2 αi 3 βi 1 + κi − 2βκi − i κ2i + −5σw + κ4i + κi − κ6i + 8 16 24 20 4 56 σw β2 βi αi 1 4 1 2 − i + κ2i (3, 4) + κi − κi (5, 8, 9) − log(2π∆σw ) 8 48 40 16 112 2 894
(8)
Mathematical Methods in Economics 2016
where ∆σ = ∆/σ 2 , αi = α0 + α1 ξi , βi = β0 + β1 ξi , zi = (xi /σw − λ)/σ and κji (c) =
Pj
k=0 ck
k zi+1 zij−k .
By differentiation of log-likelihood (8) with respect to α0 and β0 and laying down to zero, one can express these parameters wrt. other parameters α̂0 =
2c013 c021 − c012 c022 + 2β1 (c122 c013 − c022 c113 ) + α1 (4c013 c131 − c022 c122 ) , (c022 )2 − 4c013 c031
(9)
β̂0 =
2c012 c031 − c021 c022 + 2α1 (c122 c031 − c022 c131 ) + β1 (4c031 c113 − c022 c122 ) , (c022 )2 − 4c013 c031
(10)
where coefficients cij are as follows:
ck12 = ck13
=
Pn−1 i=1
ck21 = ck22 = ck31 =
κ2 (3,4) 1 4 − zi2 ) + ∆σ − 14 + 20 κi + ∆2σ i 40 , 2 Pn−1 k κi 2 1 i=1 ξi −∆σ 24 − ∆σ 48 , Pn−1 k 1 κ3i 2 κi ξ (z − z ) + ∆ + ∆ i+1 i σ σ i i=1 2 16 16 , Pn−1 k − i=1 ξi ∆σ κ8i , Pn−1 − i=1 ξik ∆σ 81 .
ξik
1 2 4 (zi+1
Besides, one can express parameters α1 and β1 in a similar way to further reduce parameter space, which 2 needs to be estimated numerically. Employing of the transition means that only parameters λ, σ and σw density estimation allows to estimate variance of dWt in contrast with stationary density estimation. 3.2
Probability of crash
Denote probability of extreme event by π(x|x0 , ∆; θ, c) =
Z
c
exp[ l(x|x0 , ∆; θ)]dx,
(11)
−∞
which for c corresponding to the antimode of the stationary density in bimodality case represents probability of switching to lower stable equilibria from starting point x0 > c. If Cardan’s discriminant (6) is lower then zero, than antimode ca can be calculated as:
ca = −2 sign(α)
r
β cos 3
1 arctan 3
2 abs(α)
r
α2 β3 − + 4 27
!
! √ + arctan 3 .
(12)
In general, π(x|x0 , ∆; θ, c) can be used with an arbitrary c, when it represents standard risk measure approach. It measures large changes in unimodality case as well, however they occur exclusively due to the stochasticity of the process.
4
Empirical example
Following [10] and [7], the example for illustration of proposed methodology involves daily observed exchange rate between Icelandic Krona and British Pound in the period from July 1, 1999 to December 31, 2004, where as a market fundamental is used differential of one month interbank interest rate of the corresponding country1 . The basic statistical characteristic is presented in Table 1. The interest rate differential is a commonly used macroeconomic indicator of the soundness of the banking sector as high interest rate differentials often precede increases in nonperforming loans (see [9]). 1 Data
from the Central Bank of Iceland and Bank of England.
895
Mathematical Methods in Economics 2016
mean median std. dev. skewness kurtosis minimum maximum
ISK/GBP 128.97 127.55 9.94 0.66 2.71 112.75 156.3
rI /rB 1.0397 1.0421 0.0222 0.2694 1.8909 1.0076 1.0886
Table 1 Descriptive statistics of exchange rate and interest rate differential Keeping the modeled quantities as usual (e.g. [6], [10]), we will use the logarithm of the exchange rate xt and logarithm of the interest rate differential ξt , where variable parameters of drift (7) are given by αt = α0 + α1 ξt−1 and βt = β0 + β1 ξt−1 . Table 2 discloses the results of numerical maximum likelihood estimation of the diffusion process. It reveals that the stationary and transition estimates of parameters are similar, but the errors of estimates are remarkably higher for transition density estimation. There are two possible explanation for that: firstly the non-normality of estimates, but this holds for stationary density estimation as well, second explanation holds just for the transition model, where the last observation of exchange rate explains substantial part of distribution of next observation. However, Figure 1 shows, that the switching probability (11) indicated possibility of depreciation of the Krona in the first quarter of 2001, before the real depreciation came. Except of that Figure 1 displays evolution of Cardan’s discriminant calculated according to both stationary and transition estimates, together with scaled exchange rate, interest rate differential and probability of adverse equilibria in a one-step ahead forecast. parameter α0 α1 β0 β1 λ σ σw
stationary est. -0.529 0.615 0.021 2.312 0.046 0.906
stat. std. error 0.054 0.042 0.173 0.099 0.025 0.018
transition est. -0.525 1.047 -0.054 2.018 0.031 0.755 1.207
trans. std. error 1.273 0.933 1.562 1.328 0.268 0.105 0.023
Table 2 Estimation results for stationary and transition density
5
Conclusion
This paper extends approximate maximum likelihood approach to the transition density estimation by time-varying parameters of the stochastic cusp model. It shows how to simplify the estimation and the measure for indicating endogenous crash of the system is introduced. The example concerning Iceland Krona and British Pound exchange rate is used to illustrate the performance of proposed methodology, where interest rate differential serves as a market fundamental. Comparison with the stationary density approach reveals the behavior of the exchange rate close to the random walk, but the measure of crash was able to indicate the Krona depreciation in advance.
Acknowledgements Supported by the grant No. P402/12/G097 of the Czech Science Foundations.
896
−4
−3
−2
−1
0
1
2
Mathematical Methods in Economics 2016
δs δt x ξ π 2000
2001
2002
2003
2004
2005
Figure 1 Stationary and transition evolution of Cardan’s discriminant, exchange rate, interest rate differential and probability of adverse equilibria
References [1] Aı̈t-Sahalia, Y.: Testing continuous-time models of the spot interest rate. Review of Financial Studies 9 (1996), 385-426. [2] Aı̈t-Sahalia, Y.: Maximum Likelihood Estimation of Discretely Sampled Diffusion: A Closed-form Approximation Approach. Econometrica 70, 1 (2002), 223–262. [3] Aı̈t-Sahalia, Y.: Closed-form Likelihood Expansions for Multivariate Diffusions. The Annals of Statistics 36, 2 (2008), 906–937. [4] Cobb, L., Koppstein, P., and Chen, N. H. : Estimation and Moment Recursion Relations for Multimodal Distributions of the Exponential Family. Journal of the American Statistical Association 78, 381 (1983), 124–130. [5] Creedy, J., and Martin, V. : Multiple Equilibria and Hysteresis in Simple Exchange Models. Economic Modelling 10, 4 (1993), 339–347. [6] Creedy, J., and Martin, V. : A Non-Linear Model of the Real US/UK Exchange Rate. Journal of Applied Econometrics 11, 6 (1996), 669–686. [7] Fernandes, M.: Financial crashes as endogenous jumps: estimation, testing and forecasting. Journal of Economic Dynamics and Control 30, 1 (2006), 111–141. [8] Jensen, B. and Poulsen, R. : Transition densities of diffusion processes: Numerical comparison of approximation techniques. Journal of Derivatives 9 (2002), 18-32. [9] Kaminsky, G. L. and Reinhart, C : The twin crises: causes of banking and balance-of-payments crises. American Economic Review 89 (1996), 473–500. [10] Koh, S. K., Fong, W. M., Chan, F. : A Cardans discriminant approach to predicting currency crashes. Journal of International Money and Finance 26 (2007), 131–148.
897
Mathematical Methods in Economics 2016
The Impact of Labor Taxation on the Economic Growth Iva Vybíralová1 Abstract. Problems of taxation are not interesting just for economists, but also for common people, because taxation has influence on their lives in many ways. The aim of this article is to evaluate the impact of direct taxation on economic growth. Especially the impact of the tax burden on labour, represented by personal income tax and social security contributions. In this article, both the indicator of tax quota and the alternative indicator of tax burden (WTI) are used. In this work, a dynamic panel data analysis is employed. In this paper, the negative effect of labour taxation is confirmed. The negative influence of total personal income tax and social security contributions on economic growth is significant. In the case of model with tax quota, the positive effect of corporate taxation is verified. It could be caused by using the tax quota as an indicator of tax burden. In model with alternative WTI indicator, the negative effect of corporate taxation on economic growth is proved. The EViews 7 software is used. Keywords: taxation, tax quota, WTI, dynamic panel data analysis, economic growth. JEL Classification: H20, O40, C50 AMS Classification: 91G70
1
Taxation and the Economic Growth
This article is focused on the problem of labor taxation. The aim of this paper is to evaluate the effects of taxation on economic growth and provide recommentation for the tax policy makers who is dealing with taxation. This paper is worked with hypothesis of the negative impact of labor taxation to the economic growth. Two different types of tax burden are used. It means the tax quota, WTI and their subcomponents. It creates possibility to distinguish the burden of different types of taxes. The issue of taxation of labor is an actual topic about every current period. Policy makers should be able to estimate level of taxation which is not harmful for state budget and which is not harmful for productivity of individuals due to excessive taxation. Authors Kessler and Norton [15] based their laboratory experiment on the idea that economic agents perceived taxation very negatively. In the experiment, they reduced wages of subject by two different ways. In the firs way, authors have reduced wages by a certain amount. In the second case, authors have reduced wages by taxation. On both cases the reduction of the same size. This experiment was confirmed the negative influence of taxation on productivity of individuals. Taxation had negative influence on disposable income of individuals. This influence determines their economic behavior. With the decline in disposable income can be expected restrictions on investment in education. This investments is in endogenous growth theory considered the engine of economic growth. This work is based on endogenous growth theory. Romer [19] and Lucas [17] can be considered like main representatives authors of endogenous growth theory. They expanded the original neoclassical models by human capital and knowledge in their work. In endogenous growth models is permitted the opportunity on influence the growth of the economy due to taxes. Taxes had influence on growth variables. In endogenous growth models is permitted the opportunity on influence the growth of the economy in the form of tax effects on growth variables also after reaching a steady state. As reported Johansson, Head, Arnold, Brys and Vatria [10], taxes has influence on household decisions about the amount of savings, their labor supply, but also taxes is determining the amount of investment in human capital. In the paper of Leibfritz, Thornton and Bibbee [16] was stated, that the negative impact of taxation on growth lies in reducing savings and investments because of high taxation.
Technical University of Ostrava, Faculty of Economics, Sokolská tř. 33, 721 00 Ostrava, Czech Republic, iva.vybiralova@vsb.cz. 1
898
Mathematical Methods in Economics 2016
1.1
Literature review
Many authors is devoted to issues of taxation and economic growth in their work. The effect of taxation on economic growth examined complexly for instance King and Rebelo [13]. According to their results, effect of taxation on the economic growth depend on the degree of openness of economies. Taxes has the greatest impact on economic growth in small open economies. The results of authors Easterly and Rebelo [7] did not confirmed a significant relationship between taxes and economic growth. They found negative effect of taxation on economic growth only in the case of personal taxes. Also Myles [18] does not consider the relationship between taxes and economic growth so evident. Authors who also confirmed the negative effect of taxation on economic growth included Karras and Furceri [11], Kotlán, Machová and Janíčková [15], Izák [9] or Romer and Romer [20]. The structure of taxation is also important as indicated works of Easterly and Rebelo [10], Widmalm [24] or Myles [21].According to the results of the authors Gemmell, Kneller and Sanz [8], taxes have influence on the gross domestic product (GDP), mainly through productivity factor, not through factor of accumulation. Arnold [3], in his study of 21 OECD countries over the years 1971-2004, realized the negative impact of personal income tax on the amount of product. It also depends on the level of progressivity this tax. Lower output of the economy could be assumed with the higher progressivity. According to the result of the authors AcostaOrmaecha and Yoo [1], the most damaging taxes for economic growth is personal income tax and social security contributions. The negative impact of personal income tax confirmed in their work also Johansson, Head, Arnold, Brys and Vatria [10] or Kotlán and Machová [14]. Dackehag and Hansson [5] admitted positive impact of personal income taxes and corporate taxes. But only on the condition that both these taxes are low. In another case, confirm the negative impact of taxation on economic growth. Daveri and Tabellini [6] odserved the relationship between unemployment and economic growth. According to their results, labor costs of companies were increased due to a high tax burden on labor. The consequence was not only higher unemployment but also reduce investment capital. Also Alesina, Ardagna, and Schiantarelli Perotti [2] in their work investigate to increase of income tax or social security. They examined the impact of this growth on labor costs and hence on the amount of future investments.
1.2
The data used and methodology
The models were estimated using generalized method of moments (GMM). Arellando and Bond estimator was used. The models were estimated in the software Eviews 7. As indicated Baltagi [4], the essence of dynamic panel is delayed of dependent variable and included it in the independent variable. The following models were dependent variable delayed of one season. Robust estimator “White Period” was used. The results of standard deviations did not include autocorrelations and heteroscedasticity due to this estimator [22]. It is important to observe the homogeneity of data. Results of the panel regression as accurate as possible due to homogeneity of data. For the study was chosen group of OECD countries. The OECD associating the most advanced countries of the world. The global databases were used in this work where can be assumed that the data collection and subsequent processing will support sample homogeneity. In this paper data from databases OECD, WB and from the pages of alternative indicators WTI were used. Variable GDP from the database WB Databank2 as a variable of economic growth was set. Variable GDP is counted as GDP per capita expressed in constant 2005 prices in US dollars. The OECD.Stat3 database was used for approximates the physical capital. Physical capital is represented in the following models as indicator of gross fixed capital expressed as a percentage of GDP. Human capital indicator is used from the OECDiLibrary4 database. Indicator consists of several component parts, which are divided into three main components that relate to investment in human capital, its quality and learning outcomes. Taxes variables were taken from two sources. OECD.Stat5 database was used in the case of model with tax quota. And in case of construction of a model with The alternative indicator of the tax burden was obtained from the WTI6 database. Indicators of tax rate as a tax variables were chosen. These indicators were chosen due to their simplicity and easy accessibility of data. It is the proportion of tax revenue of taxes to GDP. Tax quota as an inappropriate indicator for the tax burden was presented by Kotlán and Machová [14] and Kotlán, Machová and Janíčková [15]. Therefore, the alternative indicator WTI was used in this paper. WTI is a multi-criteria indicator of the tax burden. This indicator is combined hard and soft data. Tax revenue of taxes on income, profits and capital gains as % of GDP (TQ1200) as indicator of corporate taxation was chosen and index of corporate income tax (CIT) as an alternative. Excise taxes are represented in model with tax quota as tax revenue of general taxes on goods and 2
http://data.worldbank.org/indicator/NY.GDP.PCAP.KD. https://stats.oecd.org/Index.aspx?DataSetCode=NAAG#. 4 http://stats.oecd.org/BrandedView.aspx?oecd_bv_id=eo-data-en&doi=data-00717-en. 5 https://stats.oecd.org/Index.aspx?DataSetCode=REV#. 6 http://www.worldtaxindex.com/wti-dataset/. 3
899
Mathematical Methods in Economics 2016
services as % of GDP (TQ5110). Indicator of value added tax (VAT) was used in the case of the model with the WTI index. Also is this type of taxes represented by tax revenues of taxes on specific goods and services as % of GDP (TQ5120) and an index of other taxes on consumption (OTC) as alternative in case of a model with WTI index were chosen. Tax burden on labor in the model with tax quota was represented by the sum of tax revenue of taxes on income, profits and capital revenue and social security contributions as % of GDP (TQ1100_2000). Model with alternative indicator was used index of personal income tax (PIT). There was made basic descriptive statistics. As revealed by the results, the highest tax burden measured by tax quota was found at taxation of labor. In the period under review (1995 – 2013), maximum of tax revenue as % of GDP was in 1997 in Sweden. Tax burden was gradually decreased over time. Highest average taxation exceeding 25% of tax revenues from taxes on income, profits and capital revenue and social security contributions (TQ1100_2000) was in Belgium, Sweden, Denmark and Finland. The lowest values (below 10%) were reached by Korea and Turkey. When the time series was shorted for five years (2000 – 2013), the results were very similar. Even in this period, the country with the lowest tax burden on labor were Korea and Turkey. There were changed the country rankings in the case of the highest taxation. The average tax burden was the highest in labor taxes (PIT) in data sample of WTI. From the perspective of alternative indicator WTI was the highest value detected in the tax burden of value added tax (VAT), which is in models with tax quota represented by variable TQ5110. With regard to the availability of data, models was constructed with data from 1995 until 2013. In the case of the model with the WTI index, data are available only since 2000. Model with tax quota was also constructed for the shorter period to allow comparison with model with WTI. Econometric program Eviews 7 was used. Levin, Lin & Chu, Im, Pesaran and Shin and ADF - Fisher Chisquare test for testing stationarity of time series were used. [4]. First differences of time series (D1) was used in the case of nonstationary data. For fiscal variables OTC had to be used second differences (D2) of time series, since even after the first differences were not stationary time series. In order to interpret the results, all time series were logarithm (L). Empiric analysis was performed for 30 OECD countries.
2
Panel regression
Aim of this study is to evaluate the effect of the tax burden on economic growth. Estimated models was based on the endogenous growth theory. Which is expected effect of tax variables on individual growth variables. Empirical specification constructed model may be in the form of equation (1): T
T
t 1
t 1
HDPit HDPit 1 j X it kTAX it it
(1)
where i=1,…,N is the number of countries, t=1,…,N is the number of years, Xit is set of control growth variable, TAXit is set of taxes variable,
it
is a residual component.
In equation (2) was extended equation (1) for the model using the tax quota. It could be written as follows:
HDPit HDPit1 1Kit 2 H it 1TQ1200 it 2TQ5110 it 3TQ5120 it 4TQ1100 _ 2000 it it
(2)
where K is an indicator of physical capital, H is indicator of human capital, TQ1200 is ax revenue of taxes on income, profits and capital gains as % of GDP, TQ5110 is tax revenue of general taxes on goods and services as % of GDP, TQ5120 is tax revenues of taxes on specific goods and services as % of GDP, TQ1100_2000 is sum of tax revenue of taxes on income, profits and capital revenue and social security contributions as % of GDP Tax variables using partial indicators WTI can be written as equation (3):
HDPit HDPit1 1Kit 2 H it 1CITit 2VATit 3OTCit 4 PITit it
(3)
where CIT is index of corporate income tax, VAT is index of value added tax, OTC is index of other taxes on consumption, PIT is index personal income tax. As is apparent from the performed literature search, there was assumed positive effects of growth variables physical and human capital to the economic. In contrast, there were expected negative impact of fiscal variables on economic growth. The empirical analysis was based on a dynamic panel model. The model was estimated by generalized method of moments (GMM). There were used variables in the first difference (D1). All series were tested for the stationarity by the above three tests. The estimated model was statistically significant and all explanatory variables were statistically significant. Only delayed the response variables (GDP) was statistically significant at the 10% significance level. Cross-correlation was used. If was variable representing the tax burden
900
Mathematical Methods in Economics 2016
on labor (TQ1100_2000) delayed of one period, the statistical significance of individual variables was remained statistics important. Also delayed explained variable became statistically significant at the 1% level of significance. The results of this model are presented in table 1. There was confirmed the negative impact of the tax burden from labor to economic growth. Personal income tax and social security contributions had quantified the highest negative impact on economic growth. Consumption taxes and corporate taxes have less impact on economic growth than on taxation on labor. For taxes on income, profits and capital gains (TQ1200) representing corporate tax has not been confirmed negative impact on growth, as foreseen in theory. Statistical Verification
Economic Verification
Dependent Variable D1_HDP_L Variable D1_HDP_L(-1)
Coefficient -0,07(-3,74)***
Theory
Empirical evidence
+
-
D1_K_L
0,28(19,18) ***
+
+
D1_H_L
5,60(9,63) ***
+
+
D1_TQ1200_L
0,05(9,57) ***
-
+
D1_TQ5110_L
-0,03(-3,01) ***
-
-
D1_TQ5120_L
-0,06(-8,23) ***
-
-
D1_TQ1100_2000_L(-1)
-0,11(-9,07) ***
-
-
Number of observations
451
Number of instruments
30 J-statistic 27,21 Source: own calculations. Note: the level of significance is created by number of stars 1% (***), 5% (**) a 10% (*). Table 1 Model with Tax Quota (1995-2013) The results of the model with the tax quota with shortened period are shown in table 2. There was estimated model with first differences of individual variables (D1). This model was estimated for the shortened period 2000 – 2013 in order to allow mutual comparisons with model WTI. Model was estimated again with the first differences (D1). The results are very similar when was compared this model with the model of tax quota with longer period of time. Model as a whole was statistically significant if was variables representing the tax burden on labor (TQ1100_2000) delayed of one period. All explanatory variables and delayed explained variable was statistically significant at a probability level of 1%. Quantitatively highest impact on economic growth had again a variable representing the tax burden on labor and there was again found positive impact of corporate taxation on economic growth. The same results were also in work of authors Kotlán, Machová and Janíčková [15]. Unlike their results, impact of this taxation on economic growth was not so strong. Time series in model with WTI index were not stationary. There were used first differences (D1). Variable OTC was not stationary in this case again. There was used second difference (D2) of this variable. The remaining variables were in the first difference (D1). Variables CIT, OTC and PIT were statistically insignificant, when model was estimated without delay. If the variable PIT was delayed of one period (-1), this variable became statistically significant but impact on economic growth has been positive. Variable OTC stayed longer statistically insignificant. Therefore, this model was rejected. According to the results of the cross correlation being variable OTC delayed by one period (-1) and variable CIT for two periods (-2). Aside variable representing labor taxation (PIT), remaining variables were statistically significant. This model was rejected. It was designed several models with different length of delay ensued from the cross correlation. Model where the taxes variables were delayed by two periods (-2) was accepted. All tested variables were statistically significant at the 1% level of probability. For all the tax variables were found negative effect on economic growth. Quantitatively highest impact on economic growth, according to the results had corporate taxation. According to the results, taxation of labor represented by variable PIT had second highest negative impact on economic growth. Also in this model was confirmed the positive effect of physical and human capital for economic growth.
901
Mathematical Methods in Economics 2016
Dependent Variable Variable D1_HDP_L(-1) D1_K_L D1_H_L D1_TQ1200_L D1_TQ5110_L D1_TQ5120_L D1_TQ1100_2000_L(-1) Number of observations Number of instruments J-statistic
D1_HDP_L Coefficient -0,19(-8,31)*** 0,32(45,88)*** 10,09(6,95)*** 0,05(6,58)*** -0,10(-8,51)*** -0,16(-11,26)*** -0,24(-25,78)*** 330 30 27,36
Variable D1_HDP_L(-1) D1_K_L D1_H_L D1_CIT_L(-2) D1_VAT_L(-2) D2_OTC_L(-2) D1_PIT_L(-2) Number of observations Number of instruments J-statistic
Coefficient -0,10(-3,45)*** 0,39(25,49) *** 17,22(8,19) *** -0,06(-11,13) *** -0,03(-3,01) *** -0,03(-4,44) *** -0,05(-2,62) *** 270 30 27,21
Source: own calculations. Note: the level of significance is created by number of stars 1% (***), 5% (**) a 10% (*). Table 2 Model with tax quota and model with WTI
3
Conclusion
The effect of taxation on economic growth is not in the spotlight only for professionals, but also among the general. Every economic entity must pay taxes. There are two basic options - either the tax is deducted directly from economic entities income (which carry very negatively), or as part of the product price. This hidden tax treatment should be less harmful for the economic growth. This conclusion was supported in work of Arnold [3], for example. Also Kessler and Norton [12] confirmed the negative view of economic entities on direct taxation based on the experiment. The aim of this study was to evaluate the impact of labor taxation on economic growth. From the results of the analysis could be confirmed the negative impact of labor taxation on economic growth. In constructed models have been confirmed negative correlation between the variables representing this taxation. There was also detected quantitatively strongest relationship between taxes on labor and economic growth. The model results with the use of indicators of the tax quota was confirmed theory in the case of labor taxes and consumption taxes (in models represented by TQ5110 and TQ5120). In the case of corporate tax has not been confirmed negative impact on economic growth. According to the results of the model, the corporate taxation should support economic growth. The negative impact of the tax burden of labor to GDP should be reflected with one year delay by the results of the model. In the case of model with shorted period (2000 â&#x20AC;&#x201C; 2013), the results were very similar. Again in the case of corporate tax has not been confirmed negative impact on economic growth. Quantitative impact of this taxation on the economic growth was lowest. There was confirmed negative impact of consumption taxes on economic growth. Again, the strongest negative influence was had indicator TQ1100_2000. This was again delayed by one period. Although there has been a change in the data base, the results were the same. Model can be considered robust. The last constructed model was constructed by using the alternative indicator WTI as tax variable. Unlike previous models, this model fully confirms the economic theory that taxation has a negative impact on economic growth. In this model was found negative influence of corporate taxation (CIT) on economic growth. Quantitatively was this effect highest. Individual tax variables were delayed for two seasons (-2). During this delay, all the explanatory variables were statistically significant. During the construction of this model had to be variable OTC applied in the second difference (D2). The basic recommendation for the makers of tax policy is to not raise tax revenues through taxation of labor. Results of the analysis confirmed the negative impact of this taxation. It offers the possibility of transferring the tax revenues from direct taxes to indirect taxes, which do not have such a negative impact on growth. In the case of corporate taxation remains the question of appropriateness tax rate as an indicator of the tax burden. According to the results of the models would be recommendation for the tax policy makers increase this tax because of the positive effect on growth. However, as seen from the above literature, this taxation should have the highest negative effect. Therefore, it offers a suitable use of alternative indicator WTI, which confirmed the theoretical results
902
Mathematical Methods in Economics 2016
References [1] Acosta-Ormaecha, S., and Yoo, J.: Tax Composition and Growth: A Broad Cross-Country Perspective. IMF Working Paper No. 257. International Monetary Fund, 2012. [2] Alesina, A., Ardagna, S., Perotti, R., and Schiantarelli, F.: Fiscal Policy, Profits, and Investment. NBER Working Papers 7207. National Bureau of Economic Research, 1999. [3] Arnold, J.: Do Tax Structures Affect Aggregate Economic Growth?: Empirical Evidence from a Panel of OECD Countries. Working Papers No. 643. Organization for Economic Cooperation and Development, 2008. [4] Baltagi, B. H.: Econometric analysis of panel data. John Wiley & Sons Ltd., Chichester, 2005. [5] Dackehag, M., and Hansson, Å.: Taxation of Income and Economic Growth: An Empirical Analysis of 25 Rich OECD Countries. Working Paper 2012:6. Department of Economics, 2012. [6] Daveri, F., and Tabellini, G.: Unemployment, Growth and Taxation in Industrial Countries. Working Paper No. 122. InnocenzoGasparini Institute for Economic Research, 1997. [7] Easterly, W., and Rebelo, S.: Fiscal policy and economic growth: An empirical investigation. Journal of Monetary Economics 32 (1993), 417–458. [8] Gemmell, N., Kneller, R., and Sanz, I.: The growth effects of tax rates in the OECD. Canadian Journal of Economics 47 (2013), 1217–1255. [9] Izák, V.: Vliv vládních výdajů a daní na ekonomický růst (empirická analýza). Politická ekonomie 59 (2011), 147–163. [10] Johanson, Å., Heady, Ch., Arnold, J., Brys, B., and Vartia, L.: Tax and Economic Growth. Working Paper No. 620. Organization for Economic Cooperation and Development, 2008. [11] Karras, G., and Furceri, D.: Taxes and growth in Europe. South-Eastern Europe Journal of Economics 7 (2009), 181–204. [12] Kessler, J. B., and Norton, M. I.: Tax Aversion in Labor Supply. Journal of Economic Behavior and Organization, 2015. [13] King, R. G., and Rebelo, S.: Public policy and economic growth: developing neoclassical implications. NBER Working Papers 3338. National bureau of Economic Research, 1990. [14] Kotlán, I., and Machová, Z.: Horizont daňové politiky v zemích OECD. Politická ekonomie 62 (2012), 161–173. [15] Kotlán, I., Machová, Z., and Janíčková, L.: Vliv zdanění na dlouhodobý ekonomický růst. Politická ekonomie 59 (2011), 638–658. [16] Leibfritz, W., Thornton, J., and Bibbee, A.: Taxation and Economic Performance. OECD Economics Department Working Papers No. 176. Organization for Economic Cooperation and Development, 1997. [17] Lucas, R. E.: On the Mechanics of Economic Development. Journal of Monetary Economics 22 (1988), 3– 42. [18] Myles, G. D.: Taxation and Economic Growth. Fiscal Studies 21 (2000), 141–168. [19] Romer, P.: Increasing Returns and Long-Run Growth. The Journal of Political Economy 94 (1986), 1002– 1037. [20] Romer, Ch. D., and Romer, D. H.: The Macroeconomic Effects of Tax Changes: Estimates Based on a New Measure of Fiscal Shocks. American Economic Revie 100 (2010), 763–780. [21] Widmalm, F.: Tax structure and growth: Are some taxes better than others? Public Choice 107 (2001), 199– 219. [22] Wooldridge, J. M.: Introductory Econometrics: A Modern Approach. South–Western CENGAGE Learning, Mason, 2009.
903
Mathematical Methods in Economics 2016
The similarity of the European Union countries as regards the production of energy from renewable sources Katarzyna Warzecha1 Abstract. Sudden development of civilization causes the increase of demand for electricity. The particular situation applies to the European Union countries where the European Parliament and the Council defined with the use of the appropriate Directive the general objectives concerning the shares of energy from renewable sources in gross final energy consumption for every country regarding their individual national targets by the year 2020. One of the aims of the present study is to demonstrate how the European Union countries comply with the Directive. The main objective of the research is the evaluation of changes occurring in production of the energy from renewable sources and the division of the countries into groups with similar structure of production of energy from renewable sources (with the use of Czekanowski’s Diagram). To visualize multidimensional data Chernoff faces were used. The sample period was the years 2005 and 2014. The statistical data came from the Eurostat website and the calculations were made with Statictica and MaCzek computer programs. Keywords: renewable energy, Czekanowski’s Diagram, Chernoff faces JEL Classification: Q42 AMS Classification: 62H86
1 Introduction The technical and technological progress influencing the dynamic development of civilization is connected with the growing demand for energy. The use of traditional non-renewable energy sources, that is fossil fuels such as: coal, oil and natural gas, causes the increase of the environmental pollution, but first and foremost it leads to exhaustion of natural resources. Therefore, the European Union emphasizes to a large extent the use of more beneficial sources of energy which are renewable. This type of energy became the key element of the European Union energy policy and in 2009 The Climate and Energy Package was adopted [4] and it contained the main goals of energy policy by 2020. One of the objectives is to achieve in the UE at least 20% of gross final energy consumption which comes from the renewable energy sources, as well as, to anticipate the 20% reduction of primary energy consumption in comparison to projected levels. The main objective of the research is the evaluation of the changes in production of energy which comes from renewable sources and dividing the EU countries into groups which have similar structure of the energy production from renewable sources. The aim of this study is also the presentation how the EU countries comply with the Directive of the European Parliament and of the Council of 2009 in the field of the share of renewables in the final energy consumption which was established by European Parliament for every country regarding their individual national targets by the year 2020. The Czekanowski’s diagram was used to establish the similarities between the EU countries in regard to the production of energy which comes from the renewable sources. The division of the examined EU countries into groups of similar structure of energy production from renewable sources allows the answer to the question whether the geographical position and the climate have the significant influence on the production of energy from the specific sources. To visualize the multidimensional data the Chernoff faces were adopted. The sample period is the year 2005 and 2014, the statistical data come from Eurostat webpage [10]. Statistica and MaCzek are the computer programs which were used in calculation.
2 The production of energy from renewable sources in the final energy consumption in EU countries in 2005 and 2014 In the Climate and Energy Package it was stated that at least 20% of gross final energy consumption should be fulfilled by the energy which comes from the renewable sources. Each EU country’s share is established on individual level. In 2014, there was a total of 16% share of energy from renewable sources in gross final energy consumption for the European Union (28 countries) and it stated the raise equal to 7 1
University of Economics in Katowice, Department of Econometrics, warzecha@ue.katowice.pl.
904
Mathematical Methods in Economics 2016
percentage points in comparison to 2005. To achieve the objective intended by the European Parliament 4 percentage points more are required. This means that within the space of 4 years the share has to increase by 25% of the value of 2014. The target of the Polish energy policy is the increase of shares of energy from renewable sources in gross final energy consumption to the level of 15% in 2020. In 2014, there was a total amount of 11.45% share of energy from renewable sources in gross final energy consumption for Poland and it stated the raise equal to 4.55 percentage points in comparison to 2005. To achieve the intended objective 3.55 percentage points more are required. This means that within the space of 4 years the share has to increase by over 30% of the value of 2014. Share of energy from renewable sources in gross final energy consumption in 2005 and 2014 A 2005 B 2014
Legend: 0.02 – 10
10 - 20
20 – 60
Legend: 4.5 – 10
10 – 20
20 – 77
Degree of implementation of share of energy from renewable sources in gross final energy consumption in 2020 according to the Directive 2009/28/WE in particular EU countries in 2005 and 2014 C 2005 D 2014
Legend: 0.02 – 0.99
0.99 – 1.00
1.00
Legend: 0.39 – 0.99
0.99 – 1.00
Figure 1. Share of energy from renewable sources in gross final energy consumption and the degree of implementation of this share in the EU countries in 2005 and 2014
905
1.00
Mathematical Methods in Economics 2016
The share of renewable energy in gross final energy consumption in 2005 and 2014 is presented on the maps (Figure 1 - A and B). The EU countries which are characterized by the lowest share of energy from renewable sources in gross final energy consumption (below the mean obtained in 28 EU countries which was equal to 9% in 2005 and 16% in 2014) are as follows: Ireland, the Netherlands, Luxembourg, Malta, Great Britain. From the data on maps A and B it is clearly visible in most countries that over the span of 10 years the share of renewable energy in gross final energy consumption increased which indicates that examined countries strive to obtain the objectives stated by the European Parliament for 2020. The data demonstrated on maps C and D (Figure 1) present the results as follows: - in 2005 Croatia was the only country, which achieved the aim stated in the EU Directive for the year 2020; - in 2014 the aim stated in the EU Directive was achieved by more countries, i.e. Italy, Finland, Sweden, Estonia, the Czech Republic, Romania, Bulgaria and Lithuania.
3 The use of Czekanowski’s diagram to establish the similarities between EU countries in regard to structure of production of energy which comes from the renewable sources The structure of production of energy from renewable sources in the examined countries is diversified, which is a result of geographical conditions, and consequently the climatic conditions of a particular country and the capability of this country to use its resources. In EU-28 as a whole, in 2014 in comparison to 2005 (Figure 2) there was a substantial decrease in share of solid biofuels in the total production of energy from renewable sources. The decrease was from 55.5% to 44.56% (about 20% decrease). There was also the decrease of hydropower in the total production of energy from renewable sources. Here, the reduction was from 21.18% to 16.02% (about 25% decrease). Over the span of 10 years, there was a significant increase of wind energy share in the total production of energy from renewable sources from 4.77% to 10.81% (about 130% increase), the rise of solar energy in the total production of energy from renewable sources from 0.55% to 2.02% (about 270% increase). The rise can be noticed as well in other energy carriers, which are as follows: municipal waste, bio gasoline, biodiesel and the tidal energy. 2005 2014
Geothermal Energy 4,93% Liquid biofuels 4,18%
Other 4,93%
Geothermal Energy Other 3,08% 8,52% Liquid biofuels 7,57%
Hydro power 21,18%
Hydro power 16,02% Wind power 10,81%
Wind power 4,77%
Biogas 2,97%
Biogas 7,42% Solar thermal 0,55%
Solid biofuels 56,50%
Solid biofuels 44,56%
Solar thermal 2,02%
Figure 2. Share of the various energy carriers in production of energy from renewable sources in EU-28 in 2005 and 2014 To discover the similarities and differences between the EU countries as regards the structure of production of energy from renewable sources the Czekanowski’s method was applied and it was a method primarily proposed by the Author for the anthropology demand [2, 3]. The description of Czekanowski’s method can be found in: [7; 6; 8; 9]. The calculations were conducted to establish the similarities of the EU countries and they applied to shares of electricity produced from the renewable energy sources which come from various points of supplies (such as water, wind, solar energy, solid biofuels, biogas, geothermal energy and others) in the production of electricity from renewable energy carriers. According to the rules of Czekanowski’s method all of the examined variables were transformed, all of the variables were stimulants (i.e. variable (X) is a stimulant if, the higher is the value of this variable the better is the situation of a particular object as regards the examined point of view). Using the zero unitarization method the variables were normalized according to the formula (1): z ij
x ij min x ij
(i=1,2,…,n; j=1,2,…,m)
max x ij min x ij
where: min xij- min of variable xj ; max xij – max of variable xj; zij –standardized value of variable xj for i- object.
906
(1)
Mathematical Methods in Economics 2016
The countries were considered to be similar if they were situated close to each other in the multidimensional space of explanatory variables. The distance between the examined countries in multidimensional space was defined with the use of Euclidean distance formula (2):
d il
z m
j 1
z lj
2
ij
(i=1,2,…,n; j=1,2,…,m)
(2)
where: dil – the distance between i-object and l-object; zij – standardized value of variable for i-object; zlj – standardized value of variable for l-object. The graphic image of the discussed method is Czekanowski’s diagram which allows the division of examined EU countries for the groups which are characterized by a high level of similarity. The diagrams were constructed with the use of computer program MaCzek [11]. In the case of the partially corresponding countries the division was made according to the shortest mean Euclidean distance of inconclusively defined country and other countries from possible groups of its affiliation. Particular columns and rows of the Czekanowski’s diagram correspond to a particular EU country. The bigger the symbol on the crossing of the column and row the more similarity is attributed to countries as regards the shares of production of electricity from the renewable energy sources which come from various points of supplies in the general production of energy from the renewable energy carriers. The most similar objects are situated as close as possible to the main diagonal. In regard to the limitation of the number of article’s pages, the Czekanowski’s diagram presented below corresponds to the most current data- the data from 2014. Czekanowski’s diagram and the EU countries in the division for the groups of similar countries as regards the shares of production of electricity from the renewable energy sources in 2014.
Figure 3. The similarity of the EU countries as regards the structure of production of energy from renewable sources From the data in Figure 3 it is visible that: - group I consists of the countries (Belgium, Denmark, Greece) which are characterized by the great share of the energy production from solid biofuels (on average about 48% of the renewable energy productionwhich is below the mean in EU) and the great share of the energy production from wind power (significantly above the EU mean- on average about 17% of energy production from renewable energy sources); - group II is most numerous when it comes to the countries (France, Bulgaria, Romania, Sweden, Austria, Croatia, Slovenia, Slovakia) which are characterized by the great share of the energy production from water (significantly above the EU mean- on average about 31% of energy production from renewable energy
907
Mathematical Methods in Economics 2016
-
-
-
sources) and the share of the energy production from solid biofuels (on average about 51% of renewable energy production - slightly above the EU mean); group III belongs to the countries (Portugal, Spain) which are characterized by the great use of solar energy (significantly above the EU mean- on average about 7% of energy production from renewable sources), wind and water energy (both kinds of energy are significantly above the EU mean- on average about 22% of renewable energy production) and the geothermal energy as well (significantly above the EU mean- on average about 2% of energy production from renewable energy sources). The energy obtained from solid biofuels equals on average to 36% which is significantly below the EU mean. group IV consists of countries (Latvia, Finland, Lithuania, Poland, Estonia, Hungary) which are characterized by the great share of energy production from biogases (significantly above the EU mean- on average about 18% of energy production from renewable energy sources) and wind energy (above the EU mean- on average about 13% of renewable energy production). The energy from solid biofuels equals on average to 44% of energy production from renewable energy sources which is below the EU mean); group V consists of countries (Iceland, Italy, the Netherlands, Malta, Luxembourg and Cyprus do not create any groups) which use almost exclusively the solid biofuels (significantly above the EU mean- on average about 82% of energy production from renewable energy sources).
4 Chernoff faces as a way of visualization of countries as regards the structure of energy production from renewable energy sources Figure 4 presents a very interesting way of visualization of multidimensional data which describe the structure of energy production from renewable energy sources, i.e. the Chernoff faces [1]. The similarity or dissimilarity of the multidimensional data is reflected here in picture facial expressions. The caption in Figure 4 indicates which face elements correspond to particular attributes of data set.
Belgium
Greece
BulgariaCzech RepublicDenmark
Spain
France
Lithuania Luxembourg Hungary
Portugal
Romania
Slovenia
Croatia
Malta
Slovakia
Germany
Estonia
Ireland
Italy
Cyprus
Latvia
Netherlands Austria
Poland
Finland
SwedenUnited Kingdom
face/width = Solid biofuels ears/level = Hydro power face/half the height of the face = Wind power the upper half of the face/eksc. = Solar thermal the lower half of the face/eksc. = Geothermal Energy nose/length = Biogas mouth/curvature = Liquid biofuels
Figure 4. The picture chart in the form of so called Chernoff faces enabling the analysis of the multidimensional data which describe the structure of energy production from renewable energy sources in the EU countries in 2014 It results from the picture that there are two groups of countries with similar structure of energy production from the renewable sources as it was in Czekanowski’s diagram. The definite similarity is observed in such countries as: Belgium, Denmark, Greece, Portugal and Spain; Great Britain and Germany; whereas the Czech Republic is not too similar in facial pictures to this group of countries which were indicated before in Czekanowski’s diagram. The countries’ faces which belong to other groups, set apart in Figure 3, are similar to each other as well. It is possible to find effortlessly the countries in the chart that are dissimilar to other countries on the basis of the presented face- the face in completely different size and different face expression, i.e. Malta and Ireland in this case. It can be stated then that the charts of this kind are the appropriate statistical tool which presents in easy, interesting and visual way the diversity of examined objects as regards the particular phenomenon.
5 Conclusions The increase of energy production from the renewable sources of energy becomes a necessity taking into consideration the significant decrease in natural resources and the increase of environmental pollution. This
908
Mathematical Methods in Economics 2016
particular situation applies to the European Union countries where the European Parliament and the Council defined with the use of the appropriate Directive the general objectives in the shares of energy from renewable sources in gross final energy consumption in 2020. It results from the conducted analyses that some of the countries (Croatia, Italy, Finland, Sweden, Estonia, the Czech Republic, Romania, Bulgaria and Lithuania) already achieved the aims stated in the EU Directive. However, such countries as: Ireland (in 2014 the share of energy from renewable sources in gross final energy consumption equaled to 13.8% where the aim for 2020 is 18%), the Netherlands (in 2014 the share of energy from renewable sources in gross final energy consumption equaled to 5.5% where the aim for 2020 is 14%), Luxembourg (in 2014 the share of energy from renewable sources in gross final energy consumption equaled to 4,5% where the aim for 2020 is 11%) or Malta (in 2014 the share of energy from renewable sources in gross final energy consumption equaled to 4.7% where the aim for 2020 is 10%) can have problems with fulfilling the requirements because their share of energy from renewable sources in gross final energy consumption is very low when compared to the requirements stated for 2020. In 2014 the share of the energy from renewable sources in gross final energy consumption in Poland equaled to 11.45% and it increased by 4.55 percentage points in comparison to 2005, thus to achieve the required aim it is necessary to obtain 3.55 percentage points more. About 3% below the required European Parliament aim in the scope of the share of energy from renewable sources in gross final energy consumption in 2020 can also be found in Denmark, Latvia and Austria. The studies show that the shares of energy from renewable energy carriers which come from different points of supplies (solid biofuels, water, wind, biogas, solar and geothermal energy and others) differ from each other and they change in various periods. The research demonstrated that the research tools used in the process (the Czekanowski’s diagram and the picture chart in the form of so called Chernoff faces) can be used successfully to define the groups of countries similar to each other as regards the structure of energy production from renewable sources.
References [1]. Chernoff H., The Use of faces to represent points in k-dimensional space graphically, Journal of the American Statistical Association, vol. 68, no. 324 (Jun., 1973), pp. 361-368. [2]. Czekanowski J., Zur Differenzionaldiagnose der Neandertallgruppe, Korespondemz-Blatt der Deutsche Geselschaft fur Antropologie und Urgeschichte, Braunschweig, (1910), pp. 9-12. [3]. Czekanowski J., Zarys metod statystycznych w zastosowaniu do antropologii (Outline of statistical methods in application to anthropology), Prace Naukowego Towarzystwa Warszawskiego (The works of Warsaw Learned Society), no. 5 (1913). [4]. Dyrektywa Parlamentu Europejskiego i rady Europy nr 2009/28/WE w sprawie promowania stosowania energii ze źródeł odnawialnych zmieniająca i w następstwie uchylająca dyrektywy 2001/77/WE oraz 2003/30/WE (Directive 2009/28/WE of the European Parliament and of the Council on the promotion of the use of energy from renewable sources and amending and subsequently repealing Directives 2001/77/WE and 2003/30/WE), (Act of 5 June 2009), Official Journal L 140/16. [5]. Dziechciarz J., Ekonometria. Metody, przykłady, zadania. (Econometrics. Methods, examples and tasks), Wydwanictwo Uniwersytetu Ekonomicznego we Wrocławiu (University of Economics in Wroclaw), Wrocław, (2012). [6]. Heffner K., Gibas P., Analiza ekonomiczno-przestrzenna (Economics and Spatial Analysis), Wydawnictwo Akademii Ekonomicznej w Katowicach (University of Economics in Katowice), Katowice, (2007). [7]. Pociecha J., Podolec B., Sokołowski A., Zając K., Metody taksonomiczne w badaniach społecznoekonomicznych (Taxonomic methods in social and economic research), PWN, Warszawa, (1988). [8]. Mielecka-Kubień Z., Warzecha K., Sytuacja demograficzno-społeczna wybranych miast województwa śląskiego w latach 2002 i 2009 (The socio-demographic situation of chosen cities in Śląskie Province in the years 2002 and 2009), Zeszyty Naukowe Uniwersytetu Ekonomicznego w Krakowie (Scientific Notebooks of University of Economics in Cracow), 2013, no. 923, (2013), pp. 5-21 [9]. Sołtysiak A., Jaskulski P., Czekanowski's diagram. A method of multidimensional clustering , [w:] New Techniques for Old Times. CAA 98. Computer Applications and Quantitative Methods in Archaeology. Proceedings of the 26th Conference, Barcelona, March 1998, Chapter:, Publisher: BAR International Series 757, Editors: J.A. Barceló, I. Briz, A. Vila, (1998), pp.175-184 [10]. European Commission. Eurostat - Share of energy from renewable sources in gross final energy consumption [Online]. Available at: http://ec.europa.eu/eurostat/tgm/mapToolClosed.do?tab=map&init=1&plugin=1&language=en&pcode=t20 20_31&toolbox=legend [accessed 28.04.2016] [11.] Description of the program MaCzek [Online]. Available at: http://softadvice.informer.com/ext/www.antropologia.uw.edu.pl/MaCzek%2Fmaczek.html [accessed 28.04.2016]
909
Mathematical Methods in Economics 2016
The evaluation of factors influencing flight delay at popular destinations of Czech tourists Martina Zámková1, Martin Prokop2 Abstract. The main goal of this paper was assessment of factors influencing flights delay at selected international airports including Antalya, Burgas, Charles de Gaulle, Heraklion, Hurghada, Palma de Mallorca and Rhodes. Data of an Airline company, whose flights are mainly oriented on favourite destinations of Czech tourists, were used for needs of this article. Results from analysis showed that delays are the most frequent in Palma de Mallorca and the least often in Burgas airport. It was proved that delays caused by delay of preceding flight are the most common cause of delay at all. Only at Charles de Gaulle airport there was also found out that delay is also quite often caused by problems at the airport and in the flight traffic, by unusual events and also by airlines and suppliers. Delayed preceding flights cause mainly longer delay of connecting flight. Contingency tables present an easy way to display the relationship between categorical data. We used the Pearson's chi-square test; the null hypothesis of the test assumes independence of variables. Correspondence analysis aims to reduce the multidimensional space of row and column profiles and to save the original data information. Keywords: flights delay, correspondence analysis, contingency table, Pearson chisquared test JEL Classification: C30, R40 AMS Classification: 62H17, 62H25
1
Introduction
The main goal of this paper was assessment of factors influencing flights delay at selected international airports including Antalya, Burgas, Charles de Gaulle, Heraklion, Hurghada, Palma de Mallorca and Rhodes. We selected these airports because the airline that we monitor is concentrated mainly on these destinations. Based on the carried out dependency test in contingency tables and depicted correspondence maps, dependence of delay on various factors were examined. There were subsequently designed recommendations how to eliminate the causes of delay and how to mitigate not only the financial implications. The article [2] deals with a similar problematics, in particular it analyses dependence of flights delay on destination airport. More details are offered in the article [4] by strategy for reduction of delays, for example setting scheduled times for completion of a process, increasing the number of service counters, and priority service for emergent flights. The economics including possible optimizing is analysed for example in an article [5]. An article [6] analyses a model describing optimization of subsequent flights in order to prevent spreading and chaining of delays. Similar analysis of factors influencing delays at Czech international airports was carried out in the article [7]. The article [8] is focused also on financial implications on airlines.
2
Methods and Materials
Contingency tables present an easy way to display the relationship between categorical data. We used the Pearson's chi-square test; the null hypothesis of the test assumes independence of variables. The condition that a maximum of 20% of the expected frequencies are less than five must be met; see [1]. Correspondence analysis that was used for this study is a multivariate statistical technique, which allows the display and summary of a set of data in two-dimensional graphic form. This analysis aims to reduce the multidimensional space of row and column profiles and to save the original data information as much as possible; see [3]. Unistat and Statistica software was used for primary data processing. The primary data were examined in the busiest season for the selected airline company (1. 6. 2013 – 30. 9. 2013). Data contained information about length of delay and cause of delay according to IATA3 codes, which 1
College of Polytechnics Jihlava, Department of Mathematics, Tolstého 16, 58601 Jihlava, martina.zamkova@vspj.cz. College of Polytechnics Jihlava, Department of Mathematics, Tolstého 16, 58601 Jihlava, Martin.Prokop@vspj.cz. 3 IATA – International Air Transport Association 2
910
Mathematical Methods in Economics 2016
were adjusted by the airline company according to its needs. Substantial part of data is categorical or suitable to categorization. Data was gathered from an internal database of the monitored airline.
3
Research results
It is evident from the frequency table (see Tab. 1) that delays occur most often at Palma de Mallorca airport where more than one half of departures are delayed. It is a tourist destination where the intensity of traffic is the biggest in the monitored time period and the airport is very busy. Over 30% of departures are delayed from Charles de Gaulle, Heraklion, Hurghada and Rhodes airports. Charles de Gaulle airport is a big international transit airport with an intensive year-round traffic. Heraklion, Hurghada and Rhodes airports are smaller airports with only one take-off runway and one terminal, therefore delays as consequence of insufficient capacity occur here. Burgas airport achieved the best result (approximately 23% delayed departures). The result is probably caused by the fact that the high season is only from July to August in this destination due to the geographical location. BOJ airport is minimally utilized in June and September and delays almost don’t occur here. Destination
Country
CODE
Delayed flights
All flights
Percentage
Antalya
Turkey
AYT
387
1324
29.23%
Burgas
Bulgaria
BOJ
177
785
22.55%
Charles de Gaulle
France
CDG
214
516
41.47%
Heraklion
Greece
HER
223
666
33.48%
Hurghada
Egypt
HRG
155
358
43.30%
Palma de Mallorca
Spain
PMI
228
412
55.34%
Rhodes
Greece
RHO
192
520
36.92%
Table 1 Relative frequencies – Selected airports and delayed flights It is clear from the row relative frequencies table (see Tab. 2), that the biggest problems with delay at night are in Hurghada where more than half of flights are delayed. Worse situation is also at Charles de Gaulle airport where almost 30% of flights are delayed. On the contrary, at Heraklion, Palma de Mallorca and Rhodes airports less than 5% of departures are delayed at night. During the morning the differences are less significant, worse situation is at Burgas, Heraklion and Antalya airports (around 40%). On the contrary, at Rhodes, Hurghada a Charles de Gaulle airports it is only slightly over 10%. In the afternoon the differences are even smaller. The worst situation is at Burgas, Heraklion and Charles de Gaulle airports (always over 30%), at the rest of airports is the ratio of delayed departures 20-30%. In the evening the differences are significant again. At Rhodes and Palma de Mallorca airports is the rate of delayed flights more than 50%, in case of Heraklion and Antalya airports approximately one quarter of departures are delayed and at Burgas, Hurghada and Charles de Gaulle airports the ratio of delayed departures is only around 10%. Dependency is statistically significant; p-value is less than 0.0001. Row relative frequencies
00:01-06:00
06:01-12:00
12:01-18:00
18:01-24:00
AYT
11.11%
37.73%
24.29%
26.87%
BOJ
10.17%
46.89%
33.90%
9.04%
CDG
28.97%
10.75%
48.60%
11.68%
HER
2.69%
37.67%
32.74%
26.91%
HRG
54.84%
12.26%
20.00%
12.90%
RHO
0.52%
11.98%
26.04%
61.46%
PMI
4.39%
20.18%
23.25%
52.19%
Table 2 Contingency table – Daytime and airport From correspondence map (see Fig. 1) it is evident that the delayed departures at night are very frequent at Hurghada airport. The most frequent delays in the evening occur at Rhodes and Palma de Mallorca airports. Delays at Burgas airport occur most often in the morning. It is furthermore evident from the row relative frequencies table (Tab. 3) that delays caused by problems at the airport and in the air traffic as well as caused by unusual events (transport of handicapped passengers and necessary healthcare for passengers with sudden healthy problems) are the most frequent at Charles de Gaulle airport (approximately 21% of delayed flights). It is the busiest airport, that’s why more unusual events can occur there. On the contrary these causes of delay occur the least frequently at Heraklion, Burgas, Hurghada, Antalya and Rhodes airports (up to 6% of delayed flights). The best situation at all is at Burgas airport (only 2% of
911
Mathematical Methods in Economics 2016
delayed flights). It is the least busy airport with minimum problems in the air traffic and minimum unusual events. Delays on airline’s side (technical maintenance, defects, crew standards) and suppliers’ side (handling, supply of fuel, catering) are the most frequent again at Charles do Gaulle airport (27% of delayed departures). These causes occur very rarely at other airports (up to 6% of delayed departures). It is even less than 1% of delayed departures at Heraklion, Hurghada and Rhodes airports. Aircraft have very short downtime at these airports, technical problems are solved only in necessary cases and therefore they mostly don’t cause delays. Similarly catering is complemented mostly at home base airports. Delay caused by delay of preceding flight is the most common cause of delays at all. They occur most often at Antalya, Heraklion, Palma de Mallorca, Burgas, Hurghada and Rhodes airports, approximately 90% of all delays. Charter flights, which have higher probability of delays in total, fly mostly to these holidays’ destinations. That is why chaining of delays originate here. This cause is less common only at Charles de Gaulle airport and includes 53% of all delayed flights. Mostly regular flights which are less often delayed fly to this airport and the chaining doesn’t originate. The dependence is statistically significant; p-value is less than 0.0001. Similar result is visible also in the correspondence map; see Fig. 2, where all airports except are concentrated around delay caused by preceding delay. Charles de Gaulle and Palma de Mallorca airports are closer to other causes of delay. 2D graph of row and column coordinates Standardization: Row and column profiles
2D graph of row and column coordinates Standardization: Row and column profiles 0,3
0,6
Dimension 2; Eigenvalue: 0.01008 (5.539% inertia)
Dimension 2; Eigenvalue: 0.10670 (28.25% inertia)
0,8
BOJ 6:01 - 12:00
0,4 HER AYT
0,2
12:01 - 18:00 0,0 CDG -0,2
0:01 - 6:00 PMI
HRG
18:01 - 24:00 -0,4 RHO -0,6 -0,8
-0,6
-0,4
-0,2
0,0
0,2
0,4
0,6
0,8
1,0
Dimension 1; Eigenvalue: 0.23837 (63.11% inertia)
1,2
1,4
BOJ
0,1
0,0
RHO AYT Delayed previous flight HRG
CDG
HER -0,1 PMI
-0,2
Airport, air traffic control and extraordinary events
-0,3
-0,4 -0,6
Row Column
-0,4
-0,2
0,0
0,2
0,4
0,6
0,8
1,0
1,2
1,4
1,6
Dimension 1; Eigenvalue: 0.17191 (94.46% inertia)
Row Column
Figure 2 Correspondence map – Causes of delay and airport
Figure 1 Correspondence map - Daytime and airport
Row relative frequencies
Airline and related services 0,2
Airport, air traffic control and
Airline and related
Delayed previous
extraordinary events
services
flight
AYT
5.17%
3.88%
90.96%
HER
5.83%
0.90%
93.27%
CDG
20.56%
26.64%
52.80%
PMI
12.28%
3.07%
84.65%
BOJ
2.26%
6.21%
91.53%
HRG
5.16%
0.65%
94.19%
RHO
3.65%
0.52%
95.83%
Table 3 Contingency table – Causes of delay and airport In case of dependence of length of delay on particular airport the differences are not so big like in previous cases, see Tab. 4. Regarding the length of delay, the best situation is at Heraklion a Charles de Gaulle airports, where long delays over 1.5 hour occur rarely (up to 7% of all delayed flights), on the contrary short delays up to 30 minutes prevails at these airports (around 50% of all delays). With regard to the capacity and number of departure runways of CDG airport only short delays originate here although the intensity of traffic is high. There are bigger manipulation opportunities. Dependence is statistically significant, p-value is 0.0043. It is evident from the row relative frequencies table (see Tab. 5) that delays caused by problems at the airport and in the air traffic as well as by unusual events (transport of handicapped passengers and necessary healthcare for passengers with sudden healthy problems) are the most frequent in case of short delays up to 30 minutes (11% of delayed flights). With increasing length of delay these causes occur less often (with an exception of the longest delays over 2 hours). Delays on airline’s side (technical maintenance, defects, crew standards) and sup-
912
Mathematical Methods in Economics 2016
pliers’ side (handling, supply of fuel, catering) are the most frequent again in case of short delays. And again the ratio of these causes decrease with increasing length of delay with an exception of the longest delays (over 2 hours). On the contrary, the ratio of delays caused by delay of preceding flight mostly slightly increase with increasing length of delay. Delayed preceding flights cause mostly longer delays of connecting flights, whereas operational and technical reasons cause rather shorter delays. It is probably due to possible sanctions for suppliers and also duty to pay out compensations to passengers. Dependence is statistically significant; p-value is less than 0.0001. Row relative frequencies
00:15-0:30
00:31-1:00
01:01-1:30
01:31-2:00
02:01 and more
AYT
46.51%
24.29%
11.89%
6.72%
10.59%
BOJ
41.24%
24.29%
12.43%
11.86%
10.17%
CDG
52.80%
25.70%
9.35%
7.01%
5.14%
HER
49.33%
30.04%
11.66%
4.48%
4.48%
HRG
39.35%
26.45%
14.84%
9.03%
10.32%
PMI
34.21%
33.33%
14.91%
7.02%
10.53%
RHO
41.67%
31.77%
8.85%
10.42%
7.29%
Table 4 Contingency table – Length of delay and airport
Row relative frequencies
Airport, air traffic control and
Airline and related
Delayed
extraordinary events
services
previous flight
00:15-0:30
11.08%
8.63%
80.29%
00:31-1:00
6.41%
3.89%
89.70%
01:01-1:30
3.72%
3.72%
92.55%
01:31-2:00
0.82%
2.46%
96.72%
02:01 and more
2.24%
5.22%
92.54%
Table 5 Contingency table – Cause of delay and length of delay During the time period we monitor, from July to September there are only minimal differences concerning causes of delay, which is evident in the row relative frequencies table (see Tab. 6). Statistical dependence was also not proven (p-value is p = 0.96). Row relative frequencies
Airport, air traffic control and
Airline and related
Delayed
June
extraordinary events 7.36%
services 5.94%
previous flight 86.70%
July
7.46%
5.04%
87.50%
August
6.83%
6.83%
86.34%
September
7.61%
6.23%
86.16%
Table 6 Contingency table – Cause of delay and selected time period It is then apparent from the row relative frequencies table (see Tab. 7), that differences between particular airports are not so significant during the monitored time period. The best situation in June is at Burgas airport (only 17% of all delayed flights). The highest number of delayed flights in June is at Antalya, Heraklion and Hurghada airports (over 30% of all delayed flights). In July the worst situation is in Burgas and Hurghada airports (around 40% of all delayed flights). In August the worst situation is in Burgas and Charles de Gaulle airports (over 30% of all delayed flights). The situation is the best in September overall. The worst situation is at Antalya, Charles de Gaulle, Palma de Mallorca and Rhodes airports (over 20% of delayed flights). The best situation is in Hurghada in September (only 7% of all delayed departures). It can be stated overall that at Burgas airport delays originate the most often during the high season (July, August). The high season lasts only from July to August at this airport unlike other airports, delays almost don’t occur in other months due to small traffic intensity. On contrary at Hurghada airport the bigger problems with delays occur in June and July. At Heraklion airport the situation improves significantly in September, the intensity of tourism decreases. These significant differences don’t occur at the rest of airports. Dependence is statistically significant; p-value is less than 0.0001. From correspondence analysis (see Fig. 3) it is also evident, that delays are often at Antalya and Heraklion airport in June. In August delays are frequent at Charles de Gaulle airport. Burgas airport is the closest to July and August, and in these months delays are really the most frequent according to frequencies table.
913
Mathematical Methods in Economics 2016
Row relative frequencies
June
July
August
September
AYT
32.30%
23.51%
20.93%
23.26%
BOJ
16.95%
40.68%
31.64%
10.73%
CDG
23.36%
24.30%
32.71%
19.63%
HER
30.04%
27.35%
27.35%
15.25%
HRG
30.32%
40.00%
22.58%
7.10%
PMI
22.37%
29.39%
26.32%
21.93%
RHO
26.56%
26.56%
24.48%
22.40%
Table 7 Contingency table – Airports and monitored time period 2D graph of row and column coordinates Standardization: Row and column profiles
2D graph of row and column coordinates Standardization: Row and column profiles
0,20 Dimension 2; Eigenvalue: 0.00220 (3.058% inertia)
Dimension 2; Eigenvalue: 0.01180 (25.52% inertia)
0,3
HRG 0,2 June 0,1 AYT HER July 0,0 RHO
September
-0,1
BOJ
PMI August CDG
-0,2 -0,4
-0,3
-0,2
-0,1
0,0
0,1
0,2
0,3
0,4
0,5
Dimension 1; Eigenvalue: 0.03113 (67.30% inertia)
Airport, air traffic control and extraordinary events 0,15
0,10 6:01 - 12:00 0,05
0,00
18:01 - 24:00 12:01 - 18:00
-0,05
-0,10 -1,2
Row Column
Delayed previous flight
0:01 - 6:00 Airline and related services
-1,0
-0,8
-0,6
-0,4
-0,2
0,0
0,2
0,4
Dimension 1; Eigenvalue: 0.06973 (96.94% inertia)
Figure 3 Correspondence map – Monitored time period and airport
Row Column
Figure 4 Correspondence map – Causes of delay and day time
It is evident from the relative frequencies table (see Tab. 8), that delays caused by problems at the airport and in the air traffic as well as caused by unusual events (transport of handicapped passengers and necessary healthcare for passengers with sudden healthy problems) are the most frequent are the most frequent at night and in the morning (around 10% of all delayed flights). Number of delays on airline’s side (technical maintenance, defects, crew standards) and suppliers’ side (handling, supply of fuel, catering) is the highest at night (20% of all delayed flights). Otherwise these causes are very rare (up to 5% of all delays). Delays caused by delay of preceding flight are the least frequent at nights. There are only few flights at night in general and the air space is unoccupied, that’s why chaining of delays doesn’t originate. In the morning there is generally busy traffic, therefore delays often occur by reason of problems at airport and difficulties in air traffic management. Delays caused by suppliers and service companies occur often at night, probably due to lack of night shift employees. Dependence is statistically significant; p-value is less than 0.0001. It is proved by correspondence analysis (see Fig. 4), that delays on airline’s’ and suppliers’ side are the most frequent at night. Delays caused by delay of preceding flights are the closest to afternoon and evening time. Row relative frequencies
Airport, air traffic control and
Airline and related
Delayed
extraordinary events
services
previous flight
00:01-6:00
10.22%
20.44%
69.33%
06:01-12:00
8.96%
4.25%
86.79%
12:01-18:00
6.02%
4.73%
89.25%
18:01-24:00
5.63%
1.73%
92.64%
Table 8 Contingency table – Day time and causes of delay
4
Conclusions and discussion
Factors influencing delay of flights at selected international airports were examined in this article. These factors were identified by contingency tables analysis, which was created from the primary data, and also with use of airline’s database which includes various causes of delays. It was proved from the carried out analysis that de-
914
Mathematical Methods in Economics 2016
lays occur most frequently at Palma de Mallorca airport. It is a tourist destination where the intensity of traffic is the highest in the monitored season and the airport is very busy. Delay caused by delay of the preceding flight is the most frequent cause of delay at all. This cause occurs at Antalya, Heraklion, Palma de Mallorca, Burgas, Hurghada and Rhodes airports at most, it presents approximately 90% of all delays. Delays caused by other reasons occur minimally at all airports. Charter flights mostly fly to holidays destinations. Mostly regular flights fly to Charles de Gaulle airport, where delays are less common and chaining of delays doesn’t originate. Regular flights always take precedence over the charter flight. Concerning the length of delay, the best situation is at Heraklion and Charles de Gaulle airports, where long delays over 1.5 hours occur rarely. Regarding the capacity and the number of departure runways at CDG airport, although the intensity of traffic is high, only short delays originate here thanks to bigger manipulation possibilities. Delayed preceding flights cause usually longer delays of connecting flights, whereas operational and technical reasons cause rather short delays. It is probably because of penalties for suppliers or necessity of compensations paying out to passengers. Delays caused by delay of preceding flights are the least frequent at night. At night there is low number of departures overall and unoccupied air space, that’s why chaining of delays doesn’t originate. In the morning there is generally busy traffic, therefore delays often occur by reason of problems at airport and difficulties in air traffic management. In the article [2] there was proven dependence of delay on destination aerodrome and in our analysis the significant dependence on destination airport was also proven. The article [4] similarly offers different variants of strategy for reduction of delays, but again from another angle, the author recommends for example increasing the number of service counters during checking-in. However delays caused by passengers and their baggage occur very rarely at airports we monitored, except the international transit airport CDG. Similar to the article [5] also the airline we monitor tries to optimize the operating costs. In the article [6] the authors describe optimization of connecting flights in order to prevent chaining of delays. Our research showed that the most frequent cause of delays was particularly delay of preceding flight, approximately 90% of all delayed flights. In the article [8] it was also found out that delays caused by delay of preceding flight occur very frequently in general, they are much more frequent in Brno and Ostrava, around 80%, in Prague around 41%. Whereas in the article [7] the dependence of length of delay on particular airport in Czech Republic was not proven. Dependence of length of delay on selected destination airports is statistically significant. It is necessary for airline to eliminate longer delays, because paying out of compensations to passengers is very expensive. There are nowadays specialised companies searching for passengers from delayed flights and recover financial compensations by legal way. In order to eliminate delay we would recommend borrowing an alternate aircraft in summer months to have the opportunity to replace an aircraft delayed on arrival to home base airport by an alternate aircraft. We would furthermore suggest suppliers and service companies hiring a certain number of seasonal workers for these months. It is beneficial also for these companies, because they can be also penalised for delay caused by their employees.
References [1] Agresti, A.: Categorical Data Analysis. John Wiley a Sons, New York. 1990. [2] Fleurquin, P., Ramasco, J.J., and Eguiluz, V.M.: Characterization of Delay Propagation in the US AirTransportation Network. Transportation journal 53 (2014), 330–344. [3] Hebák, P. et al.: Vícerozměrné statistické metody 3. Informatorium, Praha, 2007. [4] Hsu, CI., Chao, CC., and Hsu, NW.: Control strategies for departure process delays at airport passenger terminals. Transportation planning and technology 38 (2015), 214–237. [5] Pan, W., Wang, Z., and Zhou, G.: Flight Delay Cost Optimization based on Ground Power Supply. In: Proceedings of the 2015 international informatics and computer engineering conference. Atlantis Press, China, 2015, 336–339. [6] Xing, Z., Yu, J., and Lu, H.: Optimal Control of Flight Delays Allocation at Airport. In: Proceedings of the International Conference on Advances in Materials Science and Information Technologies in Industry. Trans Tech Publications, China, 2014, 513–517. [7] Zámková, M., and Prokop, M.: The Evaluation of Factors Influencing Flights Delay at Czech International Airports. Acta Universitatis Agriculturae et Silviculturae Mendelianae Brunensis 63 (2015), 2187–2196. [8] Zámková, M., and Prokop, M.: Příčiny a finanční důsledky zpoždění letů v České republice. Acta academica karviniensia 3 (2015), 110–120.
915
Mathematical Methods in Economics 2016
Decision of a Steel Company Trading with Emissions František Zapletal1, Martin Šmı́d2 Abstract. We formulate a Mean-CVaR decision problem of a production company obliged to cover its CO2 emissions by allowances. Certain amount of the allowances is given to the company for free, the missing/redundant ones have to be bought/sold on a market. To manage their risk, the company can use derivatives on emissions allowances (in particular futures and options), in addition to spot values of allowances. We solve the decision problem for the case of an real-life Czech steel company for different levels of risk aversion and different scenarios of the demand. We show that the necessity of emissions trading generally, and the risk caused by the trading in particular, can influence the production significantly even when the risk is decreased by means of derivatives. The results of the study show that even for low levels of the risk aversion, futures on allowances are optimal to use in order to reduce the risk caused by the emissions trading. Keywords: CVaR, emissions trading, optimization, allowances, EU ETS JEL classification: C61 AMS classification: 90C15
1
Introduction and the current state of art
Since its launching, the European emissions trading system (EU ETS) became the object of interest of both researchers and managers of affected companies. The system began to work in 2005 with the purpose to decrease the amount of CO2 emissions by forcing EU industrial companies to cover their emissions by emissions allowances (1 ton of CO2 must be covered by 1 emissions allowance). Our paper investigates the influence of the emissions trading on profit and production portfolio plan of a steel company participating in the EU ETS. The aim of the paper is to assess the impact of the EU ETS system on the company and, also, to determine the optimal emissions management for a single one-year-long period. Because two types of emissions allowances exist (European Union Allowaces EUA, and Certified Emissions Reduction CER) and, moreover, EUA’s derivatives exist (options and futures), our goal is to determine which allowance type and which financial instruments are optimal to use. The model is applied using data of one Czech steel company. Several models optimizing the output of a company with respect to a duty of emissions trading have been already published. In [4], the deterministic models has been presented. In [9], the stochastic model has been built, but without involving any risk measure. This paper follows [8] where the single-stage single-period mean-variance model has been presented. This present contribution uses a different risk measure - Conditional Value at Risk (CVaR). In particular, a mean-CVaR model is formulated. Unlike the variance, the CVaR measure belongs to the group of coherent risk measures; moreover, it takes into account possible fat tails and an assymetry of a distribution, see e.g. [1]. CVaR has been used in both static ([6]) and dynamic (multi-stage) models, see e.g. [5]. However, the onlyapplication of the CVaR in relation to the EU ETS system has been presented by [3]; however, that paper is aimed to determine the dynamic relationship between electricity prices and prices of emissions allowances. The core of this paper consists in the formulation of the optimization model and its verification and sensitivity analysis at the end. In order to satisfy the requirements concerning paper’s length, some details about the conditions of the EU ETS have to be skipped; however, they can be found e.g. in [7] 1 VSB
- Technical University of Ostrava, Faculty of Economics, Sokolská 33, 702 00 Ostrava, Czech Republic, frantisek.zapletal@vsb.cz 2 The Institute of Information Theory and Automation of the CAS, Pod Vodárenskou věžı́ 4, 182 08 Prague, Czech Republic, smid@utia.cas.cz
916
Mathematical Methods in Economics 2016
or [2]. The paper is organized as follows. After the introduction, the model is designed in Section 2. Section 3 presents the results of the model verification and the sensitivity analysis is done there. The last part of the paper is the conclusion where the main findings are pointed out and some proposals for further research are given.
2
The optimization model
In this section, the model assumptions are given and the Mean-CVaR optimization model is designed. Assumptions Production • There is single (unit) decision period. • There are n products produced. • Demand for the products d ∈ Rn+ , selling prices p ∈ Rn+ and the unit production costs (of final products) c ∈ Rn+ are known at the beginning of the period. • Raw production x which is necessary for final production y is given by x = T y where T ∈ Rn×n is an inverted technological matrix. • The final production is non-zero, the limits of the raw production are given by vector w ≥ 0, i.e., y≥0
T y ≤ w.
(P )
Emissions • The vector of CO2 emissions resulting from raw aproduction x is given by hT x where h is a vector. Finance • Selling prices are collected at t = 1. • The production costs are funded by a credit with (low) interest rate ι. • Other costs are funded by loans payable at t = 1 with (high) interest rate ρ > ι. • If there is excess cash at t = 0, it may be deposited up to t = 1 with interest ι. • Insufficiency of the unit of cash at t = 1 is penalized by constant σ (perhaps a prohibitive interest rate). Emissions trading • r EUA permits are obtained for free. C • At t = 0, the company may buy sE 0 EUA spots and s0 CER spots, f futures, φ1 , φ2 , . . . φk call EUA options with strike prices K1 < K2 < · · · < Kk respectively, and/or ψ1 , ψ2 , . . . ψl put EUA options with strike prices L1 > L2 > · · · > Ll , respectively.
• Short sales are not allowed, i.e. C sE 0 ≥ −r, s0 ≥ 0, f ≥ 0, φ ≥ 0, ψ ≥ 0
(F )
• Relative margin ζ is required when holding a future, for simplicity we assume that margin calls can occur only once a month and the margin cannot be decreased (for more on futures margins, see e.g. http://www.investopedia.com/university/futures/futures4.asp).
917
Mathematical Methods in Economics 2016
C • At t = 1, the company may buy sE 1 EUA spots and s1 CER spots, short sales are not allowed: C sC 0 + s1 ≥ 0,
E r + sE 0 + s1 ≥ 0.
• Banking of permits is not allowed, i.e. C E C r + sE 0 + s0 + s1 + s1 + f +
k X i=1
φ1 −
l X
(G)
ψi = hT y.
(E)
i=1
• Only limited number of CERs may be applied, in particular, C sC 0 + s1 ≤ ηhT y
where 0 ≤ η < 1.
(C)
• The company does not speculate: in particular, they would not buy more spots than needed neither they would buy more put options than the initial number of spots, i.e., X sE sC ψi ≤ r, (S) 0 ≤ hT y, 0 ≤ ηhT y, (an alternative to (S) would be an assumption of no arbitrage).
Summarized, the vector of the decision variables is (y, ξ) where C E C ξ = (sE 0 , s0 , s1 , s1 , f, φ1 , . . . , φk , ψ1 , . . . , ψl )
(note, however, that the components of ξ are dependent due to (E)). The gross balance from financial operations at t = 0, excluding margins of futures, thus equals to E C C E0 = −sE 0 p0 − s0 p0 −
k X i=1
i φi pc,K − 0
l X
i ψi pp,L 0
i=1
i where and are EUA, CER, respectively, prices at t = 0, and pp,L are the prices of call options 0 with strike price Ki , put option with strike price Li , respectively, at t = 0.
pE 0
pC 0
i pc,K 0
The costs of futures maintenance amounts to f M where M=
11 X j=0
ρ12−j [Mj − Mj−1 ] + ,
h i f Mj = max Mj−1 , pf0 − (1 − ζ)Pj/12 , j = 0, 1 . . . , 11.
Here, ρj = (1 + ρ)j/12 − 1 is the j/12 time units interest rate, pf0 is the future price at t = 0 and M−1 = 0 by definition. The cash balance at t = 1 resulting from production is B = pT min(d, y) − (1 + ι)cT y,
while the balance resulting from financial operations at t = 1 is f E C C E1 = −sE 1 P1 − s1 P1 − f p0 −
Summarized, the value function is given by
k X
φi min(P1E , Ki ) +
i=1
l X
ψi max(P1E , Li ).
i=1
V (y, ξ) = V (y, ξ, P1E , P1C , Mρ ) = g1+σ,1 (g1+ρ,1+ι (E0 ) − M f + E1 + B) where gα,β (x)
=
(
βx αx
x≥0 . x≤0
Consequently, the decision problem is formulated as maxy,ξ
(1 − λ)EV (y, ξ) − λCVaRα (−V (y, ξ)) s.t. (P ), (F ), (G), (E), (C), (S)
where 0 ≤ λ ≤ 1 is a level of risk aversion. 918
(1)
Mathematical Methods in Economics 2016
3 3.1
Application of the model Data
The following data sources were used for verification of the model: • database of ICE stock exchange for data on emissions prices and related financial indeces (see theice.com); • business data of certain Czech steel company for values related with production (i.e. margins, costs, production coefficients); these data have been slightly modified; • database Carbon Market Data for amount of emissions and allocated allowances (see carbonmarketdata.com); • database of Czech Steel Federation for historical data on steel market. Due to the limited length of the contribution, it is not possible to present the complete set of input data. The same input values as in [8] have been used for corresponding parameters (e.g. data on demand or production) with the difference that a longer period of permit prices was used. Prices of EUA’s and CER’s spots at the beginning of the period are pe0 = 5 EUR and pc0 = 0.38 EUR. For the sake of simplicity, only 5 call and put options have been involved with different strike prices (K = L = {3, 4, 5, 6, 7} EUR). Prices of those options are shown in Tab. 1. Type of option/strike price Call option Put option
3 2.31 0.15
4 1.95 0.56
5 1.12 1.06
6 0.77 1.71
7 0.53 2.48
Table 1 Prices of EUA options with different strike prices [EUR] (source: theice.com) The distribution of the emissions prices at the end of the period was assumed to be log-normal; in . the actual computation more than 1000 scenarios were used with the following mean values: pe1 = 5.29 . c . EUR, p1 = 0.43 EUR and M = 0.78 EUR. Unfortunately, no precise data on interests rates ι, ρ, σ were available. Therefore, their values are just estimated to be meaningful and to satisfy the assumptions listed in Section 2. In particular, ι = 0.01, ρ = 0.1, σ = 1 were set (essentially, σ rate should be prohibitive enough). For the future research, those values will be surveyed and set more accurately. It is reasonable to expect that the portfolio of financial instruments on allowances will depend on the fact if a company needs some additional allowances to purchase or if the amount of EUA’s received for free is sufficient to cover all CO2 emissions. That is why three different scenarios of the demand for production are considered. The first one S1 corresponds to the current level (i.e. level of 2014), when the company has a ”slight” shortage of allowances. The second scenario S2 counts with the values of demand corresponding to production capacities of the company (i.e. the state when the company needs the maximum number of additional allowances). And the last scenario (S3) supposes the demand at the level of 50% of production capacities (in this case, the company has a surplus of permits). The last parameter whose value left to be set is the risk aversion parameter λ. In order to avoid setting a particular range of values, the model will be run with all possible values, i.e. for λ ∈ [0, 1]. CVaR will be computed at 5% level of α. 3.2
Results
The model is solved using GAMS software. Similarly to other studies mentioned in the introduction (e.g. [8]), emissions trading does not influence the amount of the production. Moreover, the amount of production of all products is equal to the levels of demand considered, regardless a degree of riskaversion. Thus, the only matter to analyze is the optimal portfolio of financial instruments for emissions allowances. In none of the three scenarios considered under any value of λ, EUA options are traded. All the remaining variables of allowances type differ with scenarios and a value of risk-aversion coefficient.
919
Mathematical Methods in Economics 2016
(a) pe0
(b) pe1
Figure 1 EUA’s traded at the beginning and end of the period Most of the allowances are traded in a form of EUA spots. Optimal numbers of EUA spots traded at the beginning and the end of the period are shown in Fig. 1a and Fig. 1b, respectively. It can be seen that the company would sell all the allowances obtained for free (i.e. value of r) in all three scenarios for very low values of risk-aversion (for λ from 0 to values of around 0.02 ). The reason is, that the mean value of EUA price at the end of the period is slightly higher than the current (certain) value. With values of λ greater than 0.02, the company prefers higher certainty given by the current price of EUA and it sells the EUA spots only when no additional EUA’s have to be bought (i.e. S3). At the end of the period, for small values of λ, all EUA’s needed for the whole period are bought, when a risk-aversion increases, optimal values of pe1 drops to zero (let us remind the expected increase in EUA price till the end of the period again). Due to very low prices of CER’s in comparison with EUA’s, the highest possible amount of CER’s (given by η = 10%) is always used by the company. The higher value of λ, the more CER’s are bought at the beginning of the period and the less at its end, see Fig. 2a and Fig. 2b. Out of all the EUA derivatives, only the futures are possibly used by the company. The optimal values of f when changing λ parameter are depicted in Fig. 2c. Under the scenario S3, no futures should be purchased at all. There is no reason for this because no additional EUA’s are needed due to the low level of demand. The results differ when the number of allowances allocated to the company for free (r) is not sufficient to cover all CO2 . When assuming a lack of allowances S1 and S2, no futures should still be bought for λ < 0.046. But, for stronger risk-aversion, futures are used to decrease the risk related with emissions trading. In scenario S1, the amount of around 15% of all allowances needed should be purchased in the form of futures, meanwhile, under the scenario S2, the share of the futures goes even up to one third.
4
Conclusions
This paper was devoted to the effects of emissions trading within the EU ETS system on steel companies. The single-stage mean-CVaR optimization model has been formulated. The model has been applied using the data of the real steel company. For the sake of better understanding the modeled system, different scenarios of demands have been considered and all possible values of the decision-maker’s risk aversion have been investigated. In the line with past studies, emissions trading does not affect the amount of production. The emphasis has been put on the analysis of optimal portfolio of financial instruments on emissions allowances. EUA options (both, call and put) have been considered to be inefficient to use. The optimal strategy regarding EUA spots differ with the value of risk-aversion coefficient. The company would always use 10% limit for CER allowances. When the number of allowances allocated to company for free is not sufficient, EUA futures should be used to decrease the risk. Two ways how to extend the model to the future are expected. The first lies in improvement of the presented model (e.g. data on interest rates could be further investigated and other sensitivity analyses could be performed) and the second one consists in the extension to the multi-stage (dynamic) version.
920
Mathematical Methods in Economics 2016
(a) pc0
(b) pe1
(c) f
Figure 2 CER’s traded at the beginning and end of the period and amount of futures purchased
Acknowledgements This paper was supported by grant No. GA 16-01298S of the Czech Science Foundation, SGS project No.SP2016/116 and the Operational Programme Education for Competitiveness Project CZ.1.07/2.3.00/20.0296. The support is greatly acknowledged.
References [1] Artzner, P. et al.: Coherent measures of risk, Mathematical Finance 9, (1999), pp. 203228. [2] European Commission: Directive 2003/87/ec of the european parliament and of the council of 13 october 2003. Union, European 46, (2003), 32–46. [3] Fell, H.: EU-ETS and Nordic Electricity: A CVAR Analysis, The Energy Journal 31(2), (2010), 219–239. [4] Letmathe, P., Balakrishnan, N.: Environmental considerations on the optimal product mix, European Journal of Operational Research 167(2), (2005), 398–412. [5] Kozmı́k, V., Morton, D. P.: Risk-Averse Stochastic Dual Dynamic Programming Mathematical Programming 152(1), (2015), 275–300. [6] Rockafellar, R. T., Uryasev, S.: Conditional value at risk for general loss distributions, Journal of Banking & Finance 26, (2002), 1443–1471. [7] Zapletal, F., Němec, R.: The Usage of Linear Programming to Construction of Ecologic-economical Model for Industrial Company Profit Optimization, Proceedings of 30th International Conference Mathematical Methods in Economics, Karviná, (2012), 1010–1015. [8] Zapletal, F., Šmı́d, M.: Mean-Risk Optimal Decision of a Steel Company under Emission Control, Central European Journal of Operations Research 24(2), (2016), 435-454. [9] Zhang, B., Xu, L.: Multi-item production planning with carbon cap and trade mechanism. International Journal of Production Economics 144(1), (2013), 118–127.
921
Mathematical Methods in Economics 2016
Using Semi-dynamic Input-Output Model for the Analysis of Large Investment Projects Jaroslav Zbranek1, Jakub Fischer2, Jaroslav Sixta3 Abstract. The use of Input-Output analysis for various macroeconomic studies is being more important in recent years. Due to more efficient computing technologies, these approaches provide extensive possibilities for various modifications of Leontiefs IO model. Static IO analyses based on constant technical coefficients are not sufficient for studying impacts of huge external impulses like investment shock. For these purposes, deeper analysis is needed. More sophisticated models such as CGE and DSGE offer solution, which is relatively complicated and their results are very much aggregated. The impact is even more significant in cases when the duration of investment shock is split into more years. The aim of the paper is to introduce a modified semi-dynamic IO model that we developed and successfully used for de-tailed analysis. Our model takes into account development in current and previous period including changes caused by investments realized in particular industry. The model permanently updates input coefficients without occurrence of any imbalances in the system. This allows reaction on gradual structural changes in the economy induced by the implementation of investment project. Keywords: analysis.
Input-Output, technical coefficients, models, macroeconomic
JEL classification: C67, O11 AMS classification: 65C20
1
Introduction
Input-Output analysis (IOA) is a continually expanding tool used especially for macroeconomic analyses. Due to continuous improvement of computer technology4 , it is possible to make IOA based on quite large Input-Output models (IOM). Such IOMs can be based on more detailed structures of commodities or industries on the one hand and they can also reflect various factors significantly influencing the results of the analyses on the other hand. The aim of this paper is to introduce developed IOM which is adapted to analyses of impact of large multi-annual investment projects. This IOM called Semi-dynamic Input-Output model takes into account primary investment shock and includes also calculation of impact of induced effects in subsequent years. These effects come from increased wages and salaries and increased profits. The Semi-dynamic IOM uses automatically updated technical coefficients after the first and each of following years. This approach allows to adapt the following calculation to the changed situation in the economy due to interim results after finished stages of analyzed investment shock. Moreover, Semi-dynamic IOM retains detailed structure of commodities all the time of the process of analysis.
1 University of Economics in Prague, Dep. of Economic Statistics, jaroslav.zbranek@vse.cz 2 University of Economics in Prague, Dep. of Economic Statistics, fischerj@vse.cz 3 University of Economics in Prague, Dep. of Economic Statistics, sixta@vse.cz 4 Improvements of ICT products confirm also results of the study in
922
Nam W. Churchilla 4, Prague 3, Czech Republic, Nam W. Churchilla 4, Prague 3, Czech Republic, Nam W. Churchilla 4, Prague 3, Czech Republic, [3].
Mathematical Methods in Economics 2016
2
Semi-dynamic Input-Output Model
The reason for the development of Semi-dynamic IOM was the need for analyzing the impacts of the huge investment shock that are spread over several years in the economy. As Symmetric Input-Output tables as input data are broken down by 82 commodities, using DSGE approach would need to aggregate input data in order to be able to reach results, [1]. Therefore Semi-dynamic IOM seems to be good approach concerning loss of information due to aggregation of input data which is not desirable for application of Semi-dynamic IOM. For this purpose, static Leontief’s IOM, as describes Hronová, at al. [4] or Miller and Blair [5], seems to be insufficient because it is needed to take into account also changes in the economy due to investment shock made and induced effects as well. The base for the developed Semi-dynamic Input-Output model is a simple static IOM based on Leontiefs inverse matrix, as following formula 1: x = (I − A)−1 y,
(1)
where x y (I − A)−1
output vector, vector of final use, Leontief inverse.
Beside primary investment shock, Semi-dynamic IOM takes into account two types of induced effects in each year of the investment project. The first induced effect is an increase of final household consumption expenditures (FHCE), defined in [2], due to increased wages and salaries (W), also defined in [2], of employees after primary investment shock. In case of induced effect based on W we assume that investment in particular year has impact only 50% to increase of FHCE in the same year. The remaining 50% has impact to the FHCE in the next year, according our assumptions. The estimate of induced increase of FHCE is based on the assumption of average propensity to consume (APC), explained in [7]. The coefficient of APC is estimated according to formula 2: AP C =
F HCE2010 , W2010
(2)
where F HCE2010 W2010
final household consumption expenditures in 2010 according to national accounts, wages and salaries in 2010 according to national accounts.
Induced increase of FHCE is calculated according to formula 3:
where ∆wj,t ∆∗ wj,t−1
∆F HCEt = 0.5 · AP C · Σ ∆wj,t + ∆∗ wj,t−1 ,
(3)
increase of wages and salaries due to primary investment shock in the year t, total increase of wages and salaries in the year t-1.
The commodity structure of induced increase of FHCE was calculated using FHCE matrix broken down by CPA codes and by COICOP codes, described in [2]. It is necessary to note that induced increase of FHCE is assumed especially in connection with products which can be signed as luxury products, which describes [7]. Products and services according to COICOP classification, described in [2], which can be considered as luxury products were selected in FHCE matrix for this purpose. Information and results of the study [6] were taken into account for the compilation of structural matrix of FHCE. The second induced effect taken into account is further increase of investment, at macroeconomic level called gross fixed capital formation (GFCF), defined in [2], due to increased operating surplus and mixed income (OSMI), also defined in [2], which indicate operating profit of enterprises at macroeconomic level. The induced increase of GFCF takes into account also the increase of OSMI and based on the
923
Mathematical Methods in Economics 2016
first mentioned induced effect, i.e. induced increase of FHCE. The estimate of induced increase of GFCF is based on the coefficient of investment rate (i) for this purpose defined according formula 4: ij =
GF CFj,2010 , OSM Ij,2010
(4)
where GF CFj,2010 OSM Ij,2010
gross fixed capital formation of industry j, in 2010 according to national accounts, operating surplus and mixed income in industry j, in 2010 according to national accounts.
Induced increase of GFCF is calculated according to formula 5:
where ∆OSM II,j,t ∆OSM III,j,t
∆GF CFj,t = ij · ∆OSM II,j,t + ∆OSM III,j,t ,
(5)
increase of operating surplus and mixed income in industry j, in the year t due to primary investment shock, increase of operating surplus and mixed income in industry j, in the year t due to induced FHCE.
The commodity structure of induced increase of GFCF was done using structural matrix of GFCF for the year 2010 which is compiled and described as AN approach in [8]. This GFCF matrix is broken down by commodities in rows and by industries in columns. The total impact is done by all effects together; primary investment shock, induced increase of FHCE and induced increase of GFCF. The principle of operation of Semi-dynamic IOM in each analyzed year shows Figure 1. More detailed description of Semi-dynamic IOM can be found in [9]. Before the analysis of impacts in the next year, there are always updated technical coefficients. The principle of their update is the same as their calculation for the simple static IOM. There are only changed input data, i.e. after including all impacts to the economy in the previous year. Input data for update of technical coefficients is still in form of Symmetric Input-Output Tables and thus this procedure does not cause any imbalances.
Figure 1 Overview of Semi-dynamic Input-Output Model
3
Input-Output Analysis Based on Semi-dynamic IOM
There are many possibilities of primary shock that can be used for application of Semi-dynamic IOM to make an Input-Output analysis. We decided to analyze a model situation when there is a large
924
Mathematical Methods in Economics 2016
investment project, investment into the infrastructure in the Czech Republic. The duration of this project is four years and part of the large investment shock occurs in each year of the period. For illustration purposes, our calculations are proceed for the period of five years and the fifth year is devoted to induced effects only. The distribution of total investment shock more than 50 CZK billion according to product and year of investment shows Table 1. CPA code 42 62 71
Product Constructions and construction works for civil engineering Computer programming, consultancy and related services Architectural and engineering services; technical testing and analysis services
T1 5 000 0 1 875
T2 13 000 0 625
T3 16 000 500 0
T4 16 000 500 0
Table 1 Distribution of investment shock in time by commodity, CZK mil. There are also planned capital expenditures connected with project documentation during the first two years (CPA 71), see Table 1. There is also gradual increase in capital expenditures directly to infrastructure. During last two years there are planned capital expenditures aimed at creation of necessary software connected with following use of constructed infrastructure. 3.1
Impact of the Investment Shock on the Gross Value Added
The impact of investment shock distributed during the period of four years can be measured by the change of gross value added (GVA). As it was mentioned with regard to induced effects results of application of Semi-dynamic IOM are presented on the period with duration five years, as it shows Figure 2. The Figure 2 shows the development of change in GVA based on primary investment shock including both described induced effects. During the investment period, i.e. from T1 to T4 increased GVA varies between 87% in T1 and 91% in T4 in relation investment shock in particular year. Without inclusion of any induced effect it would be only between 74% in T1 and 72% in T4. The resulting figure is below 100% since the increase of investments stimulates also imports.
Figure 2 Distribution of impact to Gross value added during the analyzed period, CZK mil.
3.2
Impact of the Investment Shock on the output
Another approach how to evaluate the impact of investment shock distributed during the period of four years consists in using the change of output broken down by products (commodities). Similarly as in the case of GVA, the results of application of Semi-dynamic IOM are presented on the period with duration of five years, see Table 2. The table shows commodity structure aggregated on the section level of this classification. All calculations were done on the level of 2-digits CPA codes. It is seen a significant increase of output connected with products related to construction industries, see Table 2. During the investment period from T1 to T4 there is total increase of output including both induced effects roughly three times higher than primary investment shock in particular year. Without including of any induced effect, it would be only around 2.5 times. Table 2 shows also changes in output structure in the fifth year when there is the highest output related to products of manufacturing industry.
925
Mathematical Methods in Economics 2016
The reason of this phenomenon lies in the occurrence of induced increase mainly of FHCE. The induced increase of GFCF is insignificant in this year. CPA5 / Year Total A B C D E F G H I J K L M N O P Q R S
T1 19 650 96 178 2 352 190 45 10 191 781 426 118 379 363 363 3 721 343 12 18 1 30 43
T2 41 561 254 440 5 374 398 102 25 161 1 540 993 283 691 640 625 4 037 763 22 33 2 93 110
T3 51 180 352 541 6 891 496 126 30 813 1 898 1 252 398 1 517 762 765 4 018 979 27 43 3 141 158
T4 51 782 379 543 7 150 507 128 30 846 1 937 1 283 435 1 567 776 777 4 044 1 003 27 45 3 159 173
T5 2 997 135 9 1 288 59 11 161 195 155 184 248 71 62 128 121 2 8 1 86 73
Table 2 Distribution of impact on output by products during analyzed period, CZK mil.
3.3
Impact of the Investment Shock on the Employment
Important measure for analyses of impact of large investment projects is the change in employment. In connection with investment project, there is always expected positive impact on the employment. The impact on change of employment in assumed example is shown on Figure 3.
Figure 3 Distribution of impact to employment during the analyzed period The Figure 3 clearly shows the highest impact to the employment in the year T4 with total increase 21.2 thousands of employees. This difference is caused by the inclusion of induced effects. From another point of view, one million of primary investment shock including both induced effects needs to hire around 1.25 employees to the production process. Without inclusion of any induced effect it would be maximally 1.03 employees. 5 A Products of agriculture, forestry and fishing; B Mining and quarrying; C Manufactured products; D Electricity, gas, steam and air conditioning; E Water supply; sewerage, waste management and remediation services; F Constructions and construction works; G Wholesale and retail trade services; repair services of motor vehiclesand motorcycles; H Transportation and storage services; I Accommodation and food services; J Information and communication services; K Financial and insurance services; L Real estate services; M Professional, scientific and technical services; N Administrative and support services; O Public administration and defence services; compulsory social security services; P Education services; Q Human health and social work services; R Arts, entertainment and recreation services; S Other services.
926
Mathematical Methods in Economics 2016
4
Conclusion
In comparison with simple static IOM, semi-dynamic IOM provides more complex information in connection with external shock.. Inclusion of induced effects represents the most important benefit of Semidynamic IOM. The first effect is connected with induced change of final household consumption expenditures due to the change in wages and salaries after investment shock. Changes in wages and salaries imply changes in disposable income of households which is used for consumption expenditures or for savings. The second effect is connected with induced change in gross fixed capital formation due to change in operating profit of enterprises. This is caused by both initial investment shock and induced effect of final household consumption expenditures. Changes in operating profit of enterprises imply changes in investment behavior of subjects in the economy. Beside simple static Input-Output analysis and inclusion of induced effects, the Semi-dynamic IOM updates technical coefficients when moving to the next period of analysis. Updates of technical coefficients take into account structural change in the economy based on the investment shock. That is why that such model is useful for deep economic analysis especially for the large multiannual investment projects. This paper brings an illustrative example of possible way of application of the Semi-dynamic IOM. This model was used for analysis of impacts of large investment shock in the Czech Republic. We have been developing Semi-dynamic IOM since spring 2013. It is based on published Symmetric Input-Output Tables, the last for 2010. All other necessary input data used for the development of the Semi-dynamic IOM come from national accounts of the Czech Republic for the year 2010. The currently used dimension of 82 × 82 products provides sufficient details for deep and relatively reliable analysis
Acknowledgements Supported by the long term institutional support of research activities by Faculty of Informatics and Statistics, University of Economics, Prague.
References [1] Bissová, S., Javorská, E., Vltavská, K., Zouhar, J.: Input-output interactions in a DSGE framework, In: Mathematical Methods in Economics 2013, Jihlava, 11.09.2013 - 13.09.2013. Jihlava : College of Polytechnics Jihlava, 2013, p. 55-59. ISBN 978-80-87035-76-4. [2] European Union: European System of Accounts ESA 2010. Luxembourg, Publications Office of the European Union, [online], 2010. URL: http://ec.europa.eu/eurostat/web/products-manuals-and-guidelines/-/KS-02-13-269 [3] Fischer, J., Vltavská, K., Doucek, P., Hančlová, J.: Vliv informačnı́ch a komunikačnı́ch technologiı́ na produktivitu práce a souhrnnou produktivitu faktorů v České republice. Politická ekonomie 5 (2013), p. 653–674. Text in Czech. ISSN 0032-3233. [4] Hronová, S., Fischer, J., Hindls, R., Sixta, J.: Národnı́ účetnictvı́. Nástroj popisu globálnı́ ekonomiky. C. H. Beck, Prague, 2009. [5] Miller, R. E. and Blair, P. D.: Input-Output Analysis: Foundations and Extensions. Cambridge University Press, New York, 2009. ISBN-13 978-0-511-65103-8. [6] Sixta, J., Vltavská, K., Hronová, S., Hindls, R.: Struktura spotřeby českých domácnostı́ 1970–2012. Politická ekonomie 6 (2014), p. 725–748. Text in Czech. ISSN 0032-3233. [7] Soukup, J., Pošta, V., Neset, P., Pavelka, T., Dobrylovský, J.: Makroekonomie: Modernı́ přı́stup. Management Press, Prague, 2007. ISBN 978-80-7261-174-4. [8] Safr, K.: Pilot Application of the Dynamic Input-Output Model. Case Study of the Czech Republic 2005-2013, Statistika ”forthcoming”. [9] Zbranek, J.: Konstrukce a využitı́ časových input-output tabulek v kontextu dynamizovaného inputoutput modelu. Prague, 2014. Dissertation thesis. University of Economics in Prague. Faculty of Informatics and Statistics. Department of Economic Statistics. Text in Czech.
927
Mathematical Methods in Economics 2016
Tax burden and economic growth in OECD countries: Two indicators comparison Petr Zimčík1 Abstract. The aim of this paper is to examine possible effects of tax burden on real economic growth in 32 OECD countries. The research is focused on individual components of two tax burden indicators. A tax quota is used in most of the similar studies and World Tax Index is an alternative. Instead of concentrating on the aggregate effects of tax burden, we rather focus on the effects of individual taxes in both indicators. These individual indicators are separately added into the augmented endogenous growth model with physical capital approximation as well as other control variables. The dynamic panel regression is used as a method to analyze any tax burden effects in a time period of 2000-2013. Main findings of this paper are that product taxes show the most significant adverse effects on the economic growth. Results also indicate that property taxes can positively stimulate economic growth rate. There is a certain contradiction between our results and majority of similar empirical studies. Any distortionary tendencies of corporate income tax quotas are identified as statistically insignificant in analysis and have no effects on economic growth. Keywords: Economic growth, tax burden, tax quota, World Tax Index. JEL Classification: E62, H20 AMS Classification: 62P20
1
Introduction
Taxes represent a crucial instrument in every economy. Governments need financial funds to operate and to conduct their policies. So they use taxes to collect these funds. More the governments collect, more public goods and services can be provided to citizens. This is a basic idea behind tax policy. However taxes have negative implications on real economy because they drain money from private sector. They are mandatory for citizens and companies and represent their contributions from wages, profits, properties and consumption. This could create distortions and slow down economic growth rate. All this can be interpreted as a tax burden, which is perceived by all economic subjects. Governments stand before difficult challenge: to find balance between collecting necessary funds to cover their expenditures, and also not to exceed tax burden of domestic economic subjects. Tax burden is more apparent in developed (richer) economies because they usually have an extensive public sector and require higher volume of revenues. There are many ways how to measure tax burden in a specific country. I use two indicators in this paper, tax quota with an alternative indicator, which is World Tax Index (WTI). More about WTI can be found in Kotlán and Machová [11]. Tax quota represents a share of total tax revenues in a relation to a nominal GDP. WTI is an indicator, which consists of tax quota hard data but also using soft data regarding expert opinions of economic institutions, researchers and others. It sets index of assumed tax burden in advanced countries. The aim of this paper is to examine possible effects of tax burden on real economic growth in 32 OECD countries using two presented indicators. Overall tax burden is analyzed together with effects of individual taxes.
2
Literature review
Theoretical framework used to study any long-term impacts of taxation is the endogenous model of growth pioneered by Barro [4]. Fiscal policy in this model has impacts not only on the economic output but can also influence a steady-state growth rate. Barro [4] predicted either negative or neutral effects of taxation based on type of a tax classification. Fölster and Henrekson [8] used econometric analysis to get estimates of taxation effects on growth rate. They discovered a negative correlation between those two variables. Additional testing of their estimates showed that 1
Department of Economics, Faculty of Economics and Administration, Masaryk University, Lipová 41a, Brno, Czech Republic, e-mail: petr.zimcik@mail.muni.cz.
928
Mathematical Methods in Economics 2016
they were not robust. More recent studies showed robust and significant outcomes. Bergh and Karlsson [5] estimated effects of total taxation in OECD countries for periods 1970-1995. They result showed that an increase in total taxation of 1 percentage point (pp) induced decrease in growth rate by 0.11 percentage point. Bergh and Karlsson [5] also extended observed period to 1970-2005, which changed results only slightly. Taxation had 0.10 pp adverse effect on economic growth rate. Romero-Avila and Strauch [12] pointed their interest on 15 EU countries in period 1960-2001. They found weaker negative effects than [5]. A one pp increase in taxation was associated with downturn in economic growth by 0.06-0.07 pp. Afonso and Furceri [1] used panel regression on OECD and EU member states to analyze both group of countries separately. They studied composition of taxation and its overall effect. In all regressions the effects were robust and negative. Results for both EU and OECD countries were similar and estimated to be about -0.12 pp, which is higher than two previous studies. Not all empirical studies set down effects of rising taxation to be negative. Colombier [7] worked with a new regression estimator and fiscal variables. He has found effects of taxation on per capita growth rate to be either insignificant or small but positive. Bergh and Öhrn [6] presented some criticism of Colombier´s work in their paper. Criticism was not point to a new estimator but to the fact that certain control variables were omitted from regression. This could lead to bias in results. Bergh and Öhrn [6] re-estimated his results with proper control variables and obtained robust negative results from taxation increase. Some authors were interested to make a scale, which would rank distortionary effects of individual type of taxes. Arnold [3] checked effects of different tax categories on GDP per capita. He used panel of 21 OECD countries in his analysis and found out that income taxes are in general accompanied with lower economic growth than taxes on property and production taxes. His analysis enabled him to create a ranking of taxes in view of economic growth. Property tax, especially recurrent tax on immovable property, has the most growthfriendly effects. Production taxes followed by personal income taxes indicated adverse effects. Corporate taxes have according to [3] the strongest negative effects on GDP per capita. Xing [14] examined results in Arnold [3] and presented some doubts about his ranking system. In Xing´s analysis shifts in total tax revenue towards property taxes are indeed accompanied with higher income level per capita. However increase of tax revenues from production taxes, corporate and personal income taxes are associated with lower per capita income in the long run. Xing [14] haven´t found any proof to prioritize personal income taxes over corporate ones or favoring consumption taxes over any income taxes as in [3].
2.1
Tax burden
Tax policy as one form of government´s fiscal policy focuses on ensuring macroeconomic stability, which is essential for achieving and maintaining economic growth. According to IMF [9] a well-design tax policy can boost employment, productivity and investments through several channels. Excessive tax burden can have an opposite effect. The tax and benefit systems, which include social transfers and labor income tax, influence decision making whether the person will be willing to work or not. They also determine how much (long) they will work. Capital income taxes influence private savings and investment. Decrease of savings returns on the individual level could also discourage small entrepreneurs from starting their own business. Governments can also use tax incentives to encourage private research spending. Introduction of this support design could provide high economic and social returns from research and development. I follow model in Turnovsky [13] in my analysis of tax burden. The individual agent in a market economy purchases consumption out of the after-tax income generated by his labor and his holdings of capital. His objective is to maximize: ,
(1)
where γ is a parameter related to inter-temporal elasticity of substitution, c is individual private consumption, l is leisure and Gc denotes the consumption services of a government provided consumption goods. Parameters θ and µ measure impacts of leisure and public consumption on welfare of the private agent. Economic agent is therefore a subject to the individual accumulation equation:
= (1-τw)w(1-l) + (1- τk)rkk – (1+τc)c – T/N
929
(2)
Mathematical Methods in Economics 2016
rk is return to capital, w real wage rate, τw tax on wage income, τk tax on capital, τc consumption tax, T/N is a share of lump-sum taxes of an agent and k refers to agent´s holdings of capital. All types of taxes have negative repercussions on growth according to this model.
3
Methodology
We follow methodology in Bergh and Öhrn [6] and similar works. Tax variables from Turnovsky model (eq. 2) were added into endogenous model of growth. These fiscal variables represent approximation of tax burden in economy. Control variables used in Bergh and Öhrn [6] were also taken into consideration. We get a final econometric equation in a form of: 0
gi,t = +i + ,.1 -. /., + ∑6 .1 34 54, + ui,t ,
(3)
where gi,t represents annual GDP growth, Yi,t are non-fiscal control variables and Xj,t is approximation of tax burden. βi and γj are respective coefficients of variables, ui,t is an error term. I use both overall tax quota and WTI to better determine possible effects of tax burden on economy and also to assess, which indicator is better. Second set of regressions are used to examine effects of individual categories of taxes.
3.1
Data
I used annual data from period 2000-2013 in analysis, which focuses on 32 OECD countries.2 Data were collected from OECD statistics3 and WTI, which can be found in Kotlán and Machová [11]. Dependent variable is the real GDP annual growth in percent. To sufficiently compare WTI and Tax quota, small adjustment had to be made. Tax quotas are in relation to GDP expressed in percent. WTI is constructed as an index and its nominal value is lower than one. WTI was multiplied by 100 to better compare these two variables. This changes interpretation of the results, which will be presented in next section. Not only aggregate tax quota is observed. Individual tax categories were also examined for more detailed analysis. We used 4 main tax categories as seen as in table 1. These tax categories have their counterparts in WTI dataset, which were also multiplied by 100 to achieve comparable results. Control variables are similar to those in Bergh and Öhrn [6]. Individual types of taxes were divided as follows: OECD have five main categories of taxes, which were transformed to respond to the WTI division. Eventually four groups of taxes were compared in analysis.4 These groups with all explanatory variables and their description can be found in Table 1: Control Variables Capital formation Employment Inflation Unemployment Trade Fiscal Variables WTI Tax quota PIT CIT Property Product
Description Gross fixed capital formation growth (in %) Annul change in employment (in %) Inflation rate measured by CPI Unemployment rate (in %) Annual growth rate of international trade (in %) Description World Tax Index (multiple by 100) Share of tax revenues to a nominal GDP Personal income taxes Corporate income taxes Individual property taxes Taxes on consumption (VAT included)
Table 1 Description of independent variables Stationarity of all variables was tested using ADF test used in Levin [10]. Dependent variable same as control variables didn´t indicate any stationarity problems. In all fiscal variables, which are in form of indexes or quotas, was detected possible presence of a unit-root. Fiscal variables were change into first differences form and 2
Chile and Mexico, as only two OECD member states, are not included in research due to data unavailability. Available [online] at http://stats.oecd.org/ 4 WTI has also five tax categories, value added tax and other taxes on consumption were merged to single category. All WTI categories were also multiplies by 100 to better correspond to their tax quota counterparts. 3
930
Mathematical Methods in Economics 2016
additional testing confirmed, that this procedure solved non-stationarity problem. In tables 2 and 3, you can see these differentiated variables with “d_” form.
3.2
Method
Two-step generalized method of moments (GMM) was used as an estimation technique. I applied Arellano-Bond estimator [2], which use dynamically add number of variable instruments for increasing t. Using this robust estimator in calculating the covariance matrices ensures that standard deviation of estimates and hypothesis tests are right to possible presence of heteroscedasticity and autocorrelation. Using appropriate instruments eliminates risk of endogeneity of dependent variable´s lagged values, independent variables with a random component. Starting from t-2, lagged values of the dependent variable were used as instruments. Sargan–Hansen test was conducted to confirm use of appropriate instruments. J-statistic can be found in both sets of regressions in result section. Value of J-statistic does not reject a null hypothesis of instruments validity.
4
Empirical results
Results of two-step Arellano-Bond estimator can be found in Table 2. Dependent variable is real GDP growth rate. Lagged value is used as explanatory variable in dynamic panels. Coefficients of all control variables are consistent with the theory and the result in Bergh and Öhrn [6]. Capital formation, employment and openness are in direct relation with an economic activity and promote growth. From inflation and unemployment one expects opposite tendencies, which can be also seen in Table 2. Only overall tax burden effects are estimated in this set of regressions. Coefficients of both indicators were negative, which correspond to an indirect relationship between economic growth and tax burden. Measuring burden with tax quota brings stronger negative tendencies than WTI, if you compare value of both coefficients. Ten pp increase of tax quota should be accompanied with 3.48 pp decrease in GDP growth rate ceteris paribus. Results for WTI show milder effects. Increase of WTI by 0.10 point is associated with 0.78 pp drop in growth rate. Dependent variable Variable GDP(-1) const Capital formation Employment Unemployment Inflation Trade d_Tax quota d_WTI Observations Number of instruments J-statistic
Real GDP growth rate t-statistic Coefficient 1.126 0.019 -4.838*** -0.128 6.831*** 0.095 5.336*** 0.337 -0.958 -0.062 -2.611*** -0.056 14.680*** 0.207 -4.571*** -0.078
Coefficient 0.036 -0.097 0.095 0.356 -0.045 -0.077 0.215 -0.348
384 85 26.72
t-statistic 0.612 -6.235*** 6.754*** 4.962*** -1.297 -1.867* 13.900*** -2.191** 384 85 26.60
Table 2 Effects of overall tax burden5 Overall effect of tax burden was decomposed to individual effects of personal income tax (PIT), corporate income tax (CIT), individual property tax (Property) and product taxes (Product) in Table 3. Control variables have similar coefficients as in Table 2. If you compare results for individual tax categories, main difference is the statistical significance. In WTI division only effects of product taxes are statistically significant at 5% level. Tax quotas have significant result for PIT, property and product taxes, which PIT and Product at 1% significance level. Based on this criterion, tax quotas give more relevant information in case of individual tax effects on economic growth.
5
t-statistics are adjusted for heteroscedasticity and autocorrelation; *, **, *** indicate significance levels of 10 %, 5 %, and 1 %, respectively. Source: Author´s calculations. Please note: same is applied for Table 3.
931
Mathematical Methods in Economics 2016
Interpretation of coefficient is exactly same as in case of Table 2. Product taxes have the most distortionary tendencies. The 10 pp increase in product tax quota induces a 6.7 pp drop in GDP growth rate ceteris paribus. PIT has a milder effect estimated to be 4.74 pp. Individual property tax shows a direct relationship with economic growth. These results are consistent with previous study by Xing [14]. However corporate tax quota indicated no effect on growth even after being expected to have one of the strongest distortionary effects. Dependent variable Indicator Variable GDP(-1) const Capital formation Employment Unemployment Inflation Trade d_PIT d_CIT d_Property d_Product Observations Number of instruments J-statistic
Real GDP growth rate Tax quota WTI Coefficient t-statistic Coefficient t-statistic 0.038 1.143 0.017 0.527 -0.096 -4.643*** -0.125 -6.074*** 0.099 7.146*** 0.097 6.851*** 0.329 4.948*** 0.336 4.954*** -0.048 -1.024 -0.056 -1.179 -0.062 -2.135** -0.056 -1.867* 0.209 13.870*** 0.207 13.910*** -0.474 -2.868*** 0.003 -0.053 0.025 0.163 -0.010 -0.135 0.841 1.987** -0.231 -1.435 -0.670 -4.265*** -0.094 -2.206** 384 384 88 88 24.09 27.17 Table 3 Effects of individual tax burden
5
Conclusion
The aim of this paper was to examine possible effects of tax burden on real economic growth in 32 OECD countries in period 2000-2013. Two tax burden indicators were used. First is overall tax quota, second is World Tax Index. The latter is created not only with hard data but also consists of qualified opinions of institutions and domestic experts. Dynamic two-step GMM was used as technique to estimate both overall and individual effects of tax burden. Tax quota has, according to our results, a significant negative correlation with GDP growth rate, while WTI has this negative correlation also but weaker In second set of regressions overall tax burden was decomposed to four tax categories â&#x20AC;&#x201C; personal income tax, corporate income tax, individual property tax and product taxes, which include value added tax. The latter has the most distortionary effects followed by personal income tax. Property tax indicated a positive relation with economic growth. These results are similar to other empirical studies with one exception. In our case the corporate income tax did not show any significant effects. These results are related to tax quota division only. In WTI division only product tax had significant negative effects. Compering these two indicators, tax quota presented more significant and relevant results. This can be explained by definition of WTI. WTI is constructed to contrast tax burden comparison among OECD countries. Value of index is in interval <0;1> and does not frequently change in time. Therefore it is not suitable for dynamic analysis. Tax quota is simpler but can be better use for studying impacts of tax burden on economic growth.
Acknowledgements This work was supported by funding of specific research at Faculty of Economics and Administration - project MUNI/A/1055/2015. This support is gratefully acknowledged.
References [1] Afonso, A., and Furceri, D.: Government Size, Composition, Volatility and Economic Growth. Working paper No. 849. ECB, 2008. [2] Arellano, M., and Bond, S.: Some Tests of Specification for Panel Data: Monte Carlo Evidence and an Application to Employment Equations. The Review of Economic Studies 58 (1991), 277-297.
932
Mathematical Methods in Economics 2016
[3] Arnold, J.: Do Tax Structures Affect Aggregate Economic Growth?: Empirical Evidence from a Panel of OECD Countries, OECD Economics Department Working Papers, No. 643, OECD Publishing. [4] Barro, R.: Government Spending in a Simple Model of Endogenous Growth. Journal of Political Economy 98 (1990), 103–125. [5] Bergh, A., and Karlsson, M.: Government Size and Growth: Accounting for Economic Freedom and Globalization. Public Choice 142 (2010), 195-213. [6] Bergh, A., and Öhrn, N.: Growth Effects of Fiscal Policies: A critical Appraisal of Colombier´s (2009) Study. IFN Working Paper No. 865 (2011). [7] Colombier, C.: Growth Effects of Fiscal Policies: An Application of Robust Modified M-Estimator. Applied Economics 41 (2009), 899-912. [8] Fölster, S., and Henrekson, M.: Growth effects of government expenditure and taxation in rich countries. European Economic Review 45 (2001), 1501–1520. [9] IMF. Fiscal Policy and Long-Term Growth. (2015). IMF Policy Paper. Washington D.C. [10] Levin, A., Lin, C. F., and Chu, C.: Unit Root Tests in Panel Data: Asymptotic and Finite-Sample Properties. Journal of Econometrics 108 (2002), 1–24. [11] Machová, Z., and Kotlán, I.: World Tax Index: New Methodology for OECD Countries, 2000-2010. DANUBE: Law and Economics Review 4, 165-179. [12] Romero-Ávila, S., and Strauch, R.: Public finances and long-term growth in Europe: Evidence from a panel data analysis. European Journal of Political Economy 24 (2008), 172-191. [13] Turnovsky, S.: Fiscal policy, elastic labor supply, and endogenous growth. Journal of Monetary Economics. 45 (2000), 185-210. [14] Xing, J.: Does tax structure affect economic growth? Empirical evidence from OECD countries. Centre for Business Taxation WP 11/20.
933
Mathematical Methods in Economics 2016
Effectiveness and Efficiency of Fast Train Lines in the Czech Republic Miroslav ŽiŞka1 Abstract. The article deals with the evaluation of long-distance railway lines in the Czech Republic using Data Envelopment Analysis. The object of the research was 28 lines which are operated under a contract of public service obligation between the Ministry of Transport and Czech Railways. The article is divided into three parts. The first part focuses on a literature review of the use of DEA in evaluating transport services. The next part of the article describes the research methodology. The DEA model with uncontrollable inputs (fees for the use of infrastructure), variable returns to scale and input focused (NC-IV model) was used. The output of the first model was traffic performance. The result of DEA was cost efficiency. The outputs of the first model then enter the second model, in which service effectiveness was analyzed using the BCC-I model. Based on a comparison of yields with the initial inputs, overall efficiency was determined. In the third part of the article, the results of an analysis of each of the fast train lines are presented. For each line, there were determined the scores of technical efficiency, service effectiveness and overall effectiveness. Keywords: fast train lines; DEA analysis; BCC model; non-controllable input; cost efficiency; service effectiveness; overall efficiency. JEL Classification: R41, C61 AMS Classification: 90C90
1
Introduction
The article deals with evaluating the efficiency of long-distance rail transport in Czechia and the factors affecting it. Rail transport is operated either as a public service or on a commercial basis. On a commercial basis, there are EC, IC and SC trains that are currently operated by Czech Railways (CD) and open access connections operated by RegioJet and LEO Express operators. The article focuses on long-distance railway lines which are operated under a contract of public service [13]. The contract between the Ministry of Transport (MoT) and CD was signed for the period of 2010-19 and concerns 26 lines of long-distance transport in the categories of an express train and a fast train. The range of transport performance and the amount of compensation is annually specified by the amendments to the contract. For the year 2011, which is the subject of the analysis, the range of 35.2 million of train-kilometers was agreed. To cover demonstrable losses, the operator was paid CZK 4,005 million [12]. Two lines of fast train transport R14 and R16 have separate contracts as they were the subject of a public competition in 2005. Transport performance of these two lines accounted for 1.5 million of train-kilometers with the compensation of CZK 66.2 million. Thus, in total, there are operated 28 fast train lines in the public interest (see Table 1). In connection with the liberalization of the European passenger rail transport market, it is expected that instead of directly awarding the contract to one company, the contract to operate services after 2019 will be tendered. Currently, the tender schedule of 6 lines is already known [11]. Since a relatively high amount of public financial means attracting the interest of private operators are allocated for the reimbursement of provable losses, the debate was raised whether these funds are spent transparently and efficiently. One of the tools that can assess the efficiency of public services is the Data Envelopment Analysis (DEA). This article aims to evaluate the efficiency of fast train lines in terms of both an operator and in terms of passengers. From the viewpoint of the operator, the efficiency is connected with the transformation of inputs to transport performance. From the viewpoint of passengers, it is their willingness to pay the fare for the offered transport performance. The overall efficiency then measures revenues for the sold transport performance and costs paid at the input.
2
Literary overview
To measure the efficiency of rail transport, parametric and non-parametric approaches can be used. The parametric approach seeks to estimate the parameters of the production function. The non-parametric method is particularly DEA. The comparison of parametric and non-parametric approach was carried out on a sample of 89 commuter rail systems. The results revealed a relatively large consensus of both approaches [6]. Given that this arti1
Technical University of Liberec, Faculty of Economics, StudentskĂĄ 2, Liberec, miroslav.zizka@tul.cz.
934
Mathematical Methods in Economics 2016
cle is based on DEA approach, a research of the usage of this method was carried out when evaluating the efficiency of rail transport. The issue of evaluation of performance and efficiency in the transport sector can be divided into two areas - the evaluation of the performance of operators and the evaluation of the efficiency of lines. The performance of railway companies can be broken down into three components: technical efficiency, service effectiveness and technical effectiveness [17]. It is related with the fact that the outputs can be divided into two categories: oriented inside the company (process of production services) and customer-oriented (business process). Technical efficiency measures the relation between inputs and product outputs. These outputs are transport performance (e.g. seat-kilometers, train-kilometers). Using these outputs then generates revenues which can be understood as the consumption of the service and also as the output of a business process. If there are no financial indicators at a disposal, passenger-kilometers can be used to measure consumption of the service. Service effectiveness evaluates the consumption of a service related to the offered transport performance. Technical efficiency puts into proportion transport performance and physical inputs. Relations between inputs and outputs often form a network. For measuring the performance of railway companies, it is therefore appropriate to use the network and multi-activity DEA approaches. The results were empirically tested on a sample of 20 railway companies in Europe [17], or in other research on 40 companies worldwide [16]. A research of efficiency of 8 highspeed rail systems was based on a similar principle. The system efficiency was divided in production efficiency and service effectiveness. The inputs of the production process were the capacity of the fleet and the length of the network and the output was the performance of the fleet. The output of the first process also served as an input to the second process. Carriage performance (in passenger-kilometers) and the number of passengers were the output of a consumer process [4]. In the survey of 24 US operators of suburban transportation, performance was divided into 4 sequential stages: management, operations, routing and scheduling, marketing and sales. The basic input at the beginning of the first process were operating costs, the final output were the revenues. The solution was the efficiency scores of individual operators in the time series 2001-10 and an assessment of the factors that influence this efficiency. It was found that systems with a high degree of subsidies are less organizational efficient [9]. Extensive research focused on the indicators that are used to evaluate the effectiveness and efficiency of public transport was carried out in the work [3]. The research showed that the inputs can be divided into three categories: physical, capital and operational. Physical inputs most often include fleet, employees and fuel consumption. Capital inputs mean the price of capital. Operating costs mostly include the cost of labor and fuel. Outputs can also be divided into 3 categories: providing service, service consumption and revenues. In addition, other external variables often occur in studies. These variables can be divided into 5 categories: quality (network length, average speed, average fleet age), socio-demographic and geographic (population density, locality), management (type of contract), subsidies and externalities (accidents, emissions). The inspiration for the evaluation of individual lines rather than the performance of operators can be found in other modes of transport. Roháčová evaluated the efficiency of 28 lines of urban transport in Slovakia using a DEA hybrid model. Based on the correlation analysis, 4 inputs were identified (line length, no. of connections, the capacity, operating costs) and 2 outputs (total no. of passengers and sales). The input line length was considered uncontrollable as no modifications to existing lines are possible [15]. Another example is the evaluation of 15 airlines in Taiwan. The performance was evaluated from 3 perspectives: cost efficiency, service effectiveness and cost effectiveness. The inputs of the first production process were fuel cost, personnel cost and cost aircraft cost, outputs were the number of flights and seat-miles. These outputs were also inputs to the follow-up process of service consumption. The final outcome was passenger-mile and number of passengers [1].
3
Data and methodology
The basic source of data were statements [14] published by the MoT after a dispute with the RegioJet company in 2013. The statements include the expense and revenue items and data on transport performance in trainkilometers for each fast train lines. However, the MoT refused to disclose the figures for subsequent years for the reason of trade secret. The data from the MoT were further complemented by the length of each fast train lines in kilometers, the number of connections per week (taking into account the fact that some lines are not operated every day) and an average travel speed (km/h). These data were found on the timetable of CD in 2011. In addition, the numbers of inhabitants who may be potentially served by fast train lines were found. These are inhabitants from municipalities where the corresponding service stops. The data source was the statistics of municipalities [12]. The research methodology was divided into the following steps: 1. Definition of inputs and outputs – when providing railway services these inputs are important – fleet, personnel, energy and rail infrastructure. Fleet costs can be expressed by means of depreciation which expresses the amortization of vehicles as a result of their use to provide services to passengers. In Czechia, the ownership and maintenance of infrastructure are separated from railway transport operation. For using transport infrastructure rail operators pay fees to Railway Infrastructure Administration. In terms of DEA models, this fee has a nature of uncontrollable input because the operator can influence the amount in a very limited way.
935
Mathematical Methods in Economics 2016
The fee consists of two parts: the first part depends on the traveled distance and the second part depends on the weight of the train (in gross ton-kilometers). If a given line has a certain fixed length, the operator may affect only a component dependent on the weight of the train, using lighter train units. From the statements of the MoT, the detailed data on the cost structure were available (traction power and fuel, power consumption, personnel costs, depreciation, fees for use of railway infrastructure, etc.). From the viewpoint of the operator, sales are also compensations expressing the difference between the economically justified costs and revenues. There are two types of customers for operators. Passengers who pay the fare for the service provided and the MoT that will order a certain extent of transport performance to provide transport services in the territory. Sometimes these compensations are referred to as subsidies. The subsidies were not considered to be the output in DEA because they automatically compensate for the difference between costs and revenues. Empirically, it was verified that in this case the coefficient of overall efficiency in all lines would equal one or would be very close to this value. Inputs and outputs for DEA should be independent and reflect the reality [5]. Relations between the input and output variables were assessed using Pearson product moment correlation coefficients. Strongly correlated variables were excluded from the analysis (r > 0.95, p-value < 0.0001): total cost, production consumption, operating and administrative overheads. On the output side, there is quite strong and direct relationship between revenues and transport performance in train-kilometers (r = 0.8777, p-value < 0.0001). Further, direct and strong correlation between the number of connections and realized transport performance (r = 0.8155, p-value < 0.0001) is logical. Only moderate correlation was found between revenues and number of connections (r = 0.5785, p-value = 0.0013) and average travel speed (r = 0.5119, p-value = 0.0054). A strong correlation with all inputs was found at the transport performance indicator in train-kilometers (r > 0.76, pvalue < 0.0001). A similar result also applies to sales; the strongest correlation is between revenues and rail infrastructure fees (r = 0.9449, p-value < 0.0001). It can be interpreted in the way that the lines with high yields are operated on tracks where higher fees are paid for their use. Such a result is expected because the rail infrastructure fees depend on the category of the railway line (corridor, national, regional line). The most profitable lines are operated in corridor lines where the highest fee is paid for their use. Only a moderate correlation of travel speed on inputs is quite surprising. From this perspective, stronger correlation (r = 0.5066, p-value = 0.0059) was found only between travel speed and rail infrastructure charges. In fact, the fastest connections are operated along the corridor lines where the fees are higher. In the event of correlation of speed and depreciation, only weak and insignificant correlation was found (r = 0.3249, p-value = 0.0916). This can be explained by the fact that the travel speed depends mainly on the number of stops; not only on the type of power vehicle. The number of connections shows the strongest correlation with personnel costs (r = 0.8010, p-value <0.0001) – high frequency of connections requires a larger number of staff. Similarly, relationship of a number of connections with the rail infrastructure fees (r = 0.7408, p-value <0.0001) and energy consumption (r = 0.7248, p-value < 0.0001) is strong and logical. 2. Construction of mathematical models – for the first stage of cost efficiency, a model was developed with the assumption of variable returns to scale, with a focus on inputs and one uncontrollable variable (NC-IV model). In the mathematical model, see the equations (1) and (2), it is necessary to divide the inputs into controllable (IK) and uncontrollable (IN). Controllable inputs for calculating the CE (model 1) were consumption of fuel and energy, personnel expenses and depreciations. Uncontrollable input (model 1) was the rail infrastructure fees. Slack variables belonging to uncontrollable inputs are set to equal zero [7]. Outputs were transport performances in train-kilometers and number of connections. In the service effectiveness stage (model 2), classic BCC model with a focus on inputs was used, see relations (3) and (4). Two outputs from the first model were used as inputs (transport performance in train-kilometers; number of connections). Revenues were output. For the last stage (model 3) of overall efficiency, there was also used NC-IV model with one output (revenues) and the inputs from the first model of analysis (controllable inputs: fuel and energy, personnel expenses, depreciations; uncontrollable input: infrastructure fees). Models with a focus on inputs were used because rail transport is one of the sectors with a derived demand [8]. This means that the demand can be influenced by changing the range of transport service. The open source OSDE-GUI SW was used to solve all models.
z q si si ... min i I iO k
(1)
n
xij j si q xiq , i I K
j 1
(2)
n
xij j si xiq , i I N
j 1
936
Mathematical Methods in Economics 2016
n
yij j si yiq , i O
j 1
n
j 0, si 0, si 0, j 1 si
j 1
0, i I N
z q si si ... min iO i I
(3)
n
xij j si q xiq , i I
j 1 n
yij j si yiq , i O
j 1
j 0, si 0, si 0,
(4)
n
j 1
j 1
3. Analysis of the effects of exogenous factors – the regression analysis examined other factors that could affect the scores of efficiency and effectiveness. As exogenous factors were considered line length, average travel speed, the population living in municipalities which are served by individual lines and the amount of compensation to cover losses. Given that the rate of efficiency/effectiveness of DEA method takes values in the interval (0; 1), it is often recommended that instead of using "classical" regression based on the method of least squares, use the Tobit model. McDonald, however, shows that there are no practical reasons for that. Neither the scores of efficiency nor effectiveness are generated by censored processes [10].
4
Research results
Table 1 provides an overview of evaluated of express and fast train lines including initial and terminal destination. For each line, there are given coefficients of cost efficiency CE (the result of NC-IV model solution), service effectiveness SE (BCC-I model) and the overall efficiency OE (NC-IV model). Table 1 shows that the cost efficiency is quite high. To achieve cost efficiency, inputs would need to decrease on average by only 4%. The lines R21, R22 and R15, where controllable inputs would need to be reduced by up to 47% while maintaining the existing volume of outputs, were the least cost efficient. For each inefficient line peer lines are given. Due to the lack of space, weights of the sample units are not included. For example, for the line R21, peer units are the lines R14 (0.064), R16 (0.686) and R23 (0.25), data in parentheses are weights. To achieve cost-efficiency, it would be necessary to reduce fuel costs by 53%, labor costs by 49% and depreciation by 47% in the line R21. Fees for using the infrastructure remain the same as uncontrollable input. Service effectiveness in the second stage is relatively low; inputs in the second phase would have to be reduced by an average of 49%. Only four lines lie on the frontier - Ex1, Ex3, Ex4 and N28. The lowest level of service effectiveness was reached by R26, R14 and R11 lines where inputs would have to be reduced by up to 77%. The overall efficiency indicates the need to reduce inputs by 10% on average. On the efficient frontier are 19 lines. The least efficient are the lines R15, R21 and R27 where it would be necessary to reduce inputs by up to 56%. Table 2 shows the necessary level of reduction of individual cost items to achieve effective frontier. The results indicate that in terms of all levels of efficiency and effectiveness, the best rated are Ex3, Ex4 and N28 lines. These are lines operated on corridor lines. In the last section, there were examined more exogenous factors which could affect the individual scores of efficiency and effectiveness (table 3). It was found that in the first model none of the considered factors have influence on the cost efficiency score. For service effectiveness score, there was found moderately strong dependence on the length of line (r = 0.4550) and weaker, but significant dependence on the average travel speed (r = 0.3759). It can be concluded that the longer the route of the train, the more customers it serves, which should have a positive impact on sales. Similarly, a higher speed should attract more passengers and increase revenues on the line. In the last model, there was found moderately strong dependence of overall efficiency score and a travel speed (r = 0.5544). Conversely, there was no evidence that subsidies are a factor that affects the efficiency and effectiveness scores. Apparently this is due to the fact that they are provided automatically to cover the difference between the economically justified costs and the recorded revenues.
937
Mathematical Methods in Economics 2016
5
Conclusion
Each model reflects efficiency and effectiveness from a different perspective. The first model shows that inputs were used quite efficiently to create service offerings from the perspective of the operator. The second model shows how passengers used this offer, or better say how successfully this offer was sold to passengers. This effectiveness was relatively low. It appears that on some lines, there is either a potential to increase the output (e.g. by improvement of provided services), or to reduce the scale of operations. The third model assesses the efficiency on an aggregated basis from the point of view of initial inputs and final output. It can also be assessed from the perspective of society as a whole because the loss from the lines operation is compensated to the operator by the order party. The overall efficiency was relatively high; however, there are fast train lines with a very low overall efficiency score. In terms of practical applications, it is interesting that when the MoT was looking for savings in 2012, it abolished the order of connections on the lines R17 and R25 which showed a relatively high score of efficiency and effectiveness in this analysis. The least efficient yet were the lines R27 and R15 where the scale of operation remained unchanged. The discussion in professional journals suggesting little willingness of the MoT to publish data on loss rates for individual lines shows that little attention is paid to the issue of the efficiency of individual lines. DEA is a fairly powerful tool that could help to objectively compare the effectiveness and the efficiency of fast train lines operated in the public interest.
References [1] Chiou, Y. Ch., and Chen, Y. H.: Route-based performance evaluation of Taiwanese domestic airlines using data envelopment analysis. Transportation Research 42 (2006), 116–127. [2] CZSO: Městská a obecní statistika [Municipal statistics][online]. Czech Statistical Office, Prague, 2016. Available at https://vdb.czso.cz/mos. [3] Daraio, C., Diana, M., Di Costa, F., Leporelli, C., and Matteucci, G.: Efficiency and effectiveness in the urban public transport sector: A critical review with directions for future research. European Journal of Operational Research 248 (2016), 1–20. [4] Doomernik, J. E.: Performance and Efficiency of High-Speed Rail Systems. Transportation Research Procedia 8 (2015), 136–144. [5] Düzakin, E., and Düzakin, H.: Measuring the Performance of Manufacturing Firms with Super Slacks based Model of Data Envelopment Analysis: An Application of 500 Major Industrial Enterprises in Turkey. European Journal of Operational Research 182 (2007), 1412–1432. [6] Graham, D. J.: Productivity and efficiency in urban railways: Parametric and non-parametric estimates. Transportation Research 44 (2008), 84–99. [7] Jablonský, J., and Dlouhý, M.: Modely hodnocení efektivnosti produkčních jednotek [Models of efficiency assessment of decision-making units]. Professional Publishing, Prague, 2004. [8] Jitsuzumi, T., and Nakamura, A.: Causes of inefficiency in Japanese railways: Application of DEA for managers and policymakers. Socio-Economic Planning Sciences 44 (2010) 161–173. [9] Mallikarjun, S., Lewis, H. F., and Sexton, T. R.: Operational performance of U.S. public rail transit and implications for public policy. Socio-Economic Planning Sciences 48 (2014), 74–88. [10] McDonald, J.: Using least squares and tobit in second stage DEA efficiency analyses. European Journal of Operational Research 197 (2009), 792–798. [11] MoT: Aktualizovaný harmonogram soutěží provozovatelů dálkové osobní železniční dopravy [Updated tender schedule for operators of long-distance passenger rail transport][online]. Ministry of Transport, Prague, 2014. Available at http://www.mdcr.cz/cs/verejna-doprava/nabidkova-rizeni. [12] MoT: Dodatek č. 1 ke Smlouvě o závazku veřejné služby v drážní osobní dopravě ve veřejném zájmu [Amendment Nr. 1 to the contract of public service obligation in rail passenger transport in the public interest] [online]. Ministry of Transport, Prague, 2009. Available at http://www.mdcr.cz/cs/verejna-doprava. [13] MoT: Smlouva o závazku veřejné služby v drážní osobní dopravě ve veřejném zájmu [Contract of public service obligation in rail passenger transport in the public interest][online]. Ministry of Transport, Prague, 2009. Available at http://www.mdcr.cz/cs/verejna-doprava/Smlouvy+o+ZVS/ smlouva-o-zavazku-verejnesluzby.htm. [14] MoT: Výkaz nákladů a výnosů z přepravní činnosti ve veřejné drážní osobní dopravě po jednotlivých linkách, období 1. 1. – 31. 12. 2011 [Statement of expenses and revenues from transport activities in the public passenger rail transport by individual lines, period 2011-01-01 – 2011-12-31][online]. Ministry of Transport, Prague, 2013. Available at http://www.mdcr.cz/NR/rdonlyres/EAA58F20-01D1-4951-BF16C4D2D546B91D/0/245_odpovII.pdf. [15] Roháčová, V.: A DEA based approach for optimization of urban public transport system. Central European Journal of Operations Research 23 (2015), 215–233. [16] Yu, M. M.: Assessing the technical efficiency, service effectiveness, and technical effectiveness of the world’s railways through NDEA analysis. Transportation Research 42 (2008), 1283–1294
938
Mathematical Methods in Economics 2016
[17] Yu, M. M., and Lin, E. T. J.: Efficiency and effectiveness in railway performance using a multi-activity network DEA model. Omega 36 (2008), 1005–1017. [18] Zhao, Y., Triantis, K., Murray-Tuite, P., and Edara, P.: Performance measurement of a transportation network with a downtown space reservation system: A network-DEA approach. Transportation Research 47 (2011), 1140–1159. Nr Ex1 Ex2 Ex3 Ex4 R5 R6 R7 R8 R9 R10 R11 R12 R13 R14 R15 R16 R17 R18 R19 R20 R21 R22 R23 R24 R25 R26 R27 N28
FROM – TO PHA–OSTRAVA– PL/SK PHA–LUHACOVICE/ZLIN DE–USTI–BRECLAV–AT BRECLAV–OSTR–BOHUMIN PHA–USTI–CHEB PHA–PLZEN-CHEB/KLATOVY PRAHA–CB–AT BRNO–BOHUMIN PHA–HB–BRNO PHA–TRUTNOV BRNO–PLZEN BRNO–SUMPERK BRNO–BRECLAV–OLOMOUC PARDUBICE–LBC USTI–LBC PLZEN–MOST PARDUBICE–JIHLAVA PHA–VSETIN/LUHACOVICE PHA–CT–BRNO PHA–USTI–DECIN PHA–TANVALD KOLIN–RUMBURK KOLIN–USTI PHA–RAKOVNIK RAKOVNIK–CHOMUTOV PHA–PRIBRAM–CB OSTRAVA–KRNOV–JESENIK PHA–OLOMOUC–PL/SK AVERAGE SD
CE 0.9739 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 0.9591 1.0000 1.0000 1.0000 1.0000 0.9868 1.0000 0.8383 1.0000 1.0000 1.0000 1.0000 0.9748 0.5291 0.6899 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 0.9626 0.1031
Peer 3, 5, 19, 28
5, 7, 9, 19
7, 12, 19, 23 12,14,16, 24
7, 12, 23 14, 16, 23 14, 16, 23
SE 1.0000 0.7557 1.0000 1.0000 0.4200 0.3726 0.3958 0.4067 0.4029 0.3712 0.2819 0.3774 0.3640 0.2774 0.3483 0.4561 0.9329 0.3663 0.3441 0.4642 0.3669 0.3214 0.3278 0.4506 0.7766 0.2343 0.3453 1.0000 0.5057 0.2472
Peer 1, 4, 28
4, 28 4, 28 4, 28 4, 28 4, 28 4, 28 1, 4, 28 4, 28 4, 28 4, 28 4, 28 4, 28 4 1, 4, 28 4, 28 4, 28 4, 28 4, 28 4, 28 4, 28 4 4, 28 4, 28
OE 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 0.9562 1.0000 0.7874 1.0000 1.0000 0.9850 1.0000 0.4370 1.0000 1.0000 1.0000 1.0000 0.9944 0.4656 0.5492 1.0000 0.5579 1.0000 1.0000 0.4916 1.0000 0.9009 0.1886
Peer
5, 9 12, 18, 19, 28
3, 5, 19, 23 4, 14, 16, 28
1, 5, 12, 23 4, 14, 16, 28 4, 14, 16, 28 3, 4, 14
3, 4, 23
Table 1 Coefficients of cost efficiency, service effectiveness and overall efficiency DMU Nr Consumption of energy Personnel expenses Depreciations R8 4.38 5.96 50.81 R10 21.26 21.26 21.26 R13 1.50 1.50 10.10 R15 56.30 56.30 56.30 R20 0.56 11.21 0.56 R21 53.44 53.44 53.44 R22 45.08 45.08 45.08 R24 44.21 44.21 44.74 R27 50.84 50.84 71.50
Table 2 Cost reduction (%) necessary for achieving overall efficiency Dependent variable CE
SE
OE
Independent variable Length Speed Population Subsidy Length Speed Population Subsidy Length Speed Population Subsidy
Slope 1.3559E-4 0.0025 2.0440E-8 1.9258E-7 6.65678E-4 0.0069 8.3751E-8 1.0389E-7 3.65408E-4 0.0078 6.1199E-8 3.7413E-7
t-statistic 1.1625 1.7367 0.7307 1.0679 2.6052 2.0687 1.2736 0.2353 1.7660 3.3964 1.2163 1.1365
Table 3 Regression analysis of exogenous factors
939
p-value 0.2556 0.0943 0.4715 0.2954 0.0150 0.0487 0.2141 0.8158 0.0891 0.0022 0.2348 0.2661
Title
34th International Conference Mathematical Methods in Economics 2016 Conference Proceedings
Author
composite authors participants of the international conference
Intended
for participants of the international conference for researchers and professional public
Publisher
Technical University of Liberec, Studentská 1402/2, Liberec
Authorized by rectorate of the Technical University of Liberec, reference number RE 31/16, as on the 25th July 2016 Published
in September 2016
Pages
951
Edition
1.
Press
Vysokoškolský podnik Liberec, spol. s r.o., Studentská 1402/2, Liberec
Reference number of publication
55-031-16
This publication has not been a subject of language check. All papers passed a double-blind review process.
ISBN 978-80-7494-296-9