e-ISSN: 2582-5208 International Research Journal of Modernization in Engineering Technology and Science Volume:02/Issue:09/September-2020
Impact Factor- 5.354
www.irjmets.com
PERFORMANCE ANALYSIS OF REGRESSION MODELS USING MYANMAR SALES DATA Kyawt Kyawt San*1 *1Faculty
of Information Science, University of Information Technology, Yangon, Myanmar.
ABSTRACT In the development of sales estimating and prediction, regression analysis plays a crucial role and is a broadly used technique. It is applied to estimate future sales values or values of a variable using information of other features. Regression analysis which are also supervised machine learning algorithms and are widely used in businesses to know how the diversification of a set of independent variables influence a dependent one. In this paper, linear regression, random forest regression and K-Nearest Neighbors(KNN) regression are experimented using Myanmar supermarket sales dataset. The main purpose of the experiment is to compare performance of regression analysis among these regressors. According to the experiment, the linear regression model performs the best among these regression models. The paper also intends to experiment all three regressors and analyze the optimal analyzer for supermarket sales data analysis. KEYWORDS: Analysis, regression analysis, supervised learning, linear regression, sales.
I.
INTRODUCTION
Supervised machine learning techniques such as regression and classification techniques enable to measure evaluation analysis such as error model selection and assessment to choose the optimal model for a given data set. Most of the supermarket will be make better profit if they have a good estimation for their yearly sales. Currently, most of them use ad hoc tools traditional statistical methods to estimate the yearly sales. But a lot of challenges and problems may be encountered and may result in the prediction models that execute poorly. Making profit for a supermarket can only be reached when more goods are sold, and the turnover is high. Therefore, estimating sales to increase the yearly sales becomes a demanding issue for every supermarket. Sales data from high performing supermarket has become the worthy data which is produced by customers while interacting with the supermarket. The meaningful patterns and features from these data are used to build a machine learning model which can lead to a better performance for forecast sales. There are a lot of techniques for this kind of problems. Among these techniques, machine learning becomes a critical field because of its highly accurate predictive performance. To predict an observed event, a machine learning models is built on training data and from which it finds knowledge pattern to forecast unseen events. The main purpose of the paper is to present the comparative performance analysis of regression techniques for supermarket sales data. Machine learning models such as multilinear regression, KNN regression and random forest regression are used to predict the unseen event. Supermarket data from Myanmar is utilized for experimental purposes.
II.
RELATED WORKS
The different techniques for predictions utilizing different machine learning techniques are presented in this section. Odegua [8] adopted K-Nearest Neighbor, Random Forest and Gradient Boosting regression algorithms for sales estimating a supermarket store. According to [8], random forest had the lowest mean absolute error among three algorithms and the regressor performs better if more data is observed. Shiwani, et. al.[10] predicted the spending amount of the customers by using different machine learning algorithms and compared the performances of these algorithms. The two features, the time and location are considered as the important ones for the prediction in their experiment. They aim to enhance the profit of the store. www.irjmets.com
@International Research Journal of Modernization in Engineering, Technology and Science
[291]