JAMRIS 2015 Vol 9 No 4 by JAMRIS

VOLUME 9

N째 4

2015

www.jamris.org

pISSN 1897-8649 (PRINT) / eISSN 2080-2145 (ONLINE)

JOURNAL OF AUTOMATION, MOBILE ROBOTICS & INTELLIGENT SYSTEMS

Editor-in-Chief

Associate Editors:

Janusz Kacprzyk

Jacek Salach (Warsaw University of Technology, Poland) Maciej Trojnacki (PIAP, Poland)

(Polish Academy of Sciences, PIAP, Poland)

Statistical Editor:

Advisory Board:

Małgorzata Kaliczynska (PIAP, Poland)

Dimitar Filev (Research & Advenced Engineering, Ford Motor Company, USA) Kaoru Hirota (Japan Society for the Promotion of Science, Beijing Office) Jan Jabłkowski (PIAP, Poland) Witold Pedrycz (ECERF, University of Alberta, Canada)

Language Editors: Grace Palmer (USA), Urszula Wiaczek

Typesetting: Ewa Markowska, PIAP

Co-Editors: Roman Szewczyk (PIAP, Warsaw University of Technology) Oscar Castillo (Tijuana Institute of Technology, Mexico) Marek Zaremba (University of Quebec, Canada) (ECERF, University of Alberta, Canada)

Executive Editor: Anna Ładan aladan@piap.pl

Webmaster: Piotr Ryszawa, PIAP

Editorial Office: Industrial Research Institute for Automation and Measurements PIAP Al. Jerozolimskie 202, 02-486 Warsaw, POLAND Tel. +48-22-8740109, office@jamris.org Copyright and reprint permissions Executive Editor The reference version of the journal is e-version. Printed in 300 copies.

Editorial Board: Chairman - Janusz Kacprzyk (Polish Academy of Sciences, PIAP, Poland) Plamen Angelov (Lancaster University, UK) Adam Borkowski (Polish Academy of Sciences, Poland) Wolfgang Borutzky (Fachhochschule Bonn-Rhein-Sieg, Germany) Chin Chen Chang (Feng Chia University, Taiwan) Jorge Manuel Miranda Dias (University of Coimbra, Portugal) Andries Engelbrecht (University of Pretoria, Republic of South Africa) Pablo Estévez (University of Chile) Bogdan Gabrys (Bournemouth University, UK) Fernando Gomide (University of Campinas, São Paulo, Brazil) Aboul Ella Hassanien (Cairo University, Egypt) Joachim Hertzberg (Osnabrück University, Germany) Evangelos V. Hristoforou (National Technical University of Athens, Greece) Ryszard Jachowicz (Warsaw University of Technology, Poland) Tadeusz Kaczorek (Bialystok University of Technology, Poland) Nikola Kasabov (Auckland University of Technology, New Zealand) Marian P. Kazmierkowski (Warsaw University of Technology, Poland) Laszlo T. Kóczy (Szechenyi Istvan University, Gyor and Budapest University of Technology and Economics, Hungary) Józef Korbicz (University of Zielona Góra, Poland) Krzysztof Kozłowski (Poznan University of Technology, Poland) Eckart Kramer (Fachhochschule Eberswalde, Germany) Rudolf Kruse (Otto-von-Guericke-Universität, Magdeburg, Germany) Ching-Teng Lin (National Chiao-Tung University, Taiwan) Piotr Kulczycki (AGH University of Science and Technology, Cracow, Poland) Andrew Kusiak (University of Iowa, USA)

Mark Last (Ben-Gurion University, Israel) Anthony Maciejewski (Colorado State University, USA) Krzysztof Malinowski (Warsaw University of Technology, Poland) Andrzej Masłowski (Warsaw University of Technology, Poland) Patricia Melin (Tijuana Institute of Technology, Mexico) Fazel Naghdy (University of Wollongong, Australia) Zbigniew Nahorski (Polish Academy of Sciences, Poland) Nadia Nedjah (State University of Rio de Janeiro, Brazil) Duc Truong Pham (Cardiff University, UK) Lech Polkowski (Polish-Japanese Institute of Information Technology, Poland) Alain Pruski (University of Metz, France) Rita Ribeiro (UNINOVA, Instituto de Desenvolvimento de Novas Tecnologias, Caparica, Portugal) Imre Rudas (Óbuda University, Hungary) Leszek Rutkowski (Czestochowa University of Technology, Poland) Alessandro Saffiotti (Örebro University, Sweden) Klaus Schilling (Julius-Maximilians-University Wuerzburg, Germany) Vassil Sgurev (Bulgarian Academy of Sciences, Department of Intelligent Systems, Bulgaria) Helena Szczerbicka (Leibniz Universität, Hannover, Germany) Ryszard Tadeusiewicz (AGH University of Science and Technology in Cracow, Poland) Stanisław Tarasiewicz (University of Laval, Canada) Piotr Tatjewski (Warsaw University of Technology, Poland) Rene Wamkeue (University of Quebec, Canada) Janusz Zalewski (Florida Gulf Coast University, USA) Teresa Zielinska (Warsaw University of Technology, Poland)

Publisher: Industrial Research Institute for Automation and Measurements PIAP

JOURNAL OF AUTOMATION, MOBILE ROBOTICS & INTELLIGENT SYSTEMS VOLUME 9, N째 4, 2015 DOI: 10.14313/JAMRIS_4-2015

CONTENTS 3

Development of Vibratory Part Feeder for Material Handling in Manufacturing Automation: A Surveyults Udhayakumar Sadasivam DOI: 10.14313/JAMRIS_4-2015/27 11

Preliminary Study of Hydrodynamic Load on an Underwater Robotic Manipulator Waldemar Kolodziejczyk DOI: 10.14313/JAMRIS_4-2015/28 18

Face Recognition Using Canonical Correlation, Discrimination Power, and Fractional Multiple Exemplar Discriminant Analyses Mohammadreza Hajiarbabi, Arvin Agah DOI: 10.14313/JAMRIS_4-2015/29 28

Improving Self-Localization Efficiency in a Small Mobile Robot by Using a Hybrid Field of View Vision System Marta Rostkowska, Piotr Skrzypczynski DOI: 10.14313/JAMRIS_4-2015/30 39

Design and Movement Control of a 12-Legged Mobile Robot Jacek Rysinski, Bartlomiej Gola, Jerzy Kopec DOI: 10.14313/JAMRIS_4-2015/31

Articles

ICS System Supporting the Water Networks Management by Means of Mathematical Modelling and Optimization Algorithms Jan Studzinski DOI: 10.14313/JAMRIS_4-2015/32 55

Development of Graphene Based Flow Sensor Adam Kowalski, Marcin Safinowski, Roman Szewczyk, Wojciech Winiarski DOI: 10.14313/JAMRIS_4-2015/33 58

Multiaspect Text Categorization Problem Solving: A Nearest Neighbours Classifier Based Approaches and Beyond Slawomir Zadrozny, Janusz Kacprzyk, Marek Gajewski DOI: 10.14313/JAMRIS_4-2015/34

Journal of Automation, Mobile Robotics & Intelligent Systems

VOLUME 9,

NÂ° 4

2015

Development of Vibratory Part Feeder for Material Handling in Manufacturing Automation: a Survey Submitted: 13th June 2015; accepted 28th August 2015

Udhayakumar Sadasivam DOI: 10.14313/JAMRIS_4-2015/27 Abstract: In manufacturing automation, material handling plays a significant role. Material handling is the process of loading, placing or manipulating material. The handling of materials is to be performed efficiently, safely, accurately in a timely manner so that the right parts arrive in right quantities to the right locations, at low cost and without damage. The material handling devices are generally designed around standard production machinery and integrated with specially made feeders. In assembly and processing operations, there is a need for the parts to be presented in a preferred orientation, which is achieved using part feeders. A perfect example of feeders is vibratory feeders. Vibratory feeders are commonly used in industries for sorting and orienting parts before assembly. In this paper, a survey of literature regarding design and development of part feeders is discussed. The survey includes sensorless vibratory feeders to vision based flexible part feeders. Keywords: vibratory part feeder, flexible part feeder, conveying velocity

1. Introduction Part feeders play a vital role in manufacturing industries. A part feeder has three major functions: storing, aligning and feeding. Feeders are used to make the production faster, convenient and inexpensive. They are designed to supply a specific type of material, which is a part of the production process. They help in maintaining the flow of product needed for the next stage of the process. A part feeder intakes parts of arbitrary orientation and provides output in uniform orientation. Presenting the part in preferred orientation is very much useful in assembly and processing operations and this could be easily achieved through part feeders. Vibratory feeders are perfect examples of part feeders. Vibratory feeders are commonly used in industries for sorting and orienting parts before assembly [1]. The ease of controlling the flow of bulk materials and their adaptation for processing requirements make vibratory feeders, dearer among the manufacturing industries. Vibratory feeders provide suitable alternate to manual labour, thereby saving manufacturerâ&#x20AC;&#x2122;s time and cost. Further, labour could be utilised for value adding activities rather than non-value adding activities such as segregating, stacking etc. Designing an industrial

part feeder consumes more time and is a trial and error process. The designer has to take into account of some critical aspects such as part to be fed, number of parts, material of feeder etc. [2]. This paper deals with the survey of published works in the area of design and development of part feeding systems. Natural resting orientation of a part is the way in which the part could rest on a horizontal surface naturally [3]. Fore-knowledge of probability of feasible natural resting orientation of part is critical in developing an efficient part feeder [1], [4]. Hence, the literature survey on methods available for determining the probability of natural resting orientation of parts was the first step. Then, the literature on design and development of part feeding devices and flexible part feeding systems was done.

2. Determining the Probability of Natural Resting Orientations

The parts are to be oriented in desired manner for automated assembly operation [4]. If the most probable natural resting orientation of the part is chosen as the preferred orientation, the need to re-orient parts would be minimized. Most probable natural resting is the orientation which has the highest probability of occurrence. Greater the number of parts in preferred orientation, higher is the efficiency of the part feeder [1]. Ngoi et al. [5] stated that, components have to be fed and aligned in a proper orientation at high speed in automated assembly. They also emphasized that for continuous feeding of parts through vibratory feeding; the parts were to be fed in the most probable natural resting orientation. They determined the probability of natural resting orientations of parts using drop test. Moll and Erdmann [6] focused on orienting parts with minimal sensing and manipulation. A new approach to orient parts through manipulation of pose distributions was elaborated. The pose distribution of a part being dropped from an arbitrary height on arbitrary surface was determined through dynamic simulation. They analyzed the effect of drop height and shape of support surface on pose distributions. They also derived a condition on the pose and velocity of a planar object in contact with a sloped surface, which enabled to determine the final resting orientation of the part. They also validated the dynamic simulation results with experimental results. The experimental method to find the most probable natural resting orientation is time consuming and hence the industries have the necessity of mathemati-

Journal of Automation, Mobile Robotics & Intelligent Systems

VOLUME 9,

cal models to predict them from their geometries [2]. The commonly discussed theoretical methods in literature, to determine the probability of natural resting orientation of parts are energy barrier method, centroid solid angle method, stability method and critical solid angle method. The methods are discussed as follows:

2.1. Energy Barrier Method

This method was proposed by Boothroyd et al. [3]. The probability of a part to come to rest in a particular orientation is a function of the energy tending to prevent a change of part orientation and amount of energy possessed by the part when it fall into that resting orientation. For complex parts with more than two natural resting orientations, this method was difficult to compute the energy barrier and hence only preferred for simple parts with constant cross section and two natural resting orientations [2].

N° 4

2015

tation ‘i’ from centroid, sr, − hi is the height of the centroid from orientation ‘i’, mm, − Wi is the centroid solid angle subtended from centroid by ‘j’, sr, − hj is the height of the centroid from orientation ‘j’, mm.

Figure 1. Creation of pyramid with centroid as apex

2.2. Centroid Solid Angle Method

This method was proposed by Lee et al. [7]. A solid angle is defined as one steradian unit subtended by a part of the spherical surface whose area equals the square of the radius of the sphere. The centroid solid angle is the solid angle subtended from the centroid of a part. The centroid solid angle method is based on the assumption that the probability of a component resting on a specific orientation is directly proportional to the magnitude of the centroid solid angle and inversely proportional to the height of its centroid from that orientation. The following steps were followed to determine the solid angle of the parts: 1. Assume the part is resting on a flat surface in any orientation. 2. Locate the centroid of the part (Figure 1). 3. Construct a pyramid with centroid as the apex and base of the part (Figure 1). 4. Construct a sphere of any arbitrary radius (R) with the centroid as its centre. The radius of sphere should not exceed the part height. (Figure 1). 5. The intersected volume of pyramid and sphere is called the enveloped volume, from which centroid solid angle can be found (Figure 2). The centroid solid angle of that orientation (Wi) is computed by, (1)

If a part has ‘n’ natural resting orientations, then the probability of natural resting orientation is obtained by Equation (2),

(2)

− pi is the probability of the part resting on orientation ‘i’, − n is the number of natural resting orientations, − Wi is the centroid solid angle subtended by orienArticles

Figure 2. Solid angle generation The set of these probabilities is called Static Probability Profile.

2.3. Stability Method

Stability method is based on logical analysis and was elaborated by Chua and Tay [8]. Larger the contact area with the base more is the stability. Similarly, if the center of gravity of the part is much lower and nearer to the base, stability is higher. The stability method is based on these two aspects. Stability ‘S’ is a function of the magnitude of the contact area (Ar) and the distance of the center of gravity(y) from the base. ‘S’ is proportional to ‘Ar’ and inversely proportional to ‘y’. The generalized equation is given by Equation (3). (3) − pi is the probability for the orientation ‘i’, − N is the number of surface identical to and inclusive of the contacting surface, − Ar is the contact area, mm2, − y is the distance from base to center of gravity, mm.

2.4. Critical Solid Angle Method

This method is based on the hypothesis that the probability of a part to rest in a particular orientation is proportional to the difference between the centroid

Journal of Automation, Mobile Robotics & Intelligent Systems

VOLUME 9,

solid angle subtended by that orientation and critical solid angle of that orientation for changing to rest in its neighboring orientation and is inversely proportional to the height of centre of gravity of that orientation [9]. The Critical solid angle is the solid angle subtended by the resting orientation of the part, with respect to the point that lies on the line normal to that orientation and passing through the centre of gravity, at the height of the length between the centre of gravity and the edge of that orientation and its neighboring orientation. The solid angle at a critical position of the part that is least required for the change in orientation of the part (with the part resting on its edge) is termed as critical solid angle. Whenever the part tries to shift its position to any one of the neighboring orientation, a new critical solid angle is available. Hence, the probability that a part comes to rest is proportional to the difference between the centroid solid angle subtended by that resting orientation and average of the critical solid angles of that orientation when trying to shift to its neighboring orientations and inversely proportional to the height of center of gravity from that orientation. The probability of occurrence of each orientation is given by Equation (4). (4)

Udhayakumar et al. [10] determined the most probable natural resting orientation for a family of sector-shaped parts using drop test and theoretical methods (Centroid solid angle method, stability method and critical solid angle method). The effect of initial orientation of the part during drop and height of drop, on the natural resting orientation, was also studied [11]. Pearsonâ&#x20AC;&#x2122;s Ď&#x2021;2 test for goodness of fit between the drop test and theoretical method results revealed that the null hypothesis could not be accepted at 95% confidence level. The next section provides the literature survey in the area of part feeders.

3. Vibratory Part Feeders

This section deals with the literature related to design and development of part feeders. About 50% of manufacturing cost and 40% of workforce is dedicated to production assembly [12]. A part feeder intakes identical parts of arbitrary orientation and provides output in uniform orientation. In assembly process, the parts were to be shifted from one orientation to another and the most humanoid way of doing that was gripping the part and then shifting it to another orientation [13]. Vibrating feeders are commonly used in industries for orienting parts. Vibratory part feeders are more commonly used in industries such as food processing, plastic component manufacturing, automobiles etc. The most important factor to be considered when selecting a part feeder is the type of parts to be fed. Feeder sizes and types are determined through a variety of factors such as: part size and configuration, part abrasiveness, condition of the part

NÂ° 4

2015

when handled and the required feed rate. The design of industrial parts feeders is trial and error process that can take several months [2]. Berkowitz and Canny [14] developed a tool to test the feeder designs. The behavior of the system was evaluated using Markov model. The probability that a part in any random orientation ends up in a desired/ preferred orientation was computed using Markov analysis. The probability of each pre and post orientation that a gate will convert was computed and based on that, the efficiency was calculated. They used the developed tool to simulate a feeder with edge riser. They concluded that future work was required to determine the accuracy of simulation results on actual feeder. Lim [15] performed the dynamic analysis of the vibratory feeder. A theoretical analysis of feeding on a track vibrating with simple harmonic motion was presented. Based on his analysis, the factors affecting the conveying velocity of part on a vibratory feeder are excitation frequency, amplitude of vibration, coefficient of friction and track angle. A model was developed to predict the conveying velocity based on the above said factors. The results of the developed model followed the same pattern as that of the experimental results. The part motion on a planar part feeder which had a longitudinally vibrating flat plate was discussed by Reznik et al. [16].They stated that feed rate was due to plate motion in forward motion for a longer time than backward motion combined with non-linear nature of Coulomb friction. They developed analytical expressions for feed rate. Though the analytical results deviated from rigid body dynamic simulation results, they followed the same pattern. Many methods of sensorless part feeding were discussed in literature that includes orienting and positioning using push forces, fences etc. Akella and Mason [17] discussed the use of pushing action for orienting and translating objects. In their paper, the following were briefed: 1. The sequence of linear normal pushes for orienting and positioning polygonal objects 2. Existence of sequence of linear normal pushes to move any polygon from start pose to goal pose 3. Polynomial-time pose planner that generate push sequence to orient any polygon to the goal pose. Beretty et al. [18] demonstrated that a polygonal part could be oriented using fences placed along a conveyor belt. At the end of conveyor, a part of any pose could be converted to unique final pose using fences. They developed an algorithm for developing a fence design of minimal length (i.e. less number of fences). The results proved good to fence designs for parts with acyclic left and right environments. But, they could not be generalized for any arbitrary parts. Lynch [19] augmented a 1JOC (Joint Over Conveyor) with a prismatic joint that allows the fence to move vertically and hence named them as 2JOC. They attempted feeding of 3D parts on the conveyor by combining toppling with the ability of the 1JOC to perform conveyor-plane feeding. He also proposed the idea of developing inexpensive part feeders using Articles

Journal of Automation, Mobile Robotics & Intelligent Systems

toppling and pushing actions. He also derived the mechanical conditions for toppling. Bohringer et al. [20] developed a programmable equipment that used vibrating surface for positioning and orienting parts without sensor feedback or force closure. This was based on dynamic modes of vibrating surface. They explained the apparatus using planar objects. They also developed polynomial-time algorithms to generate sequences of force fields. Manipulating a planar rigid part on a conveyor belt using a robot with just one joint was explored by Akella et al. [21]. Their approach provided a simple and flexible method for feeding parts in industries. The 1JOC approach used a fixed velocity conveyor along with a single servoed joint to obtain the diversity of motions that were required for planar manipulation. They also proved the 1JOC is capable of useful planar manipulation: any polygon is controllable from a broad range of initial configurations to any goal chosen from a broad range of goal configurations. They demonstrated that the sensorless 1JOC could position and orient polygons without sensing. Berretty et al. [22] discussed a sequence of mechanical devices such as wiper blades, grooves to filter polygonal parts on a track. They termed this as â&#x20AC;&#x2DC;trapsâ&#x20AC;&#x2122;. They discussed several gates of trap such as balcony, gap, canyon, and slot. They have a series of mechanical barriers (also known as gates) which either reject the disoriented parts or reorient the parts to desired orientation. The former type is known as passive trap and the latter as active trap. Active traps are preferred than passive traps, since the efficiency of active trap is 100%. These traps are mounted at the exit of the vibratory feeder. Vibratory bowl feeders are suitable for smaller parts whereas liner vibratory feeders can be used for handling larger parts. Active devices convert any orientation of the part to the desired orientation whereas passive devices reject the disoriented parts. The sequence of placement of passive and active devices depends on the orientation to be obtained as output. Wiendahl and Rybarczyk [23] presented the possibilities and the potentials of aerodynamic part feeding processes. They used the idea of aerodynamic part feeding, e.g. a permanent air field which forced apart into the desired orientation without the need for any control by sensors. They elaborated on three different aerodynamic part feeding methods. The orientation method was based on the behavior of workpieces in a field of air flow. Usable part characteristics for this method were the general air resistance and the center of gravity. The tipping method was applied for the orientation of the workpiece with the axis of rotation parallel to the direction of transport. The part characteristics were local air resistance, projected shape and the center of gravity. The rotating method was developed for the orientation of the workpieces with the axis of rotation vertical to the direction of transport. The possible part characteristics were air resistance, center of gravity and projected shape. Jiang et al. [24] developed a 3D simulation software for parts feeding and orienting in a vibratory bowl feeder. A mathematical model of part motion Articles

VOLUME 9,

NÂ° 4

2015

and its behavior in orienting mechanism was determined. Based on the model, a 3D simulation software was developed using Java. The computer simulation results had a good agreement with the experimental results. Force analysis and dynamic modeling of a vibratory feeder was presented by Richard et al. [25]. The vibratory feeder was equated to a three-legged parallel mechanism and geometric property of the feeder was determined. The effect of the leaf-spring legs were converted to forces and moments acting on the base of the bowl. A dynamic model that integrates the angular displacement of the bowl with the displacement of the leaf-spring legs was developed. Newtonian and Lagrangian approaches were used to verify the model Goemans et al. [26], [27] introduced a new class of geometric primitives, called blades, to feed sensorics class of 3D parts by reorienting and rejecting all but a desired orientation. The blade received identical polyhedral parts in arbitrary orientation as input and outputs parts in one single orientation. The blade is nothing but a horizontally mounted convex polygonal metal plate attached to the feeder wall. This plate was parallel to the track and had a triangular shaped segment and a rectangular shaped segment. The three parameters to characterize a blade were blade angle, blade height and blade width. Vose et al. [28] used force fields for sensorless part orientation. They developed large family of programmable frictional force fields by vibrating the rigid plate. They also stated that the strength of field and line of squeeze line were easily controllable in six degree of freedom implementation. Ramalingam and Samuel [29] investigated the behavior of a linear vibratory feeder, used for conveying small parts. A rotating drum with radial fins was designed and developed for carrying out the experiments. A tumbling barrel hopper was developed for feeding the components onto the track. They considered the parameters affecting the feed rate and conveying velocity of part such as barrel dimension, amplitude and angle of vibration, coefficient of friction and the operating frequency and the influence of these parameters was determined experimentally. Three different types of sensorless part feeding devices for handling asymmetric parts was discussed by Udhayakumar et al. [30]. They inferred that the efficiency of feeder increases with number of passes. They determined the effect of excitation frequency and amplitude of vibration on velocity of part on a vibratory feeder. A model to determine the velocity of part was also presented. A trap based vibratory part feeder for conveying brakeliners was developed by Udhayakumar et al. [31]. The trap had an efficiency of 100%. An expression relating the conveying velocity of part as a function of excitation frequency, vibration amplitude and trap inclination angle was obtained through regression analysis. The developed set-up was able to reduce the time taken for stacking 80 parts by 13.5%. To investigate the dynamic behavior of the feeding part, a 2D numerical model based on discrete element method was developed by Ashrafizadeh and Ziaei-

Journal of Automation, Mobile Robotics & Intelligent Systems

Rad [32]. The feeding part was assumed as rectangular shape with three degrees of freedom. Through simulation, a good agreement between the calculated and experimental data was observed. They concluded that co-efficient of friction had a critical role in sliding regime but not in hopping regime. The proposed model was capable of demonstrating the periodic and chaotic behavior of the part. A novel vibratory feeder called Decoupled vibratory feeder (DVF) was discussed by Linag Han et al. [33]. In DVF, excitation is provided in two mutual perpendicular directions. The governing parameters such as vibration angle, excitation frequency, and waveform of driving signals and phase angle between vertical and horizontal excitations were adjusted through software. They also developed a test system to evaluate electromagnets performance. Prediction of appropriate parameters for conveying brake pads on a vibratory feeder was discussed by Suresh et al. [34]. They determined the optimal frequency, trap and track angles using linear regression model.

4. Flexible Part Feeders

VOLUME 9,

NÂ° 4

2015

Tay et al. [37] developed a flexible and programmable vibratory bowl feeding system for use in a flexible manufacturing system. The feeding system was capable of identifying the orientation of non-rotational parts and re-oriented them into desired orientation. It was equipped with programmable passive and active orienting devices which allowed them to handle variety of parts. Nine specially designed stations were present along the track of the feeder for feeding of non-rotational parts. These stations were controlled by both the computer sub-system and PLC (Programmable Logic Controller) sub-system. The orientation of the part was identified using neural networks. Optical sensors were used to identify the internal features such as holes and pockets. Three types of neural network architectures were tried for pattern recognition and classification of feed orientation of parts in the feeder. Chua [40] stated that the flexibility of assembly system is critical for survival in competitive manufacturing market. He also discussed the need for feeding system to handle asymmetric parts with high efficiency. He developed a part feeding system to handle cylindrical parts of different aspect ratios. His system included a singularity unit, V-belt orientator, transfer mechanism with aluminium plate and an unloading module with delivery chute and re-orientation. Udhayakumar et al. [41] developed an adaptive part feeder for handling sector-shaped parts. This feeder was able to accommodate a family of sector shaped parts. Capacitive sensors were employed to determine the size of part. Based on the size of the part, the feeding system was modified accordingly, to convey the part. A regression model was developed to determine the conveying velocity based on excitation frequency and amplitude of vibration. A part feeding system based on piezoelectric vibratory conveyors was developed by Urs Leberle and JĂźrgen Fleischer [42]. Different variety of parts, including very delicate parts, could be fed in this conveyor. The design and commissioning of the conveyor set-up was discussed.

This section includes literature on non-vision based and vision based flexible part feeding systems. Boehlke et al. [35] stated that 50% of failure in automation systems attribute to custom built vibratory feeder. Janeja and Lee [36] stated that if the existing orienting elements were able to be adjusted rapidly, then this would convert a rigid design to a flexible one without any sacrifice in its efficiency. This would be able to handle a family of similar parts. Flexible feeders have the capability to accommodate most of the parts of one or more family, with minimum changeover time [37]. The flexible part feeders can be generally classified in to the following two classes: 1. Feeders that are non-vision based and rely on simple sensors or reconfigurable gates to handle the parts of one or more family. 2. Vision based feeders that depend on vision cameras to handle the parts of one or more family.

4.1. Non-vision Based Flexible Part Feeders

4.2. Vision Based Flexible Part Feeders

The concept of using a LED (Light Emitting Diode) sensor to determine the part orientation was presented by Akella and Mason [38]. A LED sensor was used to measure the resting diameter or width of polygonal parts. An array of LEDs was arranged on the side of conveyor and a set of photo resistors on the opposite parallel side. By the LED-photo resistors blocked by the part, the resting diameter or width of polygonal parts was identified. Based on this partial information from the sensor, a robot was programmed to execute a sequence of push-align operations to orient the part. Sim et al. [39] stated that programmable part feeders that can handle parts of one or more part families with short changeover times are highly in need. The capability of neural network based pattern recognition algorithm for recognition of parts was developed. Three fiber optic sensors mounted on vibratory bowl feeder were used to scan the surface of each feeding part. The scanned signature was used as input to neural network models to identify the part.

Causey et al. [43] presented the design and development of a flexible parts feeding system. He proposed three conveyors working together. The first inclined conveyor was used to lift parts from a bulk hopper. From the first conveyor, the parts were transferred to a horizontally mounted second conveyor. An under lit window presented a silhouette image of the parts to vision system. Based in this, the pose of part was determined and a robotic arm was used to acquire it. Parts in inferable orientation/overlapping were returned to bulk conveyer by a third conveyor. The guidelines for improving a part feeding systemâ&#x20AC;&#x2122;s performance were discussed. Gudmundsson and Goldberg [44] analyzed the use of vision cameras in robots for part feeding. They found that the throughput of a part feeding system could be affected by starvation, where no part is visible to the camera and saturation (too many parts are visible to camera which acts as an obstacle for the robot to identify the part orientation and grasp it). Articles

Journal of Automation, Mobile Robotics & Intelligent Systems

Chen et al. [45] developed a smart machine vision system for inspection of solder paste on printed circuit boards (PCB). Machine vision was considered since it has advantages over the traditional manual inspection by its higher efficiency and accuracy. The proposed system included two modules, LIF (Learning Inspection Features) and OLI (On-Line Inspection). The LIF module learnt the inspection features from the CAD files of a PCB board. The OLI module inspected the PCB boards online. The accuracy of detection has exceeded 97%, when deployed in the manufacturing line. Sumi and Kawai [46] proposed a new method for 3D object recognition in a cluttered environment, which used segment-based stereo vision. Based on the position and orientation of the object, a robot was signaled to pick and manipulate it. Different shaped objects (planar figures, polyhedra, free-form objects) were tried for demonstration of the concept. Khan et al. [47] used a vision set-up to inspect for defects based on the size, shape, color and dimensions of the part that arrived on a conveyor. The camera was mounted on the conveyor belt. Based on the output from the vision system, a lever attached to a stepper motor directed the part to the accepted or rejected trays. The accuracy of the system was found to be about 95%. An overview of vision based system was discussed by Han et al. [48]. He stated that conventional part feeders were effective for specific type of parts, but had limitations where families of part (similar in shape but vary in size) were to be handled. He described the current design and retooling of feeder as a black art. A vision based vibratory feeder, where the major feeding parameters such as vibration angle, frequency, amplitude and phase difference could be adjusted online by software, was developed. The system was capable of handling wide range of parts without retooling. The best operating frequency was determined automatically though frequency response analysis. It was also capable of eliminating the parts jamming. Mahalakshmi et al. [49] stated that ‘Template matching’ has created a revolution in the field of computer vision and has provided a new dimension into image processing. They have discussed the significance of various algorithms of template matching technique. Flexible vibratory feeding system based on vision camera was proposed by Liang Hana and Huimin Lib [50]. The developed system was capable of identifying the part with preferred orientation. Otherwise, the part is again sent to the bowl. Auto vision software was used to identify the parts.

5. Conclusions

Automation is growing at a rapid pace in today’s world. Having understood the significance of automation for success and growth of the industrial set-up, many companies are investing in bringing the latest technologies for the processes. Factory automation aims at minimization of manual and personnel related work over industrial, production and manufacturing Articles

VOLUME 9,

N° 4

2015

processes. Vibratory feeders are suitable for feeding parts for subsequent processes on special machines in mechanical, electrical, pharmaceutical, bearing, optical, fastener and many industries. This paper focused on the various literature regarding the design and development of vibratory part feeders. The scope of survey included identifying the most probable natural resting orientation to the development of flexible part feeders. From the literature survey, it could be understood that many more advancements could be made in vibratory part feeding technology so that the feeders are extremely flexible as well as cheap. Further research could be focused on flexible part feeders that can handle a variety of parts without retooling, at an optimum feeding rate. The conveying velocity of the parts on the feeder should be predictable in order to maintain continuous flow. More research work is required in the development of predictive models to determine the conveying velocity and hence part behavior on part feeders are to be studied extensively.

AUTHOR

Udhayakumar Sadasivam – Department of Mechanical Engineering, PSG College of Technology, Coimbatore – 641004. Tamilnadu, INDIA. Phone: +91-422-4344271 E-mail: udhaya_mech@yahoo.com

REFERENCES

[1] Lee S.G., Ngoi B.K.A., Lye S.W., Lim L.E.N., “An analysis of the resting probabilities of an object with curved surfaces”, Int. J Adv. Manuf. Technol., vol.12, no. 5, 1996, 366–369. DOI: 10.1007/ BF01179812. [2] Cordero A.S., “Analyzing the parts behavior in a vibratory bowl feeder to predict the Dynamic Probability Profile”, Thesis submitted for Master of Science Thesis, Mayaguez campus, University of Puerto Rico, 2004. [3] Boothroyd G., Poli C.R., Murch L.E., Automatic Assembly, Marcel Dekker, 1982. [4] Ngoi K.A., Lye S.W., Chen J., “Analysing the natural resting aspect of a prism on a hard surface for automated assembly”, Int. J. Adv. Manuf. Tech., vol. 11, no. 6, 1996, 406–412. DOI: 10.1007/ BF01178966. [5] Ngoi B.K.A, Lim L.E.N., Ee J.T., “Analysis of natural resting aspects of parts in a vibratory bowl feeder- validation of drop test”, Int. J. Adv. Manuf. Technol., vol. 13, 1997, 300–310. DOI: 10.1007/ BF01179612. [6] Moll M., Erdmann M., “Manipulation of pose distributions”, International Journal of Robotics Research, vol. 21, no. 3, 2002, 277–292. DOI: 10.1177/027836402320556449. [7] Lee S.S.G., Ngoi B.K.A., Lim L.E.N., Lye S.W., “Determining the probabilities of the natural resting aspects of parts from their geometries”, Assembly Automation, vol. 17, no. 2, 1997, 137–142. DOI: 10.1108/01445159710171356.

Journal of Automation, Mobile Robotics & Intelligent Systems

[8] Chua P.S.K., Tay M.L., “Modeling the natural resting aspect of small regular shaped parts”, Trans. ASME, vol. 120, 1998, 540–546. [9] Ngoi K.A., Lye S.W., Chen J., “Analyzing the natural resting aspect of a complex shaped part on a hard surface for automated parts feeding”, Proc Instn. Mech. Engrs., vol. 211 part B,1997, 435–442. [10] Udhayakumar S., Mohanram P.V., Keerthi Anand P., Srinivasan R., “Determining the most probable natural resting orientation of sector shaped parts”, Assembly Automation, vol. 33, no. 1, 2013, 29–37. DOI: 10.1108/01445151311294649. [11] Udhayakumar S., Mohanram P.V., Krishnakumar M., Yeswanth S., “Effect of initial orientation and height of drop on natural resting orientation of sector shaped components”, Journal of Manufacturing Engineering, vol. X, no. 2, 2011, 05–07. [12] Boothroyd G., Assembly Automation and Product Design, CRC press Taylor and Francis, 2005. [13] Rao A., Kriegman D., Goldberg K., “Complete algorithm for feeding polyhedral parts using pivot grasps”, IEEE Transactions on Robot Automation, vol. 12, no. 2, 1996, 331–342. DOI: 10.1109/70.488952. [14] Berkowitz D.R., Canny J., “Designing parts feeders using dynamic simulation”. In: Proceedings of IEEE International Conference on Robotics and Automation, 1996, 1127–1132. DOI: 10.1109/ ROBOT.1996.506859. [15] Lim G.H., “On the conveying velocity of a vibratory feeder”, Computers and Structures, vol. 62, no. 1, 1997, 197–203. [16] Reznik D., Canny J., Goldberg K., “Analysis of part motion on logitudinally vibrating plate”. In: Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), vol. 1, 1997, 421–427. [17] Akella S., Mason M.T., “Posing polygonal objects in the plane by pushing”, International Journal of Robotics Research, vol. 17, no. 3,1998, 70–88. [18] Berretty R.P., Goldberg K., Overmars M.H., van der Stappen A.F., “Computing fence designs for orienting parts”, Computational Geometry, vol. 10, no. 4, 1998, 249–262. DOI: 10.1016/S09257721(98)00010-8. [19] Lynch K.M., “Inexpensive conveyor based parts feeding”, Assembly Automation, vol. 19, no. 3, 1999, 209–215. DOI: 10.1108/01445159910280074. [20] Bohringer K.F., Bhatt V., Donald B.R., Goldberg K., “Algorithms for sensorless manipulation using a vibrating surface”, Algorithmica, vol. 26, 2000, 389–429. [21] Akella S., Huang W.H., Lynch K.M., Mason M.T., “Parts feeding on a conveyor with a one joint robot”, Algorithmica, vol. 26, 2000, 313–344. [22] Berretty R.P., Kenneth Y., Goldberg K., Overmars M.H., van der Stappen A.F., “Trap design for vibratory bowl feeders”, The Int. J. Robot. Res., vol. 20, no. 11, 2001, 891–908. DOI: 10.1177/02783640122068173. [23] Wiendahl H.P., Rybarczyk A., “Using air streams for part feeding systems—innovative and reli-

VOLUME 9,

[24]

[25]

[26]

[27]

[28]

[29]

[30]

[31]

[32]

[33]

[34]

[35]

N° 4

2015

able solutions for orientation and transport”, Journal of Materials Processing Technology, vol. 138, 2003, 189–195. Jiang M. H., Chua P.S.K., Tan F. L, “Simulation software for parts feeding in a vibratory bowl feeder”, International Journal of Production Research, vol. 41, no. 9, 2003, 2037–2055. DOI: 10.1080/0020754031000123895. Silversides R., Dai J.S., Seneviratne L., “Force analysis of a vibratory bowl feeder for automatic assembly”, J. Mech. Des., vol. 127, no. 4, 2004, 637–645. DOI: 10.1115/1.1897407. Goemans O.C., Goldberg K., van der Stappen A.F., “Blades for feeding 3D parts on vibratory tracks”, Assembly Automation, vol. 26, no. 3, 2006, 221–226. Goemans O.C., Goldberg K., van der Stappen A.F., “Blades: a new class of geometric primitives for feeding 3D parts on vibratory tracks”. In: Proceedings of IEEE International Conference on Robotics and Automation, 2005,1730–1736. DOI: 10.1109/ROBOT.2006.1641956. Vose T.H., Umbanhowar P., Lynch K.M., “Vibration-induced frictional force fields on a rigid plate”. In: IEEE International Conference on Robotics and Automation, 2007, 660–667. DOI: 10.1109/ROBOT.2007.363062. Ramalingam M., Samuel G.L., “Investigation on the conveying velocity of a linear vibratory feeder while handling bulk-sized small parts”, Int. J. Adv. Manuf. Technol., vol. 44, no. 3–4, 2009, 372–382. DOI: 10.1007/s00170-008-1838-1. Udhayakumar S., Mohanram P.V., Deepak S., Gobalakrishnan P., “Development of sensorless part feeding system for handling asymmetric parts”, The International Journal for Manufacturing Science and Production, vol. 10, no. 3–4, 2009, 267–277. DOI: 10.1515/IJMSP.2009.10.3-4.265. Udhayakumar S., Mohanram P.V., Keerthi Anand P., Srinivasan R., “Trap based part feeding system for stacking sector shaped parts”, Journal of the Brazilian Society of Mechanical Sciences and Engineering, vol. 36, no. 2, 2014, 421–431. DOI: 10.1007/s40430-013-0086-y. Ashrafizadeh H., Ziaei-Rad S., “A numerical 2D simulation of part motion in vibratory bowl feeders by discrete element method”, Journal of Sound and Vibration, vol. 332, no. 13, 2013, 3303–3314. DOI: 10.1016/j.jsv.2013.01.020. Han L. , Wu W.-Z., Bian Y.-H., “An Experimental Study on the Driving System of Vibratory Feeding”, TELKOMNIKA Indonesian Journal of Electrical Engineering, vol. 11, no. 10, 2013, 5851– 5859. DOI: 10.11591/telkomnika.v11i10.3415. Suresh M., Jagadeesh K.A., Sakthivel J., “Prediction of Parameters using Linear regression for trap in a vibratory part feeder”, International journal of Research in Mechanical Engineering, vol. 2, no. 1, 2014, 43–47. Boehlke D., Teschler and Leland, “Smart design for flexible feeding”, Machine Design, vol. 66, no. 23, 1994, 132–134. Articles

Journal of Automation, Mobile Robotics & Intelligent Systems

[36] Joneja A., Lee N., “A modular, parametric vibratory feeder: A case study for flexible assembly tools for mass customization”, IIE Transactions, vol. 30, no. 10, 1998, 923–931. DOI: 10.1080/07408179808966546. [37] Tay M.L., Chua P.S.K., Sim S.K., Gao Y., “Development of a flexible and programmable parts feeding system”, Int. J. Prod. Econ., vol. 98, no. 2, 2005, 227–237. DOI: 10.1016/j.ijpe.2004.05.019. [38] Akella S., Mason M.T., “Using partial sensor information to orient parts”, International Journal of Robotics Research, vol. 18, no. 10, 1999, 963–997. DOI: 10.1177/02783649922067663. [39] Sim S.K., Chua P.S.K., Tay M.L., Yun G., “Incorporating pattern recognition capability in a flexible vibratory bowl feeder using a neural network”, International Journal of Production Research, vol. 41, no. 6, 2003,1217–1237. DOI: 10.1177/02783649922067663. [40] Chua P.S.K., “Novel design and development of an active feeder”, Assembly Automation, vol. 27, no. 1, 2007, 31–37. [41] Udhayakumar S., Mohanram P.V., Yeshwanth S., Ranjan B.W., Sabareeswaran A., “Development of an Adaptive Part Feeder for Handling Sector Shaped Parts”, Assembly Automation, vol. 34, no. 3, 2014, 227–236 [42] Leberle U., Fleischer J., “Automated Modular and Part-Flexible Feeding System for Micro Parts”, Int. J. of Automation Technology, vol. 8, no. 2, 2014, 282–290. [43] Causey G.CQuinn., R.D., Barendt N.A., Sargent D.M., Newman W.S., “Design of a flexible parts feeding system”. In: Proceedings of IEEE International Conference on Robotics and Automation, vol. 2, 1997, 1235–1240. DOI: 10.1109/ROBOT.1997.614306. [44] Gudmundsson D., Goldberg K., “Tuning robotic part feeder parameters to maximize throughput”, Assembly Automation, vol. 19, no. 3, 1999, 216– 221. [45] Chen J.X., Zhang T.Q., Zhou Y.N., Murphey Y.L., “A smart machine vision system for PCB inspection”, Engineering of Intelligent Systems, Lecture notes in Computer Science, vol. 2070,2001, 513– 518. DOI: 10.1007/3-540-45517-5_57. [46] Sumi Y., Kawai Y., “3D object recognition in cluttered environments by segment-based stereo vision”, International Journal of Computer Vision, vol. 46, no.1, 2002, 5–23. [47] Khan U.S., Iqbal J., Khan M.A., “Automatic inspection system using machine vision”. In: Proceedings of 34th Applied Imagery and Pattern Recognition Workshop, 2005, 211–217. [48] Han L., Wang L.Y., G.P.Hu, “A study on the visionbased flexible vibratory feeding system”, Advanced Materials Research, vol. 279, 2011, 434–439. DOI: 10.4028/www.scientific.net/AMR.279.434. [49] Mahalakshmi T., Muthaiah R., Swaminathan P., “Overview of template matching technique in image processing”, Research Journal of Applied Sciences, Engineering and Technology, vol. 4, no. 29, 2012, 5469–5473. Articles

VOLUME 9,

N° 4

2015

[50] Liang H., Huimin L., “A study on Flexible Vibratory feeding system based on smart camera”, International Symposium on Computers and Informatics, 2015,1316–1321.

Journal of Automation, Mobile Robotics & Intelligent Systems

VOLUME 9,

N° 4

2015

Preliminary Study of Hydrodynamic Load on an Underwater Robotic Manipulator Submitted: 21st June 2015; accepted: 12th August 2015

Waldemar Kolodziejczyk DOI: 10.14313/JAMRIS_4-2015/28 Abstract: The objective of this study was to obtain the hydrodynamic load on an underwater three-link robotic arm subjected to the different current speeds at several arm configurations under steady-state conditions. CFD simulations were performed in order to assess torque requirements when hydrodynamic effects have to be compensated by motors in order to maintain the position of the arm. Keywords: underwater manipulator, CFD, hydrodynamic load

1. Introduction Remotely operated manipulators are nowadays standard equipment for several underwater ROV (Remotely Operated Vehicles), as they offer underwater robots more flexibility and wider range of application, e.g. in picking up objects from the bed, joining parts, drilling… Industrial robots and manipulators operate in atmosphere which is much lighter than a rigid body. In underwater applications the density of water is comparable with the density of the manipulator and additional effects of hydrodynamic forces appearing in the system have to be taken into consideration, especially for fast, high performance manipulators, for which large hydrodynamic forces and torques may develop inducing unwanted motions [1]. The hydrodynamic effects on the manipulator are significant and affect the ability to achieve precise control [2]. The control of underwater robots and manipulators is, moreover, extremely difficult due to additional complex hydrodynamic loads including currents and wakes caused by nearby structures. In a context of automatic control the hydrodynamic contribution to the forces acting on a system cannot be obtained from the continuity equation and the Navier-Stokes equations of motion, because they are ill-suited for on-line calculations. Hydrodynamic forces are taken into account through so called “added mass” contribution computed from the strip theory as a quotient of the hydrodynamic force divided by the acceleration of the body [3]. The added mass approach means that there is also an added Coriolis and added centripetal contribution. Strip theory originates from potential flow background for 2D inviscid flows, and was extended semiempirically to three dimensions [4]. Under the strip theory approach, the solid body is divided into mul-

tiple narrow slices, which can be considered as airfoils. Viscous effect of the fluid causes the drag and additional (beyond inviscid) lift force on the body, taken into consideration through simplified models including coefficients dependent on Reynolds number, without taking into account, for example, the configuration of the arm. However, there are results showing that drag and lift coefficients are not configuration independent [5].

The modeling of underwater manipulators has been studied in many works [4, 6, 7, 8, 9]. Underwater arms were modeled mostly as consisting of cylindrical links in order to simplify added mass, drag and lift forces calculations. Underwater manipulator in action changes its geometry during work, and consequently it is important to include the hydrodynamic effects of all links of the kinematic chain on the dynamics of the whole manipulator and the ROV. The lumped approach to the hydrodynamic load on the underwater manipulators, mentioned in this section, is of limited accuracy and there are some controversies as to how added mass effect can be included, for example, for the wakes [10]. Fluid structure interactions (FSI) or computational fluid dynamics (CFD) methods enable more accurate results to be achieved. The fast development of computers, CFD methods and software make it possible to compute the results in more reasonable time than a short time ago, but naturally, not in real time, needed for control applications, for which, however, the obtained CFD results can be harnessed as useful data. The objective of this paper was to examine the 3D steady-state hydrodynamics of the flow around the three-link manipulator placed in the current of incompressible water by using CFD methods. The present study concerned stationary three-link manipulator at different angles of the last link to the current. Seven robotic arm configurations were considered, subjected to the four different current speeds. It will enable us to compute the torques exerted on each joint of the manipulator at any configuration and at any velocity within the examined range as an interpolation function between values obtained, and consequently to make it possible to utilize results in control application for slow motion of the upper link or for slow current of water.

2. Modeling of the Flow Around the Robotic Arm. Case Study

The manipulator under consideration shown in Fig. 1 consists of three links with diameters of

Journal of Automation, Mobile Robotics & Intelligent Systems

Fig. 1. Coordinate frame arrangement of the robotic arm (external and local reference frames) 8.4 cm. The lowest link is 0.43 m long, the middle one – 0.45 m, and the upper link has the cylindrical part of the length of 0.4 m. Two lower links of manipulator were kept unchanged. The modes of manipulator configurations are characterized by different arrangements of the third upper link inclined at seven angles q3 to the second (vertical) link: –135°, –90°, –45°, 0°, 45°, 90°, and 135°. Positive value of an angle q3 is measured in counter-clockwise direction with rea)

VOLUME 9,

N° 4

2015

spect to the z3 axis. The location of the arm with reference to the free stream of water is presented in Figs. 1 and 2. In a way, the angle q3 becomes then an indicator of arm position, relative to the velocity of the current which is oppositely directed to the x axis of the external system of coordinates (Fig. 1). The computational domain in the shape of a box has been bounded only by the flat base of 8m long and 3m wide, considered as a solid wall. The arm is attached to the base in the middle of the width of the base at a distance of 2.5 m from free current inlet, as it is shown in Fig. 2. The 1/7th power law was used to specify turbulent velocity profile at the inlet to the domain. The other sides of computational domain of the height of 2.5 m were in contact with surrounding flowing water, i.e. the backflow to the domain may occur with the direction determined using the direction of the flow in the cell layer adjacent to the boundary. Gulf Stream, Kuroshio, Agulhas, Brazil, and East Australian Currents flow at speeds up to 2.5 m/s. The strongest tidal current in the world, the Saltstraumen, flows at speed reaching 41 km/h (11.4 m/s). It was decided to limit the range of velocities in the present considerations to 1.5 m/s. Calculations were performed for four free current speeds: 0.1 m/s, 0.5 m/s, 1.0 m/s and 1.5 m/s. Reynolds numbers computed with respect to the links diameters and

Fig. 2. The location of the manipulator in the computational domain for intermediate configuration mode described by q3 = –22.5° and vortex structures shedding from the arm at V = 0.75 m/s: a) for the computational domain of size of 8 m x 3 m x 2.5; b) for the reference domain of size of 11 m x 5 m x 3 m 12

Articles

Journal of Automation, Mobile Robotics & Intelligent Systems

VOLUME 9,

current speeds were equal to 8 400, 42 000, 84 000 and 126 000, respectively. The steady-state, incompressible viscous flow around a manipulator is described by the continuity equation and the Navier-Stokes equations of motion. The direct numerical simulations of N–S equations, where all the scales of the turbulent motion are resolved, exceed the capacity of currently existing computers, and then the governing equations have to be transformed to the Reynolds Averaged Navier-Stokes (RANS) equations: (1)

(2)

where xi , xj are the Cartesian coordinates, ui , uj are mean velocity components in X, Y and Z, directions, u'i , u'j are the fluctuating velocity components, r is the density of fluid, p is the pressure, m – the viscosity. The terms ( ), called the Reynolds stresses, must be modeled in order to close the problem. Usually they are modeled utilizing the Boussinesq hypothesis: , (3) where mt is the turbulent viscosity, k – turbulence kinetic energy, and dij is the Kronecker’s delta. The ways in which turbulent viscosity mt and turbulent kinetic energy k are computed are called models of turbulence. In the present study the standard k–e model of turbulence was applied, for its robustness, economy, and reasonable accuracy for fully turbulent flows. The standard k–e model is combined of two transport equations for the turbulence kinetic energy (k) and its dissipation rate (e): ,

and

(4)

(5) where C1ε = 1.44, C2ε = 1.92, sk = 1.0 and sε = 1.3 are the model constants. The term Gk represents the generation of turbulence kinetic energy due to the mean velocity gradients evaluated as: where of-strain tensor.

(6)

is the modulus of the mean rate-

N° 4

2015

Fig. 3. The examples of computational grid close to the manipulator for configuration mode described by q3 = 135⁰ In this model the turbulent viscosity is computed as follows: (7)

where Cµ = 0.09 is a constant. The ANSYS CFD (ANSYS Inc., Canonsburg, Pennsylvania, USA) software was used to perform simulations. For the computational domain with different manipulator configurations the set of eight meshes of approx. 9 500 00 ÷ 11 500 000 elements were generated using cut-cell method. Figure 3 shows an example of the computational grid near the manipulator for configuration mode described by q3 = 135°. Simulations were carried out in Parallel Fluent 16.0 (which implements the control volume method) with twelve parallel processes by utilizing the SIMPLE algorithm (Semi-Implicit Method for Pressure Linked Equations), a second order spatial pressure discretization and second order upwind discretization schemes for momentum equations and for the model of turbulence. This research has been focused on the calculations of torques exerted by the current of water about three z axes in local reference frames assigned to the arm links, as they are shown in Fig. 1. Going from top to bottom, the torque t3 was calculated taking into account pressure and shear stress distributions along the surface of upper link about z3 axis. The torque t2 includes hydrodynamic effects (due to pressure and shear stresses) on the two upper links with respect to z2 axis and the torque t1 – describes the action of water on the whole manipulator about z1 axis. They can be considered as joint torques experienced by the manipulator placed into the current of water and which have to be compensated by motors in order to maintain the positions of the links. Moments (torques) of pressure and viscous forces along a specified axis are determined as the dot products of a unit vector in the direction of the axis and the net values of the moments computed by summing the cross products of the position vector of the pressure and viscous forces origins with respect to the moment center with the pressure and viscous force vectors for each boundary cell-face belonging to the discretized surface of the arm. Articles

Journal of Automation, Mobile Robotics & Intelligent Systems

VOLUME 9,

Table 1. Domain dependence study

2015

one are very similar in shape and the length of the wake is Domain size Torques Domain dependence factor almost the same (vorticity conl×w×h [N m] [%] tours were drawn at the same [m×m×m] locations in both cases), so it Number of cells N t1 t2 t3 δ1 δ2 δ3 can be stated that the actual 8×3×2.5 computational domain was 1 –1.048 10.259 4.375 5.75 2.50 2.30 11 450 290 made long enough to capture all the features of the flow. 6×3×2.5 2 -0.931 10.881 4.666 6.05 3.41 4.19 10 898 000 Grid independence study was performed for the posi11×3×2.5 3 -0.945 10.429 4.525 4.64 0.88 1.05 tion of the arm indicated by 12 227 519 q3 = 45° and for current speed 8×4×3 V = 1m/s, comparing result4 -0.941 10.490 4.562 5.05 0.30 1.88 12632671 ing torques t1, t2, t3, obtained for meshes of different resolu11×5×3.5 5 -0.991 10.522 4.478 tions, as it is shown in Tab. 2. 15 297 509 Grid independence factor was defined in the same way as doTable 2. Grid independence study main dependence factor (8), except that i – stands for a seriSl. Number Torques [N m] Grid independence factor [%] al number of the mesh (Tab. 2), No. of cells t2 t3 δ1 δ2 δ3 t1 tj(r) is the “j” torque computed i N for the reference “r = 4” grid 1 2992942 -1.455 11.949 3.836 3.00 1.79 2.70 of maximum number of cells. 2 5065890 -1.494 11.556 3.696 0.40 1.55 1.04 As can be seen in Tab. 2, grid independence factor constant3 6142455 -1.506 11.561 3.714 0.40 1.51 0.56 ly decreases with increasing 4 9751800 -1.500 11.738 3.735 number of cells and for two finest meshes of cell numbers 6142455 and 9751800, In order to investigate the effect of the size of the the relative differences of the torques were less than domain and the computational mesh resolution on 1.6%. In order to better capture the flow structures, the results of simulations the domain and grid indethe finest mesh (No. 4) was selected and, consequentpendence study were conducted. ly, the number of cells for all computational cases was Domain dependence was checked quantitatively in kept in the range of 950000 ÷ 11500000 cells. a series of simulations carried out for particular arm 3. Results and Discussion configuration described by q3 = –22.5°, for current speed V = 0.75 m/s and for different sizes of the doThe results of calculations are summarized in main: the length l, the width w and the height h shown Tab. 3 for four velocities of the current and for seven in Tab. 1. Domain dependence factor was defined as: configuration modes of the robotic arm. The obvious conclusion is that the largest torques appear for the greatest current speed (1.5 m/s), but the effect of con(8) figuration mode of the manipulator is not so evident. All the configurations of the manipulator induce negative moments about the lower link (z1 axis). The highwhere t1, t2, t3 are torques obtained for different sizest negative t1 is observed for q3 = –45°and –135°, that es of the domain, j is an indicator of the torque (1, 2 is when the upper arm is inclined upstream at an angle or 3), i – stands for a serial number of the domain (see of 45° to the top or to the bottom of the free stream. Tab. 1), tj(r) is the “j” torque computed for the reference All the torques t2 computed from pressure and “r = 5” domain of maximum size 11 m×5 m×3.5 m. shear stress distributions along two upper links are The domain selected for computation is indicated by positive. The highest t2 are located in the range of q3 No. 1 in Tab. 1 (size: 8 m×3 m ×2.5 m). The relative between –45° and +45°. The lowest torque t2 appears difference of torques for actual domain computed for q3 = –135°, that is when the upper link is inclined with reference to those obtained for the domain of upstream to the bottom. The torque t3 changes its dimaximum size was equal to 5.75% for t1, and was rection determined by the position of the upper link equal or less than 2.5% for t2 and t3. The most imporand the current speed. It remaines positive in almost tant geometrical feature of the domain was its length. all cases for q3 between –45° and +90°, and negative in It was selected as a compromise between the need to almost all cases for q3 = –90°, –135° and 135°. capture all the structures of the flow and the capacThe obtained magnitudes of joint torques can be ity of available computers. Vortex structures (Fig. 2) used as interpolation points in procedures generating forming the wakes shedding from the manipulator for the interpolation functions for computing t1, t2 and t3 actual computational domain and for the reference at intermediate values of current speeds and in inter(

N° 4

Articles

Journal of Automation, Mobile Robotics & Intelligent Systems

Mode:

a) q3 = 135°

b) q3 = 90°

c) q3 = 45°

VOLUME 9,

d) q3 = 0°

e) q3 = -45°

f) q3 = –90°

Fig. 4. Gauge pressure distribution on the surface of the arm at speed of the current V = 1 m/s

Mode:

a) q3 = 135°

b) q3 = 90°

c) q3 = 45°

d) q3 = 0°

e) q3 = –45°

Fig. 5. Shear stress distribution on the surface of the arm at speed of the current V = 1 m/s mediate positions of the upper link. The distributions of joint torques in the space created by an angle q3 and velocity V of the current are shown in Figs. 6÷8. In figure 8 the areas of positive and negative moments were separated by thicker zero-torque isolines in order to show the relationship between them better and to indicate, when the motor has to change the direction of rotation. The results of the present calculations allow assessing how much the hydrodynamic forces impact the torques required to be supplied by motors. In control applications the joint moments to be compensated due to hydrodynamic loads can be obtained by using simple interpolation procedures utilizing, for example, bicubic 2D splines [11]. The hydrodynamic torques is caused by pressure and shear stress distributions along the surface of the manipulator links. The effect of pressure is much higher than that of shear stresses. Figures 4 and 5 present the pressure and wall shear contours on the robotic arm, obtained for speed current V = 1 m/s and for all considered configuration modes. The contours are seen from the direction different in each figure and most convenient in each case. The external system of coordinates placed near the arm indicates the position of the manipulator in relation to the current. Generally, the pressure is at its highest on the surfaces that are facing the current, and at its lowest on sides’ transversal to the current and on the sharp edges of the arm, that is in regions of the maximum velocity gradients and separation. They are also the areas of the maximum shear stresses as it is clearly seen in

f) q3 = –90°

N° 4

2015

e) q3 = -135°

e) q3 = –135°

Figs. 4 and 5. The biggest pressure difference for current speed V = 1m/s was found to be approx. equal to 1200 Pa. Maximum values of positive gauge pressure were found to be about 350÷450Pa (depending on the configuration mode) on the upstream sides of the arm, and the greatest absolute value of the negative gauge pressure rose to about 1000Pa on the sharp edges of the third link, where the maximum velocities and shear stresses appeared (approx. 15 N/m2). Wake formation in the flow around the manipulator strongly affects the hydrodynamic forces and torques. The strip theory used to compute the added mass, drag and lift forces oversimplifies the flow patterns and interaction effects caused by changing geometry of the arm during its work. In the present simulations different wake patterns were observed depending on different configurations. One of them, for the intermediate configuration q3 = –22.5° and for current speed V = 0.75m/s, is presented in Fig. 2 as contours of vorticity shedding from the links.

4. Conclusions

CFD analysis has been performed to investigate the flow around the three-link manipulator placed in the current of water. ANSYS Fluent software was used to predict the flow structure near the manipulator arm and to compute the hydrodynamic torques in several configurations of the underwater manipulator and for several velocities of the current flowing around it. The hydrodynamic torques computed in this study may be applied as external loads to dynamic model of the manipulator in order to obtain more accurate and Articles

Journal of Automation, Mobile Robotics & Intelligent Systems

VOLUME 9,

N° 4

2015

Table 3. Joint torques due to hydrodynamic effects Configuration mode

0.1

V [m/s]

0.5

1.0

1.5

0.1

V [m/s]

0.5

1.0

τ3 [Nm]

1.5

0.1

V [m/s]

0.5

1.0

1.5

135°

-0.010

-0.226

-0.717

-0.847

0.036

0.982

4.046

8.649

-0.009

-0.284

-1.912

-4.154

45°

-0.006

-0.301

-1.500

-3.511

0.084

2.604

11.738

26.213

0.021

0.798

3.735

8.039

4.099

16.708

39.146

90°

θ3

τ2 [Nm]

τ1 [Nm]

0°

-45°

-90°

-135°

-0.008

-0.009

-0.014

-0.005

-0.009

-0.161

-0.252

-0.474

-0.143

-0.281

-0.955

-1.186

-2.414

-0.613

-1.463

-1.682

-2.804

-4.937

-1.264

-4.226

0.028

0.134

0.128

0.027

0.004

1.023

3.631

0.804

0.136

3.938

16.629 3.846

1.199

9.371

39.232 7.985

3.919

-0.005

0.093

0.050

0.067

-0.002

-0.052

1.373

1.875

-0.042

-1.504

0.262

1.249

6.034

13.931

0.030

-0.362

7.372

-5.873

17.622 -9.732

Table 4. Joint torques in intermediate position of the arm and at intermediate current speed V = 0.75 m/s CFD calculations

Interpolation bicubic 2D splines

Relative difference

Fig. 6. Interpolation surface for joint torque t1

Fig. 7. Interpolation surface for joint torque t2

Fig. 8. Interpolation surface for joint torque t3 16

Articles

t1 [Nm]

t2 [Nm]

t3 [Nm]

-1.029

9.667

4.012

-1.048 0.018

10.258 0.058

4.375

0.083

more realistic simulation of the manipulator motion. The results can be applied in robotic models to define control strategies that will take into account the hydrodynamic forces computed for different modes of arm configurations and velocities of current with application of interpolation functions. In the table 4 the joint torques computed for an intermediate position of the last link (q3 = –22.5°), when the upper arm is slightly inclined upstream as one can see in Fig. 2, and for intermediate current speed V = 0.75 m/s by using the CFD approach and bicubic 2D splines [11] are presented. Relative differences between CFD calculation and the values interpolated from the data presented in Tab. 3 are less than 10%. There is also a possibility of obtaining the lift and drag forces and consequently added mass for each link of the manipulator more accurately than from the strip theory and utilizing them in modeling of the dynamics of manipulator. Underwater manipulators are usually sturdier than presented one, symmetrical in shape in most cases, and with, usually, not cylindrical links. The manipulator under investigation is based on UR 5 with cylindrical links, and it is non-symmetrical in shape. These features may give us many benefits in our investigations. Firstly, non-symmetrical shape of the arm allows us to investigate the effect of hydrodynamic load in more general way. Then cylindrical links enable us (in future works) to compare the hydrodynamic loads computed through numerical approach with results obtained via standard added mass calculations, which are more suited for links of robotic arm shaped cylindrically. This paper presents just the first step in understanding of hydrodynamic loads on the

Journal of Automation, Mobile Robotics & Intelligent Systems

underwater robotic arm via numerical simulations, because it concerns only on the steady-state flow around different configurations of the last link of the arm. In the future, we will focus on the determination how the motion of the arm may affect the magnitude and direction of joint torques which in turn may give us information about the range of current speeds and velocities of the last link for which the flow might be considered as steady-state.

ACKNOWLEDGEMENTS

This work was supported by the Bialystok University of Technology under the grant No. S/WM/1/2012.

VOLUME 9,

N° 4

2015

and developing a 6DOF underwater parallel robot”, Robotics and Autonomous System, 59, 2011, 101–112. [10] Williamson C.H.K., Govardhan R., “A brief review of recent results in vortex-induced vibrations”, Journal of Wind Engineering and Industrial Aerodynamics, vol. 96, no. 6–7, 2008, 713–735. DOI: 10.1016/j.jweia.2007.06.019. [11] Press W.H., Teukolsky S.A., Vetterling W.T., Flannery B.P., Numerical recipes in C, The art of scientific computing, Second edition, Cambridge University Press, 1992.

AUTHOR Waldemar Kołodziejczyk – Bialystok University of Technology, Faculty of Mechanical Engineering, Department of Automatic Control and Robotics, ul. Wiejska 45 c, 15-351, Bialystok, Poland. E-mail: w.kolodziejczyk@pb.edu.pl.

REFERENCES

[1] [2]

[3] [4]

[5]

[6] [7] [8]

[9]

Antonelli G., Underwater Robots, Springer Tracts in Advanced Robotics, Second edition, Springer, 2006. Farivarnejad H., Moosavian S.A., “Multiple Impedance Control for object manipulation by a dual arm underwater vehicle-manipulator system”, Ocean Engineering, vol. 89, 2014, 82–98. DOI: 10.1016/j.oceaneng.2014.06.032. Fossen T.I., Guidance and Control of Ocean Vehicles, John Wiley & Sons, Chichester, United Kingdom, 1994. McLain T.W., Rock S.M., “Development and Experimental Validation of an Underwater Manipulator Hydrodynamic Model”, The International Journal of Robotics Research, vol. 17, 1988, 748– 759. Leabourne K.N., Rock S.M., “Model Development of an Underwater Manipulator for Coordinated Arm-Vehicle Control”. In: Proceedings of the OCEANS ’98 Conference, Nice, France, no. 2, 1998, 941–946. Richard M.J., Levesque B., “Stochastic dynamic modelling of an open-chain manipulator in a fluid environment”, Mech. Mach. Theory, vol. 31, no. 5, 1996, 561–572. Rivera C., Hinchey M., “Hydrodynamics loads on subsea robots”, Ocean Engineering, vol. 26, no. 8, 1999, 805–812. DOI: 10.1016/S00298018(98)00031-6. Vossoughi G.R., Meghdari A., Borhan H., “Dynamic modeling and robust control of an underwater ROV equipped with a robotic manipulator arm”. In: Proceedings of 2004 JUSFA, @004 Japan-USA Symposium on Flexible Automation, Denver, Colorado, July 19–21, 2004. Pazmino R.S., Garcia C.E., Alvarez Arocha C., Santoja R.A., “Experiences and results from designing

Articles

Journal of Automation, Mobile Robotics & Intelligent Systems

VOLUME 9,

NÂ° 4

2015

Face Recognition Using Canonical Correlation, Discrimination Power, and Fractional Multiple Exemplar Discriminant Analyses Submitted: 14th August 2015; accepted: 17th September 2015

Mohammadreza Hajiarbabi, Arvin Agah DOI: 10.14313/JAMRIS_4-2015/29 Abstract: Face recognition is a biometric identification method which compared to other methods, such as finger print identification, speech, signature, hand written and iris recognition is shown to be more noteworthy both theoretically and practically. Biometric identification methods have various applications such as in film processing, control access networks, among many. The automatic recognition of a human face has become an important problem in pattern recognition, due to (1) the structural similarity of human faces, and (2) great impact of factors such as illumination conditions, facial expression and face orientation. These have made face recognition one of the most challenging problems in pattern recognition. Appearance-based methods are one of the most common methods in face recognition, which can be categorized into linear and nonlinear methods. In this paper face recognition using Canonical Correlation Analysis is introduced, along with the review of the linear and nonlinear appearance-based methods. Canonical Correlation Analysis finds the linear combinations between two sets of variables which have maximum correlation with one another. Discriminant Power analysis and Fractional Multiple Discriminant Analysis has been used to extract features from the image. The results provided in this paper show the advantage of this method compared to other methods in this field. Keywords: face recognition, Canonical Correlation Analysis, Discrimination Power Analysis, Multiple Exemplar Discriminant Analysis, and Radial Basis Function neural networks

1. Introduction

Recognizing the identity of humans is of great importance. Humans recognize each other based on physical characteristic such as face, voice, gait and etc. In the past centuries, the first systematic methods for recognizing were invented and used in police stations for recognizing the villains. This method measured the different parts of the body. After discovering that the finger print is unique for each person, this method became the best method for recognizing humans. In the last decades and because of inventing high speed computers, a good opportunity has been provided for the researches to work on different methods and to find certain methods for recognizing humans based on unique patterns.

A biometric system is a system which has an automated measuring component that is robust and can distinguish physical characteristics that can be used to identify a person. By robust it is meant that the features should not change significantly with the passing of years. For example iris recognition is more robust than other biometric systems because it does not change a lot over time. Due to matters of security, the budget for implementing biometric systems has increased [25]. A face biometric system can use both visual images and infra-red images, which have their own properties [19]. Face biometric systems can be divided into three categories based on the utilized implementation: 1. Appearance-based methods: These methods use statistical approaches to extract the most important information from the image. 2. Model-based methods: These use a model and then the model is placed on the test images and by computing some parameters, the person can be recognized. Elastic bunch graph [34], Active Appearance Model (AAM) [6] and 3D morphable model are some examples of model-based methods [1, 18]. 3. Template-based methods: these methods first find the location of each part of the face for example eyes, nose etc. and then by computing the correlation between parts of the training images and the test images the face can be recognized [4]. All the face biometric systems should also include a face detection part in order to find the place of the face in the image. Viola used Adaboost algorithm to find faces in an image [33]. Rowley used neural networks [24]. In Viola and Rowley method a window was moved over the image in order to find a face. New methods use color images. Hsu [17] first used color images and skin detection in order to find faces in the image. In [14] faces were detected by using correlation and skin segmentation [15].

2. Appearance-Based Methods

Appearance-based methods start with the concept of image space. A two dimensional image can be shown as a point or vector in a high dimensional space which is called image space. In this image space, each dimension is compatible with a pixel of an image. In general, an image with m rows and n columns shows a point in a N dimensional space where N = m Ă&#x2014; n . For example, an image with 20 rows and 20 columns describes a point in a 400 dimensional space. One important characteristic of image space is that changing the pixels of one image with each other does not

Journal of Automation, Mobile Robotics & Intelligent Systems

change the image space. Also image space can show the connection between a set of images [31]. The image space is a space with high dimensions. The appearance-based methods extract the most important information from the image and lower the dimension of the image space. The produced subspace under this situation is called feature space or face space [31]. The origin of appearance-based methods dates back to 1991 when Turk and Pentland introduced the Eigen face algorithm which is based on a famous mathematical method, namely, Principal Component Analysis [32]. This was the start of appearance-based methods. In 2000, Scholkopf by introducing kernel principal component analysis (Kernel Eigenface) expanded the concept of appearance-based method into non-linear fields. Appearance-based methods are robust to noise, defocusing, and similar issues [10]. Appearance-based methods have been classified into two categories of linear and non-linear methods. In the following sections these methods are described.

2.1. Linear Discriminant Analysis In face space which is of m × n dimension with m , n as the image dimensions, X = ( X 1 , X 2 ,..., X n ) ⊂ ℜm×n is a matrix containing the images in the training set. X i is an image that has been converted to a column vector. LDA maximize the between class scatter matrix to the within class scatter matrix [8]. The between class scatter matrix is calculated as: c

(

)(

S B = ∑ ni X i − X X i − X i =1

( )

Where X = 1 n

)

∑ j=1 X j the mean of the images in n

( n )∑

i the training is set and X = 1

X i is the mean j =1 j

of class i , and c is the number of the classes (total images that belong to one person). The within class scatter matrix is calculated as: c

SW = ∑ i =1

∑ (X

Xi ∈ni

−X

)( X − X )

i T

The optimal subspace is calculated by:

Eoptimal = argmax E

ET SBE

E T SW E

= [c1 , c2 ,..., cc−1 ]

Where [c1 , c2 ,..., cc −1 ] is the set of Eigen vectors of S B and SW corresponding to c − 1 greatest generalized Eigen value λi and i = 1,2,..., c − 1 S B E i = λi SW E i i = 1,2,..., c − 1

Thus, the most discriminant response for face images X would be [8]: In order to avoid the singularity problem first one has to reduce the dimension of the problem and then

VOLUME 9,

N° 4

2015

apply LDA. Principal component analysis (PCA) is the most common method which is used for dimension reduction. In this paper we applied principal component analysis on the images prior to other methods have been discussed. In addition to PCA, there are other effective methods that can be used as dimension reduction prior to LDA, such as Discrete Cosine Transform (DCT) [12]. Some researchers have observed that applying PCA to reduce the dimension of the space can cause another problem which is the elimination of some useful information from the null space. The 2FLD algorithm was introduced to address this problem and also the computational problem that applying PCA produces. But 2FLD algorithm introduces other problems such that the output of 2FLD method is a matrix and its dimension for an m × n image could be n × n . This high dimension when a neural network is used for classification causes issues. A two and high dimension matrix cannot be applied to a neural network. If the matrix is changed into a vector, a vector with size n2 is produced and because of low sample test of each face the network cannot be trained well. A direct LDA method that does not need to use PCA method before applying LDA has been proposed, but this method is time inefficient [36]. A fuzzy version of the LDA has also been proposed [20]. Shu et al. designed a linear discriminant analysis method that also preserved the local geometric structures [29]. In [9] the discriminant information was added into sparse neighborhood.

2.2. Fractional Multiple Exemplar Discriminant Analysis

The problem of face recognition differs from other pattern recognition problems and therefore it requires different discriminant methods rather than LDA. In LDA the classification of each class is based on just one sample and that is the mean of each class. Because of shortage of samples in face recognition applications, it is better to use all the samples instead of the mean of each class for classification. Rather than minimizing within class distance while maximizing the between class distance, multiple exemplar discriminant analysis (MEDA) finds the projection directions along which the within class exemplar distance (i.e., the distances between exemplars belonging to the same class) is minimized while the between-class exemplar distance (i.e., the distances between exemplars belonging to different classes) is maximized [37]. In MEDA the within class scatter matrix is calculated by: Where X ij is the jth image of ith class. Through comparison with the within class scatter matrix of LDA, it can be seen that in this method all the images in a class have participated in making the within class scatter matrix instead of using just the mean of the class, as in the LDA method. The between class scatter matrix is computed by: Articles

Journal of Automation, Mobile Robotics & Intelligent Systems

SB = ∑ i =1

1 ∑ i j n n j =1, j ≠i

j ni n

∑∑ ( X k =1 l =1

i k

)(

− X lj X ki − X lj

VOLUME 9,

)

Dissimilarly to LDA in which the means of each class and means of all samples made the between class scatter matrix, in MEDA all the samples in one class are compared to all samples of the other class. The computation of Eoptimal is the same as LDA. There is a drawback which is common in both LDA and MEDA. In between class scatter matrix ( S B ) there will be no difference if the samples are closer or far from each other. However, it is clear that for the classes which are closer to each other the probability of collision is more than the other classes. When the idea was first proposed [21], it was used for LDA and was not applied to face recognition databases. Later [13] the algorithm was combined with MEDA and was applied to face recognition. This algorithm suggests reducing the dimension of the problem step by step and in each iteration the samples which are closer are made to become far from each other. For this purpose a weight function has been introduced:

(

) (

w d X1 X2 = d X1 X2

)

−p

p = 3,4,...

Where d X1 X2 denotes the distance of the center of each class from each other [21] but for MEDA it should be considered as the distance of each two samples [13]. The between class scatter matrix in fractional MEDA is defined as: C

SB = ∑ i =1

i i  1 n n  w  d i j  × X ki − X lj X ki − X lj ∑∑ i j  Xk Xl  j =1, j ≠i n n k =1 l =1

(

∑

)(

)

The within class scatter matrix is the same as MEDA. The fractional algorithm is shown in Table 1 [21]: In the pseudo code r is the number of fractional steps used to reduce the dimensionality by 1 [21]. Table 1. Fractional algorithm [21]

Set W = In×n (the identity matrix) for k = n to (m + 1) step (-1) for  = 0 to (r − 1) to step 1 Project the data using W as y = W T x Apply the scaling transformation to obtain z = ϕ y ,α  For the z patterns, compute the k × k between class scatter matrix S b Compute the ordered eigenvalues λ1 , λ2 ,..., λ k and corresponding eigenvectors φ1 , φ2 ,...φk of Sb Set W = W F , where F = [φ1 , φ2 ,..., φk ] end for Discard the last ( kth) column of W . end for

(

)

The scaling transformation compresses the last component of by a factor α  with y  α < 1, i .e , Ψ y ;α : y ∈ℜ k → z ∈ℜ k such that:

(

Articles

)

  α yi , zi =   yi ,

N° 4

2015

i=k i = 1,2,..., ( k − 1)

Some explanations about this algorithm are [21]: • In the rth step, the reduction factor is α r −1 . It stipulates that a dimension is removed by 1,α ,α 2 ,...α r −1 scales. • When the number of steps is smaller, then α should be chosen larger and vice versa. • The weighting functions should be chosen, as d −3 , d −4 and so on. The FMEDA algorithm is shown in Table 2 [13]. Table 2. FMEDA algorithm [12]

1. Applying PCA on the training set. 2. Computing within class scatter matrix using

3. Computing between class scatter matrixes using

4. Applying fractional step dimensionally reduction algorithm. 5. Computing optimal subspace using

Eoptimal = argmax E

ET SBE

E T SW E

= [c1 , c2 ,..., cc−1 ]

6. Computing most discriminant vectors using T P = E optimal ⋅X

2.3. Kernel Methods Kernel methods are more recent methods, as compared to linear algorithms [3]. A kernel method finds the higher order correlations between instances and the algorithm, as described in this section. It is considered that patterns x ∈ℜ N are available, and that the most information lies in the dth dimension of pattern x . One manner to extract all the features from data is to extract the relations between all the elements of a vector. In computer vision applications where all the images are converted to vectors, this feature extraction shows the relations between all pixels of the image. For example in ℜ2 (an image) all the second order relations can be mapped into a non-linear space:

F : ℜ 2 → F = ℜ3

([ x ] ,[ x ] )  ([ x ] ,[ x ] ,[ x ] [ x ] ) 1

Journal of Automation, Mobile Robotics & Intelligent Systems

VOLUME 9,

( N + d − 1)! d !( N − 1)!

different combinations that make a feature space with NF dimension. For example, a 16*16 image with d = 5 has a feature space of moment 1010 . By using kernel methods there is no need to compute these relations explicitly. For computing dot products F ( x ) .F ( x ' ) the kernel method is defined as follows:

(

) (

)

Where C F is the covariance matrix. The eigenvalues λ ≥ 0 and eigenvectors w wF F ∈ F \ {0} (the eigenvectors with eigenvalues that are not zero) must be determined in a manner that qualifies lw F = C Fw F. By using it: Also the coefficient α i exists such that:

( ))

k x , x ' = F ( x ) .F x '

Which allows the dot product F to be computed without any need to map F . In this method, first d used in [3], if x is an image then the kernel x .x ' (or any other kernels) can be used to map onto a new feature space. This feature space is called the Hilbert space. In Hilbert space all the relations between any vectors can be shown using dot products. The input space is denoted as χ and the feature space is denoted by F , and the map by φ : χ → F . Any function that returns the inner product of two points x i ∈ χ and x j ∈ χ in the F space is called a kernel function. Some of the popular kernels include [21]:

( )

Polynomial kernel: k ( x , y ) = ( x . y )

)

k ( x , y ) = tanh κ ( x . y ) + θ d ∈ℵ,κ > 0,θ < 0 d

By combining the last three and by introducing K m a × m matrix: The equation:

− x − y 2  , exp k x y = RBF kernel: ( ) 2  2σ   Sigmoid kernel:

(

2015

lw F = C Fw F

This method is useful for low dimensional data but can cause problems for high dimensional data. For N dimensional data there are

NF =

N° 4

Also kernels can be combined using these methods in order to produce new kernels:

α k1 ( x , y ) + β k2 ( x , y ) = k ( x , y ) k1 ( x , y ) k2 ( x , y ) = k ( x , y )

2.3.1. Kernel Methods By having m instances x k with zero mean and T x k = [ x k 1 , x k 2 ,..., x kn ] ∈ℜn, principal component analysis method finds the new axis in the direction of the maximum variances of the data and this is equivalent to finding the eigenvalues of the covariance matrix C :

λw = Cw

For eigenvalues λ ≥ 0 and eigenvectors w ∈ℜn in kernel principal component analysis, each vector x from the input space ℜn to the high dimensional feature space ℜ f is mapped using a nonlinear mapping function F : ℜn → ℜ f , f > n . In ℜ f the eigenvalue problem is as follows:

is reached, the kernel principal component analysis becomes: mλ K α = K 2α ≡ mλα = K α

Where α is a column vector with values α 1 ,...,α m [27]. For normalizing the eigenvectors in F that is k w ⋅ w k = 1 the equation used is:

(

)

For extracting the principal components from the test instance x, and its projection in the ℜ f space is F(x), only the projection of F(x) must be computed on the eigenvectors w k in the feature subspace F by [27]:

It should be noted that none of equations need F(xi) in an explicit way. The dot products must only be calculated using the kernel function without the need to apply the map F. In face recognition, each vector x shows a face image and this is why the non-linear principal component is called kernel eigenface in the face recognition domain. The kernel principal component analysis is shown is Table 3 [27]. Articles

Journal of Automation, Mobile Robotics & Intelligent Systems

Table 3. KPCA algorithm [27] 1. Calculate the gram matrix by using:

K training

 k ( x1 , x1 ) k ( x1 , x 2 )  k ( x2 , x1 ) k ( x2 , x21 ) =  ... ...  k ( x m , x1 ) k ( x m , x2 )

... k ( x1 , x m )   ... k ( x2 , x m )   ... ...  ... k ( x m , x m )

2. Calculate mλα = K α and compute α 3. Normalize α n using:

4. Calculate the principal component coefficients for test data x using: m

(w ⋅ φ ( x ) ) = ∑ α k ( x , x ) k

i =1

k i

The classical principal component analysis is also a special version of kernel principal component analysis in which the kernel function is a first order polynomial. Therefore, the kernel principal analysis is a generalized form of principal component analysis that has used different kernels for nonlinear mapping. Another important matter is using data with zero mean in the new subspace, which can be accomplished using: As there are no data in explicit form in the new space, the following method is used [26]. By considering that for each i and j 1ij = 1 .

The above formula can be rewritten as [26]: For Kernel Fisher face first principal component analysis is applied on the image, and then LDA is applied on the new vector [35].

2.4. Canonical Correlation Analysis

Canonical Correlation Analysis (CCA) is one mechanism for measuring the linear relationship between two multi-dimensional relationships. This method was first introduced by [16], and although it has been known as a standard tool in pattern recognition, it has been used rarely in signal processing and biometric identification systems. CCA has had various applications in economics, medical studies and metrology. Articles

VOLUME 9,

N° 4

2015

It is assumed that X is a matrix with m × n dimension that consists of m array of a n dimensional vector from a random variable x . The correlation coefficient ρij that shows the correlation between the x i and x j is defined by:

ρij =

Cij

Cii C jj

Where Cij shows the covariance matrix between x i and x j , and is computed by:

Cij =

1 m ∑ ( X ki − µi ) X kj − µ j m − 1 k =1

(

)

µi is the average of x i ’s. Ax is the centered matrix of X that its elements are

aij = X ij − µ j

Therefore the covariance matrix is defined by: C ==

1 AxT Ax m−1

It has to be considered that correlation coefficients demonstrate a measurement of linear intersection between two variables. When two variables are uncorrelated (i.e., their correlation coefficients are zero) it states that there is no linear function that could describe the connection between the two variables. The aim of CCA is to determine the correlation between two sets of variables. CCA attempts to find the basis vectors for two sets of multidimensional variables in such a way that the linear correlation between the projected vectors on these basis vectors are maximized mutually. The CCA method attempts to find the basis vectors for two sets of vectors, one for x and one for y in such a way that the correlation between the projection of these variables on the basis vectors are maximized. Assuming that the zero mean vectors are X and Y, the CCA method finds the vectors α and β such that the correlations between the projections of a1 = α T X and b1 = β T Y are maximized. The projections a1 and b1 are called the first canonical variables. Then the second dual canonical variables a2 and b2 are computed which are uncorrelated with the canonical variables a1 and b1 , and this process is continued. Considering ω1 ,ω 2 ,...,ω c as features belonging to class c and the training data space being defined as and Ω = ξ | ξ ∈ℜ N , defining A = x | x ∈ℜ P B = y | y ∈ℜq then x and y are feature vectors from one instance ξ which have been extracted using two different feature extracting method. The goal is to calculate the canonical correlations between x and y . α 1T x and β1T y are the two first vectors, α 2T x and β2T y are second dual vectors and can be written as:

{ {

} }

( = (β

{

) = (α ,α ,...,α ) y ) = ( β , β ,..., β )

X ∗ = α 1T x ,α 2T x ,...,α dT x

Y∗

T 1

y , β2T y ,..., βdT

}

x = WxT x

y = W yT y

Journal of Automation, Mobile Robotics & Intelligent Systems

 X ∗   W T x   Wx Z1 =  ∗  =  xT  =   Y   W y y  0

VOLUME 9,

N° 4

2015

the head. From every person, five images were used as training set and the rest as test set. Figure 2 shows a sample of this database.

0   x W y   y 

And the transform matrix is:  Wx W1 =   0

0 W y 

Wx = (α 1 ,α 2 ,...,α d ) ,W y = ( β1 , β2 ,..., βd )

The directions α i and βi are called the ith Canonical Projective Vectors (CPV) and x , y , α iT x and βiT y are the ith features of canonical correlations. Also W1 and W2 are called Canonical Projective Matrix (CPM) and Z1 is Canonical Correlation Discriminant Feature (CCDF) and the method is called Feature Fusion Strategy (FFS) [2, 30]. For determining the CCA coefficients, it is assumed that x and y are two random variables with zero means. The total covariance matrix is defined by:

 C xx C= C yx

 x  C xy  = E    C yy   y 

 x  y 

  

Where C xx and C yy are the inner set covariance matrix of x and y , and C xy = C Tyx is the between set covariance matrix. The correlation between x and y is defined as [30]: −1 −1 2 C xx C xy C yy C yx α = ρ α  −1 −1 2 C yy C yx C xx C xy β = ρ β

Where ρ2 is the square correlation and the eigenvectors α and β are normalized basis correlation vectors.

3. Experimental Results

In order to test the described algorithms, the Sheffield (UMIST) [26] and ORL [23] databases were utilized in the experiments. The Sheffield database contains 575 images that belong to 20 people with variety of head pose from front view to profile. For training, 10 images were used from each person and the rest were used as test set. Figure 1 shows a sample of this database.

Fig. 1. Sheffield database [28] ORL database contains 400 images. This database contains 40 people with variety in scale and pose of

Fig. 2. ORL database [23]

3.1. Linear Methods In order to establish a baseline, the linear algorithms were utilized. Matlab was used for the simulation [22]. For the neural network, the network inputs are equal to the features vector’s dimensions. For the output two approaches can be used. The first one is the bit method in which the class number is shown by using bits. Each output neuron is equivalent to one bit. For instance, 000110 shows class 6 and 001001 shows class 9. The output of an RBF network is a real number between 0 and 1. The other method is considering a neuron for each class. If there are 40 classes, then there are also 40 nodes in the output layer. The second method produced better results and in all simulations the second method has been used. However in cases with large number of classes, the first method may be preferred. Also a neuron can be considered for images that do not belong to any classes. It should be noted that the two other important neural networks classifiers, back propagation neural network and probabilistic neural network, have lower performances than RBF neural networks in these experiments. Back propagation neural network needs significant time for training compared to the RBF neural network. The memory needed for back propagation neural network is also much larger than the RBF neural network. The experiment also shows that the results using back propagation neural network is of lesser quality than RBF neural network. Probabilistic neural network performance is equivalent to distance based classifiers performance. For linear methods, principal component analysis, linear discriminant analysis, fuzzy linear discriminant analysis [20] and multiple exemplar linear discriminant analysis have been used. Results are shown based on the number of extracted features. Figure 3 illustrates the results for linear methods using RBF neural network [5, 10]. For all algorithms in this paper the distance based classifier was also used as a classifier and in most cases the RBF neural networks outperformed distance based classifiers. When the number of the extracted features was low, distance based had better results. Articles

Journal of Automation, Mobile Robotics & Intelligent Systems

VOLUME 9,

N째 4

2015

Fig. 3. Linear based algorithms using RBF classifier on ORL database

Fig. 6. FMEDA algorithm using RBF classifier on Sheffield database

As the figures show multiple exemplar discriminant analysis has stronger discriminant capabilities compared with the other methods. Figure 4 shows the results for the Sheffield database.

3.2. Non-Linear Methods For nonlinear methods kernel principal component analysis and kernel linear discriminant analysis have been used. For kernel linear discriminant analysis first kernel principal component analysis has been applied to the images and then the linear discriminant analysis is applied to the new vector. Second order polynomial is used as kernel function. Figures 7 and 8 display the results.

Fig. 4. Linear based algorithms using RBF classifier on Sheffield database Figure 5 and Figure 6 show the results of FMEDA algorithm compared with LDA and MEDA algorithm. The results indicate that FMEDA algorithm has better recognition rate compared to LDA and MEDA methods and other linear methods.

Fig. 5. FMEDA algorithm using RBF classifier on ORL database 24

Articles

Fig. 7. Non-linear based algorithms using RBF classifier on ORL database

Fig. 8. Non-linear based algorithms using RBF classifier on Sheffield database

Journal of Automation, Mobile Robotics & Intelligent Systems

As the figures show kernel linear discriminant analysis has better results compared to kernel principal component analysis. Also kernel principal component analysis has better results compared to Eigen face method. Comparing the results with the linear algorithms confirms that the non-linear method is not that much better than the linear methods. The reason can be that the between class distances have not become more when the space is changed to a higher dimensional space.

3.3. Evaluating CCA

Combining the information is a powerful technique that is being used in data processing. This combining can be done at three levels of pixel level, feature level, and decision level - Similar to combining the classifiers. CCA combines the information in the feature level. One of the advantages of combining the features is that the features (vectors which have been calculated using different methods) contain different characteristic from the pattern. By combining these two methods not only the useful discriminant information from the vectors are kept, but also the redundant information is omitted. For this experiment, CCA was applied to two different feature vectors. In this case two different methods should be used where each extract features from the image using different technique. One of the methods that are used is FMEDA which had better results compared to other linear and non-linear methods in appearance-based methods. The other method that we used is Discrimination Power Analysis (DPA). CCA is applied to the extracted features using these two methods. A method has been introduced based on DCT that extract features that have better capability to discriminate faces [7]. As mentioned before in conventional DCT the coefficients are chosen using a zigzag manner, where some of the low frequency coefficients are discarded because they contain the illumination information. The low frequency coefficients are in the upper left part of the image. Some of the coefficients have more discrimination power compared to other coefficients, and therefore by extracting these features a higher true recognition rate can be achieved. So, instead of choosing the coefficients in a zigzag manner [7] searched for coefficients which have more power to discriminate between images. Unlike other methods such as PCA and LDA which use between and within class scatter matrices and try to maximize the discrimination in the transformed domain, DPA searches for the best discrimination features in the original domain. The DPA algorithm is as follows [7]: Considering DCT has been applied to an image and coefficients are X:

 x11 x X =  21  ...   xM1

x12 x22 ... xM2

... x1N  ... x2N  ... ...   ... x MN  M ×N

VOLUME 9,

N° 4

2015

Where the number of people in the database is C (The number of classes), and for each person there are S images (Training images). There are total C*S training images. Table 4 shows how to calculate the DPA of each coefficient x ij : Table 4. DPA algorithm [7]

1. Construct a large matrix containing all the DCT from the training images.

2. Calculate the mean and variance of each class:

3. Calculate variance of all classes 4. Calculate the mean and variance of all training samples:

5. For location (i, j) calculate the DP:

The higher values in D show the higher discrimination ability it has. Table 5 shows the procedure for recognizing faces: Table 5. Procedure for recognizing faces [7]

1. Compute the DCT of the training images, and normalize the results. 2. Use a mask, to discard some of the low and high frequencies. 3. Calculate DPA for the coefficients inside the mask. 4. The n largest coefficients are found and marked. Set the remaining coefficients to zero. The resulting matrix is an M*N matrix having n elements that are not zero. 5. Multiply the DCT coefficients by the matrix which was calculated in the previous step. Convert the resulting matrix into a vector. 6. Train a classifier using the training vectors. Apply the same process for the test images. Articles

Journal of Automation, Mobile Robotics & Intelligent Systems

Figure 9 shows the comparison between FMEDA, DPA and CCA. The results illustrate that applying CCA to the features can increase the recognition rate for human faces.

VOLUME 9,

N° 4

2015

AUTHORS Mohammadreza Hajiarbabi* – Department of Electrical Engineering and Computer Science, University of Kansas, Lawrence, Kansas, USA. E-mail: mehrdad.hajiarbabi@ku.edu Arvin Agah – Department of Electrical Engineering and Computer Science, University of Kansas, Lawrence, Kansas, USA. E-mail: agah@ku.edu *Corresponding author

REFERENCES [1]

Fig. 9. Comparing CCA with FMEDA and DPA using RBF classifier on ORL database

[2]

[3]

[4] [5] Fig. 10. Comparing CCA with FMEDA and DPA using RBF classifier on Sheffield database

4. Conclusion

In this paper several linear and non-linear appearance based method were discussed and the methods were applied on two popular face recognition database. In linear methods FMEDA had better results compared to other linear methods and in non-linear methods KLDA outperforms KPCA. Also the experiments show that the linear method has similar recognition rate compared to non-linear methods. Also a new method for face recognition was introduced that outperforms existing linear and non-linear methods. Canonical Correlation Analysis (CCA) is a strong tool in combining the information at feature level. Fractional Multiple Exemplar Analysis (FMEA) and Discriminant Power Analysis (DPA) were used as feature extraction techniques. This paper’s experimental results show that CCA using DPA and FMEDA exhibits improved results compared to other related methods. Articles

[6] [7]

[8] [9]

[10] [11]

Blanz V. S., Vetter T., “Face identification across different poses and illuminations with a 3D morphable model”. In: IEEE International Conference on Automatic Face and Gesture Recognition, 2002, 202–207. DOI: 10.1109/AFGR.2002.1004155. Borga M., Learning multidimensional signal processing, Department of Electrical Engineering, Linköping University, Linköping Studies in Science and Technology Dissertations, no. 531, 1998. Boser B. E., Guyon I. M., Vapnik V. N., “A training algorithm for optimal margin classifiers.” In: D. Haussler, editor, Proceedings of the 5th Annual ACM Workshop on Computational Learning Theory, 1992, 144–152. DOI: 10.1145/130385.130401. Brunelli R., Poggio T., “Face recognition: Features versus templates”, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 15, no. 10, 1993, 1042–1053. DOI: 10.1109/34.254061. Chen S., Cowan P.M., ”Orthogonal least squares learning algorithms for radial basis function networks”, IEEE Transaction on Neural Networks, vol. 2, no. 2, 1991, 302–309. DOI: 10.1109/72.80341. Cootes T.F., Edwards G.J., Taylor C.J., “Active appearance models”, IEEE Transactions Pattern Analysis and Machine Intelligence, vol. 23, no. 6, 2001, 681–685. DOI: 10.1109/34.927467. Dabbaghchian S., Ghaemmaghami M., Aghagolzadeh A., “Feature extraction using discrete cosine transform and discrimination power analysis with a face recognition technology”, Pattern Recognition, vol. 43, no. 4, 2010, 1431–1440. DOI: 10.1016/j.patcog.2009.11.001. Fukunaga K., Introduction to statistical pattern recognition, 2nd ed., San Diego, CA: Academic Press, 1990, 445-450. Gui J., Sun Z., Jia W., Hu R., Lei Y., Ji S., “Discriminant sparse neighborhood preserving embedding for face recognition”, Pattern Recognition, vol. 45, no. 8, 2012, 2884–2893. DOI: 10.1016/j. patcog.2012.02.005. Gupta J. L., Homma N., Static and dynamic neural networks from fundamentals to advanced theory, John Wiley & Sons, 2003. Hajiarbabi M., Askari J., Sadri S., Saraee M., “The Evaluation of Camera Motion, Defocusing and

Journal of Automation, Mobile Robotics & Intelligent Systems

Noise Immunity for Linear Appearance Based Methods in Face Recognition”. In: IEEE Conference WCE 2007/ICSIE 2007, vol. 1, 2007, 656– 661. [12] Hajiarbabi M., Askari J., Sadri S., Saraee M., “Face Recognition Using Discrete Cosine Transform plus Linear Discriminant Analysis”. In: IEEE Conference WCE 2007/ICSIE 2007, vol. 1, 2007, 652–655. [13] Hajiarbabi M., Askari J., Sadri S., “A New Linear Appearance-based Method in face Recognition”, Advances in Communication Systems and Electrical Engineering. Lecture Notes in Electrical Engineering, vol. 4, Springer, 2008, 579–587. DOI: 10.1007/978-0-387-74938-9_39. [14] Hajiarbabi M., Agah A., “Face Detection in color images using skin segmentation”, Journal of Automation, Mobile Robotics and Intelligent Systems, vol. 8, no. 3, 2014, 41–51. [15] Hajiarbabi M., Agah A., “Human Skin Color Detection using Neural Networks”, Journal of Intelligent Systems, under review, 2014. [16] Hotelling H., “Relations between two sets of variates”, Biometrika, vol. 28, no. 3–4, 1936, 321–377. DOI: 10.2307/2333955. [17] R., Hsu, M., and Abdel-Mottaleb, A., Jain, “Face Detection in Color images”, IEEE Transactions on pattern analysis and machine intelligence, vol. 24, no. 5, 2002, 696-706. [18] Huang J., Heisele B., Blanz V., “Component-based Face Recognition with 3D Morphable Models”. In: Proceedings of the 4th International Conference on Audio- and Video-based Biometric Person Authentication, chapter 4, Surrey, UK, 2003. DOI: 10.1007/3-540-44887-X_4. [19] Kong S., Heo J., Abidi B., Pik P., M., Abidi, “Recent Advances in Visual and Infrared Face Recognition – A Review”, Journal of Computer Vision and Image Understanding, vol. 97, no. 1, 2005, 103– 135. DOI: 10.1016/j.cviu.2004.04.001. [20] Kwak K.C., Pedrycz W., “Face recognition using a fuzzy Fisher face classifier”, Pattern Recognition, vol. 38, 2005, 1717–1732. [21] Lotlikar R., Kothari R., “Fractional-step dimensionality reduction”, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 22, no. 6, 2000, 623–627. DOI: 10.1109/34.862200. [22] Math Works, 2015: www.mathworks.com. [23] ORL Database, 2015: http://www.camorl.co.uk. [24] Rowley H., Baluja S., Kanade T., “Neural networkbased face detection”, IEEE Pattern Analysis and Machine Intelligence, vol. 20, 1998, 22–38. [25] Sarfraz M., Computer Aided Intelligent Recognition Techniques and Applications, John Wiley & Sons, 2005, 1–10. [26] Scholkopf B., Statistical learning and kernel methods, Microsoft Research Limited, February 29, 2000. [27] Scholkopf B., Smola A., Muller K.R., “Non-linear component analysis as a kernel eigenvalue problem”, Neural Computation, vol. 10, no. 5, 1998, 1299–1319. [28] Sheffield (UMIST) Database, 2015: http://www.

VOLUME 9,

N° 4

2015

sheffield.ac.uk/eee/research/iel/research/face. [29] Shu X., Gao Y., Lu H., “Efficient linear discriminant analysis with locality preserving for face recognition”, Pattern Recognition, vol. 45, no. 5, 2012, 1892–1898. [30] Sun Q.S., Zeng S.G., Liu Y., Heng P.A., Xia D.S., “A new method of feature fusion and its application in image recognition”, Pattern Recognition, vol. 38, no. 12, 2005. DOI: 10.1016/j.patcog.2004.12.013. [31] Turk M., “A Random Walk through Eigen space”, IEICE Transactions on Information and System, vol. 84, no. 12, 2001. [32] Turk M., Pentland A.,, “Eigen faces for recognition”, Journal of Cognitive Neuroscience, vol. 3, 1991, 71–86. [33] Viola P., Jones M.J., “Robust real-time object detection”. In: Proceedings of IEEE Workshop on Statistical and Computational Theories of Vision, 2001. [34] Wiskott L., Fellous J.M., Kruger N., Malsburg C., “Face recognition by elastic bunch graph matching”, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 19, no. 7, 1997, 775– 779. [35] Yang J., Jin Z., Yang J., Zhang D., Frangi F., “Essence of Kernel Fisher discriminant: KPCA plus LDA”, Elsevier Pattern Recognition, vol. 37, no. 10, 2004, 2097–2100. DOI: 10.1016/j.patcog.2003.10.015. [36] Yu H., Yang J., “A Direct LDA algorithm for high dimensional data with application to face recognition”, Pattern Recognition, vol. 34, no. 10, 2001, 2067–2070. [37] Zhou Sh. K., Chellappa R., “Multiple-Exemplar discriminant analysis for face recognition”, Center for Automation Research and Department of Electrical and Computer Engineering University of Maryland, College Park, MD 20742, 2003.

Articles

Journal of Automation, Mobile Robotics & Intelligent Systems

VOLUME 9,

N° 4

2015

Improving Self-localization Efficiency In a Small Mobile Robot by Using a Hybrid Field of View Vision System Submitted: 27th August 2015; accepted 18th September 2015

Marta Rostkowska, Piotr Skrzypczyński DOI: 10.14313/JAMRIS_4-2015/30 Abstract: In this article a self-localization system for small mobile robots based on inexpensive cameras and unobtrusive, passive landmarks is presented and evaluated. The main contribution is the experimental evaluation of the hybrid field of view vision system for self-localization with artificial landmarks. The hybrid vision system consists of an omnidirectional, upward-looking camera with a mirror, and a typical, front-view camera. This configuration is inspired by the peripheral and foveal vision co-operation in animals. We demonstrate that the omnidirectional camera enables the robot to detect quickly landmark candidates and to track the already known landmarks in the environment. The front-view camera guided by the omnidirectional information enables precise measurements of the landmark position over extended distances. The passive landmarks are based on QR codes, which makes possible to easily include in the landmark pattern additional information relevant for navigation. We present evaluation of the positioning accuracy of the system mounted on a SanBot Mk II mobile robot. The experimental results demonstrate that the hybrid field of view vision system and the QR code landmarks enable the small mobile robot to navigate safely along extended paths in a typical home environment. Keywords: self-localization, artificial landmark, omnidirectional camera

1. Introduction

An important requirement for any mobile robot is to figure out where it is within its environment. The pose of a wheeled robot (position and orientation xR = [xR yR θR]T) can be estimated by means of odometry, but this method alone is insufficient [27], and the pose has to be corrected using measurements from external sensors. Although there are many approaches to self-localization known from the literature, nowadays the Simultaneous Localization and Mapping (SLAM) is considered the state-of-the-art approach to obtain information about the robot pose [7]. The SLAM algorithms estimate from the sensory measurements both the robot pose and the environment map, thus they do not need a predefined map of the workspace. This is an important advantage, because obtaining a map of the environment that is suitable for self-localization is often a tedious and time-consuming task. However, the known SLAM al-

gorithms require data from highly precise sensors, such as laser scanners [28], or have high computing power demands, if less precise data (e.g. from passive cameras) are used [8]. Thus, the SLAM approach is rather unsuitable for small mobile robots, such like our SanBot [19], which have quite limited resources with respect to on-board sensing, computing power, and communication bandwidth. Thus, for such a robot an approach to self-localization that does not need to construct a map of the environment, or uses a simple and easy to survey representation of the known area is required. Moreover, the self-localization system should use data from compact and low-cost sensors. In the context of navigation CCD/CMOS cameras are the most compact and low-cost sensors for mobile robots [6]. However, most of the passive vision-based localization methods fail under natural environmental conditions, due to occlusions, shadows, changing illumination, etc. Therefore, in practical applications of mobile robots artificial landmarks are commonly employed. They are objects purposefully placed in the environment, such as visual patterns or reflecting tapes. Landmarks enhance the efficiency and robustness of vision-based self-localization [29]. It was also demonstrated that simple artificial landmarks are a valuable extension to visual SLAM [3]. An obvious disadvantage is that the environment has to be engineered. This problem can be alleviated by using simple, cheap, expendable and unobtrusive markers, which can be easily attached to walls and various objects. In this research we employ simple landmarks printed in black and white that are based on the matrix QR (Quick Response) codes commonly used to recognize packages and other goods. In our recent work [21] we evaluated the QR code landmarks as self-localization aids in two very different configurations of the camera-based perception system: an overhead camera that observed a landmark attached on top of a mobile robot, and a frontview camera attached to a robot, which observed landmarks freely placed in the environment. Both solutions enable to localize the robot in real-time with a sufficient accuracy, but both have important practical drawbacks. The overhead camera provides inexpensive means to localize a group of few small mobile robots in a desktop application, but cannot be easily scaled up for larger mobile robots operating in a real environment. The front-view camera with on-board image processing is a self-contained solution for selflocalization, which enables the robot to work autonomously, making it independent from possible com-

Journal of Automation, Mobile Robotics & Intelligent Systems

munication problems. However, the landmarks are detectable and decodable only over a limited range of viewing configurations. Thus, the robot has to turn the front-mounted camera towards the area of landmark location before it starts to acquire an image. In a complicated environment, with possible occlusions this approach may lead to a lot of unnecessary motion. Eventually, the robot can get lost if it cannot find a landmark before the odometry drifts too much. In this paper we propose an approach that combines to some extent the advantages of the overhead camera and the front-view camera for self-localization with passive landmarks, avoiding the aforementioned problems. We designed an affordable hybrid field of view vision system, which takes inspiration from nature, and resembles the peripheral and foveal vision in animals. The system consists of a low-cost omnidirectional camera and a typical, front-view camera. The omnidirectional component, employing an upward-looking camera and a profiled mirror provides to the robot an analogy of the peripheral vision in animals. It gives the robot the ability to quickly detect interesting objects over a large field of view. In contrast, the front-view camera provides an analog of foveal vision. The robot can focus on details of already detected objects in a much narrower field of view. The cooperation of these two subsystems enables to track in real-time many landmarks located in the environment, without the need to move the robot platform, whereas it is still possible to precisely measure the distances and viewing angle to the already found landmarks. The reminder of this paper is organized as follows: In the next Section we analyze the most relevant related work. Section 3 introduces the concept and design of the hybrid vision system, whereas the landmarks based on QR codes and the image processing algorithms used in self-localization are described in Section 4. The experimental results are presented in Section 5. Section 6 concludes the paper and presents an outlook of further research.

2. Related Work

The advantages of biologically-inspired vision for robot self-localization have been demonstrated in few papers – for instance Siagnian and Itti [25] have shown that extracting the “gist” of a scene to produce a coarse localization hypothesis, and then refining this hypothesis by locating salient landmark points enables the Monte-Carlo localization algorithm to work robustly in various indoor/outdoor scenarios. However, in this work both the global and the local characteristics of the scene were extracted from typical perspective-view images. One example of a system that is more similar to our approach and mimics the cooperation between the peripheral vision and the foveal vision in humans is given by Menegatti and Pagello [16]. They investigate cooperation between an omnidirectional camera and a perspective-view camera in the framework of a distributed vision system, with the RoboCup Soccer as the target applications. Only simple geometric and color features of the scene are considered in this system. An integrated, self-con-

VOLUME 9,

N° 4

2015

tained hybrid field of view vision system called HOPS (Hybrid Omnidirectional Pin-hole Sensor), which is quite similar in concept to our design is presented in [5], where the calibration procedure is described that enables to use this sensor for 3D measurements of the scene. Unfortunately, [5] gives no real application examples. Also Adorni et al. [1] describe the use of a combined peripheral/foveal vision system including an omnidirectional camera in the context of mobile robot navigation. Their system uses both cameras in a stereo vision setup and implements obstacle detection and avoidance, but not self-localization. Although the bioinspired vision solutions in mobile robot navigation mostly extract natural salient features, in many practical applications artificial landmarks are employed in order to simplify and speed-up the image processing and to make the detection and recognition of features more reliable [15]. Visual self-localization algorithms are susceptible to errors due to unpredictable changes in the environment [11], and require much computing power to process natural features, e.g. by employing local visual descriptors [24]. The need to circumvent these problems in a small mobile robot that is used for education and requires reliable self-localization, offering only limited computing resources motivated us to enhance the scene by artificial landmarks. Although active beacons can be employed, such like infra-red LEDs [27], most of the artificial visual landmarks are passive. This greatly simplifies deployment of the markers and makes them independent of any power source. Depending on the robot application and the characteristics of the operational environment very different designs of passive landmarks have been proposed [9, 22]. In general, simple geometric shapes can be quickly extracted from the images, particularly if they are enhanced by color [3]. A disadvantage of such simple landmarks is that only very limited information (usually only the landmark ID) can be embedded in the pattern. In contrast, employing in landmark design the idea of barcode, either one-dimensional [4] or two-dimensional [12] makes it possible to easily encode additional information. In particular, matrix codes, that proliferated recently due to their use in smartphone-based applications enable to fabricate much more information-rich landmarks. Moreover, landmarks based on matrix codes are robust to partial occlusion or damage of the content. Landmarks based on matrix codes are unobtrusive – their size can be adapted to the requirements of particular application and environment. As they are monochromatic, they can be produced in a color matching the surroundings, partially blending into the environment. The robotics and computer vision literature provides examples of successful applications of QR codes for mobile robot self-localization. Introducing QR codes into the environment has improved the robustness and accuracy of the 3D-vision-based Monte Carlo self-localization algorithm in a dynamic environment as demonstrated in [14]. The information-carrying capability of matrix codes can be efficiently used for self-localization and communication in a system of many mobile robots [18] and in an intelligent home space for service robot Articles

Journal of Automation, Mobile Robotics & Intelligent Systems

[13]. An applicability of QR codes for navigation and object labelling has been also demonstrated in [10] on the NAO humanoid robot.

3. Hybrid Field of View Vision System 3.1. Concept and Components

Most of the mobile robots that employ vision for navigation use typical perspective cameras. A perspective camera can observe landmarks located at relatively large distances and positioned arbitrary in the environment within the camera’s horizontal field of view. The distance to the robot and orientation of the landmark can be calculated from a single image taken by the perspective camera. Due to practical considerations, working indoors we assume that the landmarks are attached to vertical surfaces, such as walls that dominate man-made environments. Thus, we consider only the angle α between the camera’s optical axis and the normal to the landmark’s plane in 2D (Fig. 1) In the same camera coordinates the position of the landmark is defined by the distance zy measured along the camera’s optical axis, which is assumed to be coincident with the robot’s yR axis, and the distance d in the robot’s xR axis, computed as the offset between the center of the image (i.e. the optical axis) and the center of the landmark. The distance at which the landmark can be detected and recognized depends on the camera resolution and the physical size of the landmark [21]. The information about the actual landmark size, as well as the position and orientation in the global reference frame xL = [xL yL θL]T is encoded in the QR code of the landmark itself, so the robot doesn’t need to keep a map of known landmarks in the memory. Therefore, if at least one landmark can be recognized and decoded, the position of the robot and its orientation can be computed. However, in order to find landmarks in the surroundings, the robot has to constantly change its heading, which is inconvenient.

Fig. 1. Geometry of landmark measurements by using the perspective camera

The omnidirectional subsystem combines a standard upward-looking camera with an axially symmetric mirror located above this camera and provides 360ᵒ field of view in the horizontal plane. This type of omnidirectional sensor is called catadioptric [23] and can be implemented using mirrors of different Articles

VOLUME 9,

N° 4

2015

vertical profiles: parabolic, hyperbolic, or elliptical. The omnidirectional sensor used in this research has been designed and built within a project run by students, which imposed limitations as to the costs and the used technology. The mirror has been fabricated in a workshop from a single piece of aluminium using a simple milling machine, which limited the achievable curvature of the profile. Thus, a mirror of conical shape with a rounded, parabolic tip was designed (Fig. 2). This profile could be fabricated at acceptable cost using typical workshop equipment. The mirror is hold by a highly transparent acrylic tube over the lens of an upward-looking webcam.

Fig. 2. Conical mirror: a – design of the conical mirror with rounded tip for the omnidirectional vision sensor, b – the fabricated mirror Omnidirectional camera images represent geometrically distorted environment such as: straight lines are arcs, squares are rectangles. For this reason it is difficult to find characteristic elements which are needed in the localization process. It is therefore necessary to transform images using the single effective viewpoint [26]. Unfortunately, the chosen shape of the mirror makes it hard to achieve the single effective viewpoint property in the sensor. While for hyperbolic or elliptical mirrors this is simply achieved by placing the camera lens at a proper distance from the mirror (at one of the foci of the hyperbola/ellipse), for a parabolic mirror, an orthographic lens must be interposed between the mirror and the camera [2]. This was impossible in the simple design sensor which uses a fixed-lens webcam as the camera. Therefore, it is impossible to rectify the images captured by our omnidirectional camera to geometrically correct planar perspective images [26]. While the captured pictures may be mapped to flat panoramic images covering the 360ᵒ field of view, these images are still distorted along their vertical axis, i.e. they do not map correctly all the distances between the objects and the sensor into the vertical pixel locations. However, there are no distortions along the horizontal axis, which allows to recover the angular location of the observed objects with respect to the sensor. In the context of landmark-based positioning it means that while the landmarks can be detected in the omnidirectional images, only their angular locations, but not the distances with respect to the robot can be determined precisely, particularly for more distant landmarks. Moreover, the internal content of

Journal of Automation, Mobile Robotics & Intelligent Systems

the landmark (QR code) cannot be decoded reliably from the distorted images. Eventually, while the omnidirectional camera is capable of observing the whole proximity of the robot without unnecessary motion, it requires high computing power to rectify the whole images, still giving no guarantee that the geometric measurements of landmark positions are precise enough for self-localization.

Fig. 3. Exemplary view from the omnidirectional vision component with artificial landmarks in the field of view The aforementioned properties and limitations of the two camera subsystems resemble the characteristics of the foveal and peripheral vision in animals. This provides a strong argumentation to combine both systems. If the perspective view camera and the omnidirectional camera subsystems are coupled for landmark perception their drawbacks can be mutually compensated to a great extent. The omnidirectional camera can provide 360ᵒ view with detection of landmarks, and then guide the perspective camera to the angular coordinates of the found landmarks. The perspective camera can be pointed directly to the landmark at the known angular coordinates, and then can

Fig. 4. SanBot Mk II with the hybrid field of view vision system: CAD drawing (a), and a photo of assembled robot (b)+

VOLUME 9,

N° 4

2015

precisely measure its location and read the QR code. It should be noted, that in this cooperation scheme neither full rectification of the omnidirectional images or the perspective correction in the front-view camera images are needed, which significantly decreases the required computing power.

3.2. Experimental System on the Mobile Robot

The experimental mobile robot with the hybrid field of view vision system is shown in Fig. 4. It is based on the small, differential-drive mobile platform SanBot Mk II [19]. The robot is equipped with the front-view camera and the omnidirectional camera. The front-view camera is mounted directly to the upper plate of the robot’s chassis. It is a Logitech 500 webcam, providing images at the resolution of 1280x1024. The Microsoft LifeCam webcam is used in the omnidirectional sensor. This particular camera has been chosen due to its compact size, high resolution (1280x720), and an easy to use API. The mirror has the diameter of 6 cm, and is located 11 cm above the camera lens. The omnidirectional camera is positioned precisely above the front-view camera. Both cameras stream images at 15 FPS through the USB interface. In the current experimental setup image processing takes place on a notebook PC. The simple controller board of the SanBot robot receives only the calculated positions of the landmarks that are necessary to compute the motion commands. These data are transferred via a serial (COM) port. The robot schedules the sequence of motions to execute in order to follow the planned path. The robot stops for a moment when taking images, and then obtains the outcome of calculations related to landmark-based self-localization.

4. Landmarks and Self-localization

4.1. Passive Landmarks with Matrix Codes

There are many possibilities to design a passive landmark, but if a CCD/CMOS camera has to be used as the sensor, the landmark should have the following basic properties: • should be recognizable and decodable over a wide range of viewing ranges and angles; • should be easily recognizable under changing environment conditions (e.g. variable lighting, partial occlusions); • its geometry should allow easy and quick extraction from an image; • it should be easy to prepare, preferably printable in one color; • it should be unique within the robot’s working area, e.g. by containing an encoded ID. For the small mobile robot positioning we formulate a further requirement related to the limited computing power of the system: the landmarks should be able to carry additional information related to self-localization and navigation, such like the position of the landmark in the global frame, object labels or guidance hints. Such information easily and robustly decodable from the landmark’s image helps the robot to navigate without building a map of the environment in memory. Articles

Journal of Automation, Mobile Robotics & Intelligent Systems

All of the above-listed requirements are met by matrix codes. In our previous work [20] we have experimentally evaluated four types of commercially used matrix codes as candidates for landmarks. The results revealed that among these code types, the most suitable for navigation are the QR codes. The QR codes contain three marker positions (upper left, upper right and lower left), which are additionally separated from the data with white frame. This pattern allows easily recovering the code orientation. Comparing to other considered variants the QR code is also characterized by large size of a single module (i.e. white/black cell). This is an important advantage, which ensures proper measurements, even for long distances. In addition, QR codes are capable of partial error correction if they are damaged or occluded.

Fig. 5. Exemplary QR code based landmark: QR code encoding value ‘1’ (a), 16.5x16.5 cm landmark with frame (b), recognized landmark in the environment (c) As a result, our landmark is designed around a standard QR code, which is placed in the center. The code is encompassed with a black frame, which is used for initial sorting of landmark candidates on images and for reducing the number of potential objects, which can be subject to decoding and further processing. Landmarks are monochromatic (usually black-and-white), because they should be extremely low-cost and printable on any material, not only paper. An example of a QR code, in which the ID ’1’ has been encoded, is shown in Fig. 5a. A complete landmark with the added black frame is depicted in Fig. 5b. The same landmark, recognized and decoded in the environment is shown in Fig. 5c. The program processing the data from both cameras has been created using C# programming language and Microsoft Visual Studio 2010. The basic image processing part, leading to conversion of the omnidirectional images, extraction of landmark candidates from these images, and computation of the geometric measurements is implemented with the EmguCV library [30], which is a C# port of the wellknown OpenCV. The decoding of extracted QR codes is accomplished using the specialized MessagingToolkit library [32].

VOLUME 9,

Articles

2015

4.2. Localization of Landmarks on Perspective Camera Images We assume that landmarks are mounted on rigid surfaces, so that they will not bend, deforming the square frames. Thus, the images from the perspective view camera are assumed to be not distorted. The algorithm of recognition and localization of landmarks observed by the perspective view camera is shown in Fig. 6. The image processing begins with acquiring images from the front-view camera. The images are filtrated, and then the thick frames around the QR code are searched for by extracting candidate rectangles. If the surface of a landmark is roughly parallel to the camera sensor’s surface, there is no perspective deformation of the QR code, and it can be directly processed by the appropriate MessagingToolkit routine. If such a landmark is found, the distance to the robot’s camera is calculated. Whenever the camera’s optical axis intersects the center of the landmark (i.e. it is located horizontally in the center of the image) the distance is calculated in a simple way: ,

(1)

where z is the distance between the landmark and the camera, f is the camera’s focal length, hL is the known vertical dimension (height) of the landmark, and hI is the observed object’s vertical dimension on the image. The viewing angle can be computed from the formula: ,

(2)

where wL is the known horizontal dimension (width) of the landmark, and wI is the observed object’s horizontal dimension on the image. However, if the landmark isn’t located in the center of the image (cf. Fig. 1), the distance between the camera and the landmark is calculated from the rightangle triangle made by the distance zy measured along the camera’s optical axis (which is assumed to be coincident with the robot’s yR axis), and the distance d in the robot’s xR axis, computed as the offset between the center of the image and the center of the landmark:

Fig. 6. Landmarks detection and decoding algorithm for the perspective camera 32

N° 4

(3)

Journal of Automation, Mobile Robotics & Intelligent Systems

VOLUME 9,

where dp is the distance in pixels between the center of the image and the center of the landmark’s bounding frame. The viewing angle between the camera’s optical axis and the vector normal to the landmark surface is calculated as: ,

(4)

where zy is the perpendicular distance from camera to the landmark, and d is the distance calculated from (3).

N° 4

2015

perspective brings no improvement in the landmark localization, while this procedure is computation intensive. Quantitative results for the measurements of an exemplary passive landmark (size 20x20 cm) are shown in Fig. 7. The landmark was observed by the perspective camera from distances up to 2 meters and for the viewing angles up to 60ᵒ. In this experiment the camera was positioned in such a way that the optical axis always intersected the center of the landmark, thus the d offset was zero. As it could be expected, the distance measurement error grows for larger distances, but it grows also slightly for large viewing angles (Fig. 7a), which could be attributed to the not corrected perspective deformation. As can be seen from the plot in Fig. 7b the measured viewing angle is less precise for large and very small distances. This is probably caused by the procedure searching for the thick black frame, which for very large images of a landmark (small distances) occasionally finds the inner border of the frame instead of the outer one. Average precision of the measurements (over all distances and viewing angles) turned out to be 1.3 cm for the distance and 2ᵒ for the viewing angle.

4.3. Recognition of Landmarks on the Omnidirectional Images

Fig. 7. Spatial distribution of errors in QR code-based landmark localization by the perspective view camera: distance to landmark z errors (a), and viewing angle α errors (b) However, if in the given viewing configuration of the perspective camera the surface of a landmark is not parallel to the camera’s sensor’s surface the perspective deformation of the landmark’s image has to be corrected before decoding the QR code and calculating the distances and angle from (1)–(4). In such a case the relation between locations of the characteristic points (corners) in 3D and the image plane has to be found in order to properly calculate the landmark’s position and rotation. Computations of this relation are described in more details in [21]. We omit these calculations here, because such situations should not occur when self-localization takes place by using both the perspective and omnidirectional cameras, as the perspective camera is set to a proper angular position before taking an image of the landmark. Thus, the viewing angle of the landmark never exceeds 15ᵒ. Results of our earlier experiments [21] provide evidence that for such small viewing angles the correction of

As described in Section 3 the low-cost omnidirectional camera geometry and optics do not permit full rectification of the distorted images. Therefore, we use images from the omnidirectional camera only to find potential landmark candidates in the robot’s vicinity, and to track the known landmarks. At the beginning, in order to reduce the amount of processing information, the color image from the camera is transmuted into black-and-white image. The data processing starts by cropping and unwinding the omnidirectional image. Cropping the image relies on selecting the part of the picture which is necessary for recognition of the landmarks. The unwinding procedure is a simple cylinder unrolling. At the beginning the algorithm sets the height and width of the unrolled picture:

(5)

where R2 is the radius of outer circle, and R1 is the radius of an inner circle marked in Fig 3. Next, the algorithm starts computing a new position of each pixel in the unrolled image. This procedure is shown in pseudo-code:

Listing 1. Pixel position calculation procedure y = H - 1; for (x = 0; x < W; x++) { for (y = 0; y < H; y++) { r = (y/H) * (R2 - R1) + R1; theta =(x/W) * 2 * PI;

Articles

Journal of Automation, Mobile Robotics & Intelligent Systems

VOLUME 9,

xs = Cx + r * Sin(theta); ys = Cy + r * Cos(theta); mapx.Data[y, x] = xs; mapy.Data[y, x] = ys; y--;

} y = H - 1; }

The two operations described above provide the same result for each processed image, so they are executed only once at the beginning of the program. Afterwards, the program starts the procedure of unwinding the picture using the EmguCV cvRemap function. This function transforms the source image using the specified map (in our case mapx and mapy from the algorithm in Listing 1). After the procedure of unwrapping the image, the extreme parts of the image are duplicated in the opposing ends in order to obtain a continuous image in areas where landmarks can appear. The unrolled image is shown in Fig. 8. This image undergoes morphological erosion in order to remove noise. Next, the Canny operator is used to find edges on the image. Among the found edges those that are connected into rectangular shapes are selected. Then, the algorithm eliminates all nested rectangles – those which are located inside other rectangles. The found landmark candidates are marked on the image.

The viewing angle of the landmark with respect to the robot’s heading is calculated as:

Fig. 9. Landmarks detection for omnidirectional camera 34

Articles

2015

(6)

where xs and ys define center of the landmark, W is width of the unwrapped image, and Wd is width of duplicated part of the picture. Afterwards, the program makes a list of potential landmarks and relative angles. The algorithm of finding and localization of landmarks in the omnidirectional images is shown in Fig. 9.

4.4. Self-localization with the Hybrid System

The self-localization algorithm based on data from both the omnidirectional and perspective camera is shown in Fig: 10. At the beginning, the program processes only an image from the omnidirectional camera. If the algorithm described in Subsection 4.3 finds a landmark candidate at the viewing angle smaller than ±15ᵒ, the program starts processing the image from the front-view camera. This way the robot does not need to aim the perspective camera directly at the landmark, which speeds up the self-localization process. When the landmark is seen in the angular sector of ±15ᵒ, the image from the perspective camera is processed. The program searches for the landmark, decodes it and calculates robot’s position and orientation in the external reference frame (cf. Fig. 1). The orientation of the robot θR is concatenation of the landmark’s orientation in the global coordinates θL, and robot’s orientation with regard to the landmark α. The orientation is calculated as: ,

Fig. 8. Results from the omnidirectional camera a – cropped and unwrapped image, b – extended image, c – unwrapped and extended image with marked candidates

N° 4

(7)

where θL′ is θL – 180ᵒ and α is the angle calculated from (4). The robot’s position in the global reference frame is calculated as: ,

(8)

where xL and yL define the landmark’s position, zy is the perpendicular distance between the camera and the landmark, and d is the distance calculated from (3). In (8) the plus sign is used to compute the

Journal of Automation, Mobile Robotics & Intelligent Systems

VOLUME 9,

N° 4

2015

Fig. 10. Landmarks’ detection and decoding algorithm for the hybrid system

position in x-axis when the landmark is located to the right side of the robot, and minus when it is on the left. At the beginning of calculations the algorithm assumes that the landmark is located in front of the robot, and uses plus sign in (8) to compute the position in y-axis. But if the robot has to rotate 180ᵒ to decode the landmark, the algorithm uses minus sign. If the candidate landmark is found at a viewing angle larger than ±15ᵒ, the robot turns around until the angle becomes smaller than ±15ᵒ. The most common situation is when the algorithm finds more than one potential landmark. In such case the robot turns towards the nearest landmark. Although the front-view camera is capable of recognizing landmarks that are visible at the angles up to ±60ᵒ, to ensure robustness, the QR codes are decoded when they are visible at an angle at most ±15ᵒ. Images from the front-view camera are processed only if the omnidirectional camera finds a potential landmark. If we cannot find any landmark candidates in the unwrapped image, the robot does not localize itself, and tries to continue using the odometry.

5. Experiments and Results

In order to verify the accuracy of the landmarkbased self-localization and the usability of the hybrid vision system in practical scenarios, we have performed several experiments in a typical home environment. In these experiments we used one SanBot Mk II equipped with the hybrid field of view vision system. The ground truth data about the covered path was collected by manually measuring the robot’s

Fig. 11. Robot’s path during the experiment. Small squares represent points, at which the robot stops and takes images

Tab. 1. Viewing angle determination results for the omnidirectional camera Robot stop no.

α[ᵒ]

αg[ᵒ]

Δα[ᵒ]

-9.19

-10.00

0.81

21.33

20.00

0.35

0.00

71.98

10.52

4.39

2 4 6

72.00 9.00

1.52

5.00

0.61

-37.69

-38.00

-49.88

-50.00

47.84

50.00

-3.56

-3.40

-2.91

0.02

-3.00

-4.00

-3.00

1.33

0.35

0.31

0.56

0.12

0.60

2.16

0.09

2D position with respect to the planned path that was marked on the floor with scotch tape. Here we present the quantitative results for the longest path, spanning three rooms (Fig. 11). During this experiment, the robot covered the planned path ten times, which enabled us to asses the repeatability of the measurements carried out by our vision system. The test environment has seven landmarks, which contain their localizations and orientations with regard to external system of coordinates. Using only one landmark, its data and trigonometric relations, the robot can calculate its position. For this reason, landmarks in the environment are arranged so that the robot can always see at least one landmark. At the beginning program searches potential landmarks in images from the omnidirectional camera. If the algorithm finds a potential landmark and its absolute viewing angle is less than ±15ᵒ, the program starts processing data from the front-view camera. The robot stops near a detected landmark. If the algorithm finds a landmark candidate, but the angle is bigger than ±15ᵒ, the robot turns towards the landmark until the angle is Articles

Journal of Automation, Mobile Robotics & Intelligent Systems

VOLUME 9,

Fig. 12 Exemplary images of a measurement taken at a single robot stop: a – cropped and unwrapped image where the algorithm finds a potential landmark but the angle is larger than 15ᵒ, b – cropped and unwrapped image where algorithm finds a potential landmark and the angle is less than 15ᵒ, c – images from the perspective camera with marked landmark

smaller than ±15ᵒ. Then, the robot updates its pose from the computed localization data and continues to the next via-point on the planned path. In this experiment the average error of determining the position of the robot was 3 cm in x-axis, 5 cm in y -axis, and the orientation error was 4ᵒ, when using the front-view camera. For the omnidirectional camera the orientation error was only 1ᵒ. This enables to compensate the degraded orientation accuracy in the robot pose by using data from

N° 4

2015

the omnidirectional system. Sample images from the measurements are presented in Fig. 12. Results for the omnidirectional camera are shown in Tab. 1, where α denotes the measured angle and αg the known ground-truth angle. Final results for the perspective camera-based self-localization guided by the omnidirectional camera data are shown in Tab. 2, where xL,yL,αL describe the landmark position in the global frame, xR,yR,αR denote the computed robot’s pose in the same global frame, xgR,ygR,αgR are the ground truth coordinates of the robot, ΔxR,ΔyR,ΔαR define the absolute localization errors, and σxR,σyR,σαR the standard deviations of the localization measurements. Both tables contain average results from 10 runs along the same path. These results demonstrate that the system based on a combination of the omnidirectional camera and the perspective camera provides localization accuracy that is satisfactory for home environment navigation, and allows improving the results in comparison to a system using only the front-view camera.

6. Conclusions

This paper presents a new approach to mobile robot self-localization with passive visual landmarks. Owing to the hybrid field of view vision system even a small and simple robot can use passive vision for global self-localization, achieving both high accuracy and robustness against problems that are common in vision-based navigation: occlusions, limited field of view of the camera, and limited range of landmark recognition. The proposed approach enables to use low-cost hardware components and allows simplifying the image processing by avoiding full rectification and geometric correction of the images. The experiments conducted using a mobile robot demonstrated that the omnidirectional component can in most cases determine the viewing angle of a landmark with the accuracy better than 1ᵒ, using a partially rectified image. The positional accuracy of robot localization using the hybrid field of view system was in most cases better than 5 cm, which is satisfactory for home or

Tab. 2. Robot self-localization results for the hybrid system L. no. 1

0.00

yL [cm]

144.00

αL [ᵒ]

90.00

xR [cm]

93.90

yR [cm]

105.86

235.00 250.00 210.00 123.86 167.89

45.00

605.00 180.00

56.75

438.32

-40.00

390.00 270.00

-82.49

429.76

120.00 442.00 180.00 112.45 282.54

-135.00 544.00

90.00

-220.00 222.00

90.00

xL [cm]

Articles

15.85

529.85

-125.83 245.82

αR [ᵒ]

-75.62

xgR [cm]

97.00

ygR [cm]

108.00

34.28

120.00 160.00

4.77

54.00

442.00

-78.00

434.00

0.94

-81.30

110.00 288.00

ΔxR [cm]

30.00

3.86

-65.00 5.00

3.00

522.00

-85.00

-81.79 -128.00 240.00

-85.00

98.51

18.00

αgR [ᵒ]

98.00

ΔyR [cm]

ΔαR [ᵒ]

σxR [cm]

4.28

2.55

3.10

2.14

10.62

2.45

5.46

4.06

2.75

2.15

4.49

2.17

7.89

3.68

7.85

4.24

5.82

1.77

3.70

0.51

3.21

σyR [cm]

σαR [ᵒ]

2.00

3.63

9.34

1.74

4.42

3.09

1.35 1.2

3.10

1.40

6.97

4.08

6.90

5.70

4.92

5.06

2.98

1.05

1.51

2.10

Journal of Automation, Mobile Robotics & Intelligent Systems

office navigation. However, an omnidirectional camera that provides the single effective viewpoint geometry should allow us to extend the applications of the hybrid system beyond the artificial landmarks. This is a matter of ongoing development. Another direction of further research is the model of measurements uncertainty for the omnidirectional camera. Such a model should enable optimal fusion of the localization data from both cameras (e.g. by means of Kalman filtering), and more efficient planning of the positioning actions [27].

ACKNOWLEDGEMENTS

This work was supported by the Poznań University of Technology Faculty of Electrical Engineering grant DS-MK-141 in the year 2015.

AUTHORS

Marta Rostkowska* – Poznań University of Technology, Institute of Control and Information Engineering, ul. Piotrowo 3A, 60-965 Poznań, Poland,. E-mail: marta.a.rostkowska@doctorate.put.poznan.pl Piotr Skrzypczyński – Poznań University of Technology, Institute of Control and Information Engineering, ul. Piotrowo 3A, 60-965 Poznań, Poland. E-mail: piotr.skrzypczyński @put.poznan.pl *Corresponding author

REFERENCES

[1]

[2] [3]

[4]

[5]

[6] [7]

Adorni G., Bolognini L., Cagnoni S., Mordonini M., A Non-traditional Omnidirectional Vision System with Stereo Capabilities for Autonomous Robots, LNCS 2175, Springer, Berlin, 2001, 344–355. DOI: 10.1007/3-540-45411-X_36. Bazin J., Catadioptric Vision for Robotic Applications, PhD Dissertation, Korea Advanced Institute of Science and Technology, Daejeon, 2010. Baczyk R., Kasinski A., “Visual simultaneous localisation and map–building supported by structured landmarks”, Int. Journal of Applied Mathematics and Computer Science, vol. 20 no. 2, 2010, 281–293. DOI: 10.2478/amcs-2014-0043. Briggs A., Scharstein D., Braziunas D., Dima C., Wall P. , “Mobile Robot Navigation Using SelfSimilar Landmarks”. In: Proc. IEEE Int. Conf. on Robotics and Automation, San Francisco, 2000, 1428-1434, DOI: 10.1109/ROBOT.2000.844798. Cagnoni S., Mordonini M., Mussi L., “Hybrid Stereo Sensor with Omnidirectional Vision Capabilities: Overview and Calibration Procedures“. In: Proc. Int. Conf. on Image Analysis and Processing, Modena, 2007, 99–104, DOI: 10.1109/ ICIAP.2007.4362764. DeSouza G., A. C. Kak, “Vision for Mobile Robot Navigation: A Survey”, IEEE Trans. on Pattern Anal. and Machine Intell., vol. 24, no. 2, 2002, 237–267. DOI: 10.1109/34.982903. Durrant-Whyte H. F., Bailey T., “Simultaneous

VOLUME 9,

N° 4

2015

localization and mapping (Part I)”, IEEE Robotics & Automation Magazine, vol. 13, no. 2, 2006, 99–108. DOI: 10.1109/MRA.2006.1638022. [8] Davison A., Reid I., Molton N., Stasse O., “MonoSLAM: Real-time single camera SLAM”, IEEE Trans. on Pattern Analysis and Machine Intelligence, vol. 29, no. 6, 2007, 1052–1067. DOI: 10.1109/TPAMI.2007.1049. [9] Fiala M., “Designing highly reliable fiducial markers”, IEEE Trans. on Pattern Analysis and Machine Intelligence, vol. 32, no. 7, 2010, 1317– 1324. DOI: 10.1109/TPAMI.2009.146. [10] Figat J., Kasprzak W., “NAO-mark vs. QR-code Recognition by NAO Robot Vision”. In: Progress in Automation, Robotics and Measuring Techniques, vol. 2 Robotics, (R. Szewczyk et al., eds.), AISC 351, Springer, Heidelberg, 2015, 55–64. DOI 10.1007/978-3-319-15847-1_6. [11] Lemaire T., Berger C., Jung I.-K., Lacroix S., “Vision-based SLAM: Stereo and monocular approaches”, Int. Journal of Computer Vision, vol. 74, no. 3, 2007, 343–364. DOI: 10.1007/s11263007-0042-3. [12] Lin G., Chen X., “A robot indoor position and orientation method based on 2D barcode landmark”, Journal of Computers, vol. 6, no. 6, 2011, 1191–1197. DOI:10.4304/jcp.6.6.1191-1197. [13] Lu F., Tian G., Zhou F., Xue Y., Song B., “Building an Intelligent Home Space for Service Robot Based on Multi-Pattern Information Model and Wireless Sensor Networks”, Intelligent Control and Automation, vol. 3, no. 1, 2012, 90–97. DOI: 10.4236/ica.2012.31011. [14] McCann E., Medvedev M., Brooks D., Saenko K., “Off the Grid: Self-Contained Landmarks for Improved Indoor Probabilistic Localization“. In: Proc. IEEE Int. Conf. on Technologies for Practical Robot Applications, Woburn, 2013, 1–6. DOI: 0.1109/TePRA.2013.6556349. [15] Martínez-Gomez J., Fernańdez-Caballero A., Garciá-Varea I. , Rodriǵuez L., Romero-Gonzalez C., “A Taxonomy of Vision Systems for Ground Mobile Robots”, Int. Journal of Advanced Robotic Systems, vol. 11, 2014. DOI: 10.5772/58900. [16] Menegatti E., Pagello E., “Cooperation between Omnidirectional Vision Agents and Perspective Vision Agents for Mobile Robots“, Intelligent Autonomous Systems 7 (M. Gini et al., eds.), IOS Press, Amsterdam, 2002, 231–135, 2002. [17] Potúcek I., Omni-directional image processing for human detection and tracking, PhD Dissertation, Brno University of Technology, Brno, 2006. [18] Rahim N., Ayob M., Ismail A., Jamil S., “A comprehensive study of using 2D barcode for multi robot labelling and communication”, Int. Journal on Advanced Science Engineering Information Technology, vol. 2, no. 1, 80–84, 1998. [19] Rostkowska M., Topolski M., Skrzypczynski P., „A Modular Mobile Robot for Multi-Robot Applications”, Pomiary Automatyka Robotyka, vol. 17, no. 2, 2013, 288–293. [20] Rostkowska M., Topolski M., „Usability of matrix barcodes for mobile robots positioning”, Postȩpy Articles

Journal of Automation, Mobile Robotics & Intelligent Systems

[21]

[22]

[23] [24]

[25] [26]

[27] [28]

[29]

[30] [31] [32]

Robotyki, Prace Naukowe Politechniki Warszawskiej, Elektronika (K. Tchon, C. Zielinski, eds.), vol. 194, no. 2, 2014, 711–720. (in Polish) Rostkowska M., Topolski M., “On the Application of QR Codes for Robust Self-Localization of Mobile Robots in Various Application Scenarios”. In: Progress in Automation, Robotics and Measuring Techniques, (R. Szewczyk et al., eds.), AISC, Springer, Zürich, 2013, 243-252. DOI 10.1007/978-3-319-15847-1_24. Rusdinar A., Kim J., Lee J., Kim S., “Implementation of real-time positioning system using extended Kalman filter and artificial landmarks on ceiling, Journal of Mechanical Science and Technology”, vol. 26, no. 3, 2012, 949–958, DOI: 10.1007/s12206-011-1251-9. Scaramuzza D., Omnidirectional vision: from calibration to robot motion estimation, PhD Dissertation, ETH Zürich, 2008. Schmidt A., Kraft M., Fularz M., Domagala Z., “The comparison of point feature detectors and descriptors in the context of robot navigation”, Journal of Automation, Mobile Robotics & Intelligent Systems, vol. 7, no. 1, 2013, 11–20. Siagian C., Itti L., “Biologically Inspired Mobile Robot Vision Localization”, IEEE Trans. on Robotics, vol. 25, no. 4, 2009, 1552–3098, DOI: 10.1109/TRO.2009.2022424. Scharfenberger Ch. N., Panoramic Vision for Automotive Applications: From Image Rectification to Ambiance Monitoring and Driver Body Height Estimation, PhD Dissertation, Institute for RealTime Computer Systems at the Munich University of Technology, Munich, 2010. Skrzypczynski P., “Uncertainty Models of the Vision Sensors in Mobile Robot Positioning”, Int. Journal of Applied Mathematics and Computer Science, vol. 15, no. 1, 2005, 73–88. Skrzypczynski P., “Simultaneous Localization and Mapping: A Feature-Based Probabilistic Approach”, Int. Journal of Applied Mathematics and Computer Science, vol. 19, no. 4, 2009, 575–588, DOI: 10.2478/v10006-009-0045-z. Yoon K-J., Kweon I-S., “Artificial Landmark Tracking Based on the Color Histogram“. In: Proc. IEEE/RSJ Conf. on Intelligent Robots and Systems, Maui, 2001, 1918-19203. DOI: 10.1109/ IROS.2001.976354. EmguCV, http://www.emgu.com/wiki/index. php/Main OpenCV Documentation, http://docs.opencv.org MessagingToolkit, http://platform.twit88.com

Articles

VOLUME 9,

N° 4

2015

Journal of Automation, Mobile Robotics & Intelligent Systems

VOLUME 9,

N° 4

2015

Design and Movement Control of a 12-legged Mobile Robot Submitted: 9th September 2015; accepted: 21st September 2015

Jacek Rysiński, Bartłomiej Gola, Jerzy Kopeć DOI: 10.14313/JAMRIS_4-2015/31

such applications. Particularly, robots for anti-terrorist actions are needed in Poland where the demand of these devices is on par with other countries. All these devices have one common feature i.e. movement/drive system which enables performance of their movement. The goal of the present work was to design and manufacture of walking robot together with some cooperation elements e.g. control system where control routines are performed via a modern phone (iPhone), tablet, notebook or desktop computer.

Abstract: In the present paper, design and performance of 12-legged walking robot is described. The complete technical specification was developed for the proposed solution. The analysis of stability of the robot movements was undertaken. Communication between robot and operator is based on remote control procedures performed by means of own software which is written in versions for smartphone or desktop computer. The software version for desktop computers has additional useful features i.e. monitoring of the area of robot work/activity via a wireless camera mounted on fore side of the robot.

2. Mechanical Subsystem of the Robot

Within the design phase of the DUODEPED robot, the ideas of motion based upon wheels or caterpillar tracks where excluded. On the contrary, the mechanism of special legs presented in 2005 in Austria by Dutch physicist Theo Jansen was applied. The concept of the design solution is based upon utilization of simple geometric figures which are mutually connected by means of nodes (Fig. 1). The point – marked by means of blue color (1’) – is fixed on the rotation axis; the element which passes power is marked as green (2’). The device is driven into a motion which enables performance of consecutive steps via the node of a triangle which has contact with the ground. Time of contact with the ground for an orange node (3’) is equal to the time of the 120° rotation of the green node around the driving shaft. Taking into account that one full rotation is equivalent to 360°, aiming for sufficient stability of the device, the number of pair of legs (Fig. 1c) assigned to one rotation cycle was assumed as equal to 3. Therefore their driving bands are mounted on the driving shaft fixed for every 120°. This design solution allows for perma-

Keywords: mobile robot, design, control, kinematic analysis

1. Introduction Specialized mobile robots are produced and utilized all over the world. Their typical range of applications is as follows: monitoring, repair routines, inspection of chemically contaminated (or under threat of contamination) areas, extinguishing of fires, detecting and removing of bombs as well as various actions against terrorism. Separate but intensively developed area of utilization of such types of robots is: detection and removal of landmines and associated task [2], [3], [4]. Additional applications of mobile inspection robots are tasks performed in pits and coal mines as well as similar location where human life (e.g. machine operator) is in danger, so specialized mobile robots are utilized. In Poland, there are not any robots designed and/or manufactured for

Fig. 1. Concept of robot legs

c) 39

Journal of Automation, Mobile Robotics & Intelligent Systems

VOLUME 9,

N° 4

2015

Fig. 2. Kinematical scheme of single robot leg and trajectory of wheel axis nent contact of the legs with the ground, moreover it creates the base for stable walking.

2.1. Legs, Their Number and Motion Trajectory

The robot legs consist of triangles connected by bands. Their special shape (geometrical properties) allows for performance of leg movement trajectories, in such a manner that during the performance of a step – point P6 moves parallel to the ground, giving an impression of smooth motion as well as assuring stability of performed robot steps. Each robot leg consists of 40 elements, the elements can move. Aiming for reduction of friction, the legs are mutually separated by means of separators (washers) made of Teflon having thickness 0.5 mm. Moreover, aiming for reduction of the size – instead of ball bearings, sliding bearings were utilized being a brass rings mounted on shafts.

2.2. Design Solution of the Robot Leg

Using the above described ideas, a design solution has been proposed which assures minimal number of support points. Therefore, 12 legs were designed which are grouped in four sets of three legs. The design solution has a property that in a special case of motion (ahead) and the special legs arrangement – the number of these supports increases to eight, whereas remaining four legs are in the middle of the cycle of displacement of the support point over the ground. The design solution according to Theo Jansen’s idea allows for permanent contact of minimal number of legs with the ground. However, due to the fact that the center of gravity is placed in a relatively high position and simultaneously the support points are placed (distributed) closely to themselves – the conArticles

Fig. 3. Design solution for fixing robot legs struction is not fully stable. Aiming for improvement of construction stability – the leg mounting points were displaced in a distance of 194 mm (Fig. 3). Due to the change of positions for fixing the robot legs, the drive element (module) was designed as a system of two co-operating crankshafts placed symmetrically (180°). Their consecutive cranks are shifted mutually by an angle of 120°, around the main rotation axis. Their drive system was coupled with the gear wheels of a gear having the ratio equal to 2.29. The advantages of the proposed design solution of the robot consisting in 12-legged walking system are as follows: high capacity of loading, relatively high velocity of motion and ability to overcome the obstacles – preserving simultaneously stability of the whole structure.

2.3. Kinematical Analysis of Robot Leg

The analysis of robot movement is performed using a few assumptions. Every linkage is not compliant. All kinematics pairs are without backlash. All the movements were projected to one plane. All activi-

Journal of Automation, Mobile Robotics & Intelligent Systems

VOLUME 9,

N째 4

2015

Fig. 4. Main dimensions of leg and its drive mechanism ties were performed using such kinematical scheme of mechanism. The first stage of analysis was finding out the configurations of linkages for selected positions of driv-

ing crank. The results are depicted in Fig. 5. The path of movement of point P6 (which is corresponding to axis if ground wheel) was obtained by connecting subsequent positions of point P6. Notice that this tra-

Fig. 5. Single leg movement, for selected positions of crank Articles

Journal of Automation, Mobile Robotics & Intelligent Systems

VOLUME 9,

N째 4

2015

Fig. 6. Free positions of robot legs driving mechanism

jectory was drawn up with respect to point Pc. Additional drawing (Fig. 6) shows positions for 3D CAD model of mechanism. Using such methodology the design of linkages was optimal and the risk of parts collisions and other design troubles were prevented. The second stage of analysis was revealing the velocities in all joints of robot leg. The graphical method was used and the results are shown in Fig. 7 and Fig. 8.

The results of analysis (i.e. obtaining the shape of trajectory, knowledge of accurate positions of linkages and ground wheel, velocities plans) ensured the selected design and facilitated it. Velocities collected in table 1 were determined for angular velocity of crank w0=5,09 rad/s. Figure 9 shows the selected position of the leg with marked centre of gravity (CG). The full path of mechanism motion was traced.

Fig. 7. Velocities of joints for position +0째

Fig. 8. Velocities of joints for position +180째

Articles

Journal of Automation, Mobile Robotics & Intelligent Systems

VOLUME 9,

N° 4

2015

Figure 10 shows changing the position of gravity centre for single leg mechanism during the motion. Simulation was performed for angle from 0 to 360 deg. The results were obtained numerically from CAD model of mechanism, with assumption, that the mechanism is planar.

Fig. 9. Exemplary position of gravity centre for leg during the motion (angle coordinate 90°)

Table 1. Sample numerical values of joint velocities [mm/s] Velocity

0°

86.5

46.3

Position

180° 86.5

46.3

102.1

0.0

102.1

130.2

154.7

52.1

67.1

121.6

According to computer simulation the Figure 11 shows the variability of reduced mass moment of inertia with respect to crank axes was obtained. This is the plot for single leg.

2.4. Stability of Motion

Fig. 11. Reduced mass moment of inertia with respect to crank axes

98.8

Fig. 10. Changes of gravity center of single leg mechanism

Multi-leg constructions moved usually in a motion (walking) which is statically stable. During such walking, the projection of the robot center of gravity is always placed inside the polygon of supports (Fig. 12 a). The stability spare range is defined as a distance between the projection of the center of gravity and an edge of support polygon. This distance is measured along the current vector of motion of the center of gravity (Fig. 12 b). The stability spare range – for statically stabile walking – should not be lower than value called as minimal margin. The margin should be set in such a way that all neglected dynamical effects and action of external forces do not cause any loss of robot stability. The best situation is when this stability spare range is determined experimentally. The velocities of motion of the contemporary walking robots or devices are relatively low, usually lower than several kilometers per hour (sometimes even less than 1 kilometer per hour). Therefore, the assumed simplifications are acceptable.

Fig. 12. a) Projection of the center of gravity during motion through an inclined slope, b) Statical stability spare range for 3-support walking Articles

Journal of Automation, Mobile Robotics & Intelligent Systems

Fig. 13. a) L1 … L6, P1 … P6 – Designation of legs; b) diagram of a robot foot contact with the ground

Fig. 14. Layout / functional scheme 44

Articles

VOLUME 9,

N° 4

2015

As can be seen, the criterion of the stability spare range does take into account configuration (shape) of a machine as well as properties of the ground. The sufficient margin of stability is defined taking into account a possibility of reduction of the expected support polygon due to lack of one point of support (lack of contact for one arbitrary leg). If the machine (robot) is supported by n legs, then – besides the proper polygon – other polygons of support are created and considered in the prepared software. These polygons are adequate for all other possible phases (versions) of support by means of (n – 1) – legs. The sufficient polygon of support is built as common area of all n – 1 polygons. Sufficient spare range of stability is measured in a way described above – therefore it could be considered for machines or robots having more than four legs. The statically-stable walking of the designed 12-legged robot – consists of three 4-support phases (Fig. 13). This solution assures permanent contact of 4 legs with the ground, therefore there is not any threat of loss of stability. The crankshafts – which driver the robot legs – are connected via the gear. In consequence, only one type of walking is possible where the only variable is its velocity. Due to the motion of the robot legs, its center of gravity slightly changes the position. The changes are so low among other – because the legs’ mass is relatively low in relation to the whole mass of the construction (20% of the whole mass) and an effect of application of crankshafts – mutually rotated by 180°. The last mentioned property assures that the legs of opposite sides balance themselves. Slightly different are the matters according to the spare range of stability, which varies essentially during motion of robot legs. It is caused by the changes of the distance between the legs’ contact points with the ground and the robot center of gravity [8], [9], [10]. The measure of energetic stability of a particular robot position is the minimal work, which has to be done via whichever disturbing factor/activity causing destabilization of its current position. It is the work dedicated to displacement of the gravity center of the robot to such new position – in which, the gravity center is placed in the vertical plane above the edge of the support polygon. The distance is preserved within a particular time. Therefore, it is ultimate stable position (having ultimate stability). In consequence, any

Journal of Automation, Mobile Robotics & Intelligent Systems

VOLUME 9,

N° 4

2015

Fig. 15. Screens dedicated to the control system: a) manual control; b) control using accelerometer; c) control using touch screen smallest disturbance will cause overturning of the robot. This work, in physical sense, is equivalent to the difference of potential energy of the gravity center adequate for the start and end positions of the robot.

3. Electronic Subsystem of the Walking Robot

Robot control routines should be intuitive and easy to perform for persons of different age groups. In general, control should be simple and additionally its visualization should be possible. Nowadays, for some people, it is even unimaginable to design such a device without control visualization [7]. Aiming for increasing comfort and ease of robot control, the control system was based upon wireless transmission of data via Bluetooth. Due to this solution, cables are obviously not needed; nevertheless remote control is possible from relatively high distance. The distance considered is equal to 100m on an open area, depending on possibility of propagation of radio waves. A device – which could be used for robot control – can be based on Android 2.3.6 system as well as upgraded ones mounted in e.g. smartphone, tablet or other devices based on Windows operational system e.g. notebook or desktop computer [6]. The necessary requirement is that the device is equipped in Bluetooth module. Central subsystem of DUODEPED is the effective microcontroller PIC24HJ256GP610 made by Microchip company [1]. Special program was written in C algorithmic language, which was compiled by means of compiler C30. Microcontroller performs communication utilizing the serial transmission UART type via Bluetooth BTM-222 module which assures radio connection with the control device (subsystem) [4]. The functional scheme enclosing all elements i.e. from the control device up to motors – is presented in Fig. 14. The robot control system was written using the Basic4Android software. It allows for control of the robot via three working modes.

The first of available control work mode is the most simple control version (Fig. 15a). It is equipped with the direction arrows as well as hidden arrows responsible for motion along curves or bands. Depending on the chosen direction of robot motion, the robot will be moved ahead, turn back at the spot and turning along the arc/band. The second method of control (Fig. 15b) utilizes the accelerometer gauge which is available in a smartphone. Just declining the device, the adequate information about the position of phone is transmitted. If the robot motors are triggered via a button START, then the robot will perform the motions in accordance with the declining of the control device. The third, i.e. the last control mode (Fig. 15c), performed by means of Android, utilizes the touch screen. The clicks on this screen ensure the robot starts its motion if only the option START has been earlier clicked – launching charging of the robot motors. Robot control software for desktop computer (PC) or notebook.

Fig. 16. Screen „Preview & Control” Control of DUODEPED robot could be performed by means of a desktop computer or a notebook when the software inTOUCH version 10.1 is mounted (Fig. 16). Articles

Journal of Automation, Mobile Robotics & Intelligent Systems

Fig. 17. DUODEPED – 12-legged walking robot Moreover, the DASMBSerial.2 of the package ArchestrA creates the communication system used for control purpose. After triggering of the application, a user must log in using one of the available accounts. The login options and passwords are gathered in the field „PIERWSZE KROKI” (introductory steps). For the service activities, the password is hidden and it is the same as login. The control of robot is performed in intuitive way. It is enough to log in as a user and the button „Preview & Control” is activated. Via this button, the window of control is open. In this window, we can chose the measurements of the voltage of the battery mounted in machine/robot and measurement of electric current which flows through the controller of DC motors and charges theses motors. In the upper part of this window, there is a monitoring field shown in right part of the discussed window. Moreover, monitoring is available in visual way as a moving construction. The window is zoomed from 0 up to 50 cm, therefore above this distance the pictogram of construction on visualization sub-screen (sub-window) will be placed on right side of the control window. The detection of an obstacle option is active by default. Therefore, if during its motion the robot approaches on obstacle for the distance lower than the second threshold – then its velocity in this direction is diminish. But if the distance towards the obstacle further diminishes and it overcomes the threshold 1, then the drive mechanism: (a) stops the robot, (b) changes the direction of rotations and (c) it moves back with a low velocity (reduced to 20%). Next overcoming of the threshold 2, i.e. when the robot remains in safe distance, causes stopping of the robot and waiting for further orders. If we would like to start movement, then first the motor should be activated – i.e. ready to switch the power on. When the robot is not used for a long time, then for safety reasons, the motors should be deactivated – disenabling a possibility of using the robot by chance in unexpected moment.

4. Final Remarks

Mobile robots are more and more frequently utilized by people for differing purposes. Within recent Articles

VOLUME 9,

N° 4

2015

years, mechanisms applied in robotized constructions as well as autonomous robots were successfully used in army, medicine and industrial plants. Since the dynamic development of robotics in modeling activities is observed. People are more and more interested in design and programming of so called personal assistant robots. All over the world, robots are used for versatile tasks e.g. monitoring, production, safety and protection. It can be stated that nowadays robots are everywhere. Several times a year, presentation events and robot contests are organized in Poland and abroad. In these events, robots designed by private people as well as by firms take part. It is worth nothing, that every year number of participants of such tournaments increases very fast. The manufactured 12-legged walking robot – DUODEPED – had won several prizes for the design itself and innovative design solutions. Its name is created based upon the Latin i.e.: taking into account words: DUODE – twelve and PEDES – legs. It is a new name which has never been used – which could be confirmed via search results of the popular web/internet browsers. The construction has been manufactured very solid, which could be confirmed by a possible of displacement of relatively high loads (approx. 100kg). Robot has many admirers and it has a circle of fans i.e. students as well as kids who can play endlessly.

AUTHORS Jacek Rysiński* – Faculty of Mechanical Engineering and Computer Science, University of Bielsko-Biala, Bielsko-Biała, Poland. E-mail: jrysinski@ath.bielsko.pl. Bartłomiej Gola – Faculty of Mechanical Engineering and Computer Science, University of Bielsko-Biala, Bielsko-Biała, Poland. E-mail: bartlomiejgola@wp.pl.

Jerzy Kopeć – Faculty of Mechanical Engineering and Computer Science, University of Bielsko-Biala, Bielsko-Biała, Poland. E-mail: jkopec@ath.bielsko.pl. *Corresponding author

REFERENCES

[1] Di Jasio L., Programming 16-bit PIC Microcontrollers in C: Learning to Fly the PIC24, Newnes, Burlington 2007. [2] Zielińska T., Walking machines, basics, mechanical design, control and biological aspects, Polish Scientific Publishers PWN, Warszawa 2003 (in Polish). [3] Tchon K., Mazur A., Dulęba I., Hossa R., Muszynski R., Manipulators and mobile robots, Akademicka Oficyna Wydawnicza, Warszawa 2000 (in Polish). [4] Giergiel M. Malka P., Wireless communication systems in the control of robots, “Modelowanie

Journal of Automation, Mobile Robotics & Intelligent Systems

[5] [6] [7]

[8]

[9]

[10]

[11]

VOLUME 9,

N° 4

2015

Inżynierskie”, Gliwice, vol. 36, 2008, 95–102 (in Polish). Maslowski A., „Intervention-inspection mobile robots”, IPPT PAN, Warszawa 1999 (in Polish). Frank A. W., Sen R., King Ch., “Android in action”, Helion S.A., Gliwice 2011 (in Polish). Pa P.S., Wu C.M., “Design hexapod robot with a servo control and man-machine interface”, Robotics and Computer-Integrated Manufacturing, vol. 28, 2012, 351–358. DOI: 0.1016/j.rcim.2011.10.005 Hasan A., Soyguder S., “Kinetic and dynamic analysis of hexapod walking-running-bounding gaints robot and control actions”, Computers and Electrical Engineering, vol. 38, 2012, 444–458. DOI: 10.1016/j.compeleceng.2011.10.008. Parhi D.R., Pradhan S.K., Panda A.K., Behera R.K., “The stable and precise motion control for multiple mobile robots”, Applied Software Computing, vol. 9, 2009, 477–487. DOI: 0.1016/j. asoc.2008.04.017. Ferrell C., “A comparison of three insect-inspired locomotion controllers”, Robotics and Autonomous Systems, vol. 16, 1995, 132–159. DOI: 10.1016/0921-8890(95)00147-6. Pa P.S., “Design of a modular assembly of fourfooted robots with multiple functions”, Robotics and Computer-Integrated Manufacturing, vol. 25, 2009, 804–809. DOI: 10.1016/j. rcim.2008.12.001.

Articles

Journal of Automation, Mobile Robotics & Intelligent Systems

VOLUME 9,

N° 4

2015

ICS System Supporting the Water Networks Management by Means of Mathematical Modelling and Optimization Algorithms Submitted: 2nd September 2015; accepted: 24th September 2015

Jan Studzinski DOI: 10.14313/JAMRIS_4-2015/32 Abstract: In the/this paper a concept of an integrated information system for complex management of water networks is presented. The ICT system is under development at the Systems Research Institute (IBS PAN) in Warsaw for a couple of years and it is gradually tested in some Polish communal waterworks of differentiated size. Several waterworks management tasks requiring mathematical modelling, optimization and approximation algorithms can be solved using this system. Static optimization and multi-criteria algorithms are used for solving more complicated tasks like calibration of the water net hydraulic model, water net optimization and planning, control of pumps in the water net pump stations etc. [4] But some of the management tasks are simpler and can be performed by means of repetitive simulation runs of the water net hydraulic model. The water net simulation, planning of the SCADA system, calculation of water age and chlorine concentration in the water net, localization of hidden water leaks occurring in the network and planning of water net revitalization works are the examples of such tasks executed by the ICT system. They are described in this paper. Keywords: drink water distribution system, water net hydraulic model, hydraulic optimization, water net management

1. Introduction

The world trend in computerization of waterworks is currently the implementation of integrated information systems for complex management of whole enterprises or their whole key objects and under them of water networks what is the simplest venture from the technical, organizational and financial point of view. An integrated management system for a communal water network consists usually from GIS (Geographical Information System), SCADA (System of Control and Diagnostics Analysis) and CIS (Customer Information System) systems which are integrated strictly with some modeling, optimization and approximation algorithms [7]. Due to this strict cooperation under several programs all tasks of the water net management concerning technical, organizational, administrative and economic problems can be automatically executed or computer supported [8]. Three essential goals that can be reached by computer aided management of municipal water networks are reduction of costs and simplification of waterworks operation as well as improving the qual-

ity of drink water supplied to the city. Main problems connected with the water network management are water losses caused by the network damages, unsuitable water pressures on the end user nodes caused by inappropriate work of pump stations installed on the network or by wrong planning of the water net, and a bad quality of produced water caused by incorrect control of the network or by inaccurate planning of water net revitalization. All these problems can be solved in relatively simple way by using new informatics technologies and this idea led to the concept of an integrated ICT system for complex management of communal water networks. The system developed at IBS PAN is now tested in some Polish waterworks.

2. ICT System Description

According to the mentioned trend in waterworks computerization an integrated ICT system for complex water networks management has been developed at the Systems Research Institute and its structure is shown in Fig. 1. The system is built in modular form and it consists of the following components: • GIS – for generating the numerical maps of the water net investigated; • SCADA – for monitoring the water net parameters, i.e. pressures and flows of the water; • CIS – for recording the water consumption of the end users of the water net; • 20 computing programs with algorithms of mathematical modeling, optimization and approximation for solving the water net management tasks.

Fig. 1. Block diagram of the ICT system for water networks management The components GIS, SCADA and CIS are adopted from other firms and integrated with the computing

Journal of Automation, Mobile Robotics & Intelligent Systems

programs via data files or data tables. The computing programs are responsible for realization of all management tasks by means of water net hydraulic model and optimization algorithms. Some functions realized by the programs are as follows: 1. Hydraulic modeling of water nets; 2. Optimal planning of SCADA systems for water nets; 3. Automatic calibration of hydraulic models; 4. Optimization and planning of water nets; 5. Control of pump stations in water nets; 6. Control of pumps installed in pump stations; 7. Detecting and localization of leakage points in water nets; 8. Calculation of water age in water nets; 9. Calculation of chlorine concentration in water nets; 10. Planning of water net revitalization; 11. control of network valves changing the water flows distribution in water nets. The programs realizing the functions specificated above work with the water net hydraulic model and while realizing the tasks concerning the model calibration, water net optimization and planning, pumps control and planning of SCADA they use an heuristic algorithm of multi criteria optimization [6]. For the solution of other tasks of the water net management only multiple simulations of the hydraulic model under different work conditions of the water net are executed [8]. The functions realized in such the way by the ICT system are: 12. calculation of height coordinates for the water net nodes; 13. drawing the maps of water flow and pressure distributions in water nets; 14. drawing the maps of water net sensibility toward the leakage events occurring in water nets; 15. drawing the maps of water age distribution in water nets; 16. drawing the maps of the distribution of chlorine concentration in water nets; 17. drawing the maps of value distributions for some environmental parameters like temperature in the area of the water network. The programs that realize these functions use the algorithms of kriging approximation that enable to picture in graphical form the value distributions of parameters connected with water nets and their operation [1]. The last part of the management functions realized by the ICT system concerns the calculation of mathematical models for forecasting the hydraulic load of water nets and of their end user nodes. This is done by means of the following time series methods [2]: 18. Least squares method of Kalman; 19. Generalized least squares method of Clarke; 20. Maximum likelihood method. Due to the cooperation of several programs while solving different management tasks a synergy effect arises what boosts essentially efficiency of the running programs. In the following some algorithms supporting the water nets management and implemented in the ICT system are described.

VOLUME 9,

N째 4

2015

3. Algorithms of Modelling and Optimization 3.1. Hydraulic Model Calibration Calibration procedure in case of water nets consists usually in changing the roughness values of the network pipes in such the way that flows and pressures measured and calculated are possibly the same in the net points where the sensors of SCADA system have been installed. This changing occurs normally by hand for in the waterworks; there are not the appropriate programs that could support this action by automatic computing [9]. The algorithm presented executes the calibration procedure in three following steps: 1. Preparation of the initial data consisting in division all network pipes in groups depending on pipe diameters, age and material; 2. Changing the roughness of pipes regarding the pipe groups and not individual pipes; 3. If the roughness change in a group exceeds the values field given then changing there the nominal pipe diameters; this change occurs either in frame of a given values field. In this way the algorithm has got two phases of calculation regarding the roughness and diameter changes that follow one after another.

Fig. 2. Exemplary water net model calibrated

Fig. 3. Preparation of data for the exemplary water net model calibration In Figures 2 and 3 the exemplary water net model and the data preparation for its calibration are shown. The net consists of 25 pipes of the same age and made of the same material. On two pipes and on two nodes of the net the measuring devices for flow and pressure are installed. In Fig. 3 one can see the diagrams of calculated and measured flow and pressure values Articles

Journal of Automation, Mobile Robotics & Intelligent Systems

designed for 24 hours and shown for one pipe and one node before the calibration run. The pipes are divided into 2 groups regarding their diameters. By the calibration only the roughness values in two pipe groups will be changed.

VOLUME 9,

NÂ° 4

2015

division of the hydraulic optimization task in two separated stages makes the problem solvable from the computational point of view. In the following the realization of the 2nd stage of the algorithm done for a real pumps station of a Polish waterworks is described. In the object only 1 pump works and from it 2 pipes ended with 2 nodes go out. The pressure values in these nodes are too small against the values that have been calculated on the 1st stage of the algorithm. The problem is to find out the pipes with new diameters and to calculate the pump velocity in such the way that the obtained node pressures will fit to the indicated earlier value areas.

Fig. 4. Differences between measurement data (in grey) and calculation values (in green) in node 2 (pressure values â&#x20AC;&#x201C; left) and in pipe 5 (flow values) before calibration

Fig. 6. View on the pumps station calculated and the parameters of the pump to be controlled

Fig. 5. Results of calibration shown for node 2 (pressure values â&#x20AC;&#x201C; left) and pipe 5 (flow values) The results of the calibration done by means of a genetic algorithm are shown in Fig. 5. One can show there that the pressure values in the node and the flow values in the pipe (which are the same as before) are the practically identical for the calculation results and for the measurement data what allows to consider the calibration algorithm to be very effective.

3.2. Water Net Hydraulic Optimization

Another algorithm supporting the water net management concerns the hydraulic optimization of the water network by means of exchange of particular network pipes and/or of control of pumps in the water take out stations or in the works raising the water pressure within the water net. In the algorithms solving the task in small and medium waterworks the calculation can be done for all pump stations in the same run for there are not more than only several pumps in such the enterprises. In the case of big waterworks the situation is more complicated because there are many pump stations and also a lot of pumps in them and finding the control schemes for all devices simultaneously is practically not possible. Because of that the algorithm proposed consists of two stages when on the 1st stage the controls are calculated for the pumps stations seen as single generalized pumps and the 2nd stage the calculation is done for each pumps station and for the pumps there individually. Such the Articles

Fig. 7. Parameters of two nodes being the outputs of the pump station calculated In Figures 6 and 7 the scheme of the pipe connections in the pumps station investigated and the characteristics of the pump and of the nodes concerned are shown. In Figures 8 and 9 the screens of the program developed at the IBS PAN and prepared for introducing the input data for changing the pump velocity and the pipe diameters are to see [5]. In this example the pump velocity can be change/changed? in the area between 60% and 100% of its nominal speed, the new pipes can have their diameters between 100 mm and 1.200 mm and the acceptable or preferable node pressures are lying between 20 m and 70 m or 28 m and 36 m for one node and between 10 m and 70 m or 37 m and 45 m for another one, respectively. To solve the problem a genetic algorithm of optimization with the use of fuzzy sets while calculating the node pressures is applied; the fuzzy sets are used to make a dis-

Journal of Automation, Mobile Robotics & Intelligent Systems

VOLUME 9,

N° 4

2015

3.3. Water Net Revitalization

reason to undertake the action is an old age of water net objects, mostly of pipes, or their wrong technical state causing the risk of failures. The susceptibility of water nets to accidents can cause in older municipal waterworks the water losses reaching even up to 30% of the water production what means essential financial losses for the enterprise [3]. In the presented algorithm the revitalization task means the exchange of several pipes in the water net because of their wrong technical state and against the pipes with the same diameters. While planning the revitalization with such approach only multiple simulation runs of the network hydraulic model are done for the exchange of old pipes against new ones with reduced roughness values does not worsen but improves the hydraulic conditions of the water net. The goal of the algorithm is to reduce the liability of the network to break down and, in result, to reduce the potential water losses in the water net. While planning the revitalization one must decide which pipes are to be exchanged to minimize the water net susceptibility to accidents and at the same time to secure proper functioning of the whole network. The following factors are taken into consideration when choosing the set of pipes to be replaced: • Technical state of the pipes characterized by their roughness. After the indicators are calculated for all pipes a ranking list for them is prepared according to the diminishing indicator values. • Current durability of the pipes calculated as the difference between the year of pipe construction and the normative pipe durability. • Pipe liability to break down in percent defined on the base of historical data concerning the pipe damages. • Risk of the water losses calculated as the pressure in the pipe modified by the pipe diameter: p * (1 + d/500). • Costs of the pipe revitalization which consists of two components: the costs of the pipe installation and the costs of buying the new pipes. Depending on the financial funds which are at the management disposal one can make choice of the set of pipes for the exchange taking the pipes from the top part of the ranking list and summarizing the costs of their revitalization up to the funds limit. In order to select the pipes for revitalization from the whole set of the water net pipes the revitalization indicator is calculated from the following formula:

Water net revitalization belongs to the planning tasks that can be divided into 3 kinds: hydraulic optimization, drawing up new networks or extension of the old ones and revitalization or renovation. In the first two kinds of the tasks computer simulation of the water net hydraulic model as well as optimization algorithms must be used to secure right hydraulic conditions of the water net operation. They mean relevant water pressures in the end user nodes of the network and possibly fast velocities of water in the network pipes. In case of revitalization the network works right from the hydraulic point of view and the

where wc, wt, wa and ws are weights coefficients, Cn means pipe roughness, Tn means current pipe durability, An is pipe liability to break down and Sn is risk of the water losses defined for the pipe concerned. The weights coefficients can be chosen arbitrary by the program user and all factors in the formula are normalized in the standarized range of values from 0.0 to 1.0. When the pipes to be exchanged are already selected then the effects of the planned revitalization can be verified by performing the hydraulic calculation for the whole water net with roughness values

tinction between the acceptable and preferable value areas while calculating the node pressures.

Fig. 8. Preparation of input data for the pump control

Fig. 9. Preparation of input data for the exchange of pipes In Fig. 10 the results of the hydraulic optimization made for the single pumps station are given. The pressure values at the end nodes of the pumps work raised from 11.82 m to 35.86 m and to 37.24 m, respectively, what is the consequence of the pipe diameters change from 600 mm to 488 mm for one pipe and from 800 mm to 1.048 mm for another one, and of the pump velocity change from 0.76% to 0.89% of its nominal speed.

Figure 10. Results of the hydraulic optimization of the pumps station calculated

IR = wc * Cn + wt * (1.0 - Tn) + wa * An + ws * Sn

(1)

Articles

Journal of Automation, Mobile Robotics & Intelligent Systems

equal to null for the selected pipes. When the revitalization action is done then the vulnerability of the water net to the accidents will be reduced and the water pressures in some end user nodes as well as flow velocities in some pipes will be enlarged.

VOLUME 9,

N째 4

2015

side of the network and overhead 2 retention tanks are installed. The graph of the water net consists in total of 280 nodes and of 398 pipes. The distributions of flows and pressures in the water net before the revitalization action are shown in Fig. 12. The flow and pressure values are highest in the areas where the pump stations and tanks are situated.

Fig. 13. The graph of the water net with the pipes selected for revitalization

Fig. 11. Water net investigated before (up) and after (down) hydraulic calculation

Fig. 12. Pressure (up) and flow (down) distributions in the water net after its hydraulic calculation

The hydraulic graph of the water net investigated before and after the hydraulic calculation is shown in Fig. 11. The network is supplied with water by 2 pump stations located on its left-bottom side. On the right Articles

In Fig. 13 the pipes selected for the exchange are marked with the green colour. According to formula (1) and to data assumed, concerning all relevant factors 31 pipes from 398, i.e. 8% of the whole, have been taken for the replacement. The effects of the revitalization after performing the hydraulic calculation for the whole water net with roughness values equal to null for the selected pipes are shown in Figures 14 and 15. In Fig. 14 the curves received before the revitalization are marked with the blue colour.

Fig. 14. Comparison of water flows (up) and pressures (down) before and after the water net revitalization performed for 31 pipes One can see from Figures 14 and 15 that in accordance with expectation the values of pressures and

Journal of Automation, Mobile Robotics & Intelligent Systems

flows in the water net have increased after the revitalization. Nevertheless the changes of pressure values are very small and insignificant towards the changes of the flows. In that case not only the values but also the flow directions changed as the result of revitalization.

VOLUME 9,

Flow Flow Pressure Pressure before revital. after revital. before revital. after revital. 10.1323

16.2095

20.9

20.91

29.9229

29.31

29.36

33.7

-15.6321

-19.6183

28.0094

37.7194

17.9016

-22.4987

-34.7968

0.0599

9.847

4.5943

2.3047

13.1449

5.8978

-17.493

-22.9548

-38.4735

-44.7111

5.1874

8.3333

7,7127

-49.7825

-10.1803

35.09

32,11

35.04

33.73 35.2

35.2

32.17

35.16

5.7549

39.02

39.15

-62.1092

37.11

37.2

37.1

36.11

37.2

36.21

-9.3078

-17.5968

39.12

39.22

-9.9838

-10.2465

31.07

31.23

-6.1282

3.0865

38.1

38.2

1.8669

10.4993

-9.289

-10.7361

11.0067

14.9158

13.3058

4.3611

40,09

36.59

32.11

37.32

40.26

36.71

32.24

36,69

36.81

318.5774

32.48

32.56

274.2455

341.5322

175.2549

224.9842

256.7525

37.21

3.4721

23 24

35.1

31.44

36.54

31.4

30.45

36.41

13.2529

30.4

-6.506

17 18

Fig. 16. Comparison of water flows (up) and pressures before (down) and after the water net revitalization performed for all pipes

2015

Table 1. Comparison of flow and pressure values before and after the water net revitalization performed for all pipes

Fig. 15. Pressure (up) and flow (down) distributions in the water net after its revitalization performed for 31 pipes

N째 4

42.14

20.9

42.29

20.91 Articles

Journal of Automation, Mobile Robotics & Intelligent Systems

To see it better in another step of revitalization all pipes of the water net have been replaced. The hydraulic results received are shown in Fig. 16 and in Table 1 for exemplary pipes and nodes. Once again one can see that the pressure values increased but in a very small and practically marginal degree. Against it the flows increased their values essentially and also the flow directions have been changed in many pipes.

4. Conclusions

In the paper some algorithms supporting the management of municipal water networks have been presented. Among many algorithms developed for the waterworks there are several algorithms that use in their calculations only hydraulic model of the water net and with the simulation runs of this model several useful management tasks can be realized. These tasks are connected indeed only with planning the water net, like SCADA planning and revitalization algorithms, and with informing about the water net functioning, like calculations of network hydraulics, water age and chlorine concentration, but nevertheless they are important for correct water net operation. More complicated tasks like calibration of the water net hydraulic model or water net optimization or pumps or tank control need for their solution more sophisticated methods like multi criteria optimization algorithms. An important condition of effective operation of the algorithms described is however their using in strict cooperation with GIS and SCADA systems in frame of a united ICT system. Such the solution is more expensive than individual use of only water net hydraulic models but it makes sure that the management tasks will be done fast, easy, suitably and faultless. Such system for waterworks is for a longer time under development at the Systems Research Institute of the Polish Academy of Sciences and some versions of it have been already made and tested in some communal waterworks in Poland.

AUTHOR

Jan Studziński – Systems Research Institute Polish Academy of Sciences, Newelska 6, 01–447 Warszawa, Poland. E-mail: Jan.Studzinski@ibspan.waw.pl.

REFERENCES

[1] Bogdan L., Studzinski J., „Modeling of water pressure distribution in water nets using the kriging algorithms”. In: Industrial Simulation Conference ISC’2007 (J. Ottjes and H. Vecke, eds.) Delft, TU Delft Netherlands, 52–56. [2] Hryniewicz O., Studzinski J., „Development of computer science tools for solving the environmental engineering problems”. In: Enviroinfo’2006 Conference, Graz. [3] Saegrov S., Care-W – Computer Aided Rehabilitation for Water Networks, IWA Publishing, Alliance House, London, 2005. Articles

VOLUME 9,

N° 4

2015

[4] Stachura M., Fajdek B., Studzinski J. 2012. „Model based decision support system for communal water networks”, ISC’2012 Conference, Brno. [5] Sluzalec A., Studzinski J., Ziolkowski A., “MOSKAN-W - the web application for modelling and designing of water supply system”. In: Simulation in Umwelt- und Geowissenschaften, Reihe: Umweltinformatik, ASIM-Mitteilung AM 150, Workshop Osnabrück 2014 (J. Wittmann, Hrsg.), Shaker Verlag, Aachen 2014, 143–153. [6] Straubel R., Holznagel B., „Mehrkriteriale Optimierung fuer Planung und Steuerung von Trink- und Abwasser-Verbundsystemen”. In: Wasser•Abwasser, 140, No. 3, 1999, 191–196. [7] Studzinski J., „Computer aided management of waterworks”. In: Proceedings of QRM’2007 (R.A. Thomas, ed.), Oxford 2007, 254–258. [8] Studzinski J., „Rechnerunterstützte Endscheidungshilfe für kommunale Wasserwerke mittels mathematischer Modelle, Krigingsapproximation und Optimierung”. In: Modellierung und Simulation von Ökosystemen. Workshop Kölpinsee (A. Gnauck, Hrsg.), Shaker Verlag, Aachen 2012. [9] Wojtowicz P., Pawlak A., Studzinski J., “Preliminary results of hydraulic modelling and calibration of the Upper Silesian Waterworks in Poland”. In: 11th International Conference on Hydroinformatics HIC 2014, NY City, USA, 2014A.

Journal of Automation, Mobile Robotics & Intelligent Systems

VOLUME 9,

N° 4

2015

Development of Graphene Based Flow Sensor Submitted: 31st August 2015; accepted 22nd September 2015

Adam Kowalski, Marcin Safinowski, Roman Szewczyk, Wojciech Winiarski DOI: 10.14313/JAMRIS_4-2015/33 Abstract: This paper shows the research on a flow sensor based on graphene. Presented Results show the linear relation between voltage induced on graphene layer and flow velocity. The measurement shows that signal level is relatively low, and it is highly correlated with the time of the sample being submerged in water. A significant temperature dependency has been shown which indicates on necessity to develop a compensation system for the sensor. On the other hand, induced voltage is related to ion concentration of the liquid, so the sensor must be recalibrated for every working environment. The most important thing that turned out during research is that although the voltage signal itself is highly inconsistent, the difference between its value in steady state and for flowing liquid is always visible and correlated to the flow value – this property can be used in further deployment. Huge advantage of the sensor is also its scalability which opens so far unknown possibilities of applications. Keywords: graphene, flow, sensor, voltage

Currently known methods of flow measurement (e.g. ultrasonic, electromagnetic, Coriolis, vortex etc.) does not provide proper measurement of liquid flow for low speeds. Research showed that the graphene sensor can be used in the measurement of low flow rate.

2. Possibility of Using Graphene as a Part of Flow Sensor

A flow sensor based on graphene has to meet several requirements, i.a.: a) induced voltage is related to the flow velocity, b) changes of voltage level are consistent and mathematically describable in a relatively simple manner, c) signal dynamics is high enough – sensor is reasonably sensitive and can work in a wide range of flow velocities, d) signal to noise ratio is low enough. Proper measurements, needed to check how the sensor meets the requirements mentioned above, have

1. Introduction Graphene is characterized by a wide range of incredible properties, both electrical and mechanical, which makes it a very promising material in many branches. It is an excellent current [1] and heat [2] conductor and regarding its untypical dispersive relation [3] it provides electron flow with 1/300 speed of light! Despite a very small thickness its extension strength is more than hundred times higher than for construction steel or Kevlar [4]. Probably the biggest and most remarkable achievement of Poland in that matter is elaborating an innovative manufacturing technology. In 2011 a team led by professor Włodzimierz Strupiński from Instytut Technologii Materiałów Elektronicznych (Institute for Electronic Materials Technology) invented a method of producing thin graphene layers on SiC [5], which was given a patent the very same year. Presented results are the effect of the research made in FlowGraf project. Its final purpose is to design, build and deploy a flow sensor based on graphene. An underlying research focused on examining the influence of various factors on the voltage induced in the graphene sample among which the main one was the velocity of the flowing liquid and the others were the quantities which can disturb the consistency of the voltage level: in this case temperature and concentration of sodium chloride.

Fig.1. Laboratory stand for graphene sensor measurements 55

Journal of Automation, Mobile Robotics & Intelligent Systems

been proceeded in Przemysłowy Instytut Automatyki i Pomiarów (Industrial Research Institute for Automation and Measurement) in Warsaw, on a laboratory stand set specifically for that purpose (Fig. 1). The stand presented above enabled to control the flow velocity using proportional valve and to stop the flow on either inlet or outlet of the tube using onoff valves and to examine the behavior of the sensor (mounted as shown in Fig. 2)

Fig. 2. Mounting of graphene sensor in the tube

VOLUME 9,

N° 4

2015

a signal for non-zero flow becomes stable after certain amount of time which is probably associated with inertial effects – charging and discharging of a capacitance of our sample or fluctuations of flow velocity. Liquid velocity was estimated basing on the indications of flow sensor – basing on that we obtained the relation between induced voltage and velocity for two series of measurements (Fig. 4). Voltage level is higher for 1st series (black) than for the 2nd one (blue) which is related to constant drop of voltage in time. It shows again a phenomenon of discharging the graphene layer and every relation determined one after another with a sample constantly submerged in liquid will be lower than the previous one. Results presented above, as well as others obtained during our work, were inconsistent as far as voltage level is concerned, but the sensitivity is always of the same order of magnitude – about 10 nV/ mm/s and there is always a visible and measurable difference between the signal level for flowing liquid and steady state which can be useful in further research (look: conclusion).

3. Influence of Liquid Characteristics on Electrical Signal Generation 3.1. Influence of Temperature

Fig 3. Transient of voltage for different flow velocities

In order to determine how temperature influences the voltage of graphene sample, liquid was heated up to a certain temperature and flew through a sample for a constant flow rate. Research has been made for temperature within 20–47 °C range every 3–4 °C and voltage has been measures after achieving a desired temperature which resulted in a voltage-temperature relation (Fig. 5). Voltage difference increases with temperature which can be explained by growth of charges mobility which results in higher potential differences in a graphene layer. Order of magnitude for temperature sensitivity can be estimated as δT = 100 nV/°C. Changes of voltage influenced by temperature aren’t at all negligible – change of temperature by 1 °C causes similar change of voltage as change of flow velocity by 10 mm/s. This show that in a final flow sensor construction it is required to use temperature compensating systems (eg. thermistors) or taking it into account in a software algorithm for assigning flow velocity to certain voltage values.

Fig. 4. Relation between voltage and liquid velocity

First thing that needed to be determined was the relation between flow velocity and voltage signal. It has been examined using (once) deionized water. During the research an attempt was made to simulate the differential system – using a proportional valve a certain flow value was set, the measurement was made and the same approach was repeated for a steady state, when the water flow was stopped. An exemplary transient of voltage was shown in Fig. 3. As we can see, the signal level is higher for flowing liquid than in steady state. Another thing to note is that Articles

Fig. 5. Voltage-temperature relation for differential voltage

Journal of Automation, Mobile Robotics & Intelligent Systems

3.2. Influence of Sodium Chloride Concentration Flow sensor is expected to work also for water solutions of various compounds. Thus, we examined how the concentration influences voltage level. In this research, different sample has been used so the results aren’t quite comparable to the ones presented before, but can give us some knowledge about the scale of the phenomenon. We used sodium chloride solution within a range of concentration between 0 and 3% every 0,3% for measurements. Results are shown in Fig. 6.

VOLUME 9,

N° 4

2015

a flow of liquid or there isn’t or if the flow increased or decreased because the direction of change always stays the same. This feature has already been used in Industrial Research Institute for Automation and Measurement – graphene sensor is used in one of laboratory stands as leak detector. Further work is planned on the commercial application of sensor leaks. Recipients of a leak detector can be manufacture of valves such as APATOR Powogaz S.A., Broen S.A., Gazomet Sp. z o.o., Norson, MPWiK Wrocław etc. Huge advantage of presented sensor is its size – flow- meters already present on market are very big which automatically excludes many applications. It is easily scalable and supported by relevant research it could be used in micro scale – for instance in human circulatory system to detect and prevent blood congestions.

ACKNOWLEDGEMENT

Fig. 6. Relation between voltage and sodium chloride concentration hydraulic press test valve

leak converter FG leak sensor

This work has been supported by the National Centre for Research and Development (NCBiR) within the GRAF-TECH programme (no. GRAF-TECH/ NCBR/02/19/2012). Project „Graphene based, active flow sensors” (acronym FlowGraf).

AUTHORS

Adam Kowalski, Marcin Safinowski*, Roman Szewczyk, Wojciech Winiarski – Industrial Research Institute for Automation and Measurements PIAP, Warsaw, Poland. E-mails: adam.kowalski999@gmail.com, {msafinowski, wwiniarski, rszewczyk}@piap.pl. *Corresponding author

REFERENCES

FG leak sensor

Fig. 7. Measuring station for the graphene leak sensors testing Again we obtained constant sensitivity of order 100 µV/%NaCl. We can try comparing it to other results by scaling. For this sample voltage is of order 0,1 –1 mV, before it was 0,01 mV so 1–2 orders of magnitude higher. We can therefore estimate that for the sample examined before the sensitivity would be of order 1–10 µV/%NaCl. It is a very significant value compared to the previously estimated sensitivity values for velocity and temperature which shows that sensor acts in a different way for every liquid regarding its concentration.

4. Conclusion and Further Research Directions

It turned out that voltage changes in graphene sensor caused by liquid flow are inconsistent. On the other side there is always a significant change of voltage level which can indicate whether there is

[1] Wallace P. R., “The band theory of graphite”, Physical Review, vol. 71, no. 9, 1947, 624. DOI: dx.doi. org/10.1103/PhysRev.71.622. [2] Murali R., Yang Y., Brenner K., Beck T., Meindl J.D., “Breakdown Current Density of Graphene Nano Ribbons”, Applied Physics Letters, 94, 2009, 243114. [3] Ghosh S., Calizo I., Teweldebrhan D., Pokatilov E. P., Nika D. L., Balandin A. A., [4] Bao W., Miao F., Lau C. N., “Extremely high thermal conductivity of graphene: Prospects for thermal management applications in nanoelectronic circuits”, Apllied Physics Letters, vol. 92, no. 15, 2008, 151911. DOI: dx.doi.org/10.1063/1.2907977. [5] Taisuke O., Bostwick A., Seyller T., Horn K., Rotenberg E., “Controlling the electronic structure of bilayer grapheme”, Science, 301, 2006, 952. [6] Strupinski W., “Graphene epitaxy by chemical vapor deposition on SiC”, Nano Letters, vol. 11, no. 4, 2011, 1786. DOI: dx.doi.org/10.1021/ nl200390e. Articles

Journal of Automation, Mobile Robotics & Intelligent Systems

M AN

C C

VOLUME 9,

P A

N◦ 4

2015

: B

Submi ed: 20th September 2015; accepted: 26th October 2015

Sławomir Zadrożny, Janusz Kacprzyk, Marek Gajewski DOI: 10.14313/JAMRIS_4-2015/34 Abstract: We deal with the problem of the mul aspect text categoriza on which calls for the classifica on of the documents with respect to two, in a sense, orthogonal sets of categories. We briefly define the problem, mainly referring to our previous work, and study the applica on of the knearest neighbours algorithm. We propose a new technique meant to enhance the effec veness of this algorithm when applied to the problem in ques on. We show some experimental results confirming usefulness of the proposed approach. Keywords: text categoriza on, intelligent system, nearest neighbour classifiers, topic tracking and detec on, fuzzy majority

1. Introduc on An important feature desired for the intelligent systems is the capability to deal with textual information. Despite many efforts and success stories this area still poses many challenges to the research community. Natural language processing is an example of a domain where much has been achieved but the machines are still behind a human being and his capability to understand the text in its full meaning. Even the domain of information retrieval, setting for itself more modest goals with respect to textual information processing, calls for further research to address the tremendous growth of information to be processed as well as the ambition to assist a human user in tackling with more and more complex problems, so far reserved for a human being. In this paper, we study one of such problems, motivated by some real life applications, and try to propose and extend some well known techniques to deal with it. Our starting point is the concept of the multiaspect text categorization (MTC) which we introduced earlier in a series of papers [9, 22, 23, 25]. The motivation is a real, practical problem of managing collections of documents for the purposes of an organization, notably a public institution which has to be carried out following formal regulations imposed by the state. A part of this problem its the well-known concept of the text categorization (TC) [16] and thus relevant techniques and tools are readily applicable. Another part is, however, more challenging. Although it also can be interpreted as a TC problem, its characteristic makes it a more dificult task – irst of all due to a limited number of training documents available but also due to the different motives underlying the grouping of documents. 58

We have studied the MCT problem in a number of papers, cited above, and proposed some solutions to it. Here we study the use of the k nearest neighbours classi ier (k-nn) and propose a new algorithm inspired by this study. The starting point is the study of Yang et al. [20] which is concerning a similar problem of the topic detection and tracking (TDT) [1] and proposes some extensions to the basic k-nn algorithm in order to deal properly with the speci icity of the problem at hand. The structure of this paper is the following. The next section brie ly introduces the MTC problem. In Section 3 we recall the work of Yang et al. on the use of the k-nn classi ier for the purposes of the TDT problem solution and their extensions to the basic algorithm. In subsection 3.2 we present our algorithm inspired by the work of Yang et al. and combining somehow the paradigms of the nearest neighbour classi ier and the pro ile based classi iers [16]. Section 4 shows the results of our computational experiments meant to compare discussed methods and Section 5 concludes and discusses some ideas for the further research.

2. MTC Problem Descrip on 2.1. The Problem The multiaspect text categorization problem (MTC) may be considered as a twofold standard multiclass single-label classi ication. Thus, a collection of documents is assumed: D = {d1 , . . . , dn }

(1)

These documents are, on one hand, assigned to the set of prede ined categories C = {c1 , . . . , ct }

(2)

On the other hand, they are also assigned to the sequences of documents, referred to as cases, within their own categories. The cases, generally, are not prede ined and are established based on the documents arriving to the classi ication system. We will assume that at the beginning there are some cases already formed. Some of them may be treated as closed, i.e., no new document should be assigned to them, and some of them are on-going, i.e., they are the candidates for the new documents to be assigned (classi ied) to. Each document d belongs to exactly one category and one case within this category. The cases will be denoted as σ and their set as Σ: σk =< dk1 , . . . , dkl > Σ = {σ1 , . . . , σp }

(3) (4)

Journal of Automation, Mobile Robotics & Intelligent Systems

When a new document d∗ arrives it has to be properly added to the collection D, i.e., d∗ has to be classiied to a proper category and assigned to a proper case within this category. We consider the task of the system as of the decision support type, i.e., a human user should be assisted by the system in choosing a proper category c ∈ C and a proper a case σ ∈ Σ for the document d∗ but he or she is responsible for performing these actions. Several ways of assigning documents to categories/cases may be conceived; cf., e.g., [23, 25]. We follow here the line of conduct presented in the latter paper, i.e., of a two stage assignment: irst to a category and then to a case. A set of categories is prespeci ied and each of them may be assumed to be represented by a suf icient number of documents in the collection D. Thus, the standard text categorization techniques may be employed [16]. On the other hand, the cases may be quite short and, moreover, emerge dynamically during the lifetime of document management system. Moreover, it should be assumed that the organization of the documents into categories is based on some top level thematic grouping. For example, in the structure of documents collection of a company one category may comprise documents concerning relations of the company with the public administration institutions, another category may gather documents related to the activity of this company’s supervisory board, while still another category may concern all matters related to the human resources. On the other hand, documents are grouped into cases based on some business process they are related to, e.g., hiring a new employee. Thus, documents belonging to the cases within the same category are in general thematically similar and what should decide on assigning a new document to one of them is somehow different from the clue on document assignment to a category. One of the aspects which may be helpful in making the decision on assigning a document to a case is the fact that the documents are arranged within the case in a speci ic order. This order is based on the logic of the business process related to a given case and re lects the chronology of the running of this process. We assume that the documents arrive for the classi ication exactly in this order and thus this order may be exploited during the classi ication. 2.2. Related Work The MTC problem has been formulated in our previous papers; cf., e.g., [22]. It belongs to the broad class of text categorization problems. Its most similar problem well-known in the literature is the Topic Detection and Tracking (TDT) [1] which may be very brie ly described as follows. Topic detection and tracking concerns a stream of news on a set of topics. The basic task is to group together news stories on the same topic. A story in TDT corresponds to a document in our MTC problem de inition while a topic is a counterpart of a case. Categories as such are not considered in the original formulation of the TDT problem although later on the concept of hierarchical TDT has been introduced [8] what brings the TDT and the MTC even closer. Topics, similarly to cases, are not prede ined and

VOLUME 9,

N◦ 4

2015

new topics have to be detected in the stream of stories and then tracked, i.e., all subsequent stories concerning this topic should be recognized and properly classi ied. A subtask of the irst story detection is distinguished which consists in deciding if a newly arrived story belongs to one of earlier recognized topics or is starting a new topic. Although the MTC and TDT problems share many points they are still different. In the former, categories and cases are considered while only topics are presumed in the latter (even in the hierarchical TDT, mentioned earlier, the relation between MTC’s categories and cases is not re lected as the hierarchy of topics is there meant in the standard text categorization sense, i.e., the categories at different levels of a hierarchy are just themes considered at the different levels of abstraction and do not follow a different principle of grouping such as theme versus business process, as it is assumed for the MTC). Moreover, cases in MTC are sequences of documents while topics in TDT are just sets of stories. Again, even if stories in TDT are timestamped, their succession within a topic is not assumed to carry out any semantic information and the use of this temporal information to solve the tasks of the TDT, if any, is limited to reducing the in luence of the older stories on the classi ication decision. Finally, the practical context is different: for TDT this is the news stories stream analysis while for MTC this is the business documents management. The reader is referred to our forthcoming paper [9] for a more in depth analysis of the relations between the TDT and MTC problems. The solution approaches to the TDT problem belong to the mainstream information retrieval. Standard representation, most often in the framework of the vector space model, is assumed for the stories. The notion of the similarity/dissimilarity of stories represented as vectors in a multidimensional space is employed to detect and track topics. Often, various cluster analysis techniques are used to group stories, interpret the clusters as topics and represent them by the centroids of these clusters. A new story is compared against the centroids of particular topics to decide where it belongs. If there is no centroid similar enough then a new topic is established and the newly arrived document is assigned to it. In our previous papers we have proposed a number of solutions to the MTC problem. We also most often adopt the vector space model as the starting point. The matching of a document and a case was computed as the weighted average of fuzzy subsethood degrees of the fuzzy set representing the document to a fuzzy set representing a given case and fuzzy set representing the category of this case, respectively. This way, the assignment of a document to a case for which the highest matching was obtained, which implied also the assignment to its category, was based on the combination of the matching of this document with respect to the case as well as to the whole category. Then we propose to model the cases and proceed with the classi ication of the documents in the frame59

Journal of Automation, Mobile Robotics & Intelligent Systems

work of the hidden Markov models and sequence mining [22], using the concepts of the computational intelligence [23], or employing the support vector machines [24]. We also pursued other paths, including semantic representation of documents, inding a parallel of the MTC with text segmentation, studying the asymmetry of similarity [13,14], devising new cluster analysis techniques [11] or investigating the applicability of the concepts related to the coreference detection in data schemas [17]. In this paper we follow the line of research on combining some approaches related to the classi ication task and computational intelligence tools to propose a new approach and also study the applicability of the well known techniques to the problem of the the multiaspect text categorization.

3. The Techniques Employed 3.1. Basic k-nn Technique and Its Extensions to Topic Tracking In this paper we study the use of the k-nearest neighbours technique (k-nn) to solve the multiaspect text categorization problem. From the point of view of the statistical pattern classi ication theory, this method belongs to the group of the nonparametric techniques. This type of approaches seems to be most promising for the task at hand due to a limited set of assumptions which have to be adopted to apply them. One of the characteristic MTC features is the sparse training data present and thus, e.g., assuming a speci ic family of (conditional) probability distributions of data and estimation of its parameters may be dificult, if possible at all. The k-nn technique proved to be effective for many different classi ication tasks, including the topic tracking and detection problem [20] which is closely related to our MTC problem as discussed in section 2.2. Basically, the k-nn technique may be described in the context considered here as follows. A set of categories C and a set of training documents D (cf. section 2.1), for which the category assignment is known, are assumed. For a new document d∗ to be classi ied the k most similar to it documents in D are found. The similarity measure is usually de ined as the inverse of some distance measure; usually the Euclidean distance is adopted. The category to which the majority of the k closest documents belong to is assigned to d∗ . Formally, using the notation introduced in (1)-(4), the category c∗ assigned to the document d∗ is de ined as follows: c∗ = arg max |{d ∈ D : (Category(d) = ci ) ∧ ci

(d ∈ NNk (d∗ ))}| (5) where Category(d) denotes the category c ∈ C assigned to a training document d and NNk (d∗ ) denotes the set of k documents d ∈ D which are the closest to d∗ , i.e., NNk (d∗ ) = {dδ(1) , dδ(2) , . . . , dδ(k) } 60

VOLUME 9,

N◦ 4

2015

where δ is such a permutation of the set {1, . . . , n} that dδ(j) is j-th most similar to d∗ document in the set D. The k-nn is a very popular classi ier, often used also in the context of the text categorization, cf., e.g., [10, 19]. An inspiration for our work is in particular the paper by Yang et al. [20] on the application of the k-nn technique for the purposes of the topic tracking and detection. The authors adopt standard document representation of the stories within the framework of the vector space model, using a variant a variant of the tf x IDF keyword weighting schema. Yang et al. study in particular the use of the k-nn for the solution of the topic tracking problem of the TDT (cf. section 2.2). They proposed some modi ications to the basic algorithm which proved to yield better results on some benchmark datasets. Namely, in [20] the following improvements to the basic k-nn algorithm have been proposed. First of all, instead of the multiclass problem they consider t binary classi ication problems, one for each category c ∈ C (cf. (2)). Moreover, instead of simply counting the number of documents belonging to that category among k most similar documents NNk (d∗ ) they compute the following index, called kNN.sum (we slightly modify the original notation used in [20] to adjust it to the context of our MTC problem): ∑ ∑ r(d∗ , c, k, D) = sim(d∗ , d) − sim(d∗ , d) d∈Pkc

d∈Qck

(6) where Pkc = {d ∈ NNk (d∗ ) : Category(d) = c}, i.e., it is a subset of the set of documents most similar to d∗ which belong to the category c (are positive), Qck = NNk (d∗ ) \ Pkc , i.e., it is a subset of documents being negative examples with respect to the category c, and sim(d∗ , d) denotes the similarity measure between the documents which is assumed to be the cosinus of the angle between the vectors representing documents in question. Then, the document is assigned to a category for which this index is the highest, provided it exceeds some threshold value (otherwise document is treated as starting a new topic). From this point of view this approach is a kind of the weighted k-nn technique [6]. Yang et al. notice the dif iculty in setting an appropriate value of the parameter k. In case of the TC problem, i.e., when there is usually a large enough number of positive examples for each category, an experimentally veri ied recommended value for k is rather large, higher than 30 and less than 200. When the number of positive examples for a given category is small, as it is the case for topic tracking in TDT or assigning a document to a case in our MTC, this recommendation is not valid. If k is high the set NNk (d∗ ) will be dominated by negative examples as their number in D is much greater than the number of positive examples. However, also choosing k low may lead to the set NNk (d∗ ) comprising only negative examples unless the document to be classi ied d∗ is very similar to some positive examples. Yang et al. [20] proposed to overcome this dif iculty introducing modi ied versions of the index (6).

Journal of Automation, Mobile Robotics & Intelligent Systems

VOLUME 9,

The irst version is called kNN.avg1 and is de ined as follows: r′ (d∗ , c, k, D) =

1 ∑ sim(d∗ , d)− |Pkc | c d∈Pk

1 ∑ sim(d∗ , d) |Qck | c

(7)

d∈Qk

where | · | denotes the cardinality of a set. In this case the similarity to the positive and to the negative examples is averaged and thus even for a large k the dominance of the negative examples in the neighbourhood of the classi ied document d∗ does not pose a problem. The second modi ied version of the kNN.sum technique (6) is called kNN.avg2 and is de ined as follows: r′′ (d∗ , c, k, D) = ∑ ∑ 1 1 sim(d∗ , d) − c sim(d∗ , d) c |Ukp | |Vkn | c c d∈Uk

d∈Vk

(8)

In this case kp positive examples, i.e., belonging to the category c, most similar to d∗ , and kn negative examples most similar to d∗ are considered and they form the sets Ukcp and Vkcn , respectively. Similarity between d∗ and the documents from these two sets is averaged as in case of kNN.avg1. Thanks to that a small number of the nearest training examples may be taken into account and there is no risk that the negative examples will dominate. The kNN.avg2 techniques gives a higher lexibility making it possible to independently choose the parameters kp and kn but these two parameters have to be tuned instead of just one k as in kNN.avg1. When the kNN.avg2 is going to be applied to the MTC problem there is a risk that there are not enough positive examples, i.e., their number is lower than the value of kp . In particular, if c corresponds to a very short case and is considered as a candidate for the assignment of d∗ then for the reasonable value of kp it may easily happen. In such a situation, our implementation of the kNN.avg2 reduces the value of kp to the number of existing positive examples. Yang et al. [20] offer some recommendations as to the tuning of the parameters k or kp and kn . Their main concern is a limited number of training documents making challenging the usual splitting of the training data set into a genuine training set and a validation set. Thus, they devise the tuning in the framework of an ensemble of tested classi iers comprising kNN.avg1 and kNN.avg2 as well as a Rocchio type classi ier belonging to the class of the pro ile based classi iers [16]. The details can be found in [20]. In the current paper we test a more standard way of tuning the parameters and we check how effective it is; cf. the next section 3.2. 3.2. Our Approach to Improving the k-nn Classiﬁer for the Purposes of the MTC Problem Solu on In our previous work [25] on the solution of the MTC problem we already proposed to use the k-nn

N◦ 4

2015

technique for the irst stage of the solution, i.e., deciding on the category to which the newly classi ied document d∗ belongs as well as to the second stage, i.e., assigning the document d∗ to a case. Here we study some extensions to the basic k-nn procedure and compare them experimentally with the approaches proposed by Yang et al. [20] which are presented in section 3.1. In the TDT problem, more speci ically the topic tracking problem, considered by Yang et al. [20] the documents related to a topic are assumed to arrive over some time but the order in which they come is not essential. Namely, some documents may describe exactly the same aspect of a topic/event but come from different sources and thus it should not be expected that the order in which they appear on the input carries out some information useful for their classi ication. In the MTC problem the situation is different and the order of the documents within a case may be assumed to convey some extra information which may be exploited for their proper classi ication. In [25] we have shown that the documents can be quite successfully classi ied to the cases by their comparison just to the last document of candidate cases. Thus, it seems to con irm that the similarity to the most recent documents in a case should in luence most the classi ication decision. This observation reminds what have been con irmed in some computational experiments on the comparison of the path in the tree of an XML document [17]: that the similarity of the last segments of these paths is decisive for establishing the coreference of respective XML elements. Here, we develop this idea and propose to take into account during the comparison all documents belonging to a candidate case but with different weights: the closer to the end of the case given document is located the higher its weight is. Formally, we propose to use the following index to evaluate the matching of the document d∗ against a candidate case σ ∈ Σ. Let us irst cast it in a strict but linguistically expressed form as the truth value of the following linguistically quanti ied proposition [12, 21]: The document d∗ is similar to most of the important documents of the case σ

(9)

According to the Zadeh’s calculus of linguistically quanti ied propositions the truth value of the proposition (9) for a document to be classi ied d∗ and a candidate case σ is computed as follows: (∑ ) ∗ d∈σ min(sim(d , d), imp(d)) ∗ ∑ m(d , σ) = µQ d∈σ imp(d) (10) where sim(·, ·) denotes a similarity measure between documents and imp(·) denotes the importance of the document d belonging to the case σ. The linguistic quanti ier Q in (10) represents the concept of linguistically expressed majority, exempli ied in (9) with the word “most”. Such a quanti ier may be formally represented in many different ways (cf., e.g., [5]) and in what follows we adopt the original Zadeh’s approach [21]. Thus, a linguistically quanti ied proposition its one of 61

Journal of Automation, Mobile Robotics & Intelligent Systems

VOLUME 9,

the generic templates: QX ′ s are A, or QBX ′ s are A

(11) (12)

and expresses that, e.g., for Q = most, “most of the elements of a universe X possess a property A”, in case of (11), or that “most of the elements of a universe X possessing a property B possess also a property A”, in case of (12). Properties A and B are in general fuzzy and are represented by their membership functions de ined on the universe of discourse X. A linguistic quanti ier Q is formally represented as a fuzzy set in the interval [0,1]. For example, the membership function of Q = most may be expressed as follows:  for x ≥ 0.8  1 2x − 0.6 for 0.3 < x < 0.8 µQ (x) = (13)  0 for x ≤ 0.3 The value of the membership function µQ (x) = y is interpreted as meaning that if x ∗ 100% of elements of X possess the property A then the truth value of (11) is equal y, or that if x ∗ 100% of elements of X possessing the property B possess also the property A then the truth value of (12) is equal y. General formulae for the truth value of (11) and (12) are thus the following, respectively: (∑ ) x∈X µA (x) ′ truth(QX s are A) = µQ (14) n truth(QBX ′ s are A) = (∑ ) x∈X min(µA (x), µB (x)) ∑ µQ (15) x∈X µB (x) Notice that the de inition of our indicator of matching between the document d∗ and a case σ, as expressed with (9), may be rephrased as: Most of the important documents of σ are sim(16) ilar to d∗ and thus its the general template (a protoform) (12) [12]. It may be also easily seen that the formula (10) is an instantiation of the formula (15) where X is a set of documents belonging to the case σ (treated in the following formulae as a set of documents), A is a fuzzy property of a document d ∈ σ with the membership function: µA : σ → [0, 1] µA (d) = sim(d, d∗ ) and the fuzzy property B corresponds to the importance of document d with respect to the case σ, i.e.: µB : σ → [0, 1] µB (d) = imp(d) The index (10) introduced here is used to assign new document D∗ to a case in a straightforward way (we assume that before that d∗ is assigned to a category c using, e.g., the basic k-nn algorithm, as in our previous paper [25]): 62

N◦ 4

2015

1) the matching index m de ined by (10) is computed for all candidate (on-going) cases belonging to category c; the set of such cases is denoted as Σc , 2) the document d∗ is assigned to the case σ ∗ such that: σ ∗ = arg max m(d∗ , σi ) σi ∈Σc

Thus, our approach may be treated as another way of using the k-nn technique for classi ication of documents, although to some extent it may be also interpreted as a kind of a pro ile-based classi ication [16]. We employ the weighted similarity of the document d∗ with respect to the documents of a case – similarly as the kNN.avg1 and kNN.avg2 are doing. However, we compute the weighted average and with respect to all the training documents comprising a particular candidate case. Moreover, in our approach two different types of weights are involved: one related to the similarity sim(d, d∗ ) and another one related to the importance imp(d) of a document within a case. In order to use effectively the introduced index (10) we need to devise the way to set its parameters, i.e.: 1) the form of the quanti ier Q, 2) the form of the similarity measure sim used therein, 3) the importance weights assigned to particular documents of a case. Concerning the linguistic quanti ier employed, the very nature of the proposed index m (10) suggests the use of the quanti ier expressing the concept of the (fuzzy) majority, such as “most”. More generally a socalled Regular Increasing Monotonic (RIM) quanti ier should be used [18], i.e., one with the monotone increasing membership function µQ , such as, e.g., (13), i.e.: ∀x, y x < y ⇒ Q(x) ≤ Q(y) (17) Thus, the choice of a speci ic quanti ier seems to be of a limited importance. The replacement of a linguistic quanti ier Q1 with Q2 in (10) may change the assignment of a document to the case only due to the assumed weak monotonicity of the membership function µQ (cf. (17)), i.e., if µQi (x) = µQi (y) and µQj (x) < µQj (y), where i, j ∈ {1, 2} and x < y. Thus, we assume the unitary linguistic quanti ier in (10), i.e., deined by the membership function: µQ (x) = 1, ∀x ∈ X It has to be noted that the choice of a linguistic quanti ier may play a more important role if a threshold value of the index (10) is set and meant to decide if document d∗ may be assigned to a given case or should start a new case, i.e., when the irst story detection problem is considered. However, this goes beyond the scope of this paper. The similarity measure sim in (10) may be deined in many ways. In our previous work we are most often using the Euclidean distance between the vectors representing documents under comparison. Here

Journal of Automation, Mobile Robotics & Intelligent Systems

VOLUME 9,

we adopt it also and take its complement as the measure of the similarity. We assume the vectors representing documents to be normalized in such a way that their Euclidean norms are equal 1. Thus, the highest possible Euclidean distance √ between two vectors representing documents equals 2 and the similarity between two documents d = [d1 , . . . , dl ] and d∗ = [d∗1 , . . . , d∗l ], assuming the number of keywords used to represent the documents to be equal l, may be expressed as follows: √∑ √ l ∗ 2 2 − i=1 (di − di ) √ sim(d, d∗ ) = (18) 2 sim(d, d∗ ) ∈ [0, 1]. Finally, the importance of each document in a case is assumed to be an increasing function of the position of the document in the case. The extreme cases are the following: - important is only the last document in the case; cf. our paper [25] studying this case; - all documents are equally important, i.e., effectively the importance is not taken into account In the current paper we consider and test experimentally the following options (x denotes the position of a document in the case, len denotes the length of this case, and a and b are the parameters): 1) linear importance: imp(x) =

x len

(19)

2) quadratic importance: imp(x) = a ∗ (

x 2 ) +1−a len

(20)

3) radical importance: √ imp(x) = a ∗

x +1−a len

4) piecewise linear importance:   0 (x − a)/(b − a) imp(x) =  1

(21)

if x ≤ a if a < x ≤ b if x > b (22) All options assume that the importance degree of the last document of the case, i.e., the one most recently assigned to this case, is equal 1, i.e., is the highest. Also all are monotone: documents located later in a case get not lower importance than those located earlier. The irst option is the simplest one making the last document most important and gradually reducing the importance of earlier documents with the constant rate. No parameters have to be set. The second option, the quadratic importance, for high values of the parameter a ∈ [0, 1] relatively, with respect to the linear importance, reduces the importance of the documents behind the last document of the case. This reduction is highest for the documents located in the middle of

N◦ 4

2015

the case. For small values of a the importance of documents in the case is increased relative to the linear importance. This increase is highest for the documents at the beginning of the case. The radical importance, for any value of the parameter a ∈ [0, 1] increases importance of all documents in the case. The smaller the value of a the higher is this increase. The increase is relatively highest for the documents located at the beginning of the case. Finally, the piecewise linear importance makes it possible to set the importance of the documents located at the beginning of a case to 0 what effects in ignoring them during the computation of the index (10). At the same time, the documents located closer to the end of the case can get importance degree equal 1 - the highest importance degree is thus not anymore reserved for the last document of the case. This option requires setting of two parameters a and b which decide what proportion of the documents will get the importance degree equal 0 and 1, respectively. An option for the importance degree may be chosen based on the experience of the user or may be tuned using the training dataset. In our experiments reported in section 4 we follow the latter way. 3.3. Oversampling of Short Cases to Circumvent the Imbalance of Classes In the previous section we have introduced a variant of the k-nn method which is meant to better account for the relation of the subsequent documents in a case. This is a rather far going modi ication of the original algorithm which, in a sense, replaces the comparison of the document to be classi ied d∗ against training documents with the comparison of d∗ against the whole cases with a proper account for the sequential character of documents forming a case. In this section we propose another, modest, modi ication of the original k-nn algorithm which is expected to improve its working for the MTC problem. Namely, usually when a document d∗ is going to be assigned to a case, particular candidate (on-going) cases are of different length. Some have just started and comprise a small number of documents while other are already well developed and may comprise tens of documents. Thus, when each case is treated as a separate class then we have usually to deal with the classes of imbalanced sizes in the training dataset. We propose to duplicate documents of short candidate cases thus increasing their visibility during the execution of the regular k-nn method and its variants described earlier, including our approach presented in section 3.2. Formally, a threshold caselen is set and all cases in a given category whose length is shorter that this threshold are replicated. Several strategies may be applied: all documents of such a short case may be replicated the same number of times or the number of replicated copies may depend on the location of the document within the case. Following the similar reasoning as in section 3.2 we may put more emphasis on the most recent documents in the case and replicate them more times. In our experiments in section 4 we try a few variants. 63

Journal of Automation, Mobile Robotics & Intelligent Systems

This technique is basically meant for the original knn algorithm. Its variants proposed by Yang et al. [20] and described in section 3.1 are resistant somehow to this problem thanks to averaging of the similarity (and dissimilarity) over the documents neighbouring the document d∗ . It is also easy to notice that oversampling corresponds to considering the importance in our approach presented in section 3.2 and eventually boils down to the weighted averaging of the similarities of documents neighbouring d∗ . In the experiments discussed in the next section we employ the oversampling in case of the regular k-nn technique to compare its effectiveness with the effectiveness of the apparently more sophisticated methods discussed in sections 3.1 and 3.2.

4. Computa onal Experiments 4.1. Data and So ware Used The are no benchmark datasets yet for the multiaspect text categorization problem dealt with here. In our work we are using a collection of papers about computational linguistics which have been structured using XML and made available on the Internet as the ACL Anthology Reference Corpus (ACL ARC) [4]. In our experiments we use 113 papers forming a subset of the ACL ARC. We group papers to obtain categories (cf. section 2). After some trials we decided to look for 7 categories (clusters) using the standard k-means clustering algorithm. The clustering is applied to the papers represented according to the vector space model (cf., e.g., [2]) ignoring the XML markup. In particular, the following operations are executed to obtain the inal representation (cf. also our earlier papers, e.g., [25]). The text of the papers is normalized, i.e., the punctuation, numbers and multiple white spaces are removed, stemming is applied, the case is changed to the lower case, stopwords and words shorter than 3 characters are dropped. The document-term matrix is created for the whole set of the papers using tf × IDF terms weighting scheme. Next, the keywords present in less than 10% of the papers are removed from the document-term matrix. The vectors representing particular papers are normalized by dividing each coordinate by the Euclidean norm of the whole vector and thus the Euclidean norm of each vector equals 1. Next, we produce a set of cases based on the papers, in the following way. The papers are originally partitioned into sections (segments) and each section forms the content of the XML element Section. We treat each paper as a case while its sections are considered to be documents of this case, preserving their original order within the document. This way we obtain a collection of 113 cases comprising 1453 documents, cf. also [22–25]. The documents, i.e., the sections of the original papers, are represented using the vector space model. Thus, again the operations such as the punctuation, numbers and multiple white spaces removal, stemming, changing all characters to the lower case, stopwords and words shorter than 3 characters elimination are applied to the documents. A document-term matrix is constructed for the 64

VOLUME 9,

N◦ 4

2015

above set of documents using tf × IDF terms weighting scheme. Again, sparse keywords appearing in less than 10% of documents are removed from this matrix. and as a results 125 keywords are used to represent the documents. The vectors representing documents are normalized in the same way as in case of the papers, i.e., their Euclidean norm equals 1. The dataset obtained this way is then split into the training and testing datasets. To this aim, a number of cases are randomly chosen as the on-going cases which are thus the candidate cases for the document d∗ to be assigned to. In each on-going case a cut point is selected randomly: the document located at the cut point and all subsequent documents are removed from the case and serve as the testing dataset. All remaining documents from the collection serve as the training dataset. All computations are carried out using the R platform [15] with the help of several packages. In particular, the text processing operations are are implemented using the tm package [7]. The FNN package [3] is employed to classify documents to cases with the use of the original k-nn algorithm. The algorithms mentioned in sections 3.1 and 3.2 are implemented ourselves in the form of an R script. 4.2. The Goals of the Experiments and How the Parameters Are Chosen Our goal is to compare the effectiveness of the basic k-nn algorithm and its variants discussed in sections 3.1 and 3.2 for the solution of the MTC problem presented in section 2, and in particular, of its second stage consisting in assigning the document d∗ to the proper case. Of a special interest is, of course, our approach presented in section 3.2. Thus, we assume here the representation of the collection of documents described in section 4.1 and we assume the two-stage approach with two stages consisting in assigning the document d∗ to a category and a case within this category, respectively. We run a number of experiments and based on the results of each run we evaluate the effectiveness of the assignment of documents to cases. As the evaluation of the effectiveness of the assignment we use the microaveraged recall, i.e., number of documents properly assigned to their cases accuracy = number of all documents being classi ied

(23)

The cardinalities of the sets of test documents belonging to particular cases do not differ extremely and thus microaveraging seems to be the measure best illustrating the quality of particular classi ication algorithms under consideration. Each experiment consists in choosing on-going cases and classifying all documents located behind the cut points in these cases. An important aspect of the successful application of the classi ier is the question of tuning of its parameters. Thus in the irst series of our experiments we

Journal of Automation, Mobile Robotics & Intelligent Systems

VOLUME 9,

tune the parameters of the ive classi iers under comparison (cf. section 3): 1) the regular k-nn algorithm with the parameter tuned to be k; 2) the kNN.avg1 technique proposed by Yang et al. [20] with the parameter tuned to be k; 3) we consider also a simpli ied variant of kNN.avg1, cf. r′ given by (7), which may be expressed as follows: r0′ (d∗ , c, k, D) =

1 ∑ sim(d∗ , d) |Pkc | c

(24)

d∈Pk

i.e., instead of taking account the similarity of the document d∗ to the closest, both, positive and negative documents of the training data set, as r′ does, the index r0′ takes into account only the closest positive neighbours. The simpli ied variant is more in the spirit of the basic k-nn technique and we would like to check if taking into account also the similarity of d∗ with respect to the negative examples really increases the effectiveness of the classi ication; here again tuning concerns choosing the value of k, 4) the kNN.avg2 technique proposed by Yang et al. [20] with the parameters tuned to be kp and kn ; for this technique the variant simpli ied along the lines proposed for kNN.avg1 may be also of interest but for large k’s it becomes identical with r0′ introduced above, and in our preliminary tests it turned out to be inferior to the original kNN.avg2 technique. 5) our approach given by (10) with the parameter tuned to be the importance function imp. Tuning the Parameters k One may choose and ix the parameters values based on his or her experience, some extra knowledge concerning the characteristic of the problem at hand, or on historical data. It is also possible to dynamically and automatically choose the parameters values each time a classi ication decision is to be made, again based on the available data. In the irst series of experiments we check if such a dynamic tuning really helps in comparison to using ixed value for the parameter k. We compare the results obtained for all ive algorithms under consideration: basic knn, kNN.avg1 and its modi ied variant de ined by (24), and kNN.avg2, with the tuning of the parameter k and without tuning it, using ixed values k = 1, 5, and 10 for k-nn, and k = 5, and 10 for the kNN.avg1 and its modi ied variant (for k = 1 these two latter methods coincide with the 1-nn). For kNN.avg2 all 9 combinations for kp , kn ∈ {1, 3, 5} are investigated. An option to consider for the dynamic tuning is which part of the data set to use. Basically, it should be a separate validation data set. However, it is dif icult to obtain due to the limited size of classes (cases) and the requirement that the training data set have to be formed of the pre ixes of the cases (i.e., original cases up to a cut off point). Thus, for the purposes of the tuning we have employed training and testing datasets

N◦ 4

2015

formed as for the original dataset but assuming that all the cut off points in the on-going cases are one position earlier than in the original dataset (if such a new cut off point happens to be the irst position of the case then such a case is not used during the tuning). Then, we compared two tuning procedures, both based on adopting subsequent values of k from the interval [1,10] and checking if the testing documents are assigned to proper cases but differing in the number of testing documents taken into account. In a simpler procedure the testing set comprises only documents located at the new cut-off points while in the second procedure also the preceding documents are used down to the document located at the second position in a given case. The former procedure, to be called simple in what follows, may bene it from employing for tuning a dataset most similar – in terms of the length of cases and their content – to the actual test dataset. This procedure is also cheaper computationally as less documents are classi ied. The latter procedure, to be called complex in what follows, may better re lect the properties of the cases belonging to a given category but is more expensive. In Tables 1-4 we show the results of the tuning of the parameter k for all the methods under comparison (or kp and kn in case of the kNN.avg2 technique). Table 1 shows that for the basic k-nn technique the best results are obtained for ixed value of k equal 1 and for the simple tuning procedure. The former is however much cheaper computationally and thus we will use k ixed to 1 for the basic k-nn in our further comparisons with other techniques discussed in this paper. In case of the kNN.avg1 technique the results obtained for various values of k are more uniform, as shows Table 2 (k=1 is omitted as the method then coincides with 1-nn). This seems to be the effect of the averaging employed by the kNN.avg1 technique. For further comparisons we choose the complex tuning procedure. The same happens for our simpli ied version of the kNN.avg1 technique and we again choose the complex tuning. In case of the kNN.avg2 technique the best results are obtained for largest tested number of kp , i.e., kp = 10 (in some extra tests for even higher values of kp we have not obtained better results). The value of kn does not make much difference so we choose the following setting (kp , kn ) = (10, 1) for further comparisons. Tab. 1. The averaged results of 100 runs of the basic k-nn algorithm for the following ﬁxed values of k: 1, 5, 10, and for the values tuned using the simple and complex procedures. First row shows the mean value of the accuracy over all the runs while the second row shows the standard devia on 1 0.6338 0.0656

5 0.5186 0.0607

k 10 0.4566 0.0524

simple 0.6077 0.0641

complex 0.5741 0.0656

Journal of Automation, Mobile Robotics & Intelligent Systems

Tab. 2. The averaged results of 100 runs of the kNN.avg1 algorithm for the following ﬁxed values of k: 5, 10, and for the values tuned using the simple and complex procedures. First row shows the mean value of the accuracy over all the runs while the second row shows the standard devia on 5 0.6079 0.0564

10 0.5961 0.0582

k simple 0.6164 0.0599

complex 0.6196 0.0572

Tab. 3. The averaged results of 100 runs of the modified (simplified) kNN.avg1 algorithm for the following fixed values of k: 5, 10, and for the values tuned using the simple and complex procedures. First row shows the mean value of the accuracy over all the runs while the second row shows the standard devia on 5 0.6150 0.0596

10 0.6020 0.0634

k simple 0.6307 0.0615

complex 0.6329 0.0553

Choosing the Importance Func on This parameter applies only to our technique proposed in section 3.2. We have to choose one of the importance functions (19)-(22). Besides choosing the very function, with exception of the linear importance, we have also the freedom to choose its parameters. In case of the quadratic and radical functions (20)-(21) there is only one parameter a ∈ [0, 1] which we sample every 0.1. In case of the piecewise importance two parameters a, b ∈ [0, 1] are to be selected. In our experiments we are tuning this parameter comparing some ixed settings and the dynamic tuning procedure (using the simple procedure as described earlier in case of tuning the parameter k for the other techniques under comparison). The ixed settings are the following: 1) the linear importance (19), 2) the quadratic (20) and radical (21) importances with the parameters (a, b) set to (0.1, 0.9), (0.5, 0.5), (0.9, 0.1) each 3) the piecewise linear importance with the parameters (a, b) set to (0.0, 0.5), (0.3, 0.8), (0.5, 1.0) The dynamic tuning checks the whole space of the possible settings for the importance function. In particular, all four importance functions are taken into account and tested on the test dataset formed following the simple procedure, i.e., making the cut-off points one position earlier than in the original test data set and using only the documents located at these new cut off points for tests. During this testing the linear importance does not need any parameters, the quadratic and radical importance functions with all the pairs from U = {(a, b) : a = 0.1, 0.2, . . . , 1.0 and b = 1 − a} are tested while in case of the piecewise importance all the pairs from U = {(a, b) : a = b−1.0, b−0.9, . . . , b− 0.1 and b = 0.1, 0.2, . . . , 1.0} are tested. The combi66

VOLUME 9,

N◦ 4

2015

nation of the parameters, i.e., the importance function together with the parameters a and b setting, where applicable, is chosen for the actual classi ication o a newly arrive document. Table 5 shows the results of the tuning of the importance function parameter. Several combinations give equally good results. Also using our approach with no importance, what is equivalent to assigning highest importance of 1.0 to all documents of a case in question, yields good results. In the latter case, the index (10) underlying our approach boils down to averaging the similarity of a document to classify over all documents of a candidate case, what makes it close to the kNN.avg1 and kNN.avg2 techniques of Yang et al. [20]. For the further comparisons of our approach with respect to the other techniques we choose the radical importance function with the parameters (a, b0=(0.5, 0.5) what corresponds to the importance √ x function imp(x) = 0.5 len + 0.5. All ive techniques under consideration employ a similarity measure. In our experiments we adopted the Euclidean distance, also for kNN.avg1 and kNN.avg2 which are originally de ined in [20] with the use of the cosine measure. We leave the experiments with other similarity measures for the future research. Oversampling Variants In section 3.3 we propose to to use the oversampling of documents from short cases in the training data set to remedy the imbalance of cases sizes. In our experiments we have applied the oversampling to the cases of length lower than 3, i.e., effectively the only the cases in the training data set comprising one or two documents are affected. We have tested the following three variants: over1 in which the oldest documents in the case is oversampled more, i.e., effectively the irst document in a short case is tripled while the second (if exists) is doubled, over2 in which the newest documents in the case is oversampled more, i.e., effectively the irst document in a short case is doubled while the second (if exists) is tripled; this is a strategy in-line with our general assumption that the newest documents, located closer to the end of the case, matter the most for the successful classi ication, over3 in which all documents in short cases are equally oversampled; effectively the irst and the second document in a short case are doubled. 4.3. The Results After choosing the ixed parameters or their dynamic tuning, as described earlier, we inally compare the effectiveness of the techniques discussed in sections 3.1 and 3.2. In the Table 6 we show the accuracy of the 20 following variants of the earlier discussed algorithms, averaged over 200 runs: 1) basic 1-nn technique, 2) basic 5-nn technique

Journal of Automation, Mobile Robotics & Intelligent Systems

VOLUME 9,

N◦ 4

2015

Tab. 4. The averaged results of 100 runs of the kNN.avg2 algorithm for the following ﬁxed values of (kp , kn ): (1,1), (1,5), (1,10), (5,1), (5,5), (5,10), (10,1), (10,5), (10,10), and for the values tuned using the simple and complex procedures. First rows show the mean value of the accuracy over all the runs while the second rows show the standard devia on (1,1) 0.6264 0.0562

(1,5) 0.6339 0.0565

(10,1) 0.6682 0.0546

(10,5) 0.6639 0.0533

(kp , kn ) (1,10) (5,1) 0.6307 0.6552 0.0570 0.0564 (kp , kn ) (10,10) simple 0.6614 0.6429 0.0551 0.0548

(5,5) 0.6534 0.0569

(5,10) 0.6495 0.0564

complex 0.6411 0.0571

Tab. 5. The averaged results of 100 runs of our algorithm for the following ﬁxed choices of the importance func on: linear, quadra c with (a, b) = (0.1, 0.9), (0.5, 0.5), (0.9,0.1), radical with a = (a, b) = (0.1, 0.9), (0.5, 0.5), (0.9,0.1), piecewise with (a, b) = (0.0, 0.5), (0.3, 0.8), (0.5, 1.0), dynamically tuned, and with importance iden cally equal 1.0 (no importance). First rows show the mean value of the accuracy over all the runs while the second rows show the standard devia on linear 0.6332 0.0599 radical (0.9,0.1) 0.6557 0.0609

Importance functions quadratic radical (0.1,0.9) (0.5,0.5) (0.9,0.1) (0.1,0.9) (0.5,0.5) 0.6545 0.6577 0.6164 0.6532 0.6595 0.0651 0.0593 0.0640 0.0642 0.0623 Importance functions piecewise tuned no importance (0.0,0.5) (0.3,0.8) (0.5,1.0) 0.6525 0.5888 0.5368 0.6279 0.6518 0.0610 0.0624 0.0663 0.0653 0.0636

3) kNN.avg1 with dynamically tuned value of k using complex tuning

16) kNN.avg2 with (kp , kn ) ixed and set to (10,1) and with oversampling in variant 2

4) kNN.avg1 with k ixed and equal 5

17) 5-nn with oversampling in variant 3

5) the simpli ied version of the kNN.avg1 technique with dynamically tuned value of k using complex tuning

18) kNN.avg1 with k ixed and equal 5 and with oversampling in variant 3

6) the simpli ied version of the kNN.avg1 technique with k ixed and equal 5 7) kNN.avg2 with (kp , kn ) ixed and set to (10,1) 8) our algorithm presented in section 3.2 9) 5-nn with oversampling in variant 1 (cf. section 3.3) 10) kNN.avg1 with k ixed and equal 5 and with oversampling in variant 1 11) the simpli ied version of the kNN.avg1 technique with k ixed and equal 5 and with oversampling in variant 1 12) kNN.avg2 with (kp , kn ) ixed and set to (10,1) and with oversampling in variant 1 13) 5-nn with oversampling in variant 2 14) kNN.avg1 with k ixed and equal 5 and with oversampling in variant 2 15) the simpli ied version of the kNN.avg1 technique with k ixed and equal 5 and with oversampling in variant 2

19) the simpli ied version of the kNN.avg1 technique with k ixed and equal 5 and with oversampling in variant 3 20) kNN.avg2 with (kp , kn ) ixed and set to (10,1) and with oversampling in variant 3 Our goal was to compare our method (10) with other techniques, check how the simpli ied version of the kNN.avg1 compares with its original form, check if oversampling discussed in section 3.3 increases the effectiveness of the techniques to which it is applicable and if there is a difference between its variants. Of course, as the test has been executed on one dataset any far going conclusions are not fully justi ied. The best results are obtained for the kNN.avg2 (for all parameters tested). Our approach produces slightly, but statistically signi icant, worse results according to the paired Wilcoxon signed-rank test at 0.05 signi icance level (we use this statistical test in what follows, too). In particular, in 108 runs out of 200 reported in Table 6 the kNN.avg2 without sampling (algorithm no. 7 in Tab. 6) was better than ours while our was better in 63 runs. The third is the simple knn algorithm with k = 1, i.e., 1-nn, which is however 67

Journal of Automation, Mobile Robotics & Intelligent Systems

VOLUME 9,

Nâ&#x2014;Ś 4

2015

Tab. 6. The averaged results of 200 runs of the compared algorithms. First rows show the mean value of the accuracy over all the runs while the second rows show the standard devia on The algorithm 3 4 kNN.avg1 kNN.avg1 tuned k=5 0.6264 0.6100 0.0567 0.0582

5 simp kNN.avg1 tuned 0.6253 0.0576

6 simp kNN.avg1 k=5 0.6053 0.0592

0.6569 0.0585

The algorithm 9 10 5-nn kNN.avg1 over1 k=5 over1 0.6084 0.6125 0.0620 0.0576

11 simp kNN.avg1 k=5 over1 0.6081 0.0579

12 kNN.avg2 kp =10,kn =1 over1 0.6671 0.0560

13 5-nn over2 0.6084 0.0620

14 kNN.avg1 k=5 over2 0.6124 0.0578

The algorithm 15 16 simp kNN.avg1 kNN.avg2 k=5 over2 kp =10,kn =1 over2 0.6079 0.6665 0.0580 0.0558

17 5-nn over3 0.6084 0.0620

18 kNN.avg1 k=5 over3 0.6130 0.0580

19 simp kNN.avg1 k=5 over3 0.6085 0.0583

20 kNN.avg2 kp =10,kn =1 over3 0.6667 0.0558

1 1-nn

2 5-nn

0.6309 0.0566

0.5230 0.0611

7 kNN.avg2 kp =10,kn =1 0.6667 0.0558

8 our approach

The algorithm

signi icantly worse than two previously mentioned algorithms while is better than the kNN.avg1 algorithm and its simpli ied version. Concerning the simpli ication of the kNN.avg1 algorithm which we have considered, our experiments seem to show statistically signi icant reduction of the quality of the classi ication due to its use for most of the parameters settings, i.e., when the pairs of algorithms (4,6), (10,11), (14,15) and (18,19) in Table 6 are compared. However, for the setting where both techniques perform the best, i.e., when the parameter k is dynamically tuned (pair (3,5)), there is no statistically signi icant difference between the original kNN.avg1 technique and its simpli ied version. Concerning the oversampling, the most striking effect is visible in case of the basic 5-nn algorithm. Its performance without oversampling is poor while if coupled with oversampling, in any of the variants over1, over2 or over 3, it produces results not significantly worse than e.g., the kNN.avg1 technique. For kNN.avg1 itself and its simpli ied version adding oversampling also produces the signi icantly better results, again in case of any variant. For kNN.avg2 no signi icant impact of oversampling is visible.

5. Conclusions We have studied the application of the k-nn technique and its variants to the problem of the multiaspect text categorization (MTC), in particular with respect to the classi ication of a document to a case. One 68

of the variants known from the literature [20] proved to be the best when applied to a data set we prepared for our experiments with the solutions to MTC. We proposed also our technique which makes it possible to take into account the importance of the documents within a case, in an intuitively appealing way. This approach also yields good results in our experiments. We have also studied various ways of tuning of the parameters of the classi iers employed, as well as we have checked if the oversampling of data may help to increase the accuracy of the classi ication. The results are mixed in this respect: for some classi iers the dynamic tuning of the parameters works while for other there is no improvement. The oversampling supports better classi ication but the results are convincing mainly for the basic 5-nn classi ier. Further research is surely needed concerning the tuning of the considered techniques. Our experiments with an ACL ARC dataset have con irmed some limited usefulness of the parameters tuning. However, in another setting adjusting parameters to a given collection may turn worth consideration. Thus, it may be important to devise the tuning algorithms in the computationally optimal way. In our experiments the dynamic tuning has been performed by a direct repetition of the functions implementing particular techniques. This can be surely improved. The ways to more ef iciently sample the parameters space should be looked for as well as combining the sampling with the implementation of a given technique may be ad-

Journal of Automation, Mobile Robotics & Intelligent Systems

vantageous. We have also discussed the question of the form of the test/validation dataset and here there is also some room for further investigations.

AUTHORS

Sławomir Zadrożny∗ – Systems Research Institute, Polish Academy of Sciences, 01-447 Warszawa, ul. Newelska 6, Poland, e-mail: Slawomir.Zadrozny@ibspan.waw.pl. Janusz Kacprzyk – Systems Research Institute, Polish Academy of Sciences, 01-447 Warszawa, ul. Newelska 6, Poland, e-mail: Janusz.Kacprzyk@ibspan.waw.pl. Marek Gajewski – Systems Research Institute, Polish Academy of Sciences, 01-447 Warszawa, ul. Newelska 6, Poland, e-mail: gajewskm@ibspan.waw.pl. ∗ Corresponding author

ACKNOWLEDGEMENTS This work is supported by the National Science Centre (contract no. UMO-2011/01/B/ST6/06908).

REFERENCES [1] J. Allan, ed., Topic Detection and Tracking: Eventbased Information, Kluwer Academic Publishers, 2002. [2] R. Baeza-Yates and B. Ribeiro-Neto, Modern information retrieval, ACM Press and Addison Wesley, 1999. [3] A. Beygelzimer, S. Kakadet, J. Langford, S. Arya, D. Mount, and S. Li. FNN: Fast Nearest Neighbor Search Algorithms and Applications, 2013. R package version 1.1. [4] S. Bird, R. Dale, B. Dorr, B. Gibson, M. Joseph, M.Y. Kan, D. Lee, B. Powley, D. Radev, and Y. Tan, “The ACL anthology reference corpus: A reference dataset for bibliographic research in computational linguistics”. In: Proc. of Language Resources and Evaluation Conference (LREC 08), Marrakesh, Morocco, 1755–1759. [5] M. Delgado, M. D. Ruiz, D. Sá nchez, and M. A. Vila, “Fuzzy quanti ication: a state of the art”, Fuzzy Sets and Systems, vol. 242, 2014, 1–30, http:// dx.doi.org/10.1016/j.fss.2013.10.012. [6] S. A. Dudani, “The distance-weighted knearest-neighbor rule”, IEEE Transactions on Systems, Man, and Cybernetics, vol. 6, no. 4, 1976, 325–327, http: //dx.doi.org/10.1109/TSMC.1976.5408784. [7] I. Feinerer, K. Hornik, and D. Meyer, “Text mining infrastructure in R”, Journal of Statistical Software, vol. 25, no. 5, 2008, 1–54, http://dx.doi. org/10.18637/jss.v025.i05. [8] A. Feng and J. Allan, “Hierarchical topic detection in tdt-2004”. [9] M. Gajewski, J. Kacprzyk, and S. Zadroż ny, “Topic detection and tracking: a focused survey and a new variant”, Informatyka Stosowana, to appear.

VOLUME 9,

N◦ 4

2015

[10] E. Han, G. Karypis, and V. Kumar, “Text categorization using weight adjusted k-nearest neighbor classi ication”. In: D. W. Cheung, G. J. Williams, and Q. Li, eds., Knowledge Discovery and Data Mining - PAKDD 2001, 5th Paci ic-Asia Conference, Hong Kong, China, April 16-18, 2001, Proceedings, vol. 2035, 2001, 53–65. [11] J. Kacprzyk, J. W. Owsiń ski, and D. A. Viattchenin, “A new heuristic possibilistic clustering algorithm for feature selection”, Journal of Automation, Mobile Robotics & Intelligent Systems, vol. 8, no. 2, 2014, http://dx.doi.org/10.14313/ JAMRIS_2-2014/18. [12] J. Kacprzyk and S. Zadroż ny. “Power of linguistic data summaries and their protoforms”. In: C. Kahraman, ed., Computational Intelligence Systems in Industrial Engineering, volume 6 of Atlantis Computational Intelligence Systems, 71–90. Atlantis Press, 2012. http://dx.doi.org/10. 2991/978-94-91216-77-0_4. [13] D. Olszewski, J. Kacprzyk, and S. Zadroż ny. “Time series visualization using asymmetric selforganizing map”. In: M. Tomassini, A. Antonioni, F. Daolio, and P. Buesser, eds., Adaptive and Natural Computing Algorithms, volume 7824 of Lecture Notes in Computer Science, 40–49. Springer Berlin Heidelberg, 2013. http://dx.doi.org/ 10.1007/978-3-642-37213-1_5. [14] D. Olszewski, J. Kacprzyk, and S. Zadroż ny. “Asymmetric k-means clustering of the asymmetric self-organizing map”. In: L. Rutkowski, M. Korytkowski, R. Scherer, R. Tadeusiewicz, L. Zadeh, and J. Zurada, eds., Arti icial Intelligence and Soft Computing, volume 8468 of Lecture Notes in Computer Science, 772–783. Springer International Publishing, 2014. http://dx.doi. org/10.1007/978-3-319-07176-3_67. [15] R Core Team. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria, 2014. [16] F. Sebastiani, “Machine learning in automated text categorization”, ACM Computing Survys, vol. 34, no. 1, 2002, 1–47, http://dx.doi.org/10. 1145/505282.505283. [17] M. Szymczak, S. Zadroż ny, A. Bronselaer, and G. D. Tré , “Coreference detection in an XML schema”, Information Sciences, vol. 296, 2015, 237 – 262, http://dx.doi.org/10.1016/j. ins.2014.11.002. [18] R. Yager, “Quanti ier guided aggregation using OWA operators”, International Journal of Intelligent Systems, vol. 11, 1996, 49–73, http://dx.doi.org/10.1002/(SICI)1098111X(199601)11:1%3C49::AID-INT3%3E3.0. CO;2-Z. [19] Y. Yang, “An evaluation of statistical approaches to text categorization”, Information Retrieval, vol. 1, no. 1-2, 1999, 69–90, http://dx.doi.org/ 10.1023/A:1009982220290. 69

Journal of Automation, Mobile Robotics & Intelligent Systems

[20] Y. Yang, T. Ault, T. Pierce, and C. W. Lattimer, “Improving text categorization methods for event tracking”. In: SIGIR, 2000, 65–72, http://dx. doi.org/10.1145/345508.345550. [21] L. Zadeh, “A computational approach to fuzzy quanti iers in natural languages”, Computers and Mathematics with Applications, vol. 9, 1983, 149–184, http://dx.doi.org/10.1016/ 0898-1221(83)90013-5. [22] S. Zadroż ny, J. Kacprzyk, M. Gajewski, and M. Wysocki, “A novel text classi ication problem and its solution”, Technical Transaction. Automatic Control, vol. 4-AC, 2013, 7–16. [23] S. Zadroż ny, J. Kacprzyk, and M. Gajewski, “A novel approach to sequence-of-documents focused text categorization using the concept of a degree of fuzzy set subsethood”. In: Proceedings of the Annual Conference of the North American Fuzzy Information processing Society NAFIPS’2015 and 5th World Conference on Soft Computing 2015, Redmond, WA, USA, August 1719, 2015, 2015. [24] S. Zadroż ny, J. Kacprzyk, and M. Gajewski. “A new approach to the multiaspect text categorization by using the support vector machines”. In: G. De Tré , P. Grzegorzewski, J. Kacprzyk, J. W. Owsiń ski, W. Penczek, and S. Zadroż ny, eds., Challenging problems and solutions in intelligent systems, to appear. Springer, Heidelberg New York, 2016. [25] S. Zadroż ny, J. Kacprzyk, and M. Gajewski, “A new two-stage approach to the multiaspect text categorization”. In: 2015 IEEE Symposium on Computational Intelligence for Human-like Intelligence, CIHLI 2015, Cape Town, South Africa, December 810, 2015, to appear, 2015.

VOLUME 9,

N◦ 4

2015