Two Stage Traffic Sign Detection and Recognition based on K-Means and Mask RCNN Algorithm

International Research Journal of Engineering and Technology (IRJET) Volume: 08 Issue: 11 | Nov 2021 www.irjet.net

e-ISSN: 2395-0056 p-ISSN: 2395-0072

Two Stage Traffic Sign Detection and Recognition based on K-Means and Mask RCNN Algorithm Parv Dave1, Meet Diwan2, Meet Raval3 H Patel College of Engineering & Technology Ambani Institute of Information and Communication Technology ----------------------------------------------------------------------------------***-----------------------------------------------------------------------------Abstract— Sign board identification is critical in machine learning applications such as video surveillance and contentbased visual information retrieval. Previous studies on this problem have mostly focused on application-specific sign boards. To detect these sign boards, several special qualities such as hue, size, form, and symmetrical must be employed. In this research, we propose a technique for detecting sign boards in images or videos using Mask RCNN. Experiments on over a hundred photos clearly illustrate the new algorithms' usefulness. Image segmentation is a critical issue in image processing and computer vision, with several applications including scene interpretation, medical picture analysis, robotic perception, video surveillance, augmented reality, and image compression. Several picture segmentation techniques have been developed in the literature. Due to the success of deep learning models in a variety of vision applications, there has recently been a significant amount of effort directed towards creating picture segmentation algorithms utilising deep learning models. 3Dhirubhai

1-2G

,

Keywords—classification of images, data mining, decision tree, object recognition, satellite images, convolutional neural networks (CNNs); end-to-end detection; transfer learning; remote sensing I. INTRODUCTION Image Segmentation is a critical component in many visual comprehension systems. It entails dividing a picture into several segments or objects. To name a few, segmentation is essential in a wide range of applications such as medical image analysis, driverless cars, video surveillance, and augmented reality. In the literature, numerous image segmentation algorithms have been developed, ranging from the earliest methods, such as thresholding, histogram-based bundling, region growing, k-means clustering, and watersheds, to more advanced algorithms, such as active contours, graph cuts, conditional and Markov random fields, and sparsity-based methods. Signs give a series of cautions regarding the route. They keep traffic moving by assisting passengers in achieving their destinations and informing them of arrival, departure, and turn locations ahead of time. Signs are strategically positioned to guarantee the safety. They also provide information on when and where drivers may or may not turn. We designed a method for sign detection and recognition in this research, as well as a method for extracting a road sign from a natural complex image and processing it. It is used in such a way that it assists drivers in making quick judgments. Factors such as changeable weather, changing light directions, and varied light intensity make sign recognition difficult in real-time settings. Noise, partial or absolute underexposure, partial or entire overexposure, and considerable fluctuations in colour saturation, broad range of viewing angles, view depth, and shape/colour deformations of traffic signs all have an impact on the machine's dependability. The planned architecture is divided into three sections. The first is picture pre-processing, which involves quantifying the dataset's input files, determining the input size for learning, and resizing the information for the learning phase. During the recognition phase, the suggested method categorizes the symbol on the sign board. In the second phase, a Convolutional Neural Network method is utilised to achieve this, and the third phase deals with textto-speech translation, with the recognised sign from the second phase delivered in audio format. Mask R-CNN expands Faster R-CNN by adding a branch for predicting segmentation masks for each Region of Interest, in addition to the current branches for classification and bounding box regression. The mask branch is a tiny FCN that is applied to each RoI and predicts a segmentation mask pixel-by-pixel. Mask R-CNN is easy to develop and train because to the Faster R-CNN framework, which allows for a variety of configurable architectural designs. Furthermore, the mask branch adds just a minor computational burden, allowing for a quick system and rapid experimentation. Despite appearing to be a modest improvement, RoIAlign has a significant impact: it increases mask accuracy by 10% to 50%, with larger benefits under tougher localization measures. Second, we discovered that decoupling mask and class prediction is critical: we predict a binary mask for each class individually, without competition between classes, and rely on the network's RoI classification branch to forecast the category. In contrast, FCNs typically do per-pixel multi-class categorization, which combines segmentation and classification and, according to our findings, performs badly for segmentation. The goal of object extraction is to extract the Region of Interest of a given image based on the identified position of an item of interest. Object extraction has been widely employed in varied applications such as real-time monitoring, robot navigation, and target search as a common issue in the field of image processing and computer vision. Despite significant recent advances in object extraction, there are still many unresolved issues. For example, owing to the interference generated by diverse views and settings from where the picture was acquired, it is currently impossible to ensure the reliability and resilience of the object model obtained from an image.

|

Impact Factor value: 7.529

|

ISO 9001:2008 Certified Journal

|

Page 36

Turn static files into dynamic content formats.

Create a flipbook