Blind Assistance Using AI with Machine Learning

Page 1


Blind Assistance Using AI with Machine Learning

1Assistant Professor, CIIRC Jyothy Institute of Technology Bangalore, India

2Dept. Of C.S.E, Jyothy Institute of Technology Bangalore, India

3Dept. Of C.S.E, Jyothy Institute of Technology Bangalore, India

4Dept. Of C.S.E, Jyothy Institute of Technology Bangalore, India

5Dept. Of C.S.E, Jyothy Institute of Technology Bangalore, India

Abstract - Visually impaired individuals face numerous challenges in navigating their environment independently. This paper proposes an AI-based blind assistance system that combines computer vision techniques, object detection using Optical Character Recognition (OCR), and gender detection using Convolutional Neural Networks (CNN) on a Raspberry Pi platform. The system is designed to provide real- time audio feedback for object recognition, text reading, and gender identification, enabling enhanced situational awareness. The proposed system aims to be a cost-effective and portable solution to improve the quality of life for visually impaired users.

Key Words -Blind assistance, machine learning, artificial intelligence, object detection, speech synthesis.

1.INTRODUCTION

Artificial intelligence (AI), particularly machine learning (ML) and deep learning, has become pivotal in image processing and computer vision. Object recognition, a task humans perform effortlessly, remains a significant challengeforautomatedsystems.Thiscapabilityiscrucial fordeveloping applicationsthatenhance human life, such as assistive technologies for the visually impaired and systemsforimprovingroadsafety.

Visuallyimpairedindividualsfacenumerousdifficultiesin daily life, from navigation to interpreting social cues like facial expressions. Technology can bridge this gap by providing tools that interpret visual information, such as detecting objects, recognizing faces and their emotions, identifyinggender,andreadingtextaloud.Similarly,road safety is a major concern globally, with accidents often caused by undetected obstacles, poor visibility, or driver inattention. Systems capable of detecting obstacles like other vehicles, pedestrians, animals, or road hazards in real-time can significantly reduce accidents and improve driverawareness.

This paper reviews and synthesizes approaches from

five studies focusing on leveraging ML and deep learning for gender classification and object detection. These studies explore various algorithms, including SVM, CNNs implemented on platforms like Raspberry

International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 p-ISSN: 2395-0072 Impact Factor value: 8.315

Pi, to create practical solutions for visually impaired assistance and accident prevention. The goal is to present a consolidated view of the methodologies, results,andchallengesinthesedomains.

2. LITERATURE REVIEW / RELATED WORK

This section reviews significant studies on gender classification, object detection, and customer journey analytics using AI techniques, providing the foundationalbackgroundfortheproposedsystem

2.1 Human Gender Classification using Machine Learning,

Mali and Patil [1] compared SVM, CNN, and LDA models for gender prediction using facial images, with SVM achieving the highest accuracy of 98%. Despite its effectiveness, the system struggled with occlusionsandlimiteditselftobinaryclassification.

2.2 AI-Based Assistance for the Visually Impaired, Safnaz et al. [2] proposed a CNN-based system integrated with OCR to aid visually impaired individuals through auditory feedback. The model performedwellinreal-timeapplications,althoughits emotion recognition accuracy was limited and hardwarerelianceaffectedscalability.

2.3 DeepLearningforObjectRecognition, Erastus et al. [3] developed a CNN model using the CIFAR-10 datasets, achieving 98.5% accuracy with the Adam optimizer. While deep learning proved superior to traditional techniques, it faced challenges with overlapping objects and high computational costs.

2.4 ObjectDetection forAssistiveTechnology, Muhsin et al. [4] implemented a object detection system on a Raspberry Pi platform. The model provided real-time visual assistance but was constrained by object overlap issues and a fixed categoryset.

2.5 IoT-DrivenObjectDetectionforRoadSafety, Oza and Rathod [5] designed an LSTM-based object detection system combining IoT and ML to avoid roadaccidents.TheirapproachoutperformedCNNin terms of training time and accuracy, though it required high-quality images and introduced model complexity.

2.6 Customer Segmentation using K-Means Clustering, Vijayalakshmi and Sridevi [6] utilized K-Means to segment customers based on transactional data. Their method offered actionable insights for targeted marketing, although its sensitivity to outliers and cluster shape assumptions were notablelimitations.

2.7 HybridClusteringandStrategyDevelopment, Gupta and Gupta [7] presented a multi-technique approachusingK-Means,hierarchicalclustering,and decision trees. This method enhanced customer segmentation but increased computational complexityandinterpretabilitychallenges.

2.8Customer Journey Mapping with Data Science, MishraandSharma[8]proposedaframeworkusing clustering and visualization tools like Sankey diagrams for mapping customer flows. While the approach helped identify behavior trends, it required clean data and could become visually clutteredatscale.

2.9 PCAandK-MeansforMarketingSegmentation, Kaur and Kaur [9] applied PCA to reduce data dimensions before clustering, which improved performance and clarity. However, some relevant but low-variance features might be discarded during dimensionalityreduction.

2.10 JourneyAnalyticsforCustomerRetention, SujathaandBhuvaneswari [10]usedsequencemining and behavioral segmentation for churn prediction. Their research highlighted the value of journey analytics, thoughitdemandedlargedatasets andhigh computationalresourcesforreal-timeanalysis.

3.Methodology / Proposed Systems

A working.

The proposed system utilizes TensorFlow Lite models for performing real-time object detection and gender classification on edgedevices suchas theRaspberryPi Zero2W.Themethodologyinvolvestwocoremodules: one for detecting objects in the camera feed and anotherforidentifying thegenderof detectedpersons. Themodelsareoptimizedforlightweightinferenceand minimal latency, making them suitable for embedded systems.

Themajorstepsinthemethodologyinclude:

1.VideoStreamHandling:

A livevideofeedis capturedusing theOpenCVVideo Capture interface. Frames are continuously read using a threaded approach to ensure real-time performance.

2.ObjectDetectionusingMobileNetV2:

It is an improved version of Mobile Net, a convolutional neural network (CNN) optimized for speedandefficiencyonmobiledevices.

 Lightweight and fast

 Uses depth-wise separable convolutions to reduce computation

 Maintains decent accuracy even on lowresourcedevices

 Ideal for real-time computer vision tasks onmobile/embeddedsystems

3. GenderClassification using CNNmodel: When a person is detected, the bounding box coordinates are used to crop the face or upper body region,fromthe,original frame.

Thiscroppedimageisresized andnormalizedbefore being passed to a CNN-based gender classification model.

The CNN processes the image through multiple convolutional and pooling layers, followed by dense layers, to extract relevant features and predict gender. The model outputs probabilities for two classes:MaleandFemale.

4. AudioFeedback:

The system uses the speak tool to vocalize the label ofthedetectedobjectorgender.

A delay mechanism prevents repetitive announcements for the same object within a short timewindow.

Figure1: BlockDiagram

OutputDelivery:Theresultsarecommunicatedtothe user. In assistive systems, the output is converted to voice using TTS engines (e.g. eSpeak) and delivered through earphones or speakers. For road safety systems,visualorauditoryalertsareusedtowarnof obstaclesorhazards.

International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056

User interfaces in these systems are designed for simplicity. Examples include dual-button toggles to switch between AI- based object detection and OCR mode, as well as support for voice commands. This design ensures the technology remains accessible to userswithvaryingtechnicalabilities.

Raspberry Pi Zero 2 W, a compact and affordable single- board computer designedforlightweightand embedded computing tasks. It is powered by the RP3A0 system-in- package, which integrates a quadcore ARM Cortex-A53 processor running at 1 GHz, alongwith512MBofLPDDR2.

Figure2: RaspberrypiZero2w

RAM. Despite its small size, the board offers a range of connectivity and expansion options, including a mini-HDMI port, micro-USB ports for power and data,acameraserialinterface(CSI) connector,anda 40-pin GPIO header. It also supports wireless communication through built-in 802.11 b/g/n Wi-Fi and Bluetooth 4.2. Storage is handled via a microSD cardslot. Thankstoits compact designandefficient performance, the Raspberry Pi Zero 2 W is wellsuited for Internet of Things (IoT) applications, portable electronics, home automation systems, and real-time computer vision tasks using lightweight AI modelslikeMobileNet.

A Raspberry Pi Camera Module, which is an essential accessory designed to interface with the Raspberry Pi board. This camera, typically used for capturing highqualityimagesandvideo,connectstothePithroughthe dedicated CSI (Camera Serial Interface) port using a flexible ribbon cable. It is compact and lightweight, making it ideal for embedded systems and real-time computer vision projects. The module is commonly

B. Algorithm

p-ISSN: 2395-0072

used in applications like surveillance, object detection, facial recognition, and robotics. With support for HD videocaptureandintegrationwithOpenCV,itbecomes a vitaltoolforAIandIoT-basedimplementations.

Input: Real-time video stream from a camera module

Output: Audiofeedbackdescribingdetectedobjects

i. Start

ii. Initialize camera module (e.g., Raspberry Pi Camera)

iii. Capture Image Frame usingOpenCV

iv. Preprocess Image:

a. Resizeimage

b. Normalizepixelvalues

c. Converttosuitableformat(e.g.,RGB)

v. Detect Objects using trained CNN model

vi. Foreachdetected object:

a. Classifyobjecttype

b. (Optional) Perform gender or category classification using ML model (e.g., CNN, SVM)

vii. Generate Text Description of objects in the scene

viii. Convert Text to Speech usingeSpeak

ix. Output Audio touserviaspeakerorearphone

x. Repeat steps 3–9 for continuous real-time assistance

xi. end

4.Experiments and results

The systems evaluated in the reviewed papers utilized both public and custom datasets. Common datasets include CIFAR-10 for object recognition, FER2013 and Audience for facial attributes, and locally captured imagesforreal-worldtesting.

Accuracy: SVM achieved 91-98% accuracy on specific tasks like gender detection. CNNs surpassed 90% accuracy for object and emotion classification, while YOLO achieved near real-time detection with high precision.

Figure 3: Raspberry pi camera module

International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056

Chart 1: Comparison chart

Performance: Systems using OpenCV's DNN module demonstrated low-latency processing, suitable for RaspberryPideployments.

Advanced Classifiers: Hybrid architectures combining CNN with recurrent layers achieved higher accuracy and reduced training times comparedtoconventionalCNNsorYOLO.

Chart 2: Comparison chart

Limitations included decreased performance in the presence of overlapping objects, variable lighting, and low-resolution input. Preprocessing strategies likeMSDE helpedalleviate someoftheseissues.The useofoptimizerslikeAdamalsoacceleratedtraining convergencecomparedtoalternativeslikeSGD.

p-ISSN: 2395-0072

Figure4: Objectdetection

Figure5: Genderdetection

5.Conclusion

The combined insights from five key studies highlight how Artificial Intelligence (AI) and Machine Learning (ML) are revolutionizing assistive technologies and safety systems in the real world. These technologies employ advanced deep learning models like Convolutional Neural Networks (CNNs) and real-time detection frameworks such as YOLO to address tasks including object recognition, gender and emotion classification. By integrating intelligent software with affordablehardwareliketheRaspberryPi,researchers have shown that these innovations can be practically deployed in both assistive tools for the visually impaired and smart safety systems for road environment.

International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056

For individuals with visual impairments, systems incorporating CNN-based facial recognition, Optical Character Recognition (OCR), and Text-to-Speech (TTS) capabilities serve as valuable tools to enhance environmental awareness. Several implementations demonstrate that compact and cost-effective devices using Raspberry Pi and camera modules can successfully recognize facial features and text, converting this information intospeechtoassist users in navigating their surroundings. One particularly user-friendly design features two buttons that allow easytogglingbetweenAI-drivenfacialrecognitionand OCRreading, making the device practicalforeveryday use.

In road safety applications, real-time object detection through camera systems integrated with rapid processing tools has shown potential for reducing accidents. Techniques such as Mean Subtracted Difference Enhancement (MSDE) have improved detection accuracy, especially in difficult visual conditions like low light or object overlap. By leveraging tools like OpenCV and MobileNetV2 on embeddedhardware, thesesystemsdelivertimelyalerts without requiring powerful infrastructure, making themsuitableforuseinareaswithlimitedresources.

Performance metrics from these projects underline the effectiveness of the technologies. CNN models trained on widely used datasets such as CIFAR-10, ADIENCE, MULTI- PIE, and FER2013 achieved accuracy rates above 90% in recognizing objects and facial features. Mobile Net V2 -based frameworks, when optimized withOpenCV’sDNNmodule,deliveredhighprecisionin real-time object detection with minimal processing delays. Hybrid models, which combine CNNs with recurrent neural networks, demonstrated improvementsinbothtrainingefficiencyandprediction performance.

However, challenges still exist. System performance tends to drop with low-resolution images or complex scenes involving overlapping orsmallobjects. Power consumption is another concern, particularly for mobile or portable devices. Real- time performance may also be affected by dynamic factors such as moving backgrounds or inconsistent lighting. Moreover, ethical issues especially those surrounding user privacy in facial recognition must be carefully addressed. Future research directions appear promising. The use of sensor fusion combining inputs from cameras, ultrasonic sensors,or LiDAR can enhance environmental understanding. Advances in edge computing may reduce reliance on cloud services, offering faster response times while safeguardinguserprivacy.

p-ISSN: 2395-0072

Systems that adapt to user behavior over time and provide personalized feedback may further improve acceptanceandeffectiveness

Overall,thefusion ofAI, ML, andembeddedhardwareis creatingscalableandimpactfulsolutionsinassistiveand safety technologies. The studies reviewed here exemplify how innovation in algorithms and real-world engineering can come together to produce tools that improveboth accessibility and public safety laying the groundworkforfuturedevelopmentsinintelligent,usercenteredcomputingsystems.

REFERENCES

[1] Vaishnavi Y. Mali & Dr. Babasaheb G. Patil Human Gender Classification using Machine Learning. International Journal of EngineeringResearch&Technology(IJERT), Vol. 8, Issue 12, December 2019.

[2] Safnaz,NayanaManjuJogi,ShreyaShetty, Prasad Rathode,DeekshaK R BlindAssistanceDeviceusingAI.International Journal of Engineering Research & Technology (IJERT), NCCDS - 2021 Conference Proceedings,Volume-9,Issue-12.

[3] Ogunti Erastus, Ale Daniel, Nwabueze Ifeoma Object Recognition Using Deep Learning. International Journal of Engineering Research & Technology (IJERT), Vol. 7, Issue 06, June 2018.

Blessing

[4] AbdulMuhsinM,FarahF.Alkhalid,Bashra Kadhim Oleiwi

Online Blind Assistive System using Object Recognition. International Research Journal of Innovations in Engineering and Technology (IRJIET),Vol.3,Issue12,December2019.

[5] Shrinath Oza & Dr. Sunil Rathod Object Detection using IoT and Machine Learning toAvoidAccident and Improve Road

[6] G. Vijayalakshmi and B. R. Sridevi, “AStudyon Customer Segmentation using K-Means Clustering Algorithm,” International Journal of ComputerSciencesandEngineering,2021.

[7] M. Gupta and G. Gupta, “Customer SegmentationandStrategyDevelopmentusing Machine Learning and Data Mining Techniques,” International Journal of Advanced Research in ComputerScience,2020.

[8] P. Mishra and A. Sharma, “Customer Journey Mapping

International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 p-ISSN: 2395-0072 with Data Science for Business Improvement,”JournalofBusinessAnalyticsand Strategy,2021.

[9] M. Kaur and J. Kaur, “CustomerSegmentation Using PCAand K-MeansClustering for Enhanced Marketing,”InternationalJournalofDataScience andAnalytics,2022.

[10] S.L.SujathaandG.Bhuvaneswari, “Enhanced Customer Retention through Journey Analytics,” Journal of Data Analytics and Customer Management,2023.

Turn static files into dynamic content formats.

Create a flipbook
Issuu converts static files into: digital portfolios, online yearbooks, online catalogs, digital photo albums and more. Sign up and create your flipbook.
Blind Assistance Using AI with Machine Learning by IRJET Journal - Issuu