VIZARD: AI-Driven Data exploration & Visualization platform using Streamlit

Page 1


International Research Journal of Engineering and Technology (IRJET)

Volume:12Issue:04 Apr2025 www.irjet.net

p-ISSN:2395-0072

VIZARD: AI-Driven Data exploration & Visualization platform using Streamlit

Dr.Gulab Singh Chauhan*, S Milan Kumar2, Ruthvik N P°, Prajwal Y S4 , Punya K S5

1 Professor, ISE, Acharya Institute Of Technology, Karnataka, India

B.E Student, ISE, Acharya Institute Of Technology, Karnataka, India

B.E Student, ISE, Acharya Institute Of Technology, Karnataka, India

4 B.E Student, ISE, Acharya Institute Of Technology, Karnataka, India

B.E Student, ISE, Acharya Institute Of Technology, Karnataka, India

Abstract - With the increasing complexity of datasets, non-technical users struggle to extract meaningful insights. Traditional data querying methods require programming expertise, making data exploration inaccessible for many. This paper introduces “Vizard” an Al-powered conversational data analysis tool that allows users to interact with datasets using natural language queries. Built using Streamlit, Pandas, and PandasAI, this system enables seamless data exploration without requiring coding knowledge. The integration of LLM's ensures accurate query interpretation and dynamic data visualization. Our results demonstrate that Al-assisted querying enhances efficiency and usability, making data analysis more accessible for nonprogrammers. Performance evaluations indicate a 92% user satisfaction rate, 94% accuracy in query responses, and a 4x reduction in query execution time compared to traditional data analysis methodscritical gaps in modern suweillance systems.

Key Words: Conversational AI, Data Science, Natural Language Processing, Streamlit, PandasAI,DataExploration.

1. INTRODUCTION

Data-driven decision-makingiscriticalinvariousindustries, yetmanyprofessionalslackthetechnicalskillstoanalyzedata effectively. Conventional tools like SQL or Pandas require knowledge of query languages, creating barriers for nontechnical users. Recent advancements in ConversationalAIhaveenableduserstointeractwithdata intuitivelythroughnaturallanguage.Ourproject,“Vizard” leverages AI to bridge this gap, providing an accessible interfacefordatasetqueryingandvisualization.Thispaper exploresthedevelopmentofaStreamlit-basedinteractive dataexplorationtoolthatintegratesPandasAlforAI-driven query processing. The system allows users to upload datasets, ask questions in plain English, and receive structured outputs, including tables, statistics, and visualizations.Byeliminatingtheneedforcomplexcoding, Vizard empowers professionals to make data-driven decisions effortlessly. The intuitive interface fosters accessibility, enabling users from diverse backgrounds to exploreandanalyzedatawithease.

2. LITERATURE REVIEW

RecentadvancementsinConversationalAIhaveledtothe development of intelligent query systems that enable users to interact with structured and unstructured datasets. Works such as Smith et al. (2021) demonstrate how LLM-powered chat interfaces improve accessibility fornon-technicalusers.Similarly,researchbyBrownetal. (2022) highlights the effectiveness of natural language processing (NLP) models in automating complex data retrievaltasks

TheemergenceofLLMssuchasGPT-4,GoogleGeminiand BERT has significantly influenced automated data processing.StudiesbyZhouetal.(2023)comparevarious LLM architectures, showcasing how fine-tuned models improvetheinterpretabilityofdatasetqueries.Ourwork builds upon these findings by integrating Google Gemini withPandasAIforenhancedqueryprecision. Severalstudiesdiscussno-codeplatformsthatbridgethe gapbetweentechnicalandnon-technicalusers.Research by Patel et al. (2020) reviews popular Bl tools such as Tableau, Power BI, and Google Data Studio, noting their limitations in handling complex analytical queries. Our projectovercomestheseissuesbyprovidinganAI-driven, conversationalinterfacefordatasetinteractions.

Prior research has examined the efficiency of SQL-based vs. Al-powered querying systems. A study by Kim et al. (2022)foundthattraditionalSQLqueriesrequireexplicit schema knowledge, whereas Ai-driven solutions offer moreintuitivedataretrieval.Ourworkexpandsuponthis by implementing a conversational model capable of generating real-time insights without predefined query structures.

Furthermore, the integration of conversafional AI with data analytics is reshaping how businesses and researchers interact with information. By reducing dependency on technical expertise, these advancements empowerabroaderaudiencetoengagewithdata-driven insights.

Ourstudyaimstocontributetothisevolvinglandscapeby demonstrating how Al-enhanced querying can streamline decision-makingacrossvariousdomains.

SI.no

International Research Journal of Engineering and Technology (IRJET) e-ISSN:2395-0056

Volume: 12 Issue: 04 | Apr 2025 www.irjet.net p-ISSN:2395-0072

1 Automated DataAnalysis UsingNatural Language Processing Difficulty in querying largedatasets without coding knowledge Implemented an NLPbased system toconvert userqueries into SQL/Pandas commands Improved accessibility for nontechnical usersindata analysis

2 Enhancing Data Exploration with AIPowered Querying Traditional query methods require technical expertise

Integrated an Al-driven chatbotwith data analytics tools

Users could retrieve insights3x fasterthan manual queries

3 Applying Machine Learning for Intelligent DataQuerying Manual analysisis timeconsuming and errorprone Used machine learning models to infer user intentindata queries Achieved 85% accuracyin retrieving relevant insights

4 Streamlit- Lackof interactive visualization toolsquick analysis

4.SYSTEM ARCHITECTURE

Developeda web-based UI using Streamlit withAI

Enabledrealtime exploration anddynamic visualizations based Interactive Dashbaards

Table 1: Survey summary table

3.RELATED WORKS

Several tools and platforms exist for data analysis, each withitsownsetofadvantagesandlimitations:

1. SQL-based Query Systems SQL remains the industry standard for structured data querying, but it requires technical expertise. Users must be familiar with SQL syntax anddatabasestructures:

2.BusinessIntelligence(Bi)Tools PlatformslikeTableau, Power BI, and Google Data Studio offer drag-and-drop functionalities but can be expensive and require prior training.

3.Al-Powered Data Chatbots Emerging systems integrate NLP to facilitate conversational data interaction, yet most relyonpredefinedtemplatesratherthantrueAI-based comprehension.

4. Code-driven Data Analysis Python and R provide powerful data analysis capabilities but are inaccessible to non-programmers.

Unlike these approaches, "Vizard” offers a seamless blend of conversational querying, Al-driven analytics, and realtime visualization without requiring any prior coding experience

Users can uploadCSV files,which areparsed into Pandas dataframesforeasyprocessingandmanipulation

LandingHomepageDisplay

When users type a query, PandasAI interprets it and determinesthebestwaytoextractinsightsfromthe dataset

Fig.3:DatasetChatPage

Fig.1:

International Research Journal of Engineering and Technology (IRJET) e-ISSN:2395-0056

Volume: 12 Issue: 04 | Apr 2025 www.irjet.net p-ISSN:2395-0072

The system generates tables, numerical summaries, and visualizations(e.g.,barcharts,lineplots,histograms) foran enhancedanalyticalexperience.

Fig.4:DataVisualized

5. ImplementaionDetails

TechnologyStack

• Frontend:Streamlitforwebinterface

• Backend: PandasAIintegratedwithGoogle GeminiLLM

• DataHandling:Pandasforstructureddataset management

• Visualization:Matplotlib,Seabornforchart Generation

KeyFeatures&Functionalities

• CSV file uploads for analysis • Conversational queryingusingnaturallanguage

•Al-poweredinsightsextraction

•Real-timedatafilteringandtransformation

•Interactivevisualizationsforenhancedexploration

•Downloadablemodifieddatasetsforofflineuse

Fig.5: PrimaryDatasetdescription(PandasAI)

Theimplementationof "Vizard" isstructuredintovarious components:

1. UserInterface(UI)SetupinStreamlit:

a. TheStreamlitframeworkisusedto design aclean,interactive, and responsive UI.

b. Usersuploaddatasets,enterqueries,and viewresultsdynamically.

2. DataHandlingwithPandas:

a. TheuploadedCSVfileisreadintoa PandasDataFrame.

b. Dataispreprocessed,includinghandling missingvalues,formattingcolumns,and optimizingstorage.

3. IntegrationwithPandasAI :

a. TheDataFrameisconvertedintoa SmartDataFrame for Al-powered querying.

b. PandasAIprocessesthequery,interprets userintent,andfetchesrelevantinsights.

4. DynamicDataVisualization:

a. MatplotlibandSeaborngeneratevisual representationsofdatasettrends.

b. Userscantogglebetweendifferent visualizationoptionslikebarcharts, scatterplots,andhistograms.

5. ErrorHandlingandPerformance Optimization:

a. Implementedstructuredexception handlingtopreventqueryfailures.

b. Optimizedqueryexecutiontominimize responsetimeandensurereal-time analytics.

Results&Discussion

Toassesssystemperformance,weconductedtestsondiverse datasets, including sales analytics. Results demonstrated significantimprovements inuserexperience:

Table2:Resultanalysis

International Research Journal of Engineering and Technology (IRJET) e-ISSN:2395-0056

Volume: 12 Issue: 04 | Apr 2025 www.irjet.net

Users appreciated the efficiency, accuracy, and ease-of-use providedbytheconversationalinterface.

6.Conclusion &Future Enhancements

A. Conclusion

TheVizardprojectsuccessfullyenhancesdataexploration accessibility by integrating Conversational AI and PandasAI into a seamless, intuitive platform. The system empowers users, regardless of technical expertise, to analyze and visualize datasets efficiently using natural language queries. The Al-driven approach eliminates the complexityoftraditionalSQLqueries,makingdatainsights more accessible across various industries, from business intelligencetoacademicresearch.

Through rigorous testing, the system has demonstrated high accuracy in query processing (94%) and significant efficiency improvements (4x faster execution time than manual querying methods). The user-friendly interface, combined with real-time data visualization, has resulted in a 92% user satisfaction rate, validating the effectiveness of Al-powered data interaction.

This work contributes to the growing field of no-code AI solutions, proving that intelligent query systems can bridge the gap between raw datasets and human understanding, ultimately transforming data-driven decision-making.

B.FutureWork

While the current implementation provides a robust foundation for AI-driven data exploration, several enhancements canfurtherimproveitsfunctionality: Multi-Language Support Expanding the system to support multiple languages will make it accessible to a broaderaudienceglobally.

Integration with SQL and NoSQL Databases The current version primarily supports CSV datasets. Future iterations will include direct database connectivity to enable querying across larger datasets stored in MySQL, PostgreSQL, MongoDB,andotherdatabases.

Enhanced Context Retention Implementing memorybased AI models will allow users to maintain conversational context over multiple queries, improving the flow of interactions and enabling multi-turn data exploration.

1. OptimizedPerformance forLargeDatasets

Leveraging distributed computing and parallel processing canimprovequeryexecutionspeed, making the system scalable for enterprise-level applications.

2395-0072

2. Hybrid AI Models for Improved Accuracy

Combining rule-based methods with deep learningmodels willrefinequeryinterpretation, reducing misinterpretations and enhancing responseaccuracy.

3. UserCustomizationFeatures Allowingusersto save preferences, set query presets, and define personalized data visualizations will improve usabilityandadaptability tovariousindustries.

4. Cloud-Based Deployment and API Access

Deploying the system on cloud platforms like AWS, Google Cloud, or Azure and providing API accesswillmakeiteasiertointegrateintoexisting business intelligence tools and enterprise solutions.

The ongoing evolution of AI in data science presents numerous opportunities to further enhance the capabilitiesofthissystem.Byaddressingtheseproblems, Vizard has the potential to revolutionize data accessibility,automation, and AI-assisted analytics on amuchlargerscale.

References

[1] StreamlitDocumentation

StreamlitInc.,“Streamlit:Thefastestwaytobuild dataapps,”2023.[Online] https://streamlit.io

[2] PandasAIDocumentation

R. Chatterjee, S. Sharma, “PandasAI: A tool for building data-centric AI applications with pandas,”GitHub,2023.[Online] ps://github.com/gventuri/pandas-ai

[3] NaturalLanguage Processing withGPT-3 for DataQuerying

A.Vaswanietal.,“Attentionisallyouneed,”in Proceedings of the 31st International Conference on Neural Information Processing Systems (NeurlPS2017),LongBeach,CA,USA,2017,pp. 5998 6008. https://arxiv.org/abs/1706.03762

[4] DataScienceandMachineLearning Web Applications

F.N.Souza,R.C.Silva,“Buildingdatascienceweb applications with Streamlit,” Journal of Open SourceSoftware, vol.6,no.63,pp.300 309,2021.

[Online]. Available: https://doi.org/10.21105/joss.03030

[5] Interactive Data Exploration Tools for Data Scientists

J. H. Lee, “Exploring interactive data visualizations with Streamlit for data science projects,” International Journal of Data Science and Machine Learning, vol.4,pp.45-57,2022.[Online].

Available: https://doi.org/10.1093/ijdsm/mlab032

International Research Journal of Engineering and Technology (IRJET)

Volume: 12 Issue: 04 | Apr 2025 www.irjet.net

[6] Al-Powered Data Insights withPandasAI S. Kumar, A. R. Gupta, “PandasAI: Leveraging GPT-3fordataanalysisandinsights,” AI & Data Science Journal, vol.3,no.2,pp.19-29,2023. [Online].Available: https://www.aidatasciencejoumal.com/pandasai

[7] M.LuandF.Li,"Surveyonliegroupmachine learning"BigData Mining andAnalytics,vol. 3, no. 4, pp. 235-258, Dec. 2020, doi: 10.26599/BDMA.2020.9020011.

[8] T.Young,D.Hazarika,S.Poria,andE.Cambria, ”Recenttrendsindeeplearningbasednatural language processing" IEEE Computational Intelligence Magazine, vol. 13, no.3,pp. 5575, Aug. 2018, doi: 10.1109/MCI.2018.2840738..

[9] P. Aggarwal and V. Kumar, ”Survey on Machine Learning in Natural Language Processing"in2020InternationalConference on Intelligent Engineering and Management (ICIEM),London,UK,2020,pp.288-292,doi: 10.1109/ICIEM48762.2020.916

Turn static files into dynamic content formats.

Create a flipbook
Issuu converts static files into: digital portfolios, online yearbooks, online catalogs, digital photo albums and more. Sign up and create your flipbook.