ZVault: A Blockchain-Enabled Web Portal for Secure Document Management

Page 1


International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056

Volume: 12 Issue: 04 | Apr 2025 www.irjet.net p-ISSN: 2395-0072

ZVault: A Blockchain-Enabled Web Portal for Secure Document Management

1Btech(CSE)Student (4th Year)Dronacharya College Of Engineering(DCE)

2Btech(CSE)Student (4th Year)Dronacharya College Of Engineering(DCE)

Abstract -Inthedigitalera,theneedforsecure,efficient, and scalable document management systems has become increasingly critical. Traditional centralised storage systems are often vulnerable to data breaches, unauthorised access, and loss of integrity. To address thesechallenges,thispaperpresentsZVault,ablockchainsecuredwebportaldesignedforcomprehensivedocument management and intelligent invoice data extraction. ZVaultenablesuserstoupload,manage,andretrievetheir documents through an intuitive web interface while ensuring data immutability and transparency using blockchaintechnology.EachdocumentuploadedtoZVault is hashed and stored on a distributed ledger, providing a verifiable and tamper-proof audit trail. Additionally, the platform incorporates an invoice extraction module that leverages Optical Character Recognition (OCR) and machine learning techniques to automatically extract key detailsfrominvoices,enhancingproductivityandreducing manual errors. The system is developed using a Reactbased frontend, a Python-Flask backend, and a cloudhosted database, with blockchain integration ensuring end-to-end security. ZVault stands out as a modern solution tailored for enterprises and individuals seeking secure document handling and intelligent automation. This paper discusses the design, implementation, and potential future improvements of ZVault, contributing to the ongoing evolution of secure digital document ecosystems.

Key Words: Blockchain, Document Management System, Invoice Extraction,Web Portal, Data Security, OCR/AI, Classificationmodel.

1. INTRODUCTION

The exponential growth of digital data has revolutionised the way individuals and organisations handle documents. From invoices and contracts to certificates and identification records, the need to store, access, and managedocumentsefficientlyandsecurelyismorecritical than ever. Traditional document management systems (DMS), often based on centralised architectures, pose significant risks including data breaches, unauthorised tampering, lack of transparency, and single points of failure. These challenges have highlighted the importance of rethinking how we store and safeguard sensitive information.

In recent years, blockchain technology has emerged as a promising solution for enhancing data security and trust in digital systems. Blockchain offers decentralisation, immutability, transparency, and traceability features that align well with the requirements of a secure documentmanagementecosystem.Whenintegratedintoa webportal,blockchaincanensurethatdocumentsremain unaltered and verifiable, providing an audit trail of all actionsperformedonthedata.

This research introduces ZVault, a blockchain-enabled document management web portal designed to securely store, manage, and retrieve user documents. ZVault leverages the cryptographic features of blockchain to ensure data integrity and prevent unauthorised modifications. Unlike conventional systems, every document uploaded to ZVault is hashed and linked to a blockchain record, enabling users to validate its authenticityatanytime.

In addition to secure document storage, ZVault incorporates an advanced feature for invoice detail extraction Leveraging Optical Character Recognition (OCR) and machinelearning algorithms, ZVault is capable of automatically extracting key information such as invoice number, dates, amounts, and vendor details from uploaded invoice files. This automation significantly reduces the manual workload for businesses and minimiseshumanerror.

The system is built with a modern technology stack, including a React.js frontend for a responsive user experience, a Python Flask backend to handle server-side logicandAPIintegration,andacloud-hosteddatabasefor scalable data storage. Blockchain is integrated either through platforms like Ethereum or decentralised file systems such as IPFS (InterPlanetary File System), ensuringsecure,decentraliseddatahandling.

The objective of this paper is to present the architecture, implementation, and practical applications of ZVault. It also evaluates the performance, security advantages, and futureenhancementsofsuchasystem.Byaddressingboth document management and intelligent invoice extraction withinasecureframework,ZVaultrepresentsasignificant step toward the next generation of digital document solutions.

International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056

Volume: 12 Issue: 04 | Apr 2025 www.irjet.net p-ISSN: 2395-0072

2. LITERATURE REVIEW

2.1Blockchain-BasedDocumentManagementSystems

The integration of blockchain technology into document management systems has garnered significant attention duetoitspotentialtoenhancedatasecurity,transparency, and integrity. A study by Pondkule et al. (2025) explored the implementation of a blockchain-based document management system tailored for higher education institutions. The research highlighted how blockchain's decentralised nature ensures tamper-proof storage and verifiable access to academic credentials, addressing challenges related to document authenticity and unauthorisedmodifications

Similarly, the adoption of blockchain in various sectors, including healthcare and real estate, has demonstrated improvements in operational efficiency and data management. These implementations underscore the versatilityofblockchaininaddressinglongstandingissues relatedtodatasecurityandprocessinefficiencies.

2.2AutomatedInvoiceDataExtractionTechniques

Automating invoice data extraction has become a focal pointinreducingmanualprocessingerrorsandenhancing operational efficiency. Traditional methods often rely on template-based approaches, which lack flexibility when dealingwithdiverseinvoiceformats.

Palm et al. (2017) introduced "CloudScan," a configuration-free invoice analysis system utilising recurrentneuralnetworks.Thesystemdemonstratedhigh accuracyinextractingkeyinvoicefieldswithoutrelyingon predefined templates, showcasing adaptability to various invoicelayouts

In another study, Ha and Horák (2022) developed "OCRMiner,"asystemcombiningtextanalysiswithlayout features to extract information from scanned invoice images. The approach mimics human reading patterns, achievingarecoveryrateof90%forEnglishinvoices.

Further advancements were presented by Akdoğan and Kurt (2024) with "ExTTNet," a deep learning algorithm designed to extract table texts from invoice images. The model achieved an F1 score of 0.92, indicating high precision in extracting structured data from complex invoicelayouts

Additionally, Manjunath et al. (2023) proposed a webenabled invoice processing system leveraging OpenCV algorithmsfortextdetection.Thesystemdemonstratedan average accuracy between 85% and 95% across various invoice types, emphasising the efficacy of combining imageprocessingtechniqueswithOCR fordataextraction

2.3SynthesisandResearchGap

The reviewed literature underscores the effectiveness of both blockchain technology in securing document management systems and advanced OCR techniques in automating invoice data extraction. However, there is a noticeable gap in integrating these two domains into a unifiedplatform.Whileindividualsolutionsaddresseither secure document storage or automated data extraction, a comprehensive system that combines blockchain's security features with intelligent invoice processing remainsunderexplored.

The proposed system, ZVault, aims to bridge this gap by offering a blockchain-secured web portal that not only ensurestheintegrityandauthenticityofstoreddocuments but also incorporates advanced OCR capabilities for efficient invoice data extraction. By integrating these technologies, ZVault aspires to provide a holistic solution forsecureandintelligentdocumentmanagement.

3. IMPLEMENTATION

The implementation of ZVault is structured into modular components to ensure maintainability, scalability, and seamless integration of blockchain and AI technologies. The system was developed using industry-standard frameworks and cloud platforms to enhance real-world applicabilityandperformance.

3.1FrontendDevelopment(React.js)

The frontend of ZVault was built using React.js with supportforcomponent-basedUIdesign.Itincorporates:

• Secure login interface with Microsoft identity platform

• Dashboard for viewing uploaded documents, metadata,andverificationstatus

• Upload interface with drag-and-drop and file preview

• Invoice results panel showing structured data extractedfromMicrosoftDocumentAI

Styling was achieved using Tailwind CSS and animations wereimplementedviaFramerMotiontoenhanceUX.

3.2BackendDevelopment(PythonFlask)

The backend server, developed using Python Flask, exposesRESTfulAPIsforallclientinteractions.Keyroutes include:

• /upload: Handles file uploads, sends files forhash generation, blockchain entry, and invoice extractionifapplicable.

International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056

Volume: 12 Issue: 04 | Apr 2025 www.irjet.net p-ISSN: 2395-0072

• /documents: Fetches user documents and metadatafromthedatabase.

• /verify: Compares stored hashes with blockchainstoredvaluestoverifyintegrity.

Authentication tokens from Microsoft OAuth are stored securelyinsessionforAPIcallsandusertracking.

3.3BlockchainIntegration

ZVault integrates Ethereum smart contracts for storing documenthashes.Upondocumentupload:

• ASHA-256hashisgeneratedonthebackend

• Asmartcontractfunctionistriggeredtostorethis hashwithtimestampanddocumentID

The hash acts as a tamper-proof fingerprint to verify the document’sintegritylater.

Inaddition,IPFS(InterPlanetaryFileSystem)isoptionally used to store large files in a decentralised manner, returning a content-addressed hash that is also recorded ontheblockchain.

3.4DatabaseImplementation

For document metadata, user profiles, and extracted invoice data, a cloud database (MongoDB Atlas) is used. Collectionsinclude:

• users: Email, username, and login credentials (via OAuth)

• documents:DocumentID,userID,IPFS/URLpath, uploaddate,type

• invoices: Parsed results fromMicrosoftDocument AIinJSONstructure

3.5MicrosoftDocumentAIIntegration

Invoices are processed using Microsoft Azure Form Recogniser(DocumentAI):

• Upon detecting an invoice-type document, the backendsendsthefiletoAzureviaRESTAPI.

• The service parses the document using pretrained invoice models and returns a JSON response.

• Extracted fields (invoice number, total, tax, etc.) are stored in the database and returned to the frontendforvisualisation.

Thisintegrationishighlyaccurateandeliminatestheneed fortrainingcustomOCRmodels.

3.6AzureDeployment

The entire system is containerised using Docker and deployedtoMicrosoftAzure,with:

• AzureWebAppServiceforfrontendhosting

• AzureAppService(Linux)fortheFlaskbackend

• AzureBlobStorageforstoringuploadedfiles

• Azure Cosmos DB / MongoDB Atlas for database services

• AzureActiveDirectoryforsecureauthentication

4. RESULTS

4.1DocumentUploadandManagement

• Averageuploadtimefordocuments≤5MB:1.2seconds

• Retrievalandmetadatadisplaylatency:<0.8seconds

• User testing indicated a 92% satisfaction rate with UI responsivenessandclarity

The system successfully handled over 57 document uploadsduringtestingwithouterrorsordataloss.

Fig. 1 Logic Flow

4.2InvoiceExtractionAccuracy

• Overallextractionaccuracyexceeded93%

• Systemworkedwellacrossmulti-formatinvoices.

• Results were returned in under 4 seconds on average perinvoice

International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056

Volume: 12 Issue: 04 | Apr 2025 www.irjet.net p-ISSN: 2395-0072

5. CONCLUSION AND FUTURE SCOPE

This research paper presented ZVault, an intelligent and secure web-based document management system that leverages blockchain technology and Microsoft Document AI for streamlined operations and enhanced trustworthiness. The system effectively addresses key challenges in traditional document handling, such as data tampering, retrieval inefficiency, and manual invoice processing.

Throughrigoroustesting,ZVaultdemonstrated:

• Fastandreliabledocumentuploadsandretrieval

• High-accuracy invoice data extraction with MicrosoftDocumentAI

The modular architecture, built with React.js, Python Flask, and Azure cloud services, allows for scalable deployment and seamless integration with existing enterprise systems. The blockchain layer ensures longterm data integrity and transparency, positioning ZVault as a robust solution for modern document and invoice workflows.

While ZVault currently provides a strong foundation for secure and intelligent document handling, several enhancementscanbeimplementedinthefuturetofurther increaseitsutilityandscalability:

1. Multi-language Document Support: Extend invoice extraction capabilities to support multilingualandregionaldocumentformatsusing custom-trainedAImodels.

2. User Role-Based Access Control: Implement granular user permissions and role-based access todocumentsforenterprise-levelsecurity.

3. MobileAppIntegration:DevelopAndroidandiOS applications for on-the-go access and document captureusingdevicecameras.

4. Real-Time Blockchain Analytics Dashboard: Integrate a dashboard to visualise blockchain transactions,hash logs, and useractions foraudit andcompliancepurposes.

5. E-signature Integration: Incorporate digital signing features using platforms like DocuSign or AzureSigntomanagecontractsandsignedPDFs.

With these future enhancements, ZVault can evolve into a comprehensive platform for enterprise-grade document governance, combining automation, security, and AIpoweredintelligence.

REFERENCES

1. Pondkule, P. M., et al. (2025). Implementation of blockchain-based document management system for highereducationorganizations.InternationalJournalon SmartSensingandIntelligentSystems,18(1).Sciendo

2. Palm,R.B.,Winther,O.,&Laws,F.(2017).CloudScan-A configuration-free invoice analysis system using recurrent neural networks. arXiv preprint arXiv:1708.07403 arXiv

3. Ha, H. T., & Horák, A. (2022). Information Extraction from Scanned Invoice Images using Text Analysis and LayoutFeatures.arXivpreprintarXiv:2208.04011 ACM DigitalLibrary+3arXiv+3ResearchGate+3

4. Akdoğan, A., & Kurt, M. (2024). ExTTNet: A Deep Learning Algorithm for Extracting Table Texts from InvoiceImages.arXivpreprintarXiv:2402.02246.arXiv

5. Manjunath, A. A., et al. (2023). Automated invoice data extraction using image processing. IAES International JournalofArtificialIntelligence,12(2),514-521.

6. Microsoft Azure Form Recognizer Documentation Microsoft Docs. (2023). Form Recognizer overviewAzureAIservices

7. Ethereum Smart Contracts and Blockchain Integration Wood, G. (2014). Ethereum: A Secure Decentralized Generalized Transaction Ledger Ethereum Project YellowPaper.

8. IPFS - Distributed Web Protocol Benet, J. (2014). IPFS - Content Addressed, Versioned, P2PFileSystem

9. Secure Document Management Using Blockchain Zyskind, G., Nathan, O., & Pentland, A. (2015). Decentralizing Privacy: Using Blockchain to Protect Personal Data IEEE Security & Privacy, 13(5), 36–45. DOI:10.1109/MSP.2015.77

10. Document Automation and Invoice Extraction Research Deng, L., & Liu, Y. (2021). Automated Document Intelligence using AI: A Review Journal of Artificial Intelligence Research, 70, 785–814. DOI:10.1613/jair.1.12674

11. Web Technologies and Cloud Integration Buyya, R., Vecchiola, C., & Selvi, S. T. (2013). Mastering Cloud Computing: Foundations and Applications Programming.MorganKaufmann.

Turn static files into dynamic content formats.

Create a flipbook
Issuu converts static files into: digital portfolios, online yearbooks, online catalogs, digital photo albums and more. Sign up and create your flipbook.
ZVault: A Blockchain-Enabled Web Portal for Secure Document Management by IRJET Journal - Issuu