
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 11 Issue: 12 | Dec 2024 www.irjet.net p-ISSN: 2395-0072
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 11 Issue: 12 | Dec 2024 www.irjet.net p-ISSN: 2395-0072
Thisaranie Kaluarachchi
University of Colombo School of Computing, Colombo 0700, Sri Lanka
Abstract - Internet usage has increased exceedingly and rapidly in the past decades. The World Wide Web (WWW) has emerged as the most important public communication portal for individuals, businesses and organizations. In recent years, web development has relied on website templates and Content Management Systems, which anyone can use to create a website by plugging in their own text and images. Similarly, web designers create a design mock-up for a web page and give it to a developer to implement in code. This process is challenging and time-consuming since the design and implementation are carried out by separate teams, which is costly. The overall outcome of such a design depends entirely on the web designer's design skills, and there are differences between two source codes that have been implemented in the same UI by two different developers. What if there was a mechanism or system that could detect HTML elements in GUI images and generate source code automatically? If such a system or tool can generate source code automatically, web developers can focus on functionalities rather than wasting time on front-end development. Consequently, this paper presents an approach that automatically converts the GUI design of a website into HTML code using image processing and deep learning. This study employed an experimental approach to realize two scenarios that operates as image processing and deep learning modules. The tag tree produced by the approach was compared to the original HTML tag tree, and the HTML code produced for a specific website was compared to the website's original HTML source code.
Key Words: deeplearning,imageprocessing,HTMLcodegeneration,webdesigning,automaticwebsitegeneration
Front-endUIdevelopmentbasedonGUIdesignistheprimaryresponsibilityofdevelopersinwebsitedevelopment. Generally,thisprocesstakeslongerthanachievingsystemfunctionalityandlogic. Therefore,itpreservesdevelopersfrom focusingonimplementingkeyfunctionality.Also,themethodsofimplementingsuchaUIaredeterminedbythedeveloper's experienceandskills.Asaresult,evenwhentwodifferentdevelopersimplementthesameUI,therearedifferencesinthe sourcecode.WhatiftherewasamechanismorsystemthatcoulddetectHTMLelementsinGUIimagesandgeneratesource code automatically? If such a system or tool can generate source code automatically, web developers can focus on functionalitiesratherthanwastingtimeonfront-enddevelopment.Furthermore,thesystemcanfollowthestandardsand rules.Therefore,theoutputofthesystemwillproducestandards-compliantsourcecode,whichcanprovideadditionalbenefits tothewebsitethatimplementsit.
Userinterface(UI)designtakesintoaccounttheneedsofendusers,ensuringthatthesystemispackedwithelements thatmakeiteasyforuserstoaccessandunderstanditsfeatures. Inaddition,today'sprimaryfocusisonGUIdesignusing variousimages,effectsandanimations.GUIdesignandimplementation,however,arechallengingandtime-consuming[1]. FrontenddevelopmentforwebsitesorwebappsismorecomplexthanGUIimplementation.Itentailsworkingwithavarietyof technologiesandlanguages, includingHTML,JavaScript, PHP,ASP.NET,MySQL,and AJAX.Theremay beinstanceswhere developersbecometrappedforafewhoursordays[2].Whenitcomestofront-endlayoutimplementations,webelementsare classifiedintotwocategories:fixGUIelements(buttons,textinputs,paragraphs,etc.)anddynamicelements(dropdowns, drawermenus,etc.Inthisstudy,weconcentratedsolelyonconvertingfixedGUIelementsintosourcecodeasthepurposeof thisresearchistodevelopasolutionforconvertingGUIintosourcecodeusingimageprocessingandmachinelearning.
End-usersofasysteminteractwithitthroughtheuserinterface.Theuserinterfaceistheprimarygatewaythrough whichuserscommunicatewiththesystem.Itisvitalandfront-enddevelopmentreceivesmoreattentionfromtheSoftware DevelopmentLifeCycle(SDLC).Front-enddevelopersmustworkhardtobuildGUIelementstoaccomplishtheuniquedesign provided by graphic designers. Developers spend a lot of time choosing the appropriate HTML tags, planning the proper sequenceandstructure,thencoding.Furthermore,developersmayhavetorepeatthesamesortofcodelinesnumeroustimes toconstructasinglepageofGUIwhichisacommonissueinmarkuplanguages.Therefore,itistime-consuminganddependon thedeveloper'sexperienceandskills.
Theprimaryaimoftheresearchistodesign,develop,andevaluateanapproachforconvertingGUIdesigngraphics intoHTMLcode.Theresearchfocusedontechniquesfromimageprocessinganddeeplearningtodesignandimplementthe approach Thestudymakesthefollowingcontributions:
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 11 Issue: 12 | Dec 2024 www.irjet.net p-ISSN: 2395-0072
ConstructingamechanismtoextractHTMLelementsfromaGUIdesign.
ConstructingamechanismtoclassifytheHTMLtags.
ConstructingaCNNtogeneratesourcefromHTMLtaghierarchytree.
Figure1illustratestheoverviewoftheproposedapproachforHTMLCodeGenerationfromGUIs,whichisreferredto throughoutthedescription.AdetaileddescriptionoftheapproachisdescribedinSection3.Theremainderofthispaperis organizedasfollows:Section2conductsaliteraturereviewtocomprehendtheconceptsrelatedtotheproblemdomainand scope.Furthermore,similarstudiesinvolvingtheconversionofaGUIintosourcecodearediscussed.Section3describesthe proposedresearchmethodologyandthetheoreticalbasicofthedesignandconceptevaluationtoarriveataproperanswer Section4thoroughlycoversthedesignconceptandtheimageprocessinganddeeplearningtechniquesusedinthesolution. Section5expandsonthediscussionofresults,andfinally,Section6concludestheresearchbylookingatthelimitationsand futuredirectionsoftheproposedHTMLCodeGenerationapproach.
Thescopeofstudyisdividedintofoursub-scopesnamelygraphicaluserinterfaces,codegeneration,computervision anddeeplearning. Accordingly,theliteraturereviewsectionanalysestheeffectivenessofconvertingGUItosourcecodeand critically evaluates comparative work in the problem domain. Furthermore, it critically examines the various strategies, technologies,algorithms,methodologies,andtoolsavailablethatcanbeusedtoeffectivelyandefficientlyproducesourcecode fromGUIs.
The user interface connects end users to the system or program and allows them to interact with the system or software.Myers[1]showsthatsmalldetailsofthesystemneedtobedemonstratedandsignificanttohaveanimpactonthe overallsystemfortheendusers.Furthermore,hehasoutlinedmanyapproachestodeterminehowwellUIdesignsystems impressendusers.UIdesignplaysanimportantroleinrepresentingthesoftwareandallitspurposes.IfthesystemUIisnot designedwithsufficientamountofinformation,thefinalsoftwaresystemwillnotbeabletointeracteffectivelywiththeusers.
AGUIisanexpandedversionoftheUIdescribedinthepreviousparagraph Enduserscaninteractwithasystemor softwareusingvisualpromptsinsteadofatext-onlyUI[3].GUIintroducedatechniqueforgraphicallyinteractingwithsoftware systems.ItreplacedthemajorityofCommandLineInterfaces(CLIs)[4].GUIscombinesoftwareandhardwaretocreatean interactiveplatformforenduserstocollect,produce,anddisplaydataandinformation.Asaresult,GUIdevelopmentbecamea keypartofthesoftwaredevelopmentprocess.
Whensystemdevelopmentcomestotheuseofawebbrowser,itisusuallyreferredtoasawebsite.TheGUIoftheweb interfaceallowsuserstointeractwiththerespectivewebsite.Thewebsite'sGUIismainlyusedtorepresentdataaswellas inputtothewebsystem.
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 11 Issue: 12 | Dec 2024 www.irjet.net p-ISSN: 2395-0072
ThepartsofaGUIareuserinterfaceelements.Thesearecomponentsthatrepresentdatainthesysteminastructured way. These components assist users to interact with the software system without in-depth experience or knowledge of technologymanipulationandcomputerskills.MostofthevisualagentsthatGUIsusetocapturedata,information,anduser inputarestaticcomponents,butsomeofthemaredynamic,meaningtheychangeperiodicallydependingonconditionsand interactdirectlywithenduserstomakethemfeelthereal-timecommunicationwithcomputersystems[4].Generally,GUI elementsincludethefollowing,
Informational Components:tooltips,icons,progressbar,messageboxes,modalwindows,notifications
Input Controls:textfields,checkboxes,radiobuttons,listboxes,buttons,toggles,datefield
Navigational Components:breadcrumb,slider,searchfield,pagination,slider,tags,icons
Containers:accordion
Most of these web GUI elements can be classified under two main categories, static UI elements and dynamic UI elements.Faureetal.[5]havebrieflydescribedthedifferencebetweenstaticelementarrangementanddynamicelement arrangement in a system UI. Static element arrangement refers to elements that are fixed components of a UI and these arrangementelementsaremainlyusedtorepresentsystemdatatousersandreceiveuseractionsasinputs.NoteElementsthat aredynamicallygeneratedwithanimation,timing,oruserinteractioncanbeincludedinthestaticelementcategory.Dynamic elementsarethosewithreal-timeinformationthatchangesfromtimetotimeinthesystemduetotime-to-timechangesoruser interaction
Structured method of text manipulation in a computer calls sign language. These methods make text easily recognizableandhumanreadable.Markuplanguageisnotaprogramminglanguage,butamethodofcreatingastructureof textoradocumentforpresentationonanelectronicdevice.Itcanbeusedbyanoperatingsystem,application,orprogramto presentdataastext,images,andothervisualcomponents.Markuplanguages[6]areclassifiedintothreecategoriesasfollows.
Presentation markup: TraditionalwordprocessingsystemsusedwithWYSIWYG.Itishiddenfromhumanusers.It includeshorizontalandverticalspacingsuchaspagebreaks,doublelinespacing,andindentation.Itisgeneratedbya mechanicalorelectronictypewriter
Procedural Markup: It integrates with text and provides text processing instructions to programs. Such text is handled visually by the author. Procedural markup systems include programming constructs where macros or subroutinesaredefinedandcalledbyname.
Descriptive Markup: Thiscategoryisconcernedwiththelogicalstructureofadocumentratherthanitsappearance. Descriptivesymbolsaredesignedtobeeasyforpeopleandcomputerstoreadandunderstand.User-definedstylesin wordprocessorscanbeusedfordescriptivemarkup.
ThesyntaxassociatedwithHTMLtodayiswww.ItisanetworkofinformationresourcescalledtheWorldWideWeb. DaveRaggettetal.[7]haveproventhattherearethreemaincomponentsinvolvedinthisworldwidenetworkasfollows,and HTMLcomesundertheHypertextmechanism.
Consistentnamingconventionforaccessingwebresourcesdirectly(URIs).
Protocols(HTTP,HTTPS)toaccessresourcesovertheInternet.
Hypertextmethodtosimplynavigatethroughresourcesandrepresentdata(HTML).
HTMListhemarkuplanguagemostusedbydeveloperstoimplement webGUIs.HTMLexplainsanddefineshowtext, imagesandmultimediaaredisplayedinwebbrowsers[8].TherearetwotypesoftagsinHTML:
Paired Tags: Apairtagconsistsoftwotags,thefirstoneiscalledtheopeningtagandthesecondtagiscalledthe closingtag.Thesetagscontainthetextthatshouldapplytheeffectofthattag.
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 11 Issue: 12 | Dec 2024 www.irjet.net p-ISSN: 2395-0072
Unpaired Tags: Anunpairedtagisasingletagthatdoesnotrequireacompaniontag.Thesetagsarewrittenas<>or </>toworkwitheachother.Whichstyletochooseisthedeveloper'schoice.
Mostpagestructuringtagsandformattingtagsarepairedtags.Therefore,eachofthemhasbothopeningand closingtags.WhenimplementingaGUIwithHTMLtags,itcanberepresentedasahierarchicaltreeasshowninFigure2
2.2.2 Scripts
Developersgenerallyuseclient-sidescriptingbyembeddingitwithHTMLinwebapplicationdevelopmenttoreducethe timeofdataprocessinganddatarenderingofawebsiteorwebpage.Thisscriptrunsontheclient-sidecomputer.Therefore,the systemdoesnotblocktheuserfromwaitingfortheserver-sideprocess.Manyanimations,simplelogic,andevenlocalstorage values are handled with these server scripting methods. Similarly, scripts are used to create GUI elements in web design. Furthermore,DaveRaggettetal.[7]haveexplainedhowtocomeupwiththisscriptformtoprocessinputswhilehandlinguser input
2.2.3
DaveRaggettetal.[7]haveshownhowHTMLcodehandlestheresponsibilityofrenderingwebviewsbysimplifying HTML withstylesheetsinwebdevelopment. Stylesheetsallowuserstocontrol alignment,font,color,andlayoutsettings. DeveloperscanusetheminHTMLcodeorasaseparatefilethatcanbereferencedexternally.Themethodofassociatingthestyle sheetwiththewebpageisindependentofthestylinglanguagetheyuse.
2.2
Generally,incomputerscience,SDLCisthestructureimposedonsoftwaredevelopmentbyadevelopmentmethod[9]. Synonymsincludesoftwaredevelopmentandsoftwareprocess.Similarly,intheUIfield,UIDevelopmentLifeCycle(UIDLC) consistsofthedevelopmentpathdefinedbyaUIdevelopmentmethodfordevelopingUI[10].Theimplementationphasecreates anactualwebsitefromsitedesign.Asafirststep,theelementsandrelationshipshighlightedinthedesignaremappedontothe constructsprovidedbythechosenimplementationtechnique[11].TurntheGUIdesigncreatedbyagraphicdesignerintoa sourcecodethatcomesunderfront-endUIdevelopment.GUIimplementationisoneofthemaintasksofdevelopersinthe websitedevelopmentprocess.
TraditionallyGUIdevelopmentandimplementationareahumantask,andittakestimetofullyconvertaGUIdesigninto sourcecode.Often,thisprocesstakesmoretimethanrealizingsystemfunctionalityandsystemlogic.Moreover,thisprocess
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 11 Issue: 12 | Dec 2024 www.irjet.net p-ISSN: 2395-0072
takesmoretimeandpreventsdevelopersfromfocusingonimplementingkeyfunctionality[12].Thetimespentmayvaryonthe sameimplementation,dependingontheskilllevelofthedeveloper.IntheabsenceofadisciplinedapproachtoWeb-based systemdevelopment,thatWeb-basedapplicationsarenotdeliveringdesiredperformanceandquality,andthatdevelopment processbecomesincreasinglycomplexandchallengingtomanageandrefineandalsoexpensiveandgrosslybehindschedule [13].
Generally,developersuseuserinterfaceimplementationtoolsforGUIimplementations.Developersgetmorebenefits byusingthesetools.TherearetwomaincategoriesofUItoolsasdescribedbelow[14].
First, the quality of the resulting user interfaces can increase because designs can be rapidly prototyped and implementedevenbeforeapplicationcodeiswritten.Becauseofthathighiterationofrapiddesign,it'sanimportantpartof achievinguserinterfacequality.ThecodeoftheUIisautomaticallycreatedwithhigh-levelspecifications,sothereliabilityofthe UIwillincrease.WhencreatingUIsusingUIgeneratortools,mostoftheendproductsarethesame,eveniftheapplicationsare different.Thosetoolsareeasytouse,sonon-programmerscanworkonUIimplementation.Peoplesuchasgraphicartists, usabilitystudiesspecialists,andcognitivepsychologistscanalsobeinvolvedinthisprocess.
Second,becauseinterfacespecificationscanberepresented,validated,andevaluatedmoreefficiently,UIcodecanbe easierandmorecost-effectivetocreateandmaintain.SinceUItoolsplayakeyroleinimplementingtheUI,writingcodewillbe lesswork.Also,therecanbebettermodularizationduetoseparateimplementationbetweenUIcomponentandapplication implementation.GeneratingwebUIsusesdifferenttoolsthanbuildingtypicaldesktopormobilesoftwareGUIs.Simpleweb pagesarecreatedwithconsistenttext,graphicsandlinks.ThesesimplecomponentscanbecreatedbywritingHTMLcodes directly.ButwhenUIgenerationtoolsareusedtodesignwebsystems,theycancreatepagesthatcontaintext,buttons,input fields,etc.AlsoscriptinglanguagesshouldbeusedinHTMLtoimplementdynamicpages.
Many of the techniques used in converting graphical images into codes can be categorized into machine learning streams.Morerecently,therehavebeenseveralattemptstousemachinelearningtechniquessuchasdeeplearningneural networks.Amongtheseefforts,somefocusedonstructuredmarkuplanguages[16].
AliDavodyetal.[17]haveintroducedaReinforcementLearning(RL)methodthatgeneratesHTMLsourcecodeswhen theimageisprovidedonthewebpage.RLframeworkagentistypicallytrainedtomaximizeexpectedevaluationmetricsinstead ofmaximizingconditionalprobabilitytokens.Theagentgeneratescodethatrendersawebviewthatbettermatchesit.The modeltheyproposedisanagentthatcangenerateHTMLsourcecodethattakesawebsiteimageasinputandthenrendersit backinabrowser.TheyuseanRLapproach,wheretheagentmodelreceivesatokenateachtimestampandcreatesamodelthat cansimplygenerateaDSLsourceandistrainedtocaptureDSLtokens.Theyalsoexplainedamethodtoanalyzethegenerated websitescreenwiththetargetimageanditsresults.Fortheirmethod,theyonlyusethewebbrowsertocalculatethereward, makingthewholeprocessindependent.TheyalsoconductedsomeexperimentsonasyntheticdatabasebuiltwithDSLsupport. ItgeneratedsimpleHTMLwebpagesforafewtokens[17].
AlexanderL.etal.[18]havedevelopedanothertechniquebasedondeeplearningtosolvethefrontendUIgeneration problem. They used predictions from neural networks to augment search methods including an SMT-based solver and enumerationsearches.Fromfurtherobservations,theyhaveshownthattheirmethodleadstonon-robustbaselinesandan orderofmagnitudefasterthantheRNNmethod.Theyhaveproposedtwomainideastogeneratecomputerprogramsusing mathematicalpredictionstoaugmentcommonsearchmethods.Thefirstideaishowthesystemlearnstoproduceprogramsthat useacollectionofprograminitializationdifficultiestostudypoliciesthatterminateondifficulty.Thelatterexplainshowneural network architecture can be implemented by prioritizing search-based methods over replacement methods. For solving InductiveProgramSynthesis(IPS),theyhaveshownthatmachinelearningcanprovidesignificantvalue.Learningfromthe associationofinputandoutputsamplesacrossdifferentinterpreterscanenablesourcecodegeneration.TheyalsousedDSLto reducethecomplexityoftheprogramminglanguageanditwasabletoreducethesearchspacearea[19].
Moreover,Lingetal.[20]haverecentlystudiedinputsthroughamixtureofnaturallanguageandstructuredprogram specificationsandthenexplainedtheorganizationoftheprogram.ThemostcommonfactorwastheuseofDSLsdesignedto targetspecificdomain. Theydifferfromfull-fledgedcomputerlanguagesandaremorerestrictive.Itreducesthecomplexityof thelanguageusedinDSLs.Generatingsourcecodefromvisualinputsisanunexploredareaofresearch.
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 11 Issue: 12 | Dec 2024 www.irjet.net p-ISSN: 2395-0072
AnothercloselyrelatedworkusingreverseengineeringtechniqueswasimplementedbyNguyenetal.[21],focusingon Android mobile application developments. It was able to regenerate the source code of the UI by taking the Android app screenshotasinput.Theirmethodiscompletelydependentonengineeringbackgroundandrequiresgooddomainknowledge. pix2code [12]isthefirstattempttosolvetheproblemofcreatingwebpageUIcodesthroughvisualsasinputbytaking machinelearningsupporttofindpotentialvariablesotherthanimplementingcomplexmethods.ItfocussesonanotherCNNbasedapproach,whichgeneratestokensforthefinalstagefromasingleGUIscreenshotinput.Furthermore,itcomparesthe processofgeneratingsourcecodefromanimageorscreenshottodescribinganimageinEnglishandwritingit.Bothofthese processesrequirethecreationofvariablesoftokensbyanalyzingtheimagepixelbypixel.Ittriedonasmalldatasetbyscraping scenesandtrainingafewparameters[22,23].
Allofthetechnologiesandmethodologiesmentionedaboverelyoncomputervision-relatedrecognitionandvisual classification.Thissubsectiondiscussestheimageprocessingtechniquesthatwillbeemployedinthefollowingsteps.Other researchers have conducted several trials and attempts in image processing and edge detection, with the LoG approach providingsuperioranswers[24].ImageedgedetectionwiththeGaussianLaplaceoperatorcanmimicthevisualpropertiesof human eyes. In practice, noise has a significant impact on an image since the scale factor is incapable of self-adaptive modification[25].
Deeplearningprinciplesmustbeconsideredwithimageprocessingtechniques.Intermsofimageidentificationand classification,CNNrankshighlyamongneuralnetworks.Currently,therearevariousmachinelearningapproachestocomputer visionproblems,butCNNisthemethodofchoiceandisusedinawiderangeofapplications.Thisarchitecturehelpsthenetwork tolearnsharpandrichpotentialrepresentationsthroughthetrainingimages.[26][27].
Toattaintheendgoals,thestudyusedanexperimentalapproach.TheprocessofconvertingGUIintosourcecodeis dividedintotwomodules.Thefirstmoduleistheimageprocessingmodule,whichextractsHTMLelementsfromtheGUIimage anddeterminesthehierarchyofelements.ThesecondmoduleisforrecognizingHTMLtagsderivedfromaGUIimage
TheMATLABimageprocessingtoolkitwasusedtoconstructtheimageprocessingmodule,whichwasdesignedto extractHTMLelementsfromaGUIimage.TheextractormodulewasdesignedusingimageprocessingtechniquesincludingLoG, dilate,anderode.Experimentswereconductedbyadjustingthevariablesettingsforeachstageandcomparingtheprocessed outputimagestotheinputGUIimages.Furthermore,experimentswereconductedtoincreasethenumberofextractedHTML elementsfromtheGUIimageandcomparedittothenumberofHTMLelementtagsintheoriginal,whichprovidedtheinputGUI fortheexperiment.Figure3summarizestheconceptandexperimentoftheimageprocessingmodule.TheNotepad++utilityis usedtocomparethecountsoftwocodesegments.
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 11 Issue: 12 | Dec 2024 www.irjet.net p-ISSN: 2395-0072
TensorFlow and Python technologies were used to create the deep learning module. PyCharm IDE was used to implementthePythoncodeforCNNdevelopmentonalocalMacPCrunningLinux.Recently,themodulewashostedonanAWS cloudinstancetoimprovepredictionaccuracy.Figure4describesthehigh-leveldesignofthedeeplearningmodule.
4. EXPERIMENTAL APPROACH
Theproposedsolutionisorganizedintothreemaincomponents.Figure5depictstheflowofimageprocessing,markup tagstructure,andHTMLsourcecodeproductionatahighlevel.Theproposedsystem'sfinalproductwillcompriseanHTML stylingunit,whichmayincludeCSSorSass.However,thisscopeonlyincludedHTMLtagproduction.
5: High-Level Architecture of the Approach
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 11 Issue: 12 | Dec 2024 www.irjet.net p-ISSN: 2395-0072
Thefirstmoduleoftheproposedsolutioncanbenamedasimageprocessingunit.WhenaGUIimageisgivenasinput, thisunitprocessestheimageinseveralstepsandreturnsanimagewiththedetectedHTMLelements.Figure6illustratesthe unitinputandpredictedprocessingoutput.ToextractHTMLelementsfromtheinputGUIimage,theedgedetectionalgorithmof MATLAB[28]wasused. Furthermore,theunitofthisproposedsolutioncanbedescribedasfourimageprocessingstepsleading totheoutputofthisimageprocessingsubunit.Eachstepisdescribedbelow.
4.1 1 Laplacian of Gaussian filter (LoG)
TheinputGUIimagewassubjectedtoaLoGfilterintheinitialstageoftheimageprocessingsubunit.Ituses fspecial methodwithparsing 'log' argumenttocreatea2-Dfilterontheinputimage.Thisderivativefilterisusedtolocateedgesin imagesbyidentifyingareasoffastchange[29].
4.1.2 Convert Image into Binary Image
Thesecondstageoftheimageprocessingunitistogenerateabinaryimage.Thepreviousstep'soutputisusedasinput forthisoperation. Im2bw wasusedtoconverttheindexeddataintobinaryimages.Itfirstconvertstheinputimagetograyscale formatandthenconvertsittobinaryimage.Theimagegeneratedbythisstepcontainsonlyblackpixels.Tocompletethisstep, useMATLAB’s im2bw function[30]toparsethe 'level' input.Thelevelcanbespecifiedintherange[0,1],andthevalueusedis about80.0/256.0.
4.1.3
Theimageprocessingunit'sfollowingphasedilatesthebinaryimageproducedbythepreviousstepusingMATLAB’s imdilate algorithm[31]. Thismethoddilatesgrayscale,binary,orpackedbinaryimagesbyparsingastructuringelementobject, orarrayofstructuringelementobjects,asreturnedbyMATLAB’s strel functionasthesecondinput.Theimageprocessingunit uses square asthestructuringelementobject.Additionally,itmakesuseofthebreakdownofastructuringelementobject.The resultingamplifiedimageisfurtherprocessedusinganotherapproachcallederosion.
Thisphaseerodestheimageexpandedinthepreviousphase.Itusesthe imerode algorithmfromtheimageprocessing toolboxofMATLAB[32].Thisfunctiontakesthesameargumentsasthe imidylate method,butituses square asthestruct elementobject.Asthenextstage,imageprocessingunitexpandstheerodedimagefromthepreviousprocedureusingthe previouslymentionedmethod.ThisstageresultsinHTMLcomponentsbeingrecognizedandasmoothlyexpandedimage.
Finally,therelatedcomponentsinthebinaryimageproducedbytheprecedingstepsarediscovered.Thenittransfersto anewimagewithrandomcolorsquares.The bwconncomp techniquefromMATLAB’simageprocessingtoolbox[33]wasutilized forthisfinalstage.Thefinaloutputofthisimageprocessing,madebydrawingrandomcolorboxes,wasusedinthefollowing stagesoftheproposedapproachtoconverttheimagesintosourcecodes.
Thenumber2sectionofFigure7depictstheunitthatobtainsthepositionsofeachelementascoordinationvaluesof detectedsquares.ItgeneratesaJSONfilewiththecoordinationvaluesofeachelement.Furthermore,thesameJSONfileis
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 11 Issue: 12 | Dec 2024 www.irjet.net p-ISSN: 2395-0072
generatedbasedonthehierarchyofthediscoveredelementsquares.ItusedaPythonscripttogeneratethestructuredJSONfile, thenMATLABreadtheprocessedimagewithrandomlycoloredimagesfromtheimageprocessingunit.Figure7depictsan exampleofasimplehierarchytreecreatedusingasimpleUI.
The taghierarchy iscreatedin parallel with the placement of squares. Thesame JSON file is generatedusing this structureofelements.ThisprocedureofgeneratingthehierarchicaltreeofitemsmadeadvantageoftheBreadthFirstSearch (BFS)traversal.Figure7depictshowBFSusesthehierarchytreetodeterminetheorderoftagitems.
Furthermore,thissectioninputstheoverallGUIimageandassignscoordinatevaluestoeachhighlightedelement.Figure 8illustratestheHTMLelementdetectionflowindetail.Inthefirststage,itselectsaspecificelementfromtheentireimagebased onthecoordinatevaluesanddeterminesthenameoftheHTMLtagcorrespondingtothatelement.TheoutputJSONfileisthen updatedwiththeactualHTMLtagvalueratherthantheunknowntag.Finally,thehierarchytreeofunknowntagsisupdatedto anHTMLtaghierarchytree.Figure9summarizesthemainpicture.
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 11 Issue: 12 | Dec 2024 www.irjet.net p-ISSN: 2395-0072
WeimplementedtheHTMLtagrecognitionmoduleusing TensorFlow [34]and Python technology.Themodulewas trainedona TensorFlow environmenthostedlocallyonaMac(Linux-based)system,using Python scriptsimplementedusingthe PyCharm tool[35].ThetrainingdatasetforHTMLtagsiscreatedusinginternetresourcesandhumancollection.Recently,the modulewashostedinacloudenvironmenttotraintheCNNfurtherandimprovethepredictionaccuracy.
Inthisstudy,twodifferentmodulesareevaluated Theyareanimageprocessingmoduleandadeeplearningmodule. Theyaretestedindependently.Thefollowingscenariosareconsideredinevaluatingthemodules
TheimageprocessingmodulewasdevelopedusingtheMATLABImageProcessingToolkit.Overall,thismodulecan outputasourcecodefilecontainingonlyunknowntagsfromtheintermediatehierarchytree.TheinputGUIimagewasevaluated bycomparingthenumberofunknowntagsfoundbytheimageprocessingmoduletothenumberofactualHTMLtagsinthe sourcecode.BycomparingtheinputGUIimagewiththeoutputimage,whichconsistsofrandomcoloredsquares,thehumaneye candetectsignificantHTMLelementrecognitionerrors.Figure11illustratestheresultsfortheinputGUIimageinFigure10.
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 11 Issue: 12 | Dec 2024 www.irjet.net p-ISSN: 2395-0072
Toevaluatetheimageprocessingmodule,weusedmultipleGUIimagescontainingrealHTMLsourcecode.Themissing tagcounts were thencompared withtheavailable source codecreated bya developer, andtheunknown tagcountswere obtainedfromtheoutputpictureoftheimageprocessingmodule.
Toevaluatethedeeplearningmodule,weusedacomparativemethodusingthesameHTMLsourcecode.Wecompared thetagnamesintheoriginalHTMLsourcecodeandthemodifiedonegeneratedbytheCNNmodule.SincetheCNNwasonly trainedtorecognizeafewtypesofcomponents,wehadtoextracttherecognizedelementsfromtheCNNmoduleandmatch themtothecorrespondingelementsprovidedbythedeveloperfortheinputGUIimage.Weconcentratedsolelyonthetrained elementstologicallyevaluatethedeeplearningmodule.
Toevaluatetheresults,ademonstrationwasconductedasdescribedhere.Asimpleone-pagewebpagewascreatedby handusingHTMLandascreenshotofitwasinitiallycaptured,asshowninFigure12.
The screen capture was then sent via theimage processing moduleto extract thecorrespondingHTML elements. Figures13and14illustratetheresultsofthefirstandlaststagesoftheimageprocessingmodule.
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 11 Issue: 12 | Dec 2024 www.irjet.net p-ISSN: 2395-0072
TheaimofthestudyistoimplementamethodtoautomaticallygeneratesourcecodefromGUIimages.Basedonthe qualitativeanalysisofthefinallydetectedHTMLtags,itcanbeconcludedthattheaccuracyoftheoutputoftheimageprocessing moduleaffectsthefinaldesiredresultofthedeeplearningmodule.Afteranalyzingtheresultsofcomparingthenumberof unknowntagswiththenumberofexpectedrealHTMLtags,theimageprocessingstepshouldbefurtherimprovedtoextract HTMLelementsfromtheGUIimage.
CNNwastrainedforonlytenmainHTMLelements(heading,paragraph,image,hyperlink,button,textfield,textarea, icon,searchbar,andlabel).AccordingtothecomparisonbetweentheresultsgeneratedbytheCNNmoduleandtheHTML sourcecodethatgeneratedtheexperimentalGUIimages,theCNNismoreaccuratewiththetrainingdataset.Toimprovethe overalldesignoftheproposedsolution,attentionshouldbepaidtoimprovingtheimageprocessingmoduleratherthanthedeep learningmodule.TodetectthehighestnumberofHTMLelementsfromtheinputGUIimage,itwasconcludedthatweneedto improvetheimageprocessingmodule,astheCNNonlyretrieveselementsextractedfromtheGUIimage.
Thedesignandevaluationapproachoftheproposedsolutionhasbeenabletoinitiateanopendiscussion.Theapproach offirstextractingHTMLelementsfromtheGUIandsecondarilythroughdeeplearningandgeneratingsourcecodebyprocessing BFSonahierarchicaltreeshouldreceivemoreattentionfromtheresearchcommunity.Asafutureworktoimprovethisdesign, researchingandextractingwebsitetaxonomycanprovidesourcecodetogenerateUI,improvepredictionofHTMLtagsandeven givesuggestionstoevolveGUIimplementations.SincetheresearchonlyfocusedonHTMLtagsinsteadofstyling,improvements areneededbyintegratingstylesintotheidentifiedelements.
ACKNOWLEDGEMENT
Thisresearchreceivednospecificgrantfromfundingagenciesinthepublic,commercial,ornot-for-profitsectors.
[1] Myers,B.A.(1993). Why are human-computer interfaces difficult to design and implement? Carnegie-MellonUniversity. DepartmentofComputerScience.
[2] Kumari,P.,&Nandal,R.(2017). A Research Paper on Website Development Optimization Using Xampp/PHP.International JournalofAdvancedResearchinComputerScience,8(5).
[3] Stephenson,N.(1999). In the beginning... was the command line (pp.1-60).NewYork:AvonBooks.
[4] Graphicaluserinterface–Wikipedia:https://en.wikipedia.org/wiki/Graphical_user_interface
[5] Faure,D.,& Vanderdonckt,J.(2010,June). User interface extensible markup language.InProceedingsofthe2nd ACM SIGCHIsymposiumonEngineeringinteractivecomputingsystems(pp.361-362).
[6] Coombs, J. H., Renear, A. H., & DeRose, S. J. (1987). Markup systems and the future of scholarly text processing CommunicationsoftheACM,30(11),933-947.
[7] Raggett,D.,LeHors,A.,&Jacobs,I.(1997). HTML 4.01 Specification.IETFHTMLWG.
[8] Berners-Lee,T.,&Connolly,D.(1995). Hypertext markup language-2.0 (No.rfc1866).
[9] Pressman,R.S.(2005). Software Engineering: a practitioner’s approach.PressmanandAssociates.
[10]Khaddam,I.,Barakat,H.,&Vanderdonckt,J.(2016). Enactment of User Interface Development Methods in Software Life Cycles.InRoCHI(pp.26-35).
[11]Coda,F.,Ghezzi,C.,Vigna,G.,&Garzotto,F.(1998,April). Towards a software engineering approach to web site development InProceedingsNinthInternationalWorkshoponSoftwareSpecificationandDesign(pp.8-17).IEEE.
[12]Beltramelli,T.(2018,June). pix2code: Generating code from a graphical user interface screenshot.InProceedingsoftheACM SIGCHIsymposiumonengineeringinteractivecomputingsystems(pp.1-6).
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 11 Issue: 12 | Dec 2024 www.irjet.net p-ISSN: 2395-0072
[13]Murugesan,S.,Deshpande,Y.,Hansen,S.,&Ginige,A.(2001). Web engineering: A new discipline for development of webbased systems.Webengineering:ManagingdiversityandcomplexityofWebapplicationdevelopment,3-13.
[14]Myers,B.A.(2004).51. Graphical User Interface Programming.GraphicalUserInterfaceProgramming.
[15]McFarland,D.S.(2007). Dreamweaver CS3: The Missing Manual."O'ReillyMedia,Inc.".
[16]Deng,Y.,Kanervisto,A.,Ling,J.,&Rush,A.M.(2017,July). Image-to-markup generation with coarse-to-fine attention.In InternationalConferenceonMachineLearning(pp.980-989).PMLR.
[17]Davody, A., Davoudi, H., Baba, M. S., & Florian, R. V. (2018). Learning to generate HTML code from images with no supervisory data.
[18]Balog,M.,Gaunt,A.L.,Brockschmidt,M.,Nowozin,S.,&Tarlow,D.(2016). Deepcoder: Learning to write programs.arXiv preprintarXiv:1611.01989.
[19]Gaunt,A.L.,Brockschmidt,M.,Singh,R.,Kushman,N.,Kohli,P.,Taylor,J.,&Tarlow,D.(2016). Terpret: A probabilistic programming language for program induction.arXivpreprintarXiv:1608.04428.
[20]Ling,W.,Grefenstette,E.,Hermann,K.M.,Kočiský,T.,Senior,A.,Wang,F.,&Blunsom,P.(2016). Latent predictor networks for code generation.arXivpreprintarXiv:1603.06744.
[21]Nguyen,T.A.,&Csallner,C.(2015,November). Reverse engineering mobile application user interfaces with REMAUI (t).In 201530thIEEE/ACMInternationalConferenceonAutomatedSoftwareEngineering(ASE)(pp.248-259).IEEE.
[22]Bahdanau,D.(2014). Neural machine translation by jointly learning to align and translate.arXivpreprintarXiv:1409.0473.
[23]Xu, K. (2015). Show, attend and tell: Neural image caption generation with visual attention. arXiv preprint arXiv:1502.03044.
[24]SotakJr,G.E.,&Boyer,K.L.(1989). The Laplacian-of-Gaussian kernel: a formal analysis and design procedure for fast, accurate convolution and full-frame output.Computervision,graphics,andimageprocessing,48(2),147-189.
[25]Mallick,A.,Roy,S.,Chaudhuri,S.S.,&Roy,S.(2014,January). Optimization of Laplace of Gaussian (LoG) filter for enhanced edge detection: a new approach.InProceedingsofthe2014InternationalConferenceonControl,Instrumentation,Energy andCommunication(CIEC)(pp.658-661).IEEE.
[26]Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). Imagenet classification with deep convolutional neural networks Advancesinneuralinformationprocessingsystems,25.
[27]Sermanet, P. (2013). Overfeat: Integrated Recognition, Localization and Detection Using Convolutional networks. arXiv preprintarXiv:1312.6229.
[28]JohannaPingel, Edge Detection with MATLAB MathWorks:https://in.mathworks.com/help/images/edge-detection.html
[29]Huang,M.,Mu,Z.,Zeng,H.,&Huang,H.(2015). A Novel Approach for Interest Point Detection via Laplacian‐of‐Bilateral Filter.JournalofSensors,2015(1),685154.
[30]TheMathWorks,Inc Image Processing Toolbox - im2bw: https://in.mathworks.com/help/images/ref/im2bw.html
[31]TheMathWorks,Inc ImageProcessingToolbox–imdilate:https://in.mathworks.com/help/images/ref/imdilate.html
[32]TheMathWorks,Inc ImageProcessingToolbox–imerode:https://www.mathworks.com/help/images/ref/imerode.html
[33]The MathWorks, Inc, Image Processing Toolbox – bwconncomp: https://www.mathworks.com/help/images/ref/bwconncomp.html
[34]Abadi, M., Barham, P., Chen, J., Chen, Z., Davis, A., Dean, J., ... & Zheng, X. (2016). TensorFlow: a system for Large-Scale machine learning.In12thUSENIXsymposiumonoperatingsystemsdesignandimplementation(OSDI16)(pp.265-283).
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
[35]Shunan Zhang. Installing tensorflow on Pycharm (Mac). Stackoverflow: https://stackoverflow.com/questions/36998018/installing-tensorflow-on-pycharm-mac BIOGRAPHIES
Volume: 11 Issue: 12 | Dec 2024 www.irjet.net p-ISSN: 2395-0072 © 2024, IRJET | Impact Factor value: 8.315 | ISO 9001:2008
ThisaranieKaluarachchireceivedaB.Sc.degreeinComputerSciencefromtheFacultyofScienceatthe UniversityofPeradeniyainKandy,SriLanka.SheispursuingherPhDinComputingattheUniversityof ColomboSchoolofComputinginColombo,SriLanka.HerresearchinterestsincludeArtificialIntelligence, MachineLearning,DeepLearning,andComputerVision.