Get Concepts and semantics of programming languages 1: a semantical approach with ocaml and python t

Page 1


Concepts and Semantics of Programming Languages 1: A Semantical Approach with OCaml and Python Therese Hardin

Visit to download the full and correct content document: https://ebookmass.com/product/concepts-and-semantics-of-programming-languages1-a-semantical-approach-with-ocaml-and-python-therese-hardin/

More products digital (pdf, epub, mobi) instant download maybe you interests ...

Concepts of Programming Languages (10th Edition)

Sebesta

https://ebookmass.com/product/concepts-of-programminglanguages-10th-edition-sebesta/

Python Programming and SQL: 10 Books in 1 - Supercharge Your Career with Python Programming and SQL Andrew Reed

https://ebookmass.com/product/python-programming-andsql-10-books-in-1-supercharge-your-career-with-pythonprogramming-and-sql-andrew-reed/

Python & JavaScript Mastery: 2 Books In 1- Learn And Master Two Powerful Programming Languages Alex Iversion

https://ebookmass.com/product/python-javascript-mastery-2-booksin-1-learn-and-master-two-powerful-programming-languages-alexiversion/

Programming and Problem Solving with Python Ashok Namdev Kamthane

https://ebookmass.com/product/programming-and-problem-solvingwith-python-ashok-namdev-kamthane/

Learning Scientific Programming with Python Hill

https://ebookmass.com/product/learning-scientific-programmingwith-python-hill/

Python Mini Reference 2022: A Quick Guide to the Modern Python Programming Language for Busy Coders (A Hitchhiker's Guide to the Modern Programming Languages Book 3) Harry Yoon

https://ebookmass.com/product/python-mini-reference-2022-a-quickguide-to-the-modern-python-programming-language-for-busy-codersa-hitchhikers-guide-to-the-modern-programming-languagesbook-3-harry-yoon/

Mastering Functional Programming with Python Brett Neutreon

https://ebookmass.com/product/mastering-functional-programmingwith-python-brett-neutreon/

Python Programming for Beginners: 2 Books in 1 - the Ultimate Step-By-Step Guide to Learn Python Programming Quickly With Practical Exercises Mark Reed

https://ebookmass.com/product/python-programming-forbeginners-2-books-in-1-the-ultimate-step-by-step-guide-to-learnpython-programming-quickly-with-practical-exercises-mark-reed/

Python Programming: Using Problem Solving Approach 1st Edition Reema Thareja

https://ebookmass.com/product/python-programming-using-problemsolving-approach-1st-edition-reema-thareja/

Concepts and Semantics of Programming Languages 1

Series Editor

Concepts and Semantics of Programming Languages 1

A Semantical Approach with OCaml and Python

First published 2021 in Great Britain and the United States by ISTE Ltd and John Wiley & Sons, Inc.

Apart from any fair dealing for the purposes of research or private study, or criticism or review, as permitted under the Copyright, Designs and Patents Act 1988, this publication may only be reproduced, stored or transmitted, in any form or by any means, with the prior permission in writing of the publishers, or in the case of reprographic reproduction in accordance with the terms and licenses issued by the CLA. Enquiries concerning reproduction outside these terms should be sent to the publishers at the undermentioned address:

ISTE Ltd

John Wiley & Sons, Inc.

27-37 St George’s Road 111 River Street London SW19 4EU Hoboken, NJ 07030 UK USA

www.iste.co.uk

www.wiley.com

© ISTE Ltd 2021

The rights of Thérèse Hardin, Mathieu Jaume, François Pessaux and Véronique Viguié Donzeau-Gouge to be identified as the authors of this work have been asserted by them in accordance with the Copyright, Designs and Patents Act 1988.

Library of Congress Control Number: 2021930488

British Library Cataloguing-in-Publication Data

A CIP record for this book is available from the British Library

ISBN 978-1-78630-530-5

Foreword .......................................xi

Preface

Chapter1.FromHardwaretoSoftware ...................1

1.1.Computers:alow-levelview........................1

1.1.1.Informationprocessing.........................1

1.1.2.Memories.................................2

1.1.3.CPUs...................................3

1.1.4.Peripheraldevices............................7

1.2.Computers:ahigh-levelview.......................8

1.2.1.Modelingcomputations.........................9

1.2.2.High-levellanguages..........................9

1.2.3.Fromsourcecodetoexecutableprograms..............10

Chapter2.IntroductiontoSemanticsofProgrammingLanguages 15

2.1.Environment,memoryandstate......................16

2.1.1.Evaluationenvironment.........................16

2.1.2.Memory.................................18

2.1.3.State....................................20

2.2.Evaluationofexpressions..........................21

2.2.1.Syntax..................................21

2.2.2.Values...................................22

2.2.3.Evaluationsemantics..........................24

2.3.Definitionandassignment.........................26

2.3.1.Defininganidentifier..........................26

2.3.2.Assignment................................29

2.4.Exercises...................................31

Chapter3.SemanticsofFunctionalFeatures

3.1.Syntacticaspects...............................35

3.1.1.Syntaxofafunctionalkernel......................35

3.1.2.Abstractsyntaxtree...........................36

3.1.3.Reasoningbyinductionoverexpressions...............39

3.1.4.Declarationofvariables,boundandfreevariables..........39

3.2.Executionsemantics:evaluationfunctions................42

3.2.1.Evaluationerrors.............................42

3.2.2.Values...................................43

3.2.3.Interpretationofoperators.......................45

3.2.4.Closures.................................46

3.2.5.Evaluationofexpressions........................47

3.3.Executionsemantics:operationalsemantics...............54

3.3.1.Simpleexpressions...........................55

3.3.2.Call-by-value...............................56

3.3.3.Recursiveandmutuallyrecursivefunctions..............60

3.3.4.Call-by-name...............................61

3.3.5.Call-by-valueversuscall-by-name...................62

3.4.Evaluationfunctionsversusevaluationrelations.............64

3.4.1.Statusoftheevaluationfunction....................64

3.4.2.Inductionoverevaluationtrees.....................65

3.5.Semanticproperties.............................69

3.5.1.Equivalentexpressions.........................69

3.5.2.Equivalentenvironments........................71

4.1.Syntaxofakernelofanimperativelanguage...............77 4.2.Evaluationofexpressions..........................81 4.3.Evaluationofdefinitions..........................86 4.4.Operationalsemantics............................89

4.4.1.Big-stepsemantics............................89

4.4.2.Small-stepsemantics..........................93

4.4.3.Expressivenessofoperationalsemantics...............95 4.5.Semanticproperties.............................96 4.5.1.Equivalentprograms..........................96

4.5.2.Programtermination..........................98

4.5.3.Determinismofprogramexecution..................100

4.5.4.Bigstepsversussmallsteps......................103

4.6.Procedures..................................109

4.6.1.Blocks..................................109

4.6.2.Procedures................................112

4.7.Otherapproaches..............................118

4.7.1.Denotationalsemantics.........................118

4.7.2.Axiomaticsemantics,Hoarelogic...................129

4.8.Exercises...................................134

Chapter5.Types ..................................137

5.1.Typechecking:whenandhow?......................139

5.1.1.Whentoverifytypes?..........................139

5.1.2.Howtoverifytypes?..........................140

5.2.Informaltypingofaprogram Exp2 ....................141

5.2.1.Afirstexample..............................141

5.2.2.Typingaconditionalexpression....................142

5.2.3.Typingwithouttypeconstraints....................142

5.2.4.Polymorphism..............................143

5.3.Typingrulesin Exp2 ............................143

5.3.1.Types,typeschemesandtypingenvironments............143

5.3.2.Generalization,substitutionandinstantiation.............146

5.3.3.Typingrulesandtypingtrees......................151

5.4.Typeinferencealgorithmin Exp2 .....................154

5.4.1.Principaltype..............................154

5.4.2.Setsofconstraintsandunification...................155

5.4.3.Typeinferencealgorithm........................159

5.5.Properties...................................167

5.5.1.Propertiesoftypechecking.......................167

5.5.2.Propertiesoftheinferencealgorithm.................167

5.6.Typecheckingofimperativeconstructs..................168

5.6.1.Typealgebra...............................168

5.6.2.Typingrules...............................169

5.6.3.Typingpolymorphicdefinitions....................171

5.7.Subtypingandoverloading.........................172

5.7.1.Subtyping.................................173

5.7.2.Overloading...............................175

Chapter6.DataTypes

..............................179

6.1.Basictypes..................................179

6.1.1.Booleans.................................179

6.1.2.Integers..................................181

6.1.3.Characters................................186

6.1.4.Floatingpointnumbers.........................187

6.2.Arrays.....................................191

6.3.Strings....................................194

6.4.Typedefinitions...............................194

6.4.1.Typeabbreviations............................195

6.4.2.Records..................................196

6.4.3.Enumeratedtypes............................200

6.4.4.Sumtypes................................202

6.5.Generalizedconditional...........................205

6.5.1.Cstyle switch/case ...........................205

6.5.2.Patternmatching.............................208

6.6.Equality....................................216

6.6.1.Physicalequality.............................217

6.6.2.Structuralequality............................218

6.6.3.Equalitybetweenfunctions.......................220

Chapter7.PointersandMemoryManagement ..............223

7.1.Addressesandpointers...........................223

7.2.Endianness..................................225

7.3.Pointersandarrays.............................225

7.4.Passingparametersbyaddress.......................226

7.5.References..................................229

7.5.1.ReferencesinC++............................229

7.5.2.ReferencesinJava............................233

7.6.Memorymanagement............................234

7.6.1.Memoryallocation...........................234

7.6.2.Freeingmemory.............................237

7.6.3.Automaticmemorymanagement...................239

Chapter8.Exceptions ..............................243

8.1.Errors:notificationandpropagation....................243

8.1.1.Globalvariable..............................245

8.1.2.Recorddefinition............................245

8.1.3.Passingbyaddress............................245

8.1.4.Introducingexceptions.........................246

8.2.Asimpleformalization:ML-styleexceptions..............247

8.2.1.Abstractsyntax.............................247

8.2.2.Values...................................248

8.2.3.Typealgebra...............................248

8.2.4.Operationalsemantics..........................248

8.2.5.Typing..................................250

8.3.Exceptionsinotherlanguages.......................250

8.3.1.ExceptionsinOCaml..........................251

8.3.2.ExceptionsinPython..........................251

8.3.3.ExceptionsinJava............................253

8.3.4.ExceptionsinC++............................254

Foreword

Computerprogramshaveplayedanincreasinglycentralroleinourlivessincethe 1940s,andthequalityoftheseprogramshasthusbecomeacrucialquestion.Writing ahigh-qualityprogram–aprogramthatperformstherequiredtaskandisefficient, robust,easytomodify,easytoextend,etc.–isanintellectuallychallengingtask, requiringtheuseofrigorousdevelopmentmethods.Firstandforemost,however,the creationofsuchaprogramisdependentonanin-depthknowledgeofthe programminglanguageused,itssyntaxand,crucially,itssemantics,i.e.what happenswhenaprogramisexecuted.

Thedescriptionofthissemanticsputsthemostfundamentalconceptsintolight, includingthoseofvalue,reference,exceptionorobject.Theseconceptsarethe foundationsofprogramminglanguagetheory.Masteringtheseconceptsiswhatsets experiencedprogrammersapartfrombeginners.Certainconcepts–likethatofvalue –arecommontoallprogramminglanguages;others–suchasthenotionoffunctions –operatedifferentlyindifferentlanguages;finally,otherconcepts–suchasthatof objects–onlyexistincertainlanguages.Computerscientistsoftenreferto “programmingparadigms”toconsidersetsofconceptssharedbyafamilyof languages,whichimplyacertainprogrammingstyle:imperative,functional, object-oriented,logical,concurrent,etc.Nevertheless,anunderstandingofthe conceptsthemselvesisessential,asseveralparadigmsmaybeinterwovenwithinthe samelanguage.

Introductorytextsonprogramminginanygivenlanguagearenotdifficulttofind, andanumberofpublishedbooksaddressthefundamentalconceptsoflanguage semantics.Muchrarerarethose,likethepresentvolume,whichestablishand examinethelinksbetweenconceptsandtheirimplementationinlanguagesusedby programmersonadailybasis,suchasC,C++,Ada,Java,OCamlandPython.The authorsprovideawealthofexamplesintheselanguages,illustratingandgivinglife tothenotionsthattheypresent.Theyproposegeneralmodels,suchasthe kit

presentedinVolume2,permittingaunifiedviewofdifferentnotions;thismakesit easierforreaderstounderstandtheconstructsusedinpopularprogramming languagesandfacilitatescomparison.Thisthoroughanddetailedworkprovides readerswithanunderstandingofthesenotionsand,aboveall,anunderstandingof thewaysofusingthelattertocreatehigh-qualityprograms,buildingasaferand morereliablefutureincomputing.

CatherineD UBOIS ProfessorattheÉcolenationalesupérieure d’informatiquepourl’industrieetl’entreprise

January2021

Preface

Thistwo-volumeworkrelatestothefieldofprogramming.Firstandforemost,it isintendedtogivereadersasolidgroundinginthebasesoffunctionalorimperative programming,alongwithathoroughknowledgeofthemoduleandclassmechanisms involved.Inourview,thesemanticsapproachismostappropriatewhenstudying programming,astheimpactofinterlanguagesyntaxdifferencesislimited.Practical considerations,determinedbythematerialcharacteristicsofcomputersand/or “smart”devices,willalsobeaddressed.Thesameapproachwillbetakeninboth volumes,usingbothmathematicalformulasandmemorystatediagrams.Withthis book,wehopetohelpreadersunderstandthemeaningoftheconstructsdescribedin thereferencemanualsofprogramminglanguagesandtoestablishsolidfoundations forreasoningandassessingthecorrectnessoftheirownprogramsthroughcritical review.Inshort,ouraimistofacilitatethedevelopmentofsafeandreliable programs.

Volume1beginswithapresentationofthecomputer,inChapter1,firstatthe materiallevel–asanassemblageofcomponents–thenasatoolforexecuting programs.Chapter2isanintuitive,step-by-stepintroductiontolanguagesemantics, intendedtofamiliarizereaderswiththisapproachtoprogramming.InChapter3,we provideadetaileddiscussiononthesubject,withaformalpresentationofthe executionsemanticsoffunctionalfeatures.Chapter4continueswiththesametopic, lookingattheexecutionsemanticsofimperativefeatures.Inthesetwochapters,a clearmathematicalframeworkisusedtosupportourpresentation.Also,allofthe notionswhichweintroduceinthesechaptersareimplementedinbothPythonand OCamltoassistreaderslearningaboutthesemanticconceptsinquestionforthefirst time.Multipleexercises,withdetailedsolutions,areprovidedinbothcases.Chapter 5,onthesubjectoftyping,beginsbyaddressingtypingrules,whichareusedto checkprograms;wethenpresentthealgorithmusedtoinferpolymorphictypes, alongwiththeassociatedmathematicalnotions,allimplementedinbothlanguages. Finally,theextensionoftypingtoimperativefeaturesisaddressed.InChapter6,we

presentthemaindatatypesandmethodsofpatternmatching,usingarangeof examplesexpressedindifferentprogramminglanguages.Chapter7focuseson low-levelprogrammingfeatures:endianness,pointersandmemorymanagement; thesenotionsaremostlypresentedusingCandC++.Volume1endswitha discussionoferrorprocessingusingexceptions,theirsemanticsispresentedin OCaml,andtheexceptionmanagementmechanismsusedinPython,JavaandC++ arealsodescribed(seeChapter8).

Thus,Volume1isintendedtogiveabroadoverviewofthefunctionaland imperativefeaturesofprogramming,fromnotionsthatcanbemodeled mathematicallytonotionsthatarelinkedtothehardwareconfigurationofcomputers themselves.Volume2focusesonmodularandobjectprogramming,buildingonthe foundationslaiddowninVolume1sincemodules,classesandobjectsare,in essence,themeansoforganizingfunctionalorimperativeconstructs.Volume2first analyzestheneedsofdevelopersintermsoftoolsforsoftwarearchitecture.Basedon thisstudy,anoriginalsemanticmodel,calleda kit,isdrawnup,jointlypresentingall thefeaturesofthemodulesandobjectsthatcanmeettheseneeds.Thesemanticsof thesekitsaredefinedinaratherinformalway,asresearchinthisfieldhasnotyetled toamathematicalmodelofthissetoffeatures,whileremainingrelativelysimple. Fromthismodel,weconsiderasetofemergingquestions,theobjectiveofwhichis toguidetheacquisitionofalanguage.Thisapproachisthenexemplifiedbythestudy ofthemodulesystemsofAda,OCamlandC.Finally,thesameapproachwillbeused todeduceasemanticmodelofclassandobjectfeatures,whichwillservetopresent classesinJava,C++,OCamlandPythonfromaunifiedperspective.

Thisworkisaimedatarelativelywideaudience,fromexperienceddevelopers–whowillfindvaluableadditionalinformationonlanguagesemantics–tobeginners whohaveonlywrittenshortprograms.Forbeginners,werecommendworkingonthe semanticconceptsdescribedinVolume1usingtheimplementationsinOCamlor Pythontoeaseassimilation.Allreadersmaybenefitfromstudyingthereference manualofaprogramminglanguage,whilecomparingthepresentationsofconstructs giveninthemanualwiththosegivenhere,guidedbythequestionsmentionedin Volume2.

Notethatwedonotdiscussthealgorithmicaspectofdataprocessinghere. However,choosingthealgorithmandthedatarepresentationthatfittherequirements ofthespecificationisanessentialstepinprogramdevelopment.Manyexcellent workshavebeenpublishedonthissubject,andweencouragereaderstoexplorethe subjectfurther.Wealsorecommendusingthestandardlibrariesprovidedbythe chosenprogramminglanguage.Theselibrariesincludetriedandtested implementationsformanydifferentalgorithms,whichmaygenerallybeassumedto becorrect.

Thisfirstchapterprovidesabriefoverviewofthecomponentsfoundinall computers,frommainframestotheprocessingchipsintablets,smartphonesand smartobjectsviadesktoporlaptopcomputers.Buildingonthishardware-centric presentation,weshallthengiveamoreabstractdescriptionoftheactionscarriedout bycomputers,leadingtoauniformdefinitionoftheterms“program”and “execution”,aboveandbeyondthevariouscharacteristicsofso-calledelectronic devices.

1.1.Computers:alow-levelview

Computerscienceisthescienceofrationalprocessingofinformationby computers.Computershavethecapacitytocarryoutavarietyofprocesses, dependingontheinstructionsgiventothem.Eachitemof information isanelement ofknowledgethatmaybetransmittedusingasignalandencodedusingasequenceof symbolsinconjunctionwithasetofrulesusedtodecodethem,i.e.toreconstructthe signalfromthesequenceofsymbols.Computersusebinaryencoding,involvingtwo symbols;thesemaybereferredtoas“true”/“false”,“0”/“1”or“high”/“low”;these termsareinterchangeable,andallrepresentthetwostablestatesoftheelectrical potentialofdigitalelectroniccircuits.

1.1.1. Informationprocessing

Schematically,acomputerismadeupofthreefamiliesofcomponentsasfollows: –memories:storedata(information)andexecutablecode(theso-calledvon Neumannarchitecture);

–oneormoremicroprocessors,knownasCPUs(centralprocessingunits),which processinformationbyapplyingelementaryoperations;

–peripherals:theseenableinformationtobeexchangedbetweenthe CPU/memorycoupleandtheoutside.

Informationprocessingbyacomputer–inotherterms,theexecutionofa program–canbesummarizedasasequenceofthreesteps:fetchingdata,computing theresultsandreturningthem.Eachelementaryprocessingoperationcorrespondsto aconfigurationofthelogicalcircuitsoftheCPU,knownasa logicfunction.Ifthe resultofthisfunctionissolelydependentoninput,andifnonotionof“time”is involvedinthecomputations,thenthefunctionissaidtobe combinatorial; otherwise,itissaidtobe sequential.

Forexample,abinaryhalf-adder,asshowninFigure1.1,isacircuitthat computesthesumoftwobinarydigits(input),alongwiththepossiblecarryvalue.It thusimplementsacombinatoriallogicfunction.

Theessentialcharacterofacombinatorialfunctionisthat,forthesameinput,the functionalwaysproducesthesameoutput,nomatterwhatthecircumstances.Thisis nottrueofsequentiallogicfunctions.

Forexample,alogicfunctionthatcountsthenumberoftimesitsinputchanges reliesonanotionof“time”(changestakeplaceintime),andapersistentstatebetween twoinputsisrequiredinordertorecordthepreviousvalueofthecounter.Thisstateis savedina memory.Forsequentialfunctions,asameinputvaluecanresultindifferent outputvalues,aseveryoutputdependsnotonlyontheinput,butalsoonthestateof thememoryatthemomentofreadingthenewinput.

1.1.2. Memories

Computersusememorytosaveprogramsanddata.Thereareseveraldifferent technologiesusedinmemorycomponents,andasimplifiedpresentationisasfollows: –RAM(RandomAccessMemory):RAMmemoryisbothreadableandwriteable. RAMcomponentsaregenerallyfast,butalsovolatile:ifelectricpowerfallsdown, theircontentislost;

Figure1.1. Binaryhalf-adder

–ROM(ReadOnlyMemory):informationstoredinaROMiswrittenatthetime ofmanufacturing,anditisread-only.ROMisslowerthanRAM,butisnon-volatile, like,forexample,aburnedDVD;

–EPROM(ErasableProgrammableReadOnlyMemory):thismemoryis non-volatile,butcanbewrittenusingaspecificdevice,throughexposuretoultravioletlight,orbymodifyingthepowervoltage,etc.ItisslowerthanRAM,forboth readingandwriting.EPROMmaybeconsideredequivalenttoarewritableDVD.

Computersusethememorycomponentsofseveraltechnologies.Storagesize diminishesasaccessspeedincreases,asfast-accessmemoryismorecostly.A distinctionisgenerallymadebetweenfourdifferenttypesofmemory:

–massstorageismeasuredinterabytesandismadeeitherofmechanicaldisks (withanaccesstimeof ∼ 10 ms)or–increasingly–ofsolid-statedrive(SSD)blocks. TheseblocksuseanEEPROMvariant(electricallyerasable)withanaccesstimeof ∼ 0 1 0 3 ms,knownas flashmemory.Massstorageisnon-volatileandisprincipally usedforthefilesystem;

–RAM,whichisexternaltothemicroprocessor.Recenthomecomputersand smartphonesgenerallypossesslargeRAMcapacities(measuredingigabytes). Embeddedsystemsorconsumerdevelopmentelectronicboardsmayhaveamuch lowerRAMcapacity.Theaccesstimeisaround40–50 η s;

–the cache isgenerallyincludedintheCPUofmodernmachines.Thisisasmall RAMmemoryofafewkilobytes(ormegabytes),withanaccesstimeofaround 5 10 η s.Thereareoftenmultiplelevelsofcache,andaccesstimedecreaseswithsize. Thecacheisusedtosavefrequentlyusedand/orconsecutivedataand/orinstructions, reducingtheneedtoaccessslowerRAMbyretaininginformationlocally.Cache managementiscomplex:itisimportanttoensureconsistencybetweenthedatain themainmemoryandthecache,betweendifferentCPUsordifferentcores(full, independentprocessingunitswithinthesameCPU)andtodecidewhichdatato discardtofreeupspace,etc.;

– registers arethefastestmemoryunitsandarelocatedinthecenterofthe microprocessoritself.Themicroprocessorcontainsalimitednumber(afewdozen) ofthesestoragezones,useddirectlybyCPUinstructions.Accesstimeisaroundone processorcycle,i.e.around1ns.

1.1.3. CPUs

TheCPU,asitsnamesuggests,istheunitresponsibleforprocessinginformation, viatheexecutionof elementaryinstructions,whichcanberoughlygroupedintofive categories:

–datatransferinstructions(copybetweenregistersorbetweenmemoryand registers);

–arithmeticinstructions(additionoftwointegervaluescontainedintworegisters, multiplicationbyaconstant,etc.);

–logicalinstructions(bit-wiseand/or/not,shift,rotate,etc.);

–branchingoperations(conditional,non-conditional,tosubroutines,etc.);

–otherinstructions(halttheprocessor,reset,interruptrequests, test-and-set, compare-and-swap,etc.).

Instructionsarecodedbybinarywordsinaformatspecifictoeachmicroprocessor. Aprogramofafewlinesinahigh-levelprogramminglanguageistranslatedintotens orevenhundredsofelementaryinstructions,whichwouldbedifficult,errorprone andtimeconsumingtowriteoutmanually.ThisisillustratedinFigure1.2,wherea “HelloWorld!”programwritteninCisshownalongsideitscounterpartinx86-64 instructions,generatedbythe gcc compiler.

.section__TEXT .globl_main .align4,0x90 _main:

#include<stdio.h> intmain(){ printf("Hellow orld!\n"); return(0); }

.cfi_startproc ##BB#0: pushq%rbp Ltmp0: .cfi_def_cfa_offset16 Ltmp1: .cfi_offset%rbp, 16 movq%rsp,%rbp Ltmp2:

.cfi_def_cfa_register%rbp subq$16,%rsp leaqL_.str(%rip),%rdi movl$0, 4(%rbp) movb$0,%al callq_p rintf xorl%ecx,%ecx movl%eax, 8(%rbp) movl%ecx,%eax addq$16,%rsp popq%rbp retq .cfi_endproc .section__TEXT L_.str: .asciz"Helloworld!\n"

Putsimply,amicroprocessorissplitintotwoparts:acontrolunit,whichdecodes andsequencestheinstructionstoexecute,andoneormorearithmeticandlogicunits (ALUs),whichcarryouttheoperationsstipulatedbytheinstructions.TheCPUruns permanentlythroughathree-stagecycle:

Figure1.2. “Helloworld!”inCandinx86-64instructions

1)fetchingthenextinstructiontobeexecutedfromthememory:every microprocessorcontainsaspecialregister,theProgramCounter(PC),whichrecords thelocation(address)ofthisinstruction.ThePCisthenincremented,i.e.thesizeof thefetchedinstructionisaddedtoit;

2)decodingofthefetchedinstruction;

3)executionofthisinstruction.

However,thenextinstructionisnotalwaystheonelocatednexttothecurrent instruction.Considerthefunction min inexample1.1,writteninC,whichreturnsthe smallestofitstwoarguments.

E XAMPLE 1.1.–

C intmin(inta,intb){ if(a<b)return(a); elsereturn(b); }

Thisfunctionmaybetranslated,intuitivelyandnaively,intoelementary instructions,byfirstplacing a and b intoregisters,thencomparingthem:

min: loada,reg0 loadb,reg1 comparereg0,reg1

Dependingontheresultofthetest–trueorfalse–differentcontinuationsare considered.Executioncontinuesusinginstructionsforoneortheotherofthese continuations:wethereforehavetwopossiblecontrolpaths.Inthiscase,a conditionaljump instructionmustbeusedtomodifythePCvalue,whenrequired,to selectthefirstinstructionofoneofthetwopossiblepaths.

branchgta_gt_b loadreg0,reg2 jumpend

a_gt_b: loadreg1,reg2 end: returnreg2

The branchgt instructionloadsthelocationoftheinstructionatlabel a_gt_b into thePC.Iftheresultofthe compare instructionisthat reg0 > reg1,thenextinstruction istheonefoundatthisaddress: loadreg1,reg2.Otherwise,thenextinstructionis theonefollowing branchgt: loadreg0,reg2.Thisisfollowedbythe unconditional

jump instruction, jump,enablingunconditionalmodificationofthePC,loadingitwith theaddressofthe end label.Thus,whatevertheresultofthecomparison,execution finisheswiththeinstruction returnreg2.

Conditionalbranchingrequirestheuseofaspecificmemorytodetermine whethercertainconditionshavebeensatisfiedbytheexecutionoftheprevious instruction(overflow,positiveresult,nullresult,superiority,etc.).EveryCPU containsadedicatedregister,theStateRegister(SR),inwhicheverybitisassigned tosignalingoneoftheseconditions.Executingmostinstructionsmaymodifyallor someofthebitsintheregister.Conditionalinstructions(bothjumpsandmore “exotic”variants)usetheappropriatebitvaluesforexecution.CertainARM ® architectures[ARM10]evenpermitallinstructionstobeintrinsicallyconditional.

Everyprogramismadeupoffunctionsthatcanbecalledatdifferentpointsinthe programandthesecallscanbenested.Whenafunctioniscalled,thepointwhere executionshouldresumeoncetheexecutionofthefunctioniscompleted–the return address –mustberecorded.Consideraprogrammadeupofthefunctions g ()= k ()+ h() and f ()= g ()+ h(),featuringseveralfunctioncalls,someofwhich arenested.

g()= t11=k() t12=h() returnt11+t12

f()= v11=g() v12=h() returnv11+v12

Asingleregisterisnotsufficienttorecordthereturnaddressesofthedifferent calls.Calling k from g mustbefollowedbycalling h toevaluate t12.Butthiscall of g wasdoneby f,thusitsreturnaddressin f shouldalsobememorizedtofurther evaluationof v12.Thenumberofreturnaddressestorecordincreaseswiththenumber ofnestedcalls,anddecreasesasweleavethesecalls,suggestingverynaturallytosave theseaddressesina stack.Figure1.3showstheevolutionofastackstructureduring successivefunctioncalls,demonstratingtheneedtorecordmultiplereturnaddresses. Thestateofthestackisshownateverystepoftheexecution,atthemomentwherethe lineintheprogramisbeingexecuted.

Adedicatedregister,theStackPointer(SP),alwayscontainstheaddressofthe nextfreeslotinthestack(or,alternatively,theaddressofthelastslotused).Thus, inthecaseofnestedcalls,thereturnaddressissavedattheaddressindicatedbythe SP,andtheSPisincrementedbythesizeofthisaddress.Whenthefunctionreturns, thePCisloadedwiththesavedaddressfromthestack,andtheSPisdecremented accordingly.

Insummary,theinternalstateofamicroprocessorismadeupofitsgeneral registers,theprogramcounter,thestateregisterandthestackpointer.Note,however, thatthisisahighlysimplifiedvision.Therearemanydifferentvarietiesof microprocessorswithdifferentinternalarchitecturesand/orinstructionsets(for example,somedonotpossessanintegerdivisioninstruction).Thus,aprogram writtendirectlyusingtheinstructionsetofamicroprocessorwillnotbeexecutable usinganothermodelofmicroprocessor,anditwillneedtoberewritten.The portabilityofprogramswrittenintheassemblylanguageofagivenmicroprocessoris practicallynull.High-levellanguagesrespondtothisproblembyprovidingsyntactic constructs,whichareindependentofthetargetmicroprocessors.Thecompilerorthe interpreterhavetotranslatetheseconstructsintothelanguageusedbythe microprocessor.

1.1.4. Peripheraldevices

Aswesawinsection1.1.3,processorsexecuteaconstantcycleoffetching, decodingandexecutinginstructions.Computationsarecarriedoutusingdatastored inthememory,eitherbytheprogramitselforbyaninput/outputmechanism.The resultsofcomputationsarealsostoredinthememory,andmaybereturnedtousers usingthisinput/outputmechanism.

Theinterestofanyprogrammablesystemisinherentlydependentoninput/output capacitiesthroughwhichthesystemreactstotheexternalenvironmentandmayact onthisenvironment.Evenanassemblyrobotinacarfactory,whichrepeatsthesame actionsagainandagain,mustreacttodatainputfromtheenvironment.Forexample, thepressureofthegripmechanismmuststopincreasingonceithascaughtabolt,and thetimeittakestodothiswilldifferdependingontheexactpositionofthebolt.

Input/outputsystemsoperateusing peripherals,ancillarydevicesthatmaybe electronic,mechanicaloracombinationofthetwo.Theseallowthemicroprocessor toacquireexternalinformation,andtotransmitinformationtotheexterior.Computer

mice,screensandkeyboardsareperipheralsusedwithdesktopcomputers,butother elementssuchasmotors,analog/digitalacquisitioncards,etc.arealsoperipherals.

Ifperipheralsarepresent,themicroprocessorneedstodevotepartofits processingtimetodataacquisitionandtothetransmissionofcomputedresults.This interactionwithperipheralsmaybedirectlyintegratedintoprograms.Butinthis case,theprogramshavetointegrateregularcheckingofinputperipheralstoseeif newinformationisavailable.Itistechnicallydifficult(ifnotimpossible)toinclude suchamonitoringineveryprogram.Furthermore,regularperipheralchecksarea wasteoftimeandenergyifnonewdataisavailable.Finally,thereisnoguarantee thatinformationwouldarriveexactlyatthemomentofchecking,asdatamaybe asynchronously emitted.

Thisproblemcanbeavoidedbyrelyingonthehardwaretoindicatethe occurrenceofnewexternalevents,insteadofusingsoftwaretocheckforthese events.The interrupt mechanismisusedtointerrupttheexecutionofthecurrentcode andtolaunchtheinterrupthandlerassociatedwiththeexternalevent.Thishandleris asectionofcode,whichisnotexplicitlycalledbytheprogrambeingexecuted;itis locatedatanaddressknownbythemicroprocessor.Asanyprogrammaybe interruptedatanypoint,theprocessorstate,andnotablytheregisters,mustbesaved beforeprocessingtheinterrupt.Thecodethatisexecutedtoprocesstheinterruptwill indeedusetheregistersandmodifytheSR,SPandPC.Therefore,previousvaluesof registersmustberestoredinordertoresumeexecutionoftheinterruptedcode.This contextsavingiscarriedoutpartiallybythehardwareandpartiallybythesoftware.

1.2.Computers:ahigh-levelview

Thelow-levelvisionofavonNeumannmachinepresentedinsection1.1 providesagoodoverviewofthecomponentsofacomputerandofprogram execution,withoutgoingintodetailconcerningtheoperationsofelectronic components.However,thisviewisnotparticularlyhelpfulinthecontextofeveryday programmingactivity.Programsinbinarycode,orevenassemblycode,aredifficult towriteastheyneedtotakeaccountofeverydetailofexecution;theyare,bynature, longandhardtoreview,understandanddebug.Thefirst“high-level”programming languagesemergedveryshortlyafterthefirstcomputers.Theselanguagesassign namestocertainvaluesandaddressesinthememory,providingasetofinstructions thatcanbesplitintolow-levelmachineinstructions.Inotherterms,programming languagesofferanabstractvisionofthecomputer,enablinguserstoignorelow-level detailswhilewritingaprogram.The“helloworld”programinFigure1.2clearly demonstratesthepowerofabstractionofCcomparedtotheX86assemblylanguage.

1.2.1. Modelingcomputations

Anyprogramissimplyadescription,initsownprogramminglanguage,ofa seriesofcomputations(includingreadingandwriting),whicharetheonlyoperations thatacomputercancarryout.Anabstractviewofacomputerrequiresanabstract view–wecallita model –ofthenotionofcomputation.Thissubjectwasfirst addressedwellbeforetheemergenceofcomputers,inthelate19thcentury,by logicians,mathematiciansandphilosophers,whointroducedarangeofdifferent approachestothetheoryofcalculability.

TheTuringmachine[TUR95]isamathematicalmodelofcomputationintroduced in1936.Thismachineoperatesonaninfinitememorytapedividedintocellsandhas threeinstructions:moveonecellofthetaperightorleft,writeorreadasymbolin thecellorcomparethecontentsoftwocells.Ithasbeenformallyproventhatany “imperative”programminglanguage,featuringassignment,aconditionalinstruction anda while loop,hasthesamepowerofexpressionasthisTuringmachine.

Severalothermodelsofthenotionofalgorithmiccomputationwereintroduced overthecourseofthe20thcentury,andhavebeenformallyproventobeequivalent totheTuringmachine.OnenotableexampleisKleene’srecursiontheory[KLE52], thebasisforthe“purefunctional”languages,basedonthenotionof(potentially) recursivefunctions;hence,theselanguagesalsohavethesamepowerofexpression astheTuringmachine.Purefunctionalandimperativelanguageshavedevelopedin parallelthroughoutthehistoryofhigh-levelprogramming,leadingtodifferent programmingstyles.

1.2.2. High-levellanguages

Broadlyspeaking,theexecutionofafunctionalprogramcarriesoutaseriesof functioncallsthatleadtotheresult,withintermediatevaluesstoredexclusivelyin theregisters.Theexecutionofanimperativeprogramcarriesoutasequenceof modificationsofmemorycellsnamedbyidentifiers,thevaluesinthecellsbeing computedduringexecution.Themostwidespreadhigh-levellanguagesincludeboth functionalandimperativefeatures,alongwithvariouspossibilities(modules,object features,etc.)todividesourcecodeintopiecesthatcanbereused.

Whateverthestyleofprogrammingused,anyprogramwritteninahigh-level languageneedstobetranslatedintobinarylanguagetobeexecuted.These translationsareexecutedeithereverytimetheprogramisexecuted–inwhichcase thetranslationprogramisknownasan interpreter orjustonce,storingtheproduced binarycode–inwhichcasethetranslatorisknownasa compiler.

Aswehaveseen,high-levellanguagesfacilitatethecodingofalgorithms.They easereviewingofthesourcecodeofaprogram,asthetextismoreconcisethanit

10ConceptsandSemanticsofProgrammingLanguages1

wouldbeforthesamealgorithminassemblycode.Thisdoesnot,however,implythat usersgainabetterunderstandingofthewaytheprogramworks.Towriteaprogram, apreciseknowledgeoftheconstructsused–inotherterms,their semantics,what theydoandwhattheymean–iscrucialtounderstandthesourcecode.Bugsarenot alwaystheresultofalgorithmcodingerrors,andareoftencausedbyanerroneous interpretationofelementsofthelanguage.Forexample,theincrementationoperator ++ inCexistsintwoforms(i++ or ++i),anditsunderstandingisnotassimpleasit mayseem.Forexample,theprogram:

C #include<stdio.h>

intmain(){ inti=0; printf("%d\n",i++); return(0); } willprint0,butif i++ isreplacedwith ++i,thesameprogramwillprint1.

Thereareanumberofconceptsthatarecommontoallhigh-levellanguages:value naming,organizationofnamespaces,explicitmemorymanagement,etc.However, theseconceptsmaybeexpressedusingdifferentsyntacticconstructs.Thefieldof languagesemanticscoversasetoflogico-mathematicaltheories,whichdescribethese conceptsandtheirproperties.Constructingthesemanticsofaprogramallowstothe formalverificationofwhethertheprogrampossessesalloftherequiredproperties.

1.2.3. Fromsourcecodetoexecutableprograms

Thetransitionfromtheprogramsourcetoitsexecutionisamultistepprocess. Someofthesestepsmaydifferindifferentlanguages.Inthissection,weshallgive anoverviewofthemainstepsinvolvedinanalyzingandtransformingsourcecode, applicabletomostprogramminglanguages.

Thesourcecodeofaprogramismadeupofoneormoretextfiles.Indeed,toease softwarearchitecture,mostlanguagesallowsourcecodetobesplitacrossseveralfiles, knownas compilationunits.Eachfileisprocessedseparatelypriortothefinalphase, inwhichtheresultsofprocessingarecombinedintoonesingle executable file.

1.2.3.1. Lexicalanalysis

Lexicalanalysis isthefirstphaseoftranslation:itconvertsthesequenceof charactersthatisindeedthesourcefileintoasequenceof words,assigningeachtoa category.Commentsaregenerallydeletedatthisstage.Thus,inthefollowingtext presumedtobewritteninC

/*Thisisacomment.*/ if[x==3int+)cos($v)

lexicalanalysiswillrecognizethekeyword if,theopeningbracket,theidentifier x, theoperator ==,theintegerconstant 3,thetypeidentifier int,etc.NowordinCcan containthecharacter $,soalexicalerrorwillbehighlightedwhen $v isencountered.

Lexicalanalysismaybeseenasaformof“spellcheck”,inwhicheachrecognized wordisassignedtoacategory(keyword,constant,identifier).Thesewordsarereferred toas tokens

1.2.3.2.

Syntacticanalysis

Everylanguagefollows grammar.Forexample,inEnglish,asentenceis generallyconsideredtobecorrectlyformedifitcontainsasubject,verband complementinanunderstandableorder.Programminglanguagesarenoexception: syntacticanalysis verifiesthatthephrasesofasourcefileconformwiththegrammar oftheirlanguage.Forexample,inC,thekeyword if mustbefollowedbya bracketedexpression,aninstructionmustendwithasemicolon,etc.Clearly,the sourcetextgivenintheexampleaboveinthecontextoflexicalanalysisdoesnot respectthesyntaxofC.

Technically,thesyntacticanalyzerisinchargeofthecompletegrammatical analysisofthesourcefile.Itcallsthelexicalanalyzereverytimeitrequiresatokento progressthroughtheanalyzedsource.Syntacticanalysisisthusaformofgrammar verification,anditalsobuildsarepresentationofthesourcefilebyadatastructure, whichisoftenatree,calledthe abstractsyntaxtree (AST).Thisdatastructurewill beusedbyallthefollowingphasesofcompilation,uptothepointofexecutionbyan interpreterorthecreationofanexecutablefile.

1.2.3.3. Semanticanalyses

Thefirsttwoanalysisphasesofcompilationonlyconcernthetextualstructureof thesource.Theydonotconcernthe meaning oftheprogram,i.e.its semantics.Source textsthatpassthesyntacticanalysisphasedonotalwayshavemeaning.Thephrase “theseaeatsaderivablerabbit”isgrammaticallycorrect,butisevidentlynonsense.

Thebest-knownsemanticanalysisisthetypinganalysis,whichprohibitsthe combinationofelementsthatareincompatibleinnature.Thus,inthepreviousphase, “derivable”couldbeapplicabletoafunction,butcertainlynottoa“rabbit”.

Semanticanalysesdonotreducetoaformoftypinganalysisbuttheyallinterpret theconstructsofaprogramaccordingtothesemanticsofthechosenlanguage. Semanticanalysesmaybeusedtoeliminateprograms,whichleadstoexecution errors.Theymayalsoapplysometransformationstoprogramcodeinordertogetan

executablefile(dependencyanalysis,closureelimination,etc.).Thesesemantic analysesmaybecarriedoutduringsubsequentpassesofsourcecodeprocessing, evenafterthecodegenerationphasedescribedinthefollowingsection.

1.2.3.4. Codeinterpretation/generation

Oncetheabstractsyntaxtree(oraderivedtree)hasbeencreated,therearetwo options.Eitherthetreemaybeexecuteddirectlyviaan interpreter,whichisaprogram suppliedbytheprogramminglanguage,ortheASTisusedtogenerate object code files,withtheaimofcreatinganexecutablefilethatcanberunindependently.Letus firstfocusonthesecondapproach.Theinterpretationmechanismwillbediscussed later.

CompilationusestheASTgeneratedfromthesourcefiletoproduceasequence ofinstructionstobeexecutedeitherbytheCPUorbyavirtualmachine(VM).The compilationiscorrectiftheexecutionofthissequenceofinstructionsgivesaresult, whichconformstotheprogram’ssemantics.

Optimizationphasesmaytakeplaceduringorafterobjectcodegeneration,with theaimofimprovingitscompactnessoritsexecutionspeed.Moderncompilers implementarangeofoptimizations,whichstudyliesoutsidethescopeofthisbook. Certainoptimizationsare“universal”,whileothersmaybespecifictotheCPUfor whichthecodeisgenerated.

Theobjectcodeproducedbythecompilermaybeeitherbinarycodeencoding instructionsdirectlyorsourcetextinassemblycode.Inthelattercase,aprogram–knownasthe assembler –mustbecalledtotransformthislow-levelsourcecodeinto binarycode.Generallyspeaking,assemblerssimplyproduceamechanical translationofinstructionswrittenmnemonically(mov, add, jmp,etc.)intobinary representations.However,certainmoresophisticatedassemblersmayalsocarryout optimizationoperationsatthislevel.

Assemblingmnemoniccodeintobinarycodeisaverysimpleoperation,which doesnotalterthestructureoftheprogram.ThereferencemanualofthetargetCPU provides,foreachinstruction,themeaningofthebitsofthecorrespondingbinary word.Forexample,thereferencemanualfortheMIPS32®architecture[MIP13] describesthe32-bitbinaryformatoftheinstruction ADDrd,rs,rt (withtheeffect rd ← rs+rt ontheregisters)as:

Figure1.4. CodingtheADDinstructioninMIPS32®

Threepacketsof6bitsarereservedforencodingtheregisternumbers;theother bitsinthiswordarefixedandencodetheinstruction.Thetaskoftheassembleris togeneratesuchbitpatternsaccordingtotheinstructionsencounteredinthesource code.

1.2.3.5. Linking

Asingleprogrammaybemadeupofseveralsourcefiles,compiledseparately. Oncetheobjectcodefromeachsourcefilehasbeenproduced,allthesecodesmust becollectedintoasingleexecutablefile.Eachobjectfileincludes“holes”,indicating unknowninformationatthemomentofproductionofthisobjectcode.Itisimportant toknowwheretofindthismissingcode,whencallingfunctionsdefinedinadifferent compilationunit,orwheretofindvariablesdefinedinalocationoutsideofthecurrent unit.

The linker hastogatheralltheobjectfilesandfillalltheholes.Evidently,fora setofobjectfilestoleadtoanexecutablefile,allholesmustbefilled;sothecode ofeveryfunctioncalledinthesourcemustbeavailable.Thelinkingprocessalso hastointegratetheneededcode,ifitcomesfromsomelibraries,whetherfromthe standardlanguagelibraryorathird-partylibrary.Thereisonefinalquestiontoanswer, concerningthepointatwhichexecutionshouldbegin.Incertainlanguages(suchasC, C++andJava),thesourcecodemustcontainone,andonlyone,specialfunction,often named main,whichiscalledtostarttheexecution.Inotherlanguages(suchasPython andOCaml),definitionsareexecutedintheorderinwhichtheyappear,definedbythe fileorderingduringthelinkingprocess.Thus,“executing”thedefinitionofafunction doesnotcallthefunction:instead,the“value”ofthisfunctioniscreatedandstored tobeusedlaterwhenthefunctioniscalled.Thismeansthatprogrammershaveto insertintothesourcefileacalltothefunctionwhichtheyconsidertobethe“starting point”oftheexecution.Thiscallisusuallythefinalinstructionofthelastsourcefile processedbythelinker.

Asimplifiedillustrationofthedifferenttransformationpassesinvolvedinsource codecompilationisshowninFigure1.5. generation

1.2.3.6.

Interpretationandvirtualmachines

Aswehaveseen,informallyspeaking,aninterpreter“executes”aprogramdirectly fromtheAST.Furthermore,itwassaidthatthecodegenerationprocessmaygenerate

Figure1.5. Compilationprocess

codeforavirtualmachine.Inreality,interpretersrarelyworkdirectlyonthetree; compilationtoavirtualmachineisoftencarriedoutasanintermediatestage.A virtual machine (VM)maybeseenasapseudo-microprocessor,withoneormorestacks, registersandfairlyhigh-levelinstructions.ThecodeforaVMisoftenreferredtoas bytecode.Inthiscase,compilationdoesnotgenerateafiledirectlyexecutablebythe CPU.Executioniscarriedoutbythe virtualmachineinterpreter,aprogramsupplied bytheprogramminglanguageenvironment.So,thedifferencebetweeninterpretation andcompilationisnotclear-cut.

ThereareseveraladvantagesofusingaVM:thecompilernolongerneedstotake thespecificitiesoftheCPUintoaccount,thecodeisoftenmorecompactand portabilityishigher.Aslongastheexecutablefileforthevirtualmachineinterpreter isavailableonacomputer,itwillbepossibletogenerateabinaryfileforthe computerinquestion.Thedrawbacktothisapproachisthattheprogramsobtainedin thiswayareoftenslowerthanprogramscompiledas“native”machinecode.

Turn static files into dynamic content formats.

Create a flipbook
Issuu converts static files into: digital portfolios, online yearbooks, online catalogs, digital photo albums and more. Sign up and create your flipbook.
Get Concepts and semantics of programming languages 1: a semantical approach with ocaml and python t by Education Libraries - Issuu