Database Systems. The Complete Book 2nd ed. Hector Garcia-Molina
Visit to download the full and correct content document: https://ebookmass.com/product/database-systems-the-complete-book-2nd-ed-hectorgarcia-molina/
More products digital (pdf, epub, mobi) instant download maybe you interests ...
Complete German Grammar 2nd Edition Ed Swick
https://ebookmass.com/product/complete-german-grammar-2ndedition-ed-swick/
Database Management Systems Ramakrishnan 3rd Edition Raghu Ramakrishnan
https://ebookmass.com/product/database-management-systemsramakrishnan-3rd-edition-raghu-ramakrishnan/
The New Statistics with R: An Introduction for Biologists 2nd Edition Andy Hector
https://ebookmass.com/product/the-new-statistics-with-r-anintroduction-for-biologists-2nd-edition-andy-hector/
IIT Foundation Series Physics Class 7, The. 2nd ed. Edition Systems
https://ebookmass.com/product/iit-foundation-series-physicsclass-7-the-2nd-ed-edition-systems/
Database Systems Design, Implementation, and Management 12th Edition (eTextbook) PDF
https://ebookmass.com/product/database-systems-designimplementation-and-management-12th-edition-etextbook-pdf/
Database Systems: Design, Implementation, and Management 13th Edition Carlos Coronel
https://ebookmass.com/product/database-systems-designimplementation-and-management-13th-edition-carlos-coronel/
Complete English All-in-One for ESL Learners 2nd Edition Ed
https://ebookmass.com/product/complete-english-all-in-one-foresl-learners-2nd-edition-ed/
Data Modeling and Database Design 2nd Edition, (Ebook PDF)
https://ebookmass.com/product/data-modeling-and-databasedesign-2nd-edition-ebook-pdf/
Complete German All-in-One (Practice Makes Perfect), 2nd Premium Edition Ed Swick
https://ebookmass.com/product/complete-german-all-in-onepractice-makes-perfect-2nd-premium-edition-ed-swick/
Database Systems
The Complete Book
Second Edition
Garcia-Molina Ullman Widom
Pearson Education Limited
Edinburgh Gate
Harlow
Essex CM20 2JE
England and Associated Companies throughout the world
Visit us on the World Wide Web at: www.pearsoned.co.uk
© Pearson Education Limited 2014
ISBN 10: 1-292-02447-X
ISBN 13: 978-1-292-02447-9
British Library Cataloguing-in-Publication Data
A catalogue record for this book is available from the British Library
Printed in the United States of America
1. The Worlds of Database Systems
Hector Garcia-Molina/Jeffrey Ullman/Jennifer Widom
2. The Relational Model of Data
Hector Garcia-Molina/Jeffrey Ullman/Jennifer Widom
3. Design Theory for Relational Databases
Hector Garcia-Molina/Jeffrey Ullman/Jennifer Widom
4. High-Level Database Models
Hector Garcia-Molina/Jeffrey Ullman/Jennifer Widom
5. Algebraic and Logical Query Languages
Hector Garcia-Molina/Jeffrey Ullman/Jennifer Widom
6. The Database Language SQL
Hector Garcia-Molina/Jeffrey Ullman/Jennifer Widom
7. Constraints and Triggers
Hector Garcia-Molina/Jeffrey Ullman/Jennifer Widom
8. Views and Indexes
Hector Garcia-Molina/Jeffrey Ullman/Jennifer Widom
9. SQL in a Server Environment
Hector Garcia-Molina/Jeffrey Ullman/Jennifer Widom
10. Advanced Topics in Relational Databases
Hector Garcia-Molina/Jeffrey Ullman/Jennifer Widom
11
Hector Garcia-Molina/Jeffrey Ullman/Jennifer Widom
12. Programming Languages for XML
Hector Garcia-Molina/Jeffrey Ullman/Jennifer Widom
13. Secondary Storage Management
Hector Garcia-Molina/Jeffrey Ullman/Jennifer Widom
Hector Garcia-Molina/Jeffrey Ullman/Jennifer Widom
15. Query Execution
Hector Garcia-Molina/Jeffrey Ullman/Jennifer Widom
16. The Query Compiler
Hector Garcia-Molina/Jeffrey Ullman/Jennifer Widom
17. Coping With System Failures
Hector Garcia-Molina/Jeffrey Ullman/Jennifer Widom
18. Concurrency Control
Hector Garcia-Molina/Jeffrey Ullman/Jennifer Widom
19. More About Transaction Management
Hector Garcia-Molina/Jeffrey Ullman/Jennifer Widom
20. Parallel and Distributed Databases
Hector Garcia-Molina/Jeffrey Ullman/Jennifer Widom
21. Information Integration
Hector Garcia-Molina/Jeffrey Ullman/Jennifer Widom
22. Database Systems and the Internet
Hector Garcia-Molina/Jeffrey Ullman/Jennifer Widom
TheWorldsofDatabase Systems
Databasestodayareessentialtoeverybusiness.Wheneveryouvisitamajor Website—Google,Yahoo!,Amazon.com,orthousandsofsmallersitesthat provideinformation—thereisadatabasebehindthescenesservingupthe informationyourequest.Corporationsmaintainalltheirimportantrecordsin databases.Databasesarelikewisefoundatthecoreofmanyscientificinvestigations.Theyrepresentthedatagatheredbyastronomers,byinvestigatorsof thehumangenome,andbybiochemistsexploringpropertiesofproteins,among manyotherscientificactivities.
Thepowerofdatabasescomesfromabodyofknowledgeandtechnology thathasdevelopedoverseveraldecadesandisembodiedinspecializedsoftwarecalleda databasemanagementsystem,or DBMS,ormorecolloquiallya “databasesystem.”ADBMSisapowerfultoolforcreatingandmanaginglarge amountsofdataefficientlyandallowingittopersistoverlongperiodsoftime, safely.Thesesystemsareamongthemostcomplextypesofsoftwareavailable.
1TheEvolutionofDatabaseSystems
Whatisadatabase?Inessenceadatabaseisnothingmorethanacollectionof informationthatexistsoveralongperiodoftime,oftenmanyyears.Incommon parlance,theterm database referstoacollectionofdatathatismanagedbya DBMS.TheDBMSisexpectedto:
1.Allowuserstocreatenewdatabasesandspecifytheir schemas (logical structureofthedata),usingaspecialized data-definitionlanguage.
2.Giveuserstheabilityto query thedata(a“query”isdatabaselingofor aquestionaboutthedata)andmodifythedata,usinganappropriate language,oftencalleda querylanguage or data-manipulationlanguage.
3.Supportthestorageofverylargeamountsofdata—manyterabytesor more—overalongperiodoftime,allowingefficientaccesstothedata forqueriesanddatabasemodifications.
4.Enable durability,therecoveryofthedatabaseinthefaceoffailures, errorsofmanykinds,orintentionalmisuse.
5.Controlaccesstodatafrommanyusersatonce,withoutallowingunexpectedinteractionsamongusers(called isolation)andwithoutactionson thedatatobeperformedpartiallybutnotcompletely(called atomicity).
1.1EarlyDatabaseManagementSystems
Thefirstcommercialdatabasemanagementsystemsappearedinthelate1960’s. Thesesystemsevolvedfromfilesystems,whichprovidesomeofitem(3)above; filesystemsstoredataoveralongperiodoftime,andtheyallowthestorageof largeamountsofdata.However,filesystemsdonotgenerallyguaranteethat datacannotbelostifitisnotbackedup,andtheydon’tsupportefficientaccess todataitemswhoselocationinaparticularfileisnotknown.
Further,filesystemsdonotdirectlysupportitem(2),aquerylanguagefor thedatainfiles.Theirsupportfor(1)—aschemaforthedata—islimitedto thecreationofdirectorystructuresforfiles.Item(4)isnotalwayssupported byfilesystems;youcanlosedatathathasnotbeenbackedup.Finally,file systemsdonotsatisfy(5).Whiletheyallowconcurrentaccesstofilesbyseveral usersorprocesses,afilesystemgenerallywillnotpreventsituationssuchas twousersmodifyingthesamefileataboutthesametime,sothechangesmade byoneuserfailtoappearinthefile.
ThefirstimportantapplicationsofDBMS’swereoneswheredatawascomposedofmanysmallitems,andmanyqueriesormodificationsweremade. Examplesoftheseapplicationsare:
1.Bankingsystems:maintainingaccountsandmakingsurethatsystem failuresdonotcausemoneytodisappear.
2.Airlinereservationsystems:these,likebankingsystems,requireassurance thatdatawillnotbelost,andtheymustacceptverylargevolumesof smallactionsbycustomers.
3.Corporaterecordkeeping:employmentandtaxrecords,inventories,sales records,andagreatvarietyofothertypesofinformation,muchofit critical.
TheearlyDBMS’srequiredtheprogrammertovisualizedatamuchasit wasstored.Thesedatabasesystemsusedseveraldifferentdatamodelsfor
describingthestructureoftheinformationinadatabase,chiefamongthem the“hierarchical”ortree-basedmodelandthegraph-based“network”model. Thelatterwasstandardizedinthelate1960’sthroughareportofCODASYL (CommitteeonDataSystemsandLanguages).1
Aproblemwiththeseearlymodelsandsystemswasthattheydidnotsupporthigh-levelquerylanguages.Forexample,theCODASYLquerylanguage hadstatementsthatallowedtheusertojumpfromdataelementtodataelement,throughagraphofpointersamongtheseelements.Therewasconsiderableeffortneededtowritesuchprograms,evenforverysimplequeries.
1.2RelationalDatabaseSystems
FollowingafamouspaperwrittenbyTedCoddin1970,2 databasesystems changedsignificantly.Coddproposedthatdatabasesystemsshouldpresent theuserwithaviewofdataorganizedastablescalled relations.Behindthe scenes,theremightbeacomplexdatastructurethatallowedrapidresponseto avarietyofqueries.But,unliketheprogrammersforearlierdatabasesystems, theprogrammerofarelationalsystemwouldnotbeconcernedwiththestorage structure.Queriescouldbeexpressedinaveryhigh-levellanguage,which greatlyincreasedtheefficiencyofdatabaseprogrammers.SQL(“Structured QueryLanguage”)isthemostimportantquerylanguagebasedontherelational model.
By1990,relationaldatabasesystemswerethenorm.Yetthedatabasefield continuestoevolve,andnewissuesandapproachestothemanagementofdata surfaceregularly.Object-orientedfeatureshaveinfilratedtherelationalmodel. Someofthelargestdatabasesareorganizedratherdifferentlyfromthoseusing relationalmethodology.Inthebalanceofthissection,weshallconsidersome ofthemoderntrendsindatabasesystems.
1.3SmallerandSmallerSystems
Originally,DBMS’swerelarge,expensivesoftwaresystemsrunningonlarge computers.Thesizewasnecessary,becausetostoreagigabyteofdatarequired alargecomputersystem.Today,hundredsofgigabytesfitonasingledisk, anditisquitefeasibletorunaDBMSonapersonalcomputer.Thus,database systemsbasedontherelationalmodelhavebecomeavailableforevenverysmall machines,andtheyarebeginningtoappearasacommontoolforcomputer applications,muchasspreadsheetsandwordprocessorsdidbeforethem. Anotherimportanttrendistheuseofdocuments,oftentaggedusingXML (eXtensibleModelingLanguage).Largecollectionsofsmalldocumentscan
1 CODASYLDataBaseTaskGroupApril1971Report,ACM,NewYork.
2 Codd,E.F.,“Arelationalmodelforlargeshareddatabanks,” Comm.ACM, 13:6, pp.377–387,1970.
serveasadatabase,andthemethodsofqueryingandmanipulatingthemare differentfromthoseusedinrelationalsystems.
1.4BiggerandBiggerSystems
Ontheotherhand,agigabyteisnotthatmuchdataanymore.Corporate databasesroutinelystoreterabytes(1012 bytes).Yettherearemanydatabases thatstorepetabytes(1015 bytes)ofdataandserveitalltousers.Someimportantexamples:
1.GoogleholdspetabytesofdatagleanedfromitscrawloftheWeb.This dataisnotheldinatraditionalDBMS,butinspecializedstructures optimizedforsearch-enginequeries.
2.Satellitessenddownpetabytesofinformationforstorageinspecialized systems.
3.Apictureisactuallyworthwaymorethanathousandwords.Youcan store1000wordsinfiveorsixthousandbytes.Storingapicturetypicallytakesmuchmorespace.RepositoriessuchasFlickrstoremillions ofpicturesandsupportsearchofthosepictures.Evenadatabaselike Amazon’shasmillionsofpicturesofproductstoserve.
4.Andifstillpicturesconsumespace,moviesconsumemuchmore.Anhour ofvideorequiresatleastagigabyte.SitessuchasYouTubeholdhundreds ofthousands,ormillions,ofmoviesandmakethemavailableeasily.
5.Peer-to-peerfile-sharingsystemsuselargenetworksofconventionalcomputerstostoreanddistributedataofvariouskinds.Althougheachnode inthenetworkmayonlystoreafewhundredgigabytes,togetherthe databasetheyembodyisenormous.
1.5InformationIntegration
Toagreatextent,theoldproblemofbuildingandmaintainingdatabaseshas becomeoneof informationintegration :joiningtheinformationcontainedin manyrelateddatabasesintoawhole.Forexample,alargecompanyhas manydivisions.Eachdivisionmayhavebuiltitsowndatabaseofproducts oremployeerecordsindependentlyofotherdivisions.Perhapssomeofthese divisionsusedtobeindependentcompanies,whichnaturallyhadtheirownway ofdoingthings.ThesedivisionsmayusedifferentDBMS’sanddifferentstructuresforinformation.Theymayusedifferenttermstomeanthesamethingor thesametermtomeandifferentthings.Tomakemattersworse,theexistence oflegacyapplicationsusingeachofthesedatabasesmakesitalmostimpossible toscrapthem,ever.
Asaresult,ithasbecomenecessarywithincreasingfrequencytobuildstructuresontopofexistingdatabases,withthegoalofintegratingtheinformation
distributedamongthem.Onepopularapproachisthecreationof datawarehouses,whereinformationfrommanylegacydatabasesiscopiedperiodically, withtheappropriatetranslation,toacentraldatabase.Anotherapproachis theimplementationofamediator,or“middleware,”whosefunctionistosupportanintegratedmodelofthedataofthevariousdatabases,whiletranslating betweenthismodelandtheactualmodelsusedbyeachdatabase.
2OverviewofaDatabaseManagement System
InFig.1weseeanoutlineofacompleteDBMS.Singleboxesrepresentsystem components,whiledoubleboxesrepresentin-memorydatastructures.Thesolid linesindicatecontrolanddataflow,whiledashedlinesindicatedataflowonly. Sincethediagramiscomplicated,weshallconsiderthedetailsinseveralstages. First,atthetop,wesuggestthattherearetwodistinctsourcesofcommands totheDBMS:
1.Conventionalusersandapplicationprogramsthataskfordataormodify data.
2.A databaseadministrator :apersonorpersonsresponsibleforthestructureor schema ofthedatabase.
2.1Data-DefinitionLanguageCommands
Thesecondkindofcommandisthesimplertoprocess,andweshowitstrail beginningattheupperrightsideofFig.1.Forexample,thedatabaseadministrator,or DBA,forauniversityregistrar’sdatabasemightdecidethatthere shouldbeatableorrelationwithcolumnsforastudent,acoursethestudent hastaken,andagradeforthatstudentinthatcourse.TheDBAmightalso decidethattheonlyallowablegradesareA,B,C,D,andF.Thisstructure andconstraintinformationisallpartoftheschemaofthedatabase.Itis showninFig.1asenteredbytheDBA,whoneedsspecialauthoritytoexecuteschema-alteringcommands,sincethesecanhaveprofoundeffectsonthe database.Theseschema-alteringdata-definitionlanguage(DDL)commands areparsedbyaDDLprocessorandpassedtotheexecutionengine,whichthen goesthroughtheindex/file/recordmanagertoalterthe metadata,thatis,the schemainformationforthedatabase.
2.2OverviewofQueryProcessing
ThegreatmajorityofinteractionswiththeDBMSfollowthepathontheleft sideofFig.1.Auseroranapplicationprograminitiatessomeaction,using thedata-manipulationlanguage(DML).Thiscommanddoesnotaffectthe schemaofthedatabase,butmayaffectthecontentofthedatabase(ifthe
User/application
queries, updates query plan
index, file, and record requests commands page
Index/file/rec−
data, metadata, indexes
Database administrator table Query compiler compiler Concurrency control Lock manager Buffer DDL
Buffers
read/write pages
Storage ord manager Execution engine Transaction manager manager Logging and recovery
Storage
Figure1:Databasemanagementsystemcomponents
actionisamodificationcommand)orwillextractdatafromthedatabase(ifthe actionisaquery).DMLstatementsarehandledbytwoseparatesubsystems, asfollows.
AnsweringtheQuery
Thequeryisparsedandoptimizedbya querycompiler.Theresulting query plan,orsequenceofactionstheDBMSwillperformtoanswerthequery,is passedtothe executionengine.Theexecutionengineissuesasequenceof requestsforsmallpiecesofdata,typicallyrecordsortuplesofarelation,toa resourcemanagerthatknowsabout datafiles (holdingrelations),theformat andsizeofrecordsinthosefiles,and indexfiles,whichhelpfindelementsof datafilesquickly.
Therequestsfordataarepassedtothe buffermanager.Thebuffermanager’staskistobringappropriateportionsofthedatafromsecondarystorage (disk)whereitiskeptpermanently,tothemain-memorybuffers.Normally,the pageor“diskblock”istheunitoftransferbetweenbuffersanddisk.
Thebuffermanagercommunicateswithastoragemanagertogetdatafrom disk.Thestoragemanagermightinvolveoperating-systemcommands,but moretypically,theDBMSissuescommandsdirectlytothediskcontroller.
TransactionProcessing
QueriesandotherDMLactionsaregroupedinto transactions,whichareunits thatmustbeexecutedatomicallyandinisolationfromoneanother.Anyquery ormodificationactioncanbeatransactionbyitself.Inaddition,theexecutionoftransactionsmustbe durable,meaningthattheeffectofanycompleted transactionmustbepreservedevenifthesystemfailsinsomewayrightafter completionofthetransaction.Wedividethetransactionprocessorintotwo majorparts:
1.A concurrency-controlmanager,or scheduler,responsibleforassuring atomicityandisolationoftransactions,and
2.A loggingand recoverymanager,responsibleforthedurabilityoftransactions.
2.3StorageandBufferManagement
Thedataofadatabasenormallyresidesinsecondarystorage;intoday’scomputersystems“secondarystorage”generallymeansmagneticdisk.However,to performanyusefuloperationondata,thatdatamustbeinmainmemory.It isthejobofthe storagemanager tocontroltheplacementofdataondiskand itsmovementbetweendiskandmainmemory.
Inasimpledatabasesystem,thestoragemanagermightbenothingmore thanthefilesystemoftheunderlyingoperatingsystem.However,forefficiency
purposes,DBMS’snormallycontrolstorageonthediskdirectly,atleastunder somecircumstances.The storagemanager keepstrackofthelocationoffiles onthediskandobtainstheblockorblockscontainingafileonrequestfrom thebuffermanager.
The buffermanager isresponsibleforpartitioningtheavailablemainmemoryinto buffers,whicharepage-sizedregionsintowhichdiskblockscanbe transferred.Thus,allDBMScomponentsthatneedinformationfromthedisk willinteractwiththebuffersandthebuffermanager,eitherdirectlyorthrough theexecutionengine.Thekindsofinformationthatvariouscomponentsmay needinclude:
1. Data :thecontentsofthedatabaseitself.
2. Metadata :thedatabaseschemathatdescribesthestructureof,andconstraintson,thedatabase.
3. LogRecords :informationaboutrecentchangestothedatabase;these supportdurabilityofthedatabase.
4. Statistics :informationgatheredandstoredbytheDBMSaboutdata propertiessuchasthesizesof,andvaluesin,variousrelationsorother componentsofthedatabase.
5. Indexes :datastructuresthatsupportefficientaccesstothedata.
2.4TransactionProcessing
Itisnormaltogrouponeormoredatabaseoperationsintoa transaction,which isaunitofworkthatmustbeexecutedatomicallyandinapparentisolation fromothertransactions.Inaddition,aDBMSofferstheguaranteeofdurability: thattheworkofacompletedtransactionwillneverbelost.The transaction manager thereforeaccepts transactioncommands fromanapplication,which tellthetransactionmanagerwhentransactionsbeginandend,aswellasinformationabouttheexpectationsoftheapplication(somemaynotwishtorequire atomicity,forexample).Thetransactionprocessorperformsthefollowingtasks:
1. Logging :Inordertoassuredurability,everychangeinthedatabaseis loggedseparatelyondisk.The logmanager followsoneofseveralpolicies designedtoassurethatnomatterwhenasystemfailureor“crash”occurs, a recoverymanager willbeabletoexaminethelogofchangesandrestore thedatabasetosomeconsistentstate.Thelogmanagerinitiallywrites theloginbuffersandnegotiateswiththebuffermanagertomakesurethat buffersarewrittentodisk(wheredatacansurviveacrash)atappropriate times.
2. Concurrencycontrol :Transactionsmustappeartoexecuteinisolation. Butinmostsystems,therewillintruthbemanytransactionsexecuting
TheACIDPropertiesofTransactions
Properlyimplementedtransactionsarecommonlysaidtomeetthe“ACID test,”where:
• “A”standsfor“atomicity,”theall-or-nothingexecutionoftransactions.
• “I”standsfor“isolation,”thefactthateachtransactionmustappear tobeexecutedasifnoothertransactionisexecutingatthesame time.
• “D”standsfor“durability,”theconditionthattheeffectonthe databaseofatransactionmustneverbelost,oncethetransaction hascompleted.
Theremainingletter,“C,”standsfor“consistency.”Thatis,alldatabases haveconsistencyconstraints,orexpectationsaboutrelationshipsamong dataelements(e.g.,accountbalancesmaynotbenegativeafteratransactionfinishes).Transactionsareexpectedtopreservetheconsistencyof thedatabase.
atonce.Thus,thescheduler(concurrency-controlmanager)mustassure thattheindividualactionsofmultipletransactionsareexecutedinsuch anorderthattheneteffectisthesameasifthetransactionshadin factexecutedintheirentirety,one-at-a-time.Atypicalschedulerdoes itsworkbymaintaining locks oncertainpiecesofthedatabase.These lockspreventtwotransactionsfromaccessingthesamepieceofdatain waysthatinteractbadly.Locksaregenerallystoredinamain-memory locktable,assuggestedbyFig.1.Thescheduleraffectstheexecutionof queriesandotherdatabaseoperationsbyforbiddingtheexecutionengine fromaccessinglockedpartsofthedatabase.
3. Deadlockresolution :Astransactionscompeteforresourcesthroughthe locksthattheschedulergrants,theycangetintoasituationwherenone canproceedbecauseeachneedssomethinganothertransactionhas.The transactionmanagerhastheresponsibilitytointerveneandcancel(“rollback”or“abort”)oneormoretransactionstolettheothersproceed.
2.5TheQueryProcessor
TheportionoftheDBMSthatmostaffectstheperformancethattheusersees isthe queryprocessor.InFig.1thequeryprocessorisrepresentedbytwo components:
TheRelationalModelof Data
Thischapterintroducesthemostimportantmodelofdata:thetwo-dimensional table,or“relation.”Webeginwithanoverviewofdatamodelsingeneral.We givethebasicterminologyforrelationsandshowhowthemodelcanbeusedto representtypicalformsofdata.Wethenintroduceaportionofthelanguage SQL—thatpartusedtodeclarerelationsandtheirstructure.Thechapter closeswithanintroductiontorelationalalgebra.Weseehowthisnotation servesasbothaquerylanguage—theaspectofadatamodelthatenablesus toaskquestionsaboutthedata—andasaconstraintlanguage—theaspect ofadatamodelthatletsusrestrictthedatainthedatabaseinvariousways.
1AnOverviewofDataModels
Thenotionofa“datamodel”isoneofthemostfundamentalinthestudyof databasesystems.Inthisbriefsummaryoftheconcept,wedefinesomebasic terminologyandmentionthemostimportantdatamodels.
1.1WhatisaDataModel?
A datamodel isanotationfordescribingdataorinformation.Thedescription generallyconsistsofthreeparts:
1. Structureofthedata.Youmaybefamiliarwithtoolsinprogramming languagessuchasCorJavafordescribingthestructureofthedatausedby aprogram:arraysandstructures(“structs”)orobjects,forexample.The datastructuresusedtoimplementdatainthecomputeraresometimes referredto,indiscussionsofdatabasesystems,asa physicaldatamodel, althoughinfacttheyarefarremovedfromthegatesandelectronsthat trulyserveasthephysicalimplementationofthedata.Inthedatabase
world,datamodelsareatasomewhathigherlevelthandatastructures, andaresometimesreferredtoasa conceptualmodel toemphasizethe differenceinlevel.Weshallseeexamplesshortly.
2. Operationsonthedata.Inprogramminglanguages,operationsonthe dataaregenerallyanythingthatcanbeprogrammed.Indatabasedata models,thereisusuallyalimitedsetofoperationsthatcanbeperformed. Wearegenerallyallowedtoperformalimitedsetof queries (operations thatretrieveinformation)and modifications (operationsthatchangethe database).Thislimitationisnotaweakness,butastrength.Bylimiting operations,itispossibleforprogrammerstodescribedatabaseoperations ataveryhighlevel,yethavethedatabasemanagementsystemimplement theoperationsefficiently.Incomparison,itisgenerallyimpossibleto optimizeprogramsinconventionallanguageslikeC,totheextentthatan inefficientalgorithm(e.g.,bubblesort)isreplacedbyamoreefficientone (e.g.,quicksort).
3. Constraintsonthedata.Databasedatamodelsusuallyhaveawayto describelimitationsonwhatthedatacanbe.Theseconstraintscanrange fromthesimple(e.g.,“adayoftheweekisanintegerbetween1and7” or“amoviehasatmostonetitle”)tosomeverycomplexlimitations.
1.2ImportantDataModels
Today,thetwodatamodelsofpreeminentimportancefordatabasesystemsare:
1.Therelationalmodel,includingobject-relationalextensions.
2.Thesemistructured-datamodel,includingXMLandrelatedstandards. Thefirst,whichispresentinallcommercialdatabasemanagementsystems, isthesubjectofthischapter.Thesemistructuredmodel,ofwhichXMLis theprimarymanifestation,isanaddedfeatureofmostrelationalDBMS’s,and appearsinanumberofothercontextsaswell.
1.3TheRelationalModelinBrief
Therelationalmodelisbasedontables,ofwhichFig.1isanexample.Weshall discussthismodelbeginninginSection2.Thisrelation,ortable,describes movies:theirtitle,theyearinwhichtheyweremade,theirlengthinminutes, andthegenreofthemovie.Weshowthreeparticularmovies,butyoushould imaginethattherearemanymorerowstothistable—onerowforeachmovie evermade,perhaps.
Thestructureportionoftherelationalmodelmightappeartoresemblean arrayofstructsinC,wherethecolumnheadersarethefieldnames,andeach
title year length genre
GoneWiththeWind 1939 231 drama
StarWars 1977 124 sciFi
Wayne’sWorld 1992 95 comedy
Figure1:Anexamplerelation
oftherowsrepresentthevaluesofonestructinthearray.However,itmustbe emphasizedthatthisphysicalimplementationisonlyonepossiblewaythetable couldbeimplementedinphysicaldatastructures.Infact,itisnotthenormal waytorepresentrelations,andalargeportionofthestudyofdatabasesystems addressestherightwaystoimplementsuchtables.Muchofthedistinction comesfromthescaleofrelations—theyarenotnormallyimplementedas main-memorystructures,andtheirproperphysicalimplementationmusttake intoaccounttheneedtoaccessrelationsofverylargesizethatareresidenton disk.
Theoperationsnormallyassociatedwiththerelationalmodelformthe“relationalalgebra,”whichwediscussbeginninginSection4.Theseoperationsare table-oriented.Asanexample,wecanaskforallthoserowsofarelationthat haveacertainvalueinacertaincolumn.Forexample,wecanaskofthetable inFig.1foralltherowswherethegenreis“comedy.”
Theconstraintportionoftherelationaldatamodelwillbetouchedupon brieflyinSection5.However,asabriefsampleofwhatkindsofconstraintsare generallyused,wecoulddecidethatthereisafixedlistofgenresformovies, andthatthelastcolumnofeveryrowmusthaveavaluethatisonthislist.Or wemightdecide(incorrectly,itturnsout)thattherecouldneverbetwomovies withthesametitle,andconstrainthetablesothatnotworowscouldhavethe samestringinthefirstcomponent.
1.4TheSemistructuredModelinBrief
Semistructureddataresemblestreesorgraphs,ratherthantablesorarrays. TheprincipalmanifestationofthisviewpointtodayisXML,awaytorepresent databyhierarchicallynestedtaggedelements.Thetags,similartothoseused inHTML,definetheroleplayedbydifferentpiecesofdata,muchasthecolumn headersdointherelationalmodel.Forexample,thesamedataasinFig.1 mightappearinanXML“document”asinFig.2.
Theoperationsonsemistructureddatausuallyinvolvefollowingpathsin theimpliedtreefromanelementtooneormoreofitsnestedsubelements, thentosubelementsnestedwithinthose,andsoon.Forexample,startingat theouter <Movies> element(theentiredocumentinFig.2),wemightmoveto eachofitsnested <Movie> elements,eachdelimitedbythetag <Movie> and matching </Movie> tag,andfromeach <Movie> elementtoitsnested <Genre> element,toseewhichmoviesbelongtothe“comedy”genre.
<Movies>
<Movietitle="GoneWiththeWind"> <Year>1939</Year> <Length>231</Length> <Genre>drama</Genre> </Movie>
<Movietitle="StarWars"> <Year>1977</Year> <Length>124</Length> <Genre>sciFi</Genre> </Movie>
<Movietitle="Wayne’sWorld"> <Year>1992</Year> <Length>95</Length> <Genre>comedy</Genre> </Movie> </Movies>
Constraintsonthestructureofdatainthismodelofteninvolvethedata typeofvaluesassociatedwithatag.Forinstance,arethevaluesassociated withthe <Length> tagintegersorcantheybearbitrarycharacterstrings? Otherconstraintsdeterminewhichtagscanappearnestedwithinwhichother tags.Forexample,musteach <Movie> elementhavea <Length> elementnested withinit?Whatothertags,besidesthoseshowninFig.2mightbeusedwithin a <Movie> element?Cantherebemorethanonegenreforamovie?
1.5OtherDataModels
Therearemanyothermodelsthatare,orhavebeen,associatedwithDBMS’s. Amoderntrendistoaddobject-orientedfeaturestotherelationalmodel.There aretwoeffectsofobject-orientationonrelations:
1.Valuescanhavestructure,ratherthanbeingelementarytypessuchas integerorstrings,astheywereinFig.1.
2.Relationscanhaveassociatedmethods.
Inasense,theseextensions,calledthe object-relational model,areanalogousto thewaystructsinCwereextendedtoobjectsinC++.
Figure2:MoviedataasXML
Thereareevendatabasemodelsofthepurelyobject-orientedkind.Inthese, therelationisnolongertheprincipaldata-structuringconcept,butbecomes onlyoneoptionamongmanystructures.
ThereareseveralothermodelsthatwereusedinsomeoftheearlierDBMS’s, butthathavenowfallenoutofuse.The hierarchicalmodel was,likesemistructureddata,atree-orientedmodel.Itsdrawbackwasthatunlikemoremodern models,itreallyoperatedatthephysicallevel,whichmadeitimpossiblefor programmerstowritecodeataconvenientlyhighlevel.Anothersuchmodel wasthe networkmodel,whichwasagraph-oriented,physical-levelmodel.In truth,boththehierarchicalmodelandtoday’ssemistructuredmodels,allow fullgraphstructures,anddonotlimitusstrictlytotrees.However,thegeneralityofgraphswasbuiltdirectlyintothenetworkmodel,ratherthanfavoring treesastheseothermodelsdo.
1.6ComparisonofModelingApproaches
Evenfromourbriefexample,itappearsthatsemistructuredmodelshavemore flexibilitythanrelations.Thisdifferencebecomesevenmoreapparentwhen wediscuss,asweshall,howfullgraphstructuresareembeddedintotree-like, semistructuredmodels.Nevertheless,therelationalmodelisstillpreferredin DBMS’s,andweshouldunderstandwhy.Abriefargumentfollows.
Becausedatabasesarelarge,efficiencyofaccesstodataandefficiencyof modificationstothatdataareofgreatimportance.Alsoveryimportantisease ofuse—theproductivityofprogrammerswhousethedata.Surprisingly,both goalscanbeachievedwithamodel,particularlytherelationalmodel,that:
1.Providesasimple,limitedapproachtostructuringdata,yetisreasonably versatile,soanythingcanbemodeled.
2.Providesalimited,yetuseful,collectionofoperationsondata.
Together,theselimitationsturnintofeatures.Theyallowustoimplement languages,suchasSQL,thatenabletheprogrammertoexpresstheirwishesat averyhighlevel.AfewlinesofSQLcandotheworkofthousandsoflinesof C,orhundredsoflinesofthecodethathadtobewrittentoaccessdataunder earliermodelssuchasnetworkorhierarchical.YettheshortSQLprograms, becausetheyuseastronglylimitedsetsofoperations,canbeoptimizedtorun asfast,orfasterthanthecodewritteninalternativelanguages.
2BasicsoftheRelationalModel
Therelationalmodelgivesusasinglewaytorepresentdata:asatwo-dimensionaltablecalleda relation.Figure1,whichwecopyhereasFig.3,isan exampleofarelation,whichweshallcall Movies.Therowseachrepresenta
movie,andthecolumnseachrepresentapropertyofmovies.Inthissection, weshallintroducethemostimportantterminologyregardingrelations,and illustratethemwiththe Movies relation.
title year length genre
GoneWiththeWind 1939 231 drama
StarWars 1977 124 sciFi
Wayne’sWorld 1992 95 comedy
Figure3:Therelation Movies
2.1Attributes
Thecolumnsofarelationarenamedby attributes;inFig.3theattributesare title, year, length,and genre.Attributesappearatthetopsofthecolumns. Usually,anattributedescribesthemeaningofentriesinthecolumnbelow.For instance,thecolumnwithattribute length holdsthelength,inminutes,of eachmovie.
2.2Schemas
Thenameofarelationandthesetofattributesforarelationiscalledthe schema forthatrelation.Weshowtheschemafortherelationwiththerelation namefollowedbyaparenthesizedlistofitsattributes.Thus,theschemafor relation Movies ofFig.3is
Movies(title,year,length,genre)
Theattributesinarelationschemaareaset,notalist.However,inorderto talkaboutrelationsweoftenmustspecifya“standard”orderfortheattributes. Thus,wheneverweintroducearelationschemawithalistofattributes,as above,weshalltakethisorderingtobethestandardorderwheneverwedisplay therelationoranyofitsrows.
Intherelationalmodel,adatabaseconsistsofoneormorerelations.The setofschemasfortherelationsofadatabaseiscalleda relationaldatabase schema,orjusta databaseschema
2.3Tuples
Therowsofarelation,otherthantheheaderrowcontainingtheattribute names,arecalled tuples.Atuplehasone component foreachattributeof therelation.Forinstance,thefirstofthethreetuplesinFig.3hasthe fourcomponents GoneWiththeWind, 1939, 231,and drama forattributes title, year, length,and genre,respectively.Whenwewishtowriteatuple
ConventionsforRelationsandAttributes
Weshallgenerallyfollowtheconventionthatrelationnamesbeginwitha capitalletter,andattributenamesbeginwithalower-caseletter.However,laterweshalltalkofrelationsintheabstract,wherethenamesof attributesdonotmatter.Inthatcase,weshallusesinglecapitalletters forbothrelationsandattributes,e.g., R(A,B,C )foragenericrelation withthreeattributes.
inisolation,notaspartofarelation,wenormallyusecommastoseparate components,andweuseparenthesestosurroundthetuple.Forexample,
(GoneWiththeWind, 1939, 231, drama)
isthefirsttupleofFig.3.Noticethatwhenatupleappearsinisolation,the attributesdonotappear,sosomeindicationoftherelationtowhichthetuple belongsmustbegiven.Weshallalwaysusetheorderinwhichtheattributes werelistedintherelationschema.
2.4Domains
Therelationalmodelrequiresthateachcomponentofeachtuplebeatomic; thatis,itmustbeofsomeelementarytypesuchasintegerorstring.Itisnot permittedforavaluetobearecordstructure,set,list,array,oranyothertype thatreasonablycanhaveitsvaluesbrokenintosmallercomponents.
Itisfurtherassumedthatassociatedwitheachattributeofarelationisa domain,thatis,aparticularelementarytype.Thecomponentsofanytupleof therelationmusthave,ineachcomponent,avaluethatbelongstothedomainof thecorrespondingcolumn.Forexample,tuplesofthe Movies relationofFig.3 musthaveafirstcomponentthatisastring,secondandthirdcomponentsthat areintegers,andafourthcomponentwhosevalueisastring.
Itispossibletoincludethedomain,ordatatype,foreachattributein arelationschema.Weshalldosobyappendingacolonandatypeafter attributes.Forexample,wecouldrepresenttheschemaforthe Movies relation as:
Movies(title:string,year:integer,length:integer,genre:string)
2.5EquivalentRepresentationsofaRelation
Relationsaresetsoftuples,notlistsoftuples.Thustheorderinwhichthe tuplesofarelationarepresentedisimmaterial.Forexample,wecanlistthe threetuplesofFig.3inanyoftheirsixpossibleorders,andtherelationis “thesame”asFig.3.
Moreover,wecanreordertheattributesoftherelationaswechoose,without changingtherelation.However,whenwereordertherelationschema,wemust becarefultorememberthattheattributesarecolumnheaders.Thus,whenwe changetheorderoftheattributes,wealsochangetheorderoftheircolumns. Whenthecolumnsmove,thecomponentsoftupleschangetheirorderaswell. Theresultisthateachtuplehasitscomponentspermutedinthesamewayas theattributesarepermuted.
Forexample,Fig.4showsoneofthemanyrelationsthatcouldbeobtained fromFig.3bypermutingrowsandcolumns.Thesetworelationsareconsidered “thesame.”Moreprecisely,thesetwotablesaredifferentpresentationsofthe samerelation. year genre title length 1977 sciFi StarWars 124 1992 comedy Wayne’sWorld 95 1939 drama GoneWiththeWind 231
2.6RelationInstances
Arelationaboutmoviesisnotstatic;rather,relationschangeovertime.We expecttoinserttuplesfornewmovies,astheseappear.Wealsoexpectchanges toexistingtuplesifwegetrevisedorcorrectedinformationaboutamovie,and perhapsdeletionoftuplesformoviesthatareexpelledfromthedatabasefor somereason.
Itislesscommonfortheschemaofarelationtochange.However,thereare situationswherewemightwanttoaddordeleteattributes.Schemachanges, whilepossibleincommercialdatabasesystems,canbeveryexpensive,because eachofperhapsmillionsoftuplesneedstoberewrittentoaddordeletecomponents.Also,ifweaddanattribute,itmaybedifficultorevenimpossibleto generateappropriatevaluesforthenewcomponentintheexistingtuples.
Weshallcallasetoftuplesforagivenrelationan instance ofthatrelation.Forexample,thethreetuplesshowninFig.3formaninstanceofrelation Movies.Presumably,therelation Movies haschangedovertimeandwillcontinuetochangeovertime.Forinstance,in1990, Movies didnotcontainthe tuplefor Wayne’sWorld.However,aconventionaldatabasesystemmaintains onlyoneversionofanyrelation:thesetoftuplesthatareintherelation“now.” Thisinstanceoftherelationiscalledthe currentinstance 1
1 Databasesthatmaintainhistoricalversionsofdataasitexistedinpasttimesarecalled temporaldatabases
Figure4:Anotherpresentationoftherelation Movies
2.7KeysofRelations
Therearemanyconstraintsonrelationsthattherelationalmodelallowsusto placeondatabaseschemas.Onekindofconstraintissofundamentalthatwe shallintroduceithere: key constraints.Asetofattributesformsa key fora relationifwedonotallowtwotuplesinarelationinstancetohavethesame valuesinalltheattributesofthekey.
Example1: Wecandeclarethattherelation Movies hasakeyconsisting ofthetwoattributes title and year.Thatis,wedon’tbelievetherecould everbetwomoviesthathadboththesametitleandthesameyear.Notice that title byitselfdoesnotformakey,sincesometimes“remakes”ofamovie appear.Forexample,therearethreemoviesnamed KingKong,eachmadein adifferentyear.Itshouldalsobeobviousthat year byitselfisnotakey,since thereareusuallymanymoviesmadeinthesameyear. ✷
Weindicatetheattributeorattributesthatformakeyforarelationby underliningthekeyattribute(s).Forinstance,the Movies relationcouldhave itsschemawrittenas:
Movies(title,year,length,genre)
Rememberthatthestatementthatasetofattributesformsakeyfora relationisastatementaboutallpossibleinstancesoftherelation,notastatementaboutasingleinstance.Forexample,lookingonlyatthetinyrelationof Fig.3,wemightimaginethat genre byitselfformsakey,sincewedonotsee twotuplesthatagreeonthevalueoftheir genre components.However,wecan easilyimaginethatiftherelationinstancecontainedmoremovies,therewould bemanydramas,manycomedies,andsoon.Thus,therewouldbedistinct tuplesthatagreedonthe genre component.Asaconsequence,itwouldbe incorrecttoassertthat genre isakeyfortherelation Movies
Whilewemightbesurethat title and year canserveasakeyfor Movies, manyreal-worlddatabasesuseartificialkeys,doubtingthatitissafetomake anyassumptionaboutthevaluesofattributesoutsidetheircontrol.Forexample,companiesgenerallyassignemployeeID’stoallemployees,andtheseID’s arecarefullychosentobeuniquenumbers.OnepurposeoftheseID’sisto makesurethatinthecompanydatabaseeachemployeecanbedistinguished fromallothers,evenifthereareseveralemployeeswiththesamename.Thus, theemployee-IDattributecanserveasakeyforarelationaboutemployees.
InUScorporations,itisnormalforeveryemployeetohaveaSocial-Security number.IfthedatabasehasanattributethatistheSocial-Securitynumber, thenthisattributecanalsoserveasakeyforemployees.Notethatthereis nothingwrongwiththerebeingseveralchoicesofkey,astherewouldbefor employeeshavingbothemployeeID’sandSocial-Securitynumbers.
Theideaofcreatinganattributewhosepurposeistoserveasakeyisquite widespread.InadditiontoemployeeID’s,wefindstudentID’stodistinguish
studentsinauniversity.Wefinddrivers’licensenumbersandautomobileregistrationnumberstodistinguishdriversandautomobiles,respectively.You undoubtedlycanfindmoreexamplesofattributescreatedfortheprimarypurposeofservingaskeys.
Movies( title:string, year:integer, length:integer, genre:string, studioName:string, producerC#:integer )
MovieStar( name:string, address:string, gender:char, birthdate:date )
StarsIn( movieTitle:string, movieYear:integer, starName:string )
MovieExec( name:string, address:string, cert#:integer, netWorth:integer )
Studio( name:string, address:string, presC#:integer )
Figure5:Exampledatabaseschemaaboutmovies
2.8AnExampleDatabaseSchema
Weshallclosethissectionwithanexampleofacompletedatabaseschema. Thetopicismovies,anditbuildsontherelation Movies thathasappearedso farinexamples.ThedatabaseschemaisshowninFig.5.Herearethethings weneedtoknowtounderstandtheintentionofthisschema.
Movies
Thisrelationisanextensionoftheexamplerelationwehavebeendiscussing sofar.Rememberthatitskeyis title and year together.Wehaveadded twonewattributes; studioName tellsusthestudiothatownsthemovie,and producerC# isanintegerthatrepresentstheproducerofthemovieinaway thatweshalldiscusswhenwetalkabouttherelation MovieExec below.
MovieStar
Thisrelationtellsussomethingaboutstars.Thekeyis name,thenameofthe moviestar.Itisnotusualtoassumenamesofpersonsareuniqueandtherefore suitableasakey.However,moviestarsaredifferent;onewouldnevertakea namethatsomeothermoviestarhadused.Thus,weshallusetheconvenient fictionthatmovie-starnamesareunique.Amoreconventionalapproachwould betoinventaserialnumberofsomesort,likesocial-securitynumbers,sothat wecouldassigneachindividualauniquenumberandusethatattributeasthe key.Wetakethatapproachformovieexecutives,asweshallsee.Another interestingpointaboutthe MovieStar relationisthatweseetwonewdata types.Thegendercanbeasinglecharacter,MorF.Also,birthdateisoftype “date,”whichmightbeacharacterstringofaspecialform.
StarsIn
Thisrelationconnectsmoviestothestarsofthatmovie,andlikewiseconnectsa startothemoviesinwhichtheyappeared.Noticethatmoviesarerepresented bythekeyfor Movies —thetitleandyear—althoughwehavechosendifferentattributenamestoemphasizethatattributes movieTitle and movieYear representthemovie.Likewise,starsarerepresentedbythekeyfor MovieStar, withtheattributecalled starName.Finally,noticethatallthreeattributes arenecessarytoformakey.Itisperfectlyreasonabletosupposethatrelation StarsIn couldhavetwodistincttuplesthatagreeinanytwoofthethree attributes.Forinstance,astarmightappearintwomoviesinoneyear,giving risetotwotuplesthatagreedin movieYear and starName,butdisagreedin movieTitle
MovieExec
Thisrelationtellsusaboutmovieexecutives.Itcontainstheirname,address, andnetworthasdataabouttheexecutive.However,forakeywehaveinvented “certificatenumbers”forallmovieexecutives,includingproducers(asappear intherelation Movies)andstudiopresidents(asappearintherelation Studio, below).Theseareintegers;adifferentoneisassignedtoeachexecutive.
acctNo type balance
12345 savings 12000
23456 checking 1000
34567 savings 25
Therelation Accounts
firstName lastName idNo account
Robbie Banks 901-222 12345
Lena Hand 805-333 12345
Lena Hand 805-333 23456
Therelation Customers
Studio
Thisrelationtellsaboutmoviestudios.Werelyonnotwostudioshavingthe samename,andthereforeuse name asthekey.Theotherattributesarethe addressofthestudioandthecertificatenumberforthepresidentofthestudio. Weassumethatthestudiopresidentissurelyamovieexecutiveandtherefore appearsin MovieExec
2.9ExercisesforSection2
Exercise2.1: InFig.6areinstancesoftworelationsthatmightconstitute partofabankingdatabase.Indicatethefollowing:
a)Theattributesofeachrelation.
b)Thetuplesofeachrelation.
c)Thecomponentsofonetuplefromeachrelation.
d)Therelationschemaforeachrelation.
e)Thedatabaseschema.
f)Asuitabledomainforeachattribute.
g)Anotherequivalentwaytopresenteachrelation.
Figure6:Tworelationsofabankingdatabase
Exercise2.2: InSection2.7wesuggestedthattherearemanyexamplesof attributesthatarecreatedforthepurposeofservingaskeysofrelations.Give someadditionalexamples.
!!Exercise2.3: Howmanydifferentways(consideringordersoftuplesand attributes)aretheretorepresentarelationinstanceifthatinstancehas:
a)Threeattributesandthreetuples,liketherelation Accounts ofFig.6?
b)Fourattributesandfivetuples?
c) n attributesand m tuples?
3DefiningaRelationSchemainSQL
SQL(pronounced“sequel”)istheprincipallanguageusedtodescribeand manipulaterelationaldatabases.ThereisacurrentstandardforSQL,called SQL-99.Mostcommercialdatabasemanagementsystemsimplementsomething similar,butnotidenticalto,thestandard.TherearetwoaspectstoSQL:
1.The Data-Definition sublanguagefordeclaringdatabaseschemasand
2.The Data-Manipulation sublanguagefor querying (askingquestionsabout) databasesandformodifyingthedatabase.
Thedistinctionbetweenthesetwosublanguagesisfoundinmostlanguages; e.g.,CorJavahaveportionsthatdeclaredataandotherportionsthatare executablecode.Thesecorrespondtodata-definitionanddata-manipulation, respectively.
Inthissectionweshallbeginadiscussionofthedata-definitionportionof SQL.
3.1RelationsinSQL
SQLmakesadistinctionbetweenthreekindsofrelations:
1.Storedrelations,whicharecalled tables.Thesearethekindofrelation wedealwithordinarily—arelationthatexistsinthedatabaseandthat canbemodifiedbychangingitstuples,aswellasqueried.
2. Views,whicharerelationsdefinedbyacomputation.Theserelationsare notstored,butareconstructed,inwholeorinpart,whenneeded.
3.Temporarytables,whichareconstructedbytheSQLlanguageprocessor whenitperformsitsjobofexecutingqueriesanddatamodifications. Theserelationsarethenthrown awayandnot stored.
Inthissection,weshalllearnhowtodeclaretables.Wedonottreatthedeclarationanddefinitionofviewshere,andtemporarytablesareneverdeclared. TheSQL CREATETABLE statementdeclarestheschemaforastoredrelation.It givesanameforthetable,itsattributes,andtheirdatatypes.Italsoallows ustodeclareakey,orevenseveralkeys,forarelation.Therearemanyother featurestothe CREATETABLE statement,includingmanyformsofconstraints thatcanbedeclared,andthedeclarationof indexes (datastructuresthatspeed upmanyoperationsonthetable)butweshallleavethosefortheappropriate time.
3.2DataTypes
Tobegin,letusintroducetheprimitivedatatypesthataresupportedbySQL systems.Allattributesmusthaveadatatype.
1.Characterstringsoffixedorvaryinglength.Thetype CHAR(n) denotes afixed-lengthstringofupto n characters. VARCHAR(n) alsodenotesa stringofupto n characters.Thedifferenceisimplementation-dependent; typically CHAR impliesthatshortstringsarepaddedtomake n characters, while VARCHAR impliesthatanendmarkerorstring-lengthisused.SQL permitsreasonablecoercionsbetweenvaluesofcharacter-stringtypes. Normally,astringispaddedbytrailingblanksifitbecomesthevalue ofacomponentthatisafixed-lengthstringofgreaterlength.Forexample,thestring ’foo’, 2 ifitbecamethevalueofacomponentforan attributeoftype CHAR(5),wouldassumethevalue ’foo’ (withtwo blanksfollowingthesecond o).
2.Bitstringsoffixedorvaryinglength.Thesestringsareanalogoustofixed andvarying-lengthcharacterstrings,buttheirvaluesarestringsofbits ratherthancharacters.Thetype BIT(n) denotesbitstringsoflength n, while BITVARYING(n) denotesbitstringsoflengthupto n.
3.Thetype BOOLEAN denotesanattributewhosevalueislogical.Thepossiblevaluesofsuchanattributeare TRUE, FALSE,and—althoughitwould surpriseGeorgeBoole— UNKNOWN
4.Thetype INT or INTEGER (thesenamesaresynonyms)denotestypical integervalues.Thetype SHORTINT alsodenotesintegers,butthenumber ofbitspermittedmaybeless,dependingontheimplementation(aswith thetypes int and shortint inC).
2 NoticethatinSQL,stringsaresurroundedbysingle-quotes,notdouble-quotesasinmany otherprogramminglanguages.