Issuu on Google+

This book has been compiled by Manthan Dave from Wikipedia Encyclopedia for Dixplore.


Programming Language Mentor of your Computer

PDF generated using the open source mwlib toolkit. See http://code.pediapress.com/ for more information. PDF generated at: Thu, 29 Apr 2010 01:21:04 UTC


Contents Articles Programming language

1

Computer software

9

History

15

History of programming languages

15

Types of programming languages

21

Low-level programming language

21

High-level programming language

23

Software Programming Languages

26

Machine code

26

Assembly language

29

BASIC

41

C (programming language)

50

C++

66

Perl

77

Fortran

99

C Sharp (programming language)

116

Java (programming language)

127

.NET Framework

142

Some Software Programming Concepts

157

Computer programming

157

Algorithm

163

Computer data processing

180

Thread (computer science)

182

Parallel computing

188

Web Programming Languages

203

HTML

203

Web 2.0

219

PHP

228

Active Server Pages

240


References Article Sources and Contributors

243

Image Sources, Licenses and Contributors

252

Article Licenses License

254


Programming language

Programming language A programming language is an artificial language designed to express computations that can be performed by a machine, particularly a computer. Programming languages can be used to create programs that control the behavior of a machine, to express algorithms precisely, or as a mode of human communication. Many programming languages have some form of written specification of their syntax (form) and semantics (meaning). Some languages are defined by a specification document. For example, the C programming language is specified by an ISO Standard. Other languages, such as Perl, have a dominant implementation that is used as a reference. The earliest programming languages predate the invention of the computer, and were used to direct the behavior of machines such as Jacquard looms and player pianos. Thousands of different programming languages have been created, mainly in the computer field, with many more being created every year. Most programming languages describe computation in an imperative style, i.e., as a sequence of commands, although some languages, such as those that support functional programming or logic programming, use alternative forms of description.

Definitions A programming language is a notation for writing programs, which are specifications of a computation or algorithm.[1] Some, but not all, authors restrict the term "programming language" to those languages that can express all possible algorithms.[1] [2] Traits often considered important for what constitutes a programming language include: • Function and target: A computer programming language is a language[3] used to write computer programs, which involve a computer performing some kind of computation[4] or algorithm and possibly control external devices such as printers, disk drives, robots,[5] and so on. For example PostScript programs are frequently created by another program to control a computer printer or display. More generally, a programming language may describe computation on some, possibly abstract, machine. It is generally accepted that a complete specification for a programming language includes a description, possibly idealized, of a machine or processor for that language.[6] In most practical contexts, a programming language involves a computer; consequently programming languages are usually defined and studied this way.[7] Programming languages differ from natural languages in that natural languages are only used for interaction between people, while programming languages also allow humans to communicate instructions to machines. • Abstractions: Programming languages usually contain abstractions for defining and manipulating data structures or controlling the flow of execution. The practical necessity that a programming language support adequate abstractions is expressed by the abstraction principle;[8] this principle is sometimes formulated as recommendation to the programmer to make proper use of such abstractions.[9] • Expressive power: The theory of computation classifies languages by the computations they are capable of expressing. All Turing complete languages can implement the same set of algorithms. ANSI/ISO SQL and Charity are examples of languages that are not Turing complete, yet often called programming languages.[10] [11] Markup languages like XML, HTML or troff, which define structured data, are not generally considered programming languages.[12] [13] [14] Programming languages may, however, share the syntax with markup languages if a computational semantics is defined. XSLT, for example, is a Turing complete XML dialect.[15] [16] [17] Moreover, LaTeX, which is mostly used for structuring documents, also contains a Turing complete subset.[18] [19] The term computer language is sometimes used interchangeably with programming language.[20] However, the usage of both terms varies among authors, including the exact scope of each. One usage describes programming languages as a subset of computer languages.[21] In this vein, languages used in computing that have a different goal than expressing computer programs are generically designated computer languages. For instance, markup languages are sometimes referred to as computer languages to emphasize that they are not meant to be used for

1


Programming language programming.[22] Another usage regards programming languages as theoretical constructs for programming abstract machines, and computer languages as the subset thereof that runs on physical computers, which have finite hardware resources.[23] John C. Reynolds emphasizes that formal specification languages are just as much programming languages as are the languages intended for execution. He also argues that textual and even graphical input formats that affect the behavior of a computer are programming languages, despite the fact they are commonly not Turing-complete, and remarks that ignorance of programming language concepts is the reason for many flaws in input formats.[24]

Design and implementation Programming languages share properties with natural languages related to their purpose as vehicles for communication, having a syntactic form separate from its semantics, and showing language families of related languages branching one from another.[3] But as artificial constructs, they also differ in fundamental ways from languages that have evolved through usage. A significant difference is that a programming language can be fully described and studied in its entirety, since it has a precise and finite definition.[25] By contrast, natural languages have changing meanings given by their users in different communities. While constructed languages are also artificial languages designed from the ground up with a specific purpose, they lack the precise and complete semantic definition that a programming language has. Many languages have been designed from scratch, altered to meet new needs, combined with other languages, and eventually fallen into disuse. Although there have been attempts to design one "universal" programming language that serves all purposes, all of them have failed to be generally accepted as filling this role.[26] The need for diverse programming languages arises from the diversity of contexts in which languages are used: • Programs range from tiny scripts written by individual hobbyists to huge systems written by hundreds of programmers. • Programmers range in expertise from novices who need simplicity above all else, to experts who may be comfortable with considerable complexity. • Programs must balance speed, size, and simplicity on systems ranging from microcontrollers to supercomputers. • Programs may be written once and not change for generations, or they may undergo continual modification. • Finally, programmers may simply differ in their tastes: they may be accustomed to discussing problems and expressing them in a particular language. One common trend in the development of programming languages has been to add more ability to solve problems using a higher level of abstraction. The earliest programming languages were tied very closely to the underlying hardware of the computer. As new programming languages have developed, features have been added that let programmers express ideas that are more remote from simple translation into underlying hardware instructions. Because programmers are less tied to the complexity of the computer, their programs can do more computing with less effort from the programmer. This lets them write more functionality per time unit.[27] Natural language processors have been proposed as a way to eliminate the need for a specialized language for programming. However, this goal remains distant and its benefits are open to debate. Edsger W. Dijkstra took the position that the use of a formal language is essential to prevent the introduction of meaningless constructs, and dismissed natural language programming as "foolish".[28] Alan Perlis was similarly dismissive of the idea.[29] A language's designers and users must construct a number of artifacts that govern and enable the practice of programming. The most important of these artifacts are the language specification and implementation.

2


Programming language

Specification The specification of a programming language is intended to provide a definition that the language users and the implementors can use to determine whether the behavior of a program is correct, given its source code. A programming language specification can take several forms, including the following: • An explicit definition of the syntax, static semantics, and execution semantics of the language. While syntax is commonly specified using a formal grammar, semantic definitions may be written in natural language (e.g., as in the C language), or a formal semantics (e.g., as in Standard ML[30] and Scheme[31] specifications). • A description of the behavior of a translator for the language (e.g., the C++ and Fortran specifications). The syntax and semantics of the language have to be inferred from this description, which may be written in natural or a formal language. • A reference or model implementation, sometimes written in the language being specified (e.g., Prolog or ANSI REXX[32] ). The syntax and semantics of the language are explicit in the behavior of the reference implementation.

Implementation An implementation of a programming language provides a way to execute that program on one or more configurations of hardware and software. There are, broadly, two approaches to programming language implementation: compilation and interpretation. It is generally possible to implement a language using either technique. The output of a compiler may be executed by hardware or a program called an interpreter. In some implementations that make use of the interpreter approach there is no distinct boundary between compiling and interpreting. For instance, some implementations of BASIC compile and then execute the source a line at a time. Programs that are executed directly on the hardware usually run several orders of magnitude faster than those that are interpreted in software. One technique for improving the performance of interpreted programs is just-in-time compilation. Here the virtual machine, just before execution, translates the blocks of bytecode which are going to be used to machine code, for direct execution on the hardware.

Usage Thousands of different programming languages have been created, mainly in the computing field.[33] Programming languages differ from most other forms of human expression in that they require a greater degree of precision and completeness. When using a natural language to communicate with other people, human authors and speakers can be ambiguous and make small errors, and still expect their intent to be understood. However, figuratively speaking, computers "do exactly what they are told to do", and cannot "understand" what code the programmer intended to write. The combination of the language definition, a program, and the program's inputs must fully specify the external behavior that occurs when the program is executed, within the domain of control of that program. A programming language provides a structured mechanism for defining pieces of data, and the operations or transformations that may be carried out automatically on that data. A programmer uses the abstractions present in the language to represent the concepts involved in a computation. These concepts are represented as a collection of the simplest elements available (called primitives).[34] Programming is the process by which programmers combine these primitives to compose new programs, or adapt existing ones to new uses or a changing environment. Programs for a computer might be executed in a batch process without human interaction, or a user might type commands in an interactive session of an interpreter. In this case the "commands" are simply programs, whose execution is chained together. When a language is used to give commands to a software application (such as a shell) it is called a scripting language.

3


Programming language

Measuring language usage It is difficult to determine which programming languages are most widely used, and what usage means varies by context. One language may occupy the greater number of programmer hours, a different one have more lines of code, and a third utilize the most CPU time. Some languages are very popular for particular kinds of applications. For example, COBOL is still strong in the corporate data center, often on large mainframes; FORTRAN in engineering applications; C in embedded applications and operating systems; and other languages are regularly used to write many different kinds of applications. Various methods of measuring language popularity, each subject to a different bias over what is measured, have been proposed: • counting the number of job advertisements that mention the language[35] • the number of books sold that teach or describe the language[36] • estimates of the number of existing lines of code written in the language—which may underestimate languages not often found in public searches[37] • counts of language references (i.e., to the name of the language) found using a web search engine. Combining and averaging information from various internet sites, langpop.com claims that [38] in 2008 the 10 most cited programming languages are (in alphabetical order): C, C++, C#, Java, JavaScript, Perl, PHP, Python, Ruby, and SQL.

Taxonomies There is no overarching classification scheme for programming languages. A given programming language does not usually have a single ancestor language. Languages commonly arise by combining the elements of several predecessor languages with new ideas in circulation at the time. Ideas that originate in one language will diffuse throughout a family of related languages, and then leap suddenly across familial gaps to appear in an entirely different family. The task is further complicated by the fact that languages can be classified along multiple axes. For example, Java is both an object-oriented language (because it encourages object-oriented organization) and a concurrent language (because it contains built-in constructs for running multiple threads in parallel). Python is an object-oriented scripting language. In broad strokes, programming languages divide into programming paradigms and a classification by intended domain of use. Traditionally, programming languages have been regarded as describing computation in terms of imperative sentences, i.e. issuing commands. These are generally called imperative programming languages. A great deal of research in programming languages has been aimed at blurring the distinction between a program as a set of instructions and a program as an assertion about the desired answer, which is the main feature of declarative programming.[39] More refined paradigms include procedural programming, object-oriented programming, functional programming, and logic programming; some languages are hybrids of paradigms or multi-paradigmatic. An assembly language is not so much a paradigm as a direct model of an underlying machine architecture. By purpose, programming languages might be considered general purpose, system programming languages, scripting languages, domain-specific languages, or concurrent/distributed languages (or a combination of these).[40] Some general purpose languages were designed largely with educational goals.[41] A programming language may also be classified by factors unrelated to programming paradigm. For instance, most programming languages use English language keywords, while a minority do not. Other languages may be classified as being esoteric or not.

4


Programming language

5

History Early developments The first programming languages predate the modern computer. The 19th century had "programmable" looms and player piano scrolls which implemented what are today recognized as examples of domain-specific languages. By the beginning of the twentieth century, punch cards encoded data and directed mechanical processing. In the 1930s and 1940s, the formalisms of Alonzo Church's lambda calculus and Alan Turing's Turing machines provided mathematical abstractions for expressing algorithms; the lambda calculus remains influential in language design.[42]

A selection of textbooks that teach programming, in languages both popular and obscure. These are only a few of the thousands of programming languages and dialects that have been designed in history.

In the 1940s, the first electrically powered digital computers were created. The first high-level programming language to be designed for a computer was Plankalkül, developed for the German Z3 by Konrad Zuse between 1943 and 1945. However, it was not implemented until 1998 and 2000.[43] Programmers of early 1950s computers, notably UNIVAC I and IBM 701, used machine language programs, that is, the first generation language (1GL). 1GL programming was quickly superseded by similarly machine-specific, but mnemonic, second generation languages (2GL) known as assembly languages or "assembler". Later in the 1950s, assembly language programming, which had evolved to include the use of macro instructions, was followed by the development of "third generation" programming languages (3GL), such as FORTRAN, LISP, and COBOL.[44] 3GLs are more abstract and are "portable", or at least implemented similarly on computers that do not support the same native machine code. Updated versions of all of these 3GLs are still in general use, and each has strongly influenced the development of later languages.[45] At the end of the 1950s, the language formalized as ALGOL 60 was introduced, and most later programming languages are, in many respects, descendants of Algol.[45] The format and use of the early programming languages was heavily influenced by the constraints of the interface.[46]

Refinement The period from the 1960s to the late 1970s brought the development of the major language paradigms now in use, though many aspects were refinements of ideas in the very first Third-generation programming languages: • APL introduced array programming and influenced functional programming.[47] • PL/I (NPL) was designed in the early 1960s to incorporate the best ideas from FORTRAN and COBOL. • In the 1960s, Simula was the first language designed to support object-oriented programming; in the mid-1970s, Smalltalk followed with the first "purely" object-oriented language. • C was developed between 1969 and 1973 as a system programming language, and remains popular.[48] • Prolog, designed in 1972, was the first logic programming language. • In 1978, ML built a polymorphic type system on top of Lisp, pioneering statically typed functional programming languages. Each of these languages spawned an entire family of descendants, and most modern languages count at least one of them in their ancestry. The 1960s and 1970s also saw considerable debate over the merits of structured programming, and whether programming languages should be designed to support it.[49] Edsger Dijkstra, in a famous 1968 letter published in the Communications of the ACM, argued that GOTO statements should be eliminated from all "higher level" programming languages.[50]


Programming language The 1960s and 1970s also saw expansion of techniques that reduced the footprint of a program as well as improved productivity of the programmer and user. The card deck for an early 4GL was a lot smaller for the same functionality expressed in a 3GL deck.

Consolidation and growth The 1980s were years of relative consolidation. C++ combined object-oriented and systems programming. The United States government standardized Ada, a systems programming language derived from Pascal and intended for use by defense contractors. In Japan and elsewhere, vast sums were spent investigating so-called "fifth generation" languages that incorporated logic programming constructs.[51] The functional languages community moved to standardize ML and Lisp. Rather than inventing new paradigms, all of these movements elaborated upon the ideas invented in the previous decade. One important trend in language design during the 1980s was an increased focus on programming for large-scale systems through the use of modules, or large-scale organizational units of code. Modula-2, Ada, and ML all developed notable module systems in the 1980s, although other languages, such as PL/I, already had extensive support for modular programming. Module systems were often wedded to generic programming constructs.[52] The rapid growth of the Internet in the mid-1990s created opportunities for new languages. Perl, originally a Unix scripting tool first released in 1987, became common in dynamic websites. Java came to be used for server-side programming. These developments were not fundamentally novel, rather they were refinements to existing languages and paradigms, and largely based on the C family of programming languages. Programming language evolution continues, in both industry and research. Current directions include security and reliability verification, new kinds of modularity (mixins, delegates, aspects), and database integration such as Microsoft's LINQ. The 4GLs are examples of languages which are domain-specific, such as SQL, which manipulates and returns sets of data rather than the scalar values which are canonical to most programming languages. Perl, for example, with its 'here document' can hold multiple 4GL programs, as well as multiple JavaScript programs, in part of its own perl code and use variable interpolation in the 'here document' to support multi-language programming.[53]

See also • • • • • • • • • • • • • •

Comparison of programming languages (basic instructions) Comparison of programming languages Computer programming Computer science and Outline of computer science Educational programming language Invariant based programming Lists of programming languages List of programming language researchers Literate programming Dialect (computing) Programming language theory Pseudocode Scientific language Software engineering and List of software engineering topics

6


Programming language

Further reading • Daniel P. Friedman, Mitchell Wand, Christopher Thomas Haynes: Essentials of Programming Languages, The MIT Press 2001. • David Gelernter, Suresh Jagannathan: Programming Linguistics, The MIT Press 1990. • Shriram Krishnamurthi: Programming Languages: Application and Interpretation, online publication [54]. • Bruce J. MacLennan: Principles of Programming Languages: Design, Evaluation, and Implementation, Oxford University Press 1999. • John C. Mitchell: Concepts in Programming Languages, Cambridge University Press 2002. • Benjamin C. Pierce: Types and Programming Languages, The MIT Press 2002. • Ravi Sethi: Programming Languages: Concepts and Constructs, 2nd ed., Addison-Wesley 1996. • Michael L. Scott: Programming Language Pragmatics, Morgan Kaufmann Publishers 2005. • Richard L. Wexelblat (ed.): History of Programming Languages, Academic Press 1981.

External links • 99 Bottles of Beer [55] A collection of implementations in many languages. • Computer Programming Languages [56] at the Open Directory Project • Syntax Patterns for Various Languages [57]

References [1] Aaby, Anthony (2004). Introduction to Programming Languages (http:/ / burks. brighton. ac. uk/ burks/ pcinfo/ progdocs/ plbook/ index. htm). . [2] In mathematical terms, this means the programming language is Turing-complete MacLennan, Bruce J. (1987). Principles of Programming Languages. Oxford University Press. p. 1. ISBN 0-19-511306-3. [3] Steven R. Fischer, A history of language, Reaktion Books, 2003, ISBN 186189080X, p. 205 [4] ACM SIGPLAN (2003). "Bylaws of the Special Interest Group on Programming Languages of the Association for Computing Machinery" (http:/ / www. acm. org/ sigs/ sigplan/ sigplan_bylaws. htm). . Retrieved 2006-06-19., The scope of SIGPLAN is the theory, design, implementation, description, and application of computer programming languages - languages that permit the specification of a variety of different computations, thereby providing the user with significant control (immediate or delayed) over the computer's operation. [5] Dean, Tom (2002). "Programming Robots" (http:/ / www. cs. brown. edu/ people/ tld/ courses/ cs148/ 02/ programming. html). Building Intelligent Robots. Brown University Department of Computer Science. . Retrieved 2006-09-23. [6] R. Narasimahan, Programming Languages and Computers: A Unified Metatheory, pp. 189--247 in Franz Alt, Morris Rubinoff (eds.) Advances in computers, Volume 8, Academic Press, 1994, ISBN 012012108, p.193 : "a complete specification of a programming language must, by definition, include a specification of a processor--idealized, if you will--for that language." [the source cites many references to support this statement] [7] Ben Ari, Mordechai (1996). Understanding Programming Languages". John Wiley and Sons. "Programs and languages can be defined as purely formal mathematical objects. However, more people are interested in programs than in other mathematical objects such as groups, precisely because it is possible to use the program—the sequence of symbols—to control the execution of a computer. While we highly recommend the study of the theory of programming, this text will generally limit itself to the study of programs as they are executed on a computer." [8] David A. Schmidt, The structure of typed programming languages, MIT Press, 1994, ISBN 0262193493, p. 32 [9] Pierce, Benjamin (2002). Types and Programming Languages. MIT Press. p. 339. ISBN 0-262-16209-1. [10] Digital Equipment Corporation. "Information Technology - Database Language SQL (Proposed revised text of DIS 9075)" (http:/ / www. contrib. andrew. cmu. edu/ ~shadow/ sql/ sql1992. txt). ISO/IEC 9075:1992, Database Language SQL. . Retrieved June 29, 2006. [11] The Charity Development Group (December 1996). "The CHARITY Home Page" (http:/ / pll. cpsc. ucalgary. ca/ charity1/ www/ home. html). . Retrieved 2006-06-29., Charity is a categorical programming language..., All Charity computations terminate. [12] XML in 10 points (http:/ / www. w3. org/ XML/ 1999/ XML-in-10-points. html) W3C, 1999, XML is not a programming language. [13] Powell, Thomas (2003). HTML & XHTML: the complete reference. McGraw-Hill. p. 25. ISBN 0-07-222-942-X. "HTML is not a programming language." [14] Dykes, Lucinda (2005). XML For Dummies, 4th Edition. Wiley. p. 20. ISBN 0-7645-8845-1. "...it's a markup language, not a programming language." [15] http:/ / www. ibm. com/ developerworks/ library/ x-xslt/ [16] http:/ / msdn. microsoft. com/ en-us/ library/ ms767587(VS. 85). aspx

7


Programming language [17] Scott, Michael (2006). Programming Language Pragmatics. Morgan Kaufmann. p. 802. ISBN 0-12-633951-1. "XSLT, though highly specialized to the transformation of XML, is a Turing-complete programming language." [18] http:/ / tobi. oetiker. ch/ lshort/ lshort. pdf [19] Syropoulos, Apostolos; Antonis Tsolomitis, Nick Sofroniou (2003). Digital typography using LaTeX. Springer-Verlag. p. 213. ISBN 0-387-95217-9. "TeX is not only an excellent typesetting engine but also a real programming language." [20] Robert A. Edmunds, The Prentice-Hall standard glossary of computer terminology, Prentice-Hall, 1985, p. 91 [21] Pascal Lando, Anne Lapujade, Gilles Kassel, and Frédéric Fürst, Towards a General Ontology of Computer Programs (http:/ / www. loa-cnr. it/ ICSOFT2007_final. pdf), ICSOFT 2007 (http:/ / dblp. uni-trier. de/ db/ conf/ icsoft/ icsoft2007-1. html), pp. 163-170 [22] S.K. Bajpai, Introduction To Computers And C Programming, New Age International, 2007, ISBN 812241379X, p. 346 [23] R. Narasimahan, Programming Languages and Computers: A Unified Metatheory, pp. 189--247 in Franz Alt, Morris Rubinoff (eds.) Advances in computers, Volume 8, Academic Press, 1994, ISBN 012012108, p.215: "[...] the model [...] for computer languages differs from that [...] for programming languages in only two respects. In a computer language, there are only finitely many names--or registers--which can assume only finitely many values--or states--and these states are not further distinguished in terms of any other attributes. [author's footnote:] This may sound like a truism but its implications are far reaching. For example, it would imply that any model for programming languages, by fixing certain of its parameters or features, should be reducible in a natural way to a model for computer languages." [24] John C. Reynolds, Some thoughts on teaching programming and programming languages, SIGPLAN Notices, Volume 43, Issue 11, November 2008, p.109 [25] Jing Huang. "Artificial Language vs. Natural Language" (http:/ / www. cs. cornell. edu/ info/ Projects/ Nuprl/ cs611/ fall94notes/ cn2/ subsection3_1_3. html). . [26] IBM in first publishing PL/I, for example, rather ambitiously titled its manual The universal programming language PL/I (IBM Library; 1966). The title reflected IBM's goals for unlimited subsetting capability: PL/I is designed in such a way that one can isolate subsets from it satisfying the requirements of particular applications. ( "Encyclopaedia of Mathematics » P » PL/I" (http:/ / eom. springer. de/ P/ p072885. htm). SpringerLink. . Retrieved June 29, 2006.). Ada and UNCOL had similar early goals. [27] Frederick P. Brooks, Jr.: The Mythical Man-Month, Addison-Wesley, 1982, pp. 93-94 [28] Dijkstra, Edsger W. On the foolishness of "natural language programming." (http:/ / www. cs. utexas. edu/ users/ EWD/ transcriptions/ EWD06xx/ EWD667. html) EWD667. [29] Perlis, Alan, Epigrams on Programming (http:/ / www-pu. informatik. uni-tuebingen. de/ users/ klaeren/ epigrams. html). SIGPLAN Notices Vol. 17, No. 9, September 1982, pp. 7-13 [30] Milner, R.; M. Tofte, R. Harper and D. MacQueen. (1997). The Definition of Standard ML (Revised). MIT Press. ISBN 0-262-63181-4. [31] Kelsey, Richard; William Clinger and Jonathan Rees (February 1998). "Section 7.2 Formal semantics" (http:/ / www. schemers. org/ Documents/ Standards/ R5RS/ HTML/ r5rs-Z-H-10. html#%_sec_7. 2). Revised5 Report on the Algorithmic Language Scheme. . Retrieved 2006-06-09. [32] ANSI — Programming Language Rexx, X3-274.1996 [33] "HOPL: an interactive Roster of Programming Languages" (http:/ / hopl. murdoch. edu. au/ ). Australia: Murdoch University. . Retrieved 2009-06-01. "This site lists 8512 languages." [34] Abelson, Sussman, and Sussman. "Structure and Interpretation of Computer Programs" (http:/ / mitpress. mit. edu/ sicp/ full-text/ book/ book-Z-H-10. html). . Retrieved 2009-03-03. [35] Survey of Job advertisements mentioning a given language (http:/ / www. computerweekly. com/ Articles/ 2007/ 09/ 11/ 226631/ sslcomputer-weekly-it-salary-survey-finance-boom-drives-it-job. htm) [36] Counting programming languages by book sales (http:/ / radar. oreilly. com/ archives/ 2006/ 08/ programming_language_trends_1. html) [37] Bieman, J.M.; Murdock, V., Finding code on the World Wide Web: a preliminary investigation, Proceedings First IEEE International Workshop on Source Code Analysis and Manipulation, 2001 [38] Programming Language Popularity (http:/ / www. langpop. com/ ) [39] Carl A. Gunter, Semantics of Programming Languages: Structures and Techniques, MIT Press, 1992, ISBN 0262570955, p. 1 [40] "TUNES: Programming Languages" (http:/ / tunes. org/ wiki/ programming_20languages. html). . [41] Wirth, Niklaus (1993). "Recollections about the development of Pascal" (http:/ / portal. acm. org/ citation. cfm?id=155378). Proc. 2nd ACM SIGPLAN conference on history of programming languages: 333–342. doi:10.1145/154766.155378. . Retrieved 2006-06-30. [42] Benjamin C. Pierce writes:

"... the lambda calculus has seen widespread use in the specification of programming language features, in language design and implementation, and in the study of type systems." Pierce, Benjamin C. (2002). Types and Programming Languages. MIT Press. p. 52. ISBN 0-262-16209-1. [43] Rojas, Raúl, et al. (2000). "Plankalkül: The First High-Level Programming Language and its Implementation". Institut für Informatik, Freie Universität Berlin, Technical Report B-3/2000. (full text) (http:/ / www. zib. de/ zuse/ Inhalt/ Programme/ Plankalkuel/ Plankalkuel-Report/ Plankalkuel-Report. htm) [44] Linda Null, Julia Lobur, The essentials of computer organization and architecture, Edition 2, Jones & Bartlett Publishers, 2006, ISBN 0763737690, p. 435 [45] O'Reilly Media. "History of programming languages" (http:/ / www. oreilly. com/ news/ graphics/ prog_lang_poster. pdf) (PDF). . Retrieved October 5, 2006.

8


Programming language [46] Frank da Cruz. IBM Punch Cards (http:/ / www. columbia. edu/ acis/ history/ cards. html) Columbia University Computing History (http:/ / www. columbia. edu/ acis/ history/ index. html). [47] Richard L. Wexelblat: History of Programming Languages, Academic Press, 1981, chapter XIV. [48] François Labelle. "Programming Language Usage Graph" (http:/ / www. cs. berkeley. edu/ ~flab/ languages. html). SourceForge. . Retrieved June 21, 2006.. This comparison analyzes trends in number of projects hosted by a popular community programming repository. During most years of the comparison, C leads by a considerable margin; in 2006, Java overtakes C, but the combination of C/C++ still leads considerably. [49] Hayes, Brian (2006), "The Semicolon Wars", American Scientist 94 (4): 299–303 [50] Dijkstra, Edsger W. (March 1968). "Go To Statement Considered Harmful" (http:/ / www. acm. org/ classics/ oct95/ ). Communications of the ACM 11 (3): 147–148. doi:10.1145/362929.362947. . Retrieved 2006-06-29. [51] Tetsuro Fujise, Takashi Chikayama Kazuaki Rokusawa, Akihiko Nakase (December 1994). "KLIC: A Portable Implementation of KL1" Proc. of FGCS '94, ICOT Tokyo, December 1994. KLIC is a portable implementation of a concurrent logic programming language [[KL1 (http:/ / www. icot. or. jp/ ARCHIVE/ HomePage-E. html)].] [52] Jim Bender (March 15, 2004). "Mini-Bibliography on Modules for Functional Programming Languages" (http:/ / readscheme. org/ modules/ ). ReadScheme.org. . Retrieved 2006-09-27. [53] Wall, Programming Perl ISBN 0-596-00027-8 p.66 [54] http:/ / www. cs. brown. edu/ ~sk/ Publications/ Books/ ProgLangs/ [55] http:/ / www. 99-bottles-of-beer. net/ [56] http:/ / www. dmoz. org/ Computers/ Programming/ Languages/ [57] http:/ / merd. sourceforge. net/ pixel/ language-study/ syntax-across-languages/

Computer software Computer software, or just software is a general term primarily used for digitally stored data such as computer programs and other kinds of information read and written by computers. Today, this includes data that has not traditionally been associated with computers, such as film, tapes and records.[1] The term was coined in order to contrast to the old term hardware (meaning physical devices); in contrast to hardware, software is intangible, meaning it "cannot be touched".[2] Software is also sometimes used in a more narrow sense, meaning application software only. Examples: • Application software, such as word processors which perform productive tasks for users. • Firmware, which is software programmed resident to electrically programmable memory devices on board mainboards or other types of integrated hardware carriers. • Middleware, which controls and co-ordinates distributed systems. • System software such as operating systems, which govern computing resources and provide convenience for users. • Software testing is a domain independent of development and programming. Software testing consists of various methods to test and declare a software product fit before it can be launched for use by either an individual or a group. • Testware, which is an umbrella term or container term for all utilities and application software that serve in combination for testing a software package but not necessarily may optionally contribute to operational purposes. As such, testware is not a standing configuration but merely a working environment for application software or subsets thereof. • Video games (except the hardware part) • Websites

9


Computer software

Overview Software includes all the various forms and roles that digitally stored data may have and play in a computer (or similar system), regardless of whether the data is used as code for a CPU, or other interpreter, or whether it represents other kinds of information. Software thus encompasses a wide array of products that may be developed using different techniques such as ordinary programming languages, scripting languages, microcode, or an FPGA configuration. The types of software include web pages developed in languages and frameworks like HTML, PHP, Perl, JSP, ASP.NET, XML, and desktop applications like OpenOffice, Microsoft Word developed in languages like C, C++, Java, C#, or Smalltalk. Application software usually runs on an underlying software operating systems such as Linux or Microsoft Windows. Software (or firmware) is also used in video games and for the configurable parts of the logic systems of automobiles, televisions, and other consumer electronics. Computer software is so called to distinguish it from computer A layer structure showing where operating hardware, which encompasses the physical interconnections and system is located on generally used software devices required to store and execute (or run) the software. At the systems on desktops lowest level, executable code consists of machine language instructions specific to an individual processor. A machine language consists of groups of binary values signifying processor instructions that change the state of the computer from its preceding state. Programs are an ordered sequence of instructions for changing the state of the computer in a particular sequence. It is usually written in high-level programming languages that are easier and more efficient for humans to use (closer to natural language) than machine language. High-level languages are compiled or interpreted into machine language object code. Software may also be written in an assembly language, essentially, a mnemonic representation of a machine language using a natural language alphabet. Assembly language must be assembled into object code via an assembler. The term "software" was first used in this sense by John W. Tukey in 1958.[3] In computer science and software engineering, computer software is all computer programs. The theory that is the basis for most modern software was first proposed by Alan Turing in his 1935 essay Computable numbers with an application to the Entscheidungsproblem (Decision problem).[4]

Types of software Practical computer systems divide software systems into three major classes: system software, programming software and application software, although the distinction is arbitrary, and often blurred.

System software System software helps run the computer hardware and computer system. It includes a combination of the following: • • • •

device drivers operating systems servers utilities

• windowing systems

10


Computer software The purpose of systems software is to unburden the applications programmer from the often complex details of the particular computer being used, including such accessories as communications devices, printers, device readers, displays and keyboards, and also to partition the computer's resources such as memory and processor time in a safe and stable manner. Examples are - Windows, Linux, and Mac OS X.

Programming software Programming software usually provides tools to assist a programmer in writing computer programs, and software using different programming languages in a more convenient way. The tools include: • • • • •

compilers debuggers interpreters linkers text editors

An Integrated development environment (IDE) is a single application that attempts to manage all these functions.

Application software Application software allows end users to accomplish one or more specific (not directly computer development related) tasks. Typical applications include: • • • • • • • • • • • • • • •

industrial automation business software video games quantum chemistry and solid state physics software telecommunications (i.e., the Internet and everything that flows on it) databases educational software medical software military software molecular modeling software image editing spreadsheet simulation software Word processing Decision making software

Application software exists for and has impacted a wide variety of topics.

Software topics Architecture Users often see things differently than programmers. People who use modern general purpose computers (as opposed to embedded systems, analog computers and supercomputers) usually see three layers of software performing a variety of tasks: platform, application, and user software. • Platform software: Platform includes the firmware, device drivers, an operating system, and typically a graphical user interface which, in total, allow a user to interact with the computer and its peripherals (associated equipment). Platform software often comes bundled with the computer. On a PC you will usually have the ability to change the platform software.

11


Computer software • Application software: Application software or Applications are what most people think of when they think of software. Typical examples include office suites and video games. Application software is often purchased separately from computer hardware. Sometimes applications are bundled with the computer, but that does not change the fact that they run as independent applications. Applications are usually independent programs from the operating system, though they are often tailored for specific platforms. Most users think of compilers, databases, and other "system software" as applications. • User-written software: End-user development tailors systems to meet users' specific needs. User software include spreadsheet templates, word processor [Platform software: Platform includes the firmware, device drivers, an operating system, and typically a graphical user interface which, in total, allow a user to interact with the computer and its peripherals (associated equipment). Platform software often comes bundled with the computer. On a PC you will usually have the ability to change the platform software. Even email filters are a kind of user software. Users create this software themselves and often overlook how important it is. Depending on how competently the user-written software has been integrated into default application packages, many users may not be aware of the distinction between the original packages, and what has been added by co-workers.

Documentation Most software has software documentation so that the end user can understand the program, what it does, and how to use it. Without a clear documentation, software can be hard to use—especially if it is a very specialized and relatively complex software like the Photoshop or AutoCAD. Developer documentation may also exist, either with the code as comments and/or as separate files, detailing how the programs works and can be modified.

Library An executable is almost always not sufficiently complete for direct execution. Software libraries include collections of functions and functionality that may be embedded in other applications. Operating systems include many standard Software libraries, and applications are often distributed with their own libraries.

Standard Since software can be designed using many different programming languages and in many different operating systems and operating environments, software standard is needed so that different software can understand and exchange information between each other. For instance, an email sent from a Microsoft Outlook should be readable from Yahoo! Mail and vice versa.

Execution Computer software has to be "loaded" into the computer's storage (such as a [hard drive], memory, or RAM). Once the software has loaded, the computer is able to execute the software. This involves passing instructions from the application software, through the system software, to the hardware which ultimately receives the instruction as machine code. Each instruction causes the computer to carry out an operation – moving data, carrying out a computation, or altering the control flow of instructions. Data movement is typically from one place in memory to another. Sometimes it involves moving data between memory and registers which enable high-speed data access in the CPU. Moving data, especially large amounts of it, can be costly. So, this is sometimes avoided by using "pointers" to data instead. Computations include simple operations such as incrementing the value of a variable data element. More complex computations may involve many operations and data elements together.

12


Computer software

Quality and reliability Software quality is very important, especially for commercial and system software like Microsoft Office, Microsoft Windows and Linux. If software is faulty (buggy), it can delete a person's work, crash the computer and do other unexpected things. Faults and errors are called "bugs." Many bugs are discovered and eliminated (debugged) through software testing. However, software testing rarely – if ever – eliminates every bug; some programmers say that "every program has at least one more bug" (Lubarsky's Law). All major software companies, such as Microsoft, Novell and Sun Microsystems, have their own software testing departments with the specific goal of just testing. Software can be tested through unit testing, regression testing and other methods, which are done manually, or most commonly, automatically, since the amount of code to be tested can be quite large. For instance, NASA has extremely rigorous software testing procedures for many operating systems and communication functions. Many NASA based operations interact and identify each other through command programs called software. This enables many people who work at NASA to check and evaluate functional systems overall. Programs containing command software enable hardware engineering and system operations to function much easier together.

License The software's license gives the user the right to use the software in the licensed environment. Some software comes with the license when purchased off the shelf, or an OEM license when bundled with hardware. Other software comes with a free software license, granting the recipient the rights to modify and redistribute the software. Software can also be in the form of freeware or shareware.

Patents Software can be patented; however, software patents can be controversial in the software industry with many people holding different views about it. The controversy over software patents is that a specific algorithm or technique that the software has may not be duplicated by others and is considered an intellectual property and copyright infringement depending on the severity.

Design and implementation Design and implementation of software varies depending on the complexity of the software. For instance, design and creation of Microsoft Word software will take much longer time than designing and developing Microsoft Notepad because of the difference in functionalities in each one. Software is usually designed and created (coded/written/programmed) in integrated development environments (IDE) like Eclipse, Emacs and Microsoft Visual Studio that can simplify the process and compile the program. As noted in different section, software is usually created on top of existing software and the application programming interface (API) that the underlying software provides like GTK+, JavaBeans or Swing. Libraries (APIs) are categorized for different purposes. For instance, JavaBeans library is used for designing enterprise applications, Windows Forms library is used for designing graphical user interface (GUI) applications like Microsoft Word, and Windows Communication Foundation is used for designing web services. Underlying computer programming concepts like quicksort, hashtable, array, and binary tree can be useful to creating software. When a program is designed, it relies on the API. For instance, if a user is designing a Microsoft Windows desktop application, he/she might use the .NET Windows Forms library to design the desktop application and call its APIs like Form1.Close() and Form1.Show()[5] to close or open the application and write the additional operations him/herself that it need to have. Without these APIs, the programmer needs to write these APIs him/herself. Companies like Sun Microsystems, Novell, and Microsoft provide their own APIs so that many applications are written using their software libraries that usually have numerous APIs in them.

13


Computer software Software has special economic characteristics that make its design, creation, and distribution different from most other economic goods.[6] [7] A person who creates software is called a programmer, software engineer, software developer, or code monkey, terms that all essentially have a same meaning.

Industry and organizations The software industry is made up of different entities and peoples that produce software, and as a result there are many software companies and programmers in the world. Because software is increasingly used in many different areas like in finance, web searching, data mining, mathematics, space exploration, gaming and mining and such, software companies and people usually specialize in certain areas. For instance, Electronic Arts primarily creates video games. Also selling software can be quite a profitable industry. For instance, Bill Gates, the founder of Microsoft is the richest person in the world in 2009 largely by selling the Microsoft Windows and Microsoft Office software programs. The same goes for Larry Ellison, largely through his Oracle database software. There are also many non-profit software organizations like the Free Software Foundation, GNU Project, Mozilla Foundation. Also there are many software standard organizations like the W3C, IETF and others that try to come up with a software standard so that many software can work and interoperate with each other like through standards such as XML, HTML, HTTP or FTP. Some of the well known software companies include Microsoft, Oracle, Novell, SAP, Symantec, Adobe Systems, and Corel. Many small companies provide innovation. This is particularly important in the Internet information age where individuals set up small websites that compete with big companies.

References [1] software..(n.d.). Dictionary.com Unabridged (v 1.1). Retrieved 2007-04-13, from Dictionary.com website: http:/ / dictionary. reference. com/ browse/ software [2] "Wordreference.com: WordNet 2.0" (http:/ / www. wordreference. com/ definition/ software). Princeton University, Princeton, NJ. . Retrieved 2007-08-19. [3] "John Tukey, 85, Statistician; Coined the Word 'Software'" (http:/ / query. nytimes. com/ gst/ fullpage. html?res=9500E4DA173DF93BA15754C0A9669C8B63). New York Times. 2000-07-28. . [4] Hally, Mike (2005:79). Electronic brains/Stories from the dawn of the computer age. British Broadcasting Corporation and Granta Books, London. ISBN 1-86207-663-4. [5] MSDN Library (http:/ / msdn. microsoft. com/ en-us/ library/ default. aspx) [6] v. Engelhardt, Sebastian (2008): "The Economic Properties of Software", Jena Economic Research Papers, Volume 2 (2008), Number 2008-045. (http:/ / ideas. repec. org/ p/ jrp/ jrpwrp/ 2008-045. html) (in Adobe pdf format) [7] "Why Open Source Is The Optimum Economic Paradigm for Software" (http:/ / www. doxpara. com/ read. php/ core. html) by Dan Kaminsky 1999

14


15

History History of programming languages This article discusses the major developments in the history of programming languages. For a detailed timeline of events, see the timeline of programming languages.

Before 1940 The first programming languages predate the modern computer. At first, the languages were codes. The Jacquard loom, invented in 1801, used holes in punched cards to represent sewing loom arm movements in order to generate decorative patterns automatically. During a nine-month period in 1842-1843, Ada Lovelace translated the memoir of Italian mathematician Luigi Menabrea about Charles Babbage's newest proposed machine, the Analytical Engine. With the article, she appended a set of notes which specified in complete detail a method for calculating Bernoulli numbers with the Engine, recognized by some historians as the world's first computer program.[1] Herman Hollerith realized that he could encode information on punch cards when he observed that train conductors encode the appearance of the ticket holders on the train tickets using the position of punched holes on the tickets. Hollerith then encoded the 1890 census data on punch cards. The first computer codes were specialized for their applications. In the first decades of the 20th century, numerical calculations were based on decimal numbers. Eventually it was realized that logic could be represented with numbers, not only with words. For example, Alonzo Church was able to express the lambda calculus in a formulaic way. The Turing machine was an abstraction of the operation of a tape-marking machine, for example, in use at the telephone companies. Turing machines set the basis for storage of programs as data in the von Neumann architecture of computers by representing a machine through a finite number. However, unlike the lambda calculus, Turing's code does not serve well as a basis for higher-level languages—its principal use is in rigorous analyses of algorithmic complexity. Like many "firsts" in history, the first modern programming language is hard to identify. From the start, the restrictions of the hardware defined the language. Punch cards allowed 80 columns, but some of the columns had to be used for a sorting number on each card. FORTRAN included some keywords which were the same as English words, such as "IF", "GOTO" (go to) and "CONTINUE". The use of a magnetic drum for memory meant that computer programs also had to be interleaved with the rotations of the drum. Thus the programs were more hardware-dependent than today. To some people, what was the first modern programming language depends on how much power and human-readability is required before the status of "programming language" is granted. Jacquard looms and Charles Babbage's Difference Engine both had simple, extremely limited languages for describing the actions that these machines should perform. One can even regard the punch holes on a player piano scroll as a limited domain-specific language, albeit not designed for human consumption.


History of programming languages

The 1940s In the 1940s the first recognizably modern, electrically powered computers were created. The limited speed and memory capacity forced programmers to write hand tuned assembly language programs. It was soon discovered that programming in assembly language required a great deal of intellectual effort and was error-prone. In 1948, Konrad Zuse published a paper about his programming language Plankalkül.[2] However, it was not implemented in his lifetime and his original contributions were isolated from other developments. Some important languages that were developed in this period include: • 1943 - Plankalkül (Konrad Zuse) • 1943 - ENIAC coding system • 1949 - C-10

The 1950s and 1960s In the 1950s the first three modern programming languages whose descendants are still in widespread use today were designed: • FORTRAN (1955), the "FORmula TRANslator", invented by John Backus et al.; • LISP, the "LISt Processor", invented by John McCarthy et al.; • COBOL, the COmmon Business Oriented Language, created by the Short Range Committee, heavily influenced by Grace Hopper. Another milestone in the late 1950s was the publication, by a committee of American and European computer scientists, of "a new language for algorithms"; the ALGOL 60 Report (the "ALGOrithmic Language"). This report consolidated many ideas circulating at the time and featured two key language innovations: • nested block structure: code sequences and associated declarations could be grouped into blocks without having to be turned into separate, explicitly named procedures; • lexical scoping: a block could have its own private variables, procedures and functions, invisible to code outside that block, i.e. information hiding. Another innovation, related to this, was in how the language was described: • a mathematically exact notation, Backus-Naur Form (BNF), was used to describe the language's syntax. Nearly all subsequent programming languages have used a variant of BNF to describe the context-free portion of their syntax. Algol 60 was particularly influential in the design of later languages, some of which soon became more popular. The Burroughs large systems were designed to be programmed in an extended subset of Algol. Algol's key ideas were continued, producing ALGOL 68: • syntax and semantics became even more orthogonal, with anonymous routines, a recursive typing system with higher-order functions, etc.; • not only the context-free part, but the full language syntax and semantics were defined formally, in terms of Van Wijngaarden grammar, a formalism designed specifically for this purpose. Algol 68's many little-used language features (e.g. concurrent and parallel blocks) and its complex system of syntactic shortcuts and automatic type coercions made it unpopular with implementers and gained it a reputation of being difficult. Niklaus Wirth actually walked out of the design committee to create the simpler Pascal language. Some important languages that were developed in this period include: • 1951 - Regional Assembly Language • 1952 - Autocode • 1954 - FORTRAN

16


History of programming languages • • • • • • • • • • • • • •

1954 - IPL (forerunner to LISP) 1955 - FLOW-MATIC (forerunner to COBOL) 1957 - COMTRAN (forerunner to COBOL) 1958 - LISP 1958 - ALGOL 58 1959 - FACT (forerunner to COBOL) 1959 - COBOL 1962 - APL 1962 - Simula 1962 - SNOBOL 1963 - CPL (forerunner to C) 1964 - BASIC 1964 - PL/I 1967 - BCPL (forerunner to C)

1967-1978: establishing fundamental paradigms The period from the late 1960s to the late 1970s brought a major flowering of programming languages. Most of the major language paradigms now in use were invented in this period: • Simula, invented in the late 1960s by Nygaard and Dahl as a superset of Algol 60, was the first language designed to support object-oriented programming. • C, an early systems programming language, was developed by Dennis Ritchie and Ken Thompson at Bell Labs between 1969 and 1973. • Smalltalk (mid 1970s) provided a complete ground-up design of an object-oriented language. • Prolog, designed in 1972 by Colmerauer, Roussel, and Kowalski, was the first logic programming language. • ML built a polymorphic type system (invented by Robin Milner in 1973) on top of Lisp, pioneering statically typed functional programming languages. Each of these languages spawned an entire family of descendants, and most modern languages count at least one of them in their ancestry. The 1960s and 1970s also saw considerable debate over the merits of "structured programming", which essentially meant programming without the use of Goto. This debate was closely related to language design: some languages did not include GOTO, which forced structured programming on the programmer. Although the debate raged hotly at the time, nearly all programmers now agree that, even in languages that provide GOTO, it is bad programming style to use it except in rare circumstances. As a result, later generations of language designers have found the structured programming debate tedious and even bewildering. Some important languages that were developed in this period include: • • • • • • • •

1968 - Logo 1970 - Pascal 1970 - Forth 1972 - C 1972 - Smalltalk 1972 - Prolog 1973 - ML 1975 - Scheme

• 1978 - SQL (initially only a query language, later extended with programming constructs)

17


History of programming languages

The 1980s: consolidation, modules, performance The 1980s were years of relative consolidation. C++ combined object-oriented and systems programming. The United States government standardized Ada, a systems programming language intended for use by defense contractors. In Japan and elsewhere, vast sums were spent investigating so-called fifth-generation programming languages that incorporated logic programming constructs. The functional languages community moved to standardize ML and Lisp. Rather than inventing new paradigms, all of these movements elaborated upon the ideas invented in the previous decade. However, one important new trend in language design was an increased focus on programming for large-scale systems through the use of modules, or large-scale organizational units of code. Modula, Ada, and ML all developed notable module systems in the 1980s. Module systems were often wedded to generic programming constructs---generics being, in essence, parameterized modules (see also polymorphism in object-oriented programming). Although major new paradigms for programming languages did not appear, many researchers expanded on the ideas of prior languages and adapted them to new contexts. For example, the languages of the Argus and Emerald systems adapted object-oriented programming to distributed systems. The 1980s also brought advances in programming language implementation. The RISC movement in computer architecture postulated that hardware should be designed for compilers rather than for human assembly programmers. Aided by processor speed improvements that enabled increasingly aggressive compilation techniques, the RISC movement sparked greater interest in compilation technology for high-level languages. Language technology continued along these lines well into the 1990s. Some important languages that were developed in this period include: • • • • • • • •

1980 - C++ (as C with classes) 1983 - Ada 1984 - Common Lisp 1985 - Eiffel 1986 - Erlang 1987 - Perl 1988 - Tcl 1989 - FL (Backus)

The 1990s: the Internet age The 1990s saw no fundamental novelty, but much recombination as well as maturation of old ideas. A big driving philosophy was programmer productivity. Many "rapid application development" (RAD) languages emerged, which usually came with an IDE, garbage collection, and were descendants of older languages. All such languages were object-oriented. These included Object Pascal, Visual Basic, and C#. Java was a more conservative language that also featured garbage collection and received much attention. More radical and innovative than the RAD languages were the new scripting languages. These did not directly descend from other languages and featured new syntaxes and more liberal incorporation of features. Many consider these scripting languages to be more productive than even the RAD languages, but often because of choices that make small programs simpler but large programs more difficult to write and maintain. Nevertheless, scripting languages came to be the most prominent ones used in connection with the Web. Some important languages that were developed in this period include: • 1990 - Haskell • 1991 - Python • 1991 - Java

18


History of programming languages • • • • • • •

1993 - Ruby 1993 - Lua 1994 - CLOS (part of ANSI Common Lisp) 1995 - Delphi (Object Pascal) 1995 - JavaScript 1995 - PHP 1997 - Rebol

Current trends Programming language evolution continues, in both industry and research. Some of the current trends include: • Mechanisms for adding security and reliability verification to the language: extended static checking, information flow control, static thread safety. • Alternative mechanisms for modularity: mixins, delegates, aspects. • Component-oriented software development. • Metaprogramming, reflection or access to the abstract syntax tree • Increased emphasis on distribution and mobility. • Integration with databases, including XML and relational databases. • Support for Unicode so that source code (program text) is not restricted to those characters contained in the ASCII character set; allowing, for example, use of non-Latin-based scripts or extended punctuation. • XML for graphical interface (XUL, XAML). Some important languages developed during this period include: • • • • • • • •

2001 - C# 2001 - Visual Basic .NET 2002 - F# 2003 - Scala 2003 - Factor 2006 - Windows Power Shell 2007 - Clojure 2009 - Go

Prominent people in the history of program languages • • • • • • • • • •

John Backus, inventor of Fortran. Alan Cooper, developer of Visual Basic. Edsger W. Dijkstra, developed the framework for proper programming. James Gosling, developer of Oak, the precursor of Java. Anders Hejlsberg, developer of Turbo Pascal, Delphi and C#. Grace Hopper, developer of Flow-Matic, influencing COBOL. Kenneth E. Iverson, developer of APL, and co-developer of J along with Roger Hui. Bill Joy, inventor of vi, early author of BSD Unix, and originator of SunOS, which became Solaris. Alan Kay, pioneering work on object-oriented programming, and originator of Smalltalk. Brian Kernighan, co-author of the first book on the C programming language with Dennis Ritchie, coauthor of the AWK and AMPL programming languages. • John McCarthy, inventor of LISP. • John von Neumann, originator of the operating system concept. • Dennis Ritchie, inventor of C. • Bjarne Stroustrup, developer of C++.

19


History of programming languages • • • •

Ken Thompson, inventor of Unix. Niklaus Wirth inventor of Pascal, Modula and Oberon. Larry Wall creator of Perl and Perl 6 Guido van Rossum creator of Python

See also • • • • • • •

ACM History of Programming Languages Conference History of compiler writing History of computing hardware Programming language Timeline of computing Timeline of programming languages List of programming languages

Further reading • Sammet, Jean E., "Programming Languages: History and Fundamentals" Prentice-Hall, 1969

External links • • • • •

The History of Programming Languages [3] by Diarmuid Pigott History and evolution of programming languages [4]. The Evolution of Programming Languages [5] by Peter Grogono. PDF. Graph of programming language history [6] Topics in history and comparing programming languages [7] by Dennie Van Tassel.

References [1] J. Fuegi and J. Francis (October-December 2003), "Lovelace & Babbage and the creation of the 1843 'notes'.", Annals of the History of Computing 25 (4), doi:10.1109/MAHC.2003.1253887 [2] http:/ / www-history. mcs. st-andrews. ac. uk/ history/ Mathematicians/ Zuse. html [3] http:/ / hopl. murdoch. edu. au [4] http:/ / www. scriptol. com/ programming/ history. php [5] http:/ / www. ulb. ac. be/ di/ rwuyts/ INFO020_2003/ grogono-evolution. pdf [6] http:/ / www. levenez. com/ lang/ history. html [7] http:/ / hhh. gavilan. edu/ dvantassel/ history/ history. html

20


21

Types of programming languages Low-level programming language In computer science, a low-level programming language is a programming language that provides little or no abstraction from a computer's instruction set architecture. The word "low" refers to the small or nonexistent amount of abstraction between the language and machine language; because of this, low-level languages are sometimes described as being "close to the hardware." A low-level language does not need a compiler or interpreter to run; the processor for which the language was written is able to run the code without using either of these. By comparison, a high-level programming language isolates the execution semantics of a computer architecture from the specification of the program, making the process of developing a program simpler and more understandable. Low-level programming languages are sometimes divided into two categories: first generation, and second generation.

First generation The first-generation programming language, or 1GL, is machine code. It is the only language a microprocessor can process directly without a previous transformation. Currently, programmers almost never write programs directly in machine code, because it requires attention to numerous details which a high-level language would handle automatically, and also requires memorizing or looking up numerical codes for every instruction that is used. For this reason, second generation programming languages provide one abstraction level on top of the machine code. Example: A function in 32-bit x86 machine code to calculate the nth Fibonacci number: 8B542408 FA027706 B9010000 C84AEBF1

83FA0077 06B80000 0000C383 B8010000 00C353BB 01000000 008D0419 83FA0376 078BD98B 5BC3

Second generation The second-generation programming language, or 2GL, is assembly language. It is considered a second-generation language because while it is not a microprocessor's native language, an assembly language programmer must still understand the microprocessor's unique architecture (such as its registers and instructions). These simple instructions are then assembled directly into machine code. The assembly code can also be abstracted to another layer in a similar manner as machine code is abstracted into assembly code. Example: The same Fibonacci number calculator as above, but in x86 assembly language using MASM syntax: fib: mov edx, [esp+8] cmp edx, 0 ja @f mov eax, 0 ret


Low-level programming language @@: cmp edx, 2 ja @f mov eax, 1 ret @@: push ebx mov ebx, 1 mov ecx, 1 @@: lea cmp jbe mov mov dec jmp @b

eax, edx, @f ebx, ecx, edx

[ebx+ecx] 3 ecx eax

@@: pop ebx ret

See also • High-level programming languages • Very high-level programming languages • Categorical list of programming languages

22


High-level programming language

High-level programming language A high-level programming language is a programming language with strong abstraction from the details of the computer. In comparison to low-level programming languages, it may use natural language elements, be easier to use, or be more portable across platforms. Such languages hide the details of CPU operations such as memory access models and management of scope. This greater abstraction and hiding of details is generally intended to make the language user-friendly, as it includes concepts from the problem domain instead of those of the machine used. A high-level language isolates the execution semantics of a computer architecture from the specification of the program, making the process of developing a program simpler and more understandable with respect to a low-level language. The amount of abstraction provided defines how "high-level" a programming language is.[1] The first high-level programming language to be designed for a computer was Plankalk端l, created by Konrad Zuse. However, it was not implemented in his time and his original contributions were isolated from other developments.

Features The term "high-level language" does not imply that the language is superior to low-level programming languages - in fact, in terms of the depth of knowledge of how computers work required to productively program in a given language, the inverse may be true. Rather, "high-level language" refers to the higher level of abstraction from machine language. Rather than dealing with registers, memory addresses and call stacks, high-level languages deal with usability, threads, locks, objects, variables, arrays and complex arithmetic or boolean expressions. In addition, they have no opcodes that can directly compile the language into machine code, unlike low-level assembly language. Other features such as string handling routines, object-oriented language features and file input/output may also be present.

Abstraction penalty Stereotypically, high-level languages make complex programming simpler, while low-level languages tend to produce more efficient code. Abstraction penalty is the barrier that prevents high-level programming techniques from being applied in situations where computational resources are limited. High-level programming features like more generic data structures, run-time interpretation and intermediate code files often result in slower execution speed, higher memory consumption and larger binary size [2] [3] [4] . For this reason, code which needs to run particularly quickly and efficiently may be written in a lower-level language, even if a higher-level language would make the coding easier. However, with the growing complexity of modern microprocessor architectures, well-designed compilers for high-level languages frequently produce code comparable in efficiency to what most low-level programmers can produce by hand, and the higher abstraction may allow for more powerful techniques providing better overall results than their low-level counterparts in particular settings.[5]

Relative meaning The terms high-level and low-level are inherently relative. Some decades ago, the C language, and similar languages, were most often considered "high-level", as it supported concepts such as expression evaluation, parameterised recursive functions, and data types and structures, while assembly language was considered "low-level". Many programmers today might refer to C as low-level, as it lacks a large runtime-system (no garbage collection etc), basically supports only scalar operations, and provides direct memory addressing. It therefore readily blends with assembly language and the machine level of CPUs and microcontrollers.

23


High-level programming language Also note that assembly language may itself be regarded as a higher level (but often still one-to-one if used without macros) representation of machine code, as it supports concepts such as constants and (limited) expressions, sometimes even variables, procedures, and data structures. Machine code, in its turn, is inherently at a slightly higher level than the microcode or micro-operations used internally in many processors. See C2's page about high-level languages [6].

Execution models There are three models of execution for modern high-level languages: Interpreted Interpreted languages are read and then executed directly, with no compilation stage. Compiled Compiled languages are transformed into an executable form before running. There are two types of compilation: Intermediate representations When a language is compiled to an intermediate representation, that representation can be optimized or saved for later execution without the need to re-read the source file. When the intermediate representation is saved it is often represented as byte code. Machine code generation Some compilers compile source code directly into machine code. Virtual machines that execute byte code directly or transform it further into machine code have blurred the once clear distinction between intermediate representations and truly compiled languages. Translated A language may be translated into a low-level programming language for which native code compilers are already widely available. The C programming language is a common target for such translators.

See also • • • • •

Abstraction (computer science) Generational list of programming languages Low-level programming languages Very high-level programming languages Categorical list of programming languages

External links • http://c2.com/cgi/wiki?HighLevelLanguage - The WikiWikiWeb's article on high-level programming languages

References [1] HThreads - RD Glossary (http:/ / www. ittc. ku. edu/ hybridthreads/ glossary/ index. php) [2] Surana P (2006) (PDF). Meta-Compilation of Language Abstractions. (ftp:/ / lispnyc. org/ meeting-assets/ 2007-02-13_pinku/ SuranaThesis. pdf). . Retrieved 2008-03-17. [3] Kuketayev. "The Data Abstraction Penalty (DAP) Benchmark for Small Objects in Java." (http:/ / www. adtmag. com/ joop/ article. aspx?id=4597). . Retrieved 2008-03-17. [4] Chatzigeorgiou; Stephanides (2002), "Evaluating Performance and Power Of Object-Oriented Vs. Procedural Programming Languages" (http:/ / books. google. com/ books?id=QMalP1P2kAMC& dq="abstraction+ penalty"& lr=& source=gbs_summary_s& cad=0), in Blieberger; Strohmeier, Proceedings - 7th International Conference on Reliable Software Technologies - Ada-Europe'2002, Springer, pp. 367,

24


High-level programming language [5] Manuel Carro, JosĂŠ F. Morales, Henk L. Muller, G. Puebla, M. Hermenegildo (2006), "High-level languages for small devices: a case study" (http:/ / www. clip. dia. fi. upm. es/ papers/ carro06:stream_interpreter_cases. pdf) (PDF), Proceedings of the 2006 international conference on Compilers, architecture and synthesis for embedded systems, ACM, [6] http:/ / c2. com/ cgi/ wiki?HighLevelLanguage

25


26

Software Programming Languages Machine code Machine code or machine language is a system of instructions and data executed directly by a computer's central processing unit. Machine code may be regarded as a primitive (and cumbersome) programming language or as the lowest-level representation of a compiled and/or assembled computer program. Programs in interpreted languages [1] are not represented by machine code however, although their interpreter (which may be seen as a processor executing the higher level program) often is. Machine code is sometimes called native code when referring to platform-dependent parts of language features or libraries.[2] Machine code should not be confused with so called "bytecode", which is executed by an interpreter.

Machine code instructions Every processor or processor family has its own machine code instruction set. Instructions are patterns of bits that by physical design correspond to different commands to the machine. The instruction set is thus specific to a class of processors using (much) the same architecture. Successor or derivative processor designs often include all the instructions of a predecessor and may add additional instructions. Occasionally a successor design will discontinue or alter the meaning of some instruction code (typically because it is needed for new purposes), affecting code compatibility to some extent; even nearly completely compatible processors may show slightly different behavior for some instructions but this is seldom a problem. Systems may also differ in other details, such as memory arrangement, operating systems, or peripheral devices; because a program normally relies on such factors, different systems will typically not run the same machine code, even when the same type of processor is used. A machine code instruction set may have all instructions of the same length, or it may have variable-length instructions. How the patterns are organized varies strongly with the particular architecture and often also with the type of instruction. Most instructions have one or more opcode fields which specifies the basic instruction type (such as arithmetic, logical, jump, etc) and the actual operation (such as add or compare) and other fields that may give the type of the operand(s), the addressing mode(s), the addressing offset(s) or index, or the actual value itself (such constant operands contained in an instruction are called immediates).

Programs A computer program is a sequence of instructions that are executed by a CPU. While simple processors execute instructions one after the other, superscalar processors are capable of executing several instructions at once. Program flow may be influenced by special 'jump' instructions that transfer execution to an instruction other than the following one. Conditional jumps are taken (execution continues at another address) or not (execution continues at the next instruction) depending on some condition.


Machine code

27

Assembly languages A much more readable rendition of machine language, called assembly language, uses mnemonic codes to refer to machine code instructions, rather than simply using the instructions' numeric values. For example, on the Zilog Z80 processor, the machine code 00000101, which causes the CPU to decrement the B processor register, would be represented in assembly language as DEC B.

Example The MIPS architecture provides a specific example for a machine code whose instructions are always 32 bits long. The general type of instruction is given by the op (operation) field, the highest 6 bits. J-type (jump) and I-type (immediate) instructions are fully specified by op. R-type (register) instructions include an additional field funct to determine the exact operation. The fields used in these types are: 6 op op op

[ [ [

| | |

5 rs | rs |

5 5 5 6 bits rt | rd |shamt| funct] R-type rt | address/immediate] I-type target address ] J-type

rs, rt, and rd indicate register operands; shamt gives a shift amount; and the address or immediate fields contain an operand directly. For example adding the registers 1 and 2 and placing the result in register 6 is encoded: [

op | rs | rt | rd |shamt| funct] 0 1 2 6 0 32 000000 00001 00010 00110 00000 100000

decimal binary

Load a value into register 8, taken from the memory cell 68 cells after the location listed in register 3: [

op | rs | rt | address/immediate] 35 3 8 68 100011 00011 01000 00000 00001 000100

decimal binary

Jumping to the address 1024: [

op | target address ] 2 1024 000010 00000 00000 00000 10000 000000

decimal binary

Relationship to microcode In some computer architectures, the machine code is implemented by a more fundamental underlying layer of programs called microprograms, providing a common machine language interface across a line or family of different models of computer with widely different underlying dataflows. This is done to facilitate porting of machine language programs between different models. An example of this use is the IBM System/360 family of computers and their successors. With dataflow path widths of 8 bits to 64 bits and beyond, they nevertheless present a common architecture at the machine language level across the entire line. Using a microcode layer to implement an emulator enables the computer to present the architecture of an entirely different computer. The System/360 line used this to allow porting programs from earlier IBM machines to the new family of computers, e.g. an IBM 1401/1440/1460 emulator on the IBM S/360 model 40.


Machine code

Storing in memory The Harvard architecture is a computer architecture with physically separate storage and signal pathways for the code (instructions) and data. Today, most processors implement such separate signal pathways for performance reasons but actually implement a Modified Harvard architecture, so they can support tasks like loading a program from disk storage as data and then executing it. Harvard architecture is contrasted to the Von Neumann architecture, where data and code are stored in the same memory. From the point of view of a process, the code space is the part of its address space where code in execution is stored. In multi-threading environment, threads share code space along with data space, which reduces the overhead of context switching considerably as compared to process switching.

See also • • • • •

Reduced Instruction Set Computer (RISC) VLIW P-code machine Endianness Teaching Machine Code: Microprofessor I

Further reading • Hennessy, John L.; Patterson, David A.. Computer Organization and Design. The Hardware/Software Interface.. Morgan Kaufmann Publishers. ISBN 1-55860-281-X. • Tanenbaum, Andrew S.. Structured Computer Organization. Prentice Hall. ISBN 0-13-020435-8. • Brookshear, J. Glenn. Computer Science: An Overview. Addison Wesley. ISBN 0321387015.

References [1] Often BASIC, MATLAB, Smalltalk, Python, Ruby, etc. [2] "Managed, Unmanaged, Native: What Kind of Code Is This?" (http:/ / www. developer. com/ net/ cplus/ print. php/ 2197621). developer.com. . Retrieved 2008-09-02.

28


Assembly language

Assembly language See the terminology section below for information regarding inconsistent use of the terms assembly and assembler. Assembly languages are a type of low-level languages for programming computers, microprocessors, microcontrollers, and other (usually) integrated circuits. They implement a symbolic representation of the numeric machine codes and other constants needed to program a particular CPU architecture. This representation is usually defined by the hardware manufacturer, and is based on abbreviations (called mnemonics) that help the programmer remember individual instructions, registers, etc. An assembly language family is thus specific to a certain physical (or virtual) computer architecture. This is in contrast to most high-level languages, which are (ideally) portable. A utility program called an assembler is used to translate assembly language statements into the target computer's machine code. The assembler performs a more or less isomorphic translation (a one-to-one mapping) from mnemonic statements into machine instructions and data. This is in contrast with high-level languages, in which a single statement generally results in many machine instructions. Many sophisticated assemblers offer additional mechanisms to facilitate program development, control the assembly process, and aid debugging. In particular, most modern assemblers include a macro facility (described below), and are called macro assemblers.

Key concepts Assembler Compare with: Microassembler. Typically a modern assembler creates object code by translating assembly instruction mnemonics into opcodes, and by resolving symbolic names for memory locations and other entities.[1] The use of symbolic references is a key feature of assemblers, saving tedious calculations and manual address updates after program modifications. Most assemblers also include macro facilities for performing textual substitution—e.g., to generate common short sequences of instructions as inline, instead of called subroutines, or even generate entire programs or program suites. Assemblers are generally simpler to write than compilers for high-level languages, and have been available since the 1950s. Modern assemblers, especially for RISC based architectures, such as MIPS, Sun SPARC, and HP PA-RISC, as well as x86(-64), optimize instruction scheduling to exploit the CPU pipeline efficiently. There are two types of assemblers based on how many passes through the source are needed to produce the executable program. • One-pass assemblers go through the source code once and assumes that all symbols will be defined before any instruction that references them. • Two-pass assemblers (and multi-pass assemblers) create a table with all unresolved symbols in the first pass, then use the 2nd pass to resolve these addresses. The advantage of a one-pass assembler is speed, which is not as important as it once was with advances in computer speed and capabilities. The advantage of the two-pass assembler is that symbols can be defined anywhere in the program source. As a result, the program can be defined in a more logical and meaningful way. This makes two-pass assembler programs easier to read and maintain.[2] More sophisticated high-level assemblers provide language abstractions such as: • Advanced control structures • High-level procedure/function declarations and invocations • High-level abstract data types, including structures/records, unions, classes, and sets

29


Assembly language

30

• Sophisticated macro processing (although available on ordinary assemblers since late 1960s for IBM/360, amongst other machines) • Object-Oriented features such as encapsulation, polymorphism, inheritance, interfaces See Language design below for more details. Note that, in normal professional usage, the term assembler is often used ambiguously: It is frequently used to refer to an assembly language itself, rather than to the assembler utility. Thus: "CP/CMS was written in S/360 assembler" as opposed to "ASM-H was a widely-used S/370 assembler."

Assembly language A program written in assembly language consists of a series of instructions--mnemonics that correspond to a stream of executable instructions, when translated by an assembler, that can be loaded into memory and executed. For example, an x86/IA-32 processor can execute the following binary instruction ('MOV') as expressed in machine language (see x86 assembly language): Hexadecimal: B0 61

(Binary: 10110000 01100001)

The equivalent assembly language representation is easier to remember (example in Intel syntax, more mnemonic): MOV AL, 61h This instruction means: • Move (really a copy) the hexadecimal value '61' into the processor register known as "AL". (The h-suffix means hexadecimal or = 97 in decimal) The mnemonic "mov" represents the opcode 1011 which actually copies the value in the second operand into the register indicated by the first operand. The mnemonic was chosen by the designer of the instruction set to abbreviate "move", making it easier for the programmer to remember. Typical of an assembly language statement, a comma-separated list of arguments or parameters follows the opcode. In practice many programmers drop the word mnemonic and, technically incorrectly, call "mov" an opcode. When they do this they are referring to the underlying binary code which it represents. To put it another way, a mnemonic such as "mov" is not an opcode, but as it symbolizes an opcode, one might refer to "the opcode mov" for example when one intends to refer to the binary opcode it symbolizes rather than to the symbol -- the mnemonic -- itself. As few modern programmers have need to be mindful of actually what binary patterns are (the opcodes for specific instructions), the distinction has in practice become a bit blurred among programmers but not among processor designers. Transforming assembly into machine language is accomplished by an assembler, and the (partial) reverse by a disassembler. Unlike high-level languages, there is usually a one-to-one correspondence between simple assembly statements and machine language instructions. However, in some cases, an assembler may provide pseudoinstructions (essentially macros) which expand into several machine language instructions to provide commonly needed functionality. For example, for a machine that lacks a "branch if greater or equal" instruction, an assembler may provide a pseudoinstruction that expands to the machine's "set if less than" and "branch if zero (on the result of the set instruction)". Most full-featured assemblers also provide a rich macro language (discussed below) which is used by vendors and programmers to generate more complex code and data sequences. Each computer architecture and processor architecture usually has its own machine language. On this level, each instruction is simple enough to be executed using a relatively small number of electronic circuits. Computers differ by the number and type of operations they support. For example, a new 64-bit machine would have different circuitry from a 32-bit machine. They may also have different sizes and numbers of registers, and different representations of data types in storage. While most general-purpose computers are able to carry out essentially the


Assembly language same functionality, the ways they do so differ; the corresponding assembly languages reflect these differences. Multiple sets of mnemonics or assembly-language syntax may exist for a single instruction set, typically instantiated in different assembler programs. In these cases, the most popular one is usually that supplied by the manufacturer and used in its documentation.

Language design Basic elements Any Assembly language consists of 3 types of instruction statements which are used to define the program operations: • Opcode mnemonics • Data sections • Assembly directives Opcode mnemonics Instructions (statements) in assembly language are generally very simple, unlike those in high-level languages. Generally, an opcode is a symbolic name for a single executable machine language instruction, and there is at least one opcode mnemonic defined for each machine language instruction. Each instruction typically consists of an operation or opcode plus zero or more operands. Most instructions refer to a single value, or a pair of values. Operands can be either immediate (typically one byte values, coded in the instruction itself) or the addresses of data located elsewhere in storage. This is determined by the underlying processor architecture: the assembler merely reflects how this architecture works. Data sections There are instructions used to define data elements to hold data and variables. They define the type of data, the length and the alignment of data. These instructions can also define whether the data is available to outside programs (programs assembled separately) or only to the program in which the data section is defined. Assembly directives and pseudo-ops Assembly directives are instructions that are executed by the assembler at assembly time, not by the CPU at run time. They can make the assembly of the program dependent on parameters input by the programmer, so that one program can be assembled different ways, perhaps for different applications. They also can be used to manipulate presentation of the program to make it easier for the programmer to read and maintain. (For example, pseudo-ops would be used to reserve storage areas and optionally their initial contents.) The names of pseudo-ops often start with a dot to distinguish them from machine instructions. Some assemblers also support pseudo-instructions, which generate two or more machine instructions. Symbolic assemblers allow programmers to associate arbitrary names (labels or symbols) with memory locations. Usually, every constant and variable is given a name so instructions can reference those locations by name, thus promoting self-documenting code. In executable code, the name of each subroutine is associated with its entry point, so any calls to a subroutine can use its name. Inside subroutines, GOTO destinations are given labels. Some assemblers support local symbols which are lexically distinct from normal symbols (e.g., the use of "10$" as a GOTO destination). Most assemblers provide flexible symbol management, allowing programmers to manage different namespaces, automatically calculate offsets within data structures, and assign labels that refer to literal values or the result of simple computations performed by the assembler. Labels can also be used to initialize constants and variables with relocatable addresses.

31


Assembly language Assembly languages, like most other computer languages, allow comments to be added to assembly source code that are ignored by the assembler. Good use of comments is even more important with assembly code than with higher-level languages, as the meaning and purpose of a sequence of instructions is harder to decipher from the code itself. Wise use of these facilities can greatly simplify the problems of coding and maintaining low-level code. Raw assembly source code as generated by compilers or disassemblers—code without any comments, meaningful symbols, or data definitions—is quite difficult to read when changes must be made.

Macros Many assemblers support predefined macros, and others support programmer-defined (and repeatedly redefinable) macros involving sequences of text lines that variables and constants are embedded in. This sequence of text lines may include a sequence of instructions, or a sequence of data storage pseudo-ops. Once a macro has been defined using the appropriate pseudo-op, its name may be used in place of a mnemonic. When the assembler processes such a statement, it replaces the statement with the text lines associated with that macro, then processes them just as though they had appeared in the source code file all along (including, in better assemblers, expansion of any macros appearing in the replacement text). Since macros can have 'short' names but expand to several or indeed many lines of code, they can be used to make assembly language programs appear to be much shorter (require less lines of source code from the application programmer, as with a higher level language). They can also be used to add higher levels of structure to assembly programs, optionally introduce embedded de-bugging code via parameters and other similar features. Many assemblers have built-in (or predefined) macros for system calls and other special code sequences, such as the generation and storage of data realized through advanced bitwise and boolean operations used in gaming, software security, data management, and cryptography. Macro assemblers often allow macros to take parameters. Some assemblers include quite sophisticated macro languages, incorporating such high-level language elements as optional parameters, symbolic variables, conditionals, string manipulation, and arithmetic operations, all usable during the execution of a given macro, and allowing macros to save context or exchange information. Thus a macro might generate a large number of assembly language instructions or data definitions, based on the macro arguments. This could be used to generate record-style data structures or "unrolled" loops, for example, or could generate entire algorithms based on complex parameters. An organization using assembly language that has been heavily extended using such a macro suite can be considered to be working in a higher-level language, since such programmers are not working with a computer's lowest-level conceptual elements. Macros were used to customize large scale software systems for specific customers in the mainframe era and were also used by customer personnel to satisfy their employers' needs by making specific versions of manufacturer operating systems; this was done, for example, by systems programmers working with IBM's Conversational Monitor System/Virtual Machine (CMS/VM) and with IBM's "real time transaction processing" add-ons, CICS, Customer Information Control System, and ACP/TPF, the airline/financial system that began in the 1970s and still runs many large Global Distribution Systems (GDS) and credit card systems today. It was also possible to use solely the macro processing capabilities of an assembler to generate code written in completely different languages, for example, to generate a version of a program in Cobol using a pure macro assembler program containing lines of Cobol code inside assembly time operators instructing the assembler to generate arbitrary code. This was because, as was realized in the 1970s, the concept of "macro processing" is independent of the concept of "assembly", the former being in modern terms more word processing, text processing, than generating object code. The concept of macro processing in fact appeared in and appears in the C programming language, which supports "preprocessor instructions" to set variables, and make conditional tests on their values. Note that unlike certain

32


Assembly language previous macro processors inside assemblers, the C preprocessor was not Turing-complete because it lacked the ability to either loop or "go to", the latter allowing the programmer to loop. Despite the power of macro processing, it fell into disuse in high level languages while remaining a perennial for assemblers. This was because many programmers were rather confused by macro parameter substitution and did not disambiguate macro processing from assembly and execution. Macro parameter substitution is strictly by name: at macro processing time, the value of a parameter is textually substituted for its name. The most famous class of bugs resulting was the use of a parameter that itself was an expression and not a simple name when the macro writer expected a name. In the macro: foo: macro a load a*b the intention was that the caller would provide the name of a variable, and the "global" variable or constant b would be used to multiply "a". If foo is called with the parameter a-c, an unexpected macro expansion occurs. To avoid this, users of macro processors learned to religiously parenthesize formal parameters inside macro definitions, and callers had to do the same to their "actual" parameters. PL/I and C feature macros, but this facility was underused or dangerous when used because they can only manipulate text. On the other hand, homoiconic languages, such as Lisp, Prolog, and Forth, retain the power of assembly language macros because they are able to manipulate their own code as data.

Support for structured programming Some assemblers have incorporated structured programming elements to encode execution flow. The earliest example of this approach was in the Concept-14 macro set, originally proposed by Dr. H.D. Mills (March, 1970), and implemented by Marvin Kessler at IBM's Federal Systems Division, which extended the S/360 macro assembler with IF/ELSE/ENDIF and similar control flow blocks.[3] This was a way to reduce or eliminate the use of GOTO operations in assembly code, one of the main factors causing spaghetti code in assembly language. This approach was widely accepted in the early 80s (the latter days of large-scale assembly language use). A curious design was A-natural, a "stream-oriented" assembler for 8080/Z80 processors from Whitesmiths Ltd. (developers of the Unix-like Idris operating system, and what was reported to be the first commercial C compiler). The language was classified as an assembler, because it worked with raw machine elements such as opcodes, registers, and memory references; but it incorporated an expression syntax to indicate execution order. Parentheses and other special symbols, along with block-oriented structured programming constructs, controlled the sequence of the generated instructions. A-natural was built as the object language of a C compiler, rather than for hand-coding, but its logical syntax won some fans. There has been little apparent demand for more sophisticated assemblers since the decline of large-scale assembly language development.[4] In spite of that, they are still being developed and applied in cases where resource constraints or peculiarities in the target system's architecture prevent the effective use of higher-level languages.[5]

Use of assembly language Historical perspective Assembly languages were first developed in the 1950s, when they were referred to as second generation programming languages. They eliminated much of the error-prone and time-consuming first-generation programming needed with the earliest computers, freeing the programmer from tedium such as remembering numeric codes and calculating addresses. They were once widely used for all sorts of programming. However, by the 1980s (1990s on small computers), their use had largely been supplanted by high-level languages, in the search for improved programming productivity. Today, although assembly language is almost always handled and generated by compilers, it is still used for direct hardware manipulation, access to specialized processor instructions, or to address

33


Assembly language critical performance issues. Typical uses are device drivers, low-level embedded systems, and real-time systems. Historically, a large number of programs have been written entirely in assembly language. Operating systems were almost exclusively written in assembly language until the widespread acceptance of C in the 1970s and early 1980s. Many commercial applications were written in assembly language as well, including a large amount of the IBM mainframe software written by large corporations. COBOL and FORTRAN eventually displaced much of this work, although a number of large organizations retained assembly-language application infrastructures well into the 90s. Most early microcomputers relied on hand-coded assembly language, including most operating systems and large applications. This was because these systems had severe resource constraints, imposed idiosyncratic memory and display architectures, and provided limited, buggy system services. Perhaps more important was the lack of first-class high-level language compilers suitable for microcomputer use. A psychological factor may have also played a role: the first generation of microcomputer programmers retained a hobbyist, "wires and pliers" attitude. In a more commercial context, the biggest reasons for using assembly language were minimal bloat (size), minimal overhead, greater speed, and reliability. Typical examples of large assembly language programs from this time are the MS-DOS operating system, the early IBM PC spreadsheet program Lotus 1-2-3, and almost all popular games for the Atari 800 family of home computers. Even into the 1990s, most console video games were written in assembly, including most games for the Mega Drive/Genesis and the Super Nintendo Entertainment System . According to some industry insiders, the assembly language was the best computer language to use to get the best performance out of the Sega Saturn, a console that was notoriously challenging to develop and program games for [6] . The popular arcade game NBA Jam (1993) is another example. On the Commodore 64, Amiga, Atari ST, as well as ZX Spectrum home computers, assembler has long been the primary development language. This was in large part due to the fact that BASIC dialects on these systems offered insufficient execution speed, as well as insufficient facilities to take full advantage of the available hardware on these systems. Some systems, most notably Amiga, even have IDEs with highly advanced debugging and macro facilities, such as the freeware ASM-One assembler [7], comparable to that of Microsoft Visual Studio facilities (ASM-One predates Microsoft Visual Studio). The Assembler for the VIC-20 was written by Don French and published by French Silk. At 1639 bytes in length, its author believes it is the smallest symbolic assembler ever written. The assembler supported the usual symbolic addressing and the definition of character strings or hex strings. It also allowed address expressions which could be combined with addition, subtraction, multiplication, division, logical AND, logical OR, and exponentiation operators.[8]

Current usage There have always been debates over the usefulness and performance of assembly language relative to high-level languages. Assembly language has specific niche uses where it is important; see below. But in general, modern optimizing compilers are claimed to render high-level languages into code that can run as fast as hand-written assembly, despite the counter-examples that can be found [9] [10] [11] . The complexity of modern processors and memory sub-system makes effective optimization increasingly difficult for compilers, as well as assembler programmers [12] [13] . Moreover, and to the dismay of efficiency lovers, increasing processor performance has meant that most CPUs sit idle most of the time, with delays caused by predictable bottlenecks such as I/O operations and paging. This has made raw code execution speed a non-issue for many programmers. There are some situations in which practitioners might choose to use assembly language, such as when: • a stand-alone binary executable is required, i.e. one that must execute without recourse to the run-time components or libraries associated with a high-level language; this is perhaps the most common situation. These are embedded programs that store only a small amount of memory and the device is intended to do single purpose tasks. Such examples consist of telephones, automobile fuel and ignition systems, air-conditioning control systems, security systems, and sensors.

34


Assembly language • interacting directly with the hardware, for example in device drivers and interrupt handlers. • using processor-specific instructions not exploited by or available to the compiler. A common example is the bitwise rotation instruction at the core of many encryption algorithms. • creating vectorized functions for programs in higher-level languages such as C. In the higher-level language this is sometimes aided by compiler intrinsic functions which map directly to SIMD mnemonics, but nevertheless result in a one-to-one assembly conversion specific for the given vector processor. • extreme optimization is required, e.g., in an inner loop in a processor-intensive algorithm. Game programmers take advantage of the capabilities of hardware features in systems, enabling the games to run faster. • a system with severe resource constraints (e.g., an embedded system) must be hand-coded to maximize the use of limited resources; but this is becoming less common as processor price decreases and performance improves. • no high-level language exists, on a new or specialized processor, for example. • writing real-time programs that need precise timing and responses, such as simulations, flight navigation systems, and medical equipment. For example, in a fly-by-wire system, telemetry must be interpreted and acted upon within strict time constraints. Such systems must eliminate sources of unpredictable delays, which may be created by (some) interpreted languages, automatic garbage collection, paging operations, or preemptive multitasking. However, some higher-level languages incorporate run-time components and operating system interfaces that can introduce such delays. Choosing assembly or lower-level languages for such systems gives the programmer greater visibility and control over processing details. • complete control over the environment is required, in extremely high security situations where nothing can be taken for granted. • writing computer viruses, bootloaders, certain device drivers, or other items very close to the hardware or low-level operating system. • writing instruction set simulators for monitoring, tracing and debugging where additional overhead is kept to a minimum • reverse-engineering existing binaries that may or may not have originally been written in a high-level language, for example when cracking copy protection of proprietary software. • reverse engineering and modifying video games (also known as ROM Hacking), which is possible with a range of techniques. The most widely employed is altering the program code at the assembly language level. • writing self modifying code, to which assembly language lends itself well. • writing games and other software for graphing calculators.[14] • writing compiler software that generates assembly code, and the writers should therefore be expert assembly language programmers themselves. • writing cryptographic algorithms that must always take strictly the same time to execute, preventing timing attacks. Nevertheless, assembly language is still taught in most Computer Science and Electronic Engineering programs. Although few programmers today regularly work with assembly language as a tool, the underlying concepts remain very important. Such fundamental topics as binary arithmetic, memory allocation, stack processing, character set encoding, interrupt processing, and compiler design would be hard to study in detail without a grasp of how a computer operates at the hardware level. Since a computer's behavior is fundamentally defined by its instruction set, the logical way to learn such concepts is to study an assembly language. Most modern computers have similar instruction sets. Therefore, studying a single assembly language is sufficient to learn: i) The basic concepts; ii) To recognize situations where the use of assembly language might be appropriate; and iii) To see how efficient executable code can be created from high-level languages. [15]

35


Assembly language

Typical applications Hard-coded assembly language is typically used in a system's boot ROM (BIOS on IBM-compatible PC systems). This low-level code is used, among other things, to initialize and test the system hardware prior to booting the OS, and is stored in ROM. Once a certain level of hardware initialization has taken place, execution transfers to other code, typically written in higher level languages; but the code running immediately after power is applied is usually written in assembly language. The same is true of most boot loaders. Many compilers render high-level languages into assembly first before fully compiling, allowing the assembly code to be viewed for debugging and optimization purposes. Relatively low-level languages, such as C, often provide special syntax to embed assembly language directly in the source code. Programs using such facilities, such as the Linux kernel, can then construct abstractions utilizing different assembly language on each hardware platform. The system's portable code can then utilize these processor-specific components through a uniform interface. Assembly language is also valuable in reverse engineering, since many programs are distributed only in machine code form, and machine code is usually easy to translate into assembly language and carefully examine in this form, but very difficult to translate into a higher-level language. Tools such as the Interactive Disassembler make extensive use of disassembly for such a purpose. A particular niche that makes use of assembly language is the demoscene. Certain competitions require the contestants to restrict their creations to a very small size (e.g. 256B, 1KB, 4KB or 64 KB), and assembly language is the language of choice to achieve this goal.[16] When resources, particularly CPU-processing constrained systems, like the earlier Amiga models, and the Commodore 64, are a concern, assembler coding is a must: optimized assembler code is written "by hand" and instructions are sequenced manually by the coders in an attempt to minimize the number of CPU cycles used; the CPU constraints are so great that every CPU cycle counts. However, using such techniques has enabled systems like the Commodore 64 to produce real-time 3D graphics with advanced effects, a feat which might be considered unlikely or even impossible for a system with a 0.99MHz processor.

Related terminology • Assembly language or assembler language is commonly called assembly, assembler, ASM, or symbolic machine code. A generation of IBM mainframe programmers called it BAL for Basic Assembly Language. Note: Calling the language assembler is of course potentially confusing and ambiguous, since this is also the name of the utility program that translates assembly language statements into machine code. Some may regard this as imprecision or error. However, this usage has been common among professionals and in the literature for decades.[17] Similarly, some early computers called their assembler its assembly program.[18] ) • The computational step where an assembler is run, including all macro processing, is known as assembly time. • The use of the word assembly dates from the early years of computers (cf. short code, speedcode). • A cross assembler (see cross compiler) is functionally just an assembler. This term is used to stress that the assembler is run on a different computer than the target system, the system on which the resulting code is run. Because nowadays assemblers are written portably in a high level language like C, this is largely irrelevant. Cross assembling may be necessary if the target system lacks the capacity to run an assembler itself. This is typically the case for small embedded systems. The most important distinguishing feature of a cross assembler is that it provides for or interfaces to facilities to transport the code to the target processor, e.g. to reside in flash or EPROM. It generates a binary image, or Intel Hex file rather than an object file. • An assembler directive is a command given to an assembler. These directives may do anything from telling the assembler to include other source files, to telling it to allocate memory for constant data.

36


Assembly language

List of assemblers for different computer architectures The following page has a list of different assemblers for the different computer architectures, along with any associated information for that specific assembler: • List of assemblers

Further details For any given personal computer, mainframe, embedded system, and game console, both past and present, at least one--possibly dozens--of assemblers have been written. For some examples, see the list of assemblers. On Unix systems, the assembler is traditionally called as, although it is not a single body of code, being typically written anew for each port. A number of Unix variants use GAS. Within processor groups, each assembler has its own dialect. Sometimes, some assemblers can read another assembler's dialect, for example, TASM can read old MASM code, but not the reverse. FASM and NASM have similar syntax, but each support different macros that could make them difficult to translate to each other. The basics are all the same, but the advanced features will differ.[19] Also, assembly can sometimes be portable across different operating systems on the same type of CPU. Calling conventions between operating systems often differ slightly or not at all, and with care it is possible to gain some portability in assembly language, usually by linking with a C library that does not change between operating systems. An instruction set simulator (which would ideally be written in an assembler language) can, in theory, process the object code/ binary of any assembler to achieve portability even across platforms (with an overhead no greater than a typical bytecode interpreter). This is essentially what microcode achieves when a hardware platform changes internally. For example, many things in libc depend on the preprocessor to do OS-specific, C-specific things to the program before compiling. In fact, some functions and symbols are not even guaranteed to exist outside of the preprocessor. Worse, the size and field order of structs, as well as the size of certain typedefs such as off_t, are entirely unavailable in assembly language without help from a configure script, and differ even between versions of Linux, making it impossible to portably call functions in libc other than ones that only take simple integers and pointers as parameters. To address this issue, FASMLIB project provides a portable assembly library for Win32 and Linux platforms, but it is yet very incomplete.[20] Some higher level computer languages, such as C and Borland Pascal, support inline assembly where relatively brief sections of assembly code can be embedded into the high level language code. The Forth programming language commonly contains an assembler used in CODE words. Many people use an emulator to debug assembly-language programs.

Example listing of assembly language source code

37


Assembly language

38

Address

Label

Instruction (AT&T syntax)

[21]

Object code

.begin .org 2048 a_start

.equ 3000

2048

ld length,%

2064

be done

00000010 10000000 00000000 00000110

2068

addcc %r1,-4,%r1

10000010 10000000 01111111 11111100

2072

addcc %r1,%r2,%r4

10001000 10000000 01000000 00000010

2076

ld %r4,%r5

11001010 00000001 00000000 00000000

2080

ba loop

00010000 10111111 11111111 11111011

2084

addcc %r3,%r5,%r3

10000110 10000000 11000000 00000101

2088

done:

jmpl %r15+4,%r0

10000001 11000011 11100000 00000100

2092

length:

20

00000000 00000000 00000000 00010100

2096

address: a_start

00000000 00000000 00001011 10111000

.org a_start 3000

a:

Example of a selection of instructions (for a virtual computer[22] ) with the corresponding address in memory where each instruction will be placed. These addresses are not static, see memory management. Accompanying each instruction is the generated (by the assembler) object code that coincides with the virtual computer's architecture (or ISA).

See also • • • • • •

Compiler Disassembler Instruction set Little man computer – an educational computer model with a base-10 assembly language Microassembler Typed assembly language

Further reading • ASM Community Book [23] "An online book full of helpful ASM info, tutorials and code examples" by the ASM Community • Jonathan Bartlett: Programming from the Ground Up [24]. Bartlett Publishing, 2004. ISBN 0-9752838-4-7 Also available online as PDF [25] • Robert Britton: MIPS Assembly Language Programming. Prentice Hall, 2003. ISBN 0-13-142044-5 • Paul Carter: PC Assembly Language. Free ebook, 2001. Website [26] • Jeff Duntemann: Assembly Language Step-by-Step. Wiley, 2000. ISBN 0-471-37523-3 • Randall Hyde: The Art of Assembly Language. No Starch Press, 2003. ISBN 1-886411-97-2 Draft versions available online [27] as PDF and HTML • Peter Norton, John Socha, Peter Norton's Assembly Language Book for the IBM PC, Brady Books, NY: 1986.


Assembly language • Michael Singer, PDP-11. Assembler Language Programming and Machine Organization, John Wiley & Sons, NY: 1980. • Dominic Sweetman: See MIPS Run. Morgan Kaufmann Publishers, 1999. ISBN 1-55860-410-3 • John Waldron: Introduction to RISC Assembly Language Programming. Addison Wesley, 1998. ISBN 0-201-39828-1

External links • • • • • • • • • •

Randall Hyde's The Art of Assembly Language as HTML and PDF version [27] Machine language for beginners [28] Introduction to assembly language [29] The ASM Community [30], a programming resource about assembly including an ASM Book [23] Intel Assembly 80x86 CodeTable [31] (a cheat sheet reference) Unix Assembly Language Programming [32] IBM z/Architecture Principles of Operation [33] IBM manuals on mainframe machine language and internals. IBM High Level Assembler [34] IBM manuals on mainframe assembler language. PPR: Learning Assembly Language [35] An Introduction to Writing 32-bit Applications Using the x86 Assembly Language [36]

• • • • • • • • •

Assembly Language Programming Examples [37] Authoring Windows Applications In Assembly Language [38] Information on Linux assembly programming [39] x86 Instruction Set Reference [40] Iczelion's Win32 Assembly Tutorial [41] Assembly Optimization Tips [42] by Mark Larson NASM Manual [43] 8086 assembly coding [44] by F.A. Smit Microchip PIC assembly coding basics [45]

References [1] [2] [3] [4]

David Salomon (1993). Assemblers and Loaders (http:/ / www. davidsalomon. name/ assem. advertis/ asl. pdf) Beck, Leland L. (1996). "2". System Software: An Introduction to Systems Programming. Addison Wesley. "Concept 14 Macros" (http:/ / skycoast. us/ pscott/ software/ mvs/ concept14. html). MVS Software. . Retrieved May 25, 2009. Answers.com. "assembly language: Definition and Much More from Answers.com" (http:/ / www. answers. com/ topic/ assembly-language?cat=technology). . Retrieved 2008-06-19. [5] NESHLA: The High Level, Open Source, 6502 Assembler for the Nintendo Entertainment System (http:/ / neshla. sourceforge. net/ ) [6] Eidolon's Inn : SegaBase Saturn (http:/ / www. eidolons-inn. net/ tiki-index. php?page=SegaBase+ Saturn) [7] http:/ / www. theflamearrows. info/ homepage. html [8] Jim Lawless (2004-05-21). "Speaking with Don French : The Man Behind the French Silk Assembler Tools" (http:/ / www. radiks. net/ ~jimbo/ art/ int7. htm). . Retrieved 2008-07-25. [9] "Writing the Fastest Code, by Hand, for Fun: A Human Computer Keeps Speeding Up Chips" (http:/ / www. nytimes. com/ 2005/ 11/ 28/ technology/ 28super. html?_r=1). New York Times, John Markoff. 2005-11-28. . Retrieved 2010-03-04. [10] "Bit-field-badness" (http:/ / hardwarebug. org/ 2010/ 01/ 30/ bit-field-badness/ ). hardwarebug.org. 2010-01-30. . Retrieved 2010-03-04. [11] "GCC makes a mess" (http:/ / hardwarebug. org/ 2009/ 05/ 13/ gcc-makes-a-mess/ ). hardwarebug.org. 2009-05-13. . Retrieved 2010-03-04. [12] Randall Hyde. "The Great Debate" (http:/ / webster. cs. ucr. edu/ Page_TechDocs/ GreatDebate/ debate1. html). . Retrieved 2008-07-03. [13] "Code sourcery fails again" (http:/ / hardwarebug. org/ 2008/ 11/ 28/ codesourcery-fails-again/ ). hardwarebug.org. 2010-01-30. . Retrieved 2010-03-04. [14] "68K Programming in Fargo II" (http:/ / tifreakware. net/ tutorials/ 89/ a/ calc/ fargoii. htm). . Retrieved 2008-07-03. [15] Hyde, Randall (1996-09-30). "Foreword ("Why would anyone learn this stuff?"), op. cit." (http:/ / www. arl. wustl. edu/ ~lockwood/ class/ cs306/ books/ artofasm/ fwd. html). . Retrieved 2010-03-05. [16] "256bytes demos archives" (http:/ / web. archive. org/ web/ 20080211025322rn_1/ www. 256b. com/ home. php). . Retrieved 2008-07-03. [17] Stroustrup, Bjarne, The C++ Programming Language, Addison-Wesley, 1986, ISBN 0-201-12078-X: "C++ was primarily designed so that the author and his friends would not have to program in assembler, C, or various modern high-level languages. [use of the term assembler to

39


Assembly language mean assembly language]" [18] Saxon, James, and Plette, William, Programming the IBM 1401, Prentice-Hall, 1962, LoC 62-20615. [use of the term assembly program] [19] Randall Hyde. "Which Assembler is the Best?" (http:/ / webster. cs. ucr. edu/ AsmTools/ WhichAsm. html). . Retrieved 2007-10-19. [20] "vid". "FASMLIB: Features" (http:/ / fasmlib. x86asm. net/ features. html). . Retrieved 2007-10-19. [21] Murdocca, Miles J.; Vincent P. Heuring (2000). Principles of Computer Architecture. Prentice-Hall. ISBN 0-201-43664-7. [22] Principles of Computer Architecture (http:/ / iiusatech. com/ ~murdocca/ POCA) (POCA) – ARCTools virtual computer available for download to execute referenced code, accessed August 24, 2005 [23] http:/ / www. asmcommunity. net/ book/ [24] http:/ / programminggroundup. blogspot. com/ [25] http:/ / download. savannah. gnu. org/ releases-noredirect/ pgubook/ ProgrammingGroundUp-1-0-booksize. pdf [26] http:/ / drpaulcarter. com/ pcasm/ [27] http:/ / webster. cs. ucr. edu/ AoA/ index. html [28] http:/ / www. atariarchives. org/ mlb/ introduction. php [29] http:/ / www. swansontec. com/ sprogram. htm [30] http:/ / www. asmcommunity. net/ [31] http:/ / www. jegerlehner. ch/ intel/ IntelCodeTable. pdf [32] http:/ / www. int80h. org/ [33] http:/ / www-03. ibm. com/ systems/ z/ os/ zos/ bkserv/ r8pdf/ index. html#zarchpops [34] http:/ / www-03. ibm. com/ systems/ z/ os/ zos/ bkserv/ r8pdf/ index. html#hlasm [35] http:/ / c2. com/ cgi/ wiki?LearningAssemblyLanguage [36] http:/ / siyobik. info/ index. php?document=x86_32bit_asm [37] http:/ / www. azillionmonkeys. com/ qed/ asmexample. html [38] [39] [40] [41] [42] [43] [44] [45]

http:/ / www. grc. com/ smgassembly. htm http:/ / linuxassembly. org/ http:/ / siyobik. info/ index. php?module=x86 http:/ / win32assembly. online. fr/ tutorials. html http:/ / mark. masmcode. com/ http:/ / nasm. sourceforge. net/ doc/ nasmdoc0. html http:/ / www. xs4all. nl/ ~smit/ asm01001. htm#index1 http:/ / www. microautomate. com/ PIC/ lets-begin-assembly. html

40


BASIC

41

BASIC

Screenshot of Atari BASIC, an early BASIC language for small computers

Paradigm

unstructured, later procedural, later object-oriented

Appeared in Designed by

John George Kemeny and Thomas Eugene Kurtz

Typing discipline

strong

Major implementations Apple BASIC, Commodore BASIC, Microsoft BASIC, BBC BASIC, TI-BASIC Influenced by

ALGOL 60, FORTRAN II, JOSS

Influenced

COMAL, Visual BASIC, Visual Basic .NET, Realbasic, REXX, Perl, GRASS

In computer programming, BASIC (an acronym for Beginner's All-purpose Symbolic Instruction Code[1] ) is a family of high-level programming languages. The original BASIC was designed in 1964 by John George Kemeny and Thomas Eugene Kurtz at Dartmouth College in New Hampshire, USA to provide computer access to non-science students. At the time, nearly all use of computers required writing custom software, which was something only scientists and mathematicians tended to be able to do. The language and its variants became widespread on microcomputers in the late 1970s and 1980s. BASIC remains popular to this day in a handful of highly modified dialects and new languages influenced by BASIC such as Microsoft Visual Basic. As of 2006, 59% of developers for the .NET platform used Visual Basic .NET as their only language.[2]

History Before the mid-1960s, computers were extremely expensive and used only for special-purpose tasks. A simple batch processing arrangement ran only a single "job" at a time, one after another. But during the 1960s faster and more affordable computers became available. With this extra processing power, computers would sometimes sit idle, without jobs to run. Programming languages in the batch programming era tended to be designed, like the machines on which they ran, for specific purposes (such as scientific formula calculations or business data processing or eventually for text editing). Since even the newer, less expensive machines were still major investments, there was a strong tendency to consider efficiency to be the most important feature of a language. In general, these specialized languages were difficult to use and had widely disparate syntax. As prices decreased, the possibility of sharing computer access began to move from research labs to commercial use. Newer computer systems supported time-sharing, a system which allows multiple users or processes to use the RAM and memory. In such a system the operating system alternates between running processes, giving each one running time on the RAM before switching to another. The machines had become fast enough that most users could feel they had the machine all to themselves. In theory, timesharing reduced the cost of computing tremendously, as a single


BASIC

42

machine could be shared among hundreds of users.

Early years: the mainframe and mini-computer era The original BASIC language was designed in 1963 by John Kemeny and Thomas Kurtz[3] and implemented by a team of Dartmouth students under their direction. BASIC was designed to allow students to write programs for the Dartmouth Time-Sharing System. It was intended to address the complexity issues of older languages with a new language design specifically for the new class of users that time-sharing systems allowed—that is, a less technical user who did not have the mathematical background of the more traditional users and was not interested in acquiring it. Being able to use a computer to support teaching and research was quite novel at the time. In the following years, as other dialects of BASIC appeared, Kemeny and Kurtz's original BASIC dialect became known as Dartmouth BASIC. The eight design principles of BASIC were: 1. 2. 3. 4. 5.

Be easy for beginners to use. Be a general-purpose programming language. Allow advanced features to be added for experts (while keeping the language simple for beginners). Be interactive. Provide clear and friendly error messages.

6. Respond quickly for small programs. 7. Not to require an understanding of computer hardware. 8. Shield the user from the operating system. The language was based partly on FORTRAN II and partly on ALGOL 60, with additions to make it suitable for timesharing. (The features of other time-sharing systems such as JOSS and CORC, and to a lesser extent LISP, were also considered.) It had been preceded by other teaching-language experiments at Dartmouth such as the DARSIMCO (1956) and DOPE (1962 implementations of SAP and DART (1963) which was a simplified FORTRAN II). Initially, BASIC concentrated on supporting straightforward mathematical work, with matrix arithmetic support from its initial implementation as a batch language and full string functionality being added by 1965. BASIC was first implemented on the GE-265 mainframe which supported multiple terminals. At the time of its introduction, it was a compiled language. It was also quite efficient, beating FORTRAN II and ALGOL 60 implementations on the 265 at several fairly computationally intensive (at the time) programming problems such as numerical integration by Simpson's Rule. The designers of the language decided to make the compiler available free of charge so that the language would become widespread. They also made it available to high schools in the Hanover, NH area and put a considerable amount of effort into promoting the language. As a result, knowledge of BASIC became relatively widespread (for a computer language) and BASIC was implemented by a number of manufacturers, becoming fairly popular on newer minicomputers like the DEC PDP series and the Data General Nova. The BASIC language was also central to the HP Time-Shared BASIC system in the late 1960s and early 1970s, and the Pick operating system. In these instances the language tended to be implemented as an interpreter, instead of (or in addition to) a compiler. Several years after its release, highly respected computer professionals, notably Edsger W. Dijkstra, expressed their opinions that the use of GOTO statements, which existed in many languages including BASIC, promoted poor programming practices.[4] Some have also derided BASIC as too slow (most interpreted versions are slower than equivalent compiled versions) or too simple (many versions, especially for small computers, left out important features and capabilities).


BASIC

Explosive growth: the home computer era Notwithstanding the language's use on several minicomputers, it was the introduction of the MITS Altair 8800 "kit" microcomputer in 1975 that provided BASIC a path to universality. Most programming languages required suitable text editors, large amounts of memory and available disk space, whereas the early microcomputers had no resident editors, limited memory and often substituted recordable audio tapes for disk space. All these issues allowed a language like BASIC, in its interpreted form with a built-in code editor, to operate within those constraints. BASIC also had the advantage that it was fairly MSX BASIC version 3.0 well-known to the young designers and computer hobbyists who took an interest in microcomputers, and generally worked in the electronics industries of the day. Kemeny and Kurtz's earlier proselytizing paid off in this respect and the few hobbyists' journals of the era were filled with columns that made mentions of the language or focused entirely on one version compared to others. One of the first to appear for the 8080 machines like the Altair was Tiny BASIC, a simple BASIC implementation originally written by Dr. Li-Chen Wang, and then ported onto the Altair by Dennis Allison at the request of Bob Albrecht (who later founded Dr. Dobb's Journal). The Tiny BASIC design and the full source code were published in 1976 in DDJ. In 1975, MITS released Altair BASIC, developed by college drop-outs Bill Gates and Paul Allen as the company Micro-Soft, which grew into today's corporate giant, Microsoft. The first Altair version was co-written by Gates, Allen, and Monte Davidoff in a burst of enthusiasm and neglect of studies. Versions of Microsoft BASIC (also known then, and most widely as M BASIC or MBASIC) was soon bundled with the original floppy disk-based CP/M computers, which became widespread in small business environments. As the popularity of BASIC on CP/M spread, newer computer designs also introduced their own version of the language, or had Micro-Soft port its version to their platform. When three major new computers were introduced in what Byte Magazine would later call the "1977 Trinity",[5] all three had BASIC as their primary programming language and operating environment. The Commodore PET licensed a version of Micro-Soft BASIC that was ported to the MOS 6502, while Apple II and TRS-80 both introduced new, largely similar versions of the language. As new companies entered the field, additional versions were added that subtly changed the BASIC family. The Atari 8-bit family had their own Atari BASIC that was modified in order to fit on an 8 kB ROM cartridge. The BBC published BBC BASIC, developed for them by Acorn Computers Ltd, incorporating many extra structuring keywords. Most of the home computers of the 1980s had a ROM-resident BASIC interpreter, allowing the machines to boot directly into BASIC. Because of this legacy, there are more dialects of BASIC than there are of any other programming language. As the popularity of BASIC grew in this period, magazines (such as Creative Computing Magazine in the US) published complete source code in BASIC for games, utilities, and other programs. Given BASIC's straightforward nature, it was a simple matter to type in the code from the magazine and execute the program. Different magazines were published featuring programs for specific computers, though some BASIC programs were considered universal and could be used in machines running any variant of BASIC (sometimes with minor adaptations). Correcting the publishing errors that frequently occurred in magazine listings was an educational exercise in itself. BASIC source code was also published in fully-fledged books: the seminal examples being David Ahl's BASIC Computer Games series.[6] [7] [8] Later packages, such as Learn to Program BASIC would also have gaming as an

43


BASIC

44

introductory focus.

Maturity: the personal computer era As early as 1979 Microsoft was in negotiations with IBM to supply their IBM PCs with an IBM Cassette BASIC (BASIC C) inside BIOS. Microsoft sold several versions of BASIC for MS-DOS/PC-DOS including IBM Disk BASIC (BASIC D), IBM BASICA (BASIC A), GW-BASIC (a BASICA-compatible version that did not need IBM's ROM) and QuickBASIC. Turbo Pascal-publisher Borland published Turbo BASIC 1.0 in 1985 (successor versions are still being marketed by the original author under the name PowerBASIC). Microsoft wrote the windowing based AmigaBASIC that was supplied with version 1.1 of the pre-emptive multitasking GUI Amiga computers (late 1985/ early 1986), although the product unusually did not bear any Microsoft marks. These languages introduced many extensions to the original home computer BASIC, such as improved string manipulation and graphics support, access to the file system and additional data types. More important were the facilities for structured programming, including additional control structures and proper subroutines supporting local variables. The new graphical features of these langauges also helped lay the groundwork for PC video gaming, with BASIC programs like DONKEY.BAS showing what the PC could do.

IBM Cassette BASIC 1.10

IBM Disk BASIC 1.10

However, by the latter half of the 1980s newer computers were far more capable IBM BASICA 1.10 with more resources. At the same time, computers had progressed from a hobbyist interest to tools used primarily for applications written by others, and programming became less important for most users. BASIC started to recede in


BASIC

45

importance, though numerous versions remained available. Compiled BASIC or CBASIC is still used in many IBM 4690 OS point of sale systems. BASIC's fortunes reversed once again with the introduction of Visual Basic by Microsoft. It is somewhat difficult to consider this language to be BASIC, because of the major shift in its orientation towards an object-oriented and event-driven perspective. The only significant similarity GW-BASIC 3.23 to older BASIC dialects was familiar syntax. Syntax itself no longer "fully defined" the language, since much development was done using "drag and drop" methods without exposing all code for commonly used objects such as buttons and scrollbars to the developer. While this could be considered an evolution of the language, few of the distinctive features of early Dartmouth BASIC, such as line numbers and the INPUT keyword, remain (although Visual Basic still uses INPUT to read data from files, and INPUTBOX is available for direct user input; line numbers can also optionally be used in all VB versions, even VB.NET, albeit they cannot be used in certain places, for instance before SUB). Ironically given the origin of BASIC as a "beginner's" language, and apparently even to the surprise of many at Microsoft who still initially marketed Visual Basic or "VB" as a language for hobbyists, the language had come into widespread use for small custom business applications shortly after the release of VB version 3.0, which is widely considered the first relatively stable version. While many advanced programmers still scoffed at its use, VB met the needs of small businesses efficiently wherever processing speed was less of a concern than easy development. By that time, computers running Windows 3.1 had become fast enough that many business-related processes could be completed "in the blink of an eye" even using a "slow" language, as long as massive amounts of data were not involved. Many small business owners found they could create their own small yet useful applications in a few evenings to meet their own specialized needs. Eventually, during the lengthy lifetime of VB3, knowledge of Visual Basic had become a marketable job skill. The language, like QBasic before it,[9] also became a favourite for amateur game development,[10] with one example being the Arthur Yahtzee trilogy of amateur adventure games. Many BASIC dialects have also sprung up in the last few years, including Bywater BASIC and True BASIC (the direct successor to Dartmouth BASIC from a company controlled by Kurtz). One notable variant is RealBasic which although first released in 1998 for Macintosh computers, has since 2005 fully compiled programs for Microsoft Windows, Mac OS X and 32-bit x86 Linux, from the same object-oriented source code. RealBasic compiled programs may execute natively on these platforms as services, consoles or windowed applications. However in keeping with BASIC tradition, single-platform hobbyist Three modern Basic variants: Mono Basic, OpenOffice.org Basic and Gambas


BASIC versions are also still maintained. Many other BASIC variants and adaptations have been written by hobbyists, equipment developers, and others, as it is a relatively simple language to develop translators for. An example of an open source interpreter, written in C, is MiniBasic [11]. More complex examples of free software BASIC implementations (development tools and compilers) include Gambas and FreeBASIC. The ubiquity of BASIC interpreters on personal computers was such that textbooks once included simple "Try It In BASIC" exercises that encouraged students to experiment with mathematical and computational concepts on classroom or home computers. Futurist and sci-fi writer David Brin mourned the loss of ubiquitous BASIC in a 2006 Salon article.[12]

Examples Unstructured BASIC New BASIC programmers on a home computer might start with a simple program similar to the Hello world program made famous by Kernighan and Ritchie. This generally involves simple use of the language's PRINT statement to display the message (such as the programmer's name) to the screen. Often an infinite loop was used to fill the display with the message. Most first generation BASIC languages such as MSX BASIC and GW-BASIC supported simple data types, loop cycles and arrays. The following example is written for GW-BASIC, but will work in most versions of BASIC with minimal changes: 10 INPUT "What is your name: ", U$ 20 PRINT "Hello "; U$ 30 INPUT "How many stars do you want: ", N 40 S$ = "" 50 FOR I = 1 TO N 60 S$ = S$ + "*" 70 NEXT I 80 PRINT S$ 90 INPUT "Do you want more stars? ", A$ 100 IF LEN(A$) = 0 THEN GOTO 90 110 A$ = LEFT$(A$, 1) 120 IF A$ = "Y" OR A$ = "y" THEN GOTO 30 130 PRINT "Goodbye "; U$ 140 END

Structured BASIC Second generation BASICs (for example QuickBASIC and PowerBASIC) introduced a number of features into the language, primarily related to structured and procedure-oriented programming. Usually, line numbering is omitted from the language and replaced with labels (for GOTO) and procedures to encourage easier and more flexible design.[13] INPUT "What is your name: ", UserName$ PRINT "Hello "; UserName$ DO INPUT "How many stars do you want: ", NumStars Stars$ = STRING$(NumStars, "*") PRINT Stars$ DO INPUT "Do you want more stars? ", Answer$

46


BASIC

47

LOOP UNTIL Answer$ <> "" Answer$ = LEFT$(Answer$, 1) LOOP WHILE UCASE$(Answer$) = "Y" PRINT "Goodbye "; UserName$

BASIC with object-oriented features Third generation BASIC dialects such as Visual Basic, REALbasic, StarOffice Basic and BlitzMax introduced features to support object-oriented and event-driven programming paradigm. Most built-in procedures and functions now represented as methods of standard objects rather than operators. The following example is in Visual Basic .NET:

Public Class StarsProgram Public Shared Sub Main() Dim UserName, Answer, stars As String, NumStars As Integer Console.Write("What is your name: ") UserName = Console.ReadLine() Console.WriteLine("Hello {0}", UserName) Do Console.Write("How many stars do you want: ") NumStars = CInt(Console.ReadLine()) stars = New String("*", NumStars) Console.WriteLine(stars) Do Console.Write("Do you want more stars? ") Answer = Console.ReadLine() Loop Until Answer <> "" Answer = Answer.Substring(0, 1) Loop While Answer.ToUpper() = "Y" Console.WriteLine("Goodbye {0}", UserName) End Sub End Class

List of BASIC programming commands/statements 1. Let Command - used to assign value/content to the variable. 2. Input Statement - is a conversational statement wherein the computer asks the user the value of the variable. Types of Input Statement a. Ordinary Input b. Prompted Input - (String Variable Name is used) 3. If ... Then Statement - is used in comparison or decision making. 4. Tab Function - allows the computer user to have complete control on the position where the next character will be shown on the screen or printed on paper. Argument - represents the position where printing will be made. 5. GOSUB Command - is used to avoid repetitive typing of the same set of instructions in the program.


BASIC

48

6. REM (Remarks) - is used to assign/give title to the program and to help identify the purpose of a given section of code. 7. ON ... GOTO Command - allows the program to choose from a list of line numbers where to go depending on certain conditions. It is the variation of If ... Then Statement and GOTO Command.

See also • List of BASIC dialects • List of BASIC dialects by platform

References • (PDF) A Manual for BASIC, the elementary algebraic language designed for use with the Dartmouth Time Sharing System [14]. Dartmouth College Computation Center. 1964.—The original Dartmouth BASIC manual. • Lien, David A. (1986). The Basic Handbook: Encyclopedia of the BASIC Computer Language (3rd ed. ed.). Compusoft Publishing. ISBN 0-932760-33-3.—Documents dialect variations for over 250 versions of BASIC. • Kemeny, John G.; Kurtz, Thomas E. (1985). Back To BASIC: The History, Corruption, and Future of the Language. Addison-Wesley. pp. 141 pp. ISBN 0-201-13433-0. • Sammet, Jean E. (1969). Programming languages: History and fundamentals. Englewood Cliffs, N.J.: Prentice-Hall. • The Encyclopedia of Computer Languages. BASIC - Beginners All-purpose Symbolic Instruction Code [15]. Murdoch University.

Standards • ANSI/ISO/IEC Standard for Minimal BASIC: • ANSI X3.60-1978 "FOR MINIMAL BASIC" • ISO/IEC 6373:1984 "DATA PROCESSING - PROGRAMMING LANGUAGES - MINIMAL BASIC" • ANSI/ISO/IEC Standard for Full BASIC: • ANSI X3.113-1987 "PROGRAMMING LANGUAGES FULL BASIC" • INCITS/ISO/IEC 10279-1991 (R2005) "Information Technology - Programming Languages - Full BASIC" [16]

• ANSI/ISO/IEC Addendum Defining Modules: • ANSI X3.113 INTERPRETATIONS-1992 "BASIC TECHNICAL INFORMATION BULLETIN # 1 INTERPRETATIONS OF ANSI 03.113-1987" • ISO/IEC 10279:1991/ Amd 1:1994 "MODULES AND SINGLE CHARACTER INPUT ENHANCEMENT" • ECMA-116 BASIC (withdrawn, similar to ANSI X3.113-1987)


BASIC

External links • BASIC [17] at the Open Directory Project • More Basic Computer Games [18] by David Ahl • Big Computer Games [19] by David Ahl

References [1] The acronym is tied to the name of an unpublished paper by Thomas Kurtz and is not a backronym, as is sometimes suggested in older versions of The Jargon File (http:/ / catb. org/ ~esr/ jargon/ html/ B/ BASIC. html) [2] Mono brings Visual Basic programs to Linux (http:/ / www. linux-watch. com/ news/ NS5656359853. html), By Steven J. Vaughan-Nichols, Feb. 19, 2007, Linux-Watch [3] Thomas E. Kurtz - History of Programming Languages (http:/ / cis-alumni. org/ TKurtz. html) [4] In a 1968 letter, Dutch computer scientist Edsger Dijkstra considered programming languages using GOTO statements for program structuring purposes harmful for the productivity of the programmer as well as the quality of the resulting code ( "Go To Statement Considered Harmful" (http:/ / www. acm. org/ classics/ oct95/ ), Communications of the ACM Volume 11, 147-148. 1968). The letter, which contributed the phrase considered harmful to programming jargon, did not mention any particular programming language; instead it states that the overuse of GOTO is damaging and gives technical reasons why this should be so. In a 1975 tongue-in-cheek article, "How do We Tell Truths that Might Hurt" (http:/ / www. cs. virginia. edu/ ~evans/ cs655/ readings/ ewd498. html), Sigplan Notices Volume 17 No. 5, Dijkstra gives a list of uncomfortable "truths", including his opinion of several programming languages of the time. Although BASIC is one of his targets ("It is practically impossible to teach good programming to students that have had a prior exposure to BASIC: as potential programmers they are mentally mutilated beyond hope of regeneration"), it receives no worse treatment in the piece than PL/I, COBOL or APL. [5] "Most Important Companies" (http:/ / www. byte. com/ art/ 9509/ sec7/ art15. htm). Byte Magazine. September 1995. . Retrieved 2008-06-10. [6] Table of Contents: BASIC Computer Games (http:/ / www. atariarchives. org/ basicgames) [7] Table of Contents: More BASIC Computer Games (http:/ / www. atariarchives. org/ morebasicgames) [8] Table of Contents: Big Computer Games (http:/ / www. atariarchives. org/ bigcomputergames) [9] QBasic Games (http:/ / members. chello. at/ theodor. lauppert/ games/ qbasic. htm) [10] Some examples include, Maxit (http:/ / members. chello. at/ theodor. lauppert/ games/ maxitwin. htm), Breme Counter (http:/ / members. chello. at/ theodor. lauppert/ games/ counter. htm), and a remake of Chuckie Egg (http:/ / members. chello. at/ theodor. lauppert/ games/ chuckie. htm) [11] http:/ / www. personal. leeds. ac. uk/ ~bgy1mm/ MiniBasic/ MiniBasicHome. html [12] Why Johnny Can't Code (http:/ / www. salon. com/ tech/ feature/ 2006/ 09/ 14/ basic/ index_np. html), By David Brin, Sept. 14, 2006, Salon Technology [13] "Differences Between GW-BASIC and QBasic" (http:/ / support. microsoft. com/ kb/ 73084). 2003-05-12. . Retrieved 2008-06-28. [14] http:/ / www. bitsavers. org/ pdf/ dartmouth/ BASIC_Oct64. pdf [15] http:/ / hopl. murdoch. edu. au/ showlanguage. prx?exp=176 [16] http:/ / www. iso. org/ iso/ catalogue_detail. htm?csnumber=18321 [17] http:/ / www. dmoz. org/ Computers/ Programming/ Languages/ BASIC/ / [18] http:/ / www. atariarchives. org/ morebasicgames [19] http:/ / www. atariarchives. org/ bigcomputergames

49


C (programming language)

50

C (programming language)

The C Programming Language (also known as "K&R"), the seminal book on C Usual file extensions

.h .c

Paradigm

Imperative (procedural), structured

Appeared in

1972

Designed by

Dennis Ritchie

Developer

Dennis Ritchie & Bell Labs

Stable release

C99 (March 2000)

Typing discipline

Static, weak, manifest

Major implementations

Clang, GCC, MSVC, Turbo C, Watcom C

Dialects

Cyclone, Unified Parallel C, Split-C, Cilk, C*

Influenced by

B (BCPL, CPL), ALGOL 68,

Influenced

AWK, csh, C++, C-- , C#, Objective-C, BitC, D, Go, Java, JavaScript, Limbo, Perl, PHP, Pike, Processing, Python

OS

Cross-platform (multi-platform)

[1]

Assembly, PL/I, FORTRAN

C (pronounced /siË?/ see) is a general-purpose computer programming language developed in 1972 by Dennis Ritchie at the Bell Telephone Laboratories for use with the Unix operating system.[2] Although C was designed for implementing system software,[3] it is also widely used for developing portable application software. C is one of the most popular programming languages[4] [5] and there are very few computer architectures for which a C compiler does not exist. C has greatly influenced many other popular programming languages, most notably C++, which originally began as an extension to C.


C (programming language)

Design C is an imperative (procedural) systems implementation language. It was designed to be compiled using a relatively straightforward compiler, to provide low-level access to memory, to provide language constructs that map efficiently to machine instructions, and to require minimal run-time support. C was therefore useful for many applications that had formerly been coded in assembly language. Despite its low-level capabilities, the language was designed to encourage machine-independent programming. A standards-compliant and portably written C program can be compiled for a very wide variety of computer platforms and operating systems with little or no change to its source code. The language has become available on a very wide range of platforms, from embedded microcontrollers to supercomputers.

Minimalism C's design is tied to its intended use as a portable systems implementation language. It provides simple, direct access to any addressable object (for example, memory-mapped device control registers), and its source-code expressions can be translated in a straightforward manner to primitive machine operations in the executable code. Some early C compilers were comfortably implemented (as a few distinct passes communicating via intermediate files) on PDP-11 processors having only 16 address bits. C compilers for several common 8-bit platforms have been implemented as well.

Characteristics Like most imperative languages in the ALGOL tradition, C has facilities for structured programming and allows lexical variable scope and recursion, while a static type system prevents many unintended operations. In C, all executable code is contained within functions. Function parameters are always passed by value. Pass-by-reference is simulated in C by explicitly passing pointer values. Heterogeneous aggregate data types (struct) allow related data elements to be combined and manipulated as a unit. C program source text is free-format, using the semicolon as a statement terminator (not a delimiter). C also exhibits the following more specific characteristics: • • • • • • •

variables may be hidden in nested blocks partially weak typing; for instance, characters can be used as integers low-level access to computer memory by converting machine addresses to typed pointers function and data pointers supporting ad hoc run-time polymorphism array indexing as a secondary notion, defined in terms of pointer arithmetic a preprocessor for macro definition, source code file inclusion, and conditional compilation complex functionality such as I/O, string manipulation, and mathematical functions consistently delegated to library routines • A relatively small set of reserved keywords • A lexical structure that resembles B more than ALGOL, for example: • • • •

{ ... } rather than either of ALGOL 60's begin ... end or ALGOL 68's ( ... ) = is used for assignment (copying), like Fortran, rather than ALGOL's := == is used to test for equality (rather than .EQ. in Fortran, or = in BASIC and ALGOL) Logical "and" and "or" are represented with && and || in place of ALGOL's ∧ and ∨ operators; note that the doubled-up operators will never evaluate the right operand if the result can be determined from the left alone (this is called short-circuit evaluation), and are semantically distinct from the bit-wise operators & and | • However Unix Version 6 & 7 versions of C used ALGOL's /\ and \/ operators (in ASCII), but for determining the infimum and supremum respectively.[6]

51


C (programming language) • a large number of compound operators, such as +=, -=, *= and ++ etc. (Equivalent to the ALGOL 68 operators +:=, -:=, *:= and +:=1 )

Absent features The relatively low-level nature of the language affords the programmer close control over what the computer does, while allowing special tailoring and aggressive optimization for a particular platform. This allows the code to run efficiently on very limited hardware, such as embedded systems. C does not have some features that are available in some other programming languages: • No nested function definitions • No direct assignment of arrays or strings (copying can be done via standard functions; assignment of objects having struct or union type is supported) • No automatic garbage collection • No requirement for bounds checking of arrays • No operations on whole arrays • No syntax for ranges, such as the A..B notation used in several languages • Prior to C99, no separate Boolean type (zero/nonzero is used instead)[7] • No formal closures or functions as parameters (only function and variable pointers) • No generators or coroutines; intra-thread control flow consists of nested function calls, except for the use of the longjmp or setcontext library functions • No exception handling; standard library functions signify error conditions with the global errno variable and/or special return values, and library functions provide non-local gotos • Only rudimentary support for modular programming • No compile-time polymorphism in the form of function or operator overloading • Very limited support for object-oriented programming with regard to polymorphism and inheritance • Limited support for encapsulation • No native support for multithreading and networking • No standard libraries for computer graphics and several other application programming needs A number of these features are available as extensions in some compilers, or are provided in some operating environments (e.g., POSIX), or are supplied by third-party libraries, or can be simulated by adopting certain coding disciplines.

Undefined behavior Many operations in C that have undefined behavior are not required to be diagnosed at compile time. In the case of C, "undefined behavior" means that the exact behavior which arises is not specified by the standard, and exactly what will happen does not have to be documented by the C implementation. A famous, although misleading, expression in the newsgroups [news:comp.std.c comp.std.c] and [news:comp.lang.c comp.lang.c] is that the program could cause "demons to fly out of your nose".[8] Sometimes in practice what happens for an instance of undefined behavior is a bug that is hard to track down and which may corrupt the contents of memory. Sometimes a particular compiler generates reasonable and well-behaved actions that are completely different from those that would be obtained using a different C compiler. The reason some behavior has been left undefined is to allow compilers for a wide variety of instruction set architectures to generate more efficient executable code for well-defined behavior, which was deemed important for C's primary role as a systems implementation language; thus C makes it the programmer's responsibility to avoid undefined behavior, possibly using tools to find parts of a program whose behavior is undefined. Examples of undefined behavior are: • accessing outside the bounds of an array • overflowing a signed integer

52


C (programming language) • reaching the end of a non-void function without finding a return statement, when the return value is used • reading the value of a variable before initializing it These operations are all programming errors that could occur using many programming languages; C draws criticism because its standard explicitly identifies numerous cases of undefined behavior, including some where the behavior could have been made well defined, and does not specify any run-time error handling mechanism. Invoking fflush() on a stream opened for input is an example of a different kind of undefined behavior, not necessarily a programming error but a case for which some conforming implementations may provide well-defined, useful semantics (in this example, presumably discarding input through the next new-line) as an allowed extension. Use of such nonstandard extensions generally limits software portability.

History Early developments The initial development of C occurred at AT&T Bell Labs between 1969 and 1973; according to Ritchie, the most creative period occurred in 1972. It was named "C" because of its features were derived from an earlier language called "B", which according to Ken Thompson was a stripped-down version of the BCPL programming language. The origin of C is closely tied to the development of the Unix operating system, originally implemented in assembly language on a PDP-7 by Ritchie and Thompson, incorporating several ideas from colleagues. Eventually they decided to port the operating system to a PDP-11. B's lack of functionality taking advantage of some of the PDP-11's features, notably byte addressability, led to the development of an early version of C. The original PDP-11 version of the Unix system was developed in assembly language. By 1973, with the addition of struct types, the C language had become powerful enough that most of the Unix kernel was rewritten in C. This was one of the first operating system kernels implemented in a language other than assembly. (Earlier instances include the Multics system (written in PL/I), and MCP (Master Control Program) for the Burroughs B5000 written in ALGOL in 1961.)

K&R C In 1978, Brian Kernighan and Dennis Ritchie published the first edition of The C Programming Language. This book, known to C programmers as "K&R", served for many years as an informal specification of the language. The version of C that it describes is commonly referred to as K&R C. The second edition of the book covers the later ANSI C standard. K&R introduced several language features: • • • •

standard I/O library long int data type unsigned int data type compound assignment operators of the form =op (such as =-) were changed to the form op= to remove the semantic ambiguity created by such constructs as i=-10, which had been interpreted as i =- 10 instead of the possibly intended i = -10

Even after the publication of the 1989 C standard, for many years K&R C was still considered the "lowest common denominator" to which C programmers restricted themselves when maximum portability was desired, since many older compilers were still in use, and because carefully written K&R C code can be legal Standard C as well. In early versions of C, only functions that returned a non-integer value need to be declared if used before the function definition; a function used without any previous declaration was assumed to return an integer, if its value was used. For example:

53


C (programming language) long int SomeFunction(); /* int OtherFunction(); */ /* int */ CallingFunction() { long int test1; register /* int */ test2; test1 = SomeFunction(); if (test1 > 0) test2 = 0; else test2 = OtherFunction(); return test2; } All the above commented-out int declarations could be omitted in K&R C. Since K&R function declarations did not include any information about function arguments, function parameter type checks were not performed, although some compilers would issue a warning message if a local function was called with the wrong number of arguments, or if multiple calls to an external function used different numbers or types of arguments. Separate tools such as Unix's lint utility were developed that (among other things) could check for consistency of function use across multiple source files. In the years following the publication of K&R C, several unofficial features were added to the language, supported by compilers from AT&T and some other vendors. These included: • • • •

void functions functions returning struct or union types (rather than pointers) assignment for struct data types enumerated types

The large number of extensions and lack of agreement on a standard library, together with the language popularity and the fact that not even the Unix compilers precisely implemented the K&R specification, led to the necessity of standardization.

ANSI C and ISO C During the late 1970s and 1980s, versions of C were implemented for a wide variety of mainframe computers, minicomputers, and microcomputers, including the IBM PC, as its popularity began to increase significantly. In 1983, the American National Standards Institute (ANSI) formed a committee, X3J11, to establish a standard specification of C. In 1989, the standard was ratified as ANSI X3.159-1989 "Programming Language C". This version of the language is often referred to as ANSI C, Standard C, or sometimes C89. In 1990, the ANSI C standard (with formatting changes) was adopted by the International Organization for Standardization (ISO) as ISO/IEC 9899:1990, which is sometimes called C90. Therefore, the terms "C89" and "C90" refer to the same programming language. ANSI, like other national standards bodies, no longer develops the C standard independently, but defers to the ISO C standard. National adoption of updates to the international standard typically occurs within a year of ISO publication. One of the aims of the C standardization process was to produce a superset of K&R C, incorporating many of the unofficial features subsequently introduced. The standards committee also included several additional features such

54


C (programming language) as function prototypes (borrowed from C++), void pointers, support for international character sets and locales, and preprocessor enhancements. The syntax for parameter declarations was also augmented to include the style used in C++, although the K&R interface continued to be permitted, for compatibility with existing source code. C89 is supported by current C compilers, and most C code being written nowadays is based on it. Any program written only in Standard C and without any hardware-dependent assumptions will run correctly on any platform with a conforming C implementation, within its resource limits. Without such precautions, programs may compile only on a certain platform or with a particular compiler, due, for example, to the use of non-standard libraries, such as GUI libraries, or to a reliance on compiler- or platform-specific attributes such as the exact size of data types and byte endianness. In cases where code must be compilable by either standard-conforming or K&R C-based compilers, the __STDC__ macro can be used to split the code into Standard and K&R sections to take advantage of features available only in Standard C.

C99 After the ANSI/ISO standardization process, the C language specification remained relatively static for some time, whereas C++ continued to evolve, largely during its own standardization effort. In 1995 Normative Amendment 1 to the 1990 C standard was published, to correct some details and to add more extensive support for international character sets. The C standard was further revised in the late 1990s, leading to the publication of ISO/IEC 9899:1999 in 1999, which is commonly referred to as "C99". It has since been amended three times by Technical Corrigenda. The international C standard is maintained by the working group ISO/IEC JTC1/SC22/WG14. C99 introduced several new features, including inline functions, several new data types (including long long int and a complex type to represent complex numbers), variable-length arrays, support for variadic macros (macros of variable arity) and support for one-line comments beginning with //, as in BCPL or C++. Many of these had already been implemented as extensions in several C compilers. C99 is for the most part backward compatible with C90, but is stricter in some ways; in particular, a declaration that lacks a type specifier no longer has int implicitly assumed. A standard macro __STDC_VERSION__ is defined with value 199901L to indicate that C99 support is available. GCC, Sun Studio and other C compilers now support many or all of the new features of C99.

C1X In 2007, work began in anticipation of another revision of the C standard, informally called "C1X". The C standards committee has adopted guidelines to limit the adoption of new features that have not been tested by existing implementations.

Uses C's primary use is for "system programming", including implementing operating systems and embedded system applications, due to a combination of desirable characteristics such as code portability and efficiency, ability to access specific hardware addresses, ability to "pun" types to match externally imposed data access requirements, and low runtime demand on system resources. C can also be used for website programming using CGI as a "gateway" for information between the Web application, the server, and the browser.[9] Some factors to choose C over Interpreted languages are its speed, stability and less susceptibility to changes in operating environments due to its compiled nature.[10] One consequence of C's wide acceptance and efficiency is that compilers, libraries, and interpreters of other programming languages are often implemented in C.

55


C (programming language) C is sometimes used as an intermediate language by implementations of other languages. This approach may be used for portability or convenience; by using C as an intermediate language, it is not necessary to develop machine-specific code generators. Some compilers which use C this way are BitC, Gambit, the Glasgow Haskell Compiler, Squeak, and Vala. However, C was designed as a programming language, not as a compiler target language, and is thus less than ideal for use as an intermediate language. This has led to development of C-based intermediate languages such as C--. C has also been widely used to implement end-user applications, but as applications became larger, much of that development shifted to other languages.

Syntax Unlike languages such as FORTRAN 77, C source code is free-form which allows arbitrary use of whitespace to format code, rather than column-based or text-line-based restrictions. Comments may appear either between the delimiters /* and */, or (in C99) following // until the end of the line. Each source file contains declarations and function definitions. Function definitions, in turn, contain declarations and statements. Declarations either define new types using keywords such as struct, union, and enum, or assign types to and perhaps reserve storage for new variables, usually by writing the type followed by the variable name. Keywords such as char and int specify built-in types. Sections of code are enclosed in braces ({ and }, sometimes called "curly brackets") to limit the scope of declarations and to act as a single statement for control structures. As an imperative language, C uses statements to specify actions. The most common statement is an expression statement, consisting of an expression to be evaluated, followed by a semicolon; as a side effect of the evaluation, functions may be called and variables may be assigned new values. To modify the normal sequential execution of statements, C provides several control-flow statements identified by reserved keywords. Structured programming is supported by if(-else) conditional execution and by do-while, while, and for iterative execution (looping). The for statement has separate initialization, testing, and reinitialization expressions, any or all of which can be omitted. break and continue can be used to leave the innermost enclosing loop statement or skip to its reinitialization. There is also a non-structured goto statement which branches directly to the designated label within the function. switch selects a case to be executed based on the value of an integer expression. Expressions can use a variety of built-in operators (see below) and may contain function calls. The order in which arguments to functions and operands to most operators are evaluated is unspecified. The evaluations may even be interleaved. However, all side effects (including storage to variables) will occur before the next "sequence point"; sequence points include the end of each expression statement, and the entry to and return from each function call. Sequence points also occur during evaluation of expressions containing certain operators(&&, ||, ?: and the comma operator). This permits a high degree of object code optimization by the compiler, but requires C programmers to take more care to obtain reliable results than is needed for other programming languages. Although mimicked by many languages because of its widespread familiarity, C's syntax has often been criticized. For example, Kernighan and Ritchie say in the Introduction of The C Programming Language, "C, like any other language, has its blemishes. Some of the operators have the wrong precedence; some parts of the syntax could be better." Some specific problems worth noting are: â&#x20AC;˘ Not checking number and types of arguments when the function declaration has an empty parameter list. (This provides backward compatibility with K&R C, which lacked prototypes.) â&#x20AC;˘ Some questionable choices of operator precedence, as mentioned by Kernighan and Ritchie above, such as == binding more tightly than & and | in expressions like x & 1 == 0. â&#x20AC;˘ The use of the = operator, used in mathematics for equality, to indicate assignment, following the precedent of Fortran, PL/I, and BASIC, but unlike ALGOL and its derivatives. Ritchie made this syntax design decision

56


C (programming language) consciously, based primarily on the argument that assignment occurs more often than comparison. • Similarity of the assignment and equality operators (= and ==), making it easy to accidentally substitute one for the other. C's weak type system permits each to be used in the context of the other without a compilation error (although some compilers produce warnings). For example, the conditional expression in if (a=b) is only true if a is not zero after the assignment.[11] • A lack of infix operators for complex objects, particularly for string operations, making programs which rely heavily on these operations (implemented as functions instead) somewhat difficult to read. • A declaration syntax that some find unintuitive, particularly for function pointers. (Ritchie's idea was to declare identifiers in contexts resembling their use: "declaration reflects use".)

Operators C supports a rich set of operators, which are symbols used within an expression to specify the manipulations to be performed while evaluating that expression. C has operators for: • • • •

arithmetic (+, -, *, /, %) equality testing (==, !=) order relations (<, <=, >, >=) boolean logic (!, &&, ||)

• • • • • • • • • • • •

bitwise logic (~, &, |, ^) bitwise shifts (<<, >>) assignment (=, +=, -=, *=, /=, %=, &=, |=, ^=, <<=, >>=) increment and decrement (++, --) reference and dereference (&, *, [ ]) conditional evaluation (? :) member selection (., ->) type conversion (( )) object size (sizeof) function argument collection (( )) sequencing (,) subexpression grouping (( ))

C has a formal grammar, specified by the C standard. Integer-float conversion and rounding The type casting syntax can be used to convert values between an integer type and a floating-point type, or between two integer types or two float types with different sizes; e.g. (long int)sqrt(1000.0), (double)(256*256), or (float)sqrt(1000.0). Conversions are implicit in several contexts, e.g. when assigning a value to a variable or to a function parameter, when using a floating-point value as index to a vector, or in arithmetic operations on operand with different types. Unlike some other cases of type casting (where the bit encoding of the operands are simply re-interpreted according to the target type), conversions between integers and floating-point values generally change the bit encoding so as to preserve the numerical value of the operand, to the extent possible. In particular, conversion from an integer to a floating-point type will preserve its numeric value exactly—unless the number of fraction bits in the target type is insufficient, in which case the least-significant bits are lost. Conversion from a floating-point value to an integer type entails truncation of any fractional part (i.e. the value is rounded "towards zero"). For other kinds of rounding, the C99 standard specifies (in <math.h>) the following functions: • round(): round to nearest integer, halfway away from zero

57


C (programming language) • • • •

rint(), nearbyint(): round according to current floating-point rounding direction ceil(): smallest integral value not less than argument (round up) floor(): largest integral value (in double representation) not greater than argument (round down) trunc(): round towards zero (same as typecasting to an int)

All these functions take a double argument and return a double result, which may then be cast to an integer if necessary. The conversion of a float value to the double type preserves the numerical value exactly, while the opposite conversion rounds the value to fit in the smaller number of fraction bits, usually towards zero. (Since float also has a smaller exponent range, the conversion may yield an infinite value.) Some compilers will silently convert float values to double in some contexts, e.g. function parameters declared as float may be actually passed as double. In machines that comply with the IEEE floating point standard, some rounding events are affected by the current rounding mode (which includes round-to-even, round-down, round-up, and round-to-zero), which may be retrieved and set using the fegetround()/fesetround() functions defined in <fenv.h>.

"Hello, world" example The "hello, world" example which appeared in the first edition of K&R has become the model for an introductory program in most programming textbooks, regardless of programming language. The program prints "hello, world" to the standard output, which is usually a terminal or screen display. The original version was: main() { printf("hello, world\n"); }

A standard-conforming "hello, world" program is:[12] #include <stdio.h> int main(void) { printf("hello, world\n"); return 0; } The first line of the program contains a preprocessing directive, indicated by #include. This causes the preprocessor—the first tool to examine source code as it is compiled—to substitute the line with the entire text of the stdio.h standard header, which contains declarations for standard input and output functions such as printf. The angle brackets surrounding stdio.h indicate that stdio.h is located using a search strategy that prefers standard headers to other headers having the same name. Double quotes may also be used to include local or project-specific header files. The next line indicates that a function named main is being defined. The main function serves a special purpose in C programs: The run-time environment calls the main function to begin program execution. The type specifier int indicates that the return value, the value that is returned to the invoker (in this case the run-time environment) as a result of evaluating the main function, is an integer. The keyword void as a parameter list indicates that the main function takes no arguments.[13] The opening curly brace indicates the beginning of the definition of the main function.

58


C (programming language) The next line calls (diverts execution to) a function named printf, which was declared in stdio.h and is supplied from a system library. In this call, the printf function is passed (provided with) a single argument, the address of the first character in the string literal "hello, world\n". The string literal is an unnamed array with elements of type char, set up automatically by the compiler with a final 0-valued character to mark the end of the array (printf needs to know this). The \n is an escape sequence that C translates to a newline character, which on output signifies the end of the current line. The return value of the printf function is of type int, but it is silently discarded since it is not used. (A more careful program might test the return value to determine whether or not the printf function succeeded.) The semicolon ; terminates the statement. The return statement terminates the execution of the main function and causes it to return the integer value 0, which is interpreted by the run-time system as an exit code indicating successful execution. The closing curly brace indicates the end of the code for the main function.

Data structures C has a static weak typing type system that shares some similarities with that of other ALGOL descendants such as Pascal. There are built-in types for integers of various sizes, both signed and unsigned, floating-point numbers, characters, and enumerated types (enum). C99 added a boolean datatype. There are also derived types including arrays, pointers, records (struct), and untagged unions (union). C is often used in low-level systems programming where escapes from the type system may be necessary. The compiler attempts to ensure type correctness of most expressions, but the programmer can override the checks in various ways, either by using a type cast to explicitly convert a value from one type to another, or by using pointers or unions to reinterpret the underlying bits of a value in some other way. Pointers C supports the use of pointers, a very simple type of reference that records, in effect, the address or location of an object or function in memory. Pointers can be dereferenced to access data stored at the address pointed to, or to invoke a pointed-to function. Pointers can be manipulated using assignment and also pointer arithmetic. The run-time representation of a pointer value is typically a raw memory address (perhaps augmented by an offset-within-word field), but since a pointer's type includes the type of the thing pointed to, expressions including pointers can be type-checked at compile time. Pointer arithmetic is automatically scaled by the size of the pointed-to data type. (See Array-pointer interchangeability below.) Pointers are used for many different purposes in C. Text strings are commonly manipulated using pointers into arrays of characters. Dynamic memory allocation, which is described below, is performed using pointers. Many data types, such as trees, are commonly implemented as dynamically allocated struct objects linked together using pointers. Pointers to functions are useful for callbacks from event handlers. A null pointer is a pointer value that points to no valid location (it is often represented by address zero). Dereferencing a null pointer is therefore meaningless, typically resulting in a run-time error. Null pointers are useful for indicating special cases such as no next pointer in the final node of a linked list, or as an error indication from functions returning pointers. Void pointers (void *) point to objects of unknown type, and can therefore be used as "generic" data pointers. Since the size and type of the pointed-to object is not known, void pointers cannot be dereferenced, nor is pointer arithmetic on them allowed, although they can easily be (and in many contexts implicitly are) converted to and from any other object pointer type. Careless use of pointers is potentially dangerous. Because they are typically unchecked, a pointer variable can be made to point to any arbitrary location, which can cause undesirable effects. Although properly-used pointers point to safe places, they can be made to point to unsafe places by using invalid pointer arithmetic; the objects they point to may be deallocated and reused (dangling pointers); they may be used without having been initialized (wild

59


C (programming language) pointers); or they may be directly assigned an unsafe value using a cast, union, or through another corrupt pointer. In general, C is permissive in allowing manipulation of and conversion between pointer types, although compilers typically provide options for various levels of checking. Some other programming languages address these problems by using more restrictive reference types. Arrays Array types in C are traditionally of a fixed, static size specified at compile time. (The more recent C99 standard also allows a form of variable-length arrays.) However, it is also possible to allocate a block of memory (of arbitrary size) at run-time, using the standard library's malloc function, and treat it as an array. C's unification of arrays and pointers (see below) means that true arrays and these dynamically-allocated, simulated arrays are virtually interchangeable. Since arrays are always accessed (in effect) via pointers, array accesses are typically not checked against the underlying array size, although the compiler may provide bounds checking as an option. Array bounds violations are therefore possible and rather common in carelessly written code, and can lead to various repercussions, including illegal memory accesses, corruption of data, buffer overruns, and run-time exceptions. C does not have a special provision for declaring multidimensional arrays, but rather relies on recursion within the type system to declare arrays of arrays, which effectively accomplishes the same thing. The index values of the resulting "multidimensional array" can be thought of as increasing in row-major order. Although C supports static arrays, it is not required that array indices be validated (bounds checking). For example, one can try to write to the sixth element of an array with five elements, generally yielding undesirable results. This type of bug, called a buffer overflow or buffer overrun, is notorious for causing a number of security problems. Since bounds checking elimination technology was largely nonexistent when C was defined, bounds checking came with a severe performance penalty, particularly in numerical computation. A few years earlier, some Fortran compilers had a switch to toggle bounds checking on or off; however, this would have been much less useful for C, where array arguments are passed as simple pointers. Multidimensional arrays are commonly used in numerical algorithms (mainly from applied linear algebra) to store matrices. The structure of the C array is well suited to this particular task. However, since arrays are passed merely as pointers, the bounds of the array must be known fixed values or else explicitly passed to any subroutine that requires them, and dynamically sized arrays of arrays cannot be accessed using double indexing. (A workaround for this is to allocate the array with an additional "row vector" of pointers to the columns.) C99 introduced "variable-length arrays" which address some, but not all, of the issues with ordinary C arrays. Array-pointer interchangeability A distinctive (but potentially confusing) feature of C is its treatment of arrays and pointers. The array-subscript notation x[i] can also be used when x is a pointer; the interpretation (using pointer arithmetic) is to access the (i + 1)th object of several adjacent data objects pointed to by x, counting the object that x points to (which is x[0]) as the first element of the array. Formally, x[i] is equivalent to *(x + i). Since the type of the pointer involved is known to the compiler at compile time, the address that x + i points to is not the address pointed to by x incremented by i bytes, but rather incremented by i multiplied by the size of an element that x points to. The size of these elements can be determined with the operator sizeof by applying it to any dereferenced element of x, as in n = sizeof *x or n = sizeof x[0]. Furthermore, in most expression contexts (a notable exception is sizeof x), the name of an array is automatically converted to a pointer to the array's first element; this implies that an array is never copied as a whole when named as an argument to a function, but rather only the address of its first element is passed. Therefore, although function calls in C use pass-by-value semantics, arrays are in effect passed by reference. The number of elements in a declared array x can be determined as sizeof x / sizeof x[0].

60


C (programming language) An interesting demonstration of the interchangeability of pointers and arrays is shown below. The four assignments are equivalent and each is valid C code. /* x is an array and i is an integer */ x[i] = 1; /* equivalent to *(x + i) */ *(x + i) = 1; *(i + x) = 1; i[x] = 1; /* uncommon usage, but correct: i[x] is equivalent to *(i + x) */

Note that the last line contains the uncommon, but semantically correct, expression i[x], which appears to interchange the index variable i with the array variable x. This last line might be found in obfuscated C code, but its use is rare among C programmers. Despite this apparent equivalence between array and pointer variables, there is still a distinction to be made between them. Even though the name of an array is, in most expression contexts, converted into a pointer (to its first element), this pointer does not itself occupy any storage, unlike a pointer variable. Consequently, what an array "points to" cannot be changed, and it is impossible to assign a value to an array variable. (Array values may be copied, however, e.g., by using the memcpy function.)

Memory management One of the most important functions of a programming language is to provide facilities for managing memory and the objects that are stored in memory. C provides three distinct ways to allocate memory for objects: â&#x20AC;˘ Static memory allocation: space for the object is provided in the binary at compile-time; these objects have an extent (or lifetime) as long as the binary which contains them is loaded into memory â&#x20AC;˘ Automatic memory allocation: temporary objects can be stored on the stack, and this space is automatically freed and reusable after the block in which they are declared is exited â&#x20AC;˘ Dynamic memory allocation: blocks of memory of arbitrary size can be requested at run-time using library functions such as malloc from a region of memory called the heap; these blocks persist until subsequently freed for reuse by calling the library function free These three approaches are appropriate in different situations and have various tradeoffs. For example, static memory allocation has no allocation overhead, automatic allocation may involve a small amount of overhead, and dynamic memory allocation can potentially have a great deal of overhead for both allocation and deallocation. On the other hand, stack space is typically much more limited and transient than either static memory or heap space, and dynamic memory allocation allows allocation of objects whose size is known only at run-time. Most C programs make extensive use of all three. Where possible, automatic or static allocation is usually preferred because the storage is managed by the compiler, freeing the programmer of the potentially error-prone chore of manually allocating and releasing storage. However, many data structures can grow in size at runtime, and since static allocations (and automatic allocations in C89 and C90) must have a fixed size at compile-time, there are many situations in which dynamic allocation must be used. Prior to the C99 standard, variable-sized arrays were a common example of this (see malloc for an example of dynamically allocated arrays). Automatically and dynamically allocated objects are only initialized if an initial value is explicitly specified; otherwise they initially have indeterminate values (typically, whatever bit pattern happens to be present in the storage, which might not even represent a valid value for that type). If the program attempts to access an uninitialized value, the results are undefined. Many modern compilers try to detect and warn about this problem, but both false positives and false negatives occur.

61


C (programming language) Another issue is that heap memory allocation has to be manually synchronized with its actual usage in any program in order for it to be reused as much as possible. For example, if the only pointer to a heap memory allocation goes out of scope or has its value overwritten before free() has been called, then that memory cannot be recovered for later reuse and is essentially lost to the program, a phenomenon known as a memory leak. Conversely, it is possible to release memory too soon and continue to access it; however, since the allocation system can re-allocate or itself use the freed memory, unpredictable behavior is likely to occur when the multiple users corrupt each other's data. Typically, the symptoms will appear in a portion of the program far removed from the actual error. Such issues are ameliorated in languages with automatic garbage collection or RAII.

Libraries The C programming language uses libraries as its primary method of extension. In C, a library is a set of functions contained within a single "archive" file. Each library typically has a header file, which contains the prototypes of the functions contained within the library that may be used by a program, and declarations of special data types and macro symbols used with these functions. In order for a program to use a library, it must include the library's header file, and the library must be linked with the program, which in many cases requires compiler flags (e.g., -lm, shorthand for "math library"). The most common C library is the C standard library, which is specified by the ISO and ANSI C standards and comes with every C implementation (â&#x20AC;&#x153;freestandingâ&#x20AC;? [embedded] C implementations may provide only a subset of the standard library). This library supports stream input and output, memory allocation, mathematics, character strings, and time values. Another common set of C library functions are those used by applications specifically targeted for Unix and Unix-like systems, especially functions which provide an interface to the kernel. These functions are detailed in various standards such as POSIX and the Single UNIX Specification. Since many programs have been written in C, there are a wide variety of other libraries available. Libraries are often written in C because C compilers generate efficient object code; programmers then create interfaces to the library so that the routines can be used from higher-level languages like Java, Perl, and Python.

Language tools Tools have been created to help C programmers avoid some of the problems inherent in the language, such as statements with undefined behavior or statements that are not a good practice because they are more likely to result in unintended behavior or run-time errors. Automated source code checking and auditing are beneficial in any language, and for C many such tools exist, such as Lint. A common practice is to use Lint to detect questionable code when a program is first written. Once a program passes Lint, it is then compiled using the C compiler. Also, many compilers can optionally warn about syntactically valid constructs that are likely to actually be errors. MISRA C is a proprietary set of guidelines to avoid such questionable code, developed for embedded systems. There are also compilers, libraries and operating system level mechanisms for performing array bounds checking, buffer overflow detection, serialization and automatic garbage collection, that are not a standard part of C. Tools such as Purify, Valgrind, and linking with libraries containing special versions of the memory allocation functions can help uncover runtime memory errors.

62


C (programming language)

Related languages C has directly or indirectly influenced many later languages such as Java, Perl, PHP, JavaScript, LPC, C# and Unix's C Shell. The most pervasive influence has been syntactical: all of the languages mentioned combine the statement and (more or less recognizably) expression syntax of C with type systems, data models and/or large-scale program structures that differ from those of C, sometimes radically. When object-oriented languages became popular, C++ and Objective-C were two different extensions of C that provided object-oriented capabilities. Both languages were originally implemented as source-to-source compilers -source code was translated into C, and then compiled with a C compiler. Bjarne Stroustrup devised the C++ programming language as one approach to providing object-oriented functionality with C-like syntax. C++ adds greater typing strength, scoping and other tools useful in object-oriented programming and permits generic programming via templates. Nearly a superset of C, C++ now supports most of C, with a few exceptions (see Compatibility of C and C++ for an exhaustive list of differences). Unlike C++, which maintains nearly complete backwards compatibility with C, the D language makes a clean break with C while maintaining the same general syntax. It abandons a number of features of C which Walter Bright (the designer of D) considered undesirable, including the C preprocessor and trigraphs. Some, but not all, of D's extensions to C overlap with those of C++. Objective-C was originally a very "thin" layer on top of, and remains a strict superset of, C that permits object-oriented programming using a hybrid dynamic/static typing paradigm. Objective-C derives its syntax from both C and Smalltalk: syntax that involves preprocessing, expressions, function declarations and function calls is inherited from C, while the syntax for object-oriented features was originally taken from Smalltalk. Limbo is a language developed by the same team at Bell Labs that was responsible for C and Unix, and while retaining some of the syntax and the general style, introduced garbage collection, CSP based concurrency and other major innovations. Python has a different sort of C heritage. While the syntax and semantics of Python are radically different from C, the most widely used Python implementation, CPython, is an open source C program. This allows C users to extend Python with C, or embed Python into C programs. This close relationship is one of the key factors leading to Python's success as a general-use dynamic language. Perl is another example of a popular programming language rooted in C. However, unlike Python, Perl's syntax does closely follow C syntax. The standard Perl implementation is written in C and supports extensions written in C.

See also • • • • • •

Augmented assignment Comparison of Pascal and C Comparison of programming languages International Obfuscated C Code Contest List of articles with C programs List of compilers

63


C (programming language)

References • Brian Kernighan, Dennis Ritchie: The C Programming Language. Also known as K&R – The original book on C. • 1st, Prentice Hall 1978; ISBN 0-13-110163-3. Pre-ANSI C. • 2nd, Prentice Hall 1988; ISBN 0-13-110362-8. ANSI C. • ISO/IEC 9899 [14]. Official C99 documents, including technical corrigenda and a rationale. As of 2007 the latest version of the standard is ISO/IEC 9899:TC3 [15]PDF (3.61 MB). • Samuel P. Harbison, Guy L. Steele: C: A Reference Manual. This book is excellent as a definitive reference manual, and for those working on C compilers. The book contains a BNF grammar for C. • 5th, Prentice Hall 2002; ISBN 0-13-089592-X. • Derek M. Jones: The New C Standard: A Cultural and Economic Commentary, Addison-Wesley, ISBN 0-201-70917-1, online material [16] • Robert Sedgewick: Algorithms in C, Addison-Wesley, ISBN 0-201-31452-5 (Part 1–4) and ISBN 0-201-31663-3 (Part 5) • William H. Press, Saul A. Teukolsky, William T. Vetterling, Brian P. Flannery: Numerical Recipes in C (The Art of Scientific Computing), ISBN 0-521-43108-5

External links The current draft Standard (C99 with Technical corrigenda TC1, TC2, and TC3 included) [15]PDF (3.61 MB) Draft ANSI C Standard (ANSI X3J11/88-090) [17] (May 13, 1988), Third Public Review [18] Draft ANSI C Rationale (ANSI X3J11/88-151) [19] (Nov 18, 1988) ISO C Working Group [20] (official Web site) The Development of the C Language [21] by Dennis M. Ritchie comp.lang.c Frequently Asked Questions [22] The C Book [23] by M.Banahan-D.Brady-M.Doran (Addison-Wesley, 2nd ed.) – book for beginning and intermediate students, now out of print and free to download. • The New C Standard: An economic and cultural commentary [24]PDF (10.0 MB) – An unpublished book about "detailed analysis of the International Standard for the C language." • A New C Compiler [25] paper by Ken Thompson. • • • • • • •

References [1] Dennis M. Ritchie (January 1993). "The Development of the C Language" (http:/ / cm. bell-labs. com/ cm/ cs/ who/ dmr/ chist. html). . Retrieved Jan 1 2008. "The scheme of type composition adopted by C owes considerable debt to Algol 68, although it did not, perhaps, emerge in a form that Algol's adherents would approve of." [2] Stewart, Bill (January 7, 2000). "History of the C Programming Language" (http:/ / www. livinginternet. com/ i/ iw_unix_c. htm). Living Internet. . Retrieved 2006-10-31. [3] Patricia K. Lawlis, c.j. kemp systems, inc. (1997). "Guidelines for Choosing a Computer Language: Support for the Visionary Organization" (http:/ / archive. adaic. com/ docs/ reports/ lawlis/ k. htm). Ada Information Clearinghouse. . Retrieved 2006-07-18. [4] "Programming Language Popularity" (http:/ / www. langpop. com/ ). 2009. . Retrieved 2009-01-16. [5] "TIOBE Programming Community Index" (http:/ / www. tiobe. com/ index. php/ content/ paperinfo/ tpci/ index. html). 2009. . Retrieved 2009-05-06. [6] http:/ / stackoverflow. com/ questions/ 1540886/ what-did-the-c-operators-and-do [7] C99 added a _Bool type, but it was not retrofitted into the language's existing Boolean contexts. One can simulate a Boolean datatype, e.g. with enum { false, true } bool;, but this does not provide all of the features of a separate Boolean datatype. [8] "Jargon File entry for nasal demons" (http:/ / www. catb. org/ jargon/ html/ N/ nasal-demons. html). . [9] Dr. Dobb's Sourcebook. U.S.A.: Miller Freeman, Inc.. Nov/Dec 1995 issue. [10] "Using C for CGI Programming" (http:/ / www. linuxjournal. com/ article/ 6863). linuxjournal.com. 2005-03-01. . Retrieved 2010-01-04. [11] "10 Common Programming Mistakes in C" (http:/ / www. cs. ucr. edu/ ~nxiao/ cs10/ errors. htm). Cs.ucr.edu. . Retrieved 2009-06-26. [12] The original example code will compile on most modern compilers that are not in strict standard compliance mode, but it does not fully conform to the requirements of either C89 or C99. In fact, C99 requires that a diagnostic message be produced.

64


C (programming language) [13] The main function actually has two arguments, int argc and char *argv[], respectively, which can be used to handle command line arguments. The C standard requires that both forms of main be supported, which is special treatment not afforded any other function. [14] http:/ / www. open-std. org/ JTC1/ SC22/ WG14/ www/ standards [15] http:/ / www. open-std. org/ JTC1/ SC22/ WG14/ www/ docs/ n1256. pdf [16] http:/ / www. knosof. co. uk/ cbook/ cbook. html [17] http:/ / flash-gordon. me. uk/ ansi. c. txt [18] http:/ / groups. google. com/ group/ comp. lang. c/ msg/ 20b174b18cdd919d?hl=en [19] http:/ / www. scribd. com/ doc/ 16306895/ Draft-ANSI-C-Rationale [20] http:/ / www. open-std. org/ jtc1/ sc22/ wg14/ [21] http:/ / cm. bell-labs. com/ cm/ cs/ who/ dmr/ chist. html [22] http:/ / www. c-faq. com/ [23] http:/ / publications. gbdirect. co. uk/ c_book/ [24] http:/ / www. coding-guidelines. com/ cbook/ cbook1_2. pdf [25] http:/ / doc. cat-v. org/ bell_labs/ new_c_compilers/

65


C++

66

C++

The C++ Programming Language, written by its architect, is the seminal book on the language. Usual file extensions

.hh .hpp .hxx .h++ .cc .cpp .cxx .c++

Paradigm

Multi-paradigm: procedural, object-oriented, generic

Appeared in

1983

Designed by

Bjarne Stroustrup

Developer

Bjarne Stroustrup Bell Labs ISO/IEC JTC1/SC22/WG21

Preview release

C++0x

Typing discipline

Static, unsafe, nominative

Major implementations

Borland C++ Builder, GCC, Intel C++ Compiler, Microsoft Visual C++, Sun Studio, Turbo C++, Comeau C/C++

Dialects

ISO/IEC C++ 1998, ISO/IEC C++ 2003

Influenced by

C, Simula, Ada 83, ALGOL 68, CLU, ML

Influenced

Perl, Lua, Ada 95, Java, PHP, D, C99, C#, Aikido, Falcon, Dao

OS

Cross-platform (multi-platform)

[1]

C++ (pronounced "see plus plus") is a statically typed, free-form, multi-paradigm, compiled, general-purpose programming language. It is regarded as a middle-level language, as it comprises a combination of both high-level and low-level language features.[2] It was developed by Bjarne Stroustrup starting in 1979 at Bell Labs as an enhancement to the C programming language and originally named "C with Classes". It was renamed C++ in 1983.[3] As one of the most popular programming languages ever created,[4] [5] C++ is widely used in the software industry. Some of its application domains include systems software, application software, device drivers, embedded software, high-performance server and client applications, and entertainment software such as video games. Several groups provide both free and proprietary C++ compiler software, including the GNU Project, Microsoft, Intel and Borland. C++ has greatly influenced many other popular programming languages, most notably Java. C++ is also used for hardware design, where design is initially described in C++, then analyzed, architecturally constrained, and scheduled to create a register transfer level hardware description language via high-level synthesis. The language began as enhancements to C, first adding classes, then virtual functions, operator overloading, multiple inheritance, templates, and exception handling among other features. After years of development, the C++ programming language standard was ratified in 1998 as ISO/IEC 14882:1998. That standard is still current, but is amended by the 2003 technical corrigendum, ISO/IEC 14882:2003. The next standard version (known informally as C++0x) is in development.


C++

67

History Stroustrup began work on "C with Classes" in 1979. The idea of creating a new language originated from Stroustrup's experience in programming for his Ph.D. thesis. Stroustrup found that Simula had features that were very helpful for large software development, but the language was too slow for practical use, while BCPL was fast but too low-level to be suitable for large software development. When Stroustrup started working in AT&T Bell Labs, he had the problem of analyzing the UNIX kernel with respect to distributed computing. Remembering his Ph.D. experience, Stroustrup set out to enhance the Bjarne Stroustrup, creator of C++ C language with Simula-like features. C was chosen because it was general-purpose, fast, portable and widely used. Besides C and Simula, some other languages that inspired him were ALGOL 68, Ada, CLU and ML. At first, the class, derived class, strong type checking, inlining, and default argument features were added to C via Stroustrup's C++ to C compiler, Cfront. The first commercial implementation of C++ was released in October 1985.[6] In 1983, the name of the language was changed from C with Classes to C++ (++ being the increment operator in C and C++). New features were added including virtual functions, function name and operator overloading, references, constants, user-controlled free-store memory control, improved type checking, and BCPL style single-line comments with two forward slashes (//). In 1985, the first edition of The C++ Programming Language was released, providing an important reference to the language, since there was not yet an official standard. Release 2.0 of C++ came in 1989. New features included multiple inheritance, abstract classes, static member functions, const member functions, and protected members. In 1990, The Annotated C++ Reference Manual was published. This work became the basis for the future standard. Late addition of features included templates, exceptions, namespaces, new casts, and a Boolean type. As the C++ language evolved, a standard library also evolved with it. The first addition to the C++ standard library was the stream I/O library which provided facilities to replace the traditional C functions such as printf and scanf. Later, among the most significant additions to the standard library, was the Standard Template Library. C++ continues to be used and is one of the preferred programming languages to develop professional applications. The popularity of the language continues to grow.[7]

Language standard In 1998, the C++ standards committee (the ISO/IEC JTC1/SC22/WG21 working group) standardized C++ and published the international standard ISO/IEC 14882:1998 (informally known as C++98[8] ). For some years after the official release of the standard, the committee processed defect reports, and published a corrected version of the C++ standard, ISO/IEC 14882:2003, in 2003. In 2005, a technical report, called the "Library Technical Report 1" (often known as TR1 for short), was released. While not an official part of the standard, it specified a number of extensions to the standard library, which were expected to be included in the next version of C++. Support for TR1 is growing in almost all currently maintained C++ compilers. The standard for the next version of the language (known informally as C++0x) is in development.


C++

68

Etymology According to Stroustrup: "the name signifies the evolutionary nature of the changes from C".[9] During C++'s development period, the language had been referred to as "new C", then "C with Classes". The final name is credited to Rick Mascitti (mid-1983) and was first used in December 1983. When Mascitti was questioned informally in 1992 about the naming, he indicated that it was given in a tongue-in-cheek spirit. It stems from C's "++" operator (which increments the value of a variable) and a common naming convention of using "+" to indicate an enhanced computer program. There is no language called "C plus". ABCL/c+ was the name of an earlier, unrelated programming language.

Philosophy In The Design and Evolution of C++ (1994), Bjarne Stroustrup describes some rules that he used for the design of C++: • C++ is designed to be a statically typed, general-purpose language that is as efficient and portable as C • C++ is designed to directly and comprehensively support multiple programming styles (procedural programming, data abstraction, object-oriented programming, and generic programming) • C++ is designed to give the programmer choice, even if this makes it possible for the programmer to choose incorrectly • C++ is designed to be as compatible with C as possible, therefore providing a smooth transition from C • C++ avoids features that are platform specific or not general purpose • C++ does not incur overhead for features that are not used (the "zero-overhead principle") • C++ is designed to function without a sophisticated programming environment Stroustrup also mentions that C++ was always intended to make programming more fun and that many of the double meanings in the language are intentional. Inside the C++ Object Model (Lippman, 1996) describes how compilers may convert C++ program statements into an in-memory layout. Compiler authors are, however, free to implement the standard in their own manner.

Standard library The 1998 ANSI/ISO C++ standard consists of two parts: the core language and the C++ Standard Library; the latter includes most of the Standard Template Library (STL) and a slightly modified version of the C standard library. Many C++ libraries exist which are not part of the standard, and, using linkage specification, libraries can even be written in languages such as C, Fortran, Pascal, or BASIC. Which of these are supported is compiler dependent. The C++ standard library incorporates the C standard library with some small modifications to make it optimized with the C++ language. Another large part of the C++ library is based on the STL. This provides such useful tools as containers (for example vectors and lists), iterators to provide these containers with array-like access and algorithms to perform operations such as searching and sorting. Furthermore (multi)maps (associative arrays) and (multi)sets are provided, all of which export compatible interfaces. Therefore it is possible, using templates, to write generic algorithms that work with any container or on any sequence defined by iterators. As in C, the features of the library are accessed by using the #include directive to include a standard header. C++ provides 69 standard headers, of which 19 are deprecated. The STL was originally a third-party library from HP and later SGI, before its incorporation into the C++ standard. The standard does not refer to it as "STL", as it is merely a part of the standard library, but many people still use that term to distinguish it from the rest of the library (input/output streams, internationalization, diagnostics, the C library subset, etc.). Most C++ compilers provide an implementation of the C++ standard library, including the STL. Compiler-independent implementations of the STL, such as STLPort,[10] also exist. Other projects also produce


C++

69 various custom implementations of the C++ standard library and the STL with various design goals.

Language features C++ inherits most of C's syntax and the C preprocessor. The following is Bjarne Stroustrup's version of the Hello world program which uses the C++ standard library stream facility to write a message to standard output:[11] [12] #include <iostream> int main() { std::cout << "Hello, world!\n"; } The C++ standard requires the main function to be defined with int as its return type, but it need not return a value with an explicit return statement, as an implicit return 0 is executed when the end of main is reached.[13] Such an implicit <return> rule does not apply to any other value-returning functions: If control reaches their closing } undefined behavior results.[14]

Operators and operator overloading C++ provides more than 30 operators, covering basic arithmetic, bit manipulation, indirection, comparisons, logical operations and others. Almost all operators can be overloaded for user-defined types, with a few notable exceptions such as member access (. and .*). The rich set of overloadable operators is central to using C++ as a domain specific language. The overloadable operators are also an essential part of many advanced C++ programming techniques, such as smart pointers. Overloading an operator does not change the precedence of calculations involving the operator, nor does it change the number of operands that the operator uses (any operand may however be ignored by the operator, though it will be evaluated prior to execution).

Templates C++ templates enable generic programming. C++ supports both function and class templates. Templates may be parameterized by types, compile-time constants, and other templates. C++ templates are implemented by instantiation at compile-time. To instantiate a template, compilers substitute specific arguments for a template's parameters to generate a concrete function or class instance. Some substitutions are not possible; these are eliminated by an overload resolution policy described by the phrase "Substitution failure is not an error" (SFINAE). Templates are a powerful tool that can be used for generic programming, template metaprogramming, and code optimization, but this power implies a cost. Template use may increase code size, since each template instantiation produces a copy of the template code: one for each set of template arguments. This is in contrast to run-time generics seen in other languages (e.g. Java) where at compile-time the type is erased and a single template body is preserved. Templates are different from macros: while both of these compile-time language features enable conditional compilation, templates are not restricted to lexical substitution. Templates are aware of the semantics and type system of their companion language, as well as all compile-time type definitions, and can perform high-level operations including programmatic flow control based on evaluation of strictly type-checked parameters. Macros are capable of conditional control over compilation based on predetermined criteria, but cannot instantiate new types, recurse, or perform type evaluation and in effect are limited to pre-compilation text-substitution and text-inclusion/exclusion. In other words, macros can control compilation flow based on pre-defined symbols but cannot, unlike templates, independently instantiate new symbols. Templates are a tool for static polymorphism (see below) and generic programming.


C++

70 In addition, templates are a compile time mechanism in C++ which is Turing-complete, meaning that any computation expressible by a computer program can be computed, in some form, by a template metaprogram prior to runtime. In summary, a template is a compile-time parameterized function or class written without knowledge of the specific arguments used to instantiate it. After instantiation the resulting code is equivalent to code written specifically for the passed arguments. In this manner, templates provide a way to decouple generic, broadly-applicable aspects of functions and classes (encoded in templates) from specific aspects (encoded in template parameters) without sacrificing performance due to abstraction.

Objects C++ introduces object-oriented (OO) features to C. It offers classes, which provide the four features commonly present in OO (and some non-OO) languages: abstraction, encapsulation, inheritance, and polymorphism. Objects are instances of classes created at runtime. The class can be thought of as a template from which many different individual objects may be generated as a program runs. Encapsulation Encapsulation is the hiding of information in order to ensure that data structures and operators are used as intended and to make the usage model more obvious to the developer. C++ provides the ability to define classes and functions as its primary encapsulation mechanisms. Within a class, members can be declared as either public, protected, or private in order to explicitly enforce encapsulation. A public member of the class is accessible to any function. A private member is accessible only to functions that are members of that class and to functions and classes explicitly granted access permission by the class ("friends"). A protected member is accessible to members of classes that inherit from the class in addition to the class itself and any friends. The OO principle is that all of the functions (and only the functions) that access the internal representation of a type should be encapsulated within the type definition. C++ supports this (via member functions and friend functions), but does not enforce it: the programmer can declare parts or all of the representation of a type to be public, and is allowed to make public entities that are not part of the representation of the type. Because of this, C++ supports not just OO programming, but other weaker decomposition paradigms, like modular programming. It is generally considered good practice to make all data private or protected, and to make public only those functions that are part of a minimal interface for users of the class. This hides all the details of data implementation, allowing the designer to later fundamentally change the implementation without changing the interface in any way.[15] [16] Inheritance Inheritance allows one data type to acquire properties of other data types. Inheritance from a base class may be declared as public, protected, or private. This access specifier determines whether unrelated and derived classes can access the inherited public and protected members of the base class. Only public inheritance corresponds to what is usually meant by "inheritance". The other two forms are much less frequently used. If the access specifier is omitted, a "class" inherits privately, while a "struct" inherits publicly. Base classes may be declared as virtual; this is called virtual inheritance. Virtual inheritance ensures that only one instance of a base class exists in the inheritance graph, avoiding some of the ambiguity problems of multiple inheritance. Multiple inheritance is a C++ feature sometimes considered controversial. Multiple inheritance allows a class to be derived from more than one base class; this can result in a complicated graph of inheritance relationships. For example, a "Flying Cat" class can inherit from both "Cat" and "Flying Mammal". Some other languages, such as Java or C#, accomplish something similar (although more limited) by allowing inheritance of multiple interfaces while restricting the number of base classes to one (interfaces, unlike classes, provide only declarations of member functions, no implementation or member data). Interfaces and abstract classes in Java and C# can be defined in C++


C++

71 as a class containing only pure virtual functions, often known as an abstract base class or "ABC." Programmers preferring the Java/C# model of inheritance can choose to inherit only one non-abstract class, although in this case the declared member functions of the abstract base classes must be explicitly defined and cannot be inherited.

Polymorphism Polymorphism enables one common interface for many implementations, and for objects to act differently under different circumstances. C++ supports several kinds of static (compile-time) and dynamic (run-time) polymorphisms. Compile-time polymorphism does not allow for certain run-time decisions, while run-time polymorphism typically incurs a performance penalty. Static polymorphism Function overloading allows programs to declare multiple functions having the same name (but with different arguments). The functions are distinguished by the number and/or types of their formal parameters. Thus, the same function name can refer to different functions depending on the context in which it is used. The type returned by the function is not used to distinguish overloaded functions. When declaring a function, a programmer can specify default arguments for one or more parameters. Doing so allows the parameters with defaults to optionally be omitted when the function is called, in which case the default arguments will be used. When a function is called with fewer arguments than there are declared parameters, explicit arguments are matched to parameters in left-to-right order, with any unmatched parameters at the end of the parameter list being assigned their default arguments. In many cases, specifying default arguments in a single function declaration is preferable to providing overloaded function definitions with different numbers of parameters. Templates in C++ provide a sophisticated mechanism for writing generic, polymorphic code. In particular, through the Curiously Recurring Template Pattern it's possible to implement a form of static polymorphism that closely mimics the syntax for overriding virtual functions. Since C++ templates are type-aware and Turing-complete they can also be used to let the compiler resolve recursive conditionals and generate substantial programs through template metaprogramming. Dynamic polymorphism Inheritance Variable pointers (and references) to a base class type in C++ can refer to objects of any derived classes of that type in addition to objects exactly matching the variable type. This allows arrays and other kinds of containers to hold pointers to objects of differing types. Because assignment of values to variables usually occurs at run-time, this is necessarily a run-time phenomenon. C++ also provides a dynamic_cast operator, which allows the program to safely attempt conversion of an object into an object of a more specific object type (as opposed to conversion to a more general type, which is always allowed). This feature relies on run-time type information (RTTI). Objects known to be of a certain specific type can also be cast to that type with static_cast, a purely compile-time construct which is faster and does not require RTTI. Virtual member functions Ordinarily when a function in a derived class overrides a function in a base class, the function to call is determined by the type of the object. A given function is overridden when there exists no difference, in the number or type of parameters, between two or more definitions of that function. Hence, at compile time it may not be possible to determine the type of the object and therefore the correct function to call, given only a base class pointer; the decision is therefore put off until runtime. This is called dynamic dispatch. Virtual member functions or methods[17] allow the most specific implementation of the function to be called, according to the actual run-time type of the


C++

72 object. In C++, this is commonly done using virtual function tables. If the object type is known, this may be bypassed by prepending a fully qualified class name before the function call, but in general calls to virtual functions are resolved at run time. In addition to standard member functions, operator overloads and destructors can be virtual. A general rule of thumb is that if any functions in the class are virtual, the destructor should be as well. As the type of an object at its creation is known at compile time, constructors, and by extension copy constructors, cannot be virtual. Nonetheless a situation may arise where a copy of an object needs to be created when a pointer to a derived object is passed as a pointer to a base object. In such a case a common solution is to create a clone() (or similar) function and declare that as virtual. The clone() method creates and returns a copy of the derived class when called. A member function can also be made "pure virtual" by appending it with = 0 after the closing parenthesis and before the semicolon. Objects cannot be created of a class with a pure virtual function and are called abstract data types. Such abstract data types can only be derived from. Any derived class inherits the virtual function as pure and must provide a non-pure definition of it (and all other pure virtual functions) before objects of the derived class can be created. A program that attempts to create an object of a class with a pure virtual member function or inherited pure virtual member function is ill-formed.

Parsing and processing C++ source code It is relatively difficult to write a good C++ parser with classic parsing algorithms such as LALR(1).[18] This is partly because the C++ grammar is not LALR. Because of this, there are very few tools for analyzing or performing non-trivial transformations (e.g., refactoring) of existing code. One way to handle this difficulty is to choose a different syntax, such as Significantly Prettier and Easier C++ Syntax, which is LALR(1) parsable. More powerful parsers, such as GLR parsers, can be substantially simpler (though slower). Parsing (in the literal sense of producing a syntax tree) is not the most difficult problem in building a C++ processing tool. Such tools must also have the same understanding of the meaning of the identifiers in the program as a compiler might have. Practical systems for processing C++ must then not only parse the source text, but be able to resolve for each identifier precisely which definition applies (e.g. they must correctly handle C++'s complex scoping rules) and what its type is, as well as the types of larger expressions. Finally, a practical C++ processing tool must be able to handle the variety of C++ dialects used in practice (such as that supported by the GNU Compiler Collection and that of Microsoft's Visual C++) and implement appropriate analyzers, source code transformers, and regenerate source text. Combining advanced parsing algorithms such as GLR with symbol table construction and program transformation machinery can enable the construction of arbitrary C++ tools.

Compatibility Producing a reasonably standards-compliant C++ compiler has proven to be a difficult task for compiler vendors in general. For many years, different C++ compilers implemented the C++ language to different levels of compliance to the standard, and their implementations varied widely in some areas such as partial template specialization. Recent releases of most popular C++ compilers support almost all of the C++ 1998 standard.[19] In order to give compiler vendors greater freedom, the C++ standards committee decided not to dictate the implementation of name mangling, exception handling, and other implementation-specific features. The downside of this decision is that object code produced by different compilers is expected to be incompatible. There are, however, third party standards for particular machines or operating systems which attempt to standardize compilers on those platforms (for example C++ ABI[20] ); some compilers adopt a secondary standard for these items.


C++

73

Exported templates One particular point of contention is the export keyword, intended to allow template definitions to be separated from their declarations. The first compiler to implement export was Comeau C/C++, in early 2003 (5 years after the release of the standard); in 2004, the beta compiler of Borland C++ Builder X was also released with export. Both of these compilers are based on the EDG C++ front end. Other compilers such as GCC do not support it at all. Many C++ books (such as Beginning ANSI C++ by Ivor Horton) provide example code with the keyword that will not compile in most compilers, without reference to this problem. Herb Sutter, former convener of the C++ standards committee, recommended that export be removed from future versions of the C++ standard.[21] During the March 2010 ISO C++ standards meeting, the C++ standards committee voted to remove exported templates entirely from C++0x but reserve the keyword for future use.[22]

With C C++ is often considered to be a superset of C, but this is not strictly true.[23] Most C code can easily be made to compile correctly in C++, but there are a few differences that cause some valid C code to be invalid in C++, or to behave differently in C++. One commonly encountered difference is that C allows implicit conversion from void* to other pointer types, but C++ does not. Another common portability issue is that C++ defines many new keywords, such as new and class, that may be used as identifiers (e.g. variable names) in a C program. Some incompatibilities have been removed by the latest (C99) C standard, which now supports C++ features such as // comments and mixed declarations and code. On the other hand, C99 introduced a number of new features that C++ does not support, such as variable-length arrays, native complex-number types, designated initializers and compound literals.[24] However, at least some of the new C99 features will likely be included in the next version of the C++ standard, C++0x. In order to intermix C and C++ code, any function declaration or definition that is to be called from/used both in C and C++ must be declared with C linkage by placing it within an extern "C" {/*...*/} block. Such a function may not rely on features depending on name mangling (i.e., function overloading).

Criticism Critics of the language raise several points. First, since C++ includes C as a subset, it inherits many of the criticisms leveled at C. For its large feature set, it is criticized as being "bloated", over-complicated, and difficult to fully master.[25] Bjarne Stroustrup points out that resultant executables do not support these claims of bloat: "I have even seen the C++ version of the 'hello world' program smaller than the C version."[26] An Embedded C++ standard was proposed to deal with part of this, but criticized for leaving out useful parts of the language that incur no runtime penalty.[27] C++ is more complex than some other programming languages. The ISO standard of the C++ language is about 310 pages (excluding the definitions of what is in the library). For comparison, the C programming language standard, written eight years earlier, is about 160 pages, and C#'s ECMA language definition document is about 440 pages. Bjarne Stroustrup points out that "The programming world is far more complex today than it was 30 years ago, and modern programming languages reflect that."[28] Other criticism stems from what is missing from C++. For example, the current version of Standard C++ provides no language features to create multi-threaded software. These facilities are present in some other languages including Java, Ada, and C# (see also Lock). It is possible to use operating system calls or third party libraries to do multi-threaded programming, but both approaches may create portability concerns. The new C++0x standard addresses this matter by extending the language with threading facilities.


C++

74 C++ is also sometimes compared unfavorably with languages such as Smalltalk, Java, or Eiffel on the basis that it enables programmers to "mix and match" object-oriented programming, procedural programming, generic programming, functional programming, declarative programming, and others, rather than strictly enforcing a single style, although this feature may also be considered an advantage. A fraudulent article was written wherein Bjarne Stroustrup is supposedly interviewed for a 1998 issue of IEEE's 'Computer' magazine[29] . In this article, the interviewer expects to discuss the successes of C++ now that several years had passed after its introduction. Instead, Stroustrup proceeds to confess that his invention of C++ was intended to create the most complex and difficult language possible to weed out amateur programmers and raise the salaries of the few programmers who could master the language. The article contains various criticisms of C++'s complexity and poor usability, most false or exaggerated. In reality, Stroustrup wrote no such article, and due to the pervasiveness of the hoax, was compelled to publish an official denial on his website.[30] . Some have criticized C++ for not having garbage collection, Dr. Stroustrup even discusses this on his website, he states that the memory management capabilities of C++ essentially perform the same function as garbage collection without an implicit garbage collector. [31]

See also • • • • • •

The C++ Programming Language C++0x, the planned new standard for C++ Comparison of integrated development environments for C/C++ Comparison of programming languages List of C++ compilers List of C++ template libraries

Further reading • Abrahams, David; Aleksey Gurtovoy. C++ Template Metaprogramming: Concepts, Tools, and Techniques from Boost and Beyond. Addison-Wesley. ISBN 0-321-22725-5. • Alexandrescu, Andrei (2001). Modern C++ Design: Generic Programming and Design Patterns Applied. Addison-Wesley. ISBN 0-201-70431-5. • Alexandrescu, Andrei; Herb Sutter (2004). C++ Design and Coding Standards: Rules and Guidelines for Writing Programs. Addison-Wesley. ISBN 0-321-11358-6. • Becker, Pete (2006). The C++ Standard Library Extensions : A Tutorial and Reference. Addison-Wesley. ISBN 0-321-41299-0. • Brokken, Frank (2010). C++ Annotations [32]. University of Groningen. ISBN 90 367 0470 7. • Coplien, James O. (1992, reprinted with corrections 1994). Advanced C++: Programming Styles and Idioms. ISBN 0-201-54855-0. • Dewhurst, Stephen C. (2005). C++ Common Knowledge: Essential Intermediate Programming. Addison-Wesley. ISBN 0-321-32192-8. • Information Technology Industry Council (15 October 2003). Programming languages — C++ (Second edition ed.). Geneva: ISO/IEC. 14882:2003(E). • Josuttis, Nicolai M. The C++ Standard Library. Addison-Wesley. ISBN 0-201-37926-0. • Koenig, Andrew; Barbara E. Moo (2000). Accelerated C++ - Practical Programming by Example. Addison-Wesley. ISBN 0-201-70353-X. • Lippman, Stanley B.; Josée Lajoie, Barbara E. Moo (2005). C++ Primer. Addison-Wesley. ISBN 0-201-72148-1. • Lippman, Stanley B. (1996). Inside the C++ Object Model. Addison-Wesley. ISBN 0-201-83454-5. • Stroustrup, Bjarne (2000). The C++ Programming Language (Special Edition ed.). Addison-Wesley. ISBN 0-201-70073-5.


C++

75 • Stroustrup, Bjarne (1994). The Design and Evolution of C++. Addison-Wesley. ISBN 0-201-54330-3. • Stroustrup, Bjarne. Programming Principles and Practice Using C++. Addison-Wesley. ISBN 0321543726. • Sutter, Herb (2001). More Exceptional C++: 40 New Engineering Puzzles, Programming Problems, and Solutions. Addison-Wesley. ISBN 0-201-70434-X. • Sutter, Herb (2004). Exceptional C++ Style. Addison-Wesley. ISBN 0-201-76042-8. • Vandevoorde, David; Nicolai M. Josuttis (2003). C++ Templates: The complete Guide. Addison-Wesley. ISBN 0-201-73484-2. • Scott Meyers (2005). Effective C++. Third Edition. Addison-Wesley. ISBN 0-321-33487-6

External links • JTC1/SC22/WG21 [33] - The ISO/IEC C++ Standard Working Group

• • • •

• n3092.pdf [34] - Final Committee Draft of "ISO/IEC IS 14882 - Programming Languages - C++" (26 March 2010) A paper by Stroustrup showing the timeline of C++ evolution (1979-1991) [35] Bjarne Stroustrup's C++ Style and Technique FAQ [36] Apache C++ Standard Library Documentation [37] C++ FAQ Lite by Marshall Cline [38]

• • • •

Computer World interview with Bjarne Stroustrup [39] CrazyEngineers.com interview with Bjarne Stroustrup [40] The State of the Language: An Interview with Bjarne Stroustrup (August 15, 2008) [41] Code practices for not breaking binary compatibility between releases of C++ libraries [42] (from KDE Techbase)

References [1] [2] [3] [4] [5]

Stroustrup, Bjarne (1997). "1". The C++ Programming Language (Third ed.). ISBN 0201889544. OCLC 59193992. C++ The Complete Reference Third Edition, Herbert Schildt, Publisher: Osborne McGraw-Hill. ATT.com (http:/ / www2. research. att. com/ ~bs/ bs_faq. html#invention) "Programming Language Popularity" (http:/ / www. langpop. com/ ). 2009. . Retrieved 2009-01-16. "TIOBE Programming Community Index" (http:/ / www. tiobe. com/ index. php/ content/ paperinfo/ tpci/ index. html). 2009. . Retrieved 2009-05-06. [6] "Bjarne Stroustrup's FAQ — When was C++ invented?" (http:/ / public. research. att. com/ ~bs/ bs_faq. html#invention). . Retrieved 30 May 2006. [7] "Trends on C++ Programmers, Developers & Engineers" (http:/ / www. odesk. com/ trends/ c+ + ). . Retrieved 1 December 2008. [8] Stroustrup, Bjarne. "C++ Glossary" (http:/ / www. research. att. com/ ~bs/ glossary. html). . Retrieved 8 June 2007. [9] "Bjarne Stroustrup's FAQ — Where did the name "C++" come from?" (http:/ / public. research. att. com/ ~bs/ bs_faq. html#name). . Retrieved 16 January 2008. [10] STLPort home page (http:/ / www. stlport. org/ ), quote from "The C++ Standard Library" by Nicolai M. Josuttis, p138., ISBN 0-201 37926-0, Addison-Wesley, 1999: "An exemplary version of STL is the STLport, which is available for free for any platform" [11] Stroustrup, Bjarne (2000). The C++ Programming Language (Special Edition ed.). Addison-Wesley. p. 46. ISBN 0-201-70073-5. [12] Open issues for The C++ Programming Language (3rd Edition) (http:/ / www. research. att. com/ ~bs/ 3rd_issues. html) - This code is copied directly from Bjarne Stroustrup's errata page (p. 633). He addresses the use of '\n' rather than std::endl. Also see www.research.att.com (http:/ / www. research. att. com/ ~bs/ bs_faq2. html#void-main) for an explanation of the implicit return 0; in the main function. This implicit return is not available in other functions. [13] ISO/IEC (2003). ISO/IEC 14882:2003(E): Programming Languages - C++ §3.6.1 Main function [basic.start.main] para. 5 [14] ISO/IEC (2003). ISO/IEC 14882:2003(E): Programming Languages - C++ §6.6.3 The return statement [stmt.return] para. 2 [15] Sutter, Herb; Alexandrescu, Andrei (2004). C++ Coding Standards: 101 Rules, Guidelines, and Best Practices. Addison-Wesley. [16] Henricson, Mats; Nyquist, Erik (1997). Industrial Strength C++. Prentice Hall. ISBN ISBN 0-13-120965-5. [17] Stroustrup, Bjarne (2000). The C++ Programming Language (Special Edition ed.). Addison-Wesley. p. 310. ISBN 0-201-70073-5. "A virtual member function is sometimes called a method." [18] Andrew Birkett. "Parsing C++ at nobugs.org" (http:/ / www. nobugs. org/ developer/ parsingcpp/ ). Nobugs.org. . Retrieved 3 July 2009. [19] Herb Sutter (15 April 2003). "C++ Conformance Roundup" (http:/ / www. ddj. com/ dept/ cpp/ 184401381). Dr. Dobb's Journal. . Retrieved 30 May 2006. [20] "C++ ABI" (http:/ / www. codesourcery. com/ cxx-abi/ ). . Retrieved 30 May 2006.


C++

76 [21] Why We Can’t Afford Export (http:/ / anubis. dkuug. dk/ jtc1/ sc22/ wg21/ docs/ papers/ 2003/ n1426. pdf)PDF (266 KB) [22] Herb Sutter (13 March 2010). "Trip Report: March 2010 ISO C++ Standards Meeting" (http:/ / herbsutter. com/ 2010/ 03/ 13/ trip-report-march-2010-iso-c-standards-meeting/ ). . Retrieved 8 April 2010. [23] "Bjarne Stroustrup's FAQ - Is C a subset of C++?" (http:/ / public. research. att. com/ ~bs/ bs_faq. html#C-is-subset). . Retrieved 18 January 2008. [24] "C9X -- The New C Standard" (http:/ / home. datacomm. ch/ t_wolf/ tw/ c/ c9x_changes. html). . Retrieved 27 December 2008. [25] Morris, Richard (July 2, 2009). "Niklaus Wirth: Geek of the Week" (http:/ / www. simple-talk. com/ opinion/ geek-of-the-week/ niklaus-wirth-geek-of-the-week/ ). . Retrieved 8 August 2009. "C++ is a language that was designed to cater to everybody’s perceived needs. As a result, the language and even more so its implementations have become complex and bulky, difficult to understand, and likely to contain errors for ever." [26] Why is the code generated for the "Hello world" program ten times larger for C++ than for C? (http:/ / www. research. att. com/ ~bs/ bs_faq. html#Hello-world) [27] What do you think of EC++? (http:/ / www. research. att. com/ ~bs/ bs_faq. html#EC+ + ) [28] Why is C++ so BIG? (http:/ / www. research. att. com/ ~bs/ bs_faq. html#big) [29] Unattributed. Previously unpublished interview with Bjarne Stroustroup, designer of C++ (http:/ / flinflon. brandonu. ca/ dueck/ 1997/ 62285/ stroustroup. html). [30] Stroustrup, Bjarne. Stroustrup FAQ: Did you really give an interview to IEEE? (http:/ / www2. research. att. com/ ~bs/ bs_faq. html#IEEE) [31] http:/ / www2. research. att. com/ ~bs/ bs_faq. html. [32] http:/ / www. icce. rug. nl/ documents/ cplusplus/ [33] http:/ / www. open-std. org/ jtc1/ sc22/ wg21/ [34] http:/ / www. open-std. org/ jtc1/ sc22/ wg21/ docs/ papers/ 2010/ n3092. pdf [35] http:/ / www. research. att. com/ ~bs/ hopl2. pdf [36] [37] [38] [39] [40] [41] [42]

http:/ / www. research. att. com/ ~bs/ bs_faq2. html http:/ / incubator. apache. org/ stdcxx/ doc http:/ / www. parashift. com/ c%2B%2B-faq-lite/ http:/ / www. computerworld. com. au/ index. php/ id;408408016;pp;1;fp;16;fpid;1 http:/ / www. crazyengineers. com/ small-talk/ 1-cover-story/ 24-small-talk-with-dr-bjarne-stroustrup http:/ / www. devx. com/ SpecialReports/ Article/ 38813/ 0/ page/ 1 http:/ / techbase. kde. org/ Policies/ Binary_Compatibility_Issues_With_C+ +


Perl

77

Perl

Usual file extensions

.pl

Paradigm

multi-paradigm: functional, imperative, object-oriented (class-based)

Appeared in

1987

Designed by

Larry Wall

Developer

Larry Wall

Stable release

5.12.0 (April 12, 2010)

Preview release

5.13.0 (April 20, 2010)

Typing discipline

Dynamic

Influenced by

AWK, Smalltalk 80, Lisp, C, C++, sed, Unix shell, Pascal

Influenced

Python, PHP, Ruby, ECMAScript, Dao, Windows PowerShell, JavaScript, Falcon

Programming language C OS

Cross-platform

License

GNU General Public License, Artistic License

Website

www.perl.org

[1]

Perl is a high-level, general-purpose, interpreted, dynamic programming language. Perl was originally developed by Larry Wall in 1987 as a general-purpose Unix scripting language to make report processing easier.[2] [3] Since then, it has undergone many changes and revisions and become widely popular amongst programmers. Larry Wall continues to oversee development of the core language, and its upcoming version, Perl 6. Perl borrows features from other programming languages including C, shell scripting (sh), AWK, and sed.[4] The language provides powerful text processing facilities without the arbitrary data length limits of many contemporary Unix tools,[5] facilitating easy manipulation of text files. It is also used for graphics programming, system administration, network programming, applications that require database access and CGI programming on the Web. Perl is nicknamed "the Swiss Army chainsaw of programming languages" due to its flexibility and adaptability.[6]

History Early Perl Versions Larry Wall began work on Perl in 1987, while working as a programmer at Unisys,[7] and released version 1.0 to the comp.sources.misc newsgroup on December 18, 1987.[8] The language expanded rapidly over the next few years. Perl 2, released in 1988, featured a better regular expression engine. Perl 3, released in 1989, added support for binary data streams. Originally the only documentation for Perl was a single (increasingly lengthy) man page. In 1991, Programming Perl (known to many Perl programmers as the "Camel Book" because of its cover) was published and became the de facto reference for the language. At the same time, the Perl version number was bumped to 4窶馬ot to mark a major change in the language but to identify the version that was documented by the book.


Perl

78

Early Perl 5 Perl 4 went through a series of maintenance releases, culminating in Perl 4.036 in 1993. At that point, Wall abandoned Perl 4 to begin work on Perl 5. Initial design of Perl 5 continued into 1994. The perl5-porters mailing list was established in May 1994 to coordinate work on porting Perl 5 to different platforms. It remains the primary forum for development, maintenance, and porting of Perl 5.[9] Perl 5.000 was released on October 17, 1994.[10] It was a nearly complete rewrite of the interpreter, and it added many new features to the language, including objects, references, lexical (my) variables, and modules. Importantly, modules provided a mechanism for extending the language without modifying the interpreter. This allowed the core interpreter to stabilize, even as it enabled ordinary Perl programmers to add new language features. Perl 5 has been in active development since then. Perl 5.001 was released on March 13, 1995. Perl 5.002 was released on February 29, 1996 with the new prototypes feature. This allowed module authors to make subroutines that behaved like Perl builtins. Perl 5.003 was released June 25, 1996, as a security release. One of the most important events in Perl 5 history took place outside of the language proper and was a consequence of its module support. On October 26, 1995, the Comprehensive Perl Archive Network (CPAN) was established as a repository for Perl modules and Perl itself. At the time of writing, it carries almost 17,000 modules by more than 7,000 authors. CPAN is widely regarded as one of the greatest strengths of Perl in practice. Perl 5.004 was released on May 15, 1997, and included among other things the UNIVERSAL package, giving Perl a base object to which all classes were automatically derived and the ability to require versions of modules. In addition, Perl now supported running under Microsoft Windows and several other operating systems.[11] Perl 5.005 was released on July 22, 1998. This release included several enhancements to the Regex engine, new hooks into the backend through the B::* modules, the qr// regex quote operator, a large selection of other new core modules, and added support for several more operating systems, including BeOS.[12]

2000â&#x20AC;&#x201C;Present Perl 5.6 was released on March 22, 2000. Major changes included 64 bit support, unicode string representation, large file support (eg, files > 2 GiB) and the 'our' keyword.[13] [14] When developing Perl 5.6, the decision was made to switch the versioning scheme to one more similar to other open source projects; after 5.005_63, the next version became 5.5.640, with plans for development versions to have odd numbers and stable versions to have even numbers. In 2000, Larry Wall put forth a call for suggestions for a new version of Perl from the community. The process resulted in 361 RFCs (Request for Change) documents which were to be used in guiding development of Perl 6. In 2001,[15] work began on the apocalypses for Perl 6, a series of documents meant to summarize the change requests and present the design of the next generation of Perl. They were presented as a digest of the RFCs, rather than a formal document. At this point, Perl 6 simply existed as a description of a language. Perl 5.8 was first released on July 18, 2002, and had nearly yearly updates since then. The latest version of Perl 5.8 is 5.8.9, released December 14, 2008. Perl 5.8 improved unicode support, added a new IO implementation, added a new thread implementation, improved numeric accuracy, and added several new modules.[16] In 2004, work began on the Synopses â&#x20AC;&#x201C; originally documents that summarized the Apocalypes, but which became the specification for the Perl 6 language. In February 2005, Audrey Tang began work on Pugs, a Perl 6 interpreter written in Haskell[17] . This was the first real concerted effort towards making Perl 6 a reality. This effort stalled in 2006. On December 18, 2007, the 20th anniversary of Perl 1.0, Perl 5.10.0 was released. Perl 5.10.0 included notable new features, which brought it closer to Perl 6. Some of these new features were a new switch statement (called "given"/"when"), regular expressions updates, and the smart match operator, "~~".[18] [19]


Perl

79 Around this same time, development began in earnest on another implementation of Perl 6 known as Rakudo Perl, developed in tandem with the Parrot virtual machine. As of November 2009, Rakudo Perl has had regular monthly releases and now is the most complete implementation of Perl 6. A major change in the development process of Perl 5 occurred with Perl 5.11; the development community has switched to a monthly release cycle, with planned release dates three months ahead. On April 12, 2010, Perl 5.12.0 was released. Notable core enhancements include new package NAME VERSION syntax, the Yada Yada operator (intended to mark placeholder code that is not yet implemented), implicit strictures, full Y2038 compliance, regexp conversion overloading, DTrace support, and Unicode 5.2. [20] The latest development release of Perl 5 is 5.13.0, released by LĂŠon Brocard on April 20, 2010[21] .

Name Perl was originally named "Pearl," after the Parable of the Pearl from the Gospel of Matthew. Larry Wall wanted to give the language a short name with positive connotations; he claims that he considered (and rejected) every threeand four-letter word in the dictionary. He also considered naming it after his wife Gloria. Wall discovered the existing PEARL programming language before Perl's official release and changed the spelling of the name. When referring to the language, the name is normally capitalized (Perl) as a proper noun, as you would a spoken language (e.g. English or French). When referring to the interpreter program itself, the name is often uncapitalised (perl) because most Unix-like file systems are case-sensitive. Before the release of the first edition of Programming Perl, it was common to refer to the language as perl; Randal L. Schwartz, however, capitalised the language's name in the book to make it stand out better when typeset. This case distinction was subsequently documented as canonical.[22] There is some contention about the all-caps spelling "PERL," which the documentation declares incorrect[22] and which some core community members consider a sign of outsiders.[23] Although the name is occasionally taken as an acronym for Practical Extraction and Report Language (which appears at the top of the documentation[24] and in some printed literature[25] ), this expansion actually came after the name; several others have been suggested as equally canonical, including Wall's own humorous Pathologically Eclectic Rubbish Lister.[26] Indeed, Wall claims that the name was intended to inspire many different expansions.[27]

The camel symbol Programming Perl, published by O'Reilly Media, features a picture of a camel on the cover and is commonly referred to as The Camel Book.[7] This image of a camel has become a general symbol of Perl. It is also a hacker emblem, appearing on some T-shirts and other clothing items. O'Reilly owns the image as a trademark but claims to use their legal rights only to protect the "integrity and impact of that symbol".[28] O'Reilly allows non-commercial use of the symbol and provides Programming Republic of Perl logos and Powered by Perl buttons.[29] However, the Camel has never been meant to be an official Perl symbol, and if one is to be considered instead, it's an onion.[30]


Perl

80

Overview Perl is a general-purpose programming language originally developed for text manipulation and now used for a wide range of tasks including system administration, web development, network programming, games, bioinformatics, and GUI development. The language is intended to be practical (easy to use, efficient, complete) rather than beautiful (tiny, elegant, minimal).[31] Its major features include support for multiple programming paradigms (procedural, object-oriented, and functional styles), reference counting memory management (without a cycle-detecting garbage collector), built-in support for text processing, and a large collection of third-party modules. According to Larry Wall, Perl has two slogans. The first is "There's more than one way to do it," commonly known as TMTOWTDI. The second slogan is "Easy things should be easy and hard things should be possible."

Features The overall structure of Perl derives broadly from C. Perl is procedural in nature, with variables, expressions, assignment statements, brace-delimited blocks, control structures, and subroutines. Perl also takes features from shell programming. All variables are marked with leading sigils, which unambiguously identify the data type (for example, scalar, array, hash) of the variable in context. Importantly, sigils allow variables to be interpolated directly into strings. Perl has many built-in functions that provide tools often used in shell programming (although many of these tools are implemented by programs external to the shell) such as sorting, and calling on system facilities. Perl takes lists from Lisp, associative arrays (hashes) from AWK, and regular expressions from sed. These simplify and facilitate many parsing, text-handling, and data-management tasks. In Perl 5, features were added that support complex data structures, first-class functions (that is, closures as values), and an object-oriented programming model. These include references, packages, class-based method dispatch, and lexically scoped variables, along with compiler directives (for example, the strict pragma). A major additional feature introduced with Perl 5 was the ability to package code as reusable modules. Larry Wall later stated that "The whole intent of Perl 5's module system was to encourage the growth of Perl culture rather than the Perl core."[32] All versions of Perl do automatic data typing and memory management. The interpreter knows the type and storage requirements of every data object in the program; it allocates and frees storage for them as necessary using reference counting (so it cannot deallocate circular data structures without manual intervention). Legal type conversionsâ&#x20AC;&#x201D;for example, conversions from number to stringâ&#x20AC;&#x201D;are done automatically at run time; illegal type conversions are fatal errors.

Design The design of Perl can be understood as a response to three broad trends in the computer industry: falling hardware costs, rising labor costs, and improvements in compiler technology. Many earlier computer languages, such as Fortran and C, were designed to make efficient use of expensive computer hardware. In contrast, Perl is designed to make efficient use of expensive computer programmers. Perl has many features that ease the programmer's task at the expense of greater CPU and memory requirements. These include automatic memory management; dynamic typing; strings, lists, and hashes; regular expressions; introspection; and an eval() function. Wall was trained as a linguist, and the design of Perl is very much informed by linguistic principles. Examples include Huffman coding (common constructions should be short), good end-weighting (the important information should come first), and a large collection of language primitives. Perl favors language constructs that are concise and natural for humans to read and write, even where they complicate the Perl interpreter.


Perl

81 Perl syntax reflects the idea that "things that are different should look different." For example, scalars, arrays, and hashes have different leading sigils. Array indices and hash keys use different kinds of braces. Strings and regular expressions have different standard delimiters. This approach can be contrasted with languages such as Lisp, where the same S-expression construct and basic syntax are used for many different purposes. Perl does not enforce any particular programming paradigm (procedural, object-oriented, functional, and others) or even require the programmer to choose among them. There is a broad practical bent to both the Perl language and the community and culture that surround it. The preface to Programming Perl begins, "Perl is a language for getting your job done." One consequence of this is that Perl is not a tidy language. It includes many features, tolerates exceptions to its rules, and employs heuristics to resolve syntactical ambiguities. Because of the forgiving nature of the compiler, bugs can sometimes be hard to find. Discussing the variant behaviour of built-in functions in list and scalar contexts, the perlfunc(1) manual page says, "In general, they do what you want, unless you want consistency." In addition to Larry Wall's two slogans mentioned above, Perl has several mottos that convey aspects of its design and use, including "Perl: the Swiss Army Chainsaw of Programming Languages" and "No unnecessary limits". Perl has also been called "The Duct Tape of the Internet".[33] No written specification or standard for the Perl language exists for Perl versions through Perl 5, and there are no plans to create one for the current version of Perl. There has been only one implementation of the interpreter, and the language has evolved along with it. That interpreter, together with its functional tests, stands as a de facto specification of the language. Perl 6, however, started with a specification,[34] and several projects[35] aim to implement some or all of the specification.

Applications Perl has many and varied applications, compounded by the availability of many standard and third-party modules. Perl has been used since the early days of the Web to write CGI scripts. It is known as one of "the three Ps" (along with Python and PHP), the most popular dynamic languages for writing Web applications. It is also an integral component of the popular LAMP solution stack for web development. Large projects written in Perl include cPanel, Slash, Bugzilla, RT, TWiki, and Movable Type. Many high-traffic websites use Perl extensively. Examples include Amazon.com, bbc.co.uk, Priceline.com, Craigslist, IMDb,[36] LiveJournal, Slashdot and Ticketmaster. Perl is often used as a glue language, tying together systems and interfaces that were not specifically designed to interoperate, and for "data munging,"[37] that is, converting or processing large amounts of data for tasks such as creating reports. In fact, these strengths are intimately linked. The combination makes Perl a popular all-purpose language for system administrators, particularly because short programs can be entered and run on a single command line. With a degree of care, Perl code can be made portable across Windows and Unix. Portable Perl code is often used by suppliers of software (both COTS and bespoke) to simplify packaging and maintenance of software build and deployment scripts. Graphical user interfaces (GUIs) may be developed using Perl. For example, Perl/Tk is commonly used to enable user interaction with Perl scripts. Such interaction may be synchronous or asynchronous using callbacks to update the GUI. For more information about the technologies involved, see Tk, Tcl, WxPerl and Prima Perl. Perl is also widely used in finance and bioinformatics, where it is valued for rapid application development and deployment and for its capability to handle large data sets.


Perl

82

Implementation Perl is implemented as a core interpreter, written in C, together with a large collection of modules, written in Perl and C. The source distribution is, as of 2009, 13.5 MB when packaged in a tar file and compressed.[38] The interpreter is 150,000 lines of C code and compiles to a 1 MB executable on typical machine architectures. Alternatively, the interpreter can be compiled to a link library and embedded in other programs. There are nearly 500 modules in the distribution, comprising 200,000 lines of Perl and an additional 350,000 lines of C code. (Much of the C code in the modules consists of character-encoding tables.) The interpreter has an object-oriented architecture. All of the elements of the Perl language—scalars, arrays, hashes, coderefs, file handles—are represented in the interpreter by C structs. Operations on these structs are defined by a large collection of macros, typedefs, and functions; these constitute the Perl C API. The Perl API can be bewildering to the uninitiated, but its entry points follow a consistent naming scheme, which provides guidance to those who use it. The life of a Perl interpreter divides broadly into a compile phase and a run phase.[39] In Perl, the phases are the major stages in the interpreter's life cycle. Each interpreter goes through each phase only once, and the phases follow in a fixed sequence. Most of what happens in Perl's compile phase is compilation, and most of what happens in Perl's run phase is execution, but there are significant exceptions. Perl makes important use of its capability to execute Perl code during the compile phase. Perl will also delay compilation into the run phase. The terms that indicate the kind of processing that is actually occurring at any moment are compile time and run time. Perl is in compile time at most points during the compile phase, but compile time may also be entered during the run phase. The compile time for code in a string argument passed to the eval built-in occurs during the run phase. Perl is often in run time during the compile phase and spends most of the run phase in run time. Code in BEGIN blocks executes at run time but in the compile phase. At compile time, the interpreter parses Perl code into a syntax tree. At run time, it executes the program by walking the tree. Text is parsed only once, and the syntax tree is subject to optimization before it is executed, so that execution is relatively efficient. Compile-time optimizations on the syntax tree include constant folding and context propagation, but peephole optimization is also performed. Perl has a Turing-complete grammar because parsing can be affected by run-time code executed during the compile phase.[40] Therefore, Perl cannot be parsed by a straight Lex/Yacc lexer/parser combination. Instead, the interpreter implements its own lexer, which coordinates with a modified GNU bison parser to resolve ambiguities in the language. It is often said that "Only perl can parse Perl," meaning that only the Perl interpreter (perl) can parse the Perl language (Perl), but even this is not, in general, true. Because the Perl interpreter can simulate a Turing machine during its compile phase, it would need to decide the Halting Problem in order to complete parsing in every case. It's a long-standing result that the Halting Problem is undecidable, and therefore not even perl can always parse Perl. Perl makes the unusual choice of giving the user access to its full programming power in its own compile phase. The cost in terms of theoretical purity is high, but practical inconvenience seems to be rare. Other programs that undertake to parse Perl, such as source-code analyzers and auto-indenters, have to contend not only with ambiguous syntactic constructs but also with the undecidability of Perl parsing in the general case. Adam Kennedy's PPI project focused on parsing Perl code as a document (retaining its integrity as a document), instead of parsing Perl as executable code (which not even Perl itself can always do). It was Kennedy who first conjectured that, "parsing Perl suffers from the 'Halting Problem'."[41] and this was later proved.[42] Perl is distributed with some 120,000 functional tests. These run as part of the normal build process and extensively exercise the interpreter and its core modules. Perl developers rely on the functional tests to ensure that changes to the interpreter do not introduce bugs; conversely, Perl users who see that the interpreter passes its functional tests on


Perl

83 their system can have a high degree of confidence that it is working properly. Maintenance of the Perl interpreter has become increasingly difficult over the years. The code base has been in continuous development since 1994. The code has been optimized for performance at the expense of simplicity, clarity, and strong internal interfaces. New features have been added, yet virtually complete backward compatibility with earlier versions is maintained. Major releases of Perl were coordinated by Perl pumpkings which handled integrating patch submissions and bug fixes, but has since changed to a rotating, monthly release cycle. Development discussion takes place via the perl5_porters mailing list. As of Perl 5.11, development efforts have included refactoring certain core modules known as 'dual lifed' modules out of the Perl core[43] to help alleviate some of these problems.

Availability Perl is free software and is licensed under both the Artistic License and the GNU General Public License. Distributions are available for most operating systems. It is particularly prevalent on Unix and Unix-like systems, but it has been ported to most modern (and many obsolete) platforms. With only six reported exceptions, Perl can be compiled from source code on all Unix-like, POSIX-compliant, or otherwise-Unix-compatible platforms.[44] However, this is rarely necessary, because Perl is included in the default installation of many popular operating systems. Because of unusual changes required for the Mac OS Classic environment, a special port called MacPerl was shipped independently.[45] The Comprehensive Perl Archive Network (CPAN [46]) carries a complete list of supported platforms with links to the distributions available on each.[47] CPAN is also the source for publicly available Perl modules that are not part of the core Perl distribution. Windows Users of Microsoft Windows typically install one of the native binary distributions of Perl for Win32,[48] most commonly Strawberry Perl or ActivePerl. Compiling Perl from source code under Windows is possible, but most installations lack the requisite C compiler and build tools. This also makes it difficult to install modules from the CPAN, particularly those that are partially written in C. Users of the ActivePerl binary distribution are, therefore, dependent on the repackaged modules provided in ActiveStateâ&#x20AC;&#x2122;s module repository, which are precompiled and can be installed with PPM. Limited resources to maintain this repository have been cause for various long-standing problems.[49] [50] Strawberry Perl,[51] is an open source distribution for Windows. It has had regular, quarterly releases since January 2008, including new modules as feedback and requests come in. Strawberry Perl aims to be able to install modules like standard Perl distributions on other platforms, including compiling XS modules. Strawberry Perl started as a way in part to address the flaws in ActiveState's distribution and resolve other problems of Perl on the Windows platform. A community project[52] was launched by Adam Kennedy on behalf of The Perl Foundation in June 2006. A community website for "all things Windows and Perl." A major aim of this project is to provide production-quality alternative Perl distributions that include an embedded C compiler and build tools, so as to enable Windows users to install modules directly from the CPAN. A related version with research and experimental work was done in the Vanilla Perl distribution.[53] The Cygwin emulation layer is another popular way of running Perl under Windows. Cygwin provides a Unix-like environment on Windows, and both perl and cpan are conveniently available as standard pre-compiled packages in the Cygwin setup program. Because Cygwin also includes the gcc, compiling Perl from source is also possible.


Perl

84

Language structure In Perl, the minimal Hello world program may be written as follows: print "Hello, world!\n"; This prints the string Hello, world! and a newline, symbolically expressed by an n character whose interpretation is altered by the preceding escape character (a backslash). The canonical form of the program is slightly more verbose: #!/usr/bin/perl print "Hello, world!\n"; The hash mark character introduces a comment in Perl, which runs up to the end of the line of code and is ignored by the compiler. The comment used here is of a special kind: it’s called the shebang line. This tells Unix-like operating systems where to find the Perl interpreter, making it possible to invoke the program without explicitly mentioning perl. (Note that, on Microsoft Windows systems, Perl programs are typically invoked by associating the .pl extension with the Perl interpreter. In order to deal with such circumstances, perl detects the shebang line and parses it for switches;[54] therefore, it is not strictly true that the shebang line is ignored by the compiler.) The second line in the canonical form includes a semicolon, which is used to separate statements in Perl. With only a single statement in a block or file, a separator is unnecessary, so it can be omitted from the minimal form of the program—or more generally from the final statement in any block or file. The canonical form includes it because it is common to terminate every statement even when it is unnecessary to do so, as this makes editing easier: code can be added to, or moved away from, the end of a block or file without having to adjust semicolons. Version 5.10 of Perl introduces a say function that implicitly appends a newline character to its output, making the minimal "Hello world" program even shorter: use 5.010; # must be present to import the new 5.10 functions, notice that it is 5.010 not 5.10 say 'Hello, world!'

Data types Perl has a number of fundamental data types. The most commonly used and discussed are scalars, arrays, hashes, filehandles, and subroutines: • A scalar is a single value; it may be a number, a string, or a reference. • An array is an ordered collection of scalars. • A hash, or associative array, is a map from strings to scalars; the strings are called keys, and the scalars are called values. • A file handle is a map to a file, device, or pipe that is open for reading, writing, or both. • A subroutine is a piece of code that may be passed arguments, be executed, and return data Most variables are marked by a leading sigil, which identifies the data type being accessed (not the type of the variable itself), except filehandles, which don't have a sigil. The same name may be used for variables of different data types, without conflict.


Perl

85

Sigil Example Description $

$foo

a scalar

@

@foo

an array

%

%foo

a hash

none FOO

a file handle

&

&foo

a subroutine (the & is optional in some contexts)

*

*foo

a typeglob

File handles and constants need not be uppercase, but it is a common convention because there is no sigil to denote them. Both are global in scope, but file handles are interchangeable with references to file handles, which can be stored in scalars, which in turn permit lexical scoping. Doing so is encouraged in Damian Conway's Perl Best Practices. As a convenience, the open function in Perl 5.6 and newer will accept a scalar variable, which will be set (autovivified) to a reference to an anonymous file handle, in place of a named file handle. Scalar values String values (literals) must be enclosed by quotes. Enclosing a string in double quotes allows the values of variables whose names appear in the string to automatically replace the variable name (or be interpolated) in the string. Enclosing a string in single quotes prevents variable interpolation. If $name is "Jim", print("My name is $name") will print "My name is Jim", but print('My name is $name') will print "My name is $name". To include a double quotation mark in a string, precede it with a backslash or enclose the string in single quotes. To include a single quotation mark, precede it with a backslash or enclose the string in double quotes. Strings can also be quoted with the q and qq quote-like operators. 'this' is identical to q(this) and "$this" is identical to qq($this). Finally, multiline strings can be defined using here documents: $multilined_string = <<EOF; This is my multilined string note that I am terminating it with the word "EOF". EOF Numbers (numeric constants) do not require quotation. Perl will convert numbers into strings and vice versa depending on the context in which they are used. When strings are converted into numbers, trailing non-numeric parts of the strings are discarded. If no leading part of a string is numeric, the string will be converted to the number 0. In the following example, the strings $n and $m are treated as numbers. This code prints the number '5'. The values of the variables remain the same. Note that in Perl, + is always the numeric addition operator. The string concatenation operator is the period. $n = '3 apples'; $m = '2 oranges'; print $n + $m; Functions are provided for the rounding of fractional values to integer values: int chops off the fractional part, rounding towards zero; POSIX::ceil and POSIX::floor round always up and always down, respectively. The number-to-string conversion of printf "%f" or sprintf "%f" round out even, use bankers' rounding. Perl also has a boolean context that it uses in evaluating conditional statements. The following values all evaluate as false in Perl: $false = 0; # the number zero $false = 0.0; # the number zero as a float


Perl

86 $false $false $false $false $false $false

= = = = = =

0b0; # the number zero in binary 0x0; # the number zero in hexadecimal '0'; # the string zero ""; # the empty string undef; # the return value from undef 2-3+1 # computes to 0 which is converted to "0" so it is false

All other (non-zero evaluating) values evaluate to true. This includes the odd self-describing literal string of "0 but true", which in fact is 0 as a number, but true when used as a boolean. All non-numeric strings also have this property, but this particular string is truncated by Perl without a numeric warning. A less explicit but more conceptually portable version of this string is '0E0' or '0e0', which does not rely on characters being evaluated as 0, because '0E0' is literally zero times ten to the power zero. Evaluated boolean expressions are also scalar values. The documentation does not promise which particular value of true or false is returned. Many boolean operators return 1 for true and the empty-string for false. The defined() function determines whether a variable has any value set. In the above examples, defined($false) is true for every value except undef. If either 1 or 0 are specifically needed, an explicit conversion can be done: my $real_result = $boolean_result ? 1 : 0; Array values An array value (or list) is specified by listing its elements, separated by commas, enclosed by parentheses (at least where required by operator precedence). @scores = (32, 45, 16, 5); The qw() quote-like operator allows the definition of a list of strings without typing of quotes and commas. Almost any delimiter can be used instead of parentheses. The following lines are equivalent: @names = ('Billy', 'Joe', 'Jim-Bob'); @names = qw(Billy Joe Jim-Bob); The split function returns a list of strings, which are split from a string expression using a delimiter string or regular expression. @scores = split(',', '32,45,16,5'); Individual elements of a list are accessed by providing a numerical index in square brackets. The scalar sigil must be used. Sublists (array slices) can also be specified, using a range or list of numeric indices in brackets. The array sigil is used in this case. For example, $month[3] is "March", and @month[4..6] is ("April", "May", "June"). Hash values A hash (or associative array) may be initialized from a list of key/value pairs. If the keys are separated from the values with the => operator (sometimes called a fat comma), rather than a comma, they may be unquoted (barewords). The following lines are equivalent: %favorite = ('joe', "red", 'sam', "blue"); %favorite = (joe => 'red', sam => 'blue'); Individual values in a hash are accessed by providing the corresponding key, in curly braces. The $ sigil identifies the accessed element as a scalar. For example, $favorite{joe} equals 'red'. A hash can also be initialized by setting its values individually:


Perl

87 $favorite{joe} = 'red'; $favorite{sam} = 'blue'; $favorite{oscar} = 'green'; Multiple elements may be accessed using the @ sigil instead (identifying the result as a list). For example, @favorite{'joe', 'sam'} equals ('red', 'blue'). Typeglob values A typeglob value is a symbol table entry. The main use of typeglobs is creating symbol table aliases. For example: *PI = \3.141592653; # creating constant scalar $PI *this = *that; # creating aliases for all data types 'this' to all data types 'that' Array functions The number of elements in an array can be determined either by evaluating the array in scalar context or with the help of the $# sigil. The latter gives the index of the last element in the array, not the number of elements. The expressions scalar(@array) and ($#array + 1) are equivalent. Hash functions There are a few functions that operate on entire hashes. The keys function takes a hash and returns the list of its keys. Similarly, the values function returns a hash's values. Note that the keys and values are returned in a consistent but arbitrary order. # Every call to each returns the next key/value pair. # All values will be eventually returned, but their order # cannot be predicted. while (($name, $address) = each %addressbook) { print "$name lives at $address\n"; } # Similar to the above, but sorted alphabetically foreach my $next_name (sort keys %addressbook) { print "$next_name lives at $addressbook{$next_name}\n"; }

Control structures Perl has several kinds of control structures. It has block-oriented control structures, similar to those in the C, Javascript, and Java programming languages. Conditions are surrounded by parentheses, and controlled blocks are surrounded by braces: label while ( cond ) { ... } label while ( cond ) { ... } continue { ... } label for ( init-expr ; cond-expr ; incr-expr ) { ... } label foreach var ( list ) { ... } label foreach var ( list ) { ... } continue { ... } if ( cond ) { ... } if ( cond ) { ... } else { ... } if ( cond ) { ... } elsif ( cond ) { ... } else { ... }


Perl

88 Where only a single statement is being controlled, statement modifiers provide a more-concise syntax: statement statement statement statement statement

if cond ; unless cond ; while cond ; until cond ; foreach list ;

Short-circuit logical operators are commonly used to affect control flow at the expression level: expr expr expr expr

and expr && expr or expr || expr

(The "and" and "or" operators are similar to && and || but have lower precedence, which makes it easier to use them to control entire statements.) The flow control keywords next (corresponding to C's continue), last (corresponding to C's break), return, and redo are expressions, so they can be used with short-circuit operators. Perl also has two implicit looping constructs, each of which has two forms: results results results results

= = = =

grep { ... } list grep expr, list map { ... } list map expr, list

grep returns all elements of list for which the controlled block or expression evaluates to true. map evaluates the controlled block or expression for each element of list and returns a list of the resulting values. These constructs enable a simple functional programming style. Up until the 5.10.0 release, there was no switch statement in Perl 5. From 5.10.0 onward, a multi-way branch statement called given/when is available, which takes the following form: use v5.10; # must be present to import the new 5.10 functions given ( expr ) { when ( cond ) { ... } default { ... } } Syntactically, this structure behaves similarly to switch statements found in other languages, but with a few important differences. The largest is that unlike switch/case structures, given/when statements break execution after the first successful branch, rather than waiting for explicitly defined break commands. Conversely, explicit continues are instead necessary to emulate switch behavior. For those not using Perl 5.10, the Perl documentation describes a half-dozen ways to achieve the same effect by using other control structures. There is also a Switch module, which provides functionality modeled on the forthcoming Perl 6 re-design. It is implemented using a source filter, so its use is unofficially discouraged.[55] Perl includes a goto label statement, but it is rarely used. Situations where a goto is called for in other languages don't occur as often in Perl because of its breadth of flow control options. There is also a goto &sub statement that performs a tail call. It terminates the current subroutine and immediately calls the specified sub. This is used in situations where a caller can perform more-efficient stack management than Perl itself (typically because no change to the current stack is required), and in deep recursion, tail calling can have substantial positive impact on performance because it avoids the overhead of scope/stack management on return.


Perl

89

Subroutines Subroutines are defined with the sub keyword and are invoked simply by naming them. If the subroutine in question has not yet been declared, invocation requires either parentheses after the function name or an ampersand (&) before it. But using & without parentheses will also implicitly pass the arguments of the current subroutine to the one called, and using & with parentheses will bypass prototypes. # Calling a subroutine # Parentheses are required here if the subroutine is defined later in the code foo(); &foo; # (this also works, but has other consequences regarding arguments passed to the subroutine) # Defining a subroutine sub foo { ... } foo; # Here parentheses are not required A list of arguments may be provided after the subroutine name. Arguments may be scalars, lists, or hashes. foo $x, @y, %z; The parameters to a subroutine do not need to be declared as to either number or type; in fact, they may vary from call to call. Any validation of parameters must be performed explicitly inside the subroutine. Arrays are expanded to their elements; hashes are expanded to a list of key/value pairs; and the whole lot is passed into the subroutine as one flat list of scalars. Whatever arguments are passed are available to the subroutine in the special array @_. The elements of @_ are references to the actual arguments; changing an element of @_ changes the corresponding argument. Elements of @_ may be accessed by subscripting it in the usual way. $_[0], $_[1] However, the resulting code can be difficult to read, and the parameters have pass-by-reference semantics, which may be undesirable. One common idiom is to assign @_ to a list of named variables. my ($x, $y, $z) = @_; This provides mnemonic parameter names and implements pass-by-value semantics. The my keyword indicates that the following variables are lexically scoped to the containing block. Another idiom is to shift parameters off of @_. This is especially common when the subroutine takes only one argument or for handling the $self argument in object-oriented modules. my $x = shift; Subroutines may assign @_ to a hash to simulate named arguments; this is recommended in Perl Best Practices for subroutines that are likely to ever have more than three parameters.[56] sub function1 { my %args = @_; print "'x' argument was '$args{x}'\n";


Perl

90 } function1( x => 23 ); Subroutines may return values. return 42, $x, @y, %z; If the subroutine does not exit via a return statement, then it returns the last expression evaluated within the subroutine body. Arrays and hashes in the return value are expanded to lists of scalars, just as they are for arguments. The returned expression is evaluated in the calling context of the subroutine; this can surprise the unwary. sub list { (4, 5, 6) } sub array { @x = (4, 5, 6); @x } $x $x @x @x

= = = =

list; # returns 6 - last element of list array; # returns 3 - number of elements in list list; # returns (4, 5, 6) array; # returns (4, 5, 6)

A subroutine can discover its calling context with the wantarray function. sub either { return wantarray ? (1, 2) : 'Oranges'; } $x = either; # returns "Oranges" @x = either; # returns (1, 2)

Regular expressions The Perl language includes a specialized syntax for writing regular expressions (RE, or regexes), and the interpreter contains an engine for matching strings to regular expressions. The regular-expression engine uses a backtracking algorithm, extending its capabilities from simple pattern matching to string capture and substitution. The regular-expression engine is derived from regex written by Henry Spencer. The Perl regular-expression syntax was originally taken from Unix Version 8 regular expressions. However, it diverged before the first release of Perl and has since grown to include far more features. Many other languages and applications are now adopting Perl compatible regular expressions over POSIX regular expressions, such as PHP, Ruby, Java, Microsoft's .NET Framework,[57] and the Apache HTTP server. Regular-expression syntax is extremely compact, owing to history. The first regular-expression dialects were only slightly more expressive than globs, and the syntax was designed so that an expression would resemble the text that it matches. This meant using no more than a single punctuation character or a pair of delimiting characters to express the few supported assertions. Over time, the expressiveness of regular expressions grew tremendously, but the syntax design was never revised and continues to rely on punctuation. As a result, regular expressions can be cryptic and extremely dense.


Perl

91 Uses The m// (match) operator introduces a regular-expression match. (If it is delimited by slashes, as in all of the examples here, then the leading m may be omitted for brevity. If the m is present, as in all of the following examples, other delimiters can be used in place of slashes.) In the simplest case, an expression such as $x =~ /abc/; evaluates to true if and only if the string $x matches the regular expression abc. The s/// (substitute) operator, on the other hand, specifies a search-and-replace operation: $x =~ s/abc/aBc/; # upcase the b Another use of regular expressions is to specify delimiters for the split function: @words = split /,/, $line; The split function creates a list of the parts of the string that are separated by matches of the regular expression. In this example, a line is divided into a list of its comma-separated parts, and this list is then assigned to the @words array. Syntax Modifiers Perl regular expressions can take modifiers. These are single-letter suffixes that modify the meaning of the expression: $x =~ /abc/i; # case-insensitive pattern match $x =~ s/abc/aBc/g; # global search and replace Because the compact syntax of regular expressions can make them dense and cryptic, the /x modifier was added in Perl to help programmers write more-legible regular expressions. It allows programmers to place whitespace and comments inside regular expressions: $x =~ / a # match 'a' . # followed by any character c # then followed by the 'c'character /x; Capturing Portions of a regular expression may be enclosed in parentheses; corresponding portions of a matching string are captured. Captured strings are assigned to the sequential built-in variables $1, $2, $3, ..., and a list of captured strings is returned as the value of the match. $x =~ /a(.)c/; # capture the character between 'a' and 'c' Captured strings $1, $2, $3, ... can be used later in the code. Perl regular expressions also allow built-in or user-defined functions apply to the captured match, by using the /e modifier: $x = "Oranges"; $x =~ s/(ge)/uc($1)/e; # OranGEs $x .= $1; # append $x with the contents of the match in the previous


Perl

92 statement: OranGEsge

Objects There are many ways to write object-oriented code in Perl. The most basic is using "blessed" references. Many modern Perl applications use the Moose object system. Examples An example of a class written using the MooseX::Declare[58] extension to Moose: use MooseX::Declare; class Point3D extends Point { has 'z' => (isa => 'Num', is => 'rw'); after clear { $self->z(0); } method set_to (Num $x, Num $y, Num $z) { $self->x($x); $self->y($y); $self->z($z); } } This is a class named Point3D that extends another class named Point explained in Moose examples. Id adds to its base class a new attribute z, redefines the method set_to and extends the method clear.

Database interfaces Perl is widely favored for database applications. Its text-handling facilities are useful for generating SQL queries; arrays, hashes, and automatic memory management make it easy to collect and process the returned data. In early versions of Perl, database interfaces were created by relinking the interpreter with a client-side database library. This was sufficiently difficult that it was done for only a few of the most-important and most widely used databases, and it restricted the resulting perl executable to using just one database interface at a time. In Perl 5, database interfaces are implemented by Perl DBI modules. The DBI (Database Interface) module presents a single, database-independent interface to Perl applications, while the DBD (Database Driver) modules handle the details of accessing some 50 different databases; there are DBD drivers for most ANSI SQL databases. DBI provides caching for database handles and queries, which can greatly improve performance in long-lived execution environments such as mod_perl,[59] helping high-volume systems avert load spikes as in the Slashdot effect. In modern Perl applications, especially those written using Web application frameworks such as Catalyst, the DBI module is often used indirectly via object-relational mappers such as DBIx::Class, Class::DBI or Rose::DB::Object which generate SQL queries and handle data transparently to the application author.


Perl

93

Comparative performance The Computer Language Benchmarks Game[60] compares the performance of implementations of typical programming problems in several programming languages. The submitted Perl implementations were typically toward the high end of the memory-usage spectrum and had varied speed results. Perl's performance in the benchmarks game is typical for interpreted languages. Large Perl programs start slower than similar programs in compiled languages because perl has to compile the source every time it runs. In a talk at the YAPC::Europe 2005 conference and subsequent article "A Timely Start," Jean-Louis Leroy found that his Perl programs took much longer to run than he expected because the perl interpreter spent much of the time finding modules because of his over-large include path.[61] Unlike Java, Python, and Ruby, Perl has only experimental support for pre-compiling.[62] Therefore Perl programs pay this overhead penalty on every execution. The run phase of typical programs is long enough that amortized startup time is not substantial, but results in benchmarks that measure very short execution times are likely to be skewed. A number of tools have been introduced to improve this situation. The first such tool was Apache's mod_perl, which sought to address one of the most-common reasons that small Perl programs were invoked rapidly: CGI Web development. ActivePerl, via Microsoft ISAPI, provides similar performance improvements. Once Perl code is compiled, there is additional overhead during the execution phase that typically isn't present for programs written in compiled languages such as C or C++. Examples of such overhead include bytecode interpretation, reference-counting memory management, and dynamic type checking.

Optimizing Like any code, Perl programs can be tuned for performance using benchmarks and profiles after a readable and correct implementation is finished. In part because of Perl's interpreted nature, writing more-efficient Perl will not always be enough to meet one's performance goals for a program. In such situations, the most-critical routines of a Perl program can be written in other languages such as C or Assembler, which can be connected to Perl via simple Inline modules or the more-complex-but-flexible XS mechanism.[63] Nicholas Clark, a Perl core developer, discusses some Perl design trade-offs and some solutions in When perl is not quite fast enough.[64] In extreme cases, optimizing Perl can require intimate knowledge of the interpreter's workings rather than skill with algorithms, the Perl language, or general principles of optimization.

Future At the 2000 Perl Conference, Jon Orwant made a case for a major new language initiative.[65] This led to a decision to begin work on a redesign of the language, to be called Perl 6. Proposals for new language features were solicited from the Perl community at large, and more than 300 RFCs were submitted. For clarity purposes: Perl6 and Perl5 are different languages, sharing a common ancestry. Larry Wall spent the next few years digesting the RFCs and synthesizing them into a coherent framework for Perl 6. He has presented his design for Perl 6 in a series of documents called "apocalypses," which are numbered to correspond to chapters in Programming Perl ("The Camel Book"). The current, not-yet-finalized specification of Perl 6 is encapsulated in design documents called Synopses, which are numbered to correspond to Apocalypses. Perl 6 is not intended to be backward compatible, although there will be a compatibility mode. Thesis work by Bradley M. Kuhn, overseen by Larry Wall, considered the possible use of the Java virtual machine as a runtime for Perl.[66] Kuhn's thesis showed this approach to be problematic, and in 2001, it was decided that Perl 6 would run on a cross-language virtual machine called Parrot. This will mean that other languages targeting the Parrot will gain native access to CPAN, allowing some level of cross-language development.


Perl

94 In 2005, Audrey Tang created the pugs project, an implementation of Perl 6 in Haskell. This was, and continues to act as, a test platform for the Perl 6 language (separate from the development of the actual implementation) allowing the language designers to explore. The pugs project spawned an active Perl/Haskell cross-language community centered around the freenode #perl6 irc channel. A number of features in the Perl 6 language now show similarities to Haskell. As of early 2009, Perl 6 development is primarily centred around Rakudo Perl 6, an implementation running on top of the Parrot virtual machine. Another implementation, Mildew, is also under active development and does not use Parrot. Development of Perl 5 is also continuing. Perl 5.12.0 was released in April 2010 influenced by the design of Perl 6.[68] [69]

[67]

, with some new features

The Perl community Perl's culture and community has developed alongside the language itself. Usenet was the first public venue in which Perl was introduced, but over the course of its evolution, Perl's community was shaped by the growth of broadening Internet-based services including the introduction of the World Wide Web. The community that surrounds Perl was, in fact, the topic of Larry Wall's first "State of the Onion" talk.[70]

State of the Onion State of the Onion is the name for Larry Wallâ&#x20AC;&#x2122;s yearly keynote-style summaries on the progress of Perl and its community. They are characterized by his hallmark humor, employing references to Perlâ&#x20AC;&#x2122;s culture, the wider hacker culture, Wallâ&#x20AC;&#x2122;s linguistic background, sometimes his family life, and occasionally even his Christian background. Each talk is first given at various Perl conferences and is eventually also published online.

Pastimes Perl's pastimes have become a defining element of the Perl community. They include both trivial and complex uses of the language. JAPHs In email, Usenet, and message-board postings, "Just another Perl hacker" (JAPH) programs have become a common trend, originated by Randal L. Schwartz, one of the earliest professional Perl trainers.[71] In the parlance of Perl culture, Perl programmers are known as Perl hackers, and from this derives the practice of writing short programs to print out the phrase "Just another Perl hacker,". In the spirit of the original concept, these programs are moderately obfuscated and short enough to fit into the signature of an email or Usenet message. The "canonical" JAPH includes the comma at the end, although this is often omitted. Perl golf Perl "golf" is the pastime of reducing the number of characters (key "strokes") used in a Perl program to the bare minimum, much as how golf players seek to take as few shots as possible in a round. This use of the word "golf" originally focused on the JAPHs used in signatures in Usenet postings and elsewhere, although the same stunts had been an unnamed pastime in the language APL in previous decades. The use of Perl to write a program that performed RSA encryption prompted a widespread and practical interest in this pastime.[72] In subsequent years, the term "code golf" has been applied to the pastime in other languages.[73] A Perl Golf Apocalypse was held at Perl Conference 4.0 in Monterey, California in July 2000.


Perl

95 Obfuscation As with C, obfuscated code competitions are a well-known pastime. The annual Obfuscated Perl contest made an arch virtue of Perl's syntactic flexibility. Poetry Similar to obfuscated code and golf, but with a different purpose, Perl poetry is the practice of writing poems that can actually be compiled as legal (although generally non-sensical) Perl code, for example the piece known as Black Perl. This hobby is more or less unique to Perl because of the large number of regular English words that are used in the language. New poems are regularly published in the Perl Monks site's Perl Poetry section.[74] Perl on IRC There are a couple of channels on IRC that offer free Perl support for the language and some modules. IRC Network

Channels

irc.freenode.net #perl #cbstream #perlcafe #poe irc.perl.org

#moose #poe #catalyst #dbix-class #perl-help #distzilla #epo #corehackers #sdl #win32 #toolchain #padre

irc.slashnet.org #perlmonks irc.oftc.net

#perl

irc.efnet.net

#perlhelp

irc.rizon.net

#perl

irc.debian.org

#debian-perl

CPAN Acme There are also many examples of code written purely for entertainment on the CPAN. Lingua::Romana::Perligata, for example, allows writing programs in Latin.[75] Upon execution of such a program, the module translates its source code into regular Perl and runs it. The Perl community has set aside the "Acme" namespace for modules that are fun in nature (but its scope has widened to include exploratory or experimental code or any other module that is not meant to ever be used in production). Some of the Acme modules are deliberately implemented in amusing ways. This includes Acme::Bleach, one of the first modules in the Acme:: namespace,[76] which allows the program's source code to be "whitened" (i.e., all characters replaced with whitespace) and yet still work.

See also • • • • • •

Comparison of programming languages Just another Perl hacker Perl Data Language Perl Object Environment PerlScript Plain Old Documentation


Perl

96

Further reading • • • •

Learning Perl, Fifth Edition (the Llama book), ISBN 0-596-52010-6 Perl Cookbook, ISBN 0-596-00313-7 Programming Perl (the Camel book), ISBN 0-596-00027-8 The Perl Journal [77] published 1996–2006 was the leading publication for and about Perl Programming during this time. • Higher Order Perl [78] , ISBN 1-558-60701-3

External links • • • • • •

Perl.org [1]—Official Perl website Perl documentation [79] The Perl Foundation [80] Official Perl 5 Wiki [81] Perl [82] at the Open Directory Project The Iron Man Contest [83] Many Perl rss feeds are aggregated together to form a source of information about the Perl community as a whole.

References [1] [2] [3] [4] [5] [6]

http:/ / www. perl. org/ What is Perl? (http:/ / perl. about. com/ od/ gettingstartedwithperl/ p/ whatisperl. htm) Beginner's Introduction to Perl (http:/ / www. perl. com/ pub/ a/ 2000/ 10/ begperl1. html) Ashton, Elaine (1999). "The Timeline of Perl and its Culture (v3.0_0505)" (http:/ / history. perl. org/ PerlTimeline. html). . Wall, Larry, Tom Christiansen and Jon Orwant (July 2000). Programming Perl, Third Edition. O'Reilly. ISBN 0-596-00027-8. Sheppard, Doug (2000-10-16). "Beginner's Introduction to Perl" (http:/ / www. perl. com/ pub/ a/ 2000/ 10/ begperl1. html). O'Reilly Media. . Retrieved 2008-07-27. [7] "Larry Wall" (http:/ / www. perl. com/ pub/ au/ Wall_Larry). . Retrieved 2006-08-20. [8] "Perl, a "replacement" for awk and sed" (http:/ / groups. google. com/ groups?selm=350@fig. bbn. com). . Retrieved 2007-12-18. [9] perl5-porters archive (http:/ / www. nntp. perl. org/ group/ perl. perl5. porters/ ) [10] http:/ / perldoc. perl. org/ perlhist. html [11] http:/ / perldoc. perl. org/ perl5004delta. html [12] http:/ / perldoc. perl. org/ perl5005delta. html [13] http:/ / perldoc. perl. org/ perl56delta. html [14] http:/ / perldoc. perl. org/ perl561delta. html [15] http:/ / dev. perl. org/ perl6/ doc/ design/ apo/ A01. html [16] http:/ / perldoc. perl. org/ perl58delta. html [17] HaskellWiki (http:/ / www. haskell. org/ ) [18] perldelta: what is new for perl 5.10.0 (http:/ / search. cpan. org/ ~rgarcia/ perl-5. 10. 0-RC2/ pod/ perl5100delta. pod) [19] Smart matching in detail (http:/ / search. cpan. org/ ~rgarcia/ perl-5. 10. 0-RC2/ pod/ perlsyn. pod#Smart_matching_in_detail) [20] (http:/ / search. cpan. org/ ~jesse/ perl-5. 12. 0/ pod/ perl5120delta. pod) [21] (http:/ / search. cpan. org/ ~lbrocard/ perl-5. 13. 0/ ) [22] "perlfaq1: What's the difference between "perl" and "Perl"?" (http:/ / perldoc. perl. org/ perlfaq1. html#What's-the-difference-between-"perl"-and-"Perl"?). . [23] Schwartz, Randal. "PERL as shibboleth and the Perl community" (http:/ / www. perlmonks. org/ index. pl?node_id=510594). . Retrieved 2007-06-01. [24] Wall, Larry. "Larry Wall" (http:/ / www. linuxjournal. com/ article/ 3394). . Retrieved 2008-10-02. [25] Steve McConnell (2004) Code Complete, 2nd ed., Microsoft Press, p. 65. [26] Wall, Larry. "BUGS" (http:/ / perldoc. perl. org/ perl. html#BUGS). perl(1) man page. . Retrieved 2006-10-13. [27] Wall, Larry. "Re^7: PERL as shibboleth and the Perl community" (http:/ / www. perlmonks. org/ index. pl?node_id=511722). . Retrieved 2007-01-03. [28] O'Reilly—The Perl Camel Usage and Trademark Information (http:/ / perl. oreilly. com/ usage/ ) [29] Index of /images/perl (http:/ / www. oreillynet. com/ images/ perl/ ) [30] Perl Trademark, User Logos, Perl Marks and more (http:/ / www. perlfoundation. org/ perl_trademark) [31] perlintro(1) man page


Perl

97 [32] Usenet post, May 10, 1997, with ID 199705101952.MAA00756@wall.org. [33] "The Importance of Perl" (http:/ / www. oreillynet. com/ pub/ a/ oreilly/ perl/ news/ importance_0498. html). O'Reilly & Associates, Inc.. April 1998. . "As Hassan Schroeder, Sun's first webmaster, remarked: “Perl is the duct tape of the Internet.”" [34] See Perl 6 Specification (http:/ / www. perl6. org/ specification/ ) [35] Perl 6 Implementations (http:/ / www. perl6. org/ compilers/ ) [36] "IMDb Helpdesk: What software/hardware are you using to run the site?" (http:/ / www. imdb. com/ help/ search?domain=helpdesk_faq& index=1& file=techinfo). . Retrieved 2007-09-01. [37] Data Mungin with Perl (http:/ / books. perl. org/ book/ 95) [38] http:/ / www. cpan. org/ src [39] A description of the Perl 5 interpreter can be found in Programming Perl, 3rd Ed., chapter 18. See particularly page 467, which carefully distinguishes run phase and compile phase from run time and compile time. Perl "time" and "phase" are often confused. [40] Schwartz, Randal. "On Parsing Perl" (http:/ / www. perlmonks. org/ index. pl?node_id=44722). . Retrieved 2007-01-03. [41] The quote is from Kennedy, Adam (2006). "PPI—Parse, Analyze and Manipulate Perl (without perl)" (http:/ / search. cpan. org/ ~adamk/ PPI-1. 201/ lib/ PPI. pm). CPAN. . [42] "Rice's Theorem". The Perl Review 4 (3): 23–29. Summer 2008. and "Perl is Undecidable". The Perl Review 5 (0): 7–11. Fall 2008., which is available online at Kegler, Jeffrey. "Perl and Undecidability" (http:/ / www. jeffreykegler. com/ Home/ perl-and-undecidability). . [43] Perl 5.11.0 delta (http:/ / search. cpan. org/ ~jesse/ perl-5. 11. 1/ pod/ perl5110delta. pod) [44] Hietaniemi, Jarkko (1998). "Perl Ports (Binary Distributions)" (http:/ / www. cpan. org/ ports/ ). CPAN.org. . [45] "The MacPerl Pages" (http:/ / www. macperl. com/ ). Prime Time Freeware. 1997. . [46] http:/ / www. cpan. org/ [47] CPAN/ports (http:/ / www. cpan. org/ ports/ ) [48] "Win32 Distributions" (http:/ / win32. perl. org/ wiki/ index. php?title=Win32_Distributions#Perl_Distributions). Win32 Perl Wiki. . [49] Golden, David (2006). "Activestate and Scalar-List-Utils" (http:/ / www. mail-archive. com/ perl-qa@perl. org/ msg05407. html). . [50] Kennedy, Adam (2007). "ActivePerl PPM repository design flaw goes critical" (http:/ / use. perl. org/ ~Alias/ journal/ 35219). . [51] Strawberry Perl website (http:/ / strawberryperl. com/ ) [52] win32.perl.org/ (http:/ / win32. perl. org/ ) [53] Vanilla Perl website (http:/ / vanillaperl. com/ ) [54] "perlrun manpage" (http:/ / perldoc. perl. org/ perlrun. html#DESCRIPTION). . [55] using switch (http:/ / www. perlmonks. org/ ?node_id=496084) [56] Damian Conway, Perl Best Practices (http:/ / www. oreilly. com/ catalog/ perlbp/ chapter/ ch09. pdf), p.182 [57] Microsoft Corp., ".NET Framework Regular Expressions", .NET Framework Developer's Guide, (http:/ / msdn2. microsoft. com/ en-us/ library/ hs600312(VS. 71). aspx) [58] MooseX::Declare documentation (http:/ / search. cpan. org/ perldoc?MooseX::Declare) [59] Bekman, Stas. "Efficient Work with Databases under mod_perl" (http:/ / perl. apache. org/ docs/ 1. 0/ guide/ performance. html#Efficient_Work_with_Databases_under_mod_perl). . Retrieved 2007-09-01. [60] Boxplot Summary | The Computer Language Benchmarks Game (http:/ / shootout. alioth. debian. org/ u32/ benchmark. php?test=all& lang=all& d=data& v8=on& lua=on& python=on& php=on& perl=on& ruby=on& calc=calculate) [61] Leroy, Jean-Louis (2005-12-01). "A Timely Start" (http:/ / www. perl. com/ pub/ a/ 2005/ 12/ 21/ a_timely_start. html). Perl.com. . [62] Beattie, Malcolm and Enache Adrian (2003). "B::Bytecode Perl compiler's bytecode backend" (http:/ / search. cpan. org/ ~nwclark/ perl-5. 8. 8/ ext/ B/ B/ Bytecode. pm#KNOWN_BUGS). search.cpan.org. . [63] http:/ / search. cpan. org/ perldoc/ Inline/ [64] When perl is not quite fast enough (http:/ / www. ccl4. org/ ~nick/ P/ Fast_Enough/ ) [65] Transcription of Larry's talk (http:/ / www. nntp. perl. org/ group/ perl. perl6. meta/ 424). Retrieved on 2006 September 28. [66] Kuhn, Bradley (January 2001). Considerations on Porting Perl to the Java Virtual Machine (http:/ / www. ebb. org/ bkuhn/ writings/ technical/ thesis/ ). University of Cincinnati. . Retrieved 2008-06-28. [67] (http:/ / www. perl. org/ get. html) Perl.org, download Perl distributions page [68] (http:/ / www. h-online. com/ open/ news/ item/ Perl-5-12-0-released-Update-976919. html) article about 5.12.0 release on the H open website [69] (http:/ / search. cpan. org/ dist/ perl-5. 12. 0/ pod/ perl5120delta. pod) Document describing the differences between the 5.10.0 release and the 5.12.0 release [70] Wall, Larry (1997-08-20). "Perl Culture (AKA the first State of the Onion)" (http:/ / www. wall. org/ ~larry/ keynote/ keynote. html). . [71] Randal L. Schwartz (1999-05-02). "[news:m1hfpvh2jq.fsf@halfdome.holdit.com Who is Just another Perl hacker?]". [news:comp.lang.perl.misc comp.lang.perl.misc]. (Web link) (http:/ / groups. google. com/ groups?selm=m1hfpvh2jq. fsf@halfdome. holdit. com). Retrieved on 2007-11-12. [72] The quest for the most diminutive munitions program (http:/ / www. cypherspace. org/ adam/ rsa/ story. html) [73] "Code Golf: What is Code Golf?" (http:/ / codegolf. com/ ). 29degrees. 2007. . [74] Perl Poetry section (http:/ / www. perlmonks. org/ ?node_id=1590) on Perl Monks [75] Conway, Damian. "Lingua::Romana::Perligata -- Perl for the XXI-imum Century" (http:/ / www. csse. monash. edu. au/ ~damian/ papers/ HTML/ Perligata. html). .


Perl

98 [76] [77] [78] [79] [80] [81] [82] [83]

Brocard, Leon (2001-05-23). "use Perl; Journal of acme" (http:/ / use. perl. org/ ~acme/ journal/ 200). . http:/ / www. ddj. com/ web-development/ tpj. jhtml http:/ / hop. perl. plover. com/ book/ http:/ / perldoc. perl. org/ http:/ / www. perlfoundation. org/ http:/ / www. perlfoundation. org/ perl5/ http:/ / www. dmoz. org/ Computers/ Programming/ Languages/ Perl/ http:/ / ironman. enlightenedperl. org/


Fortran

99

Fortran

The Fortran Automatic Coding System for the IBM 704 (October 15, 1956), the first Programmer's Reference Manual for Fortran Usual file extensions

.f, .for, .f90, .f95

Paradigm

multi-paradigm: procedural, imperative, structured, object-oriented

Appeared in

1957

Designed by

John Backus

Developer

John Backus & IBM

Stable release

Fortran 2003 (2003)

Typing discipline

strong, static

Major implementations

Absoft, Cray, GFortran, G95, IBM, Intel, Lahey/Fujitsu, Open Watcom, Pathscale, PGI, Silverfrost, Sun, XL Fortran, Visual Fortran, others

Influenced

ALGOL 58, BASIC, PL/I, C

Fortran (previously FORTRAN)[1] is a general-purpose,[2] procedural,[3] imperative programming language that is especially suited to numeric computation and scientific computing. Originally developed by IBM at their campus in south San Jose, California[4] in the 1950s for scientific and engineering applications, Fortran came to dominate this area of programming early on and has been in continual use for over half a century in computationally intensive areas such as numerical weather prediction, finite element analysis, computational fluid dynamics (CFD), computational physics, and computational chemistry. It is one of the most popular languages in the area of high-performance computing and is the language used for programs that benchmark and rank the world's fastest supercomputers.[5] Fortran (a blend derived from The IBM Mathematical Formula Translating System) encompasses a lineage of versions, each of which evolved to add extensions to the language while usually retaining compatibility with previous versions. Successive versions have added support for processing of character-based data (FORTRAN 77), array programming, modular programming and object-based programming (Fortran 90 / 95), and object-oriented and generic programming (Fortran 2003).


Fortran

100

History In late 1953, John W. Backus submitted a proposal to his superiors at IBM to develop a more practical alternative to assembly language for programming their IBM 704 mainframe computer. Backus' historic FORTRAN team consisted of programmers Richard Goldberg, Sheldon F. Best, Harlan Herrick, Peter Sheridan, Roy Nutt, Robert Nelson, Irving Ziller, Lois Haibt and David Sayre.[6] A draft specification for The IBM Mathematical Formula Translating System was completed by mid-1954. The first manual for FORTRAN appeared in October 1956, with the first FORTRAN compiler delivered in April 1957. This was an optimizing compiler, because customers were reluctant to use a high-level programming language unless its compiler could generate code whose performance was comparable to that of hand-coded assembly language.

An IBM 704 mainframe

FORTRAN code on a punched card, showing the specialized uses of columns 1-5, 6 and 73-80.

While the community was skeptical that this new method could possibly out-perform hand-coding, it reduced the number of programming statements necessary to operate a machine by a factor of 20, and quickly gained acceptance. Said creator John Backus during a 1979 interview with Think, the IBM employee magazine, "Much of my work has come from being lazy. I didn't like writing programs, and so, when I was working on the IBM 701, writing programs for computing missile trajectories, I started work on a programming system to make it easier to write programs."[7] The language was widely adopted by scientists for writing numerically intensive programs, which encouraged compiler writers to produce compilers that could generate faster and more efficient code. The inclusion of a complex number data type in the language made Fortran especially suited to technical applications such as electrical engineering. By 1960, versions of FORTRAN were available for the IBM 709, 650, 1620, and 7090 computers. Significantly, the increasing popularity of FORTRAN spurred competing computer manufacturers to provide FORTRAN compilers for their machines, so that by 1963 over 40 FORTRAN compilers existed. For these reasons, FORTRAN is considered to be the first widely used programming language supported across a variety of computer architectures. The development of FORTRAN paralleled the early evolution of compiler technology; indeed many advances in the theory and design of compilers were specifically motivated by the need to generate efficient code for FORTRAN programs.


Fortran

101

FORTRAN The initial release of FORTRAN for the IBM 704 contained 32 statements, including: • • • • • • • • • • •

DIMENSION and EQUIVALENCE statements Assignment statements Three-way arithmetic IF statement.[8] IF statements for checking exceptions (ACCUMULATOR OVERFLOW, QUOTIENT OVERFLOW, and DIVIDE CHECK); and IF statements for manipulating sense switches and sense lights GOTO, computed GOTO, ASSIGN, and assigned GOTO DO loops Formatted I/O: FORMAT, READ, READ INPUT TAPE, WRITE, WRITE OUTPUT TAPE, PRINT, and PUNCH Unformatted I/O: READ TAPE, READ DRUM, WRITE TAPE, and WRITE DRUM Other I/O: END FILE, REWIND, and BACKSPACE PAUSE, STOP, and CONTINUE FREQUENCY statement (for providing optimization hints to the compiler)[9] .

IBM 1401 FORTRAN FORTRAN was provided for the IBM 1401 by an innovative 63-pass compiler that ran in only 8k of core. It kept the program in memory and loaded overlays that gradually transformed it, in place, into executable form, as described by Haines et al.[10] . The executable form was not machine language; rather it was interpreted, anticipating UCSD Pascal P-code by two decades.

FORTRAN II IBM's FORTRAN II appeared in 1958. The main enhancement was to support procedural programming by allowing user-written subroutines and functions. Six new statements were introduced: • SUBROUTINE, FUNCTION, and END • CALL and RETURN • COMMON Over the next few years, FORTRAN II would also add support for the DOUBLE PRECISION and COMPLEX data types. Simple Fortran II program This program, for Heron's formula, reads one data card containing three 5-digit integers A, B, and C as input. If A, B, and C cannot represent the sides of a triangle in plane geometry, then the program's execution will end with an error code of "STOP 1". Otherwise, an output line will be printed showing the input values for A, B, and C, followed by the computed AREA of the triangle as a floating-point number with 2 digits after the decimal point. C C C C

AREA OF A TRIANGLE WITH A STANDARD SQUARE ROOT FUNCTION INPUT - CARD READER UNIT 5, INTEGER INPUT OUTPUT - LINE PRINTER UNIT 6, REAL OUTPUT INPUT ERROR DISPLAY ERROR OUTPUT CODE 1 IN JOB CONTROL LISTING READ INPUT TAPE 5, 501, IA, IB, IC 501 FORMAT (3I5) C IA, IB, AND IC MAY NOT BE NEGATIVE C FURTHERMORE, THE SUM OF TWO SIDES OF A TRIANGLE C IS GREATER THAN THE THIRD SIDE, SO WE CHECK FOR THAT, TOO


Fortran

102

IF (IA) 777, 777, 701 701 IF (IB) 777, 777, 702 702 IF (IC) 777, 777, 703 703 IF (IA+IB-IC) 777,777,704 704 IF (IA+IC-IB) 777,777,705 705 IF (IB+IC-IA) 777,777,799 777 STOP 1 C USING HERON'S FORMULA WE CALCULATE THE C AREA OF THE TRIANGLE 799 S = FLOATF (IA + IB + IC) / 2.0 AREA = SQRT( S * (S - FLOATF(IA)) * (S - FLOATF(IB)) * + (S - FLOATF(IC))) WRITE OUTPUT TAPE 6, 601, IA, IB, IC, AREA 601 FORMAT (4H A= ,I5,5H B= ,I5,5H C= ,I5,8H AREA= ,F10.2, + 13H SQUARE UNITS) STOP END

FORTRAN III IBM also developed a FORTRAN III in 1958 that allowed for inline assembler code among other features; however, this version was never released as a product. Like the 704 FORTRAN and FORTRAN II, FORTRAN III included machine-dependent features that made code written in it unportable from machine to machine. Early versions of FORTRAN provided by other vendors suffered from the same disadvantage.

FORTRAN IV

A FORTRAN coding form, formerly printed on paper and intended to be used by programmers to prepare programs for punching onto cards by keypunch operators. Now obsolete.

Starting in 1961, as a result of customer demands, IBM began development of a FORTRAN IV that removed the machine-dependent features of FORTRAN II (such as READ INPUT TAPE), while adding new features such as a LOGICAL data type, logical Boolean expressions and the logical IF statement as an alternative to the arithmetic IF statement. FORTRAN IV was eventually released in 1962, first for the IBM 7030 ("Stretch") computer, followed by versions for the IBM 7090 and IBM 7094. By 1965, Fortran IV was supposed to be the "standard" and in compliance with American Standards Association X3.4.3 FORTRAN Working Group.[11]


Fortran

103

FORTRAN 66 Perhaps the most significant development in the early history of FORTRAN was the decision by the American Standards Association (now ANSI) to form a committee to develop an "American Standard Fortran." The resulting two standards, approved in March 1966, defined two languages, FORTRAN (based on FORTRAN IV, which had served as a de facto standard), and Basic FORTRAN (based on FORTRAN II, but stripped of its machine-dependent features). The FORTRAN defined by the first standard became known as FORTRAN 66 (although many continued to refer to it as FORTRAN IV, the language upon which the standard was largely based). FORTRAN 66 effectively became the first "industry-standard" version of FORTRAN. FORTRAN 66 included: • • • • • • • • •

Main program, SUBROUTINE, FUNCTION, and BLOCK DATA program units INTEGER, REAL, DOUBLE PRECISION, COMPLEX, and LOGICAL data types COMMON, DIMENSION, and EQUIVALENCE statements DATA statement for specifying initial values Intrinsic and EXTERNAL (e.g., library) functions Assignment statement GOTO, assigned GOTO, and computed GOTO statements Logical IF and arithmetic (three-way) IF statements DO loops

• • • • • •

READ, WRITE, BACKSPACE, REWIND, and ENDFILE statements for sequential I/O FORMAT statement CALL, RETURN, PAUSE, and STOP statements Hollerith constants in DATA and FORMAT statements, and as actual arguments to procedures Identifiers of up to six characters in length Comment lines

FORTRAN 77 After the release of the FORTRAN 66 standard, compiler vendors introduced a number of extensions to "Standard Fortran", prompting ANSI in 1969 to begin work on revising the 1966 standard. Final drafts of this revised standard circulated in 1977, leading to formal approval of the new FORTRAN standard in April 1978. The new standard, known as FORTRAN 77, added a number of significant features to address many of the shortcomings of FORTRAN 66: • Block IF and END IF statements, with optional ELSE and ELSE IF clauses, to provide improved language support for structured programming • DO loop extensions, including parameter expressions, negative increments, and zero trip counts • OPEN, CLOSE, and INQUIRE statements for improved I/O capability • Direct-access file I/O • IMPLICIT statement • CHARACTER data type, with vastly expanded facilities for character input and output and processing of character-based data • PARAMETER statement for specifying constants • SAVE statement for persistent local variables • Generic names for intrinsic functions • A set of intrinsics (LGE, LGT, LLE, LLT) for lexical comparison of strings, based upon the ASCII collating sequence. (ASCII functions were demanded by the U. S. Department of Defense, in their conditional approval vote.) In this revision of the standard, a number of features were removed or altered in a manner that might invalidate previously standard-conforming programs. (Removal was the only allowable alternative to X3J3 at that time, since


Fortran

104

the concept of "deprecation" was not yet available for ANSI standards.) While most of the 24 items in the conflict list (see Appendix A2 of X3.9-1978) addressed loopholes or pathological cases permitted by the previous standard but rarely used, a small number of specific capabilities were deliberately removed, such as: • Hollerith constants and Hollerith data, such as: GREET = 12HHELLO THERE! • Reading into a H edit (Hollerith field) descriptor in a FORMAT specification. • Overindexing of array bounds by subscripts. DIMENSION A(10,5) Y= A(11,1) • Transfer of control into the range of a DO loop (also known as "Extended Range"). An important practical extension to FORTRAN 77 was the release of MIL-STD-1753 in 1978. This specification, developed by the U. S. Department of Defense, standardized a number of features implemented by most FORTRAN 77 compilers but not included in the ANSI FORTRAN 77 standard. These features would eventually be incorporated into the Fortran 90 standard. • DO WHILE and END DO statements • INCLUDE statement • IMPLICIT NONE variant of the IMPLICIT statement • Bit manipulation intrinsic functions, based on similar functions included in Industrial Real-Time Fortran (ANSI/ISA S61.1 (1976)) The IEEE 1003.9 POSIX Standard, released in 1991, provided a simple means for Fortran-77 programmers to issue POSIX system calls. Over 100 calls were defined in the document — allowing access to POSIX-compatible process control, signal handling, file system control, device control, procedure pointing, and stream I/O in a portable manner. The development of a revised standard to succeed FORTRAN 77 would be repeatedly delayed as the standardization process struggled to keep up with rapid changes in computing and programming practice. In the meantime, as the "Standard FORTRAN" for nearly fifteen years, FORTRAN 77 would become the historically most important dialect. Control Data Corporation computers had another version of FORTRAN 77, called Minnesota FORTRAN, with variations in output constructs, special uses of COMMONs and DATA statements, optimizations code levels for compiling, and detailed error listings, extensive warning messages, and debugs.[12]

Fortran 90 The much delayed successor to FORTRAN 77, informally known as Fortran 90 (and prior to that, Fortran 8X), was finally released as an ISO standard in 1991 and an ANSI Standard in 1992. This major revision added many new features to reflect the significant changes in programming practice that had evolved since the 1978 standard: • • • •

Free-form source input, also with lowercase Fortran keywords Identifiers up to 31 characters in length Inline comments Ability to operate on arrays (or array sections) as a whole, thus greatly simplifying math and engineering computations. • whole, partial and masked array assignment statements and array expressions, such as   X(1:N)=R(1:N)*COS(A(1:N)) • WHERE statement for selective array assignment • array-valued constants and expressions,

• user-defined array-valued functions and array constructors. • RECURSIVE procedures


Fortran

105

• Modules, to group related procedures and data together, and make them available to other program units, including the capability to limit the accessibility to only specific parts of the module. • A vastly improved argument-passing mechanism, allowing interfaces to be checked at compile time • User-written interfaces for generic procedures • Operator overloading • Derived/abstract data types • New data type declaration syntax, to specify the data type and other attributes of variables • Dynamic memory allocation by means of the ALLOCATABLE attribute and the ALLOCATE and DEALLOCATE statements • POINTER attribute, pointer assignment, and NULLIFY statement to facilitate the creation and manipulation of dynamic data structures • Structured looping constructs, with an END DO statement for loop termination, and EXIT and CYCLE statements for "breaking out" of normal DO loop iterations in an orderly way • SELECT . . . CASE construct for multi-way selection • Portable specification of numerical precision under the user's control • New and enhanced intrinsic procedures. Obsolescence and deletions Unlike the previous revision, Fortran 90 did not delete any features. (Appendix B.1 says, "The list of deleted features in this standard is empty.") Any standard-conforming FORTRAN 77 program is also standard-conforming under Fortran 90, and either standard should be usable to define its behavior. A small set of features were identified as "obsolescent" and expected to be removed in a future standard. Obsolescent feature

Example

Status / 95

Arithmetic IF-statement

IF (X) 10, 20, 30

Non-integer DO parameters or control variables

DO 9 X= 1.7, 1.6, -0.1

Shared DO-loop termination or termination with a statement other than END DO or CONTINUE

DO 9 J= 1, 10         DO 9 K= 1, 10 9   L= J + K

Branching to END IF from outside a block

66  GO TO 77 ; . . .     IF (E) THEN ;     . . . 77  END IF

Alternate return

CALL SUBR( X, Y *100, *200 )

PAUSE statement

PAUSE 600

Deleted

ASSIGN statement   and assigned GO TO statement

100  . . .     ASSIGN 100 TO H   ...     GO TO H . . .

Deleted

Assigned FORMAT specifiers

ASSIGN F TO 606

Deleted

H edit descriptors

606 FORMAT ( 9H1GOODBYE. )

Deleted

Computed GO TO statement

GO TO (10, 20, 30, 40), index

(Obso.)

Statement functions

FOIL( X, Y )= X**2 + 2*X*Y + Y**2

(Obso.)

DATA statements   among executable statements

X= 27.3     DATA A, B, C / 5.0, 12.0. 13.0 /     . . .

(Obso.)

Deleted

Deleted

CHARACTER* form of CHARACTER declaration CHARACTER*8 STRING   ! Use CHARACTER(8) (Obso.) Assumed character length functions


Fortran

106 Fixed form source code

* Column 1 contains * or ! or C for comments. C       Column 6 for continuation.

Fortran 95 Fortran 95 was a minor revision, mostly to resolve some outstanding issues from the Fortran 90 standard. Nevertheless, Fortran 95 also added a number of extensions, notably from the High Performance Fortran specification: • • • • •

FORALL and nested WHERE constructs to aid vectorization User-defined PURE and ELEMENTAL procedures Default initialization of derived type components, including pointer initialization Expanded the ability to use initialization expressions for data objects Clearly defined that ALLOCATABLE arrays are automatically deallocated when they go out of scope.

A number of intrinsic functions were extended (for example a dim argument was added to the maxloc intrinsic). Several features noted in Fortran 90 to be deprecated were removed from Fortran 95: • • • • •

DO statements using REAL and DOUBLE PRECISION variables Branching to an END IF statement from outside its block PAUSE statement ASSIGN and assigned GOTO statement, and assigned format specifiers H edit descriptor.

An important supplement to Fortran 95 was the ISO technical report TR-15581: Enhanced Data Type Facilities, informally known as the Allocatable TR. This specification defined enhanced use of ALLOCATABLE arrays, prior to the availability of fully Fortran 2003-compliant Fortran compilers. Such uses include ALLOCATABLE arrays as derived type components, in procedure dummy argument lists, and as function return values. (ALLOCATABLE arrays are preferable to POINTER-based arrays because ALLOCATABLE arrays are guaranteed by Fortran 95 to be deallocated automatically when they go out of scope, eliminating the possibility of memory leakage. In addition, aliasing is not an issue for optimization of array references, allowing compilers to generate faster code than in the case of pointers.) Another important supplement to Fortran 95 was the ISO technical report TR-15580: Floating-point exception handling, informally known as the IEEE TR. This specification defined support for IEEE floating-point arithmetic and floating point exception handling. Conditional compilation and varying length strings In addition to the mandatory "Base language" (defined in ISO/IEC 1539-1 : 1997), the Fortran 95 language also includes two optional modules: • Varying character strings (ISO/IEC 1539-2 : 2000) • Conditional compilation (ISO/IEC 1539-3 : 1998) which, together, comprise the multi-part International Standard (ISO/IEC 1539). According to the standards developers, "the optional parts describe self-contained features which have been requested by a substantial body of users and/or implementors, but which are not deemed to be of sufficient generality for them to be required in all standard-conforming Fortran compilers." Nevertheless, if a standard-conforming Fortran does provide such options, then they "must be provided in accordance with the description of those facilities in the appropriate Part of the Standard."


Fortran

Fortran 2003 The most recent standard, Fortran 2003, is a major revision introducing many new features. A comprehensive summary of the new features of Fortran 2003 is available at the Fortran Working Group (WG5) official Web site.[13] From that article, the major enhancements for this revision include: • Derived type enhancements: parameterized derived types, improved control of accessibility, improved structure constructors, and finalizers. • Object-oriented programming support: type extension and inheritance, polymorphism, dynamic type allocation, and type-bound procedures. • Data manipulation enhancements: allocatable components (incorporating TR 15581), deferred type parameters, VOLATILE attribute, explicit type specification in array constructors and allocate statements, pointer enhancements, extended initialization expressions, and enhanced intrinsic procedures. • Input/output enhancements: asynchronous transfer, stream access, user specified transfer operations for derived types, user specified control of rounding during format conversions, named constants for preconnected units, the FLUSH statement, regularization of keywords, and access to error messages. • Procedure pointers. • Support for IEEE floating-point arithmetic and floating point exception handling (incorporating TR 15580). • Interoperability with the C programming language. • Support for international usage: access to ISO 10646 4-byte characters and choice of decimal or comma in numeric formatted input/output. • Enhanced integration with the host operating system: access to command line arguments, environment variables, and processor error messages. An important supplement to Fortran 2003 was the ISO technical report TR-19767: Enhanced module facilities in Fortran. This report provided submodules, which make Fortran modules more similar to Modula-2 modules. They are similar to Ada private child subunits. This allows the specification and implementation of a module to be expressed in separate program units, which improves packaging of large libraries, allows preservation of trade secrets while publishing definitive interfaces, and prevents compilation cascades.

Fortran 2008 Efforts are underway to develop a revision to Fortran 2003, tentatively called Fortran 2008. As with Fortran 95, this is intended to be a minor upgrade, incorporating clarifications and corrections to Fortran 2003 and incorporating submodules from TR-19767 into the base language, as well as introducing a select few new capabilities. As of February 2007, the proposed new capabilities included[14] . • Co-array Fortran – a parallel processing model • BIT data type In August 2007, the BIT data type was removed. In February 2008, Coarrays were scaled back: Parallel I/O and teams were removed. A report as of June 2008 is available from the ISO/IEC JTC1/SC22/WG5 committee archive [15] . The complete original work plan is available at http:/ / j3-fortran. org/ doc/ year/ 07/ 07-010. html. Information on Fortran standardization in general and progress on Fortran 2008 is at http://j3-fortran.org.

107


Fortran

Legacy Since Fortran has been in use for more than fifty years, there is a vast body of Fortran in daily use throughout the scientific and engineering communities. It is the primary language for some of the most intensive supercomputing tasks, such as weather and climate modeling, computational fluid dynamics, computational chemistry, computational economics, plant breeding and computational physics. Even today, half a century later, many of the floating-point benchmarks to gauge the performance of new computer processors are still written in Fortran (e.g., CFP2006 [16], the floating-point component of the SPEC CPU2006 [17] benchmarks).

Portability Portability was a problem in the early days because there was no agreed standardâ&#x20AC;&#x201D;not even IBM's reference manualâ&#x20AC;&#x201D;and computer companies vied to differentiate their offerings from others by providing incompatible features. Standards have improved portability. The 1966 standard provided a reference syntax and semantics, but vendors continued to provide incompatible extensions. Although careful programmers were coming to realize that use of incompatible extensions caused expensive portability problems, and were therefore using programs such as The PFORT Verifier, it was not until after the 1977 standard, when the National Bureau of Standards (now NIST) published FIPS PUB 69, that processors purchased by the U.S. Government were required to diagnose extensions of the standard. Rather than offer two processors, essentially every compiler eventually had at least an option to diagnose extensions. Incompatible extensions were not the only portability problem. For numerical calculations, it is important to take account of the characteristics of the arithmetic. This was addressed by Fox et al. in the context of the 1966 standard by the PORT library. The ideas therein became widely used, and were eventually incorporated into the 1990 standard by way of intrinsic inquiry functions. The widespread (now almost universal) adoption of the IEEE 754 standard for binary floating-point arithmetic has essentially removed this problem. Access to the computing environment (e.g. the program's command line, environment variables, textual explanation of error conditions) remained a problem until it was addressed by the 2003 standard. Large collections of "library" software that could be described as being loosely related to engineering and scientific calculations, such as graphics libraries, have been written in C, and therefore access to them presented a portability problem. This has been addressed by incorporation of C interoperability into the 2003 standard. It is now possible (and relatively easy) to write an entirely portable program in Fortran, even without recourse to a preprocessor.

Variants Fortran 5 Fortran 5 was a programming language marketed by Data General Corp in the late 1970s and early 80s, for the Nova, Eclipse, and MV line of computers. It had an optimizing compiler that was quite good for minicomputers of its time. The language most closely resembles Fortran 66. The name is a pun on the earlier Fortran IV. Univac also offered a compiler for the 1100 series known as Fortran V. A spinoff of Univac Fortran V was Athena Fortran.

108


Fortran

109

Fortran VI Fortran VI was a programming language distributed by Control Data Corporation in 1968 for the CDC 6600 series. The language was based upon Fortran IV.[18]

Specific variants Vendors of high-performance scientific computers (e.g., Burroughs, CDC, Cray, Honeywell, IBM, Texas Instruments, and UNIVAC) added extensions to Fortran to take advantage of special hardware features such as instruction cache, CPU pipelines, and vector arrays. For example, one of IBM's FORTRAN compilers (H Extended IUP) had a level of optimization which reordered the machine code instructions to keep multiple internal arithmetic units busy simultaneously. Another example is CFD, a special variant of Fortran designed specifically for the ILLIAC IV supercomputer, running at NASA's Ames Research Center. IBM Research Labs also developed an extended FORTRAN-based language called "VECTRAN" for processing of vectors and matrices. Object-Oriented Fortran was an object-oriented extension of Fortran, in which data items can be grouped into objects, which can be instantiated and executed in parallel. It was available for Sun, Iris, iPSC, and nCUBE, but is no longer supported. Such machine-specific extensions have either disappeared over time or have had elements incorporated into the main standards; the major remaining extension is OpenMP, which is a cross-platform extension for shared memory programming. One new extension, Co-array Fortran, is intended to support parallel programming. FOR TRANSIT for the IBM 650 "FOR TRANSIT" was the name of a reduced version of the IBM 704 FORTRAN language, which was implemented for the IBM 650, using a translator program developed at Carnegie [19] in the late 1950s. The following comment appears in the IBM Reference Manual ("FOR TRANSIT Automatic Coding System" C28-4038, Copyright 1957, 1959 by IBM): The FORTRAN system was designed for a more complex machine than the 650, and consequently some of the 32 statements found in the FORTRAN Programmer's Reference Manual are not acceptable to the FOR TRANSIT system. In addition, certain restrictions to the FORTRAN language have been added. However, none of these restrictions make a source program written for FOR TRANSIT incompatible with the FORTRAN system for the 704. The permissible statements were: Arithmetic assignment statements, e.g. a = b GO to n GO TO (n1, n2, ..., nm), i IF (a) n1, n2, n3 PAUSE STOP DO n i = m1, m2 CONTINUE END READ n, list PUNCH n, list DIMENSION V, V, V, ... EQUIVALENCE (a,b,c), (d,c), ...


Fortran

110

Up to ten subroutines could be used in one program. FOR TRANSIT statements were limited to columns 7 thru 56, only. Punched cards were used for input and output on the IBM 650. Three passes were required to translate source code to the "IT" language, then to compile the IT statements into SOAP assembly language, and finally to produce the object program, which could then be loaded into the machine to run the program (using punched cards for data input, and outputting results onto punched cards.) Two versions existed for the 650s with a 2000 word memory drum: FOR TRANSIT I (S) and FOR TRANSIT II, the latter for machines equipped with indexing registers and automatic floating point decimal (bi-quinary) arithmetic. Appendix A of the manual included wiring diagrams for the IBM 533 control panel.

Fortran-based languages Prior to FORTRAN 77, a number of preprocessors were commonly used to provide a friendlier language, with the advantage that the preprocessed code could be compiled on any machine with a standard FORTRAN compiler. Popular preprocessors included FLECS, MORTRAN, SFtran, S-Fortran, Ratfor, and Ratfiv. (Ratfor and Ratfiv, for example, implemented a remarkably C-like language, outputting preprocessed code in standard FORTRAN 66.[20] ) LRLTRAN was developed at the Lawrence Radiation Laboratory to provide support for vector arithmetic and dynamic storage, among other extensions to support systems programming. The distribution included the LTSS operating system. The Fortran-95 Standard includes an optional Part 3 which defines an optional conditional compilation capability. This capability is often referred to as "CoCo". Many Fortran compilers have integrated subsets of the C preprocessor into their systems. SIMSCRIPT is an application specific Fortran preprocessor for modeling and simulating large discrete systems. F (programming language) was designed to be a clean subset of Fortran 95 that attempted to remove the redundant, unstructured, and deprecated features of Fortran, such as the EQUIVALENCE statement. F retains the array features added in Fortran 90, and removes control statements that were obsoleted by structured programming constructs added to both Fortran 77 and Fortran 90. F is described by its creators as "a compiled, structured, array programming language especially well suited to education and scientific computing." "F Programming Language Homepage" [21].

Code examples The following program illustrates dynamic memory allocation and array-based operations, two features introduced with Fortran 90. Particularly noteworthy is the absence of DO loops and IF/THEN statements in manipulating the array; mathematical operations are applied to the array as a whole. Also apparent is the use of descriptive variable names and general code formatting that conform with contemporary programming style. This example computes an average over data entered interactively. program average ! Read in some numbers and take the average ! As written, if there are no data points, an average of zero is returned ! While this may not be desired behavior, it keeps this example simple implicit none real, dimension(:), allocatable :: points integer :: number_of_points


Fortran

111

real :: average_points=0., positive_average=0., negative_average=0. write (*,*) "Input number of points to average:" read (*,*) number_of_points allocate (points(number_of_points)) write (*,*) "Enter the points to average:" read (*,*) points ! Take the average by summing points and dividing by number_of_points if (number_of_points > 0) average_points = sum(points) / number_of_points ! Now form average over positive and negative points only if (count(points > 0.) > 0) then positive_average = sum(points, points > 0.) / count(points > 0.) end if if (count(points < 0.) > 0) then negative_average = sum(points, points < 0.) / count(points < 0.) end if deallocate (points) ! Print result to terminal write (*,'(a,g12.4)') 'Average = ', average_points write (*,'(a,g12.4)') 'Average of positive points = ', positive_average write (*,'(a,g12.4)') 'Average of negative points = ', negative_average end program average The sample programs can be compiled and run with any standard Fortran compiler.

FORTRAN quotations For a programming language with a half-century legacy, FORTRAN has accumulated its share of jokes and folklore.

Letter O considered harmful During the same Fortran Standards Committee meeting at which the name "FORTRAN 77" was chosen, a technical proposal was incorporated into the official distribution, bearing the title, "Letter O considered harmful". This proposal purported to address the confusion that sometimes arises between the letter "O" and the numeral zero, by eliminating the letter from allowable variable names. However, the method proposed was to eliminate the letter from the character set entirely (thereby retaining 48 as the number of lexical characters, which the colon had increased to 49).


Fortran This was considered beneficial in that it would promote structured programming, by making it impossible to use the notorious GO TO statement as before (Troublesome FORMAT statements would also be eliminated). However, it was also considered that this "might invalidate some existing programs" but that most of these "probably were non-conforming, anyway".[22] [23]

See also • History of compiler writing • List of programming languages

References Textbooks • Akin, Ed (2003). Object Oriented Programming via Fortran 90/95 (1st ed.). Cambridge University Press. ISBN 0-521-52408-3. • Etter, D. M. (1990). Structured FORTRAN 77 for Engineers and Scientists (3rd ed.). The Benjamin/Cummings Publishing Company, Inc.. ISBN 0-8053-0051-1. • Chapman, Stephen J. (2007). Fortran 95/2003 for Scientists and Engineers (3rd ed.). McGraw-Hill. ISBN 978-0-07-319157-7. • Chapman, Stephen J. (2003). Fortran 90/95 for Scientists and Engineers (2nd ed.). McGraw-Hill. ISBN 0-07-282575-8. • Chivers, Ian; Jane Sleightholme (2006). Introduction to Programming with Fortran (1st ed.). Springer. ISBN 1-84628-053-2. • Ellis, T. M. R.; Ivor R. Phillips, Thomas M. Lahey (1994). Fortran 90 Programming (1st ed.). Addison Wesley. ISBN 0-201-54446-6. • Kupferschmid, Michael (2002). Classical Fortran: Programming for Engineering and Scientific Applications. Marcel Dekker (CRC Press). ISBN 0-8247-0802-4. • McCracken, Daniel D. (1961). A Guide to Fortran Programming. Wiley. • McCracken, Daniel D. (1965). A Guide to Fortran IV Programming. Wiley. • Metcalf, Michael; John Reid, Malcolm Cohen (2004). Fortran 95/2003 Explained. Oxford University Press. ISBN 0-19-852693-8. • Nyhoff, Larry; Sanford Leestma (1995). FORTRAN 77 for Engineers and Scientists with an Introduction to Fortran 90 (4th ed.). Prentice Hall. ISBN 0-13-363003-X. • da Cunha, Rudnei Dias (2005). Introdução à Linguagem de Programação Fortran 90. Editora da Universidade Federal do Rio Grande do Sul. ISBN 85-7025-829-1. • Martínez Baena, Javier; Ignario Requena Ramos, Nicolás Marín Ruiz (2006). Programación estructurada con Fortran 90/95. Universidad de Granada. ISBN 84-338-3923-3.

112


Fortran

"Core" language standards • ANSI X3.9-1966. USA Standard FORTRAN. American National Standards Institute. Informally known as FORTRAN 66. • ANSI X3.9-1978. American National Standard – Programming Language FORTRAN. American National Standards Institute. Also known as ISO 1539-1980, informally known as FORTRAN 77. • ANSI X3.198-1992 (R1997). American National Standard – Programming Language Fortran Extended. American National Standards Institute. Informally known as Fortran 90. • ISO/IEC 1539-1:1997. Information technology – Programming languages – Fortran – Part 1: Base language. Informally known as Fortran 95. There are a further two parts to this standard. Part 1 has been formally adopted by ANSI. • ISO/IEC 1539-1:2004. Information technology – Programming languages – Fortran – Part 1: Base language. Informally known as Fortran 2003.

Related standards • Wilfried Kneis (October 1981). "Draft standard Industrial Real-Time FORTRAN". ACM SIGPLAN Notices (ACM Press) 16 (7): 45–60. doi:10.1145/947864.947868. • MIL-STD-1753. DoD Supplement to X3.9-1978. U. S. Government Printing Office. • POSIX 1003.9-1992. POSIX FORTRAN 77 Language Interface – Part 1: Binding for System Application Program Interface API [24]. The Institute of Electrical and Electronics Engineers, Inc. • ISO 8651-1:1988 Information processing systems—Computer graphics—Graphical Kernel System (GKS) language bindings—Part 1: FORTRAN

Further reading • Roberts, Mark L.; Griffiths, Peter D., "Design Considerations for IBM Personal Computer Professional FORTRAN, an Optimizing Compiler" [25], IBM Systems Journal 24(1): 49-60 (1985)

External links • "The FORTRAN Automatic Coding System" [26] (1.39 MB) — 1957 copy describes the design and implementation of the first FORTRAN compiler by the IBM team • Early Fortran manuals [27] and The very first Fortran manual, by John Backus [28] (6.11 MB) dated [1956-10-15] • History of FORTRAN [29] and Systems Manual for 704/ /709 FORTRAN [30] (13.5 MB) • FORTRAN at HOPL site [31] • "The IBM CE Manual for FORTRAN I, II, and 709" [32] from 1959 (3.82 MB) • "A History of Language Processor Technology in IBM" [33] (1.45 MB) — by F.E. Allen, IBM Journal of Research and Development, v.25, no.5, September 1981 • Bemer, Bob, "Who Was Who in IBM's Programming Research? -- Early FORTRAN Days" [34], January 1957, Computer History Vignettes • Comprehensive Fortran Standards Documents [35] by GFortran • JTC1/SC22/WG5 [36] — The ISO/IEC Fortran Working Group • ANSI(R) X3.9-1978 [37] Fortran 77 Standard • MIL-STD 1753 [38] DoD Extensions to Fortran 77 • ISO/IEC 1539:1991 [39] Fortran 90 Standard • final draft [40] Fortran 95 Standard • WG5 (2003) ISO/IEC JTC1/SC22/WG5 N1578 [41] Final Committee Draft of Fortran 2003 standard • The Professional Programmer's Guide to FORTRAN 77 [42]

113


Fortran • • • • •

114 Fortran 77, 90, 95, 2003 Information & Resources [43] Fortran 77 [44] — FORTRAN 77 documentation Fortran 77 4.0 Reference Manual [45] (851 KB) Fortran 90 Reference Card [46] Ian D. Chivers, Jane Sleightholme, Compiler support for the Fortran 2003 standard, ACM SIGPLAN Fortran Forum 28, 26 (2009). [47] (PDF [48])

References [1] The names of earlier versions of the language through FORTRAN 77 were conventionally spelled in all-caps (FORTRAN 77 was the version in which the use of lowercase letters in keywords was strictly nonstandard). The capitalization has been dropped in referring to newer versions beginning with Fortran 90. The official language standards now refer to the language as "Fortran." Because the capitalisation (or lack thereof) of the word FORTRAN was never 100% consistent in actual usage, and because many hold impassioned beliefs on the issue, this article, rather than attempt to be normative, adopts the convention of using the all-caps FORTRAN in referring to versions of FORTRAN through FORTRAN 77 and the title-caps Fortran in referring to versions of Fortran from Fortran 90 onward. This convention is reflected in the capitalization of FORTRAN in the ANSI X3.9-1966 (FORTRAN 66) and ANSI X3.9-1978 (FORTRAN 77) standards and the title caps Fortran in the ANSI X3.198-1992 (Fortran 90), ISO/IEC 1539-1:1997 (Fortran 95) and ISO/IEC 1539-1:2004 (Fortran 2003) standards. [2] Since FORTRAN 77, which introduced the CHARACTER data type. [3] Since FORTRAN II (1958). [4] "Math 169 Notes - Santa Clara University" (http:/ / math. scu. edu/ ~dsmolars/ ma169/ notesfortran. html). . [5] Top500.org (http:/ / www. top500. org/ project/ linpack) [6] Softwarepreservation.org (http:/ / www. softwarepreservation. org/ projects/ FORTRAN/ index. html#By_FORTRAN_project_members) [7] Fortran creator John Backus dies - Gadgets - MSNBC.com (http:/ / www. msnbc. msn. com/ id/ 17704662/ ), MSN.com [8] Note: It is commonly believed that this statement corresponded to a three-way branch instruction on the IBM 704. This is not true, the 704 branch instructions all contained only one destination address (e.g., TZE — Transfer AC Zero, TNZ — Transfer AC Not Zero, TPL — Transfer AC Plus, TMI — Transfer AC Minus). The machine (and its successors in the 700/7000 series) did have a three-way skip instruction (CAS — Compare AC with Storage), which was probably the origin of this belief, but using this instruction to implement the IF would consume 4 instruction words, require the constant Zero in a word of storage, and take 3 machine cycles to execute; using the Transfer instructions to implement the IF could be done in 1 to 3 instruction words, required no constants in storage, and take 1 to 3 machine cycles to execute. An optimizing compiler like FORTRAN would most likely select the more compact and usually faster Transfers instead of the Compare (use of Transfers also allowed the FREQUENCY statement to optimize IFs, which could not be done using the Compare). Also the Compare considered −0 and +0 to be different values while the Transfer Zero and Transfer Not Zero considered them to be the same. [9] The FREQUENCY statement in FORTRAN was used originally and optionally to give branch probabilities for the three branch cases of the Arithmetic IF statement to bias the way code was generated and order of the basic blocks of code generated, in the global optimisation sense, were arranged in memory for optimality. The first FORTRAN compiler used this weighting to do a Monte Carlo simulation of the run-time generated code at compile time. It was very sophisticated for its time. This technique is documented in the original article in 1957 on the first FORTRAN compiler implementation by J. Backus, et al. Many years later, the FREQUENCY statement had no effect on the code, and was treated as a comment statement, since the compilers no longer did this kind of compile-time simulation.

Below is a part of the 1957 paper, "The FORTRAN Automatic Coding System" by Backus, et al., with this snippet on the FREQUENCY statement and its use in a compile-time Monte Carlo simulation of the run-time to optimise the code generated. Quoting … The fundamental unit of program is the basic block; a basic block is a stretch of program which has a single entry point and a single exit point. The purpose of section 4 is to prepare for section 5 a table of predecessors (PRED table) which enumerates the basic blocks and lists for every basic block each of the basic blocks which can be its immediate predecessor in flow, together with the absolute frequency of each such basic block link. This table is obtained by an actual "execution" of the program in Monte-Carlo fashion, in which the outcome of conditional transfers arising out of IF-type statements and computed GO TO'S is determined by a random number generator suitably weighted according to whatever FREQUENCY statements have been provided. [10] Haines, L. H. (1965). "Serial compilation and the 1401 FORTRAN compiler" (http:/ / domino. research. IBM. com/ tchjr/ journalindex. nsf/ 495f80c9d0f539778525681e00724804/ cde711e5ad6786e485256bfa00685a03?OpenDocument). IBM Systems Journal 4 (1): 73–80. . This article was reprinted, edited, in both editions of Lee, John A. N. (1967(1st), 1974(2nd)). Anatomy of a Compiler. Van Nostrand Reinhold. [11] McCracken, Daniel D. (1965(3rd printing 1968)). A Guide to FORTRAN IV Programming. John Wiley & Sons, Inc., New York. LCCCN 65-26848. Preface p. v


Fortran

115

[12] Chilton Computing with FORTRAN (http:/ / www. chilton-computing. org. uk/ acd/ literature/ reports/ p008. htm), Chilton-computing.org.uk [13] Fortran Working Group (WG5) (http:/ / www. nag. co. uk/ sc22wg5/ ). It may also be downloaded as a PDF file (ftp:/ / ftp. nag. co. uk/ sc22wg5/ N1551-N1600/ N1579. pdf) or gzipped PostScript file (ftp:/ / ftp. nag. co. uk/ sc22wg5/ N1551-N1600/ N1579. ps. gz), FTP.nag.co.uk [14] A full list is in the report available at http:/ / www. fortran. bcs. org/ 2006/ ukfortran06. pdfPDF (24.2 KB) [15] The New Features of Fortran 2008ftp:/ / ftp. nag. co. uk/ sc22wg5/ N1701-N1750/ N1735. pdfPDF (24.2 KB) [16] http:/ / www. spec. org/ cpu2006/ CFP2006/ [17] http:/ / www. spec. org/ cpu2006/ [18] Healy, MJR (1968). "Towards FORTRAN VI" (http:/ / hopl. murdoch. edu. au/ showlanguage. prx?exp=1092& language=CDC Fortran). Advanced scientific Fortran by CDC. CDC. pp. 169–172. . Retrieved 2009-04-10. [19] "Internal Translator (IT) A Compiler for the IBM 650", by A. J. Perlis, J. W. Smith, and H. R. Van Zoeren, Computation Center, Carnegie Institute of Technology [20] This is not altogether surprising, as Brian Kernighan, one of the co-creators of Ratfor, is also co-author of The C Programming Language. [21] http:/ / www. fortran. com/ F/ index. html [22] X3J3 post-meeting distribution for meeting held at Brookhaven National Laboratory in November 1976. [23] "The obliteration of O", Computer Weekly, March 3, 1977 [24] http:/ / standards. ieee. org/ reading/ ieee/ std_public/ description/ posix/ 1003. 9-1992_desc. html [25] http:/ / www. research. ibm. com/ journal/ sj/ 241/ ibmsj2401G. pdf [26] http:/ / web. mit. edu/ 6. 035/ www/ papers/ BackusEtAl-FortranAutomaticCodingSystem-1957. pdf [27] http:/ / www. fh-jena. de/ ~kleine/ history/ [28] http:/ / www. fh-jena. de/ ~kleine/ history/ languages/ FortranAutomaticCodingSystemForTheIBM704. pdf [29] [30] [31] [32] [33] [34] [35] [36] [37] [38] [39] [40] [41] [42] [43] [44] [45] [46] [47] [48]

http:/ / www. softwarepreservation. org/ projects/ FORTRAN/ http:/ / www. softwarepreservation. org/ projects/ FORTRAN/ FORTRAN_Systems_Manual-1960. pdf http:/ / hopl. murdoch. edu. au/ showlanguage. prx?exp=8& language=FORTRAN http:/ / bitsavers. org/ pdf/ ibm/ 704/ R23-9518-0-1_FORTRAN_CE_Manual_1959. pdf http:/ / www. research. ibm. com/ journal/ rd/ 255/ ibmrd2505Q. pdf http:/ / www. trailing-edge. com/ ~bobbemer/ PRORES. HTM http:/ / gcc. gnu. org/ wiki/ GFortranStandards http:/ / www. nag. co. uk/ sc22wg5/ http:/ / www. fortran. com/ fortran/ F77_std/ rjcnf. html http:/ / www. fortran. com/ fortran/ mil_std_1753. html http:/ / www. iso. org/ iso/ en/ CatalogueDetailPage. CatalogueDetail?CSNUMBER=17366 http:/ / j3-fortran. org/ doc/ standing/ archive/ 007/ 97-007r2/ pdf/ 97-007r2. pdf http:/ / www. dkuug. dk/ jtc1/ sc22/ open/ n3661. pdf http:/ / rsusu1. rnd. runnet. ru/ develop/ fortran/ prof77/ prof77. html http:/ / www. fortranplus. co. uk/ fortran_info. html http:/ / home. casema. nl/ fam. engelberts/ http:/ / www. physics. ucdavis. edu/ ~vem/ F77_Ref. pdf http:/ / users. physik. fu-berlin. de/ ~goerz/ blog/ 2008/ 12/ fortran-90-reference-card/ http:/ / doi. acm. org/ 10. 1145/ 1520752. 1520755 http:/ / www. fortranplus. co. uk/ resources/ fortran_2003_compiler_support. pdf


C Sharp (programming language)

116

C Sharp (programming language)

Usual file extensions

.cs

Paradigm

multi-paradigm: structured, imperative, object-oriented, event-driven, functional

Appeared in

2001

Designed by

Microsoft

Developer

Microsoft

Stable release

4.0 (April 12, 2010)

Preview release

N/A (N/A)

Typing discipline

static, dynamic,

[1]

strong, safe, nominative

Major implementations .NET Framework, Mono, DotGNU Dialects

Cω, Spec#, Polyphonic C#

Influenced by

Java,

Influenced

D, F#, Java 5,

License

CLR Proprietary

Website

C# Language (MSDN)

[2]

C++, Eiffel, Modula-3, Object Pascal [5]

[3] [4]

Nemerle, Vala

[6]

C# (pronounced "see sharp") is a multi-paradigm programming language encompassing imperative, functional, generic, object-oriented (class-based), and component-oriented programming disciplines. It was developed by Microsoft within the .NET initiative and later approved as a standard by Ecma (ECMA-334) and ISO (ISO/IEC 23270). C# is one of the programming languages designed for the Common Language Infrastructure. C# is intended to be a simple, modern, general-purpose, object-oriented programming language[7] . Its development team is led by Anders Hejlsberg. The most recent version is C# 4.0, which was released in April 12, 2010.

Design goals The ECMA standard lists these design goals for C#:[7] • C# language is intended to be a simple, modern, general-purpose, object-oriented programming language. • The language, and implementations thereof, should provide support for software engineering principles such as strong type checking, array bounds checking, detection of attempts to use uninitialized variables, and automatic garbage collection. Software robustness, durability, and programmer productivity are important. • The language is intended for use in developing software components suitable for deployment in distributed environments. • Source code portability is very important, as is programmer portability, especially for those programmers already familiar with C and C++. • Support for internationalization is very important. • C# is intended to be suitable for writing applications for both hosted and embedded systems, ranging from the very large that use sophisticated operating systems, down to the very small having dedicated functions.


C Sharp (programming language)

117

â&#x20AC;˘ Although C# applications are intended to be economical with regard to memory and processing power requirements, the language was not intended to compete directly on performance and size with C or assembly language.

Language name The name "C sharp" was inspired by musical notation where a sharp indicates that the written note should be made a half-step higher in pitch.[8] This is similar to the language name of C++, where "++" indicates that a variable should be incremented by 1. By coincidence, the sharp symbol resembles four conjoined plus signs. This reiterates Rick Mascitti's tongue-in-cheek use of '++' when naming 'C++': where C was enhanced to create C++, C++ was enhanced to create C++++ (that is, C#).

C-sharp musical note

Due to technical limitations of display (standard fonts, browsers, etc.) and the fact that the sharp symbol (â&#x2122;Ż, U+266F, MUSIC SHARP SIGN) is not present on the standard keyboard, the number sign (#, U+0023, NUMBER SIGN) was chosen to represent the sharp symbol in the written name of the programming language.[9] This convention is reflected in the ECMA-334 C# Language Specification.[7] However, when it is practical to do so (for example, in advertising or in box art[10] ), Microsoft uses the intended musical symbol. The "sharp" suffix has been used by a number of other .NET languages that are variants of existing languages, including J# (a .NET language also designed by Microsoft which is derived from Java 1.1), A# (from Ada), and the functional F#.[11] The original implementation of Eiffel for .NET was called Eiffel#,[12] a name since retired since the full Eiffel language is now supported. The suffix has also been used for libraries, such as Gtk# (a .NET wrapper for GTK+ and other GNOME libraries), Cocoa# (a wrapper for Cocoa) and Qt# (a .NET language binding for the Qt toolkit).

History During the development of the .NET Framework, the class libraries were originally written using a managed code compiler system called Simple Managed C (SMC).[13] [14] [15] In January 1999, Anders Hejlsberg formed a team to build a new language at the time called Cool, which stood for "C-like Object Oriented Language".[16] Microsoft had considered keeping the name "Cool" as the final name of the language, but chose not to do so for trademark reasons. By the time the .NET project was publicly announced at the July 2000 Professional Developers Conference, the language had been renamed C#, and the class libraries and ASP.NET runtime had been ported to C#. C#'s principal designer and lead architect at Microsoft is Anders Hejlsberg, who was previously involved with the design of Turbo Pascal, Embarcadero Delphi (formerly CodeGear Delphi and Borland Delphi), and Visual J++. In interviews and technical papers he has stated that flaws in most major programming languages (e.g. C++, Java, Delphi, and Smalltalk) drove the fundamentals of the Common Language Runtime (CLR), which, in turn, drove the design of the C# programming language itself. James Gosling, who created the Java programming language in 1994, and Bill Joy, a co-founder of Sun Microsystems, the proprietor of Java, called C# an "imitation" of Java; Gosling further claimed that "[C# is] sort of Java with reliability, productivity and security deleted."[2] [17] Klaus Kreft and Angelika Langer (authors of a C++ streams book) stated in a blog post that "Java and C# are almost identical programming languages. Boring repetition that lacks innovation," "Hardly anybody will claim that Java or C# are revolutionary programming languages that changed the way we write programs," and "C# borrowed a lot from Java - and vice versa. Now that C# supports boxing and unboxing, we'll have a very similar feature in Java." [18] Anders Hejlsberg has argued that C# is "not a Java clone" and is "much closer to C++" in its design.[19]


C Sharp (programming language)

118

Versions In the course of its development, the C# language has gone through several versions: Version Language specification

C# 1.0

December 2001

C# 2.0

December 2002

C# 3.0

June 2005

C# 4.0

June 2006

Microsoft compiler

[20]

January 2002

[21]

November 2005

[22]

November 2006

[23]

April 2010

Features Note: The following description is based on the language standard and other documents listed in the external links section. By design, C# is the programming language that most directly reflects the underlying Common Language Infrastructure (CLI). Most of its intrinsic types correspond to value-types implemented by the CLI framework. However, the language specification does not state the code generation requirements of the compiler: that is, it does not state that a C# compiler must target a Common Language Runtime, or generate Common Intermediate Language (CIL), or generate any other specific format. Theoretically, a C# compiler could generate machine code like traditional compilers of C++ or FORTRAN. Some notable distinguishing features of C# are: • There are no global variables or functions. All methods and members must be declared within classes. Static members of public classes can substitute for global variables and functions. • Local variables cannot shadow variables of the enclosing block, unlike C and C++. Variable shadowing is often considered confusing by C++ texts. • C# supports a strict Boolean datatype, bool. Statements that take conditions, such as while and if, require an expression of a type that implements the true operator, such as the boolean type. While C++ also has a boolean type, it can be freely converted to and from integers, and expressions such as if(a) require only that a is convertible to bool, allowing a to be an int, or a pointer. C# disallows this "integer meaning true or false" approach on the grounds that forcing programmers to use expressions that return exactly bool can prevent certain types of common programming mistakes in C or C++ such as if (a = b) (use of assignment = instead of equality ==). • In C#, memory address pointers can only be used within blocks specifically marked as unsafe, and programs with unsafe code need appropriate permissions to run. Most object access is done through safe object references, which always either point to a "live" object or have the well-defined null value; it is impossible to obtain a reference to a "dead" object (one which has been garbage collected), or to a random block of memory. An unsafe pointer can point to an instance of a value-type, array, string, or a block of memory allocated on a stack. Code that is not marked as unsafe can still store and manipulate pointers through the System.IntPtr type, but it cannot dereference them. • Managed memory cannot be explicitly freed; instead, it is automatically garbage collected. Garbage collection addresses the problem of memory leaks by freeing the programmer of responsibility for releasing memory which is no longer needed. • In addition to the try...catch construct to handle exceptions, C# has a try...finally construct to guarantee execution of the code in the finally block.


C Sharp (programming language) • Multiple inheritance is not supported, although a class can implement any number of interfaces. This was a design decision by the language's lead architect to avoid complication and simplify architectural requirements throughout CLI. • C# is more type safe than C++. The only implicit conversions by default are those which are considered safe, such as widening of integers. This is enforced at compile-time, during JIT, and, in some cases, at runtime. There are no implicit conversions between booleans and integers, nor between enumeration members and integers (except for literal 0, which can be implicitly converted to any enumerated type). Any user-defined conversion must be explicitly marked as explicit or implicit, unlike C++ copy constructors and conversion operators, which are both implicit by default. • Enumeration members are placed in their own scope. • C# provides properties as syntactic sugar for a common pattern in which a pair of methods, accessor (getter) and mutator (setter) encapsulate operations on a single attribute of a class. • Full type reflection and discovery is available. • C# currently (as of 3 June 2008) has 77 reserved words. • Checked exceptions are not present in C# (in contrast to Java). This has been a conscious decision based on the issues of scalability and versionability.[24]

Common Type System (CTS) C# has a unified type system. This unified type system is called Common Type System (CTS).[25] A unified type system implies that all types, including primitives such as integers, are subclasses of the System.Object class. For example, every type inherits a ToString() method. For performance reasons, primitive types (and value types in general) are internally allocated on the stack.

Categories of datatypes CTS separates datatypes into two categories[25] : 1. Value types 2. Reference types Value types are plain aggregations of data. Instances of value types do not have referential identity nor a referential comparison semantics - equality and inequality comparisons for value types compare the actual data values within the instances, unless the corresponding operators are overloaded. Value types are derived from System.ValueType, always have a default value, and can always be created and copied. Some other limitations on value types are that they cannot derive from each other (but can implement interfaces) and cannot have an explicit default (parameterless) constructor. Examples of value types are some primitive types, such as int (a signed 32-bit integer), float (a 32-bit IEEE floating-point number), char (a 16-bit Unicode code unit), and System.DateTime (identifies a specific point in time with nanosecond precision). Other examples are enum (enumerations) and struct(user defined structures). In contrast, reference types have the notion of referential identity - each instance of a reference type is inherently distinct from every other instance, even if the data within both instances is the same. This is reflected in default equality and inequality comparisons for reference types, which test for referential rather than structural equality, unless the corresponding operators are overloaded (such as the case for System.String). In general, it is not always possible to create an instance of a reference type, nor to copy an existing instance, or perform a value comparison on two existing instances, though specific reference types can provide such services by exposing a public constructor or implementing a corresponding interface (such as ICloneable or IComparable). Examples of reference types are object (the ultimate base class for all other C# classes), System.String (a string of Unicode characters), and System.Array (a base class for all C# arrays).

119


C Sharp (programming language)

120

Both type categories are extensible with user-defined types.

Boxing and unboxing Boxing is the operation of converting a value of a value type into a value of a corresponding reference type.[25] Boxing in C# is implicit. Unboxing is the operation of converting a value of a reference type (previously boxed) into a value of a value type.[25] Unboxing in C# requires an explicit type cast. Example: int foo = 42; object bar = foo; int foo2 = (int)bar;

// Value type. // foo is boxed to bar. // Unboxed back to value type.

Preprocessor C# features "preprocessor directives"[26] (though it does not have an actual preprocessor) based on the C preprocessor that allow programmers to define symbols but not macros. Conditionals such as #if, #endif, and #else are also provided. Directives such as #region give hints to editors for code folding. public class Foo { #region Procedures public void IntBar(int firstParam) {} public void StrBar(string firstParam) {} public void BoolBar(bool firstParam) {} #endregion #region Constructors public Foo() {} public Foo(int firstParam) {} #endregion }

Code comments C# utilizes a double forward slash (//) to indicate the rest of the line is a comment. This is inherited from C++. public class Foo { // a comment public static void Bar(int firstParam) {} }

//Also a comment

Multi-line comments can be indicated by a starting forward slash/asterisk (/*) and ending asterisk/forward slash (*/). This is inherited from standard C. public class Foo { /* A Multi-Line comment */


C Sharp (programming language) public static void Bar(int firstParam) {} }

XML documentation system C#'s documentation system is similar to Java's Javadoc, but based on XML. Two methods of documentation are currently supported by the C# compiler. Single-line documentation comments, such as those commonly found in Visual Studio generated code, are indicated on a line beginning with ///. public class Foo { /// <summary>A summary of the method.</summary> /// <param name="firstParam">A description of the parameter.</param> /// <remarks>Remarks about the method.</remarks> public static void Bar(int firstParam) {} } Multi-line documentation comments, while defined in the version 1.0 language specification, were not supported until the .NET 1.1 release.[27] These comments are designated by a starting forward slash/asterisk/asterisk (/**) and ending asterisk/forward slash (*/).[28] public class Foo { /** <summary>A summary of the method.</summary> * <param name="firstParam">A description of the parameter.</param> * <remarks>Remarks about the method.</remarks> */ public static void Bar(int firstParam) {} } Note there are some stringent criteria regarding white space and XML documentation when using the forward slash/asterisk/asterisk (/**) technique. This code block: /** * <summary> * A summary of the method.</summary>*/ produces a different XML comment from this code block:[28] /** * <summary> A summary of the method.</summary>*/ Syntax for documentation comments and their XML markup is defined in a non-normative annex of the ECMA C# standard. The same standard also defines rules for processing of such comments, and their transformation to a plain XML document with precise rules for mapping of CLI identifiers to their related documentation elements. This allows any C# IDE or other development tool to find documentation for any symbol in the code in a certain well-defined way.

121


C Sharp (programming language)

Libraries The C# specification details a minimum set of types and class libraries that the compiler expects to have available. In practice, C# is most often used with some implementation of the Common Language Infrastructure (CLI), which is standardized as ECMA-335 Common Language Infrastructure (CLI).

"Hello, world" example The following is a very simple C# program, a version of the classic "Hello, world" example: using System; class ExampleClass { static void Main() { Console.WriteLine("Hello, world!"); } } The effect is to write the following text to the output console: Hello, world! Each line has a purpose: using System; The above line of code tells the compiler to use 'System' as a candidate prefix for types used in the source code. In this case, when the compiler sees use of the 'Console' type later in the source code, it tries to find a type named 'Console', first in the current assembly, followed by all referenced assemblies. In this case the compiler fails to find such a type, since the name of the type is actually 'System.Console'. The compiler then attempts to find a type named 'System.Console' by using the 'System' prefix from the using statement, and this time it succeeds. The using statement allows the programmer to state all candidate prefixes to use during compilation instead of always using full type names. class ExampleClass Above is a class definition. Everything between the following pair of braces describes ExampleClass. static void Main() This declares the class member method where the program begins execution. The .NET runtime calls the Main method. (Note: Main may also be called from elsewhere, like any other method, e.g. from another method of ExampleClass.) The static keyword makes the method accessible without an instance of ExampleClass. Each console application's Main entry point must be declared static. Otherwise, the program would require an instance, but any instance would require a program. To avoid that irresolvable circular dependency, C# compilers processing console applications (like that above) report an error if there is no static Main method. The void keyword declares that Main has no return value. Console.WriteLine("Hello, world!"); This line writes the output. Console is a static class in the System namespace. It provides an interface to the standard input, output, and error streams for console applications. The program calls the Console method WriteLine, which displays on the console a line with the argument, the string "Hello, world!".

122


C Sharp (programming language)

Standardization In August, 2000, Microsoft Corporation, Hewlett-Packard and Intel Corporation co-sponsored the submission of specifications for C# as well as the Common Language Infrastructure (CLI) to the standards organization ECMA International. In December 2001, ECMA released ECMA-334 C# Language Specification. C# became an ISO standard in 2003 (ISO/IEC 23270:2006 - Information technology—Programming languages—C#). ECMA had previously adopted equivalent specifications as the 2nd edition of C#, in December 2002. In June 2005, ECMA approved edition 3 of the C# specification, and updated ECMA-334. Additions included partial classes, anonymous methods, nullable types, and generics (similar to C++ templates). In July 2005, ECMA submitted the standards and related TRs to ISO/IEC JTC 1 via the latter's Fast-Track process. This process usually takes 6–9 months.

Criticism Although the C# language definition and the CLI are standardized under ISO and Ecma standards which provide reasonable and non-discriminatory licensing protection from patent claims, Microsoft uses C# and the CLI in its Base Class Library (BCL) which is the foundation of its proprietary .NET framework, and which provides a variety of non-standardized classes (extended I/O, GUI, web services, etc). Some cases where Microsoft patents apply to standards used in the .NET framework are documented by Microsoft and the applicable patents are available on either RAND terms or through Microsoft's Open Specification Promise which releases patent rights to the public,[29] but there is some concern and debate as to whether there are additional aspects patented by Microsoft which are not covered, which may deter independent implementations of the full framework. Microsoft has also agreed not to sue open source developers for violating patents in non-profit projects for the part of the framework which is covered by the OSP.[30] Microsoft has agreed not to enforce patents relating to Novell products against Novell's paying customers[31] with the exception of a list of products that do not explicitly mention C#, .NET or Novell's implementation of .NET (The Mono Project),[32] . However Novell maintains that Mono does not infringe any Microsoft patents.[33] Microsoft has also made a specific agreement not to enforce patent rights related to the Moonlight browser plugin, which depends on Mono, provided it is obtained through Novell.[34] In a note posted on the Free Software Foundation's news website in June 2009, Richard Stallman warned that he believes "Microsoft is probably planning to force all free C# implementations underground some day using software patents" and recommended that developers avoid taking what he described as the "gratuitous risk" associated with "depend[ing] on the free C# implementations".[35] . The Free Software Foundation later reiterated its warnings[36] , claiming that the extension of Microsoft Community Promise to the C# and the CLI ECMA specifications[37] would not prevent Microsoft from harming Open-Source implementations of C#, because many specific Windows libraries included with .NET or Mono were not covered by this promise.

Implementations The reference C# compiler is Microsoft Visual C#. Other C# compilers exist, often including an implementation of the Common Language Infrastructure and the .NET class libraries up to .NET 2.0: • Microsoft's Rotor project (currently called Shared Source Common Language Infrastructure) (licensed for educational and research use only) provides a shared source implementation of the CLR runtime and a C# compiler, and a subset of the required Common Language Infrastructure framework libraries in the ECMA specification (up to C# 2.0, and supported on Windows XP only). • The Mono project provides an open source C# compiler, a complete open source implementation of the Common Language Infrastructure including the required framework libraries as they appear in the ECMA specification, and a nearly complete implementation of the Microsoft proprietary .NET class libraries up to .NET 2.0, but not

123


C Sharp (programming language) specific .NET 3.0 and .NET 3.5 libraries, as of Mono 2.0. • The DotGNU project also provides an open source C# compiler, a nearly complete implementation of the Common Language Infrastructure including the required framework libraries as they appear in the ECMA specification, and subset of some of the remaining Microsoft proprietary .NET class libraries up to .NET 2.0 (those not documented or included in the ECMA specification but included in Microsoft's standard .NET Framework distribution). • The DotNetAnywhere [38] Micro Framework-like Common Language Runtime is targeted at embedded systems, and supports almost all C# 2.0 specifications.

See also • • • • • • •

C# syntax Comparison of Java and C# Sing# Mono and Microsoft’s patents .NET Framework Standardization and licensing Oxygene Microsoft Visual Studio, IDE for C#

• • • • • •

SharpDevelop, an open-source C# IDE for Windows MonoDevelop, an open-source C# IDE for Linux, Windows and Mac OS X QuickSharp [39], an open-source, minimalist C# IDE for Windows Morfik C#, a C# to JavaScript compiler complete with IDE and framework for Web application development. Baltie, an educational IDE for children and students with little or no programming experience Borland Turbo C Sharp

References • Archer, Tom (2001). Inside C#. Microsoft Press. ISBN 0-7356-1288-9. • C# Language Pocket Reference. O' Reilly. 2002. ISBN 0-596-00429-X. • Petzold, Charles (2002). Programming Microsoft Windows with C#. Microsoft Press. ISBN 0-7356-1370-2.

External links • • • • • • •

Download C# Express [40] C# Language (MSDN) [6] C# Programming Guide (MSDN) [41] C# Specification (MSDN) [6] ISO C# Language Specification [42]. C# Language Specification (hyperlinked) [43] Microsoft Visual C# .NET [44]

124


C Sharp (programming language)

References [1] Torgersen, Mads (October 27, 2008). "New features in C# 4.0" (http:/ / code. msdn. microsoft. com/ csharpfuture/ Release/ ProjectReleases. aspx?ReleaseId=1686). Microsoft. . Retrieved 2008-10-28. [2] Wylie Wong (2002). "Why Microsoft's C# isn't" (http:/ / news. cnet. com/ 2008-1082-817522. html). CNET: CBS Interactive. . Retrieved 2009-11-14. [3] "The A-Z of Programming Languages: C#" (http:/ / www. computerworld. com. au/ article/ 261958/ a-z_programming_languages_c_/ ?pp=7). Computerworld. 1 October 2008. . Retrieved 12 February 2010. "We all stand on the shoulders of giants here and every language builds on what went before it so we owe a lot to C, C++, Java, Delphi, all of these other things that came before us. (Anders Hejlsberg)" [4] Naugler, David (May 2007). "C# 2.0 for C++ and Java programmer: conference workshop". Journal of Computing Sciences in Colleges 22 (5). "Although C# has been strongly influenced by Java it has also been strongly influenced by C++ and is best viewed as a descendent of both C++ and Java.". [5] Cornelius, Barry (December 1, 2005). "Java 5 catches up with C#" (http:/ / www. barrycornelius. com/ papers/ java5/ onefile/ ). University of Oxford Computing Services. . Retrieved June 18, 2009. [6] http:/ / msdn2. microsoft. com/ en-us/ vcsharp/ aa336809. aspx [7] C# Language Specification (http:/ / www. ecma-international. org/ publications/ files/ ECMA-ST/ Ecma-334. pdf) (4th ed.). ECMA International. June 2006. . Retrieved June 18, 2009. [8] Kovacs, James (September 7, 2007). "C#/.NET History Lesson" (http:/ / www. jameskovacs. com/ blog/ CNETHistoryLesson. aspx). . Retrieved June 18, 2009. [9] "Microsoft C# FAQ" (http:/ / msdn. microsoft. com/ vcsharp/ previous/ 2002/ FAQ/ default. aspx). Microsoft. . Retrieved 2008-03-25. [10] "Visual C#.net Standard" (http:/ / www. microsoft. com/ presspass/ images/ gallery/ boxshots/ web/ visual-c-sharp03. jpg) (JPEG). Microsoft. September 4, 2003. . Retrieved June 18, 2009. [11] "F# FAQ" (http:/ / research. microsoft. com/ en-us/ um/ cambridge/ projects/ fsharp/ faq. aspx). Microsoft Research. . Retrieved June 18, 2009. [12] Simon, Raphael; Stapf, Emmanuel; Meyer, Bertrand (June 2002). "Full Eiffel on the .NET Framework" (http:/ / msdn. microsoft. com/ en-us/ library/ ms973898. aspx). Microsoft. . Retrieved June 18, 2009. [13] Zander, Jason (November 23, 2007). "Couple of Historical Facts" (http:/ / blogs. msdn. com/ jasonz/ archive/ 2007/ 11/ 23/ couple-of-historical-facts. aspx). . Retrieved 2008-02-21. [14] "C# 3.0 New Features" (http:/ / www. learnitonweb. com/ Articles/ ReadArticle. aspx?contId=4& page=1). . Retrieved 2008-06-08. [15] Guthrie, Scott (November 28, 2006). "What language was ASP.Net originally written in?" (http:/ / aspadvice. com/ blogs/ rbirkby/ archive/ 2006/ 11/ 28/ What-language-was-ASP. Net-originally-written-in_3F00_. aspx). . Retrieved 2008-02-21. [16] Hamilton, Naomi (October 1, 2008). "The A-Z of Programming Languages: C#" (http:/ / www. computerworld. com. au/ article/ 261958/ -z_programming_languages_c). Computerworld. . Retrieved 2008-10-01. [17] Bill Joy (2002-02-07). "Microsoft's blind spot" (http:/ / news. cnet. com/ 2010-1071-831385. html). cnet.com. . Retrieved 2010-01-12. [18] Klaus Kreft and Angelika Langer (2003-07-03). "After Java and C# - what is next?" (http:/ / www. artima. com/ weblogs/ viewpost. jsp?thread=6543). artima.com. . Retrieved 2010-01-12. [19] Osborn, John (2000-08-01), Deep Inside C#: An Interview with Microsoft Chief Architect Anders Hejlsberg (http:/ / windowsdevcenter. com/ pub/ a/ oreilly/ windows/ news/ hejlsberg_0800. html), O'Reilly Media, , retrieved 2009-11-14 [20] http:/ / www. ecma-international. org/ publications/ files/ ECMA-ST-WITHDRAWN/ ECMA-334,%201st%20edition,%20December%202001. pdf [21] http:/ / www. ecma-international. org/ publications/ files/ ECMA-ST-WITHDRAWN/ ECMA-334,%202nd%20edition,%20December%202002. pdf [22] http:/ / www. ecma-international. org/ publications/ files/ ECMA-ST-WITHDRAWN/ ECMA-334,%203rd%20edition,%20June%202005. pdf [23] http:/ / www. ecma-international. org/ publications/ standards/ Ecma-334. htm [24] Venners, Bill; Eckel, Bruce (August 18, 2003). "The Trouble with Checked Exceptions" (http:/ / www. artima. com/ intv/ handcuffs. html). . Retrieved 2010-03-30. [25] Archer, Part 2, Chapter 4:The Type System [26] "C# Preprocessor Directives" (http:/ / msdn. microsoft. com/ en-us/ library/ ed8yd1ha. aspx). C# Language Reference. Microsoft. . Retrieved June 18, 2009. [27] Horton, Anson (2006-09-11). "C# XML documentation comments FAQ" (http:/ / blogs. msdn. com/ ansonh/ archive/ 2006/ 09/ 11/ 750056. aspx). . Retrieved 2007-12-11. [28] "Delimiters for Documentation Tags" (http:/ / msdn. microsoft. com/ en-us/ library/ 5fz4y783(VS. 71). aspx). C# Programmer's Reference. Microsoft. January 1, 1970 GMT. . Retrieved June 18, 2009. [29] "Interoperability Principles" (http:/ / www. microsoft. com/ interop/ principles/ default. mspx). . [30] "Patent Pledge for Open Source Developers" (http:/ / www. microsoft. com/ interop/ principles/ osspatentpledge. mspx). . [31] "Patent Cooperation Agreement - Microsoft & Novell Interoperability Collaboration" (http:/ / www. microsoft. com/ interop/ msnovellcollab/ patent_agreement. mspx). Microsoft. November 2, 2006. . Retrieved July 5, 2009. "Microsoft, on behalf of itself and its Subsidiaries (collectively “Microsoft”), hereby covenants not to sue Novell’s Customers and Novell’s Subsidiaries’ Customers for infringement

125


C Sharp (programming language) under Covered Patents of Microsoft on account of such a Customer’s use of specific copies of a Covered Product as distributed by Novell or its Subsidiaries (collectively “Novell”) for which Novell has received Revenue (directly or indirectly) for such specific copies; provided the foregoing covenant is limited to use by such Customer (i) of such specific copies that are authorized by Novell in consideration for such Revenue, and (ii) within the scope authorized by Novell in consideration for such Revenue." [32] "Definitions" (http:/ / www. microsoft. com/ interop/ msnovellcollab/ definitions2. aspx). Microsoft. November 2, 2006. . Retrieved July 5, 2009. [33] Steinman, Justin (November 7, 2006). "Novell Answers Questions from the Community" (http:/ / www. novell. com/ linux/ microsoft/ faq_opensource. html). . Retrieved July 5, 2009. "We maintain that Mono does not infringe any Microsoft patents." [34] "Covenant to Downstream Recipients of Moonlight - Microsoft & Novell Interoperability Collaboration" (http:/ / www. microsoft. com/ interop/ msnovellcollab/ moonlight. mspx). Microsoft. September 28, 2007. . Retrieved March 8, 2008. "“Downstream Recipient” means an entity or individual that uses for its intended purpose a Moonlight Implementation obtained directly from Novell or through an Intermediate Recipient… Microsoft reserves the right to update (including discontinue) the foregoing covenant… “Moonlight Implementation” means only those specific portions of Moonlight 1.0 or Moonlight 1.1 that run only as a plug-in to a browser on a Personal Computer and are not licensed under GPLv3 or a Similar License." [35] Stallman, Richard (June 26, 2009). "Why free software shouldn't depend on Mono or C#" (http:/ / www. fsf. org/ news/ dont-depend-on-mono). Free Software Foundation. . Retrieved July 2, 2009. "The danger is that Microsoft is probably planning to force all free C# implementations underground some day using software patents. ... We should systematically arrange to depend on the free C# implementations as little as possible. In other words, we should discourage people from writing programs in C#. Therefore, we should not include C# implementations in the default installation of GNU/Linux distributions, and we should distribute and recommend non-C# applications rather than comparable C# applications whenever possible." [36] "Microsoft's Empty Promise" (http:/ / www. fsf. org/ news/ 2009-07-mscp-mono). Free Software Foundation. 2009-07-16. . Retrieved 2009-078-03. "Until that happens, free software developers still should not write software that depends on Mono. C# implementations can still be attacked by Microsoft's patents: the Community Promise is designed to give the company several outs if it wants them. We don't want to see developers' hard work lost to the community if we lose the ability to use Mono, and until we eliminate software patents altogether, using another language is the best way to prevent that from happening." [37] "The ECMA C# and CLI Standards" (http:/ / port25. technet. com/ archive/ 2009/ 07/ 06/ the-ecma-c-and-cli-standards. aspx). 2009-07-06. . Retrieved 2009-078-03. [38] http:/ / dotnetanywhere. org/ [39] http:/ / quicksharp. sourceforge. net [40] http:/ / www. microsoft. com/ express/ Downloads/ [41] http:/ / msdn2. microsoft. com/ en-us/ library/ 67ef8sbd. aspx [42] http:/ / standards. iso. org/ ittf/ PubliclyAvailableStandards/ c042926_ISO_IEC_23270_2006(E). zip [43] http:/ / en. csharp-online. net/ CSharp_Language_Specification [44] http:/ / msdn. microsoft. com/ en-us/ vcsharp/ default. aspx

126


Java (programming language)

127

Java (programming language)

Usual file extensions

.java, .class, .jar

Paradigm

Object-oriented, structured, imperative

Appeared in

1995

Designed by

Sun Microsystems (Now owned by Oracle Corporation)

Developer

James Gosling & Sun Microsystems

Stable release

Java Standard Edition 6 (1.6.0_20) (April 15, 2010)

Typing discipline

Static, strong, safe, nominative, manifest

Major implementations

Numerous

Dialects

Generic Java, Pizza

Influenced by

Ada 83, C++, C#, Delphi Object Pascal, [7] [8] UCSD Pascal, Smalltalk

Influenced

Ada 2005, C#, Clojure, D, ECMAScript, Groovy, J#, PHP, Scala, JavaScript, Python, BeanShell

OS

Cross-platform (multi-platform)

License

GNU General Public License / Java Community Process

Website

java.sun.com

[1]

[2]

[3]

Eiffel,

[4]

Generic Java, Mesa,

[5]

Modula-3,

[6]

Objective-C,

[9]

Java is a programming language originally developed by James Gosling at Sun Microsystems (which is now a subsidiary of Oracle Corporation) and released in 1995 as a core component of Sun Microsystems' Java platform. The language derives much of its syntax from C and C++ but has a simpler object model and fewer low-level facilities. Java applications are typically compiled to bytecode (class file) that can run on any Java Virtual Machine (JVM) regardless of computer architecture. Java is general-purpose, concurrent, class-based, and object-oriented, and is specifically designed to have as few implementation dependencies as possible. It is intended to let application developers "write once, run anywhere". Java is considered by many as one of the most influential programming languages of the 20th century, and is widely used from application software to web applications.[10] [11]

Duke, the Java mascot

The original and reference implementation Java compilers, virtual machines, and class libraries were developed by Sun from 1995. As of May 2007, in compliance with the specifications of the Java Community Process, Sun relicensed most of their Java technologies under the GNU General Public License. Others have also developed alternative implementations of these Sun technologies, such as the GNU Compiler for Java and GNU Classpath.


Java (programming language)

History James Gosling initiated the Java language project in June 1991 for use in one of his many set-top box projects.[12] The language, initially called Oak after an oak tree that stood outside Gosling's office, also went by the name Green and ended up later renamed as Java, from a list of random words.[13] Gosling aimed to implement a virtual machine and a language that had a familiar C/C++ style of notation.[14] Sun Microsystems released the first public implementation as Java 1.0 in 1995. It promised "Write Once, Run Anywhere" (WORA), providing no-cost run-times on popular platforms. Fairly secure and featuring configurable security, it allowed network- and file-access restrictions. Major web browsers soon incorporated the ability to run Java applets within web pages, and Java quickly became popular. With the advent of Java 2 (released initially as J2SE 1.2 in December 1998), new versions had multiple configurations built for different types of platforms. For example, J2EE targeted enterprise applications and the greatly stripped-down version J2ME for mobile applications (Mobile Java). J2SE designated the Standard Edition. In 2006, for marketing purposes, Sun renamed new J2 versions as Java EE, Java ME, and Java SE, respectively. In 1997, Sun Microsystems approached the ISO/IEC JTC1 standards body and later the Ecma International to formalize Java, but it soon withdrew from the process.[15] Java remains a de facto standard, controlled through the Java Community Process.[16] At one time, Sun made most of its Java implementations available without charge, despite their proprietary software status. Sun generated revenue from Java through the selling of licenses for specialized products such as the Java Enterprise System. Sun distinguishes between its Software Development Kit (SDK) and Runtime Environment (JRE) (a subset of the SDK); the primary distinction involves the JRE's lack of the compiler, utility programs, and header files. On November 13, 2006, Sun released much of Java as open source software under the terms of the GNU General Public License (GPL). On May 8, 2007, Sun finished the process, making all of Java's core code available under free software/open-source distribution terms, aside from a small portion of code to which Sun did not hold the copyright.[17] Sun's vice-president Rich Green has said that Sun's ideal role with regards to Java is as an "evangelist."[18]

Principles There were five primary goals in the creation of the Java language:[19] 1. 2. 3. 4. 5.

It should be "simple, object oriented, and familiar". It should be "robust and secure". It should be "architecture neutral and portable". It should execute with "high performance". It should be "interpreted, threaded, and dynamic".

Practices Java Platform One characteristic of Java is portability, which means that computer programs written in the Java language must run similarly on any supported hardware/operating-system platform. This is achieved by compiling the Java language code to an intermediate representation called Java bytecode, instead of directly to platform-specific machine code. Java bytecode instructions are analogous to machine code, but are intended to be interpreted by a virtual machine (VM) written specifically for the host hardware. End-users commonly use a Java Runtime Environment (JRE) installed on their own machine for standalone Java applications, or in a Web browser for Java applets. Standardized libraries provide a generic way to access host-specific features such as graphics, threading and networking.

128


Java (programming language) A major benefit of using bytecode is porting. However, the overhead of interpretation means that interpreted programs almost always run more slowly than programs compiled to native executables would. Just-in-Time compilers were introduced from an early stage that compile bytecodes to machine code during runtime. Over the years, this JVM built-in feature has been optimized to a point where the JVM's performance competes with natively compiled C code. Implementations Sun Microsystems officially licenses the Java Standard Edition platform for Linux,[20] Mac OS X,[21] and Solaris. Although in the past Sun has licensed Java to Microsoft, the license has expired and has not been renewed.[22] Through a network of third-party vendors and licensees,[23] alternative Java environments are available for these and other platforms. Sun's trademark license for usage of the Java brand insists that all implementations be "compatible". This resulted in a legal dispute with Microsoft after Sun claimed that the Microsoft implementation did not support RMI or JNI and had added platform-specific features of their own. Sun sued in 1997, and in 2001 won a settlement of $20 million as well as a court order enforcing the terms of the license from Sun.[24] As a result, Microsoft no longer ships Java with Windows, and in recent versions of Windows, Internet Explorer cannot support Java applets without a third-party plugin. Sun, and others, have made available free Java run-time systems for those and other versions of Windows. Platform-independent Java is essential to the Java EE strategy, and an even more rigorous validation is required to certify an implementation. This environment enables portable server-side applications, such as Web services, Java Servlets, and Enterprise JavaBeans, as well as with embedded systems based on OSGi, using Embedded Java environments. Through the new GlassFish project, Sun is working to create a fully functional, unified open source implementation of the Java EE technologies. Sun also distributes a superset of the JRE called the Java Development Kit (commonly known as the JDK), which includes development tools such as the Java compiler, Javadoc, Jar and debugger. Performance Programs written in Java have a reputation for being slower and requiring more memory than those written in some other languages.[25] However, Java programs' execution speed improved significantly with the introduction of Just-in-time compilation in 1997/1998 for Java 1.1,[26] [27] [28] the addition of language features supporting better code analysis (such as inner classes, StringBuffer class, optional assertions, etc.), and optimizations in the Java Virtual Machine itself, such as HotSpot becoming the default for Sun's JVM in 2000. To boost even further the speed performances that can be achieved using the Java language Systronix made JStik[29] , a microcontroller based on the aJile Systems[30] line of embedded Java processors. In addition, the widely used ARM family of CPUs has hardware support for executing Java bytecode through its Jazelle option.

Automatic memory management Java uses an automatic garbage collector to manage memory in the object lifecycle. The programmer determines when objects are created, and the Java runtime is responsible for recovering the memory once objects are no longer in use. Once no references to an object remain, the unreachable memory becomes eligible to be freed automatically by the garbage collector. Something similar to a memory leak may still occur if a programmer's code holds a reference to an object that is no longer needed, typically when objects that are no longer needed are stored in containers that are still in use. If methods for a nonexistent object are called, a "null pointer exception" is thrown.[31] [32]

One of the ideas behind Java's automatic memory management model is that programmers be spared the burden of having to perform manual memory management. In some languages memory for the creation of objects is implicitly allocated on the stack, or explicitly allocated and deallocated from the heap. Either way, the responsibility of

129


Java (programming language) managing memory resides with the programmer. If the program does not deallocate an object, a memory leak occurs. If the program attempts to access or deallocate memory that has already been deallocated, the result is undefined and difficult to predict, and the program is likely to become unstable and/or crash. This can be partially remedied by the use of smart pointers, but these add overhead and complexity. Note that garbage collection does not prevent 'logical' memory leaks, i.e. those where the memory is still referenced but never used. Garbage collection may happen at any time. Ideally, it will occur when a program is idle. It is guaranteed to be triggered if there is insufficient free memory on the heap to allocate a new object; this can cause a program to stall momentarily. Explicit memory management is not possible in Java. Java does not support C/C++ style pointer arithmetic, where object addresses and unsigned integers (usually long integers) can be used interchangeably. This allows the garbage collector to relocate referenced objects, and ensures type safety and security. As in C++ and some other object-oriented languages, variables of Java's primitive data types are not objects. Values of primitive types are either stored directly in fields (for objects) or on the stack (for methods) rather than on the heap, as commonly true for objects (but see Escape analysis). This was a conscious decision by Java's designers for performance reasons. Because of this, Java was not considered to be a pure object-oriented programming language. However, as of Java 5.0, autoboxing enables programmers to proceed as if primitive types are instances of their wrapper classes.

Syntax The syntax of Java is largely derived from C++. Unlike C++, which combines the syntax for structured, generic, and object-oriented programming, Java was built almost exclusively as an object oriented language. All code is written inside a class and everything is an object, with the exception of the intrinsic data types (ordinal and real numbers, boolean values, and characters), which are not classes for performance reasons. Java suppresses several features (such as operator overloading and multiple inheritance) for classes in order to simplify the language and to prevent possible errors and anti-pattern design. Java uses similar commenting methods to C++. There are three different styles of comment: a single line style marked with two forward slashes (//), a multiple line style opened with a forward slash asterisk (/*) and closed with an asterisk forward slash (*/) and the Javadoc commenting style opened with a forward slash and two asterisks (/**) and closed with an asterisk forward slash (*/). The Javadoc style of commenting allows the user to run the Javadoc executable to compile documentation for the program. Example: //This is an example of a single line comment using two forward slashes /* This is an example of a multiple line comment using the forward slash and asterisk. This type of comment can be used to hold a lot of information or deactivate code but it is very important to remember to close the comment. */ /** This is an example of a multiple line Javadoc comment which allows compilation from Javadoc of this comment. */

130


Java (programming language)

Examples Hello world The traditional Hello world program can be written in Java as: // Outputs "Hello, world!" and then exits public class HelloWorld { public static void main(String[] args) { System.out.println("Hello, world!"); } } Source files must be named after the public class they contain, appending the suffix .java, for example, HelloWorld.java. It must first be compiled into bytecode, using a Java compiler, producing a file named HelloWorld.class. Only then can it be executed, or 'launched'. The java source file may only contain one public class but can contain multiple classes with less than public access and any number of public inner classes. A class that is not declared public may be stored in any .java file. The compiler will generate a class file for each class defined in the source file. The name of the class file is the name of the class, with .class appended. For class file generation, anonymous classes are treated as if their name was the concatenation of the name of their enclosing class, a $, and an integer. The keyword public denotes that a method can be called from code in other classes, or that a class may be used by classes outside the class hierarchy. The class hierarchy is related to the name of the directory in which the .java file is. The keyword static in front of a method indicates a static method, which is associated only with the class and not with any specific instance of that class. Only static methods can be invoked without a reference to an object. Static methods cannot access any method variables that are not static. The keyword void indicates that the main method does not return any value to the caller. If a Java program is to exit with an error code, it must call System.exit() explicitly. The method name "main" is not a keyword in the Java language. It is simply the name of the method the Java launcher calls to pass control to the program. Java classes that run in managed environments such as applets and Enterprise JavaBean do not use or need a main() method. A java program may contain multiple classes that have main methods, which means that the VM needs to be explicitly told which class to launch from. The main method must accept an array of String objects. By convention, it is referenced as args although any other legal identifier name can be used. Since Java 5, the main method can also use variable arguments, in the form of public static void main(String... args), allowing the main method to be invoked with an arbitrary number of String arguments. The effect of this alternate declaration is semantically identical (the args parameter is still an array of String objects), but allows an alternative syntax for creating and passing the array. The Java launcher launches Java by loading a given class (specified on the command line or as an attribute in a JAR) and starting its public static void main(String[]) method. Stand-alone programs must declare this method explicitly. The String[] args parameter is an array of String objects containing any arguments passed to the class. The parameters to main are often passed by means of a command line. Printing is part of a Java standard library: The System class defines a public static field called out. The out object is an instance of the PrintStream class and provides many methods for printing data to standard out, including println(String) which also appends a new line to the passed string. The string "Hello, world!" is automatically converted to a String object by the compiler.

131


Java (programming language)

A more comprehensive example // OddEven.java import javax.swing.JOptionPane; public class OddEven { // "input" is the number that the user gives to the computer private int input; // a whole number("int" means integer) /* * This is the constructor method. It gets called when an object of the OddEven type * is being created. */ public OddEven() { /* * Code not shown for simplicity. In most Java programs constructors can initialize objects * with default values, or create other objects that this object might use to perform its * functions. In some Java programs, the constructor may simply be an empty function if nothing * needs to be initialized prior to the functioning of the object. In this program's case, an * empty constructor would suffice, even if it is empty. A constructor must exist, however if the * user doesn't put one in then the compiler will create an empty one. */ } // This is the main method. It gets called when this class is run through a Java interpreter. public static void main(String[] args) { /* * This line of code creates a new instance of this class called "number" (also known as an * Object) and initializes it by calling the constructor. The next line of code calls * the "showDialog()" method, which brings up a prompt to ask you for a number */ OddEven number = new OddEven(); number.showDialog(); } public void showDialog() { /*

132


Java (programming language) * "try" makes sure nothing goes wrong. If something does, * the interpreter skips to "catch" to see what it should do. */ try { /* * The code below brings up a JOptionPane, which is a dialog box * The String returned by the "showInputDialog()" method is converted into * an integer, making the program treat it as a number instead of a word. * After that, this method calls a second method, calculate() that will * display either "Even" or "Odd." */ input = Integer.parseInt(JOptionPane.showInputDialog("Please Enter A Number")); calculate(); } catch (NumberFormatException e) { /* * Getting in the catch block means that there was a problem with the format of * the number. Probably some letters were typed in instead of a number. */ System.err.println("ERROR: Invalid input. Please type in a numerical value."); } } /* * When this gets called, it sends a message to the interpreter. * The interpreter usually shows it on the command prompt (For Windows users) * or the terminal (For Linux users).(Assuming it's open) */ private void calculate() { if (input % 2 == 0) { System.out.println("Even"); } else { System.out.println("Odd"); } } } â&#x20AC;˘ The import statement imports the JOptionPane class from the javax.swing package. â&#x20AC;˘ The OddEven class declares a single private field of type int named input. Every instance of the OddEven class has its own copy of the input field. The private declaration means that no other class can access (read or write) the

133


Java (programming language)

input field. OddEven() is a public constructor. Constructors have the same name as the enclosing class they are declared in, and unlike a method, have no return type. A constructor is used to initialize an object that is a newly created instance of the class. The calculate() method is declared without the static keyword. This means that the method is invoked using a specific instance of the OddEven class. (The reference used to invoke the method is passed as an undeclared parameter of type OddEven named this.) The method tests the expression input % 2 == 0 using the if keyword to see if the remainder of dividing the input field belonging to the instance of the class by two is zero. If this expression is true, then it prints Even; if this expression is false it prints Odd. (The input field can be equivalently accessed as this.input, which explicitly uses the undeclared this parameter.) OddEven number = new OddEven(); declares a local object reference variable in the main method named number. This variable can hold a reference to an object of type OddEven. The declaration initializes number by first creating an instance of the OddEven class, using the new keyword and the OddEven() constructor, and then assigning this instance to the variable. The statement number.showDialog(); calls the calculate method. The instance of OddEven object referenced by the number local variable is used to invoke the method and passed as the undeclared this parameter to the calculate method.

• input = Integer.parseInt(JOptionPane.showInputDialog("Please Enter A Number")); is a statement that converts the type of String to the primitive data type int by using a utility function in the primitive wrapper class Integer.

Special classes Applet Java applets are programs that are embedded in other applications, typically in a Web page displayed in a Web browser. // Hello.java import javax.swing.JApplet; import java.awt.Graphics; public class Hello extends JApplet { public void paintComponent(Graphics g) { g.drawString("Hello, world!", 65, 95); } } The import statements direct the Java compiler to include the javax.swing.JApplet and java.awt.Graphics classes in the compilation. The import statement allows these classes to be referenced in the source code using the simple class name (i.e. JApplet) instead of the fully qualified class name (i.e. javax.swing.JApplet). The Hello class extends (subclasses) the JApplet (Java Applet) class; the JApplet class provides the framework for the host application to display and control the lifecycle of the applet. The JApplet class is a JComponent (Java Graphical Component) which provides the applet with the capability to display a graphical user interface (GUI) and respond to user events. The Hello class overrides the paintComponent(Graphics) method inherited from the Container superclass to provide the code to display the applet. The paintComponent() method is passed a Graphics object that contains the graphic context used to display the applet. The paintComponent() method calls the graphic context

134


Java (programming language) drawString(String, int, int) method to display the "Hello, world!" string at a pixel offset of (65, 95) from the upper-left corner in the applet's display. <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd"> <!-- Hello.html --> <html> <head> <title>Hello World Applet</title> </head> <body> <applet code="Hello" width="200" height="200"> </applet> </body> </html> An applet is placed in an HTML document using the <applet> HTML element. The applet tag has three attributes set: code="Hello" specifies the name of the JApplet class and width="200" height="200" sets the pixel width and height of the applet. Applets may also be embedded in HTML using either the object or embed element,[33] although support for these elements by Web browsers is inconsistent.[34] However, the applet tag is deprecated, so the object tag is preferred where supported. The host application, typically a Web browser, instantiates the Hello applet and creates an AppletContext for the applet. Once the applet has initialized itself, it is added to the AWT display hierarchy. The paintComponent() method is called by the AWT event dispatching thread whenever the display needs the applet to draw itself.

Servlet Java Servlet technology provides Web developers with a simple, consistent mechanism for extending the functionality of a Web server and for accessing existing business systems. Servlets are server-side Java EE components that generate responses (typically HTML pages) to requests (typically HTTP requests) from clients. A servlet can almost be thought of as an applet that runs on the server sideâ&#x20AC;&#x201D;without a face. // Hello.java import java.io.*; import javax.servlet.*; public class Hello extends GenericServlet { public void service(ServletRequest request, ServletResponse response) throws ServletException, IOException { response.setContentType("text/html"); final PrintWriter pw = response.getWriter(); pw.println("Hello, world!"); pw.close(); } } The import statements direct the Java compiler to include all of the public classes and interfaces from the java.io and javax.servlet [35] packages in the compilation.

135


Java (programming language) The Hello class extends the GenericServlet [36] class; the GenericServlet class provides the interface for the server to forward requests to the servlet and control the servlet's lifecycle. The Hello class overrides the service(ServletRequest, ServletResponse) [37] method defined by the Servlet [38] interface to provide the code for the service request handler. The service() method is passed a ServletRequest [39] object that contains the request from the client and a ServletResponse [40] object used to create the response returned to the client. The service() method declares that it throws the exceptions ServletException [41] and IOException if a problem prevents it from responding to the request. The setContentType(String) [42] method in the response object is called to set the MIME content type of the returned data to "text/html". The getWriter() [43] method in the response returns a PrintWriter object that is used to write the data that is sent to the client. The println(String) method is called to write the "Hello, world!" string to the response and then the close() method is called to close the print writer, which causes the data that has been written to the stream to be returned to the client.

JavaServer Page JavaServer Pages (JSPs) are server-side Java EE components that generate responses, typically HTML pages, to HTTP requests from clients. JSPs embed Java code in an HTML page by using the special delimiters <% and %>. A JSP is compiled to a Java servlet, a Java application in its own right, the first time it is accessed. After that, the generated servlet creates the response.

Swing application Swing is a graphical user interface library for the Java SE platform. It is possible to specify a different look and feel through the pluggable look and feel system of Swing. Clones of Windows, GTK+ and Motif are supplied by Sun. Apple also provides an Aqua look and feel for Mac OS X. Where prior implementations of these looks and feels may have been considered lacking, Swing in Java SE 6 addresses this problem by using more native GUI widget drawing routines of the underlying platforms. This example Swing application creates a single window with "Hello, world!" inside: // Hello.java (Java SE 5) import javax.swing.*; public class Hello extends JFrame { public Hello() { setDefaultCloseOperation(WindowConstants.EXIT_ON_CLOSE); add(new JLabel("Hello, world!")); pack(); } public static void main(String[] args) { new Hello().setVisible(true); } } The first import includes all of the public classes and interfaces from the javax.swing package. The Hello class extends the JFrame class; the JFrame class implements a window with a title bar and a close control. The Hello() constructor initializes the frame by first calling the superclass constructor, passing the parameter "hello", which is used as the window's title. It then calls the setDefaultCloseOperation(int) method inherited from JFrame

136


Java (programming language)

137

to set the default operation when the close control on the title bar is selected to WindowConstants.EXIT_ON_CLOSE — this causes the JFrame to be disposed of when the frame is closed (as opposed to merely hidden), which allows the JVM to exit and the program to terminate. Next, the layout of the frame is set to a BorderLayout; this tells Swing how to arrange the components that will be added to the frame. A JLabel is created for the string "Hello, world!" and the add(Component) method inherited from the Container superclass is called to add the label to the frame. The pack() method inherited from the Window superclass is called to size the window and lay out its contents. The main() method is called by the JVM when the program starts. It instantiates a new Hello frame and causes it to be displayed by calling the setVisible(boolean) method inherited from the Component superclass with the boolean parameter true. Once the frame is displayed, exiting the main method does not cause the program to terminate because the AWT event dispatching thread remains active until all of the Swing top-level windows have been disposed.

Generics In 2004 generics were added to the Java language, as part of J2SE 5.0. Prior to the introduction of generics, each variable declaration had to be of a specific type. For container classes, for example, this is a problem because there is no easy way to create a container that accepts only specific types of objects. Either the container operates on all subtypes of a class or interface, usually Object, or a different container class has to be created for each contained class. Generics allow compile-time type checking without having to create a large number of container classes, each containing almost identical code.

Class libraries • Java libraries are the compiled bytecodes of source code developed by the JRE implementor to support application development in Java. Examples of these libraries are: • The core libraries, which include: • Collection libraries that implement data structures such as lists, dictionaries, trees and sets

Java Platform and Class libraries diagram

• XML Processing (Parsing, Transforming, Validating) libraries • Security • Internationalization and localization libraries • The integration libraries, which allow the application writer to communicate with external systems. These libraries include: • • • •

The Java Database Connectivity (JDBC) API for database access Java Naming and Directory Interface (JNDI) for lookup and discovery RMI and CORBA for distributed application development JMX for managing and monitoring applications

• User interface libraries, which include:


Java (programming language)

• • • •

• The (heavyweight, or native) Abstract Window Toolkit (AWT), which provides GUI components, the means for laying out those components and the means for handling events from those components • The (lightweight) Swing libraries, which are built on AWT but provide (non-native) implementations of the AWT widgetry • APIs for audio capture, processing, and playback A platform dependent implementation of Java Virtual Machine (JVM) that is the means by which the byte codes of the Java libraries and third party applications are executed Plugins, which enable applets to be run in Web browsers Java Web Start, which allows Java applications to be efficiently distributed to end-users across the Internet Licensing and documentation.

Documentation Javadoc is a comprehensive documentation system, created by Sun Microsystems, used by many Java developers. It provides developers with an organized system for documenting their code. Whereas normal comments in Java and C are set off with /* and */, the multi-line comment tags, Javadoc comments have an extra asterisk at the beginning, so that the tags are /** and */.

Examples The following is an example of java code commented with simple Javadoc-style comments: /** * A program that does useful things. */ public class Program { /** * A main method. * @param args The arguments */ public static void main(String[] args) { //do stuff } }

Editions

138


Java (programming language)

139

Java Card Micro Edition (ME) Standard Edition (SE) Enterprise Edition (EE) PersonalJava (discontinued)

Sun has defined and supports four editions of Java targeting different application environments and segmented many of its APIs so that they belong to one of the platforms. The platforms are: • • • •

Java Card for smartcards. Java Platform, Micro Edition (Java ME) — targeting environments with limited resources. Java Platform, Standard Edition (Java SE) — targeting workstation environments. Java Platform, Enterprise Edition (Java EE) — targeting large distributed enterprise or Internet environments.

The classes in the Java APIs are organized into separate groups called packages. Each package contains a set of related interfaces, classes and exceptions. Refer to the separate platforms for a description of the packages available. The set of APIs is controlled by Sun Microsystems in cooperation with others through the Java Community Process program. Companies or individuals participating in this process can influence the design and development of the APIs. This process has been a subject of controversy. Sun also provided an edition called PersonalJava that has been superseded by later, standards-based Java ME configuration-profile pairings.

See also • • • • • • • • • • •

Comparison of programming languages Comparison of Java and C++ Comparison of Java and C# JavaOne Javapedia List of Java virtual machines List of Java APIs List of JVM languages C# Java version history Oak


Java (programming language)

References • Jon Byous, Java technology: The early years [44]. Sun Developer Network, no date [ca. 1998]. Retrieved April 22, 2005. • James Gosling, A brief history of the Green project [45]. Java.net, no date [ca. Q1/1998]. Retrieved April 29, 2007. • James Gosling, Bill Joy, Guy Steele, and Gilad Bracha, The Java language specification, third edition. Addison-Wesley, 2005. ISBN 0-321-24678-0 (see also online edition of the specification [46]). • Tim Lindholm and Frank Yellin. The Java Virtual Machine specification, second edition. Addison-Wesley, 1999. ISBN 0-201-43294-3 (see also online edition of the specification [47]).

External links • • • • • •

Sun Microsystems: Java home page [48] Sun Microsystems: Developer Resources for Java Technology [49]. Chamber of Chartered Java Professionals International: Professionalism for Java Technology [50]. Sun Microsystems: Java Language Specification 3rd Edition [51]. Java SE 6 API Javadocs A Brief History of the Green Project [45]

• • • •

Michael O'Connell: Java: The Inside Story [52], SunWord, July 1995. Patrick Naughton: Java Was Strongly Influenced by Objective-C [53] (no date). David Bank: The Java Saga [54], Wired Issue 3.12 (December 1995). Shahrooz Feizabadi: A history of Java [55] in: Marc Abrams, ed., World Wide Web - Beyond the Basics, Prentice Hall, 1998. Patrick Naughton: The Long Strange Trip to Java [56], March 18, 1996. Open University (UK): M254 Java Everywhere [57] (free open content documents). is-research GmbH: List of programming languages for a Java Virtual Machine [58]. How Java's Floating-Point Hurts Everyone Everywhere [59], by W. Kahan and Joseph D. Darcy, University of California, Berkeley.

• • • •

References [1] Java 5.0 added several new language features (the enhanced for loop, autoboxing, varargs and annotations), after they were introduced in the similar (and competing) C# language (http:/ / www. barrycornelius. com/ papers/ java5/ ) (http:/ / www. levenez. com/ lang/ ) [2] "About Microsoft's "Delegates"" (http:/ / java. sun. com/ docs/ white/ delegates. html). . Retrieved 2010-01-11. "We looked very carefully at Delphi Object Pascal and built a working prototype of bound method references in order to understand their interaction with the Java programming language and its APIs. [...] Our conclusion was that bound method references are unnecessary and detrimental to the language. This decision was made in consultation with Borland International, who had previous experience with bound method references in Delphi Object Pascal." [3] "The Java Language Environment" (http:/ / java. sun. com/ docs/ white/ langenv/ Intro. doc1. html#943). May 1996. . [4] "The Java Language Specification, 2nd Edition" (http:/ / java. sun. com/ docs/ books/ jls/ second_edition/ html/ intro. doc. html#237601). . [5] http:/ / www. computerworld. com. au/ index. php/ id;1422447371;pp;3;fp;4194304;fpid;1 [6] Patrick Naughton cites Objective-C as a strong influence on the design of the Java programming language, stating that notable direct derivatives include Java interfaces (derived from Objective-C's protocol) and primitive wrapper classes. (http:/ / cs. gmu. edu/ ~sean/ stuff/ java-objc. html) [7] TechMetrix Research (1999). "History of Java" (http:/ / www. fscript. org/ prof/ javapassport. pdf). Java Application Servers Report. . "The project went ahead under the name "green" and the language was based on an old model of UCSD Pascal, which makes it possible to generate interpretive code" [8] http:/ / queue. acm. org/ detail. cfm?id=1017013 [9] http:/ / java. sun. com [10] "Programming Language Popularity" (http:/ / www. langpop. com/ ). 2009. . Retrieved 2009-01-16. [11] "TIOBE Programming Community Index" (http:/ / www. tiobe. com/ index. php/ content/ paperinfo/ tpci/ index. html). 2009. . Retrieved 2009-05-06.

140


Java (programming language) [12] Jon Byous, Java technology: The early years (http:/ / java. sun. com/ features/ 1998/ 05/ birthday. html). Sun Developer Network, no date [ca. 1998]. Retrieved April 22, 2005. [13] http:/ / blogs. sun. com/ jonathan/ entry/ better_is_always_different. [14] Heinz Kabutz, Once Upon an Oak (http:/ / www. artima. com/ weblogs/ viewpost. jsp?thread=7555). Artima, Retrieved April 29, 2007. [15] Java Study Group (http:/ / www. open-std. org/ JTC1/ SC22/ JSG/ ); Why Java Was - Not - Standardized Twice (http:/ / csdl2. computer. org/ comp/ proceedings/ hicss/ 2001/ 0981/ 05/ 09815015. pdf); What is ECMAâ&#x20AC;&#x201D;and why Microsoft cares (http:/ / techupdate. zdnet. com/ techupdate/ stories/ main/ 0,14179,2832719,00. html) [16] Java Community Process website (http:/ / www. jcp. org/ en/ home/ index) [17] open.itworld.com - JAVAONE: Sun - The bulk of Java is open sourced (http:/ / open. itworld. com/ 4915/ 070508opsjava/ page_1. html) [18] "Sunâ&#x20AC;&#x2122;s Evolving Role as Java Evangelist" (http:/ / onjava. com/ pub/ a/ onjava/ 2002/ 04/ 17/ evangelism. html). O'Reilly. . [19] 1.2 Design Goals of the JavaTM Programming Language (http:/ / java. sun. com/ docs/ white/ langenv/ Intro. doc2. html) [20] Andy Patrizio (2006). "Sun Embraces Linux With New Java License" (http:/ / www. internetnews. com/ dev-news/ article. php/ 3606656). Internet News. Web Media Brands. . Retrieved 2009-09-29. [21] "Java for Mac OS X" (http:/ / developer. apple. com/ java/ ). Apple Developer Connection. Apple. . Retrieved 2009-09-29. [22] http:/ / www. microsoft. com/ mscorp/ java/ default. mspx [23] Java SE - Licensees (http:/ / java. sun. com/ javase/ licensees. jsp) [24] James Niccolai (January 23, 2001). "Sun, Microsoft settle Java lawsuit" (http:/ / www. javaworld. com/ javaworld/ jw-01-2001/ jw-0124-iw-mssuncourt. html). JavaWorld (IDG). . Retrieved 2008-07-09. [25] Jelovic, Dejan. "Why Java Will Always Be Slower than C++" (http:/ / www. jelovic. com/ articles/ why_java_is_slow. htm). . Retrieved 2008-02-15. [26] "Symantec's Just-In-Time Java Compiler To Be Integrated Into Sun JDK 1.1" (http:/ / www. symantec. com/ about/ news/ release/ article. jsp?prid=19970407_03). . [27] "Apple Licenses Symantec's Just In Time (JIT) Compiler To Accelerate Mac OS Runtime For Java" (http:/ / findarticles. com/ p/ articles/ mi_hb6676/ is_/ ai_n26150624). . [28] "Java gets four times faster with new Symantec just-in-time compiler" (http:/ / www. infoworld. com/ cgi-bin/ displayStory. pl?980416. ehjdk. htm). . [29] Official JStik Website (http:/ / www. jstik. com/ ) [30] http:/ / www. ajile. com/ index. php?option=com_content& task=view& id=21& Itemid=28 aJile Systems Inc. [31] NullPointerException (http:/ / java. sun. com/ j2se/ 1. 4. 2/ docs/ api/ java/ lang/ NullPointerException. html) [32] Exceptions in Java (http:/ / www. artima. com/ designtechniques/ exceptions. html) [33] Using the applet Tag (The Java Tutorials > Deployment > Applets) (http:/ / java. sun. com/ docs/ books/ tutorial/ deployment/ applet/ applettag. html) [34] Deploying Applets in a Mixed-Browser Environment (The Java Tutorials > Deployment > Applets) (http:/ / java. sun. com/ docs/ books/ tutorial/ deployment/ applet/ mixedbrowser. html) [35] http:/ / java. sun. com/ javaee/ 6/ docs/ api/ javax/ servlet/ package-summary. html [36] http:/ / java. sun. com/ javaee/ 6/ docs/ api/ javax/ servlet/ GenericServlet. html [37] http:/ / java. sun. com/ javaee/ 6/ docs/ api/ javax/ servlet/ Servlet. html#service(javax. servlet. ServletRequest,javax. servlet. ServletResponse) [38] http:/ / java. sun. com/ javaee/ 6/ docs/ api/ javax/ servlet/ Servlet. html [39] http:/ / java. sun. com/ javaee/ 6/ docs/ api/ javax/ servlet/ ServletRequest. html [40] http:/ / java. sun. com/ javaee/ 6/ docs/ api/ javax/ servlet/ ServletResponse. html [41] http:/ / java. sun. com/ javaee/ 6/ docs/ api/ javax/ servlet/ ServletException. html [42] http:/ / java. sun. com/ javaee/ 6/ docs/ api/ javax/ servlet/ ServletResponse. html#setContentType(java. lang. String) [43] http:/ / java. sun. com/ javaee/ 6/ docs/ api/ javax/ servlet/ ServletResponse. html#getWriter() [44] http:/ / java. sun. com/ features/ 1998/ 05/ birthday. html [45] https:/ / duke. dev. java. net/ green/ [46] http:/ / java. sun. com/ docs/ books/ jls/ index. html [47] http:/ / java. sun. com/ docs/ books/ vmspec/ 2nd-edition/ html/ VMSpecTOC. doc. html [48] http:/ / www. java. com/ [49] http:/ / java. sun. com/ [50] http:/ / www. ccjpint. org/ [51] http:/ / java. sun. com/ docs/ books/ jls/ third_edition/ html/ j3TOC. html [52] http:/ / sunsite. uakom. sk/ sunworldonline/ swol-07-1995/ swol-07-java. html [53] http:/ / cs. gmu. edu/ ~sean/ stuff/ java-objc. html [54] http:/ / www. wired. com/ wired/ archive/ 3. 12/ java. saga. html [55] http:/ / ei. cs. vt. edu/ ~wwwbtb/ book/ chap1/ java_hist. html [56] http:/ / www. blinkenlights. com/ classiccmp/ javaorigin. html [57] http:/ / computing. open. ac. uk/ m254/ [58] http:/ / www. is-research. de/ info/ vmlanguages/

141


Java (programming language)

142

[59] http:/ / www. eecs. berkeley. edu/ ~wkahan/ JAVAhurt. pdf

.NET Framework

Developer(s)

Microsoft

Initial release

February 13, 2002

Stable release

4.0.30319.1 (4.0) / 12 April 2010

Operating system

Windows XP SP3, Windows Vista SP1, Windows 7, Windows Server 2008

Type

Software framework

License

MS-EULA, BCL under Microsoft Reference Source License

Website

msdn.microsoft.com/netframework

[1]

[2]

The Microsoft .NET Framework is a software framework that can be installed on computers running Microsoft Windows operating systems. It includes a large library of coded solutions to common programming problems and a virtual machine that manages the execution of programs written specifically for the framework. The .NET Framework is a Microsoft offering and is intended to be used by most new applications created for the Windows platform. The framework's Base Class Library provides a large range of features including user interface, data access, database connectivity, cryptography, web application development, numeric algorithms, and network communications. The class library is used by programmers, who combine it with their own code to produce applications. Programs written for the .NET Framework execute in a software environment that manages the program's runtime requirements. Also part of the .NET Framework, this runtime environment is known as the Common Language Runtime (CLR). The CLR provides the appearance of an application virtual machine so that programmers need not consider the capabilities of the specific CPU that will execute the program. The CLR also provides other important services such as security, memory management, and exception handling. The class library and the CLR together constitute the .NET Framework. Version 3.0 of the .NET Framework is included with Windows Server 2008 and Windows Vista. The previous stable version of the framework, 3.5, is included with Windows 7, and can also be installed on Windows XP and the Windows Server 2003 family of operating systems.[3] Version 4 of the framework was released as a public beta on 20 May 2009.[4] In February 2010, Microsoft released a .NET Framework 4 release candidate.[5] On April 12 2010, the final version of the .NET Framework 4 was released. The .NET Framework family also includes two versions for mobile or embedded device use. A reduced version of the framework, the .NET Compact Framework, is available on Windows CE platforms, including Windows Mobile devices such as smartphones. Additionally, the .NET Micro Framework is targeted at severely resource constrained devices.


.NET Framework

Principal design features Interoperability Because interaction between new and older applications is commonly required, the .NET Framework provides means to access functionality that is implemented in programs that execute outside the .NET environment. Access to COM components is provided in the System.Runtime.InteropServices and System.EnterpriseServices namespaces of the framework; access to other functionality is provided using the P/Invoke feature. Common Runtime Engine The Common Language Runtime (CLR) is the virtual machine component of the .NET framework. All .NET programs execute under the supervision of the CLR, guaranteeing certain properties and behaviors in the areas of memory management, security, and exception handling. Language Independence The .NET Framework introduces a Common Type System, or CTS. The CTS specification defines all possible datatypes and programming constructs supported by the CLR and how they may or may not interact with each other conforming to the Common Language Infrastructure (CLI) specification. Because of this feature, the .NET Framework supports the exchange of types and object instances between libraries and applications written using any conforming .NET language. Base Class Library The Base Class Library (BCL), part of the Framework Class Library (FCL), is a library of functionality available to all languages using the .NET Framework. The BCL provides classes which encapsulate a number of common functions, including file reading and writing, graphic rendering, database interaction, XML document manipulation and so on. Simplified Deployment The .NET framework includes design features and tools that help manage the installation of computer software to ensure that it does not interfere with previously installed software, and that it conforms to security requirements. Security The design is meant to address some of the vulnerabilities, such as buffer overflows, that have been exploited by malicious software. Additionally, .NET provides a common security model for all applications. Portability The design of the .NET Framework allows it to theoretically be platform agnostic, and thus cross-platform compatible. That is, a program written to use the framework should run without change on any type of system for which the framework is implemented. While Microsoft has never implemented the full framework on any system except Microsoft Windows, the framework is engineered to be platform agnostic,[6] and cross-platform implementations are available for other operating systems (see Silverlight and the Alternative implementations section below). Microsoft submited the specifications for the Common Language Infrastructure (which includes the core class libraries, Common Type System, and the Common Intermediate Language),[7] [8] [9] the C# language,[10] and the C++/CLI language[11] to both ECMA and the ISO, making them available as open standards. This makes it possible for third parties to create compatible implementations of the framework and its languages on other platforms.

143


.NET Framework

Architecture Common Language Infrastructure (CLI) The purpose of the Common Language Infrastructure, or CLI, is to provide a language-neutral platform for application development and execution, including functions for exception handling, garbage collection, security, and interoperability. By implementing the core aspects of the .NET Framework within the scope of the CLR, this functionality will not be tied to a single language but will be available across the many languages supported by the framework. Microsoft's implementation of the CLI is called the Common Language Runtime, or CLR.

Assemblies The CIL code is housed in .NET assemblies. As mandated by specification, assemblies Visual overview of the Common Language Infrastructure (CLI) are stored in the Portable Executable (PE) format, common on the Windows platform for all DLL and EXE files. The assembly consists of one or more files, one of which must contain the manifest, which has the metadata for the assembly. The complete name of an assembly (not to be confused with the filename on disk) contains its simple text name, version number, culture, and public key token. The public key token is a unique hash generated when the assembly is compiled, thus two assemblies with the same public key token are guaranteed to be identical from the point of view of the framework. A private key can also be specified known only to the creator of the assembly and can be used for strong naming and to guarantee that the assembly is from the same author when a new version of the assembly is compiled (required to add an assembly to the Global Assembly Cache).

Metadata All CIL is self-describing through .NET metadata. The CLR checks the metadata to ensure that the correct method is called. Metadata is usually generated by language compilers but developers can create their own metadata through custom attributes. Metadata contains information about the assembly, and is also used to implement the reflective programming capabilities of .NET Framework.

Security .NET has its own security mechanism with two general features: Code Access Security (CAS), and validation and verification. Code Access Security is based on evidence that is associated with a specific assembly. Typically the evidence is the source of the assembly (whether it is installed on the local machine or has been downloaded from the intranet or Internet). Code Access Security uses evidence to determine the permissions granted to the code. Other code can demand that calling code is granted a specified permission. The demand causes the CLR to perform a call

144


.NET Framework

145

stack walk: every assembly of each method in the call stack is checked for the required permission; if any assembly is not granted the permission a security exception is thrown. When an assembly is loaded the CLR performs various tests. Two such tests are validation and verification. During validation the CLR checks that the assembly contains valid metadata and CIL, and whether the internal tables are correct. Verification is not so exact. The verification mechanism checks to see if the code does anything that is 'unsafe'. The algorithm used is quite conservative; hence occasionally code that is 'safe' does not pass. Unsafe code will only be executed if the assembly has the 'skip verification' permission, which generally means code that is installed on the local machine. .NET Framework uses AppDomains as a mechanism for isolating code running in a process. AppDomains can be created and code loaded into or unloaded from them independent of other AppDomains. This helps increase the fault tolerance of the application, as faults or crashes in one AppDomains do not affect rest of the application. AppDomains can also be configured independently with different security privileges. This can help increase the security of the application by isolating potentially unsafe code. The developer, however, has to split the application into subdomains; it is not done by the CLR.

Class library [12]

Namespaces in the BCL System System. CodeDom System. Collections System. Diagnostics System. Globalization System. IO System. Resources System. Text

System. Text.RegularExpressions

The .NET Framework includes a set of standard class libraries. The class library is organized in a hierarchy of namespaces. Most of the built in APIs are part of either System.* or Microsoft.* namespaces. These class libraries implement a large number of common functions, such as file reading and writing, graphic rendering, database interaction, and XML document manipulation, among others. The .NET class libraries are available to all CLI compliant languages. The .NET Framework class library is divided into two parts: the Base Class Library and the Framework Class Library. The Base Class Library (BCL) includes a small subset of the entire class library and is the core set of classes that serve as the basic API of the Common Language Runtime.[12] The classes in mscorlib.dll and some of the classes in System.dll and System.core.dll are considered to be a part of the BCL. The BCL classes are available in both .NET Framework as well as its alternative implementations including .NET Compact Framework, Microsoft Silverlight and Mono. The Framework Class Library (FCL) is a superset of the BCL classes and refers to the entire class library that ships with .NET Framework. It includes an expanded set of libraries, including Windows Forms, ADO.NET, ASP.NET, Language Integrated Query, Windows Presentation Foundation, Windows Communication Foundation among others. The FCL is much larger in scope than standard libraries for languages like C++, and comparable in scope to the standard libraries of Java.


.NET Framework

Memory management The .NET Framework CLR frees the developer from the burden of managing memory (allocating and freeing up when done); instead it does the memory management itself. To this end, the memory allocated to instantiations of .NET types (objects) is done contiguously[13] from the managed heap, a pool of memory managed by the CLR. As long as there exists a reference to an object, which might be either a direct reference to an object or via a graph of objects, the object is considered to be in use by the CLR. When there is no reference to an object, and it cannot be reached or used, it becomes garbage. However, it still holds on to the memory allocated to it. .NET Framework includes a garbage collector which runs periodically, on a separate thread from the application's thread, that enumerates all the unusable objects and reclaims the memory allocated to them. The .NET Garbage Collector (GC) is a non-deterministic, compacting, mark-and-sweep garbage collector. The GC runs only when a certain amount of memory has been used or there is enough pressure for memory on the system. Since it is not guaranteed when the conditions to reclaim memory are reached, the GC runs are non-deterministic. Each .NET application has a set of roots, which are pointers to objects on the managed heap (managed objects). These include references to static objects and objects defined as local variables or method parameters currently in scope, as well as objects referred to by CPU registers.[13] When the GC runs, it pauses the application, and for each object referred to in the root, it recursively enumerates all the objects reachable from the root objects and marks them as reachable. It uses .NET metadata and reflection to discover the objects encapsulated by an object, and then recursively walk them. It then enumerates all the objects on the heap (which were initially allocated contiguously) using reflection. All objects not marked as reachable are garbage.[13] This is the mark phase.[14] Since the memory held by garbage is not of any consequence, it is considered free space. However, this leaves chunks of free space between objects which were initially contiguous. The objects are then compacted together, by using memcpy[14] to copy them over to the free space to make them contiguous again.[13] Any reference to an object invalidated by moving the object is updated to reflect the new location by the GC.[14] The application is resumed after the garbage collection is over. The GC used by .NET Framework is actually generational.[15] Objects are assigned a generation; newly created objects belong to Generation 0. The objects that survive a garbage collection are tagged as Generation 1, and the Generation 1 objects that survive another collection are Generation 2 objects. The .NET Framework uses up to Generation 2 objects.[15] Higher generation objects are garbage collected less frequently than lower generation objects. This helps increase the efficiency of garbage collection, as older objects tend to have a larger lifetime than newer objects.[15] Thus, by removing older (and thus more likely to survive a collection) objects from the scope of a collection run, fewer objects need to be checked and compacted.[15]

Standardization and licensing In August 2000, Microsoft, Hewlett-Packard, and Intel worked to standardize CLI and the C# programming language. By December 2001, both were ratified ECMA standards (ECMA 335 [16] and ECMA 334 [23]). ISO followed in April 2003 - the current version of the ISO standards are ISO/IEC 23271:2006 and ISO/IEC 23270:2006.[17] [18] While Microsoft and their partners hold patents for the CLI and C#, ECMA and ISO require that all patents essential to implementation be made available under "reasonable and non-discriminatory terms". In addition to meeting these terms, the companies have agreed to make the patents available royalty-free. However, this does not apply for the part of the .NET Framework which is not covered by the ECMA/ISO standard, which includes Windows Forms, ADO.NET, and ASP.NET. Patents that Microsoft holds in these areas may deter non-Microsoft implementations of the full framework.[19] On 3 October 2007, Microsoft announced that much of the source code for the .NET Framework Base Class Library (including ASP.NET, ADO.NET, and Windows Presentation Foundation) was to have been made available with the

146


.NET Framework

147

final release of Visual Studio 2008 towards the end of 2007 under the shared source Microsoft Reference License.[1] The source code for other libraries including Windows Communication Foundation (WCF), Windows Workflow Foundation (WF), and Language Integrated Query (LINQ) were to be added in future releases. Being released under the non Open-source Microsoft Reference License means this source code is made available for debugging purpose only, primarily to support integrated debugging of the BCL in Visual Studio.

Versions Microsoft started development on the .NET Framework in the late 1990s originally under the name of Next Generation Windows Services (NGWS). By late 2000 the first beta versions of .NET 1.0 were released.[20]

The .NET Framework stack.

Version

Version Number

Release Date

Visual Studio

Default in Windows

1.0

1.0.3705.0

2002-02-13

Visual Studio .NET

1.1

1.1.4322.573

2003-04-24

Visual Studio .NET 2003 Windows Server 2003

2.0

2.0.50727.42

2005-11-07

Visual Studio 2005

3.0

3.0.4506.30

2006-11-06

3.5

3.5.21022.8

2007-11-19

Visual Studio 2008

4.0

4.0.30319.1

2010-04-12

Visual Studio 2010

Windows Vista, Windows Server 2008 Windows 7, Windows Server 2008 R2

A more complete listing of the releases of the .NET Framework may be found on the .NET Framework version list.


.NET Framework

148

.NET Framework 1.0 This is the first release of the .NET Framework, released on 13 February 2002 and available for Windows 98, Me, NT 4.0, 2000, and XP. Mainstream support by Microsoft for this version ended 10 July 2007, and extended support ended 14 July 2009.[21]

.NET Framework 1.1 The old .NET Framework logo

This is the first major .NET Framework upgrade. It is available on its own as a redistributable package or in a software development kit, and was published on 3 April 2003. It is also part of the second release of Microsoft Visual Studio .NET (released as Visual Studio .NET 2003). This is the first version of the .NET Framework to be included as part of the Windows operating system, shipping with Windows Server 2003. Mainstream support for .NET Framework 1.1 ended on 14 October 2008, and extended support ends on 8 October 2013. Since .NET 1.1 is a component of Windows Server 2003, extended support for .NET 1.1 on Server 2003 will run out with that of the OS - currently 14 July 2015. Changes in 1.1 on comparison with 1.0 • Built-in support for mobile ASP.NET controls. Previously available as an add-on for .NET Framework, now part of the framework. • Security changes - enable Windows Forms assemblies to execute in a semi-trusted manner from the Internet, and enable Code Access Security in ASP.NET applications. • Built-in support for ODBC and Oracle databases. Previously available as an add-on for .NET Framework 1.0, now part of the framework. • .NET Compact Framework - a version of the .NET Framework for small devices. • Internet Protocol version 6 (IPv6) support. • Numerous API changes.

.NET Framework 2.0 Released with Visual Studio 2005, Microsoft SQL Server 2005, and BizTalk 2006. • The 2.0 Redistributable Package can be downloaded for free from Microsoft [22], and was published on 22 January 2006. • The 2.0 Software Development Kit (SDK) can be downloaded for free from Microsoft [23]. • It is included as part of Visual Studio 2005 and Microsoft SQL Server 2005. • Version 2.0 without any Service Pack is the last version with support for Windows 98 and Windows Me. Version 2.0 with Service Pack 2 is the last version with official support for Windows 2000 although there have been some unofficial workarounds published online to use a subset of the functionality from Version 3.5 in Windows 2000.[24] Version 2.0 with Service Pack 2 requires Windows 2000 with SP4 plus KB835732 or KB891861 update, Windows XP with SP2 or later and Windows Installer 3.1 (KB893803-v2) • It shipped with Windows Server 2003 R2 (not installed by default).


.NET Framework Changes in 2.0 in comparison with 1.1 • Numerous API changes. • A new hosting API for native applications wishing to host an instance of the .NET runtime. The new API gives a fine grain control on the behavior of the runtime with regards to multithreading, memory allocation, assembly loading and more (detailed reference [25]). It was initially developed to efficiently host the runtime in Microsoft SQL Server, which implements its own scheduler and memory manager. • Full 64-bit support for both the x64 and the IA64 hardware platforms. • Language support for generics built directly into the .NET CLR. • Many additional and improved ASP.NET web controls. • New data controls with declarative data binding. • New personalization features for ASP.NET, such as support for themes, skins and webparts. • .NET Micro Framework - a version of the .NET Framework related to the Smart Personal Objects Technology initiative. • Partial classes • Anonymous methods • Data Tables • Generics

.NET Framework 3.0 .NET Framework 3.0, formerly called WinFX,[26] was released on 21 November 2006. It includes a new set of managed code APIs that are an integral part of Windows Vista and Windows Server 2008 operating systems. It is also available for Windows XP SP2 and Windows Server 2003 as a download. There are no major architectural changes included with this release; .NET Framework 3.0 uses the Common Language Runtime of .NET Framework 2.0.[27] Unlike the previous major .NET releases there was no .NET Compact Framework release made as a counterpart of this version. .NET Framework 3.0 consists of four major new components: • Windows Presentation Foundation (WPF), formerly code-named Avalon; a new user interface subsystem and API based on XML and vector graphics, which uses 3D computer graphics hardware and Direct3D technologies. See WPF SDK [28] for developer articles and documentation on WPF. • Windows Communication Foundation (WCF), formerly code-named Indigo; a service-oriented messaging system which allows programs to interoperate locally or remotely similar to web services. • Windows Workflow Foundation (WF) allows for building of task automation and integrated transactions using workflows. • Windows CardSpace, formerly code-named InfoCard; a software component which securely stores a person's digital identities and provides a unified interface for choosing the identity for a particular transaction, such as logging in to a website.

.NET Framework 3.5 Version 3.5 of the .NET Framework was released on 19 November 2007, but it is not included with Windows Server 2008. As with .NET Framework 3.0, version 3.5 uses the CLR of version 2.0. In addition, it installs .NET Framework 2.0 SP1, (installs .NET Framework 2.0 SP2 with 3.5 SP1) and .NET Framework 3.0 SP1 (installs .NET Framework 3.0 SP2 with 3.5 SP1), which adds some methods and properties to the BCL classes in version 2.0 which are required for version 3.5 features such as Language Integrated Query (LINQ). These changes do not affect applications written for version 2.0, however.[29] As with previous versions, a new .NET Compact Framework 3.5 was released in tandem with this update in order to provide support for additional features on Windows Mobile and Windows Embedded CE devices.

149


.NET Framework The source code of the Base Class Library in this version has been partially released (for debugging reference only) under the Microsoft Reference Source License.[1] Changes since version 3.0 • • • • • •

New language features in C# 3.0 and VB.NET 9.0 compiler Adds support for expression trees and lambda methods Extension methods Expression trees to represent high-level source code at runtime.[30] Anonymous types with static type inference Language Integrated Query (LINQ) along with its various providers

• • • •

• LINQ to Objects • LINQ to XML • LINQ to SQL Paging support for ADO.NET ADO.NET synchronization API to synchronize local caches and server side datastores Asynchronous network I/O API[30] . Peer-to-peer networking stack, including a managed PNRP resolver[31]

• Managed wrappers for Windows Management Instrumentation and Active Directory APIs[32] • Enhanced WCF and WF runtimes, which let WCF work with POX and JSON data, and also expose WF workflows as WCF services.[33] WCF services can be made stateful using the WF persistence model.[30] • Support for HTTP pipelining and syndication feeds.[33] • ASP.NET AJAX is included. Service Pack 1 The .NET Framework 3.5 Service Pack 1 was released on 11 August 2008. This release adds new functionality and provides performance improvements under certain conditions,[34] especially with WPF where 20-45% improvements are expected. Two new data service components have been added, the ADO.NET Entity Framework and ADO.NET Data Services. Two new assemblies for web development, System.Web.Abstraction and System.Web.Routing, have been added; these are used in the ASP.NET MVC Framework and, reportedly, will be utilized in the future release of ASP.NET Forms applications. Service Pack 1 is included with SQL Server 2008 and Visual Studio 2008 Service Pack 1.It also featured a new set of controls called "Visual Basic Power Packs" which brought back Visual Basic controls such as "Line" and "Shape". .NET Framework 3.5 SP1 Client Profile For the .NET Framework 3.5 SP1 there is also a new variant of the .NET Framework, called the ".NET Framework Client Profile", which at 28 MB is significantly smaller than the full framework and only installs components that are the most relevant to desktop applications.[35] However, the Client Profile amounts to this size only if using the online installer on Windows XP SP2 when no other .NET Frameworks are installed. When using the off-line installer or any other OS, the download size is still 250 MB.[36]

150


.NET Framework

151

.NET Framework 4 Microsoft announced the .NET Framework 4 on 29 September 2008. The Public Beta 2009.[4] Some focuses of this release are:

[37]

was released on 20 May

• Parallel Extensions to improve support for parallel computing, which target multi-core or distributed systems.[38] To this end, they plan to include technologies like PLINQ (Parallel LINQ),[39] a parallel implementation of the LINQ engine, and Task Parallel Library, which exposes parallel constructs via method calls.[40] • New Visual Basic .NET and C# language features, such as statement lambdas, implicit line continuations, dynamic dispatch, named parameters, and optional parameters. • Full support for IronPython, IronRuby, and F#.[41] • Support for a subset of the .NET Framework and ASP.NET with the "Server Core" variant of Windows Server 2008 R2.[42] • Support for Code Contracts. [43] • Inclusion of the Oslo modelling platform, along with the M programming language.[44] • Inclusion of new types to work with arbitrary-precision arithmetic (System.Numerics.BigInteger [45]) and complex numbers (System.Numerics.Complex [46]). On 28 July 2009, a second release [47] of the .NET Framework 4 beta was made available with experimental software transactional memory support.[48] Whether this functionality will be available in the final version of the framework has not been confirmed. On 19 October 2009, Microsoft released Beta 2 of the .NET Framework 4.[49] At the same time, Microsoft announced the expected launch date for .NET Framework 4 as the 22 March 2010.[49] This launch date has subsequently been delayed to April 12, 2010.[50] On 10 February 2010 a release candidate was published: Version:RC [51] On 12 April 2010 the final version Studio 2010.

[52]

of .NET Framework 4.0 was launched alongside the final release of Visual

In conjunction with .NET Framework 4, Microsoft will offer a set of enhancements, codenamed Dublin, for Windows Server 2008 application server capabilities.[53] [54] Dublin will extend IIS to be a "standard host" for applications that use either WCF or WF.[54]

.NET vs. Java and Java EE The CLI and .NET languages such as C# and VB have many similarities to Sun's JVM and Java. Both are based on a virtual machine model that hides the details of the computer hardware on which their programs run. Both use their own intermediate byte-code, Microsoft calling theirs Common Intermediate Language (CIL; formerly MSIL) and Sun calling theirs Java bytecode. On .NET the byte-code is always compiled before execution, either Just In Time (JIT) or in advance of execution using the Native Image Generator utility (NGEN). With Java the byte-code is either interpreted, compiled in advance, or compiled JIT. Both provide extensive class libraries that address many common programming requirements and address many security issues that are present in other approaches. The namespaces provided in the .NET Framework closely resemble the platform packages in the Java EE API Specification in style and invocation. .NET in its complete form (i.e., Microsoft's implementation, described in the Standardization and licensing section of this article) can only be installed on computers running a Microsoft Windows operating system[55] [56] [57] whereas Java in its entirety can be installed on computers running any one of a variety of operating systems such as Linux, Solaris, Mac OS or Windows.[58] From its beginning .NET has supported multiple programming languages and at its core remains platform agnostic and standardized so that other vendors can implement it on other platforms (although Microsoft's implementation only targets Windows, Windows CE, and Xbox platforms). The Java Virtual Machine was also designed to be both language and operating system agnostic[59] and was launched with the slogan "Write


.NET Framework once, run anywhere." While Java has long remained the most used language on the JVM by a wide margin, recent support for dynamic languages has increased popularity of alternatives; in particular JRuby, Scala, and Groovy.[60] (see JVM languages). Sun's reference implementation of Java (including the class library, the compiler, the virtual machine, and the various tools associated with the Java Platform) is open source under the GNU GPL license with Classpath exception.[61] The source code for the .NET framework base class library is available for reference purposes only under the Microsoft Reference License.[62] [63] The third-party Mono Project, sponsored by Novell, has been developing an open source implementation of the ECMA standards that are part of .NET Framework, as well as most of the other non-ECMA standardized libraries in Microsoft's .NET. The Mono implementation is meant to run on Linux, Solaris, Mac OS X, BSD, HP-UX, and Windows platforms. Mono includes the CLR, the class libraries, and compilers for C# and VB.NET. The current version supports all the APIs in version 2.0 of Microsoft's .NET. Full support exists for C# 3.0 LINQ to Objects and LINQ to XML.[64]

Criticism Some concerns and criticism relating to .NET include: • Applications running in a managed environment tend to require more system resources than similar applications that access machine resources more directly. • Unobfuscated managed CIL bytecode can often be easier to reverse-engineer than native code.[65] [66] One concern is over possible loss of trade secrets and the bypassing of license control mechanisms. Since Visual Studio .NET (2002), Microsoft has included a tool to obfuscate code (Dotfuscator Community Edition).[67] Many other techniques can also help to prevent reverse-engineering. • Newer versions of the framework (3.5 and up) are not pre-installed in versions of Windows below Windows 7. For this reason, applications must lead users without the framework through a procedure to install it. Some developers have expressed concerns about the large size of .NET framework runtime installers for end-users. The size is around 54 MB for .NET 3.0, 197 MB for .NET 3.5, and 250 MB for .NET 3.5 SP1 (while using web installer the typical download for Windows XP is around 50 MB, for Windows Vista - 20 MB). The size issue is partially solved with .NET 4 installer (x86 + x64) being 54 MB and not embedding full runtime installation packages for previous versions. The .NET 3.5 SP1 full installation package includes the full runtime installation packages for .NET 2.0 SP2 as well as .NET 3.0 SP2 for multiple operating systems (Windows XP/Server 2003 and Windows Vista/Server 2008) and for multiple CPU architectures (x86, x86-64, and IA-64). • The first service pack for version 3.5 mitigates this concern by offering a lighter-weight client-only subset of the full .NET Framework. Two significant limitations should be noted, though.[68] Firstly, the client-only subset is only an option on an existing Windows XP SP2 system that currently has no other version of the .NET framework installed. In all other scenarios, the client-only installer will install the full version of the .NET Framework 3.5 SP1. Secondly, the client-only framework does not have a 64-bit option. However, the 4 release of the .NET Framework Client Profile will be available on all operating systems and all architectures (excluding ia64) supported by the full .NET Framework.[69] • The .NET framework currently does not provide support for calling Streaming SIMD Extensions (SSE) via managed code. However, Mono has provided support for SIMD Extensions as of version 2.2 within the Mono.Simd namespace; Mono's lead developer Miguel de Icaza has expressed hope that this SIMD support will be adopted by the CLR ECMA standard.[70] Streaming SIMD Extensions have been available in x86 CPUs since the introduction of the Pentium III. Some other architectures such as ARM and MIPS also have SIMD extensions. In case the CPU lacks support for those extensions, the instructions are simulated in software. • While the standards that make up .NET are inherently cross platform, Microsoft's full implementation of .NET is only supported on Windows. Microsoft does provide limited .NET subsets for other platforms such as XNA for

152


.NET Framework Windows, XBOX 360 and Windows Phone 7; and Silverlight for Windows, Mac OSX, and Windows Phone 7. Alternative implementations of the CLR, base class libraries, and compilers also exist (sometimes from other vendors). While all of these implementations are based on the same standards, they are still different implementations with varying levels of completeness in comparison to the full .NET version Microsoft ships for Windows and are on occasion incompatible.

Alternative implementations The Microsoft .NET Framework is the predominant implementation of .NET technologies. Other implementations for parts of the framework exist. Although the runtime engine is described by an ECMA/ISO specification, other implementations of it may be encumbered by patent issues; ISO standards may include the disclaimer, "Attention is drawn to the possibility that some of the elements of this document may be the subject of patent rights. ISO shall not be held responsible for identifying any or all such patent rights."[71] It is more difficult to develop alternatives to the base class library (BCL), which is not described by an open standard and may be subject to copyright restrictions. Additionally, parts of the BCL have Windows-specific functionality and behavior, so implementation on non-Windows platforms can be problematic. Some alternative implementations of parts of the framework are listed here. • Microsoft's Shared Source Common Language Infrastructure is a shared source implementation of the CLR component of the .NET Framework. However, the last version only runs on Microsoft Windows XP SP2, and does not contain all features of version 2.0 of the .NET Framework. • Microsoft's .NET Micro Framework is a .NET platform for extremely resource-constrained devices. It includes a small version of the .NET CLR and supports development in C# and debugging (in an emulator or on hardware), both using Microsoft Visual Studio. It also features a subset of the .NET base class libraries (about 70 classes with about 420 methods), a GUI framework loosely based on Windows Presentation Foundation, and additional libraries specific to embedded applications. • Mono is an implementation of the CLI and portions of the .NET Base Class Library (BCL), and provides additional functionality. It is dual-licensed under free software and proprietary software licenses. Mono is being developed by Novell, Inc. It includes support for ASP.NET, ADO.NET, and evolving support for Windows Forms libraries. It also includes a C# compiler, and a VB.NET compiler is in pre-beta form. • CrossNet [72] is an implementation of the CLI and portions of the .NET Base Class Library (BCL). It is free software. It parses .NET assemblies and generates standard C++ code, compilable with any ANSI C++ compiler on any platform. • Portable.NET (part of DotGNU) provides an implementation of the Common Language Infrastructure (CLI), portions of the .NET Base Class Library (BCL), and a C# compiler. It supports a variety of CPUs and operating systems.

See also • • • • •

Comparison of the Java and .NET platforms COM Interop Windows API .Net Compact Framework .Net Micro Framework

153


.NET Framework

Components and libraries • Microsoft Enterprise Library - A collection of supplemental libraries for .NET. • Web Services Enhancements

External links • .NET Framework Developer Center [2]

References [1] Scott Guthrie. "Releasing the Source Code for the NET Framework" (http:/ / weblogs. asp. net/ scottgu/ archive/ 2007/ 10/ 03/ releasing-the-source-code-for-the-net-framework-libraries. aspx). . Retrieved 1 June 2008. [2] http:/ / msdn. microsoft. com/ netframework/ [3] Microsoft. "Microsoft .NET Framework 3.5 Administrator Deployment Guide" (http:/ / msdn. microsoft. com/ library/ cc160717. aspx). . Retrieved 26 June 2008. [4] S. Somasegar. "Visual Studio 2010 and .NET FX 4 Beta 1 ships!" (http:/ / www. webcitation. org/ 5h5lV7362). Archived from the original (http:/ / blogs. msdn. com/ somasegar/ archive/ 2009/ 05/ 18/ visual-studio-2010-and-net-fx-4-beta-1-ships. aspx) on 27 May 2009. . Retrieved 25 May 2009. [5] http:/ / www. infoworld. com/ d/ developer-world/ microsoft-offers-visual-studio-2010-release-candidate-643 [6] "Scott Guthrie: Silverlight and the Cross-Platform CLR" (http:/ / channel9. msdn. com/ shows/ Going+ Deep/ Scott-Guthrie-Silverlight-and-the-Cross-Platform-CLR). Channel 9. 30 April 2007. . [7] "ECMA 335 - Standard ECMA-335 Common Language Infrastructure (CLI)" (http:/ / www. ecma-international. org/ publications/ standards/ Ecma-335. htm). ECMA. 1 June 2006. . Retrieved 1 June 2008. [8] ISO/IEC 23271:2006 (http:/ / standards. iso. org/ ittf/ PubliclyAvailableStandards/ c042927_ISO_IEC_23271_2006(E)_Software. zip) [9] "Technical Report TR/84 Common Language Infrastructure (CLI) - Information Derived from Partition IV XML File" (http:/ / www. ecma-international. org/ publications/ techreports/ E-TR-084. htm). ECMA. 1 June 2006. . [10] "ECMA-334 C# Language Specification" (http:/ / www. ecma-international. org/ publications/ standards/ Ecma-334. htm). ECMA. 1 June 2006. . [11] "Standard ECMA-372 C++/CLI Language Specification" (http:/ / www. ecma-international. org/ publications/ standards/ Ecma-372. htm). ECMA. 1 December 2005. . [12] "Base Class Library" (http:/ / msdn. microsoft. com/ netframework/ aa569603. aspx). . Retrieved 1 June 2008. [13] "Garbage Collection: Automatic Memory Management in the Microsoft .NET Framework" (http:/ / web. archive. org/ web/ 20070703083608/ http:/ / msdn. microsoft. com/ msdnmag/ issues/ 1100/ GCI/ ). Archived from the original (http:/ / msdn. microsoft. com/ msdnmag/ issues/ 1100/ GCI/ ) on 3 July 2007. . Retrieved 1 June 2008. [14] "Garbage collection in .NET" (http:/ / www. csharphelp. com/ archives2/ archive297. html). . Retrieved 1 June 2008. [15] "Garbage Collection—Part 2: Automatic Memory Management in the Microsoft .NET Framework" (http:/ / web. archive. org/ web/ 20070626080134/ http:/ / msdn. microsoft. com/ msdnmag/ issues/ 1200/ GCI2/ default. aspx). Archived from the original (http:/ / msdn. microsoft. com/ msdnmag/ issues/ 1200/ GCI2/ default. aspx) on 26 June 2007. . Retrieved 1 June 2008. [16] http:/ / www. ecma-international. org/ publications/ standards/ Ecma-335. htm [17] ISO/IEC 23271:2006 - Information technology - Common Language Infrastructure (CLI) Partitions I to VI (http:/ / www. iso. org/ iso/ iso_catalogue/ catalogue_ics/ catalogue_detail_ics. htm?csnumber=42927) [18] ISO/IEC 23270:2006 - Information technology - Programming languages - C# (http:/ / www. iso. org/ iso/ iso_catalogue/ catalogue_ics/ catalogue_detail_ics. htm?csnumber=42926) [19] "Microsoft's Empty Promise" (http:/ / www. webcitation. org/ 5io4tT8mO). Free Software Foundation. 16 July 2009. Archived from the original (http:/ / www. fsf. org/ news/ 2009-07-mscp-mono) on 5 August 2009. . Retrieved 3 August 2009. "However, there are several libraries that are included with Mono, and commonly used by applications like Tomboy, that are not required by the standard. And just to be clear, we're not talking about Windows-specific libraries like ASP.NET and Windows Forms. Instead, we're talking about libraries under the System namespace that provide common functionality programmers expect in modern programming languages" [20] "Framework Versions" (http:/ / ben. skyiv. com/ clrversion. html). . [21] "Microsoft Product Lifecycle Search" (http:/ / www. webcitation. org/ 5jZfjU0F1). Microsoft. Archived from the original (http:/ / support. microsoft. com/ lifecycle/ search/ ?sort=PN& alpha=. NET+ Framework) on 5 September 2009. . Retrieved 25 January 2008. [22] http:/ / www. microsoft. com/ downloads/ details. aspx?FamilyID=0856eacb-4362-4b0d-8edd-aab15c5e04f5& DisplayLang=en [23] http:/ / www. microsoft. com/ downloads/ details. aspx?FamilyID=fe6f2099-b7b4-4f47-a244-c96d69c35dec& DisplayLang=en [24] "Microsoft don" (http:/ / www. webcitation. org/ 5g9t5nE7k). Archived from the original (http:/ / rainstorms. me. uk/ blog/ 2008/ 03/ 12/ microsoft-net-framework-35-in-windows-2000/ ) on 19 April 2009. . Retrieved 5 March 2009. [25] http:/ / winfx. msdn. microsoft. com/ library/ en-us/ dv_fxunmanref/ html/ 703b8381-43db-4a4d-9faa-cca39302d922. asp [26] WinFX name change announcement (http:/ / blogs. msdn. com/ somasegar/ archive/ 2006/ 06/ 09/ 624300. aspx)

154


.NET Framework [27] ".NET Framework 3.0 Versioning and Deployment Q&A" (http:/ / msdn. microsoft. com/ netframework/ aa663314. aspx). . Retrieved 1 June 2008. [28] http:/ / msdn. microsoft. com/ en-us/ library/ ms754130. aspx [29] "Catching RedBits differences in .NET 2.0 and .NET 2.0SP1" (http:/ / www. hanselman. com/ blog/ CommentView. aspx?guid=7cd75505-192f-4fef-b617-e47e1e2cb94b). . Retrieved 1 June 2008. [30] "What's New in the .NET Framework 3.5" (http:/ / msdn. microsoft. com/ library/ bb332048. aspx). . Retrieved 31 March 2008. [31] Kevin Hoffman. "Orcas' Hidden Gem - The managed PNRP stack" (http:/ / dotnetaddict. dotnetdevelopersjournal. com/ orcas_hidden_gem__the_managed_pnrp_stack. htm). . Retrieved 1 June 2008. [32] ".NET Framework 3.5" (http:/ / www. danielmoth. com/ Blog/ 2007/ 06/ net-framework-35. html). . Retrieved 1 June 2008. [33] Matt Winkle. "WCF and WF in "Orcas"" (http:/ / blogs. msdn. com/ mwinkle/ archive/ 2007/ 02/ 28/ wcf-and-wf-in-quot-orcas-quot. aspx). . Retrieved 1 June 2008. [34] "Visual Studio 2008 Service Pack 1 and .NET Framework 3.5 Service Pack 1" (http:/ / msdn. microsoft. com/ vstudio/ products/ cc533447. aspx). . Retrieved 7 September 2008. [35] Justin Van Patten (21 May 2008). ".NET Framework Client Profile" (http:/ / blogs. msdn. com/ bclteam/ archive/ 2008/ 05/ 21/ net-framework-client-profile-justin-van-patten. aspx). BCL Team Blog. MSDN Blogs. . Retrieved 30 September 2008. [36] Jaime Rodriguez (20 August 2008). "Client profile explained.." (http:/ / blogs. msdn. com/ jaimer/ archive/ 2008/ 08/ 20/ client-profile-explained. aspx). . Retrieved 15 February 2009. [37] http:/ / www. microsoft. com/ downloads/ details. aspx?FamilyID=ee2118cc-51cd-46ad-ab17-af6fff7538c9& displaylang=en [38] S. Somasegar. "The world of multi and many cores" (http:/ / blogs. msdn. com/ somasegar/ archive/ 2007/ 05/ 09/ the-world-of-multi-and-many-cores. aspx). . Retrieved 1 June 2008. [39] "Parallel LINQ: Running Queries On Multi-Core Processors" (http:/ / msdn. microsoft. com/ magazine/ cc163329. aspx). . Retrieved 2 June 2008. [40] "Parallel Performance: Optimize Managed Code For Multi-Core Machines" (http:/ / msdn. microsoft. com/ magazine/ cc163340. aspx). . Retrieved 2 June 2008. [41] "Visual Studio 2010, .NET 4 Ppt Presentation" (http:/ / www. authorstream. com/ presentation/ amrik-155782-new-features-vs-2010-net-4-0-visual-studio-dot-vb-10-education-ppt-powerpoint/ ). authorSTREAM. 25 February 2009. . Retrieved 2 March 2009. [42] "PDC2008 Sessions Overview" (http:/ / www. microsoftpdc. com/ Agenda/ Sessions. aspx). Microsoft. 28 May 2008. . Retrieved 28 May 2008. [43] http:/ / msdn. microsoft. com/ en-us/ devlabs/ dd491992. aspx [44] SDTimes - Microsoft details Oslo's modeling language, tools (http:/ / www. sdtimes. com/ link/ 32957) [45] http:/ / msdn. microsoft. com/ library/ system. numerics. biginteger_members(VS. 100). aspx [46] http:/ / msdn. microsoft. com/ library/ system. numerics. complex(VS. 100). aspx [47] http:/ / msdn. microsoft. com/ en-us/ devlabs/ ee334183. aspx [48] "STM.NET on DevLabs" (http:/ / www. webcitation. org/ 5iwEjbaov). 27 July 2008. Archived from the original (http:/ / blogs. msdn. com/ somasegar/ archive/ 2009/ 07/ 27/ stm-net-in-devlabs. aspx) on 10 August 2009. . Retrieved 6 August 2008. [49] S. Somasegar. "Announcing Visual Studio 2010 and .NET FX 4 Beta 2" (http:/ / blogs. msdn. com/ somasegar/ archive/ 2009/ 10/ 19/ announcing-visual-studio-2010-and-net-fx-4-beta-2. aspx). MSDN Blogs. . Retrieved 20 October 2009. [50] Rob Caron. "Visual Studio 2010 and .NET Framework 4 Launch Date" (http:/ / blogs. msdn. com/ robcaron/ archive/ 2010/ 01/ 13/ 9948172. aspx). MSDN Blogs. . Retrieved 13 January 2010. [51] http:/ / www. microsoft. com/ downloads/ details. aspx?FamilyID=a9ef9a95-58d2-4e51-a4b7-bea3cc6962cb& displaylang=en [52] http:/ / www. microsoft. com/ downloads/ details. aspx?displaylang=en& FamilyID=0a391abd-25c1-4fc0-919f-b21f31ab88b7 [53] "'Dublin' App Server coming to .NET 4" (http:/ / www. devsource. com/ c/ a/ Architecture/ Dublin-App-Server-coming-toNET-40/ ). DevSource. . Retrieved 27 April 2009. [54] ".NET Framework 4 and Dublin Application Server" (http:/ / blogs. msdn. com/ architectsrule/ archive/ 2008/ 10/ 01/ net-framework-4-0-and-dublin-application-server. aspx). MSDN Blogs. . Retrieved 27 April 2009. [55] ".Net Framework 1.0 Redistributable Requirements" (http:/ / msdn. microsoft. com/ library/ ms994373. aspx). MSDN. . Retrieved 22 April 2007. [56] ".Net Framework 1.1 Redistributable Requirements" (http:/ / msdn. microsoft. com/ library/ ms994377. aspx). MSDN. . Retrieved 22 April 2007. [57] ".Net Framework 2.0 Redistributable Requirements" (http:/ / msdn. microsoft. com/ library/ aa480241. aspx). MSDN. . Retrieved 22 April 2007. [58] "JRE 5.0 Installation Notes" (http:/ / java. sun. com/ j2se/ 1. 5. 0/ jre/ install. html). Sun Developer Network. . Retrieved 22 April 2007. [59] "The Java VM Spec" (http:/ / java. sun. com/ docs/ books/ jvms/ first_edition/ html/ Introduction. doc. html). . Retrieved 17 October 2009. [60] "groovy, jruby, jython, rhino, scala, clojure Job Trends" (http:/ / www. indeed. com/ jobtrends?q=groovy,+ jruby,+ jython,+ rhino,+ scala,+ clojure). . Retrieved 17 October 2009. [61] "Free and Open Source Java" (http:/ / www. sun. com/ software/ opensource/ java/ ). Sun.com. . Retrieved 1 June 2008. [62] "Releasing the source code for the .Net framework libraries" (http:/ / weblogs. asp. net/ scottgu/ archive/ 2007/ 10/ 03/ releasing-the-source-code-for-the-net-framework-libraries. aspx). Scott Guthrie - Microsoft. .

155


.NET Framework [63] "Microsoft Reference License" (http:/ / www. microsoft. com/ resources/ sharedsource/ licensingbasics/ referencelicense. mspx). Microsoft. . [64] "FAQ: General" (http:/ / www. mono-project. com/ FAQ:_General). Mono. 20 December 2006. . Retrieved 1 June 2008. [65] "Reverse Engineering Risk Assessment" (http:/ / www. preemptive. com/ images/ documentation/ Reverse_Engineering_Risk_Assessment. pdf). . [66] Gartner, Inc. as reported in "Hype Cycle for Cyberthreats, 2006", September 2006, Neil MacDonald; Amrit Williams, et al. [67] Dotfuscator Community Edition 4.0 [68] .NET Framework Client Profile Deployment Scenarios (http:/ / download. microsoft. com/ download/ 5/ a/ a/ 5aa86d6c-969b-42d8-bc6b-30e02bfeccf0/ NETFXClientProfile_DeploymentGuide. htm#_Toc205028507) [69] "'.NET Framework 4 Client Profile - Introduction'" (http:/ / www. webcitation. org/ 5kHCQjtFS). Archived from the original (http:/ / blogs. msdn. com/ jgoldb/ archive/ 2009/ 05/ 27/ net-framework-4-client-profile-introduction. aspx) on 2009-10-04. . Retrieved 2009-10-02. [70] Mono's SIMD Support: Making Mono safe for Gaming (http:/ / tirania. org/ blog/ archive/ 2008/ Nov-03. html) [71] ISO 9001:2008, Foreword [72] http:/ / www. codeplex. com/ crossnet

156


157

Some Software Programming Concepts Computer programming Computer programming (often shortened to programming or coding) is the process of writing, testing, debugging/troubleshooting, and maintaining the source code of computer programs. This source code is written in a programming language. The code may be a modification of an existing source or something completely new. The purpose of programming is to create a program that exhibits a certain desired behaviour (customization). The process of writing source code often requires expertise in many different subjects, including knowledge of the application domain, specialized algorithms and formal logic.

Overview Within software engineering, programming (the implementation) is regarded as one phase in a software development process. There is an ongoing debate on the extent to which the writing of programs is an art, a craft or an engineering discipline.[1] In general, good programming is considered to be the measured application of all three, with the goal of producing an efficient and evolvable software solution (the criteria for "efficient" and "evolvable" vary considerably). The discipline differs from many other technical professions in that programmers, in general, do not need to be licensed or pass any standardized (or governmentally regulated) certification tests in order to call themselves "programmers" or even "software engineers." However, representing oneself as a "Professional Software Engineer" without a license from an accredited institution is illegal in many parts of the world. However, because the discipline covers many areas, which may or may not include critical applications, it is debatable whether licensing is required for the profession as a whole. In most cases, the discipline is self-governed by the entities which require the programming, and sometimes very strict environments are defined (e.g. United States Air Force use of AdaCore and security clearance). Another ongoing debate is the extent to which the programming language used in writing computer programs affects the form that the final program takes. This debate is analogous to that surrounding the Sapir-Whorf hypothesis [2] in linguistics, that postulates that a particular language's nature influences the habitual thought of its speakers. Different language patterns yield different patterns of thought. This idea challenges the possibility of representing the world perfectly with language, because it acknowledges that the mechanisms of any language condition the thoughts of its speaker community. Said another way, programming is the craft of transforming requirements into something that a computer can execute.


Computer programming

History of programming The concept of devices that operate following a pre-defined set of instructions traces back to Greek Mythology, notably Hephaestus and his mechanical slaves.[3] The Antikythera mechanism was a calculator utilizing gears of various sizes and configuration to determine its operation. Al-Jazari built programmable Automata in 1206. One system employed in these devices was the use of pegs and cams placed into a wooden drum at specific locations. which would sequentially trigger levers that in turn operated percussion instruments. The output of this device was a small drummer playing various rhythms and drum Wired plug board for an IBM 402 Accounting patterns.[4] [5] The Jacquard Loom, which Joseph Marie Jacquard Machine. developed in 1801, uses a series of pasteboard cards with holes punched in them. The hole pattern represented the pattern that the loom had to follow in weaving cloth. The loom could produce entirely different weaves using different sets of cards. Charles Babbage adopted the use of punched cards around 1830 to control his Analytical Engine. The synthesis of numerical calculation, predetermined operation and output, along with a way to organize and input instructions in a manner relatively easy for humans to conceive and produce, led to the modern development of computer programming. Development of computer programming accelerated through the Industrial Revolution. In the late 1880s, Herman Hollerith invented the recording of data on a medium that could then be read by a machine. Prior uses of machine readable media, above, had been for control, not data. "After some initial trials with paper tape, he settled on punched cards..."[6] To process these punched cards, first known as "Hollerith cards" he invented the tabulator, and the keypunch machines. These three inventions were the foundation of the modern information processing industry. In 1896 he founded the Tabulating Machine Company (which later became the core of IBM). The addition of a control panel to his 1906 Type I Tabulator allowed it to do different jobs without having to be physically rebuilt. By the late 1940s, there were a variety of plug-board programmable machines, called unit record equipment, to perform data-processing tasks (card reading). Early computer programmers used plug-boards for the variety of complex calculations requested of the newly invented machines. The invention of the Von Neumann architecture allowed computer programs to be stored in computer memory. Early programs had to be painstakingly crafted using the instructions of the particular machine, often in binary notation. Every model of computer would be likely to need different instructions to do the same task. Later assembly languages were developed that let the programmer specify each instruction in a text format, entering abbreviations for each operation code instead of a number and specifying addresses in symbolic form (e.g., ADD X, TOTAL). In 1954 Fortran was invented, being the first high level programming language to have a functional implementation.[7] [8] It allowed programmers to specify calculations by entering a formula directly (e.g. Y = X*2 + 5*X + 9). The program text, or source, is converted into machine instructions using a special Data and instructions could be stored on external program called a compiler. Many other languages were developed, punched cards, which were kept in order and including some for commercial programming, such as COBOL. arranged in program decks. Programs were mostly still entered using punched cards or paper tape. (See computer programming in the punch card era). By the late 1960s, data storage devices and computer terminals

158


Computer programming became inexpensive enough so programs could be created by typing directly into the computers. Text editors were developed that allowed changes and corrections to be made much more easily than with punched cards. As time has progressed, computers have made giant leaps in the area of processing power. This has brought about newer programming languages that are more abstracted from the underlying hardware. Although these high-level languages usually incur greater overhead, the increase in speed of modern computers has made the use of these languages much more practical than in the past. These increasingly abstracted languages typically are easier to learn and allow the programmer to develop applications much more efficiently and with less code. However, high-level languages are still impractical for a few programs, such as those where low-level hardware control is necessary or where processing speed is at a premium. Throughout the second half of the twentieth century, programming was an attractive career in most developed countries. Some forms of programming have been increasingly subject to offshore outsourcing (importing software and services from other countries, usually at a lower wage), making programming career decisions in developed countries more complicated, while increasing economic opportunities in less developed areas. It is unclear how far this trend will continue and how deeply it will impact programmer wages and opportunities.

Modern programming Quality requirements Whatever the approach to software development may be, the final program must satisfy some fundamental properties. The following properties are among the most relevant: • Efficiency/performance: the amount of system resources a program consumes (processor time, memory space, slow devices such as disks, network bandwidth and to some extent even user interaction): the less, the better. This also includes correct disposal of some resources, such as cleaning up temporary files and lack of memory leaks. • Reliability: how often the results of a program are correct. This depends on conceptual correctness of algorithms, and minimization of programming mistakes, such as mistakes in resource management (e.g., buffer overflows and race conditions) and logic errors (such as division by zero). • Robustness: how well a program anticipates problems not due to programmer error. This includes situations such as incorrect, inappropriate or corrupt data, unavailability of needed resources such as memory, operating system services and network connections, and user error. • Usability: the ergonomics of a program: the ease with which a person can use the program for its intended purpose, or in some cases even unanticipated purposes. Such issues can make or break its success even regardless of other issues. This involves a wide range of textual, graphical and sometimes hardware elements that improve the clarity, intuitiveness, cohesiveness and completeness of a program's user interface. • Portability: the range of computer hardware and operating system platforms on which the source code of a program can be compiled/interpreted and run. This depends on differences in the programming facilities provided by the different platforms, including hardware and operating system resources, expected behaviour of the hardware and operating system, and availability of platform specific compilers (and sometimes libraries) for the language of the source code. • Maintainability: the ease with which a program can be modified by its present or future developers in order to make improvements or customizations, fix bugs and security holes, or adapt it to new environments. Good practices during initial development make the difference in this regard. This quality may not be directly apparent to the end user but it can significantly affect the fate of a program over the long term.

159


Computer programming

160

Algorithmic complexity The academic field and the engineering practice of computer programming are both largely concerned with discovering and implementing the most efficient algorithms for a given class of problem. For this purpose, algorithms are classified into orders using so-called Big O notation, O(n), which expresses resource use, such as execution time or memory consumption, in terms of the size of an input. Expert programmers are familiar with a variety of well-established algorithms and their respective complexities and use this knowledge to choose algorithms that are best suited to the circumstances.

Methodologies The first step in most formal software development projects is requirements analysis, followed by testing to determine value modeling, implementation, and failure elimination (debugging). There exist a lot of differing approaches for each of those tasks. One approach popular for requirements analysis is Use Case analysis. Popular modeling techniques include Object-Oriented Analysis and Design (OOAD) and Model-Driven Architecture (MDA). The Unified Modeling Language (UML) is a notation used for both the OOAD and MDA. A similar technique used for database design is Entity-Relationship Modeling (ER Modeling). Implementation techniques include imperative languages (object-oriented or procedural), functional languages, and logic languages.

Measuring language usage It is very difficult to determine what are the most popular of modern programming languages. Some languages are very popular for particular kinds of applications (e.g., COBOL is still strong in the corporate data center, often on large mainframes, FORTRAN in engineering applications, scripting languages in web development, and C in embedded applications), while some languages are regularly used to write many different kinds of applications. Methods of measuring programming language popularity include: counting the number of job advertisements that mention the language,[9] the number of books teaching the language that are sold (this overestimates the importance of newer languages), and estimates of the number of existing lines of code written in the language (this underestimates the number of users of business languages such as COBOL).

Debugging Debugging is a very important task in the software development process, because an incorrect program can have significant consequences for its users. Some languages are more prone to some kinds of faults because their specification does not require compilers to perform as much checking as other languages. Use of a static analysis tool can help detect some possible problems. Debugging is often done with IDEs like Visual Studio, NetBeans, and Eclipse. Standalone debuggers like gdb are also used, and these often provide less of a visual environment, usually using a command line. A bug, which was debugged in 1947.


Computer programming

Programming languages Different programming languages support different styles of programming (called programming paradigms). The choice of language used is subject to many considerations, such as company policy, suitability to task, availability of third-party packages, or individual preference. Ideally, the programming language best suited for the task at hand will be selected. Trade-offs from this ideal involve finding enough programmers who know the language to build a team, the availability of compilers for that language, and the efficiency with which programs written in a given language execute. Allen Downey, in his book How To Think Like A Computer Scientist, writes: The details look different in different languages, but a few basic instructions appear in just about every language: • • • • •

input: Get data from the keyboard, a file, or some other device. output: Display data on the screen or send data to a file or other device. arithmetic: Perform basic arithmetical operations like addition and multiplication. conditional execution: Check for certain conditions and execute the appropriate sequence of statements. repetition: Perform some action repeatedly, usually with some variation.

Many computer languages provide a mechanism to call functions provided by libraries. Provided the functions in a library follow the appropriate run time conventions (eg, method of passing arguments), then these functions may be written in any other language.

Programmers Computer programmers are those who write computer software. Their jobs usually involve: • • • • • • • • • •

Coding Compilation Documentation Integration Maintenance Requirements analysis Software architecture Software testing Specification Debugging

See also • • • • • • • • •

ACCU (organisation) Association for Computing Machinery Computer programming in the punch card era Hello world program List of basic computer programming topics List of computer programming topics Programming paradigms Software engineering The Art of Computer Programming

161


Computer programming

Further reading • Weinberg, Gerald M., The Psychology of Computer Programming, New York: Van Nostrand Reinhold, 1971

External links • Programming Wikia [10] • How to Think Like a Computer Scientist [11] - by Jeffrey Elkner, Allen B. Downey and Chris Meyers

References [1] Paul Graham (2003). Hackers and Painters (http:/ / www. paulgraham. com/ hp. html). . Retrieved 2006-08-22. [2] Kenneth E. Iverson, the originator of the APL programming language, believed that the Sapir–Whorf hypothesis applied to computer languages (without actually mentioning the hypothesis by name). His Turing award lecture, "Notation as a tool of thought", was devoted to this theme, arguing that more powerful notations aided thinking about computer algorithms. Iverson K.E.," Notation as a tool of thought (http:/ / elliscave. com/ APL_J/ tool. pdf)", Communications of the ACM, 23: 444-465 (August 1980). [3] New World Encyclopedia Online Edition (http:/ / www. newworldencyclopedia. org/ entry/ Hephaestus) New World Encyclopedia [4] A 13th Century Programmable Robot (http:/ / www. shef. ac. uk/ marcoms/ eview/ articles58/ robot. html), University of Sheffield [5] Fowler, Charles B. (October 1967), "The Museum of Music: A History of Mechanical Instruments", Music Educators Journal 54 (2): 45–49, doi:10.2307/3391092 [6] "Columbia University Computing History - Herman Hollerith" (http:/ / www. columbia. edu/ acis/ history/ hollerith. html). Columbia.edu. . Retrieved 2010-04-25. [7] 12:10 p.m. ET (2007-03-20). "Fortran creator John Backus dies - Tech and gadgets- msnbc.com" (http:/ / www. msnbc. msn. com/ id/ 17704662/ ). MSNBC. . Retrieved 2010-04-25. [8] "CSC-302 99S : Class 02: A Brief History of Programming Languages" (http:/ / www. math. grin. edu/ ~rebelsky/ Courses/ CS302/ 99S/ Outlines/ outline. 02. html). Math.grin.edu. . Retrieved 2010-04-25. [9] Survey of Job advertisements mentioning a given language (http:/ / www. computerweekly. com/ Articles/ 2007/ 09/ 11/ 226631/ sslcomputer-weekly-it-salary-survey-finance-boom-drives-it-job. htm)> [10] http:/ / programming. wikia. com/ wiki/ Main_Page [11] http:/ / openbookproject. net/ thinkCSpy

162


Algorithm

Algorithm In mathematics, computer science, and related subjects, an algorithm is an effective method for solving a problem expressed as a finite sequence of instructions. Algorithms are used for calculation, data processing, and many other fields. (In more advanced or abstract settings, the instructions do not necessarily constitute a finite sequence, and even not necessarily a sequence; see, e.g., "nondeterministic algorithm".) Each algorithm is a list of well-defined instructions for completing a task. Starting from an initial state, the instructions describe a computation that proceeds through a well-defined series of successive states, eventually terminating in a final ending state. The transition from one state to the next is not necessarily deterministic; some algorithms, known as randomized algorithms, incorporate randomness. A partial formalization of the concept began with attempts to solve the Entscheidungsproblem (the "decision problem") posed by David This is an algorithm that tries to figure out why Hilbert in 1928. Subsequent formalizations were framed as attempts to the lamp doesn't turn on and tries to fix it using define "effective calculability"[1] or "effective method";[2] those the steps. Flowcharts are often used to graphically represent algorithms. formalizations included the Gödel–Herbrand–Kleene recursive functions of 1930, 1934 and 1935, Alonzo Church's lambda calculus of 1936, Emil Post's "Formulation 1" of 1936, and Alan Turing's Turing machines of 1936–7 and 1939. The adjective "continuous" when applied to the word "algorithm" can mean: 1) An algorithm operating on data that represents continuous quantities, even though this data is represented by discrete approximations – such algorithms are studied in numerical analysis; or 2) An algorithm in the form of a differential equation that operates continuously on the data, running on an analog computer.[3]

Etymology Al-Khwārizmī, muslim Persian astronomer and mathematician, wrote a treatise in the arabic language in 825 AD, On Calculation with Hindu–Arabic numeral system. (See algorism). It was translated from arabic into Latin in the 12th century as Algoritmi de numero Indorum (al-Daffa 1977), whose title is supposedly likely intended to mean "Algoritmi on the numbers of the Indians", where "Algoritmi" was the translator's rendition of the author's name; but people misunderstanding the title treated Algoritmi as a Latin plural and this led to the word "algorithm" (Latin algorismus) coming to mean "calculation method". The intrusive "th" is most likely due to a false cognate with the Greek ἀριθμός (arithmos) meaning "numbers".

163


Algorithm

164

Why algorithms are necessary: an informal definition For a detailed presentation of the various points of view around the definition of "algorithm" see Algorithm characterizations. For examples of simple addition algorithms specified in the detailed manner described in Algorithm characterizations, see Algorithm examples. While there is no generally accepted formal definition of "algorithm," an informal definition could be "a process that performs some sequence of operations." For some people, a program is only an algorithm if it stops eventually. For others, a program is only an algorithm if it stops before a given number of calculation steps. A prototypical example of an algorithm is Euclid's algorithm to determine the maximum common divisor of two integers. We can derive clues to the issues involved and an informal meaning of the word from the following quotation from Boolos & Jeffrey (1974, 1999) (boldface added): No human being can write fast enough, or long enough, or small enough† ( †"smaller and smaller without limit ...you'd be trying to write on molecules, on atoms, on electrons") to list all members of an enumerably infinite set by writing out their names, one after another, in some notation. But humans can do something equally useful, in the case of certain enumerably infinite sets: They can give explicit instructions for determining the nth member of the set, for arbitrary finite n. Such instructions are to be given quite explicitly, in a form in which they could be followed by a computing machine, or by a human who is capable of carrying out only very elementary operations on symbols[4] The term "enumerably infinite" means "countable using integers perhaps extending to infinity." Thus Boolos and Jeffrey are saying that an algorithm implies instructions for a process that "creates" output integers from an arbitrary "input" integer or integers that, in theory, can be chosen from 0 to infinity. Thus we might expect an algorithm to be an algebraic equation such as y = m + n — two arbitrary "input variables" m and n that produce an output y. As we see in Algorithm characterizations — the word algorithm implies much more than this, something on the order of (for our addition example): Precise instructions (in language understood by "the computer") for a "fast, efficient, good" process that specifies the "moves" of "the computer" (machine or human, equipped with the necessary internally-contained information and capabilities) to find, decode, and then munch arbitrary input integers/symbols m and n, symbols + and = ... and (reliably, correctly, "effectively") produce, in a "reasonable" time, output-integer y at a specified place and in a specified format. The concept of algorithm is also used to define the notion of decidability. That notion is central for explaining how formal systems come into being starting from a small set of axioms and rules. In logic, the time that an algorithm requires to complete cannot be measured, as it is not apparently related with our customary physical dimension. From such uncertainties, that characterize ongoing work, stems the unavailability of a definition of algorithm that suits both concrete (in some sense) and abstract usage of the term.

Formalization Algorithms are essential to the way computers process information. Many computer programs contain algorithms that specify the specific instructions a computer should perform (in a specific order) to carry out a specified task, such as calculating employees' paychecks or printing students' report cards. Thus, an algorithm can be considered to be any sequence of operations that can be simulated by a Turing-complete system. Authors who assert this thesis include Minsky (1967), Savage (1987) and Gurevich (2000): Minsky: "But we will also maintain, with Turing . . . that any procedure which could "naturally" be called effective, can in fact be realized by a (simple) machine. Although this may seem extreme, the arguments . . . in its favor are hard to refute".[5]


Algorithm

165 Gurevich: "...Turing's informal argument in favor of his thesis justifies a stronger thesis: every algorithm can be simulated by a Turing machine ... according to Savage [1987], an algorithm is a computational process defined by a Turing machine".[6]

Typically, when an algorithm is associated with processing information, data is read from an input source, written to an output device, and/or stored for further processing. Stored data is regarded as part of the internal state of the entity performing the algorithm. In practice, the state is stored in one or more data structures. For any such computational process, the algorithm must be rigorously defined: specified in the way it applies in all possible circumstances that could arise. That is, any conditional steps must be systematically dealt with, case-by-case; the criteria for each case must be clear (and computable). Because an algorithm is a precise list of precise steps, the order of computation will always be critical to the functioning of the algorithm. Instructions are usually assumed to be listed explicitly, and are described as starting "from the top" and going "down to the bottom", an idea that is described more formally by flow of control. So far, this discussion of the formalization of an algorithm has assumed the premises of imperative programming. This is the most common conception, and it attempts to describe a task in discrete, "mechanical" means. Unique to this conception of formalized algorithms is the assignment operation, setting the value of a variable. It derives from the intuition of "memory" as a scratchpad. There is an example below of such an assignment. For some alternate conceptions of what constitutes an algorithm see functional programming and logic programming .

Termination Some writers restrict the definition of algorithm to procedures that eventually finish. In such a category Kleene places the "decision procedure or decision method or algorithm for the question".[7] Others, including Kleene, include procedures that could run forever without stopping; such a procedure has been called a "computational method"[8] or "calculation procedure or algorithm (and hence a calculation problem) in relation to a general question which requires for an answer, not yes or no, but the exhibiting of some object".[9] Minsky makes the pertinent observation, in regards to determining whether an algorithm will eventually terminate (from a particular starting state): But if the length of the process isn't known in advance, then "trying" it may not be decisive, because if the process does go on forever — then at no time will we ever be sure of the answer.[5] As it happens, no other method can do any better, as was shown by Alan Turing with his celebrated result on the undecidability of the so-called halting problem. There is no algorithmic procedure for determining of arbitrary algorithms whether or not they terminate from given starting states. The analysis of algorithms for their likelihood of termination is called termination analysis. See the examples of (im-)"proper" subtraction at partial function for more about what can happen when an algorithm fails for certain of its input numbers — e.g., (i) non-termination, (ii) production of "junk" (output in the wrong format to be considered a number) or no number(s) at all (halt ends the computation with no output), (iii) wrong number(s), or (iv) a combination of these. Kleene proposed that the production of "junk" or failure to produce a number is solved by having the algorithm detect these instances and produce e.g., an error message (he suggested "0"), or preferably, force the algorithm into an endless loop.[10] Davis (1958) does this to his subtraction algorithm — he fixes his algorithm in a second example so that it is proper subtraction and it terminates.[11] Along with the logical outcomes "true" and "false" Kleene (1952) also proposes the use of a third logical symbol "u" — undecided[12] — thus an algorithm will always produce something when confronted with a "proposition". The problem of wrong answers must be solved with an independent "proof" of the algorithm e.g., using induction: We normally require auxiliary evidence for this [that the algorithm correctly defines a mu recursive function], e.g., in the form of an inductive proof that, for each argument value, the computation


Algorithm

166 terminates with a unique value.[13]

Expressing algorithms Algorithms can be expressed in many kinds of notation, including natural languages, pseudocode, flowcharts, programming languages or control tables (processed by interpreters). Natural language expressions of algorithms tend to be verbose and ambiguous, and are rarely used for complex or technical algorithms. Pseudocode, flowcharts and control tables are structured ways to express algorithms that avoid many of the ambiguities common in natural language statements, while remaining independent of a particular implementation language. Programming languages are primarily intended for expressing algorithms in a form that can be executed by a computer, but are often used as a way to define or document algorithms. There is a wide variety of representations possible and one can express a given Turing machine program as a sequence of machine tables (see more at finite state machine and state transition table), as flowcharts (see more at state diagram), or as a form of rudimentary machine code or assembly code called "sets of quadruples" (see more at Turing machine). Sometimes it is helpful in the description of an algorithm to supplement small "flow charts" (state diagrams) with natural-language and/or arithmetic expressions written inside "block diagrams" to summarize what the "flow charts" are accomplishing. Representations of algorithms are generally classed into three accepted levels of Turing machine description:[14] • 1 High-level description: "...prose to describe an algorithm, ignoring the implementation details. At this level we do not need to mention how the machine manages its tape or head." • 2 Implementation description: "...prose used to define the way the Turing machine uses its head and the way that it stores data on its tape. At this level we do not give details of states or transition function." • 3 Formal description: Most detailed, "lowest level", gives the Turing machine's "state table". For an example of the simple algorithm "Add m+n" described in all three levels see Algorithm examples.

Computer algorithms In computer systems, an algorithm is basically an instance of logic written in software by software developers to be effective for the intended "target" computer(s), in order for the software on the target machines to do something. For instance, if a person is writing software that is supposed to print out a PDF document located at the operating system folder "/My Documents" at computer drive "D:" every Friday at 10PM, they will write an algorithm that specifies the following actions: "If today's date (computer time) is 'Friday,' open the document at 'D:/My Documents' and call the 'print' function". While this simple algorithm does not look into whether the printer has enough paper or whether the document has been moved into a different location, one can make this algorithm more robust and anticipate these problems by rewriting it as a formal CASE statement[15] or as a (carefully crafted) sequence of IF-THEN-ELSE statements.[16] For example the CASE statement might appear as follows (there are other possibilities): CASE 1: IF today's date is NOT Friday THEN exit this CASE instruction ELSE CASE 2: IF today's date is Friday AND the document is located at 'D:/My Documents' AND there is paper in the printer THEN print the document (and exit this CASE instruction) ELSE CASE 3: IF today's date is Friday AND the document is NOT located at 'D:/My Documents' THEN display 'document not found' error message (and exit this CASE instruction) ELSE


Algorithm

167 CASE 4: IF today's date is Friday AND the document is located at 'D:/My Documents' AND there is NO paper in the printer THEN (i) display 'out of paper' error message and (ii) exit.

Note that CASE 3 includes two possibilities: (i) the document is NOT located at 'D:/My Documents' AND there's paper in the printer OR (ii) the document is NOT located at 'D:/My Documents' AND there's NO paper in the printer. The sequence of IF-THEN-ELSE tests might look like this: TEST 1: IF today's date is NOT Friday THEN done ELSE TEST 2: TEST 2: IF the document is NOT located at 'D:/My Documents' THEN display 'document not found' error message ELSE TEST 3: TEST 3: IF there is NO paper in the printer THEN display 'out of paper' error message ELSE print the document. These examples' logic grants precedence to the instance of "NO document at 'D:/My Documents' ". Also observe that in a well-crafted CASE statement or sequence of IF-THEN-ELSE statements the number of distinct actionsâ&#x20AC;&#x201D;4 in these examples: do nothing, print the document, display 'document not found', display 'out of paper' â&#x20AC;&#x201C; equals the number of cases. Given unlimited memory, a computational machine with the ability to execute either a set of CASE statements or a sequence of IF-THEN-ELSE statements is Turing complete. Therefore, anything that is computable can be computed by this machine. This form of algorithm is fundamental to computer programming in all its forms (see more at McCarthy formalism).

Implementation Most algorithms are intended to be implemented as computer programs. However, algorithms are also implemented by other means, such as in a biological neural network (for example, the human brain implementing arithmetic or an insect looking for food), in an electrical circuit, or in a mechanical device.

Example One of the simplest algorithms is to find the largest number in an (unsorted) list of numbers. The solution necessarily requires looking at every number in the list, but only once at each. From this follows a simple algorithm, which can be stated in a high-level description English prose, as: High-level description: 1. Assume the first item is largest. 2. Look at each of the remaining items in the list and if it is larger than the largest item so far, make a note of it. 3. The last noted item is the largest in the list when the process is complete. (Quasi-)formal description: Written in prose but much closer to the high-level language of a computer program, the following is the more formal coding of the algorithm in pseudocode or pidgin code: Algorithm LargestNumber Input: A non-empty list of numbers L. Output: The largest number in the list L.

An animation of the quicksort algorithm sorting an array of randomized values. The red bars mark the pivot element; at the start of the animation, the element farthest to the right hand side is chosen as the pivot.


Algorithm largest ← L0 for each item in the list L≥1, do if the item > largest, then largest ← the item return largest • •

"←" is a loose shorthand for "changes to". For instance, "largest ← item" means that the value of largest changes to the value of item. "return" terminates the algorithm and outputs the value that follows.

For a more complex example of an algorithm, see Euclid's algorithm for the greatest common divisor, one of the earliest algorithms known.

Algorithmic analysis It is frequently important to know how much of a particular resource (such as time or storage) is theoretically required for a given algorithm. Methods have been developed for the analysis of algorithms to obtain such quantitative answers (estimates); for example, the algorithm above has a time requirement of O(n), using the big O notation with n as the length of the list. At all times the algorithm only needs to remember two values: the largest number found so far, and its current position in the input list. Therefore it is said to have a space requirement of O(1), if the space required to store the input numbers is not counted, or O(n) if it is counted. Different algorithms may complete the same task with a different set of instructions in less or more time, space, or 'effort' than others. For example, a binary search algorithm will usually outperform a brute force sequential search when used for table lookups on sorted lists.

Formal versus empirical The analysis and study of algorithms is a discipline of computer science, and is often practiced abstractly without the use of a specific programming language or implementation. In this sense, algorithm analysis resembles other mathematical disciplines in that it focuses on the underlying properties of the algorithm and not on the specifics of any particular implementation. Usually pseudocode is used for analysis as it is the simplest and most general representation. However, ultimately, most algorithms are usually implemented on particular hardware / software platforms and their algorithmic efficiency is eventually put to the test using real code. Empirical testing is useful because it may uncover unexpected interactions that affect performance. For instance an algorithm that has no locality of reference may have much poorer performance than predicted because it 'thrashes the cache'. Benchmarks may be used to compare before/after potential improvements to an algorithm after program optimization.

Classification There are various ways to classify algorithms, each with its own merits.

By implementation One way to classify algorithms is by implementation means. • Recursion or iteration: A recursive algorithm is one that invokes (makes reference to) itself repeatedly until a certain condition matches, which is a method common to functional programming. Iterative algorithms use repetitive constructs like loops and sometimes additional data structures like stacks to solve the given problems. Some problems are naturally suited for one implementation or the other. For example, towers of Hanoi is well understood in recursive implementation. Every recursive version has an equivalent (but possibly more or less complex) iterative version, and vice versa.

168


Algorithm • Logical: An algorithm may be viewed as controlled logical deduction. This notion may be expressed as: Algorithm = logic + control.[17] The logic component expresses the axioms that may be used in the computation and the control component determines the way in which deduction is applied to the axioms. This is the basis for the logic programming paradigm. In pure logic programming languages the control component is fixed and algorithms are specified by supplying only the logic component. The appeal of this approach is the elegant semantics: a change in the axioms has a well defined change in the algorithm. • Serial or parallel or distributed: Algorithms are usually discussed with the assumption that computers execute one instruction of an algorithm at a time. Those computers are sometimes called serial computers. An algorithm designed for such an environment is called a serial algorithm, as opposed to parallel algorithms or distributed algorithms. Parallel algorithms take advantage of computer architectures where several processors can work on a problem at the same time, whereas distributed algorithms utilize multiple machines connected with a network. Parallel or distributed algorithms divide the problem into more symmetrical or asymmetrical subproblems and collect the results back together. The resource consumption in such algorithms is not only processor cycles on each processor but also the communication overhead between the processors. Sorting algorithms can be parallelized efficiently, but their communication overhead is expensive. Iterative algorithms are generally parallelizable. Some problems have no parallel algorithms, and are called inherently serial problems. • Deterministic or non-deterministic: Deterministic algorithms solve the problem with exact decision at every step of the algorithm whereas non-deterministic algorithms solve problems via guessing although typical guesses are made more accurate through the use of heuristics. • Exact or approximate: While many algorithms reach an exact solution, approximation algorithms seek an approximation that is close to the true solution. Approximation may use either a deterministic or a random strategy. Such algorithms have practical value for many hard problems.

By design paradigm Another way of classifying algorithms is by their design methodology or paradigm. There is a certain number of paradigms, each different from the other. Furthermore, each of these categories will include many different types of algorithms. Some commonly found paradigms include: • Brute-force or exhaustive search. This is the naïve method of trying every possible solution to see which is best.[18] • Divide and conquer. A divide and conquer algorithm repeatedly reduces an instance of a problem to one or more smaller instances of the same problem (usually recursively) until the instances are small enough to solve easily. One such example of divide and conquer is merge sorting. Sorting can be done on each segment of data after dividing data into segments and sorting of entire data can be obtained in the conquer phase by merging the segments. A simpler variant of divide and conquer is called a decrease and conquer algorithm, that solves an identical subproblem and uses the solution of this subproblem to solve the bigger problem. Divide and conquer divides the problem into multiple subproblems and so the conquer stage will be more complex than decrease and conquer algorithms. An example of decrease and conquer algorithm is the binary search algorithm. • Dynamic programming. When a problem shows optimal substructure, meaning the optimal solution to a problem can be constructed from optimal solutions to subproblems, and overlapping subproblems, meaning the same subproblems are used to solve many different problem instances, a quicker approach called dynamic programming avoids recomputing solutions that have already been computed. For example, the shortest path to a goal from a vertex in a weighted graph can be found by using the shortest path to the goal from all adjacent vertices. Dynamic programming and memoization go together. The main difference between dynamic programming and divide and conquer is that subproblems are more or less independent in divide and conquer, whereas subproblems overlap in dynamic programming. The difference between dynamic programming and straightforward recursion is in caching or memoization of recursive calls. When subproblems are independent and there is no repetition, memoization does not help; hence dynamic programming is not a solution for all complex

169


Algorithm problems. By using memoization or maintaining a table of subproblems already solved, dynamic programming reduces the exponential nature of many problems to polynomial complexity. • The greedy method. A greedy algorithm is similar to a dynamic programming algorithm, but the difference is that solutions to the subproblems do not have to be known at each stage; instead a "greedy" choice can be made of what looks best for the moment. The greedy method extends the solution with the best possible decision (not all feasible decisions) at an algorithmic stage based on the current local optimum and the best decision (not all possible decisions) made in a previous stage. It is not exhaustive, and does not give accurate answer to many problems. But when it works, it will be the fastest method. The most popular greedy algorithm is finding the minimal spanning tree as given by Kruskal. • Linear programming. When solving a problem using linear programming, specific inequalities involving the inputs are found and then an attempt is made to maximize (or minimize) some linear function of the inputs. Many problems (such as the maximum flow for directed graphs) can be stated in a linear programming way, and then be solved by a 'generic' algorithm such as the simplex algorithm. A more complex variant of linear programming is called integer programming, where the solution space is restricted to the integers. • Reduction. This technique involves solving a difficult problem by transforming it into a better known problem for which we have (hopefully) asymptotically optimal algorithms. The goal is to find a reducing algorithm whose complexity is not dominated by the resulting reduced algorithm's. For example, one selection algorithm for finding the median in an unsorted list involves first sorting the list (the expensive portion) and then pulling out the middle element in the sorted list (the cheap portion). This technique is also known as transform and conquer. • Search and enumeration. Many problems (such as playing chess) can be modeled as problems on graphs. A graph exploration algorithm specifies rules for moving around a graph and is useful for such problems. This category also includes search algorithms, branch and bound enumeration and backtracking. 1. Randomized algorithms are those that make some choices randomly (or pseudo-randomly); for some problems, it can in fact be proven that the fastest solutions must involve some randomness. There are two large classes of such algorithms: 1. Monte Carlo algorithms return a correct answer with high-probability. E.g. RP is the subclass of these that run in polynomial time) 2. Las Vegas algorithms always return the correct answer, but their running time is only probabilistically bound, e.g. ZPP. 2. In optimization problems, heuristic algorithms do not try to find an optimal solution, but an approximate solution where the time or resources are limited. They are not practical to find perfect solutions. An example of this would be local search, tabu search, or simulated annealing algorithms, a class of heuristic probabilistic algorithms that vary the solution of a problem by a random amount. The name "simulated annealing" alludes to the metallurgic term meaning the heating and cooling of metal to achieve freedom from defects. The purpose of the random variance is to find close to globally optimal solutions rather than simply locally optimal ones, the idea being that the random element will be decreased as the algorithm settles down to a solution. Approximation algorithms are those heuristic algorithms that additionally provide some bounds on the error. Genetic algorithms attempt to find solutions to problems by mimicking biological evolutionary processes, with a cycle of random mutations yielding successive generations of "solutions". Thus, they emulate reproduction and "survival of the fittest". In genetic programming, this approach is extended to algorithms, by regarding the algorithm itself as a "solution" to a problem.

170


Algorithm

171

By field of study Every field of science has its own problems and needs efficient algorithms. Related problems in one field are often studied together. Some example classes are search algorithms, sorting algorithms, merge algorithms, numerical algorithms, graph algorithms, string algorithms, computational geometric algorithms, combinatorial algorithms, machine learning, cryptography, data compression algorithms and parsing techniques. Fields tend to overlap with each other, and algorithm advances in one field may improve those of other, sometimes completely unrelated, fields. For example, dynamic programming was invented for optimization of resource consumption in industry, but is now used in solving a broad range of problems in many fields.

By complexity Algorithms can be classified by the amount of time they need to complete compared to their input size. There is a wide variety: some algorithms complete in linear time relative to input size, some do so in an exponential amount of time or even worse, and some never halt. Additionally, some problems may have multiple algorithms of differing complexity, while other problems might have no algorithms or no known efficient algorithms. There are also mappings from some problems to other problems. Owing to this, it was found to be more suitable to classify the problems themselves instead of the algorithms into equivalence classes based on the complexity of the best possible algorithms for them.

By computing power Another way to classify algorithms is by computing power. This is typically done by considering some collection (class) of algorithms. A recursive class of algorithms is one that includes algorithms for all Turing computable functions. Looking at classes of algorithms allows for the possibility of restricting the available computational resources (time and memory) used in a computation. A subrecursive class of algorithms is one in which not all Turing computable functions can be obtained. For example, the algorithms that run in polynomial time suffice for many important types of computation but do not exhaust all Turing computable functions. The class of algorithms implemented by primitive recursive functions is another subrecursive class. Burgin (2005, p. 24) uses a generalized definition of algorithms that relaxes the common requirement that the output of the algorithm that computes a function must be determined after a finite number of steps. He defines a super-recursive class of algorithms as "a class of algorithms in which it is possible to compute functions not computable by any Turing machine" (Burgin 2005, p. 107). This is closely related to the study of methods of hypercomputation.

Legal issues See also: Software patents for a general overview of the patentability of software, including computer-implemented algorithms. Algorithms, by themselves, are not usually patentable. In the United States, a claim consisting solely of simple manipulations of abstract concepts, numbers, or signals does not constitute "processes" (USPTO 2006), and hence algorithms are not patentable (as in Gottschalk v. Benson). However, practical applications of algorithms are sometimes patentable. For example, in Diamond v. Diehr, the application of a simple feedback algorithm to aid in the curing of synthetic rubber was deemed patentable. The patenting of software is highly controversial, and there are highly criticized patents involving algorithms, especially data compression algorithms, such as Unisys' LZW patent. Additionally, some cryptographic algorithms have export restrictions (see export of cryptography).


Algorithm

172

History: Development of the notion of "algorithm" Discrete and distinguishable symbols Tally-marks: To keep track of their flocks, their sacks of grain and their money the ancients used tallying: accumulating stones or marks scratched on sticks, or making discrete symbols in clay. Through the Babylonian and Egyptian use of marks and symbols, eventually Roman numerals and the abacus evolved (Dilson, p. 16–41). Tally marks appear prominently in unary numeral system arithmetic used in Turing machine and Post–Turing machine computations.

Manipulation of symbols as "place holders" for numbers: algebra The work of the ancient Greek geometers, Persian mathematician Al-Khwarizmi (often considered the "father of algebra" and from whose name the terms "algorism" and "algorithm" are derived), and Western European mathematicians culminated in Leibniz's notion of the calculus ratiocinator (ca 1680): A good century and a half ahead of his time, Leibniz proposed an algebra of logic, an algebra that would specify the rules for manipulating logical concepts in the manner that ordinary algebra specifies the rules for manipulating numbers.[19]

Mechanical contrivances with discrete states The clock: Bolter credits the invention of the weight-driven clock as "The key invention [of Europe in the Middle Ages]", in particular the verge escapement[20] that provides us with the tick and tock of a mechanical clock. "The accurate automatic machine"[21] led immediately to "mechanical automata" beginning in the thirteenth century and finally to "computational machines" – the difference engine and analytical engines of Charles Babbage and Countess Ada Lovelace.[22] Logical machines 1870 – Stanley Jevons' "logical abacus" and "logical machine": The technical problem was to reduce Boolean equations when presented in a form similar to what are now known as Karnaugh maps. Jevons (1880) describes first a simple "abacus" of "slips of wood furnished with pins, contrived so that any part or class of the [logical] combinations can be picked out mechanically . . . More recently however I have reduced the system to a completely mechanical form, and have thus embodied the whole of the indirect process of inference in what may be called a Logical Machine" His machine came equipped with "certain moveable wooden rods" and "at the foot are 21 keys like those of a piano [etc] . . .". With this machine he could analyze a "syllogism or any other simple logical argument".[23] This machine he displayed in 1870 before the Fellows of the Royal Society.[24] Another logician John Venn, however, in his 1881 Symbolic Logic, turned a jaundiced eye to this effort: "I have no high estimate myself of the interest or importance of what are sometimes called logical machines ... it does not seem to me that any contrivances at present known or likely to be discovered really deserve the name of logical machines"; see more at Algorithm characterizations. But not to be outdone he too presented "a plan somewhat analogous, I apprehend, to Prof. Jevon's abacus ... [And] [a]gain, corresponding to Prof. Jevons's logical machine, the following contrivance may be described. I prefer to call it merely a logical-diagram machine ... but I suppose that it could do very completely all that can be rationally expected of any logical machine".[25] Jacquard loom, Hollerith punch cards, telegraphy and telephony — the electromechanical relay: Bell and Newell (1971) indicate that the Jacquard loom (1801), precursor to Hollerith cards (punch cards, 1887), and "telephone switching technologies" were the roots of a tree leading to the development of the first computers.[26] By the mid-1800s the telegraph, the precursor of the telephone, was in use throughout the world, its discrete and distinguishable encoding of letters as "dots and dashes" a common sound. By the late 1800s the ticker tape (ca 1870s) was in use, as was the use of Hollerith cards in the 1890 U.S. census. Then came the Teletype (ca. 1910) with its punched-paper use of Baudot code on tape.


Algorithm

173

Telephone-switching networks of electromechanical relays (invented 1835) was behind the work of George Stibitz (1937), the inventor of the digital adding device. As he worked in Bell Laboratories, he observed the "burdensome' use of mechanical calculators with gears. "He went home one evening in 1937 intending to test his idea... When the tinkering was over, Stibitz had constructed a binary adding device".[27] Davis (2000) observes the particular importance of the electromechanical relay (with its two "binary states" open and closed): It was only with the development, beginning in the 1930s, of electromechanical calculators using electrical relays, that machines were built having the scope Babbage had envisioned."[28]

Mathematics during the 1800s up to the mid-1900s Symbols and rules: In rapid succession the mathematics of George Boole (1847, 1854), Gottlob Frege (1879), and Giuseppe Peano (1888–1889) reduced arithmetic to a sequence of symbols manipulated by rules. Peano's The principles of arithmetic, presented by a new method (1888) was "the first attempt at an axiomatization of mathematics in a symbolic language".[29] But Heijenoort gives Frege (1879) this kudos: Frege's is "perhaps the most important single work ever written in logic. ... in which we see a " 'formula language', that is a lingua characterica, a language written with special symbols, "for pure thought", that is, free from rhetorical embellishments ... constructed from specific symbols that are manipulated according to definite rules".[30] The work of Frege was further simplified and amplified by Alfred North Whitehead and Bertrand Russell in their Principia Mathematica (1910–1913). The paradoxes: At the same time a number of disturbing paradoxes appeared in the literature, in particular the Burali-Forti paradox (1897), the Russell paradox (1902–03), and the Richard Paradox.[31] The resultant considerations led to Kurt Gödel's paper (1931) — he specifically cites the paradox of the liar — that completely reduces rules of recursion to numbers. Effective calculability: In an effort to solve the Entscheidungsproblem defined precisely by Hilbert in 1928, mathematicians first set about to define what was meant by an "effective method" or "effective calculation" or "effective calculability" (i.e., a calculation that would succeed). In rapid succession the following appeared: Alonzo Church, Stephen Kleene and J.B. Rosser's λ-calculus[32] a finely-honed definition of "general recursion" from the work of Gödel acting on suggestions of Jacques Herbrand (cf. Gödel's Princeton lectures of 1934) and subsequent simplifications by Kleene.[33] Church's proof[34] that the Entscheidungsproblem was unsolvable, Emil Post's definition of effective calculability as a worker mindlessly following a list of instructions to move left or right through a sequence of rooms and while there either mark or erase a paper or observe the paper and make a yes-no decision about the next instruction.[35] Alan Turing's proof of that the Entscheidungsproblem was unsolvable by use of his "a- [automatic-] machine"[36] – in effect almost identical to Post's "formulation", J. Barkley Rosser's definition of "effective method" in terms of "a machine".[37] S. C. Kleene's proposal of a precursor to "Church thesis" that he called "Thesis I",[38] and a few years later Kleene's renaming his Thesis "Church's Thesis"[39] and proposing "Turing's Thesis".[40]

Emil Post (1936) and Alan Turing (1936–7, 1939) Here is a remarkable coincidence of two men not knowing each other but describing a process of men-as-computers working on computations — and they yield virtually identical definitions. Emil Post (1936) described the actions of a "computer" (human being) as follows: "...two concepts are involved: that of a symbol space in which the work leading from problem to answer is to be carried out, and a fixed unalterable set of directions. His symbol space would be


Algorithm

174 "a two way infinite sequence of spaces or boxes... The problem solver or worker is to move and work in this symbol space, being capable of being in, and operating in but one box at a time.... a box is to admit of but two possible conditions, i.e., being empty or unmarked, and having a single mark in it, say a vertical stroke. "One box is to be singled out and called the starting point. ...a specific problem is to be given in symbolic form by a finite number of boxes [i.e., INPUT] being marked with a stroke. Likewise the answer [i.e., OUTPUT] is to be given in symbolic form by such a configuration of marked boxes.... "A set of directions applicable to a general problem sets up a deterministic process when applied to each specific problem. This process will terminate only when it comes to the direction of type (C ) [i.e., STOP]".[41] See more at Post–Turing machine

Alan Turing's work[42] preceded that of Stibitz (1937); it is unknown whether Stibitz knew of the work of Turing. Turing's biographer believed that Turing's use of a typewriter-like model derived from a youthful interest: "Alan had dreamt of inventing typewriters as a boy; Mrs. Turing had a typewriter; and he could well have begun by asking himself what was meant by calling a typewriter 'mechanical'".[43] Given the prevalence of Morse code and telegraphy, ticker tape machines, and Teletypes we might conjecture that all were influences. Turing — his model of computation is now called a Turing machine — begins, as did Post, with an analysis of a human computer that he whittles down to a simple set of basic motions and "states of mind". But he continues a step further and creates a machine as a model of computation of numbers.[44] "Computing is normally done by writing certain symbols on paper. We may suppose this paper is divided into squares like a child's arithmetic book....I assume then that the computation is carried out on one-dimensional paper, i.e., on a tape divided into squares. I shall also suppose that the number of symbols which may be printed is finite.... "The behavior of the computer at any moment is determined by the symbols which he is observing, and his "state of mind" at that moment. We may suppose that there is a bound B to the number of symbols or squares which the computer can observe at one moment. If he wishes to observe more, he must use successive observations. We will also suppose that the number of states of mind which need be taken into account is finite... "Let us imagine that the operations performed by the computer to be split up into 'simple operations' which are so elementary that it is not easy to imagine them further divided".[45] Turing's reduction yields the following: "The simple operations must therefore include: "(a) Changes of the symbol on one of the observed squares "(b) Changes of one of the squares observed to another square within L squares of one of the previously observed squares. "It may be that some of these change necessarily invoke a change of state of mind. The most general single operation must therefore be taken to be one of the following: "(A) A possible change (a) of symbol together with a possible change of state of mind. "(B) A possible change (b) of observed squares, together with a possible change of state of mind" "We may now construct a machine to do the work of this computer"[45] . A few years later, Turing expanded his analysis (thesis, definition) with this forceful expression of it: "A function is said to be "effectively calculable" if its values can be found by some purely mechanical process. Although it is fairly easy to get an intuitive grasp of this idea, it is nevertheless desirable to have some more definite, mathematical expressible definition . . . [he discusses the history of the definition pretty much as presented above with respect to Gödel, Herbrand, Kleene, Church, Turing and Post] . . . We may take this statement literally, understanding by a purely mechanical process one which could be carried out by a


Algorithm

175 machine. It is possible to give a mathematical description, in a certain normal form, of the structures of these machines. The development of these ideas leads to the author's definition of a computable function, and to an identification of computability † with effective calculability . . . . "† We shall use the expression "computable function" to mean a function calculable by a machine, and we let "effectively calculable" refer to the intuitive idea without particular identification with any one of these definitions".[46]

J. B. Rosser (1939) and S. C. Kleene (1943) J. Barkley Rosser boldly defined an 'effective [mathematical] method' in the following manner (boldface added): "'Effective method' is used here in the rather special sense of a method each step of which is precisely determined and which is certain to produce the answer in a finite number of steps. With this special meaning, three different precise definitions have been given to date. [his footnote #5; see discussion immediately below]. The simplest of these to state (due to Post and Turing) says essentially that an effective method of solving certain sets of problems exists if one can build a machine which will then solve any problem of the set with no human intervention beyond inserting the question and (later) reading the answer. All three definitions are equivalent, so it doesn't matter which one is used. Moreover, the fact that all three are equivalent is a very strong argument for the correctness of any one." (Rosser 1939:225–6) Rosser's footnote #5 references the work of (1) Church and Kleene and their definition of λ-definability, in particular Church's use of it in his An Unsolvable Problem of Elementary Number Theory (1936); (2) Herbrand and Gödel and their use of recursion in particular Gödel's use in his famous paper On Formally Undecidable Propositions of Principia Mathematica and Related Systems I (1931); and (3) Post (1936) and Turing (1936–7) in their mechanism-models of computation. Stephen C. Kleene defined as his now-famous "Thesis I" known as the Church–Turing thesis. But he did this in the following context (boldface in original): "12. Algorithmic theories... In setting up a complete algorithmic theory, what we do is to describe a procedure, performable for each set of values of the independent variables, which procedure necessarily terminates and in such manner that from the outcome we can read a definite answer, "yes" or "no," to the question, "is the predicate value true?"" (Kleene 1943:273)

History after 1950 A number of efforts have been directed toward further refinement of the definition of "algorithm", and activity is on-going because of issues surrounding, in particular, foundations of mathematics (especially the Church–Turing Thesis) and philosophy of mind (especially arguments around artificial intelligence). For more, see Algorithm characterizations.

See also • • • • • • •

Abstract machine Algorithm characterizations Algorithm design Algorithmic efficiency Algorithm engineering Algorithm examples Algorithmic music

• Garbage In, Garbage Out • Algorithmic synthesis


Algorithm • • • • • • • • • • • •

Algorithmic trading Data structure Heuristics Introduction to Algorithms Important algorithm-related publications List of algorithm general topics List of algorithms List of terms relating to algorithms and data structures Partial function Profiling (computer programming) Program optimization Theory of computation

• Computability (part of computability theory) • Computational complexity theory • Randomized algorithm and quantum algorithm

References • Axt, P. (1959) On a Subrecursive Hierarchy and Primitive Recursive Degrees, Transactions of the American Mathematical Society 92, pp. 85–105 • Bell, C. Gordon and Newell, Allen (1971), Computer Structures: Readings and Examples, McGraw-Hill Book Company, New York. ISBN 0070043574}. • Blass, Andreas; Gurevich, Yuri (2003). "Algorithms: A Quest for Absolute Definitions" [47]. Bulletin of European Association for Theoretical Computer Science 81. Includes an excellent bibliography of 56 references. • Boolos, George; Jeffrey, Richard (1974, 1980, 1989, 1999). Computability and Logic (4th ed.). Cambridge University Press, London. ISBN 0-521-20402-X.: cf. Chapter 3 Turing machines where they discuss "certain enumerable sets not effectively (mechanically) enumerable". • Burgin, M. Super-recursive algorithms, Monographs in computer science, Springer, 2005. ISBN 0387955690 • Campagnolo, M.L., Moore, C., and Costa, J.F. (2000) An analog characterization of the subrecursive functions. In Proc. of the 4th Conference on Real Numbers and Computers, Odense University, pp. 91–109 • Church, Alonzo (1936a). "An Unsolvable Problem of Elementary Number Theory". The American Journal of Mathematics 58 (2): 345–363. doi:10.2307/2371045. Reprinted in The Undecidable, p. 89ff. The first expression of "Church's Thesis". See in particular page 100 (The Undecidable) where he defines the notion of "effective calculability" in terms of "an algorithm", and he uses the word "terminates", etc. • Church, Alonzo (1936b). "A Note on the Entscheidungsproblem". The Journal of Symbolic Logic 1 (1): 40–41. doi:10.2307/2269326. JSTOR 2269326. Church, Alonzo (1936). "Correction to a Note on the Entscheidungsproblem". The Journal of Symbolic Logic 1 (3): 101–102. doi:10.2307/2269030. JSTOR 2269030. Reprinted in The Undecidable, p. 110ff. Church shows that the Entscheidungsproblem is unsolvable in about 3 pages of text and 3 pages of footnotes. • Daffa', Ali Abdullah al- (1977). The Muslim contribution to mathematics. London: Croom Helm. ISBN 0-85664-464-1. • Davis, Martin (1965). The Undecidable: Basic Papers On Undecidable Propositions, Unsolvable Problems and Computable Functions. New York: Raven Press. ISBN 0486432289. Davis gives commentary before each article. Papers of Gödel, Alonzo Church, Turing, Rosser, Kleene, and Emil Post are included; those cited in the article are listed here by author's name. • Davis, Martin (2000). Engines of Logic: Mathematicians and the Origin of the Computer. New York: W. W. Nortion. ISBN 0393322297. Davis offers concise biographies of Leibniz, Boole, Frege, Cantor, Hilbert, Gödel and Turing with von Neumann as the show-stealing villain. Very brief bios of Joseph-Marie Jacquard, Babbage,

176


Algorithm

• • • •

Ada Lovelace, Claude Shannon, Howard Aiken, etc.  This article incorporates public domain material from the NIST document "algorithm" [48] by Paul E. Black (Dictionary of Algorithms and Data Structures). Dennett, Daniel (1995). Darwin's Dangerous Idea. New York: Touchstone/Simon & Schuster. ISBN 0684802902. Yuri Gurevich, Sequential Abstract State Machines Capture Sequential Algorithms [49], ACM Transactions on Computational Logic, Vol 1, no 1 (July 2000), pages 77–111. Includes bibliography of 33 sources. Kleene C., Stephen (1936). "General Recursive Functions of Natural Numbers". Mathematische Annalen 112 (5): 727–742. doi:10.1007/BF01565439. Presented to the American Mathematical Society, September 1935. Reprinted in The Undecidable, p. 237ff. Kleene's definition of "general recursion" (known now as mu-recursion) was used by Church in his 1935 paper An Unsolvable Problem of Elementary Number Theory that proved the "decision problem" to be "undecidable" (i.e., a negative result). Kleene C., Stephen (1943). "Recursive Predicates and Quantifiers". American Mathematical Society Transactions 54 (1): 41–73. doi:10.2307/1990131. Reprinted in The Undecidable, p. 255ff. Kleene refined his definition of "general recursion" and proceeded in his chapter "12. Algorithmic theories" to posit "Thesis I" (p. 274); he would later repeat this thesis (in Kleene 1952:300) and name it "Church's Thesis"(Kleene 1952:317) (i.e., the Church thesis).

• Kleene, Stephen C. (First Edition 1952). Introduction to Metamathematics (Tenth Edition 1991 ed.). North-Holland Publishing Company. ISBN 0720421039. Excellent — accessible, readable — reference source for mathematical "foundations". • Knuth, Donald (1997). Fundamental Algorithms, Third Edition. Reading, Massachusetts: Addison–Wesley. ISBN 0201896834. • Kosovsky, N. K. Elements of Mathematical Logic and its Application to the theory of Subrecursive Algorithms, LSU Publ., Leningrad, 1981 • Kowalski, Robert (1979). "Algorithm=Logic+Control". Communications of the ACM 22 (7): 424–436. doi:10.1145/359131.359136. ISSN 0001-0782. • A. A. Markov (1954) Theory of algorithms. [Translated by Jacques J. Schorr-Kon and PST staff] Imprint Moscow, Academy of Sciences of the USSR, 1954 [i.e., Jerusalem, Israel Program for Scientific Translations, 1961; available from the Office of Technical Services, U.S. Dept. of Commerce, Washington] Description 444 p. 28 cm. Added t.p. in Russian Translation of Works of the Mathematical Institute, Academy of Sciences of the USSR, v. 42. Original title: Teoriya algerifmov. [QA248.M2943 Dartmouth College library. U.S. Dept. of Commerce, Office of Technical Services, number OTS 60-51085.] • Minsky, Marvin (1967). Computation: Finite and Infinite Machines (First ed.). Prentice-Hall, Englewood Cliffs, NJ. ISBN 0131654497. Minsky expands his "...idea of an algorithm — an effective procedure..." in chapter 5.1 Computability, Effective Procedures and Algorithms. Infinite machines." • Post, Emil (1936). "Finite Combinatory Processes, Formulation I". The Journal of Symbolic Logic 1 (3): 103–105. doi:10.2307/2269031. Reprinted in The Undecidable, p. 289ff. Post defines a simple algorithmic-like process of a man writing marks or erasing marks and going from box to box and eventually halting, as he follows a list of simple instructions. This is cited by Kleene as one source of his "Thesis I", the so-called Church–Turing thesis. • Rosser, J.B. (1939). "An Informal Exposition of Proofs of Godel's Theorem and Church's Theorem". Journal of Symbolic Logic 4. Reprinted in The Undecidable, p. 223ff. Herein is Rosser's famous definition of "effective method": "...a method each step of which is precisely predetermined and which is certain to produce the answer in a finite number of steps... a machine which will then solve any problem of the set with no human intervention beyond inserting the question and (later) reading the answer" (p. 225–226, The Undecidable) • Sipser, Michael (2006). Introduction to the Theory of Computation. PWS Publishing Company. ISBN 053494728X.

177


Algorithm • Stone, Harold S. (1972). Introduction to Computer Organization and Data Structures (1972 ed.). McGraw-Hill, New York. ISBN 0070617260. Cf. in particular the first chapter titled: Algorithms, Turing Machines, and Programs. His succinct informal definition: "...any sequence of instructions that can be obeyed by a robot, is called an algorithm" (p. 4). • Turing, Alan M. (1936–7). "On Computable Numbers, With An Application to the Entscheidungsproblem". Proceedings of the London Mathematical Society, Series 2 42: 230–265. doi:10.1112/plms/s2-42.1.230.. Corrections, ibid, vol. 43(1937) pp. 544–546. Reprinted in The Undecidable, p. 116ff. Turing's famous paper completed as a Master's dissertation while at King's College Cambridge UK. • Turing, Alan M. (1939). "Systems of Logic Based on Ordinals". Proceedings of the London Mathematical Society, Series 2 45: 161–228. doi:10.1112/plms/s2-45.1.161. Reprinted in The Undecidable, p. 155ff. Turing's paper that defined "the oracle" was his PhD thesis while at Princeton USA. • United States Patent and Trademark Office (2006), 2106.02 **>Mathematical Algorithms< - 2100 Patentability [50] , Manual of Patent Examining Procedure (MPEP). Latest revision August 2006

Secondary references • Bolter, David J. (1984). Turing's Man: Western Culture in the Computer Age (1984 ed.). The University of North Carolina Press, Chapel Hill NC. ISBN 0807815640., ISBN 0-8078-4108-0 pbk. • Dilson, Jesse (2007). The Abacus ((1968,1994) ed.). St. Martin's Press, NY. ISBN 031210409X., ISBN 0-312-10409-X (pbk.) • van Heijenoort, Jean (2001). From Frege to Gödel, A Source Book in Mathematical Logic, 1879–1931 ((1967) ed.). Harvard University Press, Cambridge, MA. ISBN 0674324498., 3rd edition 1976[?], ISBN 0-674-32449-8 (pbk.) • Hodges, Andrew (1983). Alan Turing: The Enigma ((1983) ed.). Simon and Schuster, New York. ISBN 0671492071., ISBN 0-671-49207-1. Cf. Chapter "The Spirit of Truth" for a history leading to, and a discussion of, his proof.

Further reading • David Harel, Yishai A. Feldman, Algorithmics: the spirit of computing, Edition 3, Pearson Education, 2004, ISBN 0321117840 • Jean-Luc Chabert, Évelyne Barbin, A history of algorithms: from the pebble to the microchip, Springer, 1999, ISBN 3540633693

External links • • • • • •

The Stony Brook Algorithm Repository [51] Weisstein, Eric W., "Algorithm [52]" from MathWorld. Algorithms in Everyday Mathematics [53] Algorithms [54] at the Open Directory Project Sortier- und Suchalgorithmen (German) [55] Jeff Erickson Algorithms course material [56]

178


Algorithm

References [1] Kleene 1943 in Davis 1965:274 [2] Rosser 1939 in Davis 1965:225 [3] Adaptation and learning in automatic systems (http:/ / books. google. com/ books?id=sgDHJlafMskC), page 54, Ya. Z. Tsypkin, Z. J. Nikolic, Academic Press, 1971, ISBN 9780127020501 [4] Boolos and Jeffrey 1974,1999:19 [5] Minsky 1967:105 [6] Gurevich 2000:1, 3 [7] Kleene 1952:136 [8] Knuth 1997:5 [9] Boldface added, Kleene 1952:137 [10] Kleene 1952:325 [11] Davis 1958:12–15 [12] Kleene 1952:332 [13] Minsky 1967:186 [14] Sipser 2006:157 [15] Kleene 1952:229 shows that "Definition by cases" is primitive recursive. CASES requires that the list of testable instances within the CASE definition to be (i) mutually exclusive and (ii) collectively exhaustive i.e. it must include or "cover" all possibility. The CASE statement proceeds in numerical order and exits at the first successful test; see more at Boolos–Burgess–Jeffrey Fourth edition 2002:74 [16] An IF-THEN-ELSE or "logical test with branching" is just a CASE instruction reduced to two outcomes: (i) test is successful, (ii) test is unsuccessful. The IF-THEN-ELSE is closely related to the AND-OR-INVERT logic function from which all 16 logical "operators" of one or two variables can be derived; see more at Propositional formula. Like definition by cases, a sequence of IF-THEN-ELSE logical tests must be mutually exclusive and collectively exhaustive over the variables tested. [17] Kowalski 1979 [18] Sue Carroll, Taz Daughtrey. Fundamental Concepts for the Software Quality Engineer (http:/ / books. google. com/ books?id=bz_cl3B05IcC& pg=PA282). pp. 282 et seq.. . [19] Davis 2000:18 [20] Bolter 1984:24 [21] Bolter 1984:26 [22] Bolter 1984:33–34, 204–206) [23] All quotes from W. Stanley Jevons 1880 Elementary Lessons in Logic: Deductive and Inductive, Macmillan and Co., London and New York. Republished as a googlebook; cf Jevons 1880:199–201. Louis Couturat 1914 the Algebra of Logic, The Open Court Publishing Company, Chicago and London. Republished as a googlebook; cf Couturat 1914:75–76 gives a few more details; interestingly he compares this to a typewriter as well as a piano. Jevons states that the account is to be found at Jan. 20, 1870 The Proceedings of the Royal Society. [24] Jevons 1880:199–200 [25] All quotes from John Venn 1881 Symbolic Logic, Macmillan and Co., London. Republished as a googlebook. cf Venn 1881:120–125. The interested reader can find a deeper explanation in those pages. [26] Bell and Newell diagram 1971:39, cf. Davis 2000 [27] * Melina Hill, Valley News Correspondent, A Tinkerer Gets a Place in History, Valley News West Lebanon NH, Thursday March 31, 1983, page 13. [28] Davis 2000:14 [29] van Heijenoort 1967:81ff [30] van Heijenoort's commentary on Frege's Begriffsschrift, a formula language, modeled upon that of arithmetic, for pure thought in van Heijenoort 1967:1 [31] Dixon 1906, cf. Kleene 1952:36–40 [32] cf. footnote in Alonzo Church 1936a in Davis 1965:90 and 1936b in Davis 1965:110 [33] Kleene 1935–6 in Davis 1965:237ff, Kleene 1943 in Davis 1965:255ff [34] Church 1936 in Davis 1965:88ff [35] cf. "Formulation I", Post 1936 in Davis 1965:289–290 [36] Turing 1936–7 in Davis 1965:116ff [37] Rosser 1939 in Davis 1965:226 [38] Kleene 1943 in Davis 1965:273–274 [39] Kleene 1952:300, 317 [40] Kleene 1952:376 [41] Turing 1936–7 in Davis 1965:289–290 [42] Turing 1936 in Davis 1965, Turing 1939 in Davis 1965:160 [43] Hodges, p. 96 [44] Turing 1936–7:116)

179


Algorithm [45] [46] [47] [48] [49] [50] [51] [52] [53] [54] [55] [56]

Turing 1936â&#x20AC;&#x201C;7 in Davis 1965:136 Turing 1939 in Davis 1965:160 http:/ / research. microsoft. com/ ~gurevich/ Opera/ 164. pdf http:/ / www. nist. gov/ dads/ HTML/ algorithm. html http:/ / research. microsoft. com/ ~gurevich/ Opera/ 141. pdf http:/ / www. uspto. gov/ web/ offices/ pac/ mpep/ documents/ 2100_2106_02. htm http:/ / www. cs. sunysb. edu/ ~algorith/ http:/ / mathworld. wolfram. com/ Algorithm. html http:/ / everydaymath. uchicago. edu/ educators/ Algorithms_final. pdf http:/ / www. dmoz. org/ Computers/ Algorithms/ / http:/ / sortieralgorithmen. de/ http:/ / compgeom. cs. uiuc. edu/ ~jeffe/ / teaching/ algorithms/

Computer data processing Computer data processing is any process that uses a computer program to enter data and summarise, analyse or otherwise convert data into usable information. The process may be automated and run on a computer. It involves recording, analysing, sorting, summarising, calculating, disseminating and storing data. Because data is most useful when well-presented and actually informative, data-processing systems are often referred to as information systems. Nevertheless, the terms are roughly synonymous, performing similar conversions; data-processing systems typically manipulate raw data into information, and likewise information systems typically take raw data as input to produce information as output. Data processing may or may not be distinguished from data conversion, when the process is merely to convert data to another format, and does not involve any data manipulation.

Data analysis When the domain from which the data are harvested is a science or an engineering field, data processing and information systems are considered terms that are too broad and the more specialized term data analysis is typically used. This is a focus on the highly-specialized and highly-accurate algorithmic derivations and statistical calculations that are less often observed in the typical general business environment. In these contexts data analysis packages like DAP, gretl or PSPP are often used. This divergence of culture is exhibited in the typical numerical representations used in data processing versus numerical; data processing's measurements are typically represented by integers or by fixed-point or binary-coded decimal representations of numbers whereas the majority of data analysis's measurements are often represented by floating-point representation of rational numbers.

Processing Practically all naturally occurring processes can be viewed as examples of data processing systems where "observable" information in the form of pressure, light, etc. are converted by human observers into electrical signals in the nervous system as the senses we recognize as touch, sound, and vision. Even the interaction of non-living systems may be viewed in this way as rudimentary information processing systems. Conventional usage of the terms data processing and information systems restricts their use to refer to the algorithmic derivations, logical deductions, and statistical calculations that recur perennially in general business environments, rather than in the more expansive sense of all conversions of real-world measurements into real-world information in, say, an organic biological system or even a scientific or engineering system.

180


Computer data processing

Elements of data processing In order to be processed by a computer, data needs first be converted into a machine readable format. Once data is in digital format, various procedures can be applied on the data to get useful information. Data processing may involve various processes, including: • • • • • • • • • • • • •

Data acquisition Data entry Data cleaning Data coding Data tranformation Data translation Data summarization Data aggregation Data validation Data tabulation Statistical analysis Computer graphics Data warehousing

• Data mining • Data fusion

See also • Data processor • Two pass verification

Further reading • Linda B., Bourque, Linda B., Bourgue, Virginia A., Clark, Processing Data: The Survey Example (Quantitative Applications in the Social Sciences), Sage Publications, Inc. (December 14, 2008), ISBN 08056781901

181


Thread (computer science)

182

Thread (computer science) In computer science, a thread of execution results from a fork of a computer program into two or more concurrently running tasks. The implementation of threads and processes differs from one operating system to another, but in most cases, a thread is contained inside a process. Multiple threads can exist within the same process and share resources such as memory, while different processes do not share these resources. On a single processor, multithreading generally occurs by time-division multiplexing (as in multitasking): the processor switches between different threads. This context switching generally happens frequently enough that the user perceives the threads or tasks as running at the same time. On a multiprocessor or multi-core system, the threads or tasks will generally run at the same time, with each processor or core running a particular thread or task.

A process with two threads of execution.

Many modern operating systems directly support both time-sliced and multiprocessor threading with a process scheduler. The kernel of an operating system allows programmers to manipulate threads via the system call interface. Some implementations are called a kernel thread, whereas a lightweight process (LWP) is a specific type of kernel thread that shares the same state and information. Programs can have user-space threads when threading with timers, signals, or other methods to interrupt their own execution, performing a sort of ad-hoc time-slicing.

Threads compared with processes Threads differ from traditional multitasking operating system processes in that: • processes are typically independent, while threads exist as subsets of a process • processes carry considerable state information, whereas multiple threads within a process share state as well as memory and other resources • processes have separate address spaces, whereas threads share their address space • processes interact only through system-provided inter-process communication mechanisms. • Context switching between threads in the same process is typically faster than context switching between processes. Systems like Windows NT and OS/2 are said to have "cheap" threads and "expensive" processes; in other operating systems there is not so great a difference except the cost of address space switch which implies a TLB flush.

Multithreading: Advantages/Uses Multithreading as a widespread programming and execution model allows multiple threads to exist within the context of a single process. These threads share the process' resources but are able to execute independently. The threaded programming model provides developers with a useful abstraction of concurrent execution. However, perhaps the most interesting application of the technology is when it is applied to a single process to enable parallel execution on a multiprocessor system. This advantage of a multithreaded program allows it to operate faster on computer systems that have multiple CPUs, CPUs with multiple cores, or across a cluster of machines — because the threads of the program naturally lend themselves to truly concurrent execution. In such a case, the programmer needs to be careful to avoid race


Thread (computer science) conditions, and other non-intuitive behaviors. In order for data to be correctly manipulated, threads will often need to rendezvous in time in order to process the data in the correct order. Threads may also require mutually-exclusive operations (often implemented using semaphores) in order to prevent common data from being simultaneously modified, or read while in the process of being modified. Careless use of such primitives can lead to deadlocks. Another advantage of multithreading, even for single-CPU systems, is the ability for an application to remain responsive to input. In a single threaded program, if the main execution thread blocks on a long running task, the entire application can appear to freeze. By moving such long running tasks to a worker thread that runs concurrently with the main execution thread, it is possible for the application to remain responsive to user input while executing tasks in the background. Operating systems schedule threads in one of two ways: 1. Preemptive multithreading is generally considered the superior approach, as it allows the operating system to determine when a context switch should occur. The disadvantage to preemptive multithreading is that the system may make a context switch at an inappropriate time, causing priority inversion or other negative effects which may be avoided by cooperative multithreading. 2. Cooperative multithreading, on the other hand, relies on the threads themselves to relinquish control once they are at a stopping point. This can create problems if a thread is waiting for a resource to become available. Traditional mainstream computing hardware did not have much support for multithreading, because switching between threads was generally already quicker than full process context switches. Processors in embedded systems, which have higher requirements for real-time behaviors, might support multithreading by decreasing the thread-switch time, perhaps by allocating a dedicated register file for each thread instead of saving/restoring a common register file. In the late 1990s, the idea of executing instructions from multiple threads simultaneously has become known as simultaneous multithreading. This feature was introduced in Intel's Pentium 4 processor, with the name hyper threading.

Processes, kernel threads, user threads, and fibers A process is the "heaviest" unit of kernel scheduling. Processes own resources allocated by the operating system. Resources include memory, file handles, sockets, device handles, and windows. Processes do not share address spaces or file resources except through explicit methods such as inheriting file handles or shared memory segments, or mapping the same file in a shared way. Processes are typically preemptively multitasked. A kernel thread is the "lightest" unit of kernel scheduling. At least one kernel thread exists within each process. If multiple kernel threads can exist within a process, then they share the same memory and file resources. Kernel threads are preemptively multitasked if the operating system's process scheduler is preemptive. Kernel threads do not own resources except for a stack, a copy of the registers including the program counter, and thread-local storage (if any). The kernel can assign one thread to each logical core in a system (because each core splits itself up into multiple logical cores if it supports multithreading, or only support one logical core per physical core if it does not support multithreading), and can swap out threads that get blocked. However, kernel threads take much longer than user threads to be swapped. Threads are sometimes implemented in userspace libraries, thus called user threads. The kernel is not aware of them, so they are managed and scheduled in userspace. Some implementations base their user threads on top of several kernel threads to benefit from multi-processor machines (N:M model). In this article the term "thread" (without kernel or user qualifier) defaults to referring to kernel threads. User threads as implemented by virtual machines are also called green threads. User threads are generally fast to create and manage, but cannot take advantage of multithreading or multiprocessing and get blocked if all of their associated kernel threads get blocked even if there are some user threads that are ready to run.

183


Thread (computer science) Fibers are an even lighter unit of scheduling which are cooperatively scheduled: a running fiber must explicitly "yield" to allow another fiber to run, which makes their implementation much easier than kernel or user threads. A fiber can be scheduled to run in any thread in the same process. This permits applications to gain performance improvements by managing scheduling themselves, instead of relying on the kernel scheduler (which may not be tuned for the application). Parallel programming environments such as OpenMP typically implement their tasks through fibers.

Thread and fiber issues Concurrency and data structures Threads in the same process share the same address space. This allows concurrently-running code to couple tightly and conveniently exchange data without the overhead or complexity of an IPC. When shared between threads, however, even simple data structures become prone to race hazards if they require more than one CPU instruction to update: two threads may end up attempting to update the data structure at the same time and find it unexpectedly changing underfoot. Bugs caused by race hazards can be very difficult to reproduce and isolate. To prevent this, threading APIs offer synchronization primitives such as mutexes to lock data structures against concurrent access. On uniprocessor systems, a thread running into a locked mutex must sleep and hence trigger a context switch. On multi-processor systems, the thread may instead poll the mutex in a spinlock. Both of these may sap performance and force processors in SMP systems to contend for the memory bus, especially if the granularity of the locking is fine. I/O and scheduling User thread or fiber implementations are typically entirely in userspace. As a result, context switching between user threads or fibers within the same process is extremely efficient because it does not require any interaction with the kernel at all: a context switch can be performed by locally saving the CPU registers used by the currently executing user thread or fiber and then loading the registers required by the user thread or fiber to be executed. Since scheduling occurs in userspace, the scheduling policy can be more easily tailored to the requirements of the program's workload. However, the use of blocking system calls in user threads (as opposed to kernel threads) or fibers can be problematic. If a user thread or a fiber performs a system call that blocks, the other user threads and fibers in the process are unable to run until the system call returns. A typical example of this problem is when performing I/O: most programs are written to perform I/O synchronously. When an I/O operation is initiated, a system call is made, and does not return until the I/O operation has been completed. In the intervening period, the entire process is "blocked" by the kernel and cannot run, which starves other user threads and fibers in the same process from executing. A common solution to this problem is providing an I/O API that implements a synchronous interface by using non-blocking I/O internally, and scheduling another user thread or fiber while the I/O operation is in progress. Similar solutions can be provided for other blocking system calls. Alternatively, the program can be written to avoid the use of synchronous I/O or other blocking system calls. SunOS 4.x implemented "light-weight processes" or LWPs. NetBSD 2.x+, and DragonFly BSD implement LWPs as kernel threads (1:1 model). SunOS 5.2 through SunOS 5.8 as well as NetBSD 2 to NetBSD 4 implemented a two level model, multiplexing one or more user level threads on each kernel thread (M:N model). SunOS 5.9 and later, as well as NetBSD 5 eliminated user threads support, returning to a 1:1 model. [1] FreeBSD 5 implemented M:N model. FreeBSD 6 supported both 1:1 and M:N, user could choose which one should be used with a given program using /etc/libmap.conf. Starting with FreeBSD 7, the 1:1 became the default. FreeBSD 8 no longer supports the M:N model.

184


Thread (computer science) The use of kernel threads simplifies user code by moving some of the most complex aspects of threading into the kernel. The program doesn't need to schedule threads or explicitly yield the processor. User code can be written in a familiar procedural style, including calls to blocking APIs, without starving other threads. However, kernel threading on uniprocessor systems may force a context switch between threads at any time, and thus expose race hazards and concurrency bugs that would otherwise lie latent. On SMP systems, this is further exacerbated because kernel threads may literally execute concurrently on separate processors.

Models 1:1 (Kernel-level threading) Threads created by the user are in 1-1 correspondence with schedulable entities in the kernel. This is the simplest possible threading implementation. On Linux, the usual C library implements this approach (via the NPTL or older LinuxThreads). The same approach is used by Solaris, NetBSD and FreeBSD.

N:1 (User-level threading) An N:1 model implies that all application-level thread map to a single kernel-level scheduled entity; the kernel has no knowledge of the application threads. With this approach, context switching can be done very fast and, in addition, it can be implemented even on simple kernels which do not support threading. One of the major drawbacks however is that it cannot benefit from the hardware acceleration on multi-threaded processors or multi-processor computers: there is never more than one thread being scheduled at the same time. It is used by GNU Portable Threads.

N:M (Hybrid threading) N:M maps some N number of application threads onto some M number of kernel entities, or "virtual processors". This is a compromise between kernel-level ("1:1") and user-level ("N:1") threading. In general, "N:M" threading systems are more complex to implement than either kernel or user threads, because both changes to kernel and user-space code are required. In the N:M implementation, the threading library is responsible for scheduling user threads on the available schedulable entities; this makes context switching of threads very fast, as it avoids system calls. However, this increases complexity and the likelihood of priority inversion, as well as suboptimal scheduling without extensive (and expensive) coordination between the userland scheduler and the kernel scheduler.

Implementations There are many different and incompatible implementations of threading. These include both kernel-level and user-level implementations. They however often follow more or less closely the POSIX Threads interface.

Kernel-level implementation examples • • • •

Light Weight Kernel Threads (LWKT) in various BSDs M:N threading Native POSIX Thread Library (NPTL) for Linux, an implementation of the POSIX Threads (pthreads) standard Apple Multiprocessing Services version 2.0 and later, uses the built-in nanokernel in Mac OS 8.6 and later which was modified to support it.

185


Thread (computer science)

User-level implementation examples • • • • • •

GNU Portable Threads FSU Pthreads Apple Inc.'s Thread Manager REALbasic (includes an API for cooperative threading) Netscape Portable Runtime (includes a user-space fibers implementation) State threads

Hybrid implementation examples • Scheduler activations used by the NetBSD native POSIX threads library implementation (an N:M model as opposed to a 1:1 kernel or userspace implementation model) • Marcel from the PM2 project. • The OS for the Tera/Cray MTA • Microsoft Windows 7

Fiber implementation examples Fibers can be implemented without operating system support, although some operating systems or libraries provide explicit support for them. • Win32 supplies a fiber API[2] (Windows NT 3.51 SP3 and later) • Ruby as Green threads

Programming Language Support Many programming languages support threading in some capacity. Many implementations of C and C++ do not provide direct support for threading on their own, but provide access to the native threading APIs provided by the operating system. Some higher level (and usually cross platform programming languages) such as Java, Python, and .NET, expose threading to the developer while abstracting the platform specific differences in threading implementations in the runtime to the developer. A number of other programming languages also try to abstract the concept of concurrency and threading from the developer altogether. A few interpreted programming languages such as CPython and Ruby support threading, but have a limitation that is known as a Global Interpreter Lock (GIL). The GIL is a mutual exclusion locked held by the interpreter that prevents the interpreter from concurrently interpreting the applications code on two or more threads at the same time, which effectively limits the concurrency on multiple core systems.

See also • • • • • •

Win32 Thread Information Block Hardware: Multithreading (computer hardware), Multi-core (computing), Simultaneous multithreading Theory: Communicating sequential processes, Computer multitasking, Message passing Problems: Thread safety, Priority inversion Techniques: Protothreads, Thread pool pattern, Lock-free and wait-free algorithms System Calls: clone (Linux system call)

186


Thread (computer science)

References [1] http:/ / www. sun. com/ software/ whitepapers/ solaris9/ multithread. pdf [2] CreateFiber, MSDN (http:/ / msdn. microsoft. com/ en-us/ library/ ms682402(VS. 85). aspx)

• David R. Butenhof: Programming with POSIX Threads, Addison-Wesley, ISBN 0-201-63392-2 • Bradford Nichols, Dick Buttlar, Jacqueline Proulx Farell: Pthreads Programming, O'Reilly & Associates, ISBN 1-56592-115-1 • Charles J. Northrup: Programming with UNIX Threads, John Wiley & Sons, ISBN 0-471-13751-0 • Mark Walmsley: Multi-Threaded Programming in C++, Springer, ISBN 1-85233-146-1 • Paul Hyde: Java Thread Programming, Sams, ISBN 0-672-31585-8 • Bill Lewis: Threads Primer: A Guide to Multithreaded Programming, Prentice Hall, ISBN 0-13-443698-9 • Steve Kleiman, Devang Shah, Bart Smaalders: Programming With Threads, SunSoft Press, ISBN 0-13-172389-8 • Pat Villani: Advanced WIN32 Programming: Files, Threads, and Process Synchronization, Harpercollins Publishers, ISBN 0-87930-563-0 • Jim Beveridge, Robert Wiener: Multithreading Applications in Win32, Addison-Wesley, ISBN 0-201-44234-5 • Thuan Q. Pham, Pankaj K. Garg: Multithreaded Programming with Windows NT, Prentice Hall, ISBN 0-13-120643-5 • Len Dorfman, Marc J. Neuberger: Effective Multithreading in OS/2, McGraw-Hill Osborne Media, ISBN 0-07-017841-0 • Alan Burns, Andy Wellings: Concurrency in ADA, Cambridge University Press, ISBN 0-521-62911-X • Uresh Vahalia: Unix Internals: the New Frontiers, Prentice Hall, ISBN 0-13-101908-2 • Alan L. Dennis: .Net Multithreading , Manning Publications Company, ISBN 1-930110-54-5 • Tobin Titus, Fabio Claudio Ferracchiati, Srinivasa Sivakumar, Tejaswi Redkar, Sandra Gopikrishna: C# Threading Handbook, Peer Information Inc, ISBN 1-86100-829-5 • Tobin Titus, Fabio Claudio Ferracchiati, Srinivasa Sivakumar, Tejaswi Redkar, Sandra Gopikrishna: Visual Basic .Net Threading Handbook, Wrox Press Inc, ISBN 1-86100-713-2

External links • Article " Query by Slice, Parallel Execute, and Join: A Thread Pool Pattern in Java (http://today.java.net/pub/ a/today/2008/01/31/query-by-slice-parallel-execute-join-thread-pool-pattern.html)" by Binildas C. A. • Answers to frequently asked questions for comp.programming.threads (http://www.serpentine.com/~bos/ threads-faq/) • The C10K problem (http://www.kegel.com/c10k.html) • Article " The Free Lunch Is Over: A Fundamental Turn Toward Concurrency in Software (http://gotw.ca/ publications/concurrency-ddj.htm)" by Herb Sutter • Article " The Problem with Threads (http://www.computer.org/portal/site/computer/index. jsp?pageID=computer_level1_article&TheCat=1005&path=computer/homepage/0506&file=cover.xml& xsl=article.xsl)" by Edward Lee • parallel computing community (http://multicore.ning.com/) • POSIX threads explained (http://www.ibm.com/developerworks/library/l-posix1.html) by Daniel Robbins • Concepts of Multithreading (http://thekiransblog.blogspot.com/2010/02/multithreading.html)] • Multithreading in the Solaris Operating Environment (http://www.sun.com/software/whitepapers/solaris9/ multithread.pdf) • Debugging and Optimizing Multithreaded OpenMP Programs (http://www.ddj.com/215600207)

187


Parallel computing

Parallel computing Parallel computing is a form of computation in which many calculations are carried out simultaneously,[1] operating on the principle that large problems can often be divided into smaller ones, which are then solved concurrently ("in parallel"). There are several different forms of parallel computing: bit-level, instruction level, data, and task parallelism. Parallelism has been employed for many years, mainly in high-performance computing, but interest in it has grown lately due to the physical constraints preventing frequency scaling.[2] As power consumption (and consequently heat generation) by computers has become a concern in recent years,[3] parallel computing has become the dominant paradigm in computer architecture, mainly in the form of multicore processors.[4] Parallel computers can be roughly classified according to the level at which the hardware supports parallelism—with multi-core and multi-processor computers having multiple processing elements within a single machine, while clusters, MPPs, and grids use multiple computers to work on the same task. Specialized parallel computer architectures are sometimes used alongside traditional processors, for accelerating specific tasks. Parallel computer programs are more difficult to write than sequential ones,[5] because concurrency introduces several new classes of potential software bugs, of which race conditions are the most common. Communication and synchronization between the different subtasks are typically one of the greatest obstacles to getting good parallel program performance. The speed-up of a program as a result of parallelization is observed as Amdahl's law.

Background Traditionally, computer software has been written for serial computation. To solve a problem, an algorithm is constructed and implemented as a serial stream of instructions. These instructions are executed on a central processing unit on one computer. Only one instruction may execute at a time—after that instruction is finished, the next is executed.[6] Parallel computing, on the other hand, uses multiple processing elements simultaneously to solve a problem. This is accomplished by breaking the problem into independent parts so that each processing element can execute its part of the algorithm simultaneously with the others. The processing elements can be diverse and include resources such as a single computer with multiple processors, several networked computers, specialized hardware, or any combination of the above.[6] Frequency scaling was the dominant reason for improvements in computer performance from the mid-1980s until 2004. The runtime of a program is equal to the number of instructions multiplied by the average time per instruction. Maintaining everything else constant, increasing the clock frequency decreases the average time it takes to execute an instruction. An increase in frequency thus decreases runtime for all computation-bounded programs.[7] However, power consumption by a chip is given by the equation P = C × V2 × F, where P is power, C is the capacitance being switched per clock cycle (proportional to the number of transistors whose inputs change), V is voltage, and F is the processor frequency (cycles per second).[8] Increases in frequency increase the amount of power used in a processor. Increasing processor power consumption led ultimately to Intel's May 2004 cancellation of its Tejas and Jayhawk processors, which is generally cited as the end of frequency scaling as the dominant computer architecture paradigm.[9] Moore's Law is the empirical observation that transistor density in a microprocessor doubles every 18 to 24 months.[10] Despite power consumption issues, and repeated predictions of its end, Moore's law is still in effect. With the end of frequency scaling, these additional transistors (which are no longer used for frequency scaling) can be used to add extra hardware for parallel computing.

188


Parallel computing

Amdahl's law and Gustafson's law Optimally, the speed-up from parallelization would be linear—doubling the number of processing elements should halve the runtime, and doubling it a second time should again halve the runtime. However, very few parallel algorithms achieve optimal speed-up. Most of them have a near-linear speed-up for small numbers of processing elements, which flattens out into a constant value for large numbers of processing elements. The potential speed-up of an algorithm on a parallel computing platform is given by Amdahl's law, originally formulated by A graphical representation of Amdahl's law. The speed-up of a program from Gene Amdahl in the 1960s.[11] It states that parallelization is limited by how much of the program can be parallelized. For a small portion of the program which cannot example, if 90% of the program can be parallelized, the theoretical maximum speed-up using parallel computing would be 10x no matter how many processors be parallelized will limit the overall are used. speed-up available from parallelization. Any large mathematical or engineering problem will typically consist of several parallelizable parts and several non-parallelizable (sequential) parts. This relationship is given by the equation:

where S is the speed-up of the program (as a factor of its original sequential runtime), and P is the fraction that is parallelizable. If the sequential portion of a program is 10% of the runtime, we can get no more than a 10× speed-up, regardless of how many processors are added. This puts an upper limit on the usefulness of adding more parallel execution units. "When a task cannot be partitioned because of sequential constraints, the application of more effort has no effect on the schedule. The bearing of a child takes nine months, no matter how many women are assigned."[12] Gustafson's law is another law in computer engineering, closely related to Amdahl's law. It can be formulated as:

Assume that a task has two independent parts, A and B. B takes roughly 25% of the time of the whole computation. With effort, a programmer may be able to make this part five times faster, but this only reduces the time for the whole computation by a little. In contrast, one may need to perform less work to make part A twice as fast. This will make the computation much faster than by optimizing part B, even though B got a greater speed-up (5× versus 2×).

189


Parallel computing where P is the number of processors, S is the speed-up, and α the non-parallelizable part of the process.[13] Amdahl's law assumes a fixed problem size and that the size of the sequential section is independent of the number of processors, whereas Gustafson's law does not make these assumptions.

Dependencies Understanding data dependencies is fundamental in implementing parallel algorithms. No program can run more quickly than the longest chain of dependent calculations (known as the critical path), since calculations that depend upon prior calculations in the chain must be executed in order. However, most algorithms do not consist of just a long chain of dependent calculations; there are usually opportunities to execute independent calculations in parallel. Let Pi and Pj be two program fragments. Bernstein's conditions[14] describe when the two are independent and can be executed in parallel. For Pi, let Ii be all of the input variables and Oi the output variables, and likewise for Pj. P i and Pj are independent if they satisfy • • • Violation of the first condition introduces a flow dependency, corresponding to the first statement producing a result used by the second statement. The second condition represents an anti-dependency, when the first statement overwrites a variable needed by the second expression. The third and final condition represents an output dependency: When two statements write to the same location, the final result must come from the logically last executed statement.[15] Consider the following functions, which demonstrate several kinds of dependencies: 1: function Dep(a, b) 2: c := a·b 3: d := 2·c 4: end function Operation 3 in Dep(a, b) cannot be executed before (or even in parallel with) operation 2, because operation 3 uses a result from operation 2. It violates condition 1, and thus introduces a flow dependency. 1: function NoDep(a, b) 2: c := a·b 3: d := 2·b 4: e := a+b 5: end function In this example, there are no dependencies between the instructions, so they can all be run in parallel. Bernstein’s conditions do not allow memory to be shared between different processes. For that, some means of enforcing an ordering between accesses is necessary, such as semaphores, barriers or some other synchronization method.

Race conditions, mutual exclusion, synchronization, and parallel slowdown Subtasks in a parallel program are often called threads. Some parallel computer architectures use smaller, lightweight versions of threads known as fibers, while others use bigger versions known as processes. However, "threads" is generally accepted as a generic term for subtasks. Threads will often need to update some variable that is shared between them. The instructions between the two programs may be interleaved in any order. For example, consider the following program:

190


Parallel computing

191

Thread A

Thread B

1A: Read variable V

1B: Read variable V

2A: Add 1 to variable V

2B: Add 1 to variable V

3A Write back to variable V

3B: Write back to variable V

If instruction 1B is executed between 1A and 3A, or if instruction 1A is executed between 1B and 3B, the program will produce incorrect data. This is known as a race condition. The programmer must use a lock to provide mutual exclusion. A lock is a programming language construct that allows one thread to take control of a variable and prevent other threads from reading or writing it, until that variable is unlocked. The thread holding the lock is free to execute its critical section (the section of a program that requires exclusive access to some variable), and to unlock the data when it is finished. Therefore, to guarantee correct program execution, the above program can be rewritten to use locks: Thread A

Thread B

1A: Lock variable V

1B: Lock variable V

2A: Read variable V

2B: Read variable V

3A: Add 1 to variable V

3B: Add 1 to variable V

4A Write back to variable V

4B: Write back to variable V

5A: Unlock variable V

5B: Unlock variable V

One thread will successfully lock variable V, while the other thread will be locked outâ&#x20AC;&#x201D;unable to proceed until V is unlocked again. This guarantees correct execution of the program. Locks, while necessary to ensure correct program execution, can greatly slow a program. Locking multiple variables using non-atomic locks introduces the possibility of program deadlock. An atomic lock locks multiple variables all at once. If it cannot lock all of them, it does not lock any of them. If two threads each need to lock the same two variables using non-atomic locks, it is possible that one thread will lock one of them and the second thread will lock the second variable. In such a case, neither thread can complete, and deadlock results. Many parallel programs require that their subtasks act in synchrony. This requires the use of a barrier. Barriers are typically implemented using a software lock. One class of algorithms, known as lock-free and wait-free algorithms, altogether avoids the use of locks and barriers. However, this approach is generally difficult to implement and requires correctly designed data structures. Not all parallelization results in speed-up. Generally, as a task is split up into more and more threads, those threads spend an ever-increasing portion of their time communicating with each other. Eventually, the overhead from communication dominates the time spent solving the problem, and further parallelization (that is, splitting the workload over even more threads) increases rather than decreases the amount of time required to finish. This is known as parallel slowdown.

Fine-grained, coarse-grained, and embarrassing parallelism Applications are often classified according to how often their subtasks need to synchronize or communicate with each other. An application exhibits fine-grained parallelism if its subtasks must communicate many times per second; it exhibits coarse-grained parallelism if they do not communicate many times per second, and it is embarrassingly parallel if they rarely or never have to communicate. Embarrassingly parallel applications are considered the easiest to parallelize.


Parallel computing

192

Consistency models Parallel programming languages and parallel computers must have a consistency model (also known as a memory model). The consistency model defines rules for how operations on computer memory occur and how results are produced. One of the first consistency models was Leslie Lamport's sequential consistency model. Sequential consistency is the property of a parallel program that its parallel execution produces the same results as a sequential program. Specifically, a program is sequentially consistent if "... the results of any execution is the same as if the operations of all the processors were executed in some sequential order, and the operations of each individual processor appear in this sequence in the order specified by its program".[16] Software transactional memory is a common type of consistency model. Software transactional memory borrows from database theory the concept of atomic transactions and applies them to memory accesses. Mathematically, these models can be represented in several ways. Petri nets, which were introduced in Carl Adam Petri's 1962 doctoral thesis, were an early attempt to codify the rules of consistency models. Dataflow theory later built upon these, and Dataflow architectures were created to physically implement the ideas of dataflow theory. Beginning in the late 1970s, process calculi such as Calculus of Communicating Systems and Communicating Sequential Processes were developed to permit algebraic reasoning about systems composed of interacting components. More recent additions to the process calculus family, such as the π-calculus, have added the capability for reasoning about dynamic topologies. Logics such as Lamport's TLA+, and mathematical models such as traces and Actor event diagrams, have also been developed to describe the behavior of concurrent systems.

Flynn's taxonomy Michael J. Flynn created one of the earliest classification systems for parallel (and sequential) computers and programs, now known as Flynn's taxonomy. Flynn classified programs and computers by whether they were operating using a single set or multiple sets of instructions, whether or not those instructions were using a single or multiple sets of data.

Flynn's taxonomy Single Multiple Instruction Instruction Single Data

SISD

MISD

Multiple Data

SIMD

MIMD

The single-instruction-single-data (SISD) classification is equivalent to an entirely sequential program. The single-instruction-multiple-data (SIMD) classification is analogous to doing the same operation repeatedly over a large data set. This is commonly done in signal processing applications. Multiple-instruction-single-data (MISD) is a rarely used classification. While computer architectures to deal with this were devised (such as systolic arrays), few applications that fit this class materialized. Multiple-instruction-multiple-data (MIMD) programs are by far the most common type of parallel programs. According to David A. Patterson and John L. Hennessy, "Some machines are hybrids of these categories, of course, but this classic model has survived because it is simple, easy to understand, and gives a good first approximation. It is also—perhaps because of its understandability—the most widely used scheme."[17]


Parallel computing

193

Types of parallelism Bit-level parallelism From the advent of very-large-scale integration (VLSI) computer-chip fabrication technology in the 1970s until about 1986, speed-up in computer architecture was driven by doubling computer word size—the amount of information the processor can manipulate per cycle.[18] Increasing the word size reduces the number of instructions the processor must execute to perform an operation on variables whose sizes are greater than the length of the word. For example, where an 8-bit processor must add two 16-bit integers, the processor must first add the 8 lower-order bits from each integer using the standard addition instruction, then add the 8 higher-order bits using an add-with-carry instruction and the carry bit from the lower order addition; thus, an 8-bit processor requires two instructions to complete a single operation, where a 16-bit processor would be able to complete the operation with a single instruction. Historically, 4-bit microprocessors were replaced with 8-bit, then 16-bit, then 32-bit microprocessors. This trend generally came to an end with the introduction of 32-bit processors, which has been a standard in general-purpose computing for two decades. Not until recently (c. 2003–2004), with the advent of x86-64 architectures, have 64-bit processors become commonplace.

Instruction-level parallelism A computer program is, in essence, a stream of instructions executed by a processor. These instructions can be re-ordered and combined into groups which are then executed in parallel without changing the result of the program. This is known as instruction-level parallelism. Advances in instruction-level parallelism dominated computer architecture from the mid-1980s until the mid-1990s.[19]

A canonical five-stage pipeline in a RISC machine (IF = Instruction Fetch, ID = Instruction Decode, EX = Execute, MEM = Memory access, WB = Register write back)

Modern processors have multi-stage instruction pipelines. Each stage in the pipeline corresponds to a different action the processor performs on that instruction in that stage; a processor with an N-stage pipeline can have up to N different instructions at different stages of completion. The canonical example of a pipelined processor is a RISC processor, with five stages: instruction fetch, decode, execute, memory access, and write back. The Pentium 4 processor had a 35-stage pipeline.[20]


Parallel computing

In addition to instruction-level parallelism from pipelining, some processors can issue more than one instruction at a time. These are known as superscalar processors. Instructions can be grouped together only if there is no data dependency between them. Scoreboarding and the Tomasulo algorithm (which is similar to scoreboarding but makes use of register renaming) are two of the most common techniques for implementing out-of-order execution and instruction-level parallelism.

Data parallelism

194

A five-stage pipelined superscalar processor, capable of issuing two instructions per cycle. It can have two instructions in each stage of the pipeline, for a total of up to 10 instructions (shown in green) being simultaneously executed.

Data parallelism is parallelism inherent in program loops, which focuses on distributing the data across different computing nodes to be processed in parallel. "Parallelizing loops often leads to similar (not necessarily identical) operation sequences or functions being performed on elements of a large data structure."[21] Many scientific and engineering applications exhibit data parallelism. A loop-carried dependency is the dependence of a loop iteration on the output of one or more previous iterations. Loop-carried dependencies prevent the parallelization of loops. For example, consider the following pseudocode that computes the first few Fibonacci numbers: 1: 2: 4: 5: 6: 7: 8:

PREV1 := 0 PREV2 := 1 do: CUR := PREV1 + PREV2 PREV1 := PREV2 PREV2 := CUR while (CUR < 10)

This loop cannot be parallelized because CUR depends on itself (PREV2) and PREV1, which are computed in each loop iteration. Since each iteration depends on the result of the previous one, they cannot be performed in parallel. As the size of a problem gets bigger, the amount of data-parallelism available usually does as well.[22]

Task parallelism Task parallelism is the characteristic of a parallel program that "entirely different calculations can be performed on either the same or different sets of data".[21] This contrasts with data parallelism, where the same calculation is performed on the same or different sets of data. Task parallelism does not usually scale with the size of a problem.[22]

Hardware Memory and communication Main memory in a parallel computer is either shared memory (shared between all processing elements in a single address space), or distributed memory (in which each processing element has its own local address space).[23] Distributed memory refers to the fact that the memory is logically distributed, but often implies that it is physically


Parallel computing

195

distributed as well. Distributed shared memory and memory virtualization combine the two approaches, where the processing element has its own local memory and access to the memory on non-local processors. Accesses to local memory are typically faster than accesses to non-local memory. Computer architectures in which each element of main memory can be accessed with equal latency and bandwidth are known as Uniform Memory Access (UMA) systems. Typically, that can be achieved only by a shared memory system, in which the memory is not physically distributed. A system that does not have this property is known as a Non-Uniform Memory Access (NUMA) architecture. Distributed memory systems have non-uniform memory access.

A logical view of a Non-Uniform Memory Access (NUMA) architecture. Processors in one directory can access that directory's memory with less latency than they can access memory in the other directory's memory.

Computer systems make use of caches—small, fast memories located close to the processor which store temporary copies of memory values (nearby in both the physical and logical sense). Parallel computer systems have difficulties with caches that may store the same value in more than one location, with the possibility of incorrect program execution. These computers require a cache coherency system, which keeps track of cached values and strategically purges them, thus ensuring correct program execution. Bus snooping is one of the most common methods for keeping track of which values are being accessed (and thus should be purged). Designing large, high-performance cache coherence systems is a very difficult problem in computer architecture. As a result, shared-memory computer architectures do not scale as well as distributed memory systems do.[23] Processor–processor and processor–memory communication can be implemented in hardware in several ways, including via shared (either multiported or multiplexed) memory, a crossbar switch, a shared bus or an interconnect network of a myriad of topologies including star, ring, tree, hypercube, fat hypercube (a hypercube with more than one processor at a node), or n-dimensional mesh. Parallel computers based on interconnect networks need to have some kind of routing to enable the passing of messages between nodes that are not directly connected. The medium used for communication between the processors is likely to be hierarchical in large multiprocessor machines.

Classes of parallel computers Parallel computers can be roughly classified according to the level at which the hardware supports parallelism. This classification is broadly analogous to the distance between basic computing nodes. These are not mutually exclusive; for example, clusters of symmetric multiprocessors are relatively common. Multicore computing A multicore processor is a processor that includes multiple execution units ("cores"). These processors differ from superscalar processors, which can issue multiple instructions per cycle from one instruction stream (thread); by contrast, a multicore processor can issue multiple instructions per cycle from multiple instruction streams. Each core in a multicore processor can potentially be superscalar as well—that is, on every cycle, each core can issue multiple instructions from one instruction stream. Simultaneous multithreading (of which Intel's HyperThreading is the best known) was an early form of pseudo-multicoreism. A processor capable of simultaneous multithreading has only one execution unit ("core"), but


Parallel computing when that execution unit is idling (such as during a cache miss), it uses that execution unit to process a second thread. IBM's Cell microprocessor, designed for use in the Sony PlayStation 3, is another prominent multicore processor. Symmetric multiprocessing A symmetric multiprocessor (SMP) is a computer system with multiple identical processors that share memory and connect via a bus.[24] Bus contention prevents bus architectures from scaling. As a result, SMPs generally do not comprise more than 32 processors.[25] "Because of the small size of the processors and the significant reduction in the requirements for bus bandwidth achieved by large caches, such symmetric multiprocessors are extremely cost-effective, provided that a sufficient amount of memory bandwidth exists."[24] Distributed computing A distributed computer (also known as a distributed memory multiprocessor) is a distributed memory computer system in which the processing elements are connected by a network. Distributed computers are highly scalable. Cluster computing A cluster is a group of loosely coupled computers that work together closely, so that in some respects they can be regarded as a single computer.[26] Clusters are composed of multiple standalone machines connected by a network. While machines in a cluster do not have to be symmetric, load balancing is more difficult if they are not. The most common type of cluster is the Beowulf cluster, which is a cluster implemented on multiple identical commercial off-the-shelf computers connected with a TCP/IP Ethernet local area network.[27] Beowulf technology was originally developed by Thomas Sterling and Donald Becker. The vast majority of the TOP500 supercomputers are clusters.[28] Massive parallel processing A Beowulf cluster A massively parallel processor (MPP) is a single computer with many networked processors. MPPs have many of the same characteristics as clusters, but MPPs have specialized interconnect networks (whereas clusters use commodity hardware for networking). MPPs also tend to be larger than clusters, typically having "far more" than 100 processors.[29] In an MPP, "each CPU contains its own memory and copy of the operating system and application. Each subsystem communicates with the others via a high-speed interconnect."[30]

196


Parallel computing

197

Blue Gene/L, the fifth fastest supercomputer in the world according to the June 2009 TOP500 ranking, is an MPP. Grid computing Grid computing is the most distributed form of parallel computing. It makes use of computers communicating over the Internet to work on a given problem. Because of the low bandwidth and extremely high latency available on the Internet, grid computing typically deals only with embarrassingly parallel problems. Many grid computing applications have been created, of which SETI@home and Folding@Home are the best-known examples.[31] Most grid computing applications use middleware, software that sits between the operating system and the application to manage network resources and standardize the software interface. The most common grid computing middleware is the Berkeley Open Infrastructure for Network Computing (BOINC). Often, grid computing software makes use of "spare cycles", performing computations at times when a computer is idling.

A cabinet from Blue Gene/L, ranked as the fourth fastest supercomputer in the world according to the 11/2008 TOP500 rankings. Blue Gene/L is a massively parallel processor.

Specialized parallel computers Within parallel computing, there are specialized parallel devices that remain niche areas of interest. While not domain-specific, they tend to be applicable to only a few classes of parallel problems. Reconfigurable computing with field-programmable gate arrays Reconfigurable computing is the use of a field-programmable gate array (FPGA) as a co-processor to a general-purpose computer. An FPGA is, in essence, a computer chip that can rewire itself for a given task. FPGAs can be programmed with hardware description languages such as VHDL or Verilog. However, programming in these languages can be tedious. Several vendors have created C to HDL languages that attempt to emulate the syntax and/or semantics of the C programming language, with which most programmers are familiar. The best known C to HDL languages are Mitrion-C, Impulse C, DIME-C, and Handel-C. Specific subsets of SystemC based on C++ can also be used for this purpose. AMD's decision to open its HyperTransport technology to third-party vendors has become the enabling technology for high-performance reconfigurable computing.[32] According to Michael R. D'Amour, Chief Operating Officer of DRC Computer Corporation, "when we first walked into AMD, they called us 'the socket stealers.' Now they call us their partners."[32] General-purpose computing on graphics processing units (GPGPU) General-purpose computing on graphics processing units (GPGPU) is a fairly recent trend in computer engineering research. GPUs are co-processors that have been heavily optimized for computer graphics processing.[33] Computer graphics processing is a field dominated by data parallel operationsâ&#x20AC;&#x201D;particularly linear algebra matrix operations.

Nvidia's Tesla GPGPU card

In the early days, GPGPU programs used the normal graphics APIs for executing programs. However, recently several new programming languages and platforms have been built to do general purpose computation on GPUs with both Nvidia and AMD releasing programming environments with CUDA and CTM respectively. Other GPU programming languages are BrookGPU, PeakStream, and RapidMind. Nvidia has also released specific products for computation in their Tesla series. Application-specific integrated circuits


Parallel computing

198

Several application-specific integrated circuit (ASIC) approaches have been devised for dealing with parallel applications.[34] [35] [36] Because an ASIC is (by definition) specific to a given application, it can be fully optimized for that application. As a result, for a given application, an ASIC tends to outperform a general-purpose computer. However, ASICs are created by X-ray lithography. This process requires a mask, which can be extremely expensive. A single mask can cost over a million US dollars.[37] (The smaller the transistors required for the chip, the more expensive the mask will be.) Meanwhile, performance increases in general-purpose computing over time (as described by Moore's Law) tend to wipe out these gains in only one or two chip generations.[32] High initial cost, and the tendency to be overtaken by Moore's-law-driven general-purpose computing, has rendered ASICs unfeasible for most parallel computing applications. However, some have been built. One example is the peta-flop RIKEN MDGRAPE-3 machine which uses custom ASICs for molecular dynamics simulation. Vector processors A vector processor is a CPU or computer system that can execute the same instruction on large sets of data. "Vector processors have high-level operations that work on linear arrays of numbers or vectors. An example vector operation is A = B × C, where A, B, and C are each 64-element vectors of 64-bit floating-point numbers."[38] They are closely related to Flynn's SIMD classification.[38] Cray computers became famous for their vector-processing computers in the 1970s and 1980s. However, vector processors—both as CPUs and as full computer systems—have generally disappeared. Modern processor instruction sets do include some vector processing instructions, such as with AltiVec and Streaming SIMD Extensions (SSE).

The Cray-1 is the most famous vector processor.

Software Parallel programming languages Concurrent programming languages, libraries, APIs, and parallel programming models have been created for programming parallel computers. These can generally be divided into classes based on the assumptions they make about the underlying memory architecture—shared memory, distributed memory, or shared distributed memory. Shared memory programming languages communicate by manipulating shared memory variables. Distributed memory uses message passing. POSIX Threads and OpenMP are two of most widely used shared memory APIs, whereas Message Passing Interface (MPI) is the most widely used message-passing system API.[39] One concept used in programming parallel programs is the future concept, where one part of a program promises to deliver a required datum to another part of a program at some future time.

Automatic parallelization Automatic parallelization of a sequential program by a compiler is the holy grail of parallel computing. Despite decades of work by compiler researchers, automatic parallelization has had only limited success.[40] Mainstream parallel programming languages remain either explicitly parallel or (at best) partially implicit, in which a programmer gives the compiler directives for parallelization. A few fully implicit parallel programming languages exist—SISAL, Parallel Haskell, and (for FPGAs) Mitrion-C.


Parallel computing

Application checkpointing The larger and more complex a computer is, the more that can go wrong and the shorter the mean time between failures. Application checkpointing is a technique whereby the computer system takes a "snapshot" of the application—a record of all current resource allocations and variable states, akin to a core dump; this information can be used to restore the program if the computer should fail. Application checkpointing means that the program has to restart from only its last checkpoint rather than the beginning. For an application that may run for months, that is critical. Application checkpointing may be used to facilitate process migration.

Applications As parallel computers become larger and faster, it becomes feasible to solve problems that previously took too long to run. Parallel computing is used in a wide range of fields, from bioinformatics (to do protein folding) to economics (to do simulation in mathematical finance). Common types of problems found in parallel computing applications are:[41] • Dense linear algebra • Sparse linear algebra • Spectral methods (such as Cooley–Tukey fast Fourier transform) • • • • • • • • • •

n-body problems (such as Barnes–Hut simulation) Structured grid problems (such as Lattice Boltzmann methods) Unstructured grid problems (such as found in finite element analysis) Monte Carlo simulation Combinational logic (such as brute-force cryptographic techniques) Graph traversal (such as sorting algorithms) Dynamic programming Branch and bound methods Graphical models (such as detecting hidden Markov models and constructing Bayesian networks) Finite-state machine simulation

History The origins of true (MIMD) parallelism go back to Federico Luigi, Conte Menabrea and his "Sketch of the Analytic Engine Invented by Charles Babbage".[42] [43] IBM introduced the 704 in 1954, through a project in which Gene Amdahl was one of the principal architects. It became the first commercially available computer to use fully automatic floating point arithmetic commands.[44] In April 1958, S. Gill (Ferranti) discussed parallel programming and the need for branching and waiting.[45] Also in 1958, IBM researchers John Cocke and Daniel Slotnick discussed the use of parallelism in ILLIAC IV, "perhaps the most infamous of numerical calculations for the first time.[46] Burroughs Corporation Supercomputers" introduced the D825 in 1962, a four-processor computer that accessed up to 16 memory modules through a crossbar switch.[47] In 1967, Amdahl and Slotnick published a debate about the feasibility of parallel processing at American Federation of Information Processing Societies Conference.[46] It was during this debate that Amdahl's Law was coined to define the limit of speed-up due to parallelism. In 1969, US company Honeywell introduced its first Multics system, a symmetric multiprocessor system capable of running up to eight processors in parallel.[46] C.mmp, a 1970s multi-processor project at Carnegie Mellon University,

199


Parallel computing was "among the first multiprocessors with more than a few processors".[43] "The first bus-connected multi-processor with snooping caches was the Synapse N+1 in 1984."[43] SIMD parallel computers can be traced back to the 1970s. The motivation behind early SIMD computers was to amortize the gate delay of the processor's control unit over multiple instructions.[48] In 1964, Slotnick had proposed building a massively parallel computer for the Lawrence Livermore National Laboratory.[46] His design was funded by the US Air Force, which was the earliest SIMD parallel-computing effort, ILLIAC IV.[46] The key to its design was a fairly high parallelism, with up to 256 processors, which allowed the machine to work on large datasets in what would later be known as vector processing. However, ILLIAC IV was called "the most infamous of Supercomputers", because the project was only one fourth completed, but took 11 years and cost almost four times the original estimate.[49] When it was finally ready to run its first real application in 1976, it was outperformed by existing commercial supercomputers such as the Cray-1.

See also • List of important publications in concurrent, parallel, and distributed computing • List of distributed computing conferences

External links • • • • • • • • •

Parallel computing [50] at the Open Directory Project Lawrence Livermore National Laboratory: Introduction to Parallel Computing [51] Designing and Building Parallel Programs, by Ian Foster [52] Internet Parallel Computing Archive [53] Parallel processing topic area at IEEE Distributed Computing Online [54] Parallel Computing Works Free On-line Book [55] Frontiers of Supercomputing Free On-line Book Covering topics like algorithms and industrial applications [56] Universal Parallel Computing Research Center [57] Course in Parallel Programming at Columbia University (in collaboration with IBM T.J Watson X10 project) [58]

References [1] Almasi, G.S. and A. Gottlieb (1989). Highly Parallel Computing (http:/ / portal. acm. org/ citation. cfm?id=1011116. 1011127). Benjamin-Cummings publishers, Redwood City, CA. [2] S.V. Adve et al. (November 2008). "Parallel Computing Research at Illinois: The UPCRC Agenda" (http:/ / www. upcrc. illinois. edu/ documents/ UPCRC_Whitepaper. pdf) (PDF). Parallel@Illinois, University of Illinois at Urbana-Champaign. "The main techniques for these performance benefits – increased clock frequency and smarter but increasingly complex architectures – are now hitting the so-called power wall. The computer industry has accepted that future performance increases must largely come from increasing the number of processors (or cores) on a die, rather than making a single core go faster." [3] Asanovic et al. Old [conventional wisdom]: Power is free, but transistors are expensive. New [conventional wisdom] is [that] power is expensive, but transistors are "free". [4] Asanovic, Krste et al. (December 18, 2006). "The Landscape of Parallel Computing Research: A View from Berkeley" (http:/ / www. eecs. berkeley. edu/ Pubs/ TechRpts/ 2006/ EECS-2006-183. pdf) (PDF). University of California, Berkeley. Technical Report No. UCB/EECS-2006-183. "Old [conventional wisdom]: Increasing clock frequency is the primary method of improving processor performance. New [conventional wisdom]: Increasing parallelism is the primary method of improving processor performance ... Even representatives from Intel, a company generally associated with the 'higher clock-speed is better' position, warned that traditional approaches to maximizing performance through maximizing clock speed have been pushed to their limit." [5] Patterson, David A. and John L. Hennessy (1998). Computer Organization and Design, Second Edition, Morgan Kaufmann Publishers, p. 715. ISBN 1558604286. [6] Barney, Blaise. "Introduction to Parallel Computing" (http:/ / www. llnl. gov/ computing/ tutorials/ parallel_comp/ ). Lawrence Livermore National Laboratory. . Retrieved 2007-11-09. [7] Hennessy, John L. and David A. Patterson (2002). Computer Architecture: A Quantitative Approach. 3rd edition, Morgan Kaufmann, p. 43. ISBN 1558607242. [8] Rabaey, J. M. (1996). Digital Integrated Circuits. Prentice Hall, p. 235. ISBN 0131786091.

200


Parallel computing [9] Flynn, Laurie J. "Intel Halts Development of 2 New Microprocessors" (http:/ / www. nytimes. com/ 2004/ 05/ 08/ business/ 08chip. html?ex=1399348800& en=98cc44ca97b1a562& ei=5007). The New York Times, May 8, 2004. Retrieved on April 22, 2008. [10] Moore, Gordon E. (1965). "Cramming more components onto integrated circuits" (ftp:/ / download. intel. com/ museum/ Moores_Law/ Articles-Press_Releases/ Gordon_Moore_1965_Article. pdf) (PDF). Electronics Magazine. pp. 4. . Retrieved 2006-11-11. [11] Amdahl, G. (April 1967) "The validity of the single processor approach to achieving large-scale computing capabilities". In Proceedings of AFIPS Spring Joint Computer Conference, Atlantic City, N.J., AFIPS Press, pp. 483–85. [12] Brooks, Frederick P. Jr. The Mythical Man-Month: Essays on Software Engineering. Chapter 2 – The Mythical Man Month. ISBN 0201835959 [13] Reevaluating Amdahl's Law (http:/ / www. scl. ameslab. gov/ Publications/ Gus/ AmdahlsLaw/ Amdahls. html) (1988). Communications of the ACM 31(5), pp. 532–33. [14] Bernstein, A. J. (October 1966). "Program Analysis for Parallel Processing,' IEEE Trans. on Electronic Computers". EC-15, pp. 757–62. [15] Roosta, Seyed H. (2000). "Parallel processing and parallel algorithms: theory and computation". Springer, p. 114. ISBN 0387987169. [16] Lamport, Leslie (September 1979). "How to Make a Multiprocessor Computer That Correctly Executes Multiprocess Programs", IEEE Transactions on Computers, C-28,9, pp. 690–91. [17] Patterson and Hennessy, p. 748. [18] Culler, David E.; Jaswinder Pal Singh and Anoop Gupta (1999). Parallel Computer Architecture - A Hardware/Software Approach. Morgan Kaufmann Publishers, p. 15. ISBN 1558603433. [19] Culler et al. p. 15. [20] Patt, Yale (April 2004). " The Microprocessor Ten Years From Now: What Are The Challenges, How Do We Meet Them? (http:/ / users. ece. utexas. edu/ ~patt/ Videos/ talk_videos/ cmu_04-29-04. wmv) (wmv). Distinguished Lecturer talk at Carnegie Mellon University. Retrieved on November 7, 2007. [21] Culler et al. p. 124. [22] Culler et al. p. 125. [23] Patterson and Hennessy, p. 713. [24] Hennessy and Patterson, p. 549. [25] Patterson and Hennessy, p. 714. [26] What is clustering? (http:/ / www. webopedia. com/ TERM/ c/ clustering. html) Webopedia computer dictionary. Retrieved on November 7, 2007. [27] Beowulf definition. (http:/ / www. pcmag. com/ encyclopedia_term/ 0,2542,t=Beowulf& i=38548,00. asp) PC Magazine. Retrieved on November 7, 2007. [28] Architecture share for 06/2007 (http:/ / www. top500. org/ stats/ list/ 29/ archtype). TOP500 Supercomputing Sites. Clusters make up 74.60% of the machines on the list. Retrieved on November 7, 2007. [29] Hennessy and Patterson, p. 537. [30] MPP Definition. (http:/ / www. pcmag. com/ encyclopedia_term/ 0,,t=mpp& i=47310,00. asp) PC Magazine. Retrieved on November 7, 2007. [31] Kirkpatrick, Scott (January 31, 2003). "Computer Science: Rough Times Ahead". Science, Vol. 299. No. 5607, pp. 668 - 669. DOI: 10.1126/science.1081623 [32] D'Amour, Michael R., Chief Operating Officer, DRC Computer Corporation. "Standard Reconfigurable Computing". Invited speaker at the University of Delaware, February 28, 2007. [33] Boggan, Sha'Kia and Daniel M. Pressel (August 2007). GPUs: An Emerging Platform for General-Purpose Computation (http:/ / www. arl. army. mil/ arlreports/ 2007/ ARL-SR-154. pdf) (PDF). ARL-SR-154, U.S. Army Research Lab. Retrieved on November 7, 2007. [34] Maslennikov, Oleg (2002). "Systematic Generation of Executing Programs for Processor Elements in Parallel ASIC or FPGA-Based Systems and Their Transformation into VHDL-Descriptions of Processor Element Control Units". (http:/ / www. springerlink. com/ content/ jjrdrb0lelyeu3e9/ ) Lecture Notes in Computer Science, 2328/2002: p. 272. [35] Shimokawa, Y.; Y. Fuwa and N. Aramaki (18–21 November 1991). A parallel ASIC VLSI neurocomputer for a large number of neurons and billion connections per second speed. (http:/ / ieeexplore. ieee. org/ xpl/ freeabs_all. jsp?arnumber=170708) IEEE International Joint Conference on Neural Networks. 3: pp. 2162–67. [36] Acken, K.P.; M.J. Irwin, R.M. Owens (July 1998). "A Parallel ASIC Architecture for Efficient Fractal Image Coding". (http:/ / www. ingentaconnect. com/ content/ klu/ vlsi/ 1998/ 00000019/ 00000002/ 00167697?crawler=true) The Journal of VLSI Signal Processing, 19(2):97–113(17) [37] Kahng, Andrew B. (June 21, 2004) " Scoping the Problem of DFM in the Semiconductor Industry (http:/ / www. future-fab. com/ documents. asp?grID=353& d_ID=2596)." University of California, San Diego. "Future design for manufacturing (DFM) technology must reduce design [non-recoverable expenditure] cost and directly address manufacturing [non-recoverable expenditures] – the cost of a mask set and probe card – which is well over $1 million at the 90 nm technology node and creates a significant damper on semiconductor-based innovation." [38] Patterson and Hennessy, p. 751. [39] The Sidney Fernbach Award given to MPI inventor Bill Gropp (http:/ / awards. computer. org/ ana/ award/ viewPastRecipients. action?id=16) refers to MPI as the "the dominant HPC communications interface"

201


Parallel computing [40] Shen, John Paul and Mikko H. Lipasti (2005). Modern Processor Design: Fundamentals of Superscalar Processors. McGraw-Hill Professional. p. 561. ISBN 0070570647. "However, the holy grail of such research - automated parallelization of serial programs - has yet to materialize. While automated parallelization of certain classes of algorithms has been demonstrated, such success has largely been limited to scientific and numeric applications with predictable flow control (e.g., nested loop structures with statically determined iteration counts) and statically analyzable memory access patterns. (e.g., walks over large multidimensional arrays of float-point data)." [41] Asanovic, Krste, et al. (December 18, 2006). The Landscape of Parallel Computing Research: A View from Berkeley (http:/ / www. eecs. berkeley. edu/ Pubs/ TechRpts/ 2006/ EECS-2006-183. pdf) (PDF). University of California, Berkeley. Technical Report No. UCB/EECS-2006-183. See table on pages 17–19. [42] Menabrea, L. F. (1842). Sketch of the Analytic Engine Invented by Charles Babbage (http:/ / www. fourmilab. ch/ babbage/ sketch. html). Bibliothèque Universelle de Genève. Retrieved on November 7, 2007. [43] Patterson and Hennessy, p. 753. [44] da Cruz, Frank (2003). "Columbia University Computing History: The IBM 704" (http:/ / www. columbia. edu/ acis/ history/ 704. html). Columbia University. . Retrieved 2008-01-08. [45] Parallel Programming, S. Gill, The Computer Journal Vol. 1 #1, pp2-10, British Computer Society, April 1958. [46] Wilson, Gregory V (1994). "The History of the Development of Parallel Computing" (http:/ / ei. cs. vt. edu/ ~history/ Parallel. html). Virginia Tech/Norfolk State University, Interactive Learning with a Digital Library in Computer Science. . Retrieved 2008-01-08. [47] Anthes, Gry (November 19, 2001). "The Power of Parallelism" (http:/ / www. computerworld. com/ action/ article. do?command=viewArticleBasic& articleId=65878). Computerworld. . Retrieved 2008-01-08. [48] Patterson and Hennessy, p. 749. [49] Patterson and Hennessy, pp. 749–50: "Although successful in pushing several technologies useful in later projects, the ILLIAC IV failed as a computer. Costs escalated from the $8 million estimated in 1966 to $31 million by 1972, despite the construction of only a quarter of the planned machine ... It was perhaps the most infamous of supercomputers. The project started in 1965 and ran its first real application in 1976." [50] http:/ / www. dmoz. org/ Computers/ Parallel_Computing/ / [51] http:/ / www. llnl. gov/ computing/ tutorials/ parallel_comp/ [52] http:/ / www-unix. mcs. anl. gov/ dbpp/ [53] http:/ / wotug. ukc. ac. uk/ parallel/ [54] http:/ / dsonline. computer. org/ portal/ site/ dsonline/ index. jsp?pageID=dso_level1_home& path=dsonline/ topics/ parallel& file=index. xml& xsl=generic. xsl [55] http:/ / www. new-npac. org/ projects/ cdroms/ cewes-1998-05/ copywrite/ pcw/ book. html [56] http:/ / ark. cdlib. org/ ark:/ 13030/ ft0f59n73z/ [57] http:/ / www. upcrc. illinois. edu/ [58] http:/ / ppppcourse. ning. com/

202


203

Web Programming Languages HTML Filename extension

.html, .htm

Internet media type

text/html

Type code

TEXT

Uniform Type Identifier public.html Developed by

World Wide Web Consortium & WHATWG

Type of format

Markup language

Extended from

SGML

Extended to

XHTML

Standard(s)

ISO/IEC 15445 [1] W3C HTML 4.01 [2] W3C HTML 5 (draft)

HTML, which stands for HyperText Markup Language, is the predominant markup language for web pages. It provides a means to create structured documents by denoting structural semantics for text such as headings, paragraphs, lists etc as well as for links, quotes, and other items. It allows images and objects to be embedded and can be used to create interactive forms. It is written in the form of HTML elements consisting of "tags" surrounded by angle brackets within the web page content. It can include or can load scripts in languages such as JavaScript which affect the behavior of HTML processors like Web browsers; and Cascading Style Sheets (CSS) to define the appearance and layout of text and other material. The W3C, maintainer of both HTML and CSS standards, encourages the use of CSS over explicit presentational markup.[3]

History Origins In 1980, physicist Tim Berners-Lee, who was a contractor at CERN, proposed and prototyped ENQUIRE, a system for CERN researchers to use and share documents. In 1989, Berners-Lee wrote a memo proposing an Internet-based hypertext system.[4] Berners-Lee specified HTML and wrote the browser and server software in the last part of 1990. In that year, Berners-Lee and CERN data systems engineer Robert Cailliau collaborated on a joint request for funding, but the project was not formally adopted by CERN. In his personal notes,[5] from 1990 he lists[6] "some of the many areas in which hypertext is used", and puts an encyclopedia first.

Tim Berners-Lee


HTML

First specifications The first publicly available description of HTML was a document called HTML Tags, first mentioned on the Internet by Berners-Lee in late 1991.[7] [8] It describes 20 elements comprising the initial, relatively simple design of HTML. Except for the hyperlink tag, these were strongly influenced by SGMLguid, an in-house SGML based documentation format at CERN. Thirteen of these elements still exist in HTML 4.[9] HTML is a text and image formatting language used by web browsers to dynamically format web pages. Many of the text elements are found in the 1988 ISO technical report TR 9537 Techniques for using SGML, which in turn covers the features of early text formatting languages such as that used by the RUNOFF command developed in the early 1960s for the CTSS (Compatible Time-Sharing System) operating system: these formatting commands were derived from the commands used by typesetters to manually format documents. However the SGML concept of generalized markup is based on elements (nested annotated ranges with attributes) rather than merely point effects, and also the separation of structure and processing: HTML has been progressively moved in this direction with CSS. Berners-Lee considered HTML to be an application of SGML, and it was formally defined as such by the Internet Engineering Task Force (IETF) with the mid-1993 publication of the first proposal for an HTML specification: "Hypertext Markup Language (HTML)" Internet-Draft [10] by Berners-Lee and Dan Connolly, which included an SGML Document Type Definition to define the grammar.[11] The draft expired after six months, but was notable for its acknowledgment of the NCSA Mosaic browser's custom tag for embedding in-line images, reflecting the IETF's philosophy of basing standards on successful prototypes.[12] Similarly, Dave Raggett's competing Internet-Draft, "HTML+ (Hypertext Markup Format)", from late 1993, suggested standardizing already-implemented features like tables and fill-out forms.[13] After the HTML and HTML+ drafts expired in early 1994, the IETF created an HTML Working Group, which in 1995 completed "HTML 2.0", the first HTML specification intended to be treated as a standard against which future implementations should be based.[12] Published as Request for Comments 1866, HTML 2.0 included ideas from the HTML and HTML+ drafts.[14] The 2.0 designation was intended to distinguish the new edition from previous drafts.[15] Further development under the auspices of the IETF was stalled by competing interests. Since 1996, the HTML specifications have been maintained, with input from commercial software vendors, by the World Wide Web Consortium (W3C).[16] However, in 2000, HTML also became an international standard (ISO/IEC 15445:2000). The last HTML specification published by the W3C is the HTML 4.01 Recommendation, published in late 1999. Its issues and errors were last acknowledged by errata published in 2001.

Version history of the standard HTML

• HTML and HTML5 • Dynamic HTML • XHTML

204


HTML

205

• XHTML Mobile Profile and C-HTML • Canvas element • Character encodings • Document Object Model • Font family • HTML editor • HTML element • HTML Frames • HTML5 video • HTML scripting • Layout engine • Quirks mode • Style sheets • Unicode and HTML • W3C and WHATWG • Web colors • WebGL • Web Storage • Comparison of • document markup languages • web browsers • layout engines for • • • • •

HTML HTML5 HTML5 Media Non-standard HTML XHTML

HTML version timeline November 24, 1995 HTML 2.0 was published as IETF RFC 1866. Supplemental RFCs added capabilities: • • • •

November 25, 1995: RFC 1867 (form-based file upload) May 1996: RFC 1942 (tables) August 1996: RFC 1980 (client-side image maps) January 1997: RFC 2070 (internationalization) In June 2000, all of these were declared obsolete/historic by RFC 2854.

January 1997 HTML 3.2[17] was published as a W3C Recommendation. It was the first version developed and standardized exclusively by the W3C, as the IETF had closed its HTML Working Group in September 1996.[18] HTML 3.2 dropped math formulas entirely, reconciled overlap among various proprietary extensions, and adopted most of Netscape's visual markup tags. Netscape's blink element and Microsoft's marquee element were omitted due to a mutual agreement between the two companies.[16] A markup for mathematical formulas


HTML

206 similar to that in HTML wasn't standardized until 14 months later in MathML.

December 1997 HTML 4.0[19] was published as a W3C Recommendation. It offers three variations: • Strict, in which deprecated elements are forbidden, • Transitional, in which deprecated elements are allowed, • Frameset, in which mostly only frame related elements are allowed; Initially code-named "Cougar",[20] HTML 4.0 adopted many browser-specific element types and attributes, but at the same time sought to phase out Netscape's visual markup features by marking them as deprecated in favor of style sheets. HTML 4 is an SGML application conforming to ISO 8879 - SGML.[21] April 1998 HTML 4.0[22] was reissued with minor edits without incrementing the version number. December 1999 HTML 4.01[23] was published as a W3C Recommendation. It offers the same three variations as HTML 4.0, and its last errata [24] were published May 12, 2001. May 2000 ISO/IEC 15445:2000[25] [26] ("ISO HTML", based on HTML 4.01 Strict) was published as an ISO/IEC international standard. In the ISO this standard falls in the domain of the ISO/IEC JTC1/SC34 (ISO/IEC Joint Technical Committee 1, Subcommittee 34 - Document description and processing languages).[25] As of mid-2008, HTML 4.01 and ISO/IEC 15445:2000 are the most recent versions of HTML. Development of the parallel, XML-based language XHTML occupied the W3C's HTML Working Group through the early and mid-2000s. HTML draft version timeline October 1991 HTML Tags,[7] an informal CERN document listing twelve HTML tags, was first mentioned in public. July 1992 First informal draft of the HTML DTD,[27] with six subsequent revisions November 1992 HTML DTD 1.1 (the first with a version number, based on RCS revisions, which start with 1.1 rather than 1.0), an informal draft June 1993 Hypertext Markup Language[28] was published by the IETF IIIR Working Group as an Internet-Draft (a rough proposal for a standard). It was replaced by a second version [10] one month later, followed by six further drafts published by IETF itself [29] that finally led to HTML 2.0 in RFC1866 November 1993 HTML+ was published by the IETF as an Internet-Draft and was a competing proposal to the Hypertext Markup Language draft. It expired in May 1994. April 1995 (authored March 1995) HTML 3.0[30] was proposed as a standard to the IETF, but the proposal expired five months later without further action. It included many of the capabilities that were in Raggett's HTML+ proposal, such as support for tables, text flow around figures, and the display of complex mathematical formulas.[31] W3C began development of its own Arena browser for testing support for HTML 3 and Cascading Style Sheets, but HTML 3.0 did not succeed for several reasons. The draft was considered very large at 150 pages


HTML

207 and the pace of browser development, as well as the number of interested parties, had outstripped the resources of the IETF.[16] Browser vendors, including Microsoft and Netscape at the time, chose to implement different subsets of HTML 3's draft features as well as to introduce their own extensions to it.[16] (See Browser wars) These included extensions to control stylistic aspects of documents, contrary to the "belief [of the academic engineering community] that such things as text color, background texture, font size and font face were definitely outside the scope of a language when their only intent was to specify how a document would be organized."[16] Dave Raggett, who has been a W3C Fellow for many years has commented for example, "To a certain extent, Microsoft built its business on the Web by extending HTML features."[16]

January 2008 HTML 5[32] was published as a Working Draft by the W3C. Although its syntax closely resembles that of SGML, HTML 5 has abandoned any attempt to be an SGML application, and has explicitly defined its own "html" serialization, in addition to an alternative XML-based XHTML 5 serialization.[33] XHTML versions XHTML is a separate language that began as a reformulation of HTML 4.01 using XML 1.0. It continues to be developed: • XHTML 1.0,[34] published January 26, 2000 as a W3C Recommendation, later revised and republished August 1, 2002. It offers the same three variations as HTML 4.0 and 4.01, reformulated in XML, with minor restrictions. • XHTML 1.1,[35] published May 31, 2001 as a W3C Recommendation. It is based on XHTML 1.0 Strict, but includes minor changes, can be customized, and is reformulated using modules from Modularization of XHTML [36] , which was published April 10, 2001 as a W3C Recommendation. • XHTML 2.0,[37] . There is no XHTML 2.0 standard. XHTML 2.0 is incompatible with XHTML 1.x and, therefore, would be more accurate to characterize as an XHTML-inspired new language than an update to XHTML 1.x. • XHTML 5, which is an update to XHTML 1.x, is being defined alongside HTML 5 in the HTML 5 draft.[38]

Markup HTML markup consists of several key components, including elements (and their attributes), character-based data types, and character references and entity references. Another important component is the document type declaration, which specifies the Document Type Definition. As of HTML 5, no Document Type Definition will need to be specified, and will only determine the layout mode [39]. The Hello world program, a common computer program employed for comparing programming languages, scripting languages, and markup languages is made of 9 lines of code in HTML, albeit Newlines are optional: <!doctype html> <html> <head> <title>Hello HTML</title> </head> <body> <p>Hello World!</p> </body> </html> This Document Type Declaration is for HTML 5. If the <!doctype html> declaration is not included, most browsers will render using "quirks mode."[40]


HTML

208

Elements HTML documents are composed entirely of HTML elements that, in their most general form have three components: a pair of element tags with a "start tag" and "end tag"; some element attributes given to the element within the tags; and finally, all the actual, textual and graphical, information content that will be rendered on the display. An HTML element is everything between and including the tags. A tag is a keyword enclosed in angle brackets. A common form of an HTML element is: <tag>content to be rendered</tag> The name of the HTML element is also the name of the tag. Note that the end tag's name starts with a slash character, "/". The most general form of an HTML element is: <tag attribute1="value1" attribute2="value2">content to be rendered</tag> By not assigning attributes most start tags default their attribute values. There are some basic types of tags: Heading of the HTML:<head>...</head>. Usually the title should be included in the head, for example: <head> <title>The title</title> </head> <source lang="html4strict"> <h1>Heading1</h1> <h2>Heading2</h2> <h3>Heading3</h3> <h4>Heading4</h4> <h5>Heading5</h5> <h6>Heading6</h6> Paragraph Partition: <p>Paragraph 1</p>

<p>Paragraph 2</p>

Newline:<br>. The difference between <br> and <p> is that 'br' breaks a line without altering the semantic structure of the page, whereas 'p' sections the page into paragraphs. Here is an example: <code><p>This <br> is a paragraph <br> with <br> line breaks</p></code> Annotation: <!--..Explain!..--> Annotations can help to understand the coding and do not display in the webpage. There are several types of markup elements used in HTML. â&#x20AC;˘ Structural markup describes the purpose of text. For example, <h2>Golf</h2> establishes "Golf" as a second-level heading, which would be rendered in a browser in a manner similar to the "HTML markup" title at the start of this section. Structural markup does not denote any specific rendering, but most Web browsers have


HTML standardized default styles for element formatting. Text may be further styled with Cascading Style Sheets (CSS). • Presentational markup describes the appearance of the text, regardless of its function. For example <b>boldface</b> indicates that visual output devices should render "boldface" in bold text, but gives no indication what devices which are unable to do this (such as aural devices that read the text aloud) should do. In the case of both <b>bold</b> and <i>italic</i>, there are elements which usually have an equivalent visual rendering but are more semantic in nature, namely <strong>strong emphasis</strong> and <em>emphasis</em> respectively. It is easier to see how an aural user agent should interpret the latter two elements. However, they are not equivalent to their presentational counterparts: it would be undesirable for a screen-reader to emphasize the name of a book, for instance, but on a screen such a name would be italicized. Most presentational markup elements have become deprecated under the HTML 4.0 specification, in favor of CSS based style design. • Hypertext markup makes parts of a document into links to other documents. HTML up through version XHTML 1.1 requires the use of an anchor element to create a hyperlink in the flow of text: <a>Wikipedia</a>. In addition, the href attribute must be set to a valid URL. For example, the HTML markup, <a href="http://en.wikipedia.org/">Wikipedia</a>, will render the word "Wikipedia [41]" as a hyperlink. An example to render an image as a hyperlink is: <a href="http://example.org"><img src="image.gif" alt="alternative text" width="50" height="50"></a>. Attributes Most of the attributes of an element are name-value pairs, separated by "=", and written within the start tag of an element, after the element's name. The value may be enclosed in single or double quotes, although values consisting of certain characters can be left unquoted in HTML (but not XHTML).[42] [43] Leaving attribute values unquoted is considered unsafe.[44] In contrast with name-value pair attributes, there are some attributes that affect the element simply by their presence in the start tag of the element[7] (like the ismap attribute for the img element[45] ). Most elements can take any of several common attributes: • The id attribute provides a document-wide unique identifier for an element. This can be used by stylesheets to provide presentational properties, by browsers to focus attention on the specific element, or by scripts to alter the contents or presentation of an element. Appended to the URL of the page, it provides a globally-unique identifier for an element; typically a sub-section of the page. For example, the ID "Attributes" in http://en.wikipedia.org/wiki/HTML#Attributes • The class attribute provides a way of classifying similar elements. This can be used for semantic or presentation purposes. Semantically, for example, classes are used in microformats. Presentationally, for example, an HTML document might use the designation class="notation" to indicate that all elements with this class value are subordinate to the main text of the document. Such elements might be gathered together and presented as footnotes on a page instead of appearing in the place where they occur in the HTML source. • An author may use the style non-attributal codes presentational properties to a particular element. It is considered better practice to use an element’s id or class attributes to select the element with a stylesheet, though sometimes this can be too cumbersome for a simple and specific or ad hoc application of styled properties. • The title attribute is used to attach subtextual explanation to an element. In most browsers this attribute is displayed as what is often referred to as a tooltip. The abbreviation element, abbr, can be used to demonstrate these various attributes: <abbr id="anId" class="aClass" style="color:blue;" title="Hypertext Markup Language">HTML</abbr> This example displays as HTML; in most browsers, pointing the cursor at the abbreviation should display the title text "Hypertext Markup Language." Most elements also take the language-related attributes lang and dir.

209


HTML

Character and entity references As of version 4.0, HTML defines a set of 252 character entity references and a set of 1,114,050 numeric character references, both of which allow individual characters to be written via simple markup, rather than literally. A literal character and its markup counterpart are considered equivalent and are rendered identically. The ability to "escape" characters in this way allows for the characters < and & (when written as &lt; and &amp;, respectively) to be interpreted as character data, rather than markup. For example, a literal < normally indicates the start of a tag, and & normally indicates the start of a character entity reference or numeric character reference; writing it as &amp; or &#x26; or &#38; allows & to be included in the content of elements or the values of attributes. The double-quote character ("), when used to quote an attribute value, must also be escaped as &quot; or &#x22; or &#34; when it appears within the attribute value itself. The single-quote character ('), when used to quote an attribute value, must also be escaped as &#x27; or &#39; (should NOT be escaped as &apos; except in XHTML documents) when it appears within the attribute value itself. However, since document authors often overlook the need to escape these characters, browsers tend to be very forgiving, treating them as markup only when subsequent text appears to confirm that intent. Escaping also allows for characters that are not easily typed or that aren't even available in the document's character encoding to be represented within the element and attribute content. For example, the acute-accented e (é), a character typically found only on Western European keyboards, can be written in any HTML document as the entity reference &eacute; or as the numeric references &#233; or &#xE9;. The characters comprising those references (that is, the &, the ;, the letters in eacute, and so on) are available on all keyboards and are supported in all character encodings, whereas the literal é is not.

Data types HTML defines several data types for element content, such as script data and stylesheet data, and a plethora of types for attribute values, including IDs, names, URIs, numbers, units of length, languages, media descriptors, colors, character encodings, dates and times, and so on. All of these data types are specializations of character data.

Document type declaration HTML documents are required to start with a Document Type Declaration (informally, a "doctype"). In browsers, the function of the doctype is to indicate the rendering mode—particularly to avoid quirks mode. The original purpose of the doctype was to enable parsing and validation of HTML documents by SGML tools based on the Document Type Definition (DTD). The DTD to which the DOCTYPE refers contains machine-readable grammar specifying the permitted and prohibited content for a document conforming to such a DTD. Browsers, on the other hand, do not implement HTML as an application of SGML and by consequence do not read the DTD. HTML 5 does not define a DTD, because of the technology's inherent limitations, so in HTML 5 the doctype declaration, <!doctype html>, does not refer to a DTD. An example of an HTML 4 doctype is <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd"> This declaration references the DTD for the Strict version of HTML 4.01, which does not include presentational elements like font, leaving formatting to Cascading Style Sheets and the span and div element. SGML-based validators read the DTD in order to properly parse the document and to perform validation. In modern browsers, this doctype activates standards mode as opposed to quirks mode. In addition, HTML 4.01 provides Transitional and Frameset DTDs, as explained below.

210


HTML

Semantic HTML Semantic HTML is a way of writing HTML that emphasizes the meaning of the encoded information over its presentation (look). HTML has included semantic markup from its inception,[46] but has also included presentational markup such as <font>, <i> and <center> tags. There are also the semantically neutral span and div tags. Since the late 1990s when Cascading Style Sheets were beginning to work in most browsers, web authors have been encouraged to avoid the use of presentational HTML markup with a view to the separation of presentation and content.[47] In a 2001 discussion of the Semantic Web, Tim Berners-Lee and others gave examples of ways in which intelligent software 'agents' may one day automatically trawl the Web and find, filter and correlate previously unrelated, published facts for the benefit of human users.[48] Such agents are not commonplace even now, but some of the ideas of Web 2.0, mashups and price comparison websites may be coming close. The main difference between these web application hybrids and Berners-Lee's semantic agents lies in the fact that the current aggregation and hybridisation of information is usually designed in by web developers, who already know the web locations and the API semantics of the specific data they wish to mash, compare and combine. An important type of web agent that does trawl and read web pages automatically, without prior knowledge of what it might find, is the Web crawler or search-engine spider. These software agents are dependent on the semantic clarity of web pages they find as they use various techniques and algorithms to read and index millions of web pages a day and provide web users with search facilities without which the World Wide Web would be only a fraction of its current usefulness. In order for search-engine spiders to be able to rate the significance of pieces of text they find in HTML documents, and also for those creating mashups and other hybrids, as well as for more automated agents as they are developed, the semantic structures that exist in HTML need to be widely and uniformly applied to bring out the meaning of published text.[49] Presentational markup tags are deprecated in current HTML and XHTML recommendations and are illegal in HTML 5. Good semantic HTML also improves the accessibility of web documents (see also Web Content Accessibility Guidelines). For example, when a screen reader or audio browser can correctly ascertain the structure of a document, it will not waste the visually impaired user's time by reading out repeated or irrelevant information when it has been marked up correctly.

Delivery HTML documents can be delivered by the same means as any other computer file; however, they are most often delivered either by HTTP from a Web server or by e-mail.

HTTP The World Wide Web is composed primarily of HTML documents transmitted from Web servers to Web browsers using the Hypertext Transfer Protocol (HTTP). However, HTTP is used to serve images, sound, and other content in addition to HTML. To allow the Web browser to know how to handle each document it receives, other information is transmitted along with the document. This meta data usually includes the MIME type (e.g. text/html or application/xhtml+xml) and the character encoding (see Character encoding in HTML). In modern browsers, the MIME type that is sent with the HTML document may affect how the document is initially interpreted. A document sent with the XHTML MIME type is expected to be well-formed XML, and syntax errors may cause the browser to fail to render it. The same document sent with the HTML MIME type might be displayed successfully, since some browsers are more lenient with HTML.

211


HTML The W3C recommendations state that XHTML 1.0 documents that follow guidelines set forth in the recommendation's Appendix C may be labeled with either MIME Type.[50] The current XHTML 1.1 Working Draft also states that XHTML 1.1 documents should[51] be labeled with either MIME type.[52]

HTML e-mail Most graphical e-mail clients allow the use of a subset of HTML (often ill-defined) to provide formatting and semantic markup not available with plain text. This may include typographic information like coloured headings, emphasized and quoted text, inline images and diagrams. Many such clients include both a GUI editor for composing HTML e-mail messages and a rendering engine for displaying them. Use of HTML in e-mail is controversial because of compatibility issues, because it can help disguise phishing attacks, because it can confuse spam filters and because the message size is larger than plain text.

Naming conventions The most common filename extension for files containing HTML is .html. A common abbreviation of this is .htm, which originated because some early operating systems and file systems, such as DOS and FAT, limited file extensions to three letters.

HTML Application An HTML Application (HTA; file extension ".hta") is a Microsoft Windows application that uses HTML and Dynamic HTML in a browser to provide the application's graphical interface. A regular HTML file is confined to the security model of the web browser, communicating only to web servers and manipulating only webpage objects and site cookies. An HTA runs as a fully trusted application and therefore has more privileges, like creation/editing/removal of files and Windows Registry entries. Because they operate outside the browser's security model, HTAs cannot be executed via HTTP, but must be downloaded (just like an EXE file) and executed from local file system.

Current variations Since its inception, HTML and its associated protocols gained acceptance relatively quickly. However, no clear standards existed in the early years of the language. Though its creators originally conceived of HTML as a semantic language devoid of presentation details[53], practical uses pushed many presentational elements and attributes into the language, driven largely by the various browser vendors. The latest standards surrounding HTML reflect efforts to overcome the sometimes chaotic development of the language[54] and to create a rational foundation for building both meaningful and well-presented documents. To return HTML to its role as a semantic language, the W3C has developed style languages such as CSS and XSL to shoulder the burden of presentation. In conjunction, the HTML specification has slowly reined in the presentational elements. There are two axes differentiating various variations of HTML as currently specified: SGML-based HTML versus XML-based HTML (referred to as XHTML) on one axis, and strict versus transitional (loose) versus frameset on the other axis.

SGML-based versus XML-based HTML One difference in the latest HTML specifications lies in the distinction between the SGML-based specification and the XML-based specification. The XML-based specification is usually called XHTML to distinguish it clearly from the more traditional definition; however, the root element name continues to be 'html' even in the XHTML-specified HTML. The W3C intended XHTML 1.0 to be identical to HTML 4.01 except where limitations of XML over the more complex SGML require workarounds. Because XHTML and HTML are closely related, they are sometimes

212


HTML documented in parallel. In such circumstances, some authors conflate the two names as (X)HTML or X(HTML). Like HTML 4.01, XHTML 1.0 has three sub-specifications: strict, loose, and frameset. Aside from the different opening declarations for a document, the differences between an HTML 4.01 and XHTML 1.0 document—in each of the corresponding DTDs—are largely syntactic. The underlying syntax of HTML allows many shortcuts that XHTML does not, such as elements with optional opening or closing tags, and even EMPTY elements which must not have an end tag. By contrast, XHTML requires all elements to have an opening tag or a closing tag. XHTML, however, also introduces a new shortcut: an XHTML tag may be opened and closed within the same tag, by including a slash before the end of the tag like this: <br/>. The introduction of this shorthand, which is not used in the SGML declaration for HTML 4.01, may confuse earlier software unfamiliar with this new convention. A fix for this is to include a space before closing the tag, as such: <br />.[55] To understand the subtle differences between HTML and XHTML, consider the transformation of a valid and well-formed XHTML 1.0 document that adheres to Appendix C (see below) into a valid HTML 4.01 document. To make this translation requires the following steps: 1. The language for an element should be specified with a lang attribute rather than the XHTML xml:lang attribute. XHTML uses XML's built in language-defining functionality attribute. 2. Remove the XML namespace (xmlns=URI). HTML has no facilities for namespaces. 3. Change the document type declaration from XHTML 1.0 to HTML 4.01. (see DTD section for further explanation). 4. If present, remove the XML declaration. (Typically this is: <?xml version="1.0" encoding="utf-8"?>). 5. Ensure that the document’s MIME type is set to text/html. For both HTML and XHTML, this comes from the HTTP Content-Type header sent by the server. 6. Change the XML empty-element syntax to an HTML style empty element (<br/> to <br>). Those are the main changes necessary to translate a document from XHTML 1.0 to HTML 4.01. To translate from HTML to XHTML would also require the addition of any omitted opening or closing tags. Whether coding in HTML or XHTML it may just be best to always include the optional tags within an HTML document rather than remembering which tags can be omitted. A well-formed XHTML document adheres to all the syntax requirements of XML. A valid document adheres to the content specification for XHTML, which describes the document structure. The W3C recommends several conventions to ensure an easy migration between HTML and XHTML (see HTML Compatibility Guidelines [56]). The following steps can be applied to XHTML 1.0 documents only: • • • •

Include both xml:lang and lang attributes on any elements assigning language. Use the empty-element syntax only for elements specified as empty in HTML. Include an extra space in empty-element tags: for example <br /> instead of <br/>. Include explicit close tags for elements that permit content but are left empty (for example, <div></div>, not <div />). • Omit the XML declaration. By carefully following the W3C’s compatibility guidelines, a user agent should be able to interpret the document equally as HTML or XHTML. For documents that are XHTML 1.0 and have been made compatible in this way, the W3C permits them to be served either as HTML (with a text/html MIME type), or as XHTML (with an application/xhtml+xml or application/xml MIME type). When delivered as XHTML, browsers should use an XML parser, which adheres strictly to the XML specifications for parsing the document's contents.

213


HTML

214

Transitional versus strict HTML 4 defined three different versions of the language: Strict, Transitional (once called Loose), and Frameset. The Strict version is intended for new documents and is considered best practice, while the Transitional and Frameset versions were developed to make it easier to transition documents that conformed to older HTML specification or didn't conform to any specification to a version of HTML 4. The Transitional and Frameset versions allow for presentational markup, which is omitted in the Strict version. Instead, cascading style sheets are encouraged to improve the presentation of HTML documents. Because XHTML 1 only defines an XML syntax for the language defined by HTML 4, the same differences apply to XHTML 1 as well. The Transitional version allows the following parts of the vocabulary, which are not included in the Strict version: • A looser content model • Inline elements and plain text are allowed directly in: body, blockquote, form, noscript and noframes • Presentation related elements • underline (u) • strike-through (s) • center • font • basefont • Presentation related attributes • background and bgcolor attributes for body element. • align attribute on div, form, paragraph (p), and heading (h1...h6) elements • align, noshade, size, and width attributes on hr element • align, border, vspace, and hspace attributes on img and object elements • align attribute on legend and caption elements • align and bgcolor on table element • nowrap, bgcolor, width, height on td and th elements • bgcolor attribute on tr element • clear attribute on br element • compact attribute on dl, dir and menu elements • type, compact, and start attributes on ol and ul elements • type and value attributes on li element • width attribute on pre element • Additional elements in Transitional specification • menu list (no substitute, though unordered list is recommended) • dir list (no substitute, though unordered list is recommended) • isindex (element requires server-side support and is typically added to documents server-side, form and input elements can be used as a substitute) • applet (deprecated in favor of object element) • The language attribute on script element (redundant with the type attribute). • Frame related entities • iframe • noframes • target attribute on anchor, client-side image-map (imagemap), link, form, and base elements The Frameset version includes everything in the Transitional version, as well as the frameset element (used instead of body) and the frame element.


HTML

Frameset versus transitional In addition to the above transitional differences, the frameset specifications (whether XHTML 1.0 or HTML 4.01) specifies a different content model, with frameset replacing body, containing frame elements, and optionally noframes, with a body.

Summary of specification versions As this list demonstrates, the loose versions of the specification are maintained for legacy support. However, contrary to popular misconceptions, the move to XHTML does not imply a removal of this legacy support. Rather the X in XML stands for extensible and the W3C is modularizing the entire specification and opening it up to independent extensions. The primary achievement in the move from XHTML 1.0 to XHTML 1.1 is the modularization of the entire specification. The strict version of HTML is deployed in XHTML 1.1 through a set of modular extensions to the base XHTML 1.1 specification. Likewise someone looking for the loose (transitional) or frameset specifications will find similar extended XHTML 1.1 support (much of it is contained in the legacy or frame modules). The modularization also allows for separate features to develop on their own timetable. So for example XHTML 1.1 will allow quicker migration to emerging XML standards such as MathML (a presentational and semantic math language based on XML) and XForms—a new highly advanced web-form technology to replace the existing HTML forms. In summary, the HTML 4.01 specification primarily reined in all the various HTML implementations into a single clear written specification based on SGML. XHTML 1.0, ported this specification, as is, to the new XML defined specification. Next, XHTML 1.1 takes advantage of the extensible nature of XML and modularizes the whole specification. XHTML 2.0 will be the first step in adding new features to the specification in a standards-body-based approach.

Hypertext features not in HTML HTML lacks some of the features found in earlier hypertext systems, such as typed links, source tracking, fat links, and more.[57] Even some hypertext features that were in early versions of HTML have been ignored by most popular web browsers until recently, such as the link element and in-browser Web page editing. Sometimes Web services or browser manufacturers remedy these shortcomings. For instance, wikis and content management systems allow surfers to edit the Web pages they visit.

WYSIWYG Editors There are some WYSIWYG editors in which the user lays out everything as it is to appear in the HTML document using a graphical user interface, and the editor renders this as an HTML document, no longer requiring the author to have extensive knowledge of HTML. Web page editing is clearly dominated by the WYSIWYG editing model. But, this model has been criticized,[58] [59] primarily because of the low quality of the generated code, and there are voices advocating a change to the WYSIWYM model. WYSIWYG editors remains a controversial topic because of their perceived flaws such as: • Relying mainly on layout as opposed to meaning, often using markup that does not convey the intended meaning but simply copies the layout.[60] • Often producing extremely verbose and redundant code that fails to make use of the cascading nature of HTML and CSS. • Often producing ingrammatal markup often called tag soup. • As a great deal of information of HTML documents is not in the layout, the model has been criticized for its 'what you see is all you get'-nature. [61]

215


HTML

216

Nevertheless, since WYSIWYG editors offer convenience over hand-coded pages as well as not requiring the author to know the finer details of HTML, they still dominate web authoring.

See also • • • • • • • •

Breadcrumb (navigation) HTML decimal character rendering HTML element JHTML List of computer standards List of document markup languages Microformat The HTML Sourcebook: The Complete Guide to HTML (historical reference from 1995)

External links • HTML 4.01, the most recent finished specification (1999) [62] • HTML 5, the upcoming version of HTML [63] • Dave Raggett's Introduction to HTML [64] • Empty elements in SGML, HTML, XML, and XHTML [65]

HTML tutorials • • • •

HTML Dog [66] HTML.net [67] Your HTML Source [68] HTML tutorial [69]

References [1] [2] [3] [4]

http:/ / www. w3. org/ TR/ 1999/ REC-html401-19991224/ http:/ / dev. w3. org/ html5/ spec/ HTML 4 — Conformance: requirements and recommendations (http:/ / www. w3. org/ TR/ html401/ conform. html#deprecated) Tim Berners-Lee, "Information Management: A Proposal." CERN (March 1989, May 1990). W3.org (http:/ / www. w3. org/ History/ 1989/ proposal. html) [5] Tim Berners-Lee, "Design Issues" (http:/ / www. w3. org/ DesignIssues/ ) [6] Tim Berners-Lee, "Design Issues" (http:/ / www. w3. org/ DesignIssues/ Uses. html) [7] "Tags used in HTML" (http:/ / www. w3. org/ History/ 19921103-hypertext/ hypertext/ WWW/ MarkUp/ Tags. html). World Wide Web Consortium. November 3, 1992. . Retrieved November 16, 2008. [8] "First mention of HTML Tags on the www-talk mailing list" (http:/ / lists. w3. org/ Archives/ Public/ www-talk/ 1991SepOct/ 0003. html). World Wide Web Consortium. October 29, 1991. . Retrieved April 8, 2007. [9] "Index of elements in HTML 4" (http:/ / www. w3. org/ TR/ 1999/ REC-html401-19991224/ index/ elements). World Wide Web Consortium. December 24, 1999. . Retrieved April 8, 2007. [10] http:/ / www. w3. org/ MarkUp/ draft-ietf-iiir-html-01. txt [11] Tim Berners-Lee (December 9, 1991). "Re: SGML/HTML docs, X Browser (archived www-talk mailing list post)" (http:/ / lists. w3. org/ Archives/ Public/ www-talk/ 1991NovDec/ 0020. html). . Retrieved June 16, 2007. "SGML is very general. HTML is a specific application of the SGML basic syntax applied to hypertext documents with simple structure." [12] Raymond, Eric. "IETF and the RFC Standards Process" (http:/ / www. faqs. org/ docs/ artu/ ietf_process. html). [[The Art of Unix Programming (http:/ / www. faqs. org/ docs/ artu/ )]]. . "In IETF tradition, standards have to arise from experience with a working prototype implementation — but once they become standards, code that does not conform to them is considered broken and mercilessly scrapped. …Internet-Drafts are not specifications, and software implementers and vendors are specifically barred from claiming compliance with them as if they were specifications. Internet-Drafts are focal points for discussion, usually in a working group… Once an Internet-Draft has been published with an RFC number, it is a specification to which implementers may claim conformance. It is expected that the authors of the RFC and the community at large will begin correcting the specification with field experience."


HTML [13] "HTML+ Internet-Draft - Abstract" (https:/ / datatracker. ietf. org/ public/ idindex. cgi?command=id_detail& id=789). . "Browser writers are experimenting with extensions to HTML and it is now appropriate to draw these ideas together into a revised document format. The new format is designed to allow a gradual roll over from HTML, adding features like tables, captioned figures and fill-out forms for querying remote databases or mailing questionnaires." [14] "RFC 1866: Hypertext Markup Language - 2.0 - Acknowledgments" (http:/ / www. ietf. org/ rfc/ rfc1866. txt). Internet Engineering Task Force. September 22, 2005. . Retrieved June 16, 2007. "Since 1993, a wide variety of Internet participants have contributed to the evolution of HTML, which has included the addition of in-line images introduced by the NCSA Mosaic software for WWW. Dave Raggett played an important role in deriving the forms material from the HTML+ specification. Dan Connolly and Karen Olson Muldrow rewrote the HTML Specification in 1994. The document was then edited by the HTML working group as a whole, with updates being made by Eric Schieler, Mike Knezovich, and Eric W. Sink at Spyglass, Inc. Finally, Roy Fielding restructured the entire draft into its current form." [15] "RFC 1866: Hypertext Markup Language - 2.0 - Introduction" (http:/ / www. ietf. org/ rfc/ rfc1866. txt). Internet Engineering Task Force. September 22, 2005. . Retrieved June 16, 2007. "This document thus defines an HTML 2.0 (to distinguish it from the previous informal specifications). Future (generally upwardly compatible) versions of HTML with new features will be released with higher version numbers." [16] Raggett, Dave (1998). Raggett on HTML 4 (http:/ / www. w3. org/ People/ Raggett/ book4/ ch02. html). . Retrieved July 9, 2007. [17] "HTML 3.2 Reference Specification" (http:/ / www. w3. org/ TR/ REC-html32). World Wide Web Consortium. January 14, 1997. . Retrieved November 16, 2008. [18] "IETF HTML WG" (http:/ / www. w3. org/ MarkUp/ HTML-WG/ ). . Retrieved June 16, 2007. "Note: This working group is closed" [19] "HTML 4.0 Specification" (http:/ / www. w3. org/ TR/ REC-html40-971218/ ). World Wide Web Consortium. December 18,1997. . Retrieved November 16, 2008. [20] Arnoud Engelfriet. "Introduction to Wilbur" (http:/ / htmlhelp. com/ reference/ wilbur/ intro. html). Web Design Group. . Retrieved June 16, 2007. [21] "HTML 4 - 4 Conformance: requirements and recommendations" (http:/ / www. w3. org/ TR/ html4/ conform. html#h-4. 2). . Retrieved December 30, 2009. [22] "HTML 4.0 Specification" (http:/ / www. w3. org/ TR/ 1998/ REC-html40-19980424/ ). World Wide Web Consortium. April 24, 1998. . Retrieved November 16, 2008. [23] "HTML 4.01 Specification" (http:/ / www. w3. org/ TR/ html401/ ). World Wide Web Consortium. December 24, 1999. . Retrieved November 16, 2008. [24] http:/ / www. w3. org/ MarkUp/ html4-updates/ errata [25] ISO (2000). "ISO/IEC 15445:2000 - Information technology -- Document description and processing languages -- HyperText Markup Language (HTML)" (http:/ / www. iso. org/ iso/ iso_catalogue/ catalogue_tc/ catalogue_detail. htm?csnumber=27688). . Retrieved December 26, 2009. [26] CS.TCD.ie (https:/ / www. cs. tcd. ie/ 15445/ 15445. HTML) [27] http:/ / lists. w3. org/ Archives/ Public/ www-talk/ 1992JulAug/ 0020. html [28] Hypertext Markup Language: A Representation of Textual Information and MetaInformation for Retrieval and Interchange (http:/ / tools. ietf. org/ html/ draft-ietf-iiir-html-00) [29] http:/ / tools. ietf. org/ html/ draft-ietf-html-spec-00 [30] "HTML 3.0 Draft (Expired!) Materials" (http:/ / www. w3. org/ MarkUp/ html3/ ). World Wide Web Consortium. December 21, 1995. . Retrieved November 16, 2008. [31] "HyperText Markup Language Specification Version 3.0" (http:/ / www. w3. org/ MarkUp/ html3/ CoverPage). . Retrieved June 16, 2007. [32] "HTML 5" (http:/ / www. w3. org/ TR/ html5/ ). World Wide Web Consortium. June 10, 2008. . Retrieved November 16, 2008. [33] "HTML 5, one vocabulary, two serializations" (http:/ / www. w3. org/ QA/ 2008/ 01/ html5-is-html-and-xml. html). . Retrieved February 25, 2009. [34] "XHTML 1.0: The Extensible HyperText Markup Language (Second Edition)" (http:/ / www. w3. org/ TR/ xhtml1/ ). World Wide Web Consortium. January 26, 2000. . Retrieved November 16, 2008. [35] "XHTML 1.1 - Module-based XHTML - Second Edition" (http:/ / www. w3. org/ TR/ xhtml11/ ). World Wide Web Consortium. February 16, 2007. . Retrieved November 16, 2008. [36] http:/ / www. w3. org/ TR/ xhtml-modularization/ [37] "XHTM 2.0" (http:/ / www. w3. org/ TR/ xhtml2/ ). World Wide Web Consortium. July 26, 2006. . Retrieved November 16, 2008. "XHTML 2 Working Group Expected to Stop Work End of 2009, W3C to Increase Resources on HTML 5" (http:/ / www. w3. org/ News/ 2009#item119). World Wide Web Consortium. July 17, 2009. . Retrieved November 16, 2008. [38] "HTML 5" (http:/ / www. w3. org/ html/ wg/ html5/ ). World Wide Web Consortium. October 24, 2008. . Retrieved November 16, 2008. [39] http:/ / www. w3. org/ 2008/ Talks/ 04-24-smith/ index. html [40] Activating Browser Modes with Doctype (http:/ / hsivonen. iki. fi/ doctype/ ) [41] http:/ / en. wikipedia. org/ [42] "On SGML and HTML" (http:/ / www. w3. org/ TR/ html401/ intro/ sgmltut. html#h-3. 2. 2). World Wide Web Consortium. . Retrieved November 16, 2008. [43] "XHTML 1.0 - Differences with HTML 4" (http:/ / www. w3. org/ TR/ xhtml1/ diffs. html#h-4. 4). World Wide Web Consortium. . Retrieved November 16, 2008.

217


HTML [44] Korpela, Jukka (July 6, 1998). "Why attribute values should always be quoted in HTML" (http:/ / www. cs. tut. fi/ ~jkorpela/ qattr. html). Cs.tut.fi. . Retrieved November 16, 2008. [45] "Objects, Images, and Applets in HTML documents" (http:/ / www. w3. org/ TR/ 1999/ REC-html401-19991224/ struct/ objects. html#adef-ismap). World Wide Web Consortium. December 24, 1999. . Retrieved November 16, 2008. [46] Berners-Lee, Tim; Fischetti, Mark (2000). Weaving the Web: The Original Design and Ultimate Destiny of the World Wide Web by Its Inventor. San Francisco: Harper. ISBN 978-0-06-251587-X. [47] Raggett, Dave (2002). "Adding a touch of style" (http:/ / www. w3. org/ MarkUp/ Guide/ Style. html). W3C. . Retrieved October 2, 2009. This article notes that presentational HTML markup may be useful when targeting browsers "before Netscape 4.0 and Internet Explorer 4.0". See the list of web browsers to confirm that these were both released in 1997. [48] Tim Berners-Lee, James Hendler and Ora Lassila (2001). "The Semantic Web" (http:/ / www. scientificamerican. com/ article. cfm?id=the-semantic-web). Scientific American. . Retrieved October 2, 2009. [49] Nigel Shadbolt, Wendy Hall and Tim Berners-Lee (2006). "The Semantic Web Revisited" (http:/ / eprints. ecs. soton. ac. uk/ 12614/ 1/ Semantic_Web_Revisted. pdf). IEEE Intelligent Systems. . Retrieved October 2, 2009. [50] "XHTML 1.0 The Extensible HyperText Markup Language (Second Edition)" (http:/ / www. w3. org/ TR/ xhtml1/ #media). World Wide Web Consortium. 2000, revised 2002. . Retrieved December 7, 2008. "XHTML Documents which follow the guidelines set forth in Appendix C, "HTML Compatibility Guidelines" may be labeled with the Internet Media Type "text/html" [RFC2854], as they are compatible with most HTML browsers. Those documents, and any other document conforming to this specification, may also be labeled with the Internet Media Type "application/xhtml+xml" as defined in [RFC3236]." [51] "RFC 2119: Key words for use in RFCs to Indicate Requirement Levels" (http:/ / www. ietf. org/ rfc/ rfc2119. txt). Harvard University. 1997. . Retrieved December 7, 2008. "3. SHOULD This word, or the adjective "RECOMMENDED", mean that there may exist valid reasons in particular circumstances to ignore a particular item, but the full implications must be understood and carefully weighed before choosing a different course." [52] "XHTML 1.1 - Module-based XHTML - Second Edition" (http:/ / www. w3. org/ TR/ xhtml11/ conformance. html#strict). World Wide Web Consortium. 2007. . Retrieved December 7, 2008. "XHTML 1.1 documents SHOULD be labeled with the Internet Media Type text/html as defined in [RFC2854] or application/xhtml+xml as defined in [RFC3236]." [53] http:/ / www. w3. org/ History/ 19921103-hypertext/ hypertext/ WWW/ MarkUp/ HTMLConstraints. html [54] http:/ / ei. cs. vt. edu/ ~wwwbtb/ book/ chap13/ who. html [55] Freeman, E (2005). Head First HTML. O'Reilly. [56] http:/ / www. w3. org/ TR/ xhtml1/ #guidelines [57] Jakob Nielsen (January 3, 2005). "Reviving Advanced Hypertext" (http:/ / www. useit. com/ alertbox/ 20050103. html). . Retrieved June 16, 2007. [58] Sauer, C.: WYSIWIKI - Questioning WYSIWYG in the Internet Age. In: Wikimania (2006) [59] Spiesser, J., Kitchen, L.: Optimization of html automatically generated by WYSIWYG programs. In: 13th International Conference on World Wide Web, pp. 355--364. WWW '04. ACM, New York, NY (New York, NY, USA, May 17-20, 2004) [60] http:/ / xhtml. com/ en/ xhtml/ reference/ blockquote/ [61] http:/ / www. invisiblerevolution. net/ [62] http:/ / www. w3. org/ TR/ html401/ [63] http:/ / dev. w3. org/ html5/ spec/ spec. html [64] http:/ / www. w3. org/ MarkUp/ Guide/ [65] http:/ / www. cs. tut. fi/ ~jkorpela/ html/ empty. html [66] http:/ / htmldog. com/ guides/ [67] http:/ / www. html. net/ tutorials/ html/ introduction. asp [68] http:/ / www. yourhtmlsource. com/ [69] http:/ / programming-guides. com/ html

218


Web 2.0

219

Web 2.0 The term "Web 2.0" (2004–present) is commonly associated with web applications that facilitate interactive information sharing, interoperability, user-centered design,[1] and collaboration on the World Wide Web. Examples of Web 2.0 include web-based communities, hosted services, web applications, social-networking sites, video-sharing sites, wikis, blogs, mashups, and folksonomies. A Web 2.0 site allows its users to interact with each other as contributors to the website's content, in contrast to non-interactive websites where users are limited to the passive viewing of information that is provided to them.

A tag cloud (a typical Web 2.0 phenomenon in itself) presenting Web 2.0 themes

The term is closely associated with Tim O'Reilly because of the O'Reilly Media Web 2.0 conference in 2004.[2] [3] Although the term suggests a new version of the World Wide Web, it does not refer to an update to any technical specifications, but rather to cumulative changes in the ways software developers and end-users use the Web. Whether Web 2.0 is qualitatively different from prior web technologies has been challenged by World Wide Web inventor Tim Berners-Lee, who called the term a "piece of jargon"[4] — precisely because he specifically intended the Web to embody these values in the first place.

History: From Web 1.0 to 2.0 The term "Web 2.0" was coined in 1999 by Darcy DiNucci. In her article, "Fragmented Future," DiNucci writes:[5] The Web we know now, which loads into a browser window in essentially static screenfulls, is only an embryo of the Web to come. The first glimmerings of Web 2.0 are beginning to appear, and we are just starting to see how that embryo might develop. The Web will be understood not as screenfulls of text and graphics but as a transport mechanism, the ether through which interactivity happens. It will [...] appear on your computer screen, [...] on your TV set [...] your car dashboard [...] your cell phone [...] hand-held game machines [...] maybe even your microwave oven. Her use of the term deals mainly with Web design and aesthetics; she argues that the Web is "fragmenting" due to the widespread use of portable Web-ready devices. Her article is aimed at designers, reminding them to code for an ever-increasing variety of hardware. As such, her use of the term hints at – but does not directly relate to – the current uses of the term. The term did not resurface until 2003.[6] [7] [8] These authors focus on the concepts currently associated with the term where, as Scott Dietzen puts it, "the Web becomes a universal, standards-based integration platform".[9] In 2004, the term began its rise in popularity when O'Reilly Media and MediaLive hosted the first Web 2.0 conference. In their opening remarks, John Battelle and Tim O'Reilly outlined their definition of the "Web as Platform", where software applications are built upon the Web as opposed to upon the desktop. The unique aspect of this migration, they argued, is that "customers are building your business for you".[10] They argued that the activities of users generating content (in the form of ideas, text, videos, or pictures) could be "harnessed" to create value.


Web 2.0

220

O'Reilly et al. contrasted Web 2.0 with what they called "Web 1.0". They associated Web 1.0 with the business models of Netscape and the Encyclopedia Britannica Online. For example, Netscape framed "the web as platform" in terms of the old software paradigm: their flagship product was the web browser, a desktop application, and their strategy was to use their dominance in the browser market to establish a market for high-priced server products. Control over standards for displaying content and applications in the browser would, in theory, give Netscape the kind of market power enjoyed by Microsoft in the PC market. Much like the "horseless carriage" framed the automobile as an extension of the familiar, Netscape promoted a "webtop" to replace the desktop, and planned to populate that webtop with information updates and applets pushed to the webtop by information providers who would purchase Netscape servers.[11] In short, Netscape focused on creating software, updating it on occasion, and distributing it to the end users. O'Reilly contrasted this with Google, a company which did not at the time focus on producing software, such as a browser, but instead focused on providing a service based on data. The data being the links Web page authors make between sites. Google exploits this user-generated content to offer Web search based on reputation through its "page rank" algorithm. Unlike software, which undergoes scheduled releases, such services are constantly updated, a process called "the perpetual beta". A similar difference can be seen between the Encyclopedia Britannica Online and Wikipedia: while the Britannica relies upon experts to create articles and releases them periodically in publications, Wikipedia relies on trust in anonymous users to constantly and quickly build content. Wikipedia is not based on expertise but rather an adaptation of the open source software adage "given enough eyeballs, all bugs are shallow", and it produces and updates articles constantly. O'Reilly's Web 2.0 conferences have been held every year since 2004, attracting entrepreneurs, large companies, and technology reporters. In terms of the lay public, the term Web 2.0 was largely championed by bloggers and by technology journalists, culminating in the 2006 TIME magazine Person of The Year â&#x20AC;&#x201C; "You".[12] That is, TIME selected the masses of users who were participating in content creation on social networks, blogs, wikis, and media sharing sites. The cover story author Lev Grossman explains: It's a story about community and collaboration on a scale never seen before. It's about the cosmic compendium of knowledge Wikipedia and the million-channel people's network YouTube and the online metropolis MySpace. It's about the many wresting power from the few and helping one another for nothing and how that will not only change the world, but also change the way the world changes. Since that time, Web 2.0 has found a place in the lexicon; the Global Language Monitor recently declared it to be the one-millionth English word.[13]


Web 2.0

221

Characteristics Web 2.0 websites allow users to do more than just retrieve information. They can build on the interactive facilities of "Web 1.0" to provide "Network as platform" computing, allowing users to run software-applications entirely through a browser.[3] Users can own the data on a Web 2.0 site and exercise control over that data.[3] [14] These sites may have an "Architecture of participation" that encourages users to add value to the application as they use it.[2] [3] The concept of Web-as-participation-platform captures many of these characteristics. Bart Decrem, a founder and former CEO of Flock, calls Web 2.0 the "participatory Web"[15] and regards the Flickr, a Web 2.0 web site that allows its users to Web-as-information-source as Web 1.0. The impossibility of excluding upload and share photos group-members who donâ&#x20AC;&#x2122;t contribute to the provision of goods from sharing profits gives rise to the possibility that rational members will prefer to withhold their contribution of effort and free-ride on the contribution of others.[16] This requires what is sometimes called Radical Trust by the management of the website. According to Best,[17] the characteristics of Web 2.0 are: rich user experience, user participation, dynamic content, metadata, web standards and scalability. Further characteristics, such as openness, freedom[18] and collective intelligence[19] by way of user participation, can also be viewed as essential attributes of Web 2.0.

Technology overview Web 2.0 draws together the capabilities of client- and server-side software, content syndication and the use of network protocols. Standards-oriented web browsers may use plug-ins and software extensions to handle the content and the user interactions. Web 2.0 sites provide users with information storage, creation, and dissemination capabilities that were not possible in the environment now known as "Web 1.0". Web 2.0 websites typically include some of the following features and techniques. Andrew McAfee used the acronym SLATES to refer to them:[20] Search Finding information through keyword search. Links Connects information together into a meaningful information ecosystem using the model of the Web, and provides low-barrier social tools. Authoring The ability to create and update content leads to the collaborative work of many rather than just a few web authors. In wikis, users may extend, undo and redo each other's work. In blogs, posts and the comments of individuals build up over time. Tags Categorization of content by users adding "tags" - short, usually one-word descriptions = to facilitate searching, without dependence on pre-made categories. Collections of tags created by many users within a single system may be referred to as "folksonomies" (i.e., folk taxonomies). Extensions Software that makes the Web an application platform as well as a document server. Signals


Web 2.0

222 The use of syndication technology such as RSS to notify users of content changes.

While SLATES forms the basic framework of Enterprise 2.0, it does not contradict all of the higher level Web 2.0 design patterns and business models. And in this way, the new Web 2.0 report from O'Reilly is quite effective and diligent in interweaving the story of Web 2.0 with the specific aspects of Enterprise 2.0. It includes discussions of self-service IT, the long tail of enterprise IT demand, and many other consequences of the Web 2.0 era in the enterprise. The report also makes many sensible recommendations around starting small with pilot projects and measuring results, among a fairly long list. [21]

How it works The client-side/web browser technologies typically used in Web 2.0 development are Asynchronous JavaScript and XML (Ajax), Adobe Flash and the Adobe Flex framework, and JavaScript/Ajax frameworks such as Yahoo! UI Library, Dojo Toolkit, MooTools, and jQuery. Ajax programming uses JavaScript to upload and download new data from the web server without undergoing a full page reload. To permit the user to continue to interact with the page, communications such as data requests going to the server are separated from data coming back to the page (asynchronously). Otherwise, the user would have to routinely wait for the data to come back before they can do anything else on that page, just as a user has to wait for a page to complete the reload. This also increases overall performance of the site, as the sending of requests can complete quicker independent of blocking and queueing required to send data back to the client. The data fetched by an Ajax request is typically formatted in XML or JSON (JavaScript Object Notation) format, two widely used structured data formats. Since both of these formats are natively understood by JavaScript, a programmer can easily use them to transmit structured data in their web application. When this data is received via Ajax, the JavaScript program then uses the Document Object Model (DOM) to dynamically update the web page based on the new data, allowing for a rapid and interactive user experience. In short, using these techniques, Web designers can make their pages function like desktop applications. For example, Google Docs uses this technique to create a Web-based word processor. Adobe Flex is another technology often used in Web 2.0 applications. Compared to JavaScript libraries like jQuery, Flex makes it easier for programmers to populate large data grids, charts, and other heavy user interactions.[22] Applications programmed in Flex, are compiled and displayed as Flash within the browser. As a widely available plugin independent of W3C (World Wide Web Consortium, the governing body of web standards and protocols), standards, Flash is capable of doing many things which are not currently possible in HTML, the language used to construct web pages. Of Flash's many capabilities, the most commonly used in Web 2.0 is its ability to play audio and video files. This has allowed for the creation of Web 2.0 sites where video media is seamlessly integrated with standard HTML. In addition to Flash and Ajax, JavaScript/Ajax frameworks have recently become a very popular means of creating Web 2.0 sites. At their core, these frameworks do not use technology any different from JavaScript, Ajax, and the DOM. What frameworks do is smooth over inconsistencies between web browsers and extend the functionality available to developers. Many of them also come with customizable, prefabricated 'widgets' that accomplish such common tasks as picking a date from a calendar, displaying a data chart, or making a tabbed panel. On the server side, Web 2.0 uses many of the same technologies as Web 1.0. Languages such as PHP, Ruby, ColdFusion, Perl, Python, JSP and ASP are used by developers to dynamically output data using information from files and databases. What has begun to change in Web 2.0 is the way this data is formatted. In the early days of the Internet, there was little need for different websites to communicate with each other and share data. In the new "participatory web", however, sharing data between sites has become an essential capability. To share its data with other sites, a web site must be able to generate output in machine-readable formats such as XML, RSS, and JSON. When a site's data is available in one of these formats, another website can use it to integrate a portion of that site's functionality into itself, linking the two together. When this design pattern is implemented, it ultimately leads to data


Web 2.0

223

that is both easier to find and more thoroughly categorized, a hallmark of the philosophy behind the Web 2.0 movement.

Usage The popularity of the term Web 2.0, along with the increasing use of blogs, wikis, and social networking technologies, has led many in academia and business to coin a flurry of 2.0s,[23] including Library 2.0,[24] Social Work 2.0,[25] Enterprise 2.0, PR 2.0,[26] Classroom 2.0, Publishing 2.0, Medicine 2.0, Telco 2.0, Travel 2.0, Government 2.0,[27] and even Porn 2.0.[28] Many of these 2.0s refer to Web 2.0 technologies as the source of the new version in their respective disciplines and areas. For example, in the Talis white paper "Library 2.0: The Challenge of Disruptive Innovation", Paul Miller argues Blogs, wikis and RSS are often held up as exemplary manifestations of Web 2.0. A reader of a blog or a wiki is provided with tools to add a comment or even, in the case of the wiki, to edit the content. This is what we call the Read/Write web.Talis believes that Library 2.0 means harnessing this type of participation so that libraries can benefit from increasingly rich collaborative cataloguing efforts, such as including contributions from partner libraries as well as adding rich enhancements, such as book jackets or movie files, to records from publishers and others.[29] Here, Miller links Web 2.0 technologies and the culture of participation that they engender to the field of library science, supporting his claim that there is now a "Library 2.0". Many of the other proponents of new 2.0s mentioned here use similar methods.

Web 3.0 Not much time passed before "Web 3.0" was coined. Definitions of Web 3.0 vary greatly. Amit Agarwal states that Web 3.0 is, among other things, about the Semantic Web and personalization.[30] Andrew Keen, author of The Cult of the Amateur [31], considers the Semantic Web an "unrealisable abstraction" and sees Web 3.0 as the return of experts and authorities to the Web. For example, he points to Bertelsman's deal with the German Wikipedia to produce an edited print version of that encyclopedia. Others still such as Manoj Sharma, an organization strategist, in the keynote "A Brave New World Of Web 3.0" proposes that Web 3.0 will be a "Totally Integrated World" cradle-to-grave experience of being always plugged onto the net.[32] CNN Money's Jessi Hempel expects Web 3.0 to emerge from new and innovative Web 2.0 services with a profitable business model.[33] Conrad Wolfram has argued that Web 3.0 is where "the computer is generating new information", rather than humans. [34]

Web-based applications and desktops Ajax has prompted the development of websites that mimic desktop applications, such as word processing, the spreadsheet, and slide-show presentation. WYSIWYG wiki sites replicate many features of PC authoring applications. In 2006 Google, Inc. acquired one of the best-known sites of this broad class, Writely.[35] Several browser-based "operating systems" have emerged, including EyeOS[36] and YouOS.[37] Although coined as such, many of these services function less like a traditional operating system and more as an application platform. They mimic the user experience of desktop operating-systems, offering features and applications similar to a PC environment, as well as the added ability of being able to run within any modern browser. However, these operating systems do not control the hardware on the client's computer. Numerous web-based application services appeared during the dot-com bubble of 1997â&#x20AC;&#x201C;2001 and then vanished, having failed to gain a critical mass of customers. In 2005, WebEx acquired one of the better-known of these, Intranets.com, for $45 million.[38]


Web 2.0

224

XML and RSS Advocates of "Web 2.0" may regard syndication of site content as a Web 2.0 feature, involving as it does standardized protocols, which permit end-users to make use of a site's data in another context (such as another website, a browser plugin, or a separate desktop application). Protocols which permit syndication include RSS (Really Simple Syndication — also known as "web syndication"), RDF (as in RSS 1.1), and Atom, all of them XML-based formats. Observers have started to refer to these technologies as "Web feed" as the usability of Web 2.0 evolves and the more user-friendly Feeds icon supplants the RSS icon. Specialized protocols Specialized protocols such as FOAF and XFN (both for social networking) extend the functionality of sites or permit end-users to interact without centralized websites. Other protocols, like XMPP enables services to users like Services over the Messenger

Web APIs Web 2.0 often uses machine-based interactions such as REST and SOAP. Often servers use proprietary APIs, but standard APIs (for example, for posting to a blog or notifying a blog update) have also come into wide use. Most communications through APIs involve XML or JSON payloads. Web Services Description Language (WSDL) is the standard way of publishing a SOAP API and there are a range of Web Service specifications. See also EMML by the Open Mashup Alliance for enterprise mashups.

Criticism Critics of the term claim that "Web 2.0" does not represent a new version of the World Wide Web at all, but merely continues to use so-called "Web 1.0" technologies and concepts. First, techniques such as AJAX do not replace underlying protocols like HTTP, but add an additional layer of abstraction on top of them. Second, many of the ideas of Web 2.0 had already been featured in implementations on networked systems well before the term "Web 2.0" emerged. Amazon.com, for instance, has allowed users to write reviews and consumer guides since its launch in 1995, in a form of self-publishing. Amazon also opened its API to outside developers in 2002.[39] Previous developments also came from research in computer-supported collaborative learning and computer-supported cooperative work and from established products like Lotus Notes and Lotus Domino, all phenomena which precede Web 2.0. But perhaps the most common criticism is that the term is unclear or simply a buzzword. For example, in a podcast interview,[4] Tim Berners-Lee described the term "Web 2.0" as a "piece of jargon": "Nobody really knows what it means...If Web 2.0 for you is blogs and wikis, then that is people to people. But that was what the Web was supposed to be all along."[4] Other critics labeled Web 2.0 “a second bubble” (referring to the Dot-com bubble of circa 1995–2001), suggesting that too many Web 2.0 companies attempt to develop the same product with a lack of business models. For example, The Economist has dubbed the mid- to late-2000s focus on Web companies "Bubble 2.0".[40] Venture capitalist Josh Kopelman noted that Web 2.0 had excited only 53,651 people (the number of subscribers at that time to TechCrunch, a Weblog covering Web 2.0 startups and technology news), too few users to make them an economically viable target for consumer applications.[41] Although Bruce Sterling reports he's a fan of Web 2.0, he thinks it is now dead as a rallying concept.[42] Critics have cited the language used to describe the hype cycle of Web 2.0[43] as an example of Techno-utopianist rhetoric.[44]


Web 2.0 In terms of Web 2.0's social impact, critics such as Andrew Keen argue that Web 2.0 has created a cult of digital narcissism and amateurism, which undermines the notion of expertise by allowing anybody, anywhere to share – and place undue value upon – their own opinions about any subject and post any kind of content, regardless of their particular talents, knowledgeability, credentials, biases or possible hidden agendas. He states that the core assumption of Web 2.0, that all opinions and user-generated content are equally valuable and relevant, is misguided and is instead "creating an endless digital forest of mediocrity: uninformed political commentary, unseemly home videos, embarrassingly amateurish music, unreadable poems, essays and novels", also stating that Wikipedia is full of "mistakes, half truths and misunderstandings".[45]

Trademark In November 2004, CMP Media applied to the USPTO for a service mark on the use of the term "WEB 2.0" for live events.[46] On the basis of this application, CMP Media sent a cease-and-desist demand to the Irish non-profit organization IT@Cork on May 24, 2006,[47] but retracted it two days later.[48] The "WEB 2.0" service mark registration passed final PTO Examining Attorney review on May 10, 2006, and was registered on June 27, 2006.[46] The European Union application (application number 004972212, which would confer unambiguous status in Ireland) was refused [49] on May 23, 2007.

See also • • • • • • • • • • • • • • • • •

Cloud computing Collective intelligence Consumer-generated media Enterprise social software Mashups New Media Office suite Open Mashup Alliance Open source governance Radical Trust Social commerce Social media Social shopping User-generated content Web 1.0 Web 2.0 for development (web2fordev) You (Time Person of the Year)

Application Domains • • • • • •

Business 2.0 E-learning 2.0 Government 2.0 Health 2.0 Library 2.0 Porn 2.0

225


Web 2.0

Resources Books Handbook of Research on Web 2.0, 3.0, and X.0: Technologies, Business, and Social Applications", San Murugesan (Editor)'http:/ / www. igi-global. com/ reference/ details. asp?id=34850'', Information Science Research, Hershey – New York, October 2009, ISBN 978-1-60566-384-5 Articles Understanding Web 2.0, San Murugesan, IEEE IT Professional, 2007, http:/ / www. computer. org/ portal/ web/ buildyourcareer/fa009

References [1] "Core Characteristics of Web 2.0 Services" (http:/ / www. techpluto. com/ web-20-services/ ). . [2] Paul Graham (November 2005). "Web 2.0" (http:/ / www. paulgraham. com/ web20. html). . Retrieved 2006-08-02. ""I first heard the phrase 'Web 2.0' in the name of the Web 2.0 conference in 2004."" [3] Tim O'Reilly (2005-09-30). "What Is Web 2.0" (http:/ / www. oreillynet. com/ pub/ a/ oreilly/ tim/ news/ 2005/ 09/ 30/ what-is-web-20. html). O'Reilly Network. . Retrieved 2006-08-06. [4] "DeveloperWorks Interviews: Tim Berners-Lee" (http:/ / www. ibm. com/ developerworks/ podcast/ dwi/ cm-int082206txt. html). 2006-07-28. . Retrieved 2007-02-07. [5] DiNucci, D. (1999). "Fragmented Future" (http:/ / www. cdinucci. com/ Darcy2/ articles/ Print/ Printarticle7. html). Print 53 (4): 32. . [6] Idehen, Kingsley. 2003. RSS: INJAN (It's not just about news). Blog. Blog Data Space. August 21 OpenLinksW.com (http:/ / www. openlinksw. com/ dataspace/ kidehen@openlinksw. com/ weblog/ kidehen@openlinksw. com's BLOG [127]/ 241) [7] Idehen, Kingsley. 2003. Jeff Bezos Comments about Web Services. Blog. Blog Data Space. September 25. OpenLinksW.com (http:/ / www. openlinksw. com/ blog/ ~kidehen/ index. vspx?id=373) [8] Knorr, Eric. 2003. The year of Web services. CIO, December 15. [9] ibid [10] O'Reilly, Tim, and John Battelle. 2004. Opening Welcome: State of the Internet Industry. In . San Francisco, CA, October 5. [11] O’Reilly, T., 2005. [12] Grossman, Lev. 2006. Person of the Year: You. December 25. Time.com (http:/ / www. time. com/ time/ covers/ 0,16641,20061225,00. html) [13] "'Millionth English Word' declared". NEWS.BBC.co.uk (http:/ / news. bbc. co. uk/ 1/ hi/ world/ americas/ 8092549. stm) [14] Dion Hinchcliffe (2006-04-02). "The State of Web 2.0" (http:/ / web2. wsj2. com/ the_state_of_web_20. htm). Web Services Journal. . Retrieved 2006-08-06. [15] Bart Decrem (2006-06-13). "Introducing Flock Beta 1" (http:/ / www. flock. com/ node/ 4500). Flock official blog. . Retrieved 2007-01-13. [16] Gerald Marwell and Ruth E. Ames: "Experiments on the Provision of Public Goods. I. Resources, Interest, Group Size, and the Free-Rider Problem". The American Journal of Sociology, Vol. 84, No. 6 (May, 1979), pp. 1335–1360 [17] Best, D., 2006. Web 2.0 Next Big Thing or Next Big Internet Bubble? Lecture Web Information Systems. Techni sche Universiteit Eindhoven. [18] Greenmeier, Larry and Gaudin, Sharon. "Amid The Rush To Web 2.0, Some Words Of Warning – Web 2.0 – InformationWeek" (http:/ / www. informationweek. com/ news/ management/ showArticle. jhtml;jsessionid=EWRPGLVJ53OW2QSNDLPCKHSCJUNN2JVN?articleID=199702353& _requestid=494050). www.informationweek.com. . Retrieved 2008-04-04. [19] O’Reilly, T., 2005. What is Web 2.0. Design Patterns and Business Models for the Next Generation of Software, 30, p.2005 [20] McAfee, A. (2006). Enterprise 2.0: The Dawn of Emergent Collaboration. MIT Sloan Management review. Vol. 47, No. 3, p. 21–28. [21] Blogs.ZDnet.com (http:/ / blogs. zdnet. com/ Hinchcliffe/ ?p=71) [22] Maraksquires.com (http:/ / maraksquires. com/ articles/ 2009/ 11/ 16/ javascript-jquery-versus-actionscript-flex-take-1/ ) [23] Schick, S., 2005. I second that emotion. IT Business.ca (Canada). [24] Miller, P., 2008. Library 2.0: The Challenge of Disruptive Innovation. Available at: Google.com (http:/ / www. google. com/ url?sa=t& source=web& ct=res& cd=1& url=http:/ / www. talis. com/ resources/ documents/ 447_Library_2_prf1. pdf& ei=qTUWSsWsH5SJtgeAlNTiDA& usg=AFQjCNER9YGaOBSyJFzz4lQ_GszwcQVqlw& sig2=_TunJ0srmZ8pio8Qzkzq-Q) [25] Singer, Jonathan B. (2009). The Role and Regulations for Technology in Social Work Practice and E-Therapy: Social Work 2.0. In A. R. Roberts (Ed). (http:/ / www. us. oup. com/ us/ catalog/ general/ subject/ SocialWork/ ~~/ dmlldz11c2EmY2k9OTc4MDE5NTM2OTM3Mw==). New York, U.S.A.: Oxford University Press. ISBN 978-0195369373. . [26] Breakenridge, D., 2008. PR 2.0: New Media, New Tools, New Audiences 1st ed., FT Press. [27] Eggers, William D. (2005). Government 2.0: Using Technology to Improve Education, Cut Red Tape, Reduce Gridlock, and Enhance Democracy (http:/ / www. manhattan-institute. org/ government2. 0/ ). Lanham MD, U.S.A.: Rowman & Littlefield Publishers, Inc.. ISBN 978-0742541757. .

226


Web 2.0 [28] Rusak, Sergey (2009). Web 2.0 Becoming An Outdated Term (http:/ / www. progressiveadvertiser. com/ web-2-0-becoming-an-outdated-term/ ). Boston, MA, U.S.A.: Progressive Advertiser. . [29] Miller 10–11 [30] Agarwal, Amit. "Web 3.0 concepts explained in plain English". Labnol.org (http:/ / www. labnol. org/ internet/ web-3-concepts-explained/ 8908/ ) [31] http:/ / www. amazon. com/ dp/ 0385520816 [32] Keen, Andrew. "Web 1.0 + Web 2.0 = Web 3.0." TypePad.com (http:/ / andrewkeen. typepad. com/ the_great_seduction/ 2008/ 04/ web-10-web-20-w. html) [33] Hempel, Jessi. "Web 2.0 is so over. Welcome to Web 3.0." CNN Money. CNN.com (http:/ / money. cnn. com/ 2009/ 01/ 07/ technology/ hempel_threepointo. fortune/ index. htm) [34] Conrad Wolfram on Communicating with apps in web 3.0 (http:/ / www. itpro. co. uk/ 621535/ q-a-conrad-wolfram-on-communicating-with-apps-in-web-3-0) IT PRO, 17 Mar 2010 [35] "Google buys Web word-processing technology" (http:/ / www. news. com/ 2100-1032_3-6048136. html). www.news.com. . Retrieved 2007-12-12. [36] "Can eyeOS Succeed Where Desktop.com Failed?" (http:/ / www. techcrunch. com/ 2006/ 11/ 27/ eyeos-open-source-webos-for-the-masses/ ). www.techcrunch.com. . Retrieved 2007-12-12. [37] "Tech Beat Hey YouOS! – BusinessWeek" (http:/ / www. businessweek. com/ the_thread/ techbeat/ archives/ 2006/ 03/ hey_youos. html). www.businessweek.com. . Retrieved 2007-12-12. [38] "PC World — WebEx Snaps Up Intranets.com" (http:/ / www. pcworld. com/ article/ id,122068-page,1/ article. html). www.pcworld.com. . Retrieved 2007-12-12. [39] Tim O'Reilly (2002-06-18). "Amazon Web Services API" (http:/ / www. oreillynet. com/ pub/ wlg/ 1707?wlg=yes). O'Reilly Network. . Retrieved 2006-05-27. [40] "Bubble 2.0" (http:/ / www. economist. com/ business/ displaystory. cfm?story_id=E1_QQNVDDS). The Economist. 2005-12-22. . Retrieved 2006-12-20. [41] Josh Kopelman (2006-05-11). "53,651" (http:/ / redeye. firstround. com/ 2006/ 05/ 53651. html). Redeye VC. . Retrieved 2006-12-21. [42] ""Bruce Sterling presenta il web 2.0"" (http:/ / www. lastampa. it/ multimedia/ multimedia. asp?p=1& IDmsezione=29& IDalbum=8558& tipo=VIDEO#mpos). "LASTAMPA.it". . [43] ""Gartner 2006 Emerging Technologies Hype Cycle" (http:/ / www. gartner. com/ it/ page. jsp?id=495475). . [44] ""Critical Perspectives on Web 2.0", Special issue of First Monday, 13(3), 2008. UIC.edu (http:/ / www. uic. edu/ htbin/ cgiwrap/ bin/ ojs/ index. php/ fm/ issue/ view/ 263/ showToc)". [45] ""Thinking is so over" (http:/ / technology. timesonline. co. uk/ tol/ news/ tech_and_web/ personal_tech/ article1874668. ece). London. . [46] USPTO serial number 78322306 (http:/ / tarr. uspto. gov/ servlet/ tarr?regser=serial& entry=78322306) [47] "O'Reilly and CMP Exercise Trademark on 'Web 2.0'" (http:/ / yro. slashdot. org/ article. pl?sid=06/ 05/ 26/ 1238245). Slashdot. 2006-05-26. . Retrieved 2006-05-27. [48] Nathan Torkington (2006-05-26). "O'Reilly's coverage of Web 2.0 as a service mark" (http:/ / radar. oreilly. com/ archives/ 2006/ 05/ more_on_our_web_20_service_mar. html). O'Reilly Radar. . Retrieved 2006-06-01. [49] http:/ / oami. europa. eu/ CTMOnline/ RequestManager/ en_Result?transition=ResultsDetailed& ntmark=& application=CTMOnline& bAdvanced=0& language=en& deno=& source=search_basic. jsp& idappli=004972212#

227


PHP

228

PHP

Usual file extensions

.php, .phtml .php5 .phps

Paradigm

imperative, object-oriented

Appeared in

1995

Designed by

Rasmus Lerdorf

Developer

The PHP Group

Stable release

5.2.13 / 5.3.2 (5.2.13 February 25, 2010 / 5.3.2: March 4, 2010)

Typing discipline

Dynamic, weak

[1]

Major implementations Zend Engine, Roadsend PHP, Phalanger, Quercus [2], Project Zero, HipHop [1]

Influenced by

C, Perl, Java, C++, Tcl

Influenced

PHP4Delphi

Programming language C OS

Cross-platform

License

PHP License

Website

http:/ / www. php. net/

PHP: Hypertext Preprocessor is a widely used, general-purpose scripting language that was originally designed for web development to produce dynamic web pages. For this purpose, PHP code is embedded into the HTML source document and interpreted by a web server with a PHP processor module, which generates the web page document. As a general-purpose programming language, PHP code is processed by an interpreter application in command-line mode performing desired operating system operations and producing program output on its standard output channel. It may also function as a graphical application. PHP is available as a processor for most modern web servers and as standalone interpreter on most operating systems and computing platforms. PHP was originally created by Rasmus Lerdorf in 1995[1] and has been in continuous development ever since. The main implementation of PHP is now produced by The PHP Group and serves as the de facto standard for PHP as there is no formal specification.[3] PHP is free software released under the PHP License.

History

Rasmus Lerdorf, who wrote the original Common Gateway Interface component, Andi Gutmans and Zeev Suraski, who rewrote the parser that formed PHP 3


PHP

229 PHP originally stood for personal home page.[3] It began in 1994 as a set of Common Gateway Interface (CGI) binaries written in the C programming language by the Danish/Greenlandic programmer Rasmus Lerdorf.[4] [5] Lerdorf initially created these Personal Home Page Tools to replace a small set of Perl scripts he had been using to maintain his personal homepage. The tools were used to perform tasks such as displaying his rĂŠsumĂŠ and recording how much traffic his page was receiving.[3] He combined these binaries with his Form Interpreter to create PHP/FI, which had more functionality. PHP/FI included a larger implementation for the C programming language and could communicate with databases, enabling the building of simple, dynamic web applications. Lerdorf released PHP publicly on June 8, 1995, to accelerate bug location and improve the code.[6] This release was named PHP version 2 and already had the basic functionality that PHP has today. This included Perl-like variables, form handling, and the ability to embed HTML. The syntax was similar to Perl but was more limited, simpler, and less consistent.[3] Zeev Suraski and Andi Gutmans, two Israeli developers at the Technion IIT, rewrote the parser in 1997 and formed the base of PHP 3, changing the language's name to the recursive initialism PHP: Hypertext Preprocessor.[3] The development team officially released PHP/FI 2 in November 1997 after months of beta testing. Afterwards, public testing of PHP 3 began, and the official launch came in June 1998. Suraski and Gutmans then started a new rewrite of PHP's core, producing the Zend Engine in 1999.[7] They also founded Zend Technologies in Ramat Gan, Israel.[3] On May 22, 2000, PHP 4, powered by the Zend Engine 1.0, was released.[3] As of August 2008 this branch is up to version 4.4.9. PHP 4 is no longer under development nor will any security updates be released.[8] [9] On July 13, 2004, PHP 5 was released, powered by the new Zend Engine II.[3] PHP 5 included new features such as improved support for object-oriented programming, the PHP Data Objects extension (which defines a lightweight and consistent interface for accessing databases), and numerous performance enhancements.[10] In 2008 PHP 5 became the only stable version under development. Late static binding has been missing from PHP and has been added in version 5.3.[11] [12] A new major version has been under development alongside PHP 5 for several years. This version was originally planned to be released as PHP 6 as a result of its significant changes, which included plans for full Unicode support. However, Unicode support took developers much longer to implement than originally thought, and the decision was made in March 2010[13] to move the project to a branch, with features still under development moved to a trunk. Changes in the new code include the removal of register_globals,[14] magic quotes, and safe mode.[8] [15] The reason for the removals was that register_globals had given way to security holes, and magic quotes had an unpredictable nature, and was best avoided. Instead, to escape characters, magic quotes may be substituted with the addslashes() function, or more appropriately an escape mechanism specific to the database vendor itself like mysql_real_escape_string() for MySQL. Functions that will be removed in future versions and have been deprecated in PHP 5.3, will produce a warning if used.[16] Many high-profile open-source projects ceased to support PHP 4 in new code as of February 5, 2008, because of the GoPHP5 initiative[17] , provided by a consortium of PHP developers promoting the transition from PHP 4 to PHP 5.[18] [19] PHP currently does not have native support for Unicode or multibyte strings; Unicode support is under development for a future verson of PHP and will allow strings as well as class, method, and function names to contain non-ASCII characters.[20] [21] PHP interpreters are available on both 32-bit and 64-bit operating systems, but on Microsoft Windows the only official distribution is a 32-bit implementation, requiring Windows 32-bit compatibility mode while using Internet Information Services (IIS) on a 64-bit Windows platform. As of PHP 5.3.0, experimental 64-bit versions are available for MS Windows.[22]


PHP

230

Licensing PHP is free software released under the PHP License, which insists that: â&#x20AC;˘ The name "PHP" must not be used to endorse or promote products derived from this software without prior written permission.[23] This makes it incompatible with the GNU General Public License (GPL) because restrictions exist regarding the use of the term PHP.[24]

Release history Meaning Red

Release no longer supported

Green Release still supported Blue

Major version

Minor version

Release date

Future release

Notes

1

1.0.0

1995-06-08 Officially called "Personal Home Page Tools (PHP Tools)". This is the first use of the name "PHP".[3]

2

2.0.0

1997-11-01 Considered by its creator as the "fastest and simplest tool" for creating dynamic web pages.[3]

3

3.0.0

1998-06-06 Development moves from one person to multiple developers. Zeev Suraski and Andi Gutmans rewrite the [3] base for this version.

4

4.0.0

2000-05-22 Added more advanced two-stage parse/execute tag-parsing system called the Zend engine.[25]

4.1.0

2001-12-10 Introduced 'superglobals' ($_GET, $_POST, $_SESSION, etc.)[25]

4.2.0

2002-04-22 Disabled register_globals by default. Data received over the network is not inserted directly into the [25] global namespace anymore, closing possible security holes in applications.

4.3.0

2002-12-27 Introduced the CLI, in addition to the CGI.[25] [26]

4.4.0

2005-07-11 Added man pages for phpize and php-config scripts.[25]

4.4.8

2008-01-03 Several security enhancements and bug fixes. Was to be the end of life release for PHP 4. Security [27] updates only until 2008-08-08, if necessary.

4.4.9

2008-08-07 More security enhancements and bug fixes. The last release of the PHP 4.4 series.[28] [29]


PHP

231 5

php-trunk-dev

5.0.0

2004-07-13 Zend Engine II with a new object model.[30]

5.1.0

2005-11-24 Performance improvements with introduction of compiler variables in re-engineered PHP Engine.[30]

5.2.0

2006-11-02 Enabled the filter extension by default. Native JSON support.[30]

5.2.11

2009-09-16 Bug and security fixes.

5.2.12

2009-12-17 Over 60 bug fixes, including 5 security fixes.

5.2.13

2010-02-25 Bug and security fixes.

5.3.0

2009-06-30 Namespace support; Late static bindings, Jump label (limited goto), Native closures, Native PHP archives (phar), garbage collection for circular references, improved Windows support, sqlite3, mysqlnd as a replacement for libmysql as underlying library for the extensions that work with MySQL, fileinfo as a replacement for mime_magic for better MIME support, the Internationalization extension, and deprecation of ereg extension.

5.3.1

2009-11-19 Over 100 bug fixes [31], some of which were security fixes as well.

5.3.2

2010-03-04 Includes a large number of bug fixes. No date set

Unicode support; removal of 'register_globals', 'magic_quotes' and 'safe_mode'; Alternative PHP Cache

Usage PHP is a general-purpose scripting language that is especially suited to server-side web development where PHP generally runs on a web server. Any PHP code in a requested file is executed by the PHP runtime, usually to create dynamic web page content. It can also be used for command-line scripting and client-side GUI applications. PHP can be deployed on most web servers, many operating systems and platforms, and can be used with many relational database management systems. It is available free of charge, and the PHP Group provides the complete source code for users to build, customize and extend for their own use.[32] PHP primarily acts as a filter,[33] taking input from a file or stream containing text and/or PHP instructions and outputs another stream of data; most commonly the output will be HTML. Since PHP 4, the PHP parser compiles input to produce bytecode for processing by the Zend Engine, giving improved performance over its interpreter predecessor.[34] Originally designed to create dynamic web pages, PHP now focuses mainly on server-side scripting,[35] and it is similar to other server-side scripting languages that provide dynamic content from a web server to a client, such as Microsoft's Active Server Pages, Sun Microsystems' JavaServer Pages,[36] and mod_perl. PHP has also attracted the development of many frameworks that provide building blocks and a design structure to promote rapid application development (RAD). Some of these include CakePHP, Symfony, CodeIgniter, and Zend Framework, offering features similar to other web application frameworks. The LAMP architecture has become popular in the web industry as a way of deploying web applications. PHP is commonly used as the P in this bundle alongside Linux, Apache and MySQL, although the P may also refer to Python or Perl or some combination of the three. As of April 2007, over 20 million Internet domains had web services hosted on servers with PHP installed and mod_php was recorded as the most popular Apache HTTP Server module.[37] Significant websites are written in PHP including the user-facing portion of Facebook,[38] Wikipedia (MediaWiki),[39] Yahoo!, MyYearbook, Digg, Joomla, eZ Publish, WordPress,[40] YouTube in its early stages, Drupal, Tagged and Moodle[41] .


PHP

232

Security The National Vulnerability Database stores all vulnerabilities found in computer software. The overall proportion of PHP-related vulnerabilities on the database amounted to: 20% in 2004, 28% in 2005, 43% in 2006, 36% in 2007, 35% in 2008, and 30% in 2009.[42] Most of these PHP-related vulnerabilities can be exploited remotely: they allow crackers to steal or destroy data from data sources linked to the webserver (such as an SQL database), send spam or contribute to DoS attacks using malware, which itself can be installed on the vulnerable servers. These vulnerabilities are caused mostly by not following best practice programming rules: technical security flaws of the language itself or of its core libraries are not frequent (23 in 2008, about 1% of the total). [43] [44] Recognizing that programmers cannot be trusted, some languages include taint checking to detect automatically the lack of input validation which induces many issues. Such a feature is being developed for PHP,[45] but its inclusion in a release has been rejected several times in the past.[46] [47] Hosting PHP applications on a server requires a careful and constant attention to deal with these security risks.[48] There are advanced protection patches such as Suhosin and Hardening-Patch, especially designed for web hosting environments.[49]

Syntax <html> <head> <title>PHP Test</title> </head> <body> <?php echo "Hello World"; /* echo("Hello World"); works as well, although echo isn't a function (it's a language construct). In some cases, such as when multiple parameters are passed to echo, parameters cannot be enclosed in parentheses */ ?> </body> </html> PHP code embedded within HTML PHP only parses code within its delimiters. Anything outside its delimiters is sent directly to the output and is not processed by PHP (although non-PHP text is still subject to control structures described within PHP code). The most common delimiters are <?php to open and ?> to close PHP sections. <script language="php"> and </script> delimiters are also available, as are the shortened forms <? or <?= (which is used to echo back a string or variable) and ?> as well as ASP-style short forms <% or <%= and %>. While short delimiters are used, they make script files less portable as their purpose can be disabled in the PHP configuration [50], and so they are discouraged.[51] The purpose of all these delimiters is to separate PHP code from non-PHP code, including HTML.[52] The first form of delimiters, <?php and ?>, in XHTML and other XML documents, creates correctly formed XML 'processing instructions'.[53] This means that the resulting mixture of PHP code and other markup in the server-side file is itself well-formed XML.


PHP

233 Variables are prefixed with a dollar symbol and a type does not need to be specified in advance. Unlike function and class names, variable names are case sensitive. Both double-quoted ("") and heredoc strings allow the ability to embed a variable's value into the string.[54] PHP treats newlines as whitespace in the manner of a free-form language (except when inside string quotes), and statements are terminated by a semicolon.[55] PHP has three types of comment syntax: /* */ marks block and inline comments; // as well as # are used for one-line comments.[56] The echo statement is one of several facilities PHP provides to output text (e.g. to a web browser). In terms of keywords and language syntax, PHP is similar to most high level languages that follow the C style syntax. If conditions, for and while loops, and function returns are similar in syntax to languages such as C, C++, Java and Perl.

Data types PHP stores whole numbers in a platform-dependent range. This range is typically that of 32-bit signed integers. Unsigned integers are converted to signed values in certain situations; this behavior is different from other programming languages.[57] Integer variables can be assigned using decimal (positive and negative), octal, and hexadecimal notations. Floating point numbers are also stored in a platform-specific range. They can be specified using floating point notation, or two forms of scientific notation.[58] PHP has a native Boolean type that is similar to the native Boolean types in Java and C++. Using the Boolean type conversion rules, non-zero values are interpreted as true and zero as false, as in Perl and C++.[58] The null data type represents a variable that has no value. The only value in the null data type is NULL.[58] Variables of the "resource" type represent references to resources from external sources. These are typically created by functions from a particular extension, and can only be processed by functions from the same extension; examples include file, image, and database resources.[58] Arrays can contain elements of any type that PHP can handle, including resources, objects, and even other arrays. Order is preserved in lists of values and in hashes with both keys and values, and the two can be intermingled.[58] PHP also supports strings, which can be used with single quotes, double quotes, or heredoc syntax.[59] The Standard PHP Library (SPL) attempts to solve standard problems and implements efficient data access interfaces and classes.[60]

Functions PHP has hundreds of base functions and thousands more via extensions. These functions are well documented on the PHP site; however, the built-in library has a wide variety of naming conventions and inconsistencies. PHP currently has no functions for thread programming, although it does support multiprocess programming on POSIX systems.[61] 5.2 and earlier Functions are not first-class functions and can only be referenced by their name, directly or dynamically by a variable containing the name of the function.[62] User-defined functions can be created at any time without being prototyped.[62] Functions can be defined inside code blocks, permitting a run-time decision as to whether or not a function should be defined. Function calls must use parentheses, with the exception of zero argument class constructor functions called with the PHP new operator, where parentheses are optional. PHP supports quasi-anonymous functions through the create_function() function, although they are not true anonymous functions because anonymous functions are nameless, but functions can only be referenced by name, or indirectly through a variable $function_name();, in PHP.[62]


PHP

234 5.3 and newer PHP gained support for closures. True anonymous functions are supported using the following syntax: function getAdder($x) { return function ($y) use ($x) { return $x + $y; }; } $adder = getAdder(8); echo $adder(2); // prints "10"

Here, getAdder() function creates a closure using parameter $x (keyword "use" forces getting variable from context), which takes additional argument $y and returns it to the caller. Such a function can be stored, given as the parameter to other functions, etc. For more details see Lambda functions and closures RFC [63]. The goto flow control device was made available in PHP 5.3 and is used as follows: function lock() { $file = fopen("file.txt","r+"); retry: if(flock($file,LOCK_EX)) { fwrite($file, "Success!"); fclose($file); return 0; } else goto retry; }

When lock() is called, PHP opens a file and tries to lock it. retry:, the target label, defines the point to which execution should return if flock() is unsuccessful and the goto retry; is called. goto is not unrestricted and requires that the target label be in the same file and context.

Objects Basic object-oriented programming functionality was added in PHP 3 and improved in PHP 4.[3] Object handling was completely rewritten for PHP 5, expanding the feature set and enhancing performance.[64] In previous versions of PHP, objects were handled like value types.[64] The drawback of this method was that the whole object was copied when a variable was assigned or passed as a parameter to a method. In the new approach, objects are referenced by handle, and not by value. PHP 5 introduced private and protected member variables and methods, along with abstract classes and final classes as well as abstract methods and final methods. It also introduced a standard way of declaring constructors and destructors, similar to that of other object-oriented languages such as C++, and a standard exception handling model. Furthermore, PHP 5 added interfaces and allowed for multiple interfaces to be implemented. There are special interfaces that allow objects to interact with the runtime system. Objects implementing ArrayAccess can be used with array syntax and objects implementing Iterator or


PHP

235 IteratorAggregate can be used with the foreach language construct. There is no virtual table feature in the engine, so static variables are bound with a name instead of a reference at compile time.[65] If the developer creates a copy of an object using the reserved word clone, the Zend engine will check if a __clone() method has been defined or not. If not, it will call a default __clone() which will copy the object's properties. If a __clone() method is defined, then it will be responsible for setting the necessary properties in the created object. For convenience, the engine will supply a function that imports the properties of the source object, so that the programmer can start with a by-value replica of the source object and only override properties that need to be changed.[66] Basic example of object-oriented programming as described above: Class Person { public $first; public $last; public function __construct($f,$l) { $this->first = $f; $this->last = $l; } public function greeting() { return 'Hello, my name is ' . $this->first . ' ' . $this->last . '.'; // return 'Hello, my name is ' . $this->first . ' ' . $this->last . '.'; also works when not called from Person::greeting() } } $him = new Person('John','Smith'); $her = new Person('Sally','Davis'); echo $him->greeting(); // prints "Hello, my name is John Smith." echo '<br>'; echo $her->greeting(); // prints "Hello, my name is Sally Davis."


PHP

236

Speed optimization PHP source code is compiled on-the-fly to an internal format that can be executed by the PHP engine.[67] [68] In order to speed up execution time and not have to compile the PHP source code every time the webpage is accessed, PHP scripts can also be deployed in executable format using a PHP compiler. Code optimizers aim to reduce the runtime of the compiled code by reducing its size and making other changes that can reduce the execution time with the goal of improving performance. The nature of the PHP compiler is such that there are often opportunities for code optimization,[69] and an example of a code optimizer is the eAccelerator PHP extension.[70] Another approach for reducing overhead for high load PHP servers is using an Opcode cache. Opcode caches work by caching the compiled form of a PHP script (opcodes) in shared memory to avoid the overhead of parsing and compiling the code every time the script runs. An opcode cache, APC, will be built into an upcoming release of PHP.[71] Opcode caching is also available in Zend Server Community Edition.

Compilers The PHP language was originally implemented using a PHP interpreter. Several compilers now exist, which decouple the PHP language from the interpreter: • phc [72] - a C++ based compiler for PHP, using the Zend run-time for maximum compatibility • Roadsend - achieves native compilation by compiling to bigloo scheme, which in turn is compiled to C, then to machine code • Raven [73] - a rewrite of Roadsend PHP (rphp), based on LLVM and a new C++ runtime • Phalanger - compiles source code written in the PHP scripting language into CIL byte-code • Caucho Resin/Quercus [74] - compiles PHP to Java bytecode • HipHop - developed at Facebook and now available as open source, transforms the PHP Script into C++, then compiles it. Advantages of compilation include not only better execution speed, but also obfuscation, static analysis, and improved interoperability with code written in other languages.[75]

Resources PHP includes free and open source libraries with the core build. PHP is a fundamentally Internet-aware system with modules built in for accessing FTP servers, many database servers, embedded SQL libraries such as embedded PostgreSQL, MySQL and SQLite, LDAP servers, and others. Many functions familiar to C programmers such as those in the stdio family are available in the standard PHP build.[76] PHP allows developers to write extensions in C to add functionality to the PHP language. These can then be compiled into PHP or loaded dynamically at runtime. Extensions have been written to add support for the Windows API, process management on Unix-like operating systems, multibyte strings (Unicode), cURL, and several popular compression formats. Some more unusual features include integration with Internet Relay Chat, dynamic generation of images and Adobe Flash content, and even speech synthesis. The PHP Extension Community Library (PECL) project is a repository for extensions to the PHP language.[77] Zend provides a certification exam for programmers to become certified PHP developers.


PHP

237

See also • • • • • • • • • • • •

Active Server Pages Comparison of programming languages Comparison of web application frameworks EasyPHP (The first WebServer for PHP) LAMP (software bundle) List of PHP editors PHP accelerator Template processor XAMPP (WebServer for PHP) Zend Certified Engineer Zend Framework Zend Server Community Edition

External links • The PHP Group [78] • PHP [79] at the Open Directory Project • PHP Reference Manual [80] • PHP CLI (Command Line Interface) web site [81]

References [1] Rasmus Lerdorf began assembling C code originally written for CGI scripts into a library and accessing the library's functions, including SQL queries, through HTML-embedded commands in 1994; by 1995 the commands had taken the shape of PHP code that would be familiar of users of the language today. Lerdorf, Rasmus (2007-04-26). "PHP on Hormones - history of PHP presentation by Rasmus Lerdorf given at the MySQL Conference in Santa Clara, California" (http:/ / itc. conversationsnetwork. org/ shows/ detail3298. html#) (mp3). The Conversations Network. . Retrieved 2009-12-11. "Every day I would change the language drastically, and it didn't take very long, so by 1995, mid-1995 or so, PHP looked like this (http:/ / talks. php. net/ show/ mysql07key/ 4). This isn't that far from what PHP looks like today, actually." [2] http:/ / quercus. caucho. com/ [3] "History of PHP and related projects" (http:/ / www. php. net/ history). The PHP Group. . Retrieved 2008-02-25. [4] Lerdorf, Rasmus (2007-04-26). "PHP on Hormones" (http:/ / itc. conversationsnetwork. org/ shows/ detail3298. html) (mp3). The Conversations Network. . Retrieved 2009-06-22. [5] Lerdorf, Rasmus (2007). "Slide 3" (http:/ / talks. php. net/ show/ mysql07key/ 3). slides for 'PHP on Hormones' talk. The PHP Group. . Retrieved 2009-06-22. [6] Lerdorf, Rasmus (1995-06-08). "Announce: Personal Home Page Tools (PHP Tools)". [news:comp.infosystems.www.authoring.cgi comp.infosystems.www.authoring.cgi]. (Web link) (http:/ / groups. google. com/ group/ comp. infosystems. www. authoring. cgi/ msg/ cc7d43454d64d133). Retrieved on 2006-09-17. [7] "[[Zend Engine (http:/ / www. zend. com/ zend/ zend-engine-summary. php)] version 2.0: Feature Overview and Design"]. Zend Technologies Ltd.. . Retrieved 2006-09-17. [8] "php.net 2007 news archive" (http:/ / www. php. net/ archive/ 2007. php). The PHP Group. 2007-07-13. . Retrieved 2008-02-22. [9] Kerner, Sean Michael (2008-02-01). "PHP 4 is Dead—Long Live PHP 5" (http:/ / www. internetnews. com/ dev-news/ article. php/ 3725291). InternetNews. . Retrieved 2008-03-16. [10] Trachtenberg, Adam (2004-07-15). "Why PHP 5 Rocks!" (http:/ / www. onlamp. com/ pub/ a/ php/ 2004/ 07/ 15/ UpgradePHP5. html). O'Reilly. . Retrieved 2008-02-22. [11] "Late Static Binding in PHP" (http:/ / www. digitalsandwich. com/ archives/ 53-Late-Static-Binding-in-PHP. html). Digital Sandwich. 2006-02-23. . Retrieved 2008-03-25. [12] "Static Keyword" (http:/ / www. php. net/ language. oop5. static). The PHP Group. . Retrieved 2008-03-25. [13] "PHP 6" (http:/ / news. php. net/ php. internals/ 47120). The PHP project. . Retrieved 2010-03-27. [14] "Using Register Globals" (http:/ / www. php. net/ register_globals). PHP. . Retrieved 2008-04-04. [15] "Prepare for PHP 6" (http:/ / www. corephp. co. uk/ archives/ 19-Prepare-for-PHP-6. html). CorePHP. 2005-11-23. . Retrieved 2008-03-24. [16] "PHP 5.3 migration guide" (http:/ / www. php. net/ migration53). The PHP project. . Retrieved 2009-07-03. [17] "GoPHP5" (http:/ / www. gophp5. org/ projects). .


PHP

238 [18] GoPHP5. "PHP projects join forces to Go PHP 5" (http:/ / gophp5. org/ sites/ gophp5. org/ files/ press_release. pdf) (PDF). GoPHP5 Press Release. . Retrieved 2008-02-23. [19] "GoPHP5" (http:/ / gophp5. org/ ). GoPHP5. . Retrieved 2008-02-22. [20] "Unicode" (http:/ / www. php. net/ ~derick/ meeting-notes. html#unicode). The PHP Group. . Retrieved 2008-03-25. [21] Byfield, Bruce (February 28, 2007). "Upcoming PHP release will offer Unicode support" (http:/ / www. linux. com/ archive/ feature/ 60386). linux.com. . Retrieved 2009-06-23. [22] The PHP Group. "PHP For Windows snapshots" (http:/ / windows. php. net/ snapshots/ ). PHP Windows Development Team. . Retrieved 2009-05-25. [23] The PHP License, version 3.01 (http:/ / www. php. net/ license/ 3_01. txt) [24] "GPL-Incompatible, Free Software Licenses" (http:/ / www. fsf. org/ licensing/ education/ licenses/ index_html/ #GPLIncompatibleLicenses). Various Licenses and Comments about Them. Free Software Foundation. . Retrieved 2008-02-22. [25] "PHP: PHP 4 ChangeLog" (http:/ / www. php. net/ ChangeLog-4. php). The PHP Group. 2008-01-03. . Retrieved 2008-02-22. [26] "PHP: Using PHP from the command line - Manual:" (http:/ / us3. php. net/ manual/ en/ features. commandline. php). The PHP Group. . Retrieved 2009-09-11. [27] "4.4.8 Release Announcement" (http:/ / www. php. net/ releases/ 4_4_8. php). PHP. 2008-08-08. . Retrieved 2009-07-29. [28] "Downloads" (http:/ / www. php. net/ downloads. php#v4). PHP. . Retrieved 2009-07-29. [29] "4.4.9 Release Announcement" (http:/ / www. php. net/ releases/ 4_4_9. php). PHP. . Retrieved 2009-07-29. [30] "PHP: PHP 5 ChangeLog" (http:/ / www. php. net/ ChangeLog-5. php). The PHP Group. 2007-11-08. . Retrieved 2008-02-22. [31] http:/ / www. php. net/ ChangeLog-5. php#5. 3. 1 [32] "Embedding PHP in HTML" (http:/ / www. onlamp. com/ pub/ a/ php/ 2001/ 05/ 03/ php_foundations. html). O'Reilly. 2001-05-03. . Retrieved 2008-02-25. [33] (http:/ / web. archive. org/ web/ 20080611231433/ http:/ / web. archive. org/ web/ 20080611231433/ http:/ / gtk. php. net/ manual1/ it/ html/ intro. whatis. php. whatdoes. html) at the Wayback Machine [34] "PHP and MySQL" (http:/ / cs. ua. edu/ 457/ Notes/ PHP and MySQL. ppt). University of Alabama. . Retrieved 2008-02-25. [35] "PHP Server-Side Scripting Language" (http:/ / webmaster. iu. edu/ PHPlanguage/ index. shtml). Indiana University. 2007-04-04. . Retrieved 2008-02-25. [36] "JavaServer Pages Technology — JavaServer Pages Comparing Methods for Server-Side Dynamic Content White Paper" (http:/ / java. sun. com/ products/ jsp/ jspservlet. html). Sun Microsystems. . Retrieved 2008-02-25. [37] "PHP: PHP Usage Stats" (http:/ / www. php. net/ usage. php). SecuritySpace. 2007-04-01. . Retrieved 2008-02-24. [38] "PHP and Facebook | Facebook" (http:/ / blog. facebook. com/ blog. php?post=2356432130). Blog.facebook.com. . Retrieved 2009-07-29. [39] "Manual:Installation requirements#PHP" (http:/ / www. mediawiki. org/ w/ index. php?title=Manual:Installation_requirements& oldid=299556#PHP). MediaWiki. 2010-01-25. . Retrieved 2010-02-26. "PHP is the programming language in which MediaWiki is written [...]" [40] "About WordPress" (http:/ / wordpress. org/ about/ ). . Retrieved 2010-02-26. "WordPress was [...] built on PHP" [41] "Moodle - About" (http:/ / docs. moodle. org/ en/ About_Moodle). Moodle.org. . Retrieved 2009-12-20. [42] "PHP-related vulnerabilities on the National Vulnerability Database" (http:/ / www. coelho. net/ php_cve. html). 2008-03-01. . [43] "Security and... Driving? (and Hiring) - Sean Coates: PHP, Web (+Beer)" (http:/ / seancoates. com/ security-and-driving-and-hiring). Sean Coates. . Retrieved 2009-07-29. [44] Computerworlduk.com (http:/ / www. computerworlduk. com/ toolbox/ open-source/ blogs/ index. cfm?entryid=533& blogid=14), Interview: Ivo Jansch, February 26, 2008 [45] "PHP Taint Mode RFC" (http:/ / wiki. php. net/ rfc/ taint). . [46] "Developer Meeting Notes, Nov. 2005" (http:/ / www. php. net/ ~derick/ meeting-notes. html#sand-boxing-or-taint-mode). . [47] "Taint mode decision, Nov 2007" (http:/ / devzone. zend. com/ article/ 2798-Zend-Weekly-Summaries-Issue-368#Heading1). . [48] "The Power of PHP, both Good and Evil" (http:/ / www. cwihosting. com/ php_security. php). 2009-02-28. . [49] "Hardened-PHP Project" (http:/ / www. hardened-php. net). 2008-08-15. . [50] http:/ / wiki. php. net/ rfc/ shortags [51] "PHP: Basic syntax" (http:/ / www. php. net/ manual/ en/ language. basic-syntax. php). The PHP Group. . Retrieved 2008-02-22. [52] "Your first PHP-enabled page" (http:/ / www. php. net/ manual/ en/ tutorial. firstpage. php). The PHP Group. . Retrieved 2008-02-25. [53] Bray, Tim; et al (26 November 2008). "Processing Instructions" (http:/ / www. w3. org/ TR/ 2008/ REC-xml-20081126/ #sec-pi). Extensible Markup Language (XML) 1.0 (Fifth Edition). W3C. . Retrieved 2009-06-18. [54] "Variables" (http:/ / www. php. net/ manual/ en/ language. variables. php). The PHP Group. . Retrieved 2008-03-16. [55] "Instruction separation" (http:/ / www. php. net/ basic-syntax. instruction-separation). The PHP Group. . Retrieved 2008-03-16. [56] "Comments" (http:/ / www. php. net/ manual/ en/ language. basic-syntax. comments. php). The PHP Group. . Retrieved 2008-03-16. [57] "Integers in PHP, running with scissors, and portability" (http:/ / www. mysqlperformanceblog. com/ 2007/ 03/ 27/ integers-in-php-running-with-scissors-and-portability/ ). MySQL Performance Blog. March 27, 2007. . Retrieved 2007-03-28. [58] "Types" (http:/ / www. php. net/ manual/ en/ language. types. php). The PHP Group. . Retrieved 2008-03-16. [59] "Strings" (http:/ / www. php. net/ manual/ en/ language. types. string. php). The PHP Group. . Retrieved 2008-03-21. [60] "SPL — StandardPHPLibrary" (http:/ / www. php. net/ spl). PHP.net. March 16, 2009. . Retrieved 2009-03-16. [61] "PHP.NET: Process Control" (http:/ / nz. php. net/ manual/ en/ book. pcntl. php). . Retrieved 2009-08-06.


PHP

239 [62] "Functions" (http:/ / www. php. net/ manual/ en/ language. functions. php). The PHP Group. . Retrieved 2008-03-16. [63] http:/ / wiki. php. net/ rfc/ closures [64] "PHP 5 Object References" (http:/ / mjtsai. com/ blog/ 2004/ 07/ 15/ php-5-object-references/ ). mjtsai. . Retrieved 2008-03-16. [65] "Classes and Objects (PHP 5)" (http:/ / www. php. net/ zend-engine-2. php). The PHP Group. . Retrieved 2008-03-16. [66] "Object cloning" (http:/ / www. php. net/ language. oop5. cloning). The PHP Group. . Retrieved 2008-03-16. [67] "How do computer languages work?" (http:/ / www. linux-tutorial. info/ modules. php?name=Howto& pagename=Unix-and-Internet-Fundamentals-HOWTO/ languages. html). . Retrieved 2009-11-04. [68] (Gilmore 2006, p. 43) [69] "PHP Accelerator 1.2 (page 3, Code Optimisation)" (http:/ / www. php-accelerator. co. uk/ PHPA_Article. pdf) (PDF). Nick Lindridge. . Retrieved 2008-03-28. [70] "eAccelerator" (http:/ / eaccelerator. net/ ). . Retrieved 2009-09-18. [71] "Upcoming PHP6 Additions & Changes" (http:/ / davidwalsh. name/ php6). . Retrieved 2009-09-18. [72] http:/ / www. phpcompiler. org/ [73] http:/ / code. roadsend. com/ rphp/ [74] http:/ / www. theserverside. com/ news/ thread. tss?thread_id=38144 [75] Owlient.eu (http:/ / technow. owlient. eu/ index. php?post/ 2010/ 02/ 20/ php-compilers) [76] "PHP Function List" (http:/ / www. php. net/ quickref. php). The PHP Group. . Retrieved 2008-02-25. [77] "Developing Custom PHP Extensions" (http:/ / www. devnewz. com/ 090902b. html). devnewz. 2002-09-09. . Retrieved 2008-02-25. [78] http:/ / www. php. net/ [79] http:/ / www. dmoz. org/ Computers/ Programming/ Languages/ PHP/ / [80] http:/ / www. php. net/ manual [81] http:/ / www. php-cli. com/


Active Server Pages

240

Active Server Pages Developer(s)

Microsoft

Stable release

3.0 (no further versions planned)

Type

Web application framework

License

Proprietary

Active Server Pages (ASP), also known as Classic ASP or ASP Classic, was Microsoft's first server-side script engine for dynamically-generated web pages. Initially released as an add-on to Internet Information Services (IIS) via the Windows NT 4.0 Option Pack, it was subsequently included as a free component of Windows Server (since the initial release of Windows 2000 Server). It has now been superseded by ASP.NET. Developing functionality in ASP websites is enabled by the active scripting engine's support of the Component Object Model (COM), with each object providing a related group of frequently-used functions and data attributes. In ASP 2.0 there were six built-in objects: Application, ASPError, Request, Response, Server, and Session. Session, for example, is a cookie-based session object that maintains the state of variables from page to page. Functionality is further extended by objects which, when instantiated, provide access to the environment of the web server; as an example FileSystemObject (FSO) is used to create, read, update and delete files. Web pages with the .asp file extension use ASP, although some Web sites disguise their choice of scripting language for security purposes (e.g. still using the more common .htm or .html extension). Pages with the .aspx extension are ASP.NET (based on Microsoft's .NET Framework) and compiled, which makes them faster and more robust than server-side scripting in ASP which is interpreted at run-time; however, many ASP.NET pages still include some ASP scripting. Such marked differences between ASP and ASP.NET have led to the terms Classic ASP or ASP Classic being used, which also implies some nostalgia for the simpler platform. Most ASP pages are written in VBScript, but any other Active Scripting engine can be selected instead by using the @Language directive or the <script language="language" runat="server"> syntax. JScript (Microsoft's implementation of ECMAScript) is the other language that is usually available. PerlScript (a derivative of Perl) and others are available as third-party installable Active Scripting engines.

History Based on the dbWeb and iBasic tools, created by Aspect Software Engineering, ASP was one of the first web application development environments that integrated web application execution directly into the web server, 9 months after the release of NeXT's (now Apple) WebObjects. This was done in order to achieve high performance compared to calling external executable programs or CGI scripts which were the most popular method for writing web applications at the time. Prior to Microsoft's release of ASP for IIS 3, web programmers working in IIS relied on IDC and HTX files combined with ODBC drivers to display and manipulate dynamic data and pages. The basics of these file formats and structures were used, at least in part, in the implementation of the early versions of ASP. Halcyon InstantASP (iASP) and Chili!Soft ASP are third-party products that run ASP on platforms other than the Microsoft Windows operating systems. Neither alternative to real ASP fully emulates every feature, and may require additional components with which traditional ASP has no issues, such as database connectivity. MS access database support is a particular issue on non-Windows systems. iASP is able to use the VBScript and JScript languages unlike Chili!Soft ASP which uses JScript. Microsoft's ASP can use both and has the potential to have other languages make use of the scripting engine. iASP was written in


Active Server Pages Java, and as such will run on almost any operating system. iASP appears to be no longer available or at least hard to find. Examples of other languages available are Perl and TCL, although they are not as widely known or used for ASP scripting. There is an Apache Webserver mod that runs an ASP-like Perl script language.[1] Chili!Soft, initially released in 1997, was acquired by Cobalt Networks on May 24, 2000. Cobalt Networks subsequently was purchased by Sun Microsystems on December 7, 2000. Chili!Soft was renamed "Sun ONE Active Server Pages", then later renamed to "Sun Java System Active Server Pages". Chilisoft ASP was written in C/C++ and is tied rather tightly to specific web server versions. According to Sun "Sun Java System Active Server Pages has entered its End Of Life".[2]

Versions ASP has gone through three major releases: • ASP version 1.0 (distributed with IIS 3.0) in December 1996 • ASP version 2.0 (distributed with IIS 4.0) in September 1997 • ASP version 3.0 (distributed with IIS 5.0) in November 2000 ASP 3.0 is currently available in IIS 6.0 on Windows Server 2003 and IIS 7.0 on Windows Server 2008. ASP.NET is often confused as the newest release of ASP, but the technologies are very different. ASP.NET relies on the .NET Framework and is a compiled language, whereas ASP is strictly an interpreted scripting language. The move from ASP 2.0 to ASP 3.0 was a relatively modest one. One of the most important additions was the Server.Execute [3] methods, as well as the ASPError object [4].[5] Microsoft's What's New in IIS 5.0 [6] lists some additional changes. There are solutions to run "Classic ASP" sites as standalone applications, such as ASPexplore, a software package that runs Microsoft Active Server Pages offline.

Sample usage Any scripting languages compatible with Microsoft's Active Scripting standard may be used in ASP. The default scripting language (in classic ASP) is VBScript: <html> <body> <% Response.Write "Hello World!" %> </body> </html> Or in a simpler format <html> <body> <%= "Hello World!" %> </body> </html> The examples above print "Hello World!" into the body of an HTML document. Here's an example of how to connect to an Access Database

241


Active Server Pages <% Set oConn = Server.CreateObject("ADODB.Connection") oConn.Open "DRIVER={Microsoft Access Driver (*.mdb)}; DBQ=" & Server.MapPath("DB.mdb") Set rsUsers = Server.CreateObject("ADODB.Recordset") rsUsers.Open "SELECT UserID FROM Users", oConn,1,3 %>

See also • Template processor • ASP.NET

External links • ASP on MSDN [7]

References [1] [2] [3] [4] [5] [6] [7]

"Apache::ASP" (http:/ / www. apache-asp. org/ ). Chamas Enterprises Inc. . Retrieved 2009-01-08. "Sun Java System Active Server Pages" (http:/ / www. sun. com/ software/ chilisoft/ ). Sun Microsystems. . Retrieved 2008-12-31. http:/ / msdn. microsoft. com/ library/ default. asp?url=/ library/ en-us/ iissdk/ html/ db562da1-d49d-4fe5-9747-64ef530de23f. asp http:/ / msdn. microsoft. com/ library/ default. asp?url=/ library/ en-us/ iissdk/ html/ 541697df-5fb9-40a4-8fa1-380b4717cbf1. asp 4 Guys From Rolla's A Look at ASP 3.0 (http:/ / www. 4guysfromrolla. com/ webtech/ 010700-1. shtml) http:/ / www. microsoft. com/ technet/ prodtechnol/ windows2000serv/ reskit/ iisbook/ c01_programmability. mspx?mfr=true http:/ / msdn. microsoft. com/ en-us/ library/ aa286483. aspx

242


Article Sources and Contributors

Article Sources and Contributors Programming language  Source: http://en.wikipedia.org/w/index.php?oldid=358748749  Contributors: -Barry-, 10.7, 151.203.224.xxx, 16@r, 198.97.55.xxx, 199.196.144.xxx, 203.37.81.xxx, 212.188.19.xxx, 2988, 96.186, AJim, Abednigo, Abeliavsky, Abram.carolan, Acacix, Acaciz, AccurateOne, Addicted2Sanity, Ahoerstemeier, Ahy1, Akadruid, Alansohn, Alex, AlexPlank, Alhoori, Alksub, Allan McInnes, Alliswellthen, Altenmann, Amire80, Ancheta Wis, AndonicO, Andre Engels, Andres, Andylmurphy, Angela, Angusmclellan, Antonielly, Ap, Apwestern, ArmadilloFromHell, AstroNomer, Autocratique, Avono, B4hand, Behnam, Beland, Ben Standeven, Benjaminct, Bevo, Bh3u4m, BigSmoke, Bill122, BirgitteSB, Blaisorblade, Blanchardb, Bobblewik, Bobo192, Bonaovox, Booyabazooka, Borislav, Brandon, Brentdax, Brianjd, Brick Thrower, Brion VIBBER, Bubba73, Burkedavis, CSProfBill, Calltech, Can't sleep, clown will eat me, CanisRufus, Capricorn42, CarlHewitt, Catgut, Centrx, Charlesriver, Charlie Huggard, Chillum, Chinabuffalo, Chun-hian, Cireshoe, Ckatz, Closedmouth, Cmichael, Cobaltbluetony, ColdFeet, Conor123777, Conversion script, Cp15, Cybercobra, DMacks, DVD R W, Damieng, Dan128, Danakil, Dave Bell, David.Monniaux, DavidHOzAu, Davidfstr, Davidpdx, Dcoetzee, DeadEyeArrow, DenisMoskowitz, DennisDaniels, DerHexer, Derek Ross, Derek farn, Diego Moya, Dolfrog, Dominator09, Don't Copy That Floppy, Donhalcon, Doradus, DouglasGreen, Dtaylor1984, Duke Ganote, Dysprosia, EJF, ESkog, EagleOne, Edward301, Eivind F Øyangen, ElAmericano, Elembis, EncMstr, EngineerScotty, Epbr123, Esap, Evercat, Everyking, Ewlyahoocom, Ezrakilty, Fantom, Faradayplank, Fayt82, Fieldday-sunday, Finlay McWalter, Fl, Foobah, Four Dog Night, Fplay, Fredrik, Friedo, Fubar Obfusco, Funandtrvl, FvdP, Gaius Cornelius, Galoubet, Gazpacho, Giftlite, Giorgios, Gioto, Goodgerster, Gploc, Green caterpillar, GregAsche, Grin, Gurch, Gutza, Hadal, Hairy Dude, Hammer1980, Hans Adler, HarisM, Harmil, Hayabusa future, Headbomb, HeikoEvermann, HenryLi, Hfastedge, HopeChrist, Hoziron, Hut 8.5, Hyad, INkubusse, IanOsgood, Icey, Ideogram, Ilario, Imran, Indon, Infinoid, Iwantitalllllllll, Ixfd64, J.delanoy, JMSwtlk, JPINFV, JaK81600, Jason5ayers, Jaxad0127, Jaxl, Jeffrey Mall, Jeltz, Jeronimo, Jerryobject, Jitse Niesen, Jj137, Johann Wolfgang, John254, JohnLai, JohnWittle, Jonik, Jorend, Jossi, Joyous!, Jpbowen, Jpk, JulesH, Juliancolton, Jusjih, Jwissick, K.lee, K12308025, KHaskell, KSmrq, KTC, Karingo, Kbdank71, Kbh3rd, Kedearian, Ketiltrout, Kickstart70, Kiminatheguardian, Kinema, Klasbricks, KnowledgeOfSelf, Knyf, Koyaanis Qatsi, Krauss, Krawi, Kris Schnee, Krischik, Kuciwalker, Kungfuadam, Kwertii, KymFarnik, L Gottschalk, L33tminion, LC, Lagalag, Leibniz, Liao, Lightmouse, Ligulem, LindsayH, LinguistAtLarge, LordCo Centre, Lradrama, Lulu of the Lotus-Eaters, Luna Santin, Lupo, MER-C, MK8, Mac c, Macaldo, Macrakis, Magnus Manske, Mahanga, Malcolm Farmer, Malleus Fatuorum, Mangojuice, Manpreett, Marcoscramer, Mark Renier, MartinHarper, MartyMcGowan, Marudubshinki, Matthew Woodcraft, Mattisse, Mav, Maxis ftw, McSly, Mccready, MearsMan, MegaHasher, Mellum, Mendaliv, Merbabu, Merphant, Michael Hardy, Michael Zimmermann, Mike Rosoft, Minesweeper, MisterCharlie, Miym, Mkdw, Monz, Mpils, Mpradeep, Mrjeff, Ms2ger, Mschel, Murray Langton, Mwaisberg, Mxn, N5iln, Naderi 8189, Nameneko, Nanshu, Napi, Natalie Erin, Natkeeran, NawlinWiki, Necklace, NewbieDoo, Nick125, Nikai, Ningauble, Nixdorf, Noisy, Noldoaran, Noosentaal, NotQuiteEXPComplete, Nottsadol, Novasource, Ntalamai, Nuggetboy, Oblivious, Ohms law, Ohnoitsjamie, Oldadamml, Oleg Alexandrov, Omphaloscope, Orderud, OrgasGirl, Ossmann, Papercutbiology, Paul August, PaulFord, Peter, Phil Sandifer, PhilKnight, Phyzome, Pieguy48, Piet Delport, PlayStation 69, Pohta ce-am pohtit, Poor Yorick, Pooryorick, Positron, Prolog, Ptk, Pumpie, Pwv1, Quagmire, Quiddity, Quota, Quuxplusone, RainerBlome, Raise exception, Ranafon, RedWolf, Reddi, Reelrt, Reinis, RenamedUser2, Revived, RexNL, Rich Farmbrough, Rjstott, Rjwilmsi, Rlee0001, Robbe, Robert A West, Robert Skyhawk, Roland2, Romanm, Ronhjones, Roux, Royboycrashfan, Rrburke, Rursus, Rushyo, Russell Joseph McCann, Ruud Koot, S.Örvarr.S, Saccade, Sam Korn, Science History, SeeAnd, Sekelsenmat, Sgbirch, Shadowjams, Shane A. Bender, Shanes, ShelfSkewed, SimonP, Simplyanil, Sjakkalle, Skytreader, Slaad, Slakr, Slashem, SmartBee, Specs112, Speed Air Man, SpeedyGonsales, Speuler, SpuriousQ, Stephen B Streater, Stephenb, Stevertigo, SubSeven, Suffusion of Yellow, Suruena, Swirsky, Switchercat, Systemetsys, TakuyaMurata, Tarret, Taxman, Techman224, Tedickey, Template namespace initialisation script, Teval, Tewy, Tgr, The Thing That Should Not Be, Thniels, Thv, Tiddly Tom, Tim Starling, Timhowardriley, Tizio, Tobias Bergemann, Tony1, TonyClarke, Torc2, Toussaint, TuukkaH, Tysto, Ubiq, Ulric1313, Ultra two, Undeference, Useight, Vadmium, Vahid83, Vaibhavkanwal, Vald, VampWillow, VictorAnyakin, Vivin, Vkhaitan, Vriullop, Vsion, WAS 4.250, Waterfles, Wavelength, Wiki alf, WikiTome, Wikibob, Wikibofh, Wikisedia, Wimt, Windharp, Wlievens, Wmahan, Woohookitty, Ww, Xaosflux, Xavier Combelle, Yana209, Yath, Yk Yk Yk, Yoric, Zaheen, Zarniwoot, Zawersh, ZeWrestler, Zero1328, Zoicon5, Zondor, ²¹², ‫یرون سارائ‬, 775 anonymous edits Computer software  Source: http://en.wikipedia.org/w/index.php?oldid=357698170  Contributors: 12dstring, 16@r, 194.237.150.xxx, A Stop at Willoughby, A.K.R., ABF, AFriedman, ALargeElk, Aatayyab, Abby, Abufaisal65, Acroterion, Addshore, Ademsaykin, AdjustShift, Ah2190, Ahadrt, Ahoerstemeier, AlMac, Alan012, Alansohn, Alasdair, Aleemqureshi, Ali K, AlisonW, AlistairMcMillan, Allen4names, Alphachimp, Amalhotra124, Anabus, Andre Engels, Andrea105, Andrejj, Andrewspencer, Angela, Angela 2502, Antandrus, Aoratos, Ap, Aqeelbilal, ArchonMagnus, Arkrishna, Ashawley, Asiananimal, AuburnPilot, AussieLegend, Autocratique, Avono, Aweiredguy, Backslash Forwardslash, Bact, BalderV, Barek, Bean2thousand, Bearly541, Ben-Zin, Betterusername, Bevo, Big Bird, Big milsy, BigDunc, Binary TSO, Blarrrgy, Bobo192, Bonadea, Bongwarrior, Booyabazooka, Boriszex, Boxplot, Brat32, Brent Gulanowski, Brinnington, Brucevdk, Brusegadi, Bryan Derksen, Bubba73, Burnisk, Butros, CIreland, Can't sleep, clown will eat me, CanadianLinuxUser, Canaima, Canterbury Tail, Cargils02, Carl007, Casperdog2227, Cassandra 73, Catgut, Causa sui, Cburress, Centrx, Christian List, Chun-hian, Chuunen Baka, Chzz, Citypanther, CommonsDelinker, Conan, Conversion script, CoolD, Coolcaesar, Cpl Syx, Crossmr, Cst17, Cunya, Cyberstrike3000X, Cyrius, DBHunter, DJ Clayworth, Damian Yerrick, Dan100, Danakil, Dandaman32, Daniel5127, DanielEng, Darkevilfairy, Darrendeng, Darth Panda, Dave6, Davennmarr, Daveydweeb, DavidWBrooks, Davie4125, Dayewalker, Dbstommy, DeadEyeArrow, Deggalega, Delirium, Denny, DerHexer, Derek farn, Dethme0w, Dibcom, Dina, Discospinster, Dispenser, DivineAlpha, Dj stone, DmitTrix, Dominator09, Dougofborg, Drawnman247, Dreamatalana, Drewster1829, Drmies, Dugmn, Dycedarg, E. Sn0 =31337=, EdBever, Edcolins, Edivorce, Edlin, Eeekster, Egbsystem, Ehheh, ElBenevolente, Ellenaz, Emana, Emvn, Epbr123, Equendil, Erdal Ronahi, Esoteric Rogue, Everyking, Excirial, Ezubaric, FJPB, Fang Aili, Favonian, Fazilati, Fctk, Feedmecereal, Felyza, Fenice, Fern80, Flarn2006, Flash man999, Flipjargendy, Flubeca, FlyingToaster, Frap, Fratrep, Fred Bradstadt, Fredrik, Fredtheflyingfrog, Freeeeeesoft, Friday, Frip1000, Frosted14, Fubar Obfusco, Fyyer, GDallimore, GSCC, GTBacchus, Gabecuevas, Gadfium, Gaius Cornelius, Galoubet, Gardar Rurak, Gary King, Gazpacho, Geneb1955, Georgia guy, Giftlite, GilbertoSilvaFan, Gilliam, Ginsengbomb, Gogo Dodo, Gordmoo, Greenrd, GregorB, Gronky, Gtg204y, Guanaco, Gueldenberg, Guppy, Gurch, Gwernol, Hadal, Harvester, Hauberk, Haunting The Better, Hayabusa future, Hectorthebat, Hello32020, Hellosandimas, HenryLi, Hmains, Hmcnally, Hno3, Howdoesthiswo, Hu12, Hydrogen Iodide, Iknowyourider, Imnotminkus, Imroy, IngerAlHaosului, Instinct, Intelligentsium, Into The Fray, Iridescent, Ishara665g, Itsmine, Ixfd64, J Di, J. Spencer, J.delanoy, JLaTondre, JMetzler, Jaan513, Jackaranga, Jacob.jose, Jacqueline7894y, Jam2k, Jamesiemiller, Japonca, Jarda-wien, Jay, Jcuadros, Jcw69, Jeff G., Jehochman, Jeremy Visser, JeremyA, Jester7777, Jh51681, Jiang, Jiddisch, Jiy, Jmabel, JoanneB, Johan1298, JohnCD, JonHarder, Jonnabuz, Jop, Joy, Jpbowen, Jpgordon, Jreferee, Jsheadixon, Jtalledo, Jtbatalla, Jusdafax, Jusjih, Justin Eiler, Jóna Þórunn, K.Nevelsteen, K.lee, KFP, Kajasudhakarababu, Karol Langner, Kathmandu2007, Kc03, Keilana, Kelldall, Kelly Martin, Kenny sh, KevinCuddeback, Kingpin13, Kinneyboy90, Knownot, Knucmo2, Kornxi, Kostisl, Kozuch, KrakatoaKatie, Krawi, Ktanzer, Kuldeep06march, Kumioko, Kurowoofwoof111, Kuru, La Pianista, Lagalag, LaggedOnUser, Lajpatdhingra, Landroo, Lareina3656y, LauriO, Leandrod, LeoNomis, Liao, Liberty Miller, Liftarn, Ligulem, LinDrug, Lindberg G Williams Jr, Listlist, LittleOldMe, Livajo, Lookingchris, Lucianob, Luna Santin, Lunaumbrax, Luntrasul, Lupo, MC10, MD66, MER-C, Macy, Majorbrainy, Malcolm Farmer, Malcolma, Mandarax, Manik762007, Manionc, Maralia, Marek69, Markaci, Martarius, Master&Expert, Matt Britt, Matusz, Mav, Maximaximax, Maximus Rex, Mcdonald.ross5, Mdd, Mdebets, MeekSaffron, Meelar, Mendalus, Mentifisto, Mermaid from the Baltic Sea, Message From Xenu, Mgrand, Michael Drüing, Michael Hardy, Michal Jurosz, Miguel.mateo, Mild Bill Hiccup, Mindspillage, Minesweeper, Minghong, Mini-Geek, Miquonranger03, Mjchonoles, Mkdw, Monobi, Mpete510, Mr Barndoor, MrDolomite, MrOllie, Muffhen, Murray Langton, Mwanner, Mwilso24, Mxn, Myanw, N n abc123, Nabeth, Naik5abhi, Nanshu, Narayana vvss, Nathanielrichards, Navstar, NawlinWiki, Neil916, Neurophyre, Nharipra, Nickels360, Nikai, Nike8, NinjaCross, Nixdorf, Nixeagle, Nk, Nmrd, Normxxx, Nosferatus2007, Nourybouraqadi, Nsaa, NuclearWarfare, NurAzije, Nurg, Oda Mari, Ohnoitsjamie, Oicumayberight, Old Moonraker, Oldwes, Oleg Alexandrov, Oliver Lineham, Omicronpersei8, One more night, OrgasGirl, Oxymoron83, P99am, PEH2, PIrish, Page Up, Paranoid, Party, Passargea, Paul August, Paul Niquette, PeaceAnywhere, Perkinsleslie, Peruvianllama, Peter Winnberg, Pgk, Phatom87, PhilHibbs, PhilKnight, Philip Howard, Philip Trueman, Phillip Ca, Piano non troppo, Pilotguy, Pip2andahalf, Poccil, Pohatu771, Pohta ce-am pohtit, Pointillist, Poo1000, Pooryorick, Priyadarshi.pratyush, Programmer13, Prohlep, Psantora, Puchiko, Purgatory Fubar, Pyfan, Quinobi, Quinsareth, QuintusQuill, R. S. Shaw, RHaworth, Ragha joshi, Ragib, Rainier3, RainierHa, Rajeshmagic, Random89, RandorXeus, Raul654, Ray Van De Walker, Raylu, Rbreen, Recnilgiarc, RedWolf, Reedy, Rehnn83, Remember the dot, Rettetast, RevRagnarok, Revised, Rgill, Rholton, Rhye123, Riana, Rich Farmbrough, Richard cocks, RickK, Ripogenus77, Rl, Rogger.Ferguson, Roland2, Ronhjones, RossPatterson, Roybb95, Rurik, Rwwww, Ryulong, S.K., SC979, SF007, ST47, Sadeq, Sadi Carnot, Safalra, Safarj, Saleem110, Salsa Shark, Sango123, SchfiftyThree, Scientus, Scrool, Seb az86556, Seidenstud, Senator Palpatine, Setveen, Shakya ind, Siggy28, Sigondronggondrong, Siliconov, SimonP, Sintonak.X, Sitarherophil, Sivaguru411, Sjö, Skarebo, SkyBon, Slingerjansen, Slowking Man, Slowmo1993, Smack, Smalljim, Snowmanradio, Sokrato, Soler97, South Bay, Speaksleft, Spectrogram, SpuriousQ, Starkiller88, Stephenb, Steven Zhang, Stewartadcock, StuffOfInterest, Supadawg, Super-Magician, Supersteve04038, Suruena, SusanLesch, SwirlBoy39, Syrthiss, T4tarzan, THEN WHO WAS PHONE?, TPK, TaintedMustard, Tangent747, Tasja, TastyPoutine, TechOutsider, Tejas81, Teryan2006, Th1rt3en, Tharcore, The Anome, The Earwig, The Rambling Man, The Thing That Should Not Be, The Transhumanist, The Transhumanist (AWB), TheBiaatch, Thelb4, Thingg, This acccount is 4 vandalism, Thumperward, Thurak13, Tide rolls, Tigershrike, Tim Q. Wells, Timhowardriley, Timothy Neilen, Topbanana, Torc2, Torinor, Traroth, Trevor MacInnis, Triona, Triwbe, True Genius, Trueshow111, Trusilver, Turlo Lomon, Tuxide, Twsx, Uncle Milty, Unyoyega, Uriyan, Vbigdeli, Versageek, Versus22, Virtualsfera, Viskonsas, WLU, Wavelength, Wayward, Werdna, Weregerbil, Wernher, Wexhammer, Weyes, Whitew123, Wiki alf, Wikicrazier2011, Wikidudeman, Wikieditor06, Wimt, Wireless friend, Wm, WojPob, Wshun, Wysprgr2005, Xdenizen, Xevious, Yamamoto Ichiro, Yidisheryid, Yoooder, Yworo, Zarkos, Zedlik, ZeroOne, Zerokewl, Zhang He, ZimZalaBim, Zybez, Александър, Евгени Симеонов, Петър Петров, 1463 anonymous edits History of programming languages  Source: http://en.wikipedia.org/w/index.php?oldid=358397472  Contributors: Abcarter, Acaciz, Aljullu, Altenmann, Andrzejsliwa, Antic-Hay, Arch dude, Ashawley, B3tamike, Banus, Beland, Ben Standeven, Capricorn42, Cassowary, CharlesGillingham, Chris the speller, Cleme263, Danakil, DiarmuidPigott, Diego Moya, Djsasso, Duncandewar, Dylan Lake, Firsfron, Funandtrvl, Fuzheado, Gerweck, Ghettoblaster, GraemeL, Greenrd, Hairy Dude, Hotcrocodile, Huffers, IanOsgood, Icey, Indeterminate, JLaTondre, JMK, John Vandenberg, Jost Riedel, Jpbowen, KoshVorlon, Lakinekaki, Lambiam, Lulu of the Lotus-Eaters, Mahanga, Malcolmxl5, Marc Mongenet, Mdd, Mr700, Oxymoron83, Pgan002, Pgr94, PhilKnight, R. S. Shaw, RedWolf, Reeve, Rheostatik, Rp, Rursus, Rwwww, Sabre23t, Sceptre, Sietse Snel, Sinbadbuddha, Skäpperöd, Sligocki, Szyslak, T-bonham, Tablizer, The Thing That Should Not Be, Tony Sidaway, Torc2, Valodzka, Yuriz, Zawersh, Zoicon5, 101 anonymous edits Low- level programming language  Source: http://en.wikipedia.org/w/index.php?oldid=348809741  Contributors: 16@r, AbsolutDan, AdmN, Alksentrs, Amcbride, Andreas Kaufmann, Antonielly, Awg1010, Brianga, Caknuck, CanisRufus, DJ LoPaTa, DanielCristofani, Diego Moya, Doktorspin, FatalError, Fox2030, Fram, Func, Greenrd, Haakon, Haiviet, J M Rice, Jackbaird, Jesin, Just Another Dan, Kate, Krawi, Lerdsuwa, Macrakis, Mecanismo, Medmrt2008, Michael Hardy, Mindmatrix, Minotaur2, Mr.Z-man, Nacefe, NewEnglandYankee, Peng, Pinkadelica, R. S. Shaw, RedWolf, RockMFR, Ruud Koot, Sarg, SoWhy, TakuyaMurata, Tgok, Thehotelambush, Theresa knott, Toxee, Traal, Trylks, Valodzka, Vkhaitan, Zx-man, 今古庸龍, 69 anonymous edits

243


Article Sources and Contributors High- level programming language  Source: http://en.wikipedia.org/w/index.php?oldid=351787140  Contributors: ARUNKUMAR P.R, Alansohn, Andreas Kaufmann, Antonielly, Anwar saadat, AuburnPilot, Borgx, CambridgeBayWeather, Can't sleep, clown will eat me, CanisRufus, Chad103, Closedmouth, DJ LoPaTa, Danilo.Piazzalunga, Diego Moya, Discospinster, Dougofborg, Dureo, Dysprosia, ENeville, El C, English Lock, Epbr123, FatalError, Fiend666, Firetrap9254, Fox2030, Fubar Obfusco, Func, Furrykef, Gail, Guy Harris, Harmil, HenkeB, Ixfd64, Jclemens, JesseGarrett, Jon Harald Søby, Jordandanford, Kbdank71, LOL, Lod, MER-C, Maelor, Maestrosync, Majorly, Mandarax, Melody, Michael Hardy, Mindmatrix, Miohtama, Munibert, Mushroom, Ningnangnong, Pedant, Pmurph, R. S. Shaw, RandomStringOfCharacters, RedWolf, RockMFR, Rsm99833, Ryankitlitz, SF007, Sarg, Shadow1, Shoeofdeath, SkonesMickLoud, Snaxorb, Solitude, Spikey, Stevage, Stewski, Stormscape, Superextremegamer1, Syndicate, TakuyaMurata, TangentCube, The Thing That Should Not Be, Tobias Bergemann, Traal, Tratten, Travelbird, Turkishbob, Vahid83, VampWillow, Vkhaitan, Vranak, Wootery, Yk Yk Yk, Yugsdrawkcabeht, 今古庸龍, 183 anonymous edits Machine code  Source: http://en.wikipedia.org/w/index.php?oldid=354949012  Contributors: .:Ajvol:., 16@r, 192.35.241.xxx, AKismet, AdmN, Aguinaldo, Alfio, Algebra, Altenmann, Andre Engels, BiT, Bnugia, CanisRufus, CharlesC, Cjewell, Cmdrjameson, Conversion script, Cst17, DJ Clayworth, Dav4is, Dawnseeker2000, Derek Andrews, Discospinster, Dori, El C, Eric-Wester, Everyking, Fabrictramp, Feezo, Furrykef, GeorgeLouis, Helix84, HenkeB, Husond, Imjustmatthew, Incnis Mrsi, Instinct, IvanLanin, J.delanoy, Jacob.jose, Jagdeepyadav, Javert, Jeff02, Jesdisciple, Jeshan, Johnny 0, JonHarder, Jpk, Jusjih, Kappa, Karol Langner, Katieh5584, Kbdank71, Kim Bruning, Klaser, LazyEditor, Lee.crabtree, LiDaobing, Lowellian, Lyricmac, Manop, Mark, Matthewsim, Meadowbert, Mega Chrome, Megatronium, Melchoir, Microprofessor, Midzata, Mike Van Emmerik, Mindmatrix, Mirror Vax, MmisNarifAlhoceimi, Modest Genius, MrPrada, Muad, Murray Langton, Mxn, Nanshu, NawlinWiki, Nikai, OrgasGirl, Paul Stansifer, Philip Baird Shearer, Prolog, Punctilius, QofASpiewak, Quuxplusone, Randomguy121, Romanm, RussBlau, Rwpostiii, Salsa Shark, Sam Vimes, Sannse, Shmuel, Slady, Slipstream, Soumyasch, Spiritia, Stmrlbs, Sychen, Tedp, The Magician, Tobias Bergemann, Toussaint, Tximist, UnfriendlyFire, VKokielov, Voyagerfan5761, Waggers, WikiBully, Wikinerd, Wtshymanski, Xihix, Xod, Yanco, రవిచంద్ర, 182 anonymous edits Assembly language  Source: http://en.wikipedia.org/w/index.php?oldid=358806331  Contributors: 0, 16@r, 3Nigma, A.R., AMRAN AL-AGHBARI, Abdull, Accatagon, Ahoerstemeier, Ahy1, Aitias, Akamad, Akyprian, Alan_d, Ale jrb, Aleph Infinity, Alex.g, Alfio, An-chan, AndonicO, Andre Engels, Andres, Angela, Anger22, Angusus, Anna Lincoln, AnnaFrance, Another-anomaly, AnthonyQBachler, Anwar saadat, Apokrif, Arbabarehman, Archanamiya, ArchiSchmedes, Ashmoo, Athaenara, Atlant, Audriusa, Autodmc, Backslash Forwardslash, Beefball, Beno1000, Bilbo1507, Binrapt, Bison bloke, Blainster, Blakegripling ph, Blashyrk, Brage, BroodKiller, Bryan Derksen, Bumm13, CRGreathouse, CUTKD, Can't sleep, clown will eat me, CanOfWorms, CanisRufus, Capi crimm, Chasingsol, ChazZeromus, Cheeselet, Chieffancypants, Chopin1810, Chris Chittleborough, Christian List, Conversion script, Coolv, Cquarksnow, Crotalus horridus, CultureDrone, Damian Yerrick, Darktemplar, Darrien, Dasnov, DavidCary, Dcoetzee, DeanHarding, DekuDekuplex, Derek Ross, Derek farn, Dexter Nextnumber, Dmbrunton, Doug, DouglasGreen, Dragon DASH, Dsavi.x4, Dtgm, Eagle246, Eagleal, Easyas12c, Edward, Edward.in.Edmonton, Elving, Emperorbma, Emuguru, Epbr123, EricR, Everyking, FDD, Feezo, Femto, Ferkelparade, Filemon, Flewis, Frap, Fred Bradstadt, Furrykef, Fuzzbox, Gaius Cornelius, Galoubet, Gannimo, Garas, Giddie, Giftlite, Gioto, GoldenMeadows, Gondooley, Goodone121, Greenrd, Gutsul, Gwern, Haiviet, HenkeB, Herzleid, Hirzel, Hotdogger, HumphreyW, IanOsgood, Inzy, ItsProgrammable, Ixfd64, Jack O exiled, Jacob.jose, James086, JavierMC, Jeffrey Mall, Jeh, Jiveshwar, JoeBruno, JohnCJWood, Jorgon, Jpsowin, Jth299, Jukrat, Just Another Dan, KD5TVI, KP-Adhikari, Karada, Kaster, Kbdank71, Kdakin, Keilana, Kglavin, Khunglongcon, Kindall, Kjp993, Koektrommel, Konstable, Kubanczyk, LOL, LeaveSleaves, LizardJr8, Lkopeter, Locke Cole, Lowellian, MER-C, MP64, Manatee0, Marasmusine, Mario Blunk, Martynas Patasius, Masgatotkaca, Matt B., Mdanh2002, Mdwh, Meera123, Mellamokb, Mellum, Mendalus, Michal Jurosz, Michele.alessandrini, Miguelito Vieira, Mike Field, Mikellewellyn, Mindmatrix, Mmernex, MonstaPro, Monz, Moonlit Knight, Mozillaman, Mr.Do!, MrOllie, Msikma, Murray Langton, Mustafazamany, Mykk, NERIUM, Nandesuka, Nanshu, NightFalcon90909, Nikai, Nmcclana, Ohnoitsjamie, Oldhamlet, Ospix, Owl3638, PJonDevelopment, Paresthesias, Patrick, Paul August, Pgk, Piano non troppo, Pinethicket, Pingveno, Pneuhaus, Pohta ce-am pohtit, Poolisfun, Popsracer, Praefectorian, Quadell, Quarryman, R. S. Shaw, RCX, RHaworth, Ramu50, Rasmus Faber, Rbakels, Rdnk, Red Prince, Regancy42, Reinderien, RexNL, Rfc1394, Rich Farmbrough, Rivanov, Robbe, Robert Merkel, Ronz, Rotundo, Ruud Koot, Sanbec, Sanoj1234, Sanpnr, Schultkl, Scientus, Scipius, Scott Gall, Shadanan, Shadow demon, Shadowjams, SilentC, Simon80, SimonP, SkyWalker, Slaryn, Slashme, Sleigh, Slightsmile, Soumyasch, Spalding, SpareHeadOne, SpeedyGonsales, Spinality, Sploonie, SpooK, SpuriousQ, Starionwolf, Starnestommy, StealthFox, SteinbDJ, Stewartadcock, Stmrlbs, Stormy Ordos, Struway, Stuart Morrow, Surturz, Suruena, System86, TParis00ap, Tarikes, Tcsetattr, TeaDrinker, Tedickey, Th1rt3en, The Thing That Should Not Be, TheStarman, ThomasHarte, Tide rolls, Tim32, Tobias Bergemann, Toksyuryel, Tomasz Tybulewicz, Toussaint, True Pagan Warrior, Tzarius, Uli, Ultimus, Utvik, VampWillow, Velle, Versus22, Vid512, Vobis132, Vwollan, Wengier, Wereon, Wernher, Wesley, Whitehatnetizen, Wiki alf, Wikiklrsc, Wilky DiFendare, WindowsNT, Wizardman, Wj32, Wjl2, Wknight8111, Wolfmankurd, Wre2wre, Wrp103, Wwmbes, XJamRastafire, Ysangkok, Yworo, Zarel, Zonination, Zundark, Zx-man, ZyMOS, 574 anonymous edits BASIC  Source: http://en.wikipedia.org/w/index.php?oldid=358868167  Contributors: 0836whimper, 130.233.251.xxx, 194.109.232.xxx, 195.186.148.xxx, 1984, 209.75.42.xxx, 23skidoo, 24.4.254.xxx, 6.31, 62.202.114.xxx, 62.202.86.xxx, 96.186, A D Monroe III, ABF, AEMoreira042281, Adw2000, Ae-a, Afairch, After Midnight, Ahoerstemeier, Ahy1, Alansohn, Aldie, Alfio, AlistairMcMillan, Alksub, Ameliorate!, AndersRoyce, Andiye, AndonicO, Andre Engels, Andres, Anonymous Dissident, Aou, Artcliffe, Arthur Rubin, Ashawley, Athaenara, Atlant, Aude, AugPi, Averagejoedude, AxelBoldt, B3virq3b, BRG, Baronnet, Baryonic Being, Bdesham, Benapgar, Bevo, Bkell, Blacken, Blueapples, Bobblewik, Bobo192, Bongwarrior, BorgQueen, Branddobbe, Brighterorange, Brion VIBBER, Bstarynk, Btx40, Burntsauce, CBDunkerson, CPMcE, CRGreathouse, CSMR, Calwatch, Can't sleep, clown will eat me, CanOfWorms, CanisRufus, Cap'n Refsmmat, Capricorn42, CatherineMunro, Catspawsd, Certh, Cgs, Charles Gaudette, Chinese3126, Chocolateboy, Choster, Chun-hian, Cjthellama, Clasqm, Closeapple, Coldacid, Comrade Graham, Conti, Conversion script, Coso, CountingPine, Cprompt, Creidieki, CryptoDerk, Cwolfsheep, Cybercobra, Cyktsui, DNewhall, Dabest311, Dachshund, Damian Yerrick, Danakil, DarkFalls, Darrenvc, Dayewalker, Dbiel, Dcljr, Dcoetzee, Dead3y3, Deflagro, Deh, Deking15, Dekisugi, Demicx, Denimadept, Derek Ross, DiSSo, Diberri, Dina, Dinoceras, Dinomite, Dland, Dlfkja;lskj, Dm01, Dmerrill, Dmsar, Dojarca, DoubleBlue, DragonHawk, Dwiakigle, Dylnuge, Dysprosia, Dzubint, EALacey, Eagleal, Eagleamn, Earle Martin, Ecwdev, Editbringer, Edivorce, Edward, Ellmist, Empiric, Enchanter, Engelec, Enochlau, Eric119, Ericd, Evercat, Exhartland, Fabartus, Firedraikke, Fireworks, FrancisRogers, Francs2000, Frecklefoot, Fredrik, Fsw, Fubar Obfusco, Func, Furrykef, Futurebird, Gadfium, Gaius Cornelius, Ganymede 901, Gazpacho, GentlemanGhost, Georgesawyer, Ghettoblaster, Ghosts&empties, Giftlite, Gimboid13, Gioto, Gogo Dodo, Golbez, Gpvos, GraemeL, Graham87, Greenrd, GregCutler, GreyCat, Grstain, Grunt, Gschizas, Gzornenplatz, Hadal, Hairy Dude, Hannes Hirzel, Hapless Hero, Hayabusa future, Helpsloose, Hirzel, HisSpaceResearch, Homo sapiens, Hydrargyrum, IMSoP, IainP, Inarius, Int19h, Interface, J JMesserly, J nachlin, J.delanoy, JaGa, Jackol, Jaxad0127, Jean-Christophe BENOIST, Jeff G., Jimregan, John Spikowski, JohnOwens, Johnuniq, Jonahb, Jpgordon, Kablammo, KholkhozNarra56, Kiam, Kilo-Lima, Kleinheero, KnowledgeOfSelf, Knulclunk, Koyaanis Qatsi, Krich, Kundor, Lee Daniel Crocker, Lee1026, Leuko, Liftarn, Lightmouse, Lockeownzj00, LukeyBoy, Lynch2000s, M7, MER-C, MK8, MacGyverMagic, Macintoshrocks9, Madlobster, Mahjongg, Maksud, Mark Richards, MartinHarper, Mastercoder, Mauls, Maury Markowitz, Mbecker, Mboverload, Meeples, Mentifisto, Metacarpus, Mezzaluna, Michael Devore, MichaelBillington, Mietchen, Mild Bill Hiccup, Mintguy, Mirror Vax, Modster, Monedula, Monz, Morwen, Mover, Mxn, Mysid, Myztry, Neilrieck, Nicdafis, Nk, Noodle snacks, Ntsimp, Nyttend, Ohnoitsjamie, OlEnglish, Olathe, Oliver Crow, Onebravemonkey, Oxymoron83, PatriceNeff, Patrick, Pavium, Paxsimius, Pd THOR, PerlDpUa, Peterlin, Peyre, Pgan002, PhilKnight, Philip Trueman, Picapica, Pichu826, Pigsonthewing, Pinoygabs, Pne, Poor Yorick, Prujohn, Psyced, Pucesurvitaminee, Puckly, Quadell, R-Joe, RCX, RTC, Rcingham, RedWolf, Renatolopes, RexNL, Rfc1394, Rhsatrhs, Rich Farmbrough, RichardRussell, Richmd, Rjd0060, Rjnt, Rjwilmsi, Rlee0001, Rmallins, Robin Johnson, Rodrigo Strauss, Rogowiak, Roseglendelyn, RossPatterson, RucasHost, Ruedetocqueville, Ruud Koot, ST47, Salasks, Samiam95124, SeanDuggan, Sfitzge308, Shizhao, Shlomif, Sidewinder1, Sidfadnis, Sietse Snel, Silver hr, SimonP, Sjock, Some guy, Soulpatch, Squirepants101, Ssd, Stan Shebs, Stephenb, Stephenchou0722, Stepshep, SteveSims, StoatBringer, Stormie, Stratocracy, Stu1011, Super3boy, Surturz, Synchrite, Tablizer, TakuyaMurata, Tannin, Taras, Tarinth, Tarquin, Taw, Television rules the nation, Template namespace initialisation script, Tempshill, Terpdx, Thavron, The Epopt, The Raven's Apprentice, TheAdventMaster, TheBilly, Thunderbird2, Tide rolls, Tifego, Timhood, Titoxd, Tnchris, Tom Harris, Tombomp, Tomcole, Tompagenet, Toon05, Toytoy, TraceyR, Trusilver, Tschild, Twelvethirteen, Twinchester, Urhixidur, Utcursch, Versus22, Victarus, Vivio Testarossa, Voidxor, Vranak, Vslashg, Wangi, Wavelength, Weirdy, Wellington, Wernher, Wgungfu, Who, Wikipe-tan, Williamv1138, Wimt, Wjhonson, Wjl2, Wtshymanski, Ww, X570, Xcrivener, Yamamoto Ichiro, Yasuoyasu, Yosri, Zaplin, Zinzinday, ^zer0dyer$, ㍐, 643 anonymous edits C (programming language)  Source: http://en.wikipedia.org/w/index.php?oldid=358895194  Contributors: (aeropagitica), -Barry-, 11.58, 144.132.75.xxx, 16@r, 1exec1, 213.253.39.xxx, A D Monroe III, ACM2, AJim, Aarchiba, Abaddon, Abbyjoe45, Abdull, Abeliavsky, Abigail-II, Abilina, Ablonus, AdShea, Adam majewski, Adrian Sampson, Adrianwn, Aeon1006, Aeons, Afog, Ahy1, Ais523, Aj00200, Ajk, Ajrisi, Akamad, Akella, Akihabara, Alansohn, Albertgenii12, Alexthe5th, Alfakim, Alhoori, AlistairMcMillan, Alksentrs, Allan McInnes, Allen3, Alphachimp, Altenmann, Amelio Vázquez, Anabus, Anaraug, Ancheta Wis, Andewz111, Andre Engels, Andrejj, Andries, Andrwsc, Anetode, AngelOfSadness, Angus Lepper, Anon E Mouse, AnthonyQBachler, AntoineL, Applet, Appraiser, Apwestern, Aragorn2, Arch dude, Arcnova, Arekku, Arvindn, Ataru, Atlant, Auroreon, Avsharath, Awickert, AxelBoldt, B k, Bact, Bahonesi, Bakilas, Bart133, Bassbonerocks, Baszoetekouw, Baudway, BazookaJoe, Ben Karel, BenFrantzDale, BenM, Benhocking, Beno1000, Bentler, Bernfarr, Bevo, Bfrbfr, Bhadani, BigChicken, Bigk105, Billposer, Bkell, Blueyoshi321, Bmicomp, Bobblewik, Bobo192, Boredzo, Borislav, Born2cycle, Brion VIBBER, Byrial, C. A. Russell, CPMcE, CRGreathouse, CYD, Cadae, Cadr, Caltas, Can't sleep, clown will eat me, CanadianLinuxUser, CanisRufus, CapitalSasha, Casey Abell, Cctoide, Cdc, Cedars, Cfailde, Cgranade, Chandrasekar78, Charlesjia, CharlotteWebb, Chbarts, Cheesy123456789, Chocolateboy, Chris Burrows, ChristTrekker, Chun-hian, Cjosefy, Cjworden, Codeman, ColtM4, Comatose51, Como, Conan, Corti, Csmaster2005, Cubathy, Curps, Cybercobra, Cyoung, Cyp, D, D6, DAGwyn, DE, DJ Clayworth, DMG413, Daf, DailyBlip, Daivox, Damian Yerrick, Dan Granahan, Dan100, Dan198792, Danakil, Danallen46, Daniel Quinlan, DanielCristofani, Darius Bacon, David Gerard, Davtom, Dcljr, Dcoetzee, Dead3y3, Dekisugi, Delibebek, Derek Ross, Derek farn, Desivenkatesh, DevastatorIIC, Dieter Simon, Disavian, Discospinster, Dispenser, Dkasak, Docu, Dodo bird, Doradus, DougsTech, Download, Dowsiewuwu, Dpv, DragonHawk, Drbrain, Drj, Druiloor, Dusik, Dwheeler, Dysprosia, E0steven, ESkog, Eagleal, EatMyShortz, EdC, Edward, Eequor, Electron9, Eloquence, Elz dad, Emperorbma, Emre furtana, EncMstr, Epl18, Eric119, Erik9, Erpingham, EugeneZelenko, Evercat, Evice, Evil Monkey, Ewok18, Excirial, Exert, Faisal.akeel, FastLizard4, Fawcett5, Fbergo, Feb30th1712, Feedmecereal, Felixdakat, Fibonacci, Fireice, Firetrap9254, Flash200, Fleminra, Flockmeal, Flubeca, Fluggo, Foobaz, Fractal3, Frank4ever, Frappucino, Freakofnurture, Frecklefoot, Frederico1234, Fredrik, Free Software Knight, FreedomByDesign, FrenchIsAwesome, Fresheneesz, Fritzpoll, Fsiler, Ftdftd, Ftiercel, FullMetal Falcon, Furrykef, Fuzzy, Fırat KÜÇÜK, GLaDOS, GMcGath, Gagewyn, Gail, Gamma, Gareth Owen, Garyzx, Gaspercat, Gavenko a, Gazimoff, Gene s, Gesslein, Getyoursuresh, Ggurbet, Ghettoblaster, Ghoseb, Giftlite, Gilgamesh, Gimmetrow, Gjd001, Gludwiczak, GnuDoyng, Gogo Dodo, Gpietsch, Graham87, Graingert, Graue, Greg L, GregorB, Griba2010, Gtrmp, Gustavb, Gustavh, Gwinkless, H2g2bob, Hadal, Haeleth, Haikupoet, HairyWombat, Hakkinen, Halo, Hard Backrest, Harryboyles, Hashar, Hayabusa future, Hdante, HellDragon, HenkeB, Henning Makholm, HenryLi, Hervegirod, Hgabor, Hirzel, Hmains, Hmd, Hokanomono, Homerjay, Hqb, Hu12, Huffers, Husky, Hyad, I already forgot, INkubusse, IanOsgood, Ideogram, Idknow, Ilgiz, Iluvcapra, In2dair, InShaneee, InfinityAndBeyond, Information Center, Infrogmation, Intangir, Iridescence, Irish Souffle, Isilanes, Itai, Itmozart, J.delanoy, JJIG, JPINFV, Jamesooders, Jay, Jc4p, Jcarroll, Jeltz, Jengod, JensenDied, Jerryobject, Jfdsmit, Jhevodisek, Jiang, Jiddisch, JimWae, Jla, Jleedev, Jm34harvey, Jmath666, Jmnbatista, Jni, Joeblakesley, John Fader, John Vandenberg, JohnJSal, Johndci, Johnuniq, Jojo-schmitz, Jok2000, Joke137, JonMcLoone, Jordandanford, Jorge Stolfi, JorgePeixoto, Josemanimala, Joseph Myers, João Jerónimo, Jrthorpe, Jscipione, Jubair.pk, Jumbuck, Jusdafax, Jutta, Juuitchan, Jwzxgo, Kapil87852007, Karen Johnson, Katanzag, Kate, Kbolino, KenshinWithNoise, Kenyon, Kerotan, Kevin B12, Kgasso, Kimiko, Kinema, King of Hearts (old account 2), Kjak, Klhuillier, KnowledgeOfSelf, Kooldeep, Koyaanis Qatsi, Kreca, Kri, Kushal kumaran, Kusunose, Kwamikagami,

244


Article Sources and Contributors LAAFan, LOL, Lainproliant, Larry Hastings, Laundrypowder, Lbraglia, Lecanard, Lee Daniel Crocker, Lee1026, Leibniz, LeoNerd, Leontios, Lfwlfw, Liftarn, Lightdarkness, LilHelpa, Linus M., Lir, Lloydd, Loadmaster, Lockley, Lost.goblin, Lotje, Luk, Lupin, Lvl, MDoko, MTizz1, Mac c, Mackstann, Magic Window, Malfuf, Marc Mongenet, Mark Renier, MarkS, Marskell, Martijn Hoekstra, Martinjakubik, MattOates, Matusz, Maustrauser, Mav, MaxSem, Mboverload, Mcaruso, Mdsawyer58, Mellum, Merphant, MertyWiki, Mgmei, Michael.Paoli, MichaelBillington, Michaeln, MichalJurosz, Mickraus, Mikademus, Mike Jones, Mike92591, Mikeloco14, MilesMi, Minesweeper, Minghong, Mipadi, Mirror Vax, Miterdale, Miym, Mk*, Modulatum, Monobi, Morwen, Moskvax, Mrjeff, Msikma, Msreeharsha, MuZemike, Muhandis, Mumuwenwu, Museak, Musicomputer, MustafaeneS, Mux, Mxn, Mycplus, N5iln, Naidim, Nanshu, Napi, NapoliRoma, NauarchLysander, Neckelmann, Neilc, NeoAdonis, Nephtes, NevilleDNZ, NewEnglandYankee, Nick8325, Nickj, Nikai, NikonMike, Ninly, Nixeagle, NixonB, Njyoder, Nk.sheridan, Nn123645, Nnp, Norm, Notedgrant, Notyourhoom, NubKnacker, Numbo3, Nv8200p, Oblivious, Obradovic Goran, Oddity-, Odinjobs, Ohnoitsjamie, Olivier, Ollydbg, Omicronpersei8, Opabinia regalis, Opelio, OrangUtanUK, Orderud, Orthoepy, Orthogonal, OwenBlacker, Oxymoron83, Oysterguitarist, Ozzmosis, P Carn, Pamri, Papa November, Paul-L, Paulsheer, Pcovello, Penubag, Peter Fleet, Pgk, Pharos, Phgao, Phuzion, Piet Delport, Pizza Puzzle, Pjvpjv, Plugwash, Poor Yorick, Postdlf, Potatoj316, Potatoswatter, Praveenp, Pripat2001, Prosfilaes, Proub, PseudoSudo, Pseudomonas, Psychotic Spoon, Psyco Fish, Qbg, Qed, Quantumobserver, Quimn, Quuxplusone, Qwertyus, RCX, RN, RTC, Radagast83, Rahinreza, Rainier002, Ralmin, Rama, RandomStringOfCharacters, Ranjithkh, RedWolf, Redf0x, Reisio, Rekh, Reverendgraham, ReyBrujo, Rhsimard, Rich257, RichW, Richard001, Richfife, Richw33, Rjwilmsi, Rl, Rlbyrne, Rlee0001, Robert Merkel, RobertG, RogueMomen, Rokfaith, Rosa67, RoyBoy, Rrelf, Rrjanbiah, Ruakh, Rudderpost, Rumping, Runtime, Rursus, Rwwww, Ryulong, SAE1962, SOMNIVM, SUCKISSTAPLES, Sac.education, Samiam95124, Sampalmer4, Samuel, Sandahl, Sanjay742, Schmid, Scientizzle, SeanProctor, Sebastiangarth, Sen Mon, Sephiroth storm, Serge Stinckwich, Shaan myself, Shadowjams, Shanes, SheepNotGoats, Shekure, Shijualex, Shirik, Shmget, Sigil2, Sigma 7, Simetrical, Simoncpu, Sin Harvest, Sir Lewk, Skier Dude, Sladen, Smyth, Snaxe920, Somercet, Soptep, Spacepotato, Sphivo, Spl, Spolstra, Spoon!, SpuriousQ, Stan Shebs, Stephan Schulz, Stephenb, Stephenbez, Steppres, Stevenj, Steveozone, Stormie, Strait, Stratocracy, Straussian, Sundaryourfriend, Sunny sin2005, SuperTails92, Susvolans, SvGeloven, THEN WHO WAS PHONE?, Tados, TakuyaMurata, Tariqabjotu, Taw, Tcascardo, Tdudkowski, Technion, Teddks, Tedickey, Template namespace initialisation script, Teryx, The Divine Fluffalizer, The Real Walrus, The Thing That Should Not Be, The wub, The751, TheBoardWalkInAC, TheMandarin, TheParanoidOne, Thebrid, Thegroove, Thingg, Thumperward, Thunderbrand, Tiggerjay, Tim Starling, Timneu22, Tobias Bergemann, Tom 99, Tom harrison, Tomgreeny, Tommy, Tommy2010, Tompsci, Tony1, Torc2, Toussaint, Toxygen, Trachtemacht, Traroth, Trevor Andersen, Trevyn, Tripodics, Troysteinbauer, Tsavage, Tsunaminoai, Tualha, Tushar.kute, TuukkaH, Twobitsprite, UU, Ultimus, Ummit, Uncle G, Unixplumber, Urhixidur, Uriyan, Urod, Useight, Usien6, Utcursch, UtherSRG, Utility Monster, Utkarsh kapadia, Uukgoblin, Val42, Valermos, VasilievVV, VbAlex, Versus22, Vicu9Mx, Vijay2421, Vijoeyz, Vinodxx1, Voice of All, Warlordwolf, Wdfarmer, Web-Crawling Stickler, Weevil, Wellithy, Wernher, Who, WikHead, Wiki alf, Wikiwonky, Wimt, Wipe, Woohookitty, World eagle, Worthawholebean, Wrp103, XJamRastafire, Xcentaur, Xemnas 5 7, Xiahou, Xiong Chiamiov, Xp54321, Yamaguchi先生, Yamla, Ybenharim, Yelyos, Yogeendra, Yosri, Ysangkok, Yuzhong, Zanimum, Zeno Gantner, Zenohockey, Zhenqinli, ZimZalaBim, Zorothez, Zundark, Zvn, Ævar Arnfjörð Bjarmason, ‫ינמיתה למגה‬, 1459 anonymous edits C+ +  Source: http://en.wikipedia.org/w/index.php?oldid=358955817  Contributors: -Barry-, 12.21.224.xxx, 223ankher, 4jobs, 4th-otaku, 7DaysForgotten, @modi, A D Monroe III, AIOS, ALOTOFTOMATOES, AMackenzie, AThing, ATren, Aandu, Abdull, Abi79, Adam12901, Addihockey10, Adi211095, Adorno rocks, Ae-a, Aesopos, Agasta, AgentFriday, Ahmadmashhour, Ahoerstemeier, Ahy1, Akeegazooka, Akersmc, Akhristov, Akihabara, Akuyume, Alan D, AlbertCahalan, AlecSchueler, Aleenf1, AlexKarpman, Alexf, Alexius08, Alexkon, Alfio, Alhoori, Aliekens, AlistairMcMillan, Allstarecho, AltiusBimm, Alxeedo, AnAccount2, AnOddName, Andante1980, Andre Engels, Andreaskem, Andrew Delong, Andrew1, AndrewHowse, AndrewKepert, Andyluciano, AngelOfSadness, Angela, Anoko moonlight, Anonymous Dissident, Antandrus, Aparna.amar.patil, Apexofservice, Arabic Pilot, Aragorn2, Arcadie, Arctic.gnome, Ardonik, Asimzb, Atjesse, Atlant, Auntof6, Austin Hair, Autopilot, Avoran, Axecution, AxelBoldt, BMW Z3, Baa, Babija, Babjisit, Bahram.zahir, Barek, Baronnet, Bart133, Bartosz, Bdragon, Belem tower, BenFrantzDale, Benhocking, Beowulf king, Bevo, Beyondthislife, Bfzhao, Biblioth3que, Bigk105, Bill gates69, Bineet, Bkil, Blaisorblade, Bluemoose, Bluezy, Bobazoid, Bobblewik, Bobo192, Bobthebill, Bodkinator, Boffob, Boing! said Zebedee, Bomarrow1, Bongwarrior, Booklegger, Boseko, Bovineone, Brion VIBBER, Btx40, C Labombard, C++ Template, C.Fred, CALR, CIreland, CPMcE, CRGreathouse, CWY2190, Caesura, Caiaffa, Callek, Caltas, Can't sleep, clown will eat me, CanisRufus, Cap'n Refsmmat, Capi crimm, CapitalR, Capricorn42, Captainhampton, Carabinieri, Carlson-steve, Catgut, Cathack, CecilWard, Cedars, CesarB, Cetinsert, Cfeet77, Cflm001, Cgranade, CharlotteWebb, Chealer, Chocolateboy, Chrisandtaund, Christian List, Chuq, Ckburke, Cleared as filed, Closedmouth, Clsdennis2007, Cometstyles, Comperr, Conversion script, Coolwanglu, Coosbane, Coq Rouge, CordeliaNaismith, Corrector7007, Corti, Cowsnatcher27, Craig Stuntz, Crotmate, Csmaster2005, Ctu2485, Cubathy, Cubbi, Curps, Cwitty, Cybercobra, Cyclonenim, Cyde, CyrilleDunant, Cyrius, DAGwyn, DJ Clayworth, DVD R W, Dallison999, Damian Yerrick, Damien.c.sadler, Dan Brown123, Dan100, Danakil, Daniel.Cardenas, DanielNuyu, Dario D., DarkHorizon, Darkmonkeyz321, Darolew, Dave Runger, Daverose 33, David A Bozzini, David H Braun (1964), Dawn Bard, Dch888, Dcoetzee, Decltype, Deibid, Delirium, Delldot, Denelson83, DerHexer, Derek Ross, Deryck Chan, DevastatorIIC, Dibash, Diego pmc, Discospinster, Dlae, Dlfkja;lskj, Dmharvey, Dogcow, DominicConnor, DonelleDer, Donhalcon, Doofenschmirtzevilinc, DoubleRing, Doug Bell, Dougjih, Doulos Christos, Drangon, Drewcifer3000, Drrngrvy, Dylnuge, Dysprosia, E Wing, ESkog, Eagleal, Earle Martin, EatMyShortz, Ebeisher, Eco84, Ecstatickid, Ed Brey, Edward Z. Yang, Eelis, Eelis.net, Ehn, Elliskev, Elysdir, Enarsee, EncMstr, Enerjazzer, EngineerScotty, Eric119, ErikHaugen, Esanchez7587, Esben, Esrogs, Eternaldescent08, Ethan, EvanED, Evice, Evil Monkey, Ewok18, Excirial, FW4NK, Fa2sA, Facorread, Faithlessthewonderboy, Faizni, Falcon300000, Fanf, Fashionslide, FatalError, Favonian, Fistboy, Fizzackerly, Flamingantichimp, Flash200, Flewis, Flex, Flyingprogrammer, FrancoGG, Freakofnurture, Frecklefoot, Free Software Knight, Fresheneesz, Fritzpoll, Ftbhrygvn, Furby100, Furrykef, Fuzzybyte, GLari, Gaul, Gauss, Geeoharee, Gene.thomas, Gengiskanhg, Giftlite, Gil mo, Gildos, Gilgamesh, Gilliam, Gimili2, Gimme danger, Gmcfoley, God Of All, Gogo Dodo, GoodSirJava, Graue, Greatdebtor, GregorB, Gremagor, Grenavitar, Grey GosHawk, Grigor The Ox, Gsonnenf, Gusmoe, Gwern, Gwjames, Hairy Dude, Hakkinen, HappyCamper, Harald Hansen, Harinisanthosh, Harryboyles, HebrewHammerTime, HeikoEvermann, Hemanshu, HenryLi, Herorev, Hervegirod, Hetar, Hgfernan, HideandLeek, Hihahiha474, Hiraku.n, Hmains, Hobartimus, Hogman500, Horselover Frost, Hoss7994, Hu, Hu12, Husond, Hyad, Hyperfusion, I already forgot, ISoLoveHer, Iamninja91, Ibroadfo, Imc, Immunize, InShaneee, Innoncent, Insanity Incarnate, Intangir, Iphoneorange, Iridescence, Iridescent, Irish Souffle, Ironholds, Isaacl, Ixfd64, J Casanova, J Di, J-A-V-A, J.delanoy, JForget, JNighthawk, Jackelfive, Jafet, Jaredwf, Jatos, Javiercastillo73, Javierito92, Jawed, Jayaram ganapathy, Jdent29, Jdowland, Jeff G., JeffTL, Jeltz, Jerry teps, Jerryobject, Jeshan, Jesse Viviano, JesseW, Jgamer509, Jgrahn, Jgroenen, Jh51681, Jimsve, Jizzbug, Jlin, Jnestorius, Johndci, Johnsolo, Jok2000, Jonathan Grynspan, Jonathanischoice, Jonel, Jonmon6691, Jorend, Josh Cherry, Juliano, Julienlecomte, Junkyboy55, Jyotirmay dewangan, K3rb, KJK::Hyperion, KTC, Kaimason1, Kajasudhakarababu, Kalanaki, Kapil87852007, Kashami, Kate, Keilana, Kentij, Khym Chanur, Kifcaliph, King of Hearts, Kinu, Klassobanieras, KnowledgeOfSelf, Kogz, Kooky, Korath, Koyaanis Qatsi, Krelborne, Krich, Krischik, Kristjan.Jonasson, Ksam, Kuru, Kusunose, Kwamikagami, Kwertii, Kxx, Kyleahampton, Landon1980, Larry V, Lars Washington, Lastplacer, Le Funtime Frankie, Lee Daniel Crocker, LeinadSpoon, Liao, Liftarn, Lightmouse, Ligulem, Lilac Soul, Lilpony6225, Lir, Liujiang, Lkdude, Lloyd Wood, Loadmaster, Logixoul, Lotje, Lowellian, Luks, Lvella, Lysander89, MER-C, Mabdul, Machekku, MadCow257, Mahanga, Maheshchowdary, Male1979, Malfuf, Malhonen, Malleus Fatuorum, Mani1, Manjo mandruva, Manofabluedog, MarSch, Marc Mongenet, Marc-André Aßbrock, Marcelo Pinto, Mark Foskey, Marktillinghast, Marqmike2, Martarius, Masterkilla, Mathrick, Mav, Mavarok, Max Schwarz, Maxim, Mayank15 5, Mbecker, Mccoyst, Mcorazao, Mcstrother, Mellum, MeltBanana, Mentifisto, Mephistophelian, Metamatic, MetsFan76, Mhnin0, MichaelJHuman, Micphi, Mifter, MihaS, Mikademus, Mike Van Emmerik, Mike92591, Mikrosam Akademija 7, MilesMi, Mindmatrix, Minesweeper, Minghong, Mipadi, Miqademus, Miranda, Mirror Vax, Mistersooreams, Mjquinn id, Mkarlesky, Mkcmkc, MoA)gnome, Moanzhu, Modify, Mohamed Magdy, Mole2386, Morwen, Moxfyre, Mptb3, Mr MaRo, Mr.GATES987, MrJeff, MrSomeone, Mrjeff, Mrwes95, Ms2ger, Muchness, Mukis, Muralive, MustafaeneS, Mxn, Myasuda, Mycplus, Mystìc, N111111KKKKKKKooooo, Naddy, Nanshu, Napi, Nasa-verve, Natdaniels, NawlinWiki, Neilc, Neurolysis, NevilleDNZ, Newsmen, Nick, Nicsterr, Ninly, Nintendude, Nirdh, Nisheet88, Nixeagle, Njaard, Nma wiki, Nohat, Noldoaran, Non-dropframe, Noobs2007, Noosentaal, Northernhenge, ORBIT, Oddity-, Odinjobs, Ohnoitsjamie, Ojuice, OldakQuill, Oleg Alexandrov, Oliver202, Oneiros, Orderud, Ouraqt, OutRIAAge, OverlordQ, OwenBlacker, Ozzmosis, Paddu, Pak21, Pankajwillis, ParallelWolverine, Paul Stansifer, Paul evans, Paulius2003, Pavel Vozenilek, Pawanindia2009, Pbroks13, Pcb21, Pde, PeaceNT, Pedant17, Peruvianllama, Peterl, Peteturtle, Pgk, Pharaoh of the Wizards, Pharos, Phil Boswell, Philip Trueman, PhilippWeissenbacher, Pi is 3.14159, Pit, Pizza Puzzle, Plasticup, Pogipogi, Poldi, Polluxian, Polonium, Poor Yorick, Prashan08, Prohlep, ProvingReliabilty, Punctilius, Quadell, Quinsareth, Quuxplusone, Qwertyus, R3m0t, R4rtutorials, REggert, RN, Raghavkvp, RainbowOfLight, Ravisankarvn, Rbonvall, Rdsmith4, RedWolf, Rehabe, Reinderien, Remember the dot, Requestion, Rethnor, RexNL, Rgb1110, Rich Farmbrough, Richard Simons, Ritualizer, Rjbrock, Rjwilmsi, Roadrunner, Robdumas, Robertd, RodneyMyers, RogueMomen, Ronark, Ronhjones, Ronnyim12345, Ronyclau, Root@localhost, Rosive, Rossami, Rprpriya, Rror, Rtfb, Rursus, Ruud Koot, RyanCross, Ryty01, SJP, STL, Sachin Joseph, Sadday, Saimhe, Samuel, Sandahl, Sasha Slutsker, Sbisolo, Sbvb, SchfiftyThree, Schiralli, SchnitzelMannGreek, Schumi555, ScoPi, Scorp.pankaj, Scottlemke, Scythe33, SebastianHelm, Sebastiangarth, Sebor, Sentense12, Seraphim, Sfxdude, Sg227, Shadowblade0, Shadowjams, Shawnc, SheffieldSteel, ShellCoder, Shinjiman, Sidhantx, Sigma 7, Silsor, Simetrical, Simon G Best, SimonP, Sinternational, Sirex98, Sishgupta, Sitethief, Skew-t, Skizzik, SkyWalker, Sl, Sleep pilot, Sligocki, Slothy13, Smyth, Snaxe920, Sneftel, Snigbrook, Snowolf, Sohmc, SomeRandomPerson23, Sommers, Spaz man, Spiel, Spitfire, SplinterOfChaos, SpuriousQ, Stanthejeep, SteinbDJ, Stephan Schulz, Stephenb, Steve carlson, Steven Zhang, Stevenj, StewartMH, Stheller, Stoni, StoptheDatabaseState, Strangnet, Stringle, Stuartclift, Style, Suffusion of Yellow, Supertouch, Suppa chuppa, Surv1v4l1st, SvGeloven, Svick, Swalot, Sydius, T0pem0, T4bits, TCorp, THEN WHO WAS PHONE?, Takis, TakuyaMurata, Tbleher, TeaDrinker, Technion, Tedickey, Template namespace initialisation script, Tero, Tetra HUN, TexMurphy, The 888th Avatar, The Anome, The Inedible Bulk, The Minister of War, The Nameless, The Thing That Should Not Be, TheDeathCard, TheIncredibleEdibleOompaLoompa, TheMandarin, TheNightFly, TheSuave, TheTim, Theatrus, Thebrid, Thematrixv, Thiagomael, Thumperward, Tietew, Tifego, Tim Starling, Tim32, TingChong Ma, Tinus, Tobias Bergemann, Toffile, TomBrown16, TomCat2800, Tombrown16, Tompsci, Tony Sidaway, Torc2, Tordek ar, Toussaint, Traroth, Trevor MacInnis, TreyHarris, Troels Arvin, Ts4z, Tslocum, Turdboy3900, Turian, TuukkaH, Ubardak, Umapathy, Unendra, Ungahbunga, Urod, UrsaFoot, Useight, Userabc, UtherSRG, Utnapistim, Val42, Vchimpanzee, Vincenzo.romano, Vinci0008, Viperez15, VladV, Vladimir Bosnjak, Wangi, Wavelength, Wazzup80, Werdna, Westway50, Whalelover Frost, Who, WikHead, Wikidemon, Wikidrone, Wikipendant, Wikiwonky, Willbennett2007, Wilson44691, Winchelsea, Wj32, Wknight94, Wlievens, Woohookitty, Wsikard, XJamRastafire, Xerxesnine, Xoaxdotnet, Yamla, Yankees26, Yboord028, Ybungalobill, Yoshirules367, Ysangkok, Yt95, Yurik, Zck, Zed toocool, Zenohockey, Zigmar, Zlog3, Zoe, Zr2d2, Zrs 12, Zundark, ZungBang, Zvn, Ævar Arnfjörð Bjarmason, Александър, ПешСай, Ἀγάπη, 无名氏, 1753 anonymous edits Perl  Source: http://en.wikipedia.org/w/index.php?oldid=358935701  Contributors: -Barry-, 130.94.122.xxx, 192.146.101.xxx, 199.196.144.xxx, 21655, A!eX, A-giau, Aardvark92, Aaronw, Acmeacme, Agentq314, Agiorgio, Ahy1, Aitter, Alerante, Alethiophile, Alex Zivoder, AlexPlank, Alfio, AliaGemma, Alksentrs, Altenmann, Alvin-cs, Ameliorate!, Amire80, Amreshk, Amux, Anakin101, Ancheta Wis, Andrejj, Andres, Andrew.langmead, AndrewHarvey4, Angela, Angryxpeh, Animum, Anup.debnath, Apeiron, Apokrif, Araes, Aranel, Arbec, Aristotle Pagaltzis, Arknascar44, Arkuat, Arnvidr, Ashley Y, Asmeurer, Asqueella, Atlant, Atou, Autrijus, AzaToth, Baka toroi, BananaFiend, Banzaimonkey, Bazj, Bazza 7, Bdesham, Beland, Ben@liddicott.com, BernardSumption, Bevo, Bhadani, BiT, Bigpresh, BillFlis, Bkd, Bkell, Bkuhn, Blazar, Bluebusy, Bonrajesh, Bookofjude, Booyabazooka, Bovineone, Brentdax, Brianski, Brighterorange, Brion VIBBER, Briséis, Buckwad, Buffyg, CFeyecare, CQ, CRGreathouse, CSWarren, CalPaterson, CambridgeBayWeather, Camembert, Cap'n Refsmmat, Catbar, Ccreitz, Charles Gaudette, Charliesome, Chillum, Chipp, Chocolateboy, ChongDae, Chorny, Chris Roy, Chrisdolan, Chrislee 2007, Closedmouth, CobaltBlue, Collectonian, Conversion script, CorbinSimpson, Corpx, Corti, Csmaster2005, Cvbusiness, Cybercobra, Cyberlemming, DMacks, DStoykov, Dacium, Dagolden, Dan100, Danakil, Danallen46, Daniel Quinlan, Daniel.Cardenas, Danimo, Daoswald,

245


Article Sources and Contributors Darguz Parsilvan, Darklama, Darkmonkeh, Daverose 33, David Andel, Davorg, Dbroadwell, Dcoetzee, Demerphq, DerHexer, Deranged bulbasaur, Derek, Diana.dragulin, Diberri, Dinosaur puppy, Diwas, Dmismir, Dmitri Bichko, Dogcow, Dominus, Doradus, Dori, Downtown dan seattle, Droob, Durin, Dushman, Dustimagic, ENSSB, Eagleal, Earle Martin, Earlypsychosis, EatMyShortz, Ed Poor, EdC, Edward, Edward Z. Yang, Ejanev, Ekalin, El Cubano, ElMorador, Eli the Bearded, Ellmist, Eloquence, Elwikipedista, Emesee, Eric TF Bat, Erik Garrison, Erik Sandberg, Ermeyers, Esjs, Eurleif, EvanCarroll, Extremecircuitz, Fagstein, Fastfission, Feezo, Finlay McWalter, Firstrock, Flash200, Flex, Fordescort79, Frap, Frecklefoot, Fredrik, FrenchIsAwesome, Fresheneesz, Friedo, Func, Furrykef, FutureDomain, FvdP, Fvw, Fxhomie, Galoubet, Gamma, GangofOne, Gaurav, Generic Player, Geniac, Gennaro Prota, GeorgeMoney, Getly, Ghettoblaster, Giftlite, Gilgamesh, Glenn, GodOfPain, Gogo Dodo, Goldom, GoodSirJava, GooseCreek, Gosgood, Grafman, Greeneto, Greenleaf, GregU, Gronky, Gsmgm, Guanaco, Guy Peters, H, H3xx, HMJust, HaeB, Hannes Hirzel, Harmil, HarmonicSphere, Havanajoe, Hazir, Heirpixel, HenkeB, Herbee, HermenLesscher, Hervegirod, Hfastedge, Hirzel, Histrion, Hitherebrian, Hqb, Hu12, Hydrargyrum, ICECommander, Ianozsvald, Igouy, Ikluft, Iluvcapra, Imroy, Indefatigable, Int19h, Interiot, Iridescent, Irishguy, Isaac Dupree, Isilanes, Ivanberti, JLaTondre, JakeVortex, JakobVoss, JamesBrownJr, Jamesday, Jandalhandler, Jarkeld, Javaman, JaxGuy, Jay, Jbolden1517, Jdavidb, Jdfle, Jefflayman, Jeffreykegler, Jeremyharmon, Jerome Charles Potts, Jerrinroyc, Jim whitson, Jimbelton, Jkchan, Jkominek, JoaquinFerrero, Joaquinferrero, Johnfn, Jolomo, Jonemerson, Jonpro, Jorge Stolfi, Joy, Jpo, Jpvinall, Jsan, Kaare, Kesla, Kikos, Kimikimicat, Kinema, King Lopez, Kl4m-AWB, Klk206, Kojot350, Kozuch, Krellis, Ksn, Kstarsinic, Kusunose, Kvonk, Kwertii, Lacker, Lasix, Lax4mike, Leif, LeoNerd, Lerdsuwa, Lgrinberg, Liftarn, LittleDan, Lordspace, Lulu of the Lotus-Eaters, Luna Santin, Luqui, M.Joseph.Curran, Mac, Mahanga, Malleus Fatuorum, Markussep, Martinsanders, Marudubshinki, Massysett, Mbarone911, Mcorazao, Mdchachi, Mendel, Mgnbar, Michael Hardy, Michael Slone, MichaelBillington, MichaelRWolf, Michal Jurosz, MichalJurosz, MikaelSorlin, Mike Dillon, Mike Payne, Mikrosam Akademija 8, MilesMi, Miltonhowe, Minatomochizuki, Mindmatrix, Minesweeper, Miranda, Mocean, Mordemur, Mstuomel, Mugwumpjism, Mutant, Naive cynic, Najoj, Nanshu, Naph, Nat1192, NathanHartman, NawlinWiki, Nef, Neilc, NeoThe1, NickBush24, Nicolas1981, Nikai, Nilsonsfj, Njyoder, Nk, Nknight, Nnkx00, Nobodys Fool, Notheruser, Nymphii, Obuli, Odin, OutZider, P0lyglut, PSzalapski, Paddu, Padraig.coogan, Peng, Pepkaro, Perl, Perlwizard, Petdance, Peter Delmonte, Peterl, Peternewman, Pgn674, Pharos, Phil Boswell, PhilHibbs, PhilipO, PhotoBox, Phydeaux, Pichu826, Piet Delport, Pillefj, Pinethicket, Pingveno, Pippin Bear, Pippoinzo, Pjf, Plutor, Pne, Polly, Poor Yorick, Poweroid, Practical Idealist, PrimeHunter, Prodego, Prolog, PseudoSudo,