7. CONTROL STRUCTURE
This chapter contains the following sections:
7.0 Introduction - Control Structures
7.1 Environment Structures
7.2 Flow Control
7.0 Introduction - Control Structures
Control structures provide the framework for programs in Common LISP, just as in other programming languages. Mastering control structures at an early point in the study of Common LISP will enable the student to progress at a rapid pace. Thus this subject deserves some detailed study.
Common LISP provides a wide range of structures which can be used to organize programs. These fall into two broad categories:
· Access to variables (7.1 Environment Structures)
· Sequence of execution (7.2 Flow Control)
Some of these features are special forms, of which the compiler must have particular knowledge. Some are macros, which expand at compile time into more complicated program fragments made up of special forms, other macros, and function calls.
7.1 Environment Structures
Environment structures deal with data just as flow control structures deal with code (instructions).
LISP has a unitary representation for both 'code' and 'data'. In other programming languages, there is no such identity; data objects are stored and notated in a much different form than code containing instructions. This economy of representation allows LISP programs to write other LISP programs; as well as LISP data to incorporate code to run at some future time. However, this makes the LISP beginners' task more difficult, particularly if they know some other programming language.
It is often important to distinguish, for the purposes of the evaluator, between LISP forms which are intended to be executed and those which are not. The quote special form resolves this ambiguity.
LISP forms which are specified by the programmer in this fashion as not to be executed are called constant objects. This is not to say that such objects cannot be altered after they are defined; simply that at that place in the code this object has some particular contents. (But note the defconstant facility, which allows specification of a variable whose value is explicitly not to be changed).
A variable is a data object, notated by writing a symbol, which may take on a value.
There are two kinds of variables in Common LISP:
· Ordinary variables
· Functional variables
Ordinary variables name data objects.
Functional variables define named functions, macros, and special forms.
There are some similarities between these two. There are parallel functions for dealing with some important characteristics of each, for example boundp and fboundp, quote and function.
However, the two kinds of variables are used for the most part to different ends.
The value of an ordinary variable can be extracted by writing the name of the variable as a form to be evaluated.
Variables can be:
· Local: Visible only within a given lexical scope
· Global: Visible (after they are defined) everywhere
Roughly speaking, local variables are most often known as 'lexical' variables because of their scoping; global variables are hence also called 'special' or 'dynamic', because of how they are scoped. For a more technical (and correct) description, see scope.
Note that a value is completely optional if the variable is global. In other languages a global variable which has not been assigned a value will have some predefined value or perhaps garbage; in LISP it is possible using the boundp predicate to determine whether a given global variable has been given a value.
7.1.1 Variable Reference
There are several forms which deal with referring to a value.
The quote special form returns an object without evaluation. The function special form obtains the functional interpretation of an object. Both of these are used so often that they have their own reader macros to make writing them less burdensome: The single forward quote ' for quote and the #' construct for function.
The symbol-value and symbol-function functions return the value and functional defintion of a global symbol. The boundp and fboundp predicates tell whether such a definition is available for a given symbol. The special-form-p predicate tells whether a given symbol names a special form.
Here is a list of the special forms, macros, and functions which allow reference to variables:
boundp
fboundp
function
quote
special-form-p
symbol-function
symbol-value
7.1.2 Variable Assignment
Several forms are used to assign values to variables. This is different than binding, which is accomplished using the let and let* constructs.
The setq special form is the most common way to assign a value. The set function assigns a value to a special variable. To make an ordinary symbol or functional symbol unbound, there are the makunbound and fmakunbound functions respectively.
The following lists the facilities which allow the value of a variable (more precisely, the value associated with the current binding of the variable) to be altered. Such alteration is different from establishing a new binding.
setq
set
makunbound
fmakunbound
7.1.3 Variable Binding Constructs
The following constructs may also be used to establish bindings of variables, both ordinary and functional.
let
let*
Also, during the invocation of a function represented by a lambda-expression (or a closure of a lambda-expression, as produced by function), new bindings are established for the variables that are the parameters of the lambda-expression. These bindings initially have values determined by the parameter-binding protocol discussed in the entry for lambda.
7.1.4 Generalized Variables
Finally, LISP extends the concept of a variable using setf. Many functions specify locations. For instance, the aref function can be used to specify a particular location in an array, the third function specifies the third element in a list, and so on.
The setf macro looks like setq, with a difference: the entity being assigned to can either be an ordinary variable, or a place, specified by a form such as aref or third. In the latter case, the value will be stored into the location which the form specifies.
Many of the list and sequence functions can be used in this context.
There are several support functions and macros which are used to implement the setf facility; experts can use these to define their own setf methods if desired. These include defsetf, define-modify-macro, define-setf-method and others.
In LISP, a variable can remember one piece of data, that is, one LISP object. The main operations on a variable are to recover that object, and to alter the variable to remember a new object; these operations are often called access and update operations. The concept of variables named by symbols can be generalized to any storage location that can remember one piece of data, no matter how that location is named. Examples of such storage locations are the car and cdr of a cons, elements of an array, and components of a structure.
For each kind of generalized variable, typically there are two functions that implement the conceptual access and update operations. For a variable, merely mentioning the name of the variable accesses it, while the setq special form can be used to update it. The function symbol-value accesses the dynamic value of a variable named by a given symbol, and the function set updates it.
Rather than thinking about two distinct functions that respectively access and update a storage location somehow deduced from their arguments, we can instead simply think of a call to the access function with given arguments as a name for the storage location. Thus, just as x may be considered a name for a storage location (a variable), so (car x) is a name for the car of some cons (which is in turn named by x). Now, rather than having to remember two functions for each kind of generalized variable (having to remember, for example, that rplaca corresponds to car), we adopt a uniform syntax for updating storage locations named in this way, using the setf macro. This is analogous to the way we use the setq special form to convert the name of a variable (which is also a form that accesses it) into a form that updates it. The uniformity of this approach is illustrated in the following table.
Access Function |
Update Function |
Update Using setf |
x |
(setq x datum) |
(setf x datum) |
(car x) |
(rplaca x datum) |
(setf (car x) datum) |
(symbol-value x) |
(set x datum) |
(setf (symbol-value x) datum) |
|
|
|
setf
is actually a macro that examins an access form and produces a call to the corresponding update function.Given the existence of setf in Common LISP, it is not necessary to have setq, rplaca, and set; they are redundant. They are retained in Common LISP because of their historical importance in LISP. However, most other update functions (such as putprop, the update function for get) have been eliminated from Common LISP in the expectation that setf will be uniformly used in their place.
7.2 Flow Control
The primary method for construction of LISP programs is functional application. Operations are implemented through a collection of smaller functions, each of which implements a simple operation. Thus larger operations are defined in terms of smaller ones. These functions operate by calling one another, or by calling themselves recursively, either directly or indirectly.
This is called an applicative approach to programming. In this style, the return value takes more importance than the side effect. In a purely applicative programming language, there would be no need to worry about the side effects of calling a function; the called functions' internal state would be irrelevant to the caller. Although this would allow the construction of large proveably correct systems (an interest of LISP programmers from the start) this is hardly the case in real world software development.
Developers must always be worrying not only about side effects; but also about the interaction between side effects. Indeed, most 'mysterious' bugs are the result of this kind of situation. By breaking up the program into a collection of small functions it is more likely that such bugs can be localized and effectively eliminated.
Nonetheless, LISP provides many operations that produce side effects. Therefore LISP has various constructs for controlling the sequencing of side effects. This is the technical definition of a control structure.
The following summarizes the flow control constructs of Common LISP. This includes the following topics:
Functional Invocation
Simple Sequencing
Conditionals
Blocks and Exits
Iteration
Mapping
Procedures
Multiple Return Values
Dynamic Non-Local Exits
7.2.1 Functional Invocation
The most primitive form for function invocation in LISP of course has no name; any list that has no other interpretation as a macro call or special form is taken to be a function call. The following constructs are provided for less common but nevertheless frequently useful situations:
apply
funcall
The limit on the number of arguments which may be passed to a function is the value of the global variable call-arguments-limit. In Star Sapphire this is 32768.
7.2.2 Simple Sequencing
There are several forms which permit simple sequencing: These are progn, prog1, and prog2. The progn special form is similar to a PASCAL begin-end block with all its semicolons. A progn runs the forms contained in its body from left to right, discarding all the values except the last; its value is the value of the last form. Many LISP control constructs are said to establish an implicit progn.
Each of the following constructs simply evaluates all the argument forms in order. They differ only in what results are returned:
progn
prog1
prog2
7.2.3 Conditionals
The functions, macros and special forms which make up LISP's inventory of conditional constructs are as follows:
case
cond
if
typecase
when
unless
It is traditional to use and and or as one-way conditionals as well. Programmers experienced with other languages may find this a bit strange at first, although it is perfectly legitimate. The choice of which conditional form to use in a specific situation is a matter of style and taste. Some guidelines are provided herein in this matter.
Common to all programming languages, conditional constructs allow branching to a particular location in the program based on specific conditions at run time.
Conditionals are the chief form used to build inference rules for expert systems. Perhaps because of LISPs role in building expert systems, Common LISP has a slightly richer inventory of these constructs than other languages.
This section describes the various conditional constructs available in Star Sapphire Common LISP.
The evaluation of conditional constructs usually hinges on the result of evaluating a conditional clause.
A conditional clause is typically a predicate; however, in Common LISP the condition can be any expression. As discussed elsewhere, a form which evaluates to nil is considered to be 'false'; all other forms are considered to be 'true'. This is the key to understanding and writing LISP conditionals.
Most programming languages support some kind of if-then-else construct. If is considered to be a primitive of Common LISP, hence its status as a special form.
There are also two constructs which can be used interchangeably with if: when and unless. These are used as 'syntactic sugar' for expressions. When and unless are used so that conditional code reads more naturally, in order to make the programmers' intentions clear to casual inspection.
The cond macro is the traditional conditional construct in LISP: It actually predates if. cond does a linear search in a list of conditions; the first matching condition is executed.
Common LISP also provides the case and typecase conditional constructs, which perform a linear search for a clause matching a particular key, by (eql) equality or type respectively. These forms are less general than cond but are easier to read where applicable.
Note that the and and or functions are often used as conditionals; they are used in specific circumstances which are discussed in the entry for if.
7.2.4 Blocks and Exits
The following constructs provide a structured lexical non-local exit facility.
block
return
return-from
A block is simply a sequence of forms. A block normally returns all values returned by the last form in the block.
Return
and return-from unconditionally exit a block they are contained in. They allow the specification of which block to return from, as well as what to return from it.In the most common cases this mechanism is more efficient than the dynamic non-local exit facility provided by catch and throw.
The return construct returns from a block named nil; in Star Sapphire and possibly other implementations return simply expands into (return-from nil).
A block is a named body of forms which can be exited at any time through a return or return-from construct. Its value is either the value of the last form or the value returned by a return or return-from (if one was executed). Although blocks act and look a lot like the progn construct, progn has no capability for interrupting its control flow prematurely. A block named nil is automatically constructed around the body of every lambda so that return can be used to return values from it. Similarly, in defun and defmacro forms, a block with the same name as the function or macro is wrapped around the body, so that within defun foo, (return-from foo...) will return from the function foo.
7.2.5 Iteration
Common LISP provides the following iteration constructs:
loop
do
do*
tagbody
dolist
dotimes
The loop construct provides a simple iteration facility; it is just a progn with a branch from the bottom back to the top. The loop construct is the simplest iteration facility. It controls no variables, and simply executes its body repeatedly. A loop can be exited using return, since it places an implicit block named nil around its body's forms.
In contrast to loop, do and do* provide a powerful and general mechanism for repetitively recalculating many variables. The do and do* constructs provide a general iteration facility for controlling the variation of several variables on each cycle.
The constructs dolist and dotimes execute a body of code once for each value taken by a single variable. They are expressible in terms of do, but capture very common patterns of use. Dolist and dotimes are considerably easier to master than do.
The tagbody construct is the most general iteration construct, permitting arbitrary go statements within it. (The traditional prog construct is a synthesis of tagbody, block, and let.)
Most of the iteration constructs permit statically defined non-local exits in the form of the return-from and return statements.
7.2.6 Mapping
A different kind of iteration is provided by the map functions. These traverse a sequence once and apply a function to each element in the sequence. The mapcar, maplist, mapc, mapl, mapcan, and mapcon functions do this on lists. The map function does completely generalized mapping on any sequence. The various flavors allow different ways for the return value to be constructed.
Mapping is basically a type of iteration in which a function is successively applied to pieces of one or more sequences. The result of the iteration is a sequence containing the respective results of the function applications. There are several options for the way in which the pieces of the list are chosen and for what is done with the results returned by the applications of the function.
The function map may be used to map over any kind of sequence. The following functions operate only on lists.
mapcar
maplist
mapc
mapl
mapcan
mapcon
7.2.7 Procedures
Several facilities are available which allow writing code in the 'procedural' or 'statement-oriented' style which is used in most programming languages. This includes tagbody and go, which, when used in combination, set up a lexical environment for executing a sequence of forms from left to right, possibly with arbitrary jumps forward or backward. This provides the goto facility which structured programmers are supposed to avoid. Finally there is a prog form, which incorporates a block, a let and a tagbody, for the ultimate in Pascal-style programming. The prog feature is currently not in favor among LISP programmers and is disparaged by Steele.
LISP implementations since LISP 1.5 have had what was originally called "the program feature". Steele stage-whispers "as if it were impossible to write programs without it!" The following Common LISP macros implement this facility:
prog
prog*
The prog construct allows one to write in an ALGOL-like or FORTRAN-like statement-oriented style, using go statements that can refer to tags in the body of the prog. Modern LISP programming style tends to use prog rather infrequently. The various iteration constructs, such as do, have bodies with the characteristics of a prog. (However, the ability to use go statements within iteration constructs is very seldom used in practice.)
Three distinct operations are performed by prog: It binds local variables, it permits use of the return statement, and it permits use of the go statement. In Common LISP, these three operations have been separated into three distinct constructs: let, block, and tagbody. These three constructs may be used independently as building blocks for other types of constructs.
7.2.8 Multiple Return Values
LISP is unique in its capability to return more than one value (or no value at all) from a form. This neatly avoids the problems which other languages are prone to when more than one object is produced or affected by a function. In Pascal, such parameters must be VARs; in C, pointer parameters are used; in all languages global variables are often resorted to in this case (with all their attendant problems).
In LISP a function can simply return as many values as it wants (including none at all!) using the values function. Many built in functions do this. However, this introduces a new level of complexity for both the user and the implementor.
To help manage multiple return values the following facilities are provided:
values
multiple-values-limit
values-list
multiple-value-list
multiple-value-call
multiple-value-prog1
multiple-value-bind
multiple-value-setq
Ordinarily the result of calling a LISP function is a single LISP object. Sometimes, however, it is convenient for a function to compute several objects and return them. Common LISP provides a mechanism for handling multiple values directly. This mechanism is cleaner and more efficient than the usual tricks such as returning a list of results or stashing results in global variables.
7.2.8.1 Constructs for Handling Multiple Values
Normally multiple values are not used. Special forms are required both to produce multiple values and to receive them. If the caller of a function does not request multiple values, but the called function produces multiple values, then the first value is given to the caller and all others are discarded; if the called function produces zero values, then the caller gets nil as a value.
The primitive for producing multiple values is values, which takes any number of arguments and returns that many values. If the last form in the body of a function is a values with three arguments, then a call to that function will return three values. Other special forms also produce multiple values, but they can be described in terms of values. Some built-in COMMON LISP functions, such as floor, return multiple values; those that do are so documented.
The special forms and macros for receiving multiple values are:
multiple-value-list
multiple-value-call
multiple-value-prog1
multiple-value-bind
multiple-value-setq
The following functions explicitly return an arbitrary number of values:
values
values-list
7.2.8.2 Rules Governing the Passing of Multiple Values
It is often the case that the value of a special form or macro call is defined to be the value of one of its subforms. For example, the value of a cond is the value of the last form in the selected clause. In most such cases, if the subform produces multiple values, then the original form will also produce all of those values. This passing back of multiple values of course has no effect unless eventually one of the special forms for receiving multiple values is reached.
To be explicit, multiple values can result from a special form under precisely these circumstances:
7.2.8.2.1 Evaluation and Application
eval
returns multiple values if the form given it to evaluate produces multiple values.apply
, funcall, and multiple-value-call pass back multiple values from the function applied or called.7.2.8.2.2 Implicit progn contexts
The special form progn passes back multiple values resulting from evaluation of the last subform. Other situations referred to as implicit progn, where several forms are evaluated and the results of all but the last form are discarded, also pass back multiple values from the last form. These situations include the body of a lambda expression, in particular those constructed by defun, defmacro, and deftype. Also included are bodies of the constructs eval-when, progv, let, let*, when, unless, block, and catch, as well as clauses in such conditional constructs as case and typecase.
7.2.8.2.3 Conditional constructs
if
passes back multiple values from whichever subform is selected (the then form or the else form).and
and or pass back multiple values from the last subform but not from subforms other than the last.cond
passes back multiple values from the last subform of the implicit progn of the selected clause. If, however, the clause selected is a singleton clause, then only a single value (the non-nil predicate value) is returned. This is true even if the singleton clause is the last clause of the cond. It is not permitted to treat a final clause (x) as being the same as (t x) for this reason; the latter passes back multiple values from the form x.7.2.8.2.4 Returning from a block
The block construct passes back multiple values from its last subform when it exits normally. If return-from (or return) is used to terminate the block prematurely, then return-from passes back multiple values from its subform as the values of the terminated block. Other constructs that create implicit blocks, such as do, dolist, dotimes, prog, and prog*, also pass back multiple values specified by return-from (or return).
do
passes back multiple values from the last form of the exit clause, exactly as if the exit clause were a cond clause. Similarly, dolist and dotimes pass back multiple values from the result form if that is executed. These situations are all examples of implicit uses of return-from.7.2.8.2.5 Throwing out of a catch
The catch construct returns multiple values if the result form in a throw exiting from such a catch produces multiple values.
7.2.8.2.6 Miscellaneous situations
unwind-protect
returns multiple values if the form it protects returns multiple values.Among special forms that never pass back multiple values are setq, multiple-value-setq, prog1, and prog2. The conventional way to force only one value to be returned from a form x is to write (values x).
The most important rule about multiple values is: no matter how many values a form produces, if the form is an argument form in a function call, then exactly one value (the first one) is used. For example, if you write (cons (floor x)), then cons will always receive exactly one argument (which is of course an error), even though floor returns two values. To pass both values from floor to cons, one must write something like (multiple-value-call #'cons (floor x)). In an ordinary function call, each argument form produces exactly one argument; if such a form returns zero values, nil is used for the argument, and if more than one value, all but the first are discarded. Similarly, conditional constructs such as if that test the value of a form will use exactly one value, the first one, from that form and discard the rest; such constructs will use nil as the test value if zero values are returned.
7.2.9 Dynamic Non-Local Exits
Common LISP specifies a very controlled way of exiting complex processes in a dynamically scoped, non-local, manner. This can be done using the following facilities:
catch
throw
unwind-protect
The catch, throw and unwind-protect forms provide a clean mechanism for transferring control from one location to another in a program, even if the destination is not within in the current function. The only reqirement is that the destination has been established at some time before it is sought out.
The unwind-protect facility can hook into any transfer of control at the moment it occurs, including go, return, catch and throw, and run code to cleanup side-effects or perform any desired action.
There is no real equivalent to these facilities in traditional high-level languages, although a C programmer can achieve approximately the same result using the longjmp and setjmp library routines. Recently, though, similar constructs have been introduced in C++.
Technically speaking, this affords a non-local, dynamically scoped means of exiting the current evaluation context. Implementationally, a stack context is stashed in a linked list whose nodes get stored on the physical LISP stack. However, don't be intimidated by all this--you don't have to be a guru to use this facility. The point is that this complexity is hidden from the you, the programmer.
A catch form evaluates the forms in its body. If a throw form is executed during this evaluation, the evaluation stops at that point. The catch form will then return a value specified by the throw.
catch
/ throw is more general than block and return. A return can only exit a block which it is contained in lexically. The catch / throw mechanism works even if the throw form is not within the body of the catch form.In simple terms, a return can only apply to a block whose parenthesis surround the return form. The throw can communicate with a catch even if that catch is defined in some other file, as long as when it throws, the catch has been called and not yet returned.