Star Sapphire Common LISP Home

Download Star Saphire
Index

2. DATA TYPES

This chapter covers the following topics:

2.1 Introduction to Common LISP types

2.1.1 Implementation Note: Typebit Values

2.2 Type Overview

2.3 Number types

2.4 Character types

2.5 String Character types

2.6 Symbol Type

2.7 Lists and Conses

2.8 Array Type

2.9 Hash table Type

2.10 Package Type

2.11 Stream Type

2.12 Random-State Type

2.13 Structure Type

2.14 Functional Object Type

2.15 Unreadable Data Objects

 

2.1 Introduction to Common LISP types

This section of the documentation discusses the range of data types available in Star Sapphire Common LISP.

A data type is a set of LISP objects. Some kinds of LISP objects can be a member of more than one type.

In LISP, as opposed to most other programming languages, data objects are typed, not variables. A symbol or lexical variable can take any LISP object as a value, which can be of any type.

As many LISP objects can belong to more than one type, it will not always make sense to ask what the type of an object is. Instead, one usually asks only whether an object is of a given type; this can be done using the predicate typep.

In Star Sapphire LISP, types are implemented by embedding a bit pattern in the virtual address of the object. These are called 'type bits'. The type returned by type-of (with one minor exception) is the value of the type bits of an object.

Certain types, such as sequence or number, include other types. The universe of Common LISP types is thereby organized into a hierarchy (actually a partial ordering) of types defined by the subset relationship.

Types are named by symbols, known as type specifiers.

The Common LISP Object System (CLOS) generalizes the concept of the type hierachy into a class hierarchy. The class hierarchy does not replace the type hierarchy, but rather provides ways of extending it. A class determines the structure and behavior of objects which belong to it (its 'instances'), not just what typep and type-of will say about it.

Every Common LISP data object belongs to some class. The class hierarchy defines how a particular class inherits methods (chunks of code which are mapped into generic operations) from other classes. This inheritance is multiple: Classes transitively inherit methods from their superclasses, conversely, a subclass inherits methods from its superclasses. This forms a directed acyclic graph.

The set of all objects is specified by the symbol t. The empty data type, which contains no objects, is denoted by nil.

In the class hierarchy, the class t is the superclass of all classes; the nil class is the subclass of all classes.

This portion of the documentation discusses all standard Common LISP types implemented in Star Sapphire Common LISP, how those objects are implemented, the acceptable values of such objects, and how objects of each type are notated.

Descriptions of LISP functions that operate on data objects of each type appear in other chapters. See also the discussion of Type Specifiers.

 

2.1.1 Implementation Note: Typebit Values

The following section is a discussion of how types are implemented internally; it is an advanced topic which is not required knowledge for most users of this product. To skip ahead, select Type Overview.

This information is primarily of use to Star Sapphire source licensees and link kit users. This section refers to files shipped only with those products. This will also be of use if you are working with debugger or inspector output or just curious about internals.

A Star Sapphire LISP cons is a 64 bit structure containing two 32 bit virtual addresses. A virtual address (an addr) is a structure with two 16 bit quantities, known as the page address and page index. The former is a 64 bit aligned offset on a page, and the latter is the offset of the page from the base of virtual memory. Pages are 1Kb in size. Since this means that there are 127 conses per page, only 7 bits are actually used in the page address.

This is defined in C (in vm.h) as follows:

typedef unsigned int vpi; /* virtual page index */

typedef unsigned int vpa; /* virtual page address */

/* virtual address type */

typedef struct addr {

vpi a_pi; /* page index */

vpa a_pa; /* page address */

} addr, *paddr;

typedef struct cons {

addr c_car; /* traditional lisp list head */

addr c_cdr; /* traditional lisp list tail */

} cons, *pcons;

An addr is printed as two C hex numbers in angle brackets, as follows: <0x23:0xa1>. In LISP output this gets preceeded by a hash sign to indicate the unreadability of the raw addr. An addr will only be printed in this fashion by the LISP print routine if it cannot be associated with a valid typebits value.

Every Star Sapphire LISP cons has encoded in it the type of its car and cdr in the respective page address field. This is because the page address has 9 bits free, allowing 512 potential typebit values. Less than 60 typebit values are currently used.

Star Sapphire LISP memory management is extremely optimized to make the best usage of the virtual memory space.

Conses have their own space since they are uniformly 64 bits in size and are the most common object allocated. This space extends from 0 up to the base of the virtual heap.

Data objects such as strings, symbols, arrays, and so on, which do not normally have a representation that can be contained within 16 bits have blocks on the virtual heap allocated for them to hold their contents. The address of the object will refer to this block.

Some objects are 'immediate'; they do not have an actual representation on the virtual heap. This includes fixnums, characters, and lexical variables. A 16 bit representation of these objects is efficient, and so this representation is embedded in the page index field.

These typebit values are listed in the table at the end of this section.

Here are some examples.

A cons containing a reference to the list

(a 2)

has the typebits:

[ LT_CONS LT_NULL ]

and the list itself contains two conses with types

[ LT_SYMBOL LT_CONS ] and [ LT_FIX LT_NULL ]

note that LT_NULL is guaranteed to be 0. These are put together like this:

cons space

[ LT_CONS LT_NULL ]

ß

[ LT_SYMBOL LT_CONS ]Þ [ LT_FIX LT_NULL ]

ß

virtual heap

A

Of course, this would be printed ((a b)).

LT_SYMBOL cells contain the virtual address of a symbol record. The symbol record is stored on the virtual heap referenced through the user package hash table.

The dotted list (a . b) has one cons with the types

cons space ....

[ LT_SYMBOL LT_SYMBOL ]

ß ß

virtual heap ....

A B

The layout of an addr+typebits (sometimes called a quantum in Star Sapphire documentation) is as follows:

page index

fedcba9876543210

XXNIIIIIIIIIIIII

X - reserved 2 bits

N - required for vnil 1 bit

I - page index 13 bits

page address

fedcba9876543210

TTTTTTTTTAAAAAAA

T - type bits 9 bits

A - page address bits 7 bits

Virtual nil is a distinct illegal virtual address, which is one beyond the end of the workspace. A vnil addr has no typebits, nor should it be interpreted as having any type. It should not be confused with LISP nil, which has a valid virtual address.

Note that vnil has page index 2 13+1, page address 0 vnil is used, e.g., in the hash tables to indicate an empty slot (since, in theory, 0 is a valid virtual address), or as an initializer for do variables with no step code, etc.

Following is the list of Star Sapphire typebit values which will be seen in objects actually produced by the reader. This therefore is the complete list of types available for use. This information has been extracted from ltypes.h and typetab.h. Some of the other values not given here represent types which are specified in Common LISP but not yet implemented.

Those with asterisks are not defined by Common LISP but are used internally by Star Sapphire to implement Common LISP.

Some of these values will only be seen in compiled code, notably lexvar and specsym.

C Define

Value

LISP Type

LT_NULL

0

null (will be printed as #<hexnum:hexnum>)

LT_ARRAY

1

generalized array

LT_BIGNUM

3

bignum

LT_BIT

4

bit (used internally in bit array representation)

LT_BITVECTOR

5

bit-vector

LT_CHARACTER

6

character

LT_CFUN

8

compiled-function

LT_CONS

10

cons

LT_FIX

12

fixnum

LT_FLOAT

13

float

LT_FUNCTION

14

function

LT_HASH

15

hash-table

LT_INTEGER

16

integer

LT_KEYWORD

17

keyword

LT_LEXVAR

18

* lexvar (lexical variable)

LT_NIL

21

nil (the type of the symbol nil)

LT_PACKAGE

23

package

LT_RSTATE

25

random-state

LT_RATIO

26

ratio

LT_STREAM

37

stream

LT_STRING

38

string

LT_SYMBOL

40

symbol

LT_VECTOR

42

vector

LT_USER

43

user defined structure (by defstruct)

LT_OPT_LLK

44

&optional lambda list keyword

LT_RES_LLK

45

&rest lambda list keyword

LT_KEY_LLK

46

&key lambda list keyword

LT_AOK_LLK

47

&allow-other-keys lambda list keyword

LT_AUX_LLK

48

&aux lambda list keyword

LT_BOD_LLK

49

&body lambda list keyword

LT_WHO_LLK

50

&whole lambda list keyword

LT_ENV_LLK

51

&environment lambda list keyword

LT_SPECSYM

55

* special symbol

LT_VJB

56

* virtual jump buffer (used by throw)

LT_MACRO

57

* compiled macro

LT_ZPROTO

58

* struct proto

LT_CLASS

59

user defined class (by defclass)

LT_CSLOTPROTO

60

* class slot prototype

LT_INSTANCE

61

product of make-instance

LT_METHOD

62

product of defmethod

LT_GENERIC

63

generic function

LT_METHCOMB

64

method combination

 

2.2 Type Overview

This section is a quick overview of the range of types available in Star Sapphire LISP.

Common LISP numeric types implemented in Star Sapphire LISP include three subcategories of integers, ratios, and one type of floating point.

Characters in general represent letters or text formatting operations. Star Sapphire characters normally represent characters in the ASCII set. An extended naming system in conformance with Common LISP can be used to name IBM PC specific key chords and the 8 bit IBM PC character set.

Strings are one-dimensional arrays of characters.

Symbols are named data objects. The name is stored as a string. LISP can locate a symbol object, given its name, via the package system.

Symbols have property lists, which allow symbols to be treated as record structures with an extensible set of named components, each of which may be any LISP object. Symbols are primarily used to name functions and variables within programs.

Lists are sequences of arbitrary length represented in the form of linked cells called conses. There is a distinct object (the symbol nil) that represents the empty list.

All other lists are built recursively by adding a new element to the an existing list. This is accomplished by allocating a new cons, which is an object having two components called the car and the cdr.

The car may hold anything, and the cdr is can be made to point to the previously existing list. Conses can be used completely generally as efficient two-element records, but their most important use is to construct lists. In the latter usage, they can be read in and printed in the 'dotted list' format.

Arrays are sequential collections of elements of identical size.One possible element (the default) is a virtual address. Arrays of this kind are termed 'generalized' and can have any LISP objects at any location in the array.

Hash tables are used to map a unique key to an associated object very efficiently timewise. Typically a symbol is used as the key, although this implementation supports arbitrary hash table key types.

Packages are name spaces which map symbol names to their associated symbol records. The parser recognizes symbols by looking up character sequences using the package system. A package is essentially a named hash table.

Streams are names for data sources or sinks. A stream is a basically a queue of bytes. They are used to perform input and output to files, most particularly to the console.

Random-states contain seeds for the random-number generator.

Structures are user-defined structures or record. They are objects that have named components. The defstruct facility is used to define new structure types.

Functions are code objects that can be invoked as procedures. Functions can take arguments and return values.

Classes are user-defined structures which can be associated with code (methods). The elements which make up a given class can be inherited by subclasses.

The following sections give more information on the catagories sketched out above.

2.3 Number types

Various kinds of numbers are defined in Common LISP. They are divided into integers, ratios, and floating-point numbers.

Refer to the following sections:

2.3.1 Integer types

2.3.2 Ratio type

2.3.3 Floating-Point types

 

 

2.3.1 Integer types

Common LISP specifies a true integer data type: Any integer, positive or negative, has (in theory) a representation as a Common LISP object.

If the integer can be represented in 2's complement, signed 16 bit, format (-32768 to 32767), it is called a fixnum and stored directly in the virtual address cell.

If the integer can be represented in 2's complement, signed 32 bit, format (-2,147,483,648 to 2,147,483,647), it is stored on the heap.

If the integer exceeds this size, it is stored as a bignum, a 2's complement signed arbitrary precision number. Bignums are limited internally to 256 bytes (or 4096 bits) of storage.

The distinction between the three subcategories of integer is completely transparent to the user.

When an integer is read, it is stored internally in the most efficient format. Conversions are performed internally between the three types when an operation would overflow or less storage would be required.

Integers are ordinarily written in decimal notation, as a sequence of decimal digits, optionally preceded by a sign and optionally followed by a decimal point. For example:

0

Zero

-0

the same as 0

+2

The second prime number

13

The fifth prime number

256

Two to the eighth power, still a fixnum

220000000

; approx. population of US, a 'true' integer

15511210043330985984000000

; 25 factorial (25!), definitely a bignum

 

Integers may be written and printed in bases other than ten. The notation:

#nnrddddd or #nnRddddd

means the integer in base-nn notation denoted by the digits . The base must be between 2 and 36, inclusive. The letter r stands for radix, another word for base. This macrocharacter is called the radix indicator.

In other words, the letter #, followed by a non-empty sequence of decimal digits representing an unsigned decimal integer n, followed by the letter r or R, an optional sign, and a sequence of base-n digits, indicates an integer written in base n.

Legal digits for the specified base use the usual digits for digits up to and including 9 and use letters of the alphabet of either case for successive digits, as required to represent numbers in the base. For example, base 20 uses the digits 0-9 and the letters A-K or a-k for additional digits. Base 3 uses digits 0-2, etc.

Binary (2), octal (8), and hexadecimal (16) bases are commonly used. These bases have the abbreviations #B or #b for Binary, #O or #o for Octal, and #X or #x for heXadecimal.

Examples of non-decimal radices:

#32rstar

; decimal 947547

#3r102

; decimal 11

#b101111111

; decimal 191

#xdd

; decimal 335

#o-200

; decimal -128

 

2.3.2 Ratio type

A ratio is a numeric type which is the mathematical ratio of two integers, a fraction. Integers and ratios make up the type rational.

Rationals have a 'canonical' (best) representation. This will be an integer if it is evenly divisible by one, otherwise it will be the ratio of two integers. If the fraction is negative, the numerator will be negative; the denominator will always be positive. If any computation produces a result that is a ratio of two integers such that the numerator is evenly divisible by the denominator, the result is immediately converted to the integer. This is the rule of rational canonicalization.

Ratios are optionally preceeded by a radix indicator and a sign. A ratio with a canonical representation as a fraction is written with a division sign, /, as a separator. There must not be any spaces between any of these items. The fractions' numerator and denominator can be an integer written in any base. If the non-default base is used to write a fractional ratio, the base is only specified for the numerator; both halves must be written in the same base. The denominator may not be zero.

For instance:

5/4

 

10/8

; non-canonical version of the preceeding example

-1/2

 

98797997979/13

; bignums work, too

#xDE/AD

; decimal 222/173

The reader will accept non-canonical fractions, but Star Sapphire LISP performs the rational canonicalization before turning the ratio into an internal representation. The printed representation of a rational will always be in canonical form.

The way that Star Sapphire LISP rational canonicalization is performed is slightly different from the Common LISP specification, which states that ratios can have a non-canonical internal representation, and are only converted when they are results of computation or printed using prin1. Other than this ratios should work as specified, particularly in terms of results.

 

2.3.3 Floating-Point types

Floating-point numbers are notated either in fractional decimal format or in computerized scientific notation. Here is the syntax:

floating-point-number ::= [sign] {digit}* decimal-point {digit}+ [exponent]

| [sign] {digit}+ [decimal-point {digit}*] exponent

sign ::= + | -

decimal-point ::= .

digit ::= 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9

exponent ::= exponent-marker [sign] {digit}+

exponent-marker ::= e | s | f | d | 1 | E | E | F | D | L

If there is no exponent specifier, then the decimal point is required, and there must be digits after it.

The exponent specifier consists of an exponent marker, an optional sign, and a non-empty sequence of digits.

One kind of floating point number is provided: This is stored internally in the double precision format. Other Common LISPs allow up to four levels of precision for floating point numbers, indicated by the letter used as the exponent marker.

The letters e, s, f, d, and l (and their respective uppercase equivalents) are all accepted by the reader as exponent markers for compatibility with other common LISP implementations, but have no effect on the internal representation. The printed representation will always use E as the exponent marker.

Examples of floating-point numbers:

0.0

; Floating-point zero in default format

0e0

; Also floating-point zero in default format

-.0

; Also a zero

0.

; The integer zero, not a floating-point zero!

0.0f0

; yet another floating-point zero

3.14159

; pi

2.71832e0

; e

2.997924e10

; speed of light in cm/sec.

 

2.4 Character types

Character objects store a 16 bit quantity (of which only 8 bits are used in the US version of Star Sapphire Common LISP).

The printed representation of characters is as follows.

Character objects are written with the sequence #\ followed by the name of the character.

For 'normal' characters (in the ASCII range ! to ~) the name is simply the character itself. For example, #\Z means the character object for an uppercase z.

Other characters have been assigned symbolic names, such as space, escape, newline, meta-control-z etc. The syntax for character names after #\ is the same as that for symbols (i.e., normally case-independent).

For example, #\Newline or #\NeWLINe means the newline character. Character symbolic names are conventionally written with their first letter capitalized in Common LISP. This is not a requirement, however.

For more information, refer to:

2.4.1 Standard Character Set

2.4.2 Semi-Standard Character Set

2.4.3 Extended Character Set

 

2.4.1 Standard Character Set

Common LISP defines a standard character set to allow portability. Programs which only use this set of characters in code and string contents are guaranteed portable to other Common LISP implementations.

The Common LISP character set consists of a space character #\Space, a newline character #\Newline, and the following ninety-four non-blank printing characters:

! " $ % & + ( ) * + , - . / 0 1 2 3 4 5 6 7 8 9 : ; < = > ?

@ A B C D E F G H I J K L M N O P Q R S T U V W X Y Z [ \ ] ^ _

` a b c d e f g h i j k l m n o p q r s t u v w x y z { | } ~

The Common LISP standard character set is thus equivalent to the ninety-five standard ASCII printing characters plus a newline character.

The single character, #\Newline serves as a line delimiter within the language. This is mapped to the native environments' line separator, whatever that may be. For instance, in DOS this becomes carriage return followed by line feed. If you use #\Newline when printing instead of hard-coding #\Return and #\Linefeed, your software will move the cursor to the start of the next line at that point if ported to other Common LISPs. This is much like the use of \n in the C language.

The following characters are used in only limited ways in the syntax of Common LISP programs, although they are used in string data:

[ ] { } ? ! ^ _ ~ $ %

For the most part, the use of the above characters is reserved for use by the implementor or user:

 

Reserved for Implementor

Reserved for User

[

X

]

X

{

X

}

X

?

X

!

X

^

X

_

X

~

X

The letters $ and % are treated as alphabetic characters but are not usually used in Common LISP symbol names.

Star Sapphire LISP has appropriated ? and ! only for the #? (describe) and #! (shell escape) macrocharacter syntax. Otherwise, you can use these characters in symbol names freely.

2.4.2 Semi-Standard Character Set

The following characters are called semi-standard:

Name

Octal Value

#\Backspace

010

#\Tab

011

#\Linefeed

012

#\Page

014

#\Return

015

#\Rubout

177

Star Sapphire LISP also supports the character name #\Escape which corresponds to the octal value 027.

 

2.4.3 Extended Character Set

The following notation is used to represent a full 8 bit character set. The exact mapping here is not portable to other Common LISPs, but similar to other implementations' extensions to the character set.

This is done by prefixing one or two hyphenated prefixes to a base alphabetic character in the range @ to Rubout.

The Common LISP specification specifies the use of character name prefixes control-, meta- and hyper-. Star Sapphire uses the first two of these. The use of the prefix control- or c- indicates a 'control' character; the use of the prefix meta- or m- indicates a 'meta' (or alt) character. Once again, these prefixes can read in in either case.

By prefixing control- or c-, 64 is subtracted from the alphabetic character. If the meta- prefix is not used, and the base character is a letter of the alphabet, the character is first converted to the uppercase equivalent by subtracting 32. Thus control-c can be written #\Control-C, #\Control-c, #\c-C or #\c-c.

By prefixing meta- or m-, 128 is added to the character. In this case, if the control- or c- prefix is also present, 64 is subracted irregardless of the case of the base character.

These characters will primarily be of use to print the IBM PC character set, which has a useful range of box characters, accented letters, greek letters and miscellaneous other glyphs such as happy faces, arrows and card suits in the ranges 0-31 and 127-255. These can be printed, for example, using the ~c format directive.

For instance, try these:

(format t "~c=3.14159~%" #\Meta-c) ; hint: #\Meta-c prints a greek letter

(format t "from here to ~c" #\Meta-l)

(format t "~cfeliz navidad y un prospero a~co nuevo!~%" #\m-c-m #\m-c-d)

Code like this which uses the IBM PC character set is inherently non-portable, of course.

Here is a complete table of the 256 extended ascii characters with a 'canonical' Star Sapphire LISP name, and decimal, hexadecimal, octal and binary values.

Dec

Hex

Oct

Binary

Name

0

0

0

0000 0000

#\Control-@

1

1

1

0000 0001

#\Control-a

2

2

2

0000 0010

#\Control-b

3

3

3

0000 0011

#\Control-c

4

4

4

0000 0100

#\Control-d

5

5

5

0000 0101

#\Control-e

6

6

6

0000 0110

#\Control-f

7

7

7

0000 0111

#\Control-g

8

8

10

0000 1000

#\Backspace

9

9

11

0000 1001

#\Tab

10

a

12

0000 1010

#\Linefeed

11

b

13

0000 1011

#\Control-k

12

c

14

0000 1100

#\Page

13

d

5

0000 1101

#\Return

14

e

16

0000 1110

#\Control-n

15

f

17

0000 1111

#\Control-o

16

10

20

0001 0000

#\Control-p

17

11

21

0001 0001

#\Control-q

18

12

22

0001 0010

#\Control-r

19

13

23

0001 0011

#\Control-s

20

14

24

0001 0100

#\Control-t

21

15

25

0001 0101

#\Control-u

22

16

26

0001 0110

#\Control-v

23

17

27

0001 0111

#\Control-w

24

18

30

0001 1000

#\Control-x

25

19

31

0001 1001

#\Control-y

26

1a

32

0001 1010

#\Control-z

27

1b

33

0001 1011

#\Escape

28

1c

34

0001 1100

#\Control-\

29

1d

35

0001 1101

#\Control-]

30

1e

36

0001 1110

#\Control-^

31

1f

37

0001 1111

#\Control-_

32

20

40

0010 0000

#Space

33

21

41

0010 0001

#\!

34

22

42

0010 0010

#\"

35

23

43

0010 0011

#\#

36

24

44

0010 0100

#\$

37

25

45

0010 0101

#\%

38

26

46

0010 0110

#\&

39

27

47

0010 0111

#\'

40

28

50

0010 1000

#\(

41

29

51

0010 1001

#\)

42

2a

52

0010 1010

#\*

43

2b

53

0010 1011

#\+

44

2c

54

0010 1100

#\,

45

2d

55

0010 1101

#\-

46

2e

56

0010 1110

#\.

47

2f

57

0010 1111

#\/

48

30

60

0011 0000

#\0

49

31

61

0011 0001

#\1

50

32

62

0011 0010

#\2

51

33

63

0011 0011

#\3

52

34

64

0011 0100

#\4

53

35

65

0011 0101

#\5

54

36

66

0011 0110

#\6

55

37

67

0011 0111

#\7

56

38

70

0011 1000

#\8

57

39

71

0011 1001

#\9

58

3a

72

0011 1010

#\:

59

3b

73

0011 1011

#\;

60

3c

74

0011 1100

#\<

61

3d

75

0011 1101

#\=

62

3e

76

0011 1110

#\>

63

3f

77

0011 1111

#\?

64

40

100

0100 0000

#\@

65

41

101

0100 0001

#\A

66

42

102

0100 0010

#\B

67

43

103

0100 0011

#\C

68

44

104

0100 0100

#\D

69

45

105

0100 0101

#\E

70

46

106

0100 0110

#\F

71

47

107

0100 0111

#\G

72

48

110

0100 1000

#\H

73

49

111

0100 1001

#\I

74

4a

112

0100 1010

#\J

75

4b

113

0100 1011

#\K

76

4c

114

0100 1100

#\L

77

4d

115

0100 1101

#\M

78

4e

116

0100 1110

#\N

79

4f

117

0100 1111

#\O

80

50

120

0101 0000

#\P

81

51

121

0101 0001

#\Q

82

52

122

0101 0010

#\R

83

53

123

0101 0011

#\S

84

54

124

0101 0100

#\T

85

55

125

0101 0101

#\U

86

56

126

0101 0110

#\V

87

57

127

0101 0111

#\W

88

58

130

0101 1000

#\X

89

59

131

0101 1001

#\Y

90

5a

132

0101 1010

#\Z

91

5b

133

0101 1011

#\[

92

5c

134

0101 1100

#\\

93

5d

135

0101 1101

#\]

94

5e

136

0101 1110

#\^

95

5f

137

0101 1111

#\_

96

60

140

0110 0000

#\`

97

61

141

0110 0001

#\a

98

62

142

0110 0010

#\b

99

63

143

0110 0011

#\c

100

64

144

0110 0100

#\d

101

65

145

0110 0101

#\e

102

66

146

0110 0110

#\f

103

67

147

0110 0111

#\g

104

68

150

0110 1000

#\h

105

69

151

0110 1001

#\i

106

6a

152

0110 1010

#\j

107

6b

153

0110 1011

#\k

108

6c

154

0110 1100

#\l

109

6d

155

0110 1101

#\m

110

6e

156

0110 1110

#\n

111

6f

157

0110 1111

#\o

112

70

160

0111 0000

#\p

113

71

161

0111 0001

#\q

114

72

162

0111 0010

#\r

115

73

63

0111 0011

#\s

116

74

164

0111 0100

#\t

117

75

165

0111 0101

#\u

118

76

166

0111 0110

#\v

119

77

167

0111 0111

#\w

120

78

170

0111 1000

#\x

121

79

171

0111 1001

#\y

122

7a

172

0111 1010

#\z

123

7b

173

0111 1011

#\{

124

7c

174

0111 1100

#\|

125

7d

175

0111 1101

#\}

126

7e

176

0111 1110

#\~

127

7f

177

0111 1111

#\Rubout

128

80

200

1000 0000

#\Meta-Control-@

129

81

201

1000 0001

#\Meta-Control-A

130

82

202

1000 0010

#\Meta-Control-B

131

83

203

1000 0011

#\Meta-Control-C

132

84

204

1000 0100

#\Meta-Control-D

133

85

205

1000 0101

#\Meta-Control-E

134

86

206

1000 0110

#\Meta-Control-F

135

87

207

1000 0111

#\Meta-Control-G

136

88

210

1000 1000

#\Meta-Control-H

137

89

211

1000 1001

#\Meta-Control-I

138

8a

212

1000 1010

#\Meta-Control-J

139

8b

213

1000 1011

#\Meta-Control-K

140

8c

214

1000 1100

#\Meta-Control-L

141

8d

215

1000 1101

#\Meta-Control-M

142

8e

216

1000 1110

#\Meta-Control-N

143

8f

217

1000 1111

#\Meta-Control-O

144

90

220

1001 0000

#\Meta-Control-P

145

91

221

1001 0001

#\Meta-Control-Q

146

92

222

1001 0010

#\Meta-Control-R

147

93

223

1001 0011

#\Meta-Control-S

148

94

224

1001 0100

#\Meta-Control-T

149

95

225

1001 0101

#\Meta-Control-U

150

96

226

1001 0110

#\Meta-Control-V

151

97

227

1001 0111

#\Meta-Control-W

152

98

230

1001 1000

#\Meta-Control-X

153

99

231

1001 1001

#\Meta-Control-Y

154

9a

232

1001 1010

#\Meta-Control-Z

155

9b

233

1001 1011

#\Meta-Control-[

156

9c

234

1001 1100

#\Meta-Control-\

157

9d

235

1001 1101

#\Meta-Control-]

158

9e

236

1001 1110

#\Meta-Control-^

159

9f

237

1001 1111

#\Meta-Control-_

160

a0

240

1010 0000

#\Meta-Control-`

161

a1

241

1010 0001

#\Meta-Control-a

162

a2

242

1010 0010

#\Meta-Control-b

163

a3

243

1010 0011

#\Meta-Control-c

164

a4

244

1010 0100

#\Meta-Control-d

165

a5

245

1010 0101

#\Meta-Control-e

166

a6

246

1010 0110

#\Meta-Control-f

167

a7

247

1010 0111

#\Meta-Control-g

168

a8

250

1010 1000

#\Meta-Control-h

169

a9

251

1010 1001

#\Meta-Control-i

170

aa

252

1010 1010

#\Meta-Control-j

171

ab

253

1010 1011

#\Meta-Control-k

172

ac

254

1010 1100

#\Meta-Control-l

173

ad

255

1010 1101

#\Meta-Control-m

174

ae

256

1010 1110

#\Meta-Control-n

175

af

257

1010 1111

#\Meta-Control-o

176

b0

260

1011 0000

#\Meta-Control-p

177

b1

261

1011 0001

#\Meta-Control-q

178

b2

262

1011 0010

#\Meta-Control-r

179

b3

263

1011 0011

#\Meta-Control-s

180

b4

264

1011 0100

#\Meta-Control-t

181

b5

265

1011 0101

#\Meta-Control-u

182

b6

266

1011 0110

#\Meta-Control-v

183

b7

267

1011 0111

#\Meta-Control-w

184

b8

270

1011 1000

#\Meta-Control-x

185

b9

271

1011 1001

#\Meta-Control-y

186

ba

272

1011 1010

#\Meta-Control-z

187

bb

273

1011 1011

#\Meta-Control-{

188

bc

274

1011 1100

#\Meta-Control-|

189

bd

275

1011 1101

#\Meta-Control-}

190

be

276

1011 1110

#\Meta-Control-~

191

bf

277

1011 1111

#\Meta-Control-Rubout

192

c0

300

1100 0000

#\Meta-@

193

c1

301

1100 0001

#\Meta-A

194

c2

302

1100 0010

#\Meta-B

195

c3

303

1100 0011

#\Meta-C

196

c4

304

1100 0100

#\Meta-D

197

c5

305

1100 0101

#\Meta-E

198

c6

306

1100 0110

#\Meta-F

199

c7

307

1100 0111

#\Meta-G

200

c8

310

1100 1000

#\Meta-H

201

c9

311

1100 1001

#\Meta-I

202

ca

312

1100 1010

#\Meta-J

203

cb

313

1100 1011

#\Meta-K

204

cc

314

1100 1100

#\Meta-L

205

cd

315

1100 1101

#\Meta-M

206

ce

316

1100 1110

#\Meta-N

207

cf

317

1100 1111

#\Meta-O

208

d0

320

1101 0000

#\Meta-P

209

d1

321

1101 0001

#\Meta-Q

210

d2

322

1101 0010

#\Meta-R

211

d3

323

1101 0011

#\Meta-S

212

d4

324

1101 0100

#\Meta-T

213

d5

325

1101 0101

#\Meta-U

214

d6

326

1101 0110

#\Meta-V

215

d7

327

1101 0111

#\Meta-W

216

d8

330

1101 1000

#\Meta-X

217

d9

331

1101 1001

#\Meta-Y

218

da

332

1101 1010

#\Meta-Z

219

db

333

1101 1011

#\Meta-[

220

dc

334

1101 1100

#\Meta-\

221

dd

335

1101 1101

#\Meta-]

222

de

336

1101 1110

#\Meta-^

223

df

337

1101 1111

#\Meta-_

224

e0

340

1110 0000

#\Meta-`

225

e1

341

1110 0001

#\Meta-a

226

e2

342

1110 0010

#\Meta-b

227

e3

343

1110 0011

#\Meta-c

228

e4

344

1110 0100

#\Meta-d

229

e5

345

1110 0101

#\Meta-e

230

e6

346

1110 0110

#\Meta-f

231

e7

347

1110 0111

#\Meta-g

232

e8

350

1110 1000

#\Meta-h

233

e9

351

1110 1001

#\Meta-i

234

ea

352

1110 1010

#\Meta-j

235

eb

353

1110 1011

#\Meta-k

236

ec

354

1110 1100

#\Meta-l

237

ed

355

1110 1101

#\Meta-m

238

ee

356

1110 1110

#\Meta-n

239

ef

357

1110 1111

#\Meta-o

240

f0

360

1111 0000

#\Meta-p

241

f1

361

1111 0001

#\Meta-q

242

f2

362

1111 0010

#\Meta-r

243

f3

363

1111 0011

#\Meta-s

244

f4

364

1111 0100

#\Meta-t

245

f5

365

1111 0101

#\Meta-u

246

f6

366

1111 0110

#\Meta-v

247

f7

367

1111 0111

#\Meta-w

248

f8

370

1111 1000

#\Meta-x

249

f9

371

1111 1001

#\Meta-y

250

fa

372

1111 1010

#\Meta-z

251

fb

373

1111 1011

#\Meta-{

252

fc

374

1111 1100

#\Meta-|

253

fd

375

1111 1101

#\Meta-}

254

fe

376

1111 1110

#\Meta-~

255

ff

377

1111 1111

#\Meta-Rubout

 

2.5 String Character types

The first edition of Common LISP, the Language, specified a distinct type for characters which can be stored in strings, string-char, defined as any character with zero font and bit attributes. ANSI Common LISP eliminates this type and defines a new type base-character, which can simply be defined to be equivalent to character.

The Star Sapphire character type (in the non-Japanese version) has always implied an 8 bit character, with no distinct internal character types. It has always been possible to store any character in a string.

Support for the type name string-char has been removed from this implementation.

Thus, code written using Star Sapphire which uses the type character will be portable both to versions of Common LISP based on the first edition of Steele and forward to ANSI LISP. We do not advise using string-char in new code.

2.6 Symbol Type

LISP is closely identified with symbolic processing. LISP symbols are roughly equivalent to global variables or identifiers in other languages. However, symbols in LISP go far beyond these constructs.

Other articles in this section about symbols are:

2.6.1 Symbol Names

2.6.1.1 Symbol Names Style Note

2.6.2 Lexical variables (lexvars)

 

Besides a name and a value, symbols can have a package, an arbitrary functional value and a property list.

Every symbol has a name, called its print name. This is stored internally in the same format as strings; the print name can be retrieved using the symbol-name function. Valid symbol names are discussed below in Symbol Names.

Symbols have a global value; they can be used to store any LISP object. If a global symbols' name is typed into the interpreter, it will evaluate to whatever the value of the symbol is. The setq or set special forms can be used to assign a global value to a symbol. A symbol will be 'unbound' until assigned to, that is it has no value (internally in Star Sapphire, its value in this case is vnil). The makunbound function can be used to strip a symbol of its current global value.

Note that special binding can override in a local context the global value of a symbol. In addition, lexical variables can be defined which have the same name as a global symbol within a local context. In this case, the lexical variable substitutes for the global symbol and its value takes precedence.

Although lexical variables are written exactly like symbols, they have a very different implementation, which is must be understood to program competantly in Common LISP. See below, as well as Scope and Extent.

Symbols are organized into packages. The print name is used to uniquely identify a given symbol in a package, using an internal hash table which each package has. The package of a given symbol is stored in the symbol as well. Symbols can be created (using the gensym function) which do not have a package. The intern function can be used to put a symbol into a given package.

Symbols can be associated with a functional value using defun. This is also stored in a slot in the symbol record.

Lastly, symbols have a property list, or plist. This is a list which should always have an even number of components. This list consists of alternating symbols, which serve as the names of properties, and arbitrary objects, which are property values. The property list is used both by the user or the system to store additional information about the symbol.

 

2.6.1 Symbol Names

Basically a symbol name consists of any non-empty string of alphanumeric characters with the addition of any other standard printing characters. Normally this is with the exception of space and parentheses. However these can be added to a symbol print name using the escape syntax if desired.

The name must not be one that can be confused with a number. Unlike most other programming languages (except COBOL!), a symbol name can start with a digit, but must diverge from the legal number syntax at some point subsequently. From an implementation point of view this means that Star Sapphire symbols are rejects of the number scanner.

In Star Sapphire Common LISP, symbol names are essentially unlimited in length; however, other implementations may impose some limit on the number of significant characters.

Symbols are conventially written in lowercase letters. These get translated to uppercase letters internally when a valid symbol is detected. When printed, the symbol name will show up in uppercase.

The following letters are normally used in symbol names in addition to the alphanumerics:

+ - * / @ $ % ^ & _ = < > ~ .

The period has certain restricitions on its use. A period by itself is considered the dot separator in a cons. Common LISP does not allow symbols of length greater than two characters with names entirely consisting of dots (although this is not currently enforced in Star Sapphire LISP).

The following characters should not routinely be used in symbol names:

? ! [ ] { }

Although parentheses and spaces can be inserted into symbol names using the escape mechanism, this is not advised either.

If lowercase letters, parends or spaces must be part of a symbols name, there are a couple of ways to do this.

The first is to insert a backslash in front of the character; this will escape one character. For instance:

\now\is\the\time

will be stored internally as:

nOWiStHEtIME

The second method is to surround the entire symbol or sections of the symbol with vertical bars:

|(Now) Is The Time.|

which is the equivalent of writing an backslash in front of every character. This symbol will be named:

(Now) Is The Time.

This method is useful if more than one character is to be escaped.

Or consider:

|xxx|bbb|yyy|

which has the print name:

xxxBBByyy

If it is neccesary to write a backslash or vertical bar in a portion escaped by vertical bars, you must put a backslash in:

|\|etaoin\\shurdlu\||

is stored with a print name of

|etaion\shurdlu|

See also: 2.6.1.1 Symbol Names Style Note

 

2.6.1.1 Symbol Names Style Note

Although the ability to write arbitrary symbol names is a fascinating feature of Common LISP, it is hard to see where this feature is very useful in real world circumstances (other than as a cocktail party topic).

As a matter of style, if you are writing code you or someone else will have to look at in the future, using descriptive symbol names without unusual characters will make your code much easier to read and maintain.

Here are some more normal LISP symbol names:

x

y

pi

multiple-value-setq

foo-bar

&optional ; a lambda list keyword

:start ; a keyword (note that the colon is not really

; part of the name, see Packages).

character-p ; a predicate which tests for character-ness

my-variable ; a global variable

*x15* ; a global variable

*the speed of light* ; a global variable, a constant (I hope)

As you can see, most normal LISP symbol names consist of one or more hyphenated segments, each of which consists of a simple descriptive alphanumeric string.

Global variables by convention are surrounded by asterisks to make code containing them more maintainable.

Predicates are by convention given the suffix -p; LISP programmers use this convention in daily life. For instance, two LISP programmers discussing what to have for dinner: "chinese food-p?" "t". (translation: "Do you want to eat chinese food?" "Sure").

 

2.6.2 Lexical variables (lexvars)

Star Sapphire has an internal type called a lexvar (lexical variable) which is used to implement lexical scoping due to the incrementally compiled nature of the system. These are very common in debugger output so it will be useful to know about them if you are using the debugger.

Lexical variables are written by the user in exactly the same way as symbols. They are translated into lexical variables by the incremental compiler.

The incremental compiler converts symbols which represent lexically scoped variables into an internal data type which refers to an offset in the stack indirect from a given stack frame. This is stored directly in the virtual address much like a fixnum or character. Once a lexical variable gets compiled, it looses all association with its symbols' print name.

Lexvars are printed in the format #<LEXVAR integer:integer>, where the first integer is the relative offset of the stack frame and the second integer is the offset in that stack frame. The #<> syntax is used so that they cannot be read back in accidentally.

For instance, in the incrementally compiled code for

(defun foo(x)

(* x x))

The symbol x in the body of the form beginning with the symbol * (multiply) will be translated into a lexical variable which will be printed as #<LEXVAR 1:0>. Why is this 1:0 instead of 0:0? This is because the defun gets translated internally into

(defun foo(x)

(block foo

(* x x)))

The block creates a stack frame, as does the functional invocation. Hence, x refers to the argument in the functions stack frame, which is offset two stack frames from the use in the multiply form.

As stated above, lexvars cannot normally be read back in per se. Star Sapphire Common LISP supports a #&number:number macrosyntax which allows reading in lexvars from files. Use of this is not advised and is subject to change. This is included to support the output from certain translation programs which are only available to Star Sapphire Common LISP source code licencees.

2.7 Lists and Conses

The LISP language is based on list manipulation. All other objects are normally read into, stored in, and printed out as part of a list.

The data object used to implement LISP lists is the cons. This is short for for list CONStructor.

A cons is a structure composed of the addresses of two other objects. The two components are called the car and the cdr.

In formal terms, a list is recursively defined to be either a cons whose cdr component is a list or the empty list.

A list is therefore a sequence of conses linked by their cdr components. The last cdr in the list contains nil, the empty list.

The car components of the conses are called the elements of the list. There is a cons for each element of the list. The empty list is considered to have no elements, and hence no conses.

The difference between a cons and a list therefore hinges on nil. A list can be empty; there is no such thing as an empty cons.

A list is notated by writing the elements of the list in sequence, separated by spaces (or tabs or newlines, etc.), surrounded by parentheses.

For example:

(1 2 3) ; list of three integers

((foo bar) 3.14159 "hello") ; list of three elements; a list, a ;floating point number and a ;string.

The empty list nil is written as (). This is because nil is a list with no elements.

2.7.1 List Diagramming

The traditional ascii graphical representation of a cons looks like this:

car cdr

+-------+

| o | o-+ ---->

+---|---+

|

V

In this document, the following abbreviated character graphic will be used to illustrate a cons:

[+|+]--

|

The hash sign will be used to indicate nil:

[+|#]

|

A

For instance, the above diagrams the list (A).

Hence the list (a b c) will be represented as follows:

[+|+]--[+|+]--[+|#]

| | |

A B C

We recommend that you get in the habit of diagramming lists in this fashion.

2.7.2 Dotted Lists

Not all lists are nil terminated. A "dotted" list has a non-list (i.e. neither a cons or nil) object as in the cdr of its last cons. The list is called 'dotted' because a period (surrounded by whitespace) is written as the next to the last element in the list.

 

(2.3 . x) ; A cons whose car is a floating point ; number

; and whose cdr is an symbol

(d o g . 3) ; A dotted list with three elements whose last ; cons

; has the integer 3 in its cdr

 

The first list above looks like this:

[+|+]

/ \

2.3 X

And the second looks like this:

[+|+]--[+|+]--[+|+]

/ | | \

d o g 3

The reader will accept a list typed in with the dot and a list as the last element:

(a . (b c))

however, this will always get printed as:

(a b c)

(draw the diagram to see why).

2.7.3 True Lists, Trees, Etc.

This section discusses some of the other terminology used to categorize list structure. None of these terms are actual types or classes but simply terms used to describe list structure or common arrangements of list elements.

list is often loosely used to refer either to dotted lists or dotted lists. When the distinction is important, true list refers to a nil terminated list.

Most functions which are described as taking a list argument assume a true list as its argument. It is an error to pass a dotted list to a function that is specified to require a list as an argument.

The term tree means a cons and all conses which it refers to directly or indirectly (in formal terms, transitively) either through their car or cdr. Any non-cons objects which these conses refer to are called by analogy the leaves.

An association list is a list of dotted lists. The car of each pair is called the key and cdr is called the datum. These are typically used to store small databases where it is useful to look up data based on the key or vice versa.

A property list is similar in intent but slightly different in implementation. A property list is maintained as an even length list. The elements at even offsets in the list are used as keys and the elements at odd offsets are used as data.

2.8 Array Type

An array is a contiguous block of memory composed of components which are all of the same size. One legal array element is a virtual address, which allows 'generalized' arrays containing any LISP object whatsoever.

The number of dimensions of an array is called its rank; the rank is a non-negative integer. Likewise, each dimension is itself a non- negative integer. The total number of elements in the array is the product of all the dimensions.

An array has an non-negative number of dimensions and is indexed by a sequence of non-negative integers. Star Sapphire arrays can have up to 2 16 (65,535) dimensions; each dimension be up to 2 32 (4,294,967,295). Therefore arrays are essentially limited in size only by available virtual memory (normally about 7 megabytes of heap).

As stated, a general array can have any LISP object as a component; in this case the array actually holds the address of the object represented and each cell is 32 bits in size.

Other types of arrays are defined to hold only one type of LISP object. This is normally done for efficiency. One-dimensional arrays are termed vectors. Strings are simply one-dimensional arrays of 8 bit characters. One-dimensional arrays of bits are called bit-vectors; the components of these vectors can take on the value 1 or 0 and occupy exactly one bit of virtual memory.

Note that any array, even a multi-dimensional array, can be defined as containing elements of only one type using the :element-type keyword with make-array. The most efficient way to store that type will be calculated by the array management code.

A dimension can be be zero. If this is so, the array has no elements, and any attempt to access an element is an error. However, other properties of the array, such as the dimensions themselves, can be used. If the rank is zero, then there are no dimensions, and the product of the dimensions is then by definition 1. A zero-rank array therefore has a single element.

An array element is identified by a sequence of indices. The length of the sequence must equal the array rank. Each index must be a non-negative integer which is less than the corresponding array dimension. Array indexing is therefore zero-origin.

As an example, suppose that the variable foo names a 2-by-3 array. Then the first index can be 0, or 1, and the second index can be 0, 1, or 2.

Array elements can be accessed using the function aref.

For example, (aref foo 32 5) refers to element (32, 5) of the array foo. The aref function takes a variable number of arguments.

The first argument is an array; this is followed by index arguments, corresponding in number to the rank of the array.

A zero-rank array has no dimensions. In this case aref takes as it only argument the array. The return value is the sole element of the array.

A one-dimensional array (vector) can also have a fill pointer. The fill pointer specifies an offset in the array less than the size of the array; the fill pointer is used to push or pop elements from the vector. An array that has no fill pointer is called a simple array.

Multidimensional arrays use row-major order to calculate the internal arrangement of their elements.

This means that a multidimensional array is implemented internally as a one-dimensional array. The last index varys the fastest.

 

2.8.1. Vector Type

A vector is a subtype of arrays which has a rank of one, i.e. is one-dimensional.

Lists and vectors make up the sequence type. The difference, of course, is that vector elements are all contiguous whereas list elements must be accessed from front to back. Thus any component of a vector can be accessed in constant time, whereas a list access is linear with respect to the length of the list. However, lists offer more efficiency in some respects, such as when an element needs to be added at the front!

Vectors can be written by surrounding its components in order by #( and ). The components must be separated like list elements by whitespace. When the read function encounters this construct, it constructs a simple general vector (i.e. without an :element-type specification or fill pointer).

#() ; An empty vector

#(1 4 9 16 25) ; A vector of length 5 containing the squares of ; the first 5 positive integers

 

2.8.2. String Type

A string is a vector of characters. A string is written as the sequence of characters contained in the string, surrounded by a double quote (") character. Any instance of the characters " or \ in the string must have a backslash before it. For example:

"" ; An empty string

"Hello, world!" ; A fairly classic string.

"\"\\" ; A string consisting of a double quote ; followed

; by a backslash.

Vertical bar is not an escape character in strings, as it is in symbols, and does not need to be escaped.

Just like any other vector, each character in the string can be accessed using a 0 based index which increases from left to right. The aref function is can be used just as with any other subtype of array.

Thus using the second example above,

(aref "Hello, world!" 0) => H

(aref "Hello, world!" 3) => l

(aref "Hello, world!" 13) => !

 

2.8.3. Bit-Vector Type

Bit vectors are one-dimensional arrays whose elements are one bit wide.

A bit-vector is notated by the sequence #* followed by the values of the bits it contains from left to right. That is, the leftmost bit is element number 0, the next one is element number 1, and so on. For example:

#* ;An empty bit-vector

#*1011000 ;A seven-bit bit-vector; bit 0 is a 1

 

2.9 Hash table Type

Hash tables provide an efficient mechanism for mapping a unique key to an associated datum. Typically a hash table is used to map symbols to associated values. There is only one, unique entry for any given key in a hash table.

The advantage of using hash tables is that the time to access any given key is approximately constant. Therefore the hash table data structure presents the best of both worlds: flexible and efficient mapping between a key and a datum.

Common LISP and this implementation of it allows the hash table to grow when the hash table is within some threshold of being full. The hash table will retain its mapping and efficiency of access irregardless of how large it gets.

Hash tables utilize a hash function which maps a possibly infinite range of values into a finite range of integers. Each entry with the same hash number gets stored as part of a chain. Searching and entering in a hash table therefore both involve computing the hash number for an item and then searching the chain for an exact match.

The hash table implementation in Star Sapphire is based on two arrays: A main hash table and an auxiliary hash table. The main hash table contains an index into the auxiliary hash table; the auxiliary hash table contains chains which store all items which have been entered with the same hash value.

 

2.10 Package Type

Packages in Star Sapphire Common LISP use internal hash tables to reference symbols. By default, symbols are entered, or interned into the user package if they are not already in the package lisp. There is always a current package in Common LISP, and symbols not in the current package must be specified using a special syntax.

To specify a symbol in another package than the current package, you must specify the package, two colons, and then the symbol name:

foo::bar ; specifies a symbol 'bar' in the 'foo' ; package.

this allows you to use the symbol as if it were in the current package. A symbol preceded by exactly one colon is interned in the keyword package. Symbols in the keyword package always evaluate to their respective symbol.

2.11 Stream Type

A stream is a source or sink of data, typically characters or bytes. Nearly all functions that perform input and output do so with respect to a specified stream. The open function takes a pathname and returns a stream connected to the file specified by the pathname. There are a number of standard streams that are used by default for various purposes, in particular terminal output.

 

2.12 Random-State Type

Random-states are a representation for a seed for the random-number generator. Given a particular random-state, the random number generator will generate the same sequence of numbers. In this implementation a random-state is simply a 16 bit fixnum with the typebits LT_RSTATE.

 

2.13 Structure Type

Structures are instances of user-defined data types that have a fixed number of named components. They are analogous to structs in the C language. Structures are defined using the defstruct macro. defstruct automatically defines access and constructor functions for the new data type. The constructor function is used to actually create new instances of the structure.

The printed representation of a structure begins with a #S( followed by the name of the struct, the name and contents of each slot and a closing ). The reader also recognizes this format (although the structure must be defined before instances of it can be read in).

 

2.14 Functional Object Type

Functions are code objects that can be invoked as procedures. Functions can take arguments and return values.

In the most usual case, symbols are be used to represent functions; the association is made using the defun macro. These are known colloquially as defuns.

Some functions are essentially anonymous and represented as a list whose car is the symbol lambda. A defun can be thought of as a named lambda or lambdas can be considered anonymous defuns (take your pick).

Some functions are called compiled-functions; in Star Sapphire this means that these have been implemented in C or LISP and translated by a compiler into machine code. The association between the name and the code in this case is made by the native linker in a non-dynamic fashion when the executable is forged.

The result of evaluating the function special form will always be a function.

 

2.15 Unreadable Data Objects

Some objects may print in implementation-dependent ways. Such objects cannot necessarily be reliably reconstructed from a printed representation, and so they are usually printed in a format informative to the user but not acceptable to the read function:

#<useful information>

The LISP reader will signal an error on encountering #<.

There are a number of Star Sapphire specific data objects which are printed using this format.

The following unreadable data objects may be seen, especially when using the debug facility. This is not an exhaustive list of such objects, but a sample of the most commonly seen examples.

#<C hexnumber: C hexnumber>

This is the internal representation of a virtual address. The first hex number indicates the page, the second hex number indicates the offset of a cons on the page. For instance:

#<0x0:0x27>

indicates the third cons on the first page in the workspace.

#<LEXVAR boundaries: offset>

This is the printed form of the internal representation of a lexically scoped variable after it has been compiled. This is a reference to an offset in a frame a given number of lexical boundaries below the current environment.

#<FUNCTION name>

This is the printed form of the internal representation of a functional object. More information than just the name may be obtained by setting the global variable *print-funobj* to t.