Chapter 21. Streams

21.1. ANSI Streams
21.1.1. Supported types
21.1.2. Element types
21.1.3. External formats
21.2. C Reference
Streams C dictionary — Common Lisp and C equivalence

21.1. ANSI Streams

21.1.1. Supported types

ECL implements all stream types described in ANSI. Additionally, when configured with option --enable-clos-streams, ECL includes a version of Gray streams where any object that implements the appropiate methods (stream-input-p, stream-read-char, etc) is a valid argument for the functions that expect streams, such as read, print, etc.

21.1.2. Element types

ECL distinguishes between two kinds of streams: character streams and byte streams. Character streams only accept and produce characters, written or read one by one, with write-char or read-char, or in chunks, with write-sequence or any of the Lisp printer functions. Character operations are conditioned by the external format, as described in Section 21.1.3

ANSI Common Lisp also supports binary streams. Here input and output is performed in chunks of bits. Binary streams are created with the function open passing as argument a subtype of integer and the implementation is free to round up that integer type to the closest size it supports. In particular ECL rounds up the size to a multiple of a byte. For example, the form (open "foo.bin" :direction :output :element-type '(unsigned-byte 13)), will open the file foo.bin for writing, using 16-bit words as the element type.

21.1.3. External formats

An external format is an encoding for characters that maps character codes to a sequence of bytes, in a one-to-one or one-to-many fashion. External formats are also known as "character encodings" in the programming world and are an essential ingredient to be able to read and write text in different languages and alphabets.

ECL has one of the most complete supports for external formats, covering all of the usual codepages from the Windows and Unix world, up to the more recent UTF-8, UCS-2 and UCS-4 formats, all of them with big and small endian variants, and considering different encodings for the newline character.

However, the set of supported external formats depends on the size of the space of character codes. When ECL is built with Unicode support (the default option), it can represent all known characters from all known codepages, and thus all external formats are supported. However, when ECL is built with the restricted character set, it can only use one codepage (the one provided by the C library), with a few variants for the representation of end-of-line characters.

In ECL, an external format designator is defined recursively as either a symbol or a list of symbols. The grammar is as follows

external-format-designator := 
   symbol |
   ( {symbol}+ )

and the table of known symbols is shown below. Note how some symbols (:cr, :little-endian, etc) just modify other external formats.

Table 21.1. Stream external formats

SymbolsCodepage or encodingUnicode required
:cr#\Newline is Carriage ReturnNo
:crlf#\Newline is Carriage Return followed by LinefeedNo
:lf#\Newline is LinefeedNo
:little-endianModify UCS to use little endian encoding.No
:big-endianModify UCS to use big endian encoding.No
:utf-8 :utf8Unicode UTF-8Yes
:ucs-2 :ucs2 :utf-16 :utf16 :unicodeUCS-2 encoding with BOM.Yes
:ucs-2le :ucs2le :utf-16leUCS-2 with big-endian encodingYes
:ucs-2be :ucs2be :utf-16beUCS-2 with big-endian encodingYes
:ucs-4 :ucs4 :utf-32 :utf32UCS-4 encoding with BOM.Yes
:ucs-4le :ucs4le :utf-32leUCS-4 with big-endian encodingYes
:ucs-4be :ucs4be :utf-32beUCS-4 with big-endian encodingYes
:iso-8859-1 :iso8859-1 :latin-1 :cp819 :ibm819Latin-1 encodingYes
:iso-8859-2 :iso8859-2 :latin-2 :latin2Latin-2 encodingYes
:iso-8859-3 :iso8859-3 :latin-3 :latin3Latin-3 encodingYes
:iso-8859-4 :iso8859-4 :latin-4 :latin4Latin-4 encodingYes
:iso-8859-5 :cyrillicLatin-5 encodingYes
:iso-8859-6 :arabic :asmo-708 :ecma-114Latin-6 encodingYes
:iso-8859-7 :greek8 :greek :ecma-118Greek encodingYes
:iso-8859-8 :hebrewHebrew encodingYes
:iso-8859-9 :latin-5 :latin5Latin-5 encodingYes
:iso-8859-10 :iso8859-10 :latin-6 :latin6Latin-6 encodingYes
:iso-8859-13 :iso8859-13 :latin-7 :latin7Latin-7 encodingYes
:iso-8859-14 :iso8859-14 :latin-8 :latin8Latin-8 encodingYes
:iso-8859-15 :iso8859-15 :latin-9 :latin9Latin-7 encodingYes
:dos-cp437 :ibm-437IBM CP 437Yes
:dos-cp850 :ibm-850 :cp850Windows CP 850Yes
:dos-cp852 :ibm-852IBM CP 852Yes
:dos-cp855 :ibm-855IBM CP 855Yes
:dos-cp860 :ibm-860IBM CP 860Yes
:dos-cp861 :ibm-861IBM CP 861Yes
:dos-cp862 :ibm-862 :cp862Windows CP 862Yes
:dos-cp863 :ibm-863IBM CP 863Yes
:dos-cp864 :ibm-864IBM CP 864Yes
:dos-cp865 :ibm-865IBM CP 865Yes
:dos-cp866 :ibm-866 :cp866Windows CP 866Yes
:dos-cp869 :ibm-869IBM CP 869Yes
:windows-cp932 :windows-932 :cp932Windows CP 932Yes
:windows-cp936 :windows-936 :cp936Windows CP 936Yes
:windows-cp949 :windows-949 :cp949Windows CP 949Yes
:windows-cp950 :windows-950 :cp950Windows CP 950Yes
:windows-cp1250 :windows-1250 :ms-eeWindows CP 1250Yes
:windows-cp1251 :windows-1251 :ms-cyrlWindows CP 1251Yes
:windows-cp1252 :windows-1252 :ms-ansiWindows CP 1252Yes
:windows-cp1253 :windows-1253 :ms-greekWindows CP 1253Yes
:windows-cp1254 :windows-1254 :ms-turkWindows CP 1254Yes
:windows-cp1255 :windows-1255 :ms-hebrWindows CP 1255Yes
:windows-cp1256 :windows-1256 :ms-arabWindows CP 1256Yes
:windows-cp1257 :windows-1257 :winbaltrimWindows CP 1257Yes
:windows-cp1258 :windows-1258Windows CP 1258Yes