Abstract
CL-WBXML is a library that can read and write WAP Binary XML (WBXML). It has been successfully used to send and receive WBXML-encoded SyncML to and from various cell phones.The code comes with a BSD-style license so you can basically do with it whatever you want.
Download shortcut: http://weitz.de/files/cl-wbxml.tar.gz.
Before you install CL-WBXML you first need to install the FLEXI-STREAMS library unless you already have it.
CL-WBXML comes with a system definition for ASDF so you can install the library with
(asdf:oos 'asdf:load-op :cl-wbxml)if you've unpacked it in a place where ASDF can find it. Installation via asdf-install should also be possible.
NIL
object and it in turn calls generic functions like START-ELEMENT
or PROCESSING-INSTRUCTION
(which you can specialize for you handlers) while traversing the document.
CL-WBXML comes with a predefined handler class that can be used to
create XMLS-like S-expressions from WBXML
documents -
see MAKE-XMLS-HANDLER
.
Furthermore, there are default handlers defined for
the system class T
and they all do nothing.
Here's an example:
<ns:foo xmlns:ns="http://weitz.de/" attr1="val1">text<bar ns:attr2="val2"> </bar>more text</ns:foo>will be converted to this S-expression:
(("foo" . "http://weitz.de/") (("attr1" "val1")) "text" ("bar" ((("attr2" . "http://weitz.de/") "val2")) " ") "more text")Note that this format is only similar but not identical to the XMLS format because (currently) XMLS doesn't handle namespace-qualified attribute names.
[Generic function]
parse-wbxml source handler &key default-charset tag-tokens attr-tokens => result, publicid, version, charset
Reads and parses a WBXML document and invokes the methods of the handlerhandler
accordingly.tag-tokens
andattr-tokens
are lists of code pages,default-charset
is the one that is to be used if the document's charset isn't specified - it should be specified in a way FLEXI-STREAMS understands. Returns multiple values - the first value of the final call toEND-DOCUMENT
, the public ID of the document (orNIL
), the WBXML version of the document (as a string), and the character set of the document.
source
can be a binary/bivalent input stream, a pathname denoting an existing file, or a sequence containing octets.The code page lists are alists where the car is the number of the code page and the cdr is itself an alist of conses mapping tokens to pseudo-XMLS names (in the case of tag tokens), strings (in the case of attribute value tokens), or lists (in the case of attribute start tokens) where the first element is the pseudo-XMLS name of the attribute and the second element is the value prefix as a string. See the file
tokens.lisp
for examples.If the document has a public ID for which CL-WBXML knows the defined code pages, these will be used instead of the supplied
tag-tokens
andattr-tokens
arguments. Currently this is the case for the following public IDs:
"-//SYNCML//DTD SyncML 1.0//EN"
"-//SYNCML//DTD SyncML 1.1//EN"
"-//SYNCML//DTD SyncML 1.2//EN"
"-//SYNCML//DTD DevInf 1.0//EN"
"-//SYNCML//DTD DevInf 1.1//EN"
"-//SYNCML//DTD DevInf 1.2//EN"
[Function]
make-xmls-handler => handler
This function returns a handler which can be used in conjunction withPARSE-WBXML
to create pseudo-XMLS documents. Here's an example (using the second example from the WBXML spec):CL-USER 3 > (defun create-file (&optional (file "/tmp/foo.txt")) (with-open-file (out file :direction :output :element-type 'octet :if-exists :supersede) (setq out (make-flexi-stream out :external-format :utf-8)) (write-sequence '(1 1 #x6a #x12 #\a #\b #\c 0 #\Space #\E #\n #\t #\e #\r #\Space #\n #\a #\m #\e #\: #\Space 0 #x47 #xc5 9 #x83 0 5 1 #x88 6 #x86 8 3 #\x #\y #\z 0 #x85 3 #\/ #\s 0 1 #x83 4 #x86 7 #xa 3 #\N 0 1 1 1) out))) CREATE-FILE CL-USER 4 > (defun read-file (&optional (file "/tmp/foo.txt")) (with-open-file (in file :element-type 'octet) (parse-wbxml in (make-xmls-handler) :tag-tokens '((0 . ((5 . "CARD") (6 . "INPUT") (7 . "XYZ") (8 . "DO")))) :attr-tokens '((0 . ((5 . ("STYLE" . "LIST")) (6 . ("TYPE")) (7 . ("TYPE" . "TEXT")) (8 . ("URL" . "http://")) (9 . ("NAME")) (10 . ("KEY")) (#x85 . ".org") (#x86 . "ACCEPT"))))))) READ-FILE CL-USER 5 > (progn (create-file) (read-file)) ("XYZ" NIL ("CARD" (("NAME" "abc") ("STYLE" "LIST")) ("DO" (("TYPE" "ACCEPT") ("URL" "http://xyz.org/s"))) " Enter name: " ("INPUT" (("TYPE" "TEXT") ("KEY" "N"))))) NIL "1.1" :UTF-8Note that you should not re-use pseudo-XMLS handlers - create a new one for each parse.
[Generic functions]
start-document handler => whatever
end-document handler => result
These functions are called exactly once (at the start and end respectively) for each WBXML document - they are supposed to be specialized by the user. The return values ofSTART-DOCUMENT
are ignored, the first return value ofEND-DOCUMENT
will be the first return value ofPARSE-WBXML
.
[Generic functions]
start-element handler namespace-uri local-name qname attributes => whatever
end-element handler namespace-uri local-name qname => whatever
These functions are called at the start and end of each XML element the parser encounters, their return values are ignored.local-name
is the name of the element andnamespace-uri
the corresponding namespace URI (orNIL
if there is no namespace).qname
is the qualified name of the element but can also beNIL
, if the name came from a pre-defined tag token.attributes
is a list ofATTRIBUTE
objects representing the element's attributes.
[Generic function]
characters handler data => whatever
This function is called whenever the parser comes across character data within the body of an XML element.data
will usually be a string but it can also be a list of octets (if theOPAQUE
token was encountered) or whatever*EXTENSION-FUNCTION*
returns (specificallyNIL
for the default function). The return value of this function is ignored by the parser.
[Generic function]
processing-instruction handler target data => whatever
This generic function is called once for each processing instruction.target
anddata
are both strings,data
can also beNIL
. The return value of this function is ignored by the parser.
[Standard class]
attribute
This is the class of those (opaque) objects that represent XML attributes - seeSTART-ELEMENT
. Their properties can be queried with the readers described below.
[Readers]
attribute-local-name attribute => local-name
attribute-namespace-uri attribute => namespace-uri
attribute-qname attribute => qname
attribute-value attribute => value
These generic functions can be used to read the respective properties ofATTRIBUTE
objects.
Encodes the XML documentdocument
(in pseudo-XMLS syntax) as WBXML and writes it totarget
which can be a binary/bivalent output stream, a pathname, or the symbolT
in which case the output will be written to an in-memory output stream. The function usually returnsNIL
, but it returns a vector representing the encoded document, iftarget
isT
.
major-version
andminor-version
(integers) denote the WBXML version which should be used - the defaults are 1 and 3.version-string
(a string) is another way to specifiy the version and if this value is notNIL
the other version paramters are ignored.
publicid
is the public ID (a string) of the document. Ifforce-literal-publicid
is true, the public ID is inserted as an index into the string table even if there's a well-known numeric value for it.
charset
(default is:UTF8
) is the character set that is to be used to encode the document. It should be a keyword that can be understood by FLEXI-STREAMS.tag-tokens
andattr-tokens
are lists of code pages (and they are ignored for public IDs known to CL-WBXML).
if-exists
is the value used when opening a file specified by a pathname. For streams this value is ignored.If
prefer-inline
is true,STR_I
is used instead ofSTR_T
whenever possible. (Some cell phones seem to have problems with string tables. Oh, well...)
[Accessors]
xmls-name element => name
(setf (xmls-name element) name)
xmls-attributes element => attributes
(setf (xmls-attributes element) attributes)
xmls-children element => children
(setf (xmls-children element) children)
These are convenience methods to access the corresponding parts of an XML element in pseudo-XMLS format.
[Special variable]
*extension-function*
The value of this variable should be a function to handle document-type-specific tokens likeEXT_I_1
. The function will be called with two arguments - an ID (one of 0, 1, or 2) and a value (a string, an integer, or NIL). The return value of this function is used as an argument toCHARACTERS
. The default function always returnsNIL
.
$Header: /usr/local/cvsrep/cl-wbxml/doc/index.html,v 1.12 2006/07/25 15:09:04 edi Exp $