Parent Directory | Revision Log
|Links to HEAD:||(view) (annotate)|
|Links to unicode-utf16-extfmt-2009-06-11:||(view) (annotate)|
code/extfmts.lisp: o Add +replacement-character-code+. pcl/simple-streams/external-formats/utf-16-be.lisp: pcl/simple-streams/external-formats/utf-16-le.lisp: pcl/simple-streams/external-formats/utf-16.lisp: pcl/simple-streams/external-formats/utf-32-be.lisp: pcl/simple-streams/external-formats/utf-32-le.lisp: pcl/simple-streams/external-formats/utf-8.lisp: o Use +replacement-character-code+ instead of the literal.
Oops. The codepoint type is in the Lisp package.
code/char.lisp: o Define CODEPOINT-LIMIT o Define CODEPOINT type code/extfmts.lisp code/string.lisp ode/unidata.lisp pcl/simple-streams/external-formats/utf-32.lisp pcl/simple-streams/external-formats/utf-8.lisp o Use the CODEPOINT type in declarations.
string.lisp: o Add SURROGATEP function to test if something is a surrogate value. extfmts.lisp: utf-16-be.lisp: utf-16-le.lisp: utf-16.lisp: utf-32-be.lisp: utf-32-le.lisp: utf-32.lisp: utf-8.lisp: o Use SURROGATEP.
code/extfmts.lisp: o Add docstrings for STRING-ENCODE and STRING-DECODE docs/cmu-user/unicode.tex: o Add documentation for STRING-ENCODE and STRING-DECODE.
Be sure to open the aliases file and the external format implementations using :iso8859-1 format, in case *default-external-format* is something else.
Update from Paul to make the external formats that use invert-tables uses tries instead of a hash table. code/extfmts.lisp: o Change (unsigned-byte 31) to (unsigned-byte 21). (Should probably add a codepoint deftype for this.) o Use a trie instead of a hash-table for the invert-table stuff o Fix a typo in a comment. pcl/simple-streams/external-formats/iso8859-2.lisp: pcl/simple-streams/external-formats/macroman.lisp: o Use a trie
o Fix typo o Put local variables onto &aux list.
A nicer implementation of OCTETS-TO-CHAR from Paul.
string.lisp: o Add Paul's SURROGATES-TO-CODEPOINT and remove CODEPOINT-FROM-SURROGATES. o Change SURROGATES to return characters, not numbers. o Update callers of SURROGATES to match. extfmts.lisp: o Update callers of SURROGATES to match. o Use CODEPOINT to extract the correct codepoint from a string in EF-STRING-TO-OCTETS and EF-OCTETS-TO-STRING.
Update OCTETS-TO-CHAR so that it can return codepoints outside the BMP. In this case, the first char returned is the high surrogate value. A subsequent call returns the low surrogate value. This is done by making the state be a cons whose car is for OCTETS-TO-CHAR for its own state and whose cdr is the state for the external format. (Idea based on a suggestion by Paul.)
Fix up comment. External formats now do really work on code points, not code units. The conversion from code points to code units (and vice versa) is done at a higher level.
pcl/simple-stream/external-formats/utf-8.lisp: o Revert to the previous version where the UTF-8 external format produces full 21-bit codepoints. code/extfmts.lisp: o Modify EF-STRING-TO-OCTETS to process code points and convert them to code units to be stored in our strings. o Modify EF-OCTETS-TO-STRING to convert code units from the string to codepoints for processing by the external format. These need more work, especially with respect to Lisp characters/codeunits, but utf-8 appears to be working fine with surrogate pairs.
Add some comments for the macro DEFINE-EXTERNAL-FORMAT.
Compile the external format. Use compile-from-stream so we don't leave fasls lying around.
More updates from Paul. code/extfmts.lisp: o Fixed bug with shared code between formats o Built a cache into the ef-macro functions so it doesn't need to call find-external-format so often at runtime code/fd-stream-extfmt.lisp o Use the changes in code/extfmts code/fd-stream.lisp: o Removed all the commented-out code in fd-stream which is duplicated in fd-stream-extfmt.
code/extfmts.lisp: o Bind *DEFAULT-EXTERNAL-FORMAT* to :iso8859-1 when compiling the new external format code. Then messages from the compiler at least have a chance of getting printed. o Removed *compile-verbose*, *compile-progress*, and *gc-verbose*, since the compiler messages are working now. (Should we leave them in?) pcl/simple-streams/external-formats/utf-8.lisp o Revert back to previous version, without LOCALLY.
Turn off *compile-verbose*, *compile-progress*, and *gc-verbose* to minimize output messages when compiling the external format. There's a problem if COMPILE wants to produce output and the external format isn't completely setup yet. (Seems only to be a problem when you change *default-external-format*.) This is a workaround. There ought to be a better solution. This change doesn't solve every issue since compiler notes are still output sometimes.
More updates from Paul. fd-stream-extfmt.lisp actually implements the external formats which now work. Cross-compile works fine. code/fd-stream-extfmt.lisp: o New file implementing external formats tools/worldcom.lisp: o Compile extfmts.lisp before fd-stream, since fd-stream uses some macros from extfmts. o Compile fd-stream-extfmt tools/worldload.lisp: o Load fd-stream-extfmt at the end. (Can't load it as part of kernel.core. Not enough is set up yet.) code/extfmts.lisp: o Avoid loading files, etc., early in the boot sequence o Add INVERT-TABLE function needed by some formats. code/fd-stream.lisp: o Some cleanups (I think) o Fix EOF handling
code/lispinit.lisp: o Revert previous change, preserving order of initialization. Changes from Paul to allow building of the new code from non-unicode version: code/extfmts.lisp code/fd-stream.lisp
Oops. Fix typo (missing paren).
Oops. Don't know how to read #\U+FFFD yet. Use (code-char #xfffd) instead.
More external format support from Paul Foley. To get external format support I think you need to add :extfmts to *features*. But you can't bootstrap with that feature yet. Initial support for pathname translations to so that namestrings can be converted to an appropriate format before being given to the OS. Many, many new external formats added. These changes are all on their own branch for now, until the bootstrap issue is resolved. And also so we don't lose these changes from Paul.
Sync to HEAD branch.
Merge changes from HEAD for the ext-formats search-list change.
Merge changes from HEAD to the unicode-utf16 branch.
Update from Paul Foley. o Disable package errors when loading up external formats. o A minor patch allowing string-to-octets and vice versa to write into a preallocated array (though they might still allocate a bigger one if necessary), o Fix up any confusion between simple-base-string and simple-string so that nothing breaks when/if they're not the same.
Import Paul Foley's external-formats support. New files: o code/extfmts.lisp o pcl/simple-streams/external-formats/iso8859-1.lisp o pcl/simple-streams/external-formats/void.lisp code/exports.lisp: o Export the new symbols STRING-TO-OCTETS, OCTETS-TO-STRING, *DEFAULT-EXTERNAL-FORMAT*, ENCODE-STRING, and DECODE-STRING from the STREAM package o Make the symbols in the EXT package too. pcl/simple-streams/internal.lisp: o Move the implementation of STRING-TO-OCTETS and friends to a new file (extfmts.lisp). pcl/simple-streams/external-formats/utf-8.lisp: o New implementation. tools/make-main-dist.sh: o Create new target directory to hold external formats o Copy all of the external formats to the new directory. tools/pclcom.lisp: o Compile new code tools/worldcom.lisp: o Compile code/extfmts.lisp tools/worldload.lisp: o Load code/extfmts.lisp
This form allows you to request diffs between any two revisions of this file. For each of the two "sides" of the diff, select a symbolic revision name using the selection box, or choose 'Use Text Field' and enter a numeric revision.
|Powered by ViewVC 1.1.5|