/[cmucl]/src/code/extfmts.lisp
ViewVC logotype

Log of /src/code/extfmts.lisp

Parent Directory Parent Directory | Revision Log Revision Log


Links to HEAD: (view) (annotate)
Sticky Tag:

Revision 1.42 - (view) (annotate) - [select for diffs]
Tue Nov 30 04:09:42 2010 UTC (3 years, 4 months ago) by rtoy
Branch: MAIN
CVS Tags: GIT-CONVERSION, HEAD, cross-sol-x86-2010-12-20, cross-sol-x86-base, cross-sol-x86-merged, snapshot-2010-12, snapshot-2011-01, snapshot-2011-02, snapshot-2011-03, snapshot-2011-04, snapshot-2011-06, snapshot-2011-07, snapshot-2011-09
Branch point for: cross-sol-x86-branch
Changes since 1.41: +4 -3 lines
Diff to previous 1.41 , to selected 1.20.4.1
o *FILENAME-ENCODING* was never getting set.  Fix that.  (From Paul
  Foley.)
o While we're here, canonicalize *FILENAME-ENCODING* using the actual
  external format name.

Revision 1.35.4.7 - (view) (annotate) - [select for diffs]
Thu Sep 9 00:41:03 2010 UTC (3 years, 7 months ago) by rtoy
Branch: RELEASE-20B-BRANCH
CVS Tags: RELEASE_20b
Changes since 1.35.4.6: +7 -5 lines
Diff to previous 1.35.4.6 , to branch point 1.35 , to next main 1.42 , to selected 1.20.4.1
Correct docstring, from HEAD.

Revision 1.41 - (view) (annotate) - [select for diffs]
Wed Sep 8 02:58:06 2010 UTC (3 years, 7 months ago) by rtoy
Branch: MAIN
CVS Tags: cross-sparc-branch-base, snapshot-2010-11
Branch point for: cross-sparc-branch
Changes since 1.40: +8 -6 lines
Diff to previous 1.40 , to selected 1.20.4.1
Correct docstring for OCTETS-TO-STRING to reflect what actually
happens.

Revision 1.40 - (view) (annotate) - [select for diffs]
Mon Sep 6 19:01:56 2010 UTC (3 years, 7 months ago) by rtoy
Branch: MAIN
Changes since 1.39: +28 -4 lines
Diff to previous 1.39 , to selected 1.20.4.1
Merge changes from 20b-pre2.

Revision 1.35.4.6 - (view) (annotate) - [select for diffs]
Mon Sep 6 15:35:28 2010 UTC (3 years, 7 months ago) by rtoy
Branch: RELEASE-20B-BRANCH
CVS Tags: release-20b-pre2
Changes since 1.35.4.5: +21 -1 lines
Diff to previous 1.35.4.5 , to branch point 1.35 , to selected 1.20.4.1
code/extfmts.lisp
o Add some comments from Paul Foley on what arguments to DEF-EF-MACRO
  mean.

Revision 1.35.4.5 - (view) (annotate) - [select for diffs]
Mon Sep 6 01:01:27 2010 UTC (3 years, 7 months ago) by rtoy
Branch: RELEASE-20B-BRANCH
Changes since 1.35.4.4: +6 -2 lines
Diff to previous 1.35.4.4 , to branch point 1.35 , to selected 1.20.4.1
code/extfmts.lisp
o Add new ef-macro index for octets-to-string-counted.  (Forgot to do
  that before.)

bootfiles/20a/boot-2010-08-1.lisp:
o Use this to bootstrap the change (using a cross-compile.)

Revision 1.39 - (view) (annotate) - [select for diffs]
Sat Sep 4 01:03:12 2010 UTC (3 years, 7 months ago) by rtoy
Branch: MAIN
Changes since 1.38: +2 -2 lines
Diff to previous 1.38 , to selected 1.20.4.1
Merge fixes for fast-read-char-string-refill from the 20b-branch.

Revision 1.35.4.4 - (view) (annotate) - [select for diffs]
Thu Sep 2 23:47:31 2010 UTC (3 years, 7 months ago) by rtoy
Branch: RELEASE-20B-BRANCH
Changes since 1.35.4.3: +1 -1 lines
Diff to previous 1.35.4.3 , to branch point 1.35 , to selected 1.20.4.1
Fix yet another bug in the FAST-READ-CHAR-STRING-REFILL.   This shows
up when running the word break test in
i18n/tests/word-break-test.lisp.

extfmts.lisp:
o Return the number of characters that were actually converted instead
  of the position of the starting point of the output string.

stream.lisp:
o In FAST-READ-CHAR-STRING-REFILL, sometimes, we'll only read one
  octet into the octet buffer, and the octet will be the first octet
  of a multi-octet character.  If this happens, we need to try to read
  some more octets in so that the call to FAST-READ-CHAR-STRING-REFILL
  can return a character.  We only retry once.  If this still fails to
  read enough octets to form a character, we're hosed since we don't
  check for this.  (Should we?)

  Need to refactor this code a bit too.

Revision 1.35.4.3 - (view) (annotate) - [select for diffs]
Sun Aug 15 15:07:51 2010 UTC (3 years, 8 months ago) by rtoy
Branch: RELEASE-20B-BRANCH
Changes since 1.35.4.2: +70 -1 lines
Diff to previous 1.35.4.2 , to branch point 1.35 , to selected 1.20.4.1
Merge fix from HEAD to fix trac #36: file-position wrong.

Revision 1.38 - (view) (annotate) - [select for diffs]
Sun Aug 15 12:04:43 2010 UTC (3 years, 8 months ago) by rtoy
Branch: MAIN
Changes since 1.37: +70 -1 lines
Diff to previous 1.37 , to selected 1.20.4.1
Fix file-position bug in trac #36.  We add an array to keep track of
the octets consumed for each character.  This array is used to figure
out the file position.  Some tests comparing this scheme indicates a
very small slowdown of about 1%, so this seems not to hurt.

Use a cross-compile using the 2010-07 snapshot to build this.  (Same
procedure as used to build the 20b-pre1 release.)

struct.lisp:
o Add new slot OCTET-COUNT to LISP-STREAM to hold the array of octets
  per character.

extfmts.lisp:
o Add OCTETS-TO-STRING-COUNTED, which is like OCTETS-TO-STRING, except
  we need an array in which to store the number of octets consumed for
  each character processed.

fd-stream.lisp:
o Create the octet-count array creating the lisp stream string buffer.
o In FD-STREAM-FILE-POSITION, use the octet count to count the number
  of octets that have been read but not yet returned to the user.

stream.lisp:
o Use OCTETS-TO-STRING-COUNTED instead of OCTETS-TO-STRING so we keep
  track of octet length of each character processed.

Revision 1.35.4.2 - (view) (annotate) - [select for diffs]
Sat Aug 14 23:51:08 2010 UTC (3 years, 8 months ago) by rtoy
Branch: RELEASE-20B-BRANCH
Changes since 1.35.4.1: +7 -4 lines
Diff to previous 1.35.4.1 , to branch point 1.35 , to selected 1.20.4.1
Merge fixes from trunk to silence some compiler notes and fix bug in
utf-16-be and utf-16-le.

Revision 1.37 - (view) (annotate) - [select for diffs]
Sat Aug 14 23:18:03 2010 UTC (3 years, 8 months ago) by rtoy
Branch: MAIN
Changes since 1.36: +7 -4 lines
Diff to previous 1.36 , to selected 1.20.4.1
extfmts.lisp:
ascii.lisp:
iso8859-1.lisp:
iso8859-2.lisp:
mac-roman.lisp:
utf-16.lisp:
utf-32-be.lisp:
utf-32-le.lisp:
utf-32.lisp:
utf-8.lisp:
o Inhibit warnings about funcalls to error (fdefinition of symbols).
  I'm tired of seeing the warnings.

utf-16-be.lisp:
utf-16-le.lisp:
o Inhibit warnings about funcalls to error (fdefinition of symbols).
  I'm tired of seeing the warnings.
o Fix bug in FLUSH-STATE:  need to call the OUT function, not the
  ,OUTPUT function!

Revision 1.35.4.1 - (view) (annotate) - [select for diffs]
Wed Aug 4 12:12:09 2010 UTC (3 years, 8 months ago) by rtoy
Branch: RELEASE-20B-BRANCH
Changes since 1.35: +20 -12 lines
Diff to previous 1.35 , to selected 1.20.4.1
Merge some change from HEAD to keep compiler quieter when compiling
external formats.

Revision 1.36 - (view) (annotate) - [select for diffs]
Wed Aug 4 02:56:36 2010 UTC (3 years, 8 months ago) by rtoy
Branch: MAIN
Changes since 1.35: +20 -12 lines
Diff to previous 1.35 , to selected 1.20.4.1
Inhibit warnings around funcalls to error.  This was generating too
compiler noise for something we don't really care if it's slow.

Revision 1.35 - (view) (annotate) - [select for diffs]
Mon Jul 12 13:58:42 2010 UTC (3 years, 9 months ago) by rtoy
Branch: MAIN
CVS Tags: release-20b-pre1, snapshot-2010-08, sparc-tramp-assem-2010-07-19, sparc-tramp-assem-base
Branch point for: RELEASE-20B-BRANCH, sparc-tramp-assem-branch
Changes since 1.34: +118 -46 lines
Diff to previous 1.34 , to selected 1.20.4.1
Add a documentation slot to external formats so that we can give a
little information about the format.  Provide a means to get a list of
external formats and to display the documentation.

bootfiles/20a/boot-2010-07-1.lisp:
o Use this bootstrap file when doing a normal build.

code/exports.lisp:
o New functions:
  - Add LIST-ALL-EXTERNAL-FORMATS to list all available external formats
    and their corresponding aliases.
  - Add DESCRIBE-EXTERNAL-FORMAT to print some information about the
    given format.
o Add docuemntation slot to defstruct EXTERNAL-FORMAT.
o Change DEFINE-EXTERNAL-FORMAT macro.  Adds :DOCUMENTATION keyword to
  specify the documentation.  Add :BASE keyword indicate that the
  external format is based on another format.  (Previously, this
  wasn't needed, but is somewhat incompatible with adding a
  documentation string.)
o Change DEFINE-COMPOSING-EXTERNAL-FORMAT to include :documentation
  keyword to specify the documentation for the format.
o Minor reindentation of some docstrings.
o Make sure documentation strings for external format are marked for
  translation; wrap other strings with intl:gettext to explicitly mark
  them for translations.
o Add docstring for VOID and ISO8859-1 external formats.

code/exports.lisp:
o Export the new symbols LIST-ALL-EXTERNAL-FORMATS and
  DESCRIBE-EXTERNAL-FORMAT.  Import into EXTENSIONS package.

docs/cmu-user/unicode.tex:
o Update docs to include LIST-ALL-EXTERNAL-FORMATS and
  DESCRIBE-EXTERNAL-FORMAT.
o Update docs for DEFINE-EXTERNAL-FORMAT and
  DEFINE-COMPOSING-EXTERNAL-FORMAT to match implementation.

general-info/release-20b.txt:
o Update

external-formats/*.lisp:
o Update with docstrings.
o Add :BASE keyword where needed.

Revision 1.34 - (view) (annotate) - [select for diffs]
Sat Jul 10 22:50:58 2010 UTC (3 years, 9 months ago) by rtoy
Branch: MAIN
Changes since 1.33: +48 -10 lines
Diff to previous 1.33 , to selected 1.20.4.1
extfmts.lisp:
o Add a simple function it list all external formats.
o Add some docstrings.
o Correctly indent some s-exps.

exports.lisp:
o Update package definitions to export new LIST-ALL-EXTERNAL-FORMATS.

Revision 1.33 - (view) (annotate) - [select for diffs]
Mon Jul 5 22:45:50 2010 UTC (3 years, 9 months ago) by rtoy
Branch: MAIN
Changes since 1.32: +11 -3 lines
Diff to previous 1.32 , to selected 1.20.4.1
o Fix issue with decoding error call.  The error function takes 3
  args.
o Generate different error messages for surrogate code points and code
  points that are too large.

Revision 1.32 - (view) (annotate) - [select for diffs]
Mon Jul 5 15:52:47 2010 UTC (3 years, 9 months ago) by rtoy
Branch: MAIN
Changes since 1.31: +3 -3 lines
Diff to previous 1.31 , to selected 1.20.4.1
extfmts.lisp:
o Revert previous incompatible change to STRING-DECODE and
  STRING-ENCODE.  Change the keyword parameters back to optional
  parameters, and make the error parameter the last one.

fd-stream.lisp:
o Update use of STRING-ENCODE.

Revision 1.31 - (view) (annotate) - [select for diffs]
Sat Jul 3 16:44:37 2010 UTC (3 years, 9 months ago) by rtoy
Branch: MAIN
Changes since 1.30: +25 -7 lines
Diff to previous 1.30 , to selected 1.20.4.1
extfmts.lisp:
o Update comments for the various slots in DEFINE-EXTERNAL-FORMAT.

fd-stream.lisp:
o Declare the types for the CHAR-TO-OCTETS-ERROR and
  OCTETS-TO-CHAR-ERROR slots in FD-STREAM.
o Update docstrings for MAKE-FD-STREAM and OPEN for :DECODING-ERROR
  and :ENCODING-ERROR parameters.

Revision 1.30 - (view) (annotate) - [select for diffs]
Sat Jul 3 13:39:19 2010 UTC (3 years, 9 months ago) by rtoy
Branch: MAIN
Changes since 1.29: +7 -7 lines
Diff to previous 1.29 , to selected 1.20.4.1
code/extfmts.lisp:
o Add error parameter to flush-state in external format definition so
  we can handle errors when flushing the state to a stream.
o Add optional error parameter to flush-state macro.

code/fd-stream.lisp:
o For the case where an external format has flush method, EF-FLUSH was
  not calling it correctly.  Update so the output function actuall
  works.
o Add error handler to call to flush-state.
o For the case where an external format does not have a flush method,
  output the state value instead of a replacement character so the
  external format can handle any errors.

Revision 1.29 - (view) (annotate) - [select for diffs]
Fri Jul 2 23:06:26 2010 UTC (3 years, 9 months ago) by rtoy
Branch: MAIN
Changes since 1.28: +5 -5 lines
Diff to previous 1.28 , to selected 1.20.4.1
code/extfmts.lisp:
o Pass the error handler on for composed external formats.

code/fd-stream.lisp:
o Forgot to pass the error-handler to char-to-octets in EF-COUT.
o In MAKE-FD-STREAM slightly change handling of encoding-error and
  decoding-error:
  - If :encoding-error is a character, then the external format will
    use that character whenever an encoding error happens instead of
    the default (internally specified by the external format).
  - If :decoding-error is a character, then the external format will
    use that character whenever an encoding error happens instead of
    the default (internally specified by the external format).  If
    :decoding-error is T, then a cerror is signaled; if continued, the
    Unicode replacement character (#\U+FFFD) is used.
o Fix bug in OPEN:  The :decoding-error and :encoding-error keyword
  parameter was placed in the &aux section by mistake.

Revision 1.28 - (view) (annotate) - [select for diffs]
Fri Jul 2 16:29:55 2010 UTC (3 years, 9 months ago) by rtoy
Branch: MAIN
Changes since 1.27: +7 -7 lines
Diff to previous 1.27 , to selected 1.20.4.1
code/extfmts.lisp:
o The optional error parameter doesn't need to be optional in
  DEFINE-EXTERNAL-FORMAT, EF-STRING-TO-OCTETS, EF-OCTETS-TO-STRING,
  EF-ENCODE and EF-DECODE.

code/fd-stream.lisp:
o Update comments for char-to-octets-error and octets-to-char-error.
o Forgot to pass the error handler to char-to-octets in EF-SOUT and
  EF-STRLEN.

Revision 1.27 - (view) (annotate) - [select for diffs]
Fri Jul 2 02:50:35 2010 UTC (3 years, 9 months ago) by rtoy
Branch: MAIN
Changes since 1.26: +47 -24 lines
Diff to previous 1.26 , to selected 1.20.4.1
Implement more of the external format error handlers.

code/extfmts.lisp
o Call the error handler for iso8859-1 output.
o In OCTETS-TO-CODEPOINT and CODEPOINT-TO-OCTETS, call the external
  format with the error argument.
o In OCTETS-TO-CHAR
  - Call OCTETS-TO-CODEPOINT with the error handler.
  - For all of the error conditions, call the error handler if
    defined.
o Add error parameter to EF-STRING-TO-OCTETS and EF-ENCODE so we can
  handle errors.  Call CHAR-TO-OCTETS with the error handler.
o Add error parameter to STRING-TO-OCTETS and use it.
o Add error parameter to EF-OCTETS-TO-STRING and EF-DECODE so we can
  handle errors.  Call OCTETS-TO-CHAR with the error handler.
o Add error parameter to OCTETS-TO-STRING and use it.
o In STRING-ENCODE and STRING-DECODE, call the ef function with the
  error handler.
o Change STRING-ENCODE to use keyword args instead of optional args.
  Add error parameter and use it.

code/fd-stream-extfmt.lisp:
o Tell OCTETS-TO-STRING about the error handler stored in the
  fd-stream.

code/fd-stream.lisp:
o OPEN, MAKE-FD-STREAM, and OPEN-FD-STREAM get DECODING-ERROR and
  ENCODING-ERROR keyword arguments for specifying how to handle
  decoding and encoding errors in external formats.

code/stream.lisp:
o Make sure the error handler is called in
  FAST-READ-CHAR-STRING-REFILL.

pcl/simple-streams/external-formats/utf-8.lisp:
o Initial cut at calling the error handler for the various possible
  invalid octet streams for a utf-8 encoding.

Revision 1.26 - (view) (annotate) - [select for diffs]
Wed Jun 30 03:53:28 2010 UTC (3 years, 9 months ago) by rtoy
Branch: MAIN
CVS Tags: snapshot-2010-07
Changes since 1.25: +17 -17 lines
Diff to previous 1.25 , to selected 1.20.4.1
Add initial support for signaling errors in external formats.

o All external formats need an extra required argument for the error
  handler.
o Add optional error parameter to OCTETS-TO-CODEPOINT,
  CODEPOINT-TO-OCTETS, OCTETS-TO-CHAR, and CHAR-TO-OCTETS.

Revision 1.25 - (view) (annotate) - [select for diffs]
Tue Apr 20 17:57:44 2010 UTC (3 years, 11 months ago) by rtoy
Branch: MAIN
CVS Tags: snapshot-2010-05, snapshot-2010-06
Changes since 1.24: +15 -15 lines
Diff to previous 1.24 , to selected 1.20.4.1
Change uses of _"foo" to (intl:gettext "foo").  This is because slime
may get confused with source locations if the reader macros are
installed.

Revision 1.24 - (view) (annotate) - [select for diffs]
Mon Apr 19 02:18:03 2010 UTC (3 years, 11 months ago) by rtoy
Branch: MAIN
Changes since 1.23: +6 -6 lines
Diff to previous 1.23 , to selected 1.20.4.1
Remove _N"" reader macro from docstrings when possible.

Revision 1.23 - (view) (annotate) - [select for diffs]
Fri Mar 19 15:18:58 2010 UTC (4 years ago) by rtoy
Branch: MAIN
CVS Tags: post-merge-intl-branch, snapshot-2010-04
Changes since 1.22: +22 -20 lines
Diff to previous 1.22 , to selected 1.20.4.1
Merge intl-branch 2010-03-18 to HEAD.  To build, you need to use
boot-2010-02-1 as the bootstrap file.  You should probably also use
the new -P option for build.sh to generate and update the po files
while building.

Revision 1.20.4.3 - (view) (annotate) - [select for diffs]
Thu Mar 18 22:17:15 2010 UTC (4 years ago) by rtoy
Branch: intl-branch
Changes since 1.20.4.2: +10 -7 lines
Diff to previous 1.20.4.2 , to branch point 1.20 , to next main 1.42 , to selected 1.20.4.1
Merge changes from HEAD, update po files.

Revision 1.22 - (view) (annotate) - [select for diffs]
Fri Mar 12 10:39:37 2010 UTC (4 years, 1 month ago) by rtoy
Branch: MAIN
CVS Tags: pre-merge-intl-branch
Changes since 1.21: +10 -11 lines
Diff to previous 1.21 , to selected 1.20.4.1
Wrap the call to compile-from-stream in WITH-STANDARD-IO-SYNTAX to
ensure that we can compile the external format correctly no matter
what the user might have done to readtable and other variables.  This
supersedes the previous change that just bound *readtable* to the
standard read table.

Revision 1.21 - (view) (annotate) - [select for diffs]
Mon Mar 8 20:43:20 2010 UTC (4 years, 1 month ago) by rtoy
Branch: MAIN
CVS Tags: snapshot-2010-03
Changes since 1.20: +5 -1 lines
Diff to previous 1.20 , to selected 1.20.4.1
Bind *readtable* to the standard table when compiling the external
format in case the user has set a non-standard readtable that can't
process the external format.

Revision 1.20.6.1 - (view) (annotate) - [select for diffs]
Thu Feb 25 20:34:49 2010 UTC (4 years, 1 month ago) by rtoy
Branch: intl-2-branch
Changes since 1.20: +22 -20 lines
Diff to previous 1.20 , to next main 1.42 , to selected 1.20.4.1
Restart internalization work.  This new branch starts with code from
the intl-branch on date 2010-02-12 18:00:00+0500.  This version works
and

LANG=en@piglatin bin/lisp

works (once the piglatin translation is added).

Revision 1.20.4.2 - (view) (annotate) - [select for diffs]
Tue Feb 9 15:18:21 2010 UTC (4 years, 2 months ago) by rtoy
Branch: intl-branch
CVS Tags: intl-branch-2010-03-18-1300, intl-branch-working-2010-02-11-1000, intl-branch-working-2010-02-19-1000
Changes since 1.20.4.1: +20 -20 lines
Diff to previous 1.20.4.1 , to branch point 1.20
Mark translatable strings; update cmucl.pot and ko/cmucl.po
accordingly.

Revision 1.20.4.1 - (view) (annotate) - [selected]
Mon Feb 8 17:15:47 2010 UTC (4 years, 2 months ago) by rtoy
Branch: intl-branch
Changes since 1.20: +3 -1 lines
Diff to previous 1.20
Add (intl:textdomain "cmucl") to the files to set the textdomain.

Revision 1.20 - (view) (annotate) - [select for diffs]
Sun Oct 18 14:21:23 2009 UTC (4 years, 5 months ago) by rtoy
Branch: MAIN
CVS Tags: amd64-dd-start, intl-2-branch-base, intl-branch-base, snapshot-2009-11, snapshot-2009-12, snapshot-2010-01, snapshot-2010-02
Branch point for: amd64-dd-branch, intl-2-branch, intl-branch
Changes since 1.19: +52 -38 lines
Diff to previous 1.19 , to selected 1.20.4.1
Merge changes from unicode-string-buffer-impl-branch which gives
faster reads on external-formats.  This is done by adding an
additional buffer to streams so we can convert the entire in-buffer
into characters all at once.

To build this change, you need to do a cross-compile using
boot-2009-10-1-cross.lisp.  Using that build, do a normal build with
these sources.

For a non-unicode build use boot-2009-10-01.lisp with a 20a
non-unicode build.

code/extfmts.lisp:
o Add another slot to the extfmts for copying the state.
o Modify EF-OCTETS-TO-STRING and OCTETS-TO-STRING to support the
  necesssary changes for fast formats.  This is incompatible with the
  previous version because the string is not grown if needed.

code/fd-stream-extfmt.lisp:
o Set *enable-stream-buffer-p* to T so we have fast streams.

code/fd-stream.lisp:
o Add new slots to support fast strams.
o In SET-ROUTINES, initialize the new slots appropriately.
o Update UNREAD-CHAR to be able to back up in the string buffer to
  unread.
o Add implementation to copy the state of an external format.

code/stream.lisp:
o Change %SET-FD-STREAM-EXTERNAL-FORMAT to be able to change formats
  even if we've already converted the buffer with a different format.
  We reconvert the buffer with the old format until we reach the
  current character.  Then the remaining octets are converted using
  the new format and stored in the string buffer.
o Add FAST-READ-CHAR-STRING-REFILL to refill the string buffer, like
  FAST-READ-CHAR-REFILL does for the octet in-buffer.

code/struct.lisp:
o Add new slots to hold the string buffer, the current index, and
  length.  These are needed for the fast formats.

code/sysmacs.lisp:
o Update PREPARE-FOR-FAST-READ-CHAR, DONE-WITH-FAST-READ-CHAR, and
  FAST-READ-CHAR to support the string buffer.

code/string.lisp:
o Microoptimization of SURROGATEP to reduce the number of branchs.

general-info/release-20b.txt:
o Update with these changes

pcl/simple-streams/external-formats/utf-16-be.lisp:
pcl/simple-streams/external-formats/utf-16-le.lisp:
pcl/simple-streams/external-formats/utf-16.lisp:
o These formats actually have state, so update them to take a handle
  an initial state.  These are needed if the string buffer ends with a
  leading surrogate and the next string buffer starts with a trailing
  surrogate.  The conversion needs to combine the surrogates together.

Revision 1.18.4.7 - (view) (annotate) - [select for diffs]
Thu Oct 15 19:37:48 2009 UTC (4 years, 6 months ago) by rtoy
Branch: unicode-string-buffer-impl-branch
Changes since 1.18.4.6: +13 -7 lines
Diff to previous 1.18.4.6 , to branch point 1.18 , to next main 1.42 , to selected 1.20.4.1
o Update docstring for OCTETS-TO-STRING to match the implementation.
o Use AREF instead of BREF in EF-OCTETS-TO-STRING.

Revision 1.18.4.6 - (view) (annotate) - [select for diffs]
Wed Oct 7 17:53:57 2009 UTC (4 years, 6 months ago) by rtoy
Branch: unicode-string-buffer-impl-branch
Changes since 1.18.4.5: +2 -1 lines
Diff to previous 1.18.4.5 , to branch point 1.18 , to selected 1.20.4.1
Merge changes from unicode-string-buffer-branch.

Revision 1.18.2.2 - (view) (annotate) - [select for diffs]
Wed Oct 7 16:46:23 2009 UTC (4 years, 6 months ago) by rtoy
Branch: unicode-string-buffer-branch
Changes since 1.18.2.1: +2 -1 lines
Diff to previous 1.18.2.1 , to branch point 1.18 , to next main 1.42 , to selected 1.20.4.1
extfmts.lisp:
o Add another ef slot to hold a copy-state function.

boot-2009-10-1-cross.lisp:
o Silently change stream::+ef-max+ here.
o Note that this is for the unicode-string-buffer-branch.

Revision 1.18.4.5 - (view) (annotate) - [select for diffs]
Tue Oct 6 12:48:32 2009 UTC (4 years, 6 months ago) by rtoy
Branch: unicode-string-buffer-impl-branch
Changes since 1.18.4.4: +27 -22 lines
Diff to previous 1.18.4.4 , to branch point 1.18 , to selected 1.20.4.1
In OCTETS-TO-STRING, correct the computation of the end of the
string.  (It caused an error if the string were not given.)

Revision 1.18.4.4 - (view) (annotate) - [select for diffs]
Mon Oct 5 03:58:01 2009 UTC (4 years, 6 months ago) by rtoy
Branch: unicode-string-buffer-impl-branch
Changes since 1.18.4.3: +8 -6 lines
Diff to previous 1.18.4.3 , to branch point 1.18 , to selected 1.20.4.1
extfmts.lisp:
o Allow caller to specify a state so octets can be converted in
  batches with the appropriate state between them.

stream.lisp:
o Call OCTETS-TO-STRING with the appropriate state from the stream.
o Better declarations in FAST-READ-CHAR-STRING-REFILL.

Revision 1.19 - (view) (annotate) - [select for diffs]
Fri Oct 2 20:15:04 2009 UTC (4 years, 6 months ago) by rtoy
Branch: MAIN
Changes since 1.18: +8 -1 lines
Diff to previous 1.18 , to selected 1.20.4.1
Add docstring for SET-SYSTEM-EXTERNAL-FORMAT.

Revision 1.18.4.3 - (view) (annotate) - [select for diffs]
Fri Oct 2 17:24:13 2009 UTC (4 years, 6 months ago) by rtoy
Branch: unicode-string-buffer-impl-branch
Changes since 1.18.4.2: +3 -3 lines
Diff to previous 1.18.4.2 , to branch point 1.18 , to selected 1.20.4.1
o Use bref instead of aref in EF-OCTETS-TO-STRING.  (Generated code is
  smaller and better.  Probably some issue with register allocation
  and vop selection.)
o Use SAFETY 0 in EF-OCTETS-TO-STRING, depending on OCTETS-TO-STRING
  to do the correct things.

With these changes, utf16 is now the same speed is 20a, and utf8 is
much faster than 20a.

Revision 1.18.4.2 - (view) (annotate) - [select for diffs]
Sat Sep 26 13:32:26 2009 UTC (4 years, 6 months ago) by rtoy
Branch: unicode-string-buffer-impl-branch
Changes since 1.18.4.1: +2 -2 lines
Diff to previous 1.18.4.1 , to branch point 1.18 , to selected 1.20.4.1
Declare some variables to get rid of some compiler notes.

Revision 1.18.4.1 - (view) (annotate) - [select for diffs]
Fri Sep 25 04:10:46 2009 UTC (4 years, 6 months ago) by rtoy
Branch: unicode-string-buffer-impl-branch
Changes since 1.18: +14 -14 lines
Diff to previous 1.18 , to selected 1.20.4.1
Merge changes to OCTETS-TO-STRING from unicode-string-buffer-branch.

Revision 1.18.2.1 - (view) (annotate) - [select for diffs]
Fri Sep 25 03:44:19 2009 UTC (4 years, 6 months ago) by rtoy
Branch: unicode-string-buffer-branch
Changes since 1.18: +14 -14 lines
Diff to previous 1.18 , to selected 1.20.4.1
Change OCTETS-TO-STRING to take two extra arguments to specify the
start and end indices of the string where the characters should be
placed.  The string is no longer adjusted in size if there are more
octets than space available in the string.

Revision 1.18 - (view) (annotate) - [select for diffs]
Sat Sep 19 14:12:22 2009 UTC (4 years, 6 months ago) by rtoy
Branch: MAIN
CVS Tags: unicode-string-buffer-base, unicode-string-buffer-impl-base
Branch point for: unicode-string-buffer-branch, unicode-string-buffer-impl-branch
Changes since 1.17: +24 -22 lines
Diff to previous 1.17 , to selected 1.20.4.1
Merge changes from the release-20a branch.

Revision 1.15.2.5 - (view) (annotate) - [select for diffs]
Fri Sep 18 12:38:28 2009 UTC (4 years, 6 months ago) by rtoy
Branch: RELEASE-20A-BRANCH
CVS Tags: RELEASE_20a
Changes since 1.15.2.4: +25 -23 lines
Diff to previous 1.15.2.4 , to branch point 1.15 , to next main 1.42 , to selected 1.20.4.1
o Simplify EF-OCTETS-TO-STRING and OCTETS-TO-STRING.
o Update docstring for OCTETS-TO-STRING to mention that a given string
  may be adjusted in size.

Revision 1.17 - (view) (annotate) - [select for diffs]
Thu Sep 17 16:15:34 2009 UTC (4 years, 6 months ago) by rtoy
Branch: MAIN
Changes since 1.16: +20 -14 lines
Diff to previous 1.16 , to selected 1.20.4.1
Merge changes from 20a branch.

Revision 1.15.2.4 - (view) (annotate) - [select for diffs]
Thu Sep 17 16:04:21 2009 UTC (4 years, 6 months ago) by rtoy
Branch: RELEASE-20A-BRANCH
Changes since 1.15.2.3: +21 -15 lines
Diff to previous 1.15.2.3 , to branch point 1.15 , to selected 1.20.4.1
o Fix typo in comment.
o Change EF-OCTETS-TO-STRING to detect end of input if the octet array
  doesn't have enough octets for the very last character.  The index
  of the last octet is returned as the third value.
o Update OCTETS-TO-STRING for the change in  EF-OCTETS-TO-STRING.  A
  third value is returned to indicate where we stopped reading from
  the octet array.

Revision 1.16 - (view) (annotate) - [select for diffs]
Wed Sep 9 15:51:27 2009 UTC (4 years, 7 months ago) by rtoy
Branch: MAIN
Changes since 1.15: +83 -16 lines
Diff to previous 1.15 , to selected 1.20.4.1
Merge changes from 20a-pre1 (tag release-20a-pre1) to trunk.

Revision 1.15.2.3 - (view) (annotate) - [select for diffs]
Sat Aug 29 01:41:48 2009 UTC (4 years, 7 months ago) by rtoy
Branch: RELEASE-20A-BRANCH
CVS Tags: release-20a-pre1
Changes since 1.15.2.2: +16 -7 lines
Diff to previous 1.15.2.2 , to branch point 1.15 , to selected 1.20.4.1
o Use lisp:codepoint type as needed.
o Add docstrings for STRING-TO-OCTETS and OCTETS-TO-STRING.

Revision 1.15.2.2 - (view) (annotate) - [select for diffs]
Fri Aug 28 21:20:36 2009 UTC (4 years, 7 months ago) by rtoy
Branch: RELEASE-20A-BRANCH
Changes since 1.15.2.1: +44 -7 lines
Diff to previous 1.15.2.1 , to branch point 1.15 , to selected 1.20.4.1
o Fix up doc for DEFINE-EXTERNAL-FORMAT.
o Add doc for COPY-STATE.
o Document DEFINE-COMPOSING-EXTERNAL-FORMAT.

Revision 1.15.2.1 - (view) (annotate) - [select for diffs]
Wed Aug 26 20:41:12 2009 UTC (4 years, 7 months ago) by rtoy
Branch: RELEASE-20A-BRANCH
Changes since 1.15: +25 -4 lines
Diff to previous 1.15 , to selected 1.20.4.1
Fix issue with file-string-length where computing the length changed
the state of the external format when it shouldn't.

code/extfmts.lisp:
o Add new slot to hold function to copy the external-format state.
o Update DEFINE-EXTERNAL-FORMAT to allow COPY-STATE function.
o Add macro to run the copy-state function.

code/fd-stream.lisp:
o In ef-strlen, save the fd-stream co state before computing the
  length and restore the state afterwards.

pcl/simple-streams/external-formats/utf-16.lisp:
pcl/simple-streams/external-formats/utf-32.lisp:
o Add copy-state function to copy the state.  (I think these are the
  only formats that have a state for output.)

Revision 1.15 - (view) (annotate) - [select for diffs]
Wed Aug 26 16:25:41 2009 UTC (4 years, 7 months ago) by rtoy
Branch: MAIN
CVS Tags: release-20a-base
Branch point for: RELEASE-20A-BRANCH
Changes since 1.14: +56 -17 lines
Diff to previous 1.14 , to selected 1.20.4.1
Add support for flushing out any state in an external format when
closing an output stream.  This causes things like

(with-open-file (s "foo" :direction :output :external-format :utf-8)
  (write-char #\u+d800 s))

to output the replacement character instead of creating an empty file.

code/extfmts.lisp:
o Add new slot to efx structure to hold the function to flush the
  state in an external format.
o Add accessor for the flush-state slot.
o Update DEFINE-EXTERNAL-FORMAT to allow specifying the flush
  function.
o Add macro to call the flush-state function.
o Added +EF-FLUSH+
o Use vm::defenum to name the constants instead of the hand-written
  values.
o Export +REPLACEMENT-CHARACTER-CODE+
o Document the slots in an efx stucture.

code/fd-stream.lisp:
o Add ef-flush def-ef-macro to flush the state of an external format
  when closing an output file.  If ef-flush-state is NIL, we just call
  EF-COUT to send out the replacement character.  Otherwise, the
  flush-state function is called to handle it.
o When closing an output character stream, call ef-flush to flush any
  state before flushing the buffers of the stream.
o Document the unicode slots in an fd-stream.

code/exports.lisp:
o Export +REPLACEMENT-CHARACTER-CODE+

Revision 1.14 - (view) (annotate) - [select for diffs]
Thu Aug 13 13:55:13 2009 UTC (4 years, 8 months ago) by rtoy
Branch: MAIN
Changes since 1.13: +13 -4 lines
Diff to previous 1.13 , to selected 1.20.4.1
Illegal surrogate sequences (leading surrogate without trailing
surrogate or a lone trailing surrogate) get replaced with the
replacement character.

Revision 1.13 - (view) (annotate) - [select for diffs]
Tue Aug 11 03:30:27 2009 UTC (4 years, 8 months ago) by rtoy
Branch: MAIN
CVS Tags: snapshot-2009-08
Changes since 1.12: +7 -2 lines
Diff to previous 1.12 , to selected 1.20.4.1
o Put some comments back in.
o Put back some unicode/unicode-bootstrap conditionals.

Revision 1.12 - (view) (annotate) - [select for diffs]
Mon Aug 10 22:14:26 2009 UTC (4 years, 8 months ago) by rtoy
Branch: MAIN
Changes since 1.11: +6 -4 lines
Diff to previous 1.11 , to selected 1.20.4.1
Compile and load the external format code.

Revision 1.11 - (view) (annotate) - [select for diffs]
Mon Aug 10 16:47:41 2009 UTC (4 years, 8 months ago) by rtoy
Branch: MAIN
Changes since 1.10: +136 -110 lines
Diff to previous 1.10 , to selected 1.20.4.1
Fixes from Paul Foley:

o Standard streams no longer change formats when
  *default-external-format* changes.  Use
  stream:set-system-external-format instead, or (setf
  external-format).
o char-to-octets properly handles surrogate characters being written.
o Makes simple-streams work again.

This change needs to be cross-compiled.  2009-07 binaries work for
cross-compiling using the 19e/boot-2008-05-cross-unicode-*.lisp
cross-compile script.

Revision 1.10 - (view) (annotate) - [select for diffs]
Thu Jul 23 21:36:51 2009 UTC (4 years, 8 months ago) by rtoy
Branch: MAIN
Changes since 1.9: +8 -2 lines
Diff to previous 1.9 , to selected 1.20.4.1
code/extfmts.lisp:
o Move the +ss-ef-foo+ constants to here from strategy.lisp, and
  update them so they don't overlap with existing +ef-foo+ constants.
o Update +ef-max+ accordingly.

pcl/simple-streams/impl.lisp:
o Use +ss-ef-str+ instead of +ef-str+ in simple-stream-strlen.

pcl/simple-streams/strategy.lisp:
o Comment out +ss-ef-foo+ constants.
o Use +ef-max+ instead of +ss-ef-max+, which is no longer defined.
o Fix bugs in %dc-write-chars-fn:
  - Use ef variable
  - Need to call flush-out-buffer, not flush-buffer for dual-channel
    streams.

Revision 1.9 - (view) (annotate) - [select for diffs]
Thu Jun 25 02:18:02 2009 UTC (4 years, 9 months ago) by rtoy
Branch: MAIN
CVS Tags: snapshot-2009-07
Changes since 1.8: +24 -16 lines
Diff to previous 1.8 , to selected 1.20.4.1
Update STRING-ENCODE and STRING-DECODE to handle surrogate pairs
correctly.  Previously, each surrogate was converted individually.
This is wrong; they should be treated as a single codepoint that is
converted.

Revision 1.8 - (view) (annotate) - [select for diffs]
Wed Jun 24 16:46:18 2009 UTC (4 years, 9 months ago) by rtoy
Branch: MAIN
Changes since 1.7: +2 -1 lines
Diff to previous 1.7 , to selected 1.20.4.1
Fix FILE-STRING-LENGTH to handle unicode streams.  From Paul.

Revision 1.7 - (view) (annotate) - [select for diffs]
Sun Jun 21 13:53:59 2009 UTC (4 years, 9 months ago) by rtoy
Branch: MAIN
Changes since 1.6: +4 -4 lines
Diff to previous 1.6 , to selected 1.20.4.1
Fix bug introduced when converting tables from hashes to ntries.  From
Paul Foley.  This makes mac-roman and derived external formats work
once again.

Revision 1.6 - (view) (annotate) - [select for diffs]
Thu Jun 11 16:03:57 2009 UTC (4 years, 10 months ago) by rtoy
Branch: MAIN
CVS Tags: merged-unicode-utf16-extfmt-2009-06-11, portable-clx-base, portable-clx-import-2009-06-16
Branch point for: portable-clx-branch
Changes since 1.5: +421 -153 lines
Diff to previous 1.5 , to selected 1.20.4.1
Merge Unicode work to trunk.  From label
unicode-utf16-extfmt-2009-06-11.

Revision 1.2.4.3.2.24 - (view) (annotate) - [select for diffs]
Wed Jun 10 16:38:50 2009 UTC (4 years, 10 months ago) by rtoy
Branch: unicode-utf16-extfmt-branch
CVS Tags: unicode-utf16-extfmt-2009-06-11
Changes since 1.2.4.3.2.23: +4 -1 lines
Diff to previous 1.2.4.3.2.23 , to branch point 1.2.4.3 , to selected 1.20.4.1
code/extfmts.lisp:
o Add +replacement-character-code+.

pcl/simple-streams/external-formats/utf-16-be.lisp:
pcl/simple-streams/external-formats/utf-16-le.lisp:
pcl/simple-streams/external-formats/utf-16.lisp:
pcl/simple-streams/external-formats/utf-32-be.lisp:
pcl/simple-streams/external-formats/utf-32-le.lisp:
pcl/simple-streams/external-formats/utf-8.lisp:
o Use +replacement-character-code+ instead of the literal.

Revision 1.2.4.3.2.23 - (view) (annotate) - [select for diffs]
Thu May 28 16:06:39 2009 UTC (4 years, 10 months ago) by rtoy
Branch: unicode-utf16-extfmt-branch
CVS Tags: unicode-snapshot-2009-06
Changes since 1.2.4.3.2.22: +4 -4 lines
Diff to previous 1.2.4.3.2.22 , to branch point 1.2.4.3 , to selected 1.20.4.1
Oops. The codepoint type is in the Lisp package.

Revision 1.2.4.3.2.22 - (view) (annotate) - [select for diffs]
Wed May 27 20:34:19 2009 UTC (4 years, 10 months ago) by rtoy
Branch: unicode-utf16-extfmt-branch
Changes since 1.2.4.3.2.21: +4 -4 lines
Diff to previous 1.2.4.3.2.21 , to branch point 1.2.4.3 , to selected 1.20.4.1
code/char.lisp:
o Define CODEPOINT-LIMIT
o Define CODEPOINT type

code/extfmts.lisp
code/string.lisp
ode/unidata.lisp
pcl/simple-streams/external-formats/utf-32.lisp
pcl/simple-streams/external-formats/utf-8.lisp
o Use the CODEPOINT type in declarations.

Revision 1.2.4.3.2.21 - (view) (annotate) - [select for diffs]
Wed May 20 21:47:36 2009 UTC (4 years, 10 months ago) by rtoy
Branch: unicode-utf16-extfmt-branch
Changes since 1.2.4.3.2.20: +2 -2 lines
Diff to previous 1.2.4.3.2.20 , to branch point 1.2.4.3 , to selected 1.20.4.1
string.lisp:
o Add SURROGATEP function to test if something is a surrogate value.

extfmts.lisp:
utf-16-be.lisp:
utf-16-le.lisp:
utf-16.lisp:
utf-32-be.lisp:
utf-32-le.lisp:
utf-32.lisp:
utf-8.lisp:
o Use SURROGATEP.

Revision 1.2.4.3.2.20 - (view) (annotate) - [select for diffs]
Mon May 18 23:54:37 2009 UTC (4 years, 10 months ago) by rtoy
Branch: unicode-utf16-extfmt-branch
Changes since 1.2.4.3.2.19: +9 -1 lines
Diff to previous 1.2.4.3.2.19 , to branch point 1.2.4.3 , to selected 1.20.4.1
code/extfmts.lisp:
o Add docstrings for STRING-ENCODE and STRING-DECODE

docs/cmu-user/unicode.tex:
o Add documentation for STRING-ENCODE and STRING-DECODE.

Revision 1.2.4.3.2.19 - (view) (annotate) - [select for diffs]
Thu May 14 13:32:16 2009 UTC (4 years, 11 months ago) by rtoy
Branch: unicode-utf16-extfmt-branch
Changes since 1.2.4.3.2.18: +4 -3 lines
Diff to previous 1.2.4.3.2.18 , to branch point 1.2.4.3 , to selected 1.20.4.1
Be sure to open the aliases file and the external format
implementations using :iso8859-1 format, in case
*default-external-format* is something else.

Revision 1.2.4.3.2.18 - (view) (annotate) - [select for diffs]
Thu Apr 30 18:52:43 2009 UTC (4 years, 11 months ago) by rtoy
Branch: unicode-utf16-extfmt-branch
CVS Tags: unicode-snapshot-2009-05
Changes since 1.2.4.3.2.17: +54 -20 lines
Diff to previous 1.2.4.3.2.17 , to branch point 1.2.4.3 , to selected 1.20.4.1
Update from Paul to make the external formats that use invert-tables
uses tries instead of a hash table.

code/extfmts.lisp:
o Change (unsigned-byte 31) to (unsigned-byte 21).  (Should probably
  add a codepoint deftype for this.)
o Use a trie instead of a hash-table for the invert-table stuff
o Fix a typo in a comment.

pcl/simple-streams/external-formats/iso8859-2.lisp:
pcl/simple-streams/external-formats/macroman.lisp:
o Use a trie

Revision 1.2.4.3.2.17 - (view) (annotate) - [select for diffs]
Fri Apr 24 19:18:12 2009 UTC (4 years, 11 months ago) by rtoy
Branch: unicode-utf16-extfmt-branch
Changes since 1.2.4.3.2.16: +10 -9 lines
Diff to previous 1.2.4.3.2.16 , to branch point 1.2.4.3 , to selected 1.20.4.1
Use gensyms.

Revision 1.2.4.3.2.16 - (view) (annotate) - [select for diffs]
Fri Apr 24 11:28:45 2009 UTC (4 years, 11 months ago) by rtoy
Branch: unicode-utf16-extfmt-branch
Changes since 1.2.4.3.2.15: +8 -8 lines
Diff to previous 1.2.4.3.2.15 , to branch point 1.2.4.3 , to selected 1.20.4.1
o Fix typo
o Put local variables onto &aux list.

Revision 1.2.4.3.2.15 - (view) (annotate) - [select for diffs]
Thu Apr 23 18:04:36 2009 UTC (4 years, 11 months ago) by rtoy
Branch: unicode-utf16-extfmt-branch
Changes since 1.2.4.3.2.14: +31 -35 lines
Diff to previous 1.2.4.3.2.14 , to branch point 1.2.4.3 , to selected 1.20.4.1
A nicer implementation of OCTETS-TO-CHAR from Paul.

Revision 1.2.4.3.2.14 - (view) (annotate) - [select for diffs]
Thu Apr 23 15:10:08 2009 UTC (4 years, 11 months ago) by rtoy
Branch: unicode-utf16-extfmt-branch
Changes since 1.2.4.3.2.13: +20 -28 lines
Diff to previous 1.2.4.3.2.13 , to branch point 1.2.4.3 , to selected 1.20.4.1
string.lisp:
o Add Paul's SURROGATES-TO-CODEPOINT and remove
  CODEPOINT-FROM-SURROGATES.
o Change SURROGATES to return characters, not numbers.
o Update callers of SURROGATES to match.

extfmts.lisp:
o Update callers of SURROGATES to match.
o Use CODEPOINT to extract the correct codepoint from a string in
  EF-STRING-TO-OCTETS and EF-OCTETS-TO-STRING.

Revision 1.2.4.3.2.13 - (view) (annotate) - [select for diffs]
Wed Apr 22 17:09:43 2009 UTC (4 years, 11 months ago) by rtoy
Branch: unicode-utf16-extfmt-branch
Changes since 1.2.4.3.2.12: +34 -7 lines
Diff to previous 1.2.4.3.2.12 , to branch point 1.2.4.3 , to selected 1.20.4.1
Update OCTETS-TO-CHAR so that it can return codepoints outside the
BMP.  In this case, the first char returned is the high surrogate
value.  A subsequent call returns the low surrogate value.

This is done by making the state be a cons whose car is for
OCTETS-TO-CHAR for its own state and whose cdr is the state for the
external format.

(Idea based on a suggestion by Paul.)

Revision 1.2.4.3.2.12 - (view) (annotate) - [select for diffs]
Thu Apr 16 20:14:14 2009 UTC (5 years ago) by rtoy
Branch: unicode-utf16-extfmt-branch
Changes since 1.2.4.3.2.11: +2 -10 lines
Diff to previous 1.2.4.3.2.11 , to branch point 1.2.4.3 , to selected 1.20.4.1
Fix up comment.  External formats now do really work on code points,
not code units.  The conversion from code points to code units (and
vice versa) is done at a higher level.

Revision 1.2.4.3.2.11 - (view) (annotate) - [select for diffs]
Tue Apr 14 21:10:02 2009 UTC (5 years ago) by rtoy
Branch: unicode-utf16-extfmt-branch
Changes since 1.2.4.3.2.10: +33 -12 lines
Diff to previous 1.2.4.3.2.10 , to branch point 1.2.4.3 , to selected 1.20.4.1
pcl/simple-stream/external-formats/utf-8.lisp:
o Revert to the previous version where the UTF-8 external format
  produces full 21-bit codepoints.

code/extfmts.lisp:
o Modify EF-STRING-TO-OCTETS to process code points and convert them
  to code units to be stored in our strings.
o Modify EF-OCTETS-TO-STRING to convert code units from the string to
  codepoints for processing by the external format.

These need more work, especially with respect to Lisp
characters/codeunits, but utf-8 appears to be working fine with
surrogate pairs.

Revision 1.2.4.3.2.10 - (view) (annotate) - [select for diffs]
Tue Apr 14 02:15:24 2009 UTC (5 years ago) by rtoy
Branch: unicode-utf16-extfmt-branch
Changes since 1.2.4.3.2.9: +34 -6 lines
Diff to previous 1.2.4.3.2.9 , to branch point 1.2.4.3 , to selected 1.20.4.1
Add some comments for the macro DEFINE-EXTERNAL-FORMAT.

Revision 1.2.4.3.2.9 - (view) (annotate) - [select for diffs]
Sat Mar 28 13:31:42 2009 UTC (5 years ago) by rtoy
Branch: unicode-utf16-extfmt-branch
Changes since 1.2.4.3.2.8: +5 -4 lines
Diff to previous 1.2.4.3.2.8 , to branch point 1.2.4.3 , to selected 1.20.4.1
Compile the external format.  Use compile-from-stream so we don't
leave fasls lying around.

Revision 1.2.4.3.2.8 - (view) (annotate) - [select for diffs]
Mon Jul 14 14:01:56 2008 UTC (5 years, 9 months ago) by rtoy
Branch: unicode-utf16-extfmt-branch
CVS Tags: unicode-utf16-extfmt-2009-03-27, unicode-utf16-extfmts-pre-sync-2008-11, unicode-utf16-extfmts-sync-2008-12
Changes since 1.2.4.3.2.7: +95 -89 lines
Diff to previous 1.2.4.3.2.7 , to branch point 1.2.4.3 , to selected 1.20.4.1
More updates from Paul.

code/extfmts.lisp:
o Fixed bug with shared code between formats
o Built a cache into the ef-macro functions so it doesn't need to call
  find-external-format so often at runtime

code/fd-stream-extfmt.lisp
o Use the changes in code/extfmts

code/fd-stream.lisp:
o Removed all the commented-out code in fd-stream which is duplicated
  in fd-stream-extfmt.

Revision 1.2.4.3.2.7 - (view) (annotate) - [select for diffs]
Wed Jul 9 15:52:12 2008 UTC (5 years, 9 months ago) by rtoy
Branch: unicode-utf16-extfmt-branch
Changes since 1.2.4.3.2.6: +8 -4 lines
Diff to previous 1.2.4.3.2.6 , to branch point 1.2.4.3 , to selected 1.20.4.1
code/extfmts.lisp:
o Bind *DEFAULT-EXTERNAL-FORMAT* to :iso8859-1 when compiling the
  new external format code.  Then messages from the compiler at least
  have a chance of getting printed.
o Removed *compile-verbose*, *compile-progress*, and *gc-verbose*,
  since the compiler messages are working now.  (Should we leave them
  in?)

pcl/simple-streams/external-formats/utf-8.lisp
o Revert back to previous version, without LOCALLY.

Revision 1.2.4.3.2.6 - (view) (annotate) - [select for diffs]
Tue Jul 8 16:09:06 2008 UTC (5 years, 9 months ago) by rtoy
Branch: unicode-utf16-extfmt-branch
Changes since 1.2.4.3.2.5: +6 -2 lines
Diff to previous 1.2.4.3.2.5 , to branch point 1.2.4.3 , to selected 1.20.4.1
Turn off *compile-verbose*, *compile-progress*, and *gc-verbose* to
minimize output messages when compiling the external format.  There's
a problem if COMPILE wants to produce output and the external format
isn't completely setup yet.  (Seems only to be a problem when you
change *default-external-format*.)

This is a workaround.  There ought to be a better solution.  This
change doesn't solve every issue since compiler notes are still output
sometimes.

Revision 1.2.4.3.2.5 - (view) (annotate) - [select for diffs]
Sat Jul 5 12:37:42 2008 UTC (5 years, 9 months ago) by rtoy
Branch: unicode-utf16-extfmt-branch
Changes since 1.2.4.3.2.4: +58 -16 lines
Diff to previous 1.2.4.3.2.4 , to branch point 1.2.4.3 , to selected 1.20.4.1
More updates from Paul.  fd-stream-extfmt.lisp actually implements the
external formats which now work.

Cross-compile works fine.

code/fd-stream-extfmt.lisp:
o New file implementing external formats

tools/worldcom.lisp:
o Compile extfmts.lisp before fd-stream, since fd-stream uses some
  macros from extfmts.
o Compile fd-stream-extfmt

tools/worldload.lisp:
o Load fd-stream-extfmt at the end.  (Can't load it as part of
  kernel.core.  Not enough is set up yet.)

code/extfmts.lisp:
o Avoid loading files, etc., early in the boot sequence
o Add INVERT-TABLE function needed by some formats.

code/fd-stream.lisp:
o Some cleanups (I think)
o Fix EOF handling

Revision 1.2.4.3.2.4 - (view) (annotate) - [select for diffs]
Wed Jul 2 14:53:44 2008 UTC (5 years, 9 months ago) by rtoy
Branch: unicode-utf16-extfmt-branch
Changes since 1.2.4.3.2.3: +3 -3 lines
Diff to previous 1.2.4.3.2.3 , to branch point 1.2.4.3 , to selected 1.20.4.1
code/lispinit.lisp:
o Revert previous change, preserving order of initialization.

Changes from Paul to allow building of the new code from non-unicode
version:

code/extfmts.lisp
code/fd-stream.lisp

Revision 1.2.4.3.2.3 - (view) (annotate) - [select for diffs]
Wed Jul 2 02:35:58 2008 UTC (5 years, 9 months ago) by rtoy
Branch: unicode-utf16-extfmt-branch
Changes since 1.2.4.3.2.2: +2 -2 lines
Diff to previous 1.2.4.3.2.2 , to branch point 1.2.4.3 , to selected 1.20.4.1
Oops.  Fix typo (missing paren).

Revision 1.2.4.3.2.2 - (view) (annotate) - [select for diffs]
Wed Jul 2 01:27:09 2008 UTC (5 years, 9 months ago) by rtoy
Branch: unicode-utf16-extfmt-branch
Changes since 1.2.4.3.2.1: +3 -2 lines
Diff to previous 1.2.4.3.2.1 , to branch point 1.2.4.3 , to selected 1.20.4.1
Oops.  Don't know how to read #\U+FFFD yet.  Use (code-char #xfffd)
instead.

Revision 1.2.4.3.2.1 - (view) (annotate) - [select for diffs]
Wed Jul 2 01:22:07 2008 UTC (5 years, 9 months ago) by rtoy
Branch: unicode-utf16-extfmt-branch
Changes since 1.2.4.3: +178 -71 lines
Diff to previous 1.2.4.3 , to selected 1.20.4.1
More external format support from Paul Foley.

To get external format support I think you need to add :extfmts to
*features*.  But you can't bootstrap with that feature yet.

Initial support for pathname translations to so that namestrings can
be converted to an appropriate format before being given to the OS.

Many, many new external formats added.

These changes are all on their own branch for now, until the bootstrap
issue is resolved.  And also so we don't lose these changes from Paul.

Revision 1.2.4.3 - (view) (annotate) - [select for diffs]
Mon Jun 23 15:03:31 2008 UTC (5 years, 9 months ago) by rtoy
Branch: unicode-utf16-branch
CVS Tags: unicode-utf16-char-support-2009-03-25, unicode-utf16-char-support-2009-03-26, unicode-utf16-sync-2008-07, unicode-utf16-sync-2008-09, unicode-utf16-sync-2008-11, unicode-utf16-sync-2008-12, unicode-utf16-sync-label-2009-03-16
Branch point for: unicode-utf16-extfmt-branch
Changes since 1.2.4.2: +66 -39 lines
Diff to previous 1.2.4.2 , to branch point 1.2 , to next main 1.42 , to selected 1.20.4.1
Sync to HEAD branch.

Revision 1.5 - (view) (annotate) - [select for diffs]
Fri Jun 20 13:16:33 2008 UTC (5 years, 9 months ago) by rtoy
Branch: MAIN
CVS Tags: RELEASE_19f, label-2009-03-16, label-2009-03-25, merge-sse2-packed, merge-with-19f, release-19f-base, release-19f-pre1, snapshot-2008-07, snapshot-2008-08, snapshot-2008-09, snapshot-2008-10, snapshot-2008-11, snapshot-2008-12, snapshot-2009-01, snapshot-2009-02, snapshot-2009-04, snapshot-2009-05, sse2-base, sse2-checkpoint-2008-10-01, sse2-merge-with-2008-10, sse2-merge-with-2008-11, sse2-packed-2008-11-12, sse2-packed-base
Branch point for: RELEASE-19F-BRANCH, sse2-branch, sse2-packed-branch
Changes since 1.4: +66 -39 lines
Diff to previous 1.4 , to selected 1.20.4.1
Update from Paul:

I've moved some slots out of external-format so they can be shared
between external-formats that are identical in all but some variables.

Also fixed a bug in octets-to-string that made it return one character
short, and used char-code-limit instead of #x100 to determine when
octets-to-char returns a "?", so now it'll work without change on 8 or
16 bit lisps.

Revision 1.2.4.2 - (view) (annotate) - [select for diffs]
Thu Jun 19 21:08:17 2008 UTC (5 years, 9 months ago) by rtoy
Branch: unicode-utf16-branch
Changes since 1.2.4.1: +3 -7 lines
Diff to previous 1.2.4.1 , to branch point 1.2 , to selected 1.20.4.1
Merge changes from HEAD for the ext-formats search-list change.

Revision 1.4 - (view) (annotate) - [select for diffs]
Thu Jun 19 20:58:05 2008 UTC (5 years, 9 months ago) by rtoy
Branch: MAIN
Changes since 1.3: +3 -7 lines
Diff to previous 1.3 , to selected 1.20.4.1
Create a new search-list "ext-formats" that is initialized to
"library:ext-formats/".  This makes it easier to add new directories
where external formats can be found.  The previous use made it
difficult because the formats had to be in the subdirectory
ext-formats.

save.lisp:
o Create and initialize new search-list.

extfmts.lisp:
o Use the new search-list instead of "library:ext-formats/".

Revision 1.2.4.1 - (view) (annotate) - [select for diffs]
Thu Jun 19 03:30:44 2008 UTC (5 years, 9 months ago) by rtoy
Branch: unicode-utf16-branch
Changes since 1.2: +252 -148 lines
Diff to previous 1.2 , to selected 1.20.4.1
Merge changes from HEAD to the unicode-utf16 branch.

Revision 1.3 - (view) (annotate) - [select for diffs]
Thu Jun 19 01:41:34 2008 UTC (5 years, 9 months ago) by rtoy
Branch: MAIN
Changes since 1.2: +252 -148 lines
Diff to previous 1.2 , to selected 1.20.4.1
New external format stuff from Paul.

bootfiles/19e/boot-2008-06-1.lisp:
o Use this bootfile to compile the change in external-format
  structure.  Just needed to get rid of a restart when compiling pcl.

code/exports.lisp:
o Renames ENCODE-STRING to STRING-ENCODE.  Similarly for
  DECODE-STRING.

code/extfmts.lisp:
pcl/simple-streams/impl.lisp:
pcl/simple-streams/strategy.lisp:
pcl/simple-streams/external-formats/iso8859-1.lisp:
pcl/simple-streams/external-formats/utf-8.lisp:
pcl/simple-streams/external-formats/void.lisp:
o Updated for new external format.  I think the main change is not
  having to do a funcall for each character.

pcl/simple-streams/external-formats/aliases
o New file describing different names for external formats.

pcl/simple-streams/external-formats/crlf.lisp:
o New file for composing external format for CR/LF

pcl/simple-streams/external-formats/utf-16-be.lisp:
pcl/simple-streams/external-formats/utf-16-le.lisp:
o New files supporting UTF-16 BE and LE formats.

tools/make-main-dist.sh:
o Copy over the new files and the aliases file too.

Revision 1.2 - (view) (annotate) - [select for diffs]
Wed Oct 31 14:37:38 2007 UTC (6 years, 5 months ago) by rtoy
Branch: MAIN
CVS Tags: release-19d, release-19e, release-19e-base, release-19e-pre1, release-19e-pre2, snapshot-2007-11, snapshot-2007-12, snapshot-2008-01, snapshot-2008-02, snapshot-2008-03, snapshot-2008-04, snapshot-2008-05, snapshot-2008-06, unicode-utf16-base, unicode-utf16-string-support
Branch point for: release-19e-branch, unicode-utf16-branch
Changes since 1.1: +23 -19 lines
Diff to previous 1.1 , to selected 1.20.4.1
Update from Paul Foley.

o Disable package errors when loading up external formats.
o A minor patch allowing string-to-octets and vice versa to write into
  a preallocated array (though they might still allocate a bigger one
  if necessary),
o Fix up any confusion between simple-base-string and simple-string so
  that nothing breaks when/if they're not the same.

Revision 1.1 - (view) (annotate) - [select for diffs]
Thu Oct 25 15:17:07 2007 UTC (6 years, 5 months ago) by rtoy
Branch: MAIN
Diff to selected 1.20.4.1
Import Paul Foley's external-formats support.

New files:
o code/extfmts.lisp
o pcl/simple-streams/external-formats/iso8859-1.lisp
o pcl/simple-streams/external-formats/void.lisp

code/exports.lisp:
o Export the new symbols STRING-TO-OCTETS, OCTETS-TO-STRING,
  *DEFAULT-EXTERNAL-FORMAT*, ENCODE-STRING, and DECODE-STRING from the
  STREAM package
o Make the symbols in the EXT package too.

pcl/simple-streams/internal.lisp:
o Move the implementation of STRING-TO-OCTETS and friends to a new
  file (extfmts.lisp).

pcl/simple-streams/external-formats/utf-8.lisp:
o New implementation.

tools/make-main-dist.sh:
o Create new target directory to hold external formats
o Copy all of the external formats to the new directory.

tools/pclcom.lisp:
o Compile new code

tools/worldcom.lisp:
o Compile code/extfmts.lisp

tools/worldload.lisp:
o Load code/extfmts.lisp

This form allows you to request diffs between any two revisions of this file. For each of the two "sides" of the diff, select a symbolic revision name using the selection box, or choose 'Use Text Field' and enter a numeric revision.

  Diffs between and
  Type of Diff should be a

Sort log by:

  ViewVC Help
Powered by ViewVC 1.1.5