/[cmucl]/src/code/extfmts.lisp
ViewVC logotype

Log of /src/code/extfmts.lisp

Parent Directory Parent Directory | Revision Log Revision Log


Links to HEAD: (view) (annotate)
Links to intl-branch-base: (view) (annotate)
Sticky Tag:

Revision 1.20 - (view) (annotate) - [select for diffs]
Sun Oct 18 14:21:23 2009 UTC (4 years, 5 months ago) by rtoy
Branch: MAIN
CVS Tags: amd64-dd-start, intl-2-branch-base, intl-branch-base, snapshot-2009-11, snapshot-2009-12, snapshot-2010-01, snapshot-2010-02
Branch point for: amd64-dd-branch, intl-2-branch, intl-branch
Changes since 1.19: +52 -38 lines
Diff to previous 1.19
Merge changes from unicode-string-buffer-impl-branch which gives
faster reads on external-formats.  This is done by adding an
additional buffer to streams so we can convert the entire in-buffer
into characters all at once.

To build this change, you need to do a cross-compile using
boot-2009-10-1-cross.lisp.  Using that build, do a normal build with
these sources.

For a non-unicode build use boot-2009-10-01.lisp with a 20a
non-unicode build.

code/extfmts.lisp:
o Add another slot to the extfmts for copying the state.
o Modify EF-OCTETS-TO-STRING and OCTETS-TO-STRING to support the
  necesssary changes for fast formats.  This is incompatible with the
  previous version because the string is not grown if needed.

code/fd-stream-extfmt.lisp:
o Set *enable-stream-buffer-p* to T so we have fast streams.

code/fd-stream.lisp:
o Add new slots to support fast strams.
o In SET-ROUTINES, initialize the new slots appropriately.
o Update UNREAD-CHAR to be able to back up in the string buffer to
  unread.
o Add implementation to copy the state of an external format.

code/stream.lisp:
o Change %SET-FD-STREAM-EXTERNAL-FORMAT to be able to change formats
  even if we've already converted the buffer with a different format.
  We reconvert the buffer with the old format until we reach the
  current character.  Then the remaining octets are converted using
  the new format and stored in the string buffer.
o Add FAST-READ-CHAR-STRING-REFILL to refill the string buffer, like
  FAST-READ-CHAR-REFILL does for the octet in-buffer.

code/struct.lisp:
o Add new slots to hold the string buffer, the current index, and
  length.  These are needed for the fast formats.

code/sysmacs.lisp:
o Update PREPARE-FOR-FAST-READ-CHAR, DONE-WITH-FAST-READ-CHAR, and
  FAST-READ-CHAR to support the string buffer.

code/string.lisp:
o Microoptimization of SURROGATEP to reduce the number of branchs.

general-info/release-20b.txt:
o Update with these changes

pcl/simple-streams/external-formats/utf-16-be.lisp:
pcl/simple-streams/external-formats/utf-16-le.lisp:
pcl/simple-streams/external-formats/utf-16.lisp:
o These formats actually have state, so update them to take a handle
  an initial state.  These are needed if the string buffer ends with a
  leading surrogate and the next string buffer starts with a trailing
  surrogate.  The conversion needs to combine the surrogates together.

Revision 1.19 - (view) (annotate) - [select for diffs]
Fri Oct 2 20:15:04 2009 UTC (4 years, 6 months ago) by rtoy
Branch: MAIN
Changes since 1.18: +8 -1 lines
Diff to previous 1.18
Add docstring for SET-SYSTEM-EXTERNAL-FORMAT.

Revision 1.18 - (view) (annotate) - [select for diffs]
Sat Sep 19 14:12:22 2009 UTC (4 years, 6 months ago) by rtoy
Branch: MAIN
CVS Tags: unicode-string-buffer-base, unicode-string-buffer-impl-base
Branch point for: unicode-string-buffer-branch, unicode-string-buffer-impl-branch
Changes since 1.17: +24 -22 lines
Diff to previous 1.17
Merge changes from the release-20a branch.

Revision 1.17 - (view) (annotate) - [select for diffs]
Thu Sep 17 16:15:34 2009 UTC (4 years, 7 months ago) by rtoy
Branch: MAIN
Changes since 1.16: +20 -14 lines
Diff to previous 1.16
Merge changes from 20a branch.

Revision 1.16 - (view) (annotate) - [select for diffs]
Wed Sep 9 15:51:27 2009 UTC (4 years, 7 months ago) by rtoy
Branch: MAIN
Changes since 1.15: +83 -16 lines
Diff to previous 1.15
Merge changes from 20a-pre1 (tag release-20a-pre1) to trunk.

Revision 1.15 - (view) (annotate) - [select for diffs]
Wed Aug 26 16:25:41 2009 UTC (4 years, 7 months ago) by rtoy
Branch: MAIN
CVS Tags: release-20a-base
Branch point for: RELEASE-20A-BRANCH
Changes since 1.14: +56 -17 lines
Diff to previous 1.14
Add support for flushing out any state in an external format when
closing an output stream.  This causes things like

(with-open-file (s "foo" :direction :output :external-format :utf-8)
  (write-char #\u+d800 s))

to output the replacement character instead of creating an empty file.

code/extfmts.lisp:
o Add new slot to efx structure to hold the function to flush the
  state in an external format.
o Add accessor for the flush-state slot.
o Update DEFINE-EXTERNAL-FORMAT to allow specifying the flush
  function.
o Add macro to call the flush-state function.
o Added +EF-FLUSH+
o Use vm::defenum to name the constants instead of the hand-written
  values.
o Export +REPLACEMENT-CHARACTER-CODE+
o Document the slots in an efx stucture.

code/fd-stream.lisp:
o Add ef-flush def-ef-macro to flush the state of an external format
  when closing an output file.  If ef-flush-state is NIL, we just call
  EF-COUT to send out the replacement character.  Otherwise, the
  flush-state function is called to handle it.
o When closing an output character stream, call ef-flush to flush any
  state before flushing the buffers of the stream.
o Document the unicode slots in an fd-stream.

code/exports.lisp:
o Export +REPLACEMENT-CHARACTER-CODE+

Revision 1.14 - (view) (annotate) - [select for diffs]
Thu Aug 13 13:55:13 2009 UTC (4 years, 8 months ago) by rtoy
Branch: MAIN
Changes since 1.13: +13 -4 lines
Diff to previous 1.13
Illegal surrogate sequences (leading surrogate without trailing
surrogate or a lone trailing surrogate) get replaced with the
replacement character.

Revision 1.13 - (view) (annotate) - [select for diffs]
Tue Aug 11 03:30:27 2009 UTC (4 years, 8 months ago) by rtoy
Branch: MAIN
CVS Tags: snapshot-2009-08
Changes since 1.12: +7 -2 lines
Diff to previous 1.12
o Put some comments back in.
o Put back some unicode/unicode-bootstrap conditionals.

Revision 1.12 - (view) (annotate) - [select for diffs]
Mon Aug 10 22:14:26 2009 UTC (4 years, 8 months ago) by rtoy
Branch: MAIN
Changes since 1.11: +6 -4 lines
Diff to previous 1.11
Compile and load the external format code.

Revision 1.11 - (view) (annotate) - [select for diffs]
Mon Aug 10 16:47:41 2009 UTC (4 years, 8 months ago) by rtoy
Branch: MAIN
Changes since 1.10: +136 -110 lines
Diff to previous 1.10
Fixes from Paul Foley:

o Standard streams no longer change formats when
  *default-external-format* changes.  Use
  stream:set-system-external-format instead, or (setf
  external-format).
o char-to-octets properly handles surrogate characters being written.
o Makes simple-streams work again.

This change needs to be cross-compiled.  2009-07 binaries work for
cross-compiling using the 19e/boot-2008-05-cross-unicode-*.lisp
cross-compile script.

Revision 1.10 - (view) (annotate) - [select for diffs]
Thu Jul 23 21:36:51 2009 UTC (4 years, 8 months ago) by rtoy
Branch: MAIN
Changes since 1.9: +8 -2 lines
Diff to previous 1.9
code/extfmts.lisp:
o Move the +ss-ef-foo+ constants to here from strategy.lisp, and
  update them so they don't overlap with existing +ef-foo+ constants.
o Update +ef-max+ accordingly.

pcl/simple-streams/impl.lisp:
o Use +ss-ef-str+ instead of +ef-str+ in simple-stream-strlen.

pcl/simple-streams/strategy.lisp:
o Comment out +ss-ef-foo+ constants.
o Use +ef-max+ instead of +ss-ef-max+, which is no longer defined.
o Fix bugs in %dc-write-chars-fn:
  - Use ef variable
  - Need to call flush-out-buffer, not flush-buffer for dual-channel
    streams.

Revision 1.9 - (view) (annotate) - [select for diffs]
Thu Jun 25 02:18:02 2009 UTC (4 years, 9 months ago) by rtoy
Branch: MAIN
CVS Tags: snapshot-2009-07
Changes since 1.8: +24 -16 lines
Diff to previous 1.8
Update STRING-ENCODE and STRING-DECODE to handle surrogate pairs
correctly.  Previously, each surrogate was converted individually.
This is wrong; they should be treated as a single codepoint that is
converted.

Revision 1.8 - (view) (annotate) - [select for diffs]
Wed Jun 24 16:46:18 2009 UTC (4 years, 9 months ago) by rtoy
Branch: MAIN
Changes since 1.7: +2 -1 lines
Diff to previous 1.7
Fix FILE-STRING-LENGTH to handle unicode streams.  From Paul.

Revision 1.7 - (view) (annotate) - [select for diffs]
Sun Jun 21 13:53:59 2009 UTC (4 years, 9 months ago) by rtoy
Branch: MAIN
Changes since 1.6: +4 -4 lines
Diff to previous 1.6
Fix bug introduced when converting tables from hashes to ntries.  From
Paul Foley.  This makes mac-roman and derived external formats work
once again.

Revision 1.6 - (view) (annotate) - [select for diffs]
Thu Jun 11 16:03:57 2009 UTC (4 years, 10 months ago) by rtoy
Branch: MAIN
CVS Tags: merged-unicode-utf16-extfmt-2009-06-11, portable-clx-base, portable-clx-import-2009-06-16
Branch point for: portable-clx-branch
Changes since 1.5: +421 -153 lines
Diff to previous 1.5
Merge Unicode work to trunk.  From label
unicode-utf16-extfmt-2009-06-11.

Revision 1.5 - (view) (annotate) - [select for diffs]
Fri Jun 20 13:16:33 2008 UTC (5 years, 9 months ago) by rtoy
Branch: MAIN
CVS Tags: RELEASE_19f, label-2009-03-16, label-2009-03-25, merge-sse2-packed, merge-with-19f, release-19f-base, release-19f-pre1, snapshot-2008-07, snapshot-2008-08, snapshot-2008-09, snapshot-2008-10, snapshot-2008-11, snapshot-2008-12, snapshot-2009-01, snapshot-2009-02, snapshot-2009-04, snapshot-2009-05, sse2-base, sse2-checkpoint-2008-10-01, sse2-merge-with-2008-10, sse2-merge-with-2008-11, sse2-packed-2008-11-12, sse2-packed-base
Branch point for: RELEASE-19F-BRANCH, sse2-branch, sse2-packed-branch
Changes since 1.4: +66 -39 lines
Diff to previous 1.4
Update from Paul:

I've moved some slots out of external-format so they can be shared
between external-formats that are identical in all but some variables.

Also fixed a bug in octets-to-string that made it return one character
short, and used char-code-limit instead of #x100 to determine when
octets-to-char returns a "?", so now it'll work without change on 8 or
16 bit lisps.

Revision 1.4 - (view) (annotate) - [select for diffs]
Thu Jun 19 20:58:05 2008 UTC (5 years, 9 months ago) by rtoy
Branch: MAIN
Changes since 1.3: +3 -7 lines
Diff to previous 1.3
Create a new search-list "ext-formats" that is initialized to
"library:ext-formats/".  This makes it easier to add new directories
where external formats can be found.  The previous use made it
difficult because the formats had to be in the subdirectory
ext-formats.

save.lisp:
o Create and initialize new search-list.

extfmts.lisp:
o Use the new search-list instead of "library:ext-formats/".

Revision 1.3 - (view) (annotate) - [select for diffs]
Thu Jun 19 01:41:34 2008 UTC (5 years, 9 months ago) by rtoy
Branch: MAIN
Changes since 1.2: +252 -148 lines
Diff to previous 1.2
New external format stuff from Paul.

bootfiles/19e/boot-2008-06-1.lisp:
o Use this bootfile to compile the change in external-format
  structure.  Just needed to get rid of a restart when compiling pcl.

code/exports.lisp:
o Renames ENCODE-STRING to STRING-ENCODE.  Similarly for
  DECODE-STRING.

code/extfmts.lisp:
pcl/simple-streams/impl.lisp:
pcl/simple-streams/strategy.lisp:
pcl/simple-streams/external-formats/iso8859-1.lisp:
pcl/simple-streams/external-formats/utf-8.lisp:
pcl/simple-streams/external-formats/void.lisp:
o Updated for new external format.  I think the main change is not
  having to do a funcall for each character.

pcl/simple-streams/external-formats/aliases
o New file describing different names for external formats.

pcl/simple-streams/external-formats/crlf.lisp:
o New file for composing external format for CR/LF

pcl/simple-streams/external-formats/utf-16-be.lisp:
pcl/simple-streams/external-formats/utf-16-le.lisp:
o New files supporting UTF-16 BE and LE formats.

tools/make-main-dist.sh:
o Copy over the new files and the aliases file too.

Revision 1.2 - (view) (annotate) - [select for diffs]
Wed Oct 31 14:37:38 2007 UTC (6 years, 5 months ago) by rtoy
Branch: MAIN
CVS Tags: release-19d, release-19e, release-19e-base, release-19e-pre1, release-19e-pre2, snapshot-2007-11, snapshot-2007-12, snapshot-2008-01, snapshot-2008-02, snapshot-2008-03, snapshot-2008-04, snapshot-2008-05, snapshot-2008-06, unicode-utf16-base, unicode-utf16-string-support
Branch point for: release-19e-branch, unicode-utf16-branch
Changes since 1.1: +23 -19 lines
Diff to previous 1.1
Update from Paul Foley.

o Disable package errors when loading up external formats.
o A minor patch allowing string-to-octets and vice versa to write into
  a preallocated array (though they might still allocate a bigger one
  if necessary),
o Fix up any confusion between simple-base-string and simple-string so
  that nothing breaks when/if they're not the same.

Revision 1.1 - (view) (annotate) - [select for diffs]
Thu Oct 25 15:17:07 2007 UTC (6 years, 5 months ago) by rtoy
Branch: MAIN
Import Paul Foley's external-formats support.

New files:
o code/extfmts.lisp
o pcl/simple-streams/external-formats/iso8859-1.lisp
o pcl/simple-streams/external-formats/void.lisp

code/exports.lisp:
o Export the new symbols STRING-TO-OCTETS, OCTETS-TO-STRING,
  *DEFAULT-EXTERNAL-FORMAT*, ENCODE-STRING, and DECODE-STRING from the
  STREAM package
o Make the symbols in the EXT package too.

pcl/simple-streams/internal.lisp:
o Move the implementation of STRING-TO-OCTETS and friends to a new
  file (extfmts.lisp).

pcl/simple-streams/external-formats/utf-8.lisp:
o New implementation.

tools/make-main-dist.sh:
o Create new target directory to hold external formats
o Copy all of the external formats to the new directory.

tools/pclcom.lisp:
o Compile new code

tools/worldcom.lisp:
o Compile code/extfmts.lisp

tools/worldload.lisp:
o Load code/extfmts.lisp

This form allows you to request diffs between any two revisions of this file. For each of the two "sides" of the diff, select a symbolic revision name using the selection box, or choose 'Use Text Field' and enter a numeric revision.

  Diffs between and
  Type of Diff should be a

Sort log by:

  ViewVC Help
Powered by ViewVC 1.1.5