Parent Directory | Revision Log
|Links to HEAD:||(view) (annotate)|
|Links to cross-sparc-branch:||(view) (annotate)|
Add trailing newline.
Some changes to replace calls to gettext with _"" or _N"" for things compiled with and without Unicode. This is needed so that the pot files have the same content for both unicode and non-unicode builds. (The _"" and _N"" are handled by the reader, so things that are conditionalized out still get processed, unlike using gettext.)
o Inhibit warnings from SURROGATEP; I'm tired seeing the code deletion notes now. o Tell the compiler what type the first return value of CODEPOINT is. Apparently, the compiler can't figure that out itself.
Add support for Unicode 5.2. The normalization and wordbreak tests pass. code/string.lisp: o In %compose, handle the case where the composite character is outside the BMP and thus needs special handling for our UTF-16 strings. code/unidata.lisp o CKJ Ideograph range has changed in 5.2. o Fix bug in build-composition-table. We were not correctly handling the case where the decomposition of a codepoint was outside the BMP. Special care is needed to handle the UTF-16 strings that we use. o The key for the pairwise composition table are the full codepoints, so we need to shift one by 21 bits instead of 16. tools/build-unidata.lisp o Update minor version to 2. i18n/BidiMirroring.txt i18n/CaseFolding.txt i18n/CompositionExclusions.txt i18n/DerivedNormalizationProps.txt i18n/NameAliases.txt i18n/NormalizationCorrections.txt i18n/SpecialCasing.txt i18n/UnicodeData.txt i18n/WordBreakProperty.txt i18n/tests/NormalizationTest.txt i18n/tests/WordBreakTest.txt o Updated from Unicode 5.2. i18n/unidata.bin o Regenerated from new Unicode 5.2 files.
Add function to convert a sequence of codepoints to a string and a function to convert a string to a list of codepoints.
Change uses of _"foo" to (intl:gettext "foo"). This is because slime may get confused with source locations if the reader macros are installed.
Remove _N"" reader macro from docstrings when possible.
Merge intl-branch 2010-03-18 to HEAD. To build, you need to use boot-2010-02-1 as the bootstrap file. You should probably also use the new -P option for build.sh to generate and update the po files while building.
Merge changes from unicode-string-buffer-impl-branch which gives faster reads on external-formats. This is done by adding an additional buffer to streams so we can convert the entire in-buffer into characters all at once. To build this change, you need to do a cross-compile using boot-2009-10-1-cross.lisp. Using that build, do a normal build with these sources. For a non-unicode build use boot-2009-10-01.lisp with a 20a non-unicode build. code/extfmts.lisp: o Add another slot to the extfmts for copying the state. o Modify EF-OCTETS-TO-STRING and OCTETS-TO-STRING to support the necesssary changes for fast formats. This is incompatible with the previous version because the string is not grown if needed. code/fd-stream-extfmt.lisp: o Set *enable-stream-buffer-p* to T so we have fast streams. code/fd-stream.lisp: o Add new slots to support fast strams. o In SET-ROUTINES, initialize the new slots appropriately. o Update UNREAD-CHAR to be able to back up in the string buffer to unread. o Add implementation to copy the state of an external format. code/stream.lisp: o Change %SET-FD-STREAM-EXTERNAL-FORMAT to be able to change formats even if we've already converted the buffer with a different format. We reconvert the buffer with the old format until we reach the current character. Then the remaining octets are converted using the new format and stored in the string buffer. o Add FAST-READ-CHAR-STRING-REFILL to refill the string buffer, like FAST-READ-CHAR-REFILL does for the octet in-buffer. code/struct.lisp: o Add new slots to hold the string buffer, the current index, and length. These are needed for the fast formats. code/sysmacs.lisp: o Update PREPARE-FOR-FAST-READ-CHAR, DONE-WITH-FAST-READ-CHAR, and FAST-READ-CHAR to support the string buffer. code/string.lisp: o Microoptimization of SURROGATEP to reduce the number of branchs. general-info/release-20b.txt: o Update with these changes pcl/simple-streams/external-formats/utf-16-be.lisp: pcl/simple-streams/external-formats/utf-16-le.lisp: pcl/simple-streams/external-formats/utf-16.lisp: o These formats actually have state, so update them to take a handle an initial state. These are needed if the string buffer ends with a leading surrogate and the next string buffer starts with a trailing surrogate. The conversion needs to combine the surrogates together.
Oops. Remove old code that didn't support our UTF-16 strings.
Add support for the Unicode word break algorithm for STRING-CAPITALIZE. Not sure about the appropriate interface, though. code/string.lisp: o Add Unicode word break algorithm. Based on Scheme code by William Clinger. Used with permission. o Update STRING-CAPITALIZE to take another keyword arg to indicate if we should use the Unicode word break algorithm. Default is not to use the Unicode algorithm. compiler/fndb.lisp: o Update defknown for string-capitalize. i18n/tests/WordBreakTest.txt: o New test file for the word break algorithm i18n/tests/word-break-test.lisp: o New file to run the word break test.
Use more descriptive argument names for SURROGATEP and SURROGATES-TO-CODEPOINT.
Oops. utf16-string-p was returning NIL if the codepoint was assigned. It should return NIL if the codepoint is NOT assigned.
Clean up a few compiler warnings about unused variables.
Cleanups for non-unicode build. code/stream.lisp: o Only define (setf stream-external-format) for Unicode builds. o In stream-external-format, don't try to look up the external format from the fd-stream structure, which doesn't exist in non-unicode builds. code/strings.lisp: o Conditionalize out things that will only work if unicode is available. tools/worldcom.lisp: o Only compile fd-stream-extfmt for unicode builds.
code/string.lisp: o Only define STRING-TO-NFD, STRING-TO-NFKD, and STRING-TO-NFKC for Unicode builds. Conditionalize out their support functions too. o Update export list to be conditional on Unicode too. o Use new name for get-pairwise-composition. code/exports.lisp: o Update export list to be conditional on Unicode for above changes in string.lisp. code/unidata.lisp: o Change name from GET-PAIRWISE-COMPOSITION to UNICODE-PAIRWISE-COMPOSITION to match other Unicode function names.
Merge Unicode work to trunk. From label unicode-utf16-extfmt-2009-06-11.
Instead of ignoring the :element-type argument to MAKE-STRING, we check that it's a valid subtype of character (then ignore it).
From eric Marsden: Fix some error types to be ANSI compliant.
A few well placed inhibit-warnings declarations to suppress noise in compile-lisp.log. Only 46/130 notes left.
ANSI CL compat. changes: o Add an optional environment argument to constantp; ignored by CMUCL. o Add the :element-type keyword to make-string.
Merged DTC's patch to string<>=*-body which fixes various problems that arose when :start2 :end2 values were specified.
Fix headed boilerplate.
Removed an extra ``)''.
Changed STRING-xxxCASE to not assign arguments.
Changed the WITH-xxx-STRINGs macros to use simply WITH-ARRAY-DATA, now that it is more clever. Also, changed it to accept any STRINGable thing, instead of just strings and symbols. These macros now bind the offset var instead of randomly setting it.
New file header with RCS header FILE-COMMENT.
Moved MIPS branch onto trunk; no merge necessary.
This form allows you to request diffs between any two revisions of this file. For each of the two "sides" of the diff, select a symbolic revision name using the selection box, or choose 'Use Text Field' and enter a numeric revision.
|Powered by ViewVC 1.1.5|