- May 25, 2013
-
-
Raymond Toy authored
functions now live in the new UNICODE package. src/code/exports.lisp:: * Export some unicode functions and constants. src/code/string.lisp:: * Removed the extended versions of string-upcase and friends. * Export surrogates function. * Make sure with-one-string is defined so the unicode package can use it. src/code/unicode.lisp:; * New file with extended versions of string-upcase and friends. src/code/unidata.lisp:: * Export some unicode functions and constants. src/compiler/fndb.lisp:: * Update defknowns for string-upcase and friends. src/tools/worldbuild.lisp:: * Build unicode.lisp src/tools/worldcom.lisp:: * Load unicode.lisp
-
- May 21, 2013
-
-
Raymond Toy authored
CLHS.
-
- May 19, 2013
-
-
Raymond Toy authored
From ticket 81, the tests are now: {{{ (time (prog1 t (time-rev *s*))) ; Evaluation took: ; 0.49 seconds of real time ; 0.481813 seconds of user run time ; 0.003624 seconds of system run time ; 1,490,776,936 CPU cycles ; [Run times include 0.13 seconds GC run time] ; 0 page faults and ; 200,073,704 bytes consed. (time (prog1 t (time-rev *s2*))) ; Evaluation took: ; 0.97 seconds of real time ; 0.965893 seconds of user run time ; 0.005139 seconds of system run time ; 2,980,415,911 CPU cycles ; [Run times include 0.23 seconds GC run time] ; 0 page faults and ; 400,005,560 bytes consed. }}} So the new string-reverse* is 20 times faster for strings without surrogates and 10 times faster for strings containing only surrogates.
-
- Mar 06, 2013
-
-
Raymond Toy authored
-
Raymond Toy authored
-
- Feb 04, 2012
-
-
Raymond Toy authored
{{{:UNICODE-WORD-BREAK}}} keyword parameter that enables the Unicode word-breaking algorithm to determine word boundaries.
-
- Nov 04, 2011
-
-
Raymond Toy authored
-
- Sep 25, 2011
-
-
Raymond Toy authored
entries with just the file path, removing the revision number, date, author and state. The actual information is now computed during compilation and stored in the fasl itself. (See ticket:48.)
-
- Oct 26, 2010
-
-
rtoy authored
-
- Oct 13, 2010
-
-
rtoy authored
compiled with and without Unicode. This is needed so that the pot files have the same content for both unicode and non-unicode builds. (The _"" and _N"" are handled by the reader, so things that are conditionalized out still get processed, unlike using gettext.)
-
- Sep 20, 2010
-
-
rtoy authored
notes now. o Tell the compiler what type the first return value of CODEPOINT is. Apparently, the compiler can't figure that out itself.
-
- Sep 15, 2010
-
-
rtoy authored
code/string.lisp: o In %compose, handle the case where the composite character is outside the BMP and thus needs special handling for our UTF-16 strings. code/unidata.lisp o CKJ Ideograph range has changed in 5.2. o Fix bug in build-composition-table. We were not correctly handling the case where the decomposition of a codepoint was outside the BMP. Special care is needed to handle the UTF-16 strings that we use. o The key for the pairwise composition table are the full codepoints, so we need to shift one by 21 bits instead of 16. tools/build-unidata.lisp o Update minor version to 2. i18n/BidiMirroring.txt i18n/CaseFolding.txt i18n/CompositionExclusions.txt i18n/DerivedNormalizationProps.txt i18n/NameAliases.txt i18n/NormalizationCorrections.txt i18n/SpecialCasing.txt i18n/UnicodeData.txt i18n/WordBreakProperty.txt i18n/tests/NormalizationTest.txt i18n/tests/WordBreakTest.txt o Updated from Unicode 5.2. i18n/unidata.bin o Regenerated from new Unicode 5.2 files.
-
- Sep 13, 2010
-
-
rtoy authored
function to convert a string to a list of codepoints.
-
- Apr 20, 2010
-
-
rtoy authored
may get confused with source locations if the reader macros are installed.
-
- Apr 19, 2010
-
-
rtoy authored
-
- Mar 19, 2010
-
-
rtoy authored
boot-2010-02-1 as the bootstrap file. You should probably also use the new -P option for build.sh to generate and update the po files while building.
-
- Oct 18, 2009
-
-
rtoy authored
faster reads on external-formats. This is done by adding an additional buffer to streams so we can convert the entire in-buffer into characters all at once. To build this change, you need to do a cross-compile using boot-2009-10-1-cross.lisp. Using that build, do a normal build with these sources. For a non-unicode build use boot-2009-10-01.lisp with a 20a non-unicode build. code/extfmts.lisp: o Add another slot to the extfmts for copying the state. o Modify EF-OCTETS-TO-STRING and OCTETS-TO-STRING to support the necesssary changes for fast formats. This is incompatible with the previous version because the string is not grown if needed. code/fd-stream-extfmt.lisp: o Set *enable-stream-buffer-p* to T so we have fast streams. code/fd-stream.lisp: o Add new slots to support fast strams. o In SET-ROUTINES, initialize the new slots appropriately. o Update UNREAD-CHAR to be able to back up in the string buffer to unread. o Add implementation to copy the state of an external format. code/stream.lisp: o Change %SET-FD-STREAM-EXTERNAL-FORMAT to be able to change formats even if we've already converted the buffer with a different format. We reconvert the buffer with the old format until we reach the current character. Then the remaining octets are converted using the new format and stored in the string buffer. o Add FAST-READ-CHAR-STRING-REFILL to refill the string buffer, like FAST-READ-CHAR-REFILL does for the octet in-buffer. code/struct.lisp: o Add new slots to hold the string buffer, the current index, and length. These are needed for the fast formats. code/sysmacs.lisp: o Update PREPARE-FOR-FAST-READ-CHAR, DONE-WITH-FAST-READ-CHAR, and FAST-READ-CHAR to support the string buffer. code/string.lisp: o Microoptimization of SURROGATEP to reduce the number of branchs. general-info/release-20b.txt: o Update with these changes pcl/simple-streams/external-formats/utf-16-be.lisp: pcl/simple-streams/external-formats/utf-16-le.lisp: pcl/simple-streams/external-formats/utf-16.lisp: o These formats actually have state, so update them to take a handle an initial state. These are needed if the string buffer ends with a leading surrogate and the next string buffer starts with a trailing surrogate. The conversion needs to combine the surrogates together.
-
- Sep 15, 2009
-
-
rtoy authored
-
rtoy authored
STRING-CAPITALIZE. Not sure about the appropriate interface, though. code/string.lisp: o Add Unicode word break algorithm. Based on Scheme code by William Clinger. Used with permission. o Update STRING-CAPITALIZE to take another keyword arg to indicate if we should use the Unicode word break algorithm. Default is not to use the Unicode algorithm. compiler/fndb.lisp: o Update defknown for string-capitalize. i18n/tests/WordBreakTest.txt: o New test file for the word break algorithm i18n/tests/word-break-test.lisp: o New file to run the word break test.
-
- Aug 17, 2009
-
-
rtoy authored
SURROGATES-TO-CODEPOINT.
-
- Aug 10, 2009
-
-
rtoy authored
assigned. It should return NIL if the codepoint is NOT assigned.
-
- Jul 13, 2009
-
-
rtoy authored
-
- Jun 16, 2009
-
-
rtoy authored
code/stream.lisp: o Only define (setf stream-external-format) for Unicode builds. o In stream-external-format, don't try to look up the external format from the fd-stream structure, which doesn't exist in non-unicode builds. code/strings.lisp: o Conditionalize out things that will only work if unicode is available. tools/worldcom.lisp: o Only compile fd-stream-extfmt for unicode builds.
-
rtoy authored
o Only define STRING-TO-NFD, STRING-TO-NFKD, and STRING-TO-NFKC for Unicode builds. Conditionalize out their support functions too. o Update export list to be conditional on Unicode too. o Use new name for get-pairwise-composition. code/exports.lisp: o Update export list to be conditional on Unicode for above changes in string.lisp. code/unidata.lisp: o Change name from GET-PAIRWISE-COMPOSITION to UNICODE-PAIRWISE-COMPOSITION to match other Unicode function names.
-
- Jun 11, 2009
-
-
rtoy authored
unicode-utf16-extfmt-2009-06-11.
-
- Apr 11, 2003
-
-
emarsden authored
that it's a valid subtype of character (then ignore it).
-
- Jun 17, 2001
-
-
pw authored
Fix some error types to be ANSI compliant.
-
- Mar 04, 2001
-
-
pw authored
compile-lisp.log. Only 46/130 notes left.
-
- Feb 13, 1998
-
-
dtc authored
o Add an optional environment argument to constantp; ignored by CMUCL. o Add the :element-type keyword to make-string.
-
- Jul 12, 1996
-
-
ram authored
when :start2 :end2 values were specified.
-
- Oct 31, 1994
-
-
ram authored
-
- Feb 11, 1994
-
-
cvs2git authored
-
- Jan 13, 1993
-
-
cvs2git authored
-
- May 15, 1992
-
-
wlott authored
-
- May 28, 1991
-
-
ram authored
-
- Apr 24, 1991
-
-
ram authored
is more clever. Also, changed it to accept any STRINGable thing, instead of just strings and symbols. These macros now bind the offset var instead of randomly setting it.
-
- Feb 08, 1991
-
-
ram authored
-
- Jul 29, 1990
-
-
wlott authored
foo isn't a simple-string.
-
- May 30, 1990
-
-
cvs2git authored
-
- Apr 11, 1990
-
-
wlott authored
-