Skip to content
  1. May 25, 2013
    • Raymond Toy's avatar
      Remove all the extensions to string-upcase and friends. The extended · 49f041ad
      Raymond Toy authored
      functions now live in the new UNICODE package.
      
       src/code/exports.lisp::
       * Export some unicode functions and constants.
      
       src/code/string.lisp::
       * Removed the extended versions of string-upcase and friends.
       * Export surrogates function.
       * Make sure with-one-string is defined so the unicode package can use
         it.
      
       src/code/unicode.lisp:;
       * New file with extended versions of string-upcase and friends.
      
       src/code/unidata.lisp::
       * Export some unicode functions and constants.
      
       src/compiler/fndb.lisp::
       * Update defknowns for string-upcase and friends.
      
       src/tools/worldbuild.lisp::
       * Build unicode.lisp
      
       src/tools/worldcom.lisp::
       * Load unicode.lisp
      49f041ad
  2. May 21, 2013
  3. May 19, 2013
    • Raymond Toy's avatar
      Fix ticket:81 and fix ticket:83. · 78cce51d
      Raymond Toy authored
      From ticket 81, the tests are now:
      
      {{{
      (time (prog1 t (time-rev *s*)))
      ; Evaluation took:
      ;   0.49 seconds of real time
      ;   0.481813 seconds of user run time
      ;   0.003624 seconds of system run time
      ;   1,490,776,936 CPU cycles
      ;   [Run times include 0.13 seconds GC run time]
      ;   0 page faults and
      ;   200,073,704 bytes consed.
      
      (time (prog1 t (time-rev *s2*)))
      ; Evaluation took:
      ;   0.97 seconds of real time
      ;   0.965893 seconds of user run time
      ;   0.005139 seconds of system run time
      ;   2,980,415,911 CPU cycles
      ;   [Run times include 0.23 seconds GC run time]
      ;   0 page faults and
      ;   400,005,560 bytes consed.
      }}}
      
      So the new string-reverse* is 20 times faster for strings without
      surrogates and 10 times faster for strings containing only surrogates.
      78cce51d
  4. Mar 06, 2013
  5. Feb 04, 2012
  6. Nov 04, 2011
  7. Sep 25, 2011
  8. Oct 26, 2010
  9. Oct 13, 2010
    • rtoy's avatar
      Some changes to replace calls to gettext with _"" or _N"" for things · b22644d4
      rtoy authored
      compiled with and without Unicode.  This is needed so that the pot
      files have the same content for both unicode and non-unicode builds.
      (The _"" and _N"" are handled by the reader, so things that are
      conditionalized out still get processed, unlike using gettext.)
      b22644d4
  10. Sep 20, 2010
  11. Sep 15, 2010
    • rtoy's avatar
      Add support for Unicode 5.2. The normalization and wordbreak tests pass. · d2b9eace
      rtoy authored
      code/string.lisp:
      o In %compose, handle the case where the composite character is
        outside the BMP and thus needs special handling for our UTF-16
        strings.
      
      code/unidata.lisp
      o CKJ Ideograph range has changed in 5.2.
      o Fix bug in build-composition-table.  We were not correctly handling
        the case where the decomposition of a codepoint was outside the
        BMP.  Special care is needed to handle the UTF-16 strings that we
        use.
      o The key for the pairwise composition table are the full codepoints,
        so we need to shift one by 21 bits instead of 16.
      
      tools/build-unidata.lisp
      o Update minor version to 2.
      
      i18n/BidiMirroring.txt
      i18n/CaseFolding.txt
      i18n/CompositionExclusions.txt
      i18n/DerivedNormalizationProps.txt
      i18n/NameAliases.txt
      i18n/NormalizationCorrections.txt
      i18n/SpecialCasing.txt
      i18n/UnicodeData.txt
      i18n/WordBreakProperty.txt
      i18n/tests/NormalizationTest.txt
      i18n/tests/WordBreakTest.txt
      o Updated from Unicode 5.2.
      
      i18n/unidata.bin
      o Regenerated from new Unicode 5.2 files.
      d2b9eace
  12. Sep 13, 2010
  13. Apr 20, 2010
  14. Apr 19, 2010
  15. Mar 19, 2010
  16. Oct 18, 2009
    • rtoy's avatar
      Merge changes from unicode-string-buffer-impl-branch which gives · 392d3e59
      rtoy authored
      faster reads on external-formats.  This is done by adding an
      additional buffer to streams so we can convert the entire in-buffer
      into characters all at once.
      
      To build this change, you need to do a cross-compile using
      boot-2009-10-1-cross.lisp.  Using that build, do a normal build with
      these sources.
      
      For a non-unicode build use boot-2009-10-01.lisp with a 20a
      non-unicode build.
      
      code/extfmts.lisp:
      o Add another slot to the extfmts for copying the state.
      o Modify EF-OCTETS-TO-STRING and OCTETS-TO-STRING to support the
        necesssary changes for fast formats.  This is incompatible with the
        previous version because the string is not grown if needed.
      
      code/fd-stream-extfmt.lisp:
      o Set *enable-stream-buffer-p* to T so we have fast streams.
      
      code/fd-stream.lisp:
      o Add new slots to support fast strams.
      o In SET-ROUTINES, initialize the new slots appropriately.
      o Update UNREAD-CHAR to be able to back up in the string buffer to
        unread.
      o Add implementation to copy the state of an external format.
      
      code/stream.lisp:
      o Change %SET-FD-STREAM-EXTERNAL-FORMAT to be able to change formats
        even if we've already converted the buffer with a different format.
        We reconvert the buffer with the old format until we reach the
        current character.  Then the remaining octets are converted using
        the new format and stored in the string buffer.
      o Add FAST-READ-CHAR-STRING-REFILL to refill the string buffer, like
        FAST-READ-CHAR-REFILL does for the octet in-buffer.
      
      code/struct.lisp:
      o Add new slots to hold the string buffer, the current index, and
        length.  These are needed for the fast formats.
      
      code/sysmacs.lisp:
      o Update PREPARE-FOR-FAST-READ-CHAR, DONE-WITH-FAST-READ-CHAR, and
        FAST-READ-CHAR to support the string buffer.
      
      code/string.lisp:
      o Microoptimization of SURROGATEP to reduce the number of branchs.
      
      general-info/release-20b.txt:
      o Update with these changes
      
      pcl/simple-streams/external-formats/utf-16-be.lisp:
      pcl/simple-streams/external-formats/utf-16-le.lisp:
      pcl/simple-streams/external-formats/utf-16.lisp:
      o These formats actually have state, so update them to take a handle
        an initial state.  These are needed if the string buffer ends with a
        leading surrogate and the next string buffer starts with a trailing
        surrogate.  The conversion needs to combine the surrogates together.
      392d3e59
  17. Sep 15, 2009
    • rtoy's avatar
      5a8aa73a
    • rtoy's avatar
      Add support for the Unicode word break algorithm for · fc0eb65b
      rtoy authored
      STRING-CAPITALIZE.  Not sure about the appropriate interface, though.
      
      code/string.lisp:
      o Add Unicode word break algorithm.  Based on Scheme code by William
        Clinger.  Used with permission.
      o Update STRING-CAPITALIZE to take another keyword arg to indicate if
        we should use the Unicode word break algorithm.  Default is not to
        use the Unicode algorithm.
      
      compiler/fndb.lisp:
      o Update defknown for string-capitalize.
      
      i18n/tests/WordBreakTest.txt:
      o New test file for the word break algorithm
      
      i18n/tests/word-break-test.lisp:
      o New file to run the word break test.
      fc0eb65b
  18. Aug 17, 2009
  19. Aug 10, 2009
  20. Jul 13, 2009
  21. Jun 16, 2009
    • rtoy's avatar
      Cleanups for non-unicode build. · 8f28c28f
      rtoy authored
      code/stream.lisp:
      o Only define (setf stream-external-format) for Unicode builds.
      o In stream-external-format, don't try to look up the external format
        from the fd-stream structure, which doesn't exist in non-unicode
        builds.
      
      code/strings.lisp:
      o Conditionalize out things that will only work if unicode is
        available.
      
      tools/worldcom.lisp:
      o Only compile fd-stream-extfmt for unicode builds.
      8f28c28f
    • rtoy's avatar
      code/string.lisp: · a826481f
      rtoy authored
      o Only define STRING-TO-NFD, STRING-TO-NFKD, and STRING-TO-NFKC for
        Unicode builds.  Conditionalize out their support functions too.
      o Update export list to be conditional on Unicode too.
      o Use new name for get-pairwise-composition.
      
      code/exports.lisp:
      o Update export list to be conditional on Unicode for above changes
        in string.lisp.
      
      code/unidata.lisp:
      o Change name from GET-PAIRWISE-COMPOSITION to
        UNICODE-PAIRWISE-COMPOSITION to match other Unicode function names.
      a826481f
  22. Jun 11, 2009
  23. Apr 11, 2003
  24. Jun 17, 2001
    • pw's avatar
      From eric Marsden: · c840823b
      pw authored
      Fix some error types to be ANSI compliant.
      c840823b
  25. Mar 04, 2001
  26. Feb 13, 1998
    • dtc's avatar
      ANSI CL compat. changes: · 2e5e2342
      dtc authored
      o Add an optional environment argument to constantp; ignored by CMUCL.
      o Add the :element-type keyword to make-string.
      2e5e2342
  27. Jul 12, 1996
  28. Oct 31, 1994
  29. Feb 11, 1994
  30. Jan 13, 1993
  31. May 15, 1992
  32. May 28, 1991
  33. Apr 24, 1991
  34. Feb 08, 1991
  35. Jul 29, 1990
  36. May 30, 1990
  37. Apr 11, 1990