Skip to content
  1. May 25, 2013
    • Raymond Toy's avatar
      Remove all the extensions to string-upcase and friends. The extended · 49f041ad
      Raymond Toy authored
      functions now live in the new UNICODE package.
      
       src/code/exports.lisp::
       * Export some unicode functions and constants.
      
       src/code/string.lisp::
       * Removed the extended versions of string-upcase and friends.
       * Export surrogates function.
       * Make sure with-one-string is defined so the unicode package can use
         it.
      
       src/code/unicode.lisp:;
       * New file with extended versions of string-upcase and friends.
      
       src/code/unidata.lisp::
       * Export some unicode functions and constants.
      
       src/compiler/fndb.lisp::
       * Update defknowns for string-upcase and friends.
      
       src/tools/worldbuild.lisp::
       * Build unicode.lisp
      
       src/tools/worldcom.lisp::
       * Load unicode.lisp
      49f041ad
  2. Mar 05, 2013
  3. Jan 17, 2013
    • Raymond Toy's avatar
      Fix ticket:69 · ce037e96
      Raymond Toy authored
      Change *unidata-path* to be a pathname object instead of a namestring.
      ce037e96
  4. Nov 18, 2012
  5. Mar 03, 2012
  6. Feb 05, 2012
    • Raymond Toy's avatar
      Update to Unicode 6.1.0. · 537cc9bb
      Raymond Toy authored
        src/code/unidata.lisp:: Update Unicode version.
      
        src/tools/build-unidata.lisp:: Update Unicode version and update for
        change of format of NameAliases.txt.
      
        src/i18n/unidata.bin:: Updated with new data.
      
        src/general-info/release-20d.txt:: Updated.
      
        src/i18n/BidiMirroring.txt:: Updated to Unicode 6.1.0.
        src/i18n/CaseFolding.txt:: Updated to Unicode 6.1.0.
        src/i18n/CompositionExclusions.txt:: Updated to Unicode 6.1.0.
        src/i18n/DerivedNormalizationProps.txt:: Updated to Unicode 6.1.0.
        src/i18n/NameAliases.txt:: Updated to Unicode 6.1.0.
        src/i18n/NormalizationCorrections.txt:;
        src/i18n/SpecialCasing.txt:: Updated to Unicode 6.1.0.
        src/i18n/UnicodeData.txt:: Updated to Unicode 6.1.0.
        src/i18n/WordBreakProperty.txt:: Updated to Unicode 6.1.0.
        src/i18n/tests/NormalizationTest.txt:: Updated to Unicode 6.1.0.
        src/i18n/tests/WordBreakTest.txt:: Updated to Unicode 6.1.0.
      537cc9bb
  7. Feb 01, 2012
  8. Nov 04, 2011
  9. Sep 25, 2011
  10. Jun 27, 2011
    • rtoy's avatar
      Update to Unicode 6.0.0. · 7aa8a23e
      rtoy authored
      
      code/unidata.lisp:
      o Update unicode version to 6.0.0
      o Add pointer to build-unidata.lisp.
      tools/build-unidata.lisp:
      o Update unicode version to 6.0.0
      o Print out directory path so we can see where we're getting the data
        from.
      
      
      i18n/CaseFolding.txt
      i18n/CompositionExclusions.txt
      i18n/DerivedNormalizationProps.txt
      i18n/NameAliases.txt
      i18n/NormalizationCorrections.txt
      i18n/SpecialCasing.txt
      i18n/UnicodeData.txt
      i18n/WordBreakProperty.txt
      i18n/tests/NormalizationTest.txt
      i18n/tests/WordBreakTest.txt:
      o Update with new files from unicode.org.
      7aa8a23e
  11. Jun 10, 2011
    • rtoy's avatar
      Add function to load all unicode data into memory. · 55d7f671
      rtoy authored
      This makes it easy to make an executable image that doesn't need
      unidata.bin around.  (Should we do this for normal cores?  It seems to
      add about 1 MB to the core size.)
      
      code/unidata.lisp:
      o Add LOAD-ALL-UNICODE-DATA to load all unicode data.
      o Add UNICODE-DATA-LOADED-P to check that unicode data has been
        loaded.
      
      code/print.lisp:
      o If unicode data is loaded, don't check for existence of
        *unidata-path*, because we don't need it.
      
      code/exports.lisp:
      o Export LOAD-ALL-UNICODE-DATA.
      
      general-info/release-20c.txt:
      o Update info
      55d7f671
  12. May 31, 2011
    • rtoy's avatar
      Add -unidata option to specify unidata.bin file. · d9b73849
      rtoy authored
      This change requires a cross-compile.  Use boot-2011-04-01-cross.lisp
      as the cross-compile script.
      
      bootfiles/20b/boot-2011-04-01-cross.lisp:
      o New cross-compile bootstrap file
      
      lisp/lisp.c:
      o Recognize -unidata option and setup *UNIDATA-PATH* appropriately.
      
      code/commandline.lisp:
      o Add defswitch for unidata so we don't get complaints about unknown
        switch.
      
      code/unidata.lisp:
      o Rename +UNIDATA-PATH+ to *UNIDATA-PATH*, since it's not a constant
        anymore.
      o Update code to use new name.
      
      code/print.lisp:
      o Update code to use *UNIDATA-PATH*
      
      compiler/sparc/parms.lisp:
      o Add *UNIDATA-PATH* to list of static symbols.
      o Add back in spare-9 and spare-8 static symbols since we need to do a
        cross-compile for this change anyway.
      
      compiler/x86/parms.lisp:
      o Add *UNIDATA-PATH* to list of static symbols.
      o Reorder the static symbols in a more logical arrangment so that the
        spare symbols are at the end.
      
      i18n/local/cmucl.pot:
      o Update
      d9b73849
  13. Apr 02, 2011
  14. Feb 23, 2011
    • rtoy's avatar
      Fix bug where cmucl was no longer recognizing things like · 23fafac4
      rtoy authored
      #\latin_small_letter_a.  This failure is caused by the new
      SEARCH-DICTIONARY function that does partial completion, and
      UNICODE-NAME-TO-CODEPOINT function wan't aware of the new way.
      
      We could change UNICODE-NAME-TO-CODEPOINT to do the appropriate thing
      with the new way, but I (rtoy) decided it would be nice to have the
      old function around too.  Hence, restore the old version and use it.
      23fafac4
  15. Sep 29, 2010
  16. Sep 21, 2010
  17. Sep 20, 2010
  18. Sep 19, 2010
    • rtoy's avatar
      o Move %STR, %STRX and %MATCH around so that we can inline them · 119f21c7
      rtoy authored
        (because they're so simple).
      o Add some comments for %STR.
      o Change implementation of %MATCH to be simpler and add comments on
        why we do what we do and explain what happens if we don't.
      o Handle completion of Hangul syllables better:
        - Match "Hangul_S" instead of "Hangul_Syllable" because there's
          #\Hangul_Single_Dot_Tone_Mark.
        - If we match "Hangul_S", try to complete some Hangul syllables so
          we don't fool slime into thinking "Hangul_Syllable_" is the only
          completion.  There are obviously more.
      o Handle completion of CJK Unified Ideographs better by trying to
        complete more so slime isn't fooled into thinking
        "CJK_Unified_Ideograph-" is the only possible completion.
      119f21c7
    • rtoy's avatar
      o Construction of the Hangul syllable codebook was wrong. To satisfy · dc4cdb68
      rtoy authored
        the constraints on the codebook, we just sort them in descreasing
        order of length.
      o In %MIP, it might happen that MISMATCH returns NIL, which means a
        match.  In this case, don't change the position.
      dc4cdb68
  19. Sep 18, 2010
    • rtoy's avatar
      Some Hangul syllables were left out of the Hangul syllable dictionary. · f2065a91
      rtoy authored
      Redo this by looping over all codepoints and selecting the codepoints
      that are Hangul syllables.
      f2065a91
    • rtoy's avatar
      code/unidata.lisp: · 820f2554
      rtoy authored
      o Update constants to Unicode version 5.2.0.
      
      i18n/unidata.bin:
      o Regenerated using Unicode version 5.2.0.
      820f2554
    • rtoy's avatar
      code/unidata.lisp: · 3d1d8295
      rtoy authored
      o Just add some comments on why we don't put the dictionaries in
        unidata.bin.
      o Print out some messages when building the hangul and cjk
        dictionaries so the user knows what's happening.
      
      tools/build-unidata.lisp:
      o Add some comments on the various parts of unidata.bin.
      3d1d8295
  20. Sep 17, 2010
    • rtoy's avatar
      exports.lisp: · 9563cc0b
      rtoy authored
      o Export STRING-TO-NFC, UNICODE-COMPLETE, and UNICODE-COMPLETE-NAME.
      
      unidata.lisp:
      o Add explicit exports.
      9563cc0b
    • rtoy's avatar
      Optimize the completion of the Hangul syllables and the CJK unified · d4b307df
      rtoy authored
      ideographs by using dictionaries.  (Should these dictionaries be part
      of unidata.bin so they don't have to be built at run time?  One the
      one hand, it makes things simpler, but unnecessarily bloats
      unidata.in.  I suspect the hangul syllables and cjk ideographs
      characters not not used very often.)
      
      o Change NODE-NEXT and CLOSE-NODE to have an optional parameter for
        the dictionary to use.
      o Update UNICODE-COMPLETE-NAME to pass the dictionary to NODE-NEXT and
        CLOSE-NODE.
      o Update UNICODE-COMPLETE to use the hangul syllable dictionary and
        the cjk ideograph dictionary when searching.
      o Fix typo in UNICODE-COMPLETE.
      o Add defvars for dictionaries for hangul syllables and cjk
        ideographs.
      o Add functions to build the hangul and cjk dictionaries.
      o Steal the implementations of BUILD-DICTIONARY, NAME-LOOKUP, and
        ENCODE-NAME from tools/build-unidata.lisp.
      d4b307df
    • rtoy's avatar
      Add support for character completion. This is primarily intended to · d4b888a2
      rtoy authored
      support character completion for slime.  The implementation is from
      Paul Foley, but some slight modifications by Raymond Toy to handle a
      few corner cases.
      
      o Modify SEARCH-DICTIONARY to take optional current and posn
        parameters so that SEARCH-DICTIONARY can be started from a different
        place.
      o Add UNICODE-COMPLETE, which is the main function for character name
        completion.
      o Add other support functions for UNICODE-COMPLETE.
      d4b888a2
    • rtoy's avatar
      o Fix typo in UNICODE-DECOMP. (It's hangul-syllable-p, not · 34af3581
      rtoy authored
        hangule-syllable-p.)
      o Move the computation of *reverse-hangule-choseong*,
        *reverse-hangul-jungseong*, and *reverse-hangul-jongseong* to its
        own routine.  Call it in UNICODE-NAME-TO-CODEPOINT.
      34af3581
  21. Sep 15, 2010
    • rtoy's avatar
      Pull out the range tests for CJK Ideographs and Hangul Syllables and · 6692aa7e
      rtoy authored
      put the tests into their own functions so that the limits are on one
      place.
      6692aa7e
    • rtoy's avatar
      Add support for Unicode 5.2. The normalization and wordbreak tests pass. · d2b9eace
      rtoy authored
      code/string.lisp:
      o In %compose, handle the case where the composite character is
        outside the BMP and thus needs special handling for our UTF-16
        strings.
      
      code/unidata.lisp
      o CKJ Ideograph range has changed in 5.2.
      o Fix bug in build-composition-table.  We were not correctly handling
        the case where the decomposition of a codepoint was outside the
        BMP.  Special care is needed to handle the UTF-16 strings that we
        use.
      o The key for the pairwise composition table are the full codepoints,
        so we need to shift one by 21 bits instead of 16.
      
      tools/build-unidata.lisp
      o Update minor version to 2.
      
      i18n/BidiMirroring.txt
      i18n/CaseFolding.txt
      i18n/CompositionExclusions.txt
      i18n/DerivedNormalizationProps.txt
      i18n/NameAliases.txt
      i18n/NormalizationCorrections.txt
      i18n/SpecialCasing.txt
      i18n/UnicodeData.txt
      i18n/WordBreakProperty.txt
      i18n/tests/NormalizationTest.txt
      i18n/tests/WordBreakTest.txt
      o Updated from Unicode 5.2.
      
      i18n/unidata.bin
      o Regenerated from new Unicode 5.2 files.
      d2b9eace
  22. Apr 20, 2010
  23. Mar 19, 2010
  24. Sep 11, 2009
    • rtoy's avatar
      tools/build-unidata.lisp: · bf4b37ac
      rtoy authored
      o Add support for word break properties.
      o Some cleanup of the code including moving the common code in
        write-ntrie* to write-ntrie.
      
      code/unidata.lisp:
      o Add support for word break properties.
      o UNICODE-WORD-BREAK-CODE and UNICODE-WORD-BREAK return the property
        code and the property keyword for a codepoint, respectively.
      
      i18n/WordBreakProperty.txt:
      o New file for the word break properties.
      bf4b37ac
  25. Jul 10, 2009
    • rtoy's avatar
      unidata.lisp: · 176f40f7
      rtoy authored
      o Add *unidata-version* to hold our revision number.
      
      save.lisp:
      o Add Unicode to the herald items.  Just print out the unidata version
        along with the supported Unicode UCD version.
      176f40f7
  26. Jul 02, 2009
    • rtoy's avatar
      boot-2009-07.lisp: · 67fc4ac5
      rtoy authored
      o Bootstrap file needed to compile this change (because the current
        shrink-vector derive-type optimizer didn't handle union types).
      
      compiler/fndb.lisp:
      o Make the compiler warn if the result of lisp::shrink-vector is not
        used.  This is a problem because the compiler doesn't know that
        shrink-vector destructively modifies the length of a vector.  As a
        partial solution, warn the user if the result of shrink-vector is
        not.
      
      code/hash-new.lisp:
      code/seq.lisp:
      o Make sure the result of shrink-vector is used, to get rid of a new
        compiler warning.
      
      code/unidata.lisp:
      o Modify %unicode-full-case so that it doesn't use shrink-vector
        anymore.
      
      compiler/seqtran.lisp:
      o Fix shrink-vector derive-type optimizer to handle union types.
      
      tools/build-unidata.lisp:
      o Fix typo that someone got in.
      o Make sure the result of shrink-vector is used, to get rid of a new
        compiler warning.
      67fc4ac5
  27. Jun 16, 2009
    • rtoy's avatar
      code/string.lisp: · a826481f
      rtoy authored
      o Only define STRING-TO-NFD, STRING-TO-NFKD, and STRING-TO-NFKC for
        Unicode builds.  Conditionalize out their support functions too.
      o Update export list to be conditional on Unicode too.
      o Use new name for get-pairwise-composition.
      
      code/exports.lisp:
      o Update export list to be conditional on Unicode for above changes
        in string.lisp.
      
      code/unidata.lisp:
      o Change name from GET-PAIRWISE-COMPOSITION to
        UNICODE-PAIRWISE-COMPOSITION to match other Unicode function names.
      a826481f
  28. Jun 11, 2009