Tuesday, July 3, 2018

UNUM 3.0: Updated to Unicode 11

Version 3.0 of UNUM is now available for downloading. Version 3.0 incorporates the Unicode 11.0.0 standard, released on June 5th, 2018. The update to Unicode adds support for seven scripts for languages, additional CJK (Chinese, Japanese, and Korean) symbols, 66 new emoji, and assorted symbols such as half-stars for rating systems. There are a total of 137,374 characters in 11.0.0, of which 684 are new since 10.0.0. (UNUM also supports an additional 65 ASCII control characters, which are not assigned graphic code points in the Unicode database.)

This is an incremental update to Unicode. There are no structural changes in how characters are defined in the databases, and other than the presence of the new characters, the operation of UNUM is unchanged.

UNUM also contains a database of HTML named character references (the sequences like “<” you use in HTML source code when you need to represent a character which has a syntactic meaning in HTML or which can't be directly included in a file with the character encoding you're using to write it). There have been no changes to this standard since UNUM 2.2 was released in September 2017, so UNUM 3.0 will behave identically when querying these references except, of course, that numerical references to the new Unicode characters will be interpreted correctly. (Is your browser totally with it? See what it does with “🦹” in an HTML document! And here we go…“🦹”.)

