Unix more / better UTF-8
http://www.dragonflybsd.org/release44/ https://news.ycombinator.com/item?id=11248847 https://wiki.gentoo.org/wiki/UTF-8/it philes... not just high ASCII anymore...
For those who have -not- yet read the canonical UTF-8 advocacy blog post from many years ago, and for whom doing so is relevant, it really is a must read: http://utf8everywhere.org/ On 3/9/16, grarpamp <grarpamp@gmail.com> wrote:
When I read something like: "We introduced "short codes", so now codes like "de_DE", "fr_FR", "en_US", "el_GR", etc. These short-codes are generally mapped to 8-bit character sets such ia ISO-8859-x, but sometimes they are mapped to UTF-8 if the traditional single-byte encoding doesn't adequately cover the locale anymore (e.g. the currency is not supported)." I think "people still haven't cottoned on - UTF-8 should be the default, and only vary if really necessary. Now I must qualify this statement, since I don't know BSD, nor much about locales. Debian is my friend.
"Xterm(1) now UTF-8 by default on OpenBSD" - great news! Better late than never...
https://wiki.gentoo.org/wiki/UTF-8/it
philes... not just high ASCII anymore...
Wonders never cease. Now, if only Java could properly handle Unicode characters and had a string class which could properly work with UTF-8: https://zenaan.github.io/zen/javadoc/zen/lang/string.html Note1: Motivated by my extreme frustration with Java's Unicode limitations to the point of not even being able to implement a proper string formatter, by the utf8everywhere.org website, and by having quite some days in a row to figure out why the problem existed in the first place and exactly what -is- Java's problem in this particular regard. Note2: The documentation at the top of this link is the relevant part, the class is just a note pad... Note3: I have a pretty solid CodePointCursor.java class (yet to be uploaded), well tested by a uint and tagged string CodePointParser, if anyone actually wants to finish a proper Java string class such as above...
participants (2)
-
grarpamp
-
Zenaan Harkness