wcwidth-like functions
Submitted by Behdad Esfahbod
Link to original bug (#563503)
Description
We provide all the ingredients in glib to write a wcwidth()
functions. Namely:
return g_unichar_iszerowidth(c): 0 ? g_unichar_iswide(c) ? 2 : 1;
However, writing this and writing a string loop around it is uglier than I like. I need that in pangofc and pangocairo. Means that I had to repeat the following code in two places:
static inline G_GNUC_UNUSED int
pango_unichar_width (gunichar c)
{
return G_UNLIKELY (g_unichar_iszerowidth (c)) ? 0 :
G_UNLIKELY (g_unichar_iswide (c)) ? 2 : 1;
}
static G_GNUC_UNUSED glong
pango_utf8_strwidth (const gchar *p)
{
glong len = 0;
g_return_val_if_fail (p != NULL, 0);
while (*p)
{
len += pango_unichar_width (g_utf8_get_char (p));
p = g_utf8_next_char (p);
}
return len;
}
Which is short enough and ok. But try writing a non-nul-terminal version and things quickly get hard to get right.
The reason we didn't add a g_unichar_width()
to glib was that depending on the locale, the user may want to use g_unichar_iswide_cjk()
instead. Ideally, I want my pango code should automatically use iswide_cjk()
for CJK locales. But I didn't have the list in Pango, so didn't do that.
Note that vte also has all this code. But the requirements there are a bit more restricted, so I don't think we will be able to reuse code anyway.
So, here is one proposal:
- Add
g_get_lc_ctype()
, which will move code from pango down to glib. The code is:
static gchar *
_pango_get_lc_ctype (void)
{
#ifdef G_OS_WIN32
/* Somebody might try to set the locale for this process using the
* LANG or LC_ environment variables. The Microsoft C library
* doesn't know anything about them. You set the locale in the
* Control Panel. Setting these env vars won't have any affect on
* locale-dependent C library functions like ctime(). But just for
* kicks, do obey LC_ALL, LC_CTYPE and LANG in Pango. (This also makes
* it easier to test GTK and Pango in various default languages, you
* don't have to clickety-click in the Control Panel, you can simply
* start the program with LC_ALL=something on the command line.)
*/
gchar *p;
p = getenv ("LC_ALL");
if (p != NULL)
return g_strdup (p);
p = getenv ("LC_CTYPE");
if (p != NULL)
return g_strdup (p);
p = getenv ("LANG");
if (p != NULL)
return g_strdup (p);
return g_win32_getlocale ();
#else
return g_strdup (setlocale (LC_CTYPE, NULL));
#endif
}
-
Add
g_unichar_width()
that uses_cjk
variant if running under a CJK locale. The list of CJK locales will be lifted from VTE. We can document that the user can decide whether running under a CJK locale by testing the return value ofg_unichar_width(SOME-AMBIGUOUS-CHAR)
. So no extra api needed there. -
Add
g_utf8_strwidth()
that is similar tog_utf8_strlen()
but usesg_unichar_width()
.
How does it sound? Still too trivial stuff to live in glib?