Commit b5fa5b98 authored by Owen Taylor's avatar Owen Taylor Committed by Owen Taylor

Fixes for #58195, based on some ideas from Hidetosh Tajima.

Wed Sep 26 22:34:12 2001  Owen Taylor  <otaylor@redhat.com>

        Fixes for #58195, based on some ideas from Hidetosh Tajima.

        * aclibcharset.m4 glib/libcharset: Add Bruno Haible's
        portable-current charset detection code from libiconv.

        * glib/gutf8.c (g_utf8_get_charset_internal): Rewrite
        to use _g_locale_charset().

        * glib/gutf8.c (_g_charset_get_aliases): Private functions
        to get aliases from libcharset for a particular canonical
        name.

        * glib/gconvert.c: If loading a charset fails, try
        aliases to look for fallbacks.
parent c7896e13
Wed Sep 26 22:34:12 2001 Owen Taylor <otaylor@redhat.com>
Fixes for #58195, based on some ideas from Hidetosh Tajima.
* aclibcharset.m4 glib/libcharset: Add Bruno Haible's
portable-current charset detection code from libiconv.
* glib/gutf8.c (g_utf8_get_charset_internal): Rewrite
to use _g_locale_charset().
* glib/gutf8.c (_g_charset_get_aliases): Private functions
to get aliases from libcharset for a particular canonical
name.
* glib/gconvert.c: If loading a charset fails, try
aliases to look for fallbacks.
2001-09-26 Matthias Clasen <matthiasc@poet.de>
* gmem.c (g_mem_is_system_malloc): Return !vtable_set.
......
Wed Sep 26 22:34:12 2001 Owen Taylor <otaylor@redhat.com>
Fixes for #58195, based on some ideas from Hidetosh Tajima.
* aclibcharset.m4 glib/libcharset: Add Bruno Haible's
portable-current charset detection code from libiconv.
* glib/gutf8.c (g_utf8_get_charset_internal): Rewrite
to use _g_locale_charset().
* glib/gutf8.c (_g_charset_get_aliases): Private functions
to get aliases from libcharset for a particular canonical
name.
* glib/gconvert.c: If loading a charset fails, try
aliases to look for fallbacks.
2001-09-26 Matthias Clasen <matthiasc@poet.de>
* gmem.c (g_mem_is_system_malloc): Return !vtable_set.
......
Wed Sep 26 22:34:12 2001 Owen Taylor <otaylor@redhat.com>
Fixes for #58195, based on some ideas from Hidetosh Tajima.
* aclibcharset.m4 glib/libcharset: Add Bruno Haible's
portable-current charset detection code from libiconv.
* glib/gutf8.c (g_utf8_get_charset_internal): Rewrite
to use _g_locale_charset().
* glib/gutf8.c (_g_charset_get_aliases): Private functions
to get aliases from libcharset for a particular canonical
name.
* glib/gconvert.c: If loading a charset fails, try
aliases to look for fallbacks.
2001-09-26 Matthias Clasen <matthiasc@poet.de>
* gmem.c (g_mem_is_system_malloc): Return !vtable_set.
......
Wed Sep 26 22:34:12 2001 Owen Taylor <otaylor@redhat.com>
Fixes for #58195, based on some ideas from Hidetosh Tajima.
* aclibcharset.m4 glib/libcharset: Add Bruno Haible's
portable-current charset detection code from libiconv.
* glib/gutf8.c (g_utf8_get_charset_internal): Rewrite
to use _g_locale_charset().
* glib/gutf8.c (_g_charset_get_aliases): Private functions
to get aliases from libcharset for a particular canonical
name.
* glib/gconvert.c: If loading a charset fails, try
aliases to look for fallbacks.
2001-09-26 Matthias Clasen <matthiasc@poet.de>
* gmem.c (g_mem_is_system_malloc): Return !vtable_set.
......
Wed Sep 26 22:34:12 2001 Owen Taylor <otaylor@redhat.com>
Fixes for #58195, based on some ideas from Hidetosh Tajima.
* aclibcharset.m4 glib/libcharset: Add Bruno Haible's
portable-current charset detection code from libiconv.
* glib/gutf8.c (g_utf8_get_charset_internal): Rewrite
to use _g_locale_charset().
* glib/gutf8.c (_g_charset_get_aliases): Private functions
to get aliases from libcharset for a particular canonical
name.
* glib/gconvert.c: If loading a charset fails, try
aliases to look for fallbacks.
2001-09-26 Matthias Clasen <matthiasc@poet.de>
* gmem.c (g_mem_is_system_malloc): Return !vtable_set.
......
Wed Sep 26 22:34:12 2001 Owen Taylor <otaylor@redhat.com>
Fixes for #58195, based on some ideas from Hidetosh Tajima.
* aclibcharset.m4 glib/libcharset: Add Bruno Haible's
portable-current charset detection code from libiconv.
* glib/gutf8.c (g_utf8_get_charset_internal): Rewrite
to use _g_locale_charset().
* glib/gutf8.c (_g_charset_get_aliases): Private functions
to get aliases from libcharset for a particular canonical
name.
* glib/gconvert.c: If loading a charset fails, try
aliases to look for fallbacks.
2001-09-26 Matthias Clasen <matthiasc@poet.de>
* gmem.c (g_mem_is_system_malloc): Return !vtable_set.
......
Wed Sep 26 22:34:12 2001 Owen Taylor <otaylor@redhat.com>
Fixes for #58195, based on some ideas from Hidetosh Tajima.
* aclibcharset.m4 glib/libcharset: Add Bruno Haible's
portable-current charset detection code from libiconv.
* glib/gutf8.c (g_utf8_get_charset_internal): Rewrite
to use _g_locale_charset().
* glib/gutf8.c (_g_charset_get_aliases): Private functions
to get aliases from libcharset for a particular canonical
name.
* glib/gconvert.c: If loading a charset fails, try
aliases to look for fallbacks.
2001-09-26 Matthias Clasen <matthiasc@poet.de>
* gmem.c (g_mem_is_system_malloc): Return !vtable_set.
......
Wed Sep 26 22:34:12 2001 Owen Taylor <otaylor@redhat.com>
Fixes for #58195, based on some ideas from Hidetosh Tajima.
* aclibcharset.m4 glib/libcharset: Add Bruno Haible's
portable-current charset detection code from libiconv.
* glib/gutf8.c (g_utf8_get_charset_internal): Rewrite
to use _g_locale_charset().
* glib/gutf8.c (_g_charset_get_aliases): Private functions
to get aliases from libcharset for a particular canonical
name.
* glib/gconvert.c: If loading a charset fails, try
aliases to look for fallbacks.
2001-09-26 Matthias Clasen <matthiasc@poet.de>
* gmem.c (g_mem_is_system_malloc): Return !vtable_set.
......
dnl From libcharset 1.1
#serial 2
dnl From Bruno Haible.
AC_DEFUN(jm_LANGINFO_CODESET,
[
AC_CHECK_HEADERS(langinfo.h)
AC_CHECK_FUNCS(nl_langinfo)
AC_CACHE_CHECK([for nl_langinfo and CODESET], jm_cv_langinfo_codeset,
[AC_TRY_LINK([#include <langinfo.h>],
[char* cs = nl_langinfo(CODESET);],
jm_cv_langinfo_codeset=yes,
jm_cv_langinfo_codeset=no)
])
if test $jm_cv_langinfo_codeset = yes; then
AC_DEFINE(HAVE_LANGINFO_CODESET, 1,
[Define if you have <langinfo.h> and nl_langinfo(CODESET).])
fi
])
#serial 2
# Test for the GNU C Library, version 2.1 or newer.
# From Bruno Haible.
AC_DEFUN(jm_GLIBC21,
[
AC_CACHE_CHECK(whether we are using the GNU C Library 2.1 or newer,
ac_cv_gnu_library_2_1,
[AC_EGREP_CPP([Lucky GNU user],
[
#include <features.h>
#ifdef __GNU_LIBRARY__
#if (__GLIBC__ == 2 && __GLIBC_MINOR__ >= 1) || (__GLIBC__ > 2)
Lucky GNU user
#endif
#endif
],
ac_cv_gnu_library_2_1=yes,
ac_cv_gnu_library_2_1=no)
]
)
AC_SUBST(GLIBC21)
GLIBC21="$ac_cv_gnu_library_2_1"
]
)
......@@ -2,6 +2,7 @@ dnl ***********************************
dnl *** include special GLib macros ***
dnl ***********************************
builtin(include, acglib.m4)dnl
builtin(include, aclibcharset.m4)dnl
# require autoconf 2.13
AC_PREREQ(2.13)
......@@ -501,6 +502,12 @@ AC_C_BIGENDIAN
AC_CHECK_HEADERS([float.h limits.h pwd.h sys/param.h sys/poll.h sys/select.h])
AC_CHECK_HEADERS([sys/time.h sys/times.h unistd.h values.h stdint.h sched.h])
# Checks for libcharset
jm_LANGINFO_CODESET
jm_GLIBC21
AC_CHECK_HEADERS([stddef.h stdlib.h string.h])
AC_CHECK_FUNCS(setlocale)
AC_MSG_CHECKING(whether make is GNU Make)
STRIP_BEGIN=
STRIP_END=
......@@ -2145,6 +2152,7 @@ Makefile
build/Makefile
build/win32/Makefile
glib/Makefile
glib/libcharset/Makefile
gmodule/gmoduleconf.h
gmodule/Makefile
gobject/Makefile
......
<!-- ##### SECTION ./tmpl/messages.sgml:Long_Description ##### -->
<para>
These functions provide support for logging error messages or messages
used for debugging.
</para>
<para>
There are several built-in levels of messages, defined in #GLogLevelFlags.
These can be extended with user-defined levels.
</para>
<!-- ##### SECTION ./tmpl/messages.sgml:See_Also ##### -->
<para>
</para>
<!-- ##### SECTION ./tmpl/messages.sgml:Short_Description ##### -->
versatile support for logging messages with different levels of importance.
<!-- ##### SECTION ./tmpl/messages.sgml:Title ##### -->
Message Logging
<!-- ##### ENUM GChannelError ##### -->
<para>
......
## Process this file with automake to produce Makefile.in
SUBDIRS=libcharset
INCLUDES = -I$(top_srcdir) -DG_LOG_DOMAIN=g_log_domain_glib \
@GLIB_DEBUG_FLAGS@ -DG_DISABLE_DEPRECATED -DGLIB_COMPILATION
......@@ -141,8 +143,8 @@ if OS_WIN32
export_symbols = -export-symbols glib.def
endif
libglib_1_3_la_LIBADD = @GIO@ @GSPAWN@ @PLATFORMDEP@ @G_LIB_WIN32_RESOURCE@ @ICONV_LIBS@ @G_LIBS_EXTRA@
libglib_1_3_la_DEPENDENCIES = @GIO@ @GSPAWN@ @PLATFORMDEP@ @G_LIB_WIN32_RESOURCE@ @GLIB_DEF@
libglib_1_3_la_LIBADD = libcharset/libcharset.la @GIO@ @GSPAWN@ @PLATFORMDEP@ @G_LIB_WIN32_RESOURCE@ @ICONV_LIBS@ @G_LIBS_EXTRA@
libglib_1_3_la_DEPENDENCIES = libcharset/libcharset.la @GIO@ @GSPAWN@ @PLATFORMDEP@ @G_LIB_WIN32_RESOURCE@ @GLIB_DEF@
libglib_1_3_la_LDFLAGS = \
-version-info $(LT_CURRENT):$(LT_REVISION):$(LT_AGE) \
......
......@@ -53,6 +53,41 @@ g_convert_error_quark()
#error libiconv not in use but included iconv.h is from libiconv
#endif
static gboolean
try_conversion (const char *to_codeset,
const char *from_codeset,
iconv_t *cd)
{
*cd = iconv_open (to_codeset, from_codeset);
if (*cd == (iconv_t)-1 && errno == EINVAL)
return FALSE;
else
return TRUE;
}
static gboolean
try_to_aliases (const char **to_aliases,
const char *from_codeset,
iconv_t *cd)
{
if (to_aliases)
{
const char **p = to_aliases;
while (*p)
{
if (try_conversion (*p, from_codeset, cd))
return TRUE;
p++;
}
}
return FALSE;
}
extern const char **_g_charset_get_aliases (const char *canonical_name);
/**
* g_iconv_open:
* @to_codeset: destination codeset
......@@ -71,8 +106,32 @@ GIConv
g_iconv_open (const gchar *to_codeset,
const gchar *from_codeset)
{
iconv_t cd = iconv_open (to_codeset, from_codeset);
iconv_t cd;
if (!try_conversion (to_codeset, from_codeset, &cd))
{
const char **to_aliases = _g_charset_get_aliases (to_codeset);
const char **from_aliases = _g_charset_get_aliases (to_codeset);
if (from_aliases)
{
const char **p = from_aliases;
while (*p)
{
if (try_conversion (to_codeset, *p, &cd))
return (GIConv)cd;
if (try_to_aliases (to_aliases, *p, &cd))
return (GIConv)cd;
p++;
}
}
if (try_to_aliases (to_aliases, from_codeset, &cd))
return (GIConv)cd;
}
return (GIConv)cd;
}
......
......@@ -36,6 +36,8 @@
#undef STRICT
#endif
#include "libcharset/libcharset.h"
#include "glibintl.h"
#define UTF8_COMPUTE(Char, Mask, Len) \
......@@ -348,60 +350,105 @@ g_utf8_strncpy (gchar *dest,
return dest;
}
static gboolean
g_utf8_get_charset_internal (char **a)
{
char *charset = getenv("CHARSET");
G_LOCK_DEFINE_STATIC (aliases);
if (charset && a && ! *a)
*a = charset;
static GHashTable *
get_alias_hash (void)
{
static GHashTable *alias_hash = NULL;
const char *aliases;
if (charset && strstr (charset, "UTF-8"))
return TRUE;
G_LOCK (aliases);
#ifdef HAVE_CODESET
charset = nl_langinfo(CODESET);
if (charset)
if (!alias_hash)
{
if (a && ! *a)
*a = charset;
if (strcmp (charset, "UTF-8") == 0)
return TRUE;
alias_hash = g_hash_table_new (g_str_hash, g_str_equal);
aliases = _g_locale_get_charset_aliases ();
while (*aliases != '\0')
{
const char *canonical;
const char *alias;
const char **alias_array;
int count = 0;
alias = aliases;
aliases += strlen (aliases) + 1;
canonical = aliases;
aliases += strlen (aliases) + 1;
alias_array = g_hash_table_lookup (alias_hash, canonical);
if (alias_array)
{
while (alias_array[count])
count++;
}
alias_array = g_renew (const char *, alias_array, count + 2);
alias_array[count] = alias;
alias_array[count + 1] = NULL;
g_hash_table_insert (alias_hash, (char *)canonical, alias_array);
}
}
#endif
#if 0 /* #ifdef _NL_CTYPE_CODESET_NAME */
charset = nl_langinfo (_NL_CTYPE_CODESET_NAME);
if (charset)
G_UNLOCK (aliases);
return alias_hash;
}
/* As an abuse of the alias table, the following routines gets
* the charsets that are aliases for the canonical name.
*/
const char **
_g_charset_get_aliases (const char *canonical_name)
{
GHashTable *alias_hash = get_alias_hash ();
return g_hash_table_lookup (alias_hash, canonical_name);
}
static gboolean
g_utf8_get_charset_internal (const char **a)
{
const char *charset = getenv("CHARSET");
if (charset && *charset)
{
if (a && ! *a)
*a = charset;
if (strcmp (charset, "UTF-8") == 0)
*a = charset;
if (charset && strstr (charset, "UTF-8"))
return TRUE;
else
return FALSE;
}
#endif
#ifdef G_PLATFORM_WIN32
if (a && ! *a)
/* The libcharset code tries to be thread-safe without
* a lock, but has a memory leak and a missing memory
* barrier, so we lock for it
*/
G_LOCK (aliases);
charset = _g_locale_charset ();
G_UNLOCK (aliases);
if (charset && *charset)
{
static char codepage[10];
*a = charset;
sprintf (codepage, "CP%d", GetACP ());
*a = codepage;
/* What about codepage 1200? Is that UTF-8? */
return FALSE;
if (charset && strstr (charset, "UTF-8"))
return TRUE;
else
return FALSE;
}
#else
if (a && ! *a)
*a = "US-ASCII";
#endif
/* Assume this for compatibility at present. */
*a = "US-ASCII";
return FALSE;
}
static int utf8_locale_cache = -1;
static char *utf8_charset_cache = NULL;
static const char *utf8_charset_cache = NULL;
/**
* g_get_charset:
......
Makefile.in
Makefile
.deps
.libs
ref-add.sed
ref-del.sed
charset.alias
## Process this file with automake to produce Makefile.in
INCLUDES = \
-DLIBDIR=\"$(libdir)\"
noinst_LTLIBRARIES = libcharset.la
libcharset_la_SOURCES = \
libcharset.h \
localcharset.c
EXTRA_DIST = \
README \
charset.alias \
ref-add.sed \
ref-del.sed \
update.sh \
make-patch.sh
charset_alias = $(DESTDIR)$(libdir)/charset.alias
charset_tmp = $(DESTDIR)$(libdir)/charset.tmp
install-exec-local: all-local
$(mkinstalldirs) $(DESTDIR)$(libdir)
if test -f $(charset_alias); then \
sed -f ref-add.sed $(charset_alias) > $(charset_tmp) ; \
$(INSTALL_DATA) $(charset_tmp) $(charset_alias) ; \
rm -f $(charset_tmp) ; \
else \
if test @GLIBC21@ = no; then \
sed -f ref-add.sed charset.alias > $(charset_tmp) ; \
$(INSTALL_DATA) $(charset_tmp) $(charset_alias) ; \
rm -f $(charset_tmp) ; \
fi ; \
fi
uninstall-local: all-local
if test -f $(charset_alias); then \
sed -f ref-del.sed $(charset_alias) > $(charset_tmp); \
if grep '^# Packages using this file: $$' $(charset_tmp) \
> /dev/null; then \
rm -f $(charset_alias); \
else \
$(INSTALL_DATA) $(charset_tmp) $(charset_alias); \
fi; \
rm -f $(charset_tmp); \
fi
charset.alias: config.charset
$(SHELL) $(srcdir)/config.charset '@host@' > t-$@
mv t-$@ $@
all-local: ref-add.sed ref-del.sed charset.alias
SUFFIXES = .sed .sin
.sin.sed:
sed -e '/^#/d' -e 's/@''PACKAGE''@/@PACKAGE@/g' $< > t-$@
mv t-$@ $@
CLEANFILES = charset.alias ref-add.sed ref-del.sed
The sources are derived from Bruno Haible's libcharset library included
with libiconv:
http//www.gnu.org/software/libiconv
The 'update.sh' script in this directory, when pointed at
the original sources updates the files in this directory
(and elsewhere in the GLib distribution) to the new version
The 'make-patch.sh' script in this directory regenerates
the patch files included in this directory from a copy
of the pristine sources and the files in this directory.
The license on the portions from libiconv portions is reproduced
below.
Owen Taylor
26 September 2001
====
/* Determine a canonical name for the current locale's character encoding.
Copyright (C) 2000-2001 Free Software Foundation, Inc.
This program is free software; you can redistribute it and/or modify it
under the terms of the GNU Library General Public License as published
by the Free Software Foundation; either version 2, or (at your option)
any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
Library General Public License for more details.
You should have received a copy of the GNU Library General Public
License along with this program; if not, write to the Free Software
Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307,
USA. */
/* Written by Bruno Haible <haible@clisp.cons.org>. */
#! /bin/sh
# Output a system dependent table of character encoding aliases.
#
# Copyright (C) 2000-2001 Free Software Foundation, Inc.
#
# This program is free software; you can redistribute it and/or modify it
# under the terms of the GNU Library General Public License as published
# by the Free Software Foundation; either version 2, or (at your option)
# any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
# Library General Public License for more details.
#
# You should have received a copy of the GNU Library General Public
# License along with this program; if not, write to the Free Software
# Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307,
# USA.
#
# The table consists of lines of the form
# ALIAS CANONICAL
#
# ALIAS is the (system dependent) result of "nl_langinfo (CODESET)".
# ALIAS is compared in a case sensitive way.
#
# CANONICAL is the GNU canonical name for this character encoding.
# It must be an encoding supported by libiconv. Support by GNU libc is
# also desirable. CANONICAL is case insensitive. Usually an upper case
# MIME charset name is preferred.
# The current list of GNU canonical charset names is as follows.
#
# name used by which systems a MIME name?
# ASCII, ANSI_X3.4-1968 glibc solaris freebsd
# ISO-8859-1 glibc aix hpux irix osf solaris freebsd yes
# ISO-8859-2 glibc aix hpux irix osf solaris freebsd yes
# ISO-8859-3 glibc yes
# ISO-8859-4 osf solaris freebsd yes
# ISO-8859-5 glibc aix hpux irix osf solaris freebsd yes
# ISO-8859-6 glibc aix hpux solaris yes
# ISO-8859-7 glibc aix hpux irix osf solaris yes
# ISO-8859-8 glibc aix hpux osf solaris yes
# ISO-8859-9 glibc aix hpux irix osf solaris yes
# ISO-8859-13 glibc
# ISO-8859-15 glibc aix osf solaris freebsd
# KOI8-R glibc solaris freebsd yes
# KOI8-U glibc freebsd yes
# CP437 dos
# CP775 dos
# CP850 aix osf dos
# CP852 dos
# CP855 dos
# CP856 aix
# CP857 dos
# CP861 dos
# CP862 dos
# CP864 dos
# CP865 dos
# CP866 freebsd dos
# CP869 dos
# CP874 win32 dos
# CP922 aix
# CP932 aix win32 dos
# CP943 aix
# CP949 osf win32 dos
# CP950 win32 dos
# CP1046 aix
# CP1124 aix
# CP1129 aix
# CP1250 win32
# CP1251 glibc win32
# CP1252 aix win32
# CP1253 win32
# CP1254 win32
# CP1255 win32
# CP1256 win32
# CP1257 win32
# GB2312 glibc aix hpux irix solaris freebsd yes
# EUC-JP glibc aix hpux irix osf solaris freebsd yes
# EUC-KR glibc aix hpux irix osf solaris freebsd yes
# EUC-TW glibc aix hpux irix osf solaris
# BIG5 glibc aix hpux osf solaris freebsd yes
# BIG5-HKSCS glibc
# GBK aix osf win32 dos
# GB18030 glibc
# SHIFT_JIS hpux osf solaris freebsd yes
# JOHAB glibc win32
# TIS-620 glibc aix hpux osf solaris
# VISCII glibc yes
# HP-ROMAN8 hpux
# HP-ARABIC8 hpux
# HP-GREEK8 hpux
# HP-HEBREW8 hpux
# HP-TURKISH8 hpux
# HP-KANA8 hpux
# DEC-KANJI osf
# DEC-HANYU osf
# UTF-8 glibc aix hpux osf solaris yes
#
# Note: Names which are not marked as being a MIME name should not be used in
# Internet protocols for information interchange (mail, news, etc.).
#
# Note: ASCII and ANSI_X3.4-1968 are synonymous canonical names. Applications
# must understand both names and treat them as equivalent.
#
# The first argument passed to this file is the canonical host specification,
# CPU_TYPE-MANUFACTURER-OPERATING_SYSTEM
# or
# CPU_TYPE-MANUFACTURER-KERNEL-OPERATING_SYSTEM
host="$1"
os=`echo "$host" | sed -e 's/^[^-]*-[^-]*-\(.*\)$/\1/'`
echo "# This file contains a table of character encoding aliases,"
echo "# suitable for operating system '${os}'."
echo "# It was automatically generated from config.charset."
# List of references, updated during installation:
echo "# Packages using this file: "
case "$os" in
linux* | *-gnu*)
# With glibc-2.1 or newer, we don't need any canonicalization,
# because glibc has iconv and both glibc and libiconv support all
# GNU canonical names directly. Therefore, the Makefile does not
# need to install the alias file at all.