Commit de298ae7 authored by Matthias Clasen's avatar Matthias Clasen Committed by Matthias Clasen

Cleanup converter state after the conversion. Document streaming

2005-08-02  Matthias Clasen  <mclasen@redhat.com>

	* glib/gconvert.c (g_convert_with_iconv, g_convert_with_fallback):
	Cleanup converter state after the conversion. Document streaming
	conversion pitfalls.  (#311337)
parent 9dfc1abf
2005-08-02 Matthias Clasen <mclasen@redhat.com>
* glib/gconvert.c (g_convert_with_iconv, g_convert_with_fallback):
Cleanup converter state after the conversion. Document streaming
conversion pitfalls. (#311337)
2005-08-02 Tor Lillqvist <tml@novell.com> 2005-08-02 Tor Lillqvist <tml@novell.com>
* tests/refcount/objects.c * tests/refcount/objects.c
......
2005-08-02 Matthias Clasen <mclasen@redhat.com>
* glib/gconvert.c (g_convert_with_iconv, g_convert_with_fallback):
Cleanup converter state after the conversion. Document streaming
conversion pitfalls. (#311337)
2005-08-02 Tor Lillqvist <tml@novell.com> 2005-08-02 Tor Lillqvist <tml@novell.com>
* tests/refcount/objects.c * tests/refcount/objects.c
......
2005-08-02 Matthias Clasen <mclasen@redhat.com>
* glib/gconvert.c (g_convert_with_iconv, g_convert_with_fallback):
Cleanup converter state after the conversion. Document streaming
conversion pitfalls. (#311337)
2005-08-02 Tor Lillqvist <tml@novell.com> 2005-08-02 Tor Lillqvist <tml@novell.com>
* tests/refcount/objects.c * tests/refcount/objects.c
......
2005-08-02 Matthias Clasen <mclasen@redhat.com>
* glib/gconvert.c (g_convert_with_iconv, g_convert_with_fallback):
Cleanup converter state after the conversion. Document streaming
conversion pitfalls. (#311337)
2005-08-02 Tor Lillqvist <tml@novell.com> 2005-08-02 Tor Lillqvist <tml@novell.com>
* tests/refcount/objects.c * tests/refcount/objects.c
......
...@@ -463,77 +463,6 @@ close_converter (GIConv converter) ...@@ -463,77 +463,6 @@ close_converter (GIConv converter)
return 0; return 0;
} }
/**
* g_convert:
* @str: the string to convert
* @len: the length of the string, or -1 if the string is
* nul-terminated<footnote id="nul-unsafe">
<para>
Note that some encodings may allow nul bytes to
occur inside strings. In that case, using -1 for
the @len parameter is unsafe.
</para>
</footnote>.
* @to_codeset: name of character set into which to convert @str
* @from_codeset: character set of @str.
* @bytes_read: location to store the number of bytes in the
* input string that were successfully converted, or %NULL.
* Even if the conversion was successful, this may be
* less than @len if there were partial characters
* at the end of the input. If the error
* #G_CONVERT_ERROR_ILLEGAL_SEQUENCE occurs, the value
* stored will the byte offset after the last valid
* input sequence.
* @bytes_written: the number of bytes stored in the output buffer (not
* including the terminating nul).
* @error: location to store the error occuring, or %NULL to ignore
* errors. Any of the errors in #GConvertError may occur.
*
* Converts a string from one character set to another.
*
* Return value: If the conversion was successful, a newly allocated
* nul-terminated string, which must be freed with
* g_free(). Otherwise %NULL and @error will be set.
**/
gchar*
g_convert (const gchar *str,
gssize len,
const gchar *to_codeset,
const gchar *from_codeset,
gsize *bytes_read,
gsize *bytes_written,
GError **error)
{
gchar *res;
GIConv cd;
g_return_val_if_fail (str != NULL, NULL);
g_return_val_if_fail (to_codeset != NULL, NULL);
g_return_val_if_fail (from_codeset != NULL, NULL);
cd = open_converter (to_codeset, from_codeset, error);
if (cd == (GIConv) -1)
{
if (bytes_read)
*bytes_read = 0;
if (bytes_written)
*bytes_written = 0;
return NULL;
}
res = g_convert_with_iconv (str, len, cd,
bytes_read, bytes_written,
error);
close_converter (cd);
return res;
}
/** /**
* g_convert_with_iconv: * g_convert_with_iconv:
* @str: the string to convert * @str: the string to convert
...@@ -553,7 +482,30 @@ g_convert (const gchar *str, ...@@ -553,7 +482,30 @@ g_convert (const gchar *str,
* @error: location to store the error occuring, or %NULL to ignore * @error: location to store the error occuring, or %NULL to ignore
* errors. Any of the errors in #GConvertError may occur. * errors. Any of the errors in #GConvertError may occur.
* *
* Converts a string from one character set to another. * Converts a string from one character set to another.
*
* Note that despite the fact that @byes_read can return information
* about partial characters, this function is not generally suitable
* for streaming. It may not handle stateful encodings like CP1255
* correctly, since it doesn't keep the @converter state across
* multiple invocations. If you need to do streaming conversions
* which may involve stateful encodings, you have to use g_iconv()
* directly.
*
* Note that you should use g_iconv() for streaming
* conversions<footnote id="streaming-state">
* <para>
* Despite the fact that @byes_read can return information about partial
* characters, the <literal>g_convert_...</literal> functions
* are not generally suitable for streaming. If the underlying converter
* being used maintains internal state, then this won't be preserved
* across successive calls to g_convert(), g_convert_with_iconv() or
* g_convert_with_fallback(). (An example of this is the GNU C converter
* for CP1255 which does not emit a base character until it knows that
* the next character is not a mark that could combine with the base
* character.)
* </para>
* </footnote>.
* *
* Return value: If the conversion was successful, a newly allocated * Return value: If the conversion was successful, a newly allocated
* nul-terminated string, which must be freed with * nul-terminated string, which must be freed with
...@@ -570,13 +522,14 @@ g_convert_with_iconv (const gchar *str, ...@@ -570,13 +522,14 @@ g_convert_with_iconv (const gchar *str,
gchar *dest; gchar *dest;
gchar *outp; gchar *outp;
const gchar *p; const gchar *p;
const gchar *shift_p = NULL;
gsize inbytes_remaining; gsize inbytes_remaining;
gsize outbytes_remaining; gsize outbytes_remaining;
gsize err; gsize err;
gsize outbuf_size; gsize outbuf_size;
gboolean have_error = FALSE; gboolean have_error = FALSE;
gboolean done = FALSE;
g_return_val_if_fail (str != NULL, NULL);
g_return_val_if_fail (converter != (GIConv) -1, NULL); g_return_val_if_fail (converter != (GIConv) -1, NULL);
if (len < 0) if (len < 0)
...@@ -589,45 +542,60 @@ g_convert_with_iconv (const gchar *str, ...@@ -589,45 +542,60 @@ g_convert_with_iconv (const gchar *str,
outbytes_remaining = outbuf_size - 1; /* -1 for nul */ outbytes_remaining = outbuf_size - 1; /* -1 for nul */
outp = dest = g_malloc (outbuf_size); outp = dest = g_malloc (outbuf_size);
again: while (!done && !have_error)
err = g_iconv (converter, (char **)&p, &inbytes_remaining, &outp, &outbytes_remaining);
if (err == (size_t) -1)
{ {
switch (errno) err = g_iconv (converter, (char **)&p, &inbytes_remaining, &outp, &outbytes_remaining);
if (err == (size_t) -1)
{ {
case EINVAL: switch (errno)
/* Incomplete text, do not report an error */ {
break; case EINVAL:
case E2BIG: /* Incomplete text, do not report an error */
{ break;
size_t used = outp - dest; case E2BIG:
{
outbuf_size *= 2; size_t used = outp - dest;
dest = g_realloc (dest, outbuf_size);
outp = dest + used; outbuf_size *= 2;
outbytes_remaining = outbuf_size - used - 1; /* -1 for nul */ dest = g_realloc (dest, outbuf_size);
goto again; outp = dest + used;
} outbytes_remaining = outbuf_size - used - 1; /* -1 for nul */
case EILSEQ: }
if (error) break;
g_set_error (error, G_CONVERT_ERROR, G_CONVERT_ERROR_ILLEGAL_SEQUENCE, case EILSEQ:
_("Invalid byte sequence in conversion input")); if (error)
have_error = TRUE; g_set_error (error, G_CONVERT_ERROR, G_CONVERT_ERROR_ILLEGAL_SEQUENCE,
break; _("Invalid byte sequence in conversion input"));
default: have_error = TRUE;
if (error) break;
g_set_error (error, G_CONVERT_ERROR, G_CONVERT_ERROR_FAILED, default:
_("Error during conversion: %s"), if (error)
g_strerror (errno)); g_set_error (error, G_CONVERT_ERROR, G_CONVERT_ERROR_FAILED,
have_error = TRUE; _("Error during conversion: %s"),
break; g_strerror (errno));
have_error = TRUE;
break;
}
}
else
{
if (!shift_p)
{
/* call g_iconv with NULL inbuf to cleanup shift state */
shift_p = p;
p = NULL;
inbytes_remaining = 0;
}
else
done = TRUE;
} }
} }
if (shift_p)
p = shift_p;
*outp = '\0'; *outp = '\0';
if (bytes_read) if (bytes_read)
...@@ -658,6 +626,87 @@ g_convert_with_iconv (const gchar *str, ...@@ -658,6 +626,87 @@ g_convert_with_iconv (const gchar *str,
return dest; return dest;
} }
/**
* g_convert:
* @str: the string to convert
* @len: the length of the string, or -1 if the string is
* nul-terminated<footnote id="nul-unsafe">
<para>
Note that some encodings may allow nul bytes to
occur inside strings. In that case, using -1 for
the @len parameter is unsafe.
</para>
</footnote>.
* @to_codeset: name of character set into which to convert @str
* @from_codeset: character set of @str.
* @bytes_read: location to store the number of bytes in the
* input string that were successfully converted, or %NULL.
* Even if the conversion was successful, this may be
* less than @len if there were partial characters
* at the end of the input. If the error
* #G_CONVERT_ERROR_ILLEGAL_SEQUENCE occurs, the value
* stored will the byte offset after the last valid
* input sequence.
* @bytes_written: the number of bytes stored in the output buffer (not
* including the terminating nul).
* @error: location to store the error occuring, or %NULL to ignore
* errors. Any of the errors in #GConvertError may occur.
*
* Converts a string from one character set to another.
*
* Note that despite the fact that @byes_read can return information
* about partial characters, this function is not generally suitable
* for streaming. It may not handle stateful encodings like CP1255
* correctly, since it doesn't keep the @converter state across
* multiple invocations. If you need to do streaming conversions
* which may involve stateful encodings, you have to use g_iconv()
* directly.
*
* Note that you should use g_iconv() for streaming
* conversions<footnoteref linkend="streaming-state"/>.
*
* Return value: If the conversion was successful, a newly allocated
* nul-terminated string, which must be freed with
* g_free(). Otherwise %NULL and @error will be set.
**/
gchar*
g_convert (const gchar *str,
gssize len,
const gchar *to_codeset,
const gchar *from_codeset,
gsize *bytes_read,
gsize *bytes_written,
GError **error)
{
gchar *res;
GIConv cd;
g_return_val_if_fail (str != NULL, NULL);
g_return_val_if_fail (to_codeset != NULL, NULL);
g_return_val_if_fail (from_codeset != NULL, NULL);
cd = open_converter (to_codeset, from_codeset, error);
if (cd == (GIConv) -1)
{
if (bytes_read)
*bytes_read = 0;
if (bytes_written)
*bytes_written = 0;
return NULL;
}
res = g_convert_with_iconv (str, len, cd,
bytes_read, bytes_written,
error);
close_converter (cd);
return res;
}
/** /**
* g_convert_with_fallback: * g_convert_with_fallback:
* @str: the string to convert * @str: the string to convert
...@@ -688,6 +737,9 @@ g_convert_with_iconv (const gchar *str, ...@@ -688,6 +737,9 @@ g_convert_with_iconv (const gchar *str,
* to @to_codeset in their iconv() functions, * to @to_codeset in their iconv() functions,
* in which case GLib will simply return that approximate conversion. * in which case GLib will simply return that approximate conversion.
* *
* Note that you should use g_iconv() for streaming
* conversions<footnoteref linkend="streaming-state"/>.
*
* Return value: If the conversion was successful, a newly allocated * Return value: If the conversion was successful, a newly allocated
* nul-terminated string, which must be freed with * nul-terminated string, which must be freed with
* g_free(). Otherwise %NULL and @error will be set. * g_free(). Otherwise %NULL and @error will be set.
...@@ -819,7 +871,7 @@ g_convert_with_fallback (const gchar *str, ...@@ -819,7 +871,7 @@ g_convert_with_fallback (const gchar *str,
have_error = TRUE; have_error = TRUE;
break; break;
} }
else else if (p)
{ {
if (!fallback) if (!fallback)
{ {
...@@ -834,8 +886,9 @@ g_convert_with_fallback (const gchar *str, ...@@ -834,8 +886,9 @@ g_convert_with_fallback (const gchar *str,
save_inbytes = inbytes_remaining - (save_p - p); save_inbytes = inbytes_remaining - (save_p - p);
p = insert_str; p = insert_str;
inbytes_remaining = strlen (p); inbytes_remaining = strlen (p);
break;
} }
break; /* fall thru if p is NULL */
default: default:
g_set_error (error, G_CONVERT_ERROR, G_CONVERT_ERROR_FAILED, g_set_error (error, G_CONVERT_ERROR, G_CONVERT_ERROR_FAILED,
_("Error during conversion: %s"), _("Error during conversion: %s"),
...@@ -854,6 +907,12 @@ g_convert_with_fallback (const gchar *str, ...@@ -854,6 +907,12 @@ g_convert_with_fallback (const gchar *str,
inbytes_remaining = save_inbytes; inbytes_remaining = save_inbytes;
save_p = NULL; save_p = NULL;
} }
else if (p)
{
/* call g_iconv with NULL inbuf to cleanup shift state */
p = NULL;
inbytes_remaining = 0;
}
else else
done = TRUE; done = TRUE;
} }
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment