Re: [PATCH] iconv.3: Clarify the behavior when input is untranslatable

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Sorry, ignore this patch.  I forgot to remove Reuben's authorship
when I modified it.  I also forgot to specify v2.

On 5/21/23 12:31, Alejandro Colomar wrote:
> From: Reuben Thomas <rrt@xxxxxxxx>
> 
> The manual page does not fully reflect the behaviour of glibc's
> iconv(3).  The manual page says:
> 
>     The conversion can stop for four reasons:
> 
>     -  An invalid multibyte sequence is encountered in the input.  In
>        this case, it sets errno to EILSEQ and returns (size_t) -1.
>        *inbuf is left pointing to the beginning of the invalid multibyte
>        sequence.
> 
>     [...]
> 
> The phrase "An invalid multibyte sequence is encountered in the input"
> is confusing, because it suggests that it refers only to the validity of
> the input per se (e.g. a non-UTF-8 sequence in input purporting to be
> UTF-8).
> 
> However, according to the original author of the manual page, Bruno
> Haible[1], it also refers to input that cannot be translated to the
> desired output encoding; and indeed, glibc's iconv returns EILSEQ when
> the input cannot be translated, even though it is valid.
> 
> This patch adds language that reflects the actual behavior, by adding an
> explicit bullet that distinguishes this case.
> 
> Link: [1] <https://sourceware.org/bugzilla/show_bug.cgi?id=29913#c4>
> Link: <https://bugzilla.kernel.org/show_bug.cgi?id=217059>
> Reported-by: Reuben Thomas <rrt@xxxxxxxx>
> Cc: Steffen Nurpmeso <steffen@xxxxxxxxxx>
> Cc: Bruno Haible <bruno@xxxxxxxxx>
> Cc: Martin Sebor <msebor@xxxxxxxxxx>
> Signed-off-by: Alejandro Colomar <alx@xxxxxxxxxx>
> 
> f
> ---
>  man3/iconv.3 | 8 ++++++++
>  1 file changed, 8 insertions(+)
> 
> diff --git a/man3/iconv.3 b/man3/iconv.3
> index 66f59b8c3..6bb27c802 100644
> --- a/man3/iconv.3
> +++ b/man3/iconv.3
> @@ -80,6 +80,14 @@ .SH DESCRIPTION
>  \fI*inbuf\fP
>  is left pointing to the beginning of the invalid multibyte sequence.
>  .IP \[bu]
> +An multibyte sequence is encountered in the input which
> +cannot be translated to the character encoding of the output.
> +In this case,
> +it sets \fIerrno\fP to \fBEILSEQ\fP and returns
> +.IR (size_t)\ \-1 .
> +\fI*inbuf\fP
> +is left pointing to the beginning of the invalid multibyte sequence.
> +.IP \[bu]
>  The input byte sequence has been entirely converted,
>  that is, \fI*inbytesleft\fP has gone down to 0.
>  In this case,

-- 
<http://www.alejandro-colomar.es/>
GPG key fingerprint: A9348594CE31283A826FBDD8D57633D441E25BB5

Attachment: OpenPGP_signature
Description: OpenPGP digital signature


[Index of Archives]     [Kernel Documentation]     [Netdev]     [Linux Ethernet Bridging]     [Linux Wireless]     [Kernel Newbies]     [Security]     [Linux for Hams]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux RAID]     [Linux Admin]     [Samba]

  Powered by Linux