[PATCH] iconv.3: Clarify the behavior when input is untranslatable

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



From: Reuben Thomas <rrt@xxxxxxxx>

The manual page does not fully reflect the behaviour of glibc's
iconv(3).  The manual page says:

    The conversion can stop for four reasons:

    -  An invalid multibyte sequence is encountered in the input.  In
       this case, it sets errno to EILSEQ and returns (size_t) -1.
       *inbuf is left pointing to the beginning of the invalid multibyte
       sequence.

    [...]

The phrase "An invalid multibyte sequence is encountered in the input"
is confusing, because it suggests that it refers only to the validity of
the input per se (e.g. a non-UTF-8 sequence in input purporting to be
UTF-8).

However, according to the original author of the manual page, Bruno
Haible[1], it also refers to input that cannot be translated to the
desired output encoding; and indeed, glibc's iconv returns EILSEQ when
the input cannot be translated, even though it is valid.

This patch adds language that reflects the actual behavior, by adding an
explicit bullet that distinguishes this case.

Link: [1] <https://sourceware.org/bugzilla/show_bug.cgi?id=29913#c4>
Link: <https://bugzilla.kernel.org/show_bug.cgi?id=217059>
Reported-by: Reuben Thomas <rrt@xxxxxxxx>
Cc: Steffen Nurpmeso <steffen@xxxxxxxxxx>
Cc: Bruno Haible <bruno@xxxxxxxxx>
Cc: Martin Sebor <msebor@xxxxxxxxxx>
Signed-off-by: Alejandro Colomar <alx@xxxxxxxxxx>

f
---
 man3/iconv.3 | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/man3/iconv.3 b/man3/iconv.3
index 66f59b8c3..6bb27c802 100644
--- a/man3/iconv.3
+++ b/man3/iconv.3
@@ -80,6 +80,14 @@ .SH DESCRIPTION
 \fI*inbuf\fP
 is left pointing to the beginning of the invalid multibyte sequence.
 .IP \[bu]
+An multibyte sequence is encountered in the input which
+cannot be translated to the character encoding of the output.
+In this case,
+it sets \fIerrno\fP to \fBEILSEQ\fP and returns
+.IR (size_t)\ \-1 .
+\fI*inbuf\fP
+is left pointing to the beginning of the invalid multibyte sequence.
+.IP \[bu]
 The input byte sequence has been entirely converted,
 that is, \fI*inbytesleft\fP has gone down to 0.
 In this case,
-- 
2.40.1




[Index of Archives]     [Kernel Documentation]     [Netdev]     [Linux Ethernet Bridging]     [Linux Wireless]     [Kernel Newbies]     [Security]     [Linux for Hams]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux RAID]     [Linux Admin]     [Samba]

  Powered by Linux