[patch] iconv.3: clarify behavior when input is untranslatable

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I attach a patch for
https://bugzilla.kernel.org/show_bug.cgi?id=217059 as requested by
Alejandro.

-- 
https://rrt.sc3d.org
From 72b623ee2c32da96a2972a9dce43a554f494c5b8 Mon Sep 17 00:00:00 2001
From: Reuben Thomas <rrt@xxxxxxxx>
Date: Sat, 20 May 2023 12:10:11 +0100
Subject: [PATCH] iconv.3: clarify the behavior when input is untranslatable

See https://bugzilla.kernel.org/show_bug.cgi?id=217059

The man page does not fully reflect the behaviour of glibc's iconv. The man
page says:

  The conversion can stop for four reasons:

     1. An invalid multibyte sequence is encountered in the input. In this
     case, it sets errno to EILSEQ and returns (size_t) -1. *inbuf is left
     pointing to the beginning of the invalid multibyte sequence.

The phrase "An invalid multibyte sequence is encountered in the input" is
confusing, because it suggests that it refers only to the validity of the
input per se (e.g. a non-UTF-8 sequence in input purporting to be UTF-8).

However, according to the original author of the man page, Bruno Haible[1],
it also refers to input that cannot be translated to the desired output
encoding; and indeed, glibc's iconv returns EILSEQ when the input cannot be
translated, even though it is valid.

This patch adds language that reflects the actual behavior.

[1] https://sourceware.org/bugzilla/show_bug.cgi?id=29913#c4

Signed-off-by: Reuben Thomas <rrt@xxxxxxxx>
Suggested-by: Alejandro Colomar <alx@xxxxxxxxxx>
---
 man3/iconv.3 | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/man3/iconv.3 b/man3/iconv.3
index 66f59b8c3..e8694ca12 100644
--- a/man3/iconv.3
+++ b/man3/iconv.3
@@ -73,7 +73,8 @@ to an update to the conversion state without producing any output bytes;
 such input is called a \fIshift sequence\fP.
 The conversion can stop for four reasons:
 .IP \[bu] 3
-An invalid multibyte sequence is encountered in the input.
+An multibyte sequence is encountered in the input which is either invalid,
+or cannot be translated to the character encoding of the output.
 In this case,
 it sets \fIerrno\fP to \fBEILSEQ\fP and returns
 .IR (size_t)\ \-1 .
-- 
2.34.1


[Index of Archives]     [Kernel Documentation]     [Netdev]     [Linux Ethernet Bridging]     [Linux Wireless]     [Kernel Newbies]     [Security]     [Linux for Hams]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux RAID]     [Linux Admin]     [Samba]

  Powered by Linux