Re: [PATCH] iconv.3: Clarify the behavior when input is untranslatable

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Alejandro,

> Please use semantic newlines.  See man-pages(7):

Thanks for explaining. I wondered whether I should use one space or two spaces
after the end of a sentence, but found no precedent for either style. This
explains it :)

> > +In the GNU C library and GNU libiconv, if
> > +.I cd
> > +was created without the suffix
> > +.B //TRANSLIT
> > +or
> > +.BR //IGNORE ,
> > +the conversion is strict: lossy conversions produce this condition.
> > +If the suffix
> > +.B //TRANSLIT
> > +was specified, transliteration can avoid this condition in some cases.
> 
> What do you mean by "can" and "some cases"?

GNU libc and GNU libiconv support transliteration, for example, of "½" to "1/2",
or of "å" to "aa" in a Danish locale. Here I want to give a hint at the
transliteration facility, but without going into too much detail.
"transliteration can avoid this condition if there is a transliteration rule
for the multibyte character and it fits the character encoding of the output"
is too detailed, IMO.
Do you have a better wording than "can ... in some cases"?

> I recommend either using \[aq]*\[aq] for producing valid C code,
> or just having an unquoted *.

I made the requested style changes.

New patch is attached.

>From caa04c49e89e64d7e8b52ab878c6dc2cd0cef5b9 Mon Sep 17 00:00:00 2001
From: Bruno Haible <bruno@xxxxxxxxx>
Date: Sun, 21 May 2023 13:05:29 +0200
Subject: [PATCH] List a fifth conditions when iconv(3) may stop.

Link: https://sourceware.org/bugzilla/show_bug.cgi?id=29913#c4
Link: https://bugzilla.kernel.org/show_bug.cgi?id=217059
Reported-by: Steffen Nurpmeso <steffen@xxxxxxxxxx>
Reported-by: Reuben Thomas <rrt@xxxxxxxx>
Signed-off-by: Bruno Haible <bruno@xxxxxxxxx>
---
 man3/iconv.3 | 35 ++++++++++++++++++++++++++++++++++-
 1 file changed, 34 insertions(+), 1 deletion(-)

diff --git a/man3/iconv.3 b/man3/iconv.3
index 66f59b8c3..94441f602 100644
--- a/man3/iconv.3
+++ b/man3/iconv.3
@@ -71,7 +71,7 @@ If the character encoding of the input is stateful, the
 function can also convert a sequence of input bytes
 to an update to the conversion state without producing any output bytes;
 such input is called a \fIshift sequence\fP.
-The conversion can stop for four reasons:
+The conversion can stop for five reasons:
 .IP \[bu] 3
 An invalid multibyte sequence is encountered in the input.
 In this case,
@@ -80,6 +80,39 @@ it sets \fIerrno\fP to \fBEILSEQ\fP and returns
 \fI*inbuf\fP
 is left pointing to the beginning of the invalid multibyte sequence.
 .IP \[bu]
+A multibyte sequence is encountered that is valid but that cannot be
+translated to the character encoding of the output.
+This condition depends on the implementation and on the conversion
+descriptor.
+In the GNU C library and GNU libiconv, if
+.I cd
+was created without the suffix
+.B //TRANSLIT
+or
+.BR //IGNORE ,
+the conversion is strict: lossy conversions produce this condition.
+If the suffix
+.B //TRANSLIT
+was specified, transliteration can avoid this condition in some cases.
+In the musl C library, this condition cannot occur because a conversion to
+.B \[aq]*\[aq]
+is used as a fallback.
+In the FreeBSD, NetBSD, and Solaris implementations of
+.BR iconv (),
+this condition cannot occur either, because a conversion to
+.B \[aq]?\[aq]
+is used as a fallback.
+When this condition is met,
+.BR iconv ()
+sets
+.I errno
+to
+.B EILSEQ
+and returns
+.IR (size_t)\ \-1 .
+.I *inbuf
+is left pointing to the beginning of the unconvertible multibyte sequence.
+.IP \[bu]
 The input byte sequence has been entirely converted,
 that is, \fI*inbytesleft\fP has gone down to 0.
 In this case,
-- 
2.34.1


[Index of Archives]     [Kernel Documentation]     [Netdev]     [Linux Ethernet Bridging]     [Linux Wireless]     [Kernel Newbies]     [Security]     [Linux for Hams]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux RAID]     [Linux Admin]     [Samba]

  Powered by Linux