Re: [PATCH] iconv.3: Clarify the behavior when input is untranslatable

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Bruno

On 5/21/23 13:11, Bruno Haible wrote:
> Alejandro Colomar wrote:
>> This patch adds language that reflects the actual behavior, by adding an
>> explicit bullet that distinguishes this case.
> 
> That is the right approach. Thanks for taking the initiative.
> 
> But I think that more details should be added, so that programmers are
> not surprised if their program behaves differently on, say, musl libc
> or FreeBSD than on glibc.
> 
> Find attached my take to describe the condition appropriately.

Thanks!

> 
> Bruno
> 

> @@ -80,6 +80,34 @@ .SH DESCRIPTION
>  \fI*inbuf\fP
>  is left pointing to the beginning of the invalid multibyte sequence.
>  .IP \[bu]
> +A multibyte sequence is encountered that is valid but that cannot be
> +translated to the character encoding of the output.  This condition

Please use semantic newlines.  See man-pages(7):
   Use semantic newlines
       In the source of a manual page, new sentences should be started
       on  new  lines,  long  sentences  should be split into lines at
       clause breaks (commas, semicolons, colons, and so on), and long
       clauses should be split at phrase boundaries.  This convention,
       sometimes known as "semantic newlines", makes it easier to  see
       the  effect of patches, which often operate at the level of in‐
       dividual sentences, clauses, or phrases.


> +depends on the implementation and on the conversion descriptor.
> +In the GNU C library and GNU libiconv, if
> +.I cd
> +was created without the suffix
> +.B //TRANSLIT
> +or
> +.BR //IGNORE ,
> +the conversion is strict: lossy conversions produce this condition.
> +If the suffix
> +.B //TRANSLIT
> +was specified, transliteration can avoid this condition in some cases.

What do you mean by "can" and "some cases"?

> +In the musl C library, this condition cannot occur because a conversion to
> +.B '*'

I recommend either using \[aq]*\[aq] for producing valid C code,
or just having an unquoted *.

> +is used as a fallback.
> +In the FreeBSD, NetBSD, and Solaris implementations of
> +.BR iconv ,

.BR iconv () ,

> +this condition cannot occur either, because a conversion to
> +.B '?'

Similar stuff here.

> +is used as a fallback.
> +When this condition is met,
> +.B iconv

And here.

> +sets \fIerrno\fP to \fBEILSEQ\fP and returns

.I errno

.B EILSEQ

I know in other places in the page we use \f, but I'll fix
that at some point.  Please use macros for new code.

Cheers,
Alex

> +.IR (size_t)\ \-1 .
> +\fI*inbuf\fP
> +is left pointing to the beginning of the invalid multibyte sequence.
> +.IP \[bu]
>  The input byte sequence has been entirely converted,
>  that is, \fI*inbytesleft\fP has gone down to 0.
>  In this case,


-- 
<http://www.alejandro-colomar.es/>
GPG key fingerprint: A9348594CE31283A826FBDD8D57633D441E25BB5

Attachment: OpenPGP_signature
Description: OpenPGP digital signature


[Index of Archives]     [Kernel Documentation]     [Netdev]     [Linux Ethernet Bridging]     [Linux Wireless]     [Kernel Newbies]     [Security]     [Linux for Hams]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux RAID]     [Linux Admin]     [Samba]

  Powered by Linux