Re: strncpy clarify result may not be null terminated

Alejandro Colomar <alx@xxxxxxxxxx> · Sun, 12 Nov 2023 22:45:42 +0100

On Sun, Nov 12, 2023 at 10:00:06PM +0100, Alejandro Colomar wrote:
> On Sun, Nov 12, 2023 at 12:49:44PM -0800, Paul Eggert wrote:
> > [dropping libc-alpha since this is only about the man pages]
> > 
> > On 2023-11-12 02:59, Alejandro Colomar wrote:
> > 
> > > I think the man-pages should go
> > > ahead and write wrapper functions such as strtcpy() and stpecpy()
> > > aound libc functions; these wrappers should provide a fast and safe
> > > starting point for most programs.
> > 
> > It's OK for man pages to give these in EXAMPLES sections. However, the man
> > pages currently go too far in this direction. Currently, if I type "man
> > stpecpy", I get a man page with a synopsis and it looks to me like glibc
> > supports stpecpy(3) just like it supports stpcpy(3). But glibc doesn't do
> > that, as stpecpy is merely a man-pages invention: although the source code
> > for stpecpy is in the EXAMPLES section of string_copying(7), you can't use
> > stpecpy in an app without copy-and-pasting the man page's source into your
> > code.
> > 
> > It's not just stepecpy. For example, there is no ustr2stp function in glibc,
> > but "man ustr2stp" acts as if there is one.
> 
> Yeah, I've thought of removing those links.  Will do it.
> 
> > 
> > The man pages should describe the library that exists, not the library that
> > some of us would rather have.
> > 
> > 
> > > It's true that memcpy(3) is the fastest function one can use, but it
> > > requires the programmer to be rather careful with the lengths of the
> > > strings.  I don't think keeping track of all those little details is
> > > what the common programmer should do.
> > 
> > Unfortunately, C is not designed for string use that's that convenient. If
> > you want safe and efficient use of possibly-long C strings, keeping track of
> > lengths is generally the best way to do it.
> > 
> > 
> > > > glibc/strlcpy.c __strlcpy() is there a reason when truncating it overwrites the last byte, twice?
> > > > 
> > > > memcpy (dest, src, size);
> > > > dest[size - 1] = '\0';
> > > 
> > > -1's in the source code make up for off-by-one bugs.
> > 
> > The "dest[size - 1] = '\0';" is there because strlcpy(dst, src, sz) is
> > defined to null-terminate the result if sz!=0, so that particular "-1" isn't
> > a bug. (Perhaps you meant that the strlcpy spec itself is buggy? It wasn't
> > clear to me.)
> 
> I didn't mean this code has a bug.  I meant that writing this code all
> the time is prone to bugs, because one may forget the -1 in some of the
> cases.

Ahh, I hadn't noticed that was part of the implementation of strlcpy(3).
I though it was some pattern showing how to use memcpy(3) to copy
strings.  I was saying that such a pattern would be a bad thing to write
all the time.

But yeah, inside strlcpy(3) it's fine, and I don't think strlcpy(3) is
bad in that regard.  The only problem I see in strlcpy(3) is the return
value.

> 
> And yes, the strlcpy(3) spec is buggy in that it forces a pattern that
> is prone to off-by-one bugs: to check for truncation, one must use '>=',
> which one may mistype as '>' (or even '==').  It would have been much
> better to return -1 on truncation, to have a simple == -1 check as most
> libc functions.
> 
> Any function that requires writing hundreds of 'size - 1', or hundreds
> of '>=' should at least be wrapped.  If that use is the only intended
> use of the function (as is of snprintf(3) and strlcpy(3)), it's a bad
> API.
> 
> Cheers,
> Alex
> 
> > 
> > That "last byte, twice" question is: why is the last argument to memcpy
> > "size" and not "size - 1" which would be equally correct? The answer is
> > performance: memcpy often works faster when copying a number of bytes that
> > is a multiple of a smallish power of two, and "size" is more likely than
> > "size - 1" to be such a multiple.
> > 
> 
> -- 
> <https://www.alejandro-colomar.es/>

-- 
<https://www.alejandro-colomar.es/>
Attachment:
signature.asc

Description: PGP signature