On Wed, Nov 08, 2023 at 08:33:34PM +0100, Alejandro Colomar wrote: > Hi Matthew, > > On Tue, Nov 07, 2023 at 09:12:37PM -0500, Matthew House wrote: > > On Tue, Nov 7, 2023 at 8:21 AM Alejandro Colomar <alx@xxxxxxxxxx> wrote: > > > On Tue, Nov 07, 2023 at 11:52:44AM +0000, Jonny Grant wrote: > > > > We see things differently, I'm on the C standard side on this one. Would any information change your mind? > > > > > > It's difficult to say, but I doubt it. But let me ask you something: > > > In what cases would you find strncpy(3) appropriate to use, and why? > > > Maybe if I understand that it helps. > > > > > > Kind regards, > > > Alex > > > > Man pages aren't read only by people writing new code, but also by people > > reading and modifying existing code. And despite your preferences regarding > > which functions ought to be used to produce strings, it's a widespread (and > > correct) practice to produce a string from the character sequence created > > by strncpy(3). There are two ways of doing this, either by setting the last > > character of the destination buffer to null if you want to produce a > > truncated string, or by testing the last character against zero if you want > > to detect truncation and raise an error. > > It is not strncpy(3) who truncated, but the programmer by adding a NULL Oops. s/NULL/NUL/ > in buff[BUFSIZ - 1]. In the following snippet, strncpy(3) will not > truncate: > > char cs[3]; > > strncpy(cs, "foo", 3); > > And yet your code doing if (cs[2] != '\0') { goto error; } would think > it did. That's because you deformed strncpy(3) to implement a poor > man's strlcpy(3). > > > char cs[3]; > > strncpy(cs, "foo", 3); > cs[2] = '\0'; // The truncation is here, not in strncpy(3). > > > I'm not aware of any alternative to a strncpy(3)-based snippet for > > producing a possibly-truncated copy of a string, except for your preferred > > strlcpy(3) or stpecpy(3), which aren't available to anyone without a > > The Linux kernel has strscpy(3), which is also good, but is not > available to user space. > > > brand-new glibc (nor, by extension, any applications or libraries that want > > libbsd has provided strlcpy(3) since basically forever. It is a very > portable library. You don't need a brand-new glibc for having > strlcpy(3). > > <https://libbsd.freedesktop.org/wiki/> > > > to support people without a brand-new glibc, nor any libraries that want to > > support other platforms like Windows with only ISO C and POSIX-ish > > If you program for Windows, it depends. If you have POSIX available, > you may be able to port libbsd; I don't know. In any case, I don't > case about Windows enough. You could always write your own string- > copying function for Windows. > > > functions); snprintf(3), which has the insidious flaw of not supporting > > more than INT_MAX characters on pain of UB, and also produces a warning if > > the compiler notices the possible truncation; or strlen(3) + min() + > > memcpy(3) + manually adding a null terminator, which is certainly more > > explicit in its intent, and avoids strncpy(3)'s zero-filling behavior if > > that poses a performance problem, but similarly opens up room for > > off-by-one errors. > > More than the performance problem, I'm more worried about the > maintainability of strncpy(3). When 20 years from now, a programmer > reading a piece of code full of strncpy(3) wants to migrate to a sane > function like strlcpy(3) or strcpy(3), the programmer needs to > understand if the zeroing was purposeful or just accidental. Because > by using strlcpy(3), it may start leaking some trailing data if the > trailing of the buffer is meaningful to some program. > > > > > For the sake of reference, I looked into a few big C and C++ projects to > > see how often a strncpy(3)-based snippet was used to produce a truncated > > copy. I found 18 instances in glibc 2.38, 2 in util-linux 2.39.2 (in spite > > of its custom xstrncpy() function), 61 in GNU binutils 2.41, 43 in > > GDB 13.2, 1 in LLVM 17.0.4, 7 in CPython 3.12.0, 99 in OpenJDK 22+22, > > 10 in .NET Runtime 7.0.13, 3 in V8 12.1.82, and 86 in Firefox 120.0. (Note > > that I haven't filtered out vendored dependencies, so there's a little bit > > of double-counting.) It seems like most codebases that don't ban strncpy(3) > > use a derived snippet somewhere or another. Also, I found 3 instances in > > glibc 2.38 and 5 instances in Firefox 120.0 of detecting truncation by > > checking the last character. > > I know. I've been rewriting the code handling strings in shadow-utils > for the last year, and ther was a lot of it. I fixed several small bugs > in the process, so I recommend avoiding it. > > > > > So these two snippets really are widespread, especially among the long tail > > of smaller C and C++ applications and libraries that don't perform enough > > string manipulation that it warrants creating a custom set of more- > > foolproof wrapper functions (at least, in the opinion of their authors). > > > > > Thus, since they're not going away, it would be useful for anyone reading > > the code to understand the concept behind how these two snippets work, that > > the only difference between the strncpy(3)'s special "character sequence" > > and an ordinary C string is an additional null terminator at the end of the > > destination buffer. > > This is part of string_copying(7): > > DESCRIPTION > Terms (and abbreviations) > string (str) > is a sequence of zero or more non‐null characters followed by a > null byte. > > character sequence > is a sequence of zero or more non‐null characters. A program > should never use a character sequence where a string is required. > However, with appropriate care, a string can be used in the place > of a character sequence. > > I think that is very explicit in the difference. strncpy(3) refers to > that page for understanding the differences, so I think it is > documented. > > strncpy(3): > CAVEATS > The name of these functions is confusing. These functions produce a > null‐padded character sequence, not a string (see string_copying(7)). > > > > > In other words, strncpy(3) doesn't create a truncated string, but it > > creates something which can be easily turned into to a truncated string, > > and that's its most relevant quality for most of its uses in existing code. > > Further, apart from snprintf(3), there's no other portable way to produce a > > truncated string without manual arithmetic. Thus, I'd also find it > > Portable is relative. With libbsd, you can port to most POSIX systems. > Windows is another story. > > > reasonable to highlight precisely why strncpy(3)'s output isn't a string > > How about this?: > > diff --git a/man3/stpncpy.3 b/man3/stpncpy.3 > index d4c2ce83d..c80c8b640 100644 > --- a/man3/stpncpy.3 > +++ b/man3/stpncpy.3 > @@ -108,7 +108,10 @@ .SH HISTORY > .SH CAVEATS > The name of these functions is confusing. > These functions produce a null-padded character sequence, > -not a string (see > +not a string. > +While strings have a terminating NUL byte, > +character sequences do not have any terminating byte > +(see > .BR string_copying (7)). > .P > It's impossible to distinguish truncation by the result of the call, > > > > (viz., the lack of a null terminator), instead of trying to insist that its > > output is worlds apart from anything string-related, especially given the > > volume of existing correct code that belies that notion. > > It is not correct code. That code is doing extra work which confuses > maintainers. It is a lot like writing dead code, since you're writing > zeros that nobody is reading, which confuses maintainers. > > Also, I've seen a lot of off-by-one bugs in calls to strncpy(3), so no, > it's not correct code. It's rather dangerous code that just happens to > not be vulnerable most of the time. > > > > > Or, to answer your question, "It's appropriate to keep using strncpy(3) in > > existing code where it's currently used as part of creating a truncated > > string, and it's not especially inappropriate to use strncpy(3) in new code > > as part of creating a truncated string, if the code must support platforms > > without strlcpy(3) or similar, and if the resulting snippets are few enough > > and well-commented enough that they create less mental load than creating > > and maintaining a custom helper function." > > strncpy(3) calls are never well documented. Do you add a comment in > each such call saying "this zeroing is superfluous"? Probably not. > > > > > (As an aside, I find the remark in the man page that "It's impossible to > > distinguish truncation by the result of the call" extremely misleading at > > best, since truncation can easily be distinguished by inspecting the last > > output character.) > > Again, strncpy(3)'s truncation is impossible to detect. What you can > detect is that your construct that resembles strlcpy(3) truncates, which > is a different thing. > > Thanks, > Alex > > > > > Thank you, > > Matthew House > > -- > <https://www.alejandro-colomar.es/> -- <https://www.alejandro-colomar.es/>
Attachment:
signature.asc
Description: PGP signature