On Sat, Aug 15, 2020 at 2:24 PM Joe Perches <joe@xxxxxxxxxxx> wrote: > > On Sat, 2020-08-15 at 13:47 -0700, Nick Desaulniers wrote: > > On Sat, Aug 15, 2020 at 9:34 AM Kees Cook <keescook@xxxxxxxxxxxx> wrote: > > > On Fri, Aug 14, 2020 at 07:09:44PM -0700, Nick Desaulniers wrote: > > > > LLVM implemented a recent "libcall optimization" that lowers calls to > > > > `sprintf(dest, "%s", str)` where the return value is used to > > > > `stpcpy(dest, str) - dest`. This generally avoids the machinery involved > > > > in parsing format strings. Calling `sprintf` with overlapping arguments > > > > was clarified in ISO C99 and POSIX.1-2001 to be undefined behavior. > > > > > > > > `stpcpy` is just like `strcpy` except it returns the pointer to the new > > > > tail of `dest`. This allows you to chain multiple calls to `stpcpy` in > > > > one statement. > > > > > > O_O What? > > > > > > No; this is a _terrible_ API: there is no bounds checking, there are no > > > buffer sizes. Anything using the example sprintf() pattern is _already_ > > > wrong and must be removed from the kernel. (Yes, I realize that the > > > kernel is *filled* with this bad assumption that "I'll never write more > > > than PAGE_SIZE bytes to this buffer", but that's both theoretically > > > wrong ("640k is enough for anybody") and has been known to be wrong in > > > practice too (e.g. when suddenly your writing routine is reachable by > > > splice(2) and you may not have a PAGE_SIZE buffer). > > > > > > But we cannot _add_ another dangerous string API. We're already in a > > > terrible mess trying to remove strcpy[1], strlcpy[2], and strncpy[3]. This > > > needs to be addressed up by removing the unbounded sprintf() uses. (And > > > to do so without introducing bugs related to using snprintf() when > > > scnprintf() is expected[4].) > > > > Well, everything (-next, mainline, stable) is broken right now (with > > ToT Clang) without providing this symbol. I'm not going to go clean > > the entire kernel's use of sprintf to get our CI back to being green. > > Maybe this should get place in compiler-clang.h so it isn't > generic and public. https://bugs.llvm.org/show_bug.cgi?id=47162#c7 and https://bugs.llvm.org/show_bug.cgi?id=47144 Seem to imply that Clang is not the only compiler that can lower a sequence of libcalls to stpcpy. Do we want to wait until we have a fire drill w/ GCC to move such an implementation from include/linux/compiler-clang.h back in to lib/string.c? > > Something like: > > --- > include/linux/compiler-clang.h | 27 +++++++++++++++++++++++++++ > 1 file changed, 27 insertions(+) > > diff --git a/include/linux/compiler-clang.h b/include/linux/compiler-clang.h > index cee0c728d39a..6279f1904e39 100644 > --- a/include/linux/compiler-clang.h > +++ b/include/linux/compiler-clang.h > @@ -61,3 +61,30 @@ > #if __has_feature(shadow_call_stack) > # define __noscs __attribute__((__no_sanitize__("shadow-call-stack"))) > #endif > + > +#ifndef __HAVE_ARCH_STPCPY > +/** > + * stpcpy - copy a string from src to dest returning a pointer to the new end > + * of dest, including src's NULL terminator. May overrun dest. > + * @dest: pointer to buffer being copied into. > + * Must be large enough to receive copy. > + * @src: pointer to the beginning of string being copied from. > + * Must not overlap dest. > + * > + * This function exists _only_ to support clang's possible conversion of > + * sprintf calls to stpcpy. > + * > + * stpcpy differs from strcpy in two key ways: > + * 1. inputs must not overlap. > + * 2. return value is dest's NUL termination character after copy. > + * (for strcpy, the return value is a pointer to src) > + */ > + > +static inline char *stpcpy(char __restrict *dest, const char __restrict *src) > +{ > + while ((*dest++ = *src++) != '\0') { > + ; /* nothing */ > + } > + return --dest; > +} > +#endif > > -- Thanks, ~Nick Desaulniers