On Wed, Aug 7, 2024 at 1:28 AM Alejandro Colomar <alx@xxxxxxxxxx> wrote: > > Hi Linus, > > Serge let me know about this thread earlier today. > > On 2024-08-05, Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote: > > On Mon, 5 Aug 2024 at 20:01, Yafang Shao <laoar.shao@xxxxxxxxx> wrote: > > > > > > One concern about removing the BUILD_BUG_ON() is that if we extend > > > TASK_COMM_LEN to a larger size, such as 24, the caller with a > > > hardcoded 16-byte buffer may overflow. > > > > No, not at all. Because get_task_comm() - and the replacements - would > > never use TASK_COMM_LEN. > > > > They'd use the size of the *destination*. That's what the code already does: > > > > #define get_task_comm(buf, tsk) ({ \ > > ... > > __get_task_comm(buf, sizeof(buf), tsk); \ > > > > note how it uses "sizeof(buf)". > > In shadow.git, we also implemented macros that are named after functions > and calculate the appropriate number of elements internally. > > $ grepc -h STRNCAT . > #define STRNCAT(dst, src) strncat(dst, src, NITEMS(src)) > $ grepc -h STRNCPY . > #define STRNCPY(dst, src) strncpy(dst, src, NITEMS(dst)) > $ grepc -h STRTCPY . > #define STRTCPY(dst, src) strtcpy(dst, src, NITEMS(dst)) > $ grepc -h STRFTIME . > #define STRFTIME(dst, fmt, tm) strftime(dst, NITEMS(dst), fmt, tm) > $ grepc -h DAY_TO_STR . > #define DAY_TO_STR(str, day, iso) day_to_str(NITEMS(str), str, day, iso) > > They're quite useful, and when implementing them we found and fixed > several bugs thanks to them. > > > Now, it might be a good idea to also verify that 'buf' is an actual > > array, and that this code doesn't do some silly "sizeof(ptr)" thing. > > I decided to use NITEMS() instead of sizeof() for that reason. > (NITEMS() is just our name for ARRAY_SIZE().) > > $ grepc -h NITEMS . > #define NITEMS(a) (SIZEOF_ARRAY((a)) / sizeof((a)[0])) > > > We do have a helper for that, so we could do something like > > > > #define get_task_comm(buf, tsk) \ > > strscpy_pad(buf, __must_be_array(buf)+sizeof(buf), (tsk)->comm) > > We have SIZEOF_ARRAY() for when you want the size of an array: > > $ grepc -h SIZEOF_ARRAY . > #define SIZEOF_ARRAY(a) (sizeof(a) + must_be_array(a)) There is already a similar macro in Linux: /** * ARRAY_SIZE - get the number of elements in array @arr * @arr: array to be sized */ #define ARRAY_SIZE(arr) (sizeof(arr) / sizeof((arr)[0]) + __must_be_array(arr)) will use it instead of the sizeof(). > > However, I don't think you want sizeof(). Let me explain why: > > - Let's say you want to call wcsncpy(3) (I know nobody should be using > that function, not strncpy(3), but I'm using it as a standard example > of a wide-character string function). > > You should call wcsncpy(dst, src, NITEMS(dst)). > A call wcsncpy(dst, src, sizeof(dst)) is bogus, since the argument is > the number of wide characters, not the number of bytes. > > When translating that to normal characters, you want conceptually the > same operation, but on (normal) characters. That is, you want > strncpy(dst, src, NITEMS(dst)). While strncpy(3) with sizeof() works > just fine because sizeof(char)==1 by definition, it is conceptually > wrong to use it. > > By using NITEMS() (i.e., ARRAY_SIZE()), you get the __must_be_array() > check for free. > > In the end, SIZEOF_ARRAY() is something we very rarely use. It's there > only used in the following two cases at the moment: > > #define NITEMS(a) (SIZEOF_ARRAY((a)) / sizeof((a)[0])) > #define MEMZERO(arr) memzero(arr, SIZEOF_ARRAY(arr)) > > Does that sound convincing? > > For memcpy(3) for example, you do want sizeof(), because you're copying > raw bytes, but with strings, in which characters are conceptually > meaningful elements, NITEMS() makes more sense. > > BTW, I'm working on a __lengthof__ operator that will soon allow using > it on function parameters declared with array notation. That is, > > size_t > f(size_t n, int a[n]) > { > return __lengthof__(a); // This will return n. > } > > If you're interested in it, I'm developing and discussing it here: > <https://inbox.sourceware.org/gcc-patches/20240806122218.3827577-1-alx@xxxxxxxxxx/> > > > > > as a helper macro for this all. > > > > (Although I'm not convinced we generally want the "_pad()" version, > > but whatever). > > We had problems with it in shadow recently. In user-space, it's similar > to strncpy(3) (at least if you wrap it in a macro that makes sure that > it terminates the string with a null byte). > > We had a lot of uses of strncpy(3), from old times where that was used > to copy strings with truncation. I audited all of that code (and > haven't really finished yet), and translated to calls similar to > strscpy(9) (we call it strtcpy(), as it _t_runcates). The problem was > that in some cases the padding was necessary, and in others it was not, > and it was very hard to distinguish those. > > I recommend not zeroing strings unnecessarily, since that will make it > hard to review the code later. E.g., twenty years from now, someone > takes a piece of code with a _pad() call, and has no clue if the zeroing > was for a reason, or for no reason. > > On the other hand, not zeroing may make it easier to explot bugs, so > whatever you think best. In the kernel you may need to be more worried > than in user space. Whatever. :) Good point. I will avoid using the _pad(). -- Regards Yafang