On 6/5/20 14:30, Kees Cook wrote: > On Fri, Jun 05, 2020 at 11:21:42AM -0500, Gustavo A. R. Silva wrote: >> Add zero-length and one-element arrays to the list. >> >> While I continue replacing zero-length and one-element arrays with >> flexible-array members, I need a reference to point people to, so >> they don't introduce more instances of such arrays. And while here, >> add a note to the "open-coded arithmetic in allocator arguments" >> section, on the use of struct_size() and the arrays-to-deprecate >> mentioned here. >> >> Signed-off-by: Gustavo A. R. Silva <gustavoars@xxxxxxxxxx> >> --- >> Changes in v2: >> - Adjust some markup links for readability. >> >> Documentation/process/deprecated.rst | 83 ++++++++++++++++++++++++++++ >> 1 file changed, 83 insertions(+) >> >> diff --git a/Documentation/process/deprecated.rst b/Documentation/process/deprecated.rst >> index 652e2aa02a66c..042c21c968e19 100644 >> --- a/Documentation/process/deprecated.rst >> +++ b/Documentation/process/deprecated.rst >> @@ -85,6 +85,11 @@ Instead, use the helper:: >> >> header = kzalloc(struct_size(header, item, count), GFP_KERNEL); >> >> +NOTE: If you are using struct_size() on a structure containing a zero-length > > Please use: > > .. note:: > OK. > for this kind of "NOTE:" > >> +or a one-element array as a trailing array member, stop using such arrays > > And I think it was likely my language suggestion to say "stop using", but > probably this should be more friendly. How about "please refactor such > arrays ..." > >> +and switch to `flexible arrays <#zero-length-and-one-element-arrays>`_ > > ... to a `flexible array member <#... > >> +instead. >> + > >> See array_size(), array3_size(), and struct_size(), >> for more details as well as the related check_add_overflow() and >> check_mul_overflow() family of functions. >> @@ -200,3 +205,81 @@ All switch/case blocks must end in one of: >> * continue; >> * goto <label>; >> * return [expression]; >> + >> +Zero-length and one-element arrays >> +---------------------------------- >> +Old code in the kernel uses the zero-length and one-element array extensions >> +to the C90 standard, but the `preferred mechanism to declare variable-length > > I'd like to reword this to make an immediate statement about what _should_ > be done, and then move into the details on an as accurate as possible > review of the history of these work-arounds. How about this, which I > mixed some of your earlier paragraphs into: > > > > There is a regular need in the kernel to provide a way to declare having > a dynamically sized set of trailing elements in a structure. Kernel code > should always use `"flexible array members" <https://en.wikipedia.org/wiki/Flexible_array_member>`_ > for these cases. The older style of one-element or zero-length arrays should > no longer be used. > > In older C code, dynamically sized trailing elements were done by specifying > a one-element array at the end of a structure:: > > struct something { > int count; > struct items[1]; > }; > > This led to fragile size calculations via sizeof() (which would need to > remove the size of the single trailing element to get a correct size of > the "header"). A `GNU C extension <https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html>`_ > was introduced to allow for zero-length arrays, to avoid these kinds of > size problems:: > > struct something { > int count; > struct items[0]; > }; > > But this led to other problems, and didn't solve some problems shared by > both styles, like not being to able to detect when such an array is > accidentally being used _not_ at the end of a structure (which could happen > directly, or when such a struct was in unions, structs of structs, etc). > > C99 introduced "flexible array members", which lacks a numeric size for > the array declaration entirely:: > > struct something { > int count; > struct items[]; > }; > > This is the way the kernel expects dynamically sized trailing elements > to be declared. It allows the compiler to generate errors when the > flexible array does not occur last in the structure, which helps to prevent > some kind of `undefined behavior > <https://git.kernel.org/linus/76497732932f15e7323dc805e8ea8dc11bb587cf>`_ > bugs from being inadvertently introduced to the codebase. It also allows > the compiler to correctly analyze array sizes (via sizeof(), > `CONFIG_FORTIFY_SOURCE`, and `CONFIG_UBSAN_BOUNDS`). For instance, > there is no mechanism that warns us that the following application of the > sizeof() operator to a zero-length array always results in zero:: > > struct something { > int count; > struct items[0]; > }; > > struct something *instance; > > instance = kmalloc(struct_size(instance, items, size), GFP_KERNEL); > instance->length = size; > memcpy(instance->items, source, size); > ... > size = sizeof(instance->items); > > > [and then continue with the rest of the details you wrote below...] > >> + >> +At the last line of code above, ``size`` turns out to be zero --when one might have >> +thought differently. Here are a couple examples of this issue: `link 1 >> +<https://git.kernel.org/linus/f2cd32a443da694ac4e28fbf4ac6f9d5cc63a539>`_, >> +`link 2 >> +<https://git.kernel.org/linus/ab91c2a89f86be2898cee208d492816ec238b2cf>`_. >> +On the other hand, `flexible array members have incomplete type, and so the sizeof() >> +operator may not be applied <https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html>`_, >> +so any misuse of such operator will be immediately noticed at build time. >> + >> +With respect to one-element arrays, one has to be acutely aware that `such arrays >> +occupy at least as much space as a single object of the type >> +<https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html>`_, >> +hence they contribute to the size of the enclosing structure. This is prone >> +to error every time people want to calculate the total size of dynamic memory >> +to allocate for a structure containing an array of this kind as a member:: >> + >> + struct something { >> + int length; >> + char data[1]; >> + }; > > And for all of these examples, I'd like to avoid using "char" as the type > for the flex array member, since it doesn't drive home the idea of > having a multiplier (i.e. "length" matches the size of "data") And > similarly, "length" should, I think, be called "count". That way the > bytes is separate from count, but is the result of sizeof(*items) * > count. > >> + >> + struct something *instance; >> + >> + instance = kmalloc(struct_size(instance, data, size - 1), GFP_KERNEL); >> + instance->length = size; >> + memcpy(instance->data, source, size); >> + >> +In the example above, we had to remember to calculate ``size - 1`` when using > > I don't want to get people confused between "size" (in bytes) vs "count" > (of trailing elements). > Yep. I'll change that to avoid confusion. >> +the struct_size() helper, otherwise we would have --unintentionally-- allocated >> +memory for one too many ``data`` objects. The cleanest and least error-prone way >> +to implement this is through the use of a `flexible array member`, which is >> +exemplified at the `beginning <#zero-length-and-one-element-arrays>`_ of this >> +section. >> -- >> 2.27.0 >> > > How does that look? > Got it. Working on the changes right now... Thanks for the feedback! -- Gustavo