On 2019-10-03, Jonathan Corbet <corbet@xxxxxxx> wrote: > [Expanding CC a bit; this is the sort of change I'm reluctant to take > without being sure it reflects what the community thinks.] > > On Wed, 2 Oct 2019 17:14:37 +0200 > Christian Brauner <christian.brauner@xxxxxxxxxx> wrote: > > > Add additional information on how to ensure that syscalls with structure > > arguments are extensible and add a section about naming conventions to > > follow when adding revised versions of already existing syscalls. > > > > Co-Developed-by: Aleksa Sarai <cyphar@xxxxxxxxxx> > > Signed-off-by: Aleksa Sarai <cyphar@xxxxxxxxxx> > > Signed-off-by: Christian Brauner <christian.brauner@xxxxxxxxxx> > > --- > > Documentation/process/adding-syscalls.rst | 82 +++++++++++++++++++---- > > 1 file changed, 70 insertions(+), 12 deletions(-) > > > > diff --git a/Documentation/process/adding-syscalls.rst b/Documentation/process/adding-syscalls.rst > > index 1c3a840d06b9..93e0221fbb9a 100644 > > --- a/Documentation/process/adding-syscalls.rst > > +++ b/Documentation/process/adding-syscalls.rst > > @@ -79,7 +79,7 @@ flags, and reject the system call (with ``EINVAL``) if it does:: > > For more sophisticated system calls that involve a larger number of arguments, > > it's preferred to encapsulate the majority of the arguments into a structure > > that is passed in by pointer. Such a structure can cope with future extension > > -by including a size argument in the structure:: > > +by either including a size argument in the structure:: > > > > struct xyzzy_params { > > u32 size; /* userspace sets p->size = sizeof(struct xyzzy_params) */ > > @@ -87,20 +87,56 @@ by including a size argument in the structure:: > > u64 param_2; > > u64 param_3; > > }; > > + int sys_xyzzy(struct xyzzy_params __user *uarg); > > + /* in case of -E2BIG, p->size is set to the in-kernel size and thus all > > + extensions after that offset are unsupported. */ > > That comment kind of threw me for a loop - this is the first mention of > E2BIG and readers may not just know what's being talked about. Especially > since the comment suggests *not* actually returning an error. I probably could've worded this better -- this comment describes what userspace sees when they use the API (sched_setattr(2) is an example of this style of API). In the case where the kernel doesn't support a requested extension (usize > ksize, and there are non-zero bytes past ksize) then the kernel returns -E2BIG *but also* sets p->size to ksize so that userspace knows what extensions the kernel supports. Maybe I should've replicated more of the details from the kernel-doc for copy_struct_from_user(). > > -As long as any subsequently added field, say ``param_4``, is designed so that a > > -zero value gives the previous behaviour, then this allows both directions of > > -version mismatch: > > +or by including a separate argument that specifies the size:: > > > > - - To cope with a later userspace program calling an older kernel, the kernel > > - code should check that any memory beyond the size of the structure that it > > - expects is zero (effectively checking that ``param_4 == 0``). > > - - To cope with an older userspace program calling a newer kernel, the kernel > > - code can zero-extend a smaller instance of the structure (effectively > > - setting ``param_4 = 0``). > > + struct xyzzy_params { > > + u32 param_1; > > + u64 param_2; > > + u64 param_3; > > + }; > > + /* userspace sets @usize = sizeof(struct xyzzy_params) */ > > + int sys_xyzzy(struct xyzzy_params __user *uarg, size_t usize); > > + /* in case of -E2BIG, userspace has to attempt smaller @usize values > > + to figure out which extensions are unsupported. */ > > Here too. But what I'm really wondering about now is: you're describing > different behavior for what are essentially two cases of the same thing. > Why should the kernel simply accept the smaller size if the size is > embedded in the structure itself, but return an error and force user space > to retry if it's a separate argument? > > I guess maybe because in the latter case the kernel can't easily return > the size it's actually using? I think that should be explicit if so. As above, the -E2BIG only happens if userspace is trying to use an extension that the kernel doesn't support (usize > ksize, non-zero bytes after ksize). The main difference between the two API styles is whether or not userspace gets told what ksize is explicitly in the case of an -E2BIG. Maybe it would be less confusing to only mention one of ways of doing it, but then we have to pick one (and while the newer syscalls [clone3 and openat2] use a separate argument, there are more syscalls which embed it in the struct). -- Aleksa Sarai Senior Software Engineer (Containers) SUSE Linux GmbH <https://www.cyphar.com/>
Attachment:
signature.asc
Description: PGP signature