On Wed, Oct 02, 2024 at 03:36:33PM GMT, Alice Ryhl wrote: > On Wed, Oct 2, 2024 at 3:24 PM Christian Brauner <brauner@xxxxxxxxxx> wrote: > > > > On Wed, Oct 02, 2024 at 12:48:12PM GMT, Arnd Bergmann wrote: > > > On Tue, Oct 1, 2024, at 08:22, Alice Ryhl wrote: > > > > +#[cfg(CONFIG_COMPAT)] > > > > +unsafe extern "C" fn fops_compat_ioctl<T: MiscDevice>( > > > > + file: *mut bindings::file, > > > > + cmd: c_uint, > > > > + arg: c_ulong, > > > > +) -> c_long { > > > > + // SAFETY: The compat ioctl call of a file can access the private > > > > data. > > > > + let private = unsafe { (*file).private_data }; > > > > + // SAFETY: Ioctl calls can borrow the private data of the file. > > > > + let device = unsafe { <T::Ptr as ForeignOwnable>::borrow(private) > > > > }; > > > > + > > > > + match T::compat_ioctl(device, cmd as u32, arg as usize) { > > > > + Ok(ret) => ret as c_long, > > > > + Err(err) => err.to_errno() as c_long, > > > > + } > > > > +} > > > > > > I think this works fine as a 1:1 mapping of the C API, so this > > > is certainly something we can do. On the other hand, it would be > > > nice to improve the interface in some way and make it better than > > > the C version. > > > > > > The changes that I think would be straightforward and helpful are: > > > > > > - combine native and compat handlers and pass a flag argument > > > that the callback can check in case it has to do something > > > special for compat mode > > > > > > - pass the 'arg' value as both a __user pointer and a 'long' > > > value to avoid having to cast. This specifically simplifies > > > the compat version since that needs different types of > > > 64-bit extension for incoming 32-bit values. > > > > > > On top of that, my ideal implementation would significantly > > > simplify writing safe ioctl handlers by using the information > > > encoded in the command word: > > > > > > - copy the __user data into a kernel buffer for _IOW() > > > and back for _IOR() type commands, or both for _IOWR() > > > - check that the argument size matches the size of the > > > structure it gets assigned to > > > > - Handle versioning by size for ioctl()s correctly so stuff like: > > > > /* extensible ioctls */ > > switch (_IOC_NR(ioctl)) { > > case _IOC_NR(NS_MNT_GET_INFO): { > > struct mnt_ns_info kinfo = {}; > > struct mnt_ns_info __user *uinfo = (struct mnt_ns_info __user *)arg; > > size_t usize = _IOC_SIZE(ioctl); > > > > if (ns->ops->type != CLONE_NEWNS) > > return -EINVAL; > > > > if (!uinfo) > > return -EINVAL; > > > > if (usize < MNT_NS_INFO_SIZE_VER0) > > return -EINVAL; > > > > return copy_ns_info_to_user(to_mnt_ns(ns), uinfo, usize, &kinfo); > > } > > > > This is not well-known and noone versions ioctl()s correctly and if they > > do it's their own hand-rolled thing. Ideally, this would be a first > > class concept with Rust bindings and versioning like this would be > > universally enforced. > > Could you point me at some more complete documentation or example of > how to correctly do versioning? So I don't want you to lead astray so if this is out of reach for now I understand but basically we do have the concept of versioning structs by size. So I'm taking an example from the mount_setattr() man page though openat2() would also work: Extensibility In order to allow for future extensibility, mount_setattr() requires the user-space application to specify the size of the mount_attr structure that it is passing. By providing this information, it is possible for mount_setattr() to provide both forwards- and backwards-compatibility, with size acting as an implicit version number. (Because new extension fields will always be appended, the structure size will always increase.) This extensibility design is very similar to other system calls such as perf_setattr(2), perf_event_open(2), clone3(2) and openat2(2). Let usize be the size of the structure as specified by the user-space application, and let ksize be the size of the structure which the kernel supports, then there are three cases to consider: • If ksize equals usize, then there is no version mismatch and attr can be used verbatim. • If ksize is larger than usize, then there are some extension fields that the kernel supports which the user-space application is unaware of. Because a zero value in any added extension field signifies a no-op, the kernel treats all of the extension fields not provided by the user-space application as having zero values. This provides backwards-compatibility. • If ksize is smaller than usize, then there are some extension fields which the user-space application is aware of but which the kernel does not support. Because any extension field must have its zero values signify a no-op, the kernel can safely ignore the unsupported extension fields if they are all zero. If any unsupported extension fields are non-zero, then -1 is returned and errno is set to E2BIG. This provides forwards-compatibility. [...] In essence ioctl()s are already versioned by size because the size of the passed argument is encoded in the ioctl cmd: struct my_struct { __u64 a; }; ioctl(fd, MY_IOCTL, &my_struct); then _IOC_SIZE(MY_IOCTL) will give you the expected size. If the kernel extends the struct to: struct my_struct { __u64 a; __u64 b; }; then the value of MY_IOCTL changes. Most code currently cannot deal with such an extension because it's coded as a simple switch on the ioctl command: switch (cmd) { case MY_IOCTL: /* do something */ break; } So on an older kernel the ioctl would now fail because it won't be able to handle MY_STRUCT with an increased struct my_struct size because the switch wouldn't trigger. The correct way to handle this is to grab the actual ioctl number out from the ioctl command: switch (_IOC_NR(cmd)) { case _IOC_NR(MY_STRUCT): { and then grab the size of the ioctl: size_t usize = _IOC_SIZE(ioctl); perform sanity checks: // garbage if (usize < MY_STRUCT_SIZE_VER0) return -EINVAL; // ¿qué? if (usize > PAGE_SIZE) return -EINVAL; and then copy the stuff via copy_struct_from_user() or copy back out to user via other means. This way you can safely extend ioctl()s in a backward and forward compatible manner and if we can enforce this for new drivers then I think that's what we should do.