On Mon, Dec 03, 2018 at 05:07:53PM -0500, Doug Ledford wrote: > On Mon, 2018-12-03 at 20:58 +0000, Jason Gunthorpe wrote: > > Using the new UVERBS_METHOD_INVOKE_WRITE interface all of the existing > > write() commands can be immediately converted to execute over ioctl. > > > > If -DIOCTL_MODE=ioctl is set then only the IOCTL interface is supported, > > and write will not be called. Otherwise the 'both' mode will auto > > detect the kernel support and use it. > > > > Signed-off-by: Jason Gunthorpe <jgg@xxxxxxxxxxxx> > > libibverbs/cmd_fallback.c | 44 ++++++++++++++++++++++++++++++++++++++- > > libibverbs/device.c | 27 ++++++++++++++++++++++++ > > libibverbs/ibverbs.h | 1 + > > 3 files changed, 71 insertions(+), 1 deletion(-) > > > > This needs the matching kernel patch: > > > > https://patchwork.kernel.org/patch/10706127/ > > > > a kernel headers update and the series at: > > > > https://github.com/linux-rdma/rdma-core/pull/434 > > > > To apply. > > And long term, this is what we need to resolve the security issue/no > fork problem... Yes > We leave the write() interface enabled in the kernel for backward > compatibility for the foreseeable future (this is the whole "no > regressions" requirement from Linus...we were required to regress in > order to resolve the security issue, but other than that, we should > require users to update rdma-core to keep working systems working, so > the existing write() interface needs to stay online). Yes > We enable this ioctl-can-wrap-writes feature in rdma-core and in the > kernel Yes, but this patch just provides the code, the default compile option is left as write() only. All the provider teams should test with ioctl() only mode and confirm there aren't any crazy things missed. I was imagining we'd give another month or two for that, the moven the rdma-core default to both (ie prefer ioctl if available) At some point testing of the write() side will diminish, but now that everything is the same flow it shouldn't be too hard to keep working. > New builds of rdma-core are switched to be ioctl only by preference with > a fallback to write if necessary, or are you thinking build it with > -DIOCTL_MODE and force it to require newer kernels? I image the ioctl only mode as an option distros could choose to enable if they feel confident they don't want to support write() only kernels. It speeds up the call path a bit and removes some code from rdma-core, so it is worth doing if you can. > Regardless, we keep the draconian check in place on the write syscall > entry point, so if anyone tries to do any funny stuff with forking an > application that's already opened a uverbs device, and manages to trick > an application to writing to the uverbs device, it fails. But we don't > turn the write() interface off, we just leave that draconian check > forever. yes, basically. In two years we add a pr_once saying the userspace should be udpated or something > Going forward, the only way to fork then use open device work is if you > use the ioctl method exclusively. This is enforced in the kernel simply > by leaving the draconian check on the write syscall entry point and the > new ioctl method you just posted bypasses the write syscall entry point > and directly looks up the method handler for all of the write commands > and then calls the handler directly after setting up the right cmd/resp > udata structs as part of the attr bundle. Yes > In the end, this is all guaranteed safe because the rest of the core > kernel doesn't play setfs(KERN_DS); games with ioctl commands, where as > they might do so on the write() syscall to redirect things under certain > circumstances. Yes, ioctl is allowed to chase pointers, write() and read() are not.. > Have I got all of the long term intentions for this straight while I'm > going over it all? Yep Jason