On Thu, Oct 12, 2023 at 06:36:52AM +0200, Christoph Hellwig wrote: > On Wed, Oct 11, 2023 at 11:04:58AM -0600, Keith Busch wrote: > > > I don't think it's reasonable for the driver to decode every passthrough > > command to validate the data lengths, or reject ones that we don't know > > how to decode. SG_IO doesn't do that either. > > I don't want that either, but what can we do against a (possibly > unprivileged) user corrupting data? The unpriviledged access is kind of recent. Maybe limit the scope of decoding to that usage? We've always known the interface can be misused to corrupt memory and/or data, and it was always user responsibility to use this interface reponsibly. We shouldn't disable something people have relied on for over 10 years just because someone rediscovered ways to break it. It's not like this is a "metadata" specific thing either; you can provide short user space buffers and corrupt memory with regular admin commands, and we have been able to that from day 1. But if you abuse this interface, it was always your fault; the kernel never took responsibility to sanity check your nvme input, and I think it's a bad precedent to start doing it. > SCSI has it much either because it has an explicit data transfer length > (outside the CDB) instead of trying to build it from information that > differs per opcode. One of the many places where it shows that NVMe > is a very sloppy and badly thought out protocol. Yeah, implicit PRP length has often been reported as one of the early protocol "regrets"...