On Mon, Apr 24, 2023 at 09:28:07AM -0300, Jason Gunthorpe wrote: > On Mon, Apr 24, 2023 at 11:17:55AM +0100, Lorenzo Stoakes wrote: > > On Mon, Apr 24, 2023 at 02:43:56AM -0700, Christoph Hellwig wrote: > > > I'm pretty sure DIRECT I/O reads that write into file backed mappings > > > are out there in the wild. > > I wonder if that is really the case? I know people tried this with > RDMA and it didn't get very far before testing uncovered data > corruption and kernel crashes.. Maybe O_DIRECT has a much smaller race > window so people can get away with it? > > > I know Jason is keen on fixing this at a fundamental level and this flag is > > ultimately his suggestion, so it certainly doesn't stand in the way of this > > work moving forward. > > Yeah, the point is to close it off, because while we wish it was > fixed properly, it isn't. We are still who knows how far away from it. > > In the mean time this is a fairly simple way to oops the kernel, > especially with cases like io_uring and RDMA. So, I view it as a > security problem. > > My general dislike was that io_uring protected itself from the > security problem and we left all the rest of the GUP users out to dry. > > So, my suggestion was to mark the places where we want to allow this, > eg O_DIRECT, and block everwhere else. Lorenzo, I would significantly > par back the list you have. I was being fairly conservative in that list, though we certainly need to set the flag for /proc/$pid/mem and ptrace to avoid breaking this functionality (I observed breakpoints breaking without it which obviously is a no go :). I'm not sure if there's a more general way we could check for this though? A perhaps slightly unpleasant solution might be to not enforce this when FOLL_FORCE is specified which is mostly a ptrace + friends thing then we could drop all those exceptions. I wouldn't be totally opposed to dropping it for RDMA too, because I suspect accessing file-backed mappings for that is pretty iffy. Do you have a sense of which in the list you feel could be pared back? > > I also suggest we force block it at some kernel lockdown level.. > > Alternatively, perhaps we abuse FOLL_LONGTERM and prevent it from > working with filebacked pages since, I think, the ease of triggering a > bug goes up the longer the pages are pinned. > This would solve the io_uring case and it is certainly more of a concern when the pin is intended to be kept around, though it feels a bit icky as a non-FOLL_LONGTERM pin could surely be problematic too? > Jason