Re: [PATCH 4/5] block: Add support for bouncing pinned pages

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue 14-02-23 22:24:33, Christoph Hellwig wrote:
> On Wed, Feb 15, 2023 at 03:59:52PM +1100, Dave Chinner wrote:
> > I don't think this works, especially if the COW mechanism relies on
> > delayed allocation to prevent ENOSPC during writeback. That is, we
> > need a write() or page fault (to run ->page_mkwrite()) after every
> > call to folio_clear_dirty_for_io() in the writeback path to ensure
> > that new space is reserved for the allocation that will occur
> > during a future writeback of that page.
> > 
> > Hence we can't just leave the page dirty on COW filesystems - it has
> > to go through a clean state so that the clean->dirty event can be
> > gated on gaining the space reservation that allows it to be written
> > back again.
> 
> Exactly.  Although if we really want we could do the redirtying without
> formally moving to a clean state, but it certainly would require special
> new code to the same steps as if we were redirtying.

Yes.

> Which is another reason why I'd prefer to avoid all that if we can.
> For that we probably need an inventory of what long term pins we have
> in the kernel tree that can and do operate on shared file mappings,
> and what kind of I/O semantics they expect.

I'm a bit skeptical we can reasonably assess that (as much as I would love
to just not write these pages and be done with it) because a lot of
FOLL_LONGTERM users just pin passed userspace address range, then allow
userspace to manipulate it with other operations, and finally unpin it with
another call. Who knows whether shared pagecache pages are passed in and
what userspace is doing with them while they are pinned? 

We have stuff like io_uring using FOLL_LONGTERM for IO buffers passed from
userspace (e.g. IORING_REGISTER_BUFFERS operation), we have V4L2 which
similarly pins buffers for video processing (and I vaguely remember one
bugreport due to some phone passing shared file pages there to acquire
screenshots from a webcam), and we have various infiniband drivers doing
this (not all of them are using FOLL_LONGTERM but they should AFAICS). We
even have vmsplice(2) that should be arguably using pinning with
FOLL_LONGTERM (at least that's the plan AFAIK) and not writing such pages
would IMO provide an interesting attack vector...

								Honza


-- 
Jan Kara <jack@xxxxxxxx>
SUSE Labs, CR



[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux