Re: [LSF/MM TOPIC] Discuss least bad options for resolving longterm-GUP usage by RDMA

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, 2019-02-06 at 10:35 -0800, Matthew Wilcox wrote:
> On Wed, Feb 06, 2019 at 01:32:04PM -0500, Doug Ledford wrote:
> > On Wed, 2019-02-06 at 09:52 -0800, Matthew Wilcox wrote:
> > > On Wed, Feb 06, 2019 at 10:31:14AM -0700, Jason Gunthorpe wrote:
> > > > On Wed, Feb 06, 2019 at 10:50:00AM +0100, Jan Kara wrote:
> > > > 
> > > > > MM/FS asks for lease to be revoked. The revoke handler agrees with the
> > > > > other side on cancelling RDMA or whatever and drops the page pins. 
> > > > 
> > > > This takes a trip through userspace since the communication protocol
> > > > is entirely managed in userspace.
> > > > 
> > > > Most existing communication protocols don't have a 'cancel operation'.
> > > > 
> > > > > Now I understand there can be HW / communication failures etc. in
> > > > > which case the driver could either block waiting or make sure future
> > > > > IO will fail and drop the pins. 
> > > > 
> > > > We can always rip things away from the userspace.. However..
> > > > 
> > > > > But under normal conditions there should be a way to revoke the
> > > > > access. And if the HW/driver cannot support this, then don't let it
> > > > > anywhere near DAX filesystem.
> > > > 
> > > > I think the general observation is that people who want to do DAX &
> > > > RDMA want it to actually work, without data corruption, random process
> > > > kills or random communication failures.
> > > > 
> > > > Really, few users would actually want to run in a system where revoke
> > > > can be triggered.
> > > > 
> > > > So.. how can the FS/MM side provide a guarantee to the user that
> > > > revoke won't happen under a certain system design?
> > > 
> > > Most of the cases we want revoke for are things like truncate().
> > > Shouldn't happen with a sane system, but we're trying to avoid users
> > > doing awful things like being able to DMA to pages that are now part of
> > > a different file.
> > 
> > Why is the solution revoke then?  Is there something besides truncate
> > that we have to worry about?  I ask because EBUSY is not currently
> > listed as a return value of truncate, so extending the API to include
> > EBUSY to mean "this file has pinned pages that can not be freed" is not
> > (or should not be) totally out of the question.
> > 
> > Admittedly, I'm coming in late to this conversation, but did I miss the
> > portion where that alternative was ruled out?
> 
> That's my preferred option too, but the preponderance of opinion leans
> towards "We can't give people a way to make files un-truncatable".

Has anyone looked at the laundry list of possible failures truncate
already has?  Among others, ETXTBSY is already in the list, and it
allows someone to make a file un-truncatable by running it.  There's
EPERM for multiple failures.  In order for someone to make a file
untruncatable using this, they would have to have perms to the file
already anyway as well as perms to get the direct I/O pin.  I see no
reason why, if they have the perms to do it, that you don't allow them
to.  If you don't want someone else to make a file untruncatable that
you want to truncate, then don't share file perms with them.  What's the
difficulty here?  Really, creating this complex revoke thing to tear
down I/O when people really *don't* want that I/O getting torn down
seems like forcing a bad API on I/O to satisfy not doing what is an
entirely natural extension to an existing API.  You *shouldn't* have the
right to truncate a file that is busy, and ETXTBSY is a perfect example
of that, and an example of the API done right.  This other....

-- 
Doug Ledford <dledford@xxxxxxxxxx>
    GPG KeyID: B826A3330E572FDD
    Key fingerprint = AE6B 1BDA 122B 23B4 265B  1274 B826 A333 0E57 2FDD

Attachment: signature.asc
Description: This is a digitally signed message part


[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux