Re: [PATCH v9 0/6] MAP_DIRECT for DAX userspace flush

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Oct 12, 2017 at 11:57 PM, Christoph Hellwig <hch@xxxxxx> wrote:
> On Thu, Oct 12, 2017 at 10:41:39AM -0700, Dan Williams wrote:
>> So, you're jumping into this review at v9 where I've split the patches
>> that take an initial MAP_DIRECT lease out from the patches that take
>> FL_LAYOUT leases at memory registration time. You can see a previous
>> attempt in "[PATCH v8 00/14] MAP_DIRECT for DAX RDMA and userspace
>> flush" which should be in your inbox.
>
> The point is that your problem has absolutely nothing to do with mmap,
> and all with get_user_pages.
>
> get_user_pages on DAX doesn't give the same guarantees as on pagecache
> or anonymous memory, and that is the prbolem we need to fix.  In fact
> I'm pretty sure if we try hard enough (and we might have to try
> very hard) we can see the same problem with plain direct I/O and without
> any RDMA involved, e.g. do a larger direct I/O write to memory that is
> mmap()ed from a DAX file, then truncate the DAX file and reallocate
> the blocks, and we might corrupt that new file.  We'll probably need
> a special setup where there is little other chance but to reallocate
> those used blocks.

I'll take a harder look at this...

> So what we need to do first is to fix get_user_pages vs unmapping
> DAX mmap()ed blocks, be that from a hole punch, truncate, COW
> operation, etc.
>
> Then we need to look into the special case of a long-living non-transient
> get_user_pages that RDMA does - we can't just reject any truncate or
> other operation for that, so that's where something like me layout
> lease suggestion comes into play - but the call that should get the
> least is not the mmap - it's the memory registration call that does
> the get_user_pages.

Yes, mmap is not the place to get the lease for a later
get_user_pages, and my patches do take an additional lease at
get_user_pages / MR init time. However, the mmap call has the
file-descriptor for SIGIO the MR-init call does not. If we delay all
of the setup it to MR time then we need to invent a notification
scheme specific to RDMA which seems like a waste to me when we can
generically signal an event on the fd for any event that effects any
of the vma's on the file. The FL_LAYOUT lease impacts the entire file,
so as far as I can see delaying the notification until MR-init is too
late, too granular, and too RDMA specific.
--
To unsubscribe from this list: send the line "unsubscribe linux-api" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux