Re: direct_access, pinning and truncation

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 10/20/2014 02:01 AM, Dave Chinner wrote:
> On Sun, Oct 19, 2014 at 02:08:07PM +0300, Boaz Harrosh wrote:
>> On 10/10/2014 05:24 PM, Matthew Wilcox wrote:
>> <>
>>>
>>> I'm assuming that we come up with *some* way to solve the missing struct
>>> page problem.  Whether it's restructuring splice, O_DIRECT and RDMA to do
>>> without struct pages, 
>>
>> That makes no sense to me, where will it end? You are doubling the size of the
>> code to have two paths, and there will always be a subsystem you did not touch
>> and is missing support. And why? page was already invented to do exactly what you
>> want, track state of a PFN.
> .....
>>> whether it's coming up
>>> with some other data structure that takes the place of struct page for
>>> DAX ... 
>>
>> Again. Why reinvent the wheel when the old one works perfectly and does
>> everything you want, including the most important aspect. Not adding any
>> new infrastructure, and/or modifying any code. So why even think about it?
>>
>>> doesn't matter for this part of the conversation.
>>>
>>
>> I agree, this does not solve the reference problem, in this case DAX will
>> need an new entry into the FS to communicate delayed free-block. But as Jan
>> pointed out this is not against current FS structure.
>>
>> I think lots of current DAX problems and performance short comings can be
>> solved very nicely if we assume we have struct-page for pmem. For example
>> the use of the page-lock instead of the i_mutex we take today.
> 
> Which makes me look at what DAX is intended for.
> 
> DAX is an enabler, allowing us to get direct access to PMEM with
> *existing filesystem technologies*.  I don't want to have to add new
> extent management functions to XFS to add temporary references to
> allow DAX to hold onto extents after an inode has been freed because
> some RDMA app has pinned the PMEM and forgot to let it go. That way
> lies madness for existing filesystems - yes, we can add such warts
> to them, but it's ugly, nasty and needed only by a very, very small
> lunatic fringe of users.
> 

I agree

> IMO, this proposal is way outside the original DAX-replaces-XIP scope;
> I really don't think that requiring extensive modifications to
> filesystems to use DAX is a good idea. Apart from it being contrary to the
> original architectural goal of DAX (which was "enable direct access
> with minimal filesystem implementation impact"), we risk significant
> impact on non-DAX users by requiring architectural changes to the
> underlying filesystems to support DAX.
> 
> So my question is this: at what point do we say "out of scope for
> DAX, make this work with a native PMEM filesystem"?  DAX as it
> stands fills the "95% of what people need" goal with minimal effort;
> our efforts should be focussed on merging what we have, not creeping
> the scope and making it harder to implement and get merged.
> 
> If we want RDMA into PMEM devices or direct IO to/from persisten
> memory, then I'd suggest that this is functionality that belongs in
> native PMEM storage devices/filesystems and should be designed to be
> efficient in that environment way from the ground up.
> 

You convinced me. This is out of scope for DAX and is up to the user.
It actually works today, let me explain:

Today, after my patch to pmem, one can just mmap a file and the pointer
returned pass to any RDMA engine he chooses and it will just work. With
brd driver and DAX it will just work today, and even with old XIP.
The problem that remains is the truncate while RDMA mapped. What the
user will need to do is take a lock on the file to wart any truncates.
For me this is like trashing the block-dev directly while an FS is
mounted, I think, can a none root do this?
Please note that this scenario is possible today with a brd device.

> Cheers,
> Dave.
> 

Thanks
Boaz

--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux