Re: Problems with VM_MIXEDMAP removal from /proc/<pid>/smaps

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Oct 5, 2018 at 6:17 PM Dan Williams <dan.j.williams@xxxxxxxxx> wrote:
>
> On Thu, Oct 4, 2018 at 11:35 PM Johannes Thumshirn <jthumshirn@xxxxxxx> wrote:
> >
> > On Thu, Oct 04, 2018 at 11:25:24PM -0700, Christoph Hellwig wrote:
> > > Since when is an article on some website a promise (of what exactly)
> > > by linux kernel developers?
> >
> > Let's stop it here, this doesn't make any sort of forward progress.
> >
>
> I do think there is some progress we can make if we separate DAX as an
> access mechanism vs DAX as a resource utilization contract. My attempt
> at representing Christoph's position is that the kernel should not be
> advertising / making access mechanism guarantees. That makes sense.
> Even with MAP_SYNC+DAX the kernel reserves the right to write-protect
> mappings at will and trap access into a kernel handler. Additionally,
> whether read(2) / write(2) does anything different behind the scenes
> in DAX mode, or not should be irrelevant to the application.
>
> That said what is certainly not irrelevant is a kernel giving
> userspace visibility and control into resource utilization. Jan's
> MADV_DIRECT_ACCESS let's the application make assumptions about page
> cache utilization, we just need to another mechanism to read if a
> mapping is effectively already in that state.

I thought more about this today while reviewing the virtio-pmem driver
that will behave mostly like a DAX-capable pmem device except it will
be implemented by passing host page cache through to the guest as a
pmem device with a paravirtualized / asynchronous flush interface.
MAP_SYNC obviously needs to be disabled for this case, but still need
allow to some semblance of DAX operation to save allocating page cache
in the guest. The need to explicitly clarify the state of DAX is
growing with the different nuances of DAX operation.

Lets use a new MAP_DIRECT flag to positively assert that a given
mmap() call is setting up a memory mapping without page-cache or
buffered indirection. To be clear not my original MAP_DIRECT proposal
from a while back, instead just a flag to mmap() that causes the
mapping attempt to fail if there is any software buffering fronting
the memory mapping, or any requirement for software to manage flushing
outside of pushing writes through the cpu cache. This way, if we ever
extend MAP_SYNC for a buffered use case we can still definitely assert
that the mapping is "direct". So, MAP_DIRECT would fail for
traditional non-DAX block devices, and for this new virtio-pmem case.
It would also fail for any pmem device where we cannot assert that the
platform will take care of flushing write-pending-queues on power-loss
events.



[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux