Re: [RFC][PATCH] dax: Do not try to clear poison for partial pages

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Feb 5, 2020 at 12:27 PM <jane.chu@xxxxxxxxxx> wrote:
>
> Hello,
>
> I haven't seen response to this proposal, unsure if there is a different
> but related discussion ongoing...
>
> I'd like to express my wish: please make it easier for the pmem
> applications when possible.
>
> If kernel does not clear poison when it could legitimately do so,

The only path where this happens today is write() syscalls in dax
mode, otherwise fallocate(PUNCH_HOLE) is currently the only guaranteed
way to trigger error clearing from userspace (outside of sending raw
commands to the device).

> applications have to go through lengths to clear poisons.
> For Cloud pmem applications that have upper bound on error recovery
> time, not clearing poison while zeroing-out is quite undesirable.

The complicating factor in all of this is the alignment requirement
for clearing and the inability for native cpu instructions to clear
errors. On current platforms talking to firmware is required and that
interface may require 256-byte block clearing. This is why the
implementation glommed on to clearing errors on block-I/O path writes
because we at least knew that all of those I/Os were 512-byte aligned.

This gets better with cpus that support the movdir64b instruction, in
that case there is still a 64-byte alignment requirement, but there's
no need to talk to the BIOS and therefore no need to talk to a driver.

So we have this awkward dependency on block-device I/O semantics only
because it happened to organize i/o in a way that supports error
clearing.

Right now the kernel does not install a pte on faults that land on a
page with known poison, but only because the error clearing path is so
convoluted and could only claim that fallocate(PUNCH_HOLE) cleared
errors because that was guaranteed to send 512-byte aligned zero's
down the block-I/O path when the fs-blocks got reallocated. In a world
where native cpu instructions can clear errors the dax write() syscall
case could be covered (modulo 64-byte alignment), and the kernel could
just let the page be mapped so that the application could attempt it's
own fine-grained clearing without calling back into the kernel.



[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux