Re: 回复: Re: [RFC PATCH 0/8] dax: Add a dax-rmap tree to support reflink

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On 2020/4/28 下午2:43, Dave Chinner wrote:
On Tue, Apr 28, 2020 at 06:09:47AM +0000, Ruan, Shiyang wrote:

在 2020/4/27 20:28:36, "Matthew Wilcox" <willy@xxxxxxxxxxxxx> 写道:

On Mon, Apr 27, 2020 at 04:47:42PM +0800, Shiyang Ruan wrote:
  This patchset is a try to resolve the shared 'page cache' problem for
  fsdax.

  In order to track multiple mappings and indexes on one page, I
  introduced a dax-rmap rb-tree to manage the relationship.  A dax entry
  will be associated more than once if is shared.  At the second time we
  associate this entry, we create this rb-tree and store its root in
  page->private(not used in fsdax).  Insert (->mapping, ->index) when
  dax_associate_entry() and delete it when dax_disassociate_entry().

Do we really want to track all of this on a per-page basis?  I would
have thought a per-extent basis was more useful.  Essentially, create
a new address_space for each shared extent.  Per page just seems like
a huge overhead.

Per-extent tracking is a nice idea for me.  I haven't thought of it
yet...

But the extent info is maintained by filesystem.  I think we need a way
to obtain this info from FS when associating a page.  May be a bit
complicated.  Let me think about it...

That's why I want the -user of this association- to do a filesystem
callout instead of keeping it's own naive tracking infrastructure.
The filesystem can do an efficient, on-demand reverse mapping lookup
from it's own extent tracking infrastructure, and there's zero
runtime overhead when there are no errors present.

Hi Dave,

I ran into some difficulties when trying to implement the per-extent rmap tracking. So, I re-read your comments and found that I was misunderstanding what you described here.

I think what you mean is: we don't need the in-memory dax-rmap tracking now. Just ask the FS for the owner's information that associate with one page when memory-failure. So, the per-page (even per-extent) dax-rmap is needless in this case. Is this right?

Based on this, we only need to store the extent information of a fsdax page in its ->mapping (by searching from FS). Then obtain the owners of this page (also by searching from FS) when memory-failure or other rmap case occurs.

So, a fsdax page is no longer associated with a specific file, but with a FS(or the pmem device). I think it's easier to understand and implement.


--
Thanks,
Ruan Shiyang.

At the moment, this "dax association" is used to "report" a storage
media error directly to userspace. I say "report" because what it
does is kill userspace processes dead. The storage media error
actually needs to be reported to the owner of the storage media,
which in the case of FS-DAX is the filesytem.

That way the filesystem can then look up all the owners of that bad
media range (i.e. the filesystem block it corresponds to) and take
appropriate action. e.g.

- if it falls in filesytem metadata, shutdown the filesystem
- if it falls in user data, call the "kill userspace dead" routines
   for each mapping/index tuple the filesystem finds for the given
   LBA address that the media error occurred.

Right now if the media error is in filesystem metadata, the
filesystem isn't even told about it. The filesystem can't even shut
down - the error is just dropped on the floor and it won't be until
the filesystem next tries to reference that metadata that we notice
there is an issue.

Cheers,

Dave.






[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux