Re: [PATCH v3 10/25] fsdax: Introduce pgmap_request_folios()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Jason Gunthorpe wrote:
> On Mon, Oct 17, 2022 at 01:06:25PM -0700, Dan Williams wrote:
> > Alistair Popple wrote:
> > > 
> > > Dan Williams <dan.j.williams@xxxxxxxxx> writes:
> > > 
> > > [...]
> > > 
> > > > +/**
> > > > + * pgmap_request_folios - activate an contiguous span of folios in @pgmap
> > > > + * @pgmap: host page map for the folio array
> > > > + * @folio: start of the folio list, all subsequent folios have same folio_size()
> > > > + *
> > > > + * Caller is responsible for @pgmap remaining live for the duration of
> > > > + * this call. Caller is also responsible for not racing requests for the
> > > > + * same folios.
> > > > + */
> > > > +bool pgmap_request_folios(struct dev_pagemap *pgmap, struct folio *folio,
> > > > +			  int nr_folios)
> > > 
> > > All callers of pgmap_request_folios() I could see call this with
> > > nr_folios == 1 and pgmap == folio_pgmap(folio). Could we remove the
> > > redundant arguments and simplify it to
> > > pgmap_request_folios(struct folio *folio)?
> > 
> > The rationale for the @pgmap argument being separate is to make clear
> > that the caller must assert that pgmap is alive before passing it to
> > pgmap_request_folios(), and that @pgmap is constant for the span of
> > folios requested.
> 
> The api is kind of weird to take in a folio - it should take in the
> offset in bytes from the start of the pgmap and "create" usable
> non-free folios.

Plumbing that is non-trivial. Recall that in the DAX case the filesystem
can be sitting on top of a stack of device-mapper-dax devices routing to
more than one pgmap, and the pgmap can be made up of discontiguous
ranges. The filesystem knows the offset into the dax-device and then
asks that device to translate that to a pfn. From pfn the optional (see
CONFIG_FS_DAX_LIMITED) pgmap can be retrieved. Once DAX knows what pfn
it wants, while holding a lock that prevents the pgmap from entering its
dying state, it can request to pin that pfn.

I arrived at the interface being folio based because the order of the
pin(s) also needs to be conveyed and the order should be consistent.

> A good design is that nothing has a struct folio * unless it can also
> prove it has a non-zero refcount on it, and that is simply impossible
> for any caller at this point.

I agree that's a good design, and I think it is a bug to make this
request on anything but a zero-refcount folio, but plumbing pgmap
offsets to replace pfn_to_page() would require dax_direct_access() to
shift from a pfn lookup to a pgmap + offset lookup.

CONFIG_FS_DAX_LIMITED is something that needs to be deleted, but for now
it supports DAX without a pgmap, where "limited" is just meant for XIP
(eXecute-In-Place), no GUP support. Once that goes then
dax_direct_access() could evolve into an interface that assumes a pgmap
is present. Until then I think pgmap_request_folios() at least puts out
the immediate fire of making people contend with oddly refcounted DAX
pages.




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux