Re: [PATCH v2 00/10] evacuate struct page from the block layer, introduce __pfn_t

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



* Dan Williams <dan.j.williams@xxxxxxxxx> wrote:

> > What is the primary thing that is driving this need? Do we have a 
> > very concrete example?
> 
> My pet concrete example is covered by __pfn_t.  Referencing 
> persistent memory in an md/dm hierarchical storage configuration.  
> Setting aside the thrash to get existing block users to do 
> "bvec_set_page(page)" instead of "bvec->page = page" the onus is on 
> that md/dm implementation and backing storage device driver to 
> operate on __pfn_t.  That use case is simple because there is no use 
> of page locking or refcounting in that path, just dma_map_page() and 
> kmap_atomic().  The more difficult use case is precisely what Al 
> picked up on, O_DIRECT and RDMA.  This patchset does nothing to 
> address those use cases outside of not needing a struct page when 
> they eventually craft a bio.

So why not do a dual approach?

There are code paths where the 'pfn' of a persistent device is mostly 
used as a sector_t equivalent of terabytes of storage, not as an index 
of a memory object.

It's not an address to a cache, it's an index into a huge storage 
space - which happens to be (flash) RAM. For them using pfn_t seems 
natural and using struct page * is a strained (not to mention 
expensive) model.

For more complex facilities, where persistent memory is used as a 
memory object, especially where the underlying device is true, 
unfinitely writable RAM (not flash), treating it as a memory zone, or 
setting up dynamic struct page would be the natural approach. (with 
the inevitable cost of setup/teardown in the latter case)

I'd say that for anything where the dynamic struct page is torn down 
unconditionally after completion of only a single use, the natural API 
is probably pfn_t, not struct page. Any synchronization is already 
handled at the block request layer already, and it's storage op 
synchronization, not memory access synchronization really.

For anything more complex, that maps any of this storage to 
user-space, or exposes it to higher level struct page based APIs, 
etc., where references matter and it's more of a cache with 
potentially multiple users, not an IO space, the natural API is struct 
page.

I'd say that this particular series mostly addresses the 'pfn as 
sector_t' side of the equation, where persistent memory is IO space, 
not memory space, and as such it is the more natural and thus also the 
cheaper/faster approach.

Linus probably disagrees? :-)

Thanks,

	Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux