On Sun, Jan 31, 2016 at 10:07 AM, Matthew Wilcox <willy@xxxxxxxxxxxxxxx> wrote: > On Sun, Jan 31, 2016 at 08:38:20AM -0800, Dan Williams wrote: >> On Sun, Jan 31, 2016 at 2:55 AM, Matthew Wilcox <willy@xxxxxxxxxxxxxxx> wrote: >> > On Sat, Jan 30, 2016 at 11:12:12PM -0700, Ross Zwisler wrote: >> >> Is there a reason to store pnfs instead of kaddrs in the radix tree? >> > >> > Once ARM, MIPS and SPARC get supported, they're going to need temporary >> > kernel addresses assigned to PFNs rather than permanent ones. Also, >> > it'll be easier for teardown to delete PFNs associated with a particular >> > device than kaddrs associated with a particular device. And it lets >> > us support more persistent memory on a 32-bit machine (also on a 64-bit >> > machine, but that's mostly theoretical) >> > >> > +/* >> > + * DAX uses the 'exceptional' entries to store PFNs in the radix tree. >> > + * Bit 0 is clear (the radix tree uses this for its own purposes). Bit >> > + * 1 is set (to indicate an exceptional entry). Bits 2 & 3 are PFN_DEV >> > + * and PFN_MAP. The top two bits denote the size of the entry (PTE, PMD, >> > + * PUD, one reserved). That leaves us 26 bits on 32-bit systems and 58 >> > + * bits on 64-bit systems, able to address 256GB and 1024EB respectively. >> > + */ >> > >> > It's also pretty cheap to look up the kaddr from the pfn, at least on >> > 64-bit architectures without cache aliasing problems: >> > >> > +static void *dax_map_pfn(pfn_t pfn, unsigned long index) >> > +{ >> > + preempt_disable(); >> > + pagefault_disable(); >> > + return pfn_to_kaddr(pfn_t_to_pfn(pfn)); >> >> pfn_to_kaddr() assumes persistent memory is direct mapped which is not >> always the case. > > Yes. This is just the default implementation of dax_map_pfn() which works > for most situations. We can introduce more complex implementations of > dax_map_pfn() as necessary. You make another excellent point for why > we should store PFNs in the radix tree instead of kaddrs :-) How much complexity do we want to add in support of an fsync/msync mechanism that is not the recommended way to use DAX? -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html