On Mon, Jul 09, 2018 at 02:49:54PM -0700, Dan Williams wrote: > On Mon, Jul 9, 2018 at 8:44 AM, Keith Busch <keith.busch@xxxxxxxxx> wrote: > > On Fri, Jul 06, 2018 at 03:25:15PM -0700, Dan Williams wrote: > >> This is going in the right direction... but still needs to account for > >> the blk_overlap. > >> > >> So, on a given DIMM BLK capacity is allocated from the top of DPA > >> space going down and PMEM capacity is allocated from the bottom of the > >> DPA space going up. > >> > >> Since BLK capacity is single DIMM, and PMEM capacity is striped you > >> could get into the situation where one DIMM is fully allocated for BLK > >> usage and that would shade / remove the possibility to use the PMEM > >> capacity on the other DIMMs in the PMEM set. PMEM needs all the same > >> DPAs in all the DIMMs to be free. > >> > >> > > >> > --- > >> > diff --git a/drivers/nvdimm/dimm_devs.c b/drivers/nvdimm/dimm_devs.c > >> > index 8d348b22ba45..f30e0c3b0282 100644 > >> > --- a/drivers/nvdimm/dimm_devs.c > >> > +++ b/drivers/nvdimm/dimm_devs.c > >> > @@ -536,6 +536,31 @@ resource_size_t nd_blk_available_dpa(struct nd_region *nd_region) > >> > return info.available; > >> > } > >> > > >> > +/** > >> > + * nd_pmem_max_contiguous_dpa - For the given dimm+region, return the max > >> > + * contiguous unallocated dpa range. > >> > + * @nd_region: constrain available space check to this reference region > >> > + * @nd_mapping: container of dpa-resource-root + labels > >> > + */ > >> > +resource_size_t nd_pmem_max_contiguous_dpa(struct nd_region *nd_region, > >> > + struct nd_mapping *nd_mapping) > >> > +{ > >> > + struct nvdimm_drvdata *ndd = to_ndd(nd_mapping); > >> > + resource_size_t max = 0; > >> > + struct resource *res; > >> > + > >> > + if (!ndd) > >> > + return 0; > >> > + > >> > + for_each_dpa_resource(ndd, res) { > >> > + if (strcmp(res->name, "pmem-reserve") != 0) > >> > + continue; > >> > + if (resource_size(res) > max) > >> > >> ...so instead straight resource_size() here you need trim the end of > >> this "pmem-reserve" resource to the start of the first BLK allocation > >> in any of the DIMMs in the set. > >> > >> See blk_start calculation in nd_pmem_available_dpa(). Okay, I think I've got it this time. __reserve_free_pmem won't reserve a resource above the first blk resource, so we can't get some fragmented alternating mix of blk and pmem within a dimm. With that in mind, if we reserve pmem across all dimms in a region, take the largest reserved pmem within each dimm, then take the smallest of those across all dimms, that should be the largest allocatable pmem extent.