On Thu, 14 Jul 2022 17:01:16 -0700 Dan Williams <dan.j.williams@xxxxxxxxx> wrote: > In preparation for provisioning CXL regions, add accounting for the DPA > space consumed by existing regions / decoders. Recall, a CXL region is a > memory range comprised from one or more endpoint devices contributing a > mapping of their DPA into HPA space through a decoder. > > Record the DPA ranges covered by committed decoders at initial probe of > endpoint ports relative to a per-device resource tree of the DPA type > (pmem or volatile-ram). > > The cxl_dpa_rwsem semaphore is introduced to globally synchronize DPA > state across all endpoints and their decoders at once. The vast majority > of DPA operations are reads as region creation is expected to be as rare > as disk partitioning and volume creation. The device_lock() for this > synchronization is specifically avoided for concern of entangling with > sysfs attribute removal. > > Co-developed-by: Ben Widawsky <bwidawsk@xxxxxxxxxx> > Signed-off-by: Ben Widawsky <bwidawsk@xxxxxxxxxx> > Signed-off-by: Dan Williams <dan.j.williams@xxxxxxxxx> One trivial ordering question inline. I'm not that bothered whether you do anything about it though as it's all very local. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@xxxxxxxxxx> > --- > drivers/cxl/core/hdm.c | 143 ++++++++++++++++++++++++++++++++++++++++++++---- > drivers/cxl/cxl.h | 2 + > drivers/cxl/cxlmem.h | 13 ++++ > 3 files changed, 147 insertions(+), 11 deletions(-) > > diff --git a/drivers/cxl/core/hdm.c b/drivers/cxl/core/hdm.c > index 650363d5272f..d4c17325001b 100644 > --- a/drivers/cxl/core/hdm.c > +++ b/drivers/cxl/core/hdm.c > @@ -153,10 +153,105 @@ void cxl_dpa_debug(struct seq_file *file, struct cxl_dev_state *cxlds) > } > EXPORT_SYMBOL_NS_GPL(cxl_dpa_debug, CXL); > > +/* > + * Must be called in a context that synchronizes against this decoder's > + * port ->remove() callback (like an endpoint decoder sysfs attribute) > + */ > +static void __cxl_dpa_release(struct cxl_endpoint_decoder *cxled) > +{ > + struct cxl_memdev *cxlmd = cxled_to_memdev(cxled); > + struct cxl_dev_state *cxlds = cxlmd->cxlds; > + struct resource *res = cxled->dpa_res; > + > + lockdep_assert_held_write(&cxl_dpa_rwsem); > + > + if (cxled->skip) > + __release_region(&cxlds->dpa_res, res->start - cxled->skip, > + cxled->skip); > + cxled->skip = 0; > + __release_region(&cxlds->dpa_res, res->start, resource_size(res)); Minor but I think the ordering in here is unnecessarily not the opposite of what is going on in __cxl_dpa_reserve() Should be releasing the actual rs first, then releasing the skip. > + cxled->dpa_res = NULL; > +} > + > +static void cxl_dpa_release(void *cxled) > +{ > + down_write(&cxl_dpa_rwsem); > + __cxl_dpa_release(cxled); > + up_write(&cxl_dpa_rwsem); > +} > + > +static int __cxl_dpa_reserve(struct cxl_endpoint_decoder *cxled, > + resource_size_t base, resource_size_t len, > + resource_size_t skipped) > +{ > + struct cxl_memdev *cxlmd = cxled_to_memdev(cxled); > + struct cxl_port *port = cxled_to_port(cxled); > + struct cxl_dev_state *cxlds = cxlmd->cxlds; > + struct device *dev = &port->dev; > + struct resource *res; > + > + lockdep_assert_held_write(&cxl_dpa_rwsem); > + > + if (!len) > + return 0; > + > + if (cxled->dpa_res) { > + dev_dbg(dev, "decoder%d.%d: existing allocation %pr assigned\n", > + port->id, cxled->cxld.id, cxled->dpa_res); > + return -EBUSY; > + } > + > + if (skipped) { > + res = __request_region(&cxlds->dpa_res, base - skipped, skipped, > + dev_name(&cxled->cxld.dev), 0); > + if (!res) { > + dev_dbg(dev, > + "decoder%d.%d: failed to reserve skipped space\n", > + port->id, cxled->cxld.id); > + return -EBUSY; > + } > + } > + res = __request_region(&cxlds->dpa_res, base, len, > + dev_name(&cxled->cxld.dev), 0); > + if (!res) { > + dev_dbg(dev, "decoder%d.%d: failed to reserve allocation\n", > + port->id, cxled->cxld.id); > + if (skipped) > + __release_region(&cxlds->dpa_res, base - skipped, > + skipped); > + return -EBUSY; > + } > + cxled->dpa_res = res; > + cxled->skip = skipped; > + > + return 0; > +} > + ...