Re: [PATCH 18/46] cxl/hdm: Add support for allocating DPA to an endpoint decoder

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Jonathan Cameron wrote:
> On Thu, 23 Jun 2022 19:47:18 -0700
> Dan Williams <dan.j.williams@xxxxxxxxx> wrote:
> 
> > The region provisioning flow will roughly follow a sequence of:
> > 
> > 1/ Allocate DPA to a set of decoders
> > 
> > 2/ Allocate HPA to a region
> > 
> > 3/ Associate decoders with a region and validate that the DPA allocations
> >    and topologies match the parameters of the region.
> > 
> > For now, this change (step 1) arranges for DPA capacity to be allocated
> > and deleted from non-committed decoders based on the decoder's mode /
> > partition selection. Capacity is allocated from the lowest DPA in the
> > partition and any 'pmem' allocation blocks out all remaining ram
> > capacity in its 'skip' setting. DPA allocations are enforced in decoder
> > instance order. I.e. decoder N + 1 always starts at a higher DPA than
> > instance N, and deleting allocations must proceed from the
> > highest-instance allocated decoder to the lowest.
> > 
> > Signed-off-by: Dan Williams <dan.j.williams@xxxxxxxxx>
> 
> The error value setting in here might save a few lines, but to me it
> is less readable than setting rc in each error path.
> 
> > ---
> >  Documentation/ABI/testing/sysfs-bus-cxl |   37 +++++++
> >  drivers/cxl/core/core.h                 |    7 +
> >  drivers/cxl/core/hdm.c                  |  160 +++++++++++++++++++++++++++++++
> >  drivers/cxl/core/port.c                 |   73 ++++++++++++++
> >  4 files changed, 275 insertions(+), 2 deletions(-)
> > 
> > diff --git a/Documentation/ABI/testing/sysfs-bus-cxl b/Documentation/ABI/testing/sysfs-bus-cxl
> > index 091459216e11..85844f9bc00b 100644
> > --- a/Documentation/ABI/testing/sysfs-bus-cxl
> > +++ b/Documentation/ABI/testing/sysfs-bus-cxl
> > @@ -171,7 +171,7 @@ Date:		May, 2022
> >  KernelVersion:	v5.20
> >  Contact:	linux-cxl@xxxxxxxxxxxxxxx
> >  Description:
> > -		(RO) When a CXL decoder is of devtype "cxl_decoder_endpoint" it
> > +		(RW) When a CXL decoder is of devtype "cxl_decoder_endpoint" it
> >  		translates from a host physical address range, to a device local
> >  		address range. Device-local address ranges are further split
> >  		into a 'ram' (volatile memory) range and 'pmem' (persistent
> > @@ -180,3 +180,38 @@ Description:
> >  		when a decoder straddles the volatile/persistent partition
> >  		boundary, and 'none' indicates the decoder is not actively
> >  		decoding, or no DPA allocation policy has been set.
> > +
> > +		'mode' can be written, when the decoder is in the 'disabled'
> > +		state, with either 'ram' or 'pmem' to set the boundaries for the
> > +		next allocation.
> > +
> 
> As before, documentation above this in the file only uses single line break between
> entries.

Yeah, I'll go fix those up as a precursor patch.

> 
> > +
> > +What:		/sys/bus/cxl/devices/decoderX.Y/dpa_resource
> > +Date:		May, 2022
> > +KernelVersion:	v5.20
> > +Contact:	linux-cxl@xxxxxxxxxxxxxxx
> > +Description:
> > +		(RO) When a CXL decoder is of devtype "cxl_decoder_endpoint",
> > +		and its 'dpa_size' attribute is non-zero, this attribute
> > +		indicates the device physical address (DPA) base address of the
> > +		allocation.
> 
> Why _resource rather than _base or _start?

To mimic PCI and NVDIMM sysfs that calls it a 'resource' address.

> 
> > +
> > +
> > +What:		/sys/bus/cxl/devices/decoderX.Y/dpa_size
> > +Date:		May, 2022
> > +KernelVersion:	v5.20
> > +Contact:	linux-cxl@xxxxxxxxxxxxxxx
> > +Description:
> > +		(RW) When a CXL decoder is of devtype "cxl_decoder_endpoint" it
> > +		translates from a host physical address range, to a device local
> > +		address range. The range, base address plus length in bytes, of
> > +		DPA allocated to this decoder is conveyed in these 2 attributes.
> > +		Allocations can be mutated as long as the decoder is in the
> > +		disabled state. A write to 'size' releases the previous DPA
> 
> 'dpa_size' ?

Yes.

> 
> > +		allocation and then attempts to allocate from the free capacity
> > +		in the device partition referred to by 'decoderX.Y/mode'.
> > +		Allocate and free requests can only be performed on the highest
> > +		instance number disabled decoder with non-zero size. I.e.
> > +		allocations are enforced to occur in increasing 'decoderX.Y/id'
> > +		order and frees are enforced to occur in decreasing
> > +		'decoderX.Y/id' order.
> > diff --git a/drivers/cxl/core/core.h b/drivers/cxl/core/core.h
> > index 1a50c0fc399c..47cf0c286fc3 100644
> > --- a/drivers/cxl/core/core.h
> > +++ b/drivers/cxl/core/core.h
> > @@ -17,6 +17,13 @@ int cxl_send_cmd(struct cxl_memdev *cxlmd, struct cxl_send_command __user *s);
> >  void __iomem *devm_cxl_iomap_block(struct device *dev, resource_size_t addr,
> >  				   resource_size_t length);
> >  
> > +int cxl_dpa_set_mode(struct cxl_endpoint_decoder *cxled,
> > +		     enum cxl_decoder_mode mode);
> > +int cxl_dpa_alloc(struct cxl_endpoint_decoder *cxled, unsigned long long size);
> > +int cxl_dpa_free(struct cxl_endpoint_decoder *cxled);
> > +resource_size_t cxl_dpa_size(struct cxl_endpoint_decoder *cxled);
> > +resource_size_t cxl_dpa_resource(struct cxl_endpoint_decoder *cxled);
> > +
> >  int cxl_memdev_init(void);
> >  void cxl_memdev_exit(void);
> >  void cxl_mbox_init(void);
> > diff --git a/drivers/cxl/core/hdm.c b/drivers/cxl/core/hdm.c
> > index 8805afe63ebf..ceb4c28abc1b 100644
> > --- a/drivers/cxl/core/hdm.c
> > +++ b/drivers/cxl/core/hdm.c
> > @@ -248,6 +248,166 @@ static int cxl_dpa_reserve(struct cxl_endpoint_decoder *cxled,
> >  	return devm_add_action_or_reset(&port->dev, cxl_dpa_release, cxled);
> >  }
> >  
> > +resource_size_t cxl_dpa_size(struct cxl_endpoint_decoder *cxled)
> > +{
> > +	resource_size_t size = 0;
> > +
> > +	down_read(&cxl_dpa_rwsem);
> > +	if (cxled->dpa_res)
> > +		size = resource_size(cxled->dpa_res);
> > +	up_read(&cxl_dpa_rwsem);
> > +
> > +	return size;
> > +}
> > +
> > +resource_size_t cxl_dpa_resource(struct cxl_endpoint_decoder *cxled)
> 
> Instinct would be to expect this to return the resource, not the start.
> Rename?

I can add _start, but this is only servicing the dpa_resource_show()
sysfs operation, and there 'resource' as a the base address has ABI
history.

> 
> 
> > +{
> > +	resource_size_t base = -1;
> > +
> > +	down_read(&cxl_dpa_rwsem);
> > +	if (cxled->dpa_res)
> > +		base = cxled->dpa_res->start;
> > +	up_read(&cxl_dpa_rwsem);
> > +
> > +	return base;
> > +}
> > +
> > +int cxl_dpa_free(struct cxl_endpoint_decoder *cxled)
> > +{
> > +	int rc = -EBUSY;
> > +	struct device *dev = &cxled->cxld.dev;
> > +	struct cxl_port *port = to_cxl_port(dev->parent);
> > +
> > +	down_write(&cxl_dpa_rwsem);
> > +	if (!cxled->dpa_res) {
> > +		rc = 0;
> > +		goto out;
> > +	}
> > +	if (cxled->cxld.flags & CXL_DECODER_F_ENABLE) {
> > +		dev_dbg(dev, "decoder enabled\n");
> 
> I'd prefer explicit setting of rc = -EBUSY in the two
> 'error' paths to make it really clear when looking at these
> that they are treated as errors.

Ok.

> 
> > +		goto out;
> > +	}
> > +	if (cxled->cxld.id != port->dpa_end) {
> > +		dev_dbg(dev, "expected decoder%d.%d\n", port->id,
> > +			port->dpa_end);
> > +		goto out;
> > +	}
> > +	__cxl_dpa_release(cxled, true);
> > +	rc = 0;
> > +out:
> > +	up_write(&cxl_dpa_rwsem);
> > +	return rc;
> > +}
> > +
> > +int cxl_dpa_set_mode(struct cxl_endpoint_decoder *cxled,
> > +		     enum cxl_decoder_mode mode)
> > +{
> > +	struct cxl_memdev *cxlmd = cxled_to_memdev(cxled);
> > +	struct cxl_dev_state *cxlds = cxlmd->cxlds;
> > +	struct device *dev = &cxled->cxld.dev;
> > +	int rc = -EBUSY;
> 
> As above, I'd prefer seeing error set in each error path rther
> than it being set in a few locations and having to go look
> for which value it currently has.  To me having the
> error code next to the condition is much easier to follow.

Fair enough.

> 
> > +
> > +	switch (mode) {
> > +	case CXL_DECODER_RAM:
> > +	case CXL_DECODER_PMEM:
> > +		break;
> > +	default:
> > +		dev_dbg(dev, "unsupported mode: %d\n", mode);
> > +		return -EINVAL;
> > +	}
> > +
> > +	down_write(&cxl_dpa_rwsem);
> > +	if (cxled->cxld.flags & CXL_DECODER_F_ENABLE)
> > +		goto out;
> > +	/*
> > +	 * Only allow modes that are supported by the current partition
> > +	 * configuration
> > +	 */
> > +	rc = -ENXIO;
> > +	if (mode == CXL_DECODER_PMEM && !resource_size(&cxlds->pmem_res)) {
> > +		dev_dbg(dev, "no available pmem capacity\n");
> > +		goto out;
> > +	}
> > +	if (mode == CXL_DECODER_RAM && !resource_size(&cxlds->ram_res)) {
> > +		dev_dbg(dev, "no available ram capacity\n");
> > +		goto out;
> > +	}
> > +
> > +	cxled->mode = mode;
> > +	rc = 0;
> > +out:
> > +	up_write(&cxl_dpa_rwsem);
> > +
> > +	return rc;
> > +}
> > +
> > +int cxl_dpa_alloc(struct cxl_endpoint_decoder *cxled, unsigned long long size)
> > +{
> > +	struct cxl_memdev *cxlmd = cxled_to_memdev(cxled);
> > +	resource_size_t free_ram_start, free_pmem_start;
> > +	struct cxl_port *port = cxled_to_port(cxled);
> > +	struct cxl_dev_state *cxlds = cxlmd->cxlds;
> > +	struct device *dev = &cxled->cxld.dev;
> > +	resource_size_t start, avail, skip;
> > +	struct resource *p, *last;
> > +	int rc = -EBUSY;
> > +
> > +	down_write(&cxl_dpa_rwsem);
> > +	if (cxled->cxld.flags & CXL_DECODER_F_ENABLE) {
> > +		dev_dbg(dev, "decoder enabled\n");
> > +		goto out;
> 
> 
> -EBUSY only used in this path, so clearer to me to push that setting down
> to in this  error path.

Ok.

Folded in these changes:

diff --git a/Documentation/ABI/testing/sysfs-bus-cxl b/Documentation/ABI/testing/sysfs-bus-cxl
index 85844f9bc00b..3fa6da73751e 100644
--- a/Documentation/ABI/testing/sysfs-bus-cxl
+++ b/Documentation/ABI/testing/sysfs-bus-cxl
@@ -207,7 +207,7 @@ Description:
 		address range. The range, base address plus length in bytes, of
 		DPA allocated to this decoder is conveyed in these 2 attributes.
 		Allocations can be mutated as long as the decoder is in the
-		disabled state. A write to 'size' releases the previous DPA
+		disabled state. A write to 'dpa_size' releases the previous DPA
 		allocation and then attempts to allocate from the free capacity
 		in the device partition referred to by 'decoderX.Y/mode'.
 		Allocate and free requests can only be performed on the highest
diff --git a/drivers/cxl/core/core.h b/drivers/cxl/core/core.h
index 47cf0c286fc3..65bcaecec405 100644
--- a/drivers/cxl/core/core.h
+++ b/drivers/cxl/core/core.h
@@ -22,7 +22,7 @@ int cxl_dpa_set_mode(struct cxl_endpoint_decoder *cxled,
 int cxl_dpa_alloc(struct cxl_endpoint_decoder *cxled, unsigned long long size);
 int cxl_dpa_free(struct cxl_endpoint_decoder *cxled);
 resource_size_t cxl_dpa_size(struct cxl_endpoint_decoder *cxled);
-resource_size_t cxl_dpa_resource(struct cxl_endpoint_decoder *cxled);
+resource_size_t cxl_dpa_resource_start(struct cxl_endpoint_decoder *cxled);
 
 int cxl_memdev_init(void);
 void cxl_memdev_exit(void);
diff --git a/drivers/cxl/core/hdm.c b/drivers/cxl/core/hdm.c
index 0c1ff3c0142f..e9281557781d 100644
--- a/drivers/cxl/core/hdm.c
+++ b/drivers/cxl/core/hdm.c
@@ -270,7 +270,7 @@ resource_size_t cxl_dpa_size(struct cxl_endpoint_decoder *cxled)
 	return size;
 }
 
-resource_size_t cxl_dpa_resource(struct cxl_endpoint_decoder *cxled)
+resource_size_t cxl_dpa_resource_start(struct cxl_endpoint_decoder *cxled)
 {
 	resource_size_t base = -1;
 
@@ -284,9 +284,9 @@ resource_size_t cxl_dpa_resource(struct cxl_endpoint_decoder *cxled)
 
 int cxl_dpa_free(struct cxl_endpoint_decoder *cxled)
 {
-	int rc = -EBUSY;
+	struct cxl_port *port = cxled_to_port(cxled);
 	struct device *dev = &cxled->cxld.dev;
-	struct cxl_port *port = to_cxl_port(dev->parent);
+	int rc;
 
 	down_write(&cxl_dpa_rwsem);
 	if (!cxled->dpa_res) {
@@ -295,11 +295,13 @@ int cxl_dpa_free(struct cxl_endpoint_decoder *cxled)
 	}
 	if (cxled->cxld.flags & CXL_DECODER_F_ENABLE) {
 		dev_dbg(dev, "decoder enabled\n");
+		rc = -EBUSY;
 		goto out;
 	}
-	if (cxled->cxld.id != port->dpa_end) {
+	if (cxled->cxld.id != port->hdm_end) {
 		dev_dbg(dev, "expected decoder%d.%d\n", port->id,
-			port->dpa_end);
+			port->hdm_end);
+		rc = -EBUSY;
 		goto out;
 	}
 	devm_cxl_dpa_release(cxled);
@@ -315,7 +317,7 @@ int cxl_dpa_set_mode(struct cxl_endpoint_decoder *cxled,
 	struct cxl_memdev *cxlmd = cxled_to_memdev(cxled);
 	struct cxl_dev_state *cxlds = cxlmd->cxlds;
 	struct device *dev = &cxled->cxld.dev;
-	int rc = -EBUSY;
+	int rc;
 
 	switch (mode) {
 	case CXL_DECODER_RAM:
@@ -327,19 +329,23 @@ int cxl_dpa_set_mode(struct cxl_endpoint_decoder *cxled,
 	}
 
 	down_write(&cxl_dpa_rwsem);
-	if (cxled->cxld.flags & CXL_DECODER_F_ENABLE)
+	if (cxled->cxld.flags & CXL_DECODER_F_ENABLE) {
+		rc = -EBUSY;
 		goto out;
+	}
+
 	/*
 	 * Only allow modes that are supported by the current partition
 	 * configuration
 	 */
-	rc = -ENXIO;
 	if (mode == CXL_DECODER_PMEM && !resource_size(&cxlds->pmem_res)) {
 		dev_dbg(dev, "no available pmem capacity\n");
+		rc = -ENXIO;
 		goto out;
 	}
 	if (mode == CXL_DECODER_RAM && !resource_size(&cxlds->ram_res)) {
 		dev_dbg(dev, "no available ram capacity\n");
+		rc = -ENXIO;
 		goto out;
 	}
 
@@ -360,11 +366,12 @@ int cxl_dpa_alloc(struct cxl_endpoint_decoder *cxled, unsigned long long size)
 	struct device *dev = &cxled->cxld.dev;
 	resource_size_t start, avail, skip;
 	struct resource *p, *last;
-	int rc = -EBUSY;
+	int rc;
 
 	down_write(&cxl_dpa_rwsem);
 	if (cxled->cxld.flags & CXL_DECODER_F_ENABLE) {
 		dev_dbg(dev, "decoder enabled\n");
+		rc = -EBUSY;
 		goto out;
 	}
 
diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c
index 130989846cce..feed86737202 100644
--- a/drivers/cxl/core/port.c
+++ b/drivers/cxl/core/port.c
@@ -215,7 +215,7 @@ static ssize_t dpa_resource_show(struct device *dev, struct device_attribute *at
 			    char *buf)
 {
 	struct cxl_endpoint_decoder *cxled = to_cxl_endpoint_decoder(dev);
-	u64 base = cxl_dpa_resource(cxled);
+	u64 base = cxl_dpa_resource_start(cxled);
 
 	return sysfs_emit(buf, "%#llx\n", base);
 }



[Index of Archives]     [DMA Engine]     [Linux Coverity]     [Linux USB]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Greybus]

  Powered by Linux