On Fri, Apr 14, 2023 at 11:54:00AM -0700, Dan Williams wrote: > The CXL specification mandates that 4-byte registers must be accessed > with 4-byte access cycles. CXL 3.0 8.2.3 "Component Register Layout and > Definition" states that the behavior is undefined if (2) 32-bit > registers are accessed as an 8-byte quantity. It turns out that at least > one hardware implementation is sensitive to this in practice. The @size > variable results in zero with: > > size = readq(hdm + CXL_HDM_DECODER0_SIZE_LOW_OFFSET(which)); > > ...and the correct size with: > > lo = readl(hdm + CXL_HDM_DECODER0_SIZE_LOW_OFFSET(which)); > hi = readl(hdm + CXL_HDM_DECODER0_SIZE_HIGH_OFFSET(which)); > size = (hi << 32) + lo; > > Fixes: d17d0540a0db ("cxl/core/hdm: Add CXL standard decoder enumeration to the core") > Cc: <stable@xxxxxxxxxxxxxxx> > Signed-off-by: Dan Williams <dan.j.williams@xxxxxxxxx> I see you got rid of ioread64_hi_lo(), so this can't be happening anywhere else. Are all the other readl, writel usages known to be OK, or do you need review help against the spec? Reviewed-by: Alison Schofield <alison.schofield@xxxxxxxxx> > --- > drivers/cxl/core/hdm.c | 20 +++++++++++++------- > 1 file changed, 13 insertions(+), 7 deletions(-) > > diff --git a/drivers/cxl/core/hdm.c b/drivers/cxl/core/hdm.c > index 35b338b716fe..6fdf7981ddc7 100644 > --- a/drivers/cxl/core/hdm.c > +++ b/drivers/cxl/core/hdm.c > @@ -1,6 +1,5 @@ > // SPDX-License-Identifier: GPL-2.0-only > /* Copyright(c) 2022 Intel Corporation. All rights reserved. */ > -#include <linux/io-64-nonatomic-hi-lo.h> > #include <linux/seq_file.h> > #include <linux/device.h> > #include <linux/delay.h> > @@ -785,8 +784,8 @@ static int init_hdm_decoder(struct cxl_port *port, struct cxl_decoder *cxld, > int *target_map, void __iomem *hdm, int which, > u64 *dpa_base, struct cxl_endpoint_dvsec_info *info) > { > + u64 size, base, skip, dpa_size, lo, hi; > struct cxl_endpoint_decoder *cxled; > - u64 size, base, skip, dpa_size; > bool committed; > u32 remainder; > int i, rc; > @@ -801,8 +800,12 @@ static int init_hdm_decoder(struct cxl_port *port, struct cxl_decoder *cxld, > which, info); > > ctrl = readl(hdm + CXL_HDM_DECODER0_CTRL_OFFSET(which)); > - base = ioread64_hi_lo(hdm + CXL_HDM_DECODER0_BASE_LOW_OFFSET(which)); > - size = ioread64_hi_lo(hdm + CXL_HDM_DECODER0_SIZE_LOW_OFFSET(which)); > + lo = readl(hdm + CXL_HDM_DECODER0_BASE_LOW_OFFSET(which)); > + hi = readl(hdm + CXL_HDM_DECODER0_BASE_HIGH_OFFSET(which)); > + base = (hi << 32) + lo; > + lo = readl(hdm + CXL_HDM_DECODER0_SIZE_LOW_OFFSET(which)); > + hi = readl(hdm + CXL_HDM_DECODER0_SIZE_HIGH_OFFSET(which)); > + size = (hi << 32) + lo; > committed = !!(ctrl & CXL_HDM_DECODER0_CTRL_COMMITTED); > cxld->commit = cxl_decoder_commit; > cxld->reset = cxl_decoder_reset; > @@ -865,8 +868,9 @@ static int init_hdm_decoder(struct cxl_port *port, struct cxl_decoder *cxld, > return rc; > > if (!info) { > - target_list.value = > - ioread64_hi_lo(hdm + CXL_HDM_DECODER0_TL_LOW(which)); > + lo = readl(hdm + CXL_HDM_DECODER0_TL_LOW(which)); > + hi = readl(hdm + CXL_HDM_DECODER0_TL_HIGH(which)); > + target_list.value = (hi << 32) + lo; > for (i = 0; i < cxld->interleave_ways; i++) > target_map[i] = target_list.target_id[i]; > > @@ -883,7 +887,9 @@ static int init_hdm_decoder(struct cxl_port *port, struct cxl_decoder *cxld, > port->id, cxld->id, size, cxld->interleave_ways); > return -ENXIO; > } > - skip = ioread64_hi_lo(hdm + CXL_HDM_DECODER0_SKIP_LOW(which)); > + lo = readl(hdm + CXL_HDM_DECODER0_SKIP_LOW(which)); > + hi = readl(hdm + CXL_HDM_DECODER0_SKIP_HIGH(which)); > + skip = (hi << 32) + lo; > cxled = to_cxl_endpoint_decoder(&cxld->dev); > rc = devm_cxl_dpa_reserve(cxled, *dpa_base + skip, dpa_size, skip); > if (rc) { >