Reviewed-by: Jacek Lawrynowicz <jacek.lawrynowicz@xxxxxxxxxxxxxxx> On 08.12.2023 17:31, Jeffrey Hugo wrote: > The SOC_HW_VERSION register in the BHI space is not correctly initialized > by the device and in many cases contains uninitialized data. The register > could contain 0xFFFFFFFF which is a special value to indicate a link > error in PCIe, therefore if observed, we could incorrectly think the > device is down. > > Intercept reads for this register, and provide the correct value - every > production instance would read 0x60110200 if the device was operating as > intended. > > Fixes: a36bf7af868b ("accel/qaic: Add MHI controller") > Signed-off-by: Jeffrey Hugo <quic_jhugo@xxxxxxxxxxx> > Reviewed-by: Pranjal Ramajor Asha Kanojiya <quic_pkanojiy@xxxxxxxxxxx> > --- > drivers/accel/qaic/mhi_controller.c | 15 ++++++++++++++- > 1 file changed, 14 insertions(+), 1 deletion(-) > > diff --git a/drivers/accel/qaic/mhi_controller.c b/drivers/accel/qaic/mhi_controller.c > index 5036e58e7235..1405623b03e4 100644 > --- a/drivers/accel/qaic/mhi_controller.c > +++ b/drivers/accel/qaic/mhi_controller.c > @@ -404,8 +404,21 @@ static struct mhi_controller_config aic100_config = { > > static int mhi_read_reg(struct mhi_controller *mhi_cntrl, void __iomem *addr, u32 *out) > { > - u32 tmp = readl_relaxed(addr); > + u32 tmp; > > + /* > + * SOC_HW_VERSION quirk > + * The SOC_HW_VERSION register (offset 0x224) is not reliable and > + * may contain uninitialized values, including 0xFFFFFFFF. This could > + * cause a false positive link down error. Instead, intercept any > + * reads and provide the correct value of the register. > + */ > + if (addr - mhi_cntrl->regs == 0x224) { > + *out = 0x60110200; > + return 0; > + } > + > + tmp = readl_relaxed(addr); > if (tmp == U32_MAX) > return -EIO; >