Hi Yicong, On Wed, Aug 02, 2023 at 10:39:04AM +0800, Yicong Yang wrote: > On 2023/8/2 6:15, Andi Shyti wrote: > > Hi Yicong, > > > > On Tue, Aug 01, 2023 at 08:46:25PM +0800, Yicong Yang wrote: > >> From: Yicong Yang <yangyicong@xxxxxxxxxxxxx> > >> > >> The controller may be shared with other port, for example the firmware. > >> Handle the interrupt from other sources will cause crash since some > >> data are not initialized. So only handle the interrupt of the driver's > >> transfer and discard others. > >> > >> Signed-off-by: Yicong Yang <yangyicong@xxxxxxxxxxxxx> > > > > Is this a fix? Then, could you please add: > > > > Fixes: d62fbdb99a85 ("i2c: add support for HiSilicon I2C controller") > > Cc: <stable@xxxxxxxxxxxxxxx> # v5.13+ > > > > What kind of crash is this? Is it a NULL pointer dereference? > > I not quite sure this is a fix of the driver. On some use case the controller is > shared between the firmware and the OS and we have no synchronization method > from the hardware. A transfer started by the firmware cause the interrupt handled > by the driver and cause a NULL pointer dereference. So that the firmware is running on a controller and memory, concurrently to the main CPU; which means that you are having some kind of bus arbitration issue with two masters on the bus. Anyway, if we are talking about avoiding a NULL pointer dereference then this is a fix and you need to add the tags above. (No need to resend for this, I can do it for you if you convince me on the part below.) > >> --- > >> drivers/i2c/busses/i2c-hisi.c | 8 ++++++++ > >> 1 file changed, 8 insertions(+) > >> > >> diff --git a/drivers/i2c/busses/i2c-hisi.c b/drivers/i2c/busses/i2c-hisi.c > >> index e067671b3ce2..8328da4bc3ec 100644 > >> --- a/drivers/i2c/busses/i2c-hisi.c > >> +++ b/drivers/i2c/busses/i2c-hisi.c > >> @@ -330,6 +330,14 @@ static irqreturn_t hisi_i2c_irq(int irq, void *context) > >> struct hisi_i2c_controller *ctlr = context; > >> u32 int_stat; > >> > >> + /* > >> + * Don't handle the interrupt if cltr->completion is NULL. We may > >> + * reach here because the interrupt is spurious or the transfer is > >> + * started by another port rather than us. > >> + */ > >> + if (!ctlr->completion) > >> + return IRQ_NONE; > > > > Is this the place you should really check for completion being > > NULL? By reading the code I don't exclude that completion at this > > stage might be NULL. > > > > Can it be that the real fix is this one instead: > > Maybe not. If we handle the case as late as below, we'll operate the hardware > which should be handled by the firmware which start the transfer. So we check > it as early as possible. But if i2c_master_xfer() is not called and we receive an irq, most probably ctrl->completion is NULL. Right? Can this happen? I can't really tell the sequence for enabling/disabling the interrupt in this device. They might happen in hisi_i2c_start_xfer() for enabling and in hisi_i2c_xfer_msg() for desabling at the last message; which makes the scenario above a bit difficult, indeed. Thanks for the explanation, Andi > > @@ -352,7 +352,7 @@ static irqreturn_t hisi_i2c_irq(int irq, void *context) > > * Only use TRANS_CPLT to indicate the completion. On error cases we'll > > * get two interrupts, INT_ERR first then TRANS_CPLT. > > */ > > - if (int_stat & HISI_I2C_INT_TRANS_CPLT) { > > + if (ctrl->completion && (int_stat & HISI_I2C_INT_TRANS_CPLT)) { > > hisi_i2c_disable_int(ctlr, HISI_I2C_INT_ALL); > > hisi_i2c_clear_int(ctlr, HISI_I2C_INT_ALL); > > complete(ctlr->completion); > > > > Anyway, this whole completion management smells a bit racy to me. > > > > Andi > > > >> int_stat = readl(ctlr->iobase + HISI_I2C_INT_MSTAT); > >> hisi_i2c_clear_int(ctlr, int_stat); > >> if (!(int_stat & HISI_I2C_INT_ALL)) > >> -- > >> 2.24.0 > >> > > . > >