On 2023/8/5 7:30, Andi Shyti wrote: > Hi Yicong, > > On Wed, Aug 02, 2023 at 10:39:04AM +0800, Yicong Yang wrote: >> On 2023/8/2 6:15, Andi Shyti wrote: >>> Hi Yicong, >>> >>> On Tue, Aug 01, 2023 at 08:46:25PM +0800, Yicong Yang wrote: >>>> From: Yicong Yang <yangyicong@xxxxxxxxxxxxx> >>>> >>>> The controller may be shared with other port, for example the firmware. >>>> Handle the interrupt from other sources will cause crash since some >>>> data are not initialized. So only handle the interrupt of the driver's >>>> transfer and discard others. >>>> >>>> Signed-off-by: Yicong Yang <yangyicong@xxxxxxxxxxxxx> >>> >>> Is this a fix? Then, could you please add: >>> >>> Fixes: d62fbdb99a85 ("i2c: add support for HiSilicon I2C controller") >>> Cc: <stable@xxxxxxxxxxxxxxx> # v5.13+ >>> >>> What kind of crash is this? Is it a NULL pointer dereference? >> >> I not quite sure this is a fix of the driver. On some use case the controller is >> shared between the firmware and the OS and we have no synchronization method >> from the hardware. A transfer started by the firmware cause the interrupt handled >> by the driver and cause a NULL pointer dereference. > > So that the firmware is running on a controller and memory, > concurrently to the main CPU; which means that you are having > some kind of bus arbitration issue with two masters on the bus. > That is one case. Another case maybe the i2c rtc also locates on the same bus and the efi-rtc will access the i2c bus. > Anyway, if we are talking about avoiding a NULL pointer > dereference then this is a fix and you need to add the tags > above. > > (No need to resend for this, I can do it for you if you convince > me on the part below.) Thanks! > >>>> --- >>>> drivers/i2c/busses/i2c-hisi.c | 8 ++++++++ >>>> 1 file changed, 8 insertions(+) >>>> >>>> diff --git a/drivers/i2c/busses/i2c-hisi.c b/drivers/i2c/busses/i2c-hisi.c >>>> index e067671b3ce2..8328da4bc3ec 100644 >>>> --- a/drivers/i2c/busses/i2c-hisi.c >>>> +++ b/drivers/i2c/busses/i2c-hisi.c >>>> @@ -330,6 +330,14 @@ static irqreturn_t hisi_i2c_irq(int irq, void *context) >>>> struct hisi_i2c_controller *ctlr = context; >>>> u32 int_stat; >>>> >>>> + /* >>>> + * Don't handle the interrupt if cltr->completion is NULL. We may >>>> + * reach here because the interrupt is spurious or the transfer is >>>> + * started by another port rather than us. >>>> + */ >>>> + if (!ctlr->completion) >>>> + return IRQ_NONE; >>> >>> Is this the place you should really check for completion being >>> NULL? By reading the code I don't exclude that completion at this >>> stage might be NULL. >>> >>> Can it be that the real fix is this one instead: >> >> Maybe not. If we handle the case as late as below, we'll operate the hardware >> which should be handled by the firmware which start the transfer. So we check >> it as early as possible. > > But if i2c_master_xfer() is not called and we receive an irq, > most probably ctrl->completion is NULL. Right? Can this happen? > Yes, this is the case. > I can't really tell the sequence for enabling/disabling the > interrupt in this device. They might happen in > hisi_i2c_start_xfer() for enabling and in hisi_i2c_xfer_msg() for > desabling at the last message; which makes the scenario above a > bit difficult, indeed. > The driver will keep the interrupt disabled if no transfer in progress. But since the transfer is driven by the interrupt so if the firmware start the transfer it will enable the interrupt. In such case the driver will receive an interrupt on the Tx FIFO empty, etc and since the transfer is not started by the driver ctlr->completion is not initialized. Thanks, Yicong > Thanks for the explanation, > Andi > >>> @@ -352,7 +352,7 @@ static irqreturn_t hisi_i2c_irq(int irq, void *context) >>> * Only use TRANS_CPLT to indicate the completion. On error cases we'll >>> * get two interrupts, INT_ERR first then TRANS_CPLT. >>> */ >>> - if (int_stat & HISI_I2C_INT_TRANS_CPLT) { >>> + if (ctrl->completion && (int_stat & HISI_I2C_INT_TRANS_CPLT)) { >>> hisi_i2c_disable_int(ctlr, HISI_I2C_INT_ALL); >>> hisi_i2c_clear_int(ctlr, HISI_I2C_INT_ALL); >>> complete(ctlr->completion); >>> >>> Anyway, this whole completion management smells a bit racy to me. >>> >>> Andi >>> >>>> int_stat = readl(ctlr->iobase + HISI_I2C_INT_MSTAT); >>>> hisi_i2c_clear_int(ctlr, int_stat); >>>> if (!(int_stat & HISI_I2C_INT_ALL)) >>>> -- >>>> 2.24.0 >>>> >>> . >>> > . >