On Mon, 2024-01-15 at 02:03 +0000, Ping-Ke Shih wrote: > > > > -----Original Message----- > > From: Deren Wu <deren.wu@xxxxxxxxxxxx> > > Sent: Saturday, January 13, 2024 5:00 PM > > To: Felix Fietkau <nbd@xxxxxxxx>; Lorenzo Bianconi < > lorenzo@xxxxxxxxxx> > > Cc: Sean Wang <sean.wang@xxxxxxxxxxxx>; Soul Huang < > Soul.Huang@xxxxxxxxxxxx>; Ming Yen Hsieh > > <mingyen.hsieh@xxxxxxxxxxxx>; Leon Yen <Leon.Yen@xxxxxxxxxxxx>; > Eric-SY Chang > > <Eric-SY.Chang@xxxxxxxxxxxx>; KM Lin <km.lin@xxxxxxxxxxxx>; Robin > Chiu <robin.chiu@xxxxxxxxxxxx>; CH Yeh > > <ch.yeh@xxxxxxxxxxxx>; Posh Sun <posh.sun@xxxxxxxxxxxx>; Quan Zhou > <quan.zhou@xxxxxxxxxxxx>; Ryder Lee > > <ryder.lee@xxxxxxxxxxxx>; Shayne Chen <shayne.chen@xxxxxxxxxxxx>; > linux-wireless > > <linux-wireless@xxxxxxxxxxxxxxx>; linux-mediatek < > linux-mediatek@xxxxxxxxxxxxxxxxxxx>; Deren Wu > > <deren.wu@xxxxxxxxxxxx> > > Subject: [PATCH 1/2] wifi: mt76: mt7921e: fix use-after-free in > free_irq() > > > > From commit a304e1b82808 ("[PATCH] Debug shared irqs"), there is a > test > > to make sure the shared irq handler should be able to handle the > unexpected > > event after deregistration. For this case, let's apply MT76_REMOVED > flag to > > indicate the device was removed and do not run into the resource > access > > anymore. > > > > BUG: KASAN: use-after-free in mt7921_irq_handler+0xd8/0x100 > [mt7921e] > > Read of size 8 at addr ffff88824a7d3b78 by task rmmod/11115 > > CPU: 28 PID: 11115 Comm: rmmod Tainted: G W L 5.17.0 > #10 > > Hardware name: Micro-Star International Co., Ltd. MS-7D73/MPG B650I > > EDGE WIFI (MS-7D73), BIOS 1.81 01/05/2024 > > Call Trace: > > <TASK> > > dump_stack_lvl+0x6f/0xa0 > > print_address_description.constprop.0+0x1f/0x190 > > ? mt7921_irq_handler+0xd8/0x100 [mt7921e] > > ? mt7921_irq_handler+0xd8/0x100 [mt7921e] > > kasan_report.cold+0x7f/0x11b > > ? mt7921_irq_handler+0xd8/0x100 [mt7921e] > > mt7921_irq_handler+0xd8/0x100 [mt7921e] > > free_irq+0x627/0xaa0 > > devm_free_irq+0x94/0xd0 > > ? devm_request_any_context_irq+0x160/0x160 > > ? kobject_put+0x18d/0x4a0 > > mt7921_pci_remove+0x153/0x190 [mt7921e] > > pci_device_remove+0xa2/0x1d0 > > __device_release_driver+0x346/0x6e0 > > driver_detach+0x1ef/0x2c0 > > bus_remove_driver+0xe7/0x2d0 > > ? __check_object_size+0x57/0x310 > > pci_unregister_driver+0x26/0x250 > > __do_sys_delete_module+0x307/0x510 > > ? free_module+0x6a0/0x6a0 > > ? fpregs_assert_state_consistent+0x4b/0xb0 > > ? rcu_read_lock_sched_held+0x10/0x70 > > ? syscall_enter_from_user_mode+0x20/0x70 > > ? trace_hardirqs_on+0x1c/0x130 > > do_syscall_64+0x5c/0x80 > > ? trace_hardirqs_on_prepare+0x72/0x160 > > ? do_syscall_64+0x68/0x80 > > ? trace_hardirqs_on_prepare+0x72/0x160 > > entry_SYSCALL_64_after_hwframe+0x44/0xae > > > > Reported-by: Mikhail Gavrilov <mikhail.v.gavrilov@xxxxxxxxx> > > Closes: > > > https://lore.kernel.org/linux-wireless/CABXGCsOdvVwdLmSsC8TZ1jF0UOg_F_W3wqLECWX620PUkvNk=A@mail.gmail > . > > com/ > > Fixes: 9270270d6219 ("wifi: mt76: mt7921: fix PCI DMA hang after > reboot") > > Tested-by: Mikhail Gavrilov <mikhail.v.gavrilov@xxxxxxxxx> > > Signed-off-by: Deren Wu <deren.wu@xxxxxxxxxxxx> > > --- > > drivers/net/wireless/mediatek/mt76/mt7921/pci.c | 1 + > > drivers/net/wireless/mediatek/mt76/mt792x_dma.c | 2 ++ > > 2 files changed, 3 insertions(+) > > > > diff --git a/drivers/net/wireless/mediatek/mt76/mt7921/pci.c > > b/drivers/net/wireless/mediatek/mt76/mt7921/pci.c > > index 57903c6e4f11..2f04d6658b6b 100644 > > --- a/drivers/net/wireless/mediatek/mt76/mt7921/pci.c > > +++ b/drivers/net/wireless/mediatek/mt76/mt7921/pci.c > > @@ -387,6 +387,7 @@ static void mt7921_pci_remove(struct pci_dev > *pdev) > > struct mt792x_dev *dev = container_of(mdev, struct > mt792x_dev, mt76); > > > > mt7921e_unregister_device(dev); > > + set_bit(MT76_REMOVED, &mdev->phy.state); > > Can it do below like mt7921_pci_suspend() to safely stop interrupt > handler? > Instead of setting a flag. > > synchronize_irq(pdev->irq); > tasklet_kill(&mdev->irq_tasklet); > Here is the snapshot. The code is trying to direct access this irq handler after deregisering, for IRQF_SHARED case. synchronize_irq() and tasklet_kill() are all done in previous steps. We need to stop the extra call here. If there are any alternative, that would be appreciated. /* * It's a shared IRQ -- the driver ought to be prepared for an IRQ * event to happen even now it's being freed, so let's make sure that * is so by doing an extra call to the handler .... * * ( We do this after actually deregistering it, to make sure that a * 'real' IRQ doesn't run in parallel with our fake. ) */ if (action->flags & IRQF_SHARED) { local_irq_save(flags); action->handler(irq, dev_id); local_irq_restore(flags); } https://elixir.bootlin.com/linux/v6.7/source/kernel/irq/manage.c#L1949 > > > devm_free_irq(&pdev->dev, pdev->irq, dev); > > mt76_free_device(&dev->mt76); > > pci_free_irq_vectors(pdev); > > diff --git a/drivers/net/wireless/mediatek/mt76/mt792x_dma.c > > b/drivers/net/wireless/mediatek/mt76/mt792x_dma.c > > index 488326ce5ed4..3893dbe866fe 100644 > > --- a/drivers/net/wireless/mediatek/mt76/mt792x_dma.c > > +++ b/drivers/net/wireless/mediatek/mt76/mt792x_dma.c > > @@ -12,6 +12,8 @@ irqreturn_t mt792x_irq_handler(int irq, void > *dev_instance) > > { > > struct mt792x_dev *dev = dev_instance; > > > > If PCI is removed, is it still safe to access 'dev_instance'? > For this case, we are running into devm_free_irq() and the instance is still alive. The irq handler should be destroyed before the PCI is removed. > > + if (test_bit(MT76_REMOVED, &dev->mt76.phy.state)) > > + return IRQ_NONE; > > mt76_wr(dev, dev->irq_map->host_irq_enable, 0); > > > > if (!test_bit(MT76_STATE_INITIALIZED, &dev->mphy.state)) > > -- > > 2.18.0 > > >