Patch "net/mlx5: DPLL, Fix possible use after free after delayed work timer triggers" has been added to the 6.7-stable tree

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



This is a note to let you know that I've just added the patch titled

    net/mlx5: DPLL, Fix possible use after free after delayed work timer triggers

to the 6.7-stable tree which can be found at:
    http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary

The filename of the patch is:
     net-mlx5-dpll-fix-possible-use-after-free-after-dela.patch
and it can be found in the queue-6.7 subdirectory.

If you, or anyone else, feels it should not be added to the stable tree,
please let <stable@xxxxxxxxxxxxxxx> know about it.



commit 43ade627a4c0a618cbc98d52f1153c24e473e583
Author: Jiri Pirko <jiri@xxxxxxxxxxx>
Date:   Tue Feb 6 17:43:28 2024 +0100

    net/mlx5: DPLL, Fix possible use after free after delayed work timer triggers
    
    [ Upstream commit aa1eec2f546f2afa8c98ec41e5d8ee488165d685 ]
    
    I managed to hit following use after free warning recently:
    
    [ 2169.711665] ==================================================================
    [ 2169.714009] BUG: KASAN: slab-use-after-free in __run_timers.part.0+0x179/0x4c0
    [ 2169.716293] Write of size 8 at addr ffff88812b326a70 by task swapper/4/0
    
    [ 2169.719022] CPU: 4 PID: 0 Comm: swapper/4 Not tainted 6.8.0-rc2jiri+ #2
    [ 2169.720974] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
    [ 2169.722457] Call Trace:
    [ 2169.722756]  <IRQ>
    [ 2169.723024]  dump_stack_lvl+0x58/0xb0
    [ 2169.723417]  print_report+0xc5/0x630
    [ 2169.723807]  ? __virt_addr_valid+0x126/0x2b0
    [ 2169.724268]  kasan_report+0xbe/0xf0
    [ 2169.724667]  ? __run_timers.part.0+0x179/0x4c0
    [ 2169.725116]  ? __run_timers.part.0+0x179/0x4c0
    [ 2169.725570]  __run_timers.part.0+0x179/0x4c0
    [ 2169.726003]  ? call_timer_fn+0x320/0x320
    [ 2169.726404]  ? lock_downgrade+0x3a0/0x3a0
    [ 2169.726820]  ? kvm_clock_get_cycles+0x14/0x20
    [ 2169.727257]  ? ktime_get+0x92/0x150
    [ 2169.727630]  ? lapic_next_deadline+0x35/0x60
    [ 2169.728069]  run_timer_softirq+0x40/0x80
    [ 2169.728475]  __do_softirq+0x1a1/0x509
    [ 2169.728866]  irq_exit_rcu+0x95/0xc0
    [ 2169.729241]  sysvec_apic_timer_interrupt+0x6b/0x80
    [ 2169.729718]  </IRQ>
    [ 2169.729993]  <TASK>
    [ 2169.730259]  asm_sysvec_apic_timer_interrupt+0x16/0x20
    [ 2169.730755] RIP: 0010:default_idle+0x13/0x20
    [ 2169.731190] Code: c0 08 00 00 00 4d 29 c8 4c 01 c7 4c 29 c2 e9 72 ff ff ff cc cc cc cc 8b 05 9a 7f 1f 02 85 c0 7e 07 0f 00 2d cf 69 43 00 fb f4 <fa> c3 66 66 2e 0f 1f 84 00 00 00 00 00 65 48 8b 04 25 c0 93 04 00
    [ 2169.732759] RSP: 0018:ffff888100dbfe10 EFLAGS: 00000242
    [ 2169.733264] RAX: 0000000000000001 RBX: ffff888100d9c200 RCX: ffffffff8241bd62
    [ 2169.733925] RDX: ffffed109a848b15 RSI: 0000000000000004 RDI: ffffffff8127ac55
    [ 2169.734566] RBP: 0000000000000004 R08: 0000000000000000 R09: ffffed109a848b14
    [ 2169.735200] R10: ffff8884d42458a3 R11: 000000000000ba7e R12: ffffffff83d7d3a0
    [ 2169.735835] R13: 1ffff110201b7fc6 R14: 0000000000000000 R15: ffff888100d9c200
    [ 2169.736478]  ? ct_kernel_exit.constprop.0+0xa2/0xc0
    [ 2169.736954]  ? do_idle+0x285/0x290
    [ 2169.737323]  default_idle_call+0x63/0x90
    [ 2169.737730]  do_idle+0x285/0x290
    [ 2169.738089]  ? arch_cpu_idle_exit+0x30/0x30
    [ 2169.738511]  ? mark_held_locks+0x1a/0x80
    [ 2169.738917]  ? lockdep_hardirqs_on_prepare+0x12e/0x200
    [ 2169.739417]  cpu_startup_entry+0x30/0x40
    [ 2169.739825]  start_secondary+0x19a/0x1c0
    [ 2169.740229]  ? set_cpu_sibling_map+0xbd0/0xbd0
    [ 2169.740673]  secondary_startup_64_no_verify+0x15d/0x16b
    [ 2169.741179]  </TASK>
    
    [ 2169.741686] Allocated by task 1098:
    [ 2169.742058]  kasan_save_stack+0x1c/0x40
    [ 2169.742456]  kasan_save_track+0x10/0x30
    [ 2169.742852]  __kasan_kmalloc+0x83/0x90
    [ 2169.743246]  mlx5_dpll_probe+0xf5/0x3c0 [mlx5_dpll]
    [ 2169.743730]  auxiliary_bus_probe+0x62/0xb0
    [ 2169.744148]  really_probe+0x127/0x590
    [ 2169.744534]  __driver_probe_device+0xd2/0x200
    [ 2169.744973]  device_driver_attach+0x6b/0xf0
    [ 2169.745402]  bind_store+0x90/0xe0
    [ 2169.745761]  kernfs_fop_write_iter+0x1df/0x2a0
    [ 2169.746210]  vfs_write+0x41f/0x790
    [ 2169.746579]  ksys_write+0xc7/0x160
    [ 2169.746947]  do_syscall_64+0x6f/0x140
    [ 2169.747333]  entry_SYSCALL_64_after_hwframe+0x46/0x4e
    
    [ 2169.748049] Freed by task 1220:
    [ 2169.748393]  kasan_save_stack+0x1c/0x40
    [ 2169.748789]  kasan_save_track+0x10/0x30
    [ 2169.749188]  kasan_save_free_info+0x3b/0x50
    [ 2169.749621]  poison_slab_object+0x106/0x180
    [ 2169.750044]  __kasan_slab_free+0x14/0x50
    [ 2169.750451]  kfree+0x118/0x330
    [ 2169.750792]  mlx5_dpll_remove+0xf5/0x110 [mlx5_dpll]
    [ 2169.751271]  auxiliary_bus_remove+0x2e/0x40
    [ 2169.751694]  device_release_driver_internal+0x24b/0x2e0
    [ 2169.752191]  unbind_store+0xa6/0xb0
    [ 2169.752563]  kernfs_fop_write_iter+0x1df/0x2a0
    [ 2169.753004]  vfs_write+0x41f/0x790
    [ 2169.753381]  ksys_write+0xc7/0x160
    [ 2169.753750]  do_syscall_64+0x6f/0x140
    [ 2169.754132]  entry_SYSCALL_64_after_hwframe+0x46/0x4e
    
    [ 2169.754847] Last potentially related work creation:
    [ 2169.755315]  kasan_save_stack+0x1c/0x40
    [ 2169.755709]  __kasan_record_aux_stack+0x9b/0xf0
    [ 2169.756165]  __queue_work+0x382/0x8f0
    [ 2169.756552]  call_timer_fn+0x126/0x320
    [ 2169.756941]  __run_timers.part.0+0x2ea/0x4c0
    [ 2169.757376]  run_timer_softirq+0x40/0x80
    [ 2169.757782]  __do_softirq+0x1a1/0x509
    
    [ 2169.758387] Second to last potentially related work creation:
    [ 2169.758924]  kasan_save_stack+0x1c/0x40
    [ 2169.759322]  __kasan_record_aux_stack+0x9b/0xf0
    [ 2169.759773]  __queue_work+0x382/0x8f0
    [ 2169.760156]  call_timer_fn+0x126/0x320
    [ 2169.760550]  __run_timers.part.0+0x2ea/0x4c0
    [ 2169.760978]  run_timer_softirq+0x40/0x80
    [ 2169.761381]  __do_softirq+0x1a1/0x509
    
    [ 2169.761998] The buggy address belongs to the object at ffff88812b326a00
                    which belongs to the cache kmalloc-256 of size 256
    [ 2169.763061] The buggy address is located 112 bytes inside of
                    freed 256-byte region [ffff88812b326a00, ffff88812b326b00)
    
    [ 2169.764346] The buggy address belongs to the physical page:
    [ 2169.764866] page:000000000f2b1e89 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x12b324
    [ 2169.765731] head:000000000f2b1e89 order:2 entire_mapcount:0 nr_pages_mapped:0 pincount:0
    [ 2169.766484] anon flags: 0x200000000000840(slab|head|node=0|zone=2)
    [ 2169.767048] page_type: 0xffffffff()
    [ 2169.767422] raw: 0200000000000840 ffff888100042b40 0000000000000000 dead000000000001
    [ 2169.768183] raw: 0000000000000000 0000000000200020 00000001ffffffff 0000000000000000
    [ 2169.768899] page dumped because: kasan: bad access detected
    
    [ 2169.769649] Memory state around the buggy address:
    [ 2169.770116]  ffff88812b326900: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
    [ 2169.770805]  ffff88812b326980: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
    [ 2169.771485] >ffff88812b326a00: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
    [ 2169.772173]                                                              ^
    [ 2169.772787]  ffff88812b326a80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
    [ 2169.773477]  ffff88812b326b00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
    [ 2169.774160] ==================================================================
    [ 2169.774845] ==================================================================
    
    I didn't manage to reproduce it. Though the issue seems to be obvious.
    There is a chance that the mlx5_dpll_remove() calls
    cancel_delayed_work() when the work runs and manages to re-arm itself.
    In that case, after delay timer triggers next attempt to queue it,
    it works with freed memory.
    
    Fix this by using cancel_delayed_work_sync() instead which makes sure
    that work is done when it returns.
    
    Fixes: 496fd0a26bbf ("mlx5: Implement SyncE support using DPLL infrastructure")
    Signed-off-by: Jiri Pirko <jiri@xxxxxxxxxx>
    Reviewed-by: Simon Horman <horms@xxxxxxxxxx>
    Link: https://lore.kernel.org/r/20240206164328.360313-1-jiri@xxxxxxxxxxx
    Signed-off-by: Jakub Kicinski <kuba@xxxxxxxxxx>
    Signed-off-by: Sasha Levin <sashal@xxxxxxxxxx>

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/dpll.c b/drivers/net/ethernet/mellanox/mlx5/core/dpll.c
index 2cd81bb32c66..8ce5c8bcda1c 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/dpll.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/dpll.c
@@ -374,7 +374,7 @@ static void mlx5_dpll_remove(struct auxiliary_device *adev)
 	struct mlx5_dpll *mdpll = auxiliary_get_drvdata(adev);
 	struct mlx5_core_dev *mdev = mdpll->mdev;
 
-	cancel_delayed_work(&mdpll->work);
+	cancel_delayed_work_sync(&mdpll->work);
 	mlx5_dpll_mdev_netdev_untrack(mdpll, mdev);
 	destroy_workqueue(mdpll->wq);
 	dpll_pin_unregister(mdpll->dpll, mdpll->dpll_pin,




[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux