Hi Sherry, On Tue, May 10, 2022 at 2:47 AM Sherry Sun <sherry.sun@xxxxxxx> wrote: > > Hi, Bluetooth community: > > When I try to connect the BLE device to i.MX8mm platform on L5.17.0-rc8 kernel, sometimes BLE connection may fail, at which point I can observe an kernel panic (Unable to handle kernel paging request at virtual address 00078024e000003b), then I enabled the KASAN, found there is a use-after-free issue in hci_le_conn_failed(), check the following error logs for details. > There is no such issue with the L5.15 kernel. Has anyone observed the similar issue on L5.17.0-rc8 kernel? Or have any idea regarding this issue? Many thanks. Interesting, this might be a problem with the controller generating a different event than expected: /* BLUETOOTH CORE SPECIFICATION Version 5.3 | Vol 4, Part E page 2261: * * If this event is unmasked and the HCI_LE_Connection_Complete event * is unmasked, only the HCI_LE_Enhanced_Connection_Complete event is * sent when a new connection has been created. */ err = __hci_cmd_sync_status_sk(hdev, HCI_OP_LE_CREATE_CONN, sizeof(cp), &cp, use_enhanced_conn_complete(hdev) ? HCI_EV_LE_ENHANCED_CONN_COMPLETE : HCI_EV_LE_CONN_COMPLETE, conn->conn_timeout, NULL); So if the event is different than expected the event would be processed _before_ create_le_conn_complete which would likely timeout, so perhaps we need a special handling to match multiple events instead of just one, though it worth checking if this is allowed by the spec given that: /* BLUETOOTH CORE SPECIFICATION Version 5.3 | Vol 4, Part E page 1789: * * C24: Mandatory if the LE Controller supports Connection State and either * LE Feature (LL Privacy) or LE Feature (Extended Advertising) is supported */ #define use_enhanced_conn_complete(dev) (ll_privacy_capable(dev) || \ ext_adv_capable(dev)) Perhaps this is not enough since it appears we are getting the enhanced version even if this evaluates to false so perhaps we should test something else, that said we also do: /* If the controller supports LL Privacy feature or LE Extended Adv, * enable the corresponding event. */ if (use_enhanced_conn_complete(hdev)) events[1] |= 0x02; /* LE Enhanced Connection Complete */ So it looks like the controller is generating LE Enhanced Connection Complete even when the host has not unmasked it. > Error logs: > [bluetooth]# connect E3:BD:5B:22:90:68 > Attempting to connect to E3:BD:5B:22:90:68 > [bluetooth]# [ 552.446716] Bluetooth: hci0: Opcode 0x200d failed: -110 > Failed to connect: org.bluez.Error.Failed > [ 552.467725] Bluetooth: hci0: request failed to create LE connection: err -110 > [ 552.481726] ================================================================== > [ 552.488966] BUG: KASAN: use-after-free in hci_le_conn_failed+0x4b0/0x5a0 > [ 552.495695] Read of size 8 at addr ffff0000126da570 by task kworker/u9:0/73 > [ 552.502667] > [ 552.504169] CPU: 2 PID: 73 Comm: kworker/u9:0 Tainted: G O 5.17.0-rc8-next-20220315-00001-g7c4f5ebecd82-dirty #245 > [ 552.515834] Hardware name: FSL i.MX8MM EVK board (DT) > [ 552.520894] Workqueue: hci0 hci_cmd_sync_work > [ 552.525275] Call trace: > [ 552.527727] dump_backtrace.part.0+0x210/0x220 > [ 552.532190] show_stack+0x18/0x6c > [ 552.535518] dump_stack_lvl+0x68/0x84 > [ 552.539199] print_report+0x160/0x5bc > [ 552.542878] kasan_report+0xa4/0xf0 > [ 552.546378] __asan_report_load8_noabort+0x3c/0x50 > [ 552.551185] hci_le_conn_failed+0x4b0/0x5a0 > [ 552.555378] create_le_conn_complete+0x60/0xb0 > [ 552.559836] hci_cmd_sync_work+0x128/0x210 > [ 552.563945] process_one_work+0x60c/0xf04 > [ 552.567971] worker_thread+0x3bc/0xe1c > [ 552.571731] kthread+0x278/0x2e0 > [ 552.574976] ret_from_fork+0x10/0x20 > [ 552.578568] > [ 552.580065] Allocated by task 518: > [ 552.583474] kasan_save_stack+0x2c/0x54 > [ 552.587323] __kasan_kmalloc+0x90/0xbc > [ 552.591086] hci_conn_add+0x70/0xe50 > [ 552.594670] hci_connect_le_scan+0x23c/0x71c > [ 552.598952] l2cap_chan_connect+0xbe8/0x11ec > [ 552.603238] l2cap_sock_connect+0x298/0x57c > [ 552.607434] __sys_connect+0x1e4/0x240 > [ 552.611198] __arm64_sys_connect+0x6c/0xa0 > [ 552.615306] invoke_syscall+0x6c/0x260 > [ 552.619073] el0_svc_common.constprop.0+0x17c/0x280 > [ 552.623962] do_el0_svc+0xa0/0xd0 > [ 552.627292] el0_svc+0x4c/0x8c > [ 552.630360] el0t_64_sync_handler+0xa4/0x130 > [ 552.634649] el0t_64_sync+0x18c/0x190 > [ 552.638325] > [ 552.639821] Freed by task 476: > [ 552.642882] kasan_save_stack+0x2c/0x54 > [ 552.646731] kasan_set_track+0x2c/0x40 > [ 552.650494] kasan_set_free_info+0x28/0x50 > [ 552.654603] __kasan_slab_free+0xec/0x13c > [ 552.658625] kfree+0xd4/0x280 > [ 552.661605] bt_link_release+0x2c/0x3c > [ 552.665366] device_release+0xe4/0x1f0 > [ 552.669131] kobject_put+0x130/0x244 > [ 552.672722] put_device+0x18/0x30 > [ 552.676049] hci_conn_cleanup+0x230/0x4d4 > [ 552.680068] hci_conn_del+0x46c/0x4a4 > [ 552.683740] hci_le_conn_failed+0x318/0x5a0 > [ 552.687935] le_conn_complete_evt.part.0+0x778/0x15a0 > [ 552.693008] hci_le_enh_conn_complete_evt+0x224/0x330 > [ 552.698075] hci_le_meta_evt+0x1cc/0x40c > [ 552.702011] hci_event_packet+0x3e0/0x165c > [ 552.706123] hci_rx_work+0x2e0/0x580 > [ 552.709715] process_one_work+0x60c/0xf04 > [ 552.713735] worker_thread+0x3bc/0xe1c > [ 552.717495] kthread+0x278/0x2e0 > [ 552.720739] ret_from_fork+0x10/0x20 > [ 552.724330] > [ 552.725824] Last potentially related work creation: > [ 552.730708] kasan_save_stack+0x2c/0x54 > [ 552.734557] __kasan_record_aux_stack+0xa0/0xd0 > [ 552.739100] kasan_record_aux_stack_noalloc+0x14/0x20 > [ 552.744168] __queue_work+0x428/0xd14 > [ 552.747846] __queue_delayed_work+0x144/0x220 > [ 552.752216] queue_delayed_work_on+0xa0/0xe0 > [ 552.756498] l2cap_chan_del+0x4b8/0xb50 > [ 552.760343] l2cap_conn_del+0x254/0x694 > [ 552.764187] l2cap_connect_cfm+0x698/0xb10 > [ 552.768299] hci_le_conn_failed+0x2c4/0x5a0 > [ 552.772494] le_conn_complete_evt.part.0+0x778/0x15a0 > [ 552.777565] hci_le_enh_conn_complete_evt+0x224/0x330 > [ 552.782630] hci_le_meta_evt+0x1cc/0x40c > [ 552.786567] hci_event_packet+0x3e0/0x165c > [ 552.790680] hci_rx_work+0x2e0/0x580 > [ 552.794273] process_one_work+0x60c/0xf04 > [ 552.798292] worker_thread+0x3bc/0xe1c > [ 552.802056] kthread+0x278/0x2e0 > [ 552.805295] ret_from_fork+0x10/0x20 > [ 552.808883] > [ 552.810380] Second to last potentially related work creation: > [ 552.816133] kasan_save_stack+0x2c/0x54 > [ 552.819984] __kasan_record_aux_stack+0xa0/0xd0 > [ 552.824526] kasan_record_aux_stack_noalloc+0x14/0x20 > [ 552.829592] __queue_work+0x428/0xd14 > [ 552.833264] delayed_work_timer_fn+0x68/0xa0 > [ 552.837546] call_timer_fn.constprop.0+0x54/0x15c > [ 552.842269] __run_timers.part.0+0x3f4/0x604 > [ 552.846553] run_timer_softirq+0x78/0x110 > [ 552.850580] _stext+0x29c/0x7d8 > [ 552.853734] > [ 552.855229] The buggy address belongs to the object at ffff0000126da000 > [ 552.855229] which belongs to the cache kmalloc-2k of size 2048 > [ 552.867759] The buggy address is located 1392 bytes inside of > [ 552.867759] 2048-byte region [ffff0000126da000, ffff0000126da800) > [ 552.879683] > [ 552.881180] The buggy address belongs to the physical page: > [ 552.886761] page:00000000bc19f159 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x526d8 > [ 552.896083] head:00000000bc19f159 order:3 compound_mapcount:0 compound_pincount:0 > [ 552.903576] flags: 0x3fffc0000010200(slab|head|node=0|zone=0|lastcpupid=0xffff) > [ 552.910910] raw: 03fffc0000010200 dead000000000100 dead000000000122 ffff000008002900 > [ 552.918663] raw: 0000000000000000 0000000080080008 00000001ffffffff 0000000000000000 > [ 552.926413] page dumped because: kasan: bad access detected > [ 552.931989] > [ 552.933486] Memory state around the buggy address: > [ 552.938285] ffff0000126da400: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb > [ 552.945518] ffff0000126da480: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb > [ 552.952752] >ffff0000126da500: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb > [ 552.959980] ^ > [ 552.966862] ffff0000126da580: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb > [ 552.974093] ffff0000126da600: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb > [ 552.981323] ================================================================== > [ 552.988700] Disabling lock debugging due to kernel taint > [ 552.994223] hci0 hcon 00000000c12ea080 handle 65535 > [ 552.994267] Unable to handle kernel paging request at virtual address 00078024e000003b > [ 553.007176] Mem abort info: > [ 553.009986] ESR = 0x96000004 > [ 553.013078] EC = 0x25: DABT (current EL), IL = 32 bits > [ 553.018460] SET = 0, FnV = 0 > [ 553.021615] EA = 0, S1PTW = 0 > [ 553.024845] FSC = 0x04: level 0 translation fault > [ 553.029772] Data abort info: > [ 553.032726] ISV = 0, ISS = 0x00000004 > [ 553.036617] CM = 0, WnR = 0 > [ 553.039651] [00078024e000003b] address between user and kernel address ranges > [ 553.046850] Internal error: Oops: 96000004 [#1] PREEMPT SMP > [ 553.052450] Modules linked in: moal(O) mlan(O) > [ 553.056948] CPU: 3 PID: 73 Comm: kworker/u9:0 Tainted: G B O 5.17.0-rc8-next-20220315-00001-g7c4f5ebecd82-dirty #245 > [ 553.056965] Hardware name: FSL i.MX8MM EVK board (DT) > [ 553.056974] Workqueue: hci0 hci_cmd_sync_work > [ 553.073677] pstate: 40000005 (nZcv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) > [ 553.085004] pc : hci_conn_cleanup+0x128/0x4d4 > [ 553.085019] lr : hci_conn_cleanup+0x42c/0x4d4 > [ 553.093734] sp : ffff80000d797ae0 > [ 553.093740] x29: ffff80000d797ae0 x28: ffff00000e85f938 x27: 0000000000000002 > [ 553.104206] x26: 1fffe00001d0bf27 x25: ffff000010a78000 x24: ffff000010a78000 > [ 553.104227] x23: dfff800000000000 x22: ffff0000126da0e8 x21: ffff00000f98d000 > [ 553.104249] x20: ffff0000126da000 x19: a0400127000001dc x18: 0000000000000000 > [ 553.118534] x17: ffff80000ac1f008 x16: ffff80000ac1ed74 x15: 0000000000000007 > [ 553.132821] x14: 0000000000000000 x13: 0000000000000007 x12: ffff6000024db41d > [ 553.139981] x11: 1fffe000024db41c x10: ffff6000024db41c x9 : dfff800000000000 > [ 553.140000] x8 : ffff0000126da0e4 x7 : 0000000000000001 x6 : 0000000000000004 > [ 553.154287] x5 : ffff0000126da0e4 x4 : 0000000000000000 x3 : ffff80000ab714cc > [ 553.154318] x2 : 0000000000000001 x1 : 14080024e000003b x0 : dfff800000000000 > [ 553.168614] Call trace: > [ 553.168625] hci_conn_cleanup+0x128/0x4d4 > [ 553.175094] hci_conn_del+0x46c/0x4a4 > [ 553.178775] hci_le_conn_failed+0x318/0x5a0 > [ 553.182970] create_le_conn_complete+0x60/0xb0 > [ 553.187425] hci_cmd_sync_work+0x128/0x210 > > Message from syslogd@ at Fri Nov 19 17:28:03 2021 ... > : Internal error: Oops: 96000004 [#1] PREEMPT SMP > [ 553.201344] process_one_work+0x60c/0xf04 > [ 553.205379] worker_thread+0x3bc/0xe1c > [ 553.209137] kthread+0x278/0x2e0 > [ 553.212378] ret_from_fork+0x10/0x20 > [ 553.215973] Code: d343fe61 d2d00000 f2fbffe0 f90006b3 (38e06820) > [ 553.222079] ---[ end trace 0000000000000000 ]--- > > Message from syslogd@ at Fri Nov 19 17:28:03 2021 ... > : Code: d343fe61 d2d00000 f2fbffe0 f90006b3 (38e06820) > > Best regards > Sherry > -- Luiz Augusto von Dentz