Re: ucsi debugfs oops (current Linus pre-6.6-rc1)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 9/5/2023 14:10, Dave Hansen wrote:
I'm having some problems booting Linus's current tree.  It seems to have
happened in some content between commit 3f86ed6ec0b3 and df0383ffad.

I'm suspecting this commit:

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=df0383ffad64dc09954a60873c1e202b47f08d90

I'm seeing a null pointer oops on this line:

void ucsi_debugfs_unregister(struct ucsi *ucsi)
{
===>    debugfs_remove_recursive(ucsi->debugfs->dentry);
         kfree(ucsi->debugfs);
}

on this instruction:

     66 0f 1f 00             nop    WORD PTR [rax]
     0f 1f 44 00 00          nop    DWORD PTR [rax+rax*1+0x0]
     53                      push   rbx
     48 8b 47 38             mov    rax,QWORD PTR [rdi+0x38]
     48 89 fb                mov    rbx,rdi
=>  48 8b 78 20             mov    rdi,QWORD PTR [rax+0x20]
     e8 36 16 26 e1          call   0xffffffffe1261669
     48 8b 7b 38             mov    rdi,QWORD PTR [rbx+0x38]
     5b                      pop    rbx
     e9 5c 79 03 e1          jmp    0xffffffffe1037999

That's the second dereference in the function, so I assume this is
trying to dereference 'debugfs' above.  It appears that this is some
failure/error path out of ucsi_acpi_probe() that's not handled correctly.

Probably this:

         if (ACPI_FAILURE(status)) {
                 dev_err(&pdev->dev, "failed to install notify handler\n");
                 ucsi_destroy(ua->ucsi);
                 return -ENODEV;
         }

         ret = ucsi_register(ua->ucsi);

where ucsi_destroy() is called before ucsi_register().  Although I do
_not_ see the dev_err() message anywhere.

If your theory is right could it be that the printk handler was racing and that's why it didn't come up?

In any case I'd think you can add this to ucsi_debugfs_unregister() to avoid it.

if (!ucsi->debugfs)
	return;


Full oops is below.

I'll try putting some hacks in place to avoid the null pointer.  Also,
please forgive the lack of a bisect for the moment.  This is happening
on my main laptop and it's a mild pain to do bisects on here.

[    4.903493] BUG: kernel NULL pointer dereference, address: 0000000000000020^M
[    4.905624] #PF: supervisor read access in kernel mode^M
[    4.907326] #PF: error_code(0x0000) - not-present page^M
[    4.908993] PGD 0 P4D 0 ^M
[    4.910998] Oops: 0000 [#1] PREEMPT SMP NOPTI^M
[    4.913077] CPU: 6 PID: 150 Comm: systemd-udevd Not tainted 6.5.0-11704-g3f86ed6ec0b3 #138^M
[    4.915211] Hardware name: Framework Laptop/FRANBMCP0B, BIOS 03.10 07/19/2022^M
[    4.917355] RIP: 0010:ucsi_debugfs_unregister+0x11/0x30 [typec_ucsi]^M
[    4.919705] Code: 2e 0f 1f 84 00 00 00 00 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 66 0f 1f 00 0f 1f 44 00 00 53 48 8b 47 38 48 89 fb <48> 8b 78 20 e8 36 16 26 e1 48 8b 7b 38 5b e9 5c 79 03 e1 66 66 2e^M
[    4.921982] RSP: 0018:ffffc900007e7bb8 EFLAGS: 00010246^M
[    4.924227] RAX: 0000000000000000 RBX: ffff888101b2be00 RCX: 0000000000009a06^M
[    4.926752] RDX: 0000000000000000 RSI: ffff888104491798 RDI: ffff888101b2be00^M
[    4.929312] RBP: ffff888101b2be00 R08: 0000000000009906 R09: 00000000000333f0^M
[    4.931887] R10: 0000000000000000 R11: 0000000000000000 R12: 00000000ffffffed^M
[    4.934451] R13: ffff888102594810 R14: ffff888100653600 R15: ffff888101fa7f78^M
[    4.937115] FS:  00007f5dd0fb48c0(0000) GS:ffff88906fb80000(0000) knlGS:0000000000000000^M
[    4.939581] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033^M
[    4.941308] CR2: 0000000000000020 CR3: 0000000105070005 CR4: 0000000000f70ee0^M
[    4.943022] PKRU: 55555554^M
[    4.944731] Call Trace:^M
[    4.946438]  <TASK>^M
[    4.948167]  ? __die+0x24/0x70^M
[    4.949864]  ? page_fault_oops+0x15b/0x440^M
[    4.951563]  ? acpi_evaluate_object+0x190/0x2f0^M
[    4.953201]  ? _raw_spin_lock_irqsave+0x28/0x50^M
[    4.954841]  ? exc_page_fault+0x6e/0x160^M
[    4.956461]  ? asm_exc_page_fault+0x26/0x30^M
[    4.958067]  ? ucsi_debugfs_unregister+0x11/0x30 [typec_ucsi]^M
[    4.959677]  ucsi_destroy+0x12/0x20 [typec_ucsi]^M
[    4.961298]  ucsi_acpi_probe+0x1cc/0x230 [ucsi_acpi]^M
[    4.962908]  platform_probe+0x40/0xb0^M
[    4.964522]  really_probe+0x1a2/0x410^M
[    4.966110]  __driver_probe_device+0x78/0x160^M
[    4.967735]  driver_probe_device+0x1e/0x90^M
[    4.969306]  __driver_attach+0xd6/0x1d0^M
[    4.970874]  ? __pfx___driver_attach+0x10/0x10^M
[    4.972449]  bus_for_each_dev+0x79/0xd0^M
[    4.974022]  bus_add_driver+0x116/0x220^M
[    4.975600]  driver_register+0x60/0x120^M
[    4.977169]  ? __pfx_ucsi_acpi_platform_driver_init+0x10/0x10 [ucsi_acpi]^M
[    4.978762]  do_one_initcall+0x45/0x220^M
[    4.980367]  ? kmalloc_trace+0x29/0x90^M
[    4.981952]  do_init_module+0x90/0x260^M
[    4.983530]  init_module_from_file+0x8b/0xd0^M
[    4.985087]  idempotent_init_module+0x181/0x240^M
[    4.986639]  __x64_sys_finit_module+0x5e/0xb0^M
[    4.988198]  do_syscall_64+0x3c/0x90^M
[    4.989739]  entry_SYSCALL_64_after_hwframe+0x6e/0xd8^M
[    4.991290] RIP: 0033:0x7f5dd16aaa3d^M




[Index of Archives]     [Linux Media]     [Linux Input]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Old Linux USB Devel Archive]

  Powered by Linux