Re: ucsi debugfs oops (current Linus pre-6.6-rc1)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

On Tue, Sep 05, 2023 at 02:25:35PM -0500, Mario Limonciello wrote:
> On 9/5/2023 14:10, Dave Hansen wrote:
> > I'm having some problems booting Linus's current tree.  It seems to have
> > happened in some content between commit 3f86ed6ec0b3 and df0383ffad.
> > 
> > I'm suspecting this commit:
> > 
> > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=df0383ffad64dc09954a60873c1e202b47f08d90
> > 
> > I'm seeing a null pointer oops on this line:
> > 
> > void ucsi_debugfs_unregister(struct ucsi *ucsi)
> > {
> > ===>    debugfs_remove_recursive(ucsi->debugfs->dentry);
> >          kfree(ucsi->debugfs);
> > }
> > 
> > on this instruction:
> > 
> >      66 0f 1f 00             nop    WORD PTR [rax]
> >      0f 1f 44 00 00          nop    DWORD PTR [rax+rax*1+0x0]
> >      53                      push   rbx
> >      48 8b 47 38             mov    rax,QWORD PTR [rdi+0x38]
> >      48 89 fb                mov    rbx,rdi
> > =>  48 8b 78 20             mov    rdi,QWORD PTR [rax+0x20]
> >      e8 36 16 26 e1          call   0xffffffffe1261669
> >      48 8b 7b 38             mov    rdi,QWORD PTR [rbx+0x38]
> >      5b                      pop    rbx
> >      e9 5c 79 03 e1          jmp    0xffffffffe1037999
> > 
> > That's the second dereference in the function, so I assume this is
> > trying to dereference 'debugfs' above.  It appears that this is some
> > failure/error path out of ucsi_acpi_probe() that's not handled correctly.
> > 
> > Probably this:
> > 
> > >          if (ACPI_FAILURE(status)) {
> > >                  dev_err(&pdev->dev, "failed to install notify handler\n");
> > >                  ucsi_destroy(ua->ucsi);
> > >                  return -ENODEV;
> > >          }
> > > 
> > >          ret = ucsi_register(ua->ucsi);
> > 
> > where ucsi_destroy() is called before ucsi_register().  Although I do
> > _not_ see the dev_err() message anywhere.
> 
> If your theory is right could it be that the printk handler was racing and
> that's why it didn't come up?
> 
> In any case I'd think you can add this to ucsi_debugfs_unregister() to avoid
> it.
> 
> if (!ucsi->debugfs)
> 	return;

Thank you guys for the report. I'll prepare the patch for this.


> > Full oops is below.
> > 
> > I'll try putting some hacks in place to avoid the null pointer.  Also,
> > please forgive the lack of a bisect for the moment.  This is happening
> > on my main laptop and it's a mild pain to do bisects on here.
> > 
> > > [    4.903493] BUG: kernel NULL pointer dereference, address: 0000000000000020^M
> > > [    4.905624] #PF: supervisor read access in kernel mode^M
> > > [    4.907326] #PF: error_code(0x0000) - not-present page^M
> > > [    4.908993] PGD 0 P4D 0 ^M
> > > [    4.910998] Oops: 0000 [#1] PREEMPT SMP NOPTI^M
> > > [    4.913077] CPU: 6 PID: 150 Comm: systemd-udevd Not tainted 6.5.0-11704-g3f86ed6ec0b3 #138^M
> > > [    4.915211] Hardware name: Framework Laptop/FRANBMCP0B, BIOS 03.10 07/19/2022^M
> > > [    4.917355] RIP: 0010:ucsi_debugfs_unregister+0x11/0x30 [typec_ucsi]^M
> > > [    4.919705] Code: 2e 0f 1f 84 00 00 00 00 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 66 0f 1f 00 0f 1f 44 00 00 53 48 8b 47 38 48 89 fb <48> 8b 78 20 e8 36 16 26 e1 48 8b 7b 38 5b e9 5c 79 03 e1 66 66 2e^M
> > > [    4.921982] RSP: 0018:ffffc900007e7bb8 EFLAGS: 00010246^M
> > > [    4.924227] RAX: 0000000000000000 RBX: ffff888101b2be00 RCX: 0000000000009a06^M
> > > [    4.926752] RDX: 0000000000000000 RSI: ffff888104491798 RDI: ffff888101b2be00^M
> > > [    4.929312] RBP: ffff888101b2be00 R08: 0000000000009906 R09: 00000000000333f0^M
> > > [    4.931887] R10: 0000000000000000 R11: 0000000000000000 R12: 00000000ffffffed^M
> > > [    4.934451] R13: ffff888102594810 R14: ffff888100653600 R15: ffff888101fa7f78^M
> > > [    4.937115] FS:  00007f5dd0fb48c0(0000) GS:ffff88906fb80000(0000) knlGS:0000000000000000^M
> > > [    4.939581] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033^M
> > > [    4.941308] CR2: 0000000000000020 CR3: 0000000105070005 CR4: 0000000000f70ee0^M
> > > [    4.943022] PKRU: 55555554^M
> > > [    4.944731] Call Trace:^M
> > > [    4.946438]  <TASK>^M
> > > [    4.948167]  ? __die+0x24/0x70^M
> > > [    4.949864]  ? page_fault_oops+0x15b/0x440^M
> > > [    4.951563]  ? acpi_evaluate_object+0x190/0x2f0^M
> > > [    4.953201]  ? _raw_spin_lock_irqsave+0x28/0x50^M
> > > [    4.954841]  ? exc_page_fault+0x6e/0x160^M
> > > [    4.956461]  ? asm_exc_page_fault+0x26/0x30^M
> > > [    4.958067]  ? ucsi_debugfs_unregister+0x11/0x30 [typec_ucsi]^M
> > > [    4.959677]  ucsi_destroy+0x12/0x20 [typec_ucsi]^M
> > > [    4.961298]  ucsi_acpi_probe+0x1cc/0x230 [ucsi_acpi]^M
> > > [    4.962908]  platform_probe+0x40/0xb0^M
> > > [    4.964522]  really_probe+0x1a2/0x410^M
> > > [    4.966110]  __driver_probe_device+0x78/0x160^M
> > > [    4.967735]  driver_probe_device+0x1e/0x90^M
> > > [    4.969306]  __driver_attach+0xd6/0x1d0^M
> > > [    4.970874]  ? __pfx___driver_attach+0x10/0x10^M
> > > [    4.972449]  bus_for_each_dev+0x79/0xd0^M
> > > [    4.974022]  bus_add_driver+0x116/0x220^M
> > > [    4.975600]  driver_register+0x60/0x120^M
> > > [    4.977169]  ? __pfx_ucsi_acpi_platform_driver_init+0x10/0x10 [ucsi_acpi]^M
> > > [    4.978762]  do_one_initcall+0x45/0x220^M
> > > [    4.980367]  ? kmalloc_trace+0x29/0x90^M
> > > [    4.981952]  do_init_module+0x90/0x260^M
> > > [    4.983530]  init_module_from_file+0x8b/0xd0^M
> > > [    4.985087]  idempotent_init_module+0x181/0x240^M
> > > [    4.986639]  __x64_sys_finit_module+0x5e/0xb0^M
> > > [    4.988198]  do_syscall_64+0x3c/0x90^M
> > > [    4.989739]  entry_SYSCALL_64_after_hwframe+0x6e/0xd8^M
> > > [    4.991290] RIP: 0033:0x7f5dd16aaa3d^M

-- 
heikki



[Index of Archives]     [Linux Media]     [Linux Input]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Old Linux USB Devel Archive]

  Powered by Linux