Re: kernel NULL pointer dereference: Workqueue: events_unbound nfsd_file_gc_worker, RIP: 0010:svc_wake_up+0x9/0x20

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Jeff,

On Sat, Jan 25, 2025 at 05:55:50PM -0500, Jeff Layton wrote:
> On Sat, 2025-01-25 at 21:44 +0100, Salvatore Bonaccorso wrote:
> > Hi Chuck, Jeff, NFSD maintainers,
> > 
> > In Debian we got a report from a user which triggered an issue during
> > package updates hwere nfs-kernel-server restart was involved, then
> > hanging and included a kernel trace of a NULL pointer dereference.
> > 
> > The full report is at:
> > https://bugs.debian.org/1093734
> > 
> > While I was not able to trigger the issue, the provided log is as
> > follows:
> > 
> > 2025-01-21T12:07:01.516291+01:00 $HOST kernel: device-mapper: core: CONFIG_IMA_DISABLE_HTABLE is disabled. Duplicate IMA measurements will not be recorded in the IMA log.
> > 2025-01-21T12:07:01.516310+01:00 $HOST kernel: device-mapper: uevent: version 1.0.3
> > 2025-01-21T12:07:01.516312+01:00 $HOST kernel: device-mapper: ioctl: 4.48.0-ioctl (2023-03-01) initialised: dm-devel@xxxxxxxxxxxxxxx
> > 2025-01-21T12:07:13.528044+01:00 $HOST kernel: NFSD: Using nfsdcld client tracking operations.
> > 2025-01-21T12:07:13.528061+01:00 $HOST kernel: NFSD: no clients to reclaim, skipping NFSv4 grace period (net f0000000)
> > 2025-01-21T12:07:17.558915+01:00 $HOST blkmapd[1148]: exit on signal(15)
> > 2025-01-21T12:07:17.574410+01:00 $HOST blkmapd[239859]: open pipe file /run/rpc_pipefs/nfs/blocklayout failed: No such file or directory
> > 2025-01-21T12:07:18.015541+01:00 $HOST kernel: BUG: kernel NULL pointer dereference, address: 0000000000000090
> 
> Thanks for the bug report. It's getting late here, so I can only take a
> quick look. svc_wake_up is pretty small:
> 
> void svc_wake_up(struct svc_serv *serv)
> {
>         struct svc_pool *pool = &serv->sv_pools[0];
> 
>         set_bit(SP_TASK_PENDING, &pool->sp_flags);
>         svc_pool_wake_idle_thread(pool);
> }
> 
> pahole on my machine says that struct svc_serv has this at offset 0x90:
> 
> 	struct svc_pool *          sv_pools;             /*  0x90   0x8 */
> 
> So it looks like the nn->nfsd_serv was a NULL pointer. That only
> happens when we shut down the server, so this looks like a race between
> filecache garbage collection with shutdown.
> 
> The filecache gets shut down in nfsd_shutdown_net, which gets called
> _after_ setting the nn->nfsd_serv pointer to NULL. We'll have to look
> at whether we can reorder the NULL pointer setting to later, or work
> around this some other way.
> 
> Could I trouble you to open a bug for this at bugzilla.kernel.org?

Thanks a lot for your quick response on it and the analysis.

Sure I can fill a bug in bugzilla.kernel.org, I see you submitted a
patch already, do you still want me to do it?

If so I try to reference as well all followups so that the information
is not spread around threads.

Thanks a lot for your work!

Regards,
Salvatore




[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux