Re: Debugging stuck mount

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Aug 08, 2024 at 11:40:07AM GMT, Morten Hein Tiljeset wrote:
> Hi folks,
> 
> I'm trying to debug an issue that occurs sporadically in production where the
> ext4 filesystem on a device, say dm-1, is never fully closed. This is visible
> from userspace only via the existence of /sys/fs/ext4/dm-1 which cryptsetup
> uses to determine that the device is still mounted.
> 
> My initial thought was that it was mounted in some mount namespace, but this is
> not the case. I've used a debugger (drgn) on /proc/kcore to find the
> superblock. I can see that this is kept alive by a single mount which looks
> like this (leaving out all fields that are NULL/empty lists):
> 
> *(struct mount *)0xffff888af92c5cc0 = {
> 	.mnt_parent = (struct mount *)0xffff888af92c5cc0,
> 	.mnt_mountpoint = (struct dentry *)0xffff888850331980,  // an application defined path
> 	.mnt = (struct vfsmount){
> 		.mnt_root = (struct dentry *)0xffff888850331980,    // note: same path as path as mnt_mountpoint
> 		.mnt_sb = (struct super_block *)0xffff88a89f7bc800, // points to the superblock I want cleaned up
> 		.mnt_flags = (int)134217760,                        // 0x8000020 = MNT_UMOUNT | MNT_RELATIME
> 		.mnt_userns = (struct user_namespace *)init_user_ns+0x0 = 0xffffffffb384b400,
> 	},
> 	.mnt_pcp = (struct mnt_pcp *)0x37dfbfa2c338,
> 	.mnt_instance = (struct list_head){
> 		.next = (struct list_head *)0xffff88a89f7bc8d0,
> 		.prev = (struct list_head *)0xffff88a89f7bc8d0,
> 	},
> 	.mnt_devname = (const char *)0xffff88a7d0fe7cc0 = "/dev/mapper/<my device>_crypt", // maps to /dev/dm-1
> 	.mnt_id = (int)3605,
> }

That's the root mount of the filesystem here.

> 
> In particular I notice that the mount namespace is NULL. As far as I understand
> the only way to get this state is through a lazy unmount (MNT_DETACH). I can at
> least manage to create a similar state by lazily unmounting but keeping the
> mount alive with a shell with CWD inside the mountpoint.
> 
> I've tried to search for the superblock pointer on cwd/root of all tasks, which
> works in my synthetic example but not for the real case. I've had similar
> results searching for the superblock pointer using drgn's fsrefs.py script[1]
> which has support for searching additional kernel data structures.

It's likely held alive by some random file descriptor someone has open.
IOW, try and walk all /proc/<pid>/fd/<nr> in that case and see whether
anything keeps it alive.




[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux