ENOSPC and filesystem shutdowns

Bernard Chan <bernard@xxxxxxxxxxxxx> · Sun, 4 Sep 2011 14:09:49 +0800

Hello,

My apologies if this sounds too amateurish for this mailing list, yet I would like to get some insights on a recent issue we have on our XFS installation.

We have an XFS filesystem (on LVM, probably doesn't matter anyway) that is 4TB running on CentOS kernel 2.6.21.7, with about 65% FS utilization and 5% reported inode utilization (df -i). This filesystem contains a lot of small files about 5 levels deep with each level fanning out with several hundred to several thousand nodes on average. While having rolled out to production for a couple weeks, the XFS suddenly shuts down on a certain mkdir.

Sep  3 04:45:39 ip-10-204-xxx-xxx kernel: Filesystem "dm-1": XFS internal error xfs_trans_cancel at line 1138 of file fs/xfs/xfs_trans.c.  Caller 0xee27bcac
Sep  3 04:45:39 ip-10-204-xxx-xxx kernel: [<ee2732fe>] xfs_trans_cancel+0x59/0xe3 [xfs]
Sep  3 04:45:39 ip-10-204-xxx-xxx kernel: [<ee27bcac>] xfs_mkdir+0x5bc/0x60b [xfs]
Sep  3 04:45:39 ip-10-204-xxx-xxx kernel: [<ee27bcac>] xfs_mkdir+0x5bc/0x60b [xfs]
Sep  3 04:45:39 ip-10-204-xxx-xxx kernel: [<ee284ee6>] xfs_vn_mknod+0x1a5/0x28f [xfs]
Sep  3 04:45:39 ip-10-204-xxx-xxx kernel: [<ee284fe2>] xfs_vn_mkdir+0x12/0x14 [xfs]
Sep  3 04:45:39 ip-10-204-xxx-xxx kernel: [<c10786d0>] vfs_mkdir+0xbd/0x125
Sep  3 04:45:39 ip-10-204-xxx-xxx kernel: [<ee2dc1bc>] nfsd_create+0x297/0x38c [nfsd]
Sep  3 04:45:39 ip-10-204-xxx-xxx kernel: [<ee2e5597>] nfsd4_create+0x1ab/0x34c [nfsd]
Sep  3 04:45:39 ip-10-204-xxx-xxx kernel: [<ee2e53ec>] nfsd4_create+0x0/0x34c [nfsd]
Sep  3 04:45:39 ip-10-204-xxx-xxx kernel: [<ee2e4948>] nfsd4_proc_compound+0x178/0x263 [nfsd]
Sep  3 04:45:39 ip-10-204-xxx-xxx kernel: [<c102ebfd>] groups_alloc+0x42/0xae
Sep  3 04:45:39 ip-10-204-xxx-xxx kernel: [<ee2d720f>] nfsd_dispatch+0xd4/0x18f [nfsd]
Sep  3 04:45:39 ip-10-204-xxx-xxx kernel: [<ee180805>] svcauth_unix_set_client+0x16d/0x1a0 [sunrpc]
Sep  3 04:45:39 ip-10-204-xxx-xxx kernel: [<ee17d63f>] svc_process+0x391/0x656 [sunrpc]
Sep  3 04:45:39 ip-10-204-xxx-xxx kernel: [<ee2d7756>] nfsd+0x171/0x277 [nfsd]
Sep  3 04:45:39 ip-10-204-xxx-xxx kernel: [<ee2d75e5>] nfsd+0x0/0x277 [nfsd]
Sep  3 04:45:39 ip-10-204-xxx-xxx kernel: [<c100598f>] kernel_thread_helper+0x7/0x10
Sep  3 04:45:39 ip-10-204-xxx-xxx kernel: =======================
Sep  3 04:45:39 ip-10-204-xxx-xxx kernel: xfs_force_shutdown(dm-1,0x8) called from line 1139 of file fs/xfs/xfs_trans.c.  Return address = 0xee287778
Sep  3 04:45:39 ip-10-204-xxx-xxx kernel: Filesystem "dm-1": Corruption of in-memory data detected.  Shutting down filesystem: dm-1
Sep  3 04:45:39 ip-10-204-xxx-xxx kernel: Please umount the filesystem, and rectify the problem(s)

We searched and found this list, and a few patches around kernel 2.6.26-2.6.27 that seem to match our scenario. We were able to log the specific mkdir command that failed and confirmed it consistently fails that way that gives "no space left on device", while we did not reproduce the same issue mkdir in other directories with large inode numbers. We haven't tried patching or upgrading the kernel yet, but we will do that later.

As the root cause of that patch points to a bug triggered by ENOSPC, we checked the inode numbers created for some directories and files with "ls -li" and some of them are pretty close to 2^32. 

So, we would like to ascertain if that is the cause for ENOSPC in our case, and does that mean 32-bit inodes are no longer adequate for us and we should switch to 64-bit inodes? Will switching it avoid this kind of shutdowns with successful writes in the future?

And is it true that we don't need a 64-bit OS for 64-bit inodes? How can we tell if our system supports 64-bit inodes?

Finally, although we all know that "df -i" is sort of nonsense on XFS, how can we get the output of 5% inode while having inode numbers that are close to 2^32? So what does that 5% exactly mean, or were I looking at inodes the wrong way?

Thanks in advance for any insights anyone may shed on this one.

-- 

Regards,
Bernard Chan.
GoAnimate.

_______________________________________________
xfs mailing list
xfs@xxxxxxxxxxx
http://oss.sgi.com/mailman/listinfo/xfs