Re: ENODATA on list/stat directory

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Jun 23, 2022 at 02:52:22PM -0500, Clay Gerrard wrote:
> I work on an object storage system, OpenStack Swift, that has always
> used xfs on the storage nodes.  Our system has encountered many
> various disk failures and occasionally apparent file system corruption
> over the years, but we've been noticing something lately that might be
> "new" and I'm considering how to approach the problem.  I'm interested
> to solicit critique on my current thinking/process - particularly from
> xfs experts.
> 
> [root@s8k-sjc3-d01-obj-9 ~]# xfs_bmap
> /srv/node/d21865/quarantined/objects-1/e53/f0418758de4baaa402eb301c5bae3e53
> /srv/node/d21865/quarantined/objects-1/e53/f0418758de4baaa402eb301c5bae3e53:
> No data available
> [root@s8k-sjc3-d01-obj-9 ~]# xfs_db
> /srv/node/d21865/quarantined/objects-1/e53/f0418758de4baaa402eb301c5bae3e53
> /srv/node/d21865/quarantined/objects-1/e53/f0418758de4baaa402eb301c5bae3e53:
> No data available

ENODATA implies that it's trying to access an xattr that doesn't
exist.

> fatal error -- couldn't initialize XFS library
> [root@s8k-sjc3-d01-obj-9 ~]# ls -alhF /srv/node/d21865/quarantined/objects-1/e53
> ls: cannot access
> /srv/node/d21865/quarantined/objects-1/e53/f0418758de4baaa402eb301c5bae3e53:
> No data available
> total 4.0K
> drwxr-xr-x  9 swift swift  318 Jun  7 00:57 ./
> drwxr-xr-x 33 swift swift 4.0K Jun 23 16:10 ../
> d?????????  ? ?     ?        ?            ? f0418758de4baaa402eb301c5bae3e53/

That's the typical ls output when it couldn't stat() an inode. This
typically occurs when the inode has been corrupted. On XFS, at
least, this should result in a corruption warning in the kernel log.

Did you check dmesg for errors?

> drwxr-xr-x  2 swift swift   47 May 27 00:43 f04193c31edc9593007471ee5a189e53/
> drwxr-xr-x  2 swift swift   47 May 27 00:43 f0419c711a5a5d01dac6154970525e53/
> drwxr-xr-x  2 swift swift   47 May 27 00:43 f041a2548b9255493d16ba21c19b6e53/
> drwxr-xr-x  2 swift swift   47 Jun  7 00:57 f041aa09d40566d6915a706a22886e53/
> drwxr-xr-x  2 swift swift   39 May 27 00:43 f041ac88bf13e5458a049d827e761e53/
> drwxr-xr-x  2 swift swift   47 May 27 00:43 f041bfd1c234d44b591c025d459a7e53/
> [root@s8k-sjc3-d01-obj-9 ~]# python
> Python 2.7.5 (default, Nov 16 2020, 22:23:17)
> [GCC 4.8.5 20150623 (Red Hat 4.8.5-44)] on linux2
> Type "help", "copyright", "credits" or "license" for more information.
> >>> import os
> >>> os.stat('/srv/node/d21865/quarantined/objects-1/e53/f0418758de4baaa402eb301c5bae3e53')
> Traceback (most recent call last):
>   File "<stdin>", line 1, in <module>
> OSError: [Errno 61] No data available:
> '/srv/node/d21865/quarantined/objects-1/e53/f0418758de4baaa402eb301c5bae3e53'
> >>> os.listdir('/srv/node/d21865/quarantined/objects-1/e53/f0418758de4baaa402eb301c5bae3e53')
> Traceback (most recent call last):
>   File "<stdin>", line 1, in <module>
> OSError: [Errno 61] No data available:
> '/srv/node/d21865/quarantined/objects-1/e53/f0418758de4baaa402eb301c5bae3e53'
> >>>

Use strace, not the python debugger, to find what syscall returned
the error.

> [root@s8k-sjc3-d01-obj-9 ~]# uname -a
> Linux s8k-sjc3-d01-obj-9.nsv.sjc3.nvmetal.net
> 3.10.0-1160.62.1.el7.x86_64 #1 SMP Tue Apr 5 16:57:59 UTC 2022 x86_64
> x86_64 x86_64 GNU/Linux

That's a RHEL7 kernel. Upstream developers really can't help you
diagnose random weird problems with these kernels - they are
completely custom kernels and so only the vendor can really help you
with diagnosing to root cause of problems such as this. You should
talk to your RH support contact.

> I'd also like to be able to "simulate" this kind of corruption on a
> healthy filesystem so we can test our "quarantine/auditor" code that's
> trying to move these filesystem problems out of the way for the
> consistency engine.  Does anyone have any guess how I could MAKE an
> xfs filesystem produce this kind of behavior on purpose?

Use xfs_db to corrupt a directory inode, then try to read it.

Cheers,

Dave.
-- 
Dave Chinner
david@xxxxxxxxxxxxx



[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux