On Sun, Jan 12, 2014 at 11:53:59AM -0800, Zachary Kotlarek wrote: > > On Jan 12, 2014, at 10:47 AM, Stan Hoeppner <stan@xxxxxxxxxxxxxxxxx> wrote: > > > http://xfs.org/index.php/XFS_FAQ#Q:_What_information_should_I_include_when_reporting_a_problem.3F > > > > If this is due to a bug it may have already been fixed. Note the first > > two things asked for. > > > Thanks for the pointer. > > My kernels a bit old, but xfsprogs is shiny and new: > Linux vera 2.6.39.2 #1 SMP Fri Sep 30 23:55:41 PDT 2011 x86_64 x86_64 x86_64 GNU/Linux > xfs_repair version 3.1.11 > > 2x4 core CPUs > 8 GB RAM, mostly free (more than 6 GB cached) > > Related mount: > /dev/lvmsas/tv /mnt/media/TV xfs rw,nosuid,nodev,noexec,relatime,attr2,delaylog,inode64,sunit=1024,swidth=4096,noquota 0 0 > > Underlying partition: > 254 31 16252928000 dm-31 > > Which is a no-frills LVM2 volume allocation over mdadm raid-6. > > meta-data=/dev/lvmsas/tv isize=256 agcount=33, agsize=126975872 blks > = sectsz=512 attr=2 > data = bsize=4096 blocks=4063232000, imaxpct=5 > = sunit=128 swidth=512 blks > naming =version 2 bsize=4096 ascii-ci=1 > log =internal bsize=4096 blocks=521728, version=2 > = sectsz=512 sunit=8 blks, lazy-count=1 > realtime =none extsz=4096 blocks=0, rtextents=0 > > Attempts to access the now-busted files/directories with accents in their paths result in a kernel log like: > Jan 11 02:05:39 vera XFS (dm-31): I/O error occurred: meta-data dev dm-31 block 0x3c8ff73e0 ("xfs_trans_read_buf") error 11 buf count 4096 error 11 = EAGAIN/EWOULDBLOCK That tends to imply that there's some interesting error occurring in the layers below XFS here. XFS on a kernel that old is not expecting an EAGAIN error from storage, so it is likely not being captured properly. There have been bugs in the raid/dm code in the past that would cause issues like this, and bugs in the XFS error handling that allowed them to slip throw and shut down the filesystem. For example, this fix made in March 2013: $ gl -n1 -p c163f9a commit c163f9a1760229a95d04e37b332de7d5c1c225cd Author: Dave Chinner <dchinner@xxxxxxxxxx> Date: Tue Mar 12 23:30:34 2013 +1100 xfs: ensure we capture IO errors correctly Failed buffer readahead can leave the buffer in the cache marked with an error. Most callers that then issue a subsequent read on the buffer do not zero the b_error field out, and so we may incorectly detect an error during IO completion due to the stale error value left on the buffer. Avoid this problem by zeroing the error before IO submission. This ensures that the only IO errors that are detected those captured from are those captured from bio submission or completion. Signed-off-by: Dave Chinner <dchinner@xxxxxxxxxx> Reviewed-by: Mark Tinguely <tinguely@xxxxxxx> Signed-off-by: Ben Myers <bpm@xxxxxxx> Is probably relevant, but there are many more changes up and down the stack that may be the cause of your problem. Indeed, the above fix may simply turn EAGAIN into EIO because there really is something wrong with that block on disk.... Cheers, Dave. -- Dave Chinner david@xxxxxxxxxxxxx _______________________________________________ xfs mailing list xfs@xxxxxxxxxxx http://oss.sgi.com/mailman/listinfo/xfs