Files and directories disappear after drive fails out of md/raid array

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello,

I am running mdadm 3.2.5 on an Ubuntu 12.04 fileserver (3.2.0-32-generic) with a 
10-drive RAID6 array (10x1TB). On top of the md/raid are LVM, DRBD, and an ext4 
filesystem, in this configuration:
HDDs <-- md/raid <-- LVM <-- DRBD <-- ext4

Recently, /dev/sdb, one of the disks in the RAID6 array, started failing:
13:49:29 kernel: [17162220.838256] sas: command 0xffff88010628f600, task 0xffff8800466241c0, timed out: BLK_EH_NOT_HANDLED

Around this same time, a few users attempted to access a directory on this RAID
array over CIFS, which they had previously accessed earlier in the day. When 
they attempted to access it this time, the directory was empty. I confirmed the
emptiness of the directory via a local shell on the fileserver, which reported
the same information. At around 13:50, mdadm dropped /dev/sdb from the RAID array:

13:50:31 mdadm[1897]: Fail event detected on md device /dev/md2, component device /dev/sdb
13:50:31 smbd[3428]: [2014/02/10 13:50:31.226854,  0] smbd/process.c:2439(keepalive_fn)
13:50:31 smbd[13539]: [2014/02/10 13:50:31.227084,  0] smbd/process.c:2439(keepalive_fn)
13:50:31 kernel: [17162282.624858] EXT4-fs error (device drbd0): htree_dirblock_to_tree:587: inode #148638560: block 1189089581: comm smbd: bad entry in directory: rec_len % 4 != 0 - offset=0(0), inode=2004033568, rec_len=29801, name_len=99
13:50:31 kernel: [17162282.823733] EXT4-fs error (device drbd0): htree_dirblock_to_tree:587: inode #148638560: block 1189089581: comm smbd: bad entry in directory: rec_len % 4 != 0 - offset=0(0), inode=2004033568, rec_len=29801, name_len=99
13:50:31 kernel: [17162282.832886] /build/buildd/linux-3.2.0/drivers/scsi/mvsas/mv_sas.c 1863:port 2 slot 45 rx_desc 3002D has error info8000000080000000.
13:50:31 kernel: [17162282.832920] /build/buildd/linux-3.2.0/drivers/scsi/mvsas/mv_94xx.c 626:command active 30305FFF,  slot [2d].
13:50:31 kernel: [17162282.991884] /build/buildd/linux-3.2.0/drivers/scsi/mvsas/mv_sas.c 1863:port 3 slot 52 rx_desc 30034 has error info8000000080000000.
13:50:31 kernel: [17162282.991892] /build/buildd/linux-3.2.0/drivers/scsi/mvsas/mv_94xx.c 626:command active 302FFFFF,  slot [34].
13:50:31 kernel: [17162282.992072] /build/buildd/linux-3.2.0/drivers/scsi/mvsas/mv_sas.c 1863:port 2 slot 53 rx_desc 30035 has error info8000000080000000.
...
13:52:03 kernel: [17162374.423961] EXT4-fs error (device drbd0): htree_dirblock_to_tree:587: inode #148638560: block 1189089581: comm smbd: bad entry in directory: rec_len % 4 != 0 - offset=0(0), inode=2004033568, rec_len=29801, name_len=99
13:52:04 kernel: [17162375.839851] EXT4-fs error (device drbd0): htree_dirblock_to_tree:587: inode #148638560: block 1189089581: comm smbd: bad entry in directory: rec_len % 4 != 0 - offset=0(0), inode=2004033568, rec_len=29801, name_len=99
13:52:08 kernel: [17162380.135391] EXT4-fs error (device drbd0): htree_dirblock_to_tree:587: inode #148638560: block 1189089581: comm smbd: bad entry in directory: rec_len % 4 != 0 - offset=0(0), inode=2004033568, rec_len=29801, name_len=99
13:52:13 kernel: [17162385.108358] EXT4-fs error (device drbd0): htree_dirblock_to_tree:587: inode #148638560: block 1189089581: comm smbd: bad entry in directory: rec_len % 4 != 0 - offset=0(0), inode=2004033568, rec_len=29801, name_len=99
13:52:17 kernel: [17162388.166515] EXT4-fs error (device drbd0): htree_dirblock_to_tree:587: inode #148638560: block 1189089581: comm smbd: bad entry in directory: rec_len % 4 != 0 - offset=0(0), inode=2004033568, rec_len=29801, name_len=99


However, it was not until around 14:12 that these files reappeared in the directory
(perhaps as the result of me initiating a "find /path/to/mountpoint -type f" to 
look for other missing files). At this same time, the above ext4 errors also ceased.
These errors started as show above, immediately after /dev/sdb was dropped. A
checksum of the files in the directory match checksums from a backup, so I do not 
believe the files were modified. This filesystem was last fsck'ed on 9/27/13, so 
less than 6 months ago. 

Is there any explanation for why these files disappeared? Is there cause for 
concern about the integrity of this filesystem?

Thanks,

Andrew Martin
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux