On Wed, 02 Jul 2014 10:32:41 +0100 Pedro Teixeira <finas@xxxxxxxx> wrote: > - I'm having the following problem on a raid6 md volume consisting og > 16 1TB Seagtes SSHD's. ( using kernel 3.15.3 or 3.14.0 ) mdadm is 3.3. > > - every time I run a fsck.ext4 I will get the exact same errors ( > ...short read ). Forcing a repair on the md0 volume shows no errors > and completes without problems. All disks are active and the volume is > not degraded, still I can't get rid of the short errors on those 16 > blocks and when the filesystem is mounted the read errors will come up > from time to time as they are probably in use. > > - If I try to read those blocks with DD ( dd if=/dev/md0 of=test.txt > seek=458227712 count=6 bs=4096 ) it will instantly create a 1.8T file > but the file doesn't appear to have nothing on it ( and the file > doesn't take the 1.8T on disk as the disk is much smaller ) > > - this started happening after having a three disk failure. I > recovered from that failure by recreating the array with the > non-failed 13 disks plus the last failed one ( events didn't differ > much ). I then readed the other disks. The failed disks are all > physically good, tested them with hdat2 and they don't have read/write > errors so I reused them. I don't know why they failed, maybe some > incompatibility with SSHD's and the LSI HBA controller.. > > root@nas3:/# dd if=/dev/md0 of=teste.txt seek=458227712 count=6 bs=4096 > 6+0 records in > 6+0 records out > 24576 bytes (25 kB) copied, 0.0019239 s, 12.8 MB/s > root@nas3:/# ls -lah teste.txt > -rw-r--r-- 1 root root 1.8T Jul 2 10:22 teste.txt > root@nas3:/# > > > > root@nas3:/# cat /proc/mdstat > Personalities : [raid6] [raid5] [raid4] > md0 : active raid6 sde[0] sdq[15] sdp[14] sdo[17] sdn[19] sdm[16] > sdl[18] sdk[9] sdj[8] sdi[7] sdh[6] sdg[5] sdf[4] sdb[3] sdd[2] sdc[1] > 13672838144 blocks super 1.2 level 6, 512k chunk, algorithm 2 > [16/16] [UUUUUUUUUUUUUUUU] > > - When doing a fsck.ext4 of /dev/md0 it returns the following ( and I > can do it over and over again with the exact same errors) : > > root@nas3:/# fsck.ext4 -f /dev/md0 > e2fsck 1.42.10 (18-May-2014) > Pass 1: Checking inodes, blocks, and sizes > Pass 2: Checking directory structure > Pass 3: Checking directory connectivity > Pass 4: Checking reference counts > Pass 5: Checking group summary information > Error reading block 458227712 (Attempt to read block from filesystem > resulted in short read) while reading inode and block bitmaps. Ignore > error<y>? yes Can't possible happen! (Do worry, I say that a lot - I'm usually wrong). What sort of computer? Particularly is it 32bit or 64bit? Try using 'dd' to read a few meg at various offsets (1G, 2G, 4G, 6G, 8G, ....) and find out if there is a pattern, where it can read and where it cannot. NeilBrown
Attachment:
signature.asc
Description: PGP signature