Re: RAID1 Corruption

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Paul Clements wrote:
Hi,

Markus Gehring wrote:

I have a reproducable problem with corrupted data read from a RAID1-array.

Setup:
 HW:
  2 S-ATA-Disks (160GB each) -> /dev/md4 RAID1
  Promise S150 TX4 - Controller
  AMD Sempron 2200+

 SW:
  Fedora Core 3
  Kernel 2.6.10 unpatched
  Samba (for read/write-accesses)
  SW-Raid

Everything works fine with only one drive in the array. If the second is
synced up read accesses return corrupted data.

Interesting: If you remove again the second disk. The same files will be
 read correctly again (no matter if written while only one disk is in
the array or two are synced!)!


This makes it sound like bad data is getting written to the second disk during resync. Could you give more details about your test procedure (a script or list of steps that reproduces the problem would be great)?
1. Setup Array (mdadm -C /dev/md4 -l 1 -n 2 /dev/sdc1 /dev/sdd1)
2. ... resync running (as i can see with cat /proc/mdstat)
3. mke2fs /dev/md4
4. mount /dev/md4 /home2
5. Copy ~100M JPGs (~800k each) via samba to array (/home2/test1/)
6. See the JPGs all okay
7. after resync has finished: Copy same ~100M JPGs to array (/home2/test2)
8. See the JPGs (at least in /home2/test2... i didn't check them in ..test1) damaged
9. remove one disk again (mdadm /dev/md4 -f /dev/sdd1
mdadm /dev/md4 -r /dev/sdd1 ... or ../dev/sdc1!!!)
10. see (from the Win Client) the JPGs in /home2/test2 okay again!



I don't think samba is the culprit, but just to be sure, is there any chance you could reproduce the problem without samba in the equation? (From what you say above, I assume all reads and writes are coming from a samba client of some sort?)
I did a quick test:
Copyied my test-JPG-dir from /home/test (where i can see the pics okay) to /home2/test9 and see the pics damaged. After i copied them back to /home/test9 the stay damaged.


Remarks:
I also saw here that the pics on the syncing /dev/md4 = /home2 are damaged (read?) while the drive is syncing (new compared to point 6 above) but this happens definitly not so often as if the drive has finished syncing (saw this the first time while dealing with the problem for over 2 weeks now).
I have all mounts on SW-Raid1 arrays, but i have never seen problems with md0 (/boot), md1 (/), md2 (swap), md3 (/var).
I have seen ext3-fs errors also (see also Sven Andras's posting from today and 5.1.2005).


Many Thanks,
 Markus

-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux