Hello I have been having input/output errors copying data between drives attached to the same controller. I have two 3114 cards, a set of four Seagate 250GB drives (Model: ST3250824NS Rev: 3.AE) and set of 3 Maxtor 300GB drives (Model:6L300S0 Rev:BACE). This problem is reproducible across all the drives and both controller cards. The problem is that when copying a file off one drive on the controller to another on the same controller, be it via dd or cp, the file that gets written becomes corrupted along with the filesystem itself. Here is an extract from dmesg: [12689.451466] attempt to access beyond end of device [12689.451475] sdb1: rw=0, want=2339438600, limit=488392002 [12689.451480] attempt to access beyond end of device [12689.451484] sdb1: rw=0, want=18446744056529747976, limit=488392002 [12689.453822] attempt to access beyond end of device [12689.453831] sdb1: rw=0, want=2339438600, limit=488392002 [12689.453834] Buffer I/O error on device sdb1, logical block 292429824 [12689.453935] attempt to access beyond end of device [12689.453938] sdb1: rw=0, want=2339438600, limit=488392002 [12689.453941] Buffer I/O error on device sdb1, logical block 292429824 The actual command used was: cp ~/hugefile /mnt/sda1 cp /mnt/sda1/hugefile /mnt/sdb1/ md5sum /mnt/sda1/hugefile /mnt/sdb1/hugefile where hugefile is a 4.9GB piped output of "yes 0123456789" on ~/, a PATA drive used for the root filesystem and /home. md5sum calculates the first file checksum fine and errors on the second file. ccf5f9052aa1fac3062c3f1920abb1fc /mnt/sda1/hugefile md5sum: /mnt/sdb1/hugefile: Input/output error The exact same problem happens when the drives are reversed, i.e. the file is copied to sdb1 first then copied/dd'd to sda1, md5sum on sda1 bombs. There is no problem copying the file individually to each drive from ~/hugefile and performing the above test on drives from different controllers. All the drives have been rotated, the same test repeated with exactly the same result. Each drive has had a complete "badblocks -w -s" performed on them with no problems. I have performed the same test with ext2, ext3 and reiserfs 3.6 and all exhibit the same behaviour: seeking beyond the end of the disk to ludicrously high sectors. I would like some help tracking down the cause of this problem as I have practically exhausted the methods currently at my disposal - my best guess at the moment is that data being written to another port is being trampled on somehow but only when there is I/O active on another port. I will continue testing to see if simultaneous writes to multiple drives on a controller causes the same problem. Thanks for any advice you can give, Jonathan
Attachment:
lspci.txt.gz
Description: GNU Zip compressed data
Attachment:
dmesg.txt.gz
Description: GNU Zip compressed data