Ok - more info 1. The bug only happens when accessing 2 disks/devices on the SAME controller 2. The bug is in stopping the HPC DMA to early. 3. Stopping the DMA to early happens only on reads. If applying this patch on the driver Index: drivers/scsi/sgiwd93.c =================================================================== RCS file: /cvs/linux/drivers/scsi/sgiwd93.c,v retrieving revision 1.27 diff -u -r1.27 sgiwd93.c --- drivers/scsi/sgiwd93.c 2001/03/26 00:38:20 1.27 +++ drivers/scsi/sgiwd93.c 2001/04/09 22:12:10 @@ -183,6 +187,10 @@ printk("dma_stop: status<%d> ", status); #endif + if (hregs->ctrl & HPC3_SCTRL_ACTIVE) + printk("DMA still active dir %d bresid %d\n", + hdata->dma_dir, + SCpnt->SCp.buffers_residual); /* First stop the HPC and flush it's FIFO. */ if(hdata->dma_dir) { hregs->ctrl |= HPC3_SCTRL_FLUSH; I get this output on the console which means files/data got corrupted. ---------------schnipp--------------------- DMA still active dir 1 bresid 6 DMA still active dir 1 bresid 1 DMA still active dir 1 bresid 3 DMA still active dir 1 bresid 1 DMA still active dir 1 bresid 0 DMA still active dir 1 bresid 0 DMA still active dir 1 bresid 0 DMA still active dir 1 bresid 2 DMA still active dir 1 bresid 0 DMA still active dir 1 bresid 2 ---------------schnapp--------------------- This might also be the cause why metadata has never been corrupted. Metadata are mostly not read in large chunks which get unchecked dumped to disk again. One suspicion i had was around the modification of the wd33 registers in the sgi driver (Its the only driver to do so) which based on the thesis that on the wd33 driver we act on the current scatter_gather item and subtract the total length - But this is simply wrong. Flo -- Florian Lohoff flo@rfc822.org +49-5201-669912 Why is it called "common sense" when nobody seems to have any?