On Tue, 11 Feb 2014 11:11:04 -0600 (CST) Andrew Martin <amartin@xxxxxxxxxxx> wrote: > Hello, > > I am running mdadm 3.2.5 on an Ubuntu 12.04 fileserver with a 10-drive RAID6 array (10x1TB). Recently, /dev/sdb started failing: > Feb 10 13:49:29 myfileserver kernel: [17162220.838256] sas: command 0xffff88010628f600, task 0xffff8800466241c0, timed out: BLK_EH_NOT_HANDLED > > Around this same time, a few users attempted to access a directory on this RAID array over CIFS, which they had previously accessed earlier in the day. When they attempted to access it this time, the directory was empty. The emptiness of the folder was confirmed via a local shell on the fileserver, which reported the same information. At around 13:50, mdadm dropped /dev/sdb from the RAID array: The directory being empty can have nothing to do with the device failure. md/raid will never let bad data into the page cache in the manner you suggest. I cannot explain to you what happened, but I'm absolutely certain it wasn't something that could be fixed by md dropping any caches. NeilBrown > Feb 10 13:50:31 myfileserver mdadm[1897]: Fail event detected on md device /dev/md2, component device /dev/sdb > > However, it was not until around 14:15 that these files reappeared in the directory. I am guessing that it took this long for the invalid, cached read to be flushed from the kernel buffer cache. > > The concern with the above behavior is it leaves a potentially large window of time during which certain data may not be correctly returned from the RAID array. Is it possible for mdadm to automatically flush the kernel buffer cache after it drops a drive from the array: > sync; echo 3 > /proc/sys/vm/drop_caches > > This would have caused the data to have been re-read at 13:50, a much smaller window of time during which invalid data was present in the cache. Or, is there a better suggestion for handling this situation? > > Thanks, > > Andrew Martin > -- > To unsubscribe from this list: send the line "unsubscribe linux-raid" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html
Attachment:
signature.asc
Description: PGP signature