OK, but in that case bcache is not between your MD RAID and it's disks,
so if your disks are dropping out of the MD array, that has to be either
an independent problem, or a very complex bug.
James
On 07/08/15 16:36, Jens-U. Mozdzen wrote:
Hi James,
Zitat von "A. James Lewis" <james@xxxxxxxxxx>:
That's interesting, are you putting your MD on top of multiple bcache
devices... rather than bcache on top of an MD device... I wonder what
the rationale behind this is?
Hi James, no such thing here...
bcache is running on top of two MD-RAIDs - RAID6 with 7 spinning
drives and RAID1 with two SSDs.
The stack is, from bottom to top:
- MD-RAID6 data, MD-RAID1 cache
- bcache (/dev/bcache0, used as an LVM PV)
- LVM
- many LVs
- DRBD on top of most of the LVs
- Ext4 on each of the DRBD devices
- SCST / NFS / SMB sharing these file systems
In the referenced incidents, SCST reports that (many) writes failed
due to time-out, and MD reports a single disk faulty. No other traces
in syslog, especially no stalled processes, locking problems or kernel
bugs.
The i/o pattern is highly parallel reads and writes, mostly via SCST.
Regards,
Jens
--
To unsubscribe from this list: send the line "unsubscribe
linux-bcache" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-bcache" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html