On 08/29/2012 03:48 AM, Brian Candler wrote: > Does anyone have any experience running gluster with XFS and MD RAID as the > backend, and/or LSI HBAs, especially bad experience? > We have a few servers with 12 drive LSI RAID controllers we use for gluster (running XFS on RHEL6.2). I don't recall seeing major issues, but to be fair these particular systems see more hacking/dev/unit test work than longevity or stress testing. We also are not using MD in any way (hardware RAID). I'd be happy to throw a similar workload at one of them if you can describe your configuration in a bit more detail: specific MD configuration (RAID type, chunk size, etc.), XFS format options and mount options, anything else that might be in the I/O stack (LVM?), specific bonnie++ test you're running (a single instance? or some kind of looping test?). > In a test setup (Ubuntu 12.04, gluster 3.3.0, 24 x SATA HD on LSI Megaraid > controllers, MD RAID) I can cause XFS corruption just by throwing some > bonnie++ load at the array - locally without gluster. This happens within > hours. The same test run over a week doesn't corrupt with ext4. > > I've just been bitten by this in production too on a gluster brick I hadn't > converted to ext4. I have the details I can post separately if you wish, > but the main symptoms were XFS timeout errors and stack traces in dmesg, and > xfs corruption (requiring a reboot and xfs_repair showing lots of errors, > almost certainly some data loss). > Could you collect the generic data and post it to linux-xfs? Somebody might be able to read further into the problem via the stack traces. It also might be worth testing an upstream kernel on your server, if possible. Brian > However, this leaves me with some unpalatable conclusions and I'm not sure > where to go from here. > > (1) XFS is a shonky filesystem, at least in the version supplied in Ubuntu > kernels. This seems unlikely given its pedigree and the fact that it is > heavily endorsed by Red Hat for their storage appliance. > > (2) Heavy write load in XFS is tickling a bug lower down in the stack > (either MD RAID or LSI mpt2sas driver/firmware), but heavy write load in > ext4 doesn't. This would have to be a gross error such as blocks queued for > write being thrown away without being sent to the drive. > > I guess this is plausible - perhaps the usage pattern of write barriers is > different for example. However I don't want to point the finger there > without direct evidence either. There are no block I/O error events logged > in dmesg. > > The only way I can think of pinning this down is to find out what's the > smallest MD RAID array I can reproduce the problem with, then try to build a > new system with a different controller card (as MD RAID + JBOD, and/or as a > hardware RAID array) > > However while I try to see what I can do for that, I would be grateful for > any other experience people have in this area. > > Many thanks, > > Brian. > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > http://gluster.org/cgi-bin/mailman/listinfo/gluster-users >