On Thu, Jul 09, 2015 at 05:32:50PM +0000, Hogan Whittall wrote: > Hello, > > Recently we encountered a previously-reported issue > regarding write amplification with MySQL replication and XFS when > used with certain RAID controllers (In our case, HP P420). That > issue exactly matches our issue and was documented by someone else > here - http://oss.sgi.com/archives/xfs/2013-03/msg00133.html - > but I don't see any resolution. I will say that the problem > *does not* exist when mkfs.xfs 2.9.6 is used to format the > filesystem on RHEL6 as that sets sunit=0 and swidth=0 instead of > setting based on minimum_io_size and optimal_io_size. The issue is the log stripe unit padding log buffers on log writes. Your workload like has lots of fsync() calls, which means log writes go from being padded to the next sector boundary to being padded to the next log stripe unit boundary. > We have systems that are identical in how they are built and > configured, we can take a RHEL6 box that has the MySQL partition > formatted with mkfs.xfs v3.1.1 and reproduce the write > amplification problem with MySQL replication every single time. Because the more recent kernel is probably getting sunit/swidth direct from the hardware via the kernel. > If we take the same box and format the MySQL partition with > mkfs.xfs 2.9.6, then bring up MySQL with the exact same > configuration there is no problem. Because that version of mkfs doesn't know about the kernel optimum IO size parameters in sysfs that are set based on hardware mode page support. Hence older mkfs is not able to set stripe unit defaults for hardware RAID automatically.... Your other option is to use a small log, so that the log writes end up being permanently pinned in the RAID BBWC, and so the bandwith they consume doesn't matter because it never hits the platters... FWIW, this problem has only been reported for HP RAID hardware, so I suspect that there is something the HP RAID firmware that doesn't handle streaming FUA writes (the log writes) mixed with other random IO particularly well. Cheers, Dave. -- Dave Chinner david@xxxxxxxxxxxxx _______________________________________________ xfs mailing list xfs@xxxxxxxxxxx http://oss.sgi.com/mailman/listinfo/xfs