Quoting Dave Chinner (2013-06-17 21:58:06) > On Mon, Jun 17, 2013 at 10:34:57AM -0400, Chris Mason wrote: > > Quoting Dave Chinner (2013-06-14 22:50:50) > > > From: Dave Chinner <dchinner@xxxxxxxxxx> > > > > > > Doing writeback on lots of little files causes terrible IOPS storms > > > because of the per-mapping writeback plugging we do. This > > > essentially causes imeediate dispatch of IO for each mapping, > > > regardless of the context in which writeback is occurring. > > > > > > IOWs, running a concurrent write-lots-of-small 4k files using fsmark > > > on XFS results in a huge number of IOPS being issued for data > > > writes. Metadata writes are sorted and plugged at a high level by > > > XFS, so aggregate nicely into large IOs. However, data writeback IOs > > > are dispatched in individual 4k IOs, even when the blocks of two > > > consecutively written files are adjacent. > > > > > > Test VM: 8p, 8GB RAM, 4xSSD in RAID0, 100TB sparse XFS filesystem, > > > metadata CRCs enabled. > > > > > > Kernel: 3.10-rc5 + xfsdev + my 3.11 xfs queue (~70 patches) > > > > I'm a little worried about this one, just because of the impact on ssds > > from plugging in the aio code: > > I'm testing on SSDs, but the impact has nothing to do with the > underlying storage. > > > https://lkml.org/lkml/2011/12/13/326 > > Oh, that's completely different situation - it's application > submitted IO wherhe Io latency is a deterimining factor in > performance. > > This is for background writeback, where IO latency is not a primary > performance issue - maximum throughput is what we are trying to > acheive here. For writeback, well formed IO has a much greater > impact on throughput than low latency IO submission, even for SSD > based storage. Very true, but at the same time we do wait for background writeback sometimes. It's worth a quick test... > > > How exactly was your FS created? I'll try it here. > > The host has an XFS filesystem on a md RAID0 of 4x40Gb slices off > larger SSDs: > > $ cat /proc/mdstat > Personalities : [raid0] > md2 : active raid0 sdb1[0] sde1[3] sdd1[2] sdc1[1] > 167772032 blocks super 1.2 32k chunks > > built with mkfs.xfs defaults. A sparse 100TB file is created and > then fed to KVM with cache=none,virtio. > > The guest formats the 100TB device using default mkfs.xfs parameters > and uses default mount options, so it's a 100TB filesytsem with > about 150GB of real storage in it... > > The underlying hardware controller that the SSDs are attached to is > limited to roughly 27,000 random 4k write IOPS, and that's the IO > pattern that writeback is splattering at the device. Ok, I'll try something less exotic here ;) -chris -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html