On Sat, Apr 30, 2011 at 04:27:48PM +0100, Peter Grandi wrote: > Regardless, op-journaled file system designs like JFS and XFS > write small records (way below a stripe set size, and usually > way below a chunk size) to the journal when they queue > operations, XFS will write log-stripe-unit sized records to disk. If the log buffers are not full, it pads them. Supported log-sunit sizes are up to 256k. > even if sometimes depending on design and options > may "batch" the journal updates (potentially breaking safety > semantics). Also they do small write when they dequeue the > operations from the journal to the actual metadata records > involved. > > How bad can this be when the journal is say internal for a > filesystem that is held on wide-stride RAID6 set? I suspect very > very bad, with apocalyptic read-modify-write storms, eating IOPS. Not bad at all, because the journal writes are sequential, and XFS can have multiple log IOs in progress at once (up to 8 x 256k = 2MB). So in general while metadata operations are in progress, XFS will fill full stripes with log IO and you won't get problems with RMW. > Where are studies or even just impressions of anedoctes on how > bad this is? Just buy decent RAID hardware with a BBWC and journal IO does not hurt at all. > Are there instrumentation tools in JFS or XFS that may allow me > to watch/inspect what is happening with the journal? For Linux > MD to see what are the rates of stripe r-m-w cases? XFS has plenty of event tracing, including all the transaction reservation and commit accounting in it. And if you know what you are looking for, you can see all the log IO and transaction completion processing in the event traces, too. Cheers, Dave. -- Dave Chinner david@xxxxxxxxxxxxx _______________________________________________ xfs mailing list xfs@xxxxxxxxxxx http://oss.sgi.com/mailman/listinfo/xfs