On 26/05/2015 15:50, Barclay Jameson wrote:
Thank you for the great explanation Zheng! That definitely shows what I was seeing with the bonnie++ test. What bad things would happen if I modified the config option mds_tick_interval to flush the journal to a second or less?
The MDS does various pieces of housekeeping according to that interval, so setting it extremely low will cause some CPU cycles to be wasted, and flushing the log more often will cause a larger number of smaller IOs to get generated. I would be very surprised if decreasing it to approx 1s was harmful though.
On a busy real world system, other metadata operations will often drive log writes through faster than waiting for a tick.
Does this also mean any custom code written should avoid use of fsync() if writing a large number of files?
You should call it only when your application requires it for consistency, and always expect it to be a high latency operation. Add up the latency from your client to your server and from the server to the disk, and the length of the IO queue on the disk, and then the return leg -- that is the *minimum* time you should expect to wait for an fsync.
For example, a real world workload creating N files in a directory would hopefully call fsync on the directory once at the end, rather than in between every file, unless you really do need to be sure that the dentry for the preceding file will be persistent before you start writing the next file.
Sometimes it's easier to reason about it in terms of concurrency: if you have a bunch of IOs that you could safely run in parallel in a thread each, then you shouldn't be fsyncing between them, just at the point you join them.
John -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html