Try to determine how much of the 200ms avg latency comes from osds vs
the qemu block driver.
Look like that osd.0 performs with low latency but osd.1 latency is way
too high and on average it appears as 200ms. osd is backed by btrfs over
LVM2. May be issue lie in backing fs selection? All four osds running
similar setup: btrfs over LVM2 so I have some doubts that it may be a
reason as osd.0 performs well.
I have read full log between osd_op for 3670 and osd_reply and there's
number of pings from other osds (which were responded to quickly) and
good number of osd_op_reply writes (osd_sub_op for these writes came 10
seconds before). So it appears 3670 was delayed by backlog of operations.
Once the latency is under control, you might look into changing guest
settings to send larger requests and readahead more.
Josh
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html