I'd just like to report the same behaviour on my test cluster with 0.48. I've set up a single box (Sl6.1 - 2.6.32-220.23.1 kernel) with 1 mds, mon and osd, and replication set to '1' for both data and metadata. Having mounted using ceph-fuse, I'm running a simple fio job to create load: [global] directory=/mnt/ceph size=500M rw=read ioengine=libaio [simple] I'm then watching the latency with ioping. With rw=read, rw=randread (random reads) or rw=write (sequential writes) I see no problems and the latency sits around 1-2ms. However, with rw=randwrite (random writes) I see the latency jump to between 5 and 60 seconds, and the following types of warning lines appear: 2012-07-19 10:29:39.417625 osd.0 [WRN] 11 slow requests, 6 included below; oldest blocked for > 54.425766 secs [WRN] slow request 54.420958 seconds old, received at 2012-07-19 10:28:44.996584: osd_op(client.4113.0:9153 100000003ed.0000003b [write 847872~4096] 0.dc4b476f snapc 1=[]) v4 currently started 2012-07-19 10:29:39.417641 osd.0 [WRN] slow request 54.420587 seconds old, received at 2012-07-19 10:28:44.996955: osd_op(client.4113.0:9154 100000003ed.00000000 [write 1175552~4096] 0.44a7cb80 snapc 1=[]) v4 currently started [...snip...] Let me know if there's any more information that I can provide that might help with diagnosing the problem (also bearing in mind that I'm new to ceph so might need extra notes on generating tests, dumps etc :) ) Thanks, Matthew -- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336.
Attachment:
signature.asc
Description: OpenPGP digital signature