Here is the output for one write_full on the in-memory OSD (the others look more or less the same e.g. most time is waiting for op_applied/op_commit): { "description": "osd_op(client.1764910.0:1 eos-root--0 [writefull 0~5] 15.fef59585 e3818)", "rmw_flags": 4, "received_at": "2013-09-23 23:33:16.536547", "age": "3.698191", "duration": "0.002225", "flag_point": "commit sent; apply or cleanup", "client_info": { "client": "client.1764910", "tid": 1}, "events": [ { "time": "2013-09-23 23:33:16.536706", "event": "waiting_for_osdmap"}, { "time": "2013-09-23 23:33:16.536807", "event": "reached_pg"}, { "time": "2013-09-23 23:33:16.536915", "event": "started"}, { "time": "2013-09-23 23:33:16.536936", "event": "started"}, { "time": "2013-09-23 23:33:16.537029", "event": "waiting for subops from [1056,1057]"}, { "time": "2013-09-23 23:33:16.537110", "event": "commit_queued_for_journal_write"}, { "time": "2013-09-23 23:33:16.537158", "event": "write_thread_in_journal_buffer"}, { "time": "2013-09-23 23:33:16.537242", "event": "journaled_completion_queued"}, { "time": "2013-09-23 23:33:16.537269", "event": "op_commit"}, { "time": "2013-09-23 23:33:16.538547", "event": "sub_op_commit_rec"}, { "time": "2013-09-23 23:33:16.538573", "event": "op_applied"}, { "time": "2013-09-23 23:33:16.538715", "event": "sub_op_commit_rec"}, { "time": "2013-09-23 23:33:16.538754", "event": "commit_sent"}, { "time": "2013-09-23 23:33:16.538772", "event": "done"}]}, We should probably look at the same output for the JBOD configuration ... Cheers Andreas. ________________________________________ From: Mark Nelson [mark.nelson@xxxxxxxxxxx] Sent: 23 September 2013 18:03 To: Andreas Joachim Peters Cc: Dan van der Ster; Sage Weil; ceph-devel@xxxxxxxxxxxxxxx Subject: Re: Object Write Latency On 09/23/2013 10:38 AM, Andreas Joachim Peters wrote: > We deployed 3 OSDs with an EXT4 using RapidDisk in-memory. > > The FS does 140k/s append+sync and the latency is now: > > ~1 ms for few byte objects with single replica > ~2 ms for few byte objects three replica (instead of 65-80ms) Interesting! If you look at the slowest operations in the ceph admin socket now with dump_historic_ops, where are those operations spending their time? > > This gives probably the base-line of the best you can do with the current implementation. > > ==> the 80ms are probably just a 'feature' of the hardware (JBOD disks/controller) and we might try to find some tuning parameters to improve the latency slightly. Hardware definitely plays a huge part in terms of Ceph performance. You can run Ceph on just about anything, but it's surprising how different two roughly similar systems can perform. > > Could you just explain how the async api functions (is_complete, is_safe) map to the three states > > 1) object is transferred from client to all OSDs and is present in memory there > 2) object is written to the OSD journal > 3) object is committed from OSD journal to the OSD filesystem > > Is it correct that the object is visible by clients only when 3) has happened? Yes, afaik. > > Thanks for your help, > Andreas. > > > > > -- > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html