Am 15.06.2012 14:03, schrieb Mark Nelson:
On 06/15/2012 12:56 AM, Stefan Priebe - Profihost AG wrote: Let me preface this by saying that I haven't specifically read through the rados bench code. Having said that, the basic idea here is that you have a pipeline where a request is sent from the client to an OSD. If you specify "-t 1", the client will only send a single request at a time, which means that the entire process is serial and you are entirely latency bound. Now think about what happens when the client sends the request. Before client gets an acknowledgement, the request must: 1) Go through client side processing. 2) Travel over the IP network to the destination OSD. 3) Go through all of the queue processing code on the OSD. 4a) Write the data to the journal (Or the faster of the journal/data disk when using btrfs. Note: The journal writes may stall if the data disk is too slow and the journal has gotten sufficiently ahead of it) 4b) Complete replication to other OSDs based on the pool's replication level and the placement group the data gets put in. (basically steps 1,2,3,4a and 5 all over again with the OSD as the client). 5) Send the Ack back to the client over the IP network If only one request is sent at a time, most of the hardware will sit idle while the request is making it's way through the pipeline. If you have multiple concurrent requests, the OSD(s) can better utilize all of the hardware (ie some requests can be coming in over the network, while others can be writing to disk, while others can be replicating). You can probably imagine that once you have multiple OSDs on multiple Nodes, having concurrent requests in flight help you even more.
Thanks for your explanation. Stefan -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html