Re: slow fio random read benchmark, need help

Marcus Sorensen <shadowsor@xxxxxxxxx> · Thu, 1 Nov 2012 10:28:50 -0600

Actually that didn't illustrate my point very well, since you see
individual requests being sent to the driver without waiting for
individual completion, but if you look at the full output you can see
that once the queue is full, you're at the mercy of waiting for
individual IOs to complete before sending new ones. Sometimes it's one
at a time, sometimes you get 3-4 completed and can insert a few at
once. I think this is countered by the fact that there's roundtrip
network latency in sending the request and in receiving the result.

For the record, I'm not saying that it's the entire reason why the
performance is lower (obviously since iscsi is better), I'm just
saying that when you're talking about high iops, adding 100us (best
case gigabit) to each request and response is significant. If an io
takes 25us locally (for example an SSD can do 40k iops or more at a
queue depth of 1), and you share that storage over gigabit, you just
increased the latency by an order of magnitude, and as seen there is
only so much simultaneous io going on when the queue depth is raised.
Add to that that multipathing isn't doing parallel but interleaving,
and extra traffic for distributed storage.
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html