Re: waiting for sub ops

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Sage,

Am 13.05.2012 02:15, schrieb Sage Weil:
On Fri, 11 May 2012, Stefan Priebe - Profihost AG wrote:
Hi,

while doing some stress testing with bonnie++ i'm seeing always these
messages accross all osd's here is just an example for osd.2.

All machines are connected with 2x 1Gbit/s bonding mode 6 to a HP switch.

These are just telling you that some operations are taking>  30 seconds.
The 'waiting for sub ops' means that it is waiting for the write/update to
be acked by other replicas.  Either there is some load imbalance (some
osds are more busy than others), or everyone is similarly loaded and the
request queues are just long across the board.

mhm but there must be something wrong in my testsetup.

1.) th osd bench shows 150MB/s per osd
2.) iperf shows constant 930Mbit/s per eth
3.) when i write 16GB with dd to the ceph mount i see spikes to 450Mbit/ and drops to 90kb/s for long periods of time. The overall dd speed is then 40Mbit/s
4.) the speed drops to 90kb/s while seeing the
"[WRN] slow request received at 2012-05-13 20:01:55.227811: osd_op(client.4102.1:38432 10000000004.000003ae [write 0~4194304] 0.5f4dfca8 snapc 1=[]) currently waiting for sub ops"
messages.

The client shows this in dmesg:
[2012-05-13 19:55:26]  libceph:  tid 38132 timed out on osd2, will reset osd
[2012-05-13 19:55:46]  libceph:  tid 38400 timed out on osd0, will reset osd
[2012-05-13 19:56:31]  libceph:  tid 38886 timed out on osd2, will reset osd

greets and thanks
Stefan
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux