Hi,
Our setup is this:
4 x OSD nodes:
E5-1630 CPU
32 GB RAM
Mellanox MT27520 56Gbps network cards
SATA controller LSI Logic SAS3008
Storage nodes are connected to SuperMicro chassis: 847E1C-R1K28JBOD
Each node has 2-3 spinning OSDs (6TB drives) and 2 ssd OSDs (240GB drives)
3 monitors running on OSD nodes
ceph hammer 0.94.2
Ubuntu 14.04
cache tier with ecpool (3+1)
We've ran some tests on the cluster and results were promising - speeds,
iops etc as expected, but now we tried to use more than one client for
testing and ran into some problems:
We've created a couple rbd images and mapped them on clients (kernel
rbd) running two rsync processes and one dd on a large number of files
(~ 250 GB of maildirs rsync'ed from one rbd image to the other and a dd
process writing one big 2TB file on another rbd image)
And the speeds now are less than OK, plus we have a lot requests blocked
warnings. We've left the processes to run over night but this morning I
came and none of the processes finished - they are able to write data,
but at a very, very slow rate.
Please help me diagnose this problem, everything seems to work, just
very, very slow... when we ran the tests with fio (librbd engine)
everything seemd fine.. I know that kernel implementation is slower, but
is this normal? I can't understand why are so many requests blocked.
Some diagnostic data:
root@cf01:/var/log/ceph# ceph -s
cluster 3469081f-9852-4b6e-b7ed-900e77c48bb5
health HEALTH_WARN
31 requests are blocked > 32 sec
monmap e1: 3 mons at
{cf01=10.4.10.211:6789/0,cf02=10.4.10.212:6789/0,cf03=10.4.10.213:6789/0}
election epoch 202, quorum 0,1,2 cf01,cf02,cf03
osdmap e1319: 18 osds: 18 up, 18 in
pgmap v933010: 2112 pgs, 19 pools, 10552 GB data, 2664 kobjects
14379 GB used, 42812 GB / 57192 GB avail
2111 active+clean
1 active+clean+scrubbing
client io 0 B/s rd, 12896 kB/s wr, 35 op/s
root@cf01:/var/log/ceph# ceph health detail
HEALTH_WARN 23 requests are blocked > 32 sec; 6 osds have slow requests
1 ops are blocked > 131.072 sec
22 ops are blocked > 65.536 sec
1 ops are blocked > 65.536 sec on osd.2
1 ops are blocked > 65.536 sec on osd.3
1 ops are blocked > 65.536 sec on osd.4
1 ops are blocked > 131.072 sec on osd.7
18 ops are blocked > 65.536 sec on osd.10
1 ops are blocked > 65.536 sec on osd.12
6 osds have slow requests
root@cf01:/var/log/ceph# grep WRN ceph.log | tail -50
2015-08-19 10:23:34.505669 osd.14 10.4.10.213:6810/21207 17942 : cluster
[WRN] 2 slow requests, 2 included below; oldest blocked for > 30.575870 secs
2015-08-19 10:23:34.505796 osd.14 10.4.10.213:6810/21207 17943 : cluster
[WRN] slow request 30.575870 seconds old, received at 2015-08-19
10:23:03.929722: osd_op(client.9203.1:22822591
rbd_data.37e02ae8944a.00000000000180ca [set-alloc-hint object_size
4194304 write_size 4194304,write 0~462848] 5.1c2aff5f ondisk+write
e1319) currently waiting for blocked object
2015-08-19 10:23:34.505803 osd.14 10.4.10.213:6810/21207 17944 : cluster
[WRN] slow request 30.560009 seconds old, received at 2015-08-19
10:23:03.945583: osd_op(client.9203.1:22822592
rbd_data.37e02ae8944a.00000000000180ca [set-alloc-hint object_size
4194304 write_size 4194304,write 462848~524288] 5.1c2aff5f ondisk+write
e1319) currently waiting for blocked object
2015-08-19 10:23:35.489927 osd.1 10.4.10.211:6812/9198 7921 : cluster
[WRN] 1 slow requests, 1 included below; oldest blocked for > 30.112783 secs
2015-08-19 10:23:35.490326 osd.1 10.4.10.211:6812/9198 7922 : cluster
[WRN] slow request 30.112783 seconds old, received at 2015-08-19
10:23:05.376339: osd_op(osd.14.1299:731492
rbd_data.37e02ae8944a.0000000000017a90 [copy-from ver 61293] 4.5aa9fb69
ondisk+write+ignore_overlay+enforce_snapc+known_if_redirected e1319)
currently commit_sent
2015-08-19 10:23:36.799861 osd.6 10.4.10.213:6806/22569 22663 : cluster
[WRN] 2 slow requests, 1 included below; oldest blocked for > 60.414800 secs
2015-08-19 10:23:36.800018 osd.6 10.4.10.213:6806/22569 22664 : cluster
[WRN] slow request 60.414800 seconds old, received at 2015-08-19
10:22:36.384453: osd_op(osd.11.1299:705465
rbd_data.37e02ae8944a.0000000000017cd2 [copy-from ver 61156] 4.c961fc05
ondisk+write+ignore_overlay+enforce_snapc+known_if_redirected e1319)
currently commit_sent
2015-08-19 10:23:34.652970 osd.7 10.4.10.213:6808/22888 21331 : cluster
[WRN] 16 slow requests, 1 included below; oldest blocked for > 58.978274
secs
2015-08-19 10:23:34.653013 osd.7 10.4.10.213:6808/22888 21332 : cluster
[WRN] slow request 30.703792 seconds old, received at 2015-08-19
10:23:03.948535: osd_op(osd.14.1299:731491
rbd_data.37e02ae8944a.00000000000180ca [copy-get max 8388608] 4.1c2aff5f
ack+read+ignore_cache+ignore_overlay+map_snap_clone+known_if_redirected
e1319) currently started
2015-08-19 10:23:35.653883 osd.7 10.4.10.213:6808/22888 21333 : cluster
[WRN] 17 slow requests, 3 included below; oldest blocked for > 59.979456
secs
2015-08-19 10:23:35.654260 osd.7 10.4.10.213:6808/22888 21334 : cluster
[WRN] slow request 30.737485 seconds old, received at 2015-08-19
10:23:04.916024: MOSDECSubOpWrite(4.1aes1 1319 ECSubWrite(tid=508685,
reqid=osd.15.1301:577673, at_version=1319'5556, trim_to=1318'2500,
trim_rollback_to=1319'5554)) currently started
2015-08-19 10:23:35.654268 osd.7 10.4.10.213:6808/22888 21335 : cluster
[WRN] slow request 30.533515 seconds old, received at 2015-08-19
10:23:05.119993: osd_op(osd.11.1299:705493
rbd_data.37e02ae8944a.0000000000017dcc [copy-from ver 60974] 4.4c845496
ondisk+write+ignore_overlay+enforce_snapc+known_if_redirected e1319)
currently commit_sent
2015-08-19 10:23:35.654272 osd.7 10.4.10.213:6808/22888 21336 : cluster
[WRN] slow request 30.255396 seconds old, received at 2015-08-19
10:23:05.398113: MOSDECSubOpWrite(4.169s1 1319 ECSubWrite(tid=614933,
reqid=osd.14.1299:731492, at_version=1319'5541, trim_to=1318'2500,
trim_rollback_to=1319'5538)) currently started
2015-08-19 10:23:36.654759 osd.7 10.4.10.213:6808/22888 21337 : cluster
[WRN] 16 slow requests, 2 included below; oldest blocked for > 60.980631
secs
2015-08-19 10:23:36.654766 osd.7 10.4.10.213:6808/22888 21338 : cluster
[WRN] slow request 60.980631 seconds old, received at 2015-08-19
10:22:35.674053: MOSDECSubOpWrite(4.1a9s1 1319 ECSubWrite(tid=604142,
reqid=osd.16.1272:599035, at_version=1319'5715, trim_to=1318'2700,
trim_rollback_to=1319'5711)) currently started
2015-08-19 10:23:36.654772 osd.7 10.4.10.213:6808/22888 21339 : cluster
[WRN] slow request 60.271394 seconds old, received at 2015-08-19
10:22:36.383289: MOSDECSubOpWrite(4.1a9s1 1319 ECSubWrite(tid=604143,
reqid=osd.16.1272:599036, at_version=1319'5716, trim_to=1318'2700,
trim_rollback_to=1319'5711)) currently started
2015-08-19 10:23:37.654933 osd.7 10.4.10.213:6808/22888 21340 : cluster
[WRN] 15 slow requests, 1 included below; oldest blocked for > 61.980829
secs
2015-08-19 10:23:37.655100 osd.7 10.4.10.213:6808/22888 21341 : cluster
[WRN] slow request 60.225464 seconds old, received at 2015-08-19
10:22:37.429417: MOSDECSubOpWrite(4.5s3 1319 ECSubWrite(tid=563247,
reqid=osd.11.1299:705465, at_version=1319'5545, trim_to=1318'2500,
trim_rollback_to=1319'5539)) currently started
2015-08-19 10:23:42.656604 osd.7 10.4.10.213:6808/22888 21342 : cluster
[WRN] 10 slow requests, 1 included below; oldest blocked for > 66.272631
secs
2015-08-19 10:23:42.657055 osd.7 10.4.10.213:6808/22888 21343 : cluster
[WRN] slow request 30.549952 seconds old, received at 2015-08-19
10:23:12.105968: osd_op(osd.14.1299:731499
rbd_data.37e02ae8944a.00000000000177ee [copy-from ver 66450] 4.a71f73bb
ondisk+write+ignore_overlay+enforce_snapc+known_if_redirected e1319)
currently commit_sent
2015-08-19 10:23:35.164040 osd.2 10.4.10.211:6808/10427 7307 : cluster
[WRN] 1 slow requests, 1 included below; oldest blocked for > 30.272625 secs
2015-08-19 10:23:35.164208 osd.2 10.4.10.211:6808/10427 7308 : cluster
[WRN] slow request 30.272625 seconds old, received at 2015-08-19
10:23:04.890780: osd_op(osd.15.1301:577673
rbd_data.37e02ae8944a.0000000000017b2c [copy-from ver 61313] 4.217af5ae
ondisk+write+ignore_overlay+enforce_snapc+known_if_redirected e1319)
currently commit_sent
2015-08-19 10:23:43.567637 osd.3 10.4.10.212:6804/31427 7180 : cluster
[WRN] 1 slow requests, 1 included below; oldest blocked for > 60.708292 secs
2015-08-19 10:23:43.567645 osd.3 10.4.10.212:6804/31427 7181 : cluster
[WRN] slow request 60.708292 seconds old, received at 2015-08-19
10:22:42.859163: osd_op(osd.14.1299:731455
rbd_data.37e02ae8944a.0000000000017f5c [copy-from ver 61192] 4.a6121781
ondisk+write+ignore_overlay+enforce_snapc+known_if_redirected e1319)
currently commit_sent
2015-08-19 10:23:36.762303 osd.4 10.4.10.212:6806/31912 7895 : cluster
[WRN] 4 slow requests, 1 included below; oldest blocked for > 61.142907 secs
2015-08-19 10:23:36.762685 osd.4 10.4.10.212:6806/31912 7896 : cluster
[WRN] slow request 60.400444 seconds old, received at 2015-08-19
10:22:36.360861: osd_op(osd.16.1272:599036
rbd_data.37e02ae8944a.0000000000017bf5 [copy-from ver 63134] 4.95eb47a9
ondisk+write+ignore_overlay+enforce_snapc+known_if_redirected e1319)
currently commit_sent
2015-08-19 10:23:43.765242 osd.4 10.4.10.212:6806/31912 7897 : cluster
[WRN] 4 slow requests, 1 included below; oldest blocked for > 67.403887 secs
2015-08-19 10:23:43.765623 osd.4 10.4.10.212:6806/31912 7898 : cluster
[WRN] slow request 30.849907 seconds old, received at 2015-08-19
10:23:12.914840: osd_op(osd.16.1272:599065
rbd_data.37e02ae8944a.00000000000180ae [copy-get max 8388608] 4.faca2da9
ack+read+ignore_cache+ignore_overlay+map_snap_clone+known_if_redirected
e1319) currently started
2015-08-19 10:23:45.493885 osd.1 10.4.10.211:6812/9198 7923 : cluster
[WRN] 1 slow requests, 1 included below; oldest blocked for > 30.014812 secs
2015-08-19 10:23:45.494246 osd.1 10.4.10.211:6812/9198 7924 : cluster
[WRN] slow request 30.014812 seconds old, received at 2015-08-19
10:23:15.478121: osd_op(osd.16.1272:599072
rbd_data.37e02ae8944a.0000000000017b1c [copy-from ver 65567] 4.bead06e1
ondisk+write+ignore_overlay+enforce_snapc+known_if_redirected e1319)
currently commit_sent
2015-08-19 10:23:46.803793 osd.6 10.4.10.213:6806/22569 22665 : cluster
[WRN] 1 slow requests, 1 included below; oldest blocked for > 60.664609 secs
2015-08-19 10:23:46.803971 osd.6 10.4.10.213:6806/22569 22666 : cluster
[WRN] slow request 60.664609 seconds old, received at 2015-08-19
10:22:46.138538: osd_op(osd.11.1299:705468
rbd_data.37e02ae8944a.0000000000017abe [copy-from ver 61142] 4.c0253e05
ondisk+write+ignore_overlay+enforce_snapc+known_if_redirected e1319)
currently commit_sent
2015-08-19 10:23:43.657718 osd.7 10.4.10.213:6808/22888 21344 : cluster
[WRN] 12 slow requests, 3 included below; oldest blocked for > 67.273998
secs
2015-08-19 10:23:43.657725 osd.7 10.4.10.213:6808/22888 21345 : cluster
[WRN] slow request 30.728099 seconds old, received at 2015-08-19
10:23:12.929188: osd_op(osd.11.1299:705499
rbd_data.37e02ae8944a.0000000000017d85 [copy-from ver 60964] 4.ba4f1896
ondisk+write+ignore_overlay+enforce_snapc+known_if_redirected e1319)
currently commit_sent
2015-08-19 10:23:43.657887 osd.7 10.4.10.213:6808/22888 21346 : cluster
[WRN] slow request 30.675844 seconds old, received at 2015-08-19
10:23:12.981443: MOSDECSubOpRead(4.1a9s1 1319 ECSubRead(tid=604164,
to_read={faca2da9/rbd_data.37e02ae8944a.00000000000180ae/head//4=0,2797408,0},
attrs_to_read=)) currently no flag points reached
2015-08-19 10:23:43.657894 osd.7 10.4.10.213:6808/22888 21347 : cluster
[WRN] slow request 60.771125 seconds old, received at 2015-08-19
10:22:42.886162: MOSDECSubOpWrite(4.181s2 1319 ECSubWrite(tid=566649,
reqid=osd.14.1299:731455, at_version=1319'5518, trim_to=1318'2500,
trim_rollback_to=1319'5514)) currently started
2015-08-19 10:23:44.658111 osd.7 10.4.10.213:6808/22888 21348 : cluster
[WRN] 13 slow requests, 2 included below; oldest blocked for > 68.274757
secs
2015-08-19 10:23:44.658118 osd.7 10.4.10.213:6808/22888 21349 : cluster
[WRN] slow request 30.969646 seconds old, received at 2015-08-19
10:23:13.688399: osd_op(osd.10.1297:558590
rbd_data.37e02ae8944a.0000000000017b0e [copy-from ver 63188] 4.646dd0af
ondisk+write+ignore_overlay+enforce_snapc+known_if_redirected e1319)
currently started
2015-08-19 10:23:44.658122 osd.7 10.4.10.213:6808/22888 21350 : cluster
[WRN] slow request 30.969351 seconds old, received at 2015-08-19
10:23:13.688695: osd_op(osd.10.1297:558591
rbd_data.37e02ae8944a.0000000000017973 [copy-from ver 63177] 4.fd21a2af
ondisk+write+ignore_overlay+enforce_snapc+known_if_redirected e1319)
currently started
2015-08-19 10:23:45.658546 osd.7 10.4.10.213:6808/22888 21351 : cluster
[WRN] 13 slow requests, 1 included below; oldest blocked for > 69.275060
secs
2015-08-19 10:23:45.658554 osd.7 10.4.10.213:6808/22888 21352 : cluster
[WRN] slow request 30.156082 seconds old, received at 2015-08-19
10:23:15.502267: MOSDECSubOpWrite(4.e1s2 1319 ECSubWrite(tid=614936,
reqid=osd.16.1272:599072, at_version=1319'5510, trim_to=1318'2500,
trim_rollback_to=1319'5508)) currently started
2015-08-19 10:23:46.658900 osd.7 10.4.10.213:6808/22888 21353 : cluster
[WRN] 13 slow requests, 2 included below; oldest blocked for > 70.275453
secs
2015-08-19 10:23:46.658908 osd.7 10.4.10.213:6808/22888 21354 : cluster
[WRN] slow request 30.776570 seconds old, received at 2015-08-19
10:23:15.882172: osd_op(osd.11.1299:705504
rbd_data.37e02ae8944a.0000000000018028 [copy-from ver 60995] 4.cf21fa96
ondisk+write+ignore_overlay+enforce_snapc+known_if_redirected e1319)
currently started
2015-08-19 10:23:46.658913 osd.7 10.4.10.213:6808/22888 21355 : cluster
[WRN] slow request 60.273271 seconds old, received at 2015-08-19
10:22:46.385472: MOSDECSubOpWrite(4.5s3 1319 ECSubWrite(tid=563263,
reqid=osd.11.1299:705468, at_version=1319'5546, trim_to=1318'2500,
trim_rollback_to=1319'5539)) currently started
2015-08-19 10:23:49.659552 osd.7 10.4.10.213:6808/22888 21356 : cluster
[WRN] 12 slow requests, 1 included below; oldest blocked for > 73.276082
secs
2015-08-19 10:23:49.659559 osd.7 10.4.10.213:6808/22888 21357 : cluster
[WRN] slow request 30.075904 seconds old, received at 2015-08-19
10:23:19.583468: osd_op(osd.12.1298:558271
rbd_data.37e02ae8944a.0000000000017f03 [copy-from ver 62273] 4.aafb718f
ondisk+write+ignore_overlay+enforce_snapc+known_if_redirected e1319)
currently commit_sent
2015-08-19 10:23:51.660604 osd.7 10.4.10.213:6808/22888 21358 : cluster
[WRN] 11 slow requests, 1 included below; oldest blocked for > 65.274645
secs
2015-08-19 10:23:51.661052 osd.7 10.4.10.213:6808/22888 21359 : cluster
[WRN] slow request 30.271682 seconds old, received at 2015-08-19
10:23:21.388434: osd_op(osd.12.1298:558278
rbd_data.37e02ae8944a.0000000000017b74 [copy-from ver 62260] 4.60d7e38f
ondisk+write+ignore_overlay+enforce_snapc+known_if_redirected e1319)
currently commit_sent
2015-08-19 10:23:53.768995 osd.4 10.4.10.212:6806/31912 7899 : cluster
[WRN] 2 slow requests, 1 included below; oldest blocked for > 60.660749 secs
2015-08-19 10:23:53.769507 osd.4 10.4.10.212:6806/31912 7900 : cluster
[WRN] slow request 60.660749 seconds old, received at 2015-08-19
10:22:53.107765: osd_op(osd.16.1272:599047
rbd_data.37e02ae8944a.000000000001803d [copy-from ver 63185] 4.a327fba9
ondisk+write+ignore_overlay+enforce_snapc+known_if_redirected e1319)
currently commit_sent
root@cf01:/var/log/ceph# ceph osd tree
ID WEIGHT TYPE NAME UP/DOWN REWEIGHT PRIMARY-AFFINITY
-15 10.00000 root sata
-7 3.00000 disktype cf01sata
0 1.00000 osd.0 up 1.00000 1.00000
1 1.00000 osd.1 up 1.00000 1.00000
2 1.00000 osd.2 up 1.00000 1.00000
-9 2.00000 disktype cf02sata
3 1.00000 osd.3 up 1.00000 1.00000
4 1.00000 osd.4 up 1.00000 1.00000
-11 3.00000 disktype cf03sata
5 1.00000 osd.5 up 1.00000 1.00000
6 1.00000 osd.6 up 1.00000 1.00000
7 1.00000 osd.7 up 1.00000 1.00000
-13 2.00000 disktype cf04sata
8 1.00000 osd.8 up 1.00000 1.00000
9 1.00000 osd.9 up 1.00000 1.00000
-14 8.00000 root ssd
-6 2.00000 disktype cf01ssd
10 1.00000 osd.10 up 1.00000 1.00000
11 1.00000 osd.11 up 1.00000 1.00000
-8 2.00000 disktype cf02ssd
12 1.00000 osd.12 up 1.00000 1.00000
13 1.00000 osd.13 up 1.00000 1.00000
-10 2.00000 disktype cf03ssd
14 1.00000 osd.14 up 1.00000 1.00000
15 1.00000 osd.15 up 1.00000 1.00000
-12 2.00000 disktype cf04ssd
17 1.00000 osd.17 up 1.00000 1.00000
16 1.00000 osd.16 up 1.00000 1.00000
-1 18.00000 root default
-2 5.00000 host cf01
-6 2.00000 disktype cf01ssd
10 1.00000 osd.10 up 1.00000 1.00000
11 1.00000 osd.11 up 1.00000 1.00000
-7 3.00000 disktype cf01sata
0 1.00000 osd.0 up 1.00000 1.00000
1 1.00000 osd.1 up 1.00000 1.00000
2 1.00000 osd.2 up 1.00000 1.00000
-3 4.00000 host cf02
-8 2.00000 disktype cf02ssd
12 1.00000 osd.12 up 1.00000 1.00000
13 1.00000 osd.13 up 1.00000 1.00000
-9 2.00000 disktype cf02sata
3 1.00000 osd.3 up 1.00000 1.00000
4 1.00000 osd.4 up 1.00000 1.00000
-4 5.00000 host cf03
-10 2.00000 disktype cf03ssd
14 1.00000 osd.14 up 1.00000 1.00000
15 1.00000 osd.15 up 1.00000 1.00000
-11 3.00000 disktype cf03sata
5 1.00000 osd.5 up 1.00000 1.00000
6 1.00000 osd.6 up 1.00000 1.00000
7 1.00000 osd.7 up 1.00000 1.00000
-5 4.00000 host cf04
-12 2.00000 disktype cf04ssd
17 1.00000 osd.17 up 1.00000 1.00000
16 1.00000 osd.16 up 1.00000 1.00000
-13 2.00000 disktype cf04sata
8 1.00000 osd.8 up 1.00000 1.00000
9 1.00000 osd.9 up 1.00000 1.00000
Cheers,
J
--
Jacek Jarosiewicz
Administrator Systemów Informatycznych
----------------------------------------------------------------------------------------
SUPERMEDIA Sp. z o.o. z siedzibą w Warszawie
ul. Senatorska 13/15, 00-075 Warszawa
Sąd Rejonowy dla m.st.Warszawy, XII Wydział Gospodarczy Krajowego
Rejestru Sądowego,
nr KRS 0000029537; kapitał zakładowy 42.756.000 zł
NIP: 957-05-49-503
Adres korespondencyjny: ul. Jubilerska 10, 04-190 Warszawa
----------------------------------------------------------------------------------------
SUPERMEDIA -> http://www.supermedia.pl
dostep do internetu - hosting - kolokacja - lacza - telefonia
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com