On Wed, Oct 14, 2015 at 1:03 AM, Sage Weil <sweil@xxxxxxxxxx> wrote: > On Mon, 12 Oct 2015, Robert LeBlanc wrote: >> -----BEGIN PGP SIGNED MESSAGE----- >> Hash: SHA256 >> >> After a weekend, I'm ready to hit this from a different direction. >> >> I replicated the issue with Firefly so it doesn't seem an issue that >> has been introduced or resolved in any nearby version. I think overall >> we may be seeing [1] to a great degree. From what I can extract from >> the logs, it looks like in situations where OSDs are going up and >> down, I see I/O blocked at the primary OSD waiting for peering and/or >> the PG to become clean before dispatching the I/O to the replicas. >> >> In an effort to understand the flow of the logs, I've attached a small >> 2 minute segment of a log I've extracted what I believe to be >> important entries in the life cycle of an I/O along with my >> understanding. If someone would be kind enough to help my >> understanding, I would appreciate it. >> >> 2015-10-12 14:12:36.537906 7fb9d2c68700 10 -- 192.168.55.16:6800/11295 >> >> 192.168.55.12:0/2013622 pipe(0x26c90000 sd=47 :6800 s=2 pgs=2 cs=1 >> l=1 c=0x32c85440).reader got message 19 0x2af81700 >> osd_op(client.6709.0:67 rbd_data.103c74b0dc51.000000000000003a >> [set-alloc-hint object_size 4194304 write_size 4194304,write >> 0~4194304] 0.474a01a9 ack+ondisk+write+known_if_redirected e44) v5 >> >> - ->Messenger has recieved the message from the client (previous >> entries in the 7fb9d2c68700 thread are the individual segments that >> make up this message). >> >> 2015-10-12 14:12:36.537963 7fb9d2c68700 1 -- 192.168.55.16:6800/11295 >> <== client.6709 192.168.55.12:0/2013622 19 ==== >> osd_op(client.6709.0:67 rbd_data.103c74b0dc51.000000000000003a >> [set-alloc-hint object_size 4194304 write_size 4194304,write >> 0~4194304] 0.474a01a9 ack+ondisk+write+known_if_redirected e44) v5 >> ==== 235+0+4194304 (2317308138 0 2001296353) 0x2af81700 con 0x32c85440 >> >> - ->OSD process acknowledges that it has received the write. >> >> 2015-10-12 14:12:36.538096 7fb9d2c68700 15 osd.4 44 enqueue_op >> 0x3052b300 prio 63 cost 4194304 latency 0.012371 >> osd_op(client.6709.0:67 rbd_data.103c74b0dc51.000000000000003a >> [set-alloc-hint object_size 4194304 write_size 4194304,write >> 0~4194304] 0.474a01a9 ack+ondisk+write+known_if_redirected e44) v5 >> >> - ->Not sure excatly what is going on here, the op is being enqueued somewhere.. >> >> 2015-10-12 14:13:06.542819 7fb9e2d3a700 10 osd.4 44 dequeue_op >> 0x3052b300 prio 63 cost 4194304 latency 30.017094 >> osd_op(client.6709.0:67 rbd_data.103c74b0dc51.000000000000003a >> [set-alloc-hint object_size 4194304 write_size 4194304,write >> 0~4194304] 0.474a01a9 ack+ondisk+write+known_if_redirected e44) v >> 5 pg pg[0.29( v 44'703 (0'0,44'703] local-les=40 n=641 ec=1 les/c >> 40/44 32/32/10) [4,5,0] r=0 lpr=32 crt=44'700 lcod 44'702 mlcod 44'702 >> active+clean] >> >> - ->The op is dequeued from this mystery queue 30 seconds later in a >> different thread. > > ^^ This is the problem. Everything after this looks reasonable. Looking > at the other dequeue_op calls over this period, it looks like we're just > overwhelmed with higher priority requests. New clients are 63, while > osd_repop (replicated write from another primary) are 127 and replies from > our own replicated ops are 196. We do process a few other prio 63 items, > but you'll see that their latency is also climbing up to 30s over this > period. > > The question is why we suddenly get a lot of them.. maybe the peering on > other OSDs just completed so we get a bunch of these? It's also not clear > to me what makes osd.4 or this op special. We expect a mix of primary and > replica ops on all the OSDs, so why would we suddenly have more of them > here.... I guess the bug tracker(http://tracker.ceph.com/issues/13482) is related to this thread. So is it means that there exists live lock with client op and repop? We permit all clients issue too much client ops which cause some OSDs bottleneck, then actually other OSDs maybe idle enough and accept more client ops. Finally, all osds are stuck into the bottleneck OSD. It seemed reasonable, but why it will last so long? > > sage > > >> >> 2015-10-12 14:13:06.542912 7fb9e2d3a700 10 osd.4 pg_epoch: 44 pg[0.29( >> v 44'703 (0'0,44'703] local-les=40 n=641 ec=1 les/c 40/44 32/32/10) >> [4,5,0] r=0 lpr=32 crt=44'700 lcod 44'702 mlcod 44'702 active+clean] >> do_op osd_op(client.6709.0:67 rbd_data.103c74b0dc51.000000000000003a >> [set-alloc-hint object_size 4194304 write_size 4194304,write >> 0~4194304] 0.474a01a9 ack+ondisk+write+known_if_redirected e44) v5 >> may_write -> write-ordered flags ack+ondisk+write+known_if_redirected >> >> - ->Not sure what this message is. Look up of secondary OSDs? >> >> 2015-10-12 14:13:06.544999 7fb9e2d3a700 10 osd.4 pg_epoch: 44 pg[0.29( >> v 44'703 (0'0,44'703] local-les=40 n=641 ec=1 les/c 40/44 32/32/10) >> [4,5,0] r=0 lpr=32 crt=44'700 lcod 44'702 mlcod 44'702 active+clean] >> new_repop rep_tid 17815 on osd_op(client.6709.0:67 >> rbd_data.103c74b0dc51.000000000000003a [set-alloc-hint object_size >> 4194304 write_size 4194304,write 0~4194304] 0.474a01a9 >> ack+ondisk+write+known_if_redirected e44) v5 >> >> - ->Dispatch write to secondaty OSDs? >> >> 2015-10-12 14:13:06.545116 7fb9e2d3a700 1 -- 192.168.55.16:6801/11295 >> --> 192.168.55.15:6801/32036 -- osd_repop(client.6709.0:67 0.29 >> 474a01a9/rbd_data.103c74b0dc51.000000000000003a/head//0 v 44'704) v1 >> -- ?+4195078 0x238fd600 con 0x32bcb5a0 >> >> - ->OSD dispatch write to OSD.0. >> >> 2015-10-12 14:13:06.545132 7fb9e2d3a700 20 -- 192.168.55.16:6801/11295 >> submit_message osd_repop(client.6709.0:67 0.29 >> 474a01a9/rbd_data.103c74b0dc51.000000000000003a/head//0 v 44'704) v1 >> remote, 192.168.55.15:6801/32036, have pipe. >> >> - ->Message sent to OSD.0. >> >> 2015-10-12 14:13:06.545195 7fb9e2d3a700 1 -- 192.168.55.16:6801/11295 >> --> 192.168.55.11:6801/13185 -- osd_repop(client.6709.0:67 0.29 >> 474a01a9/rbd_data.103c74b0dc51.000000000000003a/head//0 v 44'704) v1 >> -- ?+4195078 0x16edd200 con 0x3a37b20 >> >> - ->OSD dispatch write to OSD.5. >> >> 2015-10-12 14:13:06.545210 7fb9e2d3a700 20 -- 192.168.55.16:6801/11295 >> submit_message osd_repop(client.6709.0:67 0.29 >> 474a01a9/rbd_data.103c74b0dc51.000000000000003a/head//0 v 44'704) v1 >> remote, 192.168.55.11:6801/13185, have pipe. >> >> - ->Message sent to OSD.5. >> >> 2015-10-12 14:13:06.545229 7fb9e2d3a700 10 osd.4 pg_epoch: 44 pg[0.29( >> v 44'703 (0'0,44'703] local-les=40 n=641 ec=1 les/c 40/44 32/32/10) >> [4,5,0] r=0 lpr=32 crt=44'700 lcod 44'702 mlcod 44'702 active+clean] >> append_log log((0'0,44'703], crt=44'700) [44'704 (44'691) modify >> 474a01a9/rbd_data.103c74b0dc51.000000000000003a/head//0 by >> client.6709.0:67 2015-10-12 14:12:34.340082] >> 2015-10-12 14:13:06.545268 7fb9e2d3a700 10 osd.4 pg_epoch: 44 pg[0.29( >> v 44'704 (0'0,44'704] local-les=40 n=641 ec=1 les/c 40/44 32/32/10) >> [4,5,0] r=0 lpr=32 luod=44'703 lua=44'703 crt=44'700 lcod 44'702 mlcod >> 44'702 active+clean] add_log_entry 44'704 (44'691) modify >> 474a01a9/rbd_data.103c74b0dc51.000000000000003a/head//0 by >> client.6709.0:67 2015-10-12 14:12:34.340082 >> >> - ->These record the OP in the journal log? >> >> 2015-10-12 14:13:06.563241 7fb9d326e700 20 -- 192.168.55.16:6801/11295 >> >> 192.168.55.11:6801/13185 pipe(0x2d355000 sd=98 :6801 s=2 pgs=12 >> cs=3 l=0 c=0x3a37b20).writer encoding 17337 features 37154696925806591 >> 0x16edd200 osd_repop(client.6709.0:67 0.29 >> 474a01a9/rbd_data.103c74b0dc51.000000000000003a/head//0 v 44'704) v1 >> >> - ->Writing the data to OSD.5? >> >> 2015-10-12 14:13:06.573938 7fb9d3874700 10 -- 192.168.55.16:6801/11295 >> >> 192.168.55.15:6801/32036 pipe(0x3f96000 sd=176 :6801 s=2 pgs=8 cs=3 >> l=0 c=0x32bcb5a0).reader got ack seq 1206 >= 1206 on 0x238fd600 >> osd_repop(client.6709.0:67 0.29 >> 474a01a9/rbd_data.103c74b0dc51.000000000000003a/head//0 v 44'704) v1 >> >> - ->Messenger gets ACK from OSD.0 that it reveiced that last packet? >> >> 2015-10-12 14:13:06.613425 7fb9d3874700 10 -- 192.168.55.16:6801/11295 >> >> 192.168.55.15:6801/32036 pipe(0x3f96000 sd=176 :6801 s=2 pgs=8 cs=3 >> l=0 c=0x32bcb5a0).reader got message 1146 0x3ffa480 >> osd_repop_reply(client.6709.0:67 0.29 ondisk, result = 0) v1 >> >> - ->Messenger receives ack on disk from OSD.0. >> >> 2015-10-12 14:13:06.613447 7fb9d3874700 1 -- 192.168.55.16:6801/11295 >> <== osd.0 192.168.55.15:6801/32036 1146 ==== >> osd_repop_reply(client.6709.0:67 0.29 ondisk, result = 0) v1 ==== >> 83+0+0 (2772408781 0 0) 0x3ffa480 con 0x32bcb5a0 >> >> - ->OSD process gets on disk ACK from OSD.0. >> >> 2015-10-12 14:13:06.613478 7fb9d3874700 10 osd.4 44 handle_replica_op >> osd_repop_reply(client.6709.0:67 0.29 ondisk, result = 0) v1 epoch 44 >> >> - ->Primary OSD records the ACK (duplicate message?). Not sure how to >> correlate that to the previous message other than by time. >> >> 2015-10-12 14:13:06.613504 7fb9d3874700 15 osd.4 44 enqueue_op >> 0x120f9b00 prio 196 cost 0 latency 0.000250 >> osd_repop_reply(client.6709.0:67 0.29 ondisk, result = 0) v1 >> >> - ->The reply is enqueued onto a mystery queue. >> >> 2015-10-12 14:13:06.627793 7fb9d6afd700 10 -- 192.168.55.16:6801/11295 >> >> 192.168.55.11:6801/13185 pipe(0x2d355000 sd=98 :6801 s=2 pgs=12 >> cs=3 l=0 c=0x3a37b20).reader got ack seq 17337 >= 17337 on 0x16edd200 >> osd_repop(client.6709.0:67 0.29 >> 474a01a9/rbd_data.103c74b0dc51.000000000000003a/head//0 v 44'704) v1 >> >> - ->Messenger gets ACK from OSD.5 that it reveiced that last packet? >> >> 2015-10-12 14:13:06.628364 7fb9d6afd700 10 -- 192.168.55.16:6801/11295 >> >> 192.168.55.11:6801/13185 pipe(0x2d355000 sd=98 :6801 s=2 pgs=12 >> cs=3 l=0 c=0x3a37b20).reader got message 16477 0x21cef3c0 >> osd_repop_reply(client.6709.0:67 0.29 ondisk, result = 0) v1 >> >> - ->Messenger receives ack on disk from OSD.5. >> >> 2015-10-12 14:13:06.628382 7fb9d6afd700 1 -- 192.168.55.16:6801/11295 >> <== osd.5 192.168.55.11:6801/13185 16477 ==== >> osd_repop_reply(client.6709.0:67 0.29 ondisk, result = 0) v1 ==== >> 83+0+0 (2104182993 0 0) 0x21cef3c0 con 0x3a37b20 >> >> - ->OSD process gets on disk ACK from OSD.5. >> >> 2015-10-12 14:13:06.628406 7fb9d6afd700 10 osd.4 44 handle_replica_op >> osd_repop_reply(client.6709.0:67 0.29 ondisk, result = 0) v1 epoch 44 >> >> - ->Primary OSD records the ACK (duplicate message?). Not sure how to >> correlate that to the previous message other than by time. >> >> 2015-10-12 14:13:06.628426 7fb9d6afd700 15 osd.4 44 enqueue_op >> 0x3e41600 prio 196 cost 0 latency 0.000180 >> osd_repop_reply(client.6709.0:67 0.29 ondisk, result = 0) v1 >> >> - ->The reply is enqueued onto a mystery queue. >> >> 2015-10-12 14:13:07.124206 7fb9f4e9f700 0 log_channel(cluster) log >> [WRN] : slow request 30.598371 seconds old, received at 2015-10-12 >> 14:12:36.525724: osd_op(client.6709.0:67 >> rbd_data.103c74b0dc51.000000000000003a [set-alloc-hint object_size >> 4194304 write_size 4194304,write 0~4194304] 0.474a01a9 >> ack+ondisk+write+known_if_redirected e44) currently waiting for subops >> from 0,5 >> >> - ->OP has not been dequeued to the client from the mystery queue yet. >> >> 2015-10-12 14:13:07.278449 7fb9e2d3a700 10 osd.4 pg_epoch: 44 pg[0.29( >> v 44'704 (0'0,44'704] local-les=40 n=641 ec=1 les/c 40/44 32/32/10) >> [4,5,0] r=0 lpr=32 luod=44'703 lua=44'703 crt=44'702 lcod 44'702 mlcod >> 44'702 active+clean] eval_repop repgather(0x37ea3cc0 44'704 >> rep_tid=17815 committed?=0 applied?=0 lock=0 >> op=osd_op(client.6709.0:67 rbd_data.103c74b0dc51.000000000000003a >> [set-alloc-hint object_size 4194304 write_size 4194304,write >> 0~4194304] 0.474a01a9 ack+ondisk+write+known_if_redirected e44) v5) >> wants=ad >> >> - ->Not sure what this means. The OP has been completed on all replicas? >> >> 2015-10-12 14:13:07.278566 7fb9e0535700 10 osd.4 44 dequeue_op >> 0x120f9b00 prio 196 cost 0 latency 0.665312 >> osd_repop_reply(client.6709.0:67 0.29 ondisk, result = 0) v1 pg >> pg[0.29( v 44'704 (0'0,44'704] local-les=40 n=641 ec=1 les/c 40/44 >> 32/32/10) [4,5,0] r=0 lpr=32 luod=44'703 lua=44'703 crt=44'702 lcod >> 44'702 mlcod 44'702 active+clean] >> >> - ->One of the replica OPs is dequeued in a different thread >> >> 2015-10-12 14:13:07.278809 7fb9e0535700 10 osd.4 44 dequeue_op >> 0x3e41600 prio 196 cost 0 latency 0.650563 >> osd_repop_reply(client.6709.0:67 0.29 ondisk, result = 0) v1 pg >> pg[0.29( v 44'704 (0'0,44'704] local-les=40 n=641 ec=1 les/c 40/44 >> 32/32/10) [4,5,0] r=0 lpr=32 luod=44'703 lua=44'703 crt=44'702 lcod >> 44'702 mlcod 44'702 active+clean] >> >> - ->The other replica OP is dequeued in the new thread >> >> 2015-10-12 14:13:07.967469 7fb9efe95700 10 osd.4 pg_epoch: 44 pg[0.29( >> v 44'704 (0'0,44'704] local-les=40 n=641 ec=1 les/c 40/44 32/32/10) >> [4,5,0] r=0 lpr=32 lua=44'703 crt=44'702 lcod 44'703 mlcod 44'702 >> active+clean] eval_repop repgather(0x37ea3cc0 44'704 rep_tid=17815 >> committed?=1 applied?=0 lock=0 op=osd_op(client.6709.0:67 >> rbd_data.103c74b0dc51.000000000000003a [set-alloc-hint object_size >> 4194304 write_size 4194304,write 0~4194304] 0.474a01a9 >> ack+ondisk+write+known_if_redirected e44) v5) wants=ad >> >> - ->Not sure what this does. A thread that joins the replica OPs with >> the primary OP? >> >> 2015-10-12 14:13:07.967515 7fb9efe95700 15 osd.4 pg_epoch: 44 pg[0.29( >> v 44'704 (0'0,44'704] local-les=40 n=641 ec=1 les/c 40/44 32/32/10) >> [4,5,0] r=0 lpr=32 lua=44'703 crt=44'702 lcod 44'703 mlcod 44'702 >> active+clean] log_op_stats osd_op(client.6709.0:67 >> rbd_data.103c74b0dc51.000000000000003a [set-alloc-hint object_size >> 4194304 write_size 4194304,write 0~4194304] 0.474a01a9 >> ack+ondisk+write+known_if_redirected e44) v5 inb 4194304 outb 0 rlat >> 0.000000 lat 31.441789 >> >> - ->Logs that the write has been committed to all replicas in the >> primary journal? >> >> Not sure what the rest of these do, nor do I understand where the >> client gets an ACK that the write is committed. >> >> 2015-10-12 14:13:07.967583 7fb9efe95700 10 osd.4 pg_epoch: 44 pg[0.29( >> v 44'704 (0'0,44'704] local-les=40 n=641 ec=1 les/c 40/44 32/32/10) >> [4,5,0] r=0 lpr=32 lua=44'703 crt=44'702 lcod 44'703 mlcod 44'702 >> active+clean] sending commit on repgather(0x37ea3cc0 44'704 >> rep_tid=17815 committed?=1 applied?=0 lock=0 >> op=osd_op(client.6709.0:67 rbd_data.103c74b0dc51.000000000000003a >> [set-alloc-hint object_size 4194304 write_size 4194304,write >> 0~4194304] 0.474a01a9 ack+ondisk+write+known_if_redirected e44) v5) >> 0x3a2f0840 >> >> 2015-10-12 14:13:10.351452 7fb9f0696700 10 osd.4 pg_epoch: 44 pg[0.29( >> v 44'704 (0'0,44'704] local-les=40 n=641 ec=1 les/c 40/44 32/32/10) >> [4,5,0] r=0 lpr=32 crt=44'702 lcod 44'703 mlcod 44'702 active+clean] >> eval_repop repgather(0x37ea3cc0 44'704 rep_tid=17815 committed?=1 >> applied?=1 lock=0 op[0/1943]client.6709.0:67 >> rbd_data.103c74b0dc51.000000000000003a [set-alloc-hint object_size >> 4194304 write_size 4194304,write 0~4194304] 0.474a01a9 >> ack+ondisk+write+known_if_redirected e44) v5) wants=ad >> >> 2015-10-12 14:13:10.354089 7fb9f0696700 10 osd.4 pg_epoch: 44 pg[0.29( >> v 44'704 (0'0,44'704] local-les=40 n=641 ec=1 les/c 40/44 32/32/10) >> [4,5,0] r=0 lpr=32 crt=44'702 lcod 44'703 mlcod 44'703 active+clean] >> removing repgather(0x37ea3cc0 44'704 rep_tid=17815 committed?=1 >> applied?=1 lock=0 op=osd_op(client.6709.0:67 >> rbd_data.103c74b0dc51.000000000000003a [set-alloc-hint object_size >> 4194304 write_size 4194304,write 0~4194304] 0.474a01a9 >> ack+ondisk+write+known_if_redirected e44) v5) >> >> 2015-10-12 14:13:10.354163 7fb9f0696700 20 osd.4 pg_epoch: 44 pg[0.29( >> v 44'704 (0'0,44'704] local-les=40 n=641 ec=1 les/c 40/44 32/32/10) >> [4,5,0] r=0 lpr=32 crt=44'702 lcod 44'703 mlcod 44'703 active+clean] >> q front is repgather(0x37ea3cc0 44'704 rep_tid=17815 committed?=1 >> applied?=1 lock=0 op=osd_op(client.6709.0:67 >> rbd_data.103c74b0dc51.000000000000003a [set-alloc-hint object_size >> 4194304 write_size 4194304,write 0~4194304] 0.474a01a9 >> ack+ondisk+write+known_if_redirected e44) v5) >> >> 2015-10-12 14:13:10.354199 7fb9f0696700 20 osd.4 pg_epoch: 44 pg[0.29( >> v 44'704 (0'0,44'704] local-les=40 n=641 ec=1 les/c 40/44 32/32/10) >> [4,5,0] r=0 lpr=32 crt=44'702 lcod 44'703 mlcod 44'703 active+clean] >> remove_repop repgather(0x37ea3cc0 44'704 rep_tid=17815 committed?=1 >> applied?=1 lock=0 op=osd_op(client.6709.0:67 >> rbd_data.103c74b0dc51.000000000000003a [set-alloc-hint object_size >> 4194304 write_size 4194304,write 0~4194304] 0.474a01a9 >> ack+ondisk+write+known_if_redirected e44) v5) >> >> 2015-10-12 14:13:15.488448 7fb9e2d3a700 10 osd.4 pg_epoch: 44 pg[0.29( >> v 44'707 (0'0,44'707] local-les=40 n=641 ec=1 les/c 40/44 32/32/10) >> [4,5,0] r=0 lpr=32 luod=44'705 lua=44'705 crt=44'704 lcod 44'704 mlcod >> 44'704 active+clean] append_log: trimming to 44'704 entries 44'704 >> (44'691) modify >> 474a01a9/rbd_data.103c74b0dc51.000000000000003a/head//0 by >> client.6709.0:67 2015-10-12 14:12:34.340082 >> >> Thanks for hanging in there with me on this... >> >> [1] http://www.spinics.net/lists/ceph-devel/msg26633.html >> -----BEGIN PGP SIGNATURE----- >> Version: Mailvelope v1.2.0 >> Comment: https://www.mailvelope.com >> >> wsFcBAEBCAAQBQJWHCx0CRDmVDuy+mK58QAAXf8P/j6MD52r2DLqOP9hKFAP >> MJUktg8uqK1i8awtuIQhJHAPDZQF8EACOXg6RBuOz75iryCFKAJXk5exLXrE >> pIZqY/0/JCsUEPuQGaMY9GVQNrTeB82F5VIu572i2xeFir4fUEcvllXSeR9O >> CxSgaAncxUYGSXwsiCJ28QhwPCFXtCLACg1eTpghhAcOwY0t+z6ZB3vh+WxB >> B8kRCdee78TVZOgeTnd66aBJUrr21Ir9aPqSm73uY561dyDmyxc4zPq+FDsJ >> kuac+Ky9Lc6rqhxwRptbdx5i/EDzxj96EKEz2v4SFBmvzU8jtZlA8THJ6WlF >> 6lZRpRIMfEqVu4neFcdUIct8+Brf7fuxOI7hbhUL5xq2I6yDSY8E2T8ImRoS >> w8bSrjFV3wmnXSCHnFJPROqdhtlQlH1PkKPBRJeJrkrB1MloX0ybU4hNIr7Q >> 4ZyzeLpD9sgL1vEfUVuCksgiVJhzlFOyqeRHcfpPEnLxyGL/+mLUa5lQ5m5l >> m286ZnsMZGMzAdSA/tsqnTFzL0HbjkiWD/OMU5zThSKW2tZBNWg3xZE5Yia9 >> zAbhxpvxqhKQ7nfmv3xeVJ1GKb9CuzfN9ZIGPltHvpA3rZf3I4+XVlWbbhDZ >> z8Xp8Pw8f7neh89Tv3AT+krM1jrE1ZxOF5A2K4CxBcS3OEMc5UIZ2fy4dHSo >> 0iTE >> =t7nL >> -----END PGP SIGNATURE----- >> ---------------- >> Robert LeBlanc >> PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 >> >> >> On Thu, Oct 8, 2015 at 11:44 PM, Robert LeBlanc <robert@xxxxxxxxxxxxx> wrote: >> > -----BEGIN PGP SIGNED MESSAGE----- >> > Hash: SHA256 >> > >> > Sage, >> > >> > After trying to bisect this issue (all test moved the bisect towards >> > Infernalis) and eventually testing the Infernalis branch again, it >> > looks like the problem still exists although it is handled a tad >> > better in Infernalis. I'm going to test against Firefly/Giant next >> > week and then try and dive into the code to see if I can expose any >> > thing. >> > >> > If I can do anything to provide you with information, please let me know. >> > >> > Thanks, >> > -----BEGIN PGP SIGNATURE----- >> > Version: Mailvelope v1.2.0 >> > Comment: https://www.mailvelope.com >> > >> > wsFcBAEBCAAQBQJWF1QlCRDmVDuy+mK58QAAWLgP/2l+TkcpeKihDxF8h/kw >> > YFffNWODNfOMq8FVDQkQceo2mFCFc29JnBYiAeqW+XPelwuU5S86LG998aUB >> > BvIU4EHaJNJ31X1NCIA7nwi8rXlFYfSG2qQn58+IzqZoWCQM5vD/THISV1rP >> > qQKtoOAEuRxz+vOAJGI1A1xJSOiFwTRjs4LjE1zYjSP26LdEF61D/lb+AVzV >> > ufxi/ci6mAla/4VTAH4VqEviDgC8AbAZnWFGfUPcTUxJQS99kFrfjJnWvgyF >> > V9EmWtQCvhRO74hQLBqspOwdAxEJesPfGcJT1LjR0eEAMWvbGPtaqbSFAEWa >> > jjyy5wP9+4NnGLdhba6UBtLphjqTcl0e2vVwRj0zLhI14moAOlbhIKmZ1Dt+ >> > 1P6vfgOUGvO76xgDMwrVKRoQgWJO/0Tup9+oqInnNYgf4W+ZWsLgLgo7ETAF >> > VcI7LP1wkwAI3lz5YphY/TnKNGs6i+wVjKBamOt3R1yz9WeylaG0T6xgGHrs >> > VugrRSUuO+ND9+mE5EsUgITCZoaavXJESJMb30XkK6hYGB+T/q+hBafc6Wle >> > Jgs+aT2m1erdSyZn0ZC9a6CjWmwJXY6FCSGhE53BbefBxmCFxn+8tVav+Q8W >> > 7s14TntP6ex4ca7eTwGuSXC9FU5fAVa+3+3aXDAC1QPAkeVkXyB716W1XG6b >> > BCFo >> > =GJL4 >> > -----END PGP SIGNATURE----- >> > ---------------- >> > Robert LeBlanc >> > PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 >> > >> > >> > On Wed, Oct 7, 2015 at 1:25 PM, Robert LeBlanc <robert@xxxxxxxxxxxxx> wrote: >> >> -----BEGIN PGP SIGNED MESSAGE----- >> >> Hash: SHA256 >> >> >> >> We forgot to upload the ceph.log yesterday. It is there now. >> >> - ---------------- >> >> Robert LeBlanc >> >> PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 >> >> >> >> >> >> On Tue, Oct 6, 2015 at 5:40 PM, Robert LeBlanc wrote: >> >>> -----BEGIN PGP SIGNED MESSAGE----- >> >>> Hash: SHA256 >> >>> >> >>> I upped the debug on about everything and ran the test for about 40 >> >>> minutes. I took OSD.19 on ceph1 doen and then brought it back in. >> >>> There was at least one op on osd.19 that was blocked for over 1,000 >> >>> seconds. Hopefully this will have something that will cast a light on >> >>> what is going on. >> >>> >> >>> We are going to upgrade this cluster to Infernalis tomorrow and rerun >> >>> the test to verify the results from the dev cluster. This cluster >> >>> matches the hardware of our production cluster but is not yet in >> >>> production so we can safely wipe it to downgrade back to Hammer. >> >>> >> >>> Logs are located at http://dev.v3trae.net/~jlavoy/ceph/logs/ >> >>> >> >>> Let me know what else we can do to help. >> >>> >> >>> Thanks, >> >>> -----BEGIN PGP SIGNATURE----- >> >>> Version: Mailvelope v1.2.0 >> >>> Comment: https://www.mailvelope.com >> >>> >> >>> wsFcBAEBCAAQBQJWFFwACRDmVDuy+mK58QAAs/UP/1L+y7DEfHqD/5OpkiNQ >> >>> xuEEDm7fNJK58tLRmKsCrDrsFUvWCjiqUwboPg/E40e2GN7Lt+VkhMUEUWoo >> >>> e3L20ig04c8Zu6fE/SXX3lnvayxsWTPcMnYI+HsmIV9E/efDLVLEf6T4fvXg >> >>> 5dKLiqQ8Apu+UMVfd1+aKKDdLdnYlgBCZcIV9AQe1GB8X2VJJhmNWh6TQ3Xr >> >>> gNXDexBdYjFBLu84FXOITd3ZtyUkgx/exCUMmwsJSc90jduzipS5hArvf7LN >> >>> HD6m1gBkZNbfWfc/4nzqOQnKdY1pd9jyoiQM70jn0R5b2BlZT0wLjiAJm+07 >> >>> eCCQ99TZHFyeu1LyovakrYncXcnPtP5TfBFZW952FWQugupvxPCcaduz+GJV >> >>> OhPAJ9dv90qbbGCO+8kpTMAD1aHgt/7+0/hKZTg8WMHhua68SFCXmdGAmqje >> >>> IkIKswIAX4/uIoo5mK4TYB5HdEMJf9DzBFd+1RzzfRrrRalVkBfsu5ChFTx3 >> >>> mu5LAMwKTslvILMxAct0JwnwkOX5Gd+OFvmBRdm16UpDaDTQT2DfykylcmJd >> >>> Cf9rPZxUv0ZHtZyTTyP2e6vgrc7UM/Ie5KonABxQ11mGtT8ysra3c9kMhYpw >> >>> D6hcAZGtdvpiBRXBC5gORfiFWFxwu5kQ+daUhgUIe/O/EWyeD0rirZoqlLnZ >> >>> EDrG >> >>> =BZVw >> >>> -----END PGP SIGNATURE----- >> >>> ---------------- >> >>> Robert LeBlanc >> >>> PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 >> >>> >> >>> >> >>> On Tue, Oct 6, 2015 at 2:36 PM, Robert LeBlanc wrote: >> >>>> -----BEGIN PGP SIGNED MESSAGE----- >> >>>> Hash: SHA256 >> >>>> >> >>>> On my second test (a much longer one), it took nearly an hour, but a >> >>>> few messages have popped up over a 20 window. Still far less than I >> >>>> have been seeing. >> >>>> - ---------------- >> >>>> Robert LeBlanc >> >>>> PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 >> >>>> >> >>>> >> >>>> On Tue, Oct 6, 2015 at 2:00 PM, Robert LeBlanc wrote: >> >>>>> -----BEGIN PGP SIGNED MESSAGE----- >> >>>>> Hash: SHA256 >> >>>>> >> >>>>> I'll capture another set of logs. Is there any other debugging you >> >>>>> want turned up? I've seen the same thing where I see the message >> >>>>> dispatched to the secondary OSD, but the message just doesn't show up >> >>>>> for 30+ seconds in the secondary OSD logs. >> >>>>> - ---------------- >> >>>>> Robert LeBlanc >> >>>>> PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 >> >>>>> >> >>>>> >> >>>>> On Tue, Oct 6, 2015 at 1:34 PM, Sage Weil wrote: >> >>>>>> On Tue, 6 Oct 2015, Robert LeBlanc wrote: >> >>>>>>> -----BEGIN PGP SIGNED MESSAGE----- >> >>>>>>> Hash: SHA256 >> >>>>>>> >> >>>>>>> I can't think of anything. In my dev cluster the only thing that has >> >>>>>>> changed is the Ceph versions (no reboot). What I like is even though >> >>>>>>> the disks are 100% utilized, it is preforming as I expect now. Client >> >>>>>>> I/O is slightly degraded during the recovery, but no blocked I/O when >> >>>>>>> the OSD boots or during the recovery period. This is with >> >>>>>>> max_backfills set to 20, one backfill max in our production cluster is >> >>>>>>> painful on OSD boot/recovery. I was able to reproduce this issue on >> >>>>>>> our dev cluster very easily and very quickly with these settings. So >> >>>>>>> far two tests and an hour later, only the blocked I/O when the OSD is >> >>>>>>> marked out. We would love to see that go away too, but this is far >> >>>>>> (me too!) >> >>>>>>> better than what we have now. This dev cluster also has >> >>>>>>> osd_client_message_cap set to default (100). >> >>>>>>> >> >>>>>>> We need to stay on the Hammer version of Ceph and I'm willing to take >> >>>>>>> the time to bisect this. If this is not a problem in Firefly/Giant, >> >>>>>>> you you prefer a bisect to find the introduction of the problem >> >>>>>>> (Firefly/Giant -> Hammer) or the introduction of the resolution >> >>>>>>> (Hammer -> Infernalis)? Do you have some hints to reduce hitting a >> >>>>>>> commit that prevents a clean build as that is my most limiting factor? >> >>>>>> >> >>>>>> Nothing comes to mind. I think the best way to find this is still to see >> >>>>>> it happen in the logs with hammer. The frustrating thing with that log >> >>>>>> dump you sent is that although I see plenty of slow request warnings in >> >>>>>> the osd logs, I don't see the requests arriving. Maybe the logs weren't >> >>>>>> turned up for long enough? >> >>>>>> >> >>>>>> sage >> >>>>>> >> >>>>>> >> >>>>>> >> >>>>>>> Thanks, >> >>>>>>> - ---------------- >> >>>>>>> Robert LeBlanc >> >>>>>>> PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 >> >>>>>>> >> >>>>>>> >> >>>>>>> On Tue, Oct 6, 2015 at 12:32 PM, Sage Weil wrote: >> >>>>>>> > On Tue, 6 Oct 2015, Robert LeBlanc wrote: >> >>>>>>> >> -----BEGIN PGP SIGNED MESSAGE----- >> >>>>>>> >> Hash: SHA256 >> >>>>>>> >> >> >>>>>>> >> OK, an interesting point. Running ceph version 9.0.3-2036-g4f54a0d >> >>>>>>> >> (4f54a0dd7c4a5c8bdc788c8b7f58048b2a28b9be) looks a lot better. I got >> >>>>>>> >> messages when the OSD was marked out: >> >>>>>>> >> >> >>>>>>> >> 2015-10-06 11:52:46.961040 osd.13 192.168.55.12:6800/20870 81 : >> >>>>>>> >> cluster [WRN] 17 slow requests, 3 included below; oldest blocked for > >> >>>>>>> >> 34.476006 secs >> >>>>>>> >> 2015-10-06 11:52:46.961056 osd.13 192.168.55.12:6800/20870 82 : >> >>>>>>> >> cluster [WRN] slow request 32.913474 seconds old, received at >> >>>>>>> >> 2015-10-06 11:52:14.047475: osd_op(client.600962.0:474 >> >>>>>>> >> rbd_data.338102ae8944a.0000000000005270 [read 3302912~4096] 8.c74a4538 >> >>>>>>> >> ack+read+known_if_redirected e58744) currently waiting for peered >> >>>>>>> >> 2015-10-06 11:52:46.961066 osd.13 192.168.55.12:6800/20870 83 : >> >>>>>>> >> cluster [WRN] slow request 32.697545 seconds old, received at >> >>>>>>> >> 2015-10-06 11:52:14.263403: osd_op(client.600960.0:583 >> >>>>>>> >> rbd_data.3380f74b0dc51.000000000001ee75 [read 1016832~4096] 8.778d1be3 >> >>>>>>> >> ack+read+known_if_redirected e58744) currently waiting for peered >> >>>>>>> >> 2015-10-06 11:52:46.961074 osd.13 192.168.55.12:6800/20870 84 : >> >>>>>>> >> cluster [WRN] slow request 32.668006 seconds old, received at >> >>>>>>> >> 2015-10-06 11:52:14.292942: osd_op(client.600955.0:571 >> >>>>>>> >> rbd_data.3380f74b0dc51.0000000000019b09 [read 1034240~4096] 8.e87a6f58 >> >>>>>>> >> ack+read+known_if_redirected e58744) currently waiting for peered >> >>>>>>> >> >> >>>>>>> >> But I'm not seeing the blocked messages when the OSD came back in. The >> >>>>>>> >> OSD spindles have been running at 100% during this test. I have seen >> >>>>>>> >> slowed I/O from the clients as expected from the extra load, but so >> >>>>>>> >> far no blocked messages. I'm going to run some more tests. >> >>>>>>> > >> >>>>>>> > Good to hear. >> >>>>>>> > >> >>>>>>> > FWIW I looked through the logs and all of the slow request no flag point >> >>>>>>> > messages came from osd.163... and the logs don't show when they arrived. >> >>>>>>> > My guess is this OSD has a slower disk than the others, or something else >> >>>>>>> > funny is going on? >> >>>>>>> > >> >>>>>>> > I spot checked another OSD at random (60) where I saw a slow request. It >> >>>>>>> > was stuck peering for 10s of seconds... waiting on a pg log message from >> >>>>>>> > osd.163. >> >>>>>>> > >> >>>>>>> > sage >> >>>>>>> > >> >>>>>>> > >> >>>>>>> >> >> >>>>>>> >> -----BEGIN PGP SIGNATURE----- >> >>>>>>> >> Version: Mailvelope v1.2.0 >> >>>>>>> >> Comment: https://www.mailvelope.com >> >>>>>>> >> >> >>>>>>> >> wsFcBAEBCAAQBQJWFAzRCRDmVDuy+mK58QAASRYP/jrbKy5mptq/cSqJvB47 >> >>>>>>> >> F/gEatsqU4/TwyIJg137DQTkONbHKnLgCZqsJLnCZRH8fFqtvY6g/Q/AA7Ks >> >>>>>>> >> ouo5gvbjKM7pOm/uUn8kU44Xe15f/bkVHvWBECZzg8YJwinPAisp5R0m1HBC >> >>>>>>> >> HLvsbeqV00m72TyfsZX4aj7lHdyvcdcIH2EVgX/db092VVXczK4q2gRoNr0Y >> >>>>>>> >> 77BEr2Y/gPj5LM4b/aDG5AWY8dJZRlNz+B1CyLS+kIDXSaAbzul2UbAG6jNE >> >>>>>>> >> KJEVxndMPfHLIdwg55+q8VTMIjqXcCM47cQhWFrKChgVD8byJxpc6E0TqOxs >> >>>>>>> >> 1gtNE8AILoCSYKnwQZan+TBDGxki7rQxzMdNI+NLfhy1Mwd3lSCPsDtD7W/i >> >>>>>>> >> tzNTr6aGz+wr+OPDQV5zrzLaPZYF3FLWN4n6RYNfnDramYzD76v+7kjdW4dE >> >>>>>>> >> 5UVCtE7KGLCZ21fu6sln1b9q6lYXNtohAmAunIdqpo3FmHusRySyZzYKu1+9 >> >>>>>>> >> zg/LHiArD/ddjkPxVWCTFBS17g/bESRcv2MsA30GS8J6k1zlQaLX5KeGg6Ql >> >>>>>>> >> WJSmW8gFfEbXj/7JTrVtQWTdgjsegaySFnDisTWUR/hEM/NuKii4xfjI32M/ >> >>>>>>> >> luUMXHZ8lTHk9C8MfZcpyPGvwp2FliD9LqaWOVPWtWZJcerEWcZVlEApg4qb >> >>>>>>> >> fo5a >> >>>>>>> >> =ahEi >> >>>>>>> >> -----END PGP SIGNATURE----- >> >>>>>>> >> ---------------- >> >>>>>>> >> Robert LeBlanc >> >>>>>>> >> PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 >> >>>>>>> >> >> >>>>>>> >> >> >>>>>>> >> On Tue, Oct 6, 2015 at 6:37 AM, Sage Weil wrote: >> >>>>>>> >> > On Mon, 5 Oct 2015, Robert LeBlanc wrote: >> >>>>>>> >> >> -----BEGIN PGP SIGNED MESSAGE----- >> >>>>>>> >> >> Hash: SHA256 >> >>>>>>> >> >> >> >>>>>>> >> >> With some off-list help, we have adjusted >> >>>>>>> >> >> osd_client_message_cap=10000. This seems to have helped a bit and we >> >>>>>>> >> >> have seen some OSDs have a value up to 4,000 for client messages. But >> >>>>>>> >> >> it does not solve the problem with the blocked I/O. >> >>>>>>> >> >> >> >>>>>>> >> >> One thing that I have noticed is that almost exactly 30 seconds elapse >> >>>>>>> >> >> between an OSD boots and the first blocked I/O message. I don't know >> >>>>>>> >> >> if the OSD doesn't have time to get it's brain right about a PG before >> >>>>>>> >> >> it starts servicing it or what exactly. >> >>>>>>> >> > >> >>>>>>> >> > I'm downloading the logs from yesterday now; sorry it's taking so long. >> >>>>>>> >> > >> >>>>>>> >> >> On another note, I tried upgrading our CentOS dev cluster from Hammer >> >>>>>>> >> >> to master and things didn't go so well. The OSDs would not start >> >>>>>>> >> >> because /var/lib/ceph was not owned by ceph. I chowned the directory >> >>>>>>> >> >> and all OSDs and the OSD then started, but never became active in the >> >>>>>>> >> >> cluster. It just sat there after reading all the PGs. There were >> >>>>>>> >> >> sockets open to the monitor, but no OSD to OSD sockets. I tried >> >>>>>>> >> >> downgrading to the Infernalis branch and still no luck getting the >> >>>>>>> >> >> OSDs to come up. The OSD processes were idle after the initial boot. >> >>>>>>> >> >> All packages were installed from gitbuilder. >> >>>>>>> >> > >> >>>>>>> >> > Did you chown -R ? >> >>>>>>> >> > >> >>>>>>> >> > https://github.com/ceph/ceph/blob/infernalis/doc/release-notes.rst#upgrading-from-hammer >> >>>>>>> >> > >> >>>>>>> >> > My guess is you only chowned the root dir, and the OSD didn't throw >> >>>>>>> >> > an error when it encountered the other files? If you can generate a debug >> >>>>>>> >> > osd = 20 log, that would be helpful.. thanks! >> >>>>>>> >> > >> >>>>>>> >> > sage >> >>>>>>> >> > >> >>>>>>> >> > >> >>>>>>> >> >> >> >>>>>>> >> >> Thanks, >> >>>>>>> >> >> -----BEGIN PGP SIGNATURE----- >> >>>>>>> >> >> Version: Mailvelope v1.2.0 >> >>>>>>> >> >> Comment: https://www.mailvelope.com >> >>>>>>> >> >> >> >>>>>>> >> >> wsFcBAEBCAAQBQJWE0F5CRDmVDuy+mK58QAAaCYQAJuFcCvRUJ46k0rYrMcc >> >>>>>>> >> >> YlrSrGwS57GJS/JjaFHsvBV7KTobEMNeMkSv4PTGpwylNV9Dx4Ad74DDqX4g >> >>>>>>> >> >> 6hZDe0rE+uEI7tW9Lqp+MN7eaU2lDuwLt/pOzZI14jTskUYTlNi3HjlN67mQ >> >>>>>>> >> >> aiX1rbrJL6FFkuMOn/YqHpMbxI5ZOUZc1s7RDhASOPIs4z/CxpDfluW6fZA/ >> >>>>>>> >> >> y8C+pW6zzS9U/6jZwtGhBq4dvDBO41Lxb9WOehD8Aa/Qt6XNDzGw2KEkEkw7 >> >>>>>>> >> >> 8dBc7UFa2Wx3Tnzy238a/nKhtz6O6OrHsroA+HGWwCoxPWjOsz/xOoOmfwp+ >> >>>>>>> >> >> ALkY3id+t2uJEqzbL8/MgJ2RV1A+AZ7W1VWIJUOkDz0wR+KxQsxduHoD6rQy >> >>>>>>> >> >> zg0fj2KSAlmVusYOPM1s1+jBsqNF3wcNxpbRoVuFqk0xMgGPrIdUNdZHg6bs >> >>>>>>> >> >> D5sfkjNKexFe0ifFJ0cfv6UaGIKv4dK2eq3jUKgXHfh/qZmJbEB+zHaqJNyg >> >>>>>>> >> >> CN6w6xu1FHLeVobKAWe5ZzKY5lxw6b8YG+ce/E2dvW73gSASPTvtv68gaT04 >> >>>>>>> >> >> 2SPF9Ql0fERL5EDY9Pc4MHpQVcS0XxxJA69CgnWgaG6fzq2eY7fALeMBVWlB >> >>>>>>> >> >> fRj3zQwqJls/X8JZ3c4P4G0R6DP9bmMwGr++oYc3gWGrvgzxw3N7+ornd0jd >> >>>>>>> >> >> GdXC >> >>>>>>> >> >> =Aigq >> >>>>>>> >> >> -----END PGP SIGNATURE----- >> >>>>>>> >> >> ---------------- >> >>>>>>> >> >> Robert LeBlanc >> >>>>>>> >> >> PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 >> >>>>>>> >> >> >> >>>>>>> >> >> >> >>>>>>> >> >> On Sun, Oct 4, 2015 at 3:04 PM, Robert LeBlanc wrote: >> >>>>>>> >> >> > -----BEGIN PGP SIGNED MESSAGE----- >> >>>>>>> >> >> > Hash: SHA256 >> >>>>>>> >> >> > >> >>>>>>> >> >> > I have eight nodes running the fio job rbd_test_real to different RBD >> >>>>>>> >> >> > volumes. I've included the CRUSH map in the tarball. >> >>>>>>> >> >> > >> >>>>>>> >> >> > I stopped one OSD process and marked it out. I let it recover for a >> >>>>>>> >> >> > few minutes and then I started the process again and marked it in. I >> >>>>>>> >> >> > started getting block I/O messages during the recovery. >> >>>>>>> >> >> > >> >>>>>>> >> >> > The logs are located at http://162.144.87.113/files/ushou1.tar.xz >> >>>>>>> >> >> > >> >>>>>>> >> >> > Thanks, >> >>>>>>> >> >> > -----BEGIN PGP SIGNATURE----- >> >>>>>>> >> >> > Version: Mailvelope v1.2.0 >> >>>>>>> >> >> > Comment: https://www.mailvelope.com >> >>>>>>> >> >> > >> >>>>>>> >> >> > wsFcBAEBCAAQBQJWEZRcCRDmVDuy+mK58QAALbEQAK5pFiixJarUdLm50zp/ >> >>>>>>> >> >> > 3AGgGBPrieExKmoZZLCoMGfOLfxZDbN2ybtopKDQDfrTqndE/6Xi9UXqTOdW >> >>>>>>> >> >> > jDc9U1wusgG0CKPsY1SMYnB9akvaDwtdh5q5k4VpN2zsG9R6lRojHeNQR3Nf >> >>>>>>> >> >> > 56QevJL4/e5lC3sLhVnxXXi2XKnHCVOHT+PYgNour2ZWt6OTLoFFxuSU3zLN >> >>>>>>> >> >> > OtfXgrFiiNF0mrDpm0gg2l8a8N5SwP9mM233S2U/JiGAqsqoqkfd0okjDenC >> >>>>>>> >> >> > ksesU/n7zordFpfLN3yjL6+X9pQ4YA6otZrq4wWtjWKO/H0b+6iIsf/AE131 >> >>>>>>> >> >> > R6a4Vufndpd3Ce+FNfM+iu3FmKk0KVfDAaF/tIP6S6XUzGVMAbpvpmqNL17o >> >>>>>>> >> >> > boh3wPZEyK+7KiF4Qlt2KoI/FV24Yj8XiyMnKin3MbMYbammb4ER977VH7iI >> >>>>>>> >> >> > sZyelNPSsYmmw/MF+AkA5KVgzQ4DAPflaejIgC5uw3dYKrn2AQE5CE9nN8Gz >> >>>>>>> >> >> > GVVaGItu1Bvrz21QoT9o5v0dZ85zttFvtrKIYgSi4mdpC6XkzUbg9s9EB1/T >> >>>>>>> >> >> > SEY+fau7W7TtiLpzCAIQ3zDvgsvkx2P6tKg5U8e93LVv9B+YI8i8mUxxv1j5 >> >>>>>>> >> >> > PHFi7KTgRUPm1FPMJDSyzvOgqyMj9AzaESl1Na6k529ILFIcyfko0niTT1oZ >> >>>>>>> >> >> > 3EPx >> >>>>>>> >> >> > =UDIV >> >>>>>>> >> >> > -----END PGP SIGNATURE----- >> >>>>>>> >> >> > >> >>>>>>> >> >> > ---------------- >> >>>>>>> >> >> > Robert LeBlanc >> >>>>>>> >> >> > PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 >> >>>>>>> >> >> > >> >>>>>>> >> >> > >> >>>>>>> >> >> > On Sun, Oct 4, 2015 at 7:48 AM, Sage Weil wrote: >> >>>>>>> >> >> >> On Sat, 3 Oct 2015, Robert LeBlanc wrote: >> >>>>>>> >> >> >>> -----BEGIN PGP SIGNED MESSAGE----- >> >>>>>>> >> >> >>> Hash: SHA256 >> >>>>>>> >> >> >>> >> >>>>>>> >> >> >>> We are still struggling with this and have tried a lot of different >> >>>>>>> >> >> >>> things. Unfortunately, Inktank (now Red Hat) no longer provides >> >>>>>>> >> >> >>> consulting services for non-Red Hat systems. If there are some >> >>>>>>> >> >> >>> certified Ceph consultants in the US that we can do both remote and >> >>>>>>> >> >> >>> on-site engagements, please let us know. >> >>>>>>> >> >> >>> >> >>>>>>> >> >> >>> This certainly seems to be network related, but somewhere in the >> >>>>>>> >> >> >>> kernel. We have tried increasing the network and TCP buffers, number >> >>>>>>> >> >> >>> of TCP sockets, reduced the FIN_WAIT2 state. There is about 25% idle >> >>>>>>> >> >> >>> on the boxes, the disks are busy, but not constantly at 100% (they >> >>>>>>> >> >> >>> cycle from <10% up to 100%, but not 100% for more than a few seconds >> >>>>>>> >> >> >>> at a time). There seems to be no reasonable explanation why I/O is >> >>>>>>> >> >> >>> blocked pretty frequently longer than 30 seconds. We have verified >> >>>>>>> >> >> >>> Jumbo frames by pinging from/to each node with 9000 byte packets. The >> >>>>>>> >> >> >>> network admins have verified that packets are not being dropped in the >> >>>>>>> >> >> >>> switches for these nodes. We have tried different kernels including >> >>>>>>> >> >> >>> the recent Google patch to cubic. This is showing up on three cluster >> >>>>>>> >> >> >>> (two Ethernet and one IPoIB). I booted one cluster into Debian Jessie >> >>>>>>> >> >> >>> (from CentOS 7.1) with similar results. >> >>>>>>> >> >> >>> >> >>>>>>> >> >> >>> The messages seem slightly different: >> >>>>>>> >> >> >>> 2015-10-03 14:38:23.193082 osd.134 10.208.16.25:6800/1425 439 : >> >>>>>>> >> >> >>> cluster [WRN] 14 slow requests, 1 included below; oldest blocked for > >> >>>>>>> >> >> >>> 100.087155 secs >> >>>>>>> >> >> >>> 2015-10-03 14:38:23.193090 osd.134 10.208.16.25:6800/1425 440 : >> >>>>>>> >> >> >>> cluster [WRN] slow request 30.041999 seconds old, received at >> >>>>>>> >> >> >>> 2015-10-03 14:37:53.151014: osd_op(client.1328605.0:7082862 >> >>>>>>> >> >> >>> rbd_data.13fdcb2ae8944a.000000000001264f [read 975360~4096] >> >>>>>>> >> >> >>> 11.6d19c36f ack+read+known_if_redirected e10249) currently no flag >> >>>>>>> >> >> >>> points reached >> >>>>>>> >> >> >>> >> >>>>>>> >> >> >>> I don't know what "no flag points reached" means. >> >>>>>>> >> >> >> >> >>>>>>> >> >> >> Just that the op hasn't been marked as reaching any interesting points >> >>>>>>> >> >> >> (op->mark_*() calls). >> >>>>>>> >> >> >> >> >>>>>>> >> >> >> Is it possible to gather a lot with debug ms = 20 and debug osd = 20? >> >>>>>>> >> >> >> It's extremely verbose but it'll let us see where the op is getting >> >>>>>>> >> >> >> blocked. If you see the "slow request" message it means the op in >> >>>>>>> >> >> >> received by ceph (that's when the clock starts), so I suspect it's not >> >>>>>>> >> >> >> something we can blame on the network stack. >> >>>>>>> >> >> >> >> >>>>>>> >> >> >> sage >> >>>>>>> >> >> >> >> >>>>>>> >> >> >> >> >>>>>>> >> >> >>> >> >>>>>>> >> >> >>> The problem is most pronounced when we have to reboot an OSD node (1 >> >>>>>>> >> >> >>> of 13), we will have hundreds of I/O blocked for some times up to 300 >> >>>>>>> >> >> >>> seconds. It takes a good 15 minutes for things to settle down. The >> >>>>>>> >> >> >>> production cluster is very busy doing normally 8,000 I/O and peaking >> >>>>>>> >> >> >>> at 15,000. This is all 4TB spindles with SSD journals and the disks >> >>>>>>> >> >> >>> are between 25-50% full. We are currently splitting PGs to distribute >> >>>>>>> >> >> >>> the load better across the disks, but we are having to do this 10 PGs >> >>>>>>> >> >> >>> at a time as we get blocked I/O. We have max_backfills and >> >>>>>>> >> >> >>> max_recovery set to 1, client op priority is set higher than recovery >> >>>>>>> >> >> >>> priority. We tried increasing the number of op threads but this didn't >> >>>>>>> >> >> >>> seem to help. It seems as soon as PGs are finished being checked, they >> >>>>>>> >> >> >>> become active and could be the cause for slow I/O while the other PGs >> >>>>>>> >> >> >>> are being checked. >> >>>>>>> >> >> >>> >> >>>>>>> >> >> >>> What I don't understand is that the messages are delayed. As soon as >> >>>>>>> >> >> >>> the message is received by Ceph OSD process, it is very quickly >> >>>>>>> >> >> >>> committed to the journal and a response is sent back to the primary >> >>>>>>> >> >> >>> OSD which is received very quickly as well. I've adjust >> >>>>>>> >> >> >>> min_free_kbytes and it seems to keep the OSDs from crashing, but >> >>>>>>> >> >> >>> doesn't solve the main problem. We don't have swap and there is 64 GB >> >>>>>>> >> >> >>> of RAM per nodes for 10 OSDs. >> >>>>>>> >> >> >>> >> >>>>>>> >> >> >>> Is there something that could cause the kernel to get a packet but not >> >>>>>>> >> >> >>> be able to dispatch it to Ceph such that it could be explaining why we >> >>>>>>> >> >> >>> are seeing these blocked I/O for 30+ seconds. Is there some pointers >> >>>>>>> >> >> >>> to tracing Ceph messages from the network buffer through the kernel to >> >>>>>>> >> >> >>> the Ceph process? >> >>>>>>> >> >> >>> >> >>>>>>> >> >> >>> We can really use some pointers no matter how outrageous. We've have >> >>>>>>> >> >> >>> over 6 people looking into this for weeks now and just can't think of >> >>>>>>> >> >> >>> anything else. >> >>>>>>> >> >> >>> >> >>>>>>> >> >> >>> Thanks, >> >>>>>>> >> >> >>> -----BEGIN PGP SIGNATURE----- >> >>>>>>> >> >> >>> Version: Mailvelope v1.1.0 >> >>>>>>> >> >> >>> Comment: https://www.mailvelope.com >> >>>>>>> >> >> >>> >> >>>>>>> >> >> >>> wsFcBAEBCAAQBQJWEDY1CRDmVDuy+mK58QAARgoP/RcoL1qVmg7qbQrzStar >> >>>>>>> >> >> >>> NK80bqYGeYHb26xHbt1fZVgnZhXU0nN0Dv4ew0e/cYJLELSO2KCeXNfXN6F1 >> >>>>>>> >> >> >>> prZuzYagYEyj1Q1TOo+4h/nOQRYsTwQDdFzbHb/OUDN55C0QGZ29DjEvrqP6 >> >>>>>>> >> >> >>> K5l6sAQzvQDpUEEIiOCkS6pH59ira740nSmnYkEWhr1lxF/hMjb6fFlfCFe2 >> >>>>>>> >> >> >>> h1djM0GfY7vBHFGgI3jkw0BL5AQnWe+SCcCiKZmxY6xiR70FWl3XqK5M+nxm >> >>>>>>> >> >> >>> iq74y7Dv6cpenit6boMr6qtOeIt+8ko85hVMh09Hkaqz/m2FzxAKLcahzkGF >> >>>>>>> >> >> >>> Fh/M6YBzgnX7QBURTC4YQT/FVyDTW3JMuT3RKQdaX6c0iiOsVdkE+iyidWyY >> >>>>>>> >> >> >>> Hr1KzWU23Ur9yBfZ39Y43jrsSiAEwHnKjSqMowSGljdTysNEAAZQhlqZIoHb >> >>>>>>> >> >> >>> JlgpB39ugkHI1H5fZ5b2SIDz32/d5ywG4Gay9Rk6hp8VanvIrBbev+JYEoYT >> >>>>>>> >> >> >>> 8/WX+fhueHt4dqUYWIl3HZ0CEzbXbug0xmFvhrbmL2f3t9XOkDZRbAjlYrGm >> >>>>>>> >> >> >>> lswiJMDueY8JkxSnPvCQrHXqjbCcy9rMG7nTnLFz98rTcHNCwtpv0qVYhheg >> >>>>>>> >> >> >>> 4YRNRVMbfNP/6xsJvG1wVOSQPwxZSPqJh42pDqMRePJl3Zn66MTx5wvdNDpk >> >>>>>>> >> >> >>> l7OF >> >>>>>>> >> >> >>> =OI++ >> >>>>>>> >> >> >>> -----END PGP SIGNATURE----- >> >>>>>>> >> >> >>> ---------------- >> >>>>>>> >> >> >>> Robert LeBlanc >> >>>>>>> >> >> >>> PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 >> >>>>>>> >> >> >>> >> >>>>>>> >> >> >>> >> >>>>>>> >> >> >>> On Fri, Sep 25, 2015 at 2:40 PM, Robert LeBlanc wrote: >> >>>>>>> >> >> >>> > We dropped the replication on our cluster from 4 to 3 and it looks >> >>>>>>> >> >> >>> > like all the blocked I/O has stopped (no entries in the log for the >> >>>>>>> >> >> >>> > last 12 hours). This makes me believe that there is some issue with >> >>>>>>> >> >> >>> > the number of sockets or some other TCP issue. We have not messed with >> >>>>>>> >> >> >>> > Ephemeral ports and TIME_WAIT at this point. There are 130 OSDs, 8 KVM >> >>>>>>> >> >> >>> > hosts hosting about 150 VMs. Open files is set at 32K for the OSD >> >>>>>>> >> >> >>> > processes and 16K system wide. >> >>>>>>> >> >> >>> > >> >>>>>>> >> >> >>> > Does this seem like the right spot to be looking? What are some >> >>>>>>> >> >> >>> > configuration items we should be looking at? >> >>>>>>> >> >> >>> > >> >>>>>>> >> >> >>> > Thanks, >> >>>>>>> >> >> >>> > ---------------- >> >>>>>>> >> >> >>> > Robert LeBlanc >> >>>>>>> >> >> >>> > PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 >> >>>>>>> >> >> >>> > >> >>>>>>> >> >> >>> > >> >>>>>>> >> >> >>> > On Wed, Sep 23, 2015 at 1:30 PM, Robert LeBlanc wrote: >> >>>>>>> >> >> >>> >> -----BEGIN PGP SIGNED MESSAGE----- >> >>>>>>> >> >> >>> >> Hash: SHA256 >> >>>>>>> >> >> >>> >> >> >>>>>>> >> >> >>> >> We were able to only get ~17Gb out of the XL710 (heavily tweaked) >> >>>>>>> >> >> >>> >> until we went to the 4.x kernel where we got ~36Gb (no tweaking). It >> >>>>>>> >> >> >>> >> seems that there were some major reworks in the network handling in >> >>>>>>> >> >> >>> >> the kernel to efficiently handle that network rate. If I remember >> >>>>>>> >> >> >>> >> right we also saw a drop in CPU utilization. I'm starting to think >> >>>>>>> >> >> >>> >> that we did see packet loss while congesting our ISLs in our initial >> >>>>>>> >> >> >>> >> testing, but we could not tell where the dropping was happening. We >> >>>>>>> >> >> >>> >> saw some on the switches, but it didn't seem to be bad if we weren't >> >>>>>>> >> >> >>> >> trying to congest things. We probably already saw this issue, just >> >>>>>>> >> >> >>> >> didn't know it. >> >>>>>>> >> >> >>> >> - ---------------- >> >>>>>>> >> >> >>> >> Robert LeBlanc >> >>>>>>> >> >> >>> >> PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 >> >>>>>>> >> >> >>> >> >> >>>>>>> >> >> >>> >> >> >>>>>>> >> >> >>> >> On Wed, Sep 23, 2015 at 1:10 PM, Mark Nelson wrote: >> >>>>>>> >> >> >>> >>> FWIW, we've got some 40GbE Intel cards in the community performance cluster >> >>>>>>> >> >> >>> >>> on a Mellanox 40GbE switch that appear (knock on wood) to be running fine >> >>>>>>> >> >> >>> >>> with 3.10.0-229.7.2.el7.x86_64. We did get feedback from Intel that older >> >>>>>>> >> >> >>> >>> drivers might cause problems though. >> >>>>>>> >> >> >>> >>> >> >>>>>>> >> >> >>> >>> Here's ifconfig from one of the nodes: >> >>>>>>> >> >> >>> >>> >> >>>>>>> >> >> >>> >>> ens513f1: flags=4163 mtu 1500 >> >>>>>>> >> >> >>> >>> inet 10.0.10.101 netmask 255.255.255.0 broadcast 10.0.10.255 >> >>>>>>> >> >> >>> >>> inet6 fe80::6a05:caff:fe2b:7ea1 prefixlen 64 scopeid 0x20 >> >>>>>>> >> >> >>> >>> ether 68:05:ca:2b:7e:a1 txqueuelen 1000 (Ethernet) >> >>>>>>> >> >> >>> >>> RX packets 169232242875 bytes 229346261232279 (208.5 TiB) >> >>>>>>> >> >> >>> >>> RX errors 0 dropped 0 overruns 0 frame 0 >> >>>>>>> >> >> >>> >>> TX packets 153491686361 bytes 203976410836881 (185.5 TiB) >> >>>>>>> >> >> >>> >>> TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 >> >>>>>>> >> >> >>> >>> >> >>>>>>> >> >> >>> >>> Mark >> >>>>>>> >> >> >>> >>> >> >>>>>>> >> >> >>> >>> >> >>>>>>> >> >> >>> >>> On 09/23/2015 01:48 PM, Robert LeBlanc wrote: >> >>>>>>> >> >> >>> >>>> >> >>>>>>> >> >> >>> >>>> -----BEGIN PGP SIGNED MESSAGE----- >> >>>>>>> >> >> >>> >>>> Hash: SHA256 >> >>>>>>> >> >> >>> >>>> >> >>>>>>> >> >> >>> >>>> OK, here is the update on the saga... >> >>>>>>> >> >> >>> >>>> >> >>>>>>> >> >> >>> >>>> I traced some more of blocked I/Os and it seems that communication >> >>>>>>> >> >> >>> >>>> between two hosts seemed worse than others. I did a two way ping flood >> >>>>>>> >> >> >>> >>>> between the two hosts using max packet sizes (1500). After 1.5M >> >>>>>>> >> >> >>> >>>> packets, no lost pings. Then then had the ping flood running while I >> >>>>>>> >> >> >>> >>>> put Ceph load on the cluster and the dropped pings started increasing >> >>>>>>> >> >> >>> >>>> after stopping the Ceph workload the pings stopped dropping. >> >>>>>>> >> >> >>> >>>> >> >>>>>>> >> >> >>> >>>> I then ran iperf between all the nodes with the same results, so that >> >>>>>>> >> >> >>> >>>> ruled out Ceph to a large degree. I then booted in the the >> >>>>>>> >> >> >>> >>>> 3.10.0-229.14.1.el7.x86_64 kernel and with an hour test so far there >> >>>>>>> >> >> >>> >>>> hasn't been any dropped pings or blocked I/O. Our 40 Gb NICs really >> >>>>>>> >> >> >>> >>>> need the network enhancements in the 4.x series to work well. >> >>>>>>> >> >> >>> >>>> >> >>>>>>> >> >> >>> >>>> Does this sound familiar to anyone? I'll probably start bisecting the >> >>>>>>> >> >> >>> >>>> kernel to see where this issue in introduced. Both of the clusters >> >>>>>>> >> >> >>> >>>> with this issue are running 4.x, other than that, they are pretty >> >>>>>>> >> >> >>> >>>> differing hardware and network configs. >> >>>>>>> >> >> >>> >>>> >> >>>>>>> >> >> >>> >>>> Thanks, >> >>>>>>> >> >> >>> >>>> -----BEGIN PGP SIGNATURE----- >> >>>>>>> >> >> >>> >>>> Version: Mailvelope v1.1.0 >> >>>>>>> >> >> >>> >>>> Comment: https://www.mailvelope.com >> >>>>>>> >> >> >>> >>>> >> >>>>>>> >> >> >>> >>>> wsFcBAEBCAAQBQJWAvOzCRDmVDuy+mK58QAApOMP/1xmCtW++G11qcE8y/sr >> >>>>>>> >> >> >>> >>>> RkXguqZJLc4czdOwV/tjUvhVsm5qOl4wvQCtABFZpc6t4+m5nzE3LkA1rl2l >> >>>>>>> >> >> >>> >>>> AnARPOjh61TO6cV0CT8O0DlqtHmSd2y0ElgAUl0594eInEn7eI7crz8R543V >> >>>>>>> >> >> >>> >>>> 7I68XU5zL/vNJ9IIx38UqdhtSzXQQL664DGq3DLINK0Yb9XRVBlFip+Slt+j >> >>>>>>> >> >> >>> >>>> cB64TuWjOPLSH09pv7SUyksodqrTq3K7p6sQkq0MOzBkFQM1FHfOipbo/LYv >> >>>>>>> >> >> >>> >>>> F42iiQbCvFizArMu20WeOSQ4dmrXT/iecgTfEag/Zxvor2gOi/J6d2XS9ckW >> >>>>>>> >> >> >>> >>>> byEC5/rbm4yDBua2ZugeNxQLWq0Oa7spZnx7usLsu/6YzeDNI6kmtGURajdE >> >>>>>>> >> >> >>> >>>> /XC8bESWKveBzmGDzjff5oaMs9A1PZURYnlYADEODGAt6byoaoQEGN6dlFGe >> >>>>>>> >> >> >>> >>>> LwQ5nOdQYuUrWpJzTJBN3aduOxursoFY8S0eR0uXm0l1CHcp22RWBDvRinok >> >>>>>>> >> >> >>> >>>> UWk5xRBgjDCD2gIwc+wpImZbCtiTdf0vad1uLvdxGL29iFta4THzJgUGrp98 >> >>>>>>> >> >> >>> >>>> sUqM3RaTRdJYjFcNP293H7/DC0mqpnmo0Clx3jkdHX+x1EXpJUtocSeI44LX >> >>>>>>> >> >> >>> >>>> KWIMhe9wXtKAoHQFEcJ0o0+wrXWMevvx33HPC4q1ULrFX0ILNx5Mo0Rp944X >> >>>>>>> >> >> >>> >>>> 4OEo >> >>>>>>> >> >> >>> >>>> =P33I >> >>>>>>> >> >> >>> >>>> -----END PGP SIGNATURE----- >> >>>>>>> >> >> >>> >>>> ---------------- >> >>>>>>> >> >> >>> >>>> Robert LeBlanc >> >>>>>>> >> >> >>> >>>> PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 >> >>>>>>> >> >> >>> >>>> >> >>>>>>> >> >> >>> >>>> >> >>>>>>> >> >> >>> >>>> On Tue, Sep 22, 2015 at 4:15 PM, Robert LeBlanc >> >>>>>>> >> >> >>> >>>> wrote: >> >>>>>>> >> >> >>> >>>>> >> >>>>>>> >> >> >>> >>>>> -----BEGIN PGP SIGNED MESSAGE----- >> >>>>>>> >> >> >>> >>>>> Hash: SHA256 >> >>>>>>> >> >> >>> >>>>> >> >>>>>>> >> >> >>> >>>>> This is IPoIB and we have the MTU set to 64K. There was some issues >> >>>>>>> >> >> >>> >>>>> pinging hosts with "No buffer space available" (hosts are currently >> >>>>>>> >> >> >>> >>>>> configured for 4GB to test SSD caching rather than page cache). I >> >>>>>>> >> >> >>> >>>>> found that MTU under 32K worked reliable for ping, but still had the >> >>>>>>> >> >> >>> >>>>> blocked I/O. >> >>>>>>> >> >> >>> >>>>> >> >>>>>>> >> >> >>> >>>>> I reduced the MTU to 1500 and checked pings (OK), but I'm still seeing >> >>>>>>> >> >> >>> >>>>> the blocked I/O. >> >>>>>>> >> >> >>> >>>>> - ---------------- >> >>>>>>> >> >> >>> >>>>> Robert LeBlanc >> >>>>>>> >> >> >>> >>>>> PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 >> >>>>>>> >> >> >>> >>>>> >> >>>>>>> >> >> >>> >>>>> >> >>>>>>> >> >> >>> >>>>> On Tue, Sep 22, 2015 at 3:52 PM, Sage Weil wrote: >> >>>>>>> >> >> >>> >>>>>> >> >>>>>>> >> >> >>> >>>>>> On Tue, 22 Sep 2015, Samuel Just wrote: >> >>>>>>> >> >> >>> >>>>>>> >> >>>>>>> >> >> >>> >>>>>>> I looked at the logs, it looks like there was a 53 second delay >> >>>>>>> >> >> >>> >>>>>>> between when osd.17 started sending the osd_repop message and when >> >>>>>>> >> >> >>> >>>>>>> osd.13 started reading it, which is pretty weird. Sage, didn't we >> >>>>>>> >> >> >>> >>>>>>> once see a kernel issue which caused some messages to be mysteriously >> >>>>>>> >> >> >>> >>>>>>> delayed for many 10s of seconds? >> >>>>>>> >> >> >>> >>>>>> >> >>>>>>> >> >> >>> >>>>>> >> >>>>>>> >> >> >>> >>>>>> Every time we have seen this behavior and diagnosed it in the wild it >> >>>>>>> >> >> >>> >>>>>> has >> >>>>>>> >> >> >>> >>>>>> been a network misconfiguration. Usually related to jumbo frames. >> >>>>>>> >> >> >>> >>>>>> >> >>>>>>> >> >> >>> >>>>>> sage >> >>>>>>> >> >> >>> >>>>>> >> >>>>>>> >> >> >>> >>>>>> >> >>>>>>> >> >> >>> >>>>>>> >> >>>>>>> >> >> >>> >>>>>>> What kernel are you running? >> >>>>>>> >> >> >>> >>>>>>> -Sam >> >>>>>>> >> >> >>> >>>>>>> >> >>>>>>> >> >> >>> >>>>>>> On Tue, Sep 22, 2015 at 2:22 PM, Robert LeBlanc wrote: >> >>>>>>> >> >> >>> >>>>>>>> >> >>>>>>> >> >> >>> >>>>>>>> -----BEGIN PGP SIGNED MESSAGE----- >> >>>>>>> >> >> >>> >>>>>>>> Hash: SHA256 >> >>>>>>> >> >> >>> >>>>>>>> >> >>>>>>> >> >> >>> >>>>>>>> OK, looping in ceph-devel to see if I can get some more eyes. I've >> >>>>>>> >> >> >>> >>>>>>>> extracted what I think are important entries from the logs for the >> >>>>>>> >> >> >>> >>>>>>>> first blocked request. NTP is running all the servers so the logs >> >>>>>>> >> >> >>> >>>>>>>> should be close in terms of time. Logs for 12:50 to 13:00 are >> >>>>>>> >> >> >>> >>>>>>>> available at http://162.144.87.113/files/ceph_block_io.logs.tar.xz >> >>>>>>> >> >> >>> >>>>>>>> >> >>>>>>> >> >> >>> >>>>>>>> 2015-09-22 12:55:06.500374 - osd.17 gets I/O from client >> >>>>>>> >> >> >>> >>>>>>>> 2015-09-22 12:55:06.557160 - osd.17 submits I/O to osd.13 >> >>>>>>> >> >> >>> >>>>>>>> 2015-09-22 12:55:06.557305 - osd.17 submits I/O to osd.16 >> >>>>>>> >> >> >>> >>>>>>>> 2015-09-22 12:55:06.573711 - osd.16 gets I/O from osd.17 >> >>>>>>> >> >> >>> >>>>>>>> 2015-09-22 12:55:06.595716 - osd.17 gets ondisk result=0 from osd.16 >> >>>>>>> >> >> >>> >>>>>>>> 2015-09-22 12:55:06.640631 - osd.16 reports to osd.17 ondisk result=0 >> >>>>>>> >> >> >>> >>>>>>>> 2015-09-22 12:55:36.926691 - osd.17 reports slow I/O > 30.439150 sec >> >>>>>>> >> >> >>> >>>>>>>> 2015-09-22 12:55:59.790591 - osd.13 gets I/O from osd.17 >> >>>>>>> >> >> >>> >>>>>>>> 2015-09-22 12:55:59.812405 - osd.17 gets ondisk result=0 from osd.13 >> >>>>>>> >> >> >>> >>>>>>>> 2015-09-22 12:56:02.941602 - osd.13 reports to osd.17 ondisk result=0 >> >>>>>>> >> >> >>> >>>>>>>> >> >>>>>>> >> >> >>> >>>>>>>> In the logs I can see that osd.17 dispatches the I/O to osd.13 and >> >>>>>>> >> >> >>> >>>>>>>> osd.16 almost silmutaniously. osd.16 seems to get the I/O right away, >> >>>>>>> >> >> >>> >>>>>>>> but for some reason osd.13 doesn't get the message until 53 seconds >> >>>>>>> >> >> >>> >>>>>>>> later. osd.17 seems happy to just wait and doesn't resend the data >> >>>>>>> >> >> >>> >>>>>>>> (well, I'm not 100% sure how to tell which entries are the actual data >> >>>>>>> >> >> >>> >>>>>>>> transfer). >> >>>>>>> >> >> >>> >>>>>>>> >> >>>>>>> >> >> >>> >>>>>>>> It looks like osd.17 is receiving responses to start the communication >> >>>>>>> >> >> >>> >>>>>>>> with osd.13, but the op is not acknowledged until almost a minute >> >>>>>>> >> >> >>> >>>>>>>> later. To me it seems that the message is getting received but not >> >>>>>>> >> >> >>> >>>>>>>> passed to another thread right away or something. This test was done >> >>>>>>> >> >> >>> >>>>>>>> with an idle cluster, a single fio client (rbd engine) with a single >> >>>>>>> >> >> >>> >>>>>>>> thread. >> >>>>>>> >> >> >>> >>>>>>>> >> >>>>>>> >> >> >>> >>>>>>>> The OSD servers are almost 100% idle during these blocked I/O >> >>>>>>> >> >> >>> >>>>>>>> requests. I think I'm at the end of my troubleshooting, so I can use >> >>>>>>> >> >> >>> >>>>>>>> some help. >> >>>>>>> >> >> >>> >>>>>>>> >> >>>>>>> >> >> >>> >>>>>>>> Single Test started about >> >>>>>>> >> >> >>> >>>>>>>> 2015-09-22 12:52:36 >> >>>>>>> >> >> >>> >>>>>>>> >> >>>>>>> >> >> >>> >>>>>>>> 2015-09-22 12:55:36.926680 osd.17 192.168.55.14:6800/16726 56 : >> >>>>>>> >> >> >>> >>>>>>>> cluster [WRN] 1 slow requests, 1 included below; oldest blocked for > >> >>>>>>> >> >> >>> >>>>>>>> 30.439150 secs >> >>>>>>> >> >> >>> >>>>>>>> 2015-09-22 12:55:36.926699 osd.17 192.168.55.14:6800/16726 57 : >> >>>>>>> >> >> >>> >>>>>>>> cluster [WRN] slow request 30.439150 seconds old, received at >> >>>>>>> >> >> >>> >>>>>>>> 2015-09-22 12:55:06.487451: >> >>>>>>> >> >> >>> >>>>>>>> osd_op(client.250874.0:1388 rbd_data.3380e2ae8944a.0000000000000545 >> >>>>>>> >> >> >>> >>>>>>>> [set-alloc-hint object_size 4194304 write_size 4194304,write >> >>>>>>> >> >> >>> >>>>>>>> 0~4194304] 8.bbf3e8ff ack+ondisk+write+known_if_redirected e56785) >> >>>>>>> >> >> >>> >>>>>>>> currently waiting for subops from 13,16 >> >>>>>>> >> >> >>> >>>>>>>> 2015-09-22 12:55:36.697904 osd.16 192.168.55.13:6800/29410 7 : cluster >> >>>>>>> >> >> >>> >>>>>>>> [WRN] 2 slow requests, 2 included below; oldest blocked for > >> >>>>>>> >> >> >>> >>>>>>>> 30.379680 secs >> >>>>>>> >> >> >>> >>>>>>>> 2015-09-22 12:55:36.697918 osd.16 192.168.55.13:6800/29410 8 : cluster >> >>>>>>> >> >> >>> >>>>>>>> [WRN] slow request 30.291520 seconds old, received at 2015-09-22 >> >>>>>>> >> >> >>> >>>>>>>> 12:55:06.406303: >> >>>>>>> >> >> >>> >>>>>>>> osd_op(client.250874.0:1384 rbd_data.3380e2ae8944a.0000000000000541 >> >>>>>>> >> >> >>> >>>>>>>> [set-alloc-hint object_size 4194304 write_size 4194304,write >> >>>>>>> >> >> >>> >>>>>>>> 0~4194304] 8.5fb2123f ack+ondisk+write+known_if_redirected e56785) >> >>>>>>> >> >> >>> >>>>>>>> currently waiting for subops from 13,17 >> >>>>>>> >> >> >>> >>>>>>>> 2015-09-22 12:55:36.697927 osd.16 192.168.55.13:6800/29410 9 : cluster >> >>>>>>> >> >> >>> >>>>>>>> [WRN] slow request 30.379680 seconds old, received at 2015-09-22 >> >>>>>>> >> >> >>> >>>>>>>> 12:55:06.318144: >> >>>>>>> >> >> >>> >>>>>>>> osd_op(client.250874.0:1382 rbd_data.3380e2ae8944a.000000000000053f >> >>>>>>> >> >> >>> >>>>>>>> [set-alloc-hint object_size 4194304 write_size 4194304,write >> >>>>>>> >> >> >>> >>>>>>>> 0~4194304] 8.312e69ca ack+ondisk+write+known_if_redirected e56785) >> >>>>>>> >> >> >>> >>>>>>>> currently waiting for subops from 13,14 >> >>>>>>> >> >> >>> >>>>>>>> 2015-09-22 12:58:03.998275 osd.13 192.168.55.12:6804/4574 130 : >> >>>>>>> >> >> >>> >>>>>>>> cluster [WRN] 1 slow requests, 1 included below; oldest blocked for > >> >>>>>>> >> >> >>> >>>>>>>> 30.954212 secs >> >>>>>>> >> >> >>> >>>>>>>> 2015-09-22 12:58:03.998286 osd.13 192.168.55.12:6804/4574 131 : >> >>>>>>> >> >> >>> >>>>>>>> cluster [WRN] slow request 30.954212 seconds old, received at >> >>>>>>> >> >> >>> >>>>>>>> 2015-09-22 12:57:33.044003: >> >>>>>>> >> >> >>> >>>>>>>> osd_op(client.250874.0:1873 rbd_data.3380e2ae8944a.000000000000070d >> >>>>>>> >> >> >>> >>>>>>>> [set-alloc-hint object_size 4194304 write_size 4194304,write >> >>>>>>> >> >> >>> >>>>>>>> 0~4194304] 8.e69870d4 ack+ondisk+write+known_if_redirected e56785) >> >>>>>>> >> >> >>> >>>>>>>> currently waiting for subops from 16,17 >> >>>>>>> >> >> >>> >>>>>>>> 2015-09-22 12:58:03.759826 osd.16 192.168.55.13:6800/29410 10 : >> >>>>>>> >> >> >>> >>>>>>>> cluster [WRN] 1 slow requests, 1 included below; oldest blocked for > >> >>>>>>> >> >> >>> >>>>>>>> 30.704367 secs >> >>>>>>> >> >> >>> >>>>>>>> 2015-09-22 12:58:03.759840 osd.16 192.168.55.13:6800/29410 11 : >> >>>>>>> >> >> >>> >>>>>>>> cluster [WRN] slow request 30.704367 seconds old, received at >> >>>>>>> >> >> >>> >>>>>>>> 2015-09-22 12:57:33.055404: >> >>>>>>> >> >> >>> >>>>>>>> osd_op(client.250874.0:1874 rbd_data.3380e2ae8944a.000000000000070e >> >>>>>>> >> >> >>> >>>>>>>> [set-alloc-hint object_size 4194304 write_size 4194304,write >> >>>>>>> >> >> >>> >>>>>>>> 0~4194304] 8.f7635819 ack+ondisk+write+known_if_redirected e56785) >> >>>>>>> >> >> >>> >>>>>>>> currently waiting for subops from 13,17 >> >>>>>>> >> >> >>> >>>>>>>> >> >>>>>>> >> >> >>> >>>>>>>> Server IP addr OSD >> >>>>>>> >> >> >>> >>>>>>>> nodev - 192.168.55.11 - 12 >> >>>>>>> >> >> >>> >>>>>>>> nodew - 192.168.55.12 - 13 >> >>>>>>> >> >> >>> >>>>>>>> nodex - 192.168.55.13 - 16 >> >>>>>>> >> >> >>> >>>>>>>> nodey - 192.168.55.14 - 17 >> >>>>>>> >> >> >>> >>>>>>>> nodez - 192.168.55.15 - 14 >> >>>>>>> >> >> >>> >>>>>>>> nodezz - 192.168.55.16 - 15 >> >>>>>>> >> >> >>> >>>>>>>> >> >>>>>>> >> >> >>> >>>>>>>> fio job: >> >>>>>>> >> >> >>> >>>>>>>> [rbd-test] >> >>>>>>> >> >> >>> >>>>>>>> readwrite=write >> >>>>>>> >> >> >>> >>>>>>>> blocksize=4M >> >>>>>>> >> >> >>> >>>>>>>> #runtime=60 >> >>>>>>> >> >> >>> >>>>>>>> name=rbd-test >> >>>>>>> >> >> >>> >>>>>>>> #readwrite=randwrite >> >>>>>>> >> >> >>> >>>>>>>> #bssplit=4k/85:32k/11:512/3:1m/1,4k/89:32k/10:512k/1 >> >>>>>>> >> >> >>> >>>>>>>> #rwmixread=72 >> >>>>>>> >> >> >>> >>>>>>>> #norandommap >> >>>>>>> >> >> >>> >>>>>>>> #size=1T >> >>>>>>> >> >> >>> >>>>>>>> #blocksize=4k >> >>>>>>> >> >> >>> >>>>>>>> ioengine=rbd >> >>>>>>> >> >> >>> >>>>>>>> rbdname=test2 >> >>>>>>> >> >> >>> >>>>>>>> pool=rbd >> >>>>>>> >> >> >>> >>>>>>>> clientname=admin >> >>>>>>> >> >> >>> >>>>>>>> iodepth=8 >> >>>>>>> >> >> >>> >>>>>>>> #numjobs=4 >> >>>>>>> >> >> >>> >>>>>>>> #thread >> >>>>>>> >> >> >>> >>>>>>>> #group_reporting >> >>>>>>> >> >> >>> >>>>>>>> #time_based >> >>>>>>> >> >> >>> >>>>>>>> #direct=1 >> >>>>>>> >> >> >>> >>>>>>>> #ramp_time=60 >> >>>>>>> >> >> >>> >>>>>>>> >> >>>>>>> >> >> >>> >>>>>>>> >> >>>>>>> >> >> >>> >>>>>>>> Thanks, >> >>>>>>> >> >> >>> >>>>>>>> -----BEGIN PGP SIGNATURE----- >> >>>>>>> >> >> >>> >>>>>>>> Version: Mailvelope v1.1.0 >> >>>>>>> >> >> >>> >>>>>>>> Comment: https://www.mailvelope.com >> >>>>>>> >> >> >>> >>>>>>>> >> >>>>>>> >> >> >>> >>>>>>>> wsFcBAEBCAAQBQJWAcaKCRDmVDuy+mK58QAAPMsQAKBnS94fwuw0OqpPU3/z >> >>>>>>> >> >> >>> >>>>>>>> tL8Z6TVRxrNigf721+2ClIu4LIH71bupDc3DgrrysQmmqGuvEMn68spmasWu >> >>>>>>> >> >> >>> >>>>>>>> h9I/CqqgRpHqe4lUVoUEjyWA9/6Dbb6NiHSdpJ6p5jpGc8kZCvNS+ocDgFOl >> >>>>>>> >> >> >>> >>>>>>>> 903i0M0E9eEMeci5O/hrMrx1FG8SN2LS8nI261aNHMOwQK0bw8wWiCJEvqVB >> >>>>>>> >> >> >>> >>>>>>>> sz1/+jK1BJoeIYfaT9HfUXBAvfo/W3tY/vj9KbJuZJ5AMpeYPvEHu/LAr1N7 >> >>>>>>> >> >> >>> >>>>>>>> FzzUc7a6EMlaxmSd0ML49JbV0cY9BMDjfrkKEQNKlzszlEHm3iif98QtsxbF >> >>>>>>> >> >> >>> >>>>>>>> pPJ0hZ0G53BY3k976OWVMFm3WFRWUVOb/oiLF8H6PCm59b4LBNAg6iPNH1AI >> >>>>>>> >> >> >>> >>>>>>>> 5XhEcPpg06M03vqUaIiY9P1kQlvnn0yCXf82IUEgmg///vhxDsHWmcwClLEn >> >>>>>>> >> >> >>> >>>>>>>> B0VszouStTzlMYnc/2vlUiI4gFVeilWLMW00VGTWV+7V1oIzIYvWHyl2QpBq >> >>>>>>> >> >> >>> >>>>>>>> 4/ZwVjQ43qLfuDTS4o+IJ4ztOMd26vIv6Mn6WVwKCjoCXJc8ajywR9Dy+6lL >> >>>>>>> >> >> >>> >>>>>>>> o8oJ+tn7hMc9Qy1iBhu3/QIP4WCsUf9RVeu60oahNEpde89qW32S9CZlrJDO >> >>>>>>> >> >> >>> >>>>>>>> gf4iTryRjkAhdmZIj9JiaE8jQ6dvN817D9cqs/CXKV9vhzYoM7p5YWHghBKB >> >>>>>>> >> >> >>> >>>>>>>> J3hS >> >>>>>>> >> >> >>> >>>>>>>> =0J7F >> >>>>>>> >> >> >>> >>>>>>>> -----END PGP SIGNATURE----- >> >>>>>>> >> >> >>> >>>>>>>> ---------------- >> >>>>>>> >> >> >>> >>>>>>>> Robert LeBlanc >> >>>>>>> >> >> >>> >>>>>>>> PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 >> >>>>>>> >> >> >>> >>>>>>>> >> >>>>>>> >> >> >>> >>>>>>>> >> >>>>>>> >> >> >>> >>>>>>>> On Tue, Sep 22, 2015 at 8:31 AM, Gregory Farnum wrote: >> >>>>>>> >> >> >>> >>>>>>>>> >> >>>>>>> >> >> >>> >>>>>>>>> On Tue, Sep 22, 2015 at 7:24 AM, Robert LeBlanc wrote: >> >>>>>>> >> >> >>> >>>>>>>>>> >> >>>>>>> >> >> >>> >>>>>>>>>> -----BEGIN PGP SIGNED MESSAGE----- >> >>>>>>> >> >> >>> >>>>>>>>>> Hash: SHA256 >> >>>>>>> >> >> >>> >>>>>>>>>> >> >>>>>>> >> >> >>> >>>>>>>>>> Is there some way to tell in the logs that this is happening? >> >>>>>>> >> >> >>> >>>>>>>>> >> >>>>>>> >> >> >>> >>>>>>>>> >> >>>>>>> >> >> >>> >>>>>>>>> You can search for the (mangled) name _split_collection >> >>>>>>> >> >> >>> >>>>>>>>>> >> >>>>>>> >> >> >>> >>>>>>>>>> I'm not >> >>>>>>> >> >> >>> >>>>>>>>>> seeing much I/O, CPU usage during these times. Is there some way to >> >>>>>>> >> >> >>> >>>>>>>>>> prevent the splitting? Is there a negative side effect to doing so? >> >>>>>>> >> >> >>> >>>>>>>>> >> >>>>>>> >> >> >>> >>>>>>>>> >> >>>>>>> >> >> >>> >>>>>>>>> Bump up the split and merge thresholds. You can search the list for >> >>>>>>> >> >> >>> >>>>>>>>> this, it was discussed not too long ago. >> >>>>>>> >> >> >>> >>>>>>>>> >> >>>>>>> >> >> >>> >>>>>>>>>> We've had I/O block for over 900 seconds and as soon as the sessions >> >>>>>>> >> >> >>> >>>>>>>>>> are aborted, they are reestablished and complete immediately. >> >>>>>>> >> >> >>> >>>>>>>>>> >> >>>>>>> >> >> >>> >>>>>>>>>> The fio test is just a seq write, starting it over (rewriting from >> >>>>>>> >> >> >>> >>>>>>>>>> the >> >>>>>>> >> >> >>> >>>>>>>>>> beginning) is still causing the issue. I was suspect that it is not >> >>>>>>> >> >> >>> >>>>>>>>>> having to create new file and therefore split collections. This is >> >>>>>>> >> >> >>> >>>>>>>>>> on >> >>>>>>> >> >> >>> >>>>>>>>>> my test cluster with no other load. >> >>>>>>> >> >> >>> >>>>>>>>> >> >>>>>>> >> >> >>> >>>>>>>>> >> >>>>>>> >> >> >>> >>>>>>>>> Hmm, that does make it seem less likely if you're really not creating >> >>>>>>> >> >> >>> >>>>>>>>> new objects, if you're actually running fio in such a way that it's >> >>>>>>> >> >> >>> >>>>>>>>> not allocating new FS blocks (this is probably hard to set up?). >> >>>>>>> >> >> >>> >>>>>>>>> >> >>>>>>> >> >> >>> >>>>>>>>>> >> >>>>>>> >> >> >>> >>>>>>>>>> I'll be doing a lot of testing today. Which log options and depths >> >>>>>>> >> >> >>> >>>>>>>>>> would be the most helpful for tracking this issue down? >> >>>>>>> >> >> >>> >>>>>>>>> >> >>>>>>> >> >> >>> >>>>>>>>> >> >>>>>>> >> >> >>> >>>>>>>>> If you want to go log diving "debug osd = 20", "debug filestore = >> >>>>>>> >> >> >>> >>>>>>>>> 20", >> >>>>>>> >> >> >>> >>>>>>>>> "debug ms = 1" are what the OSD guys like to see. That should spit >> >>>>>>> >> >> >>> >>>>>>>>> out >> >>>>>>> >> >> >>> >>>>>>>>> everything you need to track exactly what each Op is doing. >> >>>>>>> >> >> >>> >>>>>>>>> -Greg >> >>>>>>> >> >> >>> >>>>>>>> >> >>>>>>> >> >> >>> >>>>>>>> -- >> >>>>>>> >> >> >>> >>>>>>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" >> >>>>>>> >> >> >>> >>>>>>>> in >> >>>>>>> >> >> >>> >>>>>>>> the body of a message to majordomo@xxxxxxxxxxxxxxx >> >>>>>>> >> >> >>> >>>>>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html >> >>>>>>> >> >> >>> >>>>>>> >> >>>>>>> >> >> >>> >>>>>>> >> >>>>>>> >> >> >>> >>>>>>> >> >>>>>>> >> >> >>> >>>>> >> >>>>>>> >> >> >>> >>>>> -----BEGIN PGP SIGNATURE----- >> >>>>>>> >> >> >>> >>>>> Version: Mailvelope v1.1.0 >> >>>>>>> >> >> >>> >>>>> Comment: https://www.mailvelope.com >> >>>>>>> >> >> >>> >>>>> >> >>>>>>> >> >> >>> >>>>> wsFcBAEBCAAQBQJWAdMSCRDmVDuy+mK58QAAoEgP/AqpH7i1BLpoz6fTlfWG >> >>>>>>> >> >> >>> >>>>> a6swvF8xvsyR15PDiPINYT0N7MgoikikGrMmhWpJ6utEr1XPW0MPFgzvNIsf >> >>>>>>> >> >> >>> >>>>> a1eMtNzyww4rAo6JCq6BtjmUsSKmOrBNhRNr6It9v4Nv+biqZHkiY8x/rRtV >> >>>>>>> >> >> >>> >>>>> s9z0cv3Q9Wqa6y/zKZg3H1XtbtUAx0r/DUwzSsP3omupZgNyaKkCgdkil9Vc >> >>>>>>> >> >> >>> >>>>> iyzBxFZU4+qXNT2FBG4dYDjxSHQv4psjvKR3AWXSN4yEn286KyMDjFrsDY5B >> >>>>>>> >> >> >>> >>>>> izS3h603QPoErqsUQngDE8COcaTAHHrV7gNJTikmGoNW6oQBjFq/z/zindTz >> >>>>>>> >> >> >>> >>>>> caXshVQQ+OTLo/qzJM8QPswh0TGU74SVbDkTq+eTOb5pBhQbp+42Pkkqh7jj >> >>>>>>> >> >> >>> >>>>> efyyYgDzpB1WrWRbUlWMNqmnjq7DT3lnAtuHyKbkwVs8x3JMPEiCl6PBvJbx >> >>>>>>> >> >> >>> >>>>> GnNSCqgDJrpb4fHQ2iqfQeh8Ai6AL1C1Ai19RZPrAUhpDW0/DbUvuoKSR8m7 >> >>>>>>> >> >> >>> >>>>> glYYuH3hpy+oPYRhFcHm2fpNJ3u9npyk2Dai9RpzQ+mWmp3xi7becYmL482H >> >>>>>>> >> >> >>> >>>>> +WyvLeY+8AiJQDpA0CdD8KeSlOC9bw5TPmihAIn9dVTJ1O2RlapCLqL3YAJg >> >>>>>>> >> >> >>> >>>>> pGyDs8ercTEJLmvEyElj5XWh5DarsGscd2LELNS/UpyuYurbPcyPKUQ0uPjp >> >>>>>>> >> >> >>> >>>>> gcZm >> >>>>>>> >> >> >>> >>>>> =CjwB >> >>>>>>> >> >> >>> >>>>> -----END PGP SIGNATURE----- >> >>>>>>> >> >> >>> >>>> >> >>>>>>> >> >> >>> >>>> -- >> >>>>>>> >> >> >>> >>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in >> >>>>>>> >> >> >>> >>>> the body of a message to majordomo@xxxxxxxxxxxxxxx >> >>>>>>> >> >> >>> >>>> More majordomo info at http://vger.kernel.org/majordomo-info.html >> >>>>>>> >> >> >>> >>>> >> >>>>>>> >> >> >>> >>> >> >>>>>>> >> >> >>> >> >> >>>>>>> >> >> >>> >> -----BEGIN PGP SIGNATURE----- >> >>>>>>> >> >> >>> >> Version: Mailvelope v1.1.0 >> >>>>>>> >> >> >>> >> Comment: https://www.mailvelope.com >> >>>>>>> >> >> >>> >> >> >>>>>>> >> >> >>> >> wsFcBAEBCAAQBQJWAv3QCRDmVDuy+mK58QAABr4QAJcQj8zjl606aMdkmQG7 >> >>>>>>> >> >> >>> >> S46iMXVav/Tv2os9GCUsQmMPx2u1w3/WmPfjByd6Divczfo0JLDDqrbsqre2 >> >>>>>>> >> >> >>> >> lq0GNK6e8fq6FXHhPpnL+t4uFV4UZ289cma3yklRqEBDXWHlP59Hu7VpxC5l >> >>>>>>> >> >> >>> >> 0MIcCg4wM5VM/LkrfcMven5em5CnjyFJYbActGzw9043rZoyUwCM+eL7sotl >> >>>>>>> >> >> >>> >> JYHMcNWnqwdt8TLFDhUfVGiAQyV8/6E33CuCNUEuFGdtiBKzs9IZadOI8Ce0 >> >>>>>>> >> >> >>> >> dod2DQNyFSvomqNq6t0DuTCSA+pT8uuks2O0NcrHjoqwIWVkxQGPYlpbpckf >> >>>>>>> >> >> >>> >> nxQdVM7vkqapVeQ0qUZx43Db9A5wDTC3PaEfVJZPZzWsSDjh9z7o6qHs3Kvp >> >>>>>>> >> >> >>> >> krfyS+dJaZ3tOYAP1VFDfasj06sOTFu3mfGYToKA75zz5HN7QZ13Zau/qhDu >> >>>>>>> >> >> >>> >> FHxsgk4oIXJsjj22LiSpoiigH5Ls+aVqtIbg8/vWp+EO6pK1fovEtJVeGAfE >> >>>>>>> >> >> >>> >> tLOdxfJJLVjMCAScFG9BRl1ePPLeptivKV0v9ruWsTpn+Q96VtqAR5GQCkYE >> >>>>>>> >> >> >>> >> hFrlxM+oIzHeArhhiIxSPCYLlnzxoD5IYXmTrWUYBCGvlY1mrI3j80mZ4VTj >> >>>>>>> >> >> >>> >> BErsSlqnjUyFKmaI7YNKyARCloMroz3wqdy/wpg/63Io62nmh5IyY+WO8hPo >> >>>>>>> >> >> >>> >> ae22 >> >>>>>>> >> >> >>> >> =AX+L >> >>>>>>> >> >> >>> >> -----END PGP SIGNATURE----- >> >>>>>>> >> >> >>> _______________________________________________ >> >>>>>>> >> >> >>> ceph-users mailing list >> >>>>>>> >> >> >>> ceph-users@xxxxxxxxxxxxxx >> >>>>>>> >> >> >>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> >>>>>>> >> >> >>> >> >>>>>>> >> >> >>> >> >>>>>>> >> >> _______________________________________________ >> >>>>>>> >> >> ceph-users mailing list >> >>>>>>> >> >> ceph-users@xxxxxxxxxxxxxx >> >>>>>>> >> >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> >>>>>>> >> >> >> >>>>>>> >> >> >> >>>>>>> >> -- >> >>>>>>> >> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in >> >>>>>>> >> the body of a message to majordomo@xxxxxxxxxxxxxxx >> >>>>>>> >> More majordomo info at http://vger.kernel.org/majordomo-info.html >> >>>>>>> >> >> >>>>>>> >> >> >>>>>>> >> >>>>>>> -----BEGIN PGP SIGNATURE----- >> >>>>>>> Version: Mailvelope v1.2.0 >> >>>>>>> Comment: https://www.mailvelope.com >> >>>>>>> >> >>>>>>> wsFcBAEBCAAQBQJWFBoOCRDmVDuy+mK58QAA7oYP/1yVPx66DovoUJiSDunA >> >>>>>>> NjIXWnKzx77aQMDwueZ0woC8PvgsX4JpLVH90Gh1MOJWyt2L4Qp+n60loSiI >> >>>>>>> Q5xU1NMYiup8YPlHqyslBxtqCPhcN1R8XhxN212R4uyVBIgjulkkEFiiQf8R >> >>>>>>> 5Uq5rDy+Vqmbla3enekV9vpAJQhVdfxvhdnN9/tSC3I5JZm+6VW9PGmwvTL4 >> >>>>>>> HK5UIz8luvtBWCWXYm2m7ZCUKYq0oWfdVDGEpEV473yyYwoVyvTBFuNNNbpu >> >>>>>>> kdxZ422Ztv2yj5phIQgU88Q/W5NY0awW25+16AMZNb6zCbF06hvQ9SjpydGu >> >>>>>>> 6vokj3uCOImMZpdJlyMuj6IjIkB27bnJer7zVLM3tDzftPzwT8ia8M3LvMWE >> >>>>>>> sD9Dl2jx5EdFZYPMxoHF4WnD4SQtUxr+cpcI/Ij96RfXz1cMbMbVdZbWXkfz >> >>>>>>> gEY46SXuM8yMi7wzJHwd4kI9q8A+ZZDpsDuTyavMr1rqZX61H+Gzc3rNI7lc >> >>>>>>> lkJ63hfYMPCdYggnUT8mAF+cwXxq66SclwbmBYM8lbrEPuuTZzZp7veLJr5g >> >>>>>>> /PO1abPcJVYq5ZP7i1iELEac6WvDWcJgImvkF+JZAN57URNpdJA03KsVkIt7 >> >>>>>>> H5n1Y8zUv7QcVMwHo/Os30vfiPmUHxg9DFbtUU8otpcf3g+udDggWHeuiZiG >> >>>>>>> 6Kfk >> >>>>>>> =/gR6 >> >>>>>>> -----END PGP SIGNATURE----- >> >>>>>>> -- >> >>>>>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in >> >>>>>>> the body of a message to majordomo@xxxxxxxxxxxxxxx >> >>>>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html >> >>>>>>> >> >>>>>>> >> >>>>> >> >>>>> -----BEGIN PGP SIGNATURE----- >> >>>>> Version: Mailvelope v1.2.0 >> >>>>> Comment: https://www.mailvelope.com >> >>>>> >> >>>>> wsFcBAEBCAAQBQJWFChuCRDmVDuy+mK58QAAfNsQAMGNu925hGNsCTuY4X7V >> >>>>> x71rdicFIn41I12KYtmhWl0U/V9GpUwLkOAKzeAcQiK2FgBBYRle0pANqE2K >> >>>>> Thf4YBJ5oEXZ72WOB14jaggiQkZwiTZLo6c69JLZADaM5NEXD/2mM77HyVLN >> >>>>> SP5v7FSqtnlzA53aZ7hUZn5r20VfOl/peOJGJz7C393hy3gBjr+P4LKsLE2L >> >>>>> QO0lNj4mJZVnVXbxqJp9Q8xn86vmfXK2sofqbAv2wjkT2C8gM9DkgLF+UJjc >> >>>>> mCSL9EUDFHD82BGsWzvYYFci686bIUC9IxJXKLORYKjzH3ueGHhiK3/apIi4 >> >>>>> 7DA0159nObAVNNz8AvvJnnjK94KrfcqpD3inFT7++WiNWTWbYljC7eukEM8L >> >>>>> QyrcMnbuomjT87I9wB9zNwa/Pt+AepdwSf7qAv1VVYrop3nJxp8bPVCzvkrr >> >>>>> MV/gxv3esOF68nOoQ9yt8DyHFihpg0nqSPjY3xDS7qZ05u3jnWN4rgkNxmyR >> >>>>> rOpwjVLUINAkVjfAM2FL2sW6wX1tKPd947CgMrAgcX0ChwZ1xYzt6xdS0p+R >> >>>>> gciSgw7nfCvwFmpou0DnqUdTN3K0zvM9zDhQ/b9u7JW3CEZLJXMoi99C4n3g >> >>>>> RfilE0rvScnx7uTI7mo94Pwy0MYFdGw04sNtFjwjIhRFPSsMUu+NSHDJe26U >> >>>>> JFPi >> >>>>> =ofgq >> >>>>> -----END PGP SIGNATURE----- >> >>>> >> >>>> -----BEGIN PGP SIGNATURE----- >> >>>> Version: Mailvelope v1.2.0 >> >>>> Comment: https://www.mailvelope.com >> >>>> >> >>>> wsFcBAEBCAAQBQJWFDDOCRDmVDuy+mK58QAA0kUP/1rfRQa5Us9b/VCvKrhk >> >>>> BYrde1/FBybKBVXsuXVU8Dq124A1e4L682AhmQPUeVP8PQLoqS/VFSl0h7i6 >> >>>> 28AzydDaBTTjnrp6ZzVbtmKtm8WhmtSTFvWTlu/yJmRXAht9YozmFCByBfIY >> >>>> GYvOhZzjvbxBKfwnwq97QkS7xfY2tss/BmaOvSVTX7naYaOF+HRwZMSt+BF4 >> >>>> 9vg9BLSL3Aic0BnvdM64TWkDaHp/3gwGSmyMn8Q2Sa9CqUTddKQx2HXN6doo >> >>>> gIyxCj+dIw2Pt73u2NoiYv8ZhTuS3QYM4n0rRBxj8Wr/EeNwGAOwdDSgbOxf >> >>>> OvDyozzmCpQyW3h/nkdQJW5mWsJmyDIiGxHDdUn7Vgemg+Bbod0ACdoJiwct >> >>>> /BIRVQe2Ee1nZQFoKBOhvaWO6+ePJR7CVfLjMkZBTzKZBjt2tfkq17G5KTdS >> >>>> EsehvG/+vfFJkANL5Xh6eo9ptlHbFW8I/44pvUtGi2JwsN487l56XR9DqEKM >> >>>> 7Cmj9Ox205YxjqcBjhWIJQTok99lvrhDX9d7HHxIeTcmouvqPz4LTcCySRtC >> >>>> xE/GcEGAAYWGPTwf9u8ULm9Rh2Z90OnKpqtCtuuWiwRRL9VU/tLlvqmHvEZM >> >>>> 73qhiLQZka5I72B2SAEtJnDt2sX3NJ4unvH4zWKLRFTTm4M0qk6xUL1JfqNz >> >>>> JYNo >> >>>> =msX2 >> >>>> -----END PGP SIGNATURE----- >> >> >> >> -----BEGIN PGP SIGNATURE----- >> >> Version: Mailvelope v1.2.0 >> >> Comment: https://www.mailvelope.com >> >> >> >> wsFcBAEBCAAQBQJWFXGPCRDmVDuy+mK58QAAx38P/1sn6TA8hH+F2kd1A2Pq >> >> IU2cg1pFcH+kw21G8VO+BavfBaBoSETHEEuMXg5SszTIcL/HyziBLJos0C0j >> >> Vu9I0/YtblQ15enzFqKFPosdc7qij9DPJxXRkx41sJZsxvSVky+URcPpcKk6 >> >> w8Lwuq9IupesQ19ZeJkCEWFVhKz/i2E9/VXfylBgFVlkICD+5pfx6/Aq7nCP >> >> 4gboyha07zpPlDqoA7xgT+6v2zlYC80saGcA1m2XaAUdPF/17l6Mq9+Glv7E >> >> 3KeUf7jmMTJQRGBZSInFgUpPwUQKvF5OSGb3YQlzofUy5Es+wH3ccqZ+mlIY >> >> szuBLAtN6zhFFPCs6016hiragiUhLk97PItXaKdDJKecuyRdShlJrXJmtX+j >> >> NdM14TkBPTiLtAd/IZEEhIIpdvQH8YSl3LnEZ5gywggaY4Pk3JLFIJPgLpEb >> >> T8hJnuiaQaYxERQ0nRoBL4LAXARseSrOuVt2EAD50Yb/5JEwB9FQlN758rb1 >> >> AE/xhpK6d53+RlkPODKxXx816hXvDP6NADaC78XGmx+A4FfepdxBijGBsmOQ >> >> 7SxAZe469K0E6EAfClc664VzwuvBEZjwTg1eK5Z6VS/FDTH/RxTKeFhlbUIT >> >> XpezlP7XZ1/YRrJ/Eg7nb1Dv0MYQdu18tQ6QBv+C1ZsmxYLlHlcf6BZ3gNar >> >> rZW5 >> >> =dKn9 >> >> -----END PGP SIGNATURE----- >> > _______________________________________________ > ceph-users mailing list > ceph-users@xxxxxxxxxxxxxx > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- Best Regards, Wheat _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com