Re: 【cephfs】cephfs hung when scp/rsync large files

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi all,

I tried the mds_cache_memory_limit and it doesn’t work.

But unexpectedly,  we found drop_cache can work.

We added a "echo 3 > /proc/sys/vm/drop_caches” cron job.
Let it drop cache every 10 minutes and the problem didn’t occur for a whole night. : )


It seems to work but we can’t understand why. :( 


—
Best Regards
Li, Ning



> On Dec 6, 2018, at 18:46, Li,Ning <lining916740672@xxxxxxxxxx> wrote:
> 
> I tested the dirty_ratio and dirty_background_ratio setting on ceph storage server side.
> It didn’t work.
> 
> 
> The problem reproduced just now. 
> 
> 
> Some process stay in D state for over 10 minitues.
> These osd requests is about to take several  minutes.
> 
> Here are some logs:
> 
> [Thu Dec  6 17:59:18 2018] INFO: task java:11477 blocked for more than 120 seconds.
> [Thu Dec  6 17:59:18 2018] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> [Thu Dec  6 17:59:18 2018] java            D ffffffff816a4180     0 11477   8391 0x00000180
> [Thu Dec  6 17:59:18 2018]  ffff8807942bbbd0 0000000000000086 ffff8801f7e79fa0 ffff8807942bbfd8
> [Thu Dec  6 17:59:18 2018]  ffff8807942bbfd8 ffff8807942bbfd8 ffff8801f7e79fa0 ffff88086de56cc0
> [Thu Dec  6 17:59:18 2018]  0000000000000000 7fffffffffffffff ffff88047ffa04e8 ffffffff816a4180
> [Thu Dec  6 17:59:18 2018] Call Trace:
> [Thu Dec  6 17:59:18 2018]  [<ffffffff816a4180>] ? bit_wait+0x50/0x50
> [Thu Dec  6 17:59:18 2018]  [<ffffffff816a6059>] schedule+0x29/0x70
> [Thu Dec  6 17:59:18 2018]  [<ffffffff816a3b69>] schedule_timeout+0x239/0x2c0
> [Thu Dec  6 17:59:18 2018]  [<ffffffff81190541>] ? pagevec_lookup_tag+0x21/0x30
> [Thu Dec  6 17:59:18 2018]  [<ffffffffc081f598>] ? ceph_writepages_start+0x1298/0x1520 [ceph]
> [Thu Dec  6 17:59:18 2018]  [<ffffffff810e939c>] ? ktime_get_ts64+0x4c/0xf0
> [Thu Dec  6 17:59:18 2018]  [<ffffffff816a4180>] ? bit_wait+0x50/0x50
> [Thu Dec  6 17:59:18 2018]  [<ffffffff816a56dd>] io_schedule_timeout+0xad/0x130
> [Thu Dec  6 17:59:18 2018]  [<ffffffff816a5778>] io_schedule+0x18/0x20
> [Thu Dec  6 17:59:18 2018]  [<ffffffff816a4191>] bit_wait_io+0x11/0x50
> [Thu Dec  6 17:59:18 2018]  [<ffffffff816a3cb5>] __wait_on_bit+0x65/0x90
> [Thu Dec  6 17:59:18 2018]  [<ffffffff81181cb1>] wait_on_page_bit+0x81/0xa0
> [Thu Dec  6 17:59:18 2018]  [<ffffffff810b19d0>] ? wake_bit_function+0x40/0x40
> [Thu Dec  6 17:59:18 2018]  [<ffffffff81181de1>] __filemap_fdatawait_range+0x111/0x190
> [Thu Dec  6 17:59:18 2018]  [<ffffffff8118f42e>] ? do_writepages+0x1e/0x40
> [Thu Dec  6 17:59:18 2018]  [<ffffffff81181e74>] filemap_fdatawait_range+0x14/0x30
> [Thu Dec  6 17:59:18 2018]  [<ffffffff81183db6>] filemap_write_and_wait_range+0x56/0x90
> [Thu Dec  6 17:59:18 2018]  [<ffffffffc082726c>] ceph_fsync+0x6c/0x510 [ceph]
> [Thu Dec  6 17:59:18 2018]  [<ffffffff81137273>] ? __secure_computing+0x73/0x240
> [Thu Dec  6 17:59:18 2018]  [<ffffffff8111f5c6>] ? __audit_syscall_exit+0x1e6/0x280
> [Thu Dec  6 17:59:18 2018]  [<ffffffff81231d85>] do_fsync+0x65/0xa0
> [Thu Dec  6 17:59:18 2018]  [<ffffffff81232083>] SyS_fdatasync+0x13/0x20
> [Thu Dec  6 17:59:18 2018]  [<ffffffff816b1d4e>] tracesys+0xd9/0xde
> [Thu Dec  6 18:03:38 2018] libceph: osd3 10.0.30.111:6808 socket closed (con state OPEN)
> [Thu Dec  6 18:03:39 2018] libceph: osd7 10.0.30.112:6807 socket closed (con state OPEN)
> [Thu Dec  6 18:03:39 2018] libceph: osd6 10.0.30.112:6805 socket closed (con state OPEN)
> 
> 
> cephfs kernel client side osdc:
> 
> REQUESTS 1548 homeless 0
> 124188  osd5    1.7e6c3097      [5,2]/5 [5,2]/5 100004474e2.000006bf    0x400024        1       0'0     write
> 124191  osd5    1.37d4eeac      [5,2]/5 [5,2]/5 100004474e2.000006c2    0x400024        1       0'0     write
> 124192  osd5    1.2506d82d      [5,1]/5 [5,1]/5 100004474e2.000006c3    0x400024        1       0'0     write
> 124198  osd5    1.5326c49c      [5,3]/5 [5,3]/5 100004474e1.00000888    0x400024        1       0'0     write
> 124219  osd5    1.37203f43      [5,2]/5 [5,2]/5 100004474e2.000006cb    0x400024        1       0'0     write
> 124236  osd5    1.1318aeb5      [5,0]/5 [5,0]/5 100002adc43.00000000    0x400024        1       0'0     write
> 124268  osd5    1.b4368bad      [5,0]/5 [5,0]/5 100004474e1.000008a2    0x400024        1       0'0     write
> 124272  osd5    1.a31398ad      [5,0]/5 [5,0]/5 100004474e2.000006dc    0x400024        1       0'0     write
> 124279  osd5    1.f6363aff      [5,0]/5 [5,0]/5 100004474e2.000006e3    0x400024        1       0'0     write
> 124281  osd5    1.e3386fca      [5,2]/5 [5,2]/5 100004474e2.000006e5    0x400024        1       0'0     write
> 124288  osd5    1.a95ff0ac      [5,2]/5 [5,2]/5 100004474e1.000008ab    0x400024        1       0'0     write
> 124290  osd5    1.4e9c48ff      [5,0]/5 [5,0]/5 100004474e1.000008ad    0x400024        1       0'0     write
> 124293  osd5    1.e4721518      [5,2]/5 [5,2]/5 100004474e1.000008b0    0x400024        1       0'0     write
> 124299  osd5    1.fae2c501      [5,0]/5 [5,0]/5 100004474e2.000006ec    0x400024        1       0'0     write
> 124303  osd5    1.708289ad      [5,0]/5 [5,0]/5 100004474e2.000006f0    0x400024        1       0'0     write
> 124312  osd5    1.85e93586      [5,0]/5 [5,0]/5 100004474e1.000008b8    0x400024        1       0'0     write
> 124314  osd5    1.85146f18      [5,2]/5 [5,2]/5 100004474e1.000008ba    0x400024        1       0'0     write
> 124316  osd5    1.b4c5875a      [5,0]/5 [5,0]/5 100004474e2.000006f2    0x400024        1       0'0     write
> 124336  osd5    1.cc35e38e      [5,2]/5 [5,2]/5 100004474e1.000008c5    0x400024        1       0'0     write
> 124339  osd5    1.c99de197      [5,2]/5 [5,2]/5 100004474e2.000006fe    0x400024        1       0'0     write
> 124364  osd5    1.21ea6cb7      [5,2]/5 [5,2]/5 100004474e2.0000070c    0x400024        1       0'0     write
> 124368  osd5    1.411148d8      [5,1]/5 [5,1]/5 100004474e2.00000710    0x400024        1       0'0     write
> 124370  osd5    1.c56129da      [5,1]/5 [5,1]/5 100004474e2.00000712    0x400024        1       0'0     write
> 124380  osd5    1.dd7b1c12      [5,1]/5 [5,1]/5 100004474e1.000008db    0x400024        1       0'0     write
> 124384  osd5    1.1878e328      [5,2]/5 [5,2]/5 100004474e2.00000715    0x400024        1       0'0     write
> 124390  osd5    1.f992feda      [5,1]/5 [5,1]/5 100004474e2.0000071b    0x400024        1       0'0     write
> 124396  osd5    1.84f2e0ff      [5,0]/5 [5,0]/5 100004474e1.000008e0    0x400024        1       0'0     write
> 124398  osd5    1.76e9cbb5      [5,0]/5 [5,0]/5 100004474e1.000008e2    0x400024        1       0'0     write
> 124405  osd5    1.3a4c5001      [5,0]/5 [5,0]/5 100004474e2.0000071f    0x400024        1       0'0     write
> 124413  osd5    1.91f47ada      [5,1]/5 [5,1]/5 100004474e2.00000727    0x400024        1       0'0     write
> 124420  osd5    1.da08fa82      [5,1]/5 [5,1]/5 100004474e1.000008ed    0x400024        1       0'0     write
> 124422  osd5    1.f0dc5efd      [5,3]/5 [5,3]/5 100004474e1.000008ef    0x400024        1       0'0     write
> 124435  osd5    1.28ba590a      [5,2]/5 [5,2]/5 100004474e2.00000732    0x400024        1       0'0     write
> 124442  osd5    1.21439cf       [5,1]/5 [5,1]/5 100004474e1.000008f8    0x400024        1       0'0     write
> 
> 
> ceph server side dumps:
> 
> # ceph daemon osd.5 dump_blocked_ops
> {
>    "ops": [],
>    "complaint_time": 30.000000,
>    "num_blocked_ops": 0
> }
> # ceph daemon osd.5 dump_ops_in_flight
> {
>    "ops": [],
>    "num_ops": 0
> }
> 
> # ceph daemon osd.5 dump_historic_slow_ops
> {
>    "num to keep": 20,
>    "threshold to keep": 10,
>    "Ops": [
>        {
>            "description": "osd_op(client.447197.1:71890 1.b7884c19 1:983211ed:::100004342f9.000015e6:head [write 0~4194304] snapc 1=[] ondisk+write+known_if_redirected e75)",
>            "initiated_at": "2018-12-06 18:04:10.573307",
>            "age": 649.893033,
>            "duration": 13.827240,
>            "type_data": {
>                "flag_point": "commit sent; apply or cleanup",
>                "client_info": {
>                    "client": "client.447197",
>                    "client_addr": "10.0.30.110:0/3181790720",
>                    "tid": 71890
>                },
>                "events": [
>                    {
>                        "time": "2018-12-06 18:04:10.573307",
>                        "event": "initiated"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:10.573307",
>                        "event": "header_read"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:10.573309",
>                        "event": "throttled"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:10.595225",
>                        "event": "all_read"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:10.595278",
>                        "event": "dispatched"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:10.595291",
>                        "event": "queued_for_pg"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:22.970164",
>                        "event": "reached_pg"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:22.970227",
>                        "event": "started"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:22.970273",
>                        "event": "waiting for subops from 6"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:23.323234",
>                        "event": "op_commit"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:24.400524",
>                        "event": "sub_op_commit_rec from 6"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:24.400541",
>                        "event": "commit_sent"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:24.400547",
>                        "event": "done"
>                    }
>                ]
>            }
>        },
>        {
>            "description": "osd_op(client.447197.1:71905 1.b5503364 1:26cc0aad:::100004342fb.00000c45:head [write 0~4194304] snapc 1=[] ondisk+write+known_if_redirected e75)",
>            "initiated_at": "2018-12-06 18:04:10.620624",
>            "age": 649.845717,
>            "duration": 12.702676,
>            "type_data": {
>                "flag_point": "commit sent; apply or cleanup",
>                "client_info": {
>                    "client": "client.447197",
>                    "client_addr": "10.0.30.110:0/3181790720",
>                    "tid": 71905
>                },
>                "events": [
>                    {
>                        "time": "2018-12-06 18:04:10.620624",
>                        "event": "initiated"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:10.620624",
>                        "event": "header_read"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:10.620626",
>                        "event": "throttled"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:10.635423",
>                        "event": "all_read"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:10.635476",
>                        "event": "dispatched"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:10.635485",
>                        "event": "queued_for_pg"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:23.057810",
>                        "event": "reached_pg"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:23.057896",
>                        "event": "started"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:23.057945",
>                        "event": "waiting for subops from 7"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:23.237036",
>                        "event": "sub_op_commit_rec from 7"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:23.323250",
>                        "event": "op_commit"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:23.323285",
>                        "event": "commit_sent"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:23.323300",
>                        "event": "done"
>                    }
>                ]
>            }
>        },
>        {
>            "description": "osd_op(client.447197.1:71913 1.9b59c75f 1:fae39ad9:::100004342fb.00000c4d:head [write 0~4194304] snapc 1=[] ondisk+write+known_if_redirected e75)",
>            "initiated_at": "2018-12-06 18:04:10.649445",
>            "age": 649.816895,
>            "duration": 12.762881,
>            "type_data": {
>                "flag_point": "commit sent; apply or cleanup",
>                "client_info": {
>                    "client": "client.447197",
>                    "client_addr": "10.0.30.110:0/3181790720",
>                    "tid": 71913
>                },
>                "events": [
>                    {
>                        "time": "2018-12-06 18:04:10.649445",
>                        "event": "initiated"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:10.649445",
>                        "event": "header_read"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:10.649447",
>                        "event": "throttled"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:10.667713",
>                        "event": "all_read"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:10.667783",
>                        "event": "dispatched"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:10.667803",
>                        "event": "queued_for_pg"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:23.059423",
>                        "event": "reached_pg"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:23.059494",
>                        "event": "started"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:23.059540",
>                        "event": "waiting for subops from 5"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:23.147873",
>                        "event": "sub_op_commit_rec from 5"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:23.412282",
>                        "event": "op_commit"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:23.412312",
>                        "event": "commit_sent"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:23.412326",
>                        "event": "done"
>                    }
>                ]
>            }
>        },
>        {
>            "description": "osd_op(client.447197.1:71924 1.43f2a896 1:69154fc2:::100004342f9.000015f9:head [write 0~4194304] snapc 1=[] ondisk+write+known_if_redirected e75)",
>            "initiated_at": "2018-12-06 18:04:10.680689",
>            "age": 649.785651,
>            "duration": 12.731666,
>            "type_data": {
>                "flag_point": "commit sent; apply or cleanup",
>                "client_info": {
>                    "client": "client.447197",
>                    "client_addr": "10.0.30.110:0/3181790720",
>                    "tid": 71924
>                },
>                "events": [
>                    {
>                        "time": "2018-12-06 18:04:10.680689",
>                        "event": "initiated"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:10.680689",
>                        "event": "header_read"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:10.680692",
>                        "event": "throttled"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:10.702583",
>                        "event": "all_read"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:10.702627",
>                        "event": "dispatched"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:10.702634",
>                        "event": "queued_for_pg"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:23.147879",
>                        "event": "reached_pg"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:23.147979",
>                        "event": "started"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:23.148065",
>                        "event": "waiting for subops from 4"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:23.237049",
>                        "event": "sub_op_commit_rec from 4"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:23.412333",
>                        "event": "op_commit"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:23.412349",
>                        "event": "commit_sent"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:23.412356",
>                        "event": "done"
>                    }
>                ]
>            }
>        },
>        {
>            "description": "osd_op(client.447197.1:71935 1.df579f5f 1:faf9eafb:::100004342fb.00000c54:head [write 0~4194304] snapc 1=[] ondisk+write+known_if_redirected e75)",
>            "initiated_at": "2018-12-06 18:04:10.751790",
>            "age": 649.714550,
>            "duration": 12.747687,
>            "type_data": {
>                "flag_point": "commit sent; apply or cleanup",
>                "client_info": {
>                    "client": "client.447197",
>                    "client_addr": "10.0.30.110:0/3181790720",
>                    "tid": 71935
>                },
>                "events": [
>                    {
>                        "time": "2018-12-06 18:04:10.751790",
>                        "event": "initiated"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:10.751790",
>                        "event": "header_read"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:10.751792",
>                        "event": "throttled"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:10.764998",
>                        "event": "all_read"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:10.765049",
>                        "event": "dispatched"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:10.765059",
>                        "event": "queued_for_pg"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:23.149915",
>                        "event": "reached_pg"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:23.149990",
>                        "event": "started"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:23.150037",
>                        "event": "waiting for subops from 5"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:23.237059",
>                        "event": "sub_op_commit_rec from 5"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:23.499434",
>                        "event": "op_commit"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:23.499466",
>                        "event": "commit_sent"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:23.499477",
>                        "event": "done"
>                    }
>                ]
>            }
>        },
>        {
>            "description": "osd_op(client.447197.1:71948 1.30315755 1:aaea8c0c:::100004342f9.00001602:head [write 0~4194304] snapc 1=[] ondisk+write+known_if_redirected e75)",
>            "initiated_at": "2018-12-06 18:04:10.765065",
>            "age": 649.701276,
>            "duration": 12.734734,
>            "type_data": {
>                "flag_point": "commit sent; apply or cleanup",
>                "client_info": {
>                    "client": "client.447197",
>                    "client_addr": "10.0.30.110:0/3181790720",
>                    "tid": 71948
>                },
>                "events": [
>                    {
>                        "time": "2018-12-06 18:04:10.765065",
>                        "event": "initiated"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:10.765065",
>                        "event": "header_read"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:10.765067",
>                        "event": "throttled"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:10.789073",
>                        "event": "all_read"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:10.789125",
>                        "event": "dispatched"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:10.789136",
>                        "event": "queued_for_pg"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:23.237064",
>                        "event": "reached_pg"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:23.237142",
>                        "event": "started"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:23.237187",
>                        "event": "waiting for subops from 5"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:23.323299",
>                        "event": "sub_op_commit_rec from 5"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:23.499499",
>                        "event": "op_commit"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:23.499513",
>                        "event": "commit_sent"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:23.499799",
>                        "event": "done"
>                    }
>                ]
>            }
>        },
>        {
>            "description": "osd_op(client.447197.1:71952 1.e0451baf 1:f5d8a207:::100004342f9.00001606:head [write 0~4194304] snapc 1=[] ondisk+write+known_if_redirected e75)",
>            "initiated_at": "2018-12-06 18:04:10.789142",
>            "age": 649.677198,
>            "duration": 12.809745,
>            "type_data": {
>                "flag_point": "commit sent; apply or cleanup",
>                "client_info": {
>                    "client": "client.447197",
>                    "client_addr": "10.0.30.110:0/3181790720",
>                    "tid": 71952
>                },
>                "events": [
>                    {
>                        "time": "2018-12-06 18:04:10.789142",
>                        "event": "initiated"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:10.789142",
>                        "event": "header_read"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:10.789145",
>                        "event": "throttled"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:10.802180",
>                        "event": "all_read"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:10.802240",
>                        "event": "dispatched"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:10.802249",
>                        "event": "queued_for_pg"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:23.238710",
>                        "event": "reached_pg"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:23.238776",
>                        "event": "started"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:23.238890",
>                        "event": "waiting for subops from 7"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:23.323284",
>                        "event": "sub_op_commit_rec from 7"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:23.598857",
>                        "event": "op_commit"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:23.598878",
>                        "event": "commit_sent"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:23.598888",
>                        "event": "done"
>                    }
>                ]
>            }
>        },
>        {
>            "description": "osd_op(client.447197.1:71987 1.87226b4b 1:d2d644e1:::100004342f9.0000161a:head [write 0~4194304] snapc 1=[] ondisk+write+known_if_redirected e75)",
>            "initiated_at": "2018-12-06 18:04:10.875449",
>            "age": 649.590892,
>            "duration": 12.723466,
>            "type_data": {
>                "flag_point": "commit sent; apply or cleanup",
>                "client_info": {
>                    "client": "client.447197",
>                    "client_addr": "10.0.30.110:0/3181790720",
>                    "tid": 71987
>                },
>                "events": [
>                    {
>                        "time": "2018-12-06 18:04:10.875449",
>                        "event": "initiated"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:10.875449",
>                        "event": "header_read"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:10.875450",
>                        "event": "throttled"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:10.889074",
>                        "event": "all_read"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:10.889109",
>                        "event": "dispatched"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:10.889113",
>                        "event": "queued_for_pg"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:23.323306",
>                        "event": "reached_pg"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:23.323402",
>                        "event": "started"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:23.323465",
>                        "event": "waiting for subops from 4"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:23.412387",
>                        "event": "sub_op_commit_rec from 4"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:23.598895",
>                        "event": "op_commit"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:23.598908",
>                        "event": "commit_sent"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:23.598915",
>                        "event": "done"
>                    }
>                ]
>            }
>        },
>        {
>            "description": "osd_op(client.447197.1:72009 1.6e26a0cd 1:b3056476:::100004342f9.00001621:head [write 0~4194304] snapc 1=[] ondisk+write+known_if_redirected e75)",
>            "initiated_at": "2018-12-06 18:04:10.951671",
>            "age": 649.514669,
>            "duration": 13.559266,
>            "type_data": {
>                "flag_point": "commit sent; apply or cleanup",
>                "client_info": {
>                    "client": "client.447197",
>                    "client_addr": "10.0.30.110:0/3181790720",
>                    "tid": 72009
>                },
>                "events": [
>                    {
>                        "time": "2018-12-06 18:04:10.951671",
>                        "event": "initiated"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:10.951671",
>                        "event": "header_read"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:10.951673",
>                        "event": "throttled"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:10.971212",
>                        "event": "all_read"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:10.971265",
>                        "event": "dispatched"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:10.971274",
>                        "event": "queued_for_pg"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:23.325006",
>                        "event": "reached_pg"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:23.325073",
>                        "event": "started"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:23.325120",
>                        "event": "waiting for subops from 6"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:23.689393",
>                        "event": "op_commit"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:24.510912",
>                        "event": "sub_op_commit_rec from 6"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:24.510930",
>                        "event": "commit_sent"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:24.510937",
>                        "event": "done"
>                    }
>                ]
>            }
>        },
>        {
>            "description": "osd_op(client.447197.1:72027 1.2675011e 1:7880ae64:::100004342fb.00000c83:head [write 0~4194304] snapc 1=[] ondisk+write+known_if_redirected e75)",
>            "initiated_at": "2018-12-06 18:04:10.983790",
>            "age": 649.482551,
>            "duration": 13.631767,
>            "type_data": {
>                "flag_point": "commit sent; apply or cleanup",
>                "client_info": {
>                    "client": "client.447197",
>                    "client_addr": "10.0.30.110:0/3181790720",
>                    "tid": 72027
>                },
>                "events": [
>                    {
>                        "time": "2018-12-06 18:04:10.983790",
>                        "event": "initiated"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:10.983790",
>                        "event": "header_read"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:10.983791",
>                        "event": "throttled"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:10.997441",
>                        "event": "all_read"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:10.997486",
>                        "event": "dispatched"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:10.997493",
>                        "event": "queued_for_pg"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:23.412425",
>                        "event": "reached_pg"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:23.412492",
>                        "event": "started"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:23.412539",
>                        "event": "waiting for subops from 6"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:23.782075",
>                        "event": "op_commit"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:24.615518",
>                        "event": "sub_op_commit_rec from 6"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:24.615545",
>                        "event": "commit_sent"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:24.615556",
>                        "event": "done"
>                    }
>                ]
>            }
>        },
>        {
>            "description": "osd_op(client.447197.1:72047 1.20d53eaf 1:f57cab04:::100004342f9.00001638:head [write 0~4194304] snapc 1=[] ondisk+write+known_if_redirected e75)",
>            "initiated_at": "2018-12-06 18:04:10.997499",
>            "age": 649.468842,
>            "duration": 12.894479,
>            "type_data": {
>                "flag_point": "commit sent; apply or cleanup",
>                "client_info": {
>                    "client": "client.447197",
>                    "client_addr": "10.0.30.110:0/3181790720",
>                    "tid": 72047
>                },
>                "events": [
>                    {
>                        "time": "2018-12-06 18:04:10.997499",
>                        "event": "initiated"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:10.997499",
>                        "event": "header_read"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:10.997500",
>                        "event": "throttled"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:11.010879",
>                        "event": "all_read"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:11.010920",
>                        "event": "dispatched"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:11.010925",
>                        "event": "queued_for_pg"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:23.499523",
>                        "event": "reached_pg"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:23.499599",
>                        "event": "started"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:23.499652",
>                        "event": "waiting for subops from 7"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:23.598911",
>                        "event": "sub_op_commit_rec from 7"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:23.891906",
>                        "event": "op_commit"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:23.891960",
>                        "event": "commit_sent"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:23.891977",
>                        "event": "done"
>                    }
>                ]
>            }
>        },
>        {
>            "description": "osd_op(client.447197.1:72114 1.5c7de55 1:aa7be3a0:::100004342fb.00000cad:head [write 0~4194304] snapc 1=[] ondisk+write+known_if_redirected e75)",
>            "initiated_at": "2018-12-06 18:04:11.111490",
>            "age": 649.354851,
>            "duration": 12.780517,
>            "type_data": {
>                "flag_point": "commit sent; apply or cleanup",
>                "client_info": {
>                    "client": "client.447197",
>                    "client_addr": "10.0.30.110:0/3181790720",
>                    "tid": 72114
>                },
>                "events": [
>                    {
>                        "time": "2018-12-06 18:04:11.111490",
>                        "event": "initiated"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:11.111490",
>                        "event": "header_read"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:11.111492",
>                        "event": "throttled"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:11.127195",
>                        "event": "all_read"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:11.127236",
>                        "event": "dispatched"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:11.127242",
>                        "event": "queued_for_pg"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:23.598920",
>                        "event": "reached_pg"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:23.598999",
>                        "event": "started"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:23.599048",
>                        "event": "waiting for subops from 5"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:23.688752",
>                        "event": "sub_op_commit_rec from 5"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:23.891986",
>                        "event": "op_commit"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:23.892001",
>                        "event": "commit_sent"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:23.892007",
>                        "event": "done"
>                    }
>                ]
>            }
>        },
>        {
>            "description": "osd_op(client.447197.1:72124 1.59941c4b 1:d238299a:::100004342fb.00000cb7:head [write 0~4194304] snapc 1=[] ondisk+write+known_if_redirected e75)",
>            "initiated_at": "2018-12-06 18:04:11.127248",
>            "age": 649.339093,
>            "duration": 12.962263,
>            "type_data": {
>                "flag_point": "commit sent; apply or cleanup",
>                "client_info": {
>                    "client": "client.447197",
>                    "client_addr": "10.0.30.110:0/3181790720",
>                    "tid": 72124
>                },
>                "events": [
>                    {
>                        "time": "2018-12-06 18:04:11.127248",
>                        "event": "initiated"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:11.127248",
>                        "event": "header_read"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:11.127250",
>                        "event": "throttled"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:11.137801",
>                        "event": "all_read"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:11.137837",
>                        "event": "dispatched"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:11.137842",
>                        "event": "queued_for_pg"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:23.600686",
>                        "event": "reached_pg"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:23.600750",
>                        "event": "started"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:23.600795",
>                        "event": "waiting for subops from 4"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:23.688740",
>                        "event": "sub_op_commit_rec from 4"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:24.089450",
>                        "event": "op_commit"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:24.089493",
>                        "event": "commit_sent"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:24.089511",
>                        "event": "done"
>                    }
>                ]
>            }
>        },
>        {
>            "description": "osd_op(client.447197.1:72136 1.4c3df65f 1:fa6fbc32:::100004342f9.00001664:head [write 0~4194304] snapc 1=[] ondisk+write+known_if_redirected e75)",
>            "initiated_at": "2018-12-06 18:04:11.161272",
>            "age": 649.305069,
>            "duration": 12.928269,
>            "type_data": {
>                "flag_point": "commit sent; apply or cleanup",
>                "client_info": {
>                    "client": "client.447197",
>                    "client_addr": "10.0.30.110:0/3181790720",
>                    "tid": 72136
>                },
>                "events": [
>                    {
>                        "time": "2018-12-06 18:04:11.161272",
>                        "event": "initiated"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:11.161272",
>                        "event": "header_read"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:11.161273",
>                        "event": "throttled"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:11.176621",
>                        "event": "all_read"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:11.176665",
>                        "event": "dispatched"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:11.176672",
>                        "event": "queued_for_pg"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:23.688809",
>                        "event": "reached_pg"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:23.688878",
>                        "event": "started"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:23.688924",
>                        "event": "waiting for subops from 5"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:23.781472",
>                        "event": "sub_op_commit_rec from 5"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:24.089520",
>                        "event": "op_commit"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:24.089534",
>                        "event": "commit_sent"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:24.089541",
>                        "event": "done"
>                    }
>                ]
>            }
>        },
>        {
>            "description": "osd_op(client.447197.1:72202 1.de800dcd 1:b3b0017b:::100004342fb.00000cd8:head [write 0~4194304] snapc 1=[] ondisk+write+known_if_redirected e75)",
>            "initiated_at": "2018-12-06 18:04:11.235424",
>            "age": 649.230917,
>            "duration": 13.494299,
>            "type_data": {
>                "flag_point": "commit sent; apply or cleanup",
>                "client_info": {
>                    "client": "client.447197",
>                    "client_addr": "10.0.30.110:0/3181790720",
>                    "tid": 72202
>                },
>                "events": [
>                    {
>                        "time": "2018-12-06 18:04:11.235424",
>                        "event": "initiated"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:11.235424",
>                        "event": "header_read"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:11.235425",
>                        "event": "throttled"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:11.244364",
>                        "event": "all_read"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:11.244400",
>                        "event": "dispatched"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:11.244406",
>                        "event": "queued_for_pg"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:23.690409",
>                        "event": "reached_pg"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:23.690473",
>                        "event": "started"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:23.690519",
>                        "event": "waiting for subops from 6"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:24.175400",
>                        "event": "op_commit"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:24.729691",
>                        "event": "sub_op_commit_rec from 6"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:24.729714",
>                        "event": "commit_sent"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:24.729723",
>                        "event": "done"
>                    }
>                ]
>            }
>        },
>        {
>            "description": "osd_op(client.447197.1:72222 1.bc0cdac3 1:c35b303d:::100004342f9.00001694:head [write 0~4194304] snapc 1=[] ondisk+write+known_if_redirected e75)",
>            "initiated_at": "2018-12-06 18:04:11.283182",
>            "age": 649.183159,
>            "duration": 12.980151,
>            "type_data": {
>                "flag_point": "commit sent; apply or cleanup",
>                "client_info": {
>                    "client": "client.447197",
>                    "client_addr": "10.0.30.110:0/3181790720",
>                    "tid": 72222
>                },
>                "events": [
>                    {
>                        "time": "2018-12-06 18:04:11.283182",
>                        "event": "initiated"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:11.283182",
>                        "event": "header_read"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:11.283183",
>                        "event": "throttled"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:11.290043",
>                        "event": "all_read"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:11.290087",
>                        "event": "dispatched"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:11.290095",
>                        "event": "queued_for_pg"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:23.781478",
>                        "event": "reached_pg"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:23.781545",
>                        "event": "started"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:23.781591",
>                        "event": "waiting for subops from 7"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:23.892055",
>                        "event": "sub_op_commit_rec from 7"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:24.263304",
>                        "event": "op_commit"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:24.263325",
>                        "event": "commit_sent"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:24.263333",
>                        "event": "done"
>                    }
>                ]
>            }
>        },
>        {
>            "description": "osd_op(client.447197.1:72237 1.567a0855 1:aa105e6a:::100004342f9.000016a3:head [write 0~4194304] snapc 1=[] ondisk+write+known_if_redirected e75)",
>            "initiated_at": "2018-12-06 18:04:11.305424",
>            "age": 649.160916,
>            "duration": 13.052176,
>            "type_data": {
>                "flag_point": "commit sent; apply or cleanup",
>                "client_info": {
>                    "client": "client.447197",
>                    "client_addr": "10.0.30.110:0/3181790720",
>                    "tid": 72237
>                },
>                "events": [
>                    {
>                        "time": "2018-12-06 18:04:11.305424",
>                        "event": "initiated"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:11.305424",
>                        "event": "header_read"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:11.305426",
>                        "event": "throttled"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:11.312298",
>                        "event": "all_read"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:11.312335",
>                        "event": "dispatched"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:11.312340",
>                        "event": "queued_for_pg"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:23.892064",
>                        "event": "reached_pg"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:23.892156",
>                        "event": "started"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:23.892216",
>                        "event": "waiting for subops from 5"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:24.089563",
>                        "event": "sub_op_commit_rec from 5"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:24.357583",
>                        "event": "op_commit"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:24.357594",
>                        "event": "commit_sent"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:24.357600",
>                        "event": "done"
>                    }
>                ]
>            }
>        },
>        {
>            "description": "osd_op(client.447197.1:72249 1.960d9955 1:aa99b069:::100004342f9.000016af:head [write 0~4194304] snapc 1=[] ondisk+write+known_if_redirected e75)",
>            "initiated_at": "2018-12-06 18:04:11.319524",
>            "age": 649.146817,
>            "duration": 13.038095,
>            "type_data": {
>                "flag_point": "commit sent; apply or cleanup",
>                "client_info": {
>                    "client": "client.447197",
>                    "client_addr": "10.0.30.110:0/3181790720",
>                    "tid": 72249
>                },
>                "events": [
>                    {
>                        "time": "2018-12-06 18:04:11.319524",
>                        "event": "initiated"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:11.319524",
>                        "event": "header_read"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:11.319525",
>                        "event": "throttled"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:11.327928",
>                        "event": "all_read"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:11.327967",
>                        "event": "dispatched"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:11.327973",
>                        "event": "queued_for_pg"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:24.089569",
>                        "event": "reached_pg"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:24.089666",
>                        "event": "started"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:24.089730",
>                        "event": "waiting for subops from 5"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:24.175469",
>                        "event": "sub_op_commit_rec from 5"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:24.357605",
>                        "event": "op_commit"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:24.357614",
>                        "event": "commit_sent"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:24.357618",
>                        "event": "done"
>                    }
>                ]
>            }
>        },
>        {
>            "description": "osd_op(client.447197.1:72256 1.38b4facd 1:b35f2d1c:::100004342f9.000016b6:head [write 0~4194304] snapc 1=[] ondisk+write+known_if_redirected e75)",
>            "initiated_at": "2018-12-06 18:04:11.334694",
>            "age": 649.131646,
>            "duration": 13.615766,
>            "type_data": {
>                "flag_point": "commit sent; apply or cleanup",
>                "client_info": {
>                    "client": "client.447197",
>                    "client_addr": "10.0.30.110:0/3181790720",
>                    "tid": 72256
>                },
>                "events": [
>                    {
>                        "time": "2018-12-06 18:04:11.334694",
>                        "event": "initiated"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:11.334694",
>                        "event": "header_read"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:11.334696",
>                        "event": "throttled"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:11.340039",
>                        "event": "all_read"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:11.340076",
>                        "event": "dispatched"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:11.340082",
>                        "event": "queued_for_pg"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:24.091372",
>                        "event": "reached_pg"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:24.091440",
>                        "event": "started"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:24.091490",
>                        "event": "waiting for subops from 6"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:24.359630",
>                        "event": "op_commit"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:24.950426",
>                        "event": "sub_op_commit_rec from 6"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:24.950451",
>                        "event": "commit_sent"
>                    },
>                    {
>                        "time": "2018-12-06 18:04:24.950460",
>                        "event": "done"
>                    }
>                ]
>            }
>        },
>        {
>            "description": "osd_op(client.447197.1:75635 1.751883c0 1:03c118ae:::100004342f9.00001a19:head [write 0~4194304] snapc 1=[] ondisk+write+known_if_redirected e75)",
>            "initiated_at": "2018-12-06 18:05:16.411107",
>            "age": 584.055234,
>            "duration": 10.290467,
>            "type_data": {
>                "flag_point": "commit sent; apply or cleanup",
>                "client_info": {
>                    "client": "client.447197",
>                    "client_addr": "10.0.30.110:0/3181790720",
>                    "tid": 75635
>                },
>                "events": [
>                    {
>                        "time": "2018-12-06 18:05:16.411107",
>                        "event": "initiated"
>                    },
>                    {
>                        "time": "2018-12-06 18:05:16.411107",
>                        "event": "header_read"
>                    },
>                    {
>                        "time": "2018-12-06 18:05:16.411108",
>                        "event": "throttled"
>                    },
>                    {
>                        "time": "2018-12-06 18:05:16.416864",
>                        "event": "all_read"
>                    },
>                    {
>                        "time": "2018-12-06 18:05:16.416903",
>                        "event": "dispatched"
>                    },
>                    {
>                        "time": "2018-12-06 18:05:16.416909",
>                        "event": "queued_for_pg"
>                    },
>                    {
>                        "time": "2018-12-06 18:05:19.383296",
>                        "event": "reached_pg"
>                    },
>                    {
>                        "time": "2018-12-06 18:05:19.383394",
>                        "event": "started"
>                    },
>                    {
>                        "time": "2018-12-06 18:05:19.383443",
>                        "event": "waiting for subops from 5"
>                    },
>                    {
>                        "time": "2018-12-06 18:05:19.957547",
>                        "event": "op_commit"
>                    },
>                    {
>                        "time": "2018-12-06 18:05:26.701524",
>                        "event": "sub_op_commit_rec from 5"
>                    },
>                    {
>                        "time": "2018-12-06 18:05:26.701557",
>                        "event": "commit_sent"
>                    },
>                    {
>                        "time": "2018-12-06 18:05:26.701574",
>                        "event": "done"
>                    }
>                ]
>            }
>        }
>    ]
> }
> 
> 
>
> Best Regards
> Li, Ning
> 
> 
> 
>> On Dec 6, 2018, at 14:31, Yan, Zheng <ukernel@xxxxxxxxx> wrote:
>> 
>> On Thu, Dec 6, 2018 at 2:24 PM Li,Ning <lining916740672@xxxxxxxxxx> wrote:
>>> 
>>> Hi,
>>> 
>>> Overload on the storage nodes that run OSDs is normal  during the whole copy.
>>> 
>>> The memory pressure is on the client side.  I think it may not work to turn the dirty_ratio and dirty_background_ratio to small on the ceph storage server side.
>>> 
>>> 
>>> Anyway,I will have a try.
>>> 
>>> —
>>> Best Regards
>>> Li, Ning
>>> 
>>> 
>>> 
>>>> On Dec 6, 2018, at 11:41, Yan, Zheng <ukernel@xxxxxxxxx> wrote:
>>>> 
>>>> On Wed, Dec 5, 2018 at 2:33 PM NingLi <lining916740672@xxxxxxxxxx> wrote:
>>>>> 
>>>>> Hi all,
>>>>> 
>>>>> We found that some process writing cephfs  will hang for a long time (> 120s)  when uploading(scp/rsync) large files(totally 50G ~ 300G)to the app node's cephfs mountpoint.
>>>>> 
>>>>> This problem is not always reproduciable. But when this problem occurs, the web(nginx) or some other services can’t be access, it’s embarrassing.
>>>>> 
>>>>> 
>>>>> Please give some tips.
>>>>> 
>>>>> 
>>>>> We are using  ceph version 13.2.1  and  3.10.0-693.el7.x86_64 kernel.
>>>>> 
>>>>> 
>>>>> When this problem occurs, the memory pressure is a bit high.
>>>>> # free -h
>>>>>            total        used        free         shared  buff/cache   available
>>>>> Mem:        30G       11G        230M        3.3M         18G         18G
>>>>> Swap:         0B        0B          0B
>>>>> 
>>>>> # mpstat -P ALL
>>>>> Average:     CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal  %guest  %gnice   %idle
>>>>> Average:     all         3.80       0.04   1.46    55.12    0.00    0.08    0.00    0.00    0.00   39.49
>>>>> Average:       0        1.52       0.00    0.51    97.98    0.00    0.00    0.00    0.00    0.00    0.00
>>>>> Average:       1        13.07      0.00   1.51    0.00      0.00    0.00    0.00    0.00    0.00   85.43
>>>>> Average:       2        3.00       0.00    3.00    0.50      0.00    0.00    0.00    0.00    0.00   93.50
>>>>> Average:       3        2.50       0.00    2.00    0.00      0.00    0.00    0.00    0.00    0.00   95.50
>>>>> Average:       4        1.50       0.50    1.50    96.00    0.00    0.50    0.00    0.00    0.00    0.00
>>>>> Average:       5       0.50       0.00    1.01    32.16     0.00    0.00    0.00    0.00    0.00   66.33
>>>>> Average:       6        2.49       0.00    2.49    0.00     0.00    0.50    0.00    0.00    0.00   94.53
>>>>> Average:       7       1.52       0.00    0.51     97.98    0.00    0.00    0.00    0.00    0.00    0.00
>>>>> Average:       8       1.50       0.00    1.50     97.00    0.00    0.00    0.00    0.00    0.00    0.00
>>>>> Average:       9       14.07     0.00    2.01     46.23    0.00    0.00    0.00    0.00    0.00   37.69
>>>>> Average:      10       2.00      0.00    0.50     97.50    0.00    0.00    0.00    0.00    0.00    0.00
>>>>> Average:      11       2.01      0.00    1.01     96.98    0.00    0.00    0.00    0.00    0.00    0.00
>>>>> 
>>>>> We can see some process in state ‘D’ for a long time.  here are 2 fragments in dmesg.
>>>>> 
>>>> 
>>>> try setting vm.dirty_ration and vm.dirty_background_ration of machines
>>>> that run OSDs to smaller value.
>>>> 
>>>>> [Fri Nov 30 15:10:43 2018] INFO: task java:11662 blocked for more than 120 seconds.
>>>>> [Fri Nov 30 15:10:43 2018] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>>>>> [Fri Nov 30 15:10:43 2018] java            D ffffffff816a4180     0 11662  10409 0x00000180
>>>>> [Fri Nov 30 15:10:43 2018]  ffff8805f4febbd0 0000000000000082 ffff8807e2edcf10 ffff8805f4febfd8
>>>>> [Fri Nov 30 15:10:43 2018]  ffff8805f4febfd8 ffff8805f4febfd8 ffff8807e2edcf10 ffff88086df56cc0
>>>>> [Fri Nov 30 15:10:43 2018]  0000000000000000 7fffffffffffffff ffff88087ffc3be8 ffffffff816a4180
>>>>> [Fri Nov 30 15:10:43 2018] Call Trace:
>>>>> [Fri Nov 30 15:10:43 2018]  [<ffffffff816a4180>] ? bit_wait+0x50/0x50
>>>>> [Fri Nov 30 15:10:43 2018]  [<ffffffff816a6059>] schedule+0x29/0x70
>>>>> [Fri Nov 30 15:10:43 2018]  [<ffffffff816a3b69>] schedule_timeout+0x239/0x2c0
>>>>> [Fri Nov 30 15:10:43 2018]  [<ffffffff81190541>] ? pagevec_lookup_tag+0x21/0x30
>>>>> [Fri Nov 30 15:10:43 2018]  [<ffffffffc082b598>] ? ceph_writepages_start+0x1298/0x1520 [ceph]
>>>>> [Fri Nov 30 15:10:43 2018]  [<ffffffff810e939c>] ? ktime_get_ts64+0x4c/0xf0
>>>>> [Fri Nov 30 15:10:43 2018]  [<ffffffff816a4180>] ? bit_wait+0x50/0x50
>>>>> [Fri Nov 30 15:10:43 2018]  [<ffffffff816a56dd>] io_schedule_timeout+0xad/0x130
>>>>> [Fri Nov 30 15:10:43 2018]  [<ffffffff816a5778>] io_schedule+0x18/0x20
>>>>> [Fri Nov 30 15:10:43 2018]  [<ffffffff816a4191>] bit_wait_io+0x11/0x50
>>>>> [Fri Nov 30 15:10:43 2018]  [<ffffffff816a3cb5>] __wait_on_bit+0x65/0x90
>>>>> [Fri Nov 30 15:10:43 2018]  [<ffffffff81181cb1>] wait_on_page_bit+0x81/0xa0
>>>>> [Fri Nov 30 15:10:43 2018]  [<ffffffff810b19d0>] ? wake_bit_function+0x40/0x40
>>>>> [Fri Nov 30 15:10:43 2018]  [<ffffffff81181de1>] __filemap_fdatawait_range+0x111/0x190
>>>>> [Fri Nov 30 15:10:43 2018]  [<ffffffff811b0d79>] ? do_numa_page+0x159/0x1e0
>>>>> [Fri Nov 30 15:10:43 2018]  [<ffffffff8118f42e>] ? do_writepages+0x1e/0x40
>>>>> [Fri Nov 30 15:10:43 2018]  [<ffffffff81181e74>] filemap_fdatawait_range+0x14/0x30
>>>>> [Fri Nov 30 15:10:43 2018]  [<ffffffff81183db6>] filemap_write_and_wait_range+0x56/0x90
>>>>> [Fri Nov 30 15:10:43 2018]  [<ffffffffc083326c>] ceph_fsync+0x6c/0x510 [ceph]
>>>>> [Fri Nov 30 15:10:43 2018]  [<ffffffff81137273>] ? __secure_computing+0x73/0x240
>>>>> [Fri Nov 30 15:10:43 2018]  [<ffffffff8111f5c6>] ? __audit_syscall_exit+0x1e6/0x280
>>>>> [Fri Nov 30 15:10:43 2018]  [<ffffffff81231d85>] do_fsync+0x65/0xa0
>>>>> [Fri Nov 30 15:10:43 2018]  [<ffffffff81232083>] SyS_fdatasync+0x13/0x20
>>>>> [Fri Nov 30 15:10:43 2018]  [<ffffffff816b1d4e>] tracesys+0xd9/0xde
>>>>> 
>>>>> 
>>>>> [Fri Nov 30 15:12:43 2018] INFO: task prometheus:9919 blocked for more than 120 seconds.
>>>>> [Fri Nov 30 15:12:43 2018] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>>>>> [Fri Nov 30 15:12:43 2018] prometheus      D ffff8807e87dbad8     0  9919   9879 0x00000080
>>>>> [Fri Nov 30 15:12:43 2018]  ffff8807e87db970 0000000000000082 ffff8803e273cf10 ffff8807e87dbfd8
>>>>> [Fri Nov 30 15:12:43 2018]  ffff8807e87dbfd8 ffff8807e87dbfd8 ffff8803e273cf10 ffff88046e716cc0
>>>>> [Fri Nov 30 15:12:43 2018]  0000000000000000 7fffffffffffffff ffffffff816a4180 ffff8807e87dbad8
>>>>> [Fri Nov 30 15:12:43 2018] Call Trace:
>>>>> [Fri Nov 30 15:12:43 2018]  [<ffffffff816a4180>] ? bit_wait+0x50/0x50
>>>>> [Fri Nov 30 15:12:43 2018]  [<ffffffff816a6059>] schedule+0x29/0x70
>>>>> [Fri Nov 30 15:12:43 2018]  [<ffffffff816a3b69>] schedule_timeout+0x239/0x2c0
>>>>> [Fri Nov 30 15:12:43 2018]  [<ffffffff81090007>] ? local_bh_enable+0x17/0x20
>>>>> [Fri Nov 30 15:12:43 2018]  [<ffffffff816a4180>] ? bit_wait+0x50/0x50
>>>>> [Fri Nov 30 15:12:43 2018]  [<ffffffff816a56dd>] io_schedule_timeout+0xad/0x130
>>>>> [Fri Nov 30 15:12:43 2018]  [<ffffffff816a5778>] io_schedule+0x18/0x20
>>>>> [Fri Nov 30 15:12:43 2018]  [<ffffffff816a4191>] bit_wait_io+0x11/0x50
>>>>> [Fri Nov 30 15:12:43 2018]  [<ffffffff816a3eaf>] __wait_on_bit_lock+0x5f/0xc0
>>>>> [Fri Nov 30 15:12:43 2018]  [<ffffffff81182094>] __lock_page+0x74/0x90
>>>>> [Fri Nov 30 15:12:43 2018]  [<ffffffff810b19d0>] ? wake_bit_function+0x40/0x40
>>>>> [Fri Nov 30 15:12:43 2018]  [<ffffffff81182a34>] __find_lock_page+0x54/0x70
>>>>> [Fri Nov 30 15:12:43 2018]  [<ffffffff811836e2>] grab_cache_page_write_begin+0x62/0xd0
>>>>> [Fri Nov 30 15:12:43 2018]  [<ffffffffc082a203>] ceph_write_begin+0x43/0xe0 [ceph]
>>>>> [Fri Nov 30 15:12:43 2018]  [<ffffffff811824ce>] generic_file_buffered_write+0x11e/0x2a0
>>>>> [Fri Nov 30 15:12:43 2018]  [<ffffffffc08259c4>] ceph_aio_write+0xac4/0xc60 [ceph]
>>>>> [Fri Nov 30 15:12:43 2018]  [<ffffffff8160417a>] ? inet_recvmsg+0x7a/0xa0
>>>>> [Fri Nov 30 15:12:43 2018]  [<ffffffff811feb3d>] do_sync_write+0x8d/0xd0
>>>>> [Fri Nov 30 15:12:43 2018]  [<ffffffff811ff5fd>] vfs_write+0xbd/0x1e0
>>>>> [Fri Nov 30 15:12:43 2018]  [<ffffffff8120040f>] SyS_write+0x7f/0xe0
>>>>> [Fri Nov 30 15:12:43 2018]  [<ffffffff816b1b49>] system_call_fastpath+0x16/0x1b
>>>>> 
>>>>> 
>>>>> But the cephfs status shows healthy.
>>>>> # ceph -s
>>>>> cluster:
>>>>>  id:     b27bc9a3-2e32-4782-b60f-b7901b0b26e5
>>>>>  health: HEALTH_OK
>>>>> 
>>>>> services:
>>>>>  mon: 2 daemons, quorum skyaxe-storage-0,skyaxe-storage-1
>>>>>  mgr: skyaxe-storage-0(active), standbys: skyaxe-storage-1
>>>>>  mds: cephfs-1/1/1 up  {0=skyaxe-storage-1=up:active}, 1 up:standby
>>>>>  osd: 8 osds: 8 up, 8 in
>>>>> 
>>>>> data:
>>>>>  pools:   2 pools, 512 pgs
>>>>>  objects: 599.7 k objects, 876 GiB
>>>>>  usage:   2.6 TiB used, 13 TiB / 15 TiB avail
>>>>>  pgs:     512 active+clean
>>>>> 
>>>>> io:
>>>>>  client:   3.7 KiB/s wr, 0 op/s rd, 1 op/s wr
>>>>> 
>>>>> 
>>>>> We can see some unfinished requests in osdc.
>>>>> 
>>>>> #  cat /sys/kernel/debug/ceph/xxx/osdc
>>>>> REQUESTS 98 homeless 0
>>>>> 8529    osd1    1.1bfdee5       [1,7]/1 [1,7]/1 100000011be.00000000    0x400024        1       0'0     write
>>>>> 8532    osd1    1.2d81c338      [1,6]/1 [1,6]/1 100000011c0.00000000    0x400024        1       0'0     write
>>>>> 8536    osd1    1.49324f26      [1,4]/1 [1,4]/1 100000011c4.00000000    0x400024        1       0'0     write
>>>>> 8542    osd1    1.1403106f      [1,7]/1 [1,7]/1 100000011c8.00000000    0x400024        1       0'0     write
>>>>> 8549    osd1    1.ea273113      [1,4]/1 [1,4]/1 100000011ce.00000000    0x400024        1       0'0     write
>>>>> 8562    osd1    1.f2d68e9a      [1,5]/1 [1,5]/1 100000011d8.00000000    0x400024        1       0'0     write
>>>>> 8566    osd1    1.1f212283      [1,4]/1 [1,4]/1 100000011db.00000000    0x400024        1       0'0     write
>>>>> 8568    osd1    1.2e1984e7      [1,5]/1 [1,5]/1 100000011de.00000000    0x400024        1       0'0     write
>>>>> 8576    osd1    1.da7e63d0      [1,5]/1 [1,5]/1 100000011e1.00000000    0x400024        1       0'0     write
>>>>> 8581    osd1    1.3b0d280       [1,4]/1 [1,4]/1 100000011e8.00000000    0x400024        1       0'0     write
>>>>> 8582    osd1    1.80227f0b      [1,5]/1 [1,5]/1 100000011ea.00000000    0x400024        1       0'0     write
>>>>> 8593    osd1    1.1efd9391      [1,6]/1 [1,6]/1 100000011ef.00000000    0x400024        1       0'0     write
>>>>> 8594    osd1    1.3db6e880      [1,4]/1 [1,4]/1 100000011f0.00000000    0x400024        1       0'0     write
>>>>> 8611    osd1    1.6d20c41d      [1,4]/1 [1,4]/1 1000000028c.00000000    0x400024        1       0'0     write
>>>>> 8612    osd1    1.6d20c41d      [1,4]/1 [1,4]/1 1000000028c.00000000    0x400024        1       0'0     write
>>>>> 8616    osd1    1.f4d90a4d      [1,7]/1 [1,7]/1 10000001206.00000000    0x400024        1       0'0     write
>>>>> 8619    osd1    1.740514fe      [1,5]/1 [1,5]/1 10000001208.00000000    0x400024        1       0'0     write
>>>>> 8630    osd1    1.d5fbabe0      [1,6]/1 [1,6]/1 10000001216.00000000    0x400024        1       0'0     write
>>>>> 8631    osd1    1.daf28f83      [1,4]/1 [1,4]/1 10000001217.00000000    0x400024        1       0'0     write
>>>>> 8635    osd1    1.d29e01a6      [1,5]/1 [1,5]/1 1000000121a.00000000    0x400024        1       0'0     write
>>>>> 8640    osd1    1.2de0ff63      [1,7]/1 [1,7]/1 10000001220.00000000    0x400024        1       0'0     write
>>>>> 8658    osd1    1.2f91206f      [1,7]/1 [1,7]/1 1000000122c.00000000    0x400064        1       0'0     write,startsync
>>>>> ...
>>>>> 
>> 
>> how long did these osd requests take? If some requests take too long,
>> use 'ceph daemon osd.x dump dump_blocked_ops/dump_ops_in_flight' to
>> check
>> 
>> 
>>>> 
>>>>> Here is our ceph config:
>>>>> 
>>>>> # cat /etc/ceph/ceph.conf
>>>>> [global]
>>>>> fsid = b27bc9a3-2e32-4782-b60f-b7901b0b26e5
>>>>> mon_initial_members = skyaxe-storage-0, skyaxe-storage-1
>>>>> mon_host = 10.0.30.111,10.0.30.112
>>>>> auth_cluster_required = cephx
>>>>> auth_service_required = cephx
>>>>> auth_client_required = cephx
>>>>> 
>>>>> osd_pool_default_size = 2
>>>>> public_network = 10.0.30.0/24
>>>>> cluster_network = 10.0.40.0/24
>>>>> osd_pool_default_min_size = 1
>>>>> 
>>>>> max_open_files = 131072
>>>>> 
>>>>> [osd]
>>>>> osd_journal_size = 56320
>>>>> 
>>>>> journal_max_write_bytes = 1073741824
>>>>> journal_max_write_entries = 10000
>>>>> journal_queue_max_ops = 50000
>>>>> journal_queue_max_bytes = 10737418240
>>>>> 
>>>>> osd_mkfs_type = xfs
>>>>> osd_mkfs_options_xfs = -f
>>>>> osd_mount_options_xfs = "rw,noexec,nodev,noatime,nodiratime,nobarrier"
>>>>> 
>>>>> osd_max_write_size = 512
>>>>> osd_client_message_size_cap = 2147483648
>>>>> osd_deep_scrub_stride = 131072
>>>>> osd_op_threads = 8
>>>>> osd_disk_threads = 4
>>>>> osd_map_cache_size = 1024
>>>>> osd_map_cache_bl_size = 128
>>>>> osd_recovery_op_priority = 4
>>>>> osd_recovery_max_active = 10
>>>>> osd_max_backfills = 4
>>>>> 
>>>>> —
>>>>> Best Regards
>>>>> Li, Ning
>>>>> 
>>>>> 
>>>>> 
>>>>> _______________________________________________
>>>>> ceph-users mailing list
>>>>> ceph-users@xxxxxxxxxxxxxx
>>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com





[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux