Re: OSD had suicide timed out

Brad Hubbard <bhubbard@xxxxxxxxxx> · Thu, 9 Aug 2018 12:07:44 +1000

If, in the above case, osd 13 was not too busy to respond (resource
shortage) then you need to find out why else osd 5, etc. could not
contact it.

On Wed, Aug 8, 2018 at 6:47 PM, Josef Zelenka
<josef.zelenka@xxxxxxxxxxxxxxxx> wrote:
> Checked the system load on the host with the OSD that is suiciding currently
> and it's fine, however i can see a noticeably higher IO (around 700), though
> that seems rather like a symptom of the constant flapping/attempting to come
> up to me(it's an SSD based Ceph so this shouldn't cause much harm to it).
> Had a look at one of the osds sending the you_died messages and it seems
> it's attempting to contact osd.13, but ultimately fails.
>
> 8/0 13574/13574/13574) [5,11] r=0 lpr=13574 crt=13592'3654839 lcod
> 13592'3654838 mlcod 13592'3654838 active+clean] publish_stats_to_osd
> 13593:9552151
> 2018-08-08 10:45:16.112344 7effa1d8c700 15 osd.5 pg_epoch: 13593 pg[14.6( v
> 13592'3654839 (13294'3653334,13592'3654839] local-lis/les=13574/13575 n=945
> ec=126/126 lis/c 13574/13574 les/c/f 13575/13578/0 13574/13574/13574) [5,11]
> r=0 lpr=13574 crt=13592'3654839 lcod 13592'3654838 mlcod 13592'3654838
> active+clean] publish_stats_to_osd 13593:9552152
> 2018-08-08 10:45:16.679484 7eff9a57d700 15 osd.5 pg_epoch: 13593 pg[11.15( v
> 13575'34486 (9987'32956,13575'34486] local-lis/les=13574/13575 n=1
> ec=115/115 lis/c 13574/13574 les/c/f 13575/13575/0 13574/13574/13574) [5,10]
> r=0 lpr=13574 crt=13572'34485 lcod 13572'34485 mlcod 13572'34485
> active+clean] publish_stats_to_osd 13593:2966967
> 2018-08-08 10:45:17.818135 7effb95a4700  1 -- 10.12.125.1:6803/1319081 <==
> osd.13 10.12.125.3:0/735946 18 ==== osd_ping(ping e13589 stamp 2018-08-08
> 10:45:17.817238) v4 ==== 2004+0+0 (4218069135 0 0) 0x55bb638ba800 con
> 0x55bb65e79800
> 2018-08-08 10:45:17.818176 7effb9da5700  1 -- 10.12.3.15:6809/1319081 <==
> osd.13 10.12.3.17:0/735946 18 ==== osd_ping(ping e13589 stamp 2018-08-08
> 10:45:17.817238) v4 ==== 2004+0+0 (4218069135 0 0) 0x55bb63cd8c00 con
> 0x55bb65e7b000
> 2018-08-08 10:45:18.919053 7effb95a4700  1 -- 10.12.125.1:6803/1319081 <==
> osd.13 10.12.125.3:0/735946 19 ==== osd_ping(ping e13589 stamp 2018-08-08
> 10:45:18.918149) v4 ==== 2004+0+0 (1428835292 0 0) 0x55bb638bb200 con
> 0x55bb65e79800
> 2018-08-08 10:45:18.919598 7effb9da5700  1 -- 10.12.3.15:6809/1319081 <==
> osd.13 10.12.3.17:0/735946 19 ==== osd_ping(ping e13589 stamp 2018-08-08
> 10:45:18.918149) v4 ==== 2004+0+0 (1428835292 0 0) 0x55bb63cd8a00 con
> 0x55bb65e7b000
> 2018-08-08 10:45:21.679563 7eff9a57d700 15 osd.5 pg_epoch: 13593 pg[11.15( v
> 13575'34486 (9987'32956,13575'34486] local-lis/les=13574/13575 n=1
> ec=115/115 lis/c 13574/13574 les/c/f 13575/13575/0 13574/13574/13574) [5,10]
> r=0 lpr=13574 crt=13572'34485 lcod 13572'34485 mlcod 13572'34485
> active+clean] publish_stats_to_osd 13593:2966968
> 2018-08-08 10:45:23.020715 7effb95a4700  1 -- 10.12.125.1:6803/1319081 <==
> osd.13 10.12.125.3:0/735946 20 ==== osd_ping(ping e13589 stamp 2018-08-08
> 10:45:23.018994) v4 ==== 2004+0+0 (1018071233 0 0) 0x55bb63bb7200 con
> 0x55bb65e79800
> 2018-08-08 10:45:23.020837 7effb9da5700  1 -- 10.12.3.15:6809/1319081 <==
> osd.13 10.12.3.17:0/735946 20 ==== osd_ping(ping e13589 stamp 2018-08-08
> 10:45:23.018994) v4 ==== 2004+0+0 (1018071233 0 0) 0x55bb63cd8c00 con
> 0x55bb65e7b000
> 2018-08-08 10:45:26.679513 7eff8e565700 15 osd.5 pg_epoch: 13593 pg[11.15( v
> 13575'34486 (9987'32956,13575'34486] local-lis/les=13574/13575 n=1
> ec=115/115 lis/c 13574/13574 les/c/f 13575/13575/0 13574/13574/13574) [5,10]
> r=0 lpr=13574 crt=13572'34485 lcod 13572'34485 mlcod 13572'34485
> active+clean] publish_stats_to_osd 13593:2966969
> 2018-08-08 10:45:28.921091 7effb95a4700  1 -- 10.12.125.1:6803/1319081 <==
> osd.13 10.12.125.3:0/735946 21 ==== osd_ping(ping e13589 stamp 2018-08-08
> 10:45:28.920140) v4 ==== 2004+0+0 (2459835898 0 0) 0x55bb638ba800 con
> 0x55bb65e79800
> 2018-08-08 10:45:28.922026 7effb9da5700  1 -- 10.12.3.15:6809/1319081 <==
> osd.13 10.12.3.17:0/735946 21 ==== osd_ping(ping e13589 stamp 2018-08-08
> 10:45:28.920140) v4 ==== 2004+0+0 (2459835898 0 0) 0x55bb63cd8c00 con
> 0x55bb65e7b000
> 2018-08-08 10:45:31.679828 7eff9a57d700 15 osd.5 pg_epoch: 13593 pg[11.15( v
> 13575'34486 (9987'32956,13575'34486] local-lis/les=13574/13575 n=1
> ec=115/115 lis/c 13574/13574 les/c/f 13575/13575/0 13574/13574/13574) [5,10]
> r=0 lpr=13574 crt=13572'34485 lcod 13572'34485 mlcod 13572'34485
> active+clean] publish_stats_to_osd 13593:2966970
> 2018-08-08 10:45:33.022697 7effb95a4700  1 -- 10.12.125.1:6803/1319081 <==
> osd.13 10.12.125.3:0/735946 22 ==== osd_ping(ping e13589 stamp 2018-08-08
> 10:45:33.021217) v4 ==== 2004+0+0 (3639738084 0 0) 0x55bb63bb7200 con
> 0x55bb65e79800
>
> Regarding heartbeat messages, all i can see on the failing osd is "heartbeat
> map is healthy" before the timeout messages. I'm thinking this might be a
> pool issue of some sort? as this happens on random osds throughout the
> cluster.
>
>
>
> On 08/08/18 09:45, Brad Hubbard wrote:
>>
>> What is the load like on the osd host at the time and what does the
>> disk utilization look like?
>>
>> Also, what does the transaction look like from one of the osds that
>> sends the "you died" message with debugging osd 20 and ms 1 enabled?
>>
>> On Wed, Aug 8, 2018 at 5:34 PM, Josef Zelenka
>> <josef.zelenka@xxxxxxxxxxxxxxxx> wrote:
>>>
>>> Thank you for your suggestion, tried it,  really seems like the other
>>> osds
>>> think the osd is dead(if I understand this right), however the networking
>>> seems absolutely fine between the nodes(no issues in graphs etc).
>>>
>>>     -13> 2018-08-08 09:13:58.466119 7fe053d41700  1 --
>>> 10.12.3.17:0/706864
>>> <== osd.12 10.12.3.17:6807/4624236 81 ==== osd_ping(ping_reply e13452
>>> stamp
>>> 2018-08-08 09:13:58.464608) v4 ==== 2004+0+0 (687351303 0 0)
>>> 0x55731eb73e00
>>> con 0x55731e7d4800
>>>     -12> 2018-08-08 09:13:58.466140 7fe054542700  1 --
>>> 10.12.3.17:0/706864
>>> <== osd.11 10.12.3.16:6812/19232 81 ==== osd_ping(ping_reply e13452 stamp
>>> 2018-08-08 09:13:58.464608) v4 ==== 2004+0+0 (687351303 0 0)
>>> 0x55733c391200
>>> con 0x55731e7a5800
>>>     -11> 2018-08-08 09:13:58.466147 7fe053540700  1 --
>>> 10.12.125.3:0/706864
>>> <== osd.11 10.12.125.2:6811/19232 82 ==== osd_ping(you_died e13452 stamp
>>> 2018-08-08 09:13:58.464608) v4 ==== 2004+0+0 (3111562112 0 0)
>>> 0x55731eb66800
>>> con 0x55731e7a4000
>>>     -10> 2018-08-08 09:13:58.466164 7fe054542700  1 --
>>> 10.12.3.17:0/706864
>>> <== osd.11 10.12.3.16:6812/19232 82 ==== osd_ping(you_died e13452 stamp
>>> 2018-08-08 09:13:58.464608) v4 ==== 2004+0+0 (3111562112 0 0)
>>> 0x55733c391200
>>> con 0x55731e7a5800
>>>      -9> 2018-08-08 09:13:58.466164 7fe053d41700  1 --
>>> 10.12.3.17:0/706864
>>> <== osd.12 10.12.3.17:6807/4624236 82 ==== osd_ping(you_died e13452 stamp
>>> 2018-08-08 09:13:58.464608) v4 ==== 2004+0+0 (3111562112 0 0)
>>> 0x55731eb73e00
>>> con 0x55731e7d4800
>>>      -8> 2018-08-08 09:13:58.466176 7fe053540700  1 --
>>> 10.12.3.17:0/706864
>>> <== osd.9 10.12.3.16:6813/10016600 81 ==== osd_ping(ping_reply e13452
>>> stamp
>>> 2018-08-08 09:13:58.464608) v4 ==== 2004+0+0 (687351303 0 0)
>>> 0x55731eb66800
>>> con 0x55731e732000
>>>      -7> 2018-08-08 09:13:58.466200 7fe053d41700  1 --
>>> 10.12.3.17:0/706864
>>> <== osd.10 10.12.3.16:6810/2017908 81 ==== osd_ping(ping_reply e13452
>>> stamp
>>> 2018-08-08 09:13:58.464608) v4 ==== 2004+0+0 (687351303 0 0)
>>> 0x55731eb73e00
>>> con 0x55731e796800
>>>      -6> 2018-08-08 09:13:58.466208 7fe053540700  1 --
>>> 10.12.3.17:0/706864
>>> <== osd.9 10.12.3.16:6813/10016600 82 ==== osd_ping(you_died e13452 stamp
>>> 2018-08-08 09:13:58.464608) v4 ==== 2004+0+0 (3111562112 0 0)
>>> 0x55731eb66800
>>> con 0x55731e732000
>>>      -5> 2018-08-08 09:13:58.466222 7fe053d41700  1 --
>>> 10.12.3.17:0/706864
>>> <== osd.10 10.12.3.16:6810/2017908 82 ==== osd_ping(you_died e13452 stamp
>>> 2018-08-08 09:13:58.464608) v4 ==== 2004+0+0 (3111562112 0 0)
>>> 0x55731eb73e00
>>> con 0x55731e796800
>>>      -4> 2018-08-08 09:13:59.748336 7fe040531700  1 --
>>> 10.12.3.17:6802/706864
>>> --> 10.12.3.16:6800/1677830 -- mgrreport(unknown.13 +0-0 packed 742
>>> osd_metrics=1) v5 -- 0x55731fa4af00 con 0
>>>      -3> 2018-08-08 09:13:59.748538 7fe040531700  1 --
>>> 10.12.3.17:6802/706864
>>> --> 10.12.3.16:6800/1677830 -- pg_stats(64 pgs tid 0 v 0) v1 --
>>> 0x55733cbf4c00 con 0
>>>      -2> 2018-08-08 09:14:00.953804 7fe0525a1700  1 heartbeat_map
>>> is_healthy
>>> 'OSD::peering_tp thread 0x7fe03f52f700' had timed out after 15
>>>      -1> 2018-08-08 09:14:00.953857 7fe0525a1700  1 heartbeat_map
>>> is_healthy
>>> 'OSD::peering_tp thread 0x7fe03f52f700' had suicide timed out after 150
>>>       0> 2018-08-08 09:14:00.970742 7fe03f52f700 -1 *** Caught signal
>>> (Aborted) **
>>>
>>>
>>> Could it be that the suiciding OSDs are rejecting the ping somehow? I'm
>>> quite confused as on what's really going on here, it seems completely
>>> random
>>> to me.
>>>
>>>
>>> On 08/08/18 01:51, Brad Hubbard wrote:
>>>>
>>>> Try to work out why the other osds are saying this one is down. Is it
>>>> because this osd is too busy to respond or something else.
>>>>
>>>> debug_ms = 1 will show you some message debugging which may help.
>>>>
>>>> On Tue, Aug 7, 2018 at 10:34 PM, Josef Zelenka
>>>> <josef.zelenka@xxxxxxxxxxxxxxxx> wrote:
>>>>>
>>>>> To follow up, I did some further digging with debug_osd=20/20 and it
>>>>> appears
>>>>> as if there's no traffic to the OSD, even though it comes UP for the
>>>>> cluster
>>>>> (this started happening on another OSD in the cluster today, same
>>>>> stuff):
>>>>>
>>>>>      -27> 2018-08-07 14:10:55.146531 7f9fce3cd700 10 osd.0 12560
>>>>> handle_osd_ping osd.17 10.12.3.17:6811/19661 says i am down in 12566
>>>>>      -26> 2018-08-07 14:10:55.146542 7f9fcebce700 10 osd.0 12560
>>>>> handle_osd_ping osd.12 10.12.125.3:6807/4624236 says i am down in 12566
>>>>>      -25> 2018-08-07 14:10:55.146551 7f9fcf3cf700 10 osd.0 12560
>>>>> handle_osd_ping osd.13 10.12.3.17:6805/186262 says i am down in 12566
>>>>>      -24> 2018-08-07 14:10:55.146564 7f9fce3cd700 20 osd.0 12559
>>>>> share_map_peer 0x56308a9d0000 already has epoch 12566
>>>>>      -23> 2018-08-07 14:10:55.146576 7f9fcebce700 20 osd.0 12559
>>>>> share_map_peer 0x56308abb9800 already has epoch 12566
>>>>>      -22> 2018-08-07 14:10:55.146590 7f9fcf3cf700 20 osd.0 12559
>>>>> share_map_peer 0x56308abb1000 already has epoch 12566
>>>>>      -21> 2018-08-07 14:10:55.146600 7f9fce3cd700 10 osd.0 12560
>>>>> handle_osd_ping osd.15 10.12.125.3:6813/49064793 says i am down in
>>>>> 12566
>>>>>      -20> 2018-08-07 14:10:55.146609 7f9fcebce700 10 osd.0 12560
>>>>> handle_osd_ping osd.16 10.12.3.17:6801/1018363 says i am down in 12566
>>>>>      -19> 2018-08-07 14:10:55.146619 7f9fcf3cf700 10 osd.0 12560
>>>>> handle_osd_ping osd.11 10.12.3.16:6812/19232 says i am down in 12566
>>>>>      -18> 2018-08-07 14:10:55.146643 7f9fcf3cf700 20 osd.0 12559
>>>>> share_map_peer 0x56308a9d0000 already has epoch 12566
>>>>>      -17> 2018-08-07 14:10:55.146653 7f9fcf3cf700 10 osd.0 12560
>>>>> handle_osd_ping osd.15 10.12.3.17:6812/49064793 says i am down in 12566
>>>>>      -16> 2018-08-07 14:10:55.448468 7f9fcabdd700 10 osd.0 12560
>>>>> tick_without_osd_lock
>>>>>      -15> 2018-08-07 14:10:55.448491 7f9fcabdd700 20 osd.0 12559
>>>>> can_inc_scrubs_pending 0 -> 1 (max 1, active 0)
>>>>>      -14> 2018-08-07 14:10:55.448497 7f9fcabdd700 20 osd.0 12560
>>>>> scrub_time_permit should run between 0 - 24 now 14 = yes
>>>>>      -13> 2018-08-07 14:10:55.448525 7f9fcabdd700 20 osd.0 12560
>>>>> scrub_load_below_threshold loadavg 2.31 < daily_loadavg 2.68855 and <
>>>>> 15m
>>>>> avg 2.63 = yes
>>>>>      -12> 2018-08-07 14:10:55.448535 7f9fcabdd700 20 osd.0 12560
>>>>> sched_scrub
>>>>> load_is_low=1
>>>>>      -11> 2018-08-07 14:10:55.448555 7f9fcabdd700 10 osd.0 12560
>>>>> sched_scrub
>>>>> 15.112 scheduled at 2018-08-07 15:03:15.052952 > 2018-08-07
>>>>> 14:10:55.448494
>>>>>      -10> 2018-08-07 14:10:55.448563 7f9fcabdd700 20 osd.0 12560
>>>>> sched_scrub
>>>>> done
>>>>>       -9> 2018-08-07 14:10:55.448565 7f9fcabdd700 10 osd.0 12559
>>>>> promote_throttle_recalibrate 0 attempts, promoted 0 objects and 0
>>>>> bytes;
>>>>> target 25 obj/sec or 5120 k bytes/sec
>>>>>       -8> 2018-08-07 14:10:55.448568 7f9fcabdd700 20 osd.0 12559
>>>>> promote_throttle_recalibrate  new_prob 1000
>>>>>       -7> 2018-08-07 14:10:55.448569 7f9fcabdd700 10 osd.0 12559
>>>>> promote_throttle_recalibrate  actual 0, actual/prob ratio 1, adjusted
>>>>> new_prob 1000, prob 1000 -> 1000
>>>>>       -6> 2018-08-07 14:10:55.507159 7f9faab9d700 20 osd.0 op_wq(5)
>>>>> _process
>>>>> empty q, waiting
>>>>>       -5> 2018-08-07 14:10:55.812434 7f9fb5bb3700 20 osd.0 op_wq(7)
>>>>> _process
>>>>> empty q, waiting
>>>>>       -4> 2018-08-07 14:10:56.236584 7f9fcd42e700  1 heartbeat_map
>>>>> is_healthy
>>>>> 'OSD::osd_op_tp thread 0x7f9fa7396700' had timed out after 60
>>>>>       -3> 2018-08-07 14:10:56.236618 7f9fcd42e700  1 heartbeat_map
>>>>> is_healthy
>>>>> 'OSD::osd_op_tp thread 0x7f9fb33ae700' had timed out after 60
>>>>>       -2> 2018-08-07 14:10:56.236621 7f9fcd42e700  1 heartbeat_map
>>>>> is_healthy
>>>>> 'OSD::peering_tp thread 0x7f9fba3bc700' had timed out after 15
>>>>>       -1> 2018-08-07 14:10:56.236640 7f9fcd42e700  1 heartbeat_map
>>>>> is_healthy
>>>>> 'OSD::peering_tp thread 0x7f9fba3bc700' had suicide timed out after 150
>>>>>        0> 2018-08-07 14:10:56.245420 7f9fba3bc700 -1 *** Caught signal
>>>>> (Aborted) **
>>>>>    in thread 7f9fba3bc700 thread_name:tp_peering
>>>>>
>>>>> THe osd cyclically crashes and comes back up. I tried modifying the
>>>>> recovery
>>>>> etc timeouts, but no luck - the situation is still the same. Regarding
>>>>> the
>>>>> radosgw, across all nodes, after starting the rgw process, i only get
>>>>> this:
>>>>>
>>>>> 2018-08-07 14:32:17.852785 7f482dcaf700  2
>>>>> RGWDataChangesLog::ChangesRenewThread: start
>>>>>
>>>>> I found this thread in the ceph mailing list
>>>>>
>>>>>
>>>>> (http://lists.ceph.com/pipermail/ceph-users-ceph.com/2017-June/018956.html)
>>>>> but I'm not sure if this is the same thing(albeit, it's the same
>>>>> error),
>>>>> as
>>>>> I don't use s3 acls/expiration in my cluster(if it's set to a default,
>>>>> I'm
>>>>> not aware of it)
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On 06/08/18 16:30, Josef Zelenka wrote:
>>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> i'm running a cluster on Luminous(12.2.5), Ubuntu 16.04 -
>>>>>> configuration
>>>>>> is
>>>>>> 3 nodes, 6 drives each(though i have encountered this on a different
>>>>>> cluster, similar hardware, only the drives were HDD instead of SSD -
>>>>>> same
>>>>>> usage). I have recently seen a bug(?) where one of the OSDs suddenly
>>>>>> spikes
>>>>>> in iops and constantly restarts(trying to load the journal/filemap
>>>>>> apparently) which renders the radosgw(primary usage of this cluster)
>>>>>> unable
>>>>>> to write. The only thing that helps here is stopping the OSD, but that
>>>>>> helps
>>>>>> only until another one does the similar thing. Any clue on the cause
>>>>>> of
>>>>>> this? LOgs of the osd when it crashes below. THanks
>>>>>>
>>>>>> Josef
>>>>>>
>>>>>>    -9920> 2018-08-06 12:12:10.588227 7f8e7afcb700  1 heartbeat_map
>>>>>> is_healthy 'OSD::osd_op_tp thread 0x7f8e56f9a700' had timed out after
>>>>>> 60
>>>>>>    -9919> 2018-08-06 12:12:10.607070 7f8e7a7ca700  1 heartbeat_map
>>>>>> is_healthy 'OSD::osd_op_tp thread 0x7f8e56f9a700' had timed out after
>>>>>> 60
>>>>>> --
>>>>>>       -1> 2018-08-06 14:12:52.428994 7f8e7982b700  1 heartbeat_map
>>>>>> is_healthy 'OSD::osd_op_tp thread 0x7f8e56f9a700' had suicide timed
>>>>>> out
>>>>>> after 150
>>>>>>        0> 2018-08-06 14:12:52.432088 7f8e56f9a700 -1 *** Caught signal
>>>>>> (Aborted) **
>>>>>>    in thread 7f8e56f9a700 thread_name:tp_osd_tp
>>>>>>
>>>>>>    ceph version 12.2.5 (cad919881333ac92274171586c827e01f554a70a)
>>>>>> luminous
>>>>>> (stable)
>>>>>>    1: (()+0xa7cab4) [0x55868269aab4]
>>>>>>    2: (()+0x11390) [0x7f8e7e51d390]
>>>>>>    3: (()+0x1026d) [0x7f8e7e51c26d]
>>>>>>    4: (pthread_mutex_lock()+0x7d) [0x7f8e7e515dbd]
>>>>>>    5: (Mutex::Lock(bool)+0x49) [0x5586826bb899]
>>>>>>    6: (PG::lock(bool) const+0x33) [0x55868216ace3]
>>>>>>    7: (OSD::ShardedOpWQ::_process(unsigned int,
>>>>>> ceph::heartbeat_handle_d*)+0x844) [0x558682101044]
>>>>>>    8: (ShardedThreadPool::shardedthreadpool_worker(unsigned
>>>>>> int)+0x884)
>>>>>> [0x5586826e27f4]
>>>>>>    9: (ShardedThreadPool::WorkThreadSharded::entry()+0x10)
>>>>>> [0x5586826e5830]
>>>>>>    10: (()+0x76ba) [0x7f8e7e5136ba]
>>>>>>    11: (clone()+0x6d) [0x7f8e7d58a41d]
>>>>>>    NOTE: a copy of the executable, or `objdump -rdS <executable>` is
>>>>>> needed
>>>>>> to interpret this.
>>>>>>
>>>>>> --- logging levels ---
>>>>>>      0/ 5 none
>>>>>>      0/ 0 lockdep
>>>>>>      0/ 0 context
>>>>>>      0/ 0 crush
>>>>>>      1/ 5 mds
>>>>>>      1/ 5 mds_balancer
>>>>>>      1/ 5 mds_locker
>>>>>>      1/ 5 mds_log
>>>>>>      1/ 5 mds_log_expire
>>>>>>      1/ 5 mds_migrator
>>>>>>      0/ 0 buffer
>>>>>>      0/ 0 timer
>>>>>>      0/ 0 filer
>>>>>>      0/ 1 striper
>>>>>>      0/ 0 objecter
>>>>>>      0/ 0 rados
>>>>>>      0/ 0 rbd
>>>>>>      0/ 5 rbd_mirror
>>>>>>      0/ 5 rbd_replay
>>>>>>      0/ 0 journaler
>>>>>>      0/ 0 objectcacher
>>>>>>      0/ 0 client
>>>>>>      0/ 0 osd
>>>>>>      0/ 0 optracker
>>>>>>      0/ 0 objclass
>>>>>>      0/ 0 filestore
>>>>>>      0/ 0 journal
>>>>>>      0/ 0 ms
>>>>>>      0/ 0 mon
>>>>>>      0/ 0 monc
>>>>>>      0/ 0 paxos
>>>>>>      0/ 0 tp
>>>>>>      0/ 0 auth
>>>>>>      1/ 5 crypto
>>>>>>      0/ 0 finisher
>>>>>>      1/ 1 reserver
>>>>>>      1/ 5 heartbeatmap
>>>>>>      0/ 0 perfcounter
>>>>>>      0/ 0 rgw
>>>>>>      1/10 civetweb
>>>>>>      1/ 5 javaclient
>>>>>>      0/ 0 asok
>>>>>>      0/ 0 throttle
>>>>>>      0/ 0 refs
>>>>>>      1/ 5 xio
>>>>>>      1/ 5 compressor
>>>>>>      1/ 5 bluestore
>>>>>>      1/ 5 bluefs
>>>>>>      1/ 3 bdev
>>>>>>      1/ 5 kstore
>>>>>>      4/ 5 rocksdb
>>>>>>      4/ 5 leveldb
>>>>>>      4/ 5 memdb
>>>>>>      1/ 5 kinetic
>>>>>>      1/ 5 fuse
>>>>>>      1/ 5 mgr
>>>>>>      1/ 5 mgrc
>>>>>>      1/ 5 dpdk
>>>>>>      1/ 5 eventtrace
>>>>>>     -2/-2 (syslog threshold)
>>>>>>     -1/-1 (stderr threshold)
>>>>>>     max_recent     10000
>>>>>>     max_new         1000
>>>>>>     log_file /var/log/ceph/ceph-osd.7.log
>>>>>> --- end dump of recent events ---
>>>>>>
>>>>>> _______________________________________________
>>>>>> ceph-users mailing list
>>>>>> ceph-users@xxxxxxxxxxxxxx
>>>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> ceph-users mailing list
>>>>> ceph-users@xxxxxxxxxxxxxx
>>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>>
>>>>
>>>>
>>
>>
>

-- 
Cheers,
Brad
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com