Re: Cluster crashing when stopping some host

Eugen Block <eblock@xxxxxx> · Fri, 14 Oct 2022 06:53:31 +0000

To me this sounds more like either your MONs didn't have a quorum  
anymore or your clients didn't have all MONs in their ceph.conf, maybe  
just the failed one? Then the issue is resolved now?

Zitat von Murilo Morais <murilo@xxxxxxxxxxxxxx>:

Unfortunately I can't verify if ceph reports any inactive PG. As soon as
the second host disconnects practically everything is locked, nothing
appears even using "ceph -w". It only appears that the OSDs are offline
when dcs2 returns.

Note: Apparently there was a new update recently. When I was in the test
environment, this behavior was not happening, dcs1 was UP with all services
without crashing even with dcs2 DOWN, performing reading and writing, even
without adding dcs3.

### COMMANDS ###
[ceph: root@dcs1 /]# ceph osd tree
ID  CLASS  WEIGHT    TYPE NAME       STATUS  REWEIGHT  PRI-AFF
-1         65.49570  root default
-3         32.74785      host dcs1
 0    hdd   2.72899          osd.0       up   1.00000  1.00000
 1    hdd   2.72899          osd.1       up   1.00000  1.00000
 2    hdd   2.72899          osd.2       up   1.00000  1.00000
 3    hdd   2.72899          osd.3       up   1.00000  1.00000
 4    hdd   2.72899          osd.4       up   1.00000  1.00000
 5    hdd   2.72899          osd.5       up   1.00000  1.00000
 6    hdd   2.72899          osd.6       up   1.00000  1.00000
 7    hdd   2.72899          osd.7       up   1.00000  1.00000
 8    hdd   2.72899          osd.8       up   1.00000  1.00000
 9    hdd   2.72899          osd.9       up   1.00000  1.00000
10    hdd   2.72899          osd.10      up   1.00000  1.00000
11    hdd   2.72899          osd.11      up   1.00000  1.00000
-5         32.74785      host dcs2
12    hdd   2.72899          osd.12      up   1.00000  1.00000
13    hdd   2.72899          osd.13      up   1.00000  1.00000
14    hdd   2.72899          osd.14      up   1.00000  1.00000
15    hdd   2.72899          osd.15      up   1.00000  1.00000
16    hdd   2.72899          osd.16      up   1.00000  1.00000
17    hdd   2.72899          osd.17      up   1.00000  1.00000
18    hdd   2.72899          osd.18      up   1.00000  1.00000
19    hdd   2.72899          osd.19      up   1.00000  1.00000
20    hdd   2.72899          osd.20      up   1.00000  1.00000
21    hdd   2.72899          osd.21      up   1.00000  1.00000
22    hdd   2.72899          osd.22      up   1.00000  1.00000
23    hdd   2.72899          osd.23      up   1.00000  1.00000

[ceph: root@dcs1 /]# ceph osd pool ls detail
pool 1 '.mgr' replicated size 2 min_size 1 crush_rule 0 object_hash
rjenkins pg_num 1 pgp_num 1 autoscale_mode on last_change 26 flags
hashpspool stripe_width 0 pg_num_max 32 pg_num_min 1 application mgr
pool 2 'cephfs.ovirt_hosted_engine.meta' replicated size 2 min_size 1
crush_rule 0 object_hash rjenkins pg_num 16 pgp_num 16 autoscale_mode on
last_change 77 lfor 0/0/47 flags hashpspool stripe_width 0
pg_autoscale_bias 4 pg_num_min 16 recovery_priority 5 application cephfs
pool 3 'cephfs.ovirt_hosted_engine.data' replicated size 2 min_size 1
crush_rule 0 object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode on
last_change 179 lfor 0/0/47 flags hashpspool max_bytes 107374182400
stripe_width 0 application cephfs
pool 6 '.nfs' replicated size 2 min_size 1 crush_rule 0 object_hash
rjenkins pg_num 32 pgp_num 32 autoscale_mode on last_change 254 lfor
0/0/252 flags hashpspool stripe_width 0 application nfs
pool 7 'cephfs.ovirt_storage_sas.meta' replicated size 2 min_size 1
crush_rule 0 object_hash rjenkins pg_num 16 pgp_num 16 autoscale_mode on
last_change 322 lfor 0/0/287 flags hashpspool stripe_width 0
pg_autoscale_bias 4 pg_num_min 16 recovery_priority 5 application cephfs
pool 8 'cephfs.ovirt_storage_sas.data' replicated size 2 min_size 1
crush_rule 0 object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode on
last_change 291 lfor 0/0/289 flags hashpspool stripe_width 0 application
cephfs
pool 9 'cephfs.ovirt_storage_iso.meta' replicated size 2 min_size 1
crush_rule 0 object_hash rjenkins pg_num 16 pgp_num 16 autoscale_mode on
last_change 356 lfor 0/0/325 flags hashpspool stripe_width 0
pg_autoscale_bias 4 pg_num_min 16 recovery_priority 5 application cephfs
pool 10 'cephfs.ovirt_storage_iso.data' replicated size 2 min_size 1
crush_rule 0 object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode on
last_change 329 lfor 0/0/327 flags hashpspool stripe_width 0 application
cephfs

[ceph: root@dcs1 /]# ceph osd crush rule dump replicated_rule
{
    "rule_id": 0,
    "rule_name": "replicated_rule",
    "type": 1,
    "steps": [
        {
            "op": "take",
            "item": -1,
            "item_name": "default"
        },
        {
            "op": "chooseleaf_firstn",
            "num": 0,
            "type": "host"
        },
        {
            "op": "emit"
        }
    ]
}

[ceph: root@dcs1 /]# ceph pg ls-by-pool cephfs.ovirt_hosted_engine.data
PG    OBJECTS  DEGRADED  MISPLACED  UNFOUND  BYTES      OMAP_BYTES*
 OMAP_KEYS*  LOG    STATE         SINCE  VERSION    REPORTED   UP
 ACTING      SCRUB_STAMP                      DEEP_SCRUB_STAMP
    LAST_SCRUB_DURATION  SCRUB_SCHEDULING
3.0        69         0          0        0  285213095            0
  0  10057  active+clean    41m  530'20632  530:39461    [1,23]p1
 [1,23]p1  2022-10-13T03:19:33.649837+0000  2022-10-10T14:57:31.136809+0000
                   1  periodic scrub scheduled @
2022-10-14T07:24:46.314217+0000
3.1        58         0          0        0  242319360            0
  0  10026  active+clean    41m  530'11926  530:21424    [6,19]p6
 [6,19]p6  2022-10-13T02:15:23.395162+0000  2022-10-10T14:57:31.136809+0000
                   1  periodic scrub scheduled @
2022-10-14T11:42:17.682881+0000
3.2        71         0          0        0  294629376            0
  0  10012  active+clean    41m  530'12312  530:25506  [10,16]p10
 [10,16]p10  2022-10-13T06:12:48.839013+0000
 2022-10-11T21:09:49.405860+0000                    1  periodic scrub
scheduled @ 2022-10-14T12:35:23.917129+0000
3.3        63         0          0        0  262520832            0
  0  10073  active+clean    41m  530'20204  530:42834  [13,11]p13
 [13,11]p13  2022-10-13T01:16:17.672947+0000
 2022-10-11T16:43:27.935298+0000                    1  periodic scrub
scheduled @ 2022-10-14T11:48:42.643271+0000
3.4        59         0          0        0  240611328            0
  0  10017  active+clean    41m  530'17883  530:32537  [10,22]p10
 [10,22]p10  2022-10-12T22:09:09.376552+0000
 2022-10-10T15:00:52.196397+0000                    1  periodic scrub
scheduled @ 2022-10-14T01:16:35.682204+0000
3.5        67         0          0        0  281018368            0
  0  10017  active+clean    41m  530'18825  530:31531   [18,3]p18
[18,3]p18  2022-10-12T18:13:50.835870+0000  2022-10-10T14:57:31.136809+0000
                   1  periodic scrub scheduled @
2022-10-14T02:17:12.292237+0000
3.6        60         0          0        0  239497216            0
  0  10079  active+clean    41m  530'22537  530:34790    [0,21]p0
 [0,21]p0  2022-10-12T20:38:44.998414+0000  2022-10-10T14:57:31.136809+0000
                   1  periodic scrub scheduled @
2022-10-14T08:12:12.106892+0000
3.7        54         0          0        0  221261824            0
  0  10082  active+clean    41m  530'30718  530:37349    [4,12]p4
 [4,12]p4  2022-10-12T20:26:54.091307+0000  2022-10-10T14:57:31.136809+0000
                   1  periodic scrub scheduled @
2022-10-13T20:51:54.792643+0000
3.8        70         0          0        0  293588992            0
  0   4527  active+clean    41m   530'4527  530:16905  [11,21]p11
 [11,21]p11  2022-10-13T07:16:50.226814+0000
 2022-10-10T14:57:31.136809+0000                    1  periodic scrub
scheduled @ 2022-10-14T13:02:27.444761+0000
3.9        47         0          0        0  192938407            0
  0  10065  active+clean    41m  530'11065  530:21345  [19,11]p19
 [19,11]p19  2022-10-13T05:05:36.274216+0000
 2022-10-10T14:57:31.136809+0000                    1  periodic scrub
scheduled @ 2022-10-14T08:17:25.165367+0000
3.a        60         0          0        0  251658240            0
  0  10044  active+clean    41m  530'14744  530:23145   [18,1]p18
[18,1]p18  2022-10-13T04:29:38.891055+0000  2022-10-10T14:57:31.136809+0000
                   1  periodic scrub scheduled @
2022-10-14T11:10:38.556482+0000
3.b        52         0          0        0  209567744            0
  0   4949  active+clean    41m   530'4949  530:26757    [7,23]p7
 [7,23]p7  2022-10-12T22:08:45.621201+0000  2022-10-10T15:00:36.799456+0000
                   1  periodic scrub scheduled @
2022-10-14T02:28:08.061560+0000
3.c        68         0          0        0  276607307            0
  0  10003  active+clean    41m  530'18828  530:39884   [18,8]p18
[18,8]p18  2022-10-12T18:25:36.991393+0000  2022-10-10T14:57:31.136809+0000
                   1  periodic scrub scheduled @
2022-10-14T00:43:12.804024+0000
3.d        67         0          0        0  272621888            0
  0   6708  active+clean    41m   530'8359  530:33988   [13,7]p13
[13,7]p13  2022-10-12T21:42:29.600145+0000  2022-10-10T14:57:31.136809+0000
                   1  periodic scrub scheduled @
2022-10-13T23:30:29.341646+0000
3.e        68         0          0        0  276746240            0
  0   5178  active+clean    41m   530'5278  530:16051   [13,1]p13
[13,1]p13  2022-10-13T05:47:06.004714+0000  2022-10-11T21:04:57.978685+0000
                   1  periodic scrub scheduled @
2022-10-14T11:45:33.438178+0000
3.f        65         0          0        0  269307904            0
  0  10056  active+clean    41m  530'34965  530:49963   [23,4]p23
[23,4]p23  2022-10-13T08:58:09.493284+0000  2022-10-10T15:00:36.390467+0000
                   1  periodic scrub scheduled @
2022-10-14T12:18:58.610252+0000
3.10       66         0          0        0  271626240            0
  0   4272  active+clean    41m   530'4431  530:19010   [12,9]p12
[12,9]p12  2022-10-13T03:52:14.952046+0000  2022-10-10T14:57:31.136809+0000
                   1  periodic scrub scheduled @
2022-10-14T07:48:12.441144+0000
3.11       58         0          0        0  239075657            0
  0   6466  active+clean    41m   530'8563  530:24677   [18,0]p18
[18,0]p18  2022-10-12T22:25:17.255090+0000  2022-10-10T15:00:43.412084+0000
                   1  periodic scrub scheduled @
2022-10-14T03:25:34.048845+0000
3.12       45         0          0        0  186254336            0
  0  10084  active+clean    41m  530'16084  530:31273    [6,14]p6
 [6,14]p6  2022-10-13T03:05:14.109923+0000  2022-10-10T14:57:31.136809+0000
                   1  periodic scrub scheduled @
2022-10-14T03:35:11.159743+0000
3.13       68         0          0        0  275124224            0
  0  10013  active+clean    41m  530'28676  530:52278   [16,8]p16
[16,8]p16  2022-10-12T21:46:50.747741+0000  2022-10-11T16:48:56.632027+0000
                   1  periodic scrub scheduled @
2022-10-14T07:03:49.125496+0000
3.14       58         0          0        0  240123904            0
  0   7531  active+clean    41m   530'8212  530:26075   [23,4]p23
[23,4]p23  2022-10-13T04:25:39.131070+0000  2022-10-13T04:25:39.131070+0000
                   4  periodic scrub scheduled @
2022-10-14T05:36:16.428326+0000
3.15       59         0          0        0  247382016            0
  0   8890  active+clean    41m   530'8890  530:18892   [23,3]p23
[23,3]p23  2022-10-13T04:45:48.156899+0000  2022-10-10T14:57:31.136809+0000
                   1  periodic scrub scheduled @
2022-10-14T14:55:14.651919+0000
3.16       57         0          0        0  237285376            0
  0   6900  active+clean    41m   530'8766  530:20717   [19,9]p19
[19,9]p19  2022-10-13T00:13:35.716060+0000  2022-10-10T14:57:31.136809+0000
                   1  periodic scrub scheduled @
2022-10-14T07:08:16.779024+0000
3.17       56         0          0        0  234303488            0
  0  10012  active+clean    41m  530'21461  530:31490    [0,13]p0
 [0,13]p0  2022-10-13T07:42:57.775955+0000  2022-10-10T14:57:31.136809+0000
                   1  periodic scrub scheduled @
2022-10-14T14:52:30.758744+0000
3.18       47         0          0        0  197132288            0
  0  10001  active+clean    41m  530'14783  530:20829  [10,14]p10
 [10,14]p10  2022-10-13T00:41:44.050740+0000
 2022-10-10T14:57:31.136809+0000                    1  periodic scrub
scheduled @ 2022-10-14T09:30:02.438044+0000
3.19       50         0          0        0  209715200            0
  0  10058  active+clean    41m  499'19880  530:27891    [8,23]p8
 [8,23]p8  2022-10-13T10:58:13.948274+0000  2022-10-10T14:57:31.136809+0000
                   1  periodic scrub scheduled @
2022-10-14T19:55:12.268345+0000
3.1a       58         0          0        0  240123904            0
  0  10037  active+clean    41m  530'36799  530:50997   [16,9]p16
[16,9]p16  2022-10-13T02:03:18.026427+0000  2022-10-10T14:57:31.136809+0000
                   1  periodic scrub scheduled @
2022-10-14T04:55:58.684437+0000
3.1b       53         0          0        0  219996160            0
  0  10051  active+clean    41m  530'18388  530:29223    [0,22]p0
 [0,22]p0  2022-10-12T19:19:25.675030+0000  2022-10-12T19:19:25.675030+0000
                   4  periodic scrub scheduled @
2022-10-14T00:21:49.935082+0000
3.1c       66         0          0        0  276762624            0
  0  10027  active+clean    41m  530'16327  530:38127   [20,5]p20
[20,5]p20  2022-10-13T00:04:49.227288+0000  2022-10-10T15:00:38.834351+0000
                   1  periodic scrub scheduled @
2022-10-14T01:15:26.524544+0000
3.1d       49         0          0        0  201327104            0
  0  10020  active+clean    41m  530'26433  530:51593   [17,9]p17
[17,9]p17  2022-10-13T03:49:02.466987+0000  2022-10-10T14:57:31.136809+0000
                   1  periodic scrub scheduled @
2022-10-14T09:04:39.909179+0000
3.1e       61         0          0        0  249098595            0
  0   8790  active+clean    41m   530'8790  530:17807    [3,21]p3
 [3,21]p3  2022-10-12T22:28:19.417597+0000  2022-10-10T15:00:39.474873+0000
                   1  periodic scrub scheduled @
2022-10-13T23:49:55.974786+0000
3.1f       53         0          0        0  222056448            0
  0  10053  active+clean    41m  530'35776  530:50234    [0,15]p0
 [0,15]p0  2022-10-13T07:16:46.787818+0000  2022-10-10T14:57:31.136809+0000
                   1  periodic scrub scheduled @
2022-10-14T16:24:45.860894+0000

* NOTE: Omap statistics are gathered during deep scrub and may be
inaccurate soon afterwards depending on utilization. See
http://docs.ceph.com/en/latest/dev/placement-group/#omap-statistics for
further details.

Em qui., 13 de out. de 2022 às 13:54, Eugen Block <eblock@xxxxxx> escreveu:

Could you share more details? Does ceph report inactive PGs when one
node is down? Please share:
ceph osd tree
ceph osd pool ls detail
ceph osd crush rule dump <rule of affected pool>
ceph pg ls-by-pool <affected pool>
ceph -s

Zitat von Murilo Morais <murilo@xxxxxxxxxxxxxx>:

> Thanks for answering.
> Marc, but there is no mechanism to prevent IO pause? At the moment I
don't
> worry about data loss.
> I understand that putting it as replica x1 can work, but I need it to be
x2.
>
> Em qui., 13 de out. de 2022 às 12:26, Marc <Marc@xxxxxxxxxxxxxxxxx>
> escreveu:
>
>>
>> >
>> > I'm having strange behavior on a new cluster.
>>
>> Not strange, by design
>>
>> > I have 3 machines, two of them have the disks. We can name them like
>> > this:
>> > dcs1 to dcs3. The dcs1 and dcs2 machines contain the disks.
>> >
>> > I started bootstrapping through dcs1, added the other hosts and left
mgr
>> > on
>> > dcs3 only.
>> >
>> > What is happening is that if I take down dcs2 everything hangs and
>> > becomes
>> > irresponsible, including the mount points that were pointed to dcs1.
>>
>> You have to have disks in 3 machines. (Or set the replication to 1x)
>>
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx