Re: Cluster crashing when stopping some host

Murilo Morais <murilo@xxxxxxxxxxxxxx> · Fri, 14 Oct 2022 12:36:11 -0300

Eugen, it worked and it didn't.

I had to bootstrap in v17.2.3, using v17.2.4 this behavior is occurring.
I did numerous tests with 3 VMs, two with disks and another only for MON,
in v17.2.4 the cluster simply crashes when one of the hosts with disk dies
even with three MONs.
I don't understand why this happened.

Em sex., 14 de out. de 2022 às 03:53, Eugen Block <eblock@xxxxxx> escreveu:

> To me this sounds more like either your MONs didn't have a quorum
> anymore or your clients didn't have all MONs in their ceph.conf, maybe
> just the failed one? Then the issue is resolved now?
>
> Zitat von Murilo Morais <murilo@xxxxxxxxxxxxxx>:
>
> > Unfortunately I can't verify if ceph reports any inactive PG. As soon as
> > the second host disconnects practically everything is locked, nothing
> > appears even using "ceph -w". It only appears that the OSDs are offline
> > when dcs2 returns.
> >
> > Note: Apparently there was a new update recently. When I was in the test
> > environment, this behavior was not happening, dcs1 was UP with all
> services
> > without crashing even with dcs2 DOWN, performing reading and writing,
> even
> > without adding dcs3.
> >
> > ### COMMANDS ###
> > [ceph: root@dcs1 /]# ceph osd tree
> > ID  CLASS  WEIGHT    TYPE NAME       STATUS  REWEIGHT  PRI-AFF
> > -1         65.49570  root default
> > -3         32.74785      host dcs1
> >  0    hdd   2.72899          osd.0       up   1.00000  1.00000
> >  1    hdd   2.72899          osd.1       up   1.00000  1.00000
> >  2    hdd   2.72899          osd.2       up   1.00000  1.00000
> >  3    hdd   2.72899          osd.3       up   1.00000  1.00000
> >  4    hdd   2.72899          osd.4       up   1.00000  1.00000
> >  5    hdd   2.72899          osd.5       up   1.00000  1.00000
> >  6    hdd   2.72899          osd.6       up   1.00000  1.00000
> >  7    hdd   2.72899          osd.7       up   1.00000  1.00000
> >  8    hdd   2.72899          osd.8       up   1.00000  1.00000
> >  9    hdd   2.72899          osd.9       up   1.00000  1.00000
> > 10    hdd   2.72899          osd.10      up   1.00000  1.00000
> > 11    hdd   2.72899          osd.11      up   1.00000  1.00000
> > -5         32.74785      host dcs2
> > 12    hdd   2.72899          osd.12      up   1.00000  1.00000
> > 13    hdd   2.72899          osd.13      up   1.00000  1.00000
> > 14    hdd   2.72899          osd.14      up   1.00000  1.00000
> > 15    hdd   2.72899          osd.15      up   1.00000  1.00000
> > 16    hdd   2.72899          osd.16      up   1.00000  1.00000
> > 17    hdd   2.72899          osd.17      up   1.00000  1.00000
> > 18    hdd   2.72899          osd.18      up   1.00000  1.00000
> > 19    hdd   2.72899          osd.19      up   1.00000  1.00000
> > 20    hdd   2.72899          osd.20      up   1.00000  1.00000
> > 21    hdd   2.72899          osd.21      up   1.00000  1.00000
> > 22    hdd   2.72899          osd.22      up   1.00000  1.00000
> > 23    hdd   2.72899          osd.23      up   1.00000  1.00000
> >
> >
> > [ceph: root@dcs1 /]# ceph osd pool ls detail
> > pool 1 '.mgr' replicated size 2 min_size 1 crush_rule 0 object_hash
> > rjenkins pg_num 1 pgp_num 1 autoscale_mode on last_change 26 flags
> > hashpspool stripe_width 0 pg_num_max 32 pg_num_min 1 application mgr
> > pool 2 'cephfs.ovirt_hosted_engine.meta' replicated size 2 min_size 1
> > crush_rule 0 object_hash rjenkins pg_num 16 pgp_num 16 autoscale_mode on
> > last_change 77 lfor 0/0/47 flags hashpspool stripe_width 0
> > pg_autoscale_bias 4 pg_num_min 16 recovery_priority 5 application cephfs
> > pool 3 'cephfs.ovirt_hosted_engine.data' replicated size 2 min_size 1
> > crush_rule 0 object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode on
> > last_change 179 lfor 0/0/47 flags hashpspool max_bytes 107374182400
> > stripe_width 0 application cephfs
> > pool 6 '.nfs' replicated size 2 min_size 1 crush_rule 0 object_hash
> > rjenkins pg_num 32 pgp_num 32 autoscale_mode on last_change 254 lfor
> > 0/0/252 flags hashpspool stripe_width 0 application nfs
> > pool 7 'cephfs.ovirt_storage_sas.meta' replicated size 2 min_size 1
> > crush_rule 0 object_hash rjenkins pg_num 16 pgp_num 16 autoscale_mode on
> > last_change 322 lfor 0/0/287 flags hashpspool stripe_width 0
> > pg_autoscale_bias 4 pg_num_min 16 recovery_priority 5 application cephfs
> > pool 8 'cephfs.ovirt_storage_sas.data' replicated size 2 min_size 1
> > crush_rule 0 object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode on
> > last_change 291 lfor 0/0/289 flags hashpspool stripe_width 0 application
> > cephfs
> > pool 9 'cephfs.ovirt_storage_iso.meta' replicated size 2 min_size 1
> > crush_rule 0 object_hash rjenkins pg_num 16 pgp_num 16 autoscale_mode on
> > last_change 356 lfor 0/0/325 flags hashpspool stripe_width 0
> > pg_autoscale_bias 4 pg_num_min 16 recovery_priority 5 application cephfs
> > pool 10 'cephfs.ovirt_storage_iso.data' replicated size 2 min_size 1
> > crush_rule 0 object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode on
> > last_change 329 lfor 0/0/327 flags hashpspool stripe_width 0 application
> > cephfs
> >
> >
> > [ceph: root@dcs1 /]# ceph osd crush rule dump replicated_rule
> > {
> >     "rule_id": 0,
> >     "rule_name": "replicated_rule",
> >     "type": 1,
> >     "steps": [
> >         {
> >             "op": "take",
> >             "item": -1,
> >             "item_name": "default"
> >         },
> >         {
> >             "op": "chooseleaf_firstn",
> >             "num": 0,
> >             "type": "host"
> >         },
> >         {
> >             "op": "emit"
> >         }
> >     ]
> > }
> >
> >
> > [ceph: root@dcs1 /]# ceph pg ls-by-pool cephfs.ovirt_hosted_engine.data
> > PG    OBJECTS  DEGRADED  MISPLACED  UNFOUND  BYTES      OMAP_BYTES*
> >  OMAP_KEYS*  LOG    STATE         SINCE  VERSION    REPORTED   UP
> >  ACTING      SCRUB_STAMP                      DEEP_SCRUB_STAMP
> >     LAST_SCRUB_DURATION  SCRUB_SCHEDULING
> > 3.0        69         0          0        0  285213095            0
> >   0  10057  active+clean    41m  530'20632  530:39461    [1,23]p1
> >  [1,23]p1  2022-10-13T03:19:33.649837+0000
> 2022-10-10T14:57:31.136809+0000
> >                    1  periodic scrub scheduled @
> > 2022-10-14T07:24:46.314217+0000
> > 3.1        58         0          0        0  242319360            0
> >   0  10026  active+clean    41m  530'11926  530:21424    [6,19]p6
> >  [6,19]p6  2022-10-13T02:15:23.395162+0000
> 2022-10-10T14:57:31.136809+0000
> >                    1  periodic scrub scheduled @
> > 2022-10-14T11:42:17.682881+0000
> > 3.2        71         0          0        0  294629376            0
> >   0  10012  active+clean    41m  530'12312  530:25506  [10,16]p10
> >  [10,16]p10  2022-10-13T06:12:48.839013+0000
> >  2022-10-11T21:09:49.405860+0000                    1  periodic scrub
> > scheduled @ 2022-10-14T12:35:23.917129+0000
> > 3.3        63         0          0        0  262520832            0
> >   0  10073  active+clean    41m  530'20204  530:42834  [13,11]p13
> >  [13,11]p13  2022-10-13T01:16:17.672947+0000
> >  2022-10-11T16:43:27.935298+0000                    1  periodic scrub
> > scheduled @ 2022-10-14T11:48:42.643271+0000
> > 3.4        59         0          0        0  240611328            0
> >   0  10017  active+clean    41m  530'17883  530:32537  [10,22]p10
> >  [10,22]p10  2022-10-12T22:09:09.376552+0000
> >  2022-10-10T15:00:52.196397+0000                    1  periodic scrub
> > scheduled @ 2022-10-14T01:16:35.682204+0000
> > 3.5        67         0          0        0  281018368            0
> >   0  10017  active+clean    41m  530'18825  530:31531   [18,3]p18
> > [18,3]p18  2022-10-12T18:13:50.835870+0000
> 2022-10-10T14:57:31.136809+0000
> >                    1  periodic scrub scheduled @
> > 2022-10-14T02:17:12.292237+0000
> > 3.6        60         0          0        0  239497216            0
> >   0  10079  active+clean    41m  530'22537  530:34790    [0,21]p0
> >  [0,21]p0  2022-10-12T20:38:44.998414+0000
> 2022-10-10T14:57:31.136809+0000
> >                    1  periodic scrub scheduled @
> > 2022-10-14T08:12:12.106892+0000
> > 3.7        54         0          0        0  221261824            0
> >   0  10082  active+clean    41m  530'30718  530:37349    [4,12]p4
> >  [4,12]p4  2022-10-12T20:26:54.091307+0000
> 2022-10-10T14:57:31.136809+0000
> >                    1  periodic scrub scheduled @
> > 2022-10-13T20:51:54.792643+0000
> > 3.8        70         0          0        0  293588992            0
> >   0   4527  active+clean    41m   530'4527  530:16905  [11,21]p11
> >  [11,21]p11  2022-10-13T07:16:50.226814+0000
> >  2022-10-10T14:57:31.136809+0000                    1  periodic scrub
> > scheduled @ 2022-10-14T13:02:27.444761+0000
> > 3.9        47         0          0        0  192938407            0
> >   0  10065  active+clean    41m  530'11065  530:21345  [19,11]p19
> >  [19,11]p19  2022-10-13T05:05:36.274216+0000
> >  2022-10-10T14:57:31.136809+0000                    1  periodic scrub
> > scheduled @ 2022-10-14T08:17:25.165367+0000
> > 3.a        60         0          0        0  251658240            0
> >   0  10044  active+clean    41m  530'14744  530:23145   [18,1]p18
> > [18,1]p18  2022-10-13T04:29:38.891055+0000
> 2022-10-10T14:57:31.136809+0000
> >                    1  periodic scrub scheduled @
> > 2022-10-14T11:10:38.556482+0000
> > 3.b        52         0          0        0  209567744            0
> >   0   4949  active+clean    41m   530'4949  530:26757    [7,23]p7
> >  [7,23]p7  2022-10-12T22:08:45.621201+0000
> 2022-10-10T15:00:36.799456+0000
> >                    1  periodic scrub scheduled @
> > 2022-10-14T02:28:08.061560+0000
> > 3.c        68         0          0        0  276607307            0
> >   0  10003  active+clean    41m  530'18828  530:39884   [18,8]p18
> > [18,8]p18  2022-10-12T18:25:36.991393+0000
> 2022-10-10T14:57:31.136809+0000
> >                    1  periodic scrub scheduled @
> > 2022-10-14T00:43:12.804024+0000
> > 3.d        67         0          0        0  272621888            0
> >   0   6708  active+clean    41m   530'8359  530:33988   [13,7]p13
> > [13,7]p13  2022-10-12T21:42:29.600145+0000
> 2022-10-10T14:57:31.136809+0000
> >                    1  periodic scrub scheduled @
> > 2022-10-13T23:30:29.341646+0000
> > 3.e        68         0          0        0  276746240            0
> >   0   5178  active+clean    41m   530'5278  530:16051   [13,1]p13
> > [13,1]p13  2022-10-13T05:47:06.004714+0000
> 2022-10-11T21:04:57.978685+0000
> >                    1  periodic scrub scheduled @
> > 2022-10-14T11:45:33.438178+0000
> > 3.f        65         0          0        0  269307904            0
> >   0  10056  active+clean    41m  530'34965  530:49963   [23,4]p23
> > [23,4]p23  2022-10-13T08:58:09.493284+0000
> 2022-10-10T15:00:36.390467+0000
> >                    1  periodic scrub scheduled @
> > 2022-10-14T12:18:58.610252+0000
> > 3.10       66         0          0        0  271626240            0
> >   0   4272  active+clean    41m   530'4431  530:19010   [12,9]p12
> > [12,9]p12  2022-10-13T03:52:14.952046+0000
> 2022-10-10T14:57:31.136809+0000
> >                    1  periodic scrub scheduled @
> > 2022-10-14T07:48:12.441144+0000
> > 3.11       58         0          0        0  239075657            0
> >   0   6466  active+clean    41m   530'8563  530:24677   [18,0]p18
> > [18,0]p18  2022-10-12T22:25:17.255090+0000
> 2022-10-10T15:00:43.412084+0000
> >                    1  periodic scrub scheduled @
> > 2022-10-14T03:25:34.048845+0000
> > 3.12       45         0          0        0  186254336            0
> >   0  10084  active+clean    41m  530'16084  530:31273    [6,14]p6
> >  [6,14]p6  2022-10-13T03:05:14.109923+0000
> 2022-10-10T14:57:31.136809+0000
> >                    1  periodic scrub scheduled @
> > 2022-10-14T03:35:11.159743+0000
> > 3.13       68         0          0        0  275124224            0
> >   0  10013  active+clean    41m  530'28676  530:52278   [16,8]p16
> > [16,8]p16  2022-10-12T21:46:50.747741+0000
> 2022-10-11T16:48:56.632027+0000
> >                    1  periodic scrub scheduled @
> > 2022-10-14T07:03:49.125496+0000
> > 3.14       58         0          0        0  240123904            0
> >   0   7531  active+clean    41m   530'8212  530:26075   [23,4]p23
> > [23,4]p23  2022-10-13T04:25:39.131070+0000
> 2022-10-13T04:25:39.131070+0000
> >                    4  periodic scrub scheduled @
> > 2022-10-14T05:36:16.428326+0000
> > 3.15       59         0          0        0  247382016            0
> >   0   8890  active+clean    41m   530'8890  530:18892   [23,3]p23
> > [23,3]p23  2022-10-13T04:45:48.156899+0000
> 2022-10-10T14:57:31.136809+0000
> >                    1  periodic scrub scheduled @
> > 2022-10-14T14:55:14.651919+0000
> > 3.16       57         0          0        0  237285376            0
> >   0   6900  active+clean    41m   530'8766  530:20717   [19,9]p19
> > [19,9]p19  2022-10-13T00:13:35.716060+0000
> 2022-10-10T14:57:31.136809+0000
> >                    1  periodic scrub scheduled @
> > 2022-10-14T07:08:16.779024+0000
> > 3.17       56         0          0        0  234303488            0
> >   0  10012  active+clean    41m  530'21461  530:31490    [0,13]p0
> >  [0,13]p0  2022-10-13T07:42:57.775955+0000
> 2022-10-10T14:57:31.136809+0000
> >                    1  periodic scrub scheduled @
> > 2022-10-14T14:52:30.758744+0000
> > 3.18       47         0          0        0  197132288            0
> >   0  10001  active+clean    41m  530'14783  530:20829  [10,14]p10
> >  [10,14]p10  2022-10-13T00:41:44.050740+0000
> >  2022-10-10T14:57:31.136809+0000                    1  periodic scrub
> > scheduled @ 2022-10-14T09:30:02.438044+0000
> > 3.19       50         0          0        0  209715200            0
> >   0  10058  active+clean    41m  499'19880  530:27891    [8,23]p8
> >  [8,23]p8  2022-10-13T10:58:13.948274+0000
> 2022-10-10T14:57:31.136809+0000
> >                    1  periodic scrub scheduled @
> > 2022-10-14T19:55:12.268345+0000
> > 3.1a       58         0          0        0  240123904            0
> >   0  10037  active+clean    41m  530'36799  530:50997   [16,9]p16
> > [16,9]p16  2022-10-13T02:03:18.026427+0000
> 2022-10-10T14:57:31.136809+0000
> >                    1  periodic scrub scheduled @
> > 2022-10-14T04:55:58.684437+0000
> > 3.1b       53         0          0        0  219996160            0
> >   0  10051  active+clean    41m  530'18388  530:29223    [0,22]p0
> >  [0,22]p0  2022-10-12T19:19:25.675030+0000
> 2022-10-12T19:19:25.675030+0000
> >                    4  periodic scrub scheduled @
> > 2022-10-14T00:21:49.935082+0000
> > 3.1c       66         0          0        0  276762624            0
> >   0  10027  active+clean    41m  530'16327  530:38127   [20,5]p20
> > [20,5]p20  2022-10-13T00:04:49.227288+0000
> 2022-10-10T15:00:38.834351+0000
> >                    1  periodic scrub scheduled @
> > 2022-10-14T01:15:26.524544+0000
> > 3.1d       49         0          0        0  201327104            0
> >   0  10020  active+clean    41m  530'26433  530:51593   [17,9]p17
> > [17,9]p17  2022-10-13T03:49:02.466987+0000
> 2022-10-10T14:57:31.136809+0000
> >                    1  periodic scrub scheduled @
> > 2022-10-14T09:04:39.909179+0000
> > 3.1e       61         0          0        0  249098595            0
> >   0   8790  active+clean    41m   530'8790  530:17807    [3,21]p3
> >  [3,21]p3  2022-10-12T22:28:19.417597+0000
> 2022-10-10T15:00:39.474873+0000
> >                    1  periodic scrub scheduled @
> > 2022-10-13T23:49:55.974786+0000
> > 3.1f       53         0          0        0  222056448            0
> >   0  10053  active+clean    41m  530'35776  530:50234    [0,15]p0
> >  [0,15]p0  2022-10-13T07:16:46.787818+0000
> 2022-10-10T14:57:31.136809+0000
> >                    1  periodic scrub scheduled @
> > 2022-10-14T16:24:45.860894+0000
> >
> > * NOTE: Omap statistics are gathered during deep scrub and may be
> > inaccurate soon afterwards depending on utilization. See
> > http://docs.ceph.com/en/latest/dev/placement-group/#omap-statistics for
> > further details.
> >
> > Em qui., 13 de out. de 2022 às 13:54, Eugen Block <eblock@xxxxxx>
> escreveu:
> >
> >> Could you share more details? Does ceph report inactive PGs when one
> >> node is down? Please share:
> >> ceph osd tree
> >> ceph osd pool ls detail
> >> ceph osd crush rule dump <rule of affected pool>
> >> ceph pg ls-by-pool <affected pool>
> >> ceph -s
> >>
> >> Zitat von Murilo Morais <murilo@xxxxxxxxxxxxxx>:
> >>
> >> > Thanks for answering.
> >> > Marc, but there is no mechanism to prevent IO pause? At the moment I
> >> don't
> >> > worry about data loss.
> >> > I understand that putting it as replica x1 can work, but I need it to
> be
> >> x2.
> >> >
> >> > Em qui., 13 de out. de 2022 às 12:26, Marc <Marc@xxxxxxxxxxxxxxxxx>
> >> > escreveu:
> >> >
> >> >>
> >> >> >
> >> >> > I'm having strange behavior on a new cluster.
> >> >>
> >> >> Not strange, by design
> >> >>
> >> >> > I have 3 machines, two of them have the disks. We can name them
> like
> >> >> > this:
> >> >> > dcs1 to dcs3. The dcs1 and dcs2 machines contain the disks.
> >> >> >
> >> >> > I started bootstrapping through dcs1, added the other hosts and
> left
> >> mgr
> >> >> > on
> >> >> > dcs3 only.
> >> >> >
> >> >> > What is happening is that if I take down dcs2 everything hangs and
> >> >> > becomes
> >> >> > irresponsible, including the mount points that were pointed to
> dcs1.
> >> >>
> >> >> You have to have disks in 3 machines. (Or set the replication to 1x)
> >> >>
> >> > _______________________________________________
> >> > ceph-users mailing list -- ceph-users@xxxxxxx
> >> > To unsubscribe send an email to ceph-users-leave@xxxxxxx
> >>
> >>
> >>
> >> _______________________________________________
> >> ceph-users mailing list -- ceph-users@xxxxxxx
> >> To unsubscribe send an email to ceph-users-leave@xxxxxxx
> >>
>
>
>
>
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx