Re: PG_AVAILABILITY with one osd down?

Maks Kowalik <maks_kowalik@xxxxxxxxx> · Sat, 16 Feb 2019 22:30:38 +0100

Clients' experience depends on whether at the very moment they need to read/write to those particular PGs involved in peering.
If their objects are placed in another PGs, then I/O operations shouldn't be impacted.
If clients were performing I/O ops to those PGs that went into peering, then they will notice increased latency. That's the case for Object and RBD.
In case of CephFS I have no experience.

Peering of several PGs does not mean the whole cluster is unavailable during that time. Only a tiny part of it.
Also, those 6 seconds is a period of the "PG_AVAIL health check warning" duration. It is not the length of each PG unavailablity.
It's the cluster which noticed that during that time some groups performed peering.
In a proper setup and healthy conditions one group peers in fractions of second.

Restarting an OSD causes the same thing. However is more "smooth" than an unexpected death (going into the details would require quite a long elaboration).
If your setup is correct, you should be able to perform a cluster-wide restart of everything and only effect visible outside would be a slightly increased latency.

Kind regards,
Maks

sob., 16 lut 2019 o 21:39 <jesper@xxxxxxxx> napisał(a):
> Hello,

> your log extract shows that:

>

> 2019-02-15 21:40:08 OSD.29 DOWN

> 2019-02-15 21:40:09 PG_AVAILABILITY warning start

> 2019-02-15 21:40:15 PG_AVAILABILITY warning cleared

>

> 2019-02-15 21:44:06 OSD.29 UP

> 2019-02-15 21:44:08 PG_AVAILABILITY warning start

> 2019-02-15 21:44:15 PG_AVAILABILITY warning cleared

>

> What you saw is the natural consequence of OSD state change. Those two

> periods of limited PG availability (6s each) are related to peering

> that happens shortly after an OSD goes down or up.

> Basically, the placement groups stored on that OSD need peering, so

> the incoming connections are directed to other (alive) OSDs. And, yes,

> during those few seconds the data are not accessible.

Thanks, bear over with my questions. I'm pretty new to Ceph.

What will clients  (CephFS, Object) experience?

.. will they just block until time has passed and they get through or?

Which means that I'll get 72 x 6 seconds unavailabilty when doing

a rolling restart of my OSD's during upgrades and such? Or is a

controlled restart different than a crash?

-- 

Jesper.

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com