Re: Is it normal Ceph reports "Degraded data redundancy" in normal use?

Eugen Block <eblock@xxxxxx> · Fri, 17 Sep 2021 14:10:38 +0000

Since I'm trying to test different erasure encoding plugin and  
technique I don't want the balancer active.
So I tried setting it to none as Eguene suggested, and to my  
surprise I did not get any degraded messages at all, and the cluster  
was in HEALTH_OK the whole time.

Interesting, maybe the balancer works differently now? Or it works  
differently under heavy load?

The logs you provided indeed mention the balancer many times in lines  
like these:

 Sep 17 06:30:01 pech-mon-1 conmon[1337]: debug  
2021-09-17T06:30:01.322+0000 7f66afb28700  0 mon.pech-mon-1@0(leader)  
e7 handle_command mon_command({"prefix": "osd pg-upmap-items",  
"format": "json", "pgid": "12.309", "id": [311, 344]} v 0) v1

The only suspicious lines I see are these:

 Sep 17 06:30:01 pech-mon-1 conmon[1337]: debug  
2021-09-17T06:30:01.402+0000 7f66b0329700  1 heartbeat_map  
reset_timeout 'Monitor::cpu_tp thread 0x7f66b0329700' had timed out  
after 0.000000000s

But I'm not sure if this is related. The out OSDs shouldn't have any  
impact on this test.

Did you monitor the network saturation during these tests with iftop  
or something similar?

Zitat von Kai Stian Olstad <ceph+list@xxxxxxxxxx>:

On 16.09.2021 15:51, Josh Baergen wrote:
I assume it's the balancer module. If you write lots of data quickly
into the cluster the distribution can vary and the balancer will try
to even out the placement.

The balancer won't cause degradation, only misplaced objects.

Since I'm trying to test different erasure encoding plugin and  
technique I don't want the balancer active.
So I tried setting it to none as Eguene suggested, and to my  
surprise I did not get any degraded messages at all, and the cluster  
was in HEALTH_OK the whole time.

   Degraded data redundancy: 260/11856050 objects degraded
(0.014%), 1 pg degraded

That status definitely indicates that something is wrong. Check your
cluster logs on your mons (/var/log/ceph/ceph.log) for the cause; my
guess is that you have OSDs flapping (rapidly going down and up again)
due to either overload (disk or network) or some sort of
misconfiguration.

So I enabled the balancer and run the rados bench again and the  
degraded messages is back.

I guess the equivalent log to /var/log/ceph/ceph.log in Cephadm is
  journalctl -u  
ceph-b321e76e-da3a-11eb-b75c-4f948441dcd@xxxxxxxx-mon-1.service

There are no messages about osd being marked down, so I don't  
understand why this is happening.
I probably need to raise some verbose value.

I have attach the log from journalctl, it start at 06:30:00 when I  
started the rados bench and included a few lines after the first  
degrade message at 06:31.06.
Just be aware that 15 OSD is set to out, since I have some problem  
with the a HBA on one host, all test has been done with those 15 OSD  
in status out.

--
Kai Stian Olstad

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx