Re: Is it normal Ceph reports "Degraded data redundancy" in normal use?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 17.09.2021 16:10, Eugen Block wrote:
Since I'm trying to test different erasure encoding plugin and technique I don't want the balancer active. So I tried setting it to none as Eguene suggested, and to my surprise I did not get any degraded messages at all, and the cluster was in HEALTH_OK the whole time.

Interesting, maybe the balancer works differently now? Or it works
differently under heavy load?

It would be strange that the balancer normal operation is to put the cluster in degraded mode.


The only suspicious lines I see are these:

 Sep 17 06:30:01 pech-mon-1 conmon[1337]: debug
2021-09-17T06:30:01.402+0000 7f66b0329700  1 heartbeat_map
reset_timeout 'Monitor::cpu_tp thread 0x7f66b0329700' had timed out
after 0.000000000s

But I'm not sure if this is related. The out OSDs shouldn't have any
impact on this test.

Did you monitor the network saturation during these tests with iftop
or something similar?

I did not, so I rerun the test this morning.

All the servers have 2x25Gbit/s NIC in bonding with LACP 802.3ad layer3+4.

The peak on the active monitor was 27 Mbit/s and less on the other 2 monitors. I also checked the CPU(Xeon 5222 3.8 GHz) and non of the cores was saturated,
and network statistics show no errors or drops.


So perhaps there is a bug in the balancer code?

--
Kai Stian Olstad
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux