Re: Ceph dashboard reports CephNodeNetworkPacketErrors

Dominique Ramaekers <dominique.ramaekers@xxxxxxxxxx> · Tue, 7 Nov 2023 11:12:12 +0000

Hi David,

Thanks for the quick response!

The bond reports not a single link failure. Nor do I register packet losses with ping. The network cards in the server are already replaced. Cables are new. With my setup I easily reach 2KIOPS over the cluster. So I do not assume network congestion when I get the error on ±300 IOPS and <100MB/s usage…

I’ll let a network technician look at the switch. I hope he’ll find a reason for the packet errors…

Greetings,

Dominique.

Van: David C. <david.casier@xxxxxxxx>
Verzonden: dinsdag 7 november 2023 11:39
Aan: Dominique Ramaekers <dominique.ramaekers@xxxxxxxxxx>
CC: ceph-users@xxxxxxx
Onderwerp: Re:  Ceph dashboard reports CephNodeNetworkPacketErrors

Hi Dominique,

The consistency of the data should not be at risk with such a problem.
But on the other hand, it's better to solve the network problem.

Perhaps look at the state of bond0 :
cat /proc/net/bonding/bond0
As well as the usual network checks
________________________________________________________

Cordialement,

David CASIER
________________________________________________________

Le mar. 7 nov. 2023 à 11:20, Dominique Ramaekers <dominique.ramaekers@xxxxxxxxxx<mailto:dominique.ramaekers@xxxxxxxxxx>> a écrit :
Hi,

I'm using Ceph on a 4-host cluster for a year now. I recently discovered the Ceph Dashboard :-)

No I see that the Dashboard reports CephNodeNetworkPacketErrors >0.01% or >10 packets/s...

Although all systems work great, I'm worried.

'ip -s link show eno5' results:
2: eno5: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq master bond0 state UP mode DEFAULT group default qlen 1000
    link/ether 7a:3b:79:9c:f6:d1 brd ff:ff:ff:ff:ff:ff permaddr 5c:ba:2c:08:b3:90
    RX:     bytes   packets errors dropped  missed   mcast
     734153938129 645770129  20160       0       0  342301
    TX:     bytes   packets errors dropped carrier collsns
    1085134190597 923843839      0       0       0       0
    altname enp178s0f0

So in average 0,0003% of RX packet errors!

All the four hosts uses the same 10Gb HP switch. The hosts themselves are HP Proliant G10 servers. I would expect 0% packet loss...

Anyway. Should I be worried about data consistency? Or can Ceph handle this amount of packet errors?

Greetings,

Dominique.
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx<mailto:ceph-users@xxxxxxx>
To unsubscribe send an email to ceph-users-leave@xxxxxxx<mailto:ceph-users-leave@xxxxxxx>