Huge headaches with NFS and ingress HA failover

Andreas Weisker <weisker@xxxxxxxxxxx> · Wed, 21 Jul 2021 18:28:55 +0200

Hi,

we recently set up a new pacific cluster with cephadm.
Deployed nfs on two hosts and ingress on two other hosts. (ceph orch 
apply for nfs and ingress like on the docs page)

So far so good. ESXi with NFS41 connects, but the way ingress works 
confuses me.

It distributes clients static to one nfs daemon by their ip addresses. 
(I know nfs won't like it if the client switches all the time, because 
of reservations.)
Three of our ESXi servers seem to connect to host1, the 4th one to the 
other. This leads to problem in ESXi where it doesn't recognize the 
store as the same like the others. I can't find on how exactly ESXi 
calculates that, but there must be different information coming from 
these nfs daemons. nfs-ganesha doesn't behave exactly the same on these 
hosts.

Besides that, I wanted to do some failover tests, before the cluster 
goes live. I stopped stopped on nfs server, but ingress (haproxy) does't 
seem to care.
On the haproxy stats page, both backends are listed with "no check", so 
there is no failover happening to the NFS clients. haproxy does not fail 
over to the other host. Datastores are disconnected and unable to 
connect new ones.

How is ingress supposed to detect a failed nfs server and how to tell 
ganesha to be identical to each other?

Bonus question: Why can't keepalived not just manage nfs-ganesha on two 
hosts instead of haproxy? It would eliminate an extra network hop.

Hope someone has a few insights to that. Spent way too much time to 
switch to some other solution.

Best regards,

Andreas
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx