Hi,
we recently set up a new pacific cluster with cephadm.
Deployed nfs on two hosts and ingress on two other hosts. (ceph orch
apply for nfs and ingress like on the docs page)
So far so good. ESXi with NFS41 connects, but the way ingress works
confuses me.
It distributes clients static to one nfs daemon by their ip addresses.
(I know nfs won't like it if the client switches all the time, because
of reservations.)
Three of our ESXi servers seem to connect to host1, the 4th one to the
other. This leads to problem in ESXi where it doesn't recognize the
store as the same like the others. I can't find on how exactly ESXi
calculates that, but there must be different information coming from
these nfs daemons. nfs-ganesha doesn't behave exactly the same on these
hosts.
Besides that, I wanted to do some failover tests, before the cluster
goes live. I stopped stopped on nfs server, but ingress (haproxy) does't
seem to care.
On the haproxy stats page, both backends are listed with "no check", so
there is no failover happening to the NFS clients. haproxy does not fail
over to the other host. Datastores are disconnected and unable to
connect new ones.
How is ingress supposed to detect a failed nfs server and how to tell
ganesha to be identical to each other?
Bonus question: Why can't keepalived not just manage nfs-ganesha on two
hosts instead of haproxy? It would eliminate an extra network hop.
Hope someone has a few insights to that. Spent way too much time to
switch to some other solution.
Best regards,
Andreas
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx