Here are the yaml files I used to create the NFS and ingress services:
nfs-ingress.yaml
service_type: ingress
service_id: nfs.xcpnfs
placement:
count: 2
spec:
backend_service: nfs.xcpnfs
frontend_port: 2049
monitor_port: 9000
virtual_ip: 172.16.172.199/24
nfs.yaml
service_type: nfs
service_id: xcpnfs
placement:
hosts:
- san1
- san2
spec:
port: 20490
Am I missing something here? Is there another mailing list where I
should be asking about this?
On 31/08/2023 10:38 am, Thorne Lawler wrote:
If there isn't any documentation for this yet, can anyone tell me:
* How do I inspect/change my NFS/haproxy/keepalived configuration?
* What is it supposed to look like? Does someone have a working example?
Thank you.
On 31/08/2023 9:36 am, Thorne Lawler wrote:
Sorry everyone,
Is there any more detailed documentation on the high availability NFS
functionality in current Ceph?
This is a pretty serious sticking point.
Thank you.
On 30/08/2023 9:33 am, Thorne Lawler wrote:
Fellow cephalopods,
I'm trying to get quick, seamless NFS failover happening on my
four-node Ceph cluster.
I followed the instructions here:
https://docs.ceph.com/en/latest/cephadm/services/nfs/#high-availability-nfs
but testing shows that failover doesn't happen. When I placed node 2
("san2") in maintenance mode, the NFS service shut down:
Aug 24 14:19:03 san2
ceph-e2f1b934-ed43-11ec-80fa-04421a1a1d66-nfs-xcpnfs-1-0-san2-datsvq[1962479]:
24/08/2023 04:19:03 : epoch 64b8af5a : san2 : ganesha.nfsd-8[Admin]
do_shutdown :MAIN :EVENT :Removing all exports.
Aug 24 14:19:13 san2 bash[3235994]: time="2023-08-24T14:19:13+10:00"
level=warning msg="StopSignal SIGTERM failed to stop container
ceph-e2f1b934-ed43-11ec-80fa-04421a1a1d66-nfs-xcpnfs-1-0-san2-datsvq
in 10 seconds, resorting to SIGKILL"
Aug 24 14:19:13 san2 bash[3235994]:
ceph-e2f1b934-ed43-11ec-80fa-04421a1a1d66-nfs-xcpnfs-1-0-san2-datsvq
Aug 24 14:19:13 san2
systemd[1]:ceph-e2f1b934-ed43-11ec-80fa-04421a1a1d66@nfs.xcpnfs.1.0.san2.datsvq.servic
<mailto:ceph-e2f1b934-ed43-11ec-80fa-04421a1a1d66@nfs.xcpnfs.1.0.san2.datsvq.servic>e:
Main process exited, code=exited, status=137/n/a
Aug 24 14:19:14 san2
systemd[1]:ceph-e2f1b934-ed43-11ec-80fa-04421a1a1d66@nfs.xcpnfs.1.0.san2.datsvq.servic
<mailto:ceph-e2f1b934-ed43-11ec-80fa-04421a1a1d66@nfs.xcpnfs.1.0.san2.datsvq.servic>e:
Failed with result 'exit-code'.
Aug 24 14:19:14 san2 systemd[1]: Stopped Ceph
nfs.xcpnfs.1.0.san2.datsvq for e2f1b934-ed43-11ec-80fa-04421a1a1d66.
And that's it. The ingress IP didn't move.
More odd, the cluster seems to have placed the ingress IP on node 1
(san1) but seems to be using the NFS service on node 2.
Do I need to more tightly connect the NFS service to the keepalive
and haproxy services, or do I need to expand the ingress services to
refer to multiple NFS services?
Thank you.
--
Regards,
Thorne Lawler - Senior System Administrator
*DDNS* | ABN 76 088 607 265
First registrar certified ISO 27001-2013 Data Security Standard ITGOV40172
P +61 499 449 170
_DDNS
/_*Please note:* The information contained in this email message and any
attached files may be confidential information, and may also be the
subject of legal professional privilege. _If you are not the intended
recipient any use, disclosure or copying of this email is unauthorised.
_If you received this email in error, please notify Discount Domain Name
Services Pty Ltd on 03 9815 6868 to report this matter and delete all
copies of this transmission together with any attachments. /
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx