Re: NFS recommendations

Alex Buie <abuie@xxxxxxxxxxxx> · Wed, 5 Feb 2025 21:14:48 -0500

Can confirm I ran into this bug within the past six months and we didn’t
find out until it caused an active outage. Definitely do not recommend KAd
only mode for nfs exports.

To be honest we ended up going with regular nfs-kernel-server on top of a
fuse mount of moosefs community edition for that install as it was a WORM
media server. Cephfs just wasn’t a great fit for reliability. Been working
wonders across the WAN now tho keeping all our sites in sync.

*Alex Buie*Senior Cloud Operations Engineer

450 Century Pkwy # 100 Allen, TX 75013
<https://maps.google.com/?q=450+Century+Pkwy+STE+100+%7C+Allen,+TX+%7C+75013&entry=gmail&source=g>
D: 469-884-0225 | www.cytracom.com

On Wed, Feb 5, 2025 at 9:10 PM Alexander Patrakov <patrakov@xxxxxxxxx>
wrote:

> Hello Devin,
>
> Last time I reviewed the code for orchestrating NFS, I was left
> wondering how the keepalived-only mode can work at all. The reason is
> that I found nothing that guarantees that the active NFS server and
> the floating IP address would end up on the same node. This might have
> been an old bug, already fixed since then - but I have not retested.
>
> On Thu, Feb 6, 2025 at 4:53 AM Devin A. Bougie <devin.bougie@xxxxxxxxxxx>
> wrote:
> >
> > Hi, All.
> >
> > We are new to Ceph, and looking for any general best practices WRT
> exporting a Cephfs file system over NFS.  I see several options in the
> documentation and have tested several different configurations, but haven’t
> yet seen much difference in our testing and aren’t sure exactly which
> configuration is generally recommended to start with.
> >
> > We have a single Cephfs filesystem in our cluster of 10 hosts.  Five of
> our hosts are OSDs with the spinning disks that make up our Cephfs data
> pool, and only run the osd services (osd, crash, ceph-exporter,
> node-exporter, and promtail).  The other five hosts are “admin” hosts that
> run everything else (mds, mgr, mon, etc.).
> >
> > Our current setup follows the "HIGH-AVAILABILITY NFS” documentation,
> which gives us an Ingress.nfs.cephfs service with the haproxy and
> keepalived daemons and a nfs.cephfs service for the actual nfs daemons.  If
> there are no downsides to this approach, are there any recommendations on
> placement for these two services?  Given our cluster, would it be best to
> run both on the admin nodes?  Or would it be better to have the
> ingress.nfs.cephfs service on the admin nodes, and the backend nfs.cephfs
> services on the osd nodes?
> >
> > Alternatively, are there advantages in using the “keepalive only” mode
> (only keepalived, no haproxy)?  Or does anyone recommend doing something
> completely different, like using Pacemaker and Corosync to manage our NFS
> services?
> >
> > Any recommendations one way or another would be greatly appreciated.
> >
> > Many thanks,
> > Devin
> > _______________________________________________
> > ceph-users mailing list -- ceph-users@xxxxxxx
> > To unsubscribe send an email to ceph-users-leave@xxxxxxx
>
>
>
> --
> Alexander Patrakov
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx
>
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx