For NFS (e.g., as implemented by NFS-ganesha), the situation is also quite stupid. Without high availability (HA), it works (that is, until you update NFS-Ganesha version), but corporate architects won't let you deploy any system without HA, because, in their view, non-HA systems are not production-ready by definition. (And BTW, the current NVMe-oF gateway also has no multipath and thus no viable HA) With an attempt to set up HA for NFS, you'll get at least the following showstoppers: For NFS v4.1: * VMware refuses to work until the manual admin intervention if it sees any change in the "owner" and "scope" fields of the EXCHANGE_ID message between the previous and the current NFS connection. * NFS-Ganesha sets both fields from the hostname by default, and the patch that makes these fields configurable is "quite recent" (in version 4.3). This is important, as otherwise, every NFS server fail-over would trip off VMware, thus defeating the point of a high-availability setup. * There is a regression in NFS-Ganesha that manifests as a deadlock (easily triggerable even without Ceph by running xfstests), which is critical, because systemd cannot restart deadlocked services. Unfortunately, the last NFS-Ganesha version before the regression (4.0.8) does not contain the patch that allows manipulating the "owner" and "scope" fields. * Cephadm-based deployments do not set these configuration options anyway. * If you would like to use the "rados_cluster" NFSv4 recovery backend (used for grace periods), you need to be extra careful with various "server names" also because they are used to decide whether to end the grace period. If the recovery backend has seen two server names (corresponding to two NFS-Ganesha instances, for scale-out), then both must be up for the grace period to end. If there is only one server name, you are allowed to run only one instance. If you want high availability together with scale-out, you need to be able to schedule two NFS-Ganesha instances (with names like a and b, not corresponding to the names of hosts where they run) on two out of three available servers. Orchestrators do not do this, you need to implement this on your own. For NFS v3: * NFS-Ganesha opens files and acquires MDS locks just in case, to make sure that another client cannot modify them while the original client might have cached something. * If NFS-Ganesha crashes or a server reboots, then the other NFS-Ganesha, brought up to replace the original one, will also stumble upon these locks, as the MDS recognizes it as a different client. Result: it waits until the locks time out, which is too long (minutes!), as the guest OS in VMware would then time out its storage. * To avoid the problem mentioned above and to get seamless fail-over, the replacement instance of NFS-Ganesha must present itself as the same client (i.e., as the same fake hostname) to the MDS, but no known orchestrators facilitate this. Conclusion: please use iSCSI or sacrifice HA, as there are no working alternatives yet. On Fri, Jun 28, 2024 at 1:31 AM Anthony D'Atri <anthony.datri@xxxxxxxxx> wrote: > > There are folks actively working on this gateway and there's a Slack channel. I haven't used it myself yet. > > My understanding is that ESXi supports NFS. Some people have had good success mounting KRBD volumes on a gateway system or VM and re-exporting via NFS. > > > > > On Jun 27, 2024, at 09:01, Drew Weaver <drew.weaver@xxxxxxxxxx> wrote: > > > > Howdy, > > > > I recently saw that Ceph has a gateway which allows VMWare ESXi to connect to RBD. > > > > We had another gateway like this awhile back the ISCSI gateway. > > > > The ISCSI gateway ended up being... let's say problematic. > > > > Is there any reason to believe that NVMeOF will also end up on the floor and has anyone that uses VMWare extensively evaluated it's viability? > > > > Just curious! > > > > Thanks, > > -Drew > > > > _______________________________________________ > > ceph-users mailing list -- ceph-users@xxxxxxx > > To unsubscribe send an email to ceph-users-leave@xxxxxxx > _______________________________________________ > ceph-users mailing list -- ceph-users@xxxxxxx > To unsubscribe send an email to ceph-users-leave@xxxxxxx -- Alexander Patrakov _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx