Re: who is using nfs-ganesha and cephfs?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi all,

By Sage Weil <sweil@xxxxxxxxxx>:
Who is running nfs-ganesha's FSAL to export CephFS?  What has your
experience been?

(We are working on building proper testing and support for this into
Mimic, but the ganesha FSAL has been around for years.)

After we had moved most of our file-based data to a CephFS environment and suffering from what later turned out to be a (mis-)configuration issue with our existing nfsd server, I had decided to give Ganesha a try.

We run a Ceph cluster on three servers, openSUSE Leap 42.3, Ceph Luminous (latest stable). 2x10G interfaces for intra-cluster communication, 2x1G towards the NFS clients. CephFS meta-data is on an SSD pool, the actual data is on SAS HDDs, 12 OSDs. Ganesha version is 2.5.2.0+git.1504275777.a9d23b98f-3.6. All Ganesha/nfsd server services are on one of the servers that are also Ceph nodes.

We run an automated, distributed build&stage environment (tons of gcc compiles on multiple clients, some Java compiles, RPM builds etc.), with (among others) nightly test build runs. These usually take about 8 hours, when using kernel nfsd and local storage on the same servers that also provide the Ceph service.

After switching to Ganesha (with CephFS FSAL, Ganesha running on the same server where we originally had run nfsd) and starting test runs of the same work load, we aborted the runs after about 12 hours - by then, only (estimated) 60 percent of the job were done.

For comparison, when now using kernel nfsd to serve the CephFS shares (mounting the single CephFS via kernel FS module on the server that's running nfsd, and exporting multiple sub-directories via nfsd), we see an increase of between none and eight percent of the original run time.

So to us, comparing "Ganesha+CephFS FSAL" to "kernel nfsd with kernel CephFS module", the latter wins. Or to put it the other way around, Ganesha seems unusable to us in its current state, judging by the slowness observed.

Other issues I noticed:

- the directory size, as shown by "ls -l" on the client, was very different from that shown when mounting via nfsd ;)

- "showmount" did not return any entries, with would have (later on, had we continued to use Ganesha) caused problems with our dynamic automouter maps

Please note that I did not have time to do intensive testing against different Ganesha parameters. The only runs I made were with or without "MaxRead = 1048576; MaxWrite = 1048576;" per share, per some comment about buffer sizes. These changes didn't seem to bring much difference, though.

We closely monitor our network and server performance, I could clearly see a huge drop of network traffic (NFS server to clients) when switching from nfsd to Ganesha, and an according increase when switching back to nfsd (sharing the CephFS mount). None of the servers seemed to be under excessive load during these tests but it was obvious that Ganesha took its share of CPU - maybe the bottle-neck were some single-threaded operations, so Ganesha could not make use of the other, idling cores. But I'm just guessing here.

Regards,
J


_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux