Re: NFS over Ceph

Calvin Morrow <calvin.morrow@xxxxxxxxx> · Mon, 23 Apr 2012 23:19:36 -0600

On Mon, Apr 23, 2012 at 9:01 PM, Sage Weil <sage@xxxxxxxxxxxx> wrote:
> On Mon, 23 Apr 2012, Calvin Morrow wrote:
>> I've been testing a couple different use scenarios with Ceph 0.45
>> (two-node cluster, single mon, active/standby mds).  I have a pair of
>> KVM virtual machines acting as ceph clients to re-export iSCSI over
>> RBD block devices, and also NFS over a Ceph mount (mount -t ceph).
>>
>> The iSCSI re-export is going very well.  So far I haven't had any
>> issues to speak of (even while testing Pacemaker based failover).
>>
>> The NFS re-export isn't going nearly as well.  I'm running into
>> several issues with reliability, speed, etc.  To start with, file
>> operations seem painstakingly long.  Copying over multiple 20 Kb files
>> takes  > 10 seconds per file.  "dd if=/dev/zero of='.... goes very
>> fast once the data transfer starts, but the actual opening of the file
>> can take nearly as long (or longer depending on size).
>
> Can you try with the 'async' option in your exports file?  I think the
> main problem with the slowness is because of what nfsd is doing with
> syncs, but want to confirm that.
>

async didn't make a difference.  I thought this pretty strange, so I
decided to try mounting a separate dir with the ceph-fuse client
instead of the native kernel client.  What amounted was a night and
day difference.  I pushed a good 79 GB (my home directory) through the
nfs server (sync) attached to the fuse client at an average speed of
~68 MB / sec over consumer gigabit.

Just for completeness, I re-exported the native kernel client (after
verifying it could browse ok, read / write files, etc.) and I was back
to __very__ slow metadata ops (just a simple `ls` takes > 1 min).

Calvin

> Generally speaking, there is an unfortunate disconnect between the NFS and
> Ceph metadata protocols.  Ceph tries to do lots of operations and sync
> periodically and on-demand (e.g., when you fsync() a directory).  NFS,
> OTOH, says you should sync every operation, which is usually pretty
> horrible for performance unless you have NVRAM or an SSD or something.
>
> We haven't invested much time/thought into what the best behavior should
> be, here... NFS is pretty far down are list at the moment.
>
> sage
>
>>
>> I've also run into cases where the directory mounted as ceph
>> (/mnt/ceph) "hangs" on the NFS server requiring a reboot of the NFS
>> server.
>>
>> That said, are there any special recommendations regarding exporting
>> Ceph through NFS?  I know that in the wiki and also (still present as
>> of 3.3.3) kernel source indicates:
>>
>> * NFS export support
>> *
>> * NFS re-export of a ceph mount is, at present, only semireliable.
>> * The basic issue is that the Ceph architectures doesn't lend itself
>> * well to generating filehandles that will remain valid forever.
>>
>> Should I be trying this a different way?  NFS export of a filesystem
>> (ext4 / xfs) on RBD?  Other options?  Also, does the filehandle
>> limitation specified above apply to more than NFS (such as a KVM image
>> using a file on a ceph mount for storage backing)?
>>
>> Any insight would be appreciated.
>>
>> Calvin
>> --
>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> the body of a message to majordomo@xxxxxxxxxxxxxxx
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
>>
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html