Re: ceph-fuse and its memory usage

John Spray <jspray@xxxxxxxxxx> · Fri, 2 Oct 2015 09:57:45 +0100

On Fri, Oct 2, 2015 at 2:42 AM, Goncalo Borges
<goncalo@xxxxxxxxxxxxxxxxxxx> wrote:
> Dear CephFS Gurus...
>
> I have a question regarding ceph-fuse and its memory usage.
>
> 1./ My Ceph and CephFS setups are the following:
>
> Ceph:
> a. ceph 9.0.3
> b. 32 OSDs distributed in 4 servers (8 OSD per server).
> c. 'osd pool default size = 3' and 'osd pool default min size = 2'
> d. All servers running Centos6.7
>
> CephFS:
> e. a single mds
> f. dedicated pools for data and metadata
> g. clients in different locations / sites mounting CephFS via ceph-fuse
> h. All servers and clients running Centos6.7
>
> 2./ I have been running fio tests in two CephFS clients:
>     - Client A is in the same data center as all OSDs connected at 1 GbE
>     - Client B is in a different data center (in another city) also
> connected at 1 GbE. However, I've seen that the connection is problematic,
> and sometimes, the network performance is well bellow the theoretical 1 Gbps
> limit.
>     - Client A has 24 GB RAM + 98 GB of SWAP and client B has 48 GB of RAM +
> 98 GB of SWAP
>
> 3./ I have been running some fio write tests (with 128 threads) in both
> clients, and surprisingly, the results show that the aggregated throughput
> is better for client B than client A.
>
> CLIENT A results:
> # grep agg
> fio128threadsALL/fio128write_ioenginelibaio_iodepth64_direct1_bs512K_20151001015558.out
> WRITE: io=1024.0GB, aggrb=114878KB/s, minb=897KB/s, maxb=1785KB/s,
> mint=4697347msec, maxt=9346754msec
>
> CLIENT B results:
> #  grep agg
> fio128threadsALL/fio128write_ioenginelibaio_iodepth64_direct1_bs512K_20151001015555.out
> WRITE: io=1024.0GB, aggrb=483254KB/s, minb=3775KB/s, maxb=3782KB/s,
> mint=2217808msec, maxt=2221896msec
>
> 4./ If I actually monitor the memory usage of ceph-fuse during the I/O
> tests, I see that
>
> CLIENT A: ceph-fuse does not seem to go behond 7GB of VMEM and 1 GB of RMEM.
> CLIENT B: ceph-fuse uses 11 GB of VMEM and 7 GB of RMEM.
>
> 5./ These results make me think that caching is playing a critical role in
> these results.
>
> My questions are the following:
>
> a./ Why CLIENT B uses more memory than CLIENT A? My hint is that there is a
> network bottleneck between CLIENT B and the Ceph Cluster, and memory is more
> used because of that.

This is weird, and I don't have an explanation.  I would be surprised
if network latency was influencing timing enough to create such a
dramatic difference in caching behaviour.

Are both clients running the same version of ceph-fuse and the same
version of the kernel?

> b/ Is the FIO write performance better in CLIENT B a consequence of the fact
> that it is using more memory than client A?

Seems a reasonable inference, but it's still all very weird!

> c./ Is there a parameters we can set for the CEPHFS clients to limit the
> amount of memory they can use?

You can limit the caching inside ceph-fuse by setting
client_cache_size (metadata cache entries) and client_oc_size (max
data cache).  However, there'll also be some caching inside the kernel
(which you can probably control somehow but I don't know off the top
of my head).

Cheers,
John

>
> Cheers
> Goncalo
>
> --
> Goncalo Borges
> Research Computing
> ARC Centre of Excellence for Particle Physics at the Terascale
> School of Physics A28 | University of Sydney, NSW  2006
> T: +61 2 93511937
>
>
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com