Re: cephfs, low performances

Francois Lafont <flafdivers@xxxxxxx> · Fri, 18 Dec 2015 16:39:12 +0100

Hi Christian,

On 18/12/2015 04:16, Christian Balzer wrote:

>> It seems to me very bad. 
> Indeed. 
> Firstly let me state that I don't use CephFS and have no clues how this
> influences things and can/should be tuned.

Ok, no problem. Anyway, thanks for your answer. ;)

> That being said, the fio above running in VM (RBD) gives me 440 IOPS
> against a single OSD storage server (replica 1) with 4 crappy HDDs and
> on-disk journals on my test cluster (1Gb/s links). 
> So yeah, given your configuration that's bad.

I have tried a quick test with a rados block device (size = 4GB with
filesystem EXT4) mounted on the same client node (the client node where
I'm testing cephfs) and the same "fio" command give me iops read/write
equal to ~1400.

So my problem could be "cephfs" specific, no?

That being said, I don't know if it's can be a symptom but during the bench
the iops are real-time displayed and the value seems to me no very constant.
I can see sometimes peacks at 1800 iops and suddenly the value is 800 iops
and re-turns up at ~1400 etc.

> In comparison I get 3000 IOPS against a production cluster (so not idle)
> with 4 storage nodes. Each with 4 100GB DC S3700 for journals and OS and 8
> SATA HDDs, Infiniband (IPoIB) connectivity for everything.
> 
> All of this is with .80.x (Firefly) on Debian Jessie.

Ok, interesting. My cluster is idle and but I have approximatively twice as
less disks than your cluster and my SATA disk are directly connected on the
motherboard. So, it seems to me logical that I have ~1400 and you ~3000, no? 

> You want to use atop on all your nodes and look for everything from disks
> to network utilization.
> There might be nothing obvious going on, but it needs to be ruled out.

It's a detail but I have noticed that atop (on Ubuntu Trusty) don't display
the % of bandwidth of my 10GbE interface.

Anyway, I have tried to inspect the node cluster during the cephfs bench,
but I have seen no bottleneck concerning CPU, network and disks. 

>> I use Ubuntu 14.04 on each server with the 3.13 kernel (it's the same
>> for the client ceph where I run my bench) and I use Ceph 9.2.0
>> (Infernalis). 
> 
> I seem to recall that this particular kernel has issues, you might want to
> scour the archives here.

But, in my case, I use cephfs-fuse in the client node so the kernel version
is not relevant I think. And I thought that the kernel version was not very
important in the cluster nodes side. Am I wrong?

>> On the client, cephfs is mounted via cephfs-fuse with this
>> in /etc/fstab:
>>
>> id=cephfs,keyring=/etc/ceph/ceph.client.cephfs.keyring,client_mountpoint=/	/mnt/cephfs
>> fuse.ceph	noatime,defaults,_netdev	0	0
>>
>> I have 5 cluster node servers "Supermicro Motherboard X10SLM+-LN4 S1150"
>> with one 1GbE port for the ceph public network and one 10GbE port for
>> the ceph private network:
>>
> For the sake of latency (which becomes the biggest issues when you're not
> exhausting CPU/DISK), you'd be better off with everything on 10GbE, unless
> you need the 1GbE to connect to clients that have no 10Gb/s ports.

Yes, exactly. My client is 1Gb/s only.

>> - 1 x Intel Xeon E3-1265Lv3
>> - 1 SSD DC3710 Series 200GB (with partitions for the OS, the 3
>> OSD-journals and, just for ceph01, ceph02 and ceph03, the SSD contains
>> too a partition for the workdir of a monitor
> The 200GB DC S3700 would have been faster, but that's a moot point and not
> your bottleneck for sure.
> 
>> - 3 HD 4TB Western Digital (WD) SATA 7200rpm
>> - RAM 32GB
>> - NO RAID controlleur
> 
> Which controller are you using?

No controller, the 3 SATA disks of my client are directly connected on
the SATA ports of the motherboard.

> I recently came across an Adaptec SATA3 HBA that delivered only 176 MB/s
> writes with 200GB DC S3700s as opposed to 280MB/s when used with Intel
> onboard SATA-3 ports or a LSI 9211-4i HBA.

Thanks for your help Christian.

-- 
François Lafont
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com