Better late than never, some XFS versus EXT4 test results

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello,

Re-cap of my new test and staging cluster:
4 nodes running latest Hammer under Debian Jessie (with sysvinit, kernel
4.6) and manually created OSDs. 
Infiniband (IPoIB) QDR (40Gb/s, about 30Gb/s effective) between all nodes.

2 HDD OSD nodes with 32GB RAM, fast enough CPU (E5-2620 v3), 2x 200GB DC S3610 for
OS and journals (2 per SSD), 4x 1GB 2.5" SATAs for OSDs.
For my amusement and edification the OSDs of one node are formatted with
XFS, the other one EXT4 (as all my production clusters).

The 2 SSD ODS nodes have 1x 200GB DC S3610 (OS and 4 journal partitions)
and 2x 400GB DC S3610s (2 180GB partitions, so 8 SSD OSDs total), same
specs as the HDD nodes otherwise.
Also one node with XFS, the other EXT4.

Today I added the above 2 SSD nodes and created a pool (future cache tier)
on them. 

First I did some 4M block (default) rados bench runs, with the following
layout:

200GB SSD with 2 journals, 80+% utilization (200MB/s),
400GB SSD, 2 OSDs, external journals (above) 50% (200MB/s)
400GB SSD, 2 OSDs co-located journals in FS 100% (400MB/s)

Thus unsurprisingly 400MB/s max speed for the cluster in this config.

With one journal per 400GB SSD external and one internal, I got 100% usage
on the 200GB journal SSD (230MB/s) and about 90% on the OSD SSDs,
resulting in 430MB/s throughput.

In a production environment however I'd go again with in-line journals, as
matching up both speed and endurance for SSD journals basically means big
NVMes with the according price tag.

XFS and EXT4 didn't show any significant differences, around 400% IOwait.


The most interesting result was the 4K rados bench on the SSD pool:

The XFS node had a 1/3 lower wait (I/O) at 40%, but also a significant
lower average throughput per SSD of 120MB/s.
The EXT4 node registered around 120% IOwait, but wrote 160MB/s per SSD on
average.

I'm not sure how to quantify these numbers, though there is another
interesting tidbit below in the HDD tests. 
Am I seeing more EXT4 overhead, or is it actually writing more?
FIO runs on the actual FS give a slight (2%) advantage to EXT4, but
nothing like what I'm seeing here.

Idle CPU during those 4K runs is down to 100% (out of 1200) at times,
which matches the 4 OSD processes per node running at about 250%.


Lastly the HDD OSD, 4MB block bench (nothing outstanding with the 4K one):
Similar throughput per drive, however according to atop the avio per HDD is 12ms with XFS and 8ms with EXT4.

Some food for thought, minor though with BlueStore in the pipeline. 

Christian
-- 
Christian Balzer        Network/Systems Engineer                
chibi@xxxxxxx   	Global OnLine Japan/Rakuten Communications
http://www.gol.com/
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux