On 26-10-15 14:29, Matteo Dacrema wrote: > Hi Nick, > > > > I also tried to increase iodepth but nothing has changed. > > > > With iostat I noticed that the disk is fully utilized and write per > seconds from iostat match fio output. > Ceph isn't fully optimized to get the maximum potential out of NVME SSDs yet. For example, NVM-E SSDs work best with very high queue depths and parallel IOps. Also, be aware that Ceph add multiple layers to the whole I/O subsystem and that there will be a performance impact when Ceph is used in between. Wido > > > Matteo > > > > *From:*Nick Fisk [mailto:nick@xxxxxxxxxx] > *Sent:* lunedì 26 ottobre 2015 13:06 > *To:* Matteo Dacrema <mdacrema@xxxxxxxx>; ceph-users@xxxxxxxx > *Subject:* RE: BAD nvme SSD performance > > > > Hi Matteo, > > > > Ceph introduces latency into the write path and so what you are seeing > is typical. If you increase the iodepth of the fio test you should get > higher results though, until you start maxing out your CPU. > > > > Nick > > > > *From:*ceph-users [mailto:ceph-users-bounces@xxxxxxxxxxxxxx] *On Behalf > Of *Matteo Dacrema > *Sent:* 26 October 2015 11:20 > *To:* ceph-users@xxxxxxxx <mailto:ceph-users@xxxxxxxx> > *Subject:* BAD nvme SSD performance > > > > Hi all, > > > > I’ve recently buy two Samsung SM951 256GB nvme PCIe SSDs and built a 2 > OSD ceph cluster with min_size = 1. > > I’ve tested them with fio ad I obtained two very different results with > these two situations with fio. > > This is the command : *fio --ioengine=libaio --direct=1 --name=test > --filename=test --bs=4k --size=100M --readwrite=randwrite > --numjobs=200 --group_reporting* > > > > On the OSD host I’ve obtained this result: > > *bw=575493KB/s, iops=143873* > > * * > > On the client host with a mounted volume I’ve obtained this result: > > > > Fio executed on the client osd with a mounted volume: > > *bw=9288.1KB/s, iops=2322* > > * * > > I’ve obtained this results with Journal and data on the same disk and > also with Journal on separate SSD. > > * * > > I’ve two OSD host with 64GB of RAM and 2x Intel Xeon E5-2620 @ 2.00GHz > and one MON host with 128GB of RAM and 2x Intel Xeon E5-2620 @ 2.00 GHz. > > I’m using 10G mellanox NIC and Switch with jumbo frames. > > > > I also did other test with this configuration ( see attached Excel > workbook ) > > Hardware configuration for each of the two OSD nodes: > > 3x 100GB Intel SSD DC S3700 with 3 * 30 GB partition > for every SSD > > 9x 1TB Seagate HDD > > Results: about *12k* IOPS with 4k bs and same fio test. > > > > I can’t understand where is the problem with nvme SSDs. > > Anyone can help me? > > > > Here the *ceph.conf:* > > [global] > > fsid = 3392a053-7b48-49d3-8fc9-50f245513cc7 > > mon_initial_members = mon1 > > mon_host = 192.168.1.3 > > auth_cluster_required = cephx > > auth_service_required = cephx > > auth_client_required = cephx > > osd_pool_default_size = 2 > > mon_client_hung_interval = 1.0 > > mon_client_ping_interval = 5.0 > > public_network = 192.168.1.0/24 > > cluster_network = 192.168.1.0/24 > > mon_osd_full_ratio = .90 > > mon_osd_nearfull_ratio = .85 > > > > [mon] > > mon_warn_on_legacy_crush_tunables = false > > > > [mon.1] > > host = mon1 > > mon_addr = 192.168.1.3:6789 > > > > [osd] > > osd_journal_size = 30000 > > journal_dio = true > > journal_aio = true > > osd_op_threads = 24 > > osd_op_thread_timeout = 60 > > osd_disk_threads = 8 > > osd_recovery_threads = 2 > > osd_recovery_max_active = 1 > > osd_max_backfills = 2 > > osd_mkfs_type = xfs > > osd_mkfs_options_xfs = "-f -i size=2048" > > osd_mount_options_xfs = "rw,noatime,inode64,logbsize=256k,delaylog" > > filestore_xattr_use_omap = false > > filestore_max_inline_xattr_size = 512 > > filestore_max_sync_interval = 10 > > filestore_merge_threshold = 40 > > filestore_split_multiple = 8 > > filestore_flusher = false > > filestore_queue_max_ops = 2000 > > filestore_queue_max_bytes = 536870912 > > filestore_queue_committing_max_ops = 500 > > filestore_queue_committing_max_bytes = 268435456 > > filestore_op_threads = 2 > > > > Best regards, > > Matteo > > > > > Web Bug from http://xo4t.mj.am/o/xo4t/f8b6cd3d/qoi1l59e.gif > -- > Questo messaggio e' stato analizzato con Libra ESVA ed e' risultato non > infetto. > Clicca qui per segnalarlo come spam. > <http://esva01.enter.it/cgi-bin/learn-msg.cgi?id=326E9400C6.A1DC9> > Clicca qui per metterlo in blacklist > <http://esva01.enter.it/cgi-bin/learn-msg.cgi?blacklist=1&id=326E9400C6.A1DC9> > > > > _______________________________________________ > ceph-users mailing list > ceph-users@xxxxxxxxxxxxxx > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com