Also, if you really to get more iops from qemu, and you can use multiple disk, with enabling iothread. (i'm able to get 50-60k iops 4k rand read by disk, up to 450k iops with 9 disks). In the future, in qemu, I'll be possible to use multiple iothread for 1 disk. ----- Mail original ----- De: "aderumier" <aderumier@xxxxxxxxx> À: "Bill WONG" <wongahshuen@xxxxxxxxx> Cc: "ceph-users" <ceph-users@xxxxxxxxxxxxxx> Envoyé: Lundi 7 Novembre 2016 07:46:16 Objet: Re: RBD Block performance vs rbd mount as filesystem >>any document can provided for how i can complied ceph with jemalloc as well? as it looks if ceph with jemalloc is much better performance too. simply build ceph with --with-jemalloc (I'm seeing improvements on really high iops, something like 300k iops, tcmalloc is limiting, and with jemalloc I'm around 450k iops) here my debian package rules change: iff --git a/debian/control b/debian/control index 3e03689..ab23b3b 100644 --- a/debian/control +++ b/debian/control @@ -38,7 +38,6 @@ Build-Depends: autoconf, libexpat1-dev, libfcgi-dev, libfuse-dev, - libgoogle-perftools-dev [i386 amd64 arm64], libkeyutils-dev, libleveldb-dev, libnss3-dev, diff --git a/debian/rules b/debian/rules index b705dd6..7db5b9a 100755 --- a/debian/rules +++ b/debian/rules @@ -23,7 +23,7 @@ export DEB_HOST_ARCH ?= $(shell dpkg-architecture -qDEB_HOST_ARCH) extraopts += --with-ocf --with-nss extraopts += --with-debug extraopts += --enable-cephfs-java - +extraopts += --with-jemalloc # rocksdb is not packaged by anyone. build it if we can. extraopts += --with-librocksdb-static=check >>and what's the side effect if debug ms=0/0 I don't see any side effect. you'll don't have debug information. (but as your are in production, it shouldn't be a problem) >>and it looks disable cephx auth is no good for production use.... cephx affect lot of performance? for me, I still have 10-20% difference with cephx. If you only use your ceph cluster for your qemu cluster, I don't see any problem to disable it. (and of course your ceph cluster is firewalled / or network access is only available for your qemu client). Note that changing it only is not possible. so you need to shutdown all the clients before doing this change. ----- Mail original ----- De: "Bill WONG" <wongahshuen@xxxxxxxxx> À: "aderumier" <aderumier@xxxxxxxxx> Cc: "dillaman" <dillaman@xxxxxxxxxx>, "ceph-users" <ceph-users@xxxxxxxxxxxxxx> Envoyé: Lundi 7 Novembre 2016 06:35:38 Objet: Re: RBD Block performance vs rbd mount as filesystem HI Alexandre, thank you! any document can provided for how i can complied ceph with jemalloc as well? as it looks if ceph with jemalloc is much better performance too. and what's the side effect if debug ms=0/0 and it looks disable cephx auth is no good for production use.... cephx affect lot of performance? On Sat, Nov 5, 2016 at 5:55 PM, Alexandre DERUMIER < [ mailto:aderumier@xxxxxxxxx | aderumier@xxxxxxxxx ] > wrote: here some tips I use to improve librbd performance && qemu: - disabling cephx auth - disable debug_ms : (I'm jumping from 30k iops to 45k iops, with 4k randread) [global] debug ms = 0/0 - compile qemu with jemalloc (--enable-jemalloc) [ https://lists.gnu.org/archive/html/qemu-devel/2015-06/msg05265.html | https://lists.gnu.org/archive/html/qemu-devel/2015-06/msg05265.html ] ----- Mail original ----- De: "Jason Dillaman" < [ mailto:jdillama@xxxxxxxxxx | jdillama@xxxxxxxxxx ] > À: "Bill WONG" < [ mailto:wongahshuen@xxxxxxxxx | wongahshuen@xxxxxxxxx ] > Cc: "aderumier" < [ mailto:aderumier@xxxxxxxxx | aderumier@xxxxxxxxx ] >, "ceph-users" < [ mailto:ceph-users@xxxxxxxxxxxxxx | ceph-users@xxxxxxxxxxxxxx ] > Envoyé: Mardi 1 Novembre 2016 02:06:22 Objet: Re: RBD Block performance vs rbd mount as filesystem For better or worse, I can repeat your "ioping" findings against a qcow2 image hosted on a krbd-backed volume. The "bad" news is that it actually isn't even sending any data to the OSDs -- which is why your latency is shockingly low. When performing a "dd ... oflag=dsync" against the krbd-backed qcow2 image, I can see lots of IO being coalesced from 4K writes into larger writes, which is artificially inflating the stats. On Mon, Oct 31, 2016 at 11:08 AM, Bill WONG < [ mailto:wongahshuen@xxxxxxxxx | wongahshuen@xxxxxxxxx ] > wrote: > Hi Jason, > > it looks the situation is the same, no difference. my ceph.conf is below, > any comments or improvement required? > --- > [global] > fsid = 106a12b0-5ed0-4a71-b6aa-68a09088ec33 > mon_initial_members = ceph-mon1, ceph-mon2, ceph-mon3 > mon_host = 192.168.8.11,192.168.8.12,192.168.8.13 > auth_cluster_required = cephx > auth_service_required = cephx > auth_client_required = cephx > filestore_xattr_use_omap = true > osd pool default size = 3 > osd pool default min size = 1 > osd pool default pg num = 4096 > osd pool default pgp num = 4096 > osd_crush_chooseleaf_type = 1 > mon_pg_warn_max_per_osd = 0 > max_open_files = 131072 > > [mon] > mon_data = /var/lib/ceph/mon/ceph-$id > > mon clock drift allowed = 2 > mon clock drift warn backoff = 30 > > [osd] > osd_data = /var/lib/ceph/osd/ceph-$id > osd_journal_size = 20000 > osd_mkfs_type = xfs > osd_mkfs_options_xfs = -f > filestore_xattr_use_omap = true > filestore_min_sync_interval = 10 > filestore_max_sync_interval = 15 > filestore_queue_max_ops = 25000 > filestore_queue_max_bytes = 10485760 > filestore_queue_committing_max_ops = 5000 > filestore_queue_committing_max_bytes = 10485760000 > journal_max_write_bytes = 1073714824 > journal_max_write_entries = 10000 > journal_queue_max_ops = 50000 > journal_queue_max_bytes = 10485760000 > osd_max_write_size = 512 > osd_client_message_size_cap = 2147483648 > osd_deep_scrub_stride = 131072 > osd_op_threads = 8 > osd_disk_threads = 4 > osd_map_cache_size = 1024 > osd_map_cache_bl_size = 128 > osd_mount_options_xfs = "rw,noexec,nodev,noatime,nodiratime,nobarrier" > osd_recovery_op_priority = 4 > osd_recovery_max_active = 10 > osd_max_backfills = 4 > rbd non blocking aio = false > > [client] > rbd_cache = true > rbd_cache_size = 268435456 > rbd_cache_max_dirty = 134217728 > rbd_cache_max_dirty_age = 5 > --- > > > > On Mon, Oct 31, 2016 at 9:20 PM, Jason Dillaman < [ mailto:jdillama@xxxxxxxxxx | jdillama@xxxxxxxxxx ] > wrote: >> >> On Sun, Oct 30, 2016 at 5:40 AM, Bill WONG < [ mailto:wongahshuen@xxxxxxxxx | wongahshuen@xxxxxxxxx ] > wrote: >> > any ideas or comments? >> >> Can you set "rbd non blocking aio = false" in your ceph.conf and retry >> librbd? This will eliminate at least one context switch on the read IO >> path -- which result in increased latency under extremely low queue >> depths. >> >> -- >> Jason > > -- Jason _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com