>>if i complied the ceph from soruce, then i cannot use ceph-deploy to install the cluster, everything need to handle from myself. as i am running CentOS 7, and it looks ceph suggest to >>use ceph-deploy to deploy the cluster. is there any pre-complied or enable --with-jemalloc by default package ? you can use ceph-deploy to deploy first time with ceph repo, then reinstall jemmaloc package on top. or you can build your own repository, and tell to ceph deploy install to use it ceph-deploy install --repo-url http://my.repo.com/debian-jewel/ But yes, you'll need to build packages manually each time ceph will release a new version. (I'm still hoping to have some day an official ceph-jemalloc repo ) ----- Mail original ----- De: "Bill WONG" <wongahshuen@xxxxxxxxx> À: "aderumier" <aderumier@xxxxxxxxx> Cc: "dillaman" <dillaman@xxxxxxxxxx>, "ceph-users" <ceph-users@xxxxxxxxxxxxxx> Envoyé: Lundi 7 Novembre 2016 11:28:02 Objet: Re: RBD Block performance vs rbd mount as filesystem Hi Alexandre, if i complied the ceph from soruce, then i cannot use ceph-deploy to install the cluster, everything need to handle from myself. as i am running CentOS 7, and it looks ceph suggest to use ceph-deploy to deploy the cluster. is there any pre-complied or enable --with-jemalloc by default package ? On Mon, Nov 7, 2016 at 2:46 PM, Alexandre DERUMIER < [ mailto:aderumier@xxxxxxxxx | aderumier@xxxxxxxxx ] > wrote: >>any document can provided for how i can complied ceph with jemalloc as well? as it looks if ceph with jemalloc is much better performance too. simply build ceph with --with-jemalloc (I'm seeing improvements on really high iops, something like 300k iops, tcmalloc is limiting, and with jemalloc I'm around 450k iops) here my debian package rules change: iff --git a/debian/control b/debian/control index 3e03689..ab23b3b 100644 --- a/debian/control +++ b/debian/control @@ -38,7 +38,6 @@ Build-Depends: autoconf, libexpat1-dev, libfcgi-dev, libfuse-dev, - libgoogle-perftools-dev [i386 amd64 arm64], libkeyutils-dev, libleveldb-dev, libnss3-dev, diff --git a/debian/rules b/debian/rules index b705dd6..7db5b9a 100755 --- a/debian/rules +++ b/debian/rules @@ -23,7 +23,7 @@ export DEB_HOST_ARCH ?= $(shell dpkg-architecture -qDEB_HOST_ARCH) extraopts += --with-ocf --with-nss extraopts += --with-debug extraopts += --enable-cephfs-java - +extraopts += --with-jemalloc # rocksdb is not packaged by anyone. build it if we can. extraopts += --with-librocksdb-static=check >>and what's the side effect if debug ms=0/0 I don't see any side effect. you'll don't have debug information. (but as your are in production, it shouldn't be a problem) >>and it looks disable cephx auth is no good for production use.... cephx affect lot of performance? for me, I still have 10-20% difference with cephx. If you only use your ceph cluster for your qemu cluster, I don't see any problem to disable it. (and of course your ceph cluster is firewalled / or network access is only available for your qemu client). Note that changing it only is not possible. so you need to shutdown all the clients before doing this change. ----- Mail original ----- De: "Bill WONG" < [ mailto:wongahshuen@xxxxxxxxx | wongahshuen@xxxxxxxxx ] > À: "aderumier" < [ mailto:aderumier@xxxxxxxxx | aderumier@xxxxxxxxx ] > Cc: "dillaman" < [ mailto:dillaman@xxxxxxxxxx | dillaman@xxxxxxxxxx ] >, "ceph-users" < [ mailto:ceph-users@xxxxxxxxxxxxxx | ceph-users@xxxxxxxxxxxxxx ] > Envoyé: Lundi 7 Novembre 2016 06:35:38 Objet: Re: RBD Block performance vs rbd mount as filesystem HI Alexandre, thank you! any document can provided for how i can complied ceph with jemalloc as well? as it looks if ceph with jemalloc is much better performance too. and what's the side effect if debug ms=0/0 and it looks disable cephx auth is no good for production use.... cephx affect lot of performance? On Sat, Nov 5, 2016 at 5:55 PM, Alexandre DERUMIER < [ mailto: [ mailto:aderumier@xxxxxxxxx | aderumier@xxxxxxxxx ] | [ mailto:aderumier@xxxxxxxxx | aderumier@xxxxxxxxx ] ] > wrote: here some tips I use to improve librbd performance && qemu: - disabling cephx auth - disable debug_ms : (I'm jumping from 30k iops to 45k iops, with 4k randread) [global] debug ms = 0/0 - compile qemu with jemalloc (--enable-jemalloc) [ [ https://lists.gnu.org/archive/html/qemu-devel/2015-06/msg05265.html | https://lists.gnu.org/archive/html/qemu-devel/2015-06/msg05265.html ] | [ https://lists.gnu.org/archive/html/qemu-devel/2015-06/msg05265.html | https://lists.gnu.org/archive/html/qemu-devel/2015-06/msg05265.html ] ] ----- Mail original ----- De: "Jason Dillaman" < [ mailto: [ mailto:jdillama@xxxxxxxxxx | jdillama@xxxxxxxxxx ] | [ mailto:jdillama@xxxxxxxxxx | jdillama@xxxxxxxxxx ] ] > À: "Bill WONG" < [ mailto: [ mailto:wongahshuen@xxxxxxxxx | wongahshuen@xxxxxxxxx ] | [ mailto:wongahshuen@xxxxxxxxx | wongahshuen@xxxxxxxxx ] ] > Cc: "aderumier" < [ mailto: [ mailto:aderumier@xxxxxxxxx | aderumier@xxxxxxxxx ] | [ mailto:aderumier@xxxxxxxxx | aderumier@xxxxxxxxx ] ] >, "ceph-users" < [ mailto: [ mailto:ceph-users@xxxxxxxxxxxxxx | ceph-users@xxxxxxxxxxxxxx ] | [ mailto:ceph-users@xxxxxxxxxxxxxx | ceph-users@xxxxxxxxxxxxxx ] ] > Envoyé: Mardi 1 Novembre 2016 02:06:22 Objet: Re: RBD Block performance vs rbd mount as filesystem For better or worse, I can repeat your "ioping" findings against a qcow2 image hosted on a krbd-backed volume. The "bad" news is that it actually isn't even sending any data to the OSDs -- which is why your latency is shockingly low. When performing a "dd ... oflag=dsync" against the krbd-backed qcow2 image, I can see lots of IO being coalesced from 4K writes into larger writes, which is artificially inflating the stats. On Mon, Oct 31, 2016 at 11:08 AM, Bill WONG < [ mailto: [ mailto:wongahshuen@xxxxxxxxx | wongahshuen@xxxxxxxxx ] | [ mailto:wongahshuen@xxxxxxxxx | wongahshuen@xxxxxxxxx ] ] > wrote: > Hi Jason, > > it looks the situation is the same, no difference. my ceph.conf is below, > any comments or improvement required? > --- > [global] > fsid = 106a12b0-5ed0-4a71-b6aa-68a09088ec33 > mon_initial_members = ceph-mon1, ceph-mon2, ceph-mon3 > mon_host = 192.168.8.11,192.168.8.12,192.168.8.13 > auth_cluster_required = cephx > auth_service_required = cephx > auth_client_required = cephx > filestore_xattr_use_omap = true > osd pool default size = 3 > osd pool default min size = 1 > osd pool default pg num = 4096 > osd pool default pgp num = 4096 > osd_crush_chooseleaf_type = 1 > mon_pg_warn_max_per_osd = 0 > max_open_files = 131072 > > [mon] > mon_data = /var/lib/ceph/mon/ceph-$id > > mon clock drift allowed = 2 > mon clock drift warn backoff = 30 > > [osd] > osd_data = /var/lib/ceph/osd/ceph-$id > osd_journal_size = 20000 > osd_mkfs_type = xfs > osd_mkfs_options_xfs = -f > filestore_xattr_use_omap = true > filestore_min_sync_interval = 10 > filestore_max_sync_interval = 15 > filestore_queue_max_ops = 25000 > filestore_queue_max_bytes = 10485760 > filestore_queue_committing_max_ops = 5000 > filestore_queue_committing_max_bytes = 10485760000 > journal_max_write_bytes = 1073714824 > journal_max_write_entries = 10000 > journal_queue_max_ops = 50000 > journal_queue_max_bytes = 10485760000 > osd_max_write_size = 512 > osd_client_message_size_cap = 2147483648 > osd_deep_scrub_stride = 131072 > osd_op_threads = 8 > osd_disk_threads = 4 > osd_map_cache_size = 1024 > osd_map_cache_bl_size = 128 > osd_mount_options_xfs = "rw,noexec,nodev,noatime,nodiratime,nobarrier" > osd_recovery_op_priority = 4 > osd_recovery_max_active = 10 > osd_max_backfills = 4 > rbd non blocking aio = false > > [client] > rbd_cache = true > rbd_cache_size = 268435456 > rbd_cache_max_dirty = 134217728 > rbd_cache_max_dirty_age = 5 > --- > > > > On Mon, Oct 31, 2016 at 9:20 PM, Jason Dillaman < [ mailto: [ mailto:jdillama@xxxxxxxxxx | jdillama@xxxxxxxxxx ] | [ mailto:jdillama@xxxxxxxxxx | jdillama@xxxxxxxxxx ] ] > wrote: >> >> On Sun, Oct 30, 2016 at 5:40 AM, Bill WONG < [ mailto: [ mailto:wongahshuen@xxxxxxxxx | wongahshuen@xxxxxxxxx ] | [ mailto:wongahshuen@xxxxxxxxx | wongahshuen@xxxxxxxxx ] ] > wrote: >> > any ideas or comments? >> >> Can you set "rbd non blocking aio = false" in your ceph.conf and retry >> librbd? This will eliminate at least one context switch on the read IO >> path -- which result in increased latency under extremely low queue >> depths. >> >> -- >> Jason > > -- Jason _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com