>> Are you sure? Your config didn't show this. Yes. I have dedicated 10GbE network between ceph nodes. Each ceph node has seperate network that have 10GbE network card and speed. Do I have to set anything in the config for 10GbE? >> What kind of devices are they? did you do the journal test? They are not connected via NVMe neither SSD's. Each node has 10x3TB SATA Hard Disk Drives (HDD). -Gencer. -----Original Message----- From: Peter Maloney [mailto:peter.maloney@xxxxxxxxxxxxxxxxxxxx] Sent: Tuesday, July 18, 2017 2:47 PM To: gencer@xxxxxxxxxxxxx Cc: ceph-users@xxxxxxxxxxxxxx Subject: Re: Yet another performance tuning for CephFS On 07/17/17 22:49, gencer@xxxxxxxxxxxxx wrote: > I have a seperate 10GbE network for ceph and another for public. > Are you sure? Your config didn't show this. > No they are not NVMe, unfortunately. > What kind of devices are they? did you do the journal test? http://www.sebastien-han.fr/blog/2014/10/10/ceph-how-to-test-if-your-ssd-is-suitable-as-a-journal-device/ Unlike most tests, with ceph journals, you can't look at the load on the device and decide it's not the bottleneck; you have to test it another way. I had some micron SSDs I tested which performed poorly, and that test showed them performing poorly too. But from other benchmarks, and disk load during journal tests, they looked ok, which was misleading. > Do you know any test command that i can try to see if this is the max. > Read speed from rsync? I don't know how you can improve your rsync test. > > Because I tried one thing a few minutes ago. I opened 4 ssh channel > and run rsync command and copy bigfile to different targets in cephfs > at the same time. Then i looked into network graphs and i see numbers > up to 1.09 gb/s. But why single copy/rsync cannot exceed 200mb/s? What > prevents it im really wonder this. > > Gencer. > > On 2017-07-17 23:24, Peter Maloney wrote: >> You should have a separate public and cluster network. And journal or >> wal/db performance is important... are the devices fast NVMe? >> >> On 07/17/17 21:31, gencer@xxxxxxxxxxxxx wrote: >> >>> Hi, >>> >>> I located and applied almost every different tuning setting/config >>> over the internet. I couldn’t manage to speed up my speed one byte >>> further. It is always same speed whatever I do. >>> >>> I was on jewel, now I tried BlueStore on Luminous. Still exact same >>> speed I gain from cephfs. >>> >>> It doesn’t matter if I disable debug log, or remove [osd] section as >>> below and re-add as below (see .conf). Results are exactly the same. >>> Not a single byte is gained from those tunings. I also did tuning >>> for kernel (sysctl.conf). >>> >>> Basics: >>> >>> I have 2 nodes with 10 OSD each and each OSD is 3TB SATA drive. Each >>> node has 24 cores and 64GB of RAM. Ceph nodes are connected via >>> 10GbE NIC. No FUSE used. But tried that too. Same results. >>> >>> $ dd if=/dev/zero of=/mnt/c/testfile bs=100M count=10 oflag=direct >>> >>> 10+0 records in >>> >>> 10+0 records out >>> >>> 1048576000 bytes (1.0 GB, 1000 MiB) copied, 5.77219 s, 182 MB/s >>> >>> 182MB/s. This is the best speed i get so far. Usually 170~MB/s. Hm.. >>> I get much much much higher speeds on different filesystems. Even >>> with glusterfs. Is there anything I can do or try? >>> >>> Read speed is also around 180-220MB/s but not higher. >>> >>> This is What I am using on ceph.conf: >>> >>> [global] >>> >>> fsid = d7163667-f8c5-466b-88df-8747b26c91df >>> >>> mon_initial_members = server1 >>> >>> mon_host = 192.168.0.1 >>> >>> auth_cluster_required = cephx >>> >>> auth_service_required = cephx >>> >>> auth_client_required = cephx >>> >>> osd mount options = rw,noexec,nodev,noatime,nodiratime,nobarrier >>> >>> osd mount options xfs = rw,noexec,nodev,noatime,nodiratime,nobarrier >>> >>> >>> osd_mkfs_type = xfs >>> >>> osd pool default size = 2 >>> >>> enable experimental unrecoverable data corrupting features = >>> bluestore rocksdb >>> >>> bluestore fsck on mount = true >>> >>> rbd readahead disable after bytes = 0 >>> >>> rbd readahead max bytes = 4194304 >>> >>> log to syslog = false >>> >>> debug_lockdep = 0/0 >>> >>> debug_context = 0/0 >>> >>> debug_crush = 0/0 >>> >>> debug_buffer = 0/0 >>> >>> debug_timer = 0/0 >>> >>> debug_filer = 0/0 >>> >>> debug_objecter = 0/0 >>> >>> debug_rados = 0/0 >>> >>> debug_rbd = 0/0 >>> >>> debug_journaler = 0/0 >>> >>> debug_objectcatcher = 0/0 >>> >>> debug_client = 0/0 >>> >>> debug_osd = 0/0 >>> >>> debug_optracker = 0/0 >>> >>> debug_objclass = 0/0 >>> >>> debug_filestore = 0/0 >>> >>> debug_journal = 0/0 >>> >>> debug_ms = 0/0 >>> >>> debug_monc = 0/0 >>> >>> debug_tp = 0/0 >>> >>> debug_auth = 0/0 >>> >>> debug_finisher = 0/0 >>> >>> debug_heartbeatmap = 0/0 >>> >>> debug_perfcounter = 0/0 >>> >>> debug_asok = 0/0 >>> >>> debug_throttle = 0/0 >>> >>> debug_mon = 0/0 >>> >>> debug_paxos = 0/0 >>> >>> debug_rgw = 0/0 >>> >>> [osd] >>> >>> osd max write size = 512 >>> >>> osd client message size cap = 2147483648 >>> >>> osd mount options xfs = rw,noexec,nodev,noatime,nodiratime,nobarrier >>> >>> >>> filestore xattr use omap = true >>> >>> osd_op_threads = 8 >>> >>> osd disk threads = 4 >>> >>> osd map cache size = 1024 >>> >>> filestore_queue_max_ops = 25000 >>> >>> filestore_queue_max_bytes = 10485760 >>> >>> filestore_queue_committing_max_ops = 5000 >>> >>> filestore_queue_committing_max_bytes = 10485760000 >>> >>> journal_max_write_entries = 1000 >>> >>> journal_queue_max_ops = 3000 >>> >>> journal_max_write_bytes = 1048576000 >>> >>> journal_queue_max_bytes = 1048576000 >>> >>> filestore_max_sync_interval = 15 >>> >>> filestore_merge_threshold = 20 >>> >>> filestore_split_multiple = 2 >>> >>> osd_enable_op_tracker = false >>> >>> filestore_wbthrottle_enable = false >>> >>> osd_client_message_size_cap = 0 >>> >>> osd_client_message_cap = 0 >>> >>> filestore_fd_cache_size = 64 >>> >>> filestore_fd_cache_shards = 32 >>> >>> filestore_op_threads = 12 >>> >>> As I stated above, it doesn’t matter if I have this [osd] section or >>> not. Results are same. >>> >>> I am open to all suggestions. >>> >>> Thanks, >>> >>> Gencer. >>> >>> _______________________________________________ >>> ceph-users mailing list >>> ceph-users@xxxxxxxxxxxxxx >>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- -------------------------------------------- Peter Maloney Brockmann Consult Max-Planck-Str. 2 21502 Geesthacht Germany Tel: +49 4152 889 300 Fax: +49 4152 889 333 E-mail: peter.maloney@xxxxxxxxxxxxxxxxxxxx Internet: http://www.brockmann-consult.de -------------------------------------------- _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com