Not sure if this info is of any help, please beware I am also just in a testing phase with ceph. I don’t know how numa=off is interpreted by the os. If it just hides the numa, you still could run into the 'known issues'. That is why I have numad running. Furthermore I have put an osd 'out' that gives also a 0 in the reweight column. So I guess your osd.1 is also not participating? If so, could not be nice if your are testing 3x replication with 2 disks? I have got this on SATA 5400rpm disks, replicated pool size 3. rados bench -p rbd 30 write --id rbd sec Cur ops started finished avg MB/s cur MB/s last lat(s) avg lat(s) 20 16 832 816 163.178 180 0.157838 0.387074 21 16 867 851 162.073 140 0.157289 0.38817 22 16 900 884 160.705 132 0.224024 0.393674 23 16 953 937 162.934 212 0.530274 0.388189 24 16 989 973 162.144 144 0.209806 0.389644 25 16 1028 1012 161.898 156 0.118438 0.391057 26 16 1067 1051 161.67 156 0.248463 0.38977 27 16 1112 1096 162.348 180 0.754184 0.392159 28 16 1143 1127 160.977 124 0.439342 0.393641 29 16 1185 1169 161.219 168 0.0801006 0.393004 30 16 1221 1205 160.644 144 0.224278 0.39363 Total time run: 30.339270 Total writes made: 1222 Write size: 4194304 Object size: 4194304 Bandwidth (MB/sec): 161.111 Stddev Bandwidth: 24.6819 Max bandwidth (MB/sec): 212 Min bandwidth (MB/sec): 120 Average IOPS: 40 Stddev IOPS: 6 Max IOPS: 53 Min IOPS: 30 Average Latency(s): 0.396239 Stddev Latency(s): 0.249998 Max latency(s): 1.29482 Min latency(s): 0.06875 -----Original Message----- From: Steven Vacaroaia [mailto:stef97@xxxxxxxxx] Sent: vrijdag 2 februari 2018 15:25 To: ceph-users Subject: ceph luminous performance - disks at 100% , low network utilization Hi, I have been struggling to get my test cluster to behave ( from a performance perspective) Dell R620, 64 GB RAM, 1 CPU, numa=off , PERC H710, Raid0, Enterprise 10K disks No SSD - just plain HDD Local tests ( dd, hdparm ) confirm my disks are capable of delivering 200 MBs Fio with 15 jobs indicate 100 MBs Ceph tell shows 400MBs rados bench with 1 thread provide 3 MB rados bench with 32 threads, 2 OSDs ( one per server) , barely touch 10 MB Adding a third server / OSD improve performance slightly ( 11 MB) atop shows disk usage at 100% for extended period of time Network usage is very low Nothing else is "red" I have removed all TCP setting and left ceph.conf mostly with defaults What am I missing ? Many thanks Steven ceph osd tree ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF 0 hdd 0.54529 osd.0 up 1.00000 1.00000 -5 0.54529 host osd02 1 hdd 0.54529 osd.1 up 0 1.00000 -7 0 host osd04 -17 0.54529 host osd05 2 hdd 0.54529 osd.2 up 1.00000 1.00000 [root@osd01 ~]# ceph tell osd.0 bench { "bytes_written": 1073741824, "blocksize": 4194304, "bytes_per_sec": 452125657 } [root@osd01 ~]# ceph tell osd.2 bench { "bytes_written": 1073741824, "blocksize": 4194304, "bytes_per_sec": 340553488 } hdparm -tT /dev/sdc /dev/sdc: Timing cached reads: 5874 MB in 1.99 seconds = 2948.51 MB/sec Timing buffered disk reads: 596 MB in 3.01 seconds = 198.17 MB/sec fio --filename=/dev/sdc --direct=1 --sync=1 --rw=write --bs=4k --numjobs=15 --iodepth=1 --runtime=60 --time_based --group_reporting --name=journal-test journal-test: (g=0): rw=write, bs=4K-4K/4K-4K/4K-4K, ioengine=sync, iodepth=1 ... fio-2.2.8 Starting 15 processes Jobs: 15 (f=15): [W(15)] [100.0% done] [0KB/104.9MB/0KB /s] [0/26.9K/0 iops] [eta 00m:00s] fio --filename=/dev/sdc --direct=1 --sync=1 --rw=write --bs=4k --numjobs=5 --iodepth=1 --runtime=60 --time_based --group_reporting --name=journal-test journal-test: (g=0): rw=write, bs=4K-4K/4K-4K/4K-4K, ioengine=sync, iodepth=1 ... fio-2.2.8 Starting 5 processes Jobs: 5 (f=5): [W(5)] [100.0% done] [0KB/83004KB/0KB /s] [0/20.8K/0 iops] [eta 00m:00s] _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com