some info update: disk are 3TB SATA,Model Number: WDC WD3000FYYZ-01UL1B1 and today,I try to set osd.0 reweight to 0.1;and then check.some useful data found. avg-cpu: %user %nice %system %iowait %steal %idle 1.63 0.00 0.48 16.15 0.00 81.75 Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await r_await w_await svctm %util sda 0.00 0.00 0.00 2.00 0.00 1.00 1024.00 39.85 1134.00 0.00 1134.00 500.00 100.00 sda1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 sda2 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 sda3 0.00 0.00 0.00 0.00 0.00 0.00 0.00 1.00 0.00 0.00 0.00 0.00 100.40 sda4 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 sda5 0.00 0.00 0.00 2.00 0.00 1.00 1024.00 32.32 1134.00 0.00 1134.00 502.00 100.40 sda6 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.66 0.00 0.00 0.00 0.00 66.40 sda7 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 sda8 0.00 0.00 0.00 0.00 0.00 0.00 0.00 1.00 0.00 0.00 0.00 0.00 100.00 sda9 0.00 0.00 0.00 0.00 0.00 0.00 0.00 1.00 0.00 0.00 0.00 0.00 100.00 sda10 0.00 0.00 0.00 0.00 0.00 0.00 0.00 1.00 0.00 0.00 0.00 0.00 100.00 ^C^C^C^C^C^C^C^C^C root@node-65:~# ls -l /var/lib/ceph/osd/ceph-0 total 62924048 -rw-r--r-- 1 root root 487 Oct 12 16:49 activate.monmap -rw-r--r-- 1 root root 3 Oct 12 16:49 active -rw-r--r-- 1 root root 37 Oct 12 16:49 ceph_fsid drwxr-xr-x 280 root root 8192 Mar 30 11:58 current -rw-r--r-- 1 root root 37 Oct 12 16:49 fsid lrwxrwxrwx 1 root root 9 Oct 12 16:49 journal -> /dev/sda5 -rw------- 1 root root 56 Oct 12 16:49 keyring -rw-r--r-- 1 root root 21 Oct 12 16:49 magic -rw-r--r-- 1 root root 6 Oct 12 16:49 ready -rw-r--r-- 1 root root 4 Oct 12 16:49 store_version -rw-r--r-- 1 root root 42 Oct 12 16:49 superblock -rw-r--r-- 1 root root 64424509440 Mar 30 10:20 testfio.file -rw-r--r-- 1 root root 0 Mar 28 09:54 upstart -rw-r--r-- 1 root root 2 Oct 12 16:49 whoami the journal of osd.0 is sda5,it is so busy.and cpu wait in top is 30%.the system is slow. So,maybe this is the problem of sda5?it is INTEL SSDSC2BB120G4 we use two SSD for journal and system. root@node-65:~# lsblk NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT sda 8:0 0 111.8G 0 disk ├─sda1 8:1 0 22M 0 part ├─sda2 8:2 0 191M 0 part /boot ├─sda3 8:3 0 43.9G 0 part / ├─sda4 8:4 0 3.8G 0 part [SWAP] ├─sda5 8:5 0 10.2G 0 part ├─sda6 8:6 0 10.2G 0 part ├─sda7 8:7 0 10.2G 0 part ├─sda8 8:8 0 10.2G 0 part ├─sda9 8:9 0 10.2G 0 part └─sda10 8:10 0 10.2G 0 part sdb 8:16 0 111.8G 0 disk ├─sdb1 8:17 0 24M 0 part ├─sdb2 8:18 0 10.2G 0 part ├─sdb3 8:19 0 10.2G 0 part ├─sdb4 8:20 0 10.2G 0 part ├─sdb5 8:21 0 10.2G 0 part ├─sdb6 8:22 0 10.2G 0 part ├─sdb7 8:23 0 10.2G 0 part └─sdb8 8:24 0 50.1G 0 part 2016-03-30 11:17 GMT+08:00 lin zhou <hnuzhoulin2@xxxxxxxxx>: > Hi,cephers > > some osd has high latency in theoutput of ceph osd perf,but I have > setting the reweight of these osd in crushmap tp 0.0 > and I use iostat to check these disk,no load. > > so how does command `ceph osd perf` work? > > root@node-67:~# ceph osd perf > osdid fs_commit_latency(ms) fs_apply_latency(ms) > 0 9819 10204 > 1 1 2 > 2 1 2 > 3 9126 9496 > 4 2 3 > 5 2 3 > 6 0 0 > 7 0 1 > 8 1 2 > 9 32960 33953 > 10 2 3 > 11 1 31 > 12 23450 24324 > 13 0 1 > 14 1 2 > 15 18882 19622 > 16 2 4 > 17 1 2 > 18 3 46 > > > root@node-67:~# ceph osd tree > # id weight type name up/down reweight > -1 103.7 root default > -2 8.19 host node-65 > 18 2.73 osd.18 up 1 > 21 0 osd.21 up 1 > 24 2.73 osd.24 up 1 > 27 2.73 osd.27 up 1 > 30 0 osd.30 up 1 > 33 0 osd.33 up 1 > 0 0 osd.0 up 1 > 3 0 osd.3 up 1 > 6 0 osd.6 down 0 > 9 0 osd.9 up 1 > 12 0 osd.12 up 1 > 15 0 osd.15 up 1 > > root@node-65:~# iostat -xm 1 -p sdd,sde,sdf,sdg > Linux 3.11.0-26-generic (node-65) 03/30/2016 _x86_64_ (40 CPU) > > avg-cpu: %user %nice %system %iowait %steal %idle > 3.34 0.00 0.61 2.60 0.00 93.45 > > Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s > avgrq-sz avgqu-sz await r_await w_await svctm %util > sdd 0.00 0.05 0.08 0.16 0.00 0.01 > 64.48 0.01 29.11 4.21 42.09 1.70 0.04 > sdd1 0.00 0.05 0.08 0.16 0.00 0.01 > 64.54 0.01 29.14 4.22 42.09 1.71 0.04 > sde 0.00 0.00 0.04 0.00 0.00 0.00 > 15.59 0.00 4.00 4.00 0.85 3.86 0.01 > sde1 0.00 0.00 0.04 0.00 0.00 0.00 > 15.62 0.00 4.01 4.02 0.85 3.88 0.01 > sdf 0.00 0.04 0.39 0.31 0.04 0.01 > 148.58 0.01 13.58 2.61 27.50 1.17 0.08 > sdf1 0.00 0.04 0.39 0.31 0.04 0.01 > 148.63 0.01 13.59 2.61 27.50 1.17 0.08 > sdg 0.00 0.06 0.22 0.24 0.02 0.02 > 162.41 0.01 23.64 2.85 42.89 1.49 0.07 > sdg1 0.00 0.06 0.22 0.24 0.02 0.02 > 162.49 0.01 23.65 2.86 42.89 1.49 0.07 > > > root@node-65:~# top > > top - 11:16:34 up 11 days, 19:26, 4 users, load average: 4.46, 4.73, 3.66 > Tasks: 588 total, 1 running, 587 sleeping, 0 stopped, 0 zombie > Cpu(s): 3.6%us, 0.6%sy, 0.0%ni, 93.5%id, 2.1%wa, 0.0%hi, 0.1%si, 0.0%st > Mem: 131998604k total, 69814800k used, 62183804k free, 216612k buffers > Swap: 3999740k total, 0k used, 3999740k free, 14093896k cached > > PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND > 17000 libvirt- 20 0 39.7g 32g 12m S 109 25.7 18417:09 qemu-system-x86 > 8852 root 20 0 2116m 854m 9896 S 20 0.7 303:24.39 ceph-osd > 9319 root 20 0 2285m 974m 12m S 16 0.8 385:24.69 ceph-osd > 8797 root 20 0 2298m 929m 10m S 10 0.7 298:14.61 ceph-osd > 17001 root 20 0 0 0 0 S 4 0.0 384:38.16 vhost-17000 > 24241 nova 20 0 148m 64m 4908 S 2 0.1 89:47.86 nova-network > 24189 nova 20 0 125m 37m 4880 S 1 0.0 49:48.06 nova-api > 1 root 20 0 24720 2644 1376 S 0 0.0 4:35.38 init > 50 root 20 0 0 0 0 S 0 0.0 46:32.66 rcu_sched > 52 root 20 0 0 0 0 S 0 0.0 1:21.29 rcuos/1 > 64 root 20 0 0 0 0 S 0 0.0 4:34.05 rcuos/13 > 67 root 20 0 0 0 0 S 0 0.0 3:06.16 rcuos/16 > 68 root 20 0 0 0 0 S 0 0.0 2:45.34 rcuos/17 > 4732 root 20 0 1650m 807m 15m S 0 0.6 3:24.89 ceph-osd _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com