Plot thickens:
I checked c-states and apparently I am operating in c1 with all CPUS on. Apparently servers were tuned to use latency-performance
tuned-adm active
Current active profile: latency-performance
turbostat shows
Package Core CPU Avg_MHz %Busy Bzy_MHz TSC_MHz SMI CPU%c1 CPU%c3 CPU%c6 CPU%c7 CoreTmp PkgTmp Pkg%pc2 Pkg%pc3 Pkg%pc6 Pkg%pc7 PkgWatt RAMWatt PKG_% RAM_%
- - - 22 0.84 2600 2400 0 99.16 0.00 0.00 0.00 49 58 0.00 0.00 0.00 0.00 69.51 17.29 0.00 0.00
0 0 0 39 1.52 2600 2400 0 98.48 0.00 0.00 0.00 48 58 0.00 0.00 0.00 0.00 36.30 8.73 0.00 0.00
0 0 12 15 0.56 2600 2400 0 99.44
0 1 2 47 1.81 2600 2400 0 98.19 0.00 0.00 0.00 49
0 1 14 17 0.66 2600 2400 0 99.34
0 2 4 31 1.20 2600 2400 0 98.80 0.00 0.00 0.00 47
0 2 16 18 0.71 2600 2400 0 99.29
0 3 6 31 1.21 2600 2400 0 98.79 0.00 0.00 0.00 49
0 3 18 39 1.50 2600 2400 0 98.50
0 4 8 33 1.27 2600 2400 0 98.73 0.00 0.00 0.00 46
0 4 20 17 0.64 2600 2400 0 99.36
0 5 10 32 1.23 2600 2400 0 98.77 0.00 0.00 0.00 48
0 5 22 20 0.76 2600 2400 0 99.24
1 0 1 25 0.95 2600 2400 0 99.05 0.00 0.00 0.00 44 52 0.00 0.00 0.00 0.00 33.21 8.56 0.00 0.00
1 0 13 9 0.34 2600 2400 0 99.66
1 1 3 9 0.35 2600 2400 0 99.65 0.00 0.00 0.00 42
1 1 15 11 0.42 2600 2400 0 99.58
1 2 5 30 1.17 2600 2400 0 98.83 0.00 0.00 0.00 46
1 2 17 7 0.28 2600 2400 0 99.72
1 3 7 10 0.40 2600 2400 0 99.60 0.00 0.00 0.00 44
1 3 19 10 0.37 2600 2400 0 99.63
1 4 9 9 0.36 2600 2400 0 99.64 0.00 0.00 0.00 45
1 4 21 7 0.27 2600 2400 0 99.73
1 5 11 12 0.45 2600 2400 0 99.55 0.00 0.00 0.00 45
1 5 23 46 1.76 2600 2400 0 98.24
iostat for ssd shows
# iostat -xd -p sdb 1 1000
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
sdb 0.00 0.00 0.05 26.78 0.20 2299.53 171.42 0.02 0.64 0.11 0.64 0.08 0.20
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
sdb 0.00 0.00 0.00 16.00 0.00 392.00 49.00 0.00 0.06 0.00 0.06 0.06 0.10
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
sdb 0.00 0.00 0.00 74.00 0.00 880.00 23.78 0.00 0.00 0.00 0.00 0.00 0.00
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
sdb 0.00 0.00 0.00 56.00 0.00 240.00 8.57 0.00 0.00 0.00 0.00 0.00 0.00
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
sdb 0.00 0.00 0.00 44.00 0.00 676.00 30.73 0.00 0.07 0.00 0.07 0.05 0.20
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
sdb 0.00 0.00 0.00 10.00 0.00 92.00 18.40 0.00 0.00 0.00 0.00 0.00 0.00
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
sdb 0.00 0.00 0.00 6.00 0.00 84.00 28.00 0.00 0.00 0.00 0.00 0.00 0.00
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
sdb 0.00 0.00 0.00 1.00 0.00 20.00 40.00 0.00 0.00 0.00 0.00 0.00 0.00
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
sdb 0.00 0.00 0.00 25.00 0.00 212.00 16.96 0.00 0.00 0.00 0.00 0.00 0.00
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
sdb 0.00 0.00 0.00 14.00 0.00 100.00 14.29 0.00 0.00 0.00 0.00 0.00 0.00
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
sdb 0.00 0.00 0.00 5.00 0.00 112.00 44.80 0.00 0.00 0.00 0.00 0.00 0.00
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
sdb 0.00 0.00 0.00 13.00 0.00 508.00 78.15 0.00 0.15 0.00 0.15 0.15 0.20
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
sdb 0.00 0.00 0.00 49.00 0.00 820.00 33.47 0.01 0.10 0.00 0.10 0.08 0.40
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
sdb 0.00 0.00 0.00 7.00 0.00 52.00 14.86 0.00 0.00 0.00 0.00 0.00 0.00
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
sdb 0.00 0.00 0.00 18.00 0.00 180.00 20.00 0.00 0.06 0.00 0.06 0.06 0.10
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
sdb 0.00 0.00 0.00 34.00 0.00 476.00 28.00 0.00 0.06 0.00 0.06 0.06 0.20
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
sdb 0.00 0.00 1.00 12.00 4.00 156.00 24.62 0.00 0.00 0.00 0.00 0.00 0.00
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
sdb 0.00 0.00 0.00 32.00 0.00 940.00 58.75 0.00 0.03 0.00 0.03 0.03 0.10
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
sdb 0.00 0.00 0.00 13.00 0.00 456.00 70.15 0.00 0.00 0.00 0.00 0.00 0.00
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
sdb 0.00 0.00 0.00 37.00 0.00 536.00 28.97 0.00 0.00 0.00 0.00 0.00 0.00
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
sdb 0.00 0.00 0.00 6.00 0.00 60.00 20.00 0.00 0.17 0.00 0.17 0.17 0.10
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
sdb 0.00 0.00 0.00 3.00 0.00 48.00 32.00 0.00 0.00 0.00 0.00 0.00 0.00
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
sdb 0.00 0.00 0.00 10.00 0.00 1452.00 290.40 0.00 0.30 0.00 0.30 0.20 0.20
On 11/17/2018 3:42 PM, John Petrini wrote:
You can check if cstates are enabled with cat /proc/acpi/processor/info. Look for power management: yes/no.||
If they are enabled then you can check the current cstate of each core. 0 is the CPU's normal operating range, any other state means the processor is in a power saving mode. cat
/proc/acpi/processor/CPU?/power.
cstates are configured in the bios so a reboot is required to change them. I know with Dell servers you can trigger the change with omconfig and then issue a reboot for it to take effect. Otherwise
you'll need to disable it directly in the bios.
As for the SSD's I would just run iostat and check the iowait. If you see small disk writes causing high iowait then your SSD's are probably at the end of their life. Ceph journaling is good at
destroying SSD's.
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com