Hi Tyler,
I think we had a user a while back that reported they had background
deletion work going on after upgrading their OSDs from filestore to
bluestore due to PGs having been moved around. Is it possible that your
cluster is doing a bunch of work (deletion or otherwise) beyond the
regular client load? I don't remember how to check for this off the top
of my head, but it might be something to investigate. If that's what it
is, we just recently added the ability to throttle background deletes:
https://github.com/ceph/ceph/pull/24749
If the logs/admin socket don't tell you anything, you could also try
using our wallclock profiler to see what the OSD is spending it's time
doing:
https://github.com/markhpc/gdbpmp/
./gdbpmp -t 1000 -p`pidof ceph-osd` -o foo.gdbpmp
./gdbpmp -i foo.gdbpmp -t 1
Mark
On 12/10/18 6:09 PM, Tyler Bishop wrote:
Hi,
I have an SSD only cluster that I recently converted from filestore to
bluestore and performance has totally tanked. It was fairly decent
before, only having a little additional latency than expected. Now
since converting to bluestore the latency is extremely high, SECONDS.
I am trying to determine if it an issue with the SSD's or Bluestore
treating them differently than filestore... potential garbage
collection? 24+ hrs ???
I am now seeing constant 100% IO utilization on ALL of the devices and
performance is terrible!
IOSTAT
avg-cpu: %user %nice %system %iowait %steal %idle
1.37 0.00 0.34 18.59 0.00 79.70
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s
avgrq-sz avgqu-sz await r_await w_await svctm %util
sda 0.00 0.00 0.00 9.50 0.00 64.00
13.47 0.01 1.16 0.00 1.16 1.11 1.05
sdb 0.00 96.50 4.50 46.50 34.00 11776.00
463.14 132.68 1174.84 782.67 1212.80 19.61 100.00
dm-0 0.00 0.00 5.50 128.00 44.00 8162.00
122.94 507.84 1704.93 674.09 1749.23 7.49 100.00
avg-cpu: %user %nice %system %iowait %steal %idle
0.85 0.00 0.30 23.37 0.00 75.48
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s
avgrq-sz avgqu-sz await r_await w_await svctm %util
sda 0.00 0.00 0.00 3.00 0.00 17.00
11.33 0.01 2.17 0.00 2.17 2.17 0.65
sdb 0.00 24.50 9.50 40.50 74.00 10000.00
402.96 83.44 2048.67 1086.11 2274.46 20.00 100.00
dm-0 0.00 0.00 10.00 33.50 78.00 2120.00
101.06 287.63 8590.47 1530.40 10697.96 22.99 100.00
avg-cpu: %user %nice %system %iowait %steal %idle
0.81 0.00 0.30 11.40 0.00 87.48
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s
avgrq-sz avgqu-sz await r_await w_await svctm %util
sda 0.00 0.00 0.00 6.00 0.00 40.25
13.42 0.01 1.33 0.00 1.33 1.25 0.75
sdb 0.00 314.50 15.50 72.00 122.00 17264.00
397.39 61.21 1013.30 740.00 1072.13 11.41 99.85
dm-0 0.00 0.00 10.00 427.00 78.00 27728.00
127.26 224.12 712.01 1147.00 701.82 2.28 99.85
avg-cpu: %user %nice %system %iowait %steal %idle
1.22 0.00 0.29 4.01 0.00 94.47
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s
avgrq-sz avgqu-sz await r_await w_await svctm %util
sda 0.00 0.00 0.00 3.50 0.00 17.00
9.71 0.00 1.29 0.00 1.29 1.14 0.40
sdb 0.00 0.00 1.00 39.50 8.00 10112.00
499.75 78.19 1711.83 1294.50 1722.39 24.69 100.00
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com