Hi Tyler,
I suspect you have BlueStore DB/WAL at these drives as well,
don't you?
Then perhaps you have performance issues with f[data]sync
requests which DB/WAL invoke pretty frequently.
See the following links for details:
https://www.percona.com/blog/2018/02/08/fsync-performance-storage-devices/
https://www.sebastien-han.fr/blog/2014/10/10/ceph-how-to-test-if-your-ssd-is-suitable-as-a-journal-device/
The latter link shows pretty poor numbers for M500DC drives.
Thanks,
Igor
On 12/11/2018 4:58 AM, Tyler Bishop wrote:
Older Crucial/Micron M500/M600
_____________________________________________
Tyler Bishop
EST 2007
O: 513-299-7108
x1000
This email is
intended only for the recipient(s)
above and/or otherwise authorized
personnel. The information contained
herein and attached is confidential
and the property of Beyond Hosting.
Any unauthorized
copying, forwarding, printing,
and/or disclosing any information
related to this email is prohibited.
If you received this message
in error, please contact the sender
and destroy all copies of this email
and any attachment(s).
On Mon, Dec 10, 2018 at 8:57 PM Christian Balzer
< chibi@xxxxxxx>
wrote:
Hello,
On Mon, 10 Dec 2018 20:43:40 -0500 Tyler Bishop wrote:
> I don't think thats my issue here because I don't see any
IO to justify the
> latency. Unless the IO is minimal and its ceph issuing a
bunch of discards
> to the ssd and its causing it to slow down while doing
that.
>
What does atop have to say?
Discards/Trims are usually visible in it, this is during a
fstrim of a
RAID1 / :
---
DSK | sdb | busy 81% | read 0 | write
8587 | MBw/s 2323.4 | avio 0.47 ms |
DSK | sda | busy 70% | read 2 | write
8587 | MBw/s 2323.4 | avio 0.41 ms |
---
The numbers tend to be a lot higher than what the actual
interface is
capable of, clearly the SSD is reporting its internal
activity.
In any case, it should give a good insight of what is going on
activity
wise.
Also for posterity and curiosity, what kind of SSDs?
Christian
> Log isn't showing anything useful and I have most
debugging disabled.
>
>
>
> On Mon, Dec 10, 2018 at 7:43 PM Mark Nelson <mnelson@xxxxxxxxxx> wrote:
>
> > Hi Tyler,
> >
> > I think we had a user a while back that reported
they had background
> > deletion work going on after upgrading their OSDs
from filestore to
> > bluestore due to PGs having been moved around. Is
it possible that your
> > cluster is doing a bunch of work (deletion or
otherwise) beyond the
> > regular client load? I don't remember how to check
for this off the top
> > of my head, but it might be something to
investigate. If that's what it
> > is, we just recently added the ability to throttle
background deletes:
> >
> > https://github.com/ceph/ceph/pull/24749
> >
> >
> > If the logs/admin socket don't tell you anything,
you could also try
> > using our wallclock profiler to see what the OSD is
spending it's time
> > doing:
> >
> > https://github.com/markhpc/gdbpmp/
> >
> >
> > ./gdbpmp -t 1000 -p`pidof ceph-osd` -o foo.gdbpmp
> >
> > ./gdbpmp -i foo.gdbpmp -t 1
> >
> >
> > Mark
> >
> > On 12/10/18 6:09 PM, Tyler Bishop wrote:
> > > Hi,
> > >
> > > I have an SSD only cluster that I recently
converted from filestore to
> > > bluestore and performance has totally tanked.
It was fairly decent
> > > before, only having a little additional latency
than expected. Now
> > > since converting to bluestore the latency is
extremely high, SECONDS.
> > > I am trying to determine if it an issue with
the SSD's or Bluestore
> > > treating them differently than filestore...
potential garbage
> > > collection? 24+ hrs ???
> > >
> > > I am now seeing constant 100% IO utilization on
ALL of the devices and
> > > performance is terrible!
> > >
> > > IOSTAT
> > >
> > > avg-cpu: %user %nice %system %iowait
%steal %idle
> > > 1.37 0.00 0.34 18.59
0.00 79.70
> > >
> > > Device: rrqm/s wrqm/s r/s w/s
rkB/s wkB/s
> > > avgrq-sz avgqu-sz await r_await w_await
svctm %util
> > > sda 0.00 0.00 0.00
9.50 0.00 64.00
> > > 13.47 0.01 1.16 0.00 1.16 1.11
1.05
> > > sdb 0.00 96.50 4.50 46.50
34.00 11776.00
> > > 463.14 132.68 1174.84 782.67 1212.80 19.61
100.00
> > > dm-0 0.00 0.00 5.50 128.00
44.00 8162.00
> > > 122.94 507.84 1704.93 674.09 1749.23 7.49
100.00
> > >
> > > avg-cpu: %user %nice %system %iowait
%steal %idle
> > > 0.85 0.00 0.30 23.37
0.00 75.48
> > >
> > > Device: rrqm/s wrqm/s r/s w/s
rkB/s wkB/s
> > > avgrq-sz avgqu-sz await r_await w_await
svctm %util
> > > sda 0.00 0.00 0.00
3.00 0.00 17.00
> > > 11.33 0.01 2.17 0.00 2.17 2.17
0.65
> > > sdb 0.00 24.50 9.50 40.50
74.00 10000.00
> > > 402.96 83.44 2048.67 1086.11 2274.46 20.00
100.00
> > > dm-0 0.00 0.00 10.00 33.50
78.00 2120.00
> > > 101.06 287.63 8590.47 1530.40 10697.96 22.99
100.00
> > >
> > > avg-cpu: %user %nice %system %iowait
%steal %idle
> > > 0.81 0.00 0.30 11.40
0.00 87.48
> > >
> > > Device: rrqm/s wrqm/s r/s w/s
rkB/s wkB/s
> > > avgrq-sz avgqu-sz await r_await w_await
svctm %util
> > > sda 0.00 0.00 0.00
6.00 0.00 40.25
> > > 13.42 0.01 1.33 0.00 1.33 1.25
0.75
> > > sdb 0.00 314.50 15.50
72.00 122.00 17264.00
> > > 397.39 61.21 1013.30 740.00 1072.13
11.41 99.85
> > > dm-0 0.00 0.00 10.00 427.00
78.00 27728.00
> > > 127.26 224.12 712.01 1147.00 701.82 2.28
99.85
> > >
> > > avg-cpu: %user %nice %system %iowait
%steal %idle
> > > 1.22 0.00 0.29 4.01
0.00 94.47
> > >
> > > Device: rrqm/s wrqm/s r/s w/s
rkB/s wkB/s
> > > avgrq-sz avgqu-sz await r_await w_await
svctm %util
> > > sda 0.00 0.00 0.00
3.50 0.00 17.00
> > > 9.71 0.00 1.29 0.00 1.29 1.14
0.40
> > > sdb 0.00 0.00 1.00
39.50 8.00 10112.00
> > > 499.75 78.19 1711.83 1294.50 1722.39 24.69
100.00
> > >
> > >
> > > _______________________________________________
> > > ceph-users mailing list
> > > ceph-users@xxxxxxxxxxxxxx
> > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> > _______________________________________________
> > ceph-users mailing list
> > ceph-users@xxxxxxxxxxxxxx
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >
--
Christian Balzer Network/Systems Engineer
chibi@xxxxxxx Rakuten
Communications
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
|
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com