Re: Persistent problem with slow metadata

Momčilo Medić <fedorauser@xxxxxxxxxxxxxxxxx> · Mon, 31 Aug 2020 16:49:14 +0200

On Mon, 2020-08-31 at 14:36 +0000, Eugen Block wrote:
> > Disks are utilized roughly between 70 and 80 percent. Not sure why
> > would operations slow down when disks are getting more utilization.
> > If that would be the case, I'd expect Ceph to issue a warning.
> 
> It is warning you, that's why you see slow requests. ;-) But just
> to  
> be clear, by utilization I mean more than just the filling level of  
> the OSD, have you watched iostat (or something similar) for your
> disks  
> during usual and high load? Heavy metadata operation on rocksDB  
> increases the load on the main device. I'm not sure if you
> mentioned  
> it before, do you have stand-alone OSDs or with faster db devices?
> I  
> believe you only mentioned cephfs_metadata on SSD.

Indeed DB is stored on HDDs and only metadata resides on SSDs.

I accidentally stumbled upon someone mentioning disk caching should be
disabled to increase performance.
I'm now looking into how to configure that on these:
- PERC H730P Adapter (has cache on controller)
- Dell HBA330 Adp (doesn't have cache on controller)

It is not as easy as executing "hdparm" command :(

I'll also look into code that does the "truncating" as it may not be
Ceph-friendly :/

> > Have I understood correctly that the expectation is that if I used
> > larger drives I wouldn't be seeing these warnings?
> > I can understand that adding more disks would create better
> > parallelisation, that's why I'm asking about larger drives.
> 
> I don't think larger drives would improve that, probably even the  
> opposite, depending on the drives, of course. More drives should  
> scale, yes, but there's more to it.
> 
> 
> Zitat von Momčilo Medić <fedorauser@xxxxxxxxxxxxxxxxx>:
> 
> > Hey Eugen,
> > 
> > On Wed, 2020-08-26 at 09:29 +0000, Eugen Block wrote:
> > > Hi,
> > > 
> > > > > root@cephosd01:~# ceph config get mds.cephosd01 osd_op_queue
> > > > > wpq
> > > > > root@0cephosd01:~# ceph config get mds.cephosd01
> > > > > osd_op_queue_cut_off
> > > > > high
> > > 
> > > just to make sure, I referred to OSD not MDS settings, maybe
> > > check
> > > again?
> > 
> > root@cephosd01:~# ceph config get osd.* osd_op_queue
> > wpq
> > root@cephosd01:~# ceph config get osd.* osd_op_queue_cut_off
> > high
> > root@cephosd01:~# ceph config get mon.* osd_op_queue
> > wpq
> > root@cephosd01:~# ceph config get mon.* osd_op_queue_cut_off
> > high
> > root@cephosd01:~# ceph config get mds.* osd_op_queue
> > wpq
> > root@cephosd01:~# ceph config get mds.* osd_op_queue_cut_off
> > high
> > root@cephosd01:~#
> > 
> > It seems no matter which setting I query, it's always the same.
> > Also, documentation for OSD clearly states[1] that it is the
> > default.
> > 
> > > I wouldn't focus too much on the MDS service, 64 GB RAM should be
> > > enough, but you could and should also check the actual RAM usage,
> > > of
> > > course. But in our case it's pretty clear that the hard disks are
> > > the
> > > bottleneck although we  have rocksDB on SSD for all OSDs. We seem
> > > to
> > > have a similar use case (we have nightly compile jobs running in
> > > cephfs) just with fewer clients. Our HDDs are saturated
> > > especially
> > > if
> > > we also run deep-scrubs during the night,  but the slow requests
> > > have
> > > been reduced since we changed the osd_op_queue settings for our
> > > OSDs.
> > > 
> > > Have you checked your disk utilization?
> > 
> > Disks are utilized roughly between 70 and 80 percent. Not sure why
> > would operations slow down when disks are getting more utilization.
> > If that would be the case, I'd expect Ceph to issue a warning.
> > 
> > Have I understood correctly that the expectation is that if I used
> > larger drives I wouldn't be seeing these warnings?
> > I can understand that adding more disks would create better
> > parallelisation, that's why I'm asking about larger drives.
> > 
> > Thank you for discussing this with me, it's highly appreciated.
> > 
> > <snip>
> > 
> > [1]
> > 
https://docs.ceph.com/docs/master/rados/configuration/osd-config-ref/#operations
> > 
> > Kind regards,
> > Momo.
> > _______________________________________________
> > ceph-users mailing list -- ceph-users@xxxxxxx
> > To unsubscribe send an email to ceph-users-leave@xxxxxxx
> 
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx