Re: 1 clients failing to respond to cache pressure (quincy:17.2.6)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



There is no definitive answer wrt mds tuning. As it is everywhere mentioned, it's about finding the right setup for your specific workload. If you can synthesize your workload (maybe scale down a bit) try optimizing it in a test cluster without interrupting your developers too much. But what you haven't explained yet is what are you experiencing as a performance issue? Do you have numbers or a detailed description? From the fs status output you didn't seem to have too much activity going on (around 140 requests per second), but that's probably not the usual traffic? What does ceph report in its client IO output?
Can you paste the 'ceph osd df' output as well?
Do you have dedicated MDS servers or are they colocated with other services?

Zitat von Özkan Göksu <ozkangksu@xxxxxxxxx>:

Hello  Eugen.

I read all of your MDS related topics and thank you so much for your effort
on this.
There is not much information and I couldn't find a MDS tuning guide at
all. It  seems that you are the correct person to discuss mds debugging and
tuning.

Do you have any documents or may I learn what is the proper way to debug
MDS and clients ?
Which debug logs will guide me to understand the limitations and will help
to tune according to the data flow?

While searching, I find this:
https://lists.ceph.io/hyperkitty/list/ceph-users@xxxxxxx/thread/YO4SGL4DJQ6EKUBUIHKTFSW72ZJ3XLZS/
quote:"A user running VSCodium, keeping 15k caps open.. the opportunistic
caps recall eventually starts recalling those but the (el7 kernel) client
won't release them. Stopping Codium seems to be the only way to release."

Because of this I think I also need to play around with the client side too.

My main goal is increasing the speed and reducing the latency and I wonder
if these ideas are correct or not:
- Maybe I need to increase client side cache size because via each client,
multiple users request a lot of objects and clearly the
client_cache_size=16 default is not enough.
-  Maybe I need to increase client side maximum cache limit for
object "client_oc_max_objects=1000 to 10000" and data "client_oc_size=200mi
to 400mi"
- The client cache cleaning threshold is not aggressive enough to keep the
free cache size in the desired range. I need to make it aggressive but this
should not reduce speed and increase latency.

mds_cache_memory_limit=4gi to 16gi
client_oc_max_objects=1000 to 10000
client_oc_size=200mi to 400mi
client_permissions=false #to reduce latency.
client_cache_size=16 to 128


What do you think?


_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux