Re: What is client request_load_avg? Troubleshooting MDS issues on Luminous

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Chris,

I would strongly advice not to use multi-MDS with 5000 clients on luminous. I enabled it on mimic with ca. 1750 clients and it was extremely dependent on luck if it converged to a stable distribution of dirfrags or ended up doing export_dir operations all the time, completely killing the FS performance. Also, even in mimic where multi-MDS is no longer experimental, it still has a lot of bugs. You will need to monitor the cluster tightly and might be forced to intervene regularly, including going back and forth between single- and multi-MDS.

My recommendation would be to upgrade to octopus as fast as possible. Its the first version that supports ephemeral pinning, which I would say is pretty much the most useful multi-MDS mode, because it uses a static dirfrag distribution over all MDSes avoiding the painful export_dir operations.

You are in the unlucky situation that you will need 2 upgrades. I think going L->M->O might be the least painful as it requires only 1 OSD conversion. If you are a bit more adventurous, you could also aim for L->N->P. Nautilus will probably not solve your performance issue and any path including nautilus will have an extra OSD conversion. However, in case you are using file store, you might want to go this route and change from file store to bluestore with a re-deployment of OSDs when you are on pacific. You will get out of some performance issues with upgraded OSDs and pacific has fixes for a boat load of FS snapshot issues.

In the mean time, can you roll out something like ganglia on all client- and storage nodes and collect network traffic stats? I found the packet report combined with bytes-in/out extremely useful to hunt down rogue FS clients. If you use snapshots, also kworker CPU and wait-IO on the client node are are indicative of problems with this client.

Best regards,
=================
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux