Hi all,
I have a cluster running cephfs on Luminous 12.2.4, using 2 active MDSes + 1 standby. I have 3 shares: /projects, /home and /scratch, and I've decided to try manual pinning as described here: http://docs.ceph.com/docs/master/cephfs/multimds/
/projects is pinned to mds.0 (rank 0) /home and /scratch are pinned to mds.1 (rank 1)
Clients mount either via ceph-fuse 12.2.4, or kernel client 4.15.13.
On our test cluster (same version and setup), it works as I think it should. I simulate metadata load via mdtest (up to around 2000 req/s on each mds, which is VM with 4 cores, 16GB RAM), and loads on /projects go to
mds.0, loads on the other shares go to mds.1. Nothing pops up in the logs. I can also successfully reset to no pinning (i.e using the default load balancing) via setting the ceph.dir.pin value to -1, and vice versa. All that happens is this show in the logs: .... mds.mds1-test-ceph2 asok_command: get subtrees (complete)
However, on our production cluster, with more powerful MDSes (10 cores 3.4GHz, 256GB RAM, much faster networking), I get this in the logs constantly: 2018-04-24 16:29:21.998261 7f02d1af9700 0 mds.1.migrator nicely exporting to mds.0 [dir 0x1000010cd91.1110* /home/ [2,head] auth{0=1017} v=5632699 cv=5632651/5632651 dir_auth=1 state=1611923458|complete|auxsubtree f(v84 55=0+55) n(v245771 rc2018-04-24
16:28:32.830971 b233439385711 423085=383063+40022) hs=55+0,ss=0+0 dirty=1 | child=1 frozen=0 subtree=1 replicated=1 dirty=1 authpin=0 0x55691ccf1c00]
Sometimes (depending on which mds starts first), I would get the same message but the other way around i.e "mds.0.migrator nicely exporting to mds.1" the workload that mds.0 should be doing. This only appears on one mds, never the other, until one is restarted.
And we've had a couple of occasions where we get this sort of slow requests: 7fd401126700 0 log_channel(cluster) log [WRN] : slow request 7681.127406 seconds old, received at 2018-04-20 08:17:35.970498: client_request(client.875554:238655 lookup #0x10038ff1eab/punim0116 2018-04-20 08:17:35.970319
caller_uid=10171, caller_gid=10000{10000,10123,}) currently failed to authpin local pins
Which then seems to snowball into thousands of slow requests, until mds.0 is restarted. When these slow requests happen, loads are fairly low on the active MDSes, although it is possible that the users could be doing something funky with metadata on production
that I can't reproduce with mdtest.
Am I doing manual pinning right? Should I even be using it? Cheers,
Linh
|
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com