Multi-active MDS cache pressure

Eugen Block <eblock@xxxxxx> · Thu, 02 Jun 2022 08:39:33 +0000

Hi,

I'm currently debugging a reoccuring issue with multi-active MDS. The  
cluster is still on Nautilus and can't be upgraded at this time. There  
have been many discussions about "cache pressure" and I was able to  
find the right settings a couple of times, but before I change too  
much in this setup I'd like to ask for your opinion. I'll add some  
information at the end.
So we have 16 active MDS daemons spread over 2 servers for one cephfs  
(8 daemons per server) with mds_cache_memory_limit = 64GB, the MDS  
servers are mostly idle except for some short peaks. Each of the MDS  
daemons uses around 2 GB according to 'ceph daemon mds.<MDS> cache  
status', so we're nowhere near the 64GB limit. There are currently 25  
servers that mount the cephs as clients.
Watching the ceph health I can see that the reported clients with  
cache pressure change, so they are not actually stuck but just don't  
respond as quickly as the MDS would like them to (I assume). For some  
of the mentioned clients I see high values for .recall_caps.value in  
the 'daemon session ls' output (at the bottom).

The docs basically state this:
When the MDS needs to shrink its cache (to stay within  
mds_cache_size), it sends messages to clients to shrink their caches  
too. The client is unresponsive to MDS requests to release cached  
inodes. Either the client is unresponsive or has a bug

To me it doesn't seem like the MDS servers are near the cache size  
limit, so it has to be the clients, right? In a different setup it  
helped to decrease the client_oc_size from 200MB to 100MB, but then  
there's also client_cache_size with 16K default. I'm not sure what the  
best approach would be here. I'd appreciate any comments on how to  
size the various cache/caps/threshold configurations.

Thanks!
Eugen

---snip---
# ceph daemon mds.<MDS> session ls

    "id": 2728101146,
    "entity": {
      "name": {
        "type": "client",
        "num": 2728101146
      },
[...]
        "nonce": 1105499797
      }
    },
    "state": "open",
    "num_leases": 0,
    "num_caps": 16158,
    "request_load_avg": 0,
    "uptime": 1118066.210318422,
    "requests_in_flight": 0,
    "completed_requests": [],
    "reconnecting": false,
    "recall_caps": {
      "value": 788916.8276369586,
      "halflife": 60
    },
    "release_caps": {
      "value": 8.814981576458962,
      "halflife": 60
    },
    "recall_caps_throttle": {
      "value": 27379.27162576508,
      "halflife": 1.5
    },
    "recall_caps_throttle2o": {
      "value": 5382.261925615086,
      "halflife": 0.5
    },
    "session_cache_liveness": {
      "value": 12.91841737465921,
      "halflife": 300
    },
    "cap_acquisition": {
      "value": 0,
      "halflife": 10
    },
[...]
    "used_inos": [],
    "client_metadata": {
      "features": "0x0000000000003bff",
      "entity_id": "cephfs_client",

# ceph fs status

cephfs - 25 clients
======
+------+--------+----------------+---------------+-------+-------+
| Rank | State  |      MDS       |    Activity   |  dns  |  inos |
+------+--------+----------------+---------------+-------+-------+
|  0   | active | stmailmds01d-3 | Reqs:   89 /s |  375k |  371k |
|  1   | active | stmailmds01d-4 | Reqs:   64 /s |  386k |  383k |
|  2   | active | stmailmds01a-3 | Reqs:    9 /s |  403k |  399k |
|  3   | active | stmailmds01a-8 | Reqs:   23 /s |  393k |  390k |
|  4   | active | stmailmds01a-2 | Reqs:   36 /s |  391k |  387k |
|  5   | active | stmailmds01a-4 | Reqs:   57 /s |  394k |  390k |
|  6   | active | stmailmds01a-6 | Reqs:   50 /s |  395k |  391k |
|  7   | active | stmailmds01d-5 | Reqs:   37 /s |  384k |  380k |
|  8   | active | stmailmds01a-5 | Reqs:   39 /s |  397k |  394k |
|  9   | active |  stmailmds01a  | Reqs:   23 /s |  400k |  396k |
|  10  | active | stmailmds01d-8 | Reqs:   74 /s |  402k |  399k |
|  11  | active | stmailmds01d-6 | Reqs:   37 /s |  399k |  395k |
|  12  | active |  stmailmds01d  | Reqs:   36 /s |  394k |  390k |
|  13  | active | stmailmds01d-7 | Reqs:   80 /s |  397k |  393k |
|  14  | active | stmailmds01d-2 | Reqs:   56 /s |  414k |  410k |
|  15  | active | stmailmds01a-7 | Reqs:   25 /s |  390k |  387k |
+------+--------+----------------+---------------+-------+-------+
+-----------------+----------+-------+-------+
|       Pool      |   type   |  used | avail |
+-----------------+----------+-------+-------+
| cephfs_metadata | metadata | 25.4G | 16.1T |
|   cephfs_data   |   data   | 2078G | 16.1T |
+-----------------+----------+-------+-------+
+----------------+
|  Standby MDS   |
+----------------+
| stmailmds01b-5 |
| stmailmds01b-2 |
| stmailmds01b-3 |
|  stmailmds01b  |
| stmailmds01b-7 |
| stmailmds01b-8 |
| stmailmds01b-6 |
| stmailmds01b-4 |
+----------------+
MDS version: ceph version 14.2.22-404-gf74e15c2e55  
(f74e15c2e552b3359f5a51482dfd8b049e262743) nautilus (stable)
---snip---

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx