Re: MDS daemons stuck in resolve, please help

Dan van der Ster <dan@xxxxxxxxxxxxxx> · Tue, 31 Aug 2021 15:26:17 +0200

Hi Frank,

It helps if you start threads reminding us which version you're running.

During nautilus the caps recall issue (which is AFAIK the main cause
of mds cache overruns) should be solved with this PR:
https://github.com/ceph/ceph/pull/39134/files
If you're not running >= 14.2.17 then you should probably just apply
these settings all together. (Don't worry which order they are set or
whatever -- just make the changes within a short window).

Also, to try to understand your MDS issues -- are you using pinning or
letting metadata move around between MDSs ?
find / might wreak havoc if you aren't pinning.

-- dan

On Tue, Aug 31, 2021 at 2:13 PM Frank Schilder <frans@xxxxxx> wrote:
>
> I seem to be hit by the problem discussed here: https://lists.ceph.io/hyperkitty/list/ceph-users@xxxxxxx/thread/AOYWQSONTFROPB4DXVYADWW7V25C3G6Z/
>
> In my case, what helped getting the cash size growth somewhat under control was
>
>     ceph config set mds mds_recall_max_caps 10000
>
> I'm not sure about the options mds_recall_max_decay_threshold and mds_recall_max_decay_rate. The description I found is quite vague about the effect of these and the defaults also don't match (mimic output):
>
> # ceph config help mds_recall_max_caps
> mds_recall_max_caps - maximum number of caps to recall from client session in single recall
>   (size_t, advanced)
>   Default: 5000
>   Can update at runtime: true
>   Services: [mds]
>
> # ceph config help mds_recall_max_decay_threshold
> mds_recall_max_decay_threshold - decay threshold for throttle on recalled caps on a session
>   (size_t, advanced)
>   Default: 16384
>   Can update at runtime: true
>   Services: [mds]
>
> # ceph config help mds_recall_max_decay_rate
> mds_recall_max_decay_rate - decay rate for throttle on recalled caps on a session
>   (double, advanced)
>   Default: 2.500000
>   Can update at runtime: true
>   Services: [mds]
>
> I assume higher mds_recall_max_decay_threshold and lower mds_recall_max_decay_rate increase speed of caps recall? What increments would be safe to use? For example, is it really a good idea to go from 16384 to the new default 131072 in one go?
>
> Thanks for any advice and best regards,
> =================
> Frank Schilder
> AIT Risø Campus
> Bygning 109, rum S14
>
> ________________________________________
> From: Frank Schilder <frans@xxxxxx>
> Sent: 30 August 2021 21:37:18
> To: ceph-users
> Subject:  Re: MDS daemons stuck in resolve, please help
>
> The MDS cluster came back up again, but I lost a number of standby MDS daemons. I cleared the OSD blacklist, but they do not show up as stand-by daemons again. The daemon itself is running, but does not seem to re-join the cluster. The log shows:
>
> 2021-08-30 21:32:34.896 7fc9e22f8700  1 heartbeat_map is_healthy 'MDSRank' had timed out after 15
> 2021-08-30 21:32:39.896 7fc9e22f8700  1 heartbeat_map is_healthy 'MDSRank' had timed out after 15
> 2021-08-30 21:32:44.896 7fc9e22f8700  1 heartbeat_map is_healthy 'MDSRank' had timed out after 15
> 2021-08-30 21:32:49.897 7fc9e22f8700  1 heartbeat_map is_healthy 'MDSRank' had timed out after 15
>
> I just had another frenzy of MDS fail-overs and am running out of stand-b daemons. A restart of a "missing" daemon brings it back to life, but I would prefer this to work by itself. Any ideas on what's going on are welcome.
>
> Thanks and best regards,
> =================
> Frank Schilder
> AIT Risø Campus
> Bygning 109, rum S14
>
> ________________________________________
> From: Frank Schilder <frans@xxxxxx>
> Sent: 30 August 2021 21:12:53
> To: ceph-users
> Subject:  MDS daemons stuck in resolve, please help
>
> Hi all,
>
> our MDS cluster got degraded after an MDS had an oversized cache and crashed. Other MDS daemons followed suit and now they are stuck in this state:
>
> [root@gnosis ~]# ceph fs status
> con-fs2 - 1640 clients
> =======
> +------+---------+---------+---------------+-------+-------+
> | Rank |  State  |   MDS   |    Activity   |  dns  |  inos |
> +------+---------+---------+---------------+-------+-------+
> |  0   | resolve | ceph-24 |               | 22.1k | 22.0k |
> |  1   | resolve | ceph-13 |               |  769k |  758k |
> |  2   |  active | ceph-16 | Reqs:    0 /s |  255k |  255k |
> |  3   | resolve | ceph-09 |               | 5624  | 5619  |
> +------+---------+---------+---------------+-------+-------+
> +---------------------+----------+-------+-------+
> |         Pool        |   type   |  used | avail |
> +---------------------+----------+-------+-------+
> |    con-fs2-meta1    | metadata | 1828M | 1767G |
> |    con-fs2-meta2    |   data   |    0  | 1767G |
> |     con-fs2-data    |   data   | 1363T | 6049T |
> | con-fs2-data-ec-ssd |   data   |  239G | 4241G |
> |    con-fs2-data2    |   data   | 10.2T | 5499T |
> +---------------------+----------+-------+-------+
> +-------------+
> | Standby MDS |
> +-------------+
> |   ceph-12   |
> |   ceph-08   |
> |   ceph-23   |
> |   ceph-11   |
> +-------------+
>
> I tried to set max_mds to 1 to no avail. How can I get the MDS daemons back up?
>
> Thanks and best regards,
> =================
> Frank Schilder
> AIT Risø Campus
> Bygning 109, rum S14
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx