Re: mds generates slow request: peer_request, how to deal with it?

David Yang <gmydw1118@xxxxxxxxx> · Tue, 2 Jan 2024 09:26:41 +0800

Hi, sake
The load situation of the multi-mds I currently use is obviously much
lower than yours.

  cluster:
    id:     4563ae9d-e449-4c48-a91c-6801e57e7460
    health: HEALTH_WARN
            2 MDSs report slow requests
            2 MDSs behind on trimming

  services:
    mon: 3 daemons, quorum osd43,osd44,osd45 (age 6d)
    mgr: osd43(active, since 13M), standbys: osd45, osd44
    mds: 3/3 daemons up, 2 standby
    osd: 84 osds: 84 up (since 11d), 84 in (since 5w)

  data:
    volumes: 1/1 healthy
    pools:   3 pools, 8321 pgs
    objects: 236.03M objects, 447 TiB
    usage:   920 TiB used, 262 TiB / 1.2 PiB avail
    pgs:     8259 active+clean
             61   active+clean+scrubbing+deep
             1    active+clean+scrubbing

  io:
    client:   15 MiB/s rd, 2.9 KiB/s wr, 10 op/s rd, 1 op/s wr

I currently deploy and maintain many clusters, and this happens only
in this cluster. A detailed solution to this issue cannot be found at
this time.
I've tried restarting mds and it takes hours to get back to active state.
There are currently slow requests in the cluster. Can it be directly
adjusted to 1mds? Will the client interrupt IO during the adjustment
period?

Sake <ceph@xxxxxxxxxxx> 于2023年12月31日周日 16:57写道：
>
> Hi David,
>
> How does your filesystem looks like. We have a few folders with a lot of subfolders, which are all randomly accessed. And I guess the balancer is moving a lot of folders between the mds nodes.
> We noticed that multiple active mds isn't working in this setup, with the same errors as you get. And after restarting the problematic mds, everything is fine for a few hours and the errors show again. So for now we reverted to 1 mds (the load is low with the holidays).
> Also the load on the cluster was very high (1000+ iops and 100+ MB traffic) with multiple mds, like it was continuing to load balance folders over the active mds nodes. The load is currently around 500 iops and 50 MB traffic, or even lower.
>
> After the holidays I'm going to see what I can achieve with manual pinning directories to mds ranks.
>
> Best regards,
> Sake
>
>
> On 31 Dec 2023 09:01, David Yang <gmydw1118@xxxxxxxxx> wrote:
>
> I hope this message finds you well.
>
> I have a cephfs cluster with 3 active mds, and use 3-node samba to
> export through the kernel.
>
> Currently, there are 2 node mds experiencing slow requests. We have
> tried restarting the mds. After a few hours, the replay log status
> became active.
> But the slow request reappears. The slow request does not seem to come
> from the client, but from the request of the mds node.
>
> Looking forward to your prompt response.
>
> HEALTH_WARN 2 MDSs report slow requests; 2 MDSs behind on trimming
> [WRN] MDS_SLOW_REQUEST: 2 MDSs report slow requests
>     mds.osd44(mds.0): 2 slow requests are blocked > 30 secs
>     mds.osd43(mds.1): 2 slow requests are blocked > 30 secs
> [WRN] MDS_TRIM: 2 MDSs behind on trimming
>     mds.osd44(mds.0): Behind on trimming (18642/1024) max_segments:
> 1024, num_segments: 18642
>     mds.osd43(mds.1): Behind on trimming (976612/1024) max_segments:
> 1024, num_segments: 976612
>
> mds.0
>
> {
>     "ops": [
>         {
>             "description": "peer_request:mds.1:1",
>             "initiated_at": "2023-12-31T11:19:38.679925+0800",
>             "age": 4358.8009461359998,
>             "duration": 4358.8009636369998,
>             "type_data": {
>                 "flag_point": "dispatched",
>                 "reqid": "mds.1:1",
>                 "op_type": "peer_request",
>                 "leader_info": {
>                     "leader": "1"
>                 },
>                 "events": [
>                     {
>                         "time": "2023-12-31T11:19:38.679925+0800",
>                         "event": "initiated"
>                     },
>                     {
>                         "time": "2023-12-31T11:19:38.679925+0800",
>                         "event": "throttled"
>                     },
>                     {
>                         "time": "2023-12-31T11:19:38.679925+0800",
>                         "event": "header_read"
>                     },
>                     {
>                         "time": "2023-12-31T11:19:38.679936+0800",
>                         "event": "all_read"
>                     },
>                     {
>                         "time": "2023-12-31T11:19:38.679940+0800",
>                         "event": "dispatched"
>                     }
>                 ]
>             }
>         },
>         {
>             "description": "peer_request:mds.1:2",
>             "initiated_at": "2023-12-31T11:19:38.679938+0800",
>             "age": 4358.8009326969996,
>             "duration": 4358.8009763549999,
>             "type_data": {
>                 "flag_point": "dispatched",
>                 "reqid": "mds.1:2",
>                 "op_type": "peer_request",
>                 "leader_info": {
>                     "leader": "1"
>                 },
>                 "events": [
>                     {
>                         "time": "2023-12-31T11:19:38.679938+0800",
>                         "event": "initiated"
>                     },
>                     {
>                         "time": "2023-12-31T11:19:38.679938+0800",
>                         "event": "throttled"
>                     },
>                     {
>                         "time": "2023-12-31T11:19:38.679938+0800",
>                         "event": "header_read"
>                     },
>                     {
>                         "time": "2023-12-31T11:19:38.679941+0800",
>                         "event": "all_read"
>                     },
>                     {
>                         "time": "2023-12-31T11:19:38.679991+0800",
>                         "event": "dispatched"
>                     }
>                 ]
>             }
>         }
>     ],
>     "complaint_time": 30,
>     "num_blocked_ops": 2
> }
>
>
> mds.1
>
> {
>     "ops": [
>         {
>             "description": "internal op exportdir:mds.1:1",
>             "initiated_at": "2023-12-31T11:19:34.416451+0800",
>             "age": 4384.38814198,
>             "duration": 4384.3881617610004,
>             "type_data": {
>                 "flag_point": "failed to wrlock, waiting",
>                 "reqid": "mds.1:1",
>                 "op_type": "internal_op",
>                 "internal_op": 5377,
>                 "op_name": "exportdir",
>                 "events": [
>                     {
>                         "time": "2023-12-31T11:19:34.416451+0800",
>                         "event": "initiated"
>                     },
>                     {
>                         "time": "2023-12-31T11:19:34.416451+0800",
>                         "event": "throttled"
>                     },
>                     {
>                         "time": "2023-12-31T11:19:34.416451+0800",
>                         "event": "header_read"
>                     },
>                     {
>                         "time": "2023-12-31T11:19:34.416451+0800",
>                         "event": "all_read"
>                     },
>                     {
>                         "time": "2023-12-31T11:19:34.416451+0800",
>                         "event": "dispatched"
>                     },
>                     {
>                         "time": "2023-12-31T11:19:38.679923+0800",
>                         "event": "requesting remote authpins"
>                     },
>                     {
>                         "time": "2023-12-31T11:19:38.693981+0800",
>                         "event": "failed to wrlock, waiting"
>                     }
>                 ]
>             }
>         },
>         {
>             "description": "internal op exportdir:mds.1:2",
>             "initiated_at": "2023-12-31T11:19:34.416482+0800",
>             "age": 4384.3881117999999,
>             "duration": 4384.3881714600002,
>             "type_data": {
>                 "flag_point": "failed to wrlock, waiting",
>                 "reqid": "mds.1:2",
>                 "op_type": "internal_op",
>                 "internal_op": 5377,
>                 "op_name": "exportdir",
>                 "events": [
>                     {
>                         "time": "2023-12-31T11:19:34.416482+0800",
>                         "event": "initiated"
>                     },
>                     {
>                         "time": "2023-12-31T11:19:34.416482+0800",
>                         "event": "throttled"
>                     },
>                     {
>                         "time": "2023-12-31T11:19:34.416482+0800",
>                         "event": "header_read"
>                     },
>                     {
>                         "time": "2023-12-31T11:19:34.416482+0800",
>                         "event": "all_read"
>                     },
>                     {
>                         "time": "2023-12-31T11:19:34.416482+0800",
>                         "event": "dispatched"
>                     },
>                     {
>                         "time": "2023-12-31T11:19:38.679929+0800",
>                         "event": "requesting remote authpins"
>                     },
>                     {
>                         "time": "2023-12-31T11:19:38.693995+0800",
>                         "event": "failed to wrlock, waiting"
>                     }
>                 ]
>             }
>         }
>     ],
>     "complaint_time": 30,
>     "num_blocked_ops": 2
> }
>
>
>
> I can't find any other solution other than restarting the mds service
> with slow requests.
>
> Currently, the backlog of mds logs in the metadata pool exceeds 4TB.
>
> Best regards,
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx
>
>
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx