Re: MDS stuck in stopping state

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Gregory,

there might be 2 separate problems. The race condition (1) caused by our workload is something I cannot look at easily. I might get back to it next year. The issue (2) with the blocked stray bucket ops might be related to us having orphaned deleted snapshots on the file system. It shows up as large omap warnings (these objects are under /strayN) and untrimmed deleted snaps on one of our data pools (con-fs2-data2):

# ceph osd pool ls detail -f json-pretty | jq '.[] | .pool_name, .removed_snaps'
"con-fs2-meta1"
"[]"
"con-fs2-meta2"
"[2~18e,191~2c,1be~2b6,475~1,477~1,479~1,47b~1,47d~1,47f~1,481~1,483~1,485~1,487~1]"
"con-fs2-data"
"[2~18e,191~2c,1be~2b6,475~1,477~1,479~1,47b~1,47d~1,47f~1,481~1,483~1,485~1,487~1]"
"con-fs2-data-ec-ssd"
"[2~18e,191~2c,1be~2b6,475~1,477~1,479~1,47b~1,47d~1,47f~1,481~1,483~1,485~1,487~1]"
"con-fs2-data2"
"[2d6~1,2d8~1,2da~1,2dc~1,2de~1,2e0~1,2e2~1,2e4~1,2e6~1,2e8~1,2ea~18a,475~1,477~1,479~1,47b~1,47d~1,47f~1,481~1,483~1,485~1,487~1]"

They seem to take up quite some space in the stray buckets:

# ssh ceph-17 'ceph daemon mds.$(hostname -s) perf dump | grep num_strays\"'
        "num_strays": 793078,

These are related conversations:

- user list thread: https://lists.ceph.io/hyperkitty/list/ceph-users@xxxxxxx/thread/HKEBXXRMX5WA5Y6JFM34WFPMWTCMPFCG
- starting at message: https://lists.ceph.io/hyperkitty/list/ceph-users@xxxxxxx/message/6S5GTKGGBI2Y3QE4E5XJJY2KSSLLX64H/
- more detailed message: https://lists.ceph.io/hyperkitty/list/ceph-users@xxxxxxx/message/IZQS4TSVZJQMA33ZBBMFUUW7ODKY7FMZ/
- tracker issue: https://tracker.ceph.com/issues/52581

I would like to clean this up before upgrading to octopus next year. Unfortunately, I didn't get the resolving clue so far. Do you know of a way to schedule the snapshots for trimming or make the system otherwise aware of their presence? Can I do this manually somehow?

Best regards,
=================
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14

________________________________________
From: Gregory Farnum <gfarnum@xxxxxxxxxx>
Sent: 13 December 2021 18:22:50
To: Frank Schilder
Cc: ceph-users
Subject: Re:  MDS stuck in stopping state

Hmm. Glad it's working now, at least.

On Mon, Dec 13, 2021 at 9:10 AM Frank Schilder <frans@xxxxxx> wrote:
>
> Dear Gregory,
>
> thanks for your fast response. The situation started worsening shortly after I sent my e-mail and I had to take action. More operations got stuck in the active MDS, leading to a failure of journal trimming. I more or less went through all combinations of what you wrote. In the end, the last option was to fail rank 0. This also crashed rank 1, but then things started to clear up. There were more blocked ops warnings every now and then, but these were temporary. After a bit of watching the 2 MDSes struggle, the "stopping" one finally became standby and everything is up and clean right now.
>
> The reason for this exercise is a (rare?) race condition between dirfrag subtree exports (multi-active MDS) and certain client operations (rename, create), which leads to a complete standstill of the file system (the fs journal size eventually reaches the pool quota). Its seems to be caused by different clients (servers) accessing the same sub-dir on the file system concurrently. It is only one workload where we observe this, no useful information so far.

Please let us know or file a ticket in the tracker if you get enough
details for us to spend some time debugging that. That's definitely
supposed to work fine, of course. :/
-Greg

>
> Thanks again and best regards,
> =================
> Frank Schilder
> AIT Risø Campus
> Bygning 109, rum S14
>
> ________________________________________
> From: Gregory Farnum <gfarnum@xxxxxxxxxx>
> Sent: 13 December 2021 17:39:55
> To: Frank Schilder
> Cc: ceph-users
> Subject: Re:  MDS stuck in stopping state
>
> This looks awkward — just from the ops, it seems mds.1 is trying to
> move some stray items (presumably snapshots of since-deleted files,
> from what you said?) into mds0's stray directory, and then mds.0 tries
> to get auth pins from mds.1 but that fails for some reason which isn't
> apparent from the dump.
>
> Somebody might be able to get farther along by tracing logs of mds.1
> rebooting, but my guess is that rebooting both servers will clear it
> up. You might also try increasing max_mds to 2 and seeing if that jogs
> things loose; I'm not sure what would be less disruptive for you.
> -Greg
>
>
> On Mon, Dec 13, 2021 at 5:37 AM Frank Schilder <frans@xxxxxx> wrote:
> >
> > Hi all, I needed to reduce the number of active MDS daemons from 4 to 1. Unfortunately, the last MDS to stop is stuck in stopping state. Ceph version is mimic 13.2.10. Each MDS has 3 blocked OPS, that seem to be related to deleted snapshots; more info below. I failed the MDS in stopping state already several times in the hope that the operations get flushed out. Before failing rank 0, I would appreciate if someone could look at this issue and advise on how to proceed safely.
> >
> > Some diagnostic info:
> >
> > # ceph fs status
> > con-fs2 - 1659 clients
> > =======
> > +------+----------+---------+---------------+-------+-------+
> > | Rank |  State   |   MDS   |    Activity   |  dns  |  inos |
> > +------+----------+---------+---------------+-------+-------+
> > |  0   |  active  | ceph-08 | Reqs:  176 /s | 2844k | 2775k |
> > |  1   | stopping | ceph-17 |               | 27.7k |   59  |
> > +------+----------+---------+---------------+-------+-------+
> > +---------------------+----------+-------+-------+
> > |         Pool        |   type   |  used | avail |
> > +---------------------+----------+-------+-------+
> > |    con-fs2-meta1    | metadata |  555M | 1261G |
> > |    con-fs2-meta2    |   data   |    0  | 1261G |
> > |     con-fs2-data    |   data   | 1321T | 5756T |
> > | con-fs2-data-ec-ssd |   data   |  252G | 4035G |
> > |    con-fs2-data2    |   data   |  389T | 5233T |
> > +---------------------+----------+-------+-------+
> > +-------------+
> > | Standby MDS |
> > +-------------+
> > |   ceph-09   |
> > |   ceph-24   |
> > |   ceph-14   |
> > |   ceph-16   |
> > |   ceph-12   |
> > |   ceph-23   |
> > |   ceph-10   |
> > |   ceph-15   |
> > |   ceph-13   |
> > |   ceph-11   |
> > +-------------+
> > MDS version: ceph version 13.2.10 (564bdc4ae87418a232fc901524470e1a0f76d641) mimic (stable)
> >
> > # ceph status
> >   cluster:
> >     id:
> >     health: HEALTH_WARN
> >             2 MDSs report slow requests
> >
> >   services:
> >     mon: 5 daemons, quorum ceph-01,ceph-02,ceph-03,ceph-25,ceph-26
> >     mgr: ceph-01(active), standbys: ceph-02, ceph-03, ceph-25, ceph-26
> >     mds: con-fs2-2/2/1 up  {0=ceph-08=up:active,1=ceph-17=up:stopping}, 10 up:standby
> >     osd: 1051 osds: 1050 up, 1050 in
> >
> >   data:
> >     pools:   13 pools, 17374 pgs
> >     objects: 1.01 G objects, 1.9 PiB
> >     usage:   2.3 PiB used, 9.2 PiB / 11 PiB avail
> >     pgs:     17352 active+clean
> >              20    active+clean+scrubbing+deep
> >              2     active+clean+scrubbing
> >
> >   io:
> >     client:   129 MiB/s rd, 175 MiB/s wr, 2.57 kop/s rd, 2.77 kop/s wr
> >
> > # ceph health detail
> > HEALTH_WARN 2 MDSs report slow requests
> > MDS_SLOW_REQUEST 2 MDSs report slow requests
> >     mdsceph-08(mds.0): 3 slow requests are blocked > 30 secs
> >     mdsceph-17(mds.1): 3 slow requests are blocked > 30 secs
> >
> > # ssh ceph-08 ceph daemon mds.ceph-08 dump_blocked_ops
> > {
> >     "ops": [
> >         {
> >             "description": "client_request(mds.1:126521 rename #0x100/stray5/1000eec35f7 #0x101/stray5/1000eec35f7 caller_uid=0, caller_gid=0{})",
> >             "initiated_at": "2021-12-13 13:08:59.430597",
> >             "age": 5034.983083,
> >             "duration": 5034.983109,
> >             "type_data": {
> >                 "flag_point": "acquired locks",
> >                 "reqid": "mds.1:126521",
> >                 "op_type": "client_request",
> >                 "client_info": {
> >                     "client": "mds.1",
> >                     "tid": 126521
> >                 },
> >                 "events": [
> >                     {
> >                         "time": "2021-12-13 13:08:59.430597",
> >                         "event": "initiated"
> >                     },
> >                     {
> >                         "time": "2021-12-13 13:08:59.430597",
> >                         "event": "header_read"
> >                     },
> >                     {
> >                         "time": "2021-12-13 13:08:59.430597",
> >                         "event": "throttled"
> >                     },
> >                     {
> >                         "time": "2021-12-13 13:08:59.430601",
> >                         "event": "all_read"
> >                     },
> >                     {
> >                         "time": "2021-12-13 13:09:00.730197",
> >                         "event": "dispatched"
> >                     },
> >                     {
> >                         "time": "2021-12-13 13:09:01.517306",
> >                         "event": "requesting remote authpins"
> >                     },
> >                     {
> >                         "time": "2021-12-13 13:09:01.557219",
> >                         "event": "failed to xlock, waiting"
> >                     },
> >                     {
> >                         "time": "2021-12-13 13:09:01.647692",
> >                         "event": "failed to wrlock, waiting"
> >                     },
> >                     {
> >                         "time": "2021-12-13 13:09:01.663629",
> >                         "event": "waiting for remote wrlocks"
> >                     },
> >                     {
> >                         "time": "2021-12-13 13:09:01.673789",
> >                         "event": "waiting for remote wrlocks"
> >                     },
> >                     {
> >                         "time": "2021-12-13 13:09:01.676523",
> >                         "event": "failed to xlock, waiting"
> >                     },
> >                     {
> >                         "time": "2021-12-13 13:09:01.691962",
> >                         "event": "failed to xlock, waiting"
> >                     },
> >                     {
> >                         "time": "2021-12-13 13:09:01.704202",
> >                         "event": "acquired locks"
> >                     }
> >                 ]
> >             }
> >         },
> >         {
> >             "description": "client_request(mds.1:1 rename #0x100/stray5/1000eec35f7 #0x101/stray5/1000eec35f7 caller_uid=0, caller_gid=0{})",
> >             "initiated_at": "2021-12-13 13:31:56.260453",
> >             "age": 3658.153227,
> >             "duration": 3658.153337,
> >             "type_data": {
> >                 "flag_point": "requesting remote authpins",
> >                 "reqid": "mds.1:1",
> >                 "op_type": "client_request",
> >                 "client_info": {
> >                     "client": "mds.1",
> >                     "tid": 1
> >                 },
> >                 "events": [
> >                     {
> >                         "time": "2021-12-13 13:31:56.260453",
> >                         "event": "initiated"
> >                     },
> >                     {
> >                         "time": "2021-12-13 13:31:56.260453",
> >                         "event": "header_read"
> >                     },
> >                     {
> >                         "time": "2021-12-13 13:31:56.260454",
> >                         "event": "throttled"
> >                     },
> >                     {
> >                         "time": "2021-12-13 13:31:56.260461",
> >                         "event": "all_read"
> >                     },
> >                     {
> >                         "time": "2021-12-13 13:31:56.260511",
> >                         "event": "dispatched"
> >                     },
> >                     {
> >                         "time": "2021-12-13 13:31:56.260604",
> >                         "event": "requesting remote authpins"
> >                     }
> >                 ]
> >             }
> >         },
> >         {
> >             "description": "client_request(mds.1:993 rename #0x100/stray5/1000eec35f7 #0x101/stray5/1000eec35f7 caller_uid=0, caller_gid=0{})",
> >             "initiated_at": "2021-12-13 13:15:31.979997",
> >             "age": 4642.433683,
> >             "duration": 4642.433850,
> >             "type_data": {
> >                 "flag_point": "requesting remote authpins",
> >                 "reqid": "mds.1:993",
> >                 "op_type": "client_request",
> >                 "client_info": {
> >                     "client": "mds.1",
> >                     "tid": 993
> >                 },
> >                 "events": [
> >                     {
> >                         "time": "2021-12-13 13:15:31.979997",
> >                         "event": "initiated"
> >                     },
> >                     {
> >                         "time": "2021-12-13 13:15:31.979997",
> >                         "event": "header_read"
> >                     },
> >                     {
> >                         "time": "2021-12-13 13:15:31.979998",
> >                         "event": "throttled"
> >                     },
> >                     {
> >                         "time": "2021-12-13 13:15:31.980003",
> >                         "event": "all_read"
> >                     },
> >                     {
> >                         "time": "2021-12-13 13:15:31.980079",
> >                         "event": "dispatched"
> >                     },
> >                     {
> >                         "time": "2021-12-13 13:15:31.980174",
> >                         "event": "requesting remote authpins"
> >                     },
> >                     {
> >                         "time": "2021-12-13 13:31:50.634734",
> >                         "event": "requesting remote authpins"
> >                     }
> >                 ]
> >             }
> >         }
> >     ],
> >     "complaint_time": 30.000000,
> >     "num_blocked_ops": 3
> > }
> >
> > # ssh ceph-17 ceph daemon mds.ceph-17 dump_blocked_ops
> > {
> >     "ops": [
> >         {
> >             "description": "rejoin:mds.1:126521",
> >             "initiated_at": "2021-12-13 13:30:34.602931",
> >             "age": 3791.164314,
> >             "duration": 3791.164335,
> >             "type_data": {
> >                 "flag_point": "dispatched",
> >                 "reqid": "mds.1:126521",
> >                 "op_type": "no_available_op_found",
> >                 "events": [
> >                     {
> >                         "time": "2021-12-13 13:30:34.602931",
> >                         "event": "initiated"
> >                     },
> >                     {
> >                         "time": "2021-12-13 13:30:34.602931",
> >                         "event": "header_read"
> >                     },
> >                     {
> >                         "time": "2021-12-13 13:30:34.602932",
> >                         "event": "throttled"
> >                     },
> >                     {
> >                         "time": "2021-12-13 13:30:34.602978",
> >                         "event": "all_read"
> >                     },
> >                     {
> >                         "time": "2021-12-13 13:30:34.605856",
> >                         "event": "dispatched"
> >                     }
> >                 ]
> >             }
> >         },
> >         {
> >             "description": "slave_request(mds.1:993.0 authpin)",
> >             "initiated_at": "2021-12-13 13:31:50.634857",
> >             "age": 3715.132388,
> >             "duration": 3715.132451,
> >             "type_data": {
> >                 "flag_point": "dispatched",
> >                 "reqid": "mds.1:993",
> >                 "op_type": "slave_request",
> >                 "master_info": {
> >                     "master": "mds.0"
> >                 },
> >                 "request_info": {
> >                     "attempt": 0,
> >                     "op_type": "authpin",
> >                     "lock_type": 0,
> >                     "object_info": "0x1000eec35f7.head",
> >                     "srcdnpath": "",
> >                     "destdnpath": "",
> >                     "witnesses": "",
> >                     "has_inode_export": false,
> >                     "inode_export_v": 0,
> >                     "op_stamp": "0.000000"
> >                 },
> >                 "events": [
> >                     {
> >                         "time": "2021-12-13 13:31:50.634857",
> >                         "event": "initiated"
> >                     },
> >                     {
> >                         "time": "2021-12-13 13:31:50.634857",
> >                         "event": "header_read"
> >                     },
> >                     {
> >                         "time": "2021-12-13 13:31:50.634858",
> >                         "event": "throttled"
> >                     },
> >                     {
> >                         "time": "2021-12-13 13:31:50.634867",
> >                         "event": "all_read"
> >                     },
> >                     {
> >                         "time": "2021-12-13 13:31:50.634893",
> >                         "event": "dispatched"
> >                     }
> >                 ]
> >             }
> >         },
> >         {
> >             "description": "slave_request(mds.1:1.0 authpin)",
> >             "initiated_at": "2021-12-13 13:31:56.260729",
> >             "age": 3709.506516,
> >             "duration": 3709.506631,
> >             "type_data": {
> >                 "flag_point": "dispatched",
> >                 "reqid": "mds.1:1",
> >                 "op_type": "slave_request",
> >                 "master_info": {
> >                     "master": "mds.0"
> >                 },
> >                 "request_info": {
> >                     "attempt": 0,
> >                     "op_type": "authpin",
> >                     "lock_type": 0,
> >                     "object_info": "0x1000eec35f7.head",
> >                     "srcdnpath": "",
> >                     "destdnpath": "",
> >                     "witnesses": "",
> >                     "has_inode_export": false,
> >                     "inode_export_v": 0,
> >                     "op_stamp": "0.000000"
> >                 },
> >                 "events": [
> >                     {
> >                         "time": "2021-12-13 13:31:56.260729",
> >                         "event": "initiated"
> >                     },
> >                     {
> >                         "time": "2021-12-13 13:31:56.260729",
> >                         "event": "header_read"
> >                     },
> >                     {
> >                         "time": "2021-12-13 13:31:56.260731",
> >                         "event": "throttled"
> >                     },
> >                     {
> >                         "time": "2021-12-13 13:31:56.260743",
> >                         "event": "all_read"
> >                     },
> >                     {
> >                         "time": "2021-12-13 13:31:56.264063",
> >                         "event": "dispatched"
> >                     }
> >                 ]
> >             }
> >         }
> >     ],
> >     "complaint_time": 30.000000,
> >     "num_blocked_ops": 3
> > }
> >
> > =================
> > Frank Schilder
> > AIT Risø Campus
> > Bygning 109, rum S14
> > _______________________________________________
> > ceph-users mailing list -- ceph-users@xxxxxxx
> > To unsubscribe send an email to ceph-users-leave@xxxxxxx
> >
>

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux