Re: ceph fs service outage: currently failed to authpin, subtree is being exported

Frank Schilder <frans@xxxxxx> · Sun, 19 Sep 2021 14:19:09 +0000

Hi Dan,

thanks for looking at this. I had to take action and restarted 2 of our 4 active MDSes, which flushed out the stuck operations. I'm pretty sure it was a real deadlock, most clients were blocked already. There was almost no fs activity and MDS CPU usage was below 2%.

We are running a version with the patch for https://tracker.ceph.com/issues/39987, so this must be a different problem and not a regression. Unfortunately, I forgot to dump the MDS cache, which 2 years ago Zheng Yan asked me to do in such cases. I hope I remember next time this happens.

Best regards and have a good Sunday,
=================
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14

________________________________________
From: Dan van der Ster <dan@xxxxxxxxxxxxxx>
Sent: 19 September 2021 14:01
To: Frank Schilder
Cc: ceph-users
Subject: Re:  ceph fs service outage: currently failed to authpin, subtree is being exported

Hi Frank,

The only time I've seen something like this was when we accidently changed a subtree pin to a different rank, causing a huge amount of mds export work to be queued up between MDSs. In that incident, we just waited until it completed... It took around 30 minutes, after which all the log segments trimmed back to normal immediately.

I think in your case you should try to understand (by injecting higher debug_mds values) e.g. if the subtree migration is ongoing, if new migrations are being triggered still now by some client activity, and if you can change some config now to slow down any additional additional work being added to the export queue.
Maybe increasing mds_bal_interval to some huge value (like 3600) would be useful to make sure no further exports are initiated.

--
Dan

On Sun, 19 Sep 2021, 10:11 Frank Schilder, <frans@xxxxxx<mailto:frans@xxxxxx>> wrote:
Guten Tag.

Our file system is out of operation (mimic 13.2.10). Our MDSes are choking on an operation:

2021-09-19 02:23:36.432664 mon.ceph-01 mon.0 192.168.32.65:6789/0<http://192.168.32.65:6789/0> 185676 : cluster [WRN] Health check failed: 1 MDSs repor
t slow requests (MDS_SLOW_REQUEST)
[...]
2021-09-19 02:23:34.909269 mds.ceph-15 mds.3 192.168.32.79:6872/3838509256<http://192.168.32.79:6872/3838509256> 1662 : cluster [WRN] 33 slow requests, 5 included below; oldest blocked for > 32.729621 secs
2021-09-19 02:23:34.909277 mds.ceph-15 mds.3 192.168.32.79:6872/3838509256<http://192.168.32.79:6872/3838509256> 1663 : cluster [WRN] slow request 31.104289 seconds old, received at 2021-09-19 02:23:03.804307: client_request(client.44559846:1121833 lookup #0x100000167f6/naikar 2021-09-19 02:23:03.803359 caller_uid=260500, caller_gid=260500{}) currently failed to authpin, subtree is being exported
2021-09-19 02:23:34.909280 mds.ceph-15 mds.3 192.168.32.79:6872/3838509256<http://192.168.32.79:6872/3838509256> 1664 : cluster [WRN] slow request 31.104254 seconds old, received at 2021-09-19 02:23:03.804343: client_request(client.44559846:1121834 lookup #0x100000167f6/naikar 2021-09-19 02:23:03.803359 caller_uid=260500, caller_gid=260500{}) currently failed to authpin, subtree is being exported
2021-09-19 02:23:34.909283 mds.ceph-15 mds.3 192.168.32.79:6872/3838509256<http://192.168.32.79:6872/3838509256> 1665 : cluster [WRN] slow request 31.104231 seconds old, received at 2021-09-19 02:23:03.804365: client_request(client.44559846:1121835 lookup #0x100000167f6/naikar 2021-09-19 02:23:03.803359 caller_uid=260500, caller_gid=260500{}) currently failed to authpin, subtree is being exported
2021-09-19 02:23:34.909285 mds.ceph-15 mds.3 192.168.32.79:6872/3838509256<http://192.168.32.79:6872/3838509256> 1666 : cluster [WRN] slow request 31.104213 seconds old, received at 2021-09-19 02:23:03.804384: client_request(client.44559846:1121836 lookup #0x100000167f6/naikar 2021-09-19 02:23:03.803359 caller_uid=260500, caller_gid=260500{}) currently failed to authpin, subtree is being exported
2021-09-19 02:23:34.909288 mds.ceph-15 mds.3 192.168.32.79:6872/3838509256<http://192.168.32.79:6872/3838509256> 1667 : cluster [WRN] slow request 31.104142 seconds old, received at 2021-09-19 02:23:03.804455: client_request(client.44559846:1121837 lookup #0x100000167f6/naikar 2021-09-19 02:23:03.803359 caller_uid=260500, caller_gid=260500{}) currently failed to authpin, subtree is being exported

By now, several thousand authpin operations are stuck for hours already. The file system is basically inoperational and work is piling up:

# ceph health detail
HEALTH_WARN 1 MDSs report slow requests; 2 MDSs behind on trimming; 20 large omap objects
MDS_SLOW_REQUEST 1 MDSs report slow requests
    mdsceph-15(mds.3): 1554 slow requests are blocked > 30 secs
MDS_TRIM 2 MDSs behind on trimming
    mdsceph-23(mds.0): Behind on trimming (7651/128) max_segments: 128, num_segments: 7651
    mdsceph-15(mds.3): Behind on trimming (4888/128) max_segments: 128, num_segments: 4888

I would be grateful for advice on how to get out of this. Current fs status is:

# ceph fs status
con-fs2 - 1636 clients
=======
+------+--------+---------+---------------+-------+-------+
| Rank | State  |   MDS   |    Activity   |  dns  |  inos |
+------+--------+---------+---------------+-------+-------+
|  0   | active | ceph-23 | Reqs:    2 /s | 2024k | 2019k |
|  1   | active | ceph-12 | Reqs:    0 /s | 1382k | 1374k |
|  2   | active | ceph-08 | Reqs:    0 /s |  998k |  926k |
|  3   | active | ceph-15 | Reqs:    0 /s | 1373k | 1272k |
+------+--------+---------+---------------+-------+-------+
+---------------------+----------+-------+-------+
|         Pool        |   type   |  used | avail |
+---------------------+----------+-------+-------+
|    con-fs2-meta1    | metadata |  102G | 1252G |
|    con-fs2-meta2    |   data   |    0  | 1252G |
|     con-fs2-data    |   data   | 1359T | 6003T |
| con-fs2-data-ec-ssd |   data   |  239G | 4006G |
|    con-fs2-data2    |   data   | 56.0T | 5457T |
+---------------------+----------+-------+-------+
+-------------+
| Standby MDS |
+-------------+
|   ceph-16   |
|   ceph-14   |
|   ceph-13   |
|   ceph-17   |
|   ceph-10   |
|   ceph-24   |
|   ceph-09   |
|   ceph-11   |
+-------------+
MDS version: ceph version 13.2.10 (564bdc4ae87418a232fc901524470e1a0f76d641) mimic (stable)

Thanks and best regards,
=================
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx<mailto:ceph-users@xxxxxxx>
To unsubscribe send an email to ceph-users-leave@xxxxxxx<mailto:ceph-users-leave@xxxxxxx>

On Sun, 19 Sep 2021, 10:11 Frank Schilder, <frans@xxxxxx<mailto:frans@xxxxxx>> wrote:
Guten Tag.

Our file system is out of operation (mimic 13.2.10). Our MDSes are choking on an operation:

2021-09-19 02:23:36.432664 mon.ceph-01 mon.0 192.168.32.65:6789/0<http://192.168.32.65:6789/0> 185676 : cluster [WRN] Health check failed: 1 MDSs repor
t slow requests (MDS_SLOW_REQUEST)
[...]
2021-09-19 02:23:34.909269 mds.ceph-15 mds.3 192.168.32.79:6872/3838509256<http://192.168.32.79:6872/3838509256> 1662 : cluster [WRN] 33 slow requests, 5 included below; oldest blocked for > 32.729621 secs
2021-09-19 02:23:34.909277 mds.ceph-15 mds.3 192.168.32.79:6872/3838509256<http://192.168.32.79:6872/3838509256> 1663 : cluster [WRN] slow request 31.104289 seconds old, received at 2021-09-19 02:23:03.804307: client_request(client.44559846:1121833 lookup #0x100000167f6/naikar 2021-09-19 02:23:03.803359 caller_uid=260500, caller_gid=260500{}) currently failed to authpin, subtree is being exported
2021-09-19 02:23:34.909280 mds.ceph-15 mds.3 192.168.32.79:6872/3838509256<http://192.168.32.79:6872/3838509256> 1664 : cluster [WRN] slow request 31.104254 seconds old, received at 2021-09-19 02:23:03.804343: client_request(client.44559846:1121834 lookup #0x100000167f6/naikar 2021-09-19 02:23:03.803359 caller_uid=260500, caller_gid=260500{}) currently failed to authpin, subtree is being exported
2021-09-19 02:23:34.909283 mds.ceph-15 mds.3 192.168.32.79:6872/3838509256<http://192.168.32.79:6872/3838509256> 1665 : cluster [WRN] slow request 31.104231 seconds old, received at 2021-09-19 02:23:03.804365: client_request(client.44559846:1121835 lookup #0x100000167f6/naikar 2021-09-19 02:23:03.803359 caller_uid=260500, caller_gid=260500{}) currently failed to authpin, subtree is being exported
2021-09-19 02:23:34.909285 mds.ceph-15 mds.3 192.168.32.79:6872/3838509256<http://192.168.32.79:6872/3838509256> 1666 : cluster [WRN] slow request 31.104213 seconds old, received at 2021-09-19 02:23:03.804384: client_request(client.44559846:1121836 lookup #0x100000167f6/naikar 2021-09-19 02:23:03.803359 caller_uid=260500, caller_gid=260500{}) currently failed to authpin, subtree is being exported
2021-09-19 02:23:34.909288 mds.ceph-15 mds.3 192.168.32.79:6872/3838509256<http://192.168.32.79:6872/3838509256> 1667 : cluster [WRN] slow request 31.104142 seconds old, received at 2021-09-19 02:23:03.804455: client_request(client.44559846:1121837 lookup #0x100000167f6/naikar 2021-09-19 02:23:03.803359 caller_uid=260500, caller_gid=260500{}) currently failed to authpin, subtree is being exported

By now, several thousand authpin operations are stuck for hours already. The file system is basically inoperational and work is piling up:

# ceph health detail
HEALTH_WARN 1 MDSs report slow requests; 2 MDSs behind on trimming; 20 large omap objects
MDS_SLOW_REQUEST 1 MDSs report slow requests
    mdsceph-15(mds.3): 1554 slow requests are blocked > 30 secs
MDS_TRIM 2 MDSs behind on trimming
    mdsceph-23(mds.0): Behind on trimming (7651/128) max_segments: 128, num_segments: 7651
    mdsceph-15(mds.3): Behind on trimming (4888/128) max_segments: 128, num_segments: 4888

I would be grateful for advice on how to get out of this. Current fs status is:

# ceph fs status
con-fs2 - 1636 clients
=======
+------+--------+---------+---------------+-------+-------+
| Rank | State  |   MDS   |    Activity   |  dns  |  inos |
+------+--------+---------+---------------+-------+-------+
|  0   | active | ceph-23 | Reqs:    2 /s | 2024k | 2019k |
|  1   | active | ceph-12 | Reqs:    0 /s | 1382k | 1374k |
|  2   | active | ceph-08 | Reqs:    0 /s |  998k |  926k |
|  3   | active | ceph-15 | Reqs:    0 /s | 1373k | 1272k |
+------+--------+---------+---------------+-------+-------+
+---------------------+----------+-------+-------+
|         Pool        |   type   |  used | avail |
+---------------------+----------+-------+-------+
|    con-fs2-meta1    | metadata |  102G | 1252G |
|    con-fs2-meta2    |   data   |    0  | 1252G |
|     con-fs2-data    |   data   | 1359T | 6003T |
| con-fs2-data-ec-ssd |   data   |  239G | 4006G |
|    con-fs2-data2    |   data   | 56.0T | 5457T |
+---------------------+----------+-------+-------+
+-------------+
| Standby MDS |
+-------------+
|   ceph-16   |
|   ceph-14   |
|   ceph-13   |
|   ceph-17   |
|   ceph-10   |
|   ceph-24   |
|   ceph-09   |
|   ceph-11   |
+-------------+
MDS version: ceph version 13.2.10 (564bdc4ae87418a232fc901524470e1a0f76d641) mimic (stable)

Thanks and best regards,
=================
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx<mailto:ceph-users@xxxxxxx>
To unsubscribe send an email to ceph-users-leave@xxxxxxx<mailto:ceph-users-leave@xxxxxxx>

On Sun, 19 Sep 2021, 10:11 Frank Schilder, <frans@xxxxxx<mailto:frans@xxxxxx>> wrote:
Guten Tag.

Our file system is out of operation (mimic 13.2.10). Our MDSes are choking on an operation:

2021-09-19 02:23:36.432664 mon.ceph-01 mon.0 192.168.32.65:6789/0<http://192.168.32.65:6789/0> 185676 : cluster [WRN] Health check failed: 1 MDSs repor
t slow requests (MDS_SLOW_REQUEST)
[...]
2021-09-19 02:23:34.909269 mds.ceph-15 mds.3 192.168.32.79:6872/3838509256<http://192.168.32.79:6872/3838509256> 1662 : cluster [WRN] 33 slow requests, 5 included below; oldest blocked for > 32.729621 secs
2021-09-19 02:23:34.909277 mds.ceph-15 mds.3 192.168.32.79:6872/3838509256<http://192.168.32.79:6872/3838509256> 1663 : cluster [WRN] slow request 31.104289 seconds old, received at 2021-09-19 02:23:03.804307: client_request(client.44559846:1121833 lookup #0x100000167f6/naikar 2021-09-19 02:23:03.803359 caller_uid=260500, caller_gid=260500{}) currently failed to authpin, subtree is being exported
2021-09-19 02:23:34.909280 mds.ceph-15 mds.3 192.168.32.79:6872/3838509256<http://192.168.32.79:6872/3838509256> 1664 : cluster [WRN] slow request 31.104254 seconds old, received at 2021-09-19 02:23:03.804343: client_request(client.44559846:1121834 lookup #0x100000167f6/naikar 2021-09-19 02:23:03.803359 caller_uid=260500, caller_gid=260500{}) currently failed to authpin, subtree is being exported
2021-09-19 02:23:34.909283 mds.ceph-15 mds.3 192.168.32.79:6872/3838509256<http://192.168.32.79:6872/3838509256> 1665 : cluster [WRN] slow request 31.104231 seconds old, received at 2021-09-19 02:23:03.804365: client_request(client.44559846:1121835 lookup #0x100000167f6/naikar 2021-09-19 02:23:03.803359 caller_uid=260500, caller_gid=260500{}) currently failed to authpin, subtree is being exported
2021-09-19 02:23:34.909285 mds.ceph-15 mds.3 192.168.32.79:6872/3838509256<http://192.168.32.79:6872/3838509256> 1666 : cluster [WRN] slow request 31.104213 seconds old, received at 2021-09-19 02:23:03.804384: client_request(client.44559846:1121836 lookup #0x100000167f6/naikar 2021-09-19 02:23:03.803359 caller_uid=260500, caller_gid=260500{}) currently failed to authpin, subtree is being exported
2021-09-19 02:23:34.909288 mds.ceph-15 mds.3 192.168.32.79:6872/3838509256<http://192.168.32.79:6872/3838509256> 1667 : cluster [WRN] slow request 31.104142 seconds old, received at 2021-09-19 02:23:03.804455: client_request(client.44559846:1121837 lookup #0x100000167f6/naikar 2021-09-19 02:23:03.803359 caller_uid=260500, caller_gid=260500{}) currently failed to authpin, subtree is being exported

By now, several thousand authpin operations are stuck for hours already. The file system is basically inoperational and work is piling up:

# ceph health detail
HEALTH_WARN 1 MDSs report slow requests; 2 MDSs behind on trimming; 20 large omap objects
MDS_SLOW_REQUEST 1 MDSs report slow requests
    mdsceph-15(mds.3): 1554 slow requests are blocked > 30 secs
MDS_TRIM 2 MDSs behind on trimming
    mdsceph-23(mds.0): Behind on trimming (7651/128) max_segments: 128, num_segments: 7651
    mdsceph-15(mds.3): Behind on trimming (4888/128) max_segments: 128, num_segments: 4888

I would be grateful for advice on how to get out of this. Current fs status is:

# ceph fs status
con-fs2 - 1636 clients
=======
+------+--------+---------+---------------+-------+-------+
| Rank | State  |   MDS   |    Activity   |  dns  |  inos |
+------+--------+---------+---------------+-------+-------+
|  0   | active | ceph-23 | Reqs:    2 /s | 2024k | 2019k |
|  1   | active | ceph-12 | Reqs:    0 /s | 1382k | 1374k |
|  2   | active | ceph-08 | Reqs:    0 /s |  998k |  926k |
|  3   | active | ceph-15 | Reqs:    0 /s | 1373k | 1272k |
+------+--------+---------+---------------+-------+-------+
+---------------------+----------+-------+-------+
|         Pool        |   type   |  used | avail |
+---------------------+----------+-------+-------+
|    con-fs2-meta1    | metadata |  102G | 1252G |
|    con-fs2-meta2    |   data   |    0  | 1252G |
|     con-fs2-data    |   data   | 1359T | 6003T |
| con-fs2-data-ec-ssd |   data   |  239G | 4006G |
|    con-fs2-data2    |   data   | 56.0T | 5457T |
+---------------------+----------+-------+-------+
+-------------+
| Standby MDS |
+-------------+
|   ceph-16   |
|   ceph-14   |
|   ceph-13   |
|   ceph-17   |
|   ceph-10   |
|   ceph-24   |
|   ceph-09   |
|   ceph-11   |
+-------------+
MDS version: ceph version 13.2.10 (564bdc4ae87418a232fc901524470e1a0f76d641) mimic (stable)

Thanks and best regards,
=================
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx<mailto:ceph-users@xxxxxxx>
To unsubscribe send an email to ceph-users-leave@xxxxxxx<mailto:ceph-users-leave@xxxxxxx>
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx