Re: MDS rejects clients causing hanging mountpoint on linux kernel client

Dan van der Ster <dan@xxxxxxxxxxxxxx> · Mon, 8 Feb 2021 10:58:32 +0100

Hi,

Sorry to ping this old thread, but we have a few kernel client nodes
stuck like this after an outage on their network.
MDS's are running v14.2.11 and the client has kernel
3.10.0-1127.19.1.el7.x86_64.

This is the first time at our lab that clients didn't reconnect after
a network issue (but this might be the first large client network
outage after we upgraded from luminous to nautilus).

It looks identical to Florian's issue:

Feb 08 10:07:23 hpc-qcd027.cern.ch kernel: libceph: mds0
10.32.5.17:6821 socket closed (con state NEGOTIATING)
Feb 08 10:07:51 hpc-qcd027.cern.ch kernel: ceph: get_quota_realm: ino
(10004fe5035.fffffffffffffffe) null i_snap_realm

The full kernel log | grep ceph is at https://termbin.com/zdwc

As of now, this client's mountpoint is "stuck" and it does not have a
session open on mds.0, but has sessions on mds.1 and mds.2 (see below
[1]).

I evicted this client from all mds's but the client didn't manage to reconnect:

Feb 08 10:20:01 hpc-qcd027.cern.ch kernel: libceph: mds1
188.185.88.47:6801 socket closed (con state OPEN)
Feb 08 10:20:01 hpc-qcd027.cern.ch kernel: libceph: mds2
188.185.88.90:6801 socket closed (con state OPEN)
Feb 08 10:20:02 hpc-qcd027.cern.ch kernel: libceph: mds1
188.185.88.47:6801 connection reset
Feb 08 10:20:02 hpc-qcd027.cern.ch kernel: libceph: reset on mds1
Feb 08 10:20:02 hpc-qcd027.cern.ch kernel: ceph: mds1 closed our session
Feb 08 10:20:02 hpc-qcd027.cern.ch kernel: ceph: mds1 reconnect start
Feb 08 10:20:02 hpc-qcd027.cern.ch kernel: libceph: mds2
188.185.88.90:6801 connection reset
Feb 08 10:20:02 hpc-qcd027.cern.ch kernel: libceph: reset on mds2
Feb 08 10:20:02 hpc-qcd027.cern.ch kernel: ceph: mds2 closed our session
Feb 08 10:20:02 hpc-qcd027.cern.ch kernel: ceph: mds2 reconnect start
Feb 08 10:20:02 hpc-qcd027.cern.ch kernel: ceph: mds1 reconnect denied
Feb 08 10:20:02 hpc-qcd027.cern.ch kernel: ceph: mds2 reconnect denied
Feb 08 10:20:21 hpc-qcd027.cern.ch kernel: ceph: get_quota_realm: ino
(10004fe5035.fffffffffffffffe) null i_snap_realm
Feb 08 10:20:51 hpc-qcd027.cern.ch kernel: ceph: get_quota_realm: ino
(10004fe5035.fffffffffffffffe) null i_snap_realm

Here are some logs from mds.0:

# egrep '10.32.3.150|137564444'
/var/log/ceph/ceph-mds.cephflax-mds-ca21a8a1c6.log
2021-02-08 09:16:46.875 7f9b22faa700  0 log_channel(cluster) log [WRN]
: evicting unresponsive client hpc-qcd027.cern.ch:hpc (137564444),
after 304.536 seconds
2021-02-08 09:17:28.326 7f9b28a6c700  0 --1-
[v2:188.184.96.191:6800/1781566860,v1:188.184.96.191:6801/1781566860]
>> v1:10.32.3.150:0/3998218413 conn(0x562ac988b800 0x562e1a6e8800
:6801 s=ACCEPTING_WAIT_CONNECT_MSG_AUTH pgs=0 cs=0
l=0).handle_connect_message_2 accept we reset (peer sent cseq 1),
sending RESETSESSION
2021-02-08 09:17:28.628 7f9b28a6c700  0 --1-
[v2:188.184.96.191:6800/1781566860,v1:188.184.96.191:6801/1781566860]
>> v1:10.32.3.150:0/3998218413 conn(0x5629ce1c6800 0x56274776f000
:6801 s=OPENED pgs=26571 cs=1 l=0).fault server, going to standby
2021-02-08 09:49:56.318 7f9b28a6c700  0 --1-
[v2:188.184.96.191:6800/1781566860,v1:188.184.96.191:6801/1781566860]
>> v1:10.32.3.150:0/3998218413 conn(0x5629f5eaa800 0x5627460ab800
:6801 s=ACCEPTING_WAIT_CONNECT_MSG_AUTH pgs=0 cs=0
l=0).handle_connect_message_2 accept peer reset, then tried to connect
to us, replacing

compared with mds.1 where the reconnect succeeded:

# egrep '10.32.3.150|137564444'
/var/log/ceph/ceph-mds.cephflax-mds-370212ad58.log
2021-02-08 09:16:42.970 7fc98299b700  0 log_channel(cluster) log [WRN]
: evicting unresponsive client hpc-qcd027.cern.ch:hpc (137564444),
after 300.629 seconds
2021-02-08 09:17:28.327 7fc987c2f700  0 --1-
[v2:188.185.88.47:6800/3666946863,v1:188.185.88.47:6801/3666946863] >>
v1:10.32.3.150:0/3998218413 conn(0x55a652fea400 0x55a7220ba800 :6801
s=ACCEPTING_WAIT_CONNECT_MSG_AUTH pgs=0 cs=0
l=0).handle_connect_message_2 accept we reset (peer sent cseq 1),
sending RESETSESSION
2021-02-08 09:17:28.414 7fc987c2f700  0 --1-
[v2:188.185.88.47:6800/3666946863,v1:188.185.88.47:6801/3666946863] >>
v1:10.32.3.150:0/3998218413 conn(0x55a791481000 0x55a6c7a2d800 :6801
s=OPENED pgs=26573 cs=1 l=0).fault server, going to standby
2021-02-08 10:05:12.810 7fc988430700  0 --1-
[v2:188.185.88.47:6800/3666946863,v1:188.185.88.47:6801/3666946863] >>
v1:10.32.3.150:0/3998218413 conn(0x55a75789d400 0x55a78695b000 :6801
s=ACCEPTING_WAIT_CONNECT_MSG_AUTH pgs=0 cs=0
l=0).handle_connect_message_2 accept peer reset, then tried to connect
to us, replacing
2021-02-08 10:05:13.374 7fc97e192700  2 mds.1.server New client
session: addr="v1:10.32.3.150:0/3998218413",elapsed=0.057278,throttled=0.000007,status="ACCEPTED",root="/hpcqcd"
2021-02-08 10:20:01.666 7fc98499f700  1 mds.1.242924 Evicting client
session 137564444 (v1
10.32.3.150:0/3998218413)
2021-02-08 10:20:01.666 7fc98499f700  0 log_channel(cluster) log [INF]
: Evicting client session 137564444 (v1:10.32.3.150:0/3998218413)
2021-02-08 10:20:02.343 7fc988430700  0 --1-
[v2:188.185.88.47:6800/3666946863,v1:188.185.88.47:6801/3666946863] >>
v1:10.32.3.150:0/3998218413 conn(0x55a7a5d81c00 0x55a7823d1000 :6801
s=ACCEPTING_WAIT_CONNECT_MSG_AUTH pgs=0 cs=0
l=0).handle_connect_message_2 accept we reset (peer sent cseq 2),
sending RESETSESSION
2021-02-08 10:20:02.345 7fc988430700  0 --1-
[v2:188.185.88.47:6800/3666946863,v1:188.185.88.47:6801/3666946863] >>
v1:10.32.3.150:0/3998218413 conn(0x55a75bb6a000 0x55a744f69800 :6801
s=OPENED pgs=26588 cs=1 l=0).fault server, going to standby

We have this in the mds config:
    mds session blacklist on evict = false
    mds session blacklist on timeout = false

The clients are working fine after they are rebooted.

Given the age of this thread -- maybe is this a known issue and
already solved in newer kernels?

Note that during this incident a few clients also crashed and rebooted
-- we are still trying to get the kernel backtrace for those cases, to
see if it matches https://tracker.ceph.com/issues/40862.

Thanks!

Dan

[1] session ls:
mds.cephflax-mds-ca21a8a1c6: []
mds.cephflax-mds-370212ad58: [
    {
        "id": 137564444,
        "entity": {
            "name": {
                "type": "client",
                "num": 137564444
            },
            "addr": {
                "type": "v1",
                "addr": "10.32.3.150:0",
                "nonce": 3998218413
            }
        },
        "state": "open",
        "num_leases": 0,
        "num_caps": 0,
        "request_load_avg": 0,
        "uptime": 844.92158032299994,
        "requests_in_flight": 0,
        "completed_requests": 0,
        "reconnecting": false,
        "recall_caps": {
            "value": 0,
            "halflife": 60
        },
        "release_caps": {
            "value": 0,
            "halflife": 60
        },
        "recall_caps_throttle": {
            "value": 0,
            "halflife": 2.5
        },
        "recall_caps_throttle2o": {
            "value": 0,
            "halflife": 0.5
        },
        "session_cache_liveness": {
            "value": 0,
            "halflife": 300
        },
        "inst": "client.137564444 v1:10.32.3.150:0/3998218413",
        "completed_requests": [],
        "prealloc_inos": [],
        "used_inos": [],
        "client_metadata": {
            "features": "0x00000000000000ff",
            "entity_id": "hpc",
            "hostname": "hpc-qcd027.cern.ch",
            "kernel_version": "3.10.0-1127.19.1.el7.x86_64",
            "root": "/hpcqcd"
        }
    }
]
mds.cephflax-mds-adccf51169: [
    {
        "id": 137564444,
        "entity": {
            "name": {
                "type": "client",
                "num": 137564444
            },
            "addr": {
                "type": "v1",
                "addr": "10.32.3.150:0",
                "nonce": 3998218413
            }
        },
        "state": "open",
        "num_leases": 0,
        "num_caps": 0,
        "request_load_avg": 0,
        "uptime": 1761.964491447,
        "requests_in_flight": 0,
        "completed_requests": 0,
        "reconnecting": false,
        "recall_caps": {
            "value": 0,
            "halflife": 60
        },
        "release_caps": {
            "value": 0,
            "halflife": 60
        },
        "recall_caps_throttle": {
            "value": 0,
            "halflife": 2.5
        },
        "recall_caps_throttle2o": {
            "value": 0,
            "halflife": 0.5
        },
        "session_cache_liveness": {
            "value": 0,
            "halflife": 300
        },
        "inst": "client.137564444 v1:10.32.3.150:0/3998218413",
        "completed_requests": [],
        "prealloc_inos": [],
        "used_inos": [],
        "client_metadata": {
            "features": "0x00000000000000ff",
            "entity_id": "hpc",
            "hostname": "hpc-qcd027.cern.ch",
            "kernel_version": "3.10.0-1127.19.1.el7.x86_64",
            "root": "/hpcqcd"
        }
    }
]

On Mon, Oct 14, 2019 at 12:03 PM Florian Pritz
<florian.pritz@xxxxxxxxxxxxxx> wrote:
>
> On Wed, Oct 02, 2019 at 10:24:41PM +0800, "Yan, Zheng" <ukernel@xxxxxxxxx> wrote:
> > Can you reproduce this. If you can, run 'ceph daemon mds.x session ls'
> > before restart mds.
>
> I just managed to run into this issue again. 'ceph daemon mds.x session
> ls' doesn't work because apparently our setup doesn't have the admin
> socket in the expected place. I've therefore used 'ceph tell mds.0
> session ls' which I think should be the same expect for how the daemon
> is contacted.
>
> When the issue happens and 2 clients are hanging, 'ceph tell mds.0
> session ls' shows only 9 clients instead of 11. The hanging clients are
> missing from the list. Once they are rebooted they show up in the
> output.
>
> On a potentially interesting note: The clients that were hanging this
> time are the same ones as last time. They aren't set up any differently
> from the others as far as I can tell though.
>
> Florian
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx