cepfs: ceph-fuse clients getting stuck + causing degraded PG

Andras Pataki <apataki@xxxxxxxxxxxxxxxxxxxxx> · Wed, 26 Feb 2020 08:27:42 -0500

We've been running into a strange problem repeating every day or so with 
a specific HPC job on a Mimic cluster (13.2.8) using ceph-fuse 
(14.2.7).  It seems like some cephfs clients are stuck (perhaps 
deadlocked) trying to access a file and are not making progress.

Ceph reports the following problems (ignore the oversized cache, large 
omap objects and the failing to respond to cache pressure - we get those 
regularly without any problems):

[root@cephmon00 ~]# ceph -s
  cluster:
    id:     d7b33135-0940-4e48-8aa6-1d2026597c2f
    health: HEALTH_WARN
            1 MDSs report oversized cache
            1 clients failing to respond to capability release
            5 clients failing to respond to cache pressure
            1 MDSs report slow requests
            2 large omap objects
            Degraded data redundancy: 1/24563062389 objects degraded 
(0.000%), 1 pg degraded
            1 slow ops, oldest one blocked for 14723 sec, osd.1058 has 
slow ops

  services:
    mon: 5 daemons, quorum 
cephmon00,cephmon01,cephmon02,cephmon03,cephmon04
    mgr: cephmon01(active), standbys: cephmon02, cephmon03, cephmon04, 
cephmon00
    mds: cephfs-1/1/1 up  {0=cephmds00=up:active}, 5 up:standby
    osd: 3264 osds: 3264 up, 3264 in

  data:
    pools:   4 pools, 67072 pgs
    objects: 3.10 G objects, 10 PiB
    usage:   17 PiB used, 6.0 PiB / 23 PiB avail
    pgs:     1/24563062389 objects degraded (0.000%)
             67006 active+clean
             40    active+clean+scrubbing+deep
             25    active+clean+scrubbing
             1     active+recovering+degraded

  io:
    client:   501 MiB/s rd, 4.2 GiB/s wr, 1.63 kop/s rd, 6.46 kop/s wr
    recovery: 28 MiB/s, 7 objects/s

[root@cephmon00 ~]# ceph health detail
HEALTH_WARN 1 MDSs report oversized cache; 1 clients failing to respond 
to capability release; 5 clients failing to respond to cache pressure; 1 
MDSs report slow requests; 2 large omap objects; Degraded data 
redundancy: 1/24563067054 objects degraded (0.000%), 1 pg degraded; 1 
slow ops, oldest one blocked for 14723 sec, osd.1058 has slow ops
MDS_CACHE_OVERSIZED 1 MDSs report oversized cache
    mdscephmds00(mds.0): MDS cache is too large (73GB/16GB); 27115370 
inodes in use by clients, 3811 stray files
MDS_CLIENT_LATE_RELEASE 1 clients failing to respond to capability release
    mdscephmds00(mds.0): Client worker1003:cephfs failing to respond to 
capability release client_id: 90142364
MDS_CLIENT_RECALL 5 clients failing to respond to cache pressure
    mdscephmds00(mds.0): Client gw162.flatironinstitute.org failing to 
respond to cache pressure client_id: 72550178
    mdscephmds00(mds.0): Client ccblin053 failing to respond to cache 
pressure client_id: 76380719
    mdscephmds00(mds.0): Client rusty2 failing to respond to cache 
pressure client_id: 79497905
    mdscephmds00(mds.0): Client infolin003:cephfs failing to respond to 
cache pressure client_id: 88565901
    mdscephmds00(mds.0): Client 
ceph-nfs1.flatironinstitute.org:cephfs-ro failing to respond to cache 
pressure client_id: 88899914
MDS_SLOW_REQUEST 1 MDSs report slow requests
    mdscephmds00(mds.0): 7 slow requests are blocked > 30 secs
LARGE_OMAP_OBJECTS 2 large omap objects
    2 large objects found in pool 'cephfs_metadata'
    Search the cluster log for 'Large omap object found' for more details.
PG_DEGRADED Degraded data redundancy: 1/24563067054 objects degraded 
(0.000%), 1 pg degraded
    pg 2.14b7 is active+recovering+degraded, acting [1058,2098,734]
SLOW_OPS 1 slow ops, oldest one blocked for 14723 sec, osd.1058 has slow ops

The client that is failing to respond to capability release seems to be 
the real problem here.  This is a ceph-fuse client (14.2.7) - it looks 
like it has a bunch of open requests in flight to the MDS:

[root@worker1003 ~]# ceph daemon 
/var/run/ceph/ceph-client.cephfs.6969.93825141492560.asok mds_requests
{
    "request": {
        "tid": 3801617,
        "op": "open",
        "path": "#0x10048b87a3b",
        "path2": "",
        "ino": "0x10048b87a3b",
        "hint_ino": "0x0",
        "sent_stamp": "2020-02-26 03:42:58.359939",
        "mds": 0,
        "resend_mds": -1,
        "send_to_auth": 0,
        "sent_on_mseq": 0,
        "retry_attempt": 1,
        "got_unsafe": 0,
        "uid": 1261,
        "gid": 1261,
        "oldest_client_tid": 3801617,
        "mdsmap_epoch": 0,
        "flags": 0,
        "num_retry": 0,
        "num_fwd": 0,
        "num_releases": 0,
        "abort_rc": 0
    },
    "request": {
        "tid": 3900154,
        "op": "lookup",
        "path": "#0x10048b61f09/tesscut_20200226034255.zip",
        "path2": "",
        "ino": "0x10048b61f09",
        "hint_ino": "0x0",
        "sent_stamp": "2020-02-26 04:08:50.572111",
        "mds": 0,
        "resend_mds": -1,
        "send_to_auth": 0,
        "sent_on_mseq": 0,
        "retry_attempt": 1,
        "got_unsafe": 0,
        "uid": 1261,
        "gid": 1261,
        "oldest_client_tid": 3801617,
        "mdsmap_epoch": 0,
        "flags": 0,
        "num_retry": 0,
        "num_fwd": 0,
        "num_releases": 0,
        "abort_rc": 0
    },
    "request": {
        "tid": 4111017,
        "op": "lookup",
        "path": "#0x10048b61f09/tesscut_20200226034255.zip",
        "path2": "",
        "ino": "0x10048b61f09",
        "hint_ino": "0x0",
        "sent_stamp": "2020-02-26 04:53:29.372694",
        "mds": 0,
        "resend_mds": -1,
        "send_to_auth": 0,
        "sent_on_mseq": 0,
        "retry_attempt": 1,
        "got_unsafe": 0,
        "uid": 1261,
        "gid": 1261,
        "oldest_client_tid": 3801617,
        "mdsmap_epoch": 0,
        "flags": 0,
        "num_retry": 0,
        "num_fwd": 0,
        "num_releases": 0,
        "abort_rc": 0
    },
    "request": {
        "tid": 4288428,
        "op": "lookup",
        "path": "#0x10048b61f09/tesscut_20200226034255.zip",
        "path2": "",
        "ino": "0x10048b61f09",
        "hint_ino": "0x0",
        "sent_stamp": "2020-02-26 05:39:52.257977",
        "mds": 0,
        "resend_mds": -1,
        "send_to_auth": 0,
        "sent_on_mseq": 0,
        "retry_attempt": 1,
        "got_unsafe": 0,
        "uid": 1261,
        "gid": 1261,
        "oldest_client_tid": 3801617,
        "mdsmap_epoch": 0,
        "flags": 0,
        "num_retry": 0,
        "num_fwd": 0,
        "num_releases": 0,
        "abort_rc": 0
    },
    "request": {
        "tid": 4425681,
        "op": "lookup",
        "path": "#0x10048b61f09/tesscut_20200226034255.zip",
        "path2": "",
        "ino": "0x10048b61f09",
        "hint_ino": "0x0",
        "sent_stamp": "2020-02-26 06:23:52.201814",
        "mds": 0,
        "resend_mds": -1,
        "send_to_auth": 0,
        "sent_on_mseq": 0,
        "retry_attempt": 1,
        "got_unsafe": 0,
        "uid": 1261,
        "gid": 1261,
        "oldest_client_tid": 3801617,
        "mdsmap_epoch": 0,
        "flags": 0,
        "num_retry": 0,
        "num_fwd": 0,
        "num_releases": 0,
        "abort_rc": 0
    }
}

What seems to be odd is that the last 4 are for the same file.  The MDS 
also sees these ops as pending:

[root@cephmds00 ~]# ceph daemon /var/run/ceph/ceph-mds.cephmds00.asok ops
{
    "ops": [
        {
            "description": "client_request(client.90142364:4288428 
lookup #0x10048b61f09/tesscut_20200226034255.zip 2020-02-26 
05:39:52.257975 caller_uid=1261, caller_gid=1261{1193,1207,1223,1261,})",
            "initiated_at": "2020-02-26 05:39:52.258280",
            "age": 8189.169056,
            "duration": 8189.169070,
            "type_data": {
                "flag_point": "failed to rdlock, waiting",
                "reqid": "client.90142364:4288428",
                "op_type": "client_request",
                "client_info": {
                    "client": "client.90142364",
                    "tid": 4288428
                },
                "events": [
                    {
                        "time": "2020-02-26 05:39:52.258280",
                        "event": "initiated"
                    },
                    {
                        "time": "2020-02-26 05:39:52.258280",
                        "event": "header_read"
                    },
                    {
                        "time": "2020-02-26 05:39:52.258281",
                        "event": "throttled"
                    },
                    {
                        "time": "2020-02-26 05:39:52.258283",
                        "event": "all_read"
                    },
                    {
                        "time": "2020-02-26 05:39:52.258362",
                        "event": "dispatched"
                    },
                    {
                        "time": "2020-02-26 05:39:52.258383",
                        "event": "failed to rdlock, waiting"
                    }
                ]
            }
        },
        {
            "description": "client_request(client.90142364:4425681 
lookup #0x10048b61f09/tesscut_20200226034255.zip 2020-02-26 
06:23:52.201811 caller_uid=1261, caller_gid=1261{1193,1207,1223,1261,})",
            "initiated_at": "2020-02-26 06:23:52.201698",
            "age": 5549.225638,
            "duration": 5549.225679,
            "type_data": {
                "flag_point": "failed to rdlock, waiting",
                "reqid": "client.90142364:4425681",
                "op_type": "client_request",
                "client_info": {
                    "client": "client.90142364",
                    "tid": 4425681
                },
                "events": [
                    {
                        "time": "2020-02-26 06:23:52.201698",
                        "event": "initiated"
                    },
                    {
                        "time": "2020-02-26 06:23:52.201698",
                        "event": "header_read"
                    },
                    {
                        "time": "2020-02-26 06:23:52.201699",
                        "event": "throttled"
                    },
                    {
                        "time": "2020-02-26 06:23:52.201702",
                        "event": "all_read"
                    },
                    {
                        "time": "2020-02-26 06:23:52.201720",
                        "event": "dispatched"
                    },
                    {
                        "time": "2020-02-26 06:23:52.201740",
                        "event": "failed to rdlock, waiting"
                    }
                ]
            }
        },
        {
            "description": "client_request(client.90142364:4111017 
lookup #0x10048b61f09/tesscut_20200226034255.zip 2020-02-26 
04:53:29.372691 caller_uid=1261, caller_gid=1261{1193,1207,1223,1261,})",
            "initiated_at": "2020-02-26 04:53:29.373246",
            "age": 10972.054090,
            "duration": 10972.054155,
            "type_data": {
                "flag_point": "failed to rdlock, waiting",
                "reqid": "client.90142364:4111017",
                "op_type": "client_request",
                "client_info": {
                    "client": "client.90142364",
                    "tid": 4111017
                },
                "events": [
                    {
                        "time": "2020-02-26 04:53:29.373246",
                        "event": "initiated"
                    },
                    {
                        "time": "2020-02-26 04:53:29.373246",
                        "event": "header_read"
                    },
                    {
                        "time": "2020-02-26 04:53:29.373246",
                        "event": "throttled"
                    },
                    {
                        "time": "2020-02-26 04:53:29.373249",
                        "event": "all_read"
                    },
                    {
                        "time": "2020-02-26 04:53:29.373299",
                        "event": "dispatched"
                    },
                    {
                        "time": "2020-02-26 04:53:29.373323",
                        "event": "failed to rdlock, waiting"
                    }
                ]
            }
        },
        {
            "description": "client_request(client.90118987:32628772 
lookup #0x10048b61f09/tesscut_20200226034255.zip 2020-02-26 
04:09:03.139173 caller_uid=1261, caller_gid=1261{1193,1207,1223,1261,})",
            "initiated_at": "2020-02-26 04:09:03.138539",
            "age": 13638.288797,
            "duration": 13638.288884,
            "type_data": {
                "flag_point": "failed to rdlock, waiting",
                "reqid": "client.90118987:32628772",
                "op_type": "client_request",
                "client_info": {
                    "client": "client.90118987",
                    "tid": 32628772
                },
                "events": [
                    {
                        "time": "2020-02-26 04:09:03.138539",
                        "event": "initiated"
                    },
                    {
                        "time": "2020-02-26 04:09:03.138539",
                        "event": "header_read"
                    },
                    {
                        "time": "2020-02-26 04:09:03.138540",
                        "event": "throttled"
                    },
                    {
                        "time": "2020-02-26 04:09:03.138543",
                        "event": "all_read"
                    },
                    {
                        "time": "2020-02-26 04:09:03.138569",
                        "event": "dispatched"
                    },
                    {
                        "time": "2020-02-26 04:09:03.138592",
                        "event": "failed to rdlock, waiting"
                    }
                ]
            }
        },
        {
            "description": "client_request(client.90142364:3801617 open 
#0x10048b87a3b 2020-02-26 03:42:58.359934 caller_uid=1261, 
caller_gid=1261{1193,1207,1223,1261,})",
            "initiated_at": "2020-02-26 03:42:58.359502",
            "age": 15203.067834,
            "duration": 15203.067944,
            "type_data": {
                "flag_point": "failed to xlock, waiting",
                "reqid": "client.90142364:3801617",
                "op_type": "client_request",
                "client_info": {
                    "client": "client.90142364",
                    "tid": 3801617
                },
                "events": [
                    {
                        "time": "2020-02-26 03:42:58.359502",
                        "event": "initiated"
                    },
                    {
                        "time": "2020-02-26 03:42:58.359502",
                        "event": "header_read"
                    },
                    {
                        "time": "2020-02-26 03:42:58.359502",
                        "event": "throttled"
                    },
                    {
                        "time": "2020-02-26 03:42:58.359505",
                        "event": "all_read"
                    },
                    {
                        "time": "2020-02-26 03:42:58.359525",
                        "event": "dispatched"
                    },
                    {
                        "time": "2020-02-26 03:42:58.359557",
                        "event": "failed to xlock, waiting"
                    }
                ]
            }
        },
        {
            "description": "client_request(client.90145864:3374743 
lookup #0x10048b61f09/tesscut_20200226034255.zip 2020-02-26 
06:24:02.082413 caller_uid=1261, caller_gid=1261{1193,1207,1223,1261,})",
            "initiated_at": "2020-02-26 06:24:02.081762",
            "age": 5539.345573,
            "duration": 5539.345708,
            "type_data": {
                "flag_point": "failed to rdlock, waiting",
                "reqid": "client.90145864:3374743",
                "op_type": "client_request",
                "client_info": {
                    "client": "client.90145864",
                    "tid": 3374743
                },
                "events": [
                    {
                        "time": "2020-02-26 06:24:02.081762",
                        "event": "initiated"
                    },
                    {
                        "time": "2020-02-26 06:24:02.081762",
                        "event": "header_read"
                    },
                    {
                        "time": "2020-02-26 06:24:02.081763",
                        "event": "throttled"
                    },
                    {
                        "time": "2020-02-26 06:24:02.081766",
                        "event": "all_read"
                    },
                    {
                        "time": "2020-02-26 06:24:02.081787",
                        "event": "dispatched"
                    },
                    {
                        "time": "2020-02-26 06:24:02.081804",
                        "event": "failed to rdlock, waiting"
                    }
                ]
            }
        },
        {
            "description": "client_request(client.90142364:3900154 
lookup #0x10048b61f09/tesscut_20200226034255.zip 2020-02-26 
04:08:50.572109 caller_uid=1261, caller_gid=1261{1193,1207,1223,1261,})",
            "initiated_at": "2020-02-26 04:08:50.572289",
            "age": 13650.855047,
            "duration": 13650.855205,
            "type_data": {
                "flag_point": "failed to rdlock, waiting",
                "reqid": "client.90142364:3900154",
                "op_type": "client_request",
                "client_info": {
                    "client": "client.90142364",
                    "tid": 3900154
                },
                "events": [
                    {
                        "time": "2020-02-26 04:08:50.572289",
                        "event": "initiated"
                    },
                    {
                        "time": "2020-02-26 04:08:50.572289",
                        "event": "header_read"
                    },
                    {
                        "time": "2020-02-26 04:08:50.572290",
                        "event": "throttled"
                    },
                    {
                        "time": "2020-02-26 04:08:50.572293",
                        "event": "all_read"
                    },
                    {
                        "time": "2020-02-26 04:08:50.572318",
                        "event": "dispatched"
                    },
                    {
                        "time": "2020-02-26 04:08:50.572338",
                        "event": "failed to rdlock, waiting"
                    }
                ]
            }
        }
    ],
    "num_ops": 7
}

There is another client on this list (90118987) also trying to access 
the same file - here is the client's op list from that (worker1020):

[root@worker1020 ~]# ceph daemon 
/var/run/ceph/ceph-client.cephfs.6958.93825141492560.asok mds_requests
{
    "request": {
        "tid": 32628772,
        "op": "lookup",
        "path": "#0x10048b61f09/tesscut_20200226034255.zip",
        "path2": "",
        "ino": "0x10048b61f09",
        "hint_ino": "0x0",
        "sent_stamp": "2020-02-26 04:09:03.139177",
        "mds": 0,
        "resend_mds": -1,
        "send_to_auth": 0,
        "sent_on_mseq": 0,
        "retry_attempt": 1,
        "got_unsafe": 0,
        "uid": 1261,
        "gid": 1261,
        "oldest_client_tid": 32628772,
        "mdsmap_epoch": 0,
        "flags": 0,
        "num_retry": 0,
        "num_fwd": 0,
        "num_releases": 0,
        "abort_rc": 0
    }
}

Also, the kernel (CentOS 7 3.10.0-1062.9.1.el7.x86_64) also reports 
stuck processes.  dmesg:

[539762.054205] INFO: task python:1098178 blocked for more than 120 seconds.
[539762.054210] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" 
disables this message.
[539762.054212] python          D ffff8835f8788640     0 1098178 1098093 
0x00000000
[539762.054218] Call Trace:
[539762.054252]  [<ffffffffc09abfcf>] ? fuse_do_open+0x14f/0x1e0 [fuse]
[539762.054259]  [<ffffffff96181929>] schedule_preempt_disabled+0x29/0x70
[539762.054263]  [<ffffffff9617f8b7>] __mutex_lock_slowpath+0xc7/0x1d0
[539762.054269]  [<ffffffff9617ec8f>] mutex_lock+0x1f/0x2f
[539762.054274]  [<ffffffff95d28df6>] ima_file_check+0xa6/0x1b0
[539762.054280]  [<ffffffff95c592ca>] do_last+0x59a/0x1290
[539762.054284]  [<ffffffff95c5bdbd>] path_openat+0xcd/0x5a0
[539762.054290]  [<ffffffff95c4737a>] ? __check_object_size+0x1ca/0x250
[539762.054294]  [<ffffffff95c5d72d>] do_filp_open+0x4d/0xb0
[539762.054299]  [<ffffffff95c6b207>] ? __alloc_fd+0x47/0x170
[539762.054303]  [<ffffffff95c49684>] do_sys_open+0x124/0x220
[539762.054307]  [<ffffffff95c4979e>] SyS_open+0x1e/0x20
[539762.054311]  [<ffffffff9618dede>] system_call_fastpath+0x25/0x2a
[539762.054317] INFO: task python:1098179 blocked for more than 120 seconds.
[539762.054318] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" 
disables this message.
[539762.054320] python          D ffff88847f25ac80     0 1098179 1098093 
0x00000000
[539762.054324] Call Trace:
[539762.054328]  [<ffffffff96181929>] schedule_preempt_disabled+0x29/0x70
[539762.054331]  [<ffffffff9617f8b7>] __mutex_lock_slowpath+0xc7/0x1d0
[539762.054335]  [<ffffffff9617ec8f>] mutex_lock+0x1f/0x2f
[539762.054344]  [<ffffffffc09aeb3b>] fuse_file_aio_write+0xdb/0x390 [fuse]
[539762.054348]  [<ffffffff95c49d03>] do_sync_write+0x93/0xe0
[539762.054353]  [<ffffffff95c4a7f0>] vfs_write+0xc0/0x1f0
[539762.054357]  [<ffffffff95c4b60f>] SyS_write+0x7f/0xf0
[539762.054360]  [<ffffffff9618dede>] system_call_fastpath+0x25/0x2a
[539762.054366] INFO: task python:1098180 blocked for more than 120 seconds.
[539762.054368] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" 
disables this message.
[539762.054369] python          D ffff88847f25ac80     0 1098180 1098093 
0x00000000
[539762.054372] Call Trace:
[539762.054376]  [<ffffffff96181929>] schedule_preempt_disabled+0x29/0x70
[539762.054379]  [<ffffffff9617f8b7>] __mutex_lock_slowpath+0xc7/0x1d0
[539762.054384]  [<ffffffff9617ec8f>] mutex_lock+0x1f/0x2f
[539762.054392]  [<ffffffffc09aeb3b>] fuse_file_aio_write+0xdb/0x390 [fuse]
[539762.054398]  [<ffffffff95bf1249>] ? handle_pte_fault+0x2b9/0xe20
[539762.054402]  [<ffffffff95c49d03>] do_sync_write+0x93/0xe0
[539762.054406]  [<ffffffff95c4a7f0>] vfs_write+0xc0/0x1f0
[539762.054410]  [<ffffffff95c4b60f>] SyS_write+0x7f/0xf0
[539762.054413]  [<ffffffff9618dede>] system_call_fastpath+0x25/0x2a
[539762.054420] INFO: task python:1098181 blocked for more than 120 seconds.
[539762.054421] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" 
disables this message.
[539762.054423] python          D ffff88847f45ac80     0 1098181 1098093 
0x00000004
[539762.054426] Call Trace:
[539762.054430]  [<ffffffff96181929>] schedule_preempt_disabled+0x29/0x70
[539762.054433]  [<ffffffff9617f8b7>] __mutex_lock_slowpath+0xc7/0x1d0
[539762.054438]  [<ffffffff9617ec8f>] mutex_lock+0x1f/0x2f
[539762.054446]  [<ffffffffc09ac5e6>] fuse_flush+0xb6/0x200 [fuse]
[539762.054452]  [<ffffffff96188678>] ? __do_page_fault+0x238/0x500
[539762.054455]  [<ffffffff95c47ce7>] filp_close+0x37/0x90
[539762.054458]  [<ffffffff95c6b4ac>] __close_fd+0x8c/0xb0
[539762.054461]  [<ffffffff95c49803>] SyS_close+0x23/0x50
[539762.054465]  [<ffffffff9618dede>] system_call_fastpath+0x25/0x2a
[539762.054482] INFO: task python:1098186 blocked for more than 120 seconds.
[539762.054483] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" 
disables this message.
[539762.054485] python          D ffff88847f45ac80     0 1098186 1098093 
0x00000000
[539762.054488] Call Trace:
[539762.054492]  [<ffffffff96181929>] schedule_preempt_disabled+0x29/0x70
[539762.054494]  [<ffffffff9617f8b7>] __mutex_lock_slowpath+0xc7/0x1d0
[539762.054499]  [<ffffffff9617ec8f>] mutex_lock+0x1f/0x2f
[539762.054507]  [<ffffffffc09ac5e6>] fuse_flush+0xb6/0x200 [fuse]
[539762.054511]  [<ffffffff95c47ce7>] filp_close+0x37/0x90
[539762.054514]  [<ffffffff95c6b4ac>] __close_fd+0x8c/0xb0
[539762.054517]  [<ffffffff95c49803>] SyS_close+0x23/0x50
[539762.054520]  [<ffffffff9618dede>] system_call_fastpath+0x25/0x2a
[539762.054529] INFO: task python:1098188 blocked for more than 120 seconds.
[539762.054531] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" 
disables this message.
[539762.054532] python          D ffff88847f39ac80     0 1098188 1098093 
0x00000004
[539762.054536] Call Trace:
[539762.054539]  [<ffffffff96181929>] schedule_preempt_disabled+0x29/0x70
[539762.054542]  [<ffffffff9617f8b7>] __mutex_lock_slowpath+0xc7/0x1d0
[539762.054546]  [<ffffffff9617ec8f>] mutex_lock+0x1f/0x2f
[539762.054554]  [<ffffffffc09ac5e6>] fuse_flush+0xb6/0x200 [fuse]
[539762.054559]  [<ffffffff96188678>] ? __do_page_fault+0x238/0x500
[539762.054562]  [<ffffffff95c47ce7>] filp_close+0x37/0x90
[539762.054565]  [<ffffffff95c6b4ac>] __close_fd+0x8c/0xb0
[539762.054568]  [<ffffffff95c49803>] SyS_close+0x23/0x50
[539762.054572]  [<ffffffff9618dede>] system_call_fastpath+0x25/0x2a
[539762.054577] INFO: task python:1098189 blocked for more than 120 seconds.
[539762.054579] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" 
disables this message.
[539762.054580] python          D ffff88847f29ac80     0 1098189 1098093 
0x00000004
[539762.054584] Call Trace:
[539762.054587]  [<ffffffff96181929>] schedule_preempt_disabled+0x29/0x70
[539762.054590]  [<ffffffff9617f8b7>] __mutex_lock_slowpath+0xc7/0x1d0
[539762.054595]  [<ffffffff9617ec8f>] mutex_lock+0x1f/0x2f
[539762.054602]  [<ffffffffc09ac5e6>] fuse_flush+0xb6/0x200 [fuse]
[539762.054606]  [<ffffffff95c47ce7>] filp_close+0x37/0x90
[539762.054609]  [<ffffffff95c6b4ac>] __close_fd+0x8c/0xb0
[539762.054612]  [<ffffffff95c49803>] SyS_close+0x23/0x50
[539762.054615]  [<ffffffff9618dede>] system_call_fastpath+0x25/0x2a
[539762.054626] INFO: task python:1098192 blocked for more than 120 seconds.
[539762.054628] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" 
disables this message.
[539762.054630] python          D ffff88847f31ac80     0 1098192 1098093 
0x00000004
[539762.054633] Call Trace:
[539762.054636]  [<ffffffff96181929>] schedule_preempt_disabled+0x29/0x70
[539762.054639]  [<ffffffff9617f8b7>] __mutex_lock_slowpath+0xc7/0x1d0
[539762.054644]  [<ffffffff9617ec8f>] mutex_lock+0x1f/0x2f
[539762.054651]  [<ffffffffc09ac5e6>] fuse_flush+0xb6/0x200 [fuse]
[539762.054656]  [<ffffffff96188678>] ? __do_page_fault+0x238/0x500
[539762.054659]  [<ffffffff95c47ce7>] filp_close+0x37/0x90
[539762.054662]  [<ffffffff95c6b4ac>] __close_fd+0x8c/0xb0
[539762.054665]  [<ffffffff95c49803>] SyS_close+0x23/0x50
[539762.054668]  [<ffffffff9618dede>] system_call_fastpath+0x25/0x2a
[539762.054694] INFO: task python:1098202 blocked for more than 120 seconds.
[539762.054696] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" 
disables this message.
[539762.054697] python          D ffff88847f49ac80     0 1098202 1098093 
0x00000000
[539762.054701] Call Trace:
[539762.054709]  [<ffffffffc09abfcf>] ? fuse_do_open+0x14f/0x1e0 [fuse]
[539762.054713]  [<ffffffff96181929>] schedule_preempt_disabled+0x29/0x70
[539762.054715]  [<ffffffff9617f8b7>] __mutex_lock_slowpath+0xc7/0x1d0
[539762.054720]  [<ffffffff9617ec8f>] mutex_lock+0x1f/0x2f
[539762.054723]  [<ffffffff95d28df6>] ima_file_check+0xa6/0x1b0
[539762.054727]  [<ffffffff95c592ca>] do_last+0x59a/0x1290
[539762.054731]  [<ffffffff95c5bdbd>] path_openat+0xcd/0x5a0
[539762.054735]  [<ffffffff95c4737a>] ? __check_object_size+0x1ca/0x250
[539762.054739]  [<ffffffff95c5d72d>] do_filp_open+0x4d/0xb0
[539762.054744]  [<ffffffff95c6b207>] ? __alloc_fd+0x47/0x170
[539762.054748]  [<ffffffff95c49684>] do_sys_open+0x124/0x220
[539762.054751]  [<ffffffff95c4979e>] SyS_open+0x1e/0x20
[539762.054755]  [<ffffffff9618dede>] system_call_fastpath+0x25/0x2a
[539882.054331] INFO: task python:1098178 blocked for more than 120 seconds.
[539882.054334] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" 
disables this message.
[539882.054335] python          D ffff8835f8788640     0 1098178 1098093 
0x00000004
[539882.054339] Call Trace:
[539882.054367]  [<ffffffffc09abfcf>] ? fuse_do_open+0x14f/0x1e0 [fuse]
[539882.054371]  [<ffffffff96181929>] schedule_preempt_disabled+0x29/0x70
[539882.054374]  [<ffffffff9617f8b7>] __mutex_lock_slowpath+0xc7/0x1d0
[539882.054377]  [<ffffffff9617ec8f>] mutex_lock+0x1f/0x2f
[539882.054381]  [<ffffffff95d28df6>] ima_file_check+0xa6/0x1b0
[539882.054385]  [<ffffffff95c592ca>] do_last+0x59a/0x1290
[539882.054387]  [<ffffffff95c5bdbd>] path_openat+0xcd/0x5a0
[539882.054391]  [<ffffffff95c4737a>] ? __check_object_size+0x1ca/0x250
[539882.054394]  [<ffffffff95c5d72d>] do_filp_open+0x4d/0xb0
[539882.054398]  [<ffffffff95c6b207>] ? __alloc_fd+0x47/0x170
[539882.054400]  [<ffffffff95c49684>] do_sys_open+0x124/0x220
[539882.054402]  [<ffffffff95c4979e>] SyS_open+0x1e/0x20
[539882.054405]  [<ffffffff9618dede>] system_call_fastpath+0x25/0x2a

Now - independently from this there is a degraded PG in the data pool 
(NOT in the metadata pool interestingly enough)

[root@cephmon00 ~]# ceph pg dump | grep degraded
dumped all
2.14b7    16292                  0        1         0       0 
25530224980           0          0 3037     3037 
active+recovering+degraded 2020-02-26 08:03:40.561062 1492172'6549513 
1492172:10385376 [1058,2098,734]       1058 [1058,2098,734]           
1058  1492086'6537090 2020-02-22 20:06:11.092352  1485071'6362109 
2020-01-29 21:48:02.821467             0

[root@cephmon00 ~]# ceph osd pool ls detail
pool 2 'cephfs_data' replicated size 3 min_size 2 crush_rule 1 
object_hash rjenkins pg_num 32768 pgp_num 32768 last_change 1484983 lfor 
0/1355093 flags hashpspool min_write_recency_for_promote 1 stripe_width 
0 application cephfs
pool 3 'cephfs_metadata' replicated size 3 min_size 2 crush_rule 3 
object_hash rjenkins pg_num 512 pgp_num 512 last_change 1484989 flags 
hashpspool min_write_recency_for_promote 1 stripe_width 0 application cephfs
pool 5 'rbd_vm' replicated size 3 min_size 2 crush_rule 1 object_hash 
rjenkins pg_num 1024 pgp_num 1024 last_change 1487461 flags 
hashpspool,selfmanaged_snaps min_write_recency_for_promote 1 
stripe_width 0 application rbd
     removed_snaps [1~3,5~2]
pool 9 'cephfs_data_ec63' erasure size 9 min_size 7 crush_rule 2 
object_hash rjenkins pg_num 32768 pgp_num 32768 last_change 1484986 lfor 
0/1355095 flags hashpspool,ec_overwrites stripe_width 24576 application 
cephfs

osd.1058 keeps complaining about a missing object and crc mismatch:

...
2020-02-26 08:05:44.453 7fffcb3fe700 -1 log_channel(cluster) log [ERR] : 
2.14b7 missing primary copy of 2:ed28cfe5:::10048b87a3b.00000001:head, 
will try copies on 734,2098
2020-02-26 08:05:44.585 7fffcb3fe700 -1 log_channel(cluster) log [ERR] : 
2.14b7 full-object read crc 0x91584ffd != expected 0xbd4cf65d on 
2:ed28cfe5:::10048b87a3b.00000001:head
2020-02-26 08:05:44.609 7fffcb3fe700 -1 log_channel(cluster) log [ERR] : 
2.14b7 full-object read crc 0x91584ffd != expected 0xbd4cf65d on 
2:ed28cfe5:::10048b87a3b.00000001:head
2020-02-26 08:05:44.609 7fffcb3fe700 -1 log_channel(cluster) log [ERR] : 
2.14b7 missing primary copy of 2:ed28cfe5:::10048b87a3b.00000001:head, 
will try copies on 734,2098
2020-02-26 08:05:44.717 7fffcb3fe700 -1 log_channel(cluster) log [ERR] : 
2.14b7 full-object read crc 0x91584ffd != expected 0xbd4cf65d on 
2:ed28cfe5:::10048b87a3b.00000001:head
2020-02-26 08:05:44.741 7fffcb3fe700 -1 log_channel(cluster) log [ERR] : 
2.14b7 full-object read crc 0x91584ffd != expected 0xbd4cf65d on 
2:ed28cfe5:::10048b87a3b.00000001:head
2020-02-26 08:05:44.741 7fffcb3fe700 -1 log_channel(cluster) log [ERR] : 
2.14b7 missing primary copy of 2:ed28cfe5:::10048b87a3b.00000001:head, 
will try copies on 734,2098
...

When I stop this OSD, one of its secondaries takes over and then THAT 
osd complains about the exact same issue.

What is most bizarre about this is that if I power cycle the client 
nodes that are stuck - the whole issue resolves itself, including the 
degraded PG.  So somehow these stuck ceph-fuse processes are responsible 
for this degraded PG as well.  Not quite sure how.  This issue happened 
4 times in the past few days on different nodes - so it is definitely 
not a hardware issue.

Are there any known issues we're running into here, or is this a new 
problem?

Andras
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx