We've been running into a strange problem repeating every day or so with
a specific HPC job on a Mimic cluster (13.2.8) using ceph-fuse
(14.2.7). It seems like some cephfs clients are stuck (perhaps
deadlocked) trying to access a file and are not making progress.
Ceph reports the following problems (ignore the oversized cache, large
omap objects and the failing to respond to cache pressure - we get those
regularly without any problems):
[root@cephmon00 ~]# ceph -s
cluster:
id: d7b33135-0940-4e48-8aa6-1d2026597c2f
health: HEALTH_WARN
1 MDSs report oversized cache
1 clients failing to respond to capability release
5 clients failing to respond to cache pressure
1 MDSs report slow requests
2 large omap objects
Degraded data redundancy: 1/24563062389 objects degraded
(0.000%), 1 pg degraded
1 slow ops, oldest one blocked for 14723 sec, osd.1058 has
slow ops
services:
mon: 5 daemons, quorum
cephmon00,cephmon01,cephmon02,cephmon03,cephmon04
mgr: cephmon01(active), standbys: cephmon02, cephmon03, cephmon04,
cephmon00
mds: cephfs-1/1/1 up {0=cephmds00=up:active}, 5 up:standby
osd: 3264 osds: 3264 up, 3264 in
data:
pools: 4 pools, 67072 pgs
objects: 3.10 G objects, 10 PiB
usage: 17 PiB used, 6.0 PiB / 23 PiB avail
pgs: 1/24563062389 objects degraded (0.000%)
67006 active+clean
40 active+clean+scrubbing+deep
25 active+clean+scrubbing
1 active+recovering+degraded
io:
client: 501 MiB/s rd, 4.2 GiB/s wr, 1.63 kop/s rd, 6.46 kop/s wr
recovery: 28 MiB/s, 7 objects/s
[root@cephmon00 ~]# ceph health detail
HEALTH_WARN 1 MDSs report oversized cache; 1 clients failing to respond
to capability release; 5 clients failing to respond to cache pressure; 1
MDSs report slow requests; 2 large omap objects; Degraded data
redundancy: 1/24563067054 objects degraded (0.000%), 1 pg degraded; 1
slow ops, oldest one blocked for 14723 sec, osd.1058 has slow ops
MDS_CACHE_OVERSIZED 1 MDSs report oversized cache
mdscephmds00(mds.0): MDS cache is too large (73GB/16GB); 27115370
inodes in use by clients, 3811 stray files
MDS_CLIENT_LATE_RELEASE 1 clients failing to respond to capability release
mdscephmds00(mds.0): Client worker1003:cephfs failing to respond to
capability release client_id: 90142364
MDS_CLIENT_RECALL 5 clients failing to respond to cache pressure
mdscephmds00(mds.0): Client gw162.flatironinstitute.org failing to
respond to cache pressure client_id: 72550178
mdscephmds00(mds.0): Client ccblin053 failing to respond to cache
pressure client_id: 76380719
mdscephmds00(mds.0): Client rusty2 failing to respond to cache
pressure client_id: 79497905
mdscephmds00(mds.0): Client infolin003:cephfs failing to respond to
cache pressure client_id: 88565901
mdscephmds00(mds.0): Client
ceph-nfs1.flatironinstitute.org:cephfs-ro failing to respond to cache
pressure client_id: 88899914
MDS_SLOW_REQUEST 1 MDSs report slow requests
mdscephmds00(mds.0): 7 slow requests are blocked > 30 secs
LARGE_OMAP_OBJECTS 2 large omap objects
2 large objects found in pool 'cephfs_metadata'
Search the cluster log for 'Large omap object found' for more details.
PG_DEGRADED Degraded data redundancy: 1/24563067054 objects degraded
(0.000%), 1 pg degraded
pg 2.14b7 is active+recovering+degraded, acting [1058,2098,734]
SLOW_OPS 1 slow ops, oldest one blocked for 14723 sec, osd.1058 has slow ops
The client that is failing to respond to capability release seems to be
the real problem here. This is a ceph-fuse client (14.2.7) - it looks
like it has a bunch of open requests in flight to the MDS:
[root@worker1003 ~]# ceph daemon
/var/run/ceph/ceph-client.cephfs.6969.93825141492560.asok mds_requests
{
"request": {
"tid": 3801617,
"op": "open",
"path": "#0x10048b87a3b",
"path2": "",
"ino": "0x10048b87a3b",
"hint_ino": "0x0",
"sent_stamp": "2020-02-26 03:42:58.359939",
"mds": 0,
"resend_mds": -1,
"send_to_auth": 0,
"sent_on_mseq": 0,
"retry_attempt": 1,
"got_unsafe": 0,
"uid": 1261,
"gid": 1261,
"oldest_client_tid": 3801617,
"mdsmap_epoch": 0,
"flags": 0,
"num_retry": 0,
"num_fwd": 0,
"num_releases": 0,
"abort_rc": 0
},
"request": {
"tid": 3900154,
"op": "lookup",
"path": "#0x10048b61f09/tesscut_20200226034255.zip",
"path2": "",
"ino": "0x10048b61f09",
"hint_ino": "0x0",
"sent_stamp": "2020-02-26 04:08:50.572111",
"mds": 0,
"resend_mds": -1,
"send_to_auth": 0,
"sent_on_mseq": 0,
"retry_attempt": 1,
"got_unsafe": 0,
"uid": 1261,
"gid": 1261,
"oldest_client_tid": 3801617,
"mdsmap_epoch": 0,
"flags": 0,
"num_retry": 0,
"num_fwd": 0,
"num_releases": 0,
"abort_rc": 0
},
"request": {
"tid": 4111017,
"op": "lookup",
"path": "#0x10048b61f09/tesscut_20200226034255.zip",
"path2": "",
"ino": "0x10048b61f09",
"hint_ino": "0x0",
"sent_stamp": "2020-02-26 04:53:29.372694",
"mds": 0,
"resend_mds": -1,
"send_to_auth": 0,
"sent_on_mseq": 0,
"retry_attempt": 1,
"got_unsafe": 0,
"uid": 1261,
"gid": 1261,
"oldest_client_tid": 3801617,
"mdsmap_epoch": 0,
"flags": 0,
"num_retry": 0,
"num_fwd": 0,
"num_releases": 0,
"abort_rc": 0
},
"request": {
"tid": 4288428,
"op": "lookup",
"path": "#0x10048b61f09/tesscut_20200226034255.zip",
"path2": "",
"ino": "0x10048b61f09",
"hint_ino": "0x0",
"sent_stamp": "2020-02-26 05:39:52.257977",
"mds": 0,
"resend_mds": -1,
"send_to_auth": 0,
"sent_on_mseq": 0,
"retry_attempt": 1,
"got_unsafe": 0,
"uid": 1261,
"gid": 1261,
"oldest_client_tid": 3801617,
"mdsmap_epoch": 0,
"flags": 0,
"num_retry": 0,
"num_fwd": 0,
"num_releases": 0,
"abort_rc": 0
},
"request": {
"tid": 4425681,
"op": "lookup",
"path": "#0x10048b61f09/tesscut_20200226034255.zip",
"path2": "",
"ino": "0x10048b61f09",
"hint_ino": "0x0",
"sent_stamp": "2020-02-26 06:23:52.201814",
"mds": 0,
"resend_mds": -1,
"send_to_auth": 0,
"sent_on_mseq": 0,
"retry_attempt": 1,
"got_unsafe": 0,
"uid": 1261,
"gid": 1261,
"oldest_client_tid": 3801617,
"mdsmap_epoch": 0,
"flags": 0,
"num_retry": 0,
"num_fwd": 0,
"num_releases": 0,
"abort_rc": 0
}
}
What seems to be odd is that the last 4 are for the same file. The MDS
also sees these ops as pending:
[root@cephmds00 ~]# ceph daemon /var/run/ceph/ceph-mds.cephmds00.asok ops
{
"ops": [
{
"description": "client_request(client.90142364:4288428
lookup #0x10048b61f09/tesscut_20200226034255.zip 2020-02-26
05:39:52.257975 caller_uid=1261, caller_gid=1261{1193,1207,1223,1261,})",
"initiated_at": "2020-02-26 05:39:52.258280",
"age": 8189.169056,
"duration": 8189.169070,
"type_data": {
"flag_point": "failed to rdlock, waiting",
"reqid": "client.90142364:4288428",
"op_type": "client_request",
"client_info": {
"client": "client.90142364",
"tid": 4288428
},
"events": [
{
"time": "2020-02-26 05:39:52.258280",
"event": "initiated"
},
{
"time": "2020-02-26 05:39:52.258280",
"event": "header_read"
},
{
"time": "2020-02-26 05:39:52.258281",
"event": "throttled"
},
{
"time": "2020-02-26 05:39:52.258283",
"event": "all_read"
},
{
"time": "2020-02-26 05:39:52.258362",
"event": "dispatched"
},
{
"time": "2020-02-26 05:39:52.258383",
"event": "failed to rdlock, waiting"
}
]
}
},
{
"description": "client_request(client.90142364:4425681
lookup #0x10048b61f09/tesscut_20200226034255.zip 2020-02-26
06:23:52.201811 caller_uid=1261, caller_gid=1261{1193,1207,1223,1261,})",
"initiated_at": "2020-02-26 06:23:52.201698",
"age": 5549.225638,
"duration": 5549.225679,
"type_data": {
"flag_point": "failed to rdlock, waiting",
"reqid": "client.90142364:4425681",
"op_type": "client_request",
"client_info": {
"client": "client.90142364",
"tid": 4425681
},
"events": [
{
"time": "2020-02-26 06:23:52.201698",
"event": "initiated"
},
{
"time": "2020-02-26 06:23:52.201698",
"event": "header_read"
},
{
"time": "2020-02-26 06:23:52.201699",
"event": "throttled"
},
{
"time": "2020-02-26 06:23:52.201702",
"event": "all_read"
},
{
"time": "2020-02-26 06:23:52.201720",
"event": "dispatched"
},
{
"time": "2020-02-26 06:23:52.201740",
"event": "failed to rdlock, waiting"
}
]
}
},
{
"description": "client_request(client.90142364:4111017
lookup #0x10048b61f09/tesscut_20200226034255.zip 2020-02-26
04:53:29.372691 caller_uid=1261, caller_gid=1261{1193,1207,1223,1261,})",
"initiated_at": "2020-02-26 04:53:29.373246",
"age": 10972.054090,
"duration": 10972.054155,
"type_data": {
"flag_point": "failed to rdlock, waiting",
"reqid": "client.90142364:4111017",
"op_type": "client_request",
"client_info": {
"client": "client.90142364",
"tid": 4111017
},
"events": [
{
"time": "2020-02-26 04:53:29.373246",
"event": "initiated"
},
{
"time": "2020-02-26 04:53:29.373246",
"event": "header_read"
},
{
"time": "2020-02-26 04:53:29.373246",
"event": "throttled"
},
{
"time": "2020-02-26 04:53:29.373249",
"event": "all_read"
},
{
"time": "2020-02-26 04:53:29.373299",
"event": "dispatched"
},
{
"time": "2020-02-26 04:53:29.373323",
"event": "failed to rdlock, waiting"
}
]
}
},
{
"description": "client_request(client.90118987:32628772
lookup #0x10048b61f09/tesscut_20200226034255.zip 2020-02-26
04:09:03.139173 caller_uid=1261, caller_gid=1261{1193,1207,1223,1261,})",
"initiated_at": "2020-02-26 04:09:03.138539",
"age": 13638.288797,
"duration": 13638.288884,
"type_data": {
"flag_point": "failed to rdlock, waiting",
"reqid": "client.90118987:32628772",
"op_type": "client_request",
"client_info": {
"client": "client.90118987",
"tid": 32628772
},
"events": [
{
"time": "2020-02-26 04:09:03.138539",
"event": "initiated"
},
{
"time": "2020-02-26 04:09:03.138539",
"event": "header_read"
},
{
"time": "2020-02-26 04:09:03.138540",
"event": "throttled"
},
{
"time": "2020-02-26 04:09:03.138543",
"event": "all_read"
},
{
"time": "2020-02-26 04:09:03.138569",
"event": "dispatched"
},
{
"time": "2020-02-26 04:09:03.138592",
"event": "failed to rdlock, waiting"
}
]
}
},
{
"description": "client_request(client.90142364:3801617 open
#0x10048b87a3b 2020-02-26 03:42:58.359934 caller_uid=1261,
caller_gid=1261{1193,1207,1223,1261,})",
"initiated_at": "2020-02-26 03:42:58.359502",
"age": 15203.067834,
"duration": 15203.067944,
"type_data": {
"flag_point": "failed to xlock, waiting",
"reqid": "client.90142364:3801617",
"op_type": "client_request",
"client_info": {
"client": "client.90142364",
"tid": 3801617
},
"events": [
{
"time": "2020-02-26 03:42:58.359502",
"event": "initiated"
},
{
"time": "2020-02-26 03:42:58.359502",
"event": "header_read"
},
{
"time": "2020-02-26 03:42:58.359502",
"event": "throttled"
},
{
"time": "2020-02-26 03:42:58.359505",
"event": "all_read"
},
{
"time": "2020-02-26 03:42:58.359525",
"event": "dispatched"
},
{
"time": "2020-02-26 03:42:58.359557",
"event": "failed to xlock, waiting"
}
]
}
},
{
"description": "client_request(client.90145864:3374743
lookup #0x10048b61f09/tesscut_20200226034255.zip 2020-02-26
06:24:02.082413 caller_uid=1261, caller_gid=1261{1193,1207,1223,1261,})",
"initiated_at": "2020-02-26 06:24:02.081762",
"age": 5539.345573,
"duration": 5539.345708,
"type_data": {
"flag_point": "failed to rdlock, waiting",
"reqid": "client.90145864:3374743",
"op_type": "client_request",
"client_info": {
"client": "client.90145864",
"tid": 3374743
},
"events": [
{
"time": "2020-02-26 06:24:02.081762",
"event": "initiated"
},
{
"time": "2020-02-26 06:24:02.081762",
"event": "header_read"
},
{
"time": "2020-02-26 06:24:02.081763",
"event": "throttled"
},
{
"time": "2020-02-26 06:24:02.081766",
"event": "all_read"
},
{
"time": "2020-02-26 06:24:02.081787",
"event": "dispatched"
},
{
"time": "2020-02-26 06:24:02.081804",
"event": "failed to rdlock, waiting"
}
]
}
},
{
"description": "client_request(client.90142364:3900154
lookup #0x10048b61f09/tesscut_20200226034255.zip 2020-02-26
04:08:50.572109 caller_uid=1261, caller_gid=1261{1193,1207,1223,1261,})",
"initiated_at": "2020-02-26 04:08:50.572289",
"age": 13650.855047,
"duration": 13650.855205,
"type_data": {
"flag_point": "failed to rdlock, waiting",
"reqid": "client.90142364:3900154",
"op_type": "client_request",
"client_info": {
"client": "client.90142364",
"tid": 3900154
},
"events": [
{
"time": "2020-02-26 04:08:50.572289",
"event": "initiated"
},
{
"time": "2020-02-26 04:08:50.572289",
"event": "header_read"
},
{
"time": "2020-02-26 04:08:50.572290",
"event": "throttled"
},
{
"time": "2020-02-26 04:08:50.572293",
"event": "all_read"
},
{
"time": "2020-02-26 04:08:50.572318",
"event": "dispatched"
},
{
"time": "2020-02-26 04:08:50.572338",
"event": "failed to rdlock, waiting"
}
]
}
}
],
"num_ops": 7
}
There is another client on this list (90118987) also trying to access
the same file - here is the client's op list from that (worker1020):
[root@worker1020 ~]# ceph daemon
/var/run/ceph/ceph-client.cephfs.6958.93825141492560.asok mds_requests
{
"request": {
"tid": 32628772,
"op": "lookup",
"path": "#0x10048b61f09/tesscut_20200226034255.zip",
"path2": "",
"ino": "0x10048b61f09",
"hint_ino": "0x0",
"sent_stamp": "2020-02-26 04:09:03.139177",
"mds": 0,
"resend_mds": -1,
"send_to_auth": 0,
"sent_on_mseq": 0,
"retry_attempt": 1,
"got_unsafe": 0,
"uid": 1261,
"gid": 1261,
"oldest_client_tid": 32628772,
"mdsmap_epoch": 0,
"flags": 0,
"num_retry": 0,
"num_fwd": 0,
"num_releases": 0,
"abort_rc": 0
}
}
Also, the kernel (CentOS 7 3.10.0-1062.9.1.el7.x86_64) also reports
stuck processes. dmesg:
[539762.054205] INFO: task python:1098178 blocked for more than 120 seconds.
[539762.054210] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[539762.054212] python D ffff8835f8788640 0 1098178 1098093
0x00000000
[539762.054218] Call Trace:
[539762.054252] [<ffffffffc09abfcf>] ? fuse_do_open+0x14f/0x1e0 [fuse]
[539762.054259] [<ffffffff96181929>] schedule_preempt_disabled+0x29/0x70
[539762.054263] [<ffffffff9617f8b7>] __mutex_lock_slowpath+0xc7/0x1d0
[539762.054269] [<ffffffff9617ec8f>] mutex_lock+0x1f/0x2f
[539762.054274] [<ffffffff95d28df6>] ima_file_check+0xa6/0x1b0
[539762.054280] [<ffffffff95c592ca>] do_last+0x59a/0x1290
[539762.054284] [<ffffffff95c5bdbd>] path_openat+0xcd/0x5a0
[539762.054290] [<ffffffff95c4737a>] ? __check_object_size+0x1ca/0x250
[539762.054294] [<ffffffff95c5d72d>] do_filp_open+0x4d/0xb0
[539762.054299] [<ffffffff95c6b207>] ? __alloc_fd+0x47/0x170
[539762.054303] [<ffffffff95c49684>] do_sys_open+0x124/0x220
[539762.054307] [<ffffffff95c4979e>] SyS_open+0x1e/0x20
[539762.054311] [<ffffffff9618dede>] system_call_fastpath+0x25/0x2a
[539762.054317] INFO: task python:1098179 blocked for more than 120 seconds.
[539762.054318] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[539762.054320] python D ffff88847f25ac80 0 1098179 1098093
0x00000000
[539762.054324] Call Trace:
[539762.054328] [<ffffffff96181929>] schedule_preempt_disabled+0x29/0x70
[539762.054331] [<ffffffff9617f8b7>] __mutex_lock_slowpath+0xc7/0x1d0
[539762.054335] [<ffffffff9617ec8f>] mutex_lock+0x1f/0x2f
[539762.054344] [<ffffffffc09aeb3b>] fuse_file_aio_write+0xdb/0x390 [fuse]
[539762.054348] [<ffffffff95c49d03>] do_sync_write+0x93/0xe0
[539762.054353] [<ffffffff95c4a7f0>] vfs_write+0xc0/0x1f0
[539762.054357] [<ffffffff95c4b60f>] SyS_write+0x7f/0xf0
[539762.054360] [<ffffffff9618dede>] system_call_fastpath+0x25/0x2a
[539762.054366] INFO: task python:1098180 blocked for more than 120 seconds.
[539762.054368] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[539762.054369] python D ffff88847f25ac80 0 1098180 1098093
0x00000000
[539762.054372] Call Trace:
[539762.054376] [<ffffffff96181929>] schedule_preempt_disabled+0x29/0x70
[539762.054379] [<ffffffff9617f8b7>] __mutex_lock_slowpath+0xc7/0x1d0
[539762.054384] [<ffffffff9617ec8f>] mutex_lock+0x1f/0x2f
[539762.054392] [<ffffffffc09aeb3b>] fuse_file_aio_write+0xdb/0x390 [fuse]
[539762.054398] [<ffffffff95bf1249>] ? handle_pte_fault+0x2b9/0xe20
[539762.054402] [<ffffffff95c49d03>] do_sync_write+0x93/0xe0
[539762.054406] [<ffffffff95c4a7f0>] vfs_write+0xc0/0x1f0
[539762.054410] [<ffffffff95c4b60f>] SyS_write+0x7f/0xf0
[539762.054413] [<ffffffff9618dede>] system_call_fastpath+0x25/0x2a
[539762.054420] INFO: task python:1098181 blocked for more than 120 seconds.
[539762.054421] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[539762.054423] python D ffff88847f45ac80 0 1098181 1098093
0x00000004
[539762.054426] Call Trace:
[539762.054430] [<ffffffff96181929>] schedule_preempt_disabled+0x29/0x70
[539762.054433] [<ffffffff9617f8b7>] __mutex_lock_slowpath+0xc7/0x1d0
[539762.054438] [<ffffffff9617ec8f>] mutex_lock+0x1f/0x2f
[539762.054446] [<ffffffffc09ac5e6>] fuse_flush+0xb6/0x200 [fuse]
[539762.054452] [<ffffffff96188678>] ? __do_page_fault+0x238/0x500
[539762.054455] [<ffffffff95c47ce7>] filp_close+0x37/0x90
[539762.054458] [<ffffffff95c6b4ac>] __close_fd+0x8c/0xb0
[539762.054461] [<ffffffff95c49803>] SyS_close+0x23/0x50
[539762.054465] [<ffffffff9618dede>] system_call_fastpath+0x25/0x2a
[539762.054482] INFO: task python:1098186 blocked for more than 120 seconds.
[539762.054483] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[539762.054485] python D ffff88847f45ac80 0 1098186 1098093
0x00000000
[539762.054488] Call Trace:
[539762.054492] [<ffffffff96181929>] schedule_preempt_disabled+0x29/0x70
[539762.054494] [<ffffffff9617f8b7>] __mutex_lock_slowpath+0xc7/0x1d0
[539762.054499] [<ffffffff9617ec8f>] mutex_lock+0x1f/0x2f
[539762.054507] [<ffffffffc09ac5e6>] fuse_flush+0xb6/0x200 [fuse]
[539762.054511] [<ffffffff95c47ce7>] filp_close+0x37/0x90
[539762.054514] [<ffffffff95c6b4ac>] __close_fd+0x8c/0xb0
[539762.054517] [<ffffffff95c49803>] SyS_close+0x23/0x50
[539762.054520] [<ffffffff9618dede>] system_call_fastpath+0x25/0x2a
[539762.054529] INFO: task python:1098188 blocked for more than 120 seconds.
[539762.054531] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[539762.054532] python D ffff88847f39ac80 0 1098188 1098093
0x00000004
[539762.054536] Call Trace:
[539762.054539] [<ffffffff96181929>] schedule_preempt_disabled+0x29/0x70
[539762.054542] [<ffffffff9617f8b7>] __mutex_lock_slowpath+0xc7/0x1d0
[539762.054546] [<ffffffff9617ec8f>] mutex_lock+0x1f/0x2f
[539762.054554] [<ffffffffc09ac5e6>] fuse_flush+0xb6/0x200 [fuse]
[539762.054559] [<ffffffff96188678>] ? __do_page_fault+0x238/0x500
[539762.054562] [<ffffffff95c47ce7>] filp_close+0x37/0x90
[539762.054565] [<ffffffff95c6b4ac>] __close_fd+0x8c/0xb0
[539762.054568] [<ffffffff95c49803>] SyS_close+0x23/0x50
[539762.054572] [<ffffffff9618dede>] system_call_fastpath+0x25/0x2a
[539762.054577] INFO: task python:1098189 blocked for more than 120 seconds.
[539762.054579] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[539762.054580] python D ffff88847f29ac80 0 1098189 1098093
0x00000004
[539762.054584] Call Trace:
[539762.054587] [<ffffffff96181929>] schedule_preempt_disabled+0x29/0x70
[539762.054590] [<ffffffff9617f8b7>] __mutex_lock_slowpath+0xc7/0x1d0
[539762.054595] [<ffffffff9617ec8f>] mutex_lock+0x1f/0x2f
[539762.054602] [<ffffffffc09ac5e6>] fuse_flush+0xb6/0x200 [fuse]
[539762.054606] [<ffffffff95c47ce7>] filp_close+0x37/0x90
[539762.054609] [<ffffffff95c6b4ac>] __close_fd+0x8c/0xb0
[539762.054612] [<ffffffff95c49803>] SyS_close+0x23/0x50
[539762.054615] [<ffffffff9618dede>] system_call_fastpath+0x25/0x2a
[539762.054626] INFO: task python:1098192 blocked for more than 120 seconds.
[539762.054628] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[539762.054630] python D ffff88847f31ac80 0 1098192 1098093
0x00000004
[539762.054633] Call Trace:
[539762.054636] [<ffffffff96181929>] schedule_preempt_disabled+0x29/0x70
[539762.054639] [<ffffffff9617f8b7>] __mutex_lock_slowpath+0xc7/0x1d0
[539762.054644] [<ffffffff9617ec8f>] mutex_lock+0x1f/0x2f
[539762.054651] [<ffffffffc09ac5e6>] fuse_flush+0xb6/0x200 [fuse]
[539762.054656] [<ffffffff96188678>] ? __do_page_fault+0x238/0x500
[539762.054659] [<ffffffff95c47ce7>] filp_close+0x37/0x90
[539762.054662] [<ffffffff95c6b4ac>] __close_fd+0x8c/0xb0
[539762.054665] [<ffffffff95c49803>] SyS_close+0x23/0x50
[539762.054668] [<ffffffff9618dede>] system_call_fastpath+0x25/0x2a
[539762.054694] INFO: task python:1098202 blocked for more than 120 seconds.
[539762.054696] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[539762.054697] python D ffff88847f49ac80 0 1098202 1098093
0x00000000
[539762.054701] Call Trace:
[539762.054709] [<ffffffffc09abfcf>] ? fuse_do_open+0x14f/0x1e0 [fuse]
[539762.054713] [<ffffffff96181929>] schedule_preempt_disabled+0x29/0x70
[539762.054715] [<ffffffff9617f8b7>] __mutex_lock_slowpath+0xc7/0x1d0
[539762.054720] [<ffffffff9617ec8f>] mutex_lock+0x1f/0x2f
[539762.054723] [<ffffffff95d28df6>] ima_file_check+0xa6/0x1b0
[539762.054727] [<ffffffff95c592ca>] do_last+0x59a/0x1290
[539762.054731] [<ffffffff95c5bdbd>] path_openat+0xcd/0x5a0
[539762.054735] [<ffffffff95c4737a>] ? __check_object_size+0x1ca/0x250
[539762.054739] [<ffffffff95c5d72d>] do_filp_open+0x4d/0xb0
[539762.054744] [<ffffffff95c6b207>] ? __alloc_fd+0x47/0x170
[539762.054748] [<ffffffff95c49684>] do_sys_open+0x124/0x220
[539762.054751] [<ffffffff95c4979e>] SyS_open+0x1e/0x20
[539762.054755] [<ffffffff9618dede>] system_call_fastpath+0x25/0x2a
[539882.054331] INFO: task python:1098178 blocked for more than 120 seconds.
[539882.054334] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[539882.054335] python D ffff8835f8788640 0 1098178 1098093
0x00000004
[539882.054339] Call Trace:
[539882.054367] [<ffffffffc09abfcf>] ? fuse_do_open+0x14f/0x1e0 [fuse]
[539882.054371] [<ffffffff96181929>] schedule_preempt_disabled+0x29/0x70
[539882.054374] [<ffffffff9617f8b7>] __mutex_lock_slowpath+0xc7/0x1d0
[539882.054377] [<ffffffff9617ec8f>] mutex_lock+0x1f/0x2f
[539882.054381] [<ffffffff95d28df6>] ima_file_check+0xa6/0x1b0
[539882.054385] [<ffffffff95c592ca>] do_last+0x59a/0x1290
[539882.054387] [<ffffffff95c5bdbd>] path_openat+0xcd/0x5a0
[539882.054391] [<ffffffff95c4737a>] ? __check_object_size+0x1ca/0x250
[539882.054394] [<ffffffff95c5d72d>] do_filp_open+0x4d/0xb0
[539882.054398] [<ffffffff95c6b207>] ? __alloc_fd+0x47/0x170
[539882.054400] [<ffffffff95c49684>] do_sys_open+0x124/0x220
[539882.054402] [<ffffffff95c4979e>] SyS_open+0x1e/0x20
[539882.054405] [<ffffffff9618dede>] system_call_fastpath+0x25/0x2a
Now - independently from this there is a degraded PG in the data pool
(NOT in the metadata pool interestingly enough)
[root@cephmon00 ~]# ceph pg dump | grep degraded
dumped all
2.14b7 16292 0 1 0 0
25530224980 0 0 3037 3037
active+recovering+degraded 2020-02-26 08:03:40.561062 1492172'6549513
1492172:10385376 [1058,2098,734] 1058 [1058,2098,734]
1058 1492086'6537090 2020-02-22 20:06:11.092352 1485071'6362109
2020-01-29 21:48:02.821467 0
[root@cephmon00 ~]# ceph osd pool ls detail
pool 2 'cephfs_data' replicated size 3 min_size 2 crush_rule 1
object_hash rjenkins pg_num 32768 pgp_num 32768 last_change 1484983 lfor
0/1355093 flags hashpspool min_write_recency_for_promote 1 stripe_width
0 application cephfs
pool 3 'cephfs_metadata' replicated size 3 min_size 2 crush_rule 3
object_hash rjenkins pg_num 512 pgp_num 512 last_change 1484989 flags
hashpspool min_write_recency_for_promote 1 stripe_width 0 application cephfs
pool 5 'rbd_vm' replicated size 3 min_size 2 crush_rule 1 object_hash
rjenkins pg_num 1024 pgp_num 1024 last_change 1487461 flags
hashpspool,selfmanaged_snaps min_write_recency_for_promote 1
stripe_width 0 application rbd
removed_snaps [1~3,5~2]
pool 9 'cephfs_data_ec63' erasure size 9 min_size 7 crush_rule 2
object_hash rjenkins pg_num 32768 pgp_num 32768 last_change 1484986 lfor
0/1355095 flags hashpspool,ec_overwrites stripe_width 24576 application
cephfs
osd.1058 keeps complaining about a missing object and crc mismatch:
...
2020-02-26 08:05:44.453 7fffcb3fe700 -1 log_channel(cluster) log [ERR] :
2.14b7 missing primary copy of 2:ed28cfe5:::10048b87a3b.00000001:head,
will try copies on 734,2098
2020-02-26 08:05:44.585 7fffcb3fe700 -1 log_channel(cluster) log [ERR] :
2.14b7 full-object read crc 0x91584ffd != expected 0xbd4cf65d on
2:ed28cfe5:::10048b87a3b.00000001:head
2020-02-26 08:05:44.609 7fffcb3fe700 -1 log_channel(cluster) log [ERR] :
2.14b7 full-object read crc 0x91584ffd != expected 0xbd4cf65d on
2:ed28cfe5:::10048b87a3b.00000001:head
2020-02-26 08:05:44.609 7fffcb3fe700 -1 log_channel(cluster) log [ERR] :
2.14b7 missing primary copy of 2:ed28cfe5:::10048b87a3b.00000001:head,
will try copies on 734,2098
2020-02-26 08:05:44.717 7fffcb3fe700 -1 log_channel(cluster) log [ERR] :
2.14b7 full-object read crc 0x91584ffd != expected 0xbd4cf65d on
2:ed28cfe5:::10048b87a3b.00000001:head
2020-02-26 08:05:44.741 7fffcb3fe700 -1 log_channel(cluster) log [ERR] :
2.14b7 full-object read crc 0x91584ffd != expected 0xbd4cf65d on
2:ed28cfe5:::10048b87a3b.00000001:head
2020-02-26 08:05:44.741 7fffcb3fe700 -1 log_channel(cluster) log [ERR] :
2.14b7 missing primary copy of 2:ed28cfe5:::10048b87a3b.00000001:head,
will try copies on 734,2098
...
When I stop this OSD, one of its secondaries takes over and then THAT
osd complains about the exact same issue.
What is most bizarre about this is that if I power cycle the client
nodes that are stuck - the whole issue resolves itself, including the
degraded PG. So somehow these stuck ceph-fuse processes are responsible
for this degraded PG as well. Not quite sure how. This issue happened
4 times in the past few days on different nodes - so it is definitely
not a hardware issue.
Are there any known issues we're running into here, or is this a new
problem?
Andras
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx