We've been running into a strange issue with ceph-fuse on some nodes
lately. After some job runs on the node (and finishes or gets killed),
ceph-fuse gets stuck busy requesting objects from the OSDs without any
processes on the node using cephfs. When this happens, ceph-fuse uses
2-3 cores, spinning in what seems like an infinite loop making objecter
requests to various OSDs of files that were perhaps requested by some
process that is long gone. The same object gets requested every few
seconds cycling through a list of objects. This close to saturates the
traffic in the 25Gbps NIC of the node. I have a one minute log file
with debug_objecter/objectcacher/client/context/finisher set to 20 that
traces this, but I don't see why this is happening exactly. Can someone
with a better understanding on the client->cache->OSD flow share some
insights into what is going wrong here?
Here is an example:
2023-02-03T14:42:47.593-0500 7fffdbfff700 10 client.151672236.objecter
_op_submit oid 10097d5e07f.0000002b '@9' '@9' [read 0~25165824] tid
42777727 osd.4349
2023-02-03T14:43:00.056-0500 7fffdbfff700 10 client.151672236.objecter
_op_submit oid 10097d5e07f.0000002b '@9' '@9' [read 0~25165824] tid
42778921 osd.4349
2023-02-03T14:43:02.048-0500 7fffdbfff700 10 client.151672236.objecter
_op_submit oid 10097d5e07f.0000002b '@9' '@9' [read 0~25165824] tid
42779104 osd.4349
2023-02-03T14:43:05.392-0500 7fffdbfff700 10 client.151672236.objecter
_op_submit oid 10097d5e07f.0000002b '@9' '@9' [read 0~25165824] tid
42779408 osd.4349
2023-02-03T14:43:10.076-0500 7fffdbfff700 10 client.151672236.objecter
_op_submit oid 10097d5e07f.0000002b '@9' '@9' [read 0~25165824] tid
42779847 osd.4349
2023-02-03T14:43:13.288-0500 7fffdbfff700 10 client.151672236.objecter
_op_submit oid 10097d5e07f.0000002b '@9' '@9' [read 0~25165824] tid
42780156 osd.4349
2023-02-03T14:43:18.908-0500 7fffdbfff700 10 client.151672236.objecter
_op_submit oid 10097d5e07f.0000002b '@9' '@9' [read 0~25165824] tid
42780697 osd.4349
2023-02-03T14:43:29.056-0500 7fffdbfff700 10 client.151672236.objecter
_op_submit oid 10097d5e07f.0000002b '@9' '@9' [read 0~25165824] tid
42781660 osd.4349
2023-02-03T14:43:33.707-0500 7fffdbfff700 10 client.151672236.objecter
_op_submit oid 10097d5e07f.0000002b '@9' '@9' [read 0~25165824] tid
42782079 osd.4349
2023-02-03T14:42:41.609-0500 7fffdbfff700 10 client.151672236.objecter
_op_submit oid 10097d5e07f.0000002c '@9' '@9' [read 0~25165824] tid
42777194 osd.3251
2023-02-03T14:42:49.809-0500 7fffdbfff700 10 client.151672236.objecter
_op_submit oid 10097d5e07f.0000002c '@9' '@9' [read 0~25165824] tid
42777954 osd.3251
2023-02-03T14:43:07.884-0500 7fffdbfff700 10 client.151672236.objecter
_op_submit oid 10097d5e07f.0000002c '@9' '@9' [read 0~25165824] tid
42779646 osd.3251
2023-02-03T14:43:16.736-0500 7fffdbfff700 10 client.151672236.objecter
_op_submit oid 10097d5e07f.0000002c '@9' '@9' [read 0~25165824] tid
42780500 osd.3251
2023-02-03T14:43:22.160-0500 7fffdbfff700 10 client.151672236.objecter
_op_submit oid 10097d5e07f.0000002c '@9' '@9' [read 0~25165824] tid
42781009 osd.3251
2023-02-03T14:43:31.603-0500 7fffdbfff700 10 client.151672236.objecter
_op_submit oid 10097d5e07f.0000002c '@9' '@9' [read 0~25165824] tid
42781892 osd.3251
2023-02-03T14:43:35.503-0500 7fffdbfff700 10 client.151672236.objecter
_op_submit oid 10097d5e07f.0000002c '@9' '@9' [read 0~25165824] tid
42782258 osd.3251
Taking a specific object as an example: 100dbad3fce.000000b3
2023-02-03T14:42:46.293-0500 7fffdbfff700 10 objectcacher readx
extent(100dbad3fce.000000b3 (179) in @9 0~25165824 -> [729808896,25165824])
2023-02-03T14:42:46.293-0500 7fffdbfff700 10
objectcacher.object(100dbad3fce.000000b3/head) map_read
100dbad3fce.000000b3 0~25165824
2023-02-03T14:42:46.293-0500 7fffdbfff700 20
objectcacher.object(100dbad3fce.000000b3/head) map_read miss 25165824
left, bh[ 0x7fffb0f461a0 0~25165824 0x7fffb0e4b720 (0) v 0 missing]
waiters = {}
... a few times the above ...
... then an OSD read ...
2023-02-03T14:42:48.557-0500 7fffdbfff700 10 objectcacher readx
extent(100dbad3fce.000000b3 (179) in @9 0~25165824 -> [50331648,25165824])
2023-02-03T14:42:48.557-0500 7fffdbfff700 10
objectcacher.object(100dbad3fce.000000b3/head) map_read
100dbad3fce.000000b3 0~25165824
2023-02-03T14:42:48.557-0500 7fffdbfff700 20
objectcacher.object(100dbad3fce.000000b3/head) map_read miss 25165824
left, bh[ 0x7fffb123acd0 0~25165824 0x7fffb0e4b720 (0) v 0 missing]
waiters = {}
2023-02-03T14:42:48.557-0500 7fffdbfff700 7 objectcacher bh_read on bh[
0x7fffb123acd0 0~25165824 0x7fffb0e4b720 (0) v 0 missing] waiters = {}
outstanding reads 170
2023-02-03T14:42:48.557-0500 7fffdbfff700 10 client.151672236.objecter
_op_submit op 0x7fffb12de170
2023-02-03T14:42:48.557-0500 7fffdbfff700 20 client.151672236.objecter
_calc_target epoch 2609365 base 100dbad3fce.000000b3 @9 precalc_pgid 0
pgid 0.0 is_read
2023-02-03T14:42:48.557-0500 7fffdbfff700 20 client.151672236.objecter
_calc_target target 100dbad3fce.000000b3 @9 -> pgid 9.502eec40
2023-02-03T14:42:48.557-0500 7fffdbfff700 10 client.151672236.objecter
_calc_target raw pgid 9.502eec40 -> actual 9.6c40s0 acting
[975,3712,536,3462,2276,1365,2726,1648,4053] primary 975
2023-02-03T14:42:48.557-0500 7fffdbfff700 20 client.151672236.objecter
_get_session s=0x7fff9824d360 osd=975 3
2023-02-03T14:42:48.557-0500 7fffdbfff700 10 client.151672236.objecter
_op_submit oid 100dbad3fce.000000b3 '@9' '@9' [read 0~25165824] tid
42777830 osd.975
2023-02-03T14:42:48.557-0500 7fffdbfff700 20 client.151672236.objecter
get_session s=0x7fff9824d360 osd=975 3
2023-02-03T14:42:48.557-0500 7fffdbfff700 15 client.151672236.objecter
_session_op_assign 975 42777830
2023-02-03T14:42:48.557-0500 7fffdbfff700 15 client.151672236.objecter
_send_op 42777830 to 9.6c40s0 on osd.975
2023-02-03T14:42:48.557-0500 7fffdbfff700 20 client.151672236.objecter
put_session s=0x7fff9824d360 osd=975 4
2023-02-03T14:42:48.557-0500 7fffdbfff700 5 client.151672236.objecter
170 in flight
...
2023-02-03T14:42:50.045-0500 7fffdbfff700 10 objectcacher readx
extent(100dbad3fce.000000b3 (179) in @9 0~25165824 -> [729808896,25165824])
2023-02-03T14:42:50.045-0500 7fffdbfff700 10
objectcacher.object(100dbad3fce.000000b3/head) map_read
100dbad3fce.000000b3 0~25165824
2023-02-03T14:42:50.045-0500 7fffdbfff700 20
objectcacher.object(100dbad3fce.000000b3/head) map_read rx bh[
0x7fffb123acd0 0~25165824 0x7fffb0e4b720 (0) v 0 rx] waiters = {
0->[0x7fffb0798310, ]}
... read reply ...
2023-02-03T14:42:51.653-0500 7fffeb7fe700 10 client.151672236.objecter
ms_dispatch 0x55555e305810 osd_op_reply(42777830 100dbad3fce.000000b3
[read 0~25165824 out=25165824b] v0'0 uv471599 ondisk = 0) v8
2023-02-03T14:42:51.653-0500 7fffeb7fe700 10 client.151672236.objecter
in handle_osd_op_reply
2023-02-03T14:42:51.653-0500 7fffeb7fe700 7 client.151672236.objecter
handle_osd_op_reply 42777830 ondisk uv 471599 in 9.6c40 attempt 0
2023-02-03T14:42:51.653-0500 7fffeb7fe700 10 client.151672236.objecter
op 0 rval 0 len 25165824
2023-02-03T14:42:51.653-0500 7fffeb7fe700 15 client.151672236.objecter
handle_osd_op_reply completed tid 42777830
2023-02-03T14:42:51.653-0500 7fffeb7fe700 15 client.151672236.objecter
_finish_op 42777830
2023-02-03T14:42:51.653-0500 7fffeb7fe700 20 client.151672236.objecter
put_session s=0x7fff9824d360 osd=975 5
2023-02-03T14:42:51.653-0500 7fffeb7fe700 15 client.151672236.objecter
_session_op_remove 975 42777830
2023-02-03T14:42:51.653-0500 7fffeb7fe700 5 client.151672236.objecter
169 in flight
2023-02-03T14:42:51.657-0500 7fffdbfff700 10 finisher(0x55555e43bf90)
finisher_thread doing [0x7fffb0f2ad60,0]
2023-02-03T14:42:51.657-0500 7fffdbfff700 7 objectcacher bh_read_finish
100dbad3fce.000000b3/head tid 42770553 0~25165824 (bl is 25165824)
returned 0 outstanding reads 170
2023-02-03T14:42:51.657-0500 7fffdbfff700 20 objectcacher checking bh
bh[ 0x7fffb123acd0 0~25165824 0x7fffb0e4b720 (0) v 0 rx] waiters = {
0->[0x7fffb0798310, ]}
2023-02-03T14:42:51.657-0500 7fffdbfff700 10 objectcacher bh_read_finish
read bh[ 0x7fffb123acd0 0~25165824 0x7fffb0e4b720 (25165824) v 0 clean
firstbyte=5] waiters = {}
2023-02-03T14:42:51.657-0500 7fffdbfff700 10
objectcacher.object(100dbad3fce.000000b3/head) try_merge_bh bh[
0x7fffb123acd0 0~25165824 0x7fffb0e4b720 (25165824) v 0 clean
firstbyte=5] waiters = {}
2023-02-03T14:42:51.657-0500 7fffdbfff700 20 objectcacher finishing
waiters 0x7fffb0798310
...
2023-02-03T14:42:51.657-0500 7fffdbfff700 10 objectcacher readx
extent(100dbad3fce.000000b3 (179) in @9 0~25165824 -> [50331648,25165824])
2023-02-03T14:42:51.657-0500 7fffdbfff700 10
objectcacher.object(100dbad3fce.000000b3/head) map_read
100dbad3fce.000000b3 0~25165824
2023-02-03T14:42:51.657-0500 7fffdbfff700 20
objectcacher.object(100dbad3fce.000000b3/head) map_read hit bh[
0x7fffb123acd0 0~25165824 0x7fffb0e4b720 (25165824) v 0 clean
firstbyte=5] waiters = {}
... a few times the above ...
2023-02-03T14:42:55.085-0500 7fffacff9700 10 objectcacher trim trimming
object[100dbad3fce.000000b3/head oset 0x7fffd2895870 wr 0/0]
2023-02-03T14:42:55.085-0500 7fffacff9700 10 objectcacher close_object
object[100dbad3fce.000000b3/head oset 0x7fffd2895870 wr 0/0]
...
2023-02-03T14:42:55.293-0500 7fffdbfff700 10 objectcacher readx
extent(100dbad3fce.000000b3 (179) in @9 0~25165824 -> [50331648,25165824])
2023-02-03T14:42:55.293-0500 7fffdbfff700 10
objectcacher.object(100dbad3fce.000000b3/head) map_read
100dbad3fce.000000b3 0~25165824
2023-02-03T14:42:55.293-0500 7fffdbfff700 20
objectcacher.object(100dbad3fce.000000b3/head) map_read miss 25165824
left, bh[ 0x7fffb1c36480 0~25165824 0x7fffb0dfefa0 (0) v 0 missing]
waiters = {}
... some more times ...
... then eventually another OSD read for the same object - and the cycle
repeats ...
The client is ceph-fuse Pacific 16.2.10, the back end ceph is Octopus
15.2.7. The cluster is healthy during this time, no down OSDs,
recoveries, etc.
Here is the tracker: https://tracker.ceph.com/issues/58640
Any insight/help would be appreciated,
Andras
_______________________________________________
Dev mailing list -- dev@xxxxxxx
To unsubscribe send an email to dev-leave@xxxxxxx