Hi Andras, On Sat, Feb 4, 2023 at 1:59 AM Andras Pataki <apataki@xxxxxxxxxxxxxxxxxxxxx> wrote: > > We've been running into a strange issue with ceph-fuse on some nodes > lately. After some job runs on the node (and finishes or gets killed), > ceph-fuse gets stuck busy requesting objects from the OSDs without any > processes on the node using cephfs. When this happens, ceph-fuse uses > 2-3 cores, spinning in what seems like an infinite loop making objecter > requests to various OSDs of files that were perhaps requested by some > process that is long gone. The same object gets requested every few > seconds cycling through a list of objects. This close to saturates the > traffic in the 25Gbps NIC of the node. I have a one minute log file > with debug_objecter/objectcacher/client/context/finisher set to 20 that > traces this, but I don't see why this is happening exactly. Can someone > with a better understanding on the client->cache->OSD flow share some > insights into what is going wrong here? > > Here is an example: > > 2023-02-03T14:42:47.593-0500 7fffdbfff700 10 client.151672236.objecter > _op_submit oid 10097d5e07f.0000002b '@9' '@9' [read 0~25165824] tid > 42777727 osd.4349 > 2023-02-03T14:43:00.056-0500 7fffdbfff700 10 client.151672236.objecter > _op_submit oid 10097d5e07f.0000002b '@9' '@9' [read 0~25165824] tid > 42778921 osd.4349 > 2023-02-03T14:43:02.048-0500 7fffdbfff700 10 client.151672236.objecter > _op_submit oid 10097d5e07f.0000002b '@9' '@9' [read 0~25165824] tid > 42779104 osd.4349 > 2023-02-03T14:43:05.392-0500 7fffdbfff700 10 client.151672236.objecter > _op_submit oid 10097d5e07f.0000002b '@9' '@9' [read 0~25165824] tid > 42779408 osd.4349 > 2023-02-03T14:43:10.076-0500 7fffdbfff700 10 client.151672236.objecter > _op_submit oid 10097d5e07f.0000002b '@9' '@9' [read 0~25165824] tid > 42779847 osd.4349 > 2023-02-03T14:43:13.288-0500 7fffdbfff700 10 client.151672236.objecter > _op_submit oid 10097d5e07f.0000002b '@9' '@9' [read 0~25165824] tid > 42780156 osd.4349 > 2023-02-03T14:43:18.908-0500 7fffdbfff700 10 client.151672236.objecter > _op_submit oid 10097d5e07f.0000002b '@9' '@9' [read 0~25165824] tid > 42780697 osd.4349 > 2023-02-03T14:43:29.056-0500 7fffdbfff700 10 client.151672236.objecter > _op_submit oid 10097d5e07f.0000002b '@9' '@9' [read 0~25165824] tid > 42781660 osd.4349 > 2023-02-03T14:43:33.707-0500 7fffdbfff700 10 client.151672236.objecter > _op_submit oid 10097d5e07f.0000002b '@9' '@9' [read 0~25165824] tid > 42782079 osd.4349 > 2023-02-03T14:42:41.609-0500 7fffdbfff700 10 client.151672236.objecter > _op_submit oid 10097d5e07f.0000002c '@9' '@9' [read 0~25165824] tid > 42777194 osd.3251 > 2023-02-03T14:42:49.809-0500 7fffdbfff700 10 client.151672236.objecter > _op_submit oid 10097d5e07f.0000002c '@9' '@9' [read 0~25165824] tid > 42777954 osd.3251 > 2023-02-03T14:43:07.884-0500 7fffdbfff700 10 client.151672236.objecter > _op_submit oid 10097d5e07f.0000002c '@9' '@9' [read 0~25165824] tid > 42779646 osd.3251 > 2023-02-03T14:43:16.736-0500 7fffdbfff700 10 client.151672236.objecter > _op_submit oid 10097d5e07f.0000002c '@9' '@9' [read 0~25165824] tid > 42780500 osd.3251 > 2023-02-03T14:43:22.160-0500 7fffdbfff700 10 client.151672236.objecter > _op_submit oid 10097d5e07f.0000002c '@9' '@9' [read 0~25165824] tid > 42781009 osd.3251 > 2023-02-03T14:43:31.603-0500 7fffdbfff700 10 client.151672236.objecter > _op_submit oid 10097d5e07f.0000002c '@9' '@9' [read 0~25165824] tid > 42781892 osd.3251 > 2023-02-03T14:43:35.503-0500 7fffdbfff700 10 client.151672236.objecter > _op_submit oid 10097d5e07f.0000002c '@9' '@9' [read 0~25165824] tid > 42782258 osd.3251 > > Taking a specific object as an example: 100dbad3fce.000000b3 > > 2023-02-03T14:42:46.293-0500 7fffdbfff700 10 objectcacher readx > extent(100dbad3fce.000000b3 (179) in @9 0~25165824 -> [729808896,25165824]) > 2023-02-03T14:42:46.293-0500 7fffdbfff700 10 > objectcacher.object(100dbad3fce.000000b3/head) map_read > 100dbad3fce.000000b3 0~25165824 > 2023-02-03T14:42:46.293-0500 7fffdbfff700 20 > objectcacher.object(100dbad3fce.000000b3/head) map_read miss 25165824 > left, bh[ 0x7fffb0f461a0 0~25165824 0x7fffb0e4b720 (0) v 0 missing] > waiters = {} > ... a few times the above ... > ... then an OSD read ... > 2023-02-03T14:42:48.557-0500 7fffdbfff700 10 objectcacher readx > extent(100dbad3fce.000000b3 (179) in @9 0~25165824 -> [50331648,25165824]) > 2023-02-03T14:42:48.557-0500 7fffdbfff700 10 > objectcacher.object(100dbad3fce.000000b3/head) map_read > 100dbad3fce.000000b3 0~25165824 > 2023-02-03T14:42:48.557-0500 7fffdbfff700 20 > objectcacher.object(100dbad3fce.000000b3/head) map_read miss 25165824 > left, bh[ 0x7fffb123acd0 0~25165824 0x7fffb0e4b720 (0) v 0 missing] > waiters = {} > 2023-02-03T14:42:48.557-0500 7fffdbfff700 7 objectcacher bh_read on bh[ > 0x7fffb123acd0 0~25165824 0x7fffb0e4b720 (0) v 0 missing] waiters = {} > outstanding reads 170 > 2023-02-03T14:42:48.557-0500 7fffdbfff700 10 client.151672236.objecter > _op_submit op 0x7fffb12de170 > 2023-02-03T14:42:48.557-0500 7fffdbfff700 20 client.151672236.objecter > _calc_target epoch 2609365 base 100dbad3fce.000000b3 @9 precalc_pgid 0 > pgid 0.0 is_read > 2023-02-03T14:42:48.557-0500 7fffdbfff700 20 client.151672236.objecter > _calc_target target 100dbad3fce.000000b3 @9 -> pgid 9.502eec40 > 2023-02-03T14:42:48.557-0500 7fffdbfff700 10 client.151672236.objecter > _calc_target raw pgid 9.502eec40 -> actual 9.6c40s0 acting > [975,3712,536,3462,2276,1365,2726,1648,4053] primary 975 > 2023-02-03T14:42:48.557-0500 7fffdbfff700 20 client.151672236.objecter > _get_session s=0x7fff9824d360 osd=975 3 > 2023-02-03T14:42:48.557-0500 7fffdbfff700 10 client.151672236.objecter > _op_submit oid 100dbad3fce.000000b3 '@9' '@9' [read 0~25165824] tid > 42777830 osd.975 > 2023-02-03T14:42:48.557-0500 7fffdbfff700 20 client.151672236.objecter > get_session s=0x7fff9824d360 osd=975 3 > 2023-02-03T14:42:48.557-0500 7fffdbfff700 15 client.151672236.objecter > _session_op_assign 975 42777830 > 2023-02-03T14:42:48.557-0500 7fffdbfff700 15 client.151672236.objecter > _send_op 42777830 to 9.6c40s0 on osd.975 > 2023-02-03T14:42:48.557-0500 7fffdbfff700 20 client.151672236.objecter > put_session s=0x7fff9824d360 osd=975 4 > 2023-02-03T14:42:48.557-0500 7fffdbfff700 5 client.151672236.objecter > 170 in flight > ... > 2023-02-03T14:42:50.045-0500 7fffdbfff700 10 objectcacher readx > extent(100dbad3fce.000000b3 (179) in @9 0~25165824 -> [729808896,25165824]) > 2023-02-03T14:42:50.045-0500 7fffdbfff700 10 > objectcacher.object(100dbad3fce.000000b3/head) map_read > 100dbad3fce.000000b3 0~25165824 > 2023-02-03T14:42:50.045-0500 7fffdbfff700 20 > objectcacher.object(100dbad3fce.000000b3/head) map_read rx bh[ > 0x7fffb123acd0 0~25165824 0x7fffb0e4b720 (0) v 0 rx] waiters = { > 0->[0x7fffb0798310, ]} > ... read reply ... > 2023-02-03T14:42:51.653-0500 7fffeb7fe700 10 client.151672236.objecter > ms_dispatch 0x55555e305810 osd_op_reply(42777830 100dbad3fce.000000b3 > [read 0~25165824 out=25165824b] v0'0 uv471599 ondisk = 0) v8 > 2023-02-03T14:42:51.653-0500 7fffeb7fe700 10 client.151672236.objecter > in handle_osd_op_reply > 2023-02-03T14:42:51.653-0500 7fffeb7fe700 7 client.151672236.objecter > handle_osd_op_reply 42777830 ondisk uv 471599 in 9.6c40 attempt 0 > 2023-02-03T14:42:51.653-0500 7fffeb7fe700 10 client.151672236.objecter > op 0 rval 0 len 25165824 > 2023-02-03T14:42:51.653-0500 7fffeb7fe700 15 client.151672236.objecter > handle_osd_op_reply completed tid 42777830 > 2023-02-03T14:42:51.653-0500 7fffeb7fe700 15 client.151672236.objecter > _finish_op 42777830 > 2023-02-03T14:42:51.653-0500 7fffeb7fe700 20 client.151672236.objecter > put_session s=0x7fff9824d360 osd=975 5 > 2023-02-03T14:42:51.653-0500 7fffeb7fe700 15 client.151672236.objecter > _session_op_remove 975 42777830 > 2023-02-03T14:42:51.653-0500 7fffeb7fe700 5 client.151672236.objecter > 169 in flight > 2023-02-03T14:42:51.657-0500 7fffdbfff700 10 finisher(0x55555e43bf90) > finisher_thread doing [0x7fffb0f2ad60,0] > 2023-02-03T14:42:51.657-0500 7fffdbfff700 7 objectcacher bh_read_finish > 100dbad3fce.000000b3/head tid 42770553 0~25165824 (bl is 25165824) > returned 0 outstanding reads 170 > 2023-02-03T14:42:51.657-0500 7fffdbfff700 20 objectcacher checking bh > bh[ 0x7fffb123acd0 0~25165824 0x7fffb0e4b720 (0) v 0 rx] waiters = { > 0->[0x7fffb0798310, ]} > 2023-02-03T14:42:51.657-0500 7fffdbfff700 10 objectcacher bh_read_finish > read bh[ 0x7fffb123acd0 0~25165824 0x7fffb0e4b720 (25165824) v 0 clean > firstbyte=5] waiters = {} > 2023-02-03T14:42:51.657-0500 7fffdbfff700 10 > objectcacher.object(100dbad3fce.000000b3/head) try_merge_bh bh[ > 0x7fffb123acd0 0~25165824 0x7fffb0e4b720 (25165824) v 0 clean > firstbyte=5] waiters = {} > 2023-02-03T14:42:51.657-0500 7fffdbfff700 20 objectcacher finishing > waiters 0x7fffb0798310 > ... > 2023-02-03T14:42:51.657-0500 7fffdbfff700 10 objectcacher readx > extent(100dbad3fce.000000b3 (179) in @9 0~25165824 -> [50331648,25165824]) > 2023-02-03T14:42:51.657-0500 7fffdbfff700 10 > objectcacher.object(100dbad3fce.000000b3/head) map_read > 100dbad3fce.000000b3 0~25165824 > 2023-02-03T14:42:51.657-0500 7fffdbfff700 20 > objectcacher.object(100dbad3fce.000000b3/head) map_read hit bh[ > 0x7fffb123acd0 0~25165824 0x7fffb0e4b720 (25165824) v 0 clean > firstbyte=5] waiters = {} > ... a few times the above ... > 2023-02-03T14:42:55.085-0500 7fffacff9700 10 objectcacher trim trimming > object[100dbad3fce.000000b3/head oset 0x7fffd2895870 wr 0/0] > 2023-02-03T14:42:55.085-0500 7fffacff9700 10 objectcacher close_object > object[100dbad3fce.000000b3/head oset 0x7fffd2895870 wr 0/0] > ... > 2023-02-03T14:42:55.293-0500 7fffdbfff700 10 objectcacher readx > extent(100dbad3fce.000000b3 (179) in @9 0~25165824 -> [50331648,25165824]) > 2023-02-03T14:42:55.293-0500 7fffdbfff700 10 > objectcacher.object(100dbad3fce.000000b3/head) map_read > 100dbad3fce.000000b3 0~25165824 > 2023-02-03T14:42:55.293-0500 7fffdbfff700 20 > objectcacher.object(100dbad3fce.000000b3/head) map_read miss 25165824 > left, bh[ 0x7fffb1c36480 0~25165824 0x7fffb0dfefa0 (0) v 0 missing] > waiters = {} > ... some more times ... > ... then eventually another OSD read for the same object - and the cycle > repeats ... > > The client is ceph-fuse Pacific 16.2.10, the back end ceph is Octopus > 15.2.7. The cluster is healthy during this time, no down OSDs, > recoveries, etc. > > Here is the tracker: https://tracker.ceph.com/issues/58640 > > Any insight/help would be appreciated, Please check the tracker for updates/progress. > > Andras > _______________________________________________ > ceph-users mailing list -- ceph-users@xxxxxxx > To unsubscribe send an email to ceph-users-leave@xxxxxxx -- Cheers, Venky _______________________________________________ Dev mailing list -- dev@xxxxxxx To unsubscribe send an email to dev-leave@xxxxxxx