ceph-fuse in infinite loop reading objects without client requests

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



We've been running into a strange issue with ceph-fuse on some nodes lately.  After some job runs on the node (and finishes or gets killed), ceph-fuse gets stuck busy requesting objects from the OSDs without any processes on the node using cephfs.  When this happens, ceph-fuse uses 2-3 cores, spinning in what seems like an infinite loop making objecter requests to various OSDs of files that were perhaps requested by some process that is long gone.  The same object gets requested every few seconds cycling through a list of objects.  This close to saturates the traffic in the 25Gbps NIC of the node.  I have a one minute log file with debug_objecter/objectcacher/client/context/finisher set to 20 that traces this, but I don't see why this is happening exactly.  Can someone with a better understanding on the client->cache->OSD flow share some insights into what is going wrong here?

Here is an example:

2023-02-03T14:42:47.593-0500 7fffdbfff700 10 client.151672236.objecter _op_submit oid 10097d5e07f.0000002b '@9' '@9' [read 0~25165824] tid 42777727 osd.4349 2023-02-03T14:43:00.056-0500 7fffdbfff700 10 client.151672236.objecter _op_submit oid 10097d5e07f.0000002b '@9' '@9' [read 0~25165824] tid 42778921 osd.4349 2023-02-03T14:43:02.048-0500 7fffdbfff700 10 client.151672236.objecter _op_submit oid 10097d5e07f.0000002b '@9' '@9' [read 0~25165824] tid 42779104 osd.4349 2023-02-03T14:43:05.392-0500 7fffdbfff700 10 client.151672236.objecter _op_submit oid 10097d5e07f.0000002b '@9' '@9' [read 0~25165824] tid 42779408 osd.4349 2023-02-03T14:43:10.076-0500 7fffdbfff700 10 client.151672236.objecter _op_submit oid 10097d5e07f.0000002b '@9' '@9' [read 0~25165824] tid 42779847 osd.4349 2023-02-03T14:43:13.288-0500 7fffdbfff700 10 client.151672236.objecter _op_submit oid 10097d5e07f.0000002b '@9' '@9' [read 0~25165824] tid 42780156 osd.4349 2023-02-03T14:43:18.908-0500 7fffdbfff700 10 client.151672236.objecter _op_submit oid 10097d5e07f.0000002b '@9' '@9' [read 0~25165824] tid 42780697 osd.4349 2023-02-03T14:43:29.056-0500 7fffdbfff700 10 client.151672236.objecter _op_submit oid 10097d5e07f.0000002b '@9' '@9' [read 0~25165824] tid 42781660 osd.4349 2023-02-03T14:43:33.707-0500 7fffdbfff700 10 client.151672236.objecter _op_submit oid 10097d5e07f.0000002b '@9' '@9' [read 0~25165824] tid 42782079 osd.4349 2023-02-03T14:42:41.609-0500 7fffdbfff700 10 client.151672236.objecter _op_submit oid 10097d5e07f.0000002c '@9' '@9' [read 0~25165824] tid 42777194 osd.3251 2023-02-03T14:42:49.809-0500 7fffdbfff700 10 client.151672236.objecter _op_submit oid 10097d5e07f.0000002c '@9' '@9' [read 0~25165824] tid 42777954 osd.3251 2023-02-03T14:43:07.884-0500 7fffdbfff700 10 client.151672236.objecter _op_submit oid 10097d5e07f.0000002c '@9' '@9' [read 0~25165824] tid 42779646 osd.3251 2023-02-03T14:43:16.736-0500 7fffdbfff700 10 client.151672236.objecter _op_submit oid 10097d5e07f.0000002c '@9' '@9' [read 0~25165824] tid 42780500 osd.3251 2023-02-03T14:43:22.160-0500 7fffdbfff700 10 client.151672236.objecter _op_submit oid 10097d5e07f.0000002c '@9' '@9' [read 0~25165824] tid 42781009 osd.3251 2023-02-03T14:43:31.603-0500 7fffdbfff700 10 client.151672236.objecter _op_submit oid 10097d5e07f.0000002c '@9' '@9' [read 0~25165824] tid 42781892 osd.3251 2023-02-03T14:43:35.503-0500 7fffdbfff700 10 client.151672236.objecter _op_submit oid 10097d5e07f.0000002c '@9' '@9' [read 0~25165824] tid 42782258 osd.3251

Taking a specific object as an example: 100dbad3fce.000000b3

2023-02-03T14:42:46.293-0500 7fffdbfff700 10 objectcacher readx extent(100dbad3fce.000000b3 (179) in @9 0~25165824 -> [729808896,25165824]) 2023-02-03T14:42:46.293-0500 7fffdbfff700 10 objectcacher.object(100dbad3fce.000000b3/head) map_read 100dbad3fce.000000b3 0~25165824 2023-02-03T14:42:46.293-0500 7fffdbfff700 20 objectcacher.object(100dbad3fce.000000b3/head) map_read miss 25165824 left, bh[ 0x7fffb0f461a0 0~25165824 0x7fffb0e4b720 (0) v 0 missing] waiters = {}
... a few times the above ...
... then an OSD read ...
2023-02-03T14:42:48.557-0500 7fffdbfff700 10 objectcacher readx extent(100dbad3fce.000000b3 (179) in @9 0~25165824 -> [50331648,25165824]) 2023-02-03T14:42:48.557-0500 7fffdbfff700 10 objectcacher.object(100dbad3fce.000000b3/head) map_read 100dbad3fce.000000b3 0~25165824 2023-02-03T14:42:48.557-0500 7fffdbfff700 20 objectcacher.object(100dbad3fce.000000b3/head) map_read miss 25165824 left, bh[ 0x7fffb123acd0 0~25165824 0x7fffb0e4b720 (0) v 0 missing] waiters = {} 2023-02-03T14:42:48.557-0500 7fffdbfff700  7 objectcacher bh_read on bh[ 0x7fffb123acd0 0~25165824 0x7fffb0e4b720 (0) v 0 missing] waiters = {} outstanding reads 170 2023-02-03T14:42:48.557-0500 7fffdbfff700 10 client.151672236.objecter _op_submit op 0x7fffb12de170 2023-02-03T14:42:48.557-0500 7fffdbfff700 20 client.151672236.objecter _calc_target epoch 2609365 base 100dbad3fce.000000b3 @9 precalc_pgid 0 pgid 0.0 is_read 2023-02-03T14:42:48.557-0500 7fffdbfff700 20 client.151672236.objecter _calc_target target 100dbad3fce.000000b3 @9 -> pgid 9.502eec40 2023-02-03T14:42:48.557-0500 7fffdbfff700 10 client.151672236.objecter _calc_target  raw pgid 9.502eec40 -> actual 9.6c40s0 acting [975,3712,536,3462,2276,1365,2726,1648,4053] primary 975 2023-02-03T14:42:48.557-0500 7fffdbfff700 20 client.151672236.objecter _get_session s=0x7fff9824d360 osd=975 3 2023-02-03T14:42:48.557-0500 7fffdbfff700 10 client.151672236.objecter _op_submit oid 100dbad3fce.000000b3 '@9' '@9' [read 0~25165824] tid 42777830 osd.975 2023-02-03T14:42:48.557-0500 7fffdbfff700 20 client.151672236.objecter get_session s=0x7fff9824d360 osd=975 3 2023-02-03T14:42:48.557-0500 7fffdbfff700 15 client.151672236.objecter _session_op_assign 975 42777830 2023-02-03T14:42:48.557-0500 7fffdbfff700 15 client.151672236.objecter _send_op 42777830 to 9.6c40s0 on osd.975 2023-02-03T14:42:48.557-0500 7fffdbfff700 20 client.151672236.objecter put_session s=0x7fff9824d360 osd=975 4 2023-02-03T14:42:48.557-0500 7fffdbfff700  5 client.151672236.objecter 170 in flight
...
2023-02-03T14:42:50.045-0500 7fffdbfff700 10 objectcacher readx extent(100dbad3fce.000000b3 (179) in @9 0~25165824 -> [729808896,25165824]) 2023-02-03T14:42:50.045-0500 7fffdbfff700 10 objectcacher.object(100dbad3fce.000000b3/head) map_read 100dbad3fce.000000b3 0~25165824 2023-02-03T14:42:50.045-0500 7fffdbfff700 20 objectcacher.object(100dbad3fce.000000b3/head) map_read rx bh[ 0x7fffb123acd0 0~25165824 0x7fffb0e4b720 (0) v 0 rx] waiters = { 0->[0x7fffb0798310, ]}
... read reply ...
2023-02-03T14:42:51.653-0500 7fffeb7fe700 10 client.151672236.objecter ms_dispatch 0x55555e305810 osd_op_reply(42777830 100dbad3fce.000000b3 [read 0~25165824 out=25165824b] v0'0 uv471599 ondisk = 0) v8 2023-02-03T14:42:51.653-0500 7fffeb7fe700 10 client.151672236.objecter in handle_osd_op_reply 2023-02-03T14:42:51.653-0500 7fffeb7fe700  7 client.151672236.objecter handle_osd_op_reply 42777830 ondisk uv 471599 in 9.6c40 attempt 0 2023-02-03T14:42:51.653-0500 7fffeb7fe700 10 client.151672236.objecter  op 0 rval 0 len 25165824 2023-02-03T14:42:51.653-0500 7fffeb7fe700 15 client.151672236.objecter handle_osd_op_reply completed tid 42777830 2023-02-03T14:42:51.653-0500 7fffeb7fe700 15 client.151672236.objecter _finish_op 42777830 2023-02-03T14:42:51.653-0500 7fffeb7fe700 20 client.151672236.objecter put_session s=0x7fff9824d360 osd=975 5 2023-02-03T14:42:51.653-0500 7fffeb7fe700 15 client.151672236.objecter _session_op_remove 975 42777830 2023-02-03T14:42:51.653-0500 7fffeb7fe700  5 client.151672236.objecter 169 in flight 2023-02-03T14:42:51.657-0500 7fffdbfff700 10 finisher(0x55555e43bf90) finisher_thread doing [0x7fffb0f2ad60,0] 2023-02-03T14:42:51.657-0500 7fffdbfff700  7 objectcacher bh_read_finish 100dbad3fce.000000b3/head tid 42770553 0~25165824 (bl is 25165824) returned 0 outstanding reads 170 2023-02-03T14:42:51.657-0500 7fffdbfff700 20 objectcacher checking bh bh[ 0x7fffb123acd0 0~25165824 0x7fffb0e4b720 (0) v 0 rx] waiters = { 0->[0x7fffb0798310, ]} 2023-02-03T14:42:51.657-0500 7fffdbfff700 10 objectcacher bh_read_finish read bh[ 0x7fffb123acd0 0~25165824 0x7fffb0e4b720 (25165824) v 0 clean firstbyte=5] waiters = {} 2023-02-03T14:42:51.657-0500 7fffdbfff700 10 objectcacher.object(100dbad3fce.000000b3/head) try_merge_bh bh[ 0x7fffb123acd0 0~25165824 0x7fffb0e4b720 (25165824) v 0 clean firstbyte=5] waiters = {} 2023-02-03T14:42:51.657-0500 7fffdbfff700 20 objectcacher finishing waiters 0x7fffb0798310
...
2023-02-03T14:42:51.657-0500 7fffdbfff700 10 objectcacher readx extent(100dbad3fce.000000b3 (179) in @9 0~25165824 -> [50331648,25165824]) 2023-02-03T14:42:51.657-0500 7fffdbfff700 10 objectcacher.object(100dbad3fce.000000b3/head) map_read 100dbad3fce.000000b3 0~25165824 2023-02-03T14:42:51.657-0500 7fffdbfff700 20 objectcacher.object(100dbad3fce.000000b3/head) map_read hit bh[ 0x7fffb123acd0 0~25165824 0x7fffb0e4b720 (25165824) v 0 clean firstbyte=5] waiters = {}
... a few times the above ...
2023-02-03T14:42:55.085-0500 7fffacff9700 10 objectcacher trim trimming object[100dbad3fce.000000b3/head oset 0x7fffd2895870 wr 0/0] 2023-02-03T14:42:55.085-0500 7fffacff9700 10 objectcacher close_object object[100dbad3fce.000000b3/head oset 0x7fffd2895870 wr 0/0]
...
2023-02-03T14:42:55.293-0500 7fffdbfff700 10 objectcacher readx extent(100dbad3fce.000000b3 (179) in @9 0~25165824 -> [50331648,25165824]) 2023-02-03T14:42:55.293-0500 7fffdbfff700 10 objectcacher.object(100dbad3fce.000000b3/head) map_read 100dbad3fce.000000b3 0~25165824 2023-02-03T14:42:55.293-0500 7fffdbfff700 20 objectcacher.object(100dbad3fce.000000b3/head) map_read miss 25165824 left, bh[ 0x7fffb1c36480 0~25165824 0x7fffb0dfefa0 (0) v 0 missing] waiters = {}
... some more times ...
... then eventually another OSD read for the same object - and the cycle repeats ...

The client is ceph-fuse Pacific 16.2.10, the back end ceph is Octopus 15.2.7.  The cluster is healthy during this time, no down OSDs, recoveries, etc.

Here is the tracker:  https://tracker.ceph.com/issues/58640

Any insight/help would be appreciated,

Andras
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux