Hi all,
Recently, the problem of OSD disorder has often appeared in my environment(14.2.5) and my Fuse Client borken
due to "FAILED assert(ob->last_commit_tid < tid)”. My application can’t work normally now.
The time series that triggered this problem is like this:
note:
a. my datapool is: EC 4+2
b. osd(osd.x) of pg_1 is down
Event Sequences:
t1: op_1(write) send to OSD and send 5 shards to 5 osds. only return 4 shards except primary osd because there is osd(osd.x) down.
t2: many other operations have occurred in this pg and record in pg_log
t3: op_n(write) send to OSD and send 5 shards to 5 osds. only return 4 shards except primary osd because there is osd(osd.x) down.
t4: the peer osd report osd.x timeout to monitor and osd.x is marked down
t5: pg_1 start canceling and requeueing op_1, op_2 … op_n to osd op_wq
t6: pg_1 start peering and op_1 is trimmed from pg_log and dup map in this process
t7: pg_1 become active and start reprocessing the op_1, op_2 … op_n
t8: op_1 is not found in pg_log and dup map, so redo it.
t9: op_n is found in pg_log or dup map and be considered completed, so return osd reply to client directly with tid_op_n
t10: op_1 complete and return to client with tid_op_1. client will break down due to "assert(ob->last_commit_tid < tid)”
I found some relative issues in https://tracker.ceph.com/issues/23827 which have some discussions about this problem.
But i didn’t find an effective method to avoid this problem.
I think the current mechanism to prevent non-idempotent op from being repeated is flawed, may be we should redesign it.
How do you think about it? And if my idea is wrong, what should i do to avoid this problem?
Any response is very grateful, thank you!
_______________________________________________ Dev mailing list -- dev@xxxxxxx To unsubscribe send an email to dev-leave@xxxxxxx