On Fri, 9 Dec 2016, xxhdx1985126 wrote: > Thanks for the quick reply:-) > > By the way, according to my understanding of the source code, the ondisk > callback is trigger both when journal write complete or syscall(SYS_syncfs, > fd) complete. Is this right? The syncfs path only happens if you're using btrfs ('parallel' mode) since with xfs we don't even write to the fs until the journaled event has committed. s > > > > At 2016-12-07 22:29:55, "Sage Weil" <sage@xxxxxxxxxxxx> wrote: > >On Wed, 7 Dec 2016, xxhdx1985126 wrote: > >> Hi, everyone. > >> > >> > >> I'm trying to read the source code of ceph, and I found that some code se > ems strange. > >> In ReplicatedBackend::sub_op_modify_reply, there is the following code: > >> if (r->ack_type & CEPH_OSD_FLAG_ONDISK) { > >> assert(ip_op.waiting_for_commit.count(from)); > >> ip_op.waiting_for_commit.erase(from); > >> if (ip_op.op) { > >> ostringstream ss; > >> ss << "sub_op_commit_rec_from_osd." << from.osd; > >> ip_op.op->mark_event(ss.str()); > >> } > >> } else { > >> assert(ip_op.waiting_for_applied.count(from)); > >> if (ip_op.op) { > >> ostringstream ss; > >> ss << "sub_op_applied_rec_from_osd." << from.osd; > >> ip_op.op->mark_event(ss.str()); > >> } > >> } > >> ip_op.waiting_for_applied.erase(from); > >> It seems that the statement "ip_op.waiting_for_applied.erase(from)" shoul > d be in the "else" clause, otherwise items in waiting_for_applied could be e > rrorly erased when ack_type is CEPH_OSD_FLAG_ONDISK. > >> > >> Why put it outside the "else" clause? > > > >The commit/ondisk message implies applied. > > > >sage > > > > > >