On Fri, Oct 13, 2017 at 12:48 AM, zhaomingyue <zhao.mingyue@xxxxxxx> wrote: > Hi: > I had met an assert problem like > bug16279(http://tracker.ceph.com/issues/16279) when testing pull out disk > and insert, ceph version 10.2.5,assert(objiter->second->version > > last_divergent_update) > > according to osd log,I think this maybe due to (log.head != > *log.log.rbegin.version.version) when some abnormal condition happened,such > as power off ,pull out disk and insert. I don't think is supposed to be possible. We apply all changes like this atomically; FileStore does all its journaling to prevent partial updates like this. A few other people have reported the same issue on disk pull, so maybe there's some *other* issue going on, but the correct fix is by preventing those two from differing (unless I misunderstand the context). Given one of the reporters on that ticket confirms they also had xfs issues, I find it vastly more likely that something in your kernel configuration and hardware stack is not writing out data the way it claims to. Be very, very sure all that is working correctly! > In below situation, merge_log would push 234’1034 into divergent list;and > divergent has only one node;then lead to assert(objiter->second->version > > last_divergent_update). > > olog ---------------- (0’0, 234’1034) olog.head = 234’1034 > > log ---------------- (0’0, 234’1034) log.head = 234’1033 > > > > I see osd load_pgs code,in function PGLog::read_log() , code like this: > ..... > for (p->seek_to_first(); p->valid() ; p->next()) { > > ..... > > log.log.push_back(e); > > log.head = e.version; // every pg log node > > } > > ..... > > log.head = info.last_update; > > > > two doubt: > > first : why set (log.head = info.last_update) after all pg log node > processed(every node has updated log.head = e.version)? > > second: Whether it can occur that info.last_update is less than > *log.log.rbegin.version or not and what scene happens? I'm looking at the luminous code base right now and things have changed a bit so I don't have the specifics of your question on hand. But the general reason we change these versions around is because we need to reconcile the logs across all OSDs. If one OSD has an entry for an operation that was never returned to the client, we may need to declare it divergent and undo it. (In replicated pools, entries are only divergent if the OSD hosting it was either netsplit from the primary, or else managed to commit something during a failure event that its peers didn't and then was resubmitted under a different ID by the client on recovery. In erasure-coded pools things are more complicated because we can only roll operations forward if a quorum of the shards are present.) -Greg -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html