Sage, > As you can see the problem is that the OSDMap's removed snaps shrank somehow. Should we teach master (infernalis, hammer) to handle such a situation more gracefully too? Or at least make the OSD fail with a more clear message? Best regards, Alexey On Tue, Jan 12, 2016 at 4:24 PM, Sage Weil <sage@xxxxxxxxxxxx> wrote: > On Tue, 12 Jan 2016, Mykola Golub wrote: >> On Mon, Jan 11, 2016 at 09:00:18PM -0500, Boris Lukashev wrote: >> > In case anyone is following the mailing list later on, we spoke in IRC >> > and Sage provided a patch - http://fpaste.org/309609/52550203/ >> >> > diff --git a/src/osd/PG.cc b/src/osd/PG.cc >> > index dc18aec..f9ee23c 100644 >> > --- a/src/osd/PG.cc >> > +++ b/src/osd/PG.cc >> > @@ -135,8 +135,16 @@ void PGPool::update(OSDMapRef map) >> > name = map->get_pool_name(id); >> > if (pi->get_snap_epoch() == map->get_epoch()) { >> > pi->build_removed_snaps(newly_removed_snaps); >> > - newly_removed_snaps.subtract(cached_removed_snaps); >> > - cached_removed_snaps.union_of(newly_removed_snaps); >> > + interval_set<snapid_t> intersection; >> > + intersection.intersection_of(newly_removed_snaps, cached_removed_snaps); >> > + if (!(intersection == cached_removed_snaps)) { >> > + newly_removed_snaps.subtract(cached_removed_snaps); >> >> Sage, won't it still violate the assert? >> "intersection != cached_removed_snaps" means that cached_removed_snaps >> contains snapshots missed in newly_removed_snaps, and we can't subtract? > > Oops, yeah, just remote the !. > > As you can see the problem is that the OSDMap's removed snaps shrank > somehow. If you crank up logging you can see what the competing sets > are. > > An alternative fix/hack would be to modify the monitor to allow the > snapids that were previously in the set to be added back into the OSDMap. > That's arguably a better fix, although it's a bit more work. But, even > then, something like the above will be needed since there are still > OSDMaps in the history where the set is smaller. > > sage > >> >> > + cached_removed_snaps.union_of(newly_removed_snaps); >> > + } else { >> > + lgeneric_subdout(g_ceph_context, osd, 0) << __func__ << " cached_removed_snaps shrank from " << cached_removed_snaps << dendl; >> > + cached_removed_snaps = newly_removed_snaps; >> > + newly_removed_snaps.clear(); >> > + } >> > snapc = pi->get_snap_context(); >> > } else { >> > newly_removed_snaps.clear(); >> >> -- >> Mykola Golub >> >> >> > -- > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html