Currently trying to figure out whether c7d92d1d3fe469f5e8e7c35185a670570c665029 matters (any interval where acting contains an incomplete peer is could not have gone active, right?) -Sam On Fri, Dec 9, 2016 at 3:40 PM, Samuel Just <sjust@xxxxxxxxxx> wrote: > Yeah, that's pretty much where I'm going. > -Sam > > On Fri, Dec 9, 2016 at 3:24 PM, Sage Weil <sweil@xxxxxxxxxx> wrote: >> On Fri, 9 Dec 2016, Samuel Just wrote: >>> Currently, never. However, I'm thinking that we might want to retain >>> the freedom to not send the structure if it's really big. And >>> actually, we won't ever need to extend a received past_intervals >>> structure to the current epoch since if the interval changed, we'd >>> throw out the whole message. >> >> I'd say throw out the incremental build code, then, and assert >> past_intervals is present at notify time; we can re-add something to >> recalculate the whole past_intervals if it becomes necessary in the >> future. >> >> sage >> >> >>> -Sam >>> >>> On Fri, Dec 9, 2016 at 2:16 PM, Sage Weil <sweil@xxxxxxxxxx> wrote: >>> > On Fri, 9 Dec 2016, Samuel Just wrote: >>> >> In particular, we don't need PG::generate_past_intervals duplicating >>> >> the logic in build_past_intervals_parallel since constructed PG >>> >> objects only ever need to maintain a consistent past_intervals >>> >> structure, never build it from scratch. >>> > >>> > Sounds good to me. >>> > >>> > My main question is: >>> > >>> >> On Fri, Dec 9, 2016 at 11:59 AM, Samuel Just <sjust@xxxxxxxxxx> wrote: >>> >> > There's code for dealing with some odd past_intervals configurations >>> >> > including doesn't go back far enough and doesn't go forward far >>> >> > enough. I *think* we can simplify this as follows: >>> >> > 1) Once the PG object is constructed and in memory, past_intervals >>> >> > extends from history.last_epoch_started to the PG's current map >>> >> > 2) On disk, either 1) is true or the set is empty (after an import) >>> >> > >>> >> > On boot, the OSD generates past_intervals in parallel for any PGs >>> >> > without them (and perhaps verifies 1) for the rest). On receipt of a >>> >> > Notify creating a PG, the OSD generates the past_intervals structure >>> >> > before instantiating the PG using the same process as on boot -- >>> >> > starting with the included past_intervals if present (may not extend >>> >> > to the current map, and so may need to be extended). >>> > >>> > When does this actually happen? If PGs are always in state 1, can we >>> > instead ensure that PG notify will always include past_intervals and that >>> > a received notify will never require us to go off do the (slow) work of >>> > loading up old maps to generate old intervals? >>> > >>> > Then, we can focus our efforts on make the past_intervals representation >>> > compact (e.g., by discarding redundant intervals)... >>> > >>> > sage >>> -- >>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in >>> the body of a message to majordomo@xxxxxxxxxxxxxxx >>> More majordomo info at http://vger.kernel.org/majordomo-info.html >>> >>> -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html