Re: 7915 is not resolved

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Sage,

> As you can see the problem is that the OSDMap's removed snaps shrank somehow.

Should we teach master (infernalis, hammer) to handle such a situation
more gracefully too?
Or at least make the OSD fail with a more clear message?

Best regards,
     Alexey


On Tue, Jan 12, 2016 at 4:24 PM, Sage Weil <sage@xxxxxxxxxxxx> wrote:
> On Tue, 12 Jan 2016, Mykola Golub wrote:
>> On Mon, Jan 11, 2016 at 09:00:18PM -0500, Boris Lukashev wrote:
>> > In case anyone is following the mailing list later on, we spoke in IRC
>> > and Sage provided a patch - http://fpaste.org/309609/52550203/
>>
>> > diff --git a/src/osd/PG.cc b/src/osd/PG.cc
>> > index dc18aec..f9ee23c 100644
>> > --- a/src/osd/PG.cc
>> > +++ b/src/osd/PG.cc
>> > @@ -135,8 +135,16 @@ void PGPool::update(OSDMapRef map)
>> >    name = map->get_pool_name(id);
>> >    if (pi->get_snap_epoch() == map->get_epoch()) {
>> >      pi->build_removed_snaps(newly_removed_snaps);
>> > -    newly_removed_snaps.subtract(cached_removed_snaps);
>> > -    cached_removed_snaps.union_of(newly_removed_snaps);
>> > +    interval_set<snapid_t> intersection;
>> > +    intersection.intersection_of(newly_removed_snaps, cached_removed_snaps);
>> > +    if (!(intersection == cached_removed_snaps)) {
>> > +      newly_removed_snaps.subtract(cached_removed_snaps);
>>
>> Sage, won't it still violate the assert?
>> "intersection != cached_removed_snaps" means that cached_removed_snaps
>> contains snapshots missed in newly_removed_snaps, and we can't subtract?
>
> Oops, yeah, just remote the !.
>
> As you can see the problem is that the OSDMap's removed snaps shrank
> somehow.  If you crank up logging you can see what the competing sets
> are.
>
> An alternative fix/hack would be to modify the monitor to allow the
> snapids that were previously in the set to be added back into the OSDMap.
> That's arguably a better fix, although it's a bit more work.  But, even
> then, something like the above will be needed since there are still
> OSDMaps in the history where the set is smaller.
>
> sage
>
>>
>> > +      cached_removed_snaps.union_of(newly_removed_snaps);
>> > +    } else {
>> > +      lgeneric_subdout(g_ceph_context, osd, 0) << __func__ << " cached_removed_snaps shrank from " << cached_removed_snaps << dendl;
>> > +      cached_removed_snaps = newly_removed_snaps;
>> > +      newly_removed_snaps.clear();
>> > +    }
>> >      snapc = pi->get_snap_context();
>> >    } else {
>> >      newly_removed_snaps.clear();
>>
>> --
>> Mykola Golub
>>
>>
>>
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux