Re: 7915 is not resolved

Boris Lukashev <blukashev@xxxxxxxxxxxxxxxx> · Mon, 11 Jan 2016 13:15:50 -0500

Thank you, pulling those into my branch currently and kicking off a build.
In terms of upgrading to Hammer - the documentation looks straight
forward enough, but given that this is a Fuel based OpenStack
deployment, i'm wondering if you've heard of any potential
compatibility issues from doing so.

-Boris

On Mon, Jan 11, 2016 at 12:25 PM, Sage Weil <sage@xxxxxxxxxxxx> wrote:
> On Mon, 11 Jan 2016, Boris Lukashev wrote:
>> I ran into an incredibly unpleasant loss of a 5 node, 10 OSD ceph
>> cluster backing our openstack glance and cinder services by just
>> asking RBD to snapshot one of the volumes.
>> The conditions under which this occured are as follows - bash script
>> asking cinder to snapshot RBD volumes in rapid succession (2 of them),
>> which either caused a nova host (and ceph OSD holder) to crash, or
>> simply suffered the crash simultaneously. On reboot of the host, RBD
>> started throwing errors, once all OSDs were restarted, they all fail,
>> crashing with the following:
>>
>>     -1> 2016-01-11 16:37:35.401002 7f16f8449700  5 osd.6 pg_epoch:
>> 84269 pg[2.2c( empty local-les=84219 n=0 ec=1 les/c 84219/84219
>> 84218/84218/84193) [6,8] r=0 lpr=84261 crt=0'0 mlcod 0'0 peering]
>> enter Started/Primary/Peering/GetInfo
>>      0> 2016-01-11 16:37:35.401057 7f16f7c48700 -1
>> ./include/interval_set.h: In function 'void interval_set<T>::erase(T,
>> T) [with T = snapid_t]' thread 7f16f7c48700 time 2016-01-11
>> 16:37:35.398335
>> ./include/interval_set.h: 386: FAILED assert(_size >= 0)
>>
>>  ceph version 0.80.11-19-g130b0f7 (130b0f748332851eb2e3789e2b2fa4d3d08f3006)
>>  1: (interval_set<snapid_t>::subtract(interval_set<snapid_t>
>> const&)+0xb0) [0x79d140]
>>  2: (PGPool::update(std::tr1::shared_ptr<OSDMap const>)+0x656) [0x772856]
>>  3: (PG::handle_advance_map(std::tr1::shared_ptr<OSDMap const>,
>> std::tr1::shared_ptr<OSDMap const>, std::vector<int,
>> std::allocator<int> >&, int, std::vector<int, std::allocator<int> >&,
>> int, PG::RecoveryCtx*)+0x282) [0x772c22]
>>  4: (OSD::advance_pg(unsigned int, PG*, ThreadPool::TPHandle&,
>> PG::RecoveryCtx*, std::set<boost::intrusive_ptr<PG>,
>> std::less<boost::intrusive_ptr<PG> >,
>> std::allocator<boost::intrusive_ptr<PG> > >*)+0x292) [0x6548e2]
>>  5: (OSD::process_peering_events(std::list<PG*, std::allocator<PG*> >
>> const&, ThreadPool::TPHandle&)+0x20c) [0x6553cc]
>>  6: (OSD::PeeringWQ::_process(std::list<PG*, std::allocator<PG*> >
>> const&, ThreadPool::TPHandle&)+0x18) [0x69c858]
>>  7: (ThreadPool::worker(ThreadPool::WorkThread*)+0xb01) [0xa5ac71]
>>  8: (ThreadPool::WorkThread::entry()+0x10) [0xa5bb60]
>>  9: (()+0x8182) [0x7f170def5182]
>>  10: (clone()+0x6d) [0x7f170c51447d]
>>  NOTE: a copy of the executable, or `objdump -rdS <executable>` is
>> needed to interpret this.
>>
>> To me, this looks like the snapshot which was being created when the
>> nova host died is causing the assert to fail since the snap was never
>> completed and is broken.
>>
>> http://tracker.ceph.com/issues/11493 which appears very similar is
>> marked as resolved, but with firefly current (deployed via Fuel and
>> updated in place with 0.80.11 debs) this issue hit us on Saturday.
>
> You can try cherry-picking the two commits in wip-11493-b which make the
> OSD semi-gracefully tolerate this situation.  This is a bug that's been
> fixed in hammer, but since the inconsistency has already been introduced
> simply upgrading probably won't resolve it.  Nevertheless, after working
> around this, I'd encourage you to move to hammer and firefly is at end of
> life.
>
> sage
>
>>
>> Whats the way around this? I imagine commenting out that assert may
>> cause more damage, but we need to get our OSDs and the RBD data in
>> them back online. Is there a permanent fix in any branch we can
>> backport? We built this cluster using Fuel so this affects every
>> Mirantis user if not every ceph user out there, and the vector into
>> this catastrophic bug is normal daily operations (snapshot
>> apparently)....
>>
>> Thank you all for looking over this, advice would be greatly appreciated.
>> --
>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> the body of a message to majordomo@xxxxxxxxxxxxxxx
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
>>
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html