Re: Upgrade from Giant to Hammer and after some basic operations most of the OSD's went down

tuomas.juntunen@xxxxxxxxxxxxxxx · Fri, 1 May 2015 18:10:27 +0300 (EEST)

Hi

I deleted the images and img pools and started osd's, they still die.

Here's a log of one of the osd's after this, if you need it.

http://beta.xaasbox.com/ceph/ceph-osd.19.log

Br,
Tuomas

> Thanks man. I'll try it tomorrow. Have a good one.
>
> Br,T
>
> -------- Original message --------
> From: Sage Weil <sage@xxxxxxxxxxxx>
> Date: 30/04/2015  18:23  (GMT+02:00)
> To: Tuomas Juntunen <tuomas.juntunen@xxxxxxxxxxxxxxx>
> Cc: ceph-users@xxxxxxxxxxxxxx, ceph-devel@xxxxxxxxxxxxxxx
> Subject: RE:  Upgrade from Giant to Hammer and after some basic

> operations most of the OSD's went down
>
> On Thu, 30 Apr 2015, tuomas.juntunen@xxxxxxxxxxxxxxx wrote:
>> Hey
>>
>> Yes I can drop the images data, you think this will fix it?
>
> It's a slightly different assert that (I believe) should not trigger once
> the pool is deleted.Â  Please give that a try and if you still hit it I'll
> whip up a workaround.
>
> Thanks!
> sage
>
>  >
>>
>> Br,
>>
>> Tuomas
>>
>> > On Wed, 29 Apr 2015, Tuomas Juntunen wrote:
>> >> Hi
>> >>
>> >> I updated that version and it seems that something did happen, the osd's
>> >> stayed up for a while and 'ceph status' got updated. But then in couple of
>> >> minutes, they all went down the same way.
>> >>
>> >> I have attached new 'ceph osd dump -f json-pretty' and got a new log from
>> >> one of the osd's with osd debug = 20,
>> >> http://beta.xaasbox.com/ceph/ceph-osd.15.log
>> >
>> > Sam mentioned that you had said earlier that this was not critical data?
>> > If not, I think the simplest thing is to just drop those pools.Â  The
>> > important thing (from my perspective at least :) is that we understand the
>> > root cause and can prevent this in the future.
>> >
>> > sage
>> >
>> >
>> >>
>> >> Thank you!
>> >>
>> >> Br,
>> >> Tuomas
>> >>
>> >>
>> >>
>> >> -----Original Message-----
>> >> From: Sage Weil [mailto:sage@xxxxxxxxxxxx]
>> >> Sent: 28. huhtikuuta 2015 23:57
>> >> To: Tuomas Juntunen
>> >> Cc: ceph-users@xxxxxxxxxxxxxx; ceph-devel@xxxxxxxxxxxxxxx
>> >> Subject: Re:  Upgrade from Giant to Hammer and after some basic
>> >> operations most of the OSD's went down
>> >>
>> >> Hi Tuomas,
>> >>
>> >> I've pushed an updated wip-hammer-snaps branch.Â  Can you please try it?
>> >> The build will appear here
>> >>
>> >>
>> >> http://gitbuilder.ceph.com/ceph-deb-trusty-x86_64-basic/sha1/08bf531331afd5e
>> >> 2eb514067f72afda11bcde286
>> >>
>> >> (or a similar url; adjust for your distro).
>> >>
>> >> Thanks!
>> >> sage
>> >>
>> >>
>> >> On Tue, 28 Apr 2015, Sage Weil wrote:
>> >>
>> >> > [adding ceph-devel]
>> >> >
>> >> > Okay, I see the problem.Â  This seems to be unrelated ot the giant ->
>> >> > hammer move... it's a result of the tiering changes you made:
>> >> >
>> >> > > > > > > > The following:
>> >> > > > > > > >
>> >> > > > > > > > ceph osd tier add img images --force-nonempty ceph osd
>> >> > > > > > > > tier cache-mode images forward ceph osd tier set-overlay
>> >> > > > > > > > img images
>> >> >
>> >> > Specifically, --force-nonempty bypassed important safety checks.
>> >> >
>> >> > 1. images had snapshots (and removed_snaps)
>> >> >
>> >> > 2. images was added as a tier *of* img, and img's removed_snaps was
>> >> > copied to images, clobbering the removed_snaps value (see
>> >> > OSDMap::Incremental::propagate_snaps_to_tiers)
>> >> >
>> >> > 3. tiering relation was undone, but removed_snaps was still gone
>> >> >
>> >> > 4. on OSD startup, when we load the PG, removed_snaps is initialized
>> >> > with the older map.Â  later, in PGPool::update(), we assume that
>> >> > removed_snaps alwasy grows (never shrinks) and we trigger an assert.
>> >> >
>> >> > To fix this I think we need to do 2 things:
>> >> >
>> >> > 1. make the OSD forgiving out removed_snaps getting smaller.Â  This is
>> >> > probably a good thing anyway: once we know snaps are removed on all
>> >> > OSDs we can prune the interval_set in the OSDMap.Â  Maybe.
>> >> >
>> >> > 2. Fix the mon to prevent this from happening, *even* when
>> >> > --force-nonempty is specified.Â  (This is the root cause.)
>> >> >
>> >> > I've opened http://tracker.ceph.com/issues/11493 to track this.
>> >> >
>> >> > sage
>> >> >
>> >> >
>> >> >
>> >> > > > > > > >
>> >> > > > > > > > Idea was to make images as a tier to img, move data to img
>> >> > > > > > > > then change
>> >> > > > > > > clients to use the new img pool.
>> >> > > > > > > >
>> >> > > > > > > > Br,
>> >> > > > > > > > Tuomas
>> >> > > > > > > >
>> >> > > > > > > > > Can you explain exactly what you mean by:
>> >> > > > > > > > >
>> >> > > > > > > > > "Also I created one pool for tier to be able to move
>> >> > > > > > > > > data without
>> >> > > > > > > outage."
>> >> > > > > > > > >
>> >> > > > > > > > > -Sam
>> >> > > > > > > > > ----- Original Message -----
>> >> > > > > > > > > From: "tuomas juntunen"
>> >> > > > > > > > > <tuomas.juntunen@xxxxxxxxxxxxxxx>
>> >> > > > > > > > > To: "Ian Colle" <icolle@xxxxxxxxxx>
>> >> > > > > > > > > Cc: ceph-users@xxxxxxxxxxxxxx
>> >> > > > > > > > > Sent: Monday, April 27, 2015 4:23:44 AM
>> >> > > > > > > > > Subject: Re:  Upgrade from Giant to Hammer
>> >> > > > > > > > > and after some basic operations most of the OSD's went
>> >> > > > > > > > > down
>> >> > > > > > > > >
>> >> > > > > > > > > Hi
>> >> > > > > > > > >
>> >> > > > > > > > > Any solution for this yet?
>> >> > > > > > > > >
>> >> > > > > > > > > Br,
>> >> > > > > > > > > Tuomas
>> >> > > > > > > > >
>> >> > > > > > > > >> It looks like you may have hit
>> >> > > > > > > > >> http://tracker.ceph.com/issues/7915
>> >> > > > > > > > >>
>> >> > > > > > > > >> Ian R. Colle
>> >> > > > > > > > >> Global Director
>> >> > > > > > > > >> of Software Engineering Red Hat (Inktank is now part of
>> >> > > > > > > > >> Red Hat!) http://www.linkedin.com/in/ircolle
>> >> > > > > > > > >> http://www.twitter.com/ircolle
>> >> > > > > > > > >> Cell: +1.303.601.7713
>> >> > > > > > > > >> Email: icolle@xxxxxxxxxx
>> >> > > > > > > > >>
>> >> > > > > > > > >> ----- Original Message -----
>> >> > > > > > > > >> From: "tuomas juntunen"
>> >> > > > > > > > >> <tuomas.juntunen@xxxxxxxxxxxxxxx>
>> >> > > > > > > > >> To: ceph-users@xxxxxxxxxxxxxx
>> >> > > > > > > > >> Sent: Monday, April 27, 2015 1:56:29 PM
>> >> > > > > > > > >> Subject:  Upgrade from Giant to Hammer and
>> >> > > > > > > > >> after some basic operations most of the OSD's went down
>> >> > > > > > > > >>
>> >> > > > > > > > >>
>> >> > > > > > > > >>
>> >> > > > > > > > >> I upgraded Ceph from 0.87 Giant to 0.94.1 Hammer
>> >> > > > > > > > >>
>> >> > > > > > > > >> Then created new pools and deleted some old ones. Also
>> >> > > > > > > > >> I created one pool for tier to be able to move data
>> >> > > > > > > > >> without
>> >> > > outage.
>> >> > > > > > > > >>
>> >> > > > > > > > >> After these operations all but 10 OSD's are down and
>> >> > > > > > > > >> creating this kind of messages to logs, I get more than
>> >> > > > > > > > >> 100gb of these in a
>> >> > > > > > night:
>> >> > > > > > > > >>
>> >> > > > > > > > >>Â  -19> 2015-04-27 10:17:08.808584 7fd8e748d700Â  5 osd.23
>> >> > > pg_epoch:
>> >> > > >
>> >> > > > > > > > >> 17882 pg[0.189( v 8480'7 (0'0,8480'7] local-les=16609
>> >> > > > > > > > >> n=0
>> >> > > > > > > > >> ec=1 les/c
>> >> > > > > > > > >> 16609/16659
>> >> > > > > > > > >> 16590/16590/16590) [24,3,23] r=2 lpr=17838
>> >> > > > > > > > >> pi=15659-16589/42
>> >> > > > > > > > >> crt=8480'7 lcod
>> >> > > > > > > > >> 0'0 inactive NOTIFY] enter Started
>> >> > > > > > > > >>Â Â Â  -18> 2015-04-27 10:17:08.808596 7fd8e748d700Â  5
>> >> > > > > > > > >> osd.23
>> >> > > > pg_epoch:
>> >> > > > >
>> >> > > > > > > > >> 17882 pg[0.189( v 8480'7 (0'0,8480'7] local-les=16609
>> >> > > > > > > > >> n=0
>> >> > > > > > > > >> ec=1 les/c
>> >> > > > > > > > >> 16609/16659
>> >> > > > > > > > >> 16590/16590/16590) [24,3,23] r=2 lpr=17838
>> >> > > > > > > > >> pi=15659-16589/42
>> >> > > > > > > > >> crt=8480'7 lcod
>> >> > > > > > > > >> 0'0 inactive NOTIFY] enter Start
>> >> > > > > > > > >>Â Â Â  -17> 2015-04-27 10:17:08.808608 7fd8e748d700Â  1
>> >> > > > > > > > >> osd.23
>> >> > > > pg_epoch:
>> >> > > > >
>> >> > > > > > > > >> 17882 pg[0.189( v 8480'7 (0'0,8480'7] local-les=16609
>> >> > > > > > > > >> n=0
>> >> > > > > > > > >> ec=1 les/c
>> >> > > > > > > > >> 16609/16659
>> >> > > > > > > > >> 16590/16590/16590) [24,3,23] r=2 lpr=17838
>> >> > > > > > > > >> pi=15659-16589/42
>> >> > > > > > > > >> crt=8480'7 lcod
>> >> > > > > > > > >> 0'0 inactive NOTIFY] state<Start>: transitioning to Stray
>> >> > > > > > > > >>Â Â Â  -16> 2015-04-27 10:17:08.808621 7fd8e748d700Â  5
>> >> > > > > > > > >> osd.23
>> >> > > > pg_epoch:
>> >> > > > >
>> >> > > > > > > > >> 17882 pg[0.189( v 8480'7 (0'0,8480'7] local-les=16609
>> >> > > > > > > > >> n=0
>> >> > > > > > > > >> ec=1 les/c
>> >> > > > > > > > >> 16609/16659
>> >> > > > > > > > >> 16590/16590/16590) [24,3,23] r=2 lpr=17838
>> >> > > > > > > > >> pi=15659-16589/42
>> >> > > > > > > > >> crt=8480'7 lcod
>> >> > > > > > > > >> 0'0 inactive NOTIFY] exit Start 0.000025 0 0.000000
>> >> > > > > > > > >>Â Â Â  -15> 2015-04-27 10:17:08.808637 7fd8e748d700Â  5
>> >> > > > > > > > >> osd.23
>> >> > > > pg_epoch:
>> >> > > > >
>> >> > > > > > > > >> 17882 pg[0.189( v 8480'7 (0'0,8480'7] local-les=16609
>> >> > > > > > > > >> n=0
>> >> > > > > > > > >> ec=1 les/c
>> >> > > > > > > > >> 16609/16659
>> >> > > > > > > > >> 16590/16590/16590) [24,3,23] r=2 lpr=17838
>> >> > > > > > > > >> pi=15659-16589/42
>> >> > > > > > > > >> crt=8480'7 lcod
>> >> > > > > > > > >> 0'0 inactive NOTIFY] enter Started/Stray
>> >> > > > > > > > >>Â Â Â  -14> 2015-04-27 10:17:08.808796 7fd8e748d700Â  5
>> >> > > > > > > > >> osd.23
>> >> > > > pg_epoch:
>> >> > > > >
>> >> > > > > > > > >> 17882 pg[10.181( empty local-les=17879 n=0 ec=17863
>> >> > > > > > > > >> les/c
>> >> > > > > > > > >> 17879/17879
>> >> > > > > > > > >> 17863/17863/17863) [25,5,23] r=2 lpr=17879 crt=0'0
>> >> > > > > > > > >> inactive NOTIFY] exit Reset 0.119467 4 0.000037
>> >> > > > > > > > >>Â Â Â  -13> 2015-04-27 10:17:08.808817 7fd8e748d700Â  5
>> >> > > > > > > > >> osd.23
>> >> > > > pg_epoch:
>> >> > > > >
>> >> > > > > > > > >> 17882 pg[10.181( empty local-les=17879 n=0 ec=17863
>> >> > > > > > > > >> les/c
>> >> > > > > > > > >> 17879/17879
>> >> > > > > > > > >> 17863/17863/17863) [25,5,23] r=2 lpr=17879 crt=0'0
>> >> > > > > > > > >> inactive NOTIFY] enter Started
>> >> > > > > > > > >>Â Â Â  -12> 2015-04-27 10:17:08.808828 7fd8e748d700Â  5
>> >> > > > > > > > >> osd.23
>> >> > > > pg_epoch:
>> >> > > > >
>> >> > > > > > > > >> 17882 pg[10.181( empty local-les=17879 n=0 ec=17863
>> >> > > > > > > > >> les/c
>> >> > > > > > > > >> 17879/17879
>> >> > > > > > > > >> 17863/17863/17863) [25,5,23] r=2 lpr=17879 crt=0'0
>> >> > > > > > > > >> inactive NOTIFY] enter Start
>> >> > > > > > > > >>Â Â Â  -11> 2015-04-27 10:17:08.808838 7fd8e748d700Â  1
>> >> > > > > > > > >> osd.23
>> >> > > > pg_epoch:
>> >> > > > >
>> >> > > > > > > > >> 17882 pg[10.181( empty local-les=17879 n=0 ec=17863
>> >> > > > > > > > >> les/c
>> >> > > > > > > > >> 17879/17879
>> >> > > > > > > > >> 17863/17863/17863) [25,5,23] r=2 lpr=17879 crt=0'0
>> >> > > > > > > > >> inactive NOTIFY]
>> >> > > > > > > > >> state<Start>: transitioning to Stray
>> >> > > > > > > > >>Â Â Â  -10> 2015-04-27 10:17:08.808849 7fd8e748d700Â  5
>> >> > > > > > > > >> osd.23
>> >> > > > pg_epoch:
>> >> > > > >
>> >> > > > > > > > >> 17882 pg[10.181( empty local-les=17879 n=0 ec=17863
>> >> > > > > > > > >> les/c
>> >> > > > > > > > >> 17879/17879
>> >> > > > > > > > >> 17863/17863/17863) [25,5,23] r=2 lpr=17879 crt=0'0
>> >> > > > > > > > >> inactive NOTIFY] exit Start 0.000020 0 0.000000
>> >> > > > > > > > >>Â Â Â Â  -9> 2015-04-27 10:17:08.808861 7fd8e748d700Â  5
>> >> > > > > > > > >> osd.23
>> >> > > > pg_epoch:
>> >> > > > >
>> >> > > > > > > > >> 17882 pg[10.181( empty local-les=17879 n=0 ec=17863
>> >> > > > > > > > >> les/c
>> >> > > > > > > > >> 17879/17879
>> >> > > > > > > > >> 17863/17863/17863) [25,5,23] r=2 lpr=17879 crt=0'0
>> >> > > > > > > > >> inactive NOTIFY] enter Started/Stray
>> >> > > > > > > > >>Â Â Â Â  -8> 2015-04-27 10:17:08.809427 7fd8e748d700Â  5
>> >> > > > > > > > >> osd.23
>> >> > > > pg_epoch:
>> >> > > > >
>> >> > > > > > > > >> 17882 pg[2.189( empty local-les=16127 n=0 ec=1 les/c
>> >> > > > > > > > >> 16127/16344
>> >> > > > > > > > >> 16125/16125/16125) [23,5] r=0 lpr=17838 crt=0'0 mlcod
>> >> > > > > > > > >> 0'0 inactive] exit Reset 7.511623 45 0.000165
>> >> > > > > > > > >>Â Â Â Â  -7> 2015-04-27 10:17:08.809445 7fd8e748d700Â  5
>> >> > > > > > > > >> osd.23
>> >> > > > pg_epoch:
>> >> > > > >
>> >> > > > > > > > >> 17882 pg[2.189( empty local-les=16127 n=0 ec=1 les/c
>> >> > > > > > > > >> 16127/16344
>> >> > > > > > > > >> 16125/16125/16125) [23,5] r=0 lpr=17838 crt=0'0 mlcod
>> >> > > > > > > > >> 0'0 inactive] enter Started
>> >> > > > > > > > >>Â Â Â Â  -6> 2015-04-27 10:17:08.809456 7fd8e748d700Â  5
>> >> > > > > > > > >> osd.23
>> >> > > > pg_epoch:
>> >> > > > >
>> >> > > > > > > > >> 17882 pg[2.189( empty local-les=16127 n=0 ec=1 les/c
>> >> > > > > > > > >> 16127/16344
>> >> > > > > > > > >> 16125/16125/16125) [23,5] r=0 lpr=17838 crt=0'0 mlcod
>> >> > > > > > > > >> 0'0 inactive] enter Start
>> >> > > > > > > > >>Â Â Â Â  -5> 2015-04-27 10:17:08.809468 7fd8e748d700Â  1
>> >> > > > > > > > >> osd.23
>> >> > > > pg_epoch:
>> >> > > > >
>> >> > > > > > > > >> 17882 pg[2.189( empty local-les=16127 n=0 ec=1 les/c
>> >> > > > > > > > >> 16127/16344
>> >> > > > > > > > >> 16125/16125/16125) [23,5] r=0 lpr=17838 crt=0'0 mlcod
>> >> > > > > > > > >> 0'0 inactive]
>> >> > > > > > > > >> state<Start>: transitioning to Primary
>> >> > > > > > > > >>Â Â Â Â  -4> 2015-04-27 10:17:08.809479 7fd8e748d700Â  5
>> >> > > > > > > > >> osd.23
>> >> > > > pg_epoch:
>> >> > > > >
>> >> > > > > > > > >> 17882 pg[2.189( empty local-les=16127 n=0 ec=1 les/c
>> >> > > > > > > > >> 16127/16344
>> >> > > > > > > > >> 16125/16125/16125) [23,5] r=0 lpr=17838 crt=0'0 mlcod
>> >> > > > > > > > >> 0'0 inactive] exit Start 0.000023 0 0.000000
>> >> > > > > > > > >>Â Â Â Â  -3> 2015-04-27 10:17:08.809492 7fd8e748d700Â  5
>> >> > > > > > > > >> osd.23
>> >> > > > pg_epoch:
>> >> > > > >
>> >> > > > > > > > >> 17882 pg[2.189( empty local-les=16127 n=0 ec=1 les/c
>> >> > > > > > > > >> 16127/16344
>> >> > > > > > > > >> 16125/16125/16125) [23,5] r=0 lpr=17838 crt=0'0 mlcod
>> >> > > > > > > > >> 0'0 inactive] enter Started/Primary
>> >> > > > > > > > >>Â Â Â Â  -2> 2015-04-27 10:17:08.809502 7fd8e748d700Â  5
>> >> > > > > > > > >> osd.23
>> >> > > > pg_epoch:
>> >> > > > >
>> >> > > > > > > > >> 17882 pg[2.189( empty local-les=16127 n=0 ec=1 les/c
>> >> > > > > > > > >> 16127/16344
>> >> > > > > > > > >> 16125/16125/16125) [23,5] r=0 lpr=17838 crt=0'0 mlcod
>> >> > > > > > > > >> 0'0 inactive] enter Started/Primary/Peering
>> >> > > > > > > > >>Â Â Â Â  -1> 2015-04-27 10:17:08.809513 7fd8e748d700Â  5
>> >> > > > > > > > >> osd.23
>> >> > > > pg_epoch:
>> >> > > > >
>> >> > > > > > > > >> 17882 pg[2.189( empty local-les=16127 n=0 ec=1 les/c
>> >> > > > > > > > >> 16127/16344
>> >> > > > > > > > >> 16125/16125/16125) [23,5] r=0 lpr=17838 crt=0'0 mlcod
>> >> > > > > > > > >> 0'0 peering] enter Started/Primary/Peering/GetInfo
>> >> > > > > > > > >>Â Â Â Â Â  0> 2015-04-27 10:17:08.813837 7fd8e748d700 -1
>> >> > > > > > > ./include/interval_set.h:
>> >> > > > > > > > >> In
>> >> > > > > > > > >> function 'void interval_set<T>::erase(T, T) [with T =
>> >> > > snapid_t]'
>> >> > > > > > > > >> thread
>> >> > > > > > > > >> 7fd8e748d700 time 2015-04-27 10:17:08.809899
>> >> > > > > > > > >> ./include/interval_set.h: 385: FAILED assert(_size >=
>> >> > > > > > > > >> 0)
>> >> > > > > > > > >>
>> >> > > > > > > > >>Â  ceph version 0.94.1
>> >> > > > > > > > >> (e4bfad3a3c51054df7e537a724c8d0bf9be972ff)
>> >> > > > > > > > >>Â  1: (ceph::__ceph_assert_fail(char const*, char const*,
>> >> > > > > > > > >> int, char
>> >> > > > > > > > >> const*)+0x8b)
>> >> > > > > > > > >> [0xbc271b]
>> >> > > > > > > > >>Â  2:
>> >> > > > > > > > >> (interval_set<snapid_t>::subtract(interval_set<snapid_t
>> >> > > > > > > > >> >
>> >> > > > > > > > >> const&)+0xb0) [0x82cd50]
>> >> > > > > > > > >>Â  3: (PGPool::update(std::tr1::shared_ptr<OSDMap
>> >> > > > > > > > >> const>)+0x52e) [0x80113e]
>> >> > > > > > > > >>Â  4: (PG::handle_advance_map(std::tr1::shared_ptr<OSDMap
>> >> > > > > > > > >> const>, std::tr1::shared_ptr<OSDMap const>,
>> >> > > > > > > > >> const>std::vector<int,
>> >> > > > > > > > >> std::allocator<int> >&, int, std::vector<int,
>> >> > > > > > > > >> std::allocator<int>
>> >> > > > > > > > >> >&, int, PG::RecoveryCtx*)+0x282) [0x801652]
>> >> > > > > > > > >>Â  5: (OSD::advance_pg(unsigned int, PG*,
>> >> > > > > > > > >> ThreadPool::TPHandle&, PG::RecoveryCtx*,
>> >> > > > > > > > >> std::set<boost::intrusive_ptr<PG>,
>> >> > > > > > > > >> std::less<boost::intrusive_ptr<PG> >,
>> >> > > > > > > > >> std::allocator<boost::intrusive_ptr<PG> > >*)+0x2c3)
>> >> > > > > > > > >> [0x6b0e43]
>> >> > > > > > > > >>Â  6: (OSD::process_peering_events(std::list<PG*,
>> >> > > > > > > > >> std::allocator<PG*>
>> >> > > > > > > > >> > const&,
>> >> > > > > > > > >> ThreadPool::TPHandle&)+0x21c) [0x6b191c]
>> >> > > > > > > > >>Â  7: (OSD::PeeringWQ::_process(std::list<PG*,
>> >> > > > > > > > >> std::allocator<PG*>
>> >> > > > > > > > >> > const&,
>> >> > > > > > > > >> ThreadPool::TPHandle&)+0x18) [0x709278]
>> >> > > > > > > > >>Â  8: (ThreadPool::worker(ThreadPool::WorkThread*)+0xa5e)
>> >> > > > > > > > >> [0xbb38ae]
>> >> > > > > > > > >>Â  9: (ThreadPool::WorkThread::entry()+0x10) [0xbb4950]
>> >> > > > > > > > >>Â  10: (()+0x8182) [0x7fd906946182]
>> >> > > > > > > > >>Â  11: (clone()+0x6d) [0x7fd904eb147d]
>> >> > > > > > > > >>
>> >> > > > > > > > >> Also by monitoring (ceph -w) I get the following
>> >> > > > > > > > >> messages, also lots of
>> >> > > > > > > them.
>> >> > > > > > > > >>
>> >> > > > > > > > >> 2015-04-27 10:39:52.935812 mon.0 [INF] from='client.?
>> >> > > > > > > 10.20.0.13:0/1174409'
>> >> > > > > > > > >> entity='osd.30' cmd=[{"prefix": "osd crush
>> >> > > > > > > > >> create-or-move",
>> >> > > > "args":
>> >> > > > > > > > >> ["host=ceph3", "root=default"], "id": 30, "weight":
>> 1.82}]:
>> >>
>> >> > > > > > > > >> dispatch
>> >> > > > > > > > >> 2015-04-27 10:39:53.297376 mon.0 [INF] from='client.?
>> >> > > > > > > 10.20.0.13:0/1174483'
>> >> > > > > > > > >> entity='osd.26' cmd=[{"prefix": "osd crush
>> >> > > > > > > > >> create-or-move",
>> >> > > > "args":
>> >> > > > > > > > >> ["host=ceph3", "root=default"], "id": 26, "weight":
>> 1.82}]:
>> >>
>> >> > > > > > > > >> dispatch
>> >> > > > > > > > >>
>> >> > > > > > > > >>
>> >> > > > > > > > >> This is a cluster of 3 nodes with 36 OSD's, nodes are
>> >> > > > > > > > >> also mons and mds's to save servers. All run Ubuntu
>> >> 14.04.2.
>> >> > > > > > > > >>
>> >> > > > > > > > >> I have pretty much tried everything I could think of.
>> >> > > > > > > > >>
>> >> > > > > > > > >> Restarting daemons doesn't help.
>> >> > > > > > > > >>
>> >> > > > > > > > >> Any help would be appreciated. I can also provide more
>> >> > > > > > > > >> logs if necessary. They just seem to get pretty large
>> >> > > > > > > > >> in few
>> >> > > moments.
>> >> > > > > > > > >>
>> >> > > > > > > > >> Thank you
>> >> > > > > > > > >> Tuomas
>> >> > > > > > > > >>
>> >> > > > > > > > >>
>> >> > > > > > > > >> _______________________________________________
>> >> > > > > > > > >> ceph-users mailing list ceph-users@xxxxxxxxxxxxxx
>> >> > > > > > > > >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> >> > > > > > > > >>
>> >> > > > > > > > >>
>> >> > > > > > > > >>
>> >> > > > > > > > >
>> >> > > > > > > > >
>> >> > > > > > > > > _______________________________________________
>> >> > > > > > > > > ceph-users mailing list
>> >> > > > > > > > > ceph-users@xxxxxxxxxxxxxx
>> >> > > > > > > > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> >> > > > > > > > >
>> >> > > > > > > > >
>> >> > > > > > > > >
>> >> > > > > > > >
>> >> > > > > > > >
>> >> > > > > > > > _______________________________________________
>> >> > > > > > > > ceph-users mailing list
>> >> > > > > > > > ceph-users@xxxxxxxxxxxxxx
>> >> > > > > > > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> >> > > > > > > >
>> >> > > > > > > >
>> >> > > > > > > >
>> >> > > > > > > > _______________________________________________
>> >> > > > > > > > ceph-users mailing list
>> >> > > > > > > > ceph-users@xxxxxxxxxxxxxx
>> >> > > > > > > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> >> > > > > > > >
>> >> > > > > > > >
>> >> > > > > > >
>> >> > > > > >
>> >> > > > > >
>> >> > > > >
>> >> > > > >
>> >> > > >
>> >> > >
>> >> > >
>> >> > _______________________________________________
>> >> > ceph-users mailing list
>> >> > ceph-users@xxxxxxxxxxxxxx
>> >> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> >> >
>> >> >
>> >>
>> >
>>
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> the body of a message to majordomo@xxxxxxxxxxxxxxx
>> More majordomo info atÂ  http://vger.kernel.org/majordomo-info.html
>>
>>
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com