Re: Fwd: monitor crashing

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, 13 Oct 2015, Loic Dachary wrote:
> https://github.com/ceph/ceph/compare/hammer...wip-ecpool-hammer
> 
> In order to bypass the crush verification, you could:
> 
> ceph tell mon.* injectargs --crushtool /bin/true

Ah, good trick!

	http://tracker.ceph.com/issues/13477

is the ticket, and my fix for master is

	https://github.com/ceph/ceph/pull/6246

sage

> 
> Cheers
> 
> On 13/10/2015 15:41, Sage Weil wrote:
> > On Tue, 13 Oct 2015, Luis Periquito wrote:
> >> the store.db dir is 3.4GB big :(
> >>
> >> can I do it on my side?
> > 
> > Nevermind, I was able to reproduce it from the bugzilla.  I've pushed a 
> > branch wip-ecpool-hammer.  Not sure which distro you're on, but packages 
> > will appear at gitbuilder.ceph.com in 30-45 minutes.  This fixes the mon 
> > crash, which will let you delete the pool.  I suggest stopping the OSDs 
> > before starting the mon with this or else they might get pg create 
> > messages and crash too.  Once the pool is removed you can start them 
> > again.  They shouldn't need to be upgraded.
> > 
> > Note that the latest hammer doesn't let you create the pool at all because 
> > it fails the crush safety check (I had to disable the check to reproduce 
> > this), so that's good at least!
> > 
> > sage
> > 
> >>
> >> On Tue, Oct 13, 2015 at 2:25 PM, Sage Weil <sage@xxxxxxxxxxxx> wrote:
> >>> On Tue, 13 Oct 2015, Luis Periquito wrote:
> >>>> Any ideas? I'm growing desperate :(
> >>>>
> >>>> I've tried compiling from source, and including
> >>>> https://github.com/ceph/ceph/pull/5276, but it still crashes on boot
> >>>> of the ceph-mon
> >>>
> >>> If you can email a (link to a) tarball of your mon data directory I'd love
> >>> to extract the osdmap and see why crush is crashing.. it's obviously not
> >>> supposed to do that (even with a bad rule).  You can also use
> >>> the ceph-post-file utility.
> >>>
> >>> Thanks!
> >>> sage
> >>>
> >>>
> >>>>
> >>>> ---------- Forwarded message ----------
> >>>> From: Luis Periquito <periquito@xxxxxxxxx>
> >>>> Date: Tue, Oct 13, 2015 at 12:26 PM
> >>>> Subject: Re: monitor crashing
> >>>> To: Ceph Users <ceph-users@xxxxxxxxxxxxxx>
> >>>>
> >>>>
> >>>> I'm currently running Hammer (0.94.3), created an invalid LRC profile
> >>>> (typo in the l=, should have been l=4 but was l=3, and now I don't
> >>>> have enough different ruleset-locality) and created a pool. Is there
> >>>> any way to delete this pool? remember I can't start the ceph-mon...
> >>>>
> >>>> On Tue, Oct 13, 2015 at 11:56 AM, Luis Periquito <periquito@xxxxxxxxx> wrote:
> >>>>> It seems I've hit this bug:
> >>>>> https://bugzilla.redhat.com/show_bug.cgi?id=1231630
> >>>>>
> >>>>> is there any way I can recover this cluster? It worked in our test
> >>>>> cluster, but crashed the production one...
> >>>> --
> >>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> >>>> the body of a message to majordomo@xxxxxxxxxxxxxxx
> >>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >>>>
> >>>>
> >> --
> >> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> >> the body of a message to majordomo@xxxxxxxxxxxxxxx
> >> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >>
> >>
> > --
> > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> > the body of a message to majordomo@xxxxxxxxxxxxxxx
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > 
> 
> -- 
> Loïc Dachary, Artisan Logiciel Libre
> 
> 

[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux