Re: Fwd: monitor crashing

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Thanks for all the help Sage. The cluster is now back to life with
your awesome patch.

On Tue, Oct 13, 2015 at 3:35 PM, Sage Weil <sage@xxxxxxxxxxxx> wrote:
> On Tue, 13 Oct 2015, Luis Periquito wrote:
>> Hi Sage,
>>
>> awesome help.
>>
>> Sorry for not telling before, but I'm running 2xMON in precise and
>> 1xMON in trusty. Looking at the status page
>> (http://ceph.com/gitbuilder.cgi) it seems the precise build is
>> failing... Can you have a look?
>
> I've repushed the branch, this time cherry-picking the master fix.  Let me
> know if you run into other problems!
>
> Thanks-
> sage
>
>>
>> thanks,
>>
>>
>> On Tue, Oct 13, 2015 at 2:59 PM, Sage Weil <sage@xxxxxxxxxxxx> wrote:
>> > On Tue, 13 Oct 2015, Loic Dachary wrote:
>> >> https://github.com/ceph/ceph/compare/hammer...wip-ecpool-hammer
>> >>
>> >> In order to bypass the crush verification, you could:
>> >>
>> >> ceph tell mon.* injectargs --crushtool /bin/true
>> >
>> > Ah, good trick!
>> >
>> >         http://tracker.ceph.com/issues/13477
>> >
>> > is the ticket, and my fix for master is
>> >
>> >         https://github.com/ceph/ceph/pull/6246
>> >
>> > sage
>> >
>> >>
>> >> Cheers
>> >>
>> >> On 13/10/2015 15:41, Sage Weil wrote:
>> >> > On Tue, 13 Oct 2015, Luis Periquito wrote:
>> >> >> the store.db dir is 3.4GB big :(
>> >> >>
>> >> >> can I do it on my side?
>> >> >
>> >> > Nevermind, I was able to reproduce it from the bugzilla.  I've pushed a
>> >> > branch wip-ecpool-hammer.  Not sure which distro you're on, but packages
>> >> > will appear at gitbuilder.ceph.com in 30-45 minutes.  This fixes the mon
>> >> > crash, which will let you delete the pool.  I suggest stopping the OSDs
>> >> > before starting the mon with this or else they might get pg create
>> >> > messages and crash too.  Once the pool is removed you can start them
>> >> > again.  They shouldn't need to be upgraded.
>> >> >
>> >> > Note that the latest hammer doesn't let you create the pool at all because
>> >> > it fails the crush safety check (I had to disable the check to reproduce
>> >> > this), so that's good at least!
>> >> >
>> >> > sage
>> >> >
>> >> >>
>> >> >> On Tue, Oct 13, 2015 at 2:25 PM, Sage Weil <sage@xxxxxxxxxxxx> wrote:
>> >> >>> On Tue, 13 Oct 2015, Luis Periquito wrote:
>> >> >>>> Any ideas? I'm growing desperate :(
>> >> >>>>
>> >> >>>> I've tried compiling from source, and including
>> >> >>>> https://github.com/ceph/ceph/pull/5276, but it still crashes on boot
>> >> >>>> of the ceph-mon
>> >> >>>
>> >> >>> If you can email a (link to a) tarball of your mon data directory I'd love
>> >> >>> to extract the osdmap and see why crush is crashing.. it's obviously not
>> >> >>> supposed to do that (even with a bad rule).  You can also use
>> >> >>> the ceph-post-file utility.
>> >> >>>
>> >> >>> Thanks!
>> >> >>> sage
>> >> >>>
>> >> >>>
>> >> >>>>
>> >> >>>> ---------- Forwarded message ----------
>> >> >>>> From: Luis Periquito <periquito@xxxxxxxxx>
>> >> >>>> Date: Tue, Oct 13, 2015 at 12:26 PM
>> >> >>>> Subject: Re: monitor crashing
>> >> >>>> To: Ceph Users <ceph-users@xxxxxxxxxxxxxx>
>> >> >>>>
>> >> >>>>
>> >> >>>> I'm currently running Hammer (0.94.3), created an invalid LRC profile
>> >> >>>> (typo in the l=, should have been l=4 but was l=3, and now I don't
>> >> >>>> have enough different ruleset-locality) and created a pool. Is there
>> >> >>>> any way to delete this pool? remember I can't start the ceph-mon...
>> >> >>>>
>> >> >>>> On Tue, Oct 13, 2015 at 11:56 AM, Luis Periquito <periquito@xxxxxxxxx> wrote:
>> >> >>>>> It seems I've hit this bug:
>> >> >>>>> https://bugzilla.redhat.com/show_bug.cgi?id=1231630
>> >> >>>>>
>> >> >>>>> is there any way I can recover this cluster? It worked in our test
>> >> >>>>> cluster, but crashed the production one...
>> >> >>>> --
>> >> >>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> >> >>>> the body of a message to majordomo@xxxxxxxxxxxxxxx
>> >> >>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> >> >>>>
>> >> >>>>
>> >> >> --
>> >> >> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> >> >> the body of a message to majordomo@xxxxxxxxxxxxxxx
>> >> >> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> >> >>
>> >> >>
>> >> > --
>> >> > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> >> > the body of a message to majordomo@xxxxxxxxxxxxxxx
>> >> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> >> >
>> >>
>> >> --
>> >> Loïc Dachary, Artisan Logiciel Libre
>> >>
>> >>
>> --
>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> the body of a message to majordomo@xxxxxxxxxxxxxxx
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
>>
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux