I would not run the cephfs disaster recovery tools. Your cluster was offline here because it couldn't do some writes, but it should still be self-consistent.
On Thu, Jun 14, 2018 at 4:52 PM Oliver Schulz <oliver.schulz@xxxxxxxxxxxxxx> wrote:
They are recovered now, looks like it just took a bit
for them to "jump the queue". :-)
Whew ...
I rember something about there being some kind of
fschk for CephFS now. Is something like this that I
can/should run before I start my MDS daemons again?
May then I can finally reduce my MDS max-rank back
to one (I went to two, but it caused trouble, I think
because we have some clients that are too old).
On 14.06.2018 22:47, Gregory Farnum wrote:
> I don't think there's a way to help them. They "should" get priority in
> recovery, but there were a number of bugs with it in various versions
> and forcing that kind of priority without global decision making is
> prone to issues.
>
> But yep, looks like things will eventually become all good now. :)
>
> On Thu, Jun 14, 2018 at 4:39 PM Oliver Schulz
> <oliver.schulz@xxxxxxxxxxxxxx <mailto:oliver.schulz@xxxxxxxxxxxxxx>> wrote:
>
> Thanks, Greg!!
>
> I reset all the OSD weights to 1.00, and I think I'm in a much
> better state now. The only trouble left in "ceph health detail" is
>
> PG_DEGRADED Degraded data redundancy: 4/404985012 objects degraded
> (0.000%), 3 pgs degraded
> pg 2.47 is active+recovery_wait+degraded+remapped, acting
> [177,68,187]
> pg 2.1fd is active+recovery_wait+degraded+remapped, acting
> [36,83,185]
> pg 2.748 is active+recovery_wait+degraded, acting [31,8,149]
>
> (There's a lot of misplaced PGs now, obviously). The interesting
> thing is that my "lost" PG is back, too, with three acting OSDs.
>
> Maybe I dodged the bullet - what do you think?
>
> One question: Is there a way to give recovery of the three
> degraded PGs priority over backfilling the misplaced ones?
> I tried "ceph pg force-recovery" but it didn't seem to have
> any effect, they were still on "recovery_wait", after.
>
>
> Cheers,
>
> Oliver
>
>
> On 14.06.2018 22:09, Gregory Farnum wrote:
> > On Thu, Jun 14, 2018 at 4:07 PM Oliver Schulz
> > <oliver.schulz@xxxxxxxxxxxxxx
> <mailto:oliver.schulz@xxxxxxxxxxxxxx>
> <mailto:oliver.schulz@xxxxxxxxxxxxxx
> <mailto:oliver.schulz@xxxxxxxxxxxxxx>>> wrote:
> >
> > Hi Greg,
> >
> > I increased the hard limit and rebooted everything. The
> > PG without acting OSDs still has none, but I also have
> > quite a few PGs with that look like this, now:
> >
> > pg 1.79c is stuck undersized for 470.640254, current state
> > active+undersized+degraded, last acting [179,154]
> >
> > I had that problem before (only two acting OSDs on a few PGs),
> > I always solved it by setting the primary OSD to out and then
> > back in a few seconds later (resulting in a very quick recovery,
> > then all was fine again). But maybe that's not the ideal
> solution?
> >
> > Here's "ceph pg map" for one of them:
> >
> > osdmap e526060 pg 1.79c (1.79c) -> up [179,154] acting
> [179,154]
> >
> > I also have two PG's that have only one acting OSD, now:
> >
> > osdmap e526060 pg 0.58a (0.58a) -> up [174] acting [174]
> > osdmap e526060 pg 2.139 (2.139) -> up [61] acting [61]
> >
> > How can I make Ceph assign three OSD's to all of these weird PGs?
> > Before the reboot, they all did have three OSDs assigned
> (except for
> > the one that has none), and they were not shown as degraded.
> >
> >
> > > If it's the second, then fixing the remapping problem will
> > resolve it.
> > > That's probably/hopefully just by undoing the
> remap-by-utilization
> > > changes.
> >
> > How do I do that, best? Just set all the weights back to 1.00?
> >
> >
> > Yeah. This is probably the best way to fix up the other
> undersized PGs —
> > at least, assuming it doesn't result in an over-full PG!
> >
> > I don't work with overflowing OSDs/clusters often, but my
> suspicion is
> > you're better off with something like CERN's reweight scripts
> than using
> > reweight-by-utilization. Unless it's improved without my
> noticing, that
> > algorithm just isn't very good. :/
> > -Greg
> >
> >
> >
> > Cheers,
> >
> > Oliver
> >
> >
> > P.S.: Thanks so much for helping!
> >
> >
> >
> > On 14.06.2018 21:37, Gregory Farnum wrote:
> > > On Thu, Jun 14, 2018 at 3:26 PM Oliver Schulz
> > > <oliver.schulz@xxxxxxxxxxxxxx
> <mailto:oliver.schulz@xxxxxxxxxxxxxx>
> > <mailto:oliver.schulz@xxxxxxxxxxxxxx
> <mailto:oliver.schulz@xxxxxxxxxxxxxx>>
> > <mailto:oliver.schulz@xxxxxxxxxxxxxx
> <mailto:oliver.schulz@xxxxxxxxxxxxxx>
> > <mailto:oliver.schulz@xxxxxxxxxxxxxx
> <mailto:oliver.schulz@xxxxxxxxxxxxxx>>>> wrote:
> > >
> > > But the contents of the remapped PGs should still be
> > > Ok, right? What confuses me is that they don't
> > > backfill - why don't the "move" where they belong?
> > >
> > > As for the PG hard limit, yes, I ran into this. Our
> > > cluster had been very (very) full, but I wanted the
> > > new OSD nodes to use bluestore, so I updated to
> > > Luminous before I added the additional storage. I
> > > temporarily increased the pg hard limit and after
> > > a while (and after adding the new OSDs) the cluster
> > > seemed to be in a decent state again. Afterwards,
> > > I set the PG hard limit back to normal.
> > >
> > > I don't have a "too many PGs per OSD" health warning,
> > > currently - should I still increase the PG hard limit?
> > >
> > >
> > > Well, it's either the hard limit getting hit, or the fact that
> > the PG
> > > isn't getting mapped to any OSD and there not being an
> existing
> > primary
> > > to take responsibility for remapping it.
> > >
> > > If it's the second, then fixing the remapping problem will
> > resolve it.
> > > That's probably/hopefully just by undoing the
> > remap-by-utilization changes.
> > >
> > >
> > > On 14.06.2018 20:58, Gregory Farnum wrote:
> > > > Okay, I can’t tell you what happened to that one
> pg, but
> > you’ve got
> > > > another 445 remapped pgs and that’s not a good
> state to be
> > in. It
> > > was
> > > > probably your use of the rewritten-by-utilization.
> :/ I am
> > pretty
> > > sure
> > > > the missing PG and remapped ones have the same root
> cause,
> > and it’s
> > > > possible but by no means certain fixing one will
> fix the
> > others.
> > > >
> > > >
> > > > ...oh, actually, the most likely cause just came up
> in an
> > unrelated
> > > > conversation. You’ve probably run into the pg overdose
> > protection
> > > that
> > > > was added in luminous. Check the list archives for
> the exact
> > > name, but
> > > > you’ll want to increase the pg hard limit and
> restart the
> > osds that
> > > > exceeded the previous/current setting.
> > > > -Greg
> > > > On Thu, Jun 14, 2018 at 2:33 PM Oliver Schulz
> > > > <oliver.schulz@xxxxxxxxxxxxxx
> <mailto:oliver.schulz@xxxxxxxxxxxxxx>
> > <mailto:oliver.schulz@xxxxxxxxxxxxxx
> <mailto:oliver.schulz@xxxxxxxxxxxxxx>>
> > > <mailto:oliver.schulz@xxxxxxxxxxxxxx
> <mailto:oliver.schulz@xxxxxxxxxxxxxx>
> > <mailto:oliver.schulz@xxxxxxxxxxxxxx
> <mailto:oliver.schulz@xxxxxxxxxxxxxx>>>
> > > <mailto:oliver.schulz@xxxxxxxxxxxxxx
> <mailto:oliver.schulz@xxxxxxxxxxxxxx>
> > <mailto:oliver.schulz@xxxxxxxxxxxxxx
> <mailto:oliver.schulz@xxxxxxxxxxxxxx>>
> > > <mailto:oliver.schulz@xxxxxxxxxxxxxx
> <mailto:oliver.schulz@xxxxxxxxxxxxxx>
> > <mailto:oliver.schulz@xxxxxxxxxxxxxx
> <mailto:oliver.schulz@xxxxxxxxxxxxxx>>>>> wrote:
> > > >
> > > > I'm not running the balancer, but I did
> > reweight-by-utilization
> > > > a few times recently.
> > > >
> > > > "ceph osd tree" and "ceph -s" say:
> > > >
> > > >
> > https://gist.github.com/oschulz/36d92af84851ec42e09ce1f3cacbc110
> > > >
> > > >
> > > >
> > > > On 14.06.2018 20:23, Gregory Farnum wrote:
> > > > > Well, if this pg maps to no osds, something has
> > certainly
> > > gone wrong
> > > > > with your crush map. What’s the crush rule it’s
> > using, and
> > > what’s
> > > > the
> > > > > output of “ceph osd tree”?
> > > > > Are you running the manager’s balancer module or
> > something
> > > that
> > > > might be
> > > > > putting explicit mappings into the osd map and
> > broken it?
> > > > >
> > > > > I’m not certain off-hand about the pg
> reporting, but I
> > > believe if
> > > > it’s
> > > > > reporting the state as unknown that means *no*
> > running osd
> > > which
> > > > > contains any copy of that pg. That’s not
> something
> > which ceph
> > > > could do
> > > > > on its own without failures of osds. What’s the
> > output of
> > > “ceph -s”?
> > > > > On Thu, Jun 14, 2018 at 2:15 PM Oliver Schulz
> > > > > <oliver.schulz@xxxxxxxxxxxxxx
> <mailto:oliver.schulz@xxxxxxxxxxxxxx>
> > <mailto:oliver.schulz@xxxxxxxxxxxxxx
> <mailto:oliver.schulz@xxxxxxxxxxxxxx>>
> > > <mailto:oliver.schulz@xxxxxxxxxxxxxx
> <mailto:oliver.schulz@xxxxxxxxxxxxxx>
> > <mailto:oliver.schulz@xxxxxxxxxxxxxx
> <mailto:oliver.schulz@xxxxxxxxxxxxxx>>>
> > > > <mailto:oliver.schulz@xxxxxxxxxxxxxx
> <mailto:oliver.schulz@xxxxxxxxxxxxxx>
> > <mailto:oliver.schulz@xxxxxxxxxxxxxx
> <mailto:oliver.schulz@xxxxxxxxxxxxxx>>
> > > <mailto:oliver.schulz@xxxxxxxxxxxxxx
> <mailto:oliver.schulz@xxxxxxxxxxxxxx>
> > <mailto:oliver.schulz@xxxxxxxxxxxxxx
> <mailto:oliver.schulz@xxxxxxxxxxxxxx>>>>
> > > > <mailto:oliver.schulz@xxxxxxxxxxxxxx
> <mailto:oliver.schulz@xxxxxxxxxxxxxx>
> > <mailto:oliver.schulz@xxxxxxxxxxxxxx
> <mailto:oliver.schulz@xxxxxxxxxxxxxx>>
> > > <mailto:oliver.schulz@xxxxxxxxxxxxxx
> <mailto:oliver.schulz@xxxxxxxxxxxxxx>
> > <mailto:oliver.schulz@xxxxxxxxxxxxxx
> <mailto:oliver.schulz@xxxxxxxxxxxxxx>>>
> > > > <mailto:oliver.schulz@xxxxxxxxxxxxxx
> <mailto:oliver.schulz@xxxxxxxxxxxxxx>
> > <mailto:oliver.schulz@xxxxxxxxxxxxxx
> <mailto:oliver.schulz@xxxxxxxxxxxxxx>>
> > > <mailto:oliver.schulz@xxxxxxxxxxxxxx
> <mailto:oliver.schulz@xxxxxxxxxxxxxx>
> > <mailto:oliver.schulz@xxxxxxxxxxxxxx
> <mailto:oliver.schulz@xxxxxxxxxxxxxx>>>>>> wrote:
> > > > >
> > > > > Dear Greg,
> > > > >
> > > > > no, it's a very old cluster (continuous
> operation
> > > since 2013,
> > > > > with multiple extensions). It's a production
> > cluster and
> > > > > there's about 300TB of valuable data on it.
> > > > >
> > > > > We recently updated to luminous and
> added more
> > OSDs (a
> > > month
> > > > > ago or so), but everything seemed Ok
> since then. We
> > > didn't have
> > > > > any disk failures, but we had trouble
> with the
> > MDS daemons
> > > > > in the last days, so there were a few
> reboots.
> > > > >
> > > > > Is it somehow possible to find this
> "lost" PG
> > again? Since
> > > > > it's in the metadata pool, large parts
> of our
> > CephFS
> > > directory
> > > > > tree are currently unavailable. I turned
> the MDS
> > > daemons off
> > > > > for now ...
> > > > >
> > > > >
> > > > > Cheers
> > > > >
> > > > > Oliver
> > > > >
> > > > > On 14.06.2018 19:59, Gregory Farnum wrote:
> > > > > > Is this a new cluster? Or did the
> crush map
> > change
> > > somehow
> > > > > recently? One
> > > > > > way this might happen is if CRUSH
> just failed
> > > entirely to
> > > > map a pg,
> > > > > > although I think if the pg exists
> anywhere it
> > > should still be
> > > > > getting
> > > > > > reported as inactive.
> > > > > > On Thu, Jun 14, 2018 at 8:40 AM
> Oliver Schulz
> > > > > > <oliver.schulz@xxxxxxxxxxxxxx
> <mailto:oliver.schulz@xxxxxxxxxxxxxx>
> > <mailto:oliver.schulz@xxxxxxxxxxxxxx
> <mailto:oliver.schulz@xxxxxxxxxxxxxx>>
> > > <mailto:oliver.schulz@xxxxxxxxxxxxxx
> <mailto:oliver.schulz@xxxxxxxxxxxxxx>
> > <mailto:oliver.schulz@xxxxxxxxxxxxxx
> <mailto:oliver.schulz@xxxxxxxxxxxxxx>>>
> > > > <mailto:oliver.schulz@xxxxxxxxxxxxxx
> <mailto:oliver.schulz@xxxxxxxxxxxxxx>
> > <mailto:oliver.schulz@xxxxxxxxxxxxxx
> <mailto:oliver.schulz@xxxxxxxxxxxxxx>>
> > > <mailto:oliver.schulz@xxxxxxxxxxxxxx
> <mailto:oliver.schulz@xxxxxxxxxxxxxx>
> > <mailto:oliver.schulz@xxxxxxxxxxxxxx
> <mailto:oliver.schulz@xxxxxxxxxxxxxx>>>>
> > > > > <mailto:oliver.schulz@xxxxxxxxxxxxxx
> <mailto:oliver.schulz@xxxxxxxxxxxxxx>
> > <mailto:oliver.schulz@xxxxxxxxxxxxxx
> <mailto:oliver.schulz@xxxxxxxxxxxxxx>>
> > > <mailto:oliver.schulz@xxxxxxxxxxxxxx
> <mailto:oliver.schulz@xxxxxxxxxxxxxx>
> > <mailto:oliver.schulz@xxxxxxxxxxxxxx
> <mailto:oliver.schulz@xxxxxxxxxxxxxx>>>
> > > > <mailto:oliver.schulz@xxxxxxxxxxxxxx
> <mailto:oliver.schulz@xxxxxxxxxxxxxx>
> > <mailto:oliver.schulz@xxxxxxxxxxxxxx
> <mailto:oliver.schulz@xxxxxxxxxxxxxx>>
> > > <mailto:oliver.schulz@xxxxxxxxxxxxxx
> <mailto:oliver.schulz@xxxxxxxxxxxxxx>
> > <mailto:oliver.schulz@xxxxxxxxxxxxxx
> <mailto:oliver.schulz@xxxxxxxxxxxxxx>>>>>
> > > > > <mailto:oliver.schulz@xxxxxxxxxxxxxx
> <mailto:oliver.schulz@xxxxxxxxxxxxxx>
> > <mailto:oliver.schulz@xxxxxxxxxxxxxx
> <mailto:oliver.schulz@xxxxxxxxxxxxxx>>
> > > <mailto:oliver.schulz@xxxxxxxxxxxxxx
> <mailto:oliver.schulz@xxxxxxxxxxxxxx>
> > <mailto:oliver.schulz@xxxxxxxxxxxxxx
> <mailto:oliver.schulz@xxxxxxxxxxxxxx>>>
> > > > <mailto:oliver.schulz@xxxxxxxxxxxxxx
> <mailto:oliver.schulz@xxxxxxxxxxxxxx>
> > <mailto:oliver.schulz@xxxxxxxxxxxxxx
> <mailto:oliver.schulz@xxxxxxxxxxxxxx>>
> > > <mailto:oliver.schulz@xxxxxxxxxxxxxx
> <mailto:oliver.schulz@xxxxxxxxxxxxxx>
> > <mailto:oliver.schulz@xxxxxxxxxxxxxx
> <mailto:oliver.schulz@xxxxxxxxxxxxxx>>>>
> > > > > <mailto:oliver.schulz@xxxxxxxxxxxxxx
> <mailto:oliver.schulz@xxxxxxxxxxxxxx>
> > <mailto:oliver.schulz@xxxxxxxxxxxxxx
> <mailto:oliver.schulz@xxxxxxxxxxxxxx>>
> > > <mailto:oliver.schulz@xxxxxxxxxxxxxx
> <mailto:oliver.schulz@xxxxxxxxxxxxxx>
> > <mailto:oliver.schulz@xxxxxxxxxxxxxx
> <mailto:oliver.schulz@xxxxxxxxxxxxxx>>>
> > > > <mailto:oliver.schulz@xxxxxxxxxxxxxx
> <mailto:oliver.schulz@xxxxxxxxxxxxxx>
> > <mailto:oliver.schulz@xxxxxxxxxxxxxx
> <mailto:oliver.schulz@xxxxxxxxxxxxxx>>
> > > <mailto:oliver.schulz@xxxxxxxxxxxxxx
> <mailto:oliver.schulz@xxxxxxxxxxxxxx>
> > <mailto:oliver.schulz@xxxxxxxxxxxxxx
> <mailto:oliver.schulz@xxxxxxxxxxxxxx>>>>>>> wrote:
> > > > > >
> > > > > > Dear all,
> > > > > >
> > > > > > I have a serious problem with our
> Ceph
> > cluster:
> > > One of our
> > > > > PGs somehow
> > > > > > ended up in this state (reported
> by "ceph
> > > health detail":
> > > > > >
> > > > > > pg 1.XXX is stuck inactive for
> > ..., current
> > > > state unknown,
> > > > > > last acting []
> > > > > >
> > > > > > Also, "ceph pg map 1.xxx" reports:
> > > > > >
> > > > > > osdmap e525812 pg 1.721
> (1.721) ->
> > up []
> > > acting []
> > > > > >
> > > > > > I can't use "ceph pg 1.XXX
> query", it just
> > > hangs with
> > > > no output.
> > > > > >
> > > > > > All OSDs are up and in, I have MON
> > quorum, all
> > > other
> > > > PGs seem
> > > > > to be
> > > > > > fine.
> > > > > >
> > > > > > How can diagnose/fix this?
> > Unfortunately, the PG in
> > > > question
> > > > > is part
> > > > > > of the CephFS metadata pool ...
> > > > > >
> > > > > > Any help would be very, very much
> > appreciated!
> > > > > >
> > > > > >
> > > > > > Cheers,
> > > > > >
> > > > > > Oliver
> > > > > >
> > _______________________________________________
> > > > > > ceph-users mailing list
> > > > > > ceph-users@xxxxxxxxxxxxxx
> <mailto:ceph-users@xxxxxxxxxxxxxx>
> > <mailto:ceph-users@xxxxxxxxxxxxxx
> <mailto:ceph-users@xxxxxxxxxxxxxx>>
> > > <mailto:ceph-users@xxxxxxxxxxxxxx
> <mailto:ceph-users@xxxxxxxxxxxxxx>
> > <mailto:ceph-users@xxxxxxxxxxxxxx
> <mailto:ceph-users@xxxxxxxxxxxxxx>>>
> > > > <mailto:ceph-users@xxxxxxxxxxxxxx
> <mailto:ceph-users@xxxxxxxxxxxxxx>
> > <mailto:ceph-users@xxxxxxxxxxxxxx
> <mailto:ceph-users@xxxxxxxxxxxxxx>>
> > > <mailto:ceph-users@xxxxxxxxxxxxxx
> <mailto:ceph-users@xxxxxxxxxxxxxx>
> > <mailto:ceph-users@xxxxxxxxxxxxxx
> <mailto:ceph-users@xxxxxxxxxxxxxx>>>>
> > > <mailto:ceph-users@xxxxxxxxxxxxxx
> <mailto:ceph-users@xxxxxxxxxxxxxx>
> > <mailto:ceph-users@xxxxxxxxxxxxxx
> <mailto:ceph-users@xxxxxxxxxxxxxx>>
> <mailto:ceph-users@xxxxxxxxxxxxxx <mailto:ceph-users@xxxxxxxxxxxxxx>
> > <mailto:ceph-users@xxxxxxxxxxxxxx
> <mailto:ceph-users@xxxxxxxxxxxxxx>>>
> > > > <mailto:ceph-users@xxxxxxxxxxxxxx
> <mailto:ceph-users@xxxxxxxxxxxxxx>
> > <mailto:ceph-users@xxxxxxxxxxxxxx
> <mailto:ceph-users@xxxxxxxxxxxxxx>>
> > > <mailto:ceph-users@xxxxxxxxxxxxxx
> <mailto:ceph-users@xxxxxxxxxxxxxx>
> > <mailto:ceph-users@xxxxxxxxxxxxxx
> <mailto:ceph-users@xxxxxxxxxxxxxx>>>>>
> > > > > <mailto:ceph-users@xxxxxxxxxxxxxx
> <mailto:ceph-users@xxxxxxxxxxxxxx>
> > <mailto:ceph-users@xxxxxxxxxxxxxx
> <mailto:ceph-users@xxxxxxxxxxxxxx>>
> > > <mailto:ceph-users@xxxxxxxxxxxxxx
> <mailto:ceph-users@xxxxxxxxxxxxxx>
> > <mailto:ceph-users@xxxxxxxxxxxxxx
> <mailto:ceph-users@xxxxxxxxxxxxxx>>>
> > > > <mailto:ceph-users@xxxxxxxxxxxxxx
> <mailto:ceph-users@xxxxxxxxxxxxxx>
> > <mailto:ceph-users@xxxxxxxxxxxxxx
> <mailto:ceph-users@xxxxxxxxxxxxxx>>
> > > <mailto:ceph-users@xxxxxxxxxxxxxx
> <mailto:ceph-users@xxxxxxxxxxxxxx>
> > <mailto:ceph-users@xxxxxxxxxxxxxx
> <mailto:ceph-users@xxxxxxxxxxxxxx>>>>
> > > <mailto:ceph-users@xxxxxxxxxxxxxx
> <mailto:ceph-users@xxxxxxxxxxxxxx>
> > <mailto:ceph-users@xxxxxxxxxxxxxx
> <mailto:ceph-users@xxxxxxxxxxxxxx>>
> <mailto:ceph-users@xxxxxxxxxxxxxx <mailto:ceph-users@xxxxxxxxxxxxxx>
> > <mailto:ceph-users@xxxxxxxxxxxxxx
> <mailto:ceph-users@xxxxxxxxxxxxxx>>>
> > > > <mailto:ceph-users@xxxxxxxxxxxxxx
> <mailto:ceph-users@xxxxxxxxxxxxxx>
> > <mailto:ceph-users@xxxxxxxxxxxxxx
> <mailto:ceph-users@xxxxxxxxxxxxxx>>
> > > <mailto:ceph-users@xxxxxxxxxxxxxx
> <mailto:ceph-users@xxxxxxxxxxxxxx>
> > <mailto:ceph-users@xxxxxxxxxxxxxx
> <mailto:ceph-users@xxxxxxxxxxxxxx>>>>>>
> > > > > >
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> > > > > >
> > > > >
> > > >
> > >
> >
>
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com