Re: 5 pgs inactive, 5 pgs incomplete

Martin Palma <martin@xxxxxxxx> · Fri, 21 Aug 2020 11:58:45 +0200

Sorry for stressing but it would help us a lot if someone with deeper
knowledge could tell us if marking the PG on the secondary OSD will
not render the whole CephFS pool unusable? We are aware that it could
mean that some files will be lost or inconsistent but it will not
affect all data in the pool and the whole pool.

On Thu, Aug 20, 2020 at 11:55 AM Martin Palma <martin@xxxxxxxx> wrote:
>
> On one pool, which was only a test pool, we investigated both OSDs
> which host the inactive and incomplete PG with the following command:
>
> % ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-## --pgid
> <ID> --op list
>
> On the primary OSD for the PG we saw no output, but on the secondary
> we got an output. So we marked that PG on that OSD as complete. This
> solved the inactive/incomplete PG for that pool.
>
> The other PGs are from our main CephFS pool and we have the fear that
> by doing the above we could lose access to the whole pool and data.
>
> On Thu, Aug 20, 2020 at 11:49 AM Martin Palma <martin@xxxxxxxx> wrote:
> >
> > Yes we already did that but since the OSD does not exists anymore we
> > get the following error:
> >
> > % ceph osd lost 81 --yes-i-really-mean-it
> > Error ENOENT: osd.81 does not exist
> >
> > So we do not know how we can bring the PGs to notice that OSD 81 does
> > not exist anymore...
> >
> > On Thu, Aug 20, 2020 at 11:41 AM Dan van der Ster <dan@xxxxxxxxxxxxxx> wrote:
> > >
> > > Did you already mark osd.81 as lost?
> > >
> > > AFAIU you need to `ceph osd lost 81`, and *then* you can try the
> > > osd_find_best_info_ignore_history_les option.
> > >
> > > -- dan
> > >
> > >
> > > On Thu, Aug 20, 2020 at 11:31 AM Martin Palma <martin@xxxxxxxx> wrote:
> > > >
> > > > All inactive and incomplete PGs are blocked by OSD 81 which does not
> > > > exist anymore:
> > > > ...
> > > > "down_osds_we_would_probe": [
> > > >                 81
> > > >             ],
> > > >             "peering_blocked_by": [],
> > > >             "peering_blocked_by_detail": [
> > > >                 {
> > > >                     "detail": "peering_blocked_by_history_les_bound"
> > > >                 }
> > > >             ]
> > > > ...
> > > >
> > > > Here the full output: https://pastebin.com/V5EPZ0N7
> > > >
> > > > On Thu, Aug 20, 2020 at 10:58 AM Dan van der Ster <dan@xxxxxxxxxxxxxx> wrote:
> > > > >
> > > > > Something else to help debugging is
> > > > >
> > > > > ceph pg 17.173 query
> > > > >
> > > > > at the end it should say why the pg is incomplete.
> > > > >
> > > > > -- dan
> > > > >
> > > > >
> > > > >
> > > > > On Thu, Aug 20, 2020 at 10:01 AM Eugen Block <eblock@xxxxxx> wrote:
> > > > > >
> > > > > > Hi Martin,
> > > > > >
> > > > > > have you seen this blog post [1]? It describes how to recover from
> > > > > > inactive and incomplete PGs (on a size 1 pool). I haven't tried any of
> > > > > > that but it could be worth a try. Apparently it only would work if the
> > > > > > affected PGs have 0 objects but that seems to be the case, right?
> > > > > >
> > > > > > Regards,
> > > > > > Eugen
> > > > > >
> > > > > > [1]
> > > > > > https://medium.com/opsops/recovering-ceph-from-reduced-data-availability-3-pgs-inactive-3-pgs-incomplete-b97cbcb4b5a1
> > > > > >
> > > > > >
> > > > > > Zitat von Martin Palma <martin@xxxxxxxx>:
> > > > > >
> > > > > > > If Ceph consultants are reading this please feel free to contact me
> > > > > > > off list. We are seeking for someone who can help us of course we will
> > > > > > > pay.
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > On Mon, Aug 17, 2020 at 12:50 PM Martin Palma <martin@xxxxxxxx> wrote:
> > > > > > >>
> > > > > > >> After doing some research I suspect the problem is that during the
> > > > > > >> cluster was backfilling an OSD was removed.
> > > > > > >>
> > > > > > >> Now the PGs which are inactive and incomplete have all the same
> > > > > > >> (removed OSD) in the "down_osds_we_would_probe" output and the peering
> > > > > > >> is blocked by "peering_blocked_by_history_les_bound". We tried to set
> > > > > > >> the "osd_find_best_info_ignore_history_les = true" but with no success
> > > > > > >> the OSDs keep in a peering loop.
> > > > > > >>
> > > > > > >> On Mon, Aug 17, 2020 at 9:53 AM Martin Palma <martin@xxxxxxxx> wrote:
> > > > > > >> >
> > > > > > >> > Here is the output with all OSD up and running.
> > > > > > >> >
> > > > > > >> > ceph -s: https://pastebin.com/5tMf12Lm
> > > > > > >> > ceph health detail: https://pastebin.com/avDhcJt0
> > > > > > >> > ceph osd tree: https://pastebin.com/XEB0eUbk
> > > > > > >> > ceph osd pool ls detail: https://pastebin.com/ShSdmM5a
> > > > > > >> >
> > > > > > >> > On Mon, Aug 17, 2020 at 9:38 AM Martin Palma <martin@xxxxxxxx> wrote:
> > > > > > >> > >
> > > > > > >> > > Hi Peter,
> > > > > > >> > >
> > > > > > >> > > On the weekend another host was down due to power problems, which was
> > > > > > >> > > restarted. Therefore these outputs also include some "Degraded data
> > > > > > >> > > redundancy" messages. And one OSD crashed due to a disk error.
> > > > > > >> > >
> > > > > > >> > > ceph -s: https://pastebin.com/Tm8QHp52
> > > > > > >> > > ceph health detail: https://pastebin.com/SrA7Bivj
> > > > > > >> > > ceph osd tree: https://pastebin.com/nBK8Uafd
> > > > > > >> > > ceph osd pool ls detail: https://pastebin.com/kYyCb7B2
> > > > > > >> > >
> > > > > > >> > > No it's not a EC pool which has the inactive+incomplete PGs.
> > > > > > >> > >
> > > > > > >> > > ceph osd crush dump | jq '[.rules, .tunables]':
> > > > > > >> https://pastebin.com/gqDTjfat
> > > > > > >> > >
> > > > > > >> > > Best,
> > > > > > >> > > Martin
> > > > > > >> > >
> > > > > > >> > > On Sun, Aug 16, 2020 at 3:44 PM Peter Maloney
> > > > > > >> > > <peter.maloney@xxxxxxxxxxxxxxxxxxxx> wrote:
> > > > > > >> > > >
> > > > > > >> > > > Dear Martin,
> > > > > > >> > > >
> > > > > > >> > > > Can you provide some details?
> > > > > > >> > > >
> > > > > > >> > > > ceph -s
> > > > > > >> > > > ceph health detail
> > > > > > >> > > > ceph osd tree
> > > > > > >> > > > ceph osd pool ls detail
> > > > > > >> > > >
> > > > > > >> > > > If it's EC (you implied it's not) also show the crush
> > > > > > >> rules...and may as well include tunables (because greatly raising
> > > > > > >> choose_total_tries, eg. 200 may be the solution to your problem):
> > > > > > >> > > > ceph osd crush dump | jq '[.rules, .tunables]'
> > > > > > >> > > >
> > > > > > >> > > > Peter
> > > > > > >> > > >
> > > > > > >> > > > On 8/16/20 1:18 AM, Martin Palma wrote:
> > > > > > >> > > > > Yes, but that didn’t help. After some time they have
> > > > > > >> blocked requests again
> > > > > > >> > > > > and remain inactive and incomplete.
> > > > > > >> > > > >
> > > > > > >> > > > > On Sat, 15 Aug 2020 at 16:58, <ceph@xxxxxxxxxx> wrote:
> > > > > > >> > > > >
> > > > > > >> > > > >> Did you tried to restart the sayed osds?
> > > > > > >> > > > >>
> > > > > > >> > > > >>
> > > > > > >> > > > >>
> > > > > > >> > > > >> Hth
> > > > > > >> > > > >>
> > > > > > >> > > > >> Mehmet
> > > > > > >> > > > >>
> > > > > > >> > > > >>
> > > > > > >> > > > >>
> > > > > > >> > > > >> Am 12. August 2020 21:07:55 MESZ schrieb Martin Palma
> > > > > > >> <martin@xxxxxxxx>:
> > > > > > >> > > > >>
> > > > > > >> > > > >>>> Are the OSDs online? Or do they refuse to boot?
> > > > > > >> > > > >>> Yes. They are up and running and not marked as down or out of the
> > > > > > >> > > > >>> cluster.
> > > > > > >> > > > >>>> Can you list the data with ceph-objectstore-tool on these OSDs?
> > > > > > >> > > > >>> If you mean the "list" operation on the PG works if an output for
> > > > > > >> > > > >>> example:
> > > > > > >> > > > >>> $ ceph-objectstore-tool --data-path
> > > > > > >> /var/lib/ceph/osd/ceph-63 --pgid
> > > > > > >> > > > >>> 22.11a --op list
> > > > > > >> > > > >>
> > > > > > >> > > > >>>
> > > > > > >> ["22.11a",{"oid":"1001c1ee04f.00000007","key":"","snapid":-2,"hash":3825189146,"max":0,"pool":22,"namespace":"","max":0}]
> > > > > > >> > > > >>
> > > > > > >> > > > >>>
> > > > > > >> ["22.11a",{"oid":"1000448667f.00000000","key":"","snapid":-2,"hash":4294951194,"max":0,"pool":22,"namespace":"","max":0}]
> > > > > > >> > > > >>> ...
> > > > > > >> > > > >>> If I run "ceph pg ls incomplete" in the output only one PG has
> > > > > > >> > > > >>> objects... all others have 0 objects.
> > > > > > >> > > > >>> _______________________________________________
> > > > > > >> > > > >>> ceph-users mailing list -- ceph-users@xxxxxxx
> > > > > > >> > > > >>> To unsubscribe send an email to ceph-users-leave@xxxxxxx
> > > > > > >> > > > >> _______________________________________________
> > > > > > >> > > > >>
> > > > > > >> > > > >> ceph-users mailing list -- ceph-users@xxxxxxx
> > > > > > >> > > > >>
> > > > > > >> > > > >> To unsubscribe send an email to ceph-users-leave@xxxxxxx
> > > > > > >> > > > >>
> > > > > > >> > > > >>
> > > > > > >> > > > > _______________________________________________
> > > > > > >> > > > > ceph-users mailing list -- ceph-users@xxxxxxx
> > > > > > >> > > > > To unsubscribe send an email to ceph-users-leave@xxxxxxx
> > > > > > >> > > >
> > > > > > >> > > >
> > > > > > >> > > > --
> > > > > > >> > > > --------------------------------------------
> > > > > > >> > > > Peter Maloney
> > > > > > >> > > > Brockmann Consult GmbH
> > > > > > >> > > > www.brockmann-consult.de
> > > > > > >> > > > Chrysanderstr. 1
> > > > > > >> > > > D-21029 Hamburg, Germany
> > > > > > >> > > > Tel: +49 (0)40 69 63 89 - 320
> > > > > > >> > > > E-mail: peter.maloney@xxxxxxxxxxxxxxxxxxxx
> > > > > > >> > > > Amtsgericht Hamburg HRB 157689
> > > > > > >> > > > Geschäftsführer Dr. Carsten Brockmann
> > > > > > >> > > > --------------------------------------------
> > > > > > >> > > >
> > > > > > > _______________________________________________
> > > > > > > ceph-users mailing list -- ceph-users@xxxxxxx
> > > > > > > To unsubscribe send an email to ceph-users-leave@xxxxxxx
> > > > > >
> > > > > >
> > > > > > _______________________________________________
> > > > > > ceph-users mailing list -- ceph-users@xxxxxxx
> > > > > > To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx