Re: PG auto repair with BlueStore

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



While I also believe it to be perfectly safe on a bluestore cluster
(especially since there's osd_scrub_auto_repair_num_errors if there's
more wrong than your usual bit rot), we also don't run any cluster
with this option at the moment. We had it enabled for some time before
we backported the OOM-read-error stuff on some clusters.

But there's a small operational issue with auto repair at the moment:
this option will occasionally set the repair flag on a PG without any
scrub errors during scrubbing for some reason which triggers a health
error.

We've had a quick look at the code and couldn't figure out how the
repair flag gets set in some cases on perfectly healthy PGs. Does it
maybe only get set for a very short time while finishing up the scrub
and that's not always picked up in time?
Anyways, a potential work-around for this would be to maybe remove the
repair state from the conditions for the PG_DAMAGED warning?

Paul

-- 
Paul Emmerich

Looking for help with your Ceph cluster? Contact us at https://croit.io

croit GmbH
Freseniusstr. 31h
81247 München
www.croit.io
Tel: +49 89 1896585 90

Am Fr., 16. Nov. 2018 um 08:49 Uhr schrieb Mark Schouten <mark@xxxxxxxx>:
>
>
> Which, as a user, is very surprising to me too..
> --
>
> Mark Schouten  | Tuxis Internet Engineering
> KvK: 61527076  | http://www.tuxis.nl/
> T: 0318 200208 | info@xxxxxxxx
>
>
>
>
> ----- Original Message -----
>
>
> From: Wido den Hollander (wido@xxxxxxxx)
> Date: 16-11-2018 08:25
> To: Mark Schouten (mark@xxxxxxxx)
> Cc: Ceph Users (ceph-users@xxxxxxxx)
> Subject: Re:  PG auto repair with BlueStore
>
>
> On 11/15/18 7:45 PM, Mark Schouten wrote:
> > As a user, I’m very surprised that this isn’t a default setting.
> >
>
> That is because you can also have FileStore OSDs in a cluster on which
> such a auto-repair is not safe.
>
> Wido
>
> > Mark Schouten
> >
> >> Op 15 nov. 2018 om 18:40 heeft Wido den Hollander <wido@xxxxxxxx> het volgende geschreven:
> >>
> >> Hi,
> >>
> >> This question is actually still outstanding. Is there any good reason to
> >> keep auto repair for scrub errors disabled with BlueStore?
> >>
> >> I couldn't think of a reason when using size=3 and min_size=2, so just
> >> wondering.
> >>
> >> Thanks!
> >>
> >> Wido
> >>
> >>> On 8/24/18 8:55 AM, Wido den Hollander wrote:
> >>> Hi,
> >>>
> >>> osd_scrub_auto_repair still defaults to false and I was wondering how we
> >>> think about enabling this feature by default.
> >>>
> >>> Would we say it's safe to enable this with BlueStore?
> >>>
> >>> Wido
> >>> _______________________________________________
> >>> ceph-users mailing list
> >>> ceph-users@xxxxxxxxxxxxxx
> >>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >>>
> >> _______________________________________________
> >> ceph-users mailing list
> >> ceph-users@xxxxxxxxxxxxxx
> >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
>
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux