Re: Refuse OSD removal if still up or acting for PG

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> Op 23 maart 2016 om 23:18 schreef Sage Weil <sage@xxxxxxxxxxxx>:
> 
> 
> On Wed, 23 Mar 2016, wido@xxxxxxxx wrote:
> > Hi,
> > 
> > This week I got a call to recover a Ceph cluster where somebody ran 
> > 'ceph osd rm X' for OSDs which were still holding PGs.
> > 
> > He removed multiple OSDs and together they were all the replicas forma 
> > certain PG.
> > 
> > This raised the question: Should we refuse a rm for a OSD which is still 
> > up or acting for a PG?
> > 
> > If not, what would the use-case be for removing a OSD from the OSDMap 
> > when it is still up or acting?
> > 
> > I would say that recovery/backfill has to be finished before we allow an 
> > OSD to be removed.
> 
> This seems reasonable, as longa  there is a --yes-i-really-mean-it flagt  
> to force it.
> 
> There are several options, though:
> 
> 1- OSD must not be up.  Probably doesn't protect from much.
> 2- OSD must not be in the up set for any OSD.  This will prevent you from 
> removing just one replica of a PG.

I think you mean: "OSD must not be in the up set for any PG" ?

> 3- OSD must not be the only up OSD (or, must not bring up set to < 
> min_size).
> 
> Neither of these really tell you which OSDs the PG is stored on, though. 

Sure, but it protects you from doing stupid things. Like happened in this case.
I had to use ceph-object-store to recover the PG and inject it again.

> The mon doesn't actually know that--only the primary does.  Either we can 
> try to cram that info into pg_stat_t, or we can accept that we can't make 
> a precise condition and instead just settle on something simple.  Like 1 & 
> 2?

Doesn't the mon know? The PG map contains allmost all the information. A
iteration through the monmap where you check if the OSD is in the "up" set for
any of the PGs would be enough, right?

In this case the replication level was set to 2 where the user thought it was
set to 3. So he thought it was safe to remove 2 machines at once and remove the
OSDs as well. Again, user error, but let's protect users from destroying their
data as much as possible.

Wido

> 
> sage
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux