Re: ceph osd safe to remove

Dan van der Ster <dan@xxxxxxxxxxxxxx> · Thu, 3 Aug 2017 11:05:13 +0200

On Fri, Jul 28, 2017 at 9:42 PM, Peter Maloney
<peter.maloney@xxxxxxxxxxxxxxxxxxxx> wrote:
> Hello Dan,
>
> Based on what I know and what people told me on IRC, this means basicaly the
> condition that the osd is not acting nor up for any pg. And for one person
> (fusl on irc) that said there was a unfound objects bug when he had size =
> 1, also he said if reweight (and I assume crush weight) is 0, it will surely
> be safe, but possibly it won't be otherwise.
>
> And so here I took my bc-ceph-reweight-by-utilization.py script that already
> parses `ceph pg dump --format=json` (for up,acting,bytes,count of pgs) and
> `ceph osd df --format=json` (for weight and reweight), and gutted out the
> unneeded parts, and changed the report to show the condition I described as
> True or False per OSD. So the ceph auth needs to allow ceph pg dump and ceph
> osd df. The script is attached.
>
> The script doesn't assume you're ok with acting lower than size, or care
> about min_size, and just assumes you want the OSD completely empty.

Thanks for this script. In fact, I am trying to use the
min_size/size-based removal heuristics. If we would be able to wait
until an OSD is completely empty, then I suppose could just set the
crush weight to 0 then wait for HEALTH_OK. For our procedures I'm
trying to shortcut this with an earlier device removal.

Cheers, Dan

>
> Sample output:
>
> Real cluster:
>
> root@cephtest:~ # ./bc-ceph-empty-osds.py -a
> osd_id weight  reweight pgs_old bytes_old      pgs_new bytes_new      empty
>      0 4.00099  0.61998      38  1221853911536      38  1221853911536 False
>      1 4.00099  0.59834      43  1168531341347      43  1168531341347 False
>      2 4.00099  0.79213      44  1155260814435      44  1155260814435 False
>     27 4.00099  0.69459      39  1210145117377      39  1210145117377 False
>     30 6.00099  0.73933      56  1691992924542      56  1691992924542 False
>     31 6.00099  0.81180      64  1810503842054      64  1810503842054 False
> ...
>
>
> Test cluster with some -nan and 0's in crush map:
>
> root@tceph1:~ # ceph osd df
> ID WEIGHT  REWEIGHT SIZE   USE    AVAIL  %USE VAR  PGS
>  4 1.00000        0      0      0      0 -nan -nan   0
>  1 0.06439  1.00000 61409M 98860k 61313M 0.16 0.93  47
>  0 0.06438  1.00000 61409M   134M 61275M 0.22 1.29  59
>  2 0.06439  1.00000 61409M 82300k 61329M 0.13 0.77  46
>  3       0        0      0      0      0 -nan -nan   0
>               TOTAL   179G   311M   179G 0.17
> MIN/MAX VAR: 0.77/1.29  STDDEV: 0.04
>
>
> root@tceph1:~ # ./bc-ceph-empty-osds.py
> osd_id weight  reweight pgs_old bytes_old      pgs_new bytes_new      empty
>      3 0.00000  0.00000       0              0       0              0 True
>      4 1.00000  0.00000       0              0       0              0 True
> root@tceph1:~ # ./bc-ceph-empty-osds.py -a
> osd_id weight  reweight pgs_old bytes_old      pgs_new bytes_new      empty
>      0 0.06438  1.00000      59       46006167      59       46006167 False
>      1 0.06439  1.00000      47       28792306      47       28792306 False
>      2 0.06439  1.00000      46       17623485      46       17623485 False
>      3 0.00000  0.00000       0              0       0              0 True
>      4 1.00000  0.00000       0              0       0              0 True
>
>
> The "old" vs "new" suffixes refer to the position of data now and after
> recovery is complete, respectively. (the magic that made my reweight script
> efficient compared to the official reweight script)
>
> And I have not used such a method in the past... my cluster is small, so I
> have always just let recovery completely finish instead. I hope you find it
> useful and it develops from there.
>
> Peter
>
>
> On 07/28/17 15:36, Dan van der Ster wrote:
>
> Hi all,
>
> We are trying to outsource the disk replacement process for our ceph
> clusters to some non-expert sysadmins.
> We could really use a tool that reports if a Ceph OSD *would* or
> *would not* be safe to stop, e.g.
>
> # ceph-osd-safe-to-stop osd.X
> Yes it would be OK to stop osd.X
>
> (which of course means that no PGs would go inactive if osd.X were to
> be stopped).
>
> Does anyone have such a script that they'd like to share?
>
> Thanks!
>
> Dan
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com