Re: ceph osd safe to remove

Peter Maloney <peter.maloney@xxxxxxxxxxxxxxxxxxxx> · Thu, 3 Aug 2017 11:42:14 +0200

    On 08/03/17 11:05, Dan van der Ster
      wrote:

      On Fri, Jul 28, 2017 at 9:42 PM, Peter Maloney
<peter.maloney@xxxxxxxxxxxxxxxxxxxx> wrote:

        Hello Dan,

Based on what I know and what people told me on IRC, this means basicaly the
condition that the osd is not acting nor up for any pg. And for one person
(fusl on irc) that said there was a unfound objects bug when he had size =
1, also he said if reweight (and I assume crush weight) is 0, it will surely
be safe, but possibly it won't be otherwise.

And so here I took my bc-ceph-reweight-by-utilization.py script that already
parses `ceph pg dump --format=json` (for up,acting,bytes,count of pgs) and
`ceph osd df --format=json` (for weight and reweight), and gutted out the
unneeded parts, and changed the report to show the condition I described as
True or False per OSD. So the ceph auth needs to allow ceph pg dump and ceph
osd df. The script is attached.

The script doesn't assume you're ok with acting lower than size, or care
about min_size, and just assumes you want the OSD completely empty.

      Thanks for this script. In fact, I am trying to use the
min_size/size-based removal heuristics. If we would be able to wait
until an OSD is completely empty, then I suppose could just set the
crush weight to 0 then wait for HEALTH_OK. For our procedures I'm
trying to shortcut this with an earlier device removal.

Cheers, Dan

    Well what this is intended for is you can set some weight 0, then
    later set others weight 0, etc. and before all are done, you
    can remove some that the script identifies (no pgs are on
    that disk, even if other pgs are still being moved on other
    disks). So it's a shortcut, but only by gaining knowledge, not by
    sacrificing redundancy.

    And I wasn't sure what you preferred... I definitely prefer to have
    my full size achieved, not just min_size if I'm going to remove
    something. Just like how you don't run raid5 on large disks, and
    instead use raid6, and would only replace one disk at a time so you
    still have redundancy.

    What do you use so keeping redundancy isn't important?

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com