Re: New OSD with weight 0, rebalance still happen...

Paweł Sadowski <ceph@xxxxxxxxx> · Thu, 22 Nov 2018 22:09:34 +0100

    On 11/22/18 6:12 PM, Marco Gaiarin
      wrote:

      Mandi! Paweł Sadowsk
  In chel di` si favelave...

        From your osd tree it looks like you used 'ceph osd reweight'.

      Yes, and i supposed also to do the right things!

Now, i've tried to lower the to-dimissi OSD, using:
	ceph osd reweight 2 0.95

leading to an osd map tree like:

 root@blackpanther:~# ceph osd tree
 ID WEIGHT   TYPE NAME               UP/DOWN REWEIGHT PRIMARY-AFFINITY 
 -1 21.83984 root default                                              
 -2  5.45996     host capitanamerica                                   
  0  1.81999         osd.0                up  1.00000          1.00000 
  1  1.81999         osd.1                up  1.00000          1.00000 
 10  0.90999         osd.10               up  1.00000          1.00000 
 11  0.90999         osd.11               up  1.00000          1.00000 
 -3  5.45996     host vedovanera                                       
  2  1.81999         osd.2                up  0.95000          1.00000 
  3  1.81999         osd.3                up  1.00000          1.00000 
  4  0.90999         osd.4                up  1.00000          1.00000 
  5  0.90999         osd.5                up  1.00000          1.00000 
 -4  5.45996     host deadpool                                         
  6  1.81999         osd.6                up  1.00000          1.00000 
  7  1.81999         osd.7                up  1.00000          1.00000 
  8  0.90999         osd.8                up  1.00000          1.00000 
  9  0.90999         osd.9                up  1.00000          1.00000 
 -5  5.45996     host blackpanther                                     
 12  1.81999         osd.12               up  0.04999          1.00000 
 13  1.81999         osd.13               up  0.04999          1.00000 
 14  0.90999         osd.14               up  0.04999          1.00000 
 15  0.90999         osd.15               up  0.04999          1.00000 

and, after rebalancing, to:

 root@blackpanther:~# ceph -s
    cluster 8794c124-c2ec-4e81-8631-742992159bd6
     health HEALTH_WARN
            6 pgs stuck unclean
            recovery 4/2550363 objects degraded (0.000%)
            recovery 11282/2550363 objects misplaced (0.442%)
     monmap e6: 6 mons at {0=10.27.251.7:6789/0,1=10.27.251.8:6789/0,2=10.27.251.11:6789/0,3=10.27.251.12:6789/0,4=10.27.251.9:6789/0,blackpanther=10.27.251.2:6789/0}
            election epoch 2750, quorum 0,1,2,3,4,5 blackpanther,0,1,4,2,3
     osdmap e7300: 16 osds: 16 up, 16 in; 6 remapped pgs
      pgmap v54737590: 768 pgs, 3 pools, 3299 GB data, 830 kobjects
            9870 GB used, 12474 GB / 22344 GB avail
            4/2550363 objects degraded (0.000%)
            11282/2550363 objects misplaced (0.442%)
                 761 active+clean
                   6 active+remapped
                   1 active+clean+scrubbing
  client io 13476 B/s rd, 654 kB/s wr, 95 op/s

Why pgs that are in state 'stuck unclean'?

    This is most probably due to big difference in weights between
      your hosts (the new one has 20x lower weight than the old ones)
      which in combination with straw algorithm is a 'known' issue. You
      could try to increase choose_total_tries in your crush map
      from 50 to some bigger number. The best IMO would be to use straw2
      (which will cause some rebalance) and then use 'ceph osd crush
      reweight' (instead of 'ceph osd reweight') with small steps to
      slowly rebalance data onto new OSDs.
    -- 
PS

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com