Re: New OSD with weight 0, rebalance still happen...

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 11/22/18 6:12 PM, Marco Gaiarin wrote:
Mandi! Paweł Sadowsk
  In chel di` si favelave...

From your osd tree it looks like you used 'ceph osd reweight'.
Yes, and i supposed also to do the right things!

Now, i've tried to lower the to-dimissi OSD, using:
	ceph osd reweight 2 0.95

leading to an osd map tree like:

 root@blackpanther:~# ceph osd tree
 ID WEIGHT   TYPE NAME               UP/DOWN REWEIGHT PRIMARY-AFFINITY 
 -1 21.83984 root default                                              
 -2  5.45996     host capitanamerica                                   
  0  1.81999         osd.0                up  1.00000          1.00000 
  1  1.81999         osd.1                up  1.00000          1.00000 
 10  0.90999         osd.10               up  1.00000          1.00000 
 11  0.90999         osd.11               up  1.00000          1.00000 
 -3  5.45996     host vedovanera                                       
  2  1.81999         osd.2                up  0.95000          1.00000 
  3  1.81999         osd.3                up  1.00000          1.00000 
  4  0.90999         osd.4                up  1.00000          1.00000 
  5  0.90999         osd.5                up  1.00000          1.00000 
 -4  5.45996     host deadpool                                         
  6  1.81999         osd.6                up  1.00000          1.00000 
  7  1.81999         osd.7                up  1.00000          1.00000 
  8  0.90999         osd.8                up  1.00000          1.00000 
  9  0.90999         osd.9                up  1.00000          1.00000 
 -5  5.45996     host blackpanther                                     
 12  1.81999         osd.12               up  0.04999          1.00000 
 13  1.81999         osd.13               up  0.04999          1.00000 
 14  0.90999         osd.14               up  0.04999          1.00000 
 15  0.90999         osd.15               up  0.04999          1.00000 

and, after rebalancing, to:

 root@blackpanther:~# ceph -s
    cluster 8794c124-c2ec-4e81-8631-742992159bd6
     health HEALTH_WARN
            6 pgs stuck unclean
            recovery 4/2550363 objects degraded (0.000%)
            recovery 11282/2550363 objects misplaced (0.442%)
     monmap e6: 6 mons at {0=10.27.251.7:6789/0,1=10.27.251.8:6789/0,2=10.27.251.11:6789/0,3=10.27.251.12:6789/0,4=10.27.251.9:6789/0,blackpanther=10.27.251.2:6789/0}
            election epoch 2750, quorum 0,1,2,3,4,5 blackpanther,0,1,4,2,3
     osdmap e7300: 16 osds: 16 up, 16 in; 6 remapped pgs
      pgmap v54737590: 768 pgs, 3 pools, 3299 GB data, 830 kobjects
            9870 GB used, 12474 GB / 22344 GB avail
            4/2550363 objects degraded (0.000%)
            11282/2550363 objects misplaced (0.442%)
                 761 active+clean
                   6 active+remapped
                   1 active+clean+scrubbing
  client io 13476 B/s rd, 654 kB/s wr, 95 op/s

Why pgs that are in state 'stuck unclean'?

This is most probably due to big difference in weights between your hosts (the new one has 20x lower weight than the old ones) which in combination with straw algorithm is a 'known' issue. You could try to increase choose_total_tries in your crush map from 50 to some bigger number. The best IMO would be to use straw2 (which will cause some rebalance) and then use 'ceph osd crush reweight' (instead of 'ceph osd reweight') with small steps to slowly rebalance data onto new OSDs.

-- 
PS
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux