Re: Re: Crush algorithm problem

"ningt0509@xxxxxxxxx" <ningt0509@xxxxxxxxx> · Mon, 26 Nov 2018 08:58:01 +0800



Thank you for your reply, but after I changed the min_size of pool to 4, the pgs was still unable to recover

ceph -s
  cluster:
    id:     5e527773-9873-4100-bcce-19a1eaf6e496
    health: HEALTH_OK
 
  services:
    mon: 1 daemons, quorum a
    mgr: x(active)
    osd: 12 osds: 9 up, 9 in
 
  data:
    pools:   1 pools, 32 pgs
    objects: 0 objects, 0 bytes
    usage:   9238 MB used, 82921 MB / 92160 MB avail
    pgs:     26 active+undersized
             6  active+clean


ceph osd pool ls detail
pool 1 'ec' erasure size 6 min_size 4 origin_min_size 0 crush_rule 1 object_hash rjenkins pg_num 32 pgp_num 32 last_change 79 flags hashpspool stripe_width 16384

ceph osd tree
ID  CLASS WEIGHT   TYPE NAME      STATUS REWEIGHT PRI-AFF 
 -1       12.00000 root default                           
 -5        3.00000     host host0                         
  0   ssd  1.00000         osd.0    down        0 1.00000 
  1   ssd  1.00000         osd.1    down        0 1.00000 
  2   ssd  1.00000         osd.2    down        0 1.00000 
 -7        3.00000     host host1                         
  3   ssd  1.00000         osd.3      up  1.00000 1.00000 
  4   ssd  1.00000         osd.4      up  1.00000 1.00000 
  5   ssd  1.00000         osd.5      up  1.00000 1.00000 
 -9        3.00000     host host2                         
  6   ssd  1.00000         osd.6      up  1.00000 1.00000 
  7   ssd  1.00000         osd.7      up  1.00000 1.00000 
  8   ssd  1.00000         osd.8      up  1.00000 1.00000 
-11        3.00000     host host3                         
  9   ssd  1.00000         osd.9      up  1.00000 1.00000 
 10   ssd  1.00000         osd.10     up  1.00000 1.00000 
 11   ssd  1.00000         osd.11     up  1.00000 1.00000

--------------
ningt0509@xxxxxxxxx
>
>On 24/11/18 09:04, ningt0509@xxxxxxxxx wrote:
>> There are four hosts in the environment, the storage pool use EC 4+2, and the Crush rule is configured to select two osds from each host. When I shut down one host, all osds are marked as out state, but PG cannot restore active+clean. Why PG cannot map OSD on another host, Is there a problem with this situation?
>>
>> ID  CLASS WEIGHT   TYPE NAME      STATUS REWEIGHT PRI-AFF
>>   -1       30.00000 root default
>>   -5        7.00000     host host0
>>    0   ssd  1.00000         osd.0    down        0 1.00000
>>    1   ssd  1.00000         osd.1    down        0 1.00000
>>    2   ssd  1.00000         osd.2    down        0 1.00000
>>    3   ssd  1.00000         osd.3    down        0 1.00000
>>    4   ssd  1.00000         osd.4    down        0 1.00000
>>    5   ssd  1.00000         osd.5    down        0 1.00000
>>    6   ssd  1.00000         osd.6    down        0 1.00000
>>   -7        7.00000     host host1
>>    7   ssd  1.00000         osd.7      up  1.00000 1.00000
>>    8   ssd  1.00000         osd.8      up  1.00000 1.00000
>>    9   ssd  1.00000         osd.9      up  1.00000 1.00000
>>   10   ssd  1.00000         osd.10     up  1.00000 1.00000
>>   11   ssd  1.00000         osd.11     up  1.00000 1.00000
>>   12   ssd  1.00000         osd.12     up  1.00000 1.00000
>>   13   ssd  1.00000         osd.13     up  1.00000 1.00000
>>   -9        8.00000     host host2
>>   14   ssd  1.00000         osd.14     up  1.00000 1.00000
>>   15   ssd  1.00000         osd.15     up  1.00000 1.00000
>>   16   ssd  1.00000         osd.16     up  1.00000 1.00000
>>   17   ssd  1.00000         osd.17     up  1.00000 1.00000
>>   18   ssd  1.00000         osd.18     up  1.00000 1.00000
>>   19   ssd  1.00000         osd.19     up  1.00000 1.00000
>>   20   ssd  1.00000         osd.20     up  1.00000 1.00000
>>   21   ssd  1.00000         osd.21     up  1.00000 1.00000
>> -11        8.00000     host host3
>>   29        1.00000         osd.29     up  1.00000 1.00000
>>   22   ssd  1.00000         osd.22     up  1.00000 1.00000
>>   23   ssd  1.00000         osd.23     up  1.00000 1.00000
>>   24   ssd  1.00000         osd.24     up  1.00000 1.00000
>>   25   ssd  1.00000         osd.25     up  1.00000 1.00000
>>   26   ssd  1.00000         osd.26     up  1.00000 1.00000
>>   27   ssd  1.00000         osd.27     up  1.00000 1.00000
>>   28   ssd  1.00000         osd.28     up  1.00000 1.00000
>>
>>    cluster:
>>      id:     d24174ae-a1bf-43f9-a8f3-a10246988ab7
>>      health: HEALTH_WARN
>>              Reduced data availability: 413 pgs inactive
>>              Degraded data redundancy: 414 pgs undersized
>>  
>>    services:
>>      mon: 1 daemons, quorum a
>>      mgr: x(active)
>>      osd: 30 osds: 23 up, 23 in; 3 remapped pgs
>>  
>>    data:
>>      pools:   1 pools, 512 pgs
>>      objects: 0 objects, 0 bytes
>>      usage:   24026 MB used, 206 GB / 230 GB avail
>>      pgs:     80.664% pgs not active
>>               413 undersized+peered
>>               96  active+clean
>>               2   active+clean+remapped
>>               1   active+undersized+remapped
>>
>>
>> The Ceph environment configuration is as follows:
>>
>> Crush rule:
>> rule ec_4_2 {
>>         id 1
>>          type erasure
>>          min_size 3
>>          max_size 6
>>          step set_chooseleaf_tries 5
>>          step set_choose_tries 400
>>          step take default
>>          step choose indep 0 type host
>>          step chooseleaf indep 2 type osd
>>          step emit
>> }
>>
>> Pool:
>> pool 1 'ec_4_2' erasure size 6 min_size 5 origin_min_size 0 crush_rule 1 object_hash rjenkins pg_num 512 pgp_num 512 last_change 94 flags hashpspool stripe_width 16384
>>
>>
>> --------------
>> ningt0509@xxxxxxxxx
>
>Try setting your pool min_size to temporarily 4 rather than 5 to kick
>start the recovery.
>I believe this is a feature/bug that EC pools require min_size of pool
>chunks to start recovery rather than k chunks.
>
>Maged