Re: Crush algorithm problem

Maged Mokhtar <mmokhtar@xxxxxxxxxxx> · Sat, 24 Nov 2018 14:33:31 +0200

On 24/11/18 09:04, ningt0509@xxxxxxxxx wrote:
There are four hosts in the environment, the storage pool use EC 4+2, and the Crush rule is configured to select two osds from each host. When I shut down one host, all osds are marked as out state, but PG cannot restore active+clean. Why PG cannot map OSD on another host, Is there a problem with this situation?

ID  CLASS WEIGHT   TYPE NAME      STATUS REWEIGHT PRI-AFF
  -1       30.00000 root default
  -5        7.00000     host host0
   0   ssd  1.00000         osd.0    down        0 1.00000
   1   ssd  1.00000         osd.1    down        0 1.00000
   2   ssd  1.00000         osd.2    down        0 1.00000
   3   ssd  1.00000         osd.3    down        0 1.00000
   4   ssd  1.00000         osd.4    down        0 1.00000
   5   ssd  1.00000         osd.5    down        0 1.00000
   6   ssd  1.00000         osd.6    down        0 1.00000
  -7        7.00000     host host1
   7   ssd  1.00000         osd.7      up  1.00000 1.00000
   8   ssd  1.00000         osd.8      up  1.00000 1.00000
   9   ssd  1.00000         osd.9      up  1.00000 1.00000
  10   ssd  1.00000         osd.10     up  1.00000 1.00000
  11   ssd  1.00000         osd.11     up  1.00000 1.00000
  12   ssd  1.00000         osd.12     up  1.00000 1.00000
  13   ssd  1.00000         osd.13     up  1.00000 1.00000
  -9        8.00000     host host2
  14   ssd  1.00000         osd.14     up  1.00000 1.00000
  15   ssd  1.00000         osd.15     up  1.00000 1.00000
  16   ssd  1.00000         osd.16     up  1.00000 1.00000
  17   ssd  1.00000         osd.17     up  1.00000 1.00000
  18   ssd  1.00000         osd.18     up  1.00000 1.00000
  19   ssd  1.00000         osd.19     up  1.00000 1.00000
  20   ssd  1.00000         osd.20     up  1.00000 1.00000
  21   ssd  1.00000         osd.21     up  1.00000 1.00000
-11        8.00000     host host3
  29        1.00000         osd.29     up  1.00000 1.00000
  22   ssd  1.00000         osd.22     up  1.00000 1.00000
  23   ssd  1.00000         osd.23     up  1.00000 1.00000
  24   ssd  1.00000         osd.24     up  1.00000 1.00000
  25   ssd  1.00000         osd.25     up  1.00000 1.00000
  26   ssd  1.00000         osd.26     up  1.00000 1.00000
  27   ssd  1.00000         osd.27     up  1.00000 1.00000
  28   ssd  1.00000         osd.28     up  1.00000 1.00000

   cluster:
     id:     d24174ae-a1bf-43f9-a8f3-a10246988ab7
     health: HEALTH_WARN
             Reduced data availability: 413 pgs inactive
             Degraded data redundancy: 414 pgs undersized

   services:
     mon: 1 daemons, quorum a
     mgr: x(active)
     osd: 30 osds: 23 up, 23 in; 3 remapped pgs

   data:
     pools:   1 pools, 512 pgs
     objects: 0 objects, 0 bytes
     usage:   24026 MB used, 206 GB / 230 GB avail
     pgs:     80.664% pgs not active
              413 undersized+peered
              96  active+clean
              2   active+clean+remapped
              1   active+undersized+remapped

The Ceph environment configuration is as follows:

Crush rule:
rule ec_4_2 {
        id 1
         type erasure
         min_size 3
         max_size 6
         step set_chooseleaf_tries 5
         step set_choose_tries 400
         step take default
         step choose indep 0 type host
         step chooseleaf indep 2 type osd
         step emit
}

Pool:
pool 1 'ec_4_2' erasure size 6 min_size 5 origin_min_size 0 crush_rule 1 object_hash rjenkins pg_num 512 pgp_num 512 last_change 94 flags hashpspool stripe_width 16384

--------------
ningt0509@xxxxxxxxx

Try setting your pool min_size to temporarily 4 rather than 5 to kick 
start the recovery.
I believe this is a feature/bug that EC pools require min_size of pool 
chunks to start recovery rather than k chunks.

Maged