Re: undersized pgs after removing smaller OSDs

Roger Brown <rogerpbrown@xxxxxxxxx> · Wed, 19 Jul 2017 03:16:00 +0000

Resolution confirmed!
$ ceph -s
  cluster:
    id:     eea7b78c-b138-40fc-9f3e-3d77afb770f0
    health: HEALTH_OK

  services:
    mon: 3 daemons, quorum desktop,mon1,nuc2
    mgr: desktop(active), standbys: mon1
    osd: 3 osds: 3 up, 3 in

  data:
    pools:   19 pools, 372 pgs
    objects: 54243 objects, 71722 MB
    usage:   129 GB used, 27812 GB / 27941 GB avail
    pgs:     372 active+clean

On Tue, Jul 18, 2017 at 8:47 PM Roger Brown <rogerpbrown@xxxxxxxxx> wrote:
Ah, that was the problem!

So I edited the crushmap (http://docs.ceph.com/docs/master/rados/operations/crush-map/) with a weight of 10.000 for all three 10TB OSD hosts. The instant result was all those pgs with only 2 OSDs were replaced with 3 OSDs while the cluster started rebalancing the data. I trust it will complete with time and I'll be good to go!

New OSD tree:
$ ceph osd tree
ID WEIGHT   TYPE NAME     UP/DOWN REWEIGHT PRIMARY-AFFINITY 
-1 30.00000 root default                                    
-5 10.00000     host osd1                                   
 3 10.00000         osd.3      up  1.00000          1.00000 
-6 10.00000     host osd2                                   
 4 10.00000         osd.4      up  1.00000          1.00000 
-2 10.00000     host osd3                                   
 0 10.00000         osd.0      up  1.00000          1.00000 

Kudos to Brad Hubbard for steering me in the right direction!

On Tue, Jul 18, 2017 at 8:27 PM Brad Hubbard <bhubbard@xxxxxxxxxx> wrote:
ID WEIGHT   TYPE NAME

-5  1.00000     host osd1

-6  9.09560     host osd2

-2  9.09560     host osd3

The weight allocated to host "osd1" should presumably be the same as

the other two hosts?

Dump your crushmap and take a good look at it, specifically the

weighting of "osd1".

On Wed, Jul 19, 2017 at 11:48 AM, Roger Brown <rogerpbrown@xxxxxxxxx> wrote:

> I also tried ceph pg query, but it gave no helpful recommendations for any

> of the stuck pgs.

>

>

> On Tue, Jul 18, 2017 at 7:45 PM Roger Brown <rogerpbrown@xxxxxxxxx> wrote:

>>

>> Problem:

>> I have some pgs with only two OSDs instead of 3 like all the other pgs

>> have. This is causing active+undersized+degraded status.

>>

>> History:

>> 1. I started with 3 hosts, each with 1 OSD process (min_size 2) for a 1TB

>> drive.

>> 2. Added 3 more hosts, each with 1 OSD process for a 10TB drive.

>> 3. Removed the original 3 1TB OSD hosts from the osd tree (reweight 0,

>> wait, stop, remove, del osd&host, rm).

>> 4. The last OSD to be removed would never return to active+clean after

>> reweight 0. It returned undersized instead, but I went on with removal

>> anyway, leaving me stuck with 5 undersized pgs.

>>

>> Things tried that didn't help:

>> * give it time to go away on its own

>> * Replace replicated default.rgw.buckets.data pool with erasure-code 2+1

>> version.

>> * ceph osd lost 1 (and 2)

>> * ceph pg repair (pgs from dump_stuck)

>> * googled 'ceph pg undersized' and similar searches for help.

>>

>> Current status:

>> $ ceph osd tree

>> ID WEIGHT   TYPE NAME     UP/DOWN REWEIGHT PRIMARY-AFFINITY

>> -1 19.19119 root default

>> -5  1.00000     host osd1

>>  3  1.00000         osd.3      up  1.00000          1.00000

>> -6  9.09560     host osd2

>>  4  9.09560         osd.4      up  1.00000          1.00000

>> -2  9.09560     host osd3

>>  0  9.09560         osd.0      up  1.00000          1.00000

>> $ ceph pg dump_stuck

>> ok

>> PG_STAT STATE                      UP    UP_PRIMARY ACTING ACTING_PRIMARY

>> 88.3    active+undersized+degraded [4,0]          4  [4,0]              4

>> 97.3    active+undersized+degraded [4,0]          4  [4,0]              4

>> 85.6    active+undersized+degraded [4,0]          4  [4,0]              4

>> 87.5    active+undersized+degraded [0,4]          0  [0,4]              0

>> 70.0    active+undersized+degraded [0,4]          0  [0,4]              0

>> $ ceph osd pool ls detail

>> pool 70 'default.rgw.rgw.gc' replicated size 3 min_size 2 crush_rule 0

>> object_hash rjenkins pg_num 4 pgp_num 4 last_change 548 flags hashpspool

>> stripe_width 0

>> pool 83 'default.rgw.buckets.non-ec' replicated size 3 min_size 2

>> crush_rule 0 object_hash rjenkins pg_num 8 pgp_num 8 last_change 576 owner

>> 18446744073709551615 flags hashpspool stripe_width 0

>> pool 85 'default.rgw.control' replicated size 3 min_size 2 crush_rule 0

>> object_hash rjenkins pg_num 8 pgp_num 8 last_change 652 flags hashpspool

>> stripe_width 0

>> pool 86 'default.rgw.data.root' replicated size 3 min_size 2 crush_rule 0

>> object_hash rjenkins pg_num 8 pgp_num 8 last_change 653 flags hashpspool

>> stripe_width 0

>> pool 87 'default.rgw.gc' replicated size 3 min_size 2 crush_rule 0

>> object_hash rjenkins pg_num 8 pgp_num 8 last_change 654 flags hashpspool

>> stripe_width 0

>> pool 88 'default.rgw.lc' replicated size 3 min_size 2 crush_rule 0

>> object_hash rjenkins pg_num 8 pgp_num 8 last_change 600 flags hashpspool

>> stripe_width 0

>> pool 89 'default.rgw.log' replicated size 3 min_size 2 crush_rule 0

>> object_hash rjenkins pg_num 8 pgp_num 8 last_change 655 flags hashpspool

>> stripe_width 0

>> pool 90 'default.rgw.users.uid' replicated size 3 min_size 2 crush_rule 0

>> object_hash rjenkins pg_num 8 pgp_num 8 last_change 662 flags hashpspool

>> stripe_width 0

>> pool 91 'default.rgw.users.email' replicated size 3 min_size 2 crush_rule

>> 0 object_hash rjenkins pg_num 8 pgp_num 8 last_change 660 flags hashpspool

>> stripe_width 0

>> pool 92 'default.rgw.users.keys' replicated size 3 min_size 2 crush_rule 0

>> object_hash rjenkins pg_num 8 pgp_num 8 last_change 659 flags hashpspool

>> stripe_width 0

>> pool 93 'default.rgw.buckets.index' replicated size 3 min_size 2

>> crush_rule 0 object_hash rjenkins pg_num 8 pgp_num 8 last_change 664 flags

>> hashpspool stripe_width 0

>> pool 95 'default.rgw.intent-log' replicated size 3 min_size 2 crush_rule 0

>> object_hash rjenkins pg_num 4 pgp_num 4 last_change 656 flags hashpspool

>> stripe_width 0

>> pool 96 'default.rgw.meta' replicated size 3 min_size 2 crush_rule 0

>> object_hash rjenkins pg_num 4 pgp_num 4 last_change 657 flags hashpspool

>> stripe_width 0

>> pool 97 'default.rgw.usage' replicated size 3 min_size 2 crush_rule 0

>> object_hash rjenkins pg_num 4 pgp_num 4 last_change 658 flags hashpspool

>> stripe_width 0

>> pool 98 'default.rgw.users.swift' replicated size 3 min_size 2 crush_rule

>> 0 object_hash rjenkins pg_num 4 pgp_num 4 last_change 661 flags hashpspool

>> stripe_width 0

>> pool 99 'default.rgw.buckets.extra' replicated size 3 min_size 2

>> crush_rule 0 object_hash rjenkins pg_num 4 pgp_num 4 last_change 663 flags

>> hashpspool stripe_width 0

>> pool 100 '.rgw.root' replicated size 3 min_size 2 crush_rule 0 object_hash

>> rjenkins pg_num 4 pgp_num 4 last_change 651 flags hashpspool stripe_width 0

>> pool 101 'default.rgw.reshard' replicated size 3 min_size 2 crush_rule 0

>> object_hash rjenkins pg_num 8 pgp_num 8 last_change 1529 owner

>> 18446744073709551615 flags hashpspool stripe_width 0

>> pool 103 'default.rgw.buckets.data' erasure size 3 min_size 2 crush_rule 1

>> object_hash rjenkins pg_num 256 pgp_num 256 last_change 2106 flags

>> hashpspool stripe_width 8192

>>

>> I'll keep on googling, but I'm open to advice!

>>

>> Thank you,

>>

>> Roger

>>

>

> _______________________________________________

> ceph-users mailing list

> ceph-users@xxxxxxxxxxxxxx

> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

>

--

Cheers,

Brad

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com