Re: undersized pgs after removing smaller OSDs

David Turner <drakonstein@xxxxxxxxx> · Wed, 19 Jul 2017 13:34:45 +0000

I would go with the weight that was originally assigned to them. That way it is in line with what new osds will be weighted.

On Wed, Jul 19, 2017, 9:17 AM Roger Brown <rogerpbrown@xxxxxxxxx> wrote:
David,
Thank you. I have it currently as...

$ ceph osd df
ID WEIGHT   REWEIGHT SIZE   USE    AVAIL  %USE VAR  PGS 
 3 10.00000  1.00000  9313G 44404M  9270G 0.47 1.00 372 
 4 10.00000  1.00000  9313G 46933M  9268G 0.49 1.06 372 
 0 10.00000  1.00000  9313G 41283M  9273G 0.43 0.93 372 
               TOTAL 27941G   129G 27812G 0.46          
MIN/MAX VAR: 0.93/1.06  STDDEV: 0.02

The above output shows size not as 10TB but as 9313G. So should I reweight each as 9.313? Or as the TiB value 9.09560?

On Tue, Jul 18, 2017 at 11:18 PM David Turner <drakonstein@xxxxxxxxx> wrote:
I would recommend sucking with the weight of 9.09560 for the osds as that is the TiB size of the osds that ceph details to as supposed to the TB size of the osds. New osds will have their weights based on the TiB value. What is your `ceph osd df` output just to see what things look like? Hopefully very healthy.

On Tue, Jul 18, 2017, 11:16 PM Roger Brown <rogerpbrown@xxxxxxxxx> wrote:
Resolution confirmed!
$ ceph -s
  cluster:
    id:     eea7b78c-b138-40fc-9f3e-3d77afb770f0
    health: HEALTH_OK

  services:
    mon: 3 daemons, quorum desktop,mon1,nuc2
    mgr: desktop(active), standbys: mon1
    osd: 3 osds: 3 up, 3 in

  data:
    pools:   19 pools, 372 pgs
    objects: 54243 objects, 71722 MB
    usage:   129 GB used, 27812 GB / 27941 GB avail
    pgs:     372 active+clean

On Tue, Jul 18, 2017 at 8:47 PM Roger Brown <rogerpbrown@xxxxxxxxx> wrote:
Ah, that was the problem!

So I edited the crushmap (http://docs.ceph.com/docs/master/rados/operations/crush-map/) with a weight of 10.000 for all three 10TB OSD hosts. The instant result was all those pgs with only 2 OSDs were replaced with 3 OSDs while the cluster started rebalancing the data. I trust it will complete with time and I'll be good to go!

New OSD tree:
$ ceph osd tree
ID WEIGHT   TYPE NAME     UP/DOWN REWEIGHT PRIMARY-AFFINITY 
-1 30.00000 root default                                    
-5 10.00000     host osd1                                   
 3 10.00000         osd.3      up  1.00000          1.00000 
-6 10.00000     host osd2                                   
 4 10.00000         osd.4      up  1.00000          1.00000 
-2 10.00000     host osd3                                   
 0 10.00000         osd.0      up  1.00000          1.00000 

Kudos to Brad Hubbard for steering me in the right direction!

On Tue, Jul 18, 2017 at 8:27 PM Brad Hubbard <bhubbard@xxxxxxxxxx> wrote:
ID WEIGHT   TYPE NAME

-5  1.00000     host osd1

-6  9.09560     host osd2

-2  9.09560     host osd3

The weight allocated to host "osd1" should presumably be the same as

the other two hosts?

Dump your crushmap and take a good look at it, specifically the

weighting of "osd1".

On Wed, Jul 19, 2017 at 11:48 AM, Roger Brown <rogerpbrown@xxxxxxxxx> wrote:

> I also tried ceph pg query, but it gave no helpful recommendations for any

> of the stuck pgs.

>

>

> On Tue, Jul 18, 2017 at 7:45 PM Roger Brown <rogerpbrown@xxxxxxxxx> wrote:

>>

>> Problem:

>> I have some pgs with only two OSDs instead of 3 like all the other pgs

>> have. This is causing active+undersized+degraded status.

>>

>> History:

>> 1. I started with 3 hosts, each with 1 OSD process (min_size 2) for a 1TB

>> drive.

>> 2. Added 3 more hosts, each with 1 OSD process for a 10TB drive.

>> 3. Removed the original 3 1TB OSD hosts from the osd tree (reweight 0,

>> wait, stop, remove, del osd&host, rm).

>> 4. The last OSD to be removed would never return to active+clean after

>> reweight 0. It returned undersized instead, but I went on with removal

>> anyway, leaving me stuck with 5 undersized pgs.

>>

>> Things tried that didn't help:

>> * give it time to go away on its own

>> * Replace replicated default.rgw.buckets.data pool with erasure-code 2+1

>> version.

>> * ceph osd lost 1 (and 2)

>> * ceph pg repair (pgs from dump_stuck)

>> * googled 'ceph pg undersized' and similar searches for help.

>>

>> Current status:

>> $ ceph osd tree

>> ID WEIGHT   TYPE NAME     UP/DOWN REWEIGHT PRIMARY-AFFINITY

>> -1 19.19119 root default

>> -5  1.00000     host osd1

>>  3  1.00000         osd.3      up  1.00000          1.00000

>> -6  9.09560     host osd2

>>  4  9.09560         osd.4      up  1.00000          1.00000

>> -2  9.09560     host osd3

>>  0  9.09560         osd.0      up  1.00000          1.00000

>> $ ceph pg dump_stuck

>> ok

>> PG_STAT STATE                      UP    UP_PRIMARY ACTING ACTING_PRIMARY

>> 88.3    active+undersized+degraded [4,0]          4  [4,0]              4

>> 97.3    active+undersized+degraded [4,0]          4  [4,0]              4

>> 85.6    active+undersized+degraded [4,0]          4  [4,0]              4

>> 87.5    active+undersized+degraded [0,4]          0  [0,4]              0

>> 70.0    active+undersized+degraded [0,4]          0  [0,4]              0

>> $ ceph osd pool ls detail

>> pool 70 'default.rgw.rgw.gc' replicated size 3 min_size 2 crush_rule 0

>> object_hash rjenkins pg_num 4 pgp_num 4 last_change 548 flags hashpspool

>> stripe_width 0

>> pool 83 'default.rgw.buckets.non-ec' replicated size 3 min_size 2

>> crush_rule 0 object_hash rjenkins pg_num 8 pgp_num 8 last_change 576 owner

>> 18446744073709551615 flags hashpspool stripe_width 0

>> pool 85 'default.rgw.control' replicated size 3 min_size 2 crush_rule 0

>> object_hash rjenkins pg_num 8 pgp_num 8 last_change 652 flags hashpspool

>> stripe_width 0

>> pool 86 'default.rgw.data.root' replicated size 3 min_size 2 crush_rule 0

>> object_hash rjenkins pg_num 8 pgp_num 8 last_change 653 flags hashpspool

>> stripe_width 0

>> pool 87 'default.rgw.gc' replicated size 3 min_size 2 crush_rule 0

>> object_hash rjenkins pg_num 8 pgp_num 8 last_change 654 flags hashpspool

>> stripe_width 0

>> pool 88 'default.rgw.lc' replicated size 3 min_size 2 crush_rule 0

>> object_hash rjenkins pg_num 8 pgp_num 8 last_change 600 flags hashpspool

>> stripe_width 0

>> pool 89 'default.rgw.log' replicated size 3 min_size 2 crush_rule 0

>> object_hash rjenkins pg_num 8 pgp_num 8 last_change 655 flags hashpspool

>> stripe_width 0

>> pool 90 'default.rgw.users.uid' replicated size 3 min_size 2 crush_rule 0

>> object_hash rjenkins pg_num 8 pgp_num 8 last_change 662 flags hashpspool

>> stripe_width 0

>> pool 91 'default.rgw.users.email' replicated size 3 min_size 2 crush_rule

>> 0 object_hash rjenkins pg_num 8 pgp_num 8 last_change 660 flags hashpspool

>> stripe_width 0

>> pool 92 'default.rgw.users.keys' replicated size 3 min_size 2 crush_rule 0

>> object_hash rjenkins pg_num 8 pgp_num 8 last_change 659 flags hashpspool

>> stripe_width 0

>> pool 93 'default.rgw.buckets.index' replicated size 3 min_size 2

>> crush_rule 0 object_hash rjenkins pg_num 8 pgp_num 8 last_change 664 flags

>> hashpspool stripe_width 0

>> pool 95 'default.rgw.intent-log' replicated size 3 min_size 2 crush_rule 0

>> object_hash rjenkins pg_num 4 pgp_num 4 last_change 656 flags hashpspool

>> stripe_width 0

>> pool 96 'default.rgw.meta' replicated size 3 min_size 2 crush_rule 0

>> object_hash rjenkins pg_num 4 pgp_num 4 last_change 657 flags hashpspool

>> stripe_width 0

>> pool 97 'default.rgw.usage' replicated size 3 min_size 2 crush_rule 0

>> object_hash rjenkins pg_num 4 pgp_num 4 last_change 658 flags hashpspool

>> stripe_width 0

>> pool 98 'default.rgw.users.swift' replicated size 3 min_size 2 crush_rule

>> 0 object_hash rjenkins pg_num 4 pgp_num 4 last_change 661 flags hashpspool

>> stripe_width 0

>> pool 99 'default.rgw.buckets.extra' replicated size 3 min_size 2

>> crush_rule 0 object_hash rjenkins pg_num 4 pgp_num 4 last_change 663 flags

>> hashpspool stripe_width 0

>> pool 100 '.rgw.root' replicated size 3 min_size 2 crush_rule 0 object_hash

>> rjenkins pg_num 4 pgp_num 4 last_change 651 flags hashpspool stripe_width 0

>> pool 101 'default.rgw.reshard' replicated size 3 min_size 2 crush_rule 0

>> object_hash rjenkins pg_num 8 pgp_num 8 last_change 1529 owner

>> 18446744073709551615 flags hashpspool stripe_width 0

>> pool 103 'default.rgw.buckets.data' erasure size 3 min_size 2 crush_rule 1

>> object_hash rjenkins pg_num 256 pgp_num 256 last_change 2106 flags

>> hashpspool stripe_width 8192

>>

>> I'll keep on googling, but I'm open to advice!

>>

>> Thank you,

>>

>> Roger

>>

>

> _______________________________________________

> ceph-users mailing list

> ceph-users@xxxxxxxxxxxxxx

> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

>

--

Cheers,

Brad

_______________________________________________

ceph-users mailing list

ceph-users@xxxxxxxxxxxxxx

http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com