Re: RESEND: Re: PG Balancer Upmap mode not working

David Zafman <dzafman@xxxxxxxxxx> · Wed, 11 Dec 2019 16:09:28 -0800

Philippe,

I have a master branch version of the code to test.  The nautilus 
backport https://github.com/ceph/ceph/pull/31956 should be the same.

Using your OSDMap, the code in master branch and some additional changes 
to osdmaptool I was able to balance your cluster.  The osdmaptool 
changes simulate the mgr active balancer behavior.  It never took no 
more than 0.13991 seconds to calculate more upmaps per round. And that's 
on a virtual machine used for development. It took 35 rounds with 10 
maximum upmaps per crush rule set of pools per round.  With the default 
1 minute sleeps inside the mgr it would take 35 minutes.  Obviously, 
recovery/backfill has to finish before the cluster settles into the new 
configuration.  It needed 397 additional upmaps and removed 8.

Because all pools for a given crush rule are balanced together you can 
see that this is more balanced than Rich's configuration uising Luminous.

This balancer code is subject to change before final release of the next 
Nautilus point release.

Final layout:

osd.0 pgs 146
osd.1 pgs 146
osd.2 pgs 146
osd.3 pgs 146
osd.4 pgs 146
osd.5 pgs 146
osd.6 pgs 146
osd.7 pgs 146
osd.8 pgs 146
osd.9 pgs 146
osd.10 pgs 146
osd.11 pgs 146
osd.12 pgs 74
osd.13 pgs 74
osd.14 pgs 73
osd.15 pgs 74
osd.16 pgs 74
osd.17 pgs 74
osd.18 pgs 73
osd.19 pgs 74
osd.20 pgs 73
osd.21 pgs 73
osd.22 pgs 74
osd.23 pgs 73
osd.24 pgs 73
osd.25 pgs 75
osd.26 pgs 74
osd.27 pgs 74
osd.28 pgs 73
osd.29 pgs 73
osd.30 pgs 73
osd.31 pgs 73
osd.32 pgs 74
osd.33 pgs 73
osd.34 pgs 73
osd.35 pgs 74
osd.36 pgs 74
osd.37 pgs 74
osd.38 pgs 74
osd.39 pgs 74
osd.40 pgs 73
osd.41 pgs 73
osd.42 pgs 73
osd.43 pgs 73
osd.44 pgs 74
osd.45 pgs 73
osd.46 pgs 73
osd.47 pgs 73
osd.48 pgs 73
osd.49 pgs 73
osd.50 pgs 73
osd.51 pgs 73
osd.52 pgs 75
osd.53 pgs 59
osd.54 pgs 74
osd.55 pgs 74
osd.56 pgs 74
osd.57 pgs 73
osd.58 pgs 74
osd.59 pgs 74
osd.60 pgs 74
osd.61 pgs 74
osd.62 pgs 73
osd.63 pgs 74
osd.64 pgs 73
osd.65 pgs 74
osd.66 pgs 74
osd.67 pgs 74
osd.68 pgs 73
osd.69 pgs 74
osd.70 pgs 73
osd.71 pgs 73
osd.72 pgs 73
osd.73 pgs 73
osd.74 pgs 73
osd.75 pgs 73
osd.76 pgs 73
osd.77 pgs 73
osd.78 pgs 73

osd.79 pgs 73
osd.80 pgs 73
osd.81 pgs 73
osd.82 pgs 73
osd.83 pgs 73
osd.84 pgs 73
osd.85 pgs 73
osd.86 pgs 73
osd.87 pgs 73
osd.88 pgs 73
osd.89 pgs 73
osd.90 pgs 73
osd.91 pgs 73
osd.92 pgs 73
osd.93 pgs 73
osd.94 pgs 73
osd.95 pgs 73
osd.96 pgs 73
osd.97 pgs 73
osd.98 pgs 73
osd.99 pgs 73
osd.100 pgs 146
osd.101 pgs 146
osd.102 pgs 146
osd.103 pgs 146
osd.104 pgs 146
osd.105 pgs 146
osd.106 pgs 146
osd.107 pgs 146
osd.108 pgs 146
osd.109 pgs 146
osd.110 pgs 146
osd.111 pgs 146
osd.112 pgs 73
osd.113 pgs 73
osd.114 pgs 73
osd.115 pgs 73
osd.116 pgs 73
osd.117 pgs 73
osd.118 pgs 73
osd.119 pgs 73
osd.120 pgs 73
osd.121 pgs 73
osd.122 pgs 73
osd.123 pgs 73
osd.124 pgs 73
osd.125 pgs 73
osd.126 pgs 73
osd.127 pgs 74
osd.128 pgs 73
osd.129 pgs 73
osd.130 pgs 73
osd.131 pgs 73
osd.132 pgs 73
osd.133 pgs 73
osd.134 pgs 73
osd.135 pgs 73

David

On 12/10/19 9:59 PM, Philippe D'Anjou wrote:
Given I was told its an issue of too low PGs I am raising and testing 
this, although my SSDs which have about 150 each also are not well 
distributed.
I attached my OSDmap, I'd appreciate if you could run your test on it 
like you did with the other guy, so I know if this will ever 
distribute equally or not..

If you're busy I understand that too, then ignore this.

Thanks in either case. Just have been dealing with this since months 
now and it gets frustrating.

best regards

Am Dienstag, 10. Dezember 2019, 03:53:17 OEZ hat David Zafman 
<dzafman@xxxxxxxxxx> Folgendes geschrieben:

Please file a tracker with the symptom and examples. Please attach your
OSDMap (ceph osd getmap > osdmap.bin).

Note that https://github.com/ceph/ceph/pull/31956 
<https://github.com/ceph/ceph/pull/31956 >has the Nautilus
version of improved upmap code.  It also changes osdmaptool to match the
mgr behavior, so that one can observe the behavior of the upmap balancer
offline.

Thanks

David

On 12/8/19 11:04 AM, Philippe D'Anjou wrote:
> It's only getting worse after raising PGs now.
>
> Anything between:
>  96   hdd 9.09470 1.00000 9.1 TiB 4.9 TiB 4.9 TiB 97 KiB  13 GiB 4.2
> TiB 53.62 0.76  54     up
>
> and
>
>  89   hdd 9.09470 1.00000 9.1 TiB 8.1 TiB 8.1 TiB 88 KiB  21 GiB 1001
> GiB 89.25 1.27  87     up
>
> How is that possible? I dont know how much more proof I need to
> present that there's a bug.

>
>
>
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx <mailto:ceph-users@xxxxxxxxxxxxxx>
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx