You have a lot of useless PG, yet they have the same "weight" as the useful ones If those pools are useless, you can: - drop them - raise npr_archive's pg_num using the freed PGs As npr_archive own 97% of your data, it should get 97% of your pg (which is ~8000) The balance module is still quite useful On 04/30/2019 08:02 PM, Shain Miley wrote: > Here is the per pool pg_num info: > > 'data' pg_num 64 > 'metadata' pg_num 64 > 'rbd' pg_num 64 > 'npr_archive' pg_num 6775 > '.rgw.root' pg_num 64 > '.rgw.control' pg_num 64 > '.rgw' pg_num 64 > '.rgw.gc' pg_num 64 > '.users.uid' pg_num 64 > '.users.email' pg_num 64 > '.users' pg_num 64 > '.usage' pg_num 64 > '.rgw.buckets.index' pg_num 128 > '.intent-log' pg_num 8 > '.rgw.buckets' pg_num 64 > 'kube' pg_num 512 > '.log' pg_num 8 > > Here is the df output: > > GLOBAL: > SIZE AVAIL RAW USED %RAW USED > 1.06PiB 306TiB 778TiB 71.75 > POOLS: > NAME ID USED %USED MAX AVAIL OBJECTS > data 0 11.7GiB 0.14 8.17TiB 3006 > metadata 1 0B 0 8.17TiB 0 > rbd 2 43.2GiB 0.51 8.17TiB 11147 > npr_archive 3 258TiB 97.93 5.45TiB 82619649 > .rgw.root 4 1001B 0 8.17TiB 5 > .rgw.control 5 0B 0 8.17TiB 8 > .rgw 6 6.16KiB 0 8.17TiB 35 > .rgw.gc 7 0B 0 8.17TiB 32 > .users.uid 8 0B 0 8.17TiB 0 > .users.email 9 0B 0 8.17TiB 0 > .users 10 0B 0 8.17TiB 0 > .usage 11 0B 0 8.17TiB 1 > .rgw.buckets.index 12 0B 0 8.17TiB 26 > .intent-log 17 0B 0 5.45TiB 0 > .rgw.buckets 18 24.2GiB 0.29 8.17TiB 6622 > kube 21 1.82GiB 0.03 5.45TiB 550 > .log 22 0B 0 5.45TiB 176 > > > The stuff in the data pool and the rwg pools is old data that we used > for testing...if you guys think that removing everything outside of rbd > and npr_archive would make a significant impact I will give it a try. > > Thanks, > > Shain > > > > On 4/30/19 1:15 PM, Jack wrote: >> Hi, >> >> I see that you are using rgw >> RGW comes with many pools, yet most of them are used for metadata and >> configuration, those do not store many data >> Such pools do not need more than a couple PG, each (I use pg_num = 8) >> >> You need to allocate your pg on pool that actually stores the data >> >> Please do the following, to let us know more: >> Print the pg_num per pool: >> for i in $(rados lspools); do echo -n "$i: "; ceph osd pool get $i >> pg_num; done >> >> Print the usage per pool: >> ceph df >> >> Also, instead of doing a "ceph osd reweight-by-utilization", check out >> the balancer plugin : >> https://urldefense.proofpoint.com/v2/url?u=http-3A__docs.ceph.com_docs_mimic_mgr_balancer_&d=DwICAg&c=E2nBno7hEddFhl23N5nD1Q&r=cqFccwnwHGRorPuRWs36Dw&m=1BfaF7xeFT_o8pdT9mrRmWm0gCn4wgalDi3UviTy24M&s=YoiU-wa-ZXHUEj8xYmiSVRVnXnDenoUaRZMa-bfRFvo&e= >> >> >> Finally, in nautilus, the pg can now upscale and downscale automaticaly >> See >> https://urldefense.proofpoint.com/v2/url?u=https-3A__ceph.com_rados_new-2Din-2Dnautilus-2Dpg-2Dmerging-2Dand-2Dautotuning_&d=DwICAg&c=E2nBno7hEddFhl23N5nD1Q&r=cqFccwnwHGRorPuRWs36Dw&m=1BfaF7xeFT_o8pdT9mrRmWm0gCn4wgalDi3UviTy24M&s=7-W9i3gJAcCtrL7MzjJlG5LZ_91zeesYBT7g0rGrLh0&e= >> >> >> >> On 04/30/2019 06:34 PM, Shain Miley wrote: >>> Hi, >>> >>> We have a cluster with 235 osd's running version 12.2.11 with a >>> combination of 4 and 6 TB drives. The data distribution across osd's >>> varies from 52% to 94%. >>> >>> I have been trying to figure out how to get this a bit more balanced as >>> we are running into 'backfillfull' issues on a regular basis. >>> >>> I've tried adding more pgs...but this did not seem to do much in terms >>> of the imbalance. >>> >>> Here is the end output from 'ceph osd df': >>> >>> MIN/MAX VAR: 0.73/1.31 STDDEV: 7.73 >>> >>> We have 8199 pgs total with 6775 of them in the pool that has 97% of the >>> data. >>> >>> The other pools are not really used (data, metadata, .rgw.root, >>> .rgw.control, etc). I have thought about deleting those unused pools so >>> that most if not all the pgs are being used by the pool with the >>> majority of the data. >>> >>> However...before I do that...there anything else I can do or try in >>> order to see if I can balance out the data more uniformly? >>> >>> Thanks in advance, >>> >>> Shain >>> >> _______________________________________________ >> ceph-users mailing list >> ceph-users@xxxxxxxxxxxxxx >> https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.ceph.com_listinfo.cgi_ceph-2Dusers-2Dceph.com&d=DwICAg&c=E2nBno7hEddFhl23N5nD1Q&r=cqFccwnwHGRorPuRWs36Dw&m=1BfaF7xeFT_o8pdT9mrRmWm0gCn4wgalDi3UviTy24M&s=BczlpHmYiubLlNUhgDHcEsVHAsR_RYCKYV2G_5w2Vio&e= >> > _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com