Re: New pool created with 2048 pg_num not executed

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



after backfilling was complete, I was able to increase pg_num and pgp_num on the empty pool cfs_data in 128 increments all the way up to 2048, that was fine.

This is not working for the filled pool.

pg_num 187 pgp_num, 59
trying to increase that in small increments

set nobackfill
set norebalance

then increase

It does not go beyond this

pool 1 'cephfs_data' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 195 pgp_num 67 pg_num_target 256 pgp_num_target 195 autoscale_mode off last_change 3319 lfor 0/3089/3315 flags hashpspool,bulk stripe_width 0 target_size_ratio 1 application cephfs

So does this mean I can go only in increments of 8 and then have to wait for rebalancing / backfill? If so, it will takes several months given that the pool is filled with 90 million objects already.

Fortunately, the data is cold backup only, so if I cannot find another way to increment in larger steps I will have to delete the pool and restart.



ceph status
  cluster:
    id:     c3f53dc2-6fec-11ed-8f82-8d92bac89f1e
    health: HEALTH_WARN
            4 clients failing to advance oldest client/flush tid
            nobackfill,norebalance,noscrub,nodeep-scrub flag(s) set
            161 pgs not deep-scrubbed in time
            148 pgs not scrubbed in time
            1 pools have pg_num > pgp_num

  services:
    mon: 6 daemons, quorum i01,i02,i03,i04,i05,i06 (age 24h)
mgr: i05.cubljm(active, since 41h), standbys: i02.yshlju, i03.fxfpta, i04.bjgfeu, i06.blyjkk, i01.nbmavd
    mds: 6/6 daemons up
    osd: 71 osds: 71 up (since 24h), 71 in (since 24h); 21 remapped pgs
         flags nobackfill,norebalance,noscrub,nodeep-scrub

  data:
    volumes: 1/1 healthy
    pools:   3 pools, 212 pgs
    objects: 90.33M objects, 41 TiB
    usage:   128 TiB used, 510 TiB / 639 TiB avail
    pgs:     24452271/270990393 objects misplaced (9.023%)
             191 active+clean
             21  active+remapped+backfilling

 ceph osd df tree
ID CLASS WEIGHT REWEIGHT SIZE RAW USE DATA OMAP META AVAIL %USE VAR PGS STATUS TYPE NAME -1 638.52063 - 639 TiB 128 TiB 128 TiB 137 GiB 507 GiB 510 TiB 20.08 1.00 - root default -3 100.05257 - 100 TiB 22 TiB 22 TiB 28 GiB 83 GiB 78 TiB 22.13 1.10 - host i01 0 hdd 9.09569 1.00000 9.1 TiB 1.3 TiB 1.3 TiB 5.8 GiB 6.2 GiB 7.8 TiB 14.22 0.71 8 up osd.0 7 hdd 9.09569 1.00000 9.1 TiB 656 GiB 653 GiB 1 KiB 2.7 GiB 8.5 TiB 7.04 0.35 3 up osd.7 13 hdd 9.09569 1.00000 9.1 TiB 1.3 TiB 1.3 TiB 2.9 GiB 4.5 GiB 7.8 TiB 14.20 0.71 7 up osd.13 19 hdd 9.09569 1.00000 9.1 TiB 3.2 TiB 3.2 TiB 1 KiB 10 GiB 5.9 TiB 35.40 1.76 16 up osd.19 25 hdd 9.09569 1.00000 9.1 TiB 1.3 TiB 1.3 TiB 1 KiB 3.9 GiB 7.8 TiB 14.19 0.71 6 up osd.25 31 hdd 9.09569 1.00000 9.1 TiB 3.2 TiB 3.2 TiB 8.3 GiB 14 GiB 5.9 TiB 35.50 1.77 18 up osd.31 38 hdd 9.09569 1.00000 9.1 TiB 2.6 TiB 2.6 TiB 5.6 GiB 9.0 GiB 6.5 TiB 28.35 1.41 14 up osd.38 44 hdd 9.09569 1.00000 9.1 TiB 1.9 TiB 1.9 TiB 5.6 GiB 6.9 GiB 7.2 TiB 21.28 1.06 11 up osd.44 50 hdd 9.09569 1.00000 9.1 TiB 2.6 TiB 2.6 TiB 1 KiB 9.3 GiB 6.5 TiB 28.36 1.41 13 up osd.50 56 hdd 9.09569 1.00000 9.1 TiB 1.5 TiB 1.5 TiB 1 KiB 5.7 GiB 7.6 TiB 16.34 0.81 6 up osd.56 62 hdd 9.09569 1.00000 9.1 TiB 2.6 TiB 2.6 TiB 1 KiB 10 GiB 6.5 TiB 28.52 1.42 12 up osd.62 -5 103.69336 - 104 TiB 19 TiB 18 TiB 20 GiB 76 GiB 85 TiB 17.85 0.89 - host i02 5 hdd 9.09569 1.00000 9.1 TiB 660 GiB 655 GiB 2.9 GiB 2.8 GiB 8.5 TiB 7.09 0.35 4 up osd.5 11 hdd 7.27739 1.00000 7.3 TiB 2.6 TiB 2.6 TiB 1 KiB 7.9 GiB 4.7 TiB 35.34 1.76 12 up osd.11 12 hdd 9.09569 1.00000 9.1 TiB 662 GiB 659 GiB 1 KiB 3.6 GiB 8.4 TiB 7.11 0.35 3 up osd.12 18 hdd 9.09569 1.00000 9.1 TiB 669 GiB 659 GiB 5.7 GiB 3.8 GiB 8.4 TiB 7.18 0.36 5 up osd.18 24 hdd 7.27739 1.00000 7.3 TiB 3.2 TiB 3.2 TiB 1 KiB 9.7 GiB 4.1 TiB 44.24 2.20 16 up osd.24 30 hdd 7.27739 1.00000 7.3 TiB 2.6 TiB 2.6 TiB 3.0 GiB 8.2 GiB 4.7 TiB 35.42 1.76 13 up osd.30 36 hdd 9.09569 1.00000 9.1 TiB 1.9 TiB 1.9 TiB 2.8 GiB 12 GiB 7.2 TiB 21.33 1.06 10 up osd.36 42 hdd 9.09569 1.00000 9.1 TiB 1.3 TiB 1.3 TiB 1 KiB 4.1 GiB 7.8 TiB 14.14 0.70 6 up osd.42 48 hdd 9.09569 1.00000 9.1 TiB 82 MiB 28 MiB 1 KiB 54 MiB 9.1 TiB 0 0 0 up osd.48 54 hdd 9.09569 1.00000 9.1 TiB 663 GiB 657 GiB 2.9 GiB 3.2 GiB 8.4 TiB 7.11 0.35 4 up osd.54 60 hdd 9.09569 1.00000 9.1 TiB 1.3 TiB 1.3 TiB 1 KiB 6.5 GiB 7.8 TiB 14.49 0.72 6 up osd.60 66 hdd 9.09569 1.00000 9.1 TiB 3.0 TiB 3.0 TiB 2.8 GiB 14 GiB 6.1 TiB 33.06 1.65 13 up osd.66 -13 109.14825 - 109 TiB 21 TiB 21 TiB 26 GiB 82 GiB 88 TiB 19.27 0.96 - host i03 4 hdd 9.09569 1.00000 9.1 TiB 659 GiB 657 GiB 1 KiB 2.6 GiB 8.5 TiB 7.08 0.35 3 up osd.4 10 hdd 9.09569 1.00000 9.1 TiB 1.5 TiB 1.5 TiB 1 KiB 6.2 GiB 7.6 TiB 16.82 0.84 6 up osd.10 17 hdd 9.09569 1.00000 9.1 TiB 2.5 TiB 2.5 TiB 2.8 GiB 10 GiB 6.6 TiB 27.52 1.37 10 up osd.17 23 hdd 9.09569 1.00000 9.1 TiB 2.6 TiB 2.6 TiB 2.8 GiB 8.3 GiB 6.5 TiB 28.33 1.41 13 up osd.23 29 hdd 9.09569 1.00000 9.1 TiB 1.3 TiB 1.3 TiB 1 KiB 5.3 GiB 7.8 TiB 14.16 0.71 6 up osd.29 35 hdd 9.09569 1.00000 9.1 TiB 7.4 GiB 261 MiB 5.6 GiB 1.5 GiB 9.1 TiB 0.08 0.00 3 up osd.35 41 hdd 9.09569 1.00000 9.1 TiB 1.9 TiB 1.9 TiB 2.8 GiB 7.0 GiB 7.2 TiB 21.28 1.06 10 up osd.41 47 hdd 9.09569 1.00000 9.1 TiB 1.9 TiB 1.9 TiB 2.9 GiB 6.4 GiB 7.2 TiB 21.27 1.06 10 up osd.47 53 hdd 9.09569 0.89999 9.1 TiB 2.8 TiB 2.8 TiB 2.9 GiB 14 GiB 6.3 TiB 31.07 1.55 13 up osd.53 59 hdd 9.09569 1.00000 9.1 TiB 3.2 TiB 3.2 TiB 2.9 GiB 10 GiB 5.9 TiB 35.39 1.76 17 up osd.59 65 hdd 9.09569 1.00000 9.1 TiB 1.3 TiB 1.3 TiB 1 KiB 4.7 GiB 7.8 TiB 14.13 0.70 6 up osd.65 70 hdd 9.09569 1.00000 9.1 TiB 1.3 TiB 1.3 TiB 2.8 GiB 4.6 GiB 7.8 TiB 14.17 0.71 7 up osd.70 -11 109.14825 - 109 TiB 19 TiB 19 TiB 26 GiB 77 GiB 90 TiB 17.46 0.87 - host i04 2 hdd 9.09569 1.00000 9.1 TiB 2.6 TiB 2.6 TiB 2.8 GiB 8.6 GiB 6.5 TiB 28.32 1.41 13 up osd.2 6 hdd 9.09569 1.00000 9.1 TiB 2.1 TiB 2.0 TiB 5.8 GiB 9.4 GiB 7.0 TiB 22.65 1.13 12 up osd.6 14 hdd 9.09569 1.00000 9.1 TiB 2.8 TiB 2.8 TiB 2.9 GiB 13 GiB 6.3 TiB 30.53 1.52 14 up osd.14 20 hdd 9.09569 1.00000 9.1 TiB 660 GiB 658 GiB 1 KiB 2.6 GiB 8.5 TiB 7.09 0.35 3 up osd.20 26 hdd 9.09569 1.00000 9.1 TiB 1.9 TiB 1.9 TiB 2.8 GiB 6.5 GiB 7.2 TiB 21.23 1.06 10 up osd.26 32 hdd 9.09569 1.00000 9.1 TiB 661 GiB 658 GiB 1 KiB 2.3 GiB 8.5 TiB 7.09 0.35 3 up osd.32 39 hdd 9.09569 1.00000 9.1 TiB 660 GiB 657 GiB 1 KiB 2.2 GiB 8.5 TiB 7.08 0.35 3 up osd.39 45 hdd 9.09569 1.00000 9.1 TiB 1.3 TiB 1.3 TiB 2.8 GiB 4.5 GiB 7.8 TiB 14.12 0.70 7 up osd.45 51 hdd 9.09569 1.00000 9.1 TiB 7.3 GiB 152 MiB 5.8 GiB 1.4 GiB 9.1 TiB 0.08 0.00 2 up osd.51 57 hdd 9.09569 1.00000 9.1 TiB 2.6 TiB 2.6 TiB 2.9 GiB 9.1 GiB 6.5 TiB 28.33 1.41 13 up osd.57 64 hdd 9.09569 1.00000 9.1 TiB 2.0 TiB 2.0 TiB 1 KiB 9.5 GiB 7.1 TiB 21.78 1.08 9 up osd.64 69 hdd 9.09569 1.00000 9.1 TiB 1.9 TiB 1.9 TiB 1 KiB 8.0 GiB 7.2 TiB 21.27 1.06 9 up osd.69 -9 109.14825 - 109 TiB 22 TiB 22 TiB 20 GiB 90 GiB 87 TiB 20.02 1.00 - host i05 1 hdd 9.09569 1.00000 9.1 TiB 3.2 TiB 3.2 TiB 2.9 GiB 13 GiB 5.9 TiB 35.41 1.76 16 up osd.1 9 hdd 9.09569 1.00000 9.1 TiB 658 GiB 656 GiB 1 KiB 2.3 GiB 8.5 TiB 7.07 0.35 3 up osd.9 15 hdd 9.09569 1.00000 9.1 TiB 3.6 GiB 100 MiB 2.9 GiB 611 MiB 9.1 TiB 0.04 0.00 1 up osd.15 21 hdd 9.09569 1.00000 9.1 TiB 1.4 TiB 1.4 TiB 5.7 GiB 6.2 GiB 7.7 TiB 15.06 0.75 8 up osd.21 27 hdd 9.09569 1.00000 9.1 TiB 730 GiB 725 GiB 1 KiB 4.2 GiB 8.4 TiB 7.83 0.39 3 up osd.27 34 hdd 9.09569 1.00000 9.1 TiB 2.7 TiB 2.7 TiB 2.8 GiB 11 GiB 6.4 TiB 29.94 1.49 13 up osd.34 40 hdd 9.09569 1.00000 9.1 TiB 658 GiB 656 GiB 1 KiB 2.4 GiB 8.5 TiB 7.07 0.35 3 up osd.40 46 hdd 9.09569 1.00000 9.1 TiB 2.8 TiB 2.8 TiB 1 KiB 9.3 GiB 6.3 TiB 30.39 1.51 13 up osd.46 52 hdd 9.09569 1.00000 9.1 TiB 2.0 TiB 2.0 TiB 1 KiB 7.4 GiB 7.1 TiB 22.23 1.11 9 up osd.52 58 hdd 9.09569 1.00000 9.1 TiB 2.1 TiB 2.0 TiB 2.8 GiB 8.9 GiB 7.0 TiB 22.61 1.13 10 up osd.58 63 hdd 9.09569 0.89999 9.1 TiB 3.0 TiB 3.0 TiB 1 KiB 12 GiB 6.1 TiB 33.09 1.65 12 up osd.63 68 hdd 9.09569 1.00000 9.1 TiB 2.7 TiB 2.7 TiB 3.0 GiB 12 GiB 6.4 TiB 29.50 1.47 14 up osd.68 -7 107.32996 - 107 TiB 26 TiB 25 TiB 17 GiB 101 GiB 82 TiB 23.84 1.19 - host i06 3 hdd 9.09569 0.29999 9.1 TiB 439 GiB 432 GiB 1 KiB 7.4 GiB 8.7 TiB 4.72 0.23 0 up osd.3 8 hdd 9.09569 1.00000 9.1 TiB 666 GiB 656 GiB 5.7 GiB 3.8 GiB 8.4 TiB 7.15 0.36 5 up osd.8 16 hdd 9.09569 1.00000 9.1 TiB 3.2 TiB 3.2 TiB 1 KiB 9.7 GiB 5.9 TiB 35.35 1.76 15 up osd.16 22 hdd 9.09569 1.00000 9.1 TiB 1.9 TiB 1.9 TiB 1 KiB 7.2 GiB 7.2 TiB 21.23 1.06 9 up osd.22 28 hdd 9.09569 1.00000 9.1 TiB 1.9 TiB 1.9 TiB 1 KiB 6.5 GiB 7.2 TiB 21.25 1.06 9 up osd.28 33 hdd 7.27739 1.00000 7.3 TiB 1.9 TiB 1.9 TiB 1 KiB 6.0 GiB 5.3 TiB 26.50 1.32 10 up osd.33 37 hdd 9.09569 1.00000 9.1 TiB 3.9 TiB 3.9 TiB 5.7 GiB 14 GiB 5.2 TiB 42.54 2.12 20 up osd.37 43 hdd 9.09569 1.00000 9.1 TiB 3.0 TiB 3.0 TiB 1 KiB 9.9 GiB 6.1 TiB 33.05 1.65 12 up osd.43 49 hdd 9.09569 1.00000 9.1 TiB 1.3 TiB 1.3 TiB 2.9 GiB 4.7 GiB 7.8 TiB 14.20 0.71 7 up osd.49 55 hdd 9.09569 1.00000 9.1 TiB 2.1 TiB 2.1 TiB 1 KiB 8.8 GiB 7.0 TiB 23.09 1.15 9 up osd.55 61 hdd 9.09569 1.00000 9.1 TiB 1.3 TiB 1.3 TiB 1 KiB 6.4 GiB 7.8 TiB 14.30 0.71 6 up osd.61 67 hdd 9.09569 0.89999 9.1 TiB 3.9 TiB 3.9 TiB 3.0 GiB 17 GiB 5.2 TiB 43.22 2.15 21 up osd.67 TOTAL 639 TiB 128 TiB 128 TiB 137 GiB 507 GiB 510 TiB 20.08 MIN/MAX VAR: 0/2.20 STDDEV: 11.36


On 14.12.22 22:58, Frank Schilder wrote:
Hi Martin,

I can't find the output of

ceph osd df tree
ceph status

anywhere. I thought you posted it, but well. Could you please post the output of these commands?

Best regards,
=================
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14

________________________________________
From: Martin Buss <mbuss7004@xxxxxxxxx>
Sent: 14 December 2022 22:02:43
To: Frank Schilder; ceph-users@xxxxxxx
Cc: Eugen Block
Subject: Re:  Re: New pool created with 2048 pg_num not executed

Hi Frank,

thanks for coming in on this, setting target_max_misplaced_ratio to 1
does not help

Regards,
Martin

On 14.12.22 21:32, Frank Schilder wrote:
Hi Eugen: déjà vu again?

I think the way autoscaler code in the MGRs interferes with operations is extremely confusing.

Could this be the same issue I and somebody else had a while ago? Even though autoscaler is disabled, there are parts of it in the MGR still interfering. One of the essential config options was target_max_misplaced_ratio, which needs to be set to 1 if you want to have all PGs created regardless of how many objects are misplaced.

The thread was https://lists.ceph.io/hyperkitty/list/ceph-users@xxxxxxx/thread/WST6K5A4UQGGISBFGJEZS4HFL2VVWW32

In addition, the PG splitting will stop if recovery IO is going on (some objects are degraded).

Best regards,
=================
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14

________________________________________
From: Martin Buss <mbuss7004@xxxxxxxxx>
Sent: 14 December 2022 19:32
To: ceph-users@xxxxxxx
Subject:  Re: New pool created with 2048 pg_num not executed

will do, that will take another day or so.

Can this have to do anything with
osd_pg_bits that defaults to 6
some operators seem to be working with 8 or 11

Can you explain what this option means? I could not quite understand
from the documentation.

Thanks!

On 14.12.22 16:11, Eugen Block wrote:
Then I'd suggest to wait until the backfilling is done and then report
back if the PGs are still not created. I don't have information about
the ML admin, sorry.

Zitat von Martin Buss <mbuss7004@xxxxxxxxx>:

that cephfs_data has been autoscaling while filling, the mismatched
numbers are a result of that autoscaling

the cluster status is WARN as there is still some old stuff
backfilling on cephfs_data

The issue is the newly created pool 9 cfs_data, which is stuck at 1152
pg_num

ps: can you help me to get in touch with the list admin so I can get
that post including private info deleted

On 14.12.22 15:41, Eugen Block wrote:
I'm wondering why the cephfs_data pool has mismatching pg_num and
pgp_num:

pool 1 'cephfs_data' replicated size 3 min_size 2 crush_rule 0
object_hash rjenkins pg_num 187 pgp_num 59 autoscale_mode off

Does disabling the autoscaler leave it like that when you disable it
in the middle of scaling? What is the current 'ceph status'?


Zitat von Martin Buss <mbuss7004@xxxxxxxxx>:

Hi Eugen,

thanks, sure, below:

pg_num stuck at 1152 and pgp_num stuck at 1024

Regards,

Martin

ceph config set global mon_max_pg_per_osd 400

ceph osd pool create cfs_data 2048 2048 --pg_num_min 2048
pool 'cfs_data' created

pool 1 'cephfs_data' replicated size 3 min_size 2 crush_rule 0
object_hash rjenkins pg_num 187 pgp_num 59 autoscale_mode off
last_change 3099 lfor 0/3089/3096 flags hashpspool,bulk stripe_width
0 target_size_ratio 1 application cephfs
pool 2 'cephfs_metadata' replicated size 3 min_size 2 crush_rule 0
object_hash rjenkins pg_num 16 pgp_num 16 autoscale_mode off
last_change 2942 lfor 0/0/123 flags hashpspool stripe_width 0
pg_autoscale_bias 4 pg_num_min 16 recovery_priority 5 application
cephfs
pool 3 '.mgr' replicated size 3 min_size 2 crush_rule 0 object_hash
rjenkins pg_num 1 pgp_num 1 autoscale_mode off last_change 2943
flags hashpspool stripe_width 0 pg_num_max 32 pg_num_min 1
application mgr
pool 9 'cfs_data' replicated size 3 min_size 2 crush_rule 0
object_hash rjenkins pg_num 1152 pgp_num 1024 pg_num_target 2048
pgp_num_target 2048 autoscale_mode off last_change 3198 lfor
0/0/3198 flags hashpspool stripe_width 0 pg_num_min 2048



On 14.12.22 15:10, Eugen Block wrote:
Hi,

are there already existing pools in the cluster? Can you share your
'ceph osd df tree' as well as 'ceph osd pool ls detail'? It sounds
like ceph is trying to stay within the limit of mon_max_pg_per_osd
(default 250).

Regards,
Eugen

Zitat von Martin Buss <mbuss7004@xxxxxxxxx>:

Hi,

on quincy, I created a new pool

ceph osd pool create cfs_data 2048 2048

6 hosts 71 osds

autoscaler is off; I find it kind of strange that the pool is
created with pg_num 1152 and pgp_num 1024, mentioning the 2048 as
the new target. I cannot manage to actually make this pool contain
2048 pg_num and 2048 pgp_num.

What config option am I missing that does not allow me to grow the
pool to 2048? Although I specified pg_num and pgp_num be the same,
it is not.

Please some help and guidance.

Thank you,

Martin
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx



_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx



_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx



_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsu
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx


_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux