after backfilling was complete, I was able to increase pg_num and
pgp_num on the empty pool cfs_data in 128 increments all the way up to
2048, that was fine.
This is not working for the filled pool.
pg_num 187 pgp_num, 59
trying to increase that in small increments
set nobackfill
set norebalance
then increase
It does not go beyond this
pool 1 'cephfs_data' replicated size 3 min_size 2 crush_rule 0
object_hash rjenkins pg_num 195 pgp_num 67 pg_num_target 256
pgp_num_target 195 autoscale_mode off last_change 3319 lfor 0/3089/3315
flags hashpspool,bulk stripe_width 0 target_size_ratio 1 application cephfs
So does this mean I can go only in increments of 8 and then have to wait
for rebalancing / backfill? If so, it will takes several months given
that the pool is filled with 90 million objects already.
Fortunately, the data is cold backup only, so if I cannot find another
way to increment in larger steps I will have to delete the pool and restart.
ceph status
cluster:
id: c3f53dc2-6fec-11ed-8f82-8d92bac89f1e
health: HEALTH_WARN
4 clients failing to advance oldest client/flush tid
nobackfill,norebalance,noscrub,nodeep-scrub flag(s) set
161 pgs not deep-scrubbed in time
148 pgs not scrubbed in time
1 pools have pg_num > pgp_num
services:
mon: 6 daemons, quorum i01,i02,i03,i04,i05,i06 (age 24h)
mgr: i05.cubljm(active, since 41h), standbys: i02.yshlju,
i03.fxfpta, i04.bjgfeu, i06.blyjkk, i01.nbmavd
mds: 6/6 daemons up
osd: 71 osds: 71 up (since 24h), 71 in (since 24h); 21 remapped pgs
flags nobackfill,norebalance,noscrub,nodeep-scrub
data:
volumes: 1/1 healthy
pools: 3 pools, 212 pgs
objects: 90.33M objects, 41 TiB
usage: 128 TiB used, 510 TiB / 639 TiB avail
pgs: 24452271/270990393 objects misplaced (9.023%)
191 active+clean
21 active+remapped+backfilling
ceph osd df tree
ID CLASS WEIGHT REWEIGHT SIZE RAW USE DATA OMAP
META AVAIL %USE VAR PGS STATUS TYPE NAME
-1 638.52063 - 639 TiB 128 TiB 128 TiB 137 GiB
507 GiB 510 TiB 20.08 1.00 - root default
-3 100.05257 - 100 TiB 22 TiB 22 TiB 28 GiB
83 GiB 78 TiB 22.13 1.10 - host i01
0 hdd 9.09569 1.00000 9.1 TiB 1.3 TiB 1.3 TiB 5.8 GiB
6.2 GiB 7.8 TiB 14.22 0.71 8 up osd.0
7 hdd 9.09569 1.00000 9.1 TiB 656 GiB 653 GiB 1 KiB
2.7 GiB 8.5 TiB 7.04 0.35 3 up osd.7
13 hdd 9.09569 1.00000 9.1 TiB 1.3 TiB 1.3 TiB 2.9 GiB
4.5 GiB 7.8 TiB 14.20 0.71 7 up osd.13
19 hdd 9.09569 1.00000 9.1 TiB 3.2 TiB 3.2 TiB 1 KiB
10 GiB 5.9 TiB 35.40 1.76 16 up osd.19
25 hdd 9.09569 1.00000 9.1 TiB 1.3 TiB 1.3 TiB 1 KiB
3.9 GiB 7.8 TiB 14.19 0.71 6 up osd.25
31 hdd 9.09569 1.00000 9.1 TiB 3.2 TiB 3.2 TiB 8.3 GiB
14 GiB 5.9 TiB 35.50 1.77 18 up osd.31
38 hdd 9.09569 1.00000 9.1 TiB 2.6 TiB 2.6 TiB 5.6 GiB
9.0 GiB 6.5 TiB 28.35 1.41 14 up osd.38
44 hdd 9.09569 1.00000 9.1 TiB 1.9 TiB 1.9 TiB 5.6 GiB
6.9 GiB 7.2 TiB 21.28 1.06 11 up osd.44
50 hdd 9.09569 1.00000 9.1 TiB 2.6 TiB 2.6 TiB 1 KiB
9.3 GiB 6.5 TiB 28.36 1.41 13 up osd.50
56 hdd 9.09569 1.00000 9.1 TiB 1.5 TiB 1.5 TiB 1 KiB
5.7 GiB 7.6 TiB 16.34 0.81 6 up osd.56
62 hdd 9.09569 1.00000 9.1 TiB 2.6 TiB 2.6 TiB 1 KiB
10 GiB 6.5 TiB 28.52 1.42 12 up osd.62
-5 103.69336 - 104 TiB 19 TiB 18 TiB 20 GiB
76 GiB 85 TiB 17.85 0.89 - host i02
5 hdd 9.09569 1.00000 9.1 TiB 660 GiB 655 GiB 2.9 GiB
2.8 GiB 8.5 TiB 7.09 0.35 4 up osd.5
11 hdd 7.27739 1.00000 7.3 TiB 2.6 TiB 2.6 TiB 1 KiB
7.9 GiB 4.7 TiB 35.34 1.76 12 up osd.11
12 hdd 9.09569 1.00000 9.1 TiB 662 GiB 659 GiB 1 KiB
3.6 GiB 8.4 TiB 7.11 0.35 3 up osd.12
18 hdd 9.09569 1.00000 9.1 TiB 669 GiB 659 GiB 5.7 GiB
3.8 GiB 8.4 TiB 7.18 0.36 5 up osd.18
24 hdd 7.27739 1.00000 7.3 TiB 3.2 TiB 3.2 TiB 1 KiB
9.7 GiB 4.1 TiB 44.24 2.20 16 up osd.24
30 hdd 7.27739 1.00000 7.3 TiB 2.6 TiB 2.6 TiB 3.0 GiB
8.2 GiB 4.7 TiB 35.42 1.76 13 up osd.30
36 hdd 9.09569 1.00000 9.1 TiB 1.9 TiB 1.9 TiB 2.8 GiB
12 GiB 7.2 TiB 21.33 1.06 10 up osd.36
42 hdd 9.09569 1.00000 9.1 TiB 1.3 TiB 1.3 TiB 1 KiB
4.1 GiB 7.8 TiB 14.14 0.70 6 up osd.42
48 hdd 9.09569 1.00000 9.1 TiB 82 MiB 28 MiB 1 KiB
54 MiB 9.1 TiB 0 0 0 up osd.48
54 hdd 9.09569 1.00000 9.1 TiB 663 GiB 657 GiB 2.9 GiB
3.2 GiB 8.4 TiB 7.11 0.35 4 up osd.54
60 hdd 9.09569 1.00000 9.1 TiB 1.3 TiB 1.3 TiB 1 KiB 6.5
GiB 7.8 TiB 14.49 0.72 6 up osd.60
66 hdd 9.09569 1.00000 9.1 TiB 3.0 TiB 3.0 TiB 2.8 GiB
14 GiB 6.1 TiB 33.06 1.65 13 up osd.66
-13 109.14825 - 109 TiB 21 TiB 21 TiB 26 GiB 82
GiB 88 TiB 19.27 0.96 - host i03
4 hdd 9.09569 1.00000 9.1 TiB 659 GiB 657 GiB 1 KiB
2.6 GiB 8.5 TiB 7.08 0.35 3 up osd.4
10 hdd 9.09569 1.00000 9.1 TiB 1.5 TiB 1.5 TiB 1 KiB
6.2 GiB 7.6 TiB 16.82 0.84 6 up osd.10
17 hdd 9.09569 1.00000 9.1 TiB 2.5 TiB 2.5 TiB 2.8 GiB
10 GiB 6.6 TiB 27.52 1.37 10 up osd.17
23 hdd 9.09569 1.00000 9.1 TiB 2.6 TiB 2.6 TiB 2.8 GiB
8.3 GiB 6.5 TiB 28.33 1.41 13 up osd.23
29 hdd 9.09569 1.00000 9.1 TiB 1.3 TiB 1.3 TiB 1 KiB
5.3 GiB 7.8 TiB 14.16 0.71 6 up osd.29
35 hdd 9.09569 1.00000 9.1 TiB 7.4 GiB 261 MiB 5.6 GiB
1.5 GiB 9.1 TiB 0.08 0.00 3 up osd.35
41 hdd 9.09569 1.00000 9.1 TiB 1.9 TiB 1.9 TiB 2.8 GiB
7.0 GiB 7.2 TiB 21.28 1.06 10 up osd.41
47 hdd 9.09569 1.00000 9.1 TiB 1.9 TiB 1.9 TiB 2.9 GiB
6.4 GiB 7.2 TiB 21.27 1.06 10 up osd.47
53 hdd 9.09569 0.89999 9.1 TiB 2.8 TiB 2.8 TiB 2.9 GiB
14 GiB 6.3 TiB 31.07 1.55 13 up osd.53
59 hdd 9.09569 1.00000 9.1 TiB 3.2 TiB 3.2 TiB 2.9 GiB
10 GiB 5.9 TiB 35.39 1.76 17 up osd.59
65 hdd 9.09569 1.00000 9.1 TiB 1.3 TiB 1.3 TiB 1 KiB
4.7 GiB 7.8 TiB 14.13 0.70 6 up osd.65
70 hdd 9.09569 1.00000 9.1 TiB 1.3 TiB 1.3 TiB 2.8 GiB
4.6 GiB 7.8 TiB 14.17 0.71 7 up osd.70
-11 109.14825 - 109 TiB 19 TiB 19 TiB 26 GiB 77
GiB 90 TiB 17.46 0.87 - host i04
2 hdd 9.09569 1.00000 9.1 TiB 2.6 TiB 2.6 TiB 2.8 GiB
8.6 GiB 6.5 TiB 28.32 1.41 13 up osd.2
6 hdd 9.09569 1.00000 9.1 TiB 2.1 TiB 2.0 TiB 5.8 GiB
9.4 GiB 7.0 TiB 22.65 1.13 12 up osd.6
14 hdd 9.09569 1.00000 9.1 TiB 2.8 TiB 2.8 TiB 2.9 GiB
13 GiB 6.3 TiB 30.53 1.52 14 up osd.14
20 hdd 9.09569 1.00000 9.1 TiB 660 GiB 658 GiB 1 KiB
2.6 GiB 8.5 TiB 7.09 0.35 3 up osd.20
26 hdd 9.09569 1.00000 9.1 TiB 1.9 TiB 1.9 TiB 2.8 GiB
6.5 GiB 7.2 TiB 21.23 1.06 10 up osd.26
32 hdd 9.09569 1.00000 9.1 TiB 661 GiB 658 GiB 1 KiB
2.3 GiB 8.5 TiB 7.09 0.35 3 up osd.32
39 hdd 9.09569 1.00000 9.1 TiB 660 GiB 657 GiB 1 KiB
2.2 GiB 8.5 TiB 7.08 0.35 3 up osd.39
45 hdd 9.09569 1.00000 9.1 TiB 1.3 TiB 1.3 TiB 2.8 GiB
4.5 GiB 7.8 TiB 14.12 0.70 7 up osd.45
51 hdd 9.09569 1.00000 9.1 TiB 7.3 GiB 152 MiB 5.8 GiB
1.4 GiB 9.1 TiB 0.08 0.00 2 up osd.51
57 hdd 9.09569 1.00000 9.1 TiB 2.6 TiB 2.6 TiB 2.9 GiB
9.1 GiB 6.5 TiB 28.33 1.41 13 up osd.57
64 hdd 9.09569 1.00000 9.1 TiB 2.0 TiB 2.0 TiB 1 KiB
9.5 GiB 7.1 TiB 21.78 1.08 9 up osd.64
69 hdd 9.09569 1.00000 9.1 TiB 1.9 TiB 1.9 TiB 1 KiB
8.0 GiB 7.2 TiB 21.27 1.06 9 up osd.69
-9 109.14825 - 109 TiB 22 TiB 22 TiB 20 GiB
90 GiB 87 TiB 20.02 1.00 - host i05
1 hdd 9.09569 1.00000 9.1 TiB 3.2 TiB 3.2 TiB 2.9 GiB
13 GiB 5.9 TiB 35.41 1.76 16 up osd.1
9 hdd 9.09569 1.00000 9.1 TiB 658 GiB 656 GiB 1 KiB
2.3 GiB 8.5 TiB 7.07 0.35 3 up osd.9
15 hdd 9.09569 1.00000 9.1 TiB 3.6 GiB 100 MiB 2.9 GiB
611 MiB 9.1 TiB 0.04 0.00 1 up osd.15
21 hdd 9.09569 1.00000 9.1 TiB 1.4 TiB 1.4 TiB 5.7 GiB
6.2 GiB 7.7 TiB 15.06 0.75 8 up osd.21
27 hdd 9.09569 1.00000 9.1 TiB 730 GiB 725 GiB 1 KiB
4.2 GiB 8.4 TiB 7.83 0.39 3 up osd.27
34 hdd 9.09569 1.00000 9.1 TiB 2.7 TiB 2.7 TiB 2.8 GiB
11 GiB 6.4 TiB 29.94 1.49 13 up osd.34
40 hdd 9.09569 1.00000 9.1 TiB 658 GiB 656 GiB 1 KiB
2.4 GiB 8.5 TiB 7.07 0.35 3 up osd.40
46 hdd 9.09569 1.00000 9.1 TiB 2.8 TiB 2.8 TiB 1 KiB
9.3 GiB 6.3 TiB 30.39 1.51 13 up osd.46
52 hdd 9.09569 1.00000 9.1 TiB 2.0 TiB 2.0 TiB 1 KiB
7.4 GiB 7.1 TiB 22.23 1.11 9 up osd.52
58 hdd 9.09569 1.00000 9.1 TiB 2.1 TiB 2.0 TiB 2.8 GiB
8.9 GiB 7.0 TiB 22.61 1.13 10 up osd.58
63 hdd 9.09569 0.89999 9.1 TiB 3.0 TiB 3.0 TiB 1 KiB
12 GiB 6.1 TiB 33.09 1.65 12 up osd.63
68 hdd 9.09569 1.00000 9.1 TiB 2.7 TiB 2.7 TiB 3.0 GiB
12 GiB 6.4 TiB 29.50 1.47 14 up osd.68
-7 107.32996 - 107 TiB 26 TiB 25 TiB 17 GiB
101 GiB 82 TiB 23.84 1.19 - host i06
3 hdd 9.09569 0.29999 9.1 TiB 439 GiB 432 GiB 1 KiB
7.4 GiB 8.7 TiB 4.72 0.23 0 up osd.3
8 hdd 9.09569 1.00000 9.1 TiB 666 GiB 656 GiB 5.7 GiB
3.8 GiB 8.4 TiB 7.15 0.36 5 up osd.8
16 hdd 9.09569 1.00000 9.1 TiB 3.2 TiB 3.2 TiB 1 KiB
9.7 GiB 5.9 TiB 35.35 1.76 15 up osd.16
22 hdd 9.09569 1.00000 9.1 TiB 1.9 TiB 1.9 TiB 1 KiB
7.2 GiB 7.2 TiB 21.23 1.06 9 up osd.22
28 hdd 9.09569 1.00000 9.1 TiB 1.9 TiB 1.9 TiB 1 KiB
6.5 GiB 7.2 TiB 21.25 1.06 9 up osd.28
33 hdd 7.27739 1.00000 7.3 TiB 1.9 TiB 1.9 TiB 1 KiB
6.0 GiB 5.3 TiB 26.50 1.32 10 up osd.33
37 hdd 9.09569 1.00000 9.1 TiB 3.9 TiB 3.9 TiB 5.7 GiB
14 GiB 5.2 TiB 42.54 2.12 20 up osd.37
43 hdd 9.09569 1.00000 9.1 TiB 3.0 TiB 3.0 TiB 1 KiB
9.9 GiB 6.1 TiB 33.05 1.65 12 up osd.43
49 hdd 9.09569 1.00000 9.1 TiB 1.3 TiB 1.3 TiB 2.9 GiB
4.7 GiB 7.8 TiB 14.20 0.71 7 up osd.49
55 hdd 9.09569 1.00000 9.1 TiB 2.1 TiB 2.1 TiB 1 KiB
8.8 GiB 7.0 TiB 23.09 1.15 9 up osd.55
61 hdd 9.09569 1.00000 9.1 TiB 1.3 TiB 1.3 TiB 1 KiB
6.4 GiB 7.8 TiB 14.30 0.71 6 up osd.61
67 hdd 9.09569 0.89999 9.1 TiB 3.9 TiB 3.9 TiB 3.0 GiB
17 GiB 5.2 TiB 43.22 2.15 21 up osd.67
TOTAL 639 TiB 128 TiB 128 TiB 137 GiB
507 GiB 510 TiB 20.08
MIN/MAX VAR: 0/2.20 STDDEV: 11.36
On 14.12.22 22:58, Frank Schilder wrote:
Hi Martin,
I can't find the output of
ceph osd df tree
ceph status
anywhere. I thought you posted it, but well. Could you please post the output of these commands?
Best regards,
=================
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
________________________________________
From: Martin Buss <mbuss7004@xxxxxxxxx>
Sent: 14 December 2022 22:02:43
To: Frank Schilder; ceph-users@xxxxxxx
Cc: Eugen Block
Subject: Re: Re: New pool created with 2048 pg_num not executed
Hi Frank,
thanks for coming in on this, setting target_max_misplaced_ratio to 1
does not help
Regards,
Martin
On 14.12.22 21:32, Frank Schilder wrote:
Hi Eugen: déjà vu again?
I think the way autoscaler code in the MGRs interferes with operations is extremely confusing.
Could this be the same issue I and somebody else had a while ago? Even though autoscaler is disabled, there are parts of it in the MGR still interfering. One of the essential config options was target_max_misplaced_ratio, which needs to be set to 1 if you want to have all PGs created regardless of how many objects are misplaced.
The thread was https://lists.ceph.io/hyperkitty/list/ceph-users@xxxxxxx/thread/WST6K5A4UQGGISBFGJEZS4HFL2VVWW32
In addition, the PG splitting will stop if recovery IO is going on (some objects are degraded).
Best regards,
=================
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
________________________________________
From: Martin Buss <mbuss7004@xxxxxxxxx>
Sent: 14 December 2022 19:32
To: ceph-users@xxxxxxx
Subject: Re: New pool created with 2048 pg_num not executed
will do, that will take another day or so.
Can this have to do anything with
osd_pg_bits that defaults to 6
some operators seem to be working with 8 or 11
Can you explain what this option means? I could not quite understand
from the documentation.
Thanks!
On 14.12.22 16:11, Eugen Block wrote:
Then I'd suggest to wait until the backfilling is done and then report
back if the PGs are still not created. I don't have information about
the ML admin, sorry.
Zitat von Martin Buss <mbuss7004@xxxxxxxxx>:
that cephfs_data has been autoscaling while filling, the mismatched
numbers are a result of that autoscaling
the cluster status is WARN as there is still some old stuff
backfilling on cephfs_data
The issue is the newly created pool 9 cfs_data, which is stuck at 1152
pg_num
ps: can you help me to get in touch with the list admin so I can get
that post including private info deleted
On 14.12.22 15:41, Eugen Block wrote:
I'm wondering why the cephfs_data pool has mismatching pg_num and
pgp_num:
pool 1 'cephfs_data' replicated size 3 min_size 2 crush_rule 0
object_hash rjenkins pg_num 187 pgp_num 59 autoscale_mode off
Does disabling the autoscaler leave it like that when you disable it
in the middle of scaling? What is the current 'ceph status'?
Zitat von Martin Buss <mbuss7004@xxxxxxxxx>:
Hi Eugen,
thanks, sure, below:
pg_num stuck at 1152 and pgp_num stuck at 1024
Regards,
Martin
ceph config set global mon_max_pg_per_osd 400
ceph osd pool create cfs_data 2048 2048 --pg_num_min 2048
pool 'cfs_data' created
pool 1 'cephfs_data' replicated size 3 min_size 2 crush_rule 0
object_hash rjenkins pg_num 187 pgp_num 59 autoscale_mode off
last_change 3099 lfor 0/3089/3096 flags hashpspool,bulk stripe_width
0 target_size_ratio 1 application cephfs
pool 2 'cephfs_metadata' replicated size 3 min_size 2 crush_rule 0
object_hash rjenkins pg_num 16 pgp_num 16 autoscale_mode off
last_change 2942 lfor 0/0/123 flags hashpspool stripe_width 0
pg_autoscale_bias 4 pg_num_min 16 recovery_priority 5 application
cephfs
pool 3 '.mgr' replicated size 3 min_size 2 crush_rule 0 object_hash
rjenkins pg_num 1 pgp_num 1 autoscale_mode off last_change 2943
flags hashpspool stripe_width 0 pg_num_max 32 pg_num_min 1
application mgr
pool 9 'cfs_data' replicated size 3 min_size 2 crush_rule 0
object_hash rjenkins pg_num 1152 pgp_num 1024 pg_num_target 2048
pgp_num_target 2048 autoscale_mode off last_change 3198 lfor
0/0/3198 flags hashpspool stripe_width 0 pg_num_min 2048
On 14.12.22 15:10, Eugen Block wrote:
Hi,
are there already existing pools in the cluster? Can you share your
'ceph osd df tree' as well as 'ceph osd pool ls detail'? It sounds
like ceph is trying to stay within the limit of mon_max_pg_per_osd
(default 250).
Regards,
Eugen
Zitat von Martin Buss <mbuss7004@xxxxxxxxx>:
Hi,
on quincy, I created a new pool
ceph osd pool create cfs_data 2048 2048
6 hosts 71 osds
autoscaler is off; I find it kind of strange that the pool is
created with pg_num 1152 and pgp_num 1024, mentioning the 2048 as
the new target. I cannot manage to actually make this pool contain
2048 pg_num and 2048 pgp_num.
What config option am I missing that does not allow me to grow the
pool to 2048? Although I specified pg_num and pgp_num be the same,
it is not.
Please some help and guidance.
Thank you,
Martin
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsu
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx