Re: radosgw stopped working

Rok Jaklič <rjaklic@xxxxxxxxx> · Mon, 23 Dec 2024 16:24:40 +0100

[root@ctplmon1 ~]# ceph osd dump | grep pool
pool 1 '.mgr' replicated size 3 min_size 2 crush_rule 0 object_hash
rjenkins pg_num 1 pgp_num 1 autoscale_mode on last_change 320144 flags
hashpspool stripe_width 0 pg_num_min 1 application mgr,mgr_devicehealth
pool 2 '.rgw.root' replicated size 3 min_size 2 crush_rule 0 object_hash
rjenkins pg_num 32 pgp_num 32 autoscale_mode on last_change 320144 lfor
0/18964/18962 flags hashpspool stripe_width 0 application rgw
pool 3 'default.rgw.log' replicated size 3 min_size 2 crush_rule 0
object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode on last_change
320144 lfor 0/127672/127670 flags hashpspool stripe_width 0 application rgw
pool 4 'default.rgw.control' replicated size 3 min_size 2 crush_rule 0
object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode on last_change
320144 lfor 0/59850/59848 flags hashpspool stripe_width 0 application rgw
pool 5 'default.rgw.meta' replicated size 3 min_size 2 crush_rule 0
object_hash rjenkins pg_num 8 pgp_num 8 autoscale_mode on last_change
320144 lfor 0/51538/51536 flags hashpspool stripe_width 0 pg_autoscale_bias
4 pg_num_min 8 application rgw
pool 6 'default.rgw.buckets.index' replicated size 3 min_size 2 crush_rule
2 object_hash rjenkins pg_num 8 pgp_num 8 autoscale_mode on last_change
315285 lfor 0/127830/127828 flags hashpspool stripe_width 0
pg_autoscale_bias 4 pg_num_min 8 application rgw
pool 7 'default.rgw.buckets.non-ec' replicated size 3 min_size 2 crush_rule
0 object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode on last_change
320144 lfor 0/76474/76472 flags hashpspool stripe_width 0 application rgw
pool 9 'default.rgw.buckets.data' erasure profile ec-32-profile size 5
min_size 4 crush_rule 1 object_hash rjenkins pg_num 512 pgp_num 512
autoscale_mode on last_change 320144 lfor 0/127784/214408 flags
hashpspool,ec_overwrites stripe_width 12288 application rgw
pool 10 'cephfs_data' replicated size 3 min_size 2 crush_rule 0 object_hash
rjenkins pg_num 128 pgp_num 128 autoscale_mode on last_change 320144 flags
hashpspool,bulk stripe_width 0 application cephfs
pool 11 'cephfs_metadata' replicated size 3 min_size 2 crush_rule 4
object_hash rjenkins pg_num 8 pgp_num 8 autoscale_mode on last_change
320144 flags hashpspool stripe_width 0 pg_autoscale_bias 4 pg_num_min 16
recovery_priority 5 application cephfs

---

Right now there are around 200 osds (5.5T) in a cluster, with around 25
waiting to be added.

Rok

On Mon, Dec 23, 2024 at 4:16 PM Anthony D'Atri <anthony.datri@xxxxxxxxx>
wrote:

>
>
> > autoscale_mode for pg is on for a particular pool
> > (default.rgw.buckets.data) and EC 3-2 is used. During pool lifetime I've
> > seen one time that PG number have changed automatically
>
> pg_num for a given pool likes to be a power of 2, so either the relative
> usage of pools or the overall cluster fillage has to change substantially
> for a change to be triggered in many cases.
>
> > but now I am also considering changing PG number manually after
> backfills completes.
>
> If you do, be sure to disable the autoscaler for that pool.
>
> > Right now pg_num 512 pgp_num 512 is used and I am considering to change
> it
> > to 1024. Do you think that would be too aggressive maybe?
>
> Depends on how many OSDs you have and what the rest of the pools are
> like.  Send us
>
> `ceph osd dump | grep pool`
>
> These days, assuming that your OSDs are BlueStore, chances are that going
> higher on pg_num won’t cause issues.
>
> >
> > Rok
> >
> > On Sun, Dec 22, 2024 at 8:46 PM Alwin Antreich <alwin.antreich@xxxxxxxx>
> > wrote:
> >
> >> Hi Rok,
> >>
> >> On Sun, 22 Dec 2024 at 20:19, Rok Jaklič <rjaklic@xxxxxxxxx> wrote:
> >>
> >>> First I tried with osd reweight, waited a few hours then osd crush
> >>> reweight, then with pg-umpap from Laimis. Seems to crush reweight was
> most
> >>> effective, but not for "all" osds I tried.
> >>>
> >>> Uh, probably I've set ceph config set osd osd_max_backfills to high
> >>> number in the past, probably better to reduce it to 1 in steps, since
> now
> >>> much backfilling is already going on?
> >>>
> >> Every time a backfill finishes, a new one will be placed in the queue.
> The
> >> number of backfills won't reduce as long as you don't lower it. You can
> >> adjust it and see if it improves the backfill process or not (wait an
> hour
> >> or two).
> >>
> >>
> >>>
> >>> Output of commands in attachment.
> >>>
> >> There seems to be a low amount of PGs for the rgw data pool, compared to
> >> the amount of OSDs. Though it depends on the EC profile and size of a
> shard
> >> (`ceph pg <id> query`) if this is really an issue. But in general the
> >> amount of PGs is important, because too few of them will make them grow
> >> larger. Hence backfilling a PG will take a longer time and easier tilts
> the
> >> usage of OSDs, as the algorithm works by pseudo-randomly placing PGs and
> >> not taking its size into account.
> >>
> >> I'd wait with the PG adjustment after the backfilling to the HDDs has
> >> finished, should you need to adjust the number of PGs. As this will
> create
> >> more data movement.
> >>
> >> Cheers,
> >> Alwin
> >> croit GmbH, https://croit.io/
> >>
> > _______________________________________________
> > ceph-users mailing list -- ceph-users@xxxxxxx
> > To unsubscribe send an email to ceph-users-leave@xxxxxxx
>
>
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx