Hello,
well yes, I think I have to edit the Crush rule and modify:
item_name
or to be clear:
I need to modify this in the decompiled crush map:
root bmeta {
id -4 # do not change unnecessarily
id -254 class hdd # do not change unnecessarily
id -256 class ssd # do not change unnecessarily
# weight 0.000
alg straw2
hash 0 # rjenkins1
item gor-bmeta weight 0.000
root and item have to be modified to match the other host I moved the
OSDs to, I think.
What do you think?
Best,
Malte
Am 18.10.23 um 11:30 schrieb Malte Stroem:
Hello Eugen,
I was wrong. I am sorry.
The PGs are not empty and orphaned.
Most of the PGs are empty but a few are indeed used.
And the pool for these PGs is still there. It is the metadata pool of
the erasure coded pool for RBDs. The cache tier pool was removed
successfully.
So now we need to empty these OSDs.
The device class was SSD. I changed it to HDD and moved the OSDs inside
the Crush tree to the other HDD OSDs of the host.
I need to move the PGs away from the OSDs to other OSDs but I do not
know how to do it.
Is using pg-upmap the solution?
Is using the objectstore-tool the solution?
Is moving the OSDs inside Crush to the right place the solution?
Is migrating the metadata pool to another with another crush rule the
solution?
The crush rule of this metadata pool looks like this:
{
"rule_id": 8,
"rule_name": "rbd-meta",
"ruleset": 6,
"type": 1,
"min_size": 1,
"max_size": 10,
"steps": [
{
"op": "take",
"item": -4,
"item_name": "bmeta"
},
{
"op": "chooseleaf_firstn",
"num": 0,
"type": "host"
},
{
"op": "emit"
}
]
}
When stopping one of the OSDs the status gets degraded.
How to move PGs away from the OSDs?
How to let the pool use other OSDs?
Changing the crush rule?
Best,
Malte
Am 05.10.23 um 11:35 schrieb Malte Stroem:
Hello Eugen, Hello Joachim,
@Joachim: Interesting! And you got empty PGs, too? How did you solve
the problem?
@Eugen: This is one of our biggest clusters and we're in the process
to migrate from Nautilus to Octopus and to migrate from CentOS to Ubuntu.
The cache tier pool's OSDs were still version 14 OSDs. Most of the
other OSDs are version 15 already.
So I tested the command:
ceph-objectstore-tool --data-path /path/to/osd --op remove --pgid 3.0
--force
in a test cluster environment and this worked fine.
But the test scenario was not similar to our productive environment
and the PG wasn't empty.
I did not find a way to emulate the same situation in the test
scenario, yet.
Best,
Malte
Am 05.10.23 um 11:03 schrieb Eugen Block:
I know, I know... but since we are already using it (for years) I
have to check how to remove it safely, maybe as long as we're on
Pacific. ;-)
Zitat von Joachim Kraftmayer - ceph ambassador
<joachim.kraftmayer@xxxxxxxxx>:
@Eugen
We have seen the same problems 8 years ago. I can only recommend
never to use cache tiering in production.
At Cephalocon this was part of my talk and as far as I remember
cache tiering will also disappear from ceph soon.
Cache tiering has been deprecated in the Reef release as it has
lacked a maintainer for a very long time. This does not mean it will
be certainly removed, but we may choose to remove it without much
further notice.
https://docs.ceph.com/en/latest/rados/operations/cache-tiering/
Regards, Joachim
___________________________________
ceph ambassador DACH
ceph consultant since 2012
Clyso GmbH - Premier Ceph Foundation Member
https://www.clyso.com/
Am 05.10.23 um 10:02 schrieb Eugen Block:
Which ceph version is this? I'm trying to understand how removing a
pool leaves the PGs of that pool... Do you have any logs or
something from when you removed the pool?
We'll have to deal with a cache tier in the forseeable future as
well so this is quite relevant for us as well. Maybe I'll try to
reproduce it in a test cluster first.
Are those SSDs exclusively for the cache tier or are they used by
other pools as well? If they were used only for the cache tier you
should be able to just remove them without any risk. But as I said,
I'd rather try to understand before purging them.
Zitat von Malte Stroem <malte.stroem@xxxxxxxxx>:
Hello Eugen,
yes, we followed the documentation and everything worked fine. The
cache is gone.
Removing the pool worked well. Everything is clean.
The PGs are empty active+clean.
Possible solutions:
1.
ceph pg {pg-id} mark_unfound_lost delete
I do not think this is the right way since it is for PGs with
status unfound. But it could work also.
2.
Set the following for the three disk:
ceph osd lost {osd-id}
I am not sure how the cluster will react to this.
3.
ceph-objectstore-tool --data-path /path/to/osd --op remove --pgid
3.0 --force
Now, will the cluster accept the removed PG status?
4.
The three disks are still presented in the crush rule, class ssd,
each single OSD under one host entry.
What if I remove them from crush?
Do you have a better idea, Eugen?
Best,
Malte
Am 04.10.23 um 09:21 schrieb Eugen Block:
Hi,
just for clarity, you're actually talking about the cache tier as
described in the docs [1]? And you followed the steps until 'ceph
osd tier remove cold-storage hot-storage' successfully? And the
pool has been really deleted successfully ('ceph osd pool ls
detail')?
[1]
https://docs.ceph.com/en/latest/rados/operations/cache-tiering/#removing-a-cache-tier
Zitat von Malte Stroem <malte.stroem@xxxxxxxxx>:
Hello,
we removed an SSD cache tier and its pool.
The PGs for the pool do still exist.
The cluster is healthy.
The PGs are empty and they reside on the cache tier pool's SSDs.
We like to take out the disks but it is not possible. The
cluster sees the PGs and answers with a HEALTH_WARN.
Because of the replication of three there are still 128 PGs on
three of the 24 OSDs. We were able to remove the other OSDs.
Summary:
- pool removed
- 3 x 128 empty PGs still exist
- 3 of 24 OSDs still exist
How is it possible to remove these empty and healthy PGs?
The only way I found was something like:
ceph pg {pg-id} mark_unfound_lost delete
Is that the right way?
Some output of:
ceph pg ls-by-osd 23
PG OBJECTS DEGRADED MISPLACED UNFOUND BYTES OMAP_BYTES*
OMAP_KEYS* LOG STATE SINCE VERSION REPORTED
UP ACTING SCRUB_STAMP DEEP_SCRUB_STAMP
3.0 0 0 0 0 0 0 0
0 active+clean 27h 0'0 2627265:196316 [15,6,23]p15
[15,6,23]p15 2023-09-28T12:41:52.982955+0200
2023-09-27T06:48:23.265838+0200
3.1 0 0 0 0 0 0 0
0 active+clean 9h 0'0 2627266:19330 [6,23,15]p6
[6,23,15]p6 2023-09-29T06:30:57.630016+0200
2023-09-27T22:58:21.992451+0200
3.2 0 0 0 0 0 0 0
0 active+clean 2h 0'0 2627265:1135185 [23,15,6]p23
[23,15,6]p23 2023-09-29T13:42:07.346658+0200
2023-09-24T14:31:52.844427+0200
3.3 0 0 0 0 0 0 0
0 active+clean 13h 0'0 2627266:193170 [6,15,23]p6
[6,15,23]p6 2023-09-29T01:56:54.517337+0200
2023-09-27T17:47:24.961279+0200
3.4 0 0 0 0 0 0 0
0 active+clean 14h 0'0 2627265:2343551 [23,6,15]p23
[23,6,15]p23 2023-09-29T00:47:47.548860+0200
2023-09-25T09:39:51.259304+0200
3.5 0 0 0 0 0 0 0
0 active+clean 2h 0'0 2627265:194111 [15,6,23]p15
[15,6,23]p15 2023-09-29T13:28:48.879959+0200
2023-09-26T15:35:44.217302+0200
3.6 0 0 0 0 0 0 0
0 active+clean 6h 0'0 2627265:2345717 [23,15,6]p23
[23,15,6]p23 2023-09-29T09:26:02.534825+0200
2023-09-27T21:56:57.500126+0200
Best regards,
Malte
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx