Hi,
On 20/05/2024 17:29, Anthony D'Atri wrote:
On May 20, 2024, at 12:21 PM, Matthew Vernon <mvernon@xxxxxxxxxxxxx> wrote:
This has left me with a single sad pg:
[WRN] PG_AVAILABILITY: Reduced data availability: 1 pg inactive
pg 1.0 is stuck inactive for 33m, current state unknown, last acting []
.mgr pool perhaps.
I think so
ceph osd tree shows that CRUSH picked up my racks OK, eg.
-3 45.11993 rack B4
-2 45.11993 host moss-be1001
1 hdd 3.75999 osd.1 up 1.00000 1.00000
Please send the entire first 10 lines or so of `ceph osd tree`
root@moss-be1001:/# ceph osd tree
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-7 176.11194 rack F3
-6 176.11194 host moss-be1003
2 hdd 7.33800 osd.2 up 1.00000 1.00000
3 hdd 7.33800 osd.3 up 1.00000 1.00000
6 hdd 7.33800 osd.6 up 1.00000 1.00000
9 hdd 7.33800 osd.9 up 1.00000 1.00000
12 hdd 7.33800 osd.12 up 1.00000 1.00000
13 hdd 7.33800 osd.13 up 1.00000 1.00000
16 hdd 7.33800 osd.16 up 1.00000 1.00000
19 hdd 7.33800 osd.19 up 1.00000 1.00000
I passed this config to bootstrap with --config:
[global]
osd_crush_chooseleaf_type = 3
Why did you set that? 3 is an unusual value. AIUI most of the time the only reason to change this option is if one is setting up a single-node sandbox - and perhaps localpools create a rule using it. I suspect this is at least part of your problem.
I wanted to have rack as failure domain rather than host i.e. to ensure
that each replica goes in a different rack (academic at the moment as I
have 3 hosts, one in each rack, but for future expansion important).
Once the cluster was up I used an osd spec file that looked like:
service_type: osd
service_id: rrd_single_NVMe
placement:
label: "NVMe"
spec:
data_devices:
rotational: 1
db_devices:
model: "NVMe"
Is it your intent to use spinners for payload data and SSD for metadata?
Yes.
Regards,
Matthew
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx