Most OSDs down and all PGs unknown after P2V migration

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I run a small single-node Ceph deployment (had plans to scale up once
I got more equipment) for home file storage deployed using cephadm. It
was running bare-metal, and I attempted a physical-to-virtual
migration to a Proxmox VM. After doing so, all of my PGs seemed to be
"unknown". Initial after a boot, the OSDs appear to be up, but after a
while, most of them go down. I assume some sort of timeout in the OSD
start process. The systemd processes (and podman containers) are still
running and appear to be happy. I don't see anything crazy in their
logs. I'm relatively new to Ceph, so I don't really know where to go
from here. Can anyone provide any guidance?

The logs for the monitor and one of the OSDs can be found here -
https://gitlab.com/-/snippets/4793143
And here is the output from a few commands that might be useful. 

ceph -s
```
cluster:
    id:     768819b0-a83f-11ee-81d6-74563c5bfc7b
    health: HEALTH_WARN
            Reduced data availability: 545 pgs inactive
            139 pgs not deep-scrubbed in time
            17 slow ops, oldest one blocked for 1668 sec,
mon.fileserver has slow ops

  services:
    mon: 1 daemons, quorum fileserver (age 28m)
    mgr: fileserver.rgtdvr(active, since 28m), standbys:
fileserver.gikddq
    osd: 17 osds: 5 up (since 116m), 5 in (since 10m)

  data:
    pools:   3 pools, 545 pgs
    objects: 1.97M objects, 7.5 TiB
    usage:   7.7 TiB used, 1.4 TiB / 9.1 TiB avail
    pgs:     100.000% pgs unknown
             545 unknown
```

ceph osd df
```
ID  CLASS  WEIGHT   REWEIGHT  SIZE     RAW USE  DATA     OMAP    META
AVAIL    %USE   VAR   PGS  STATUS
 0    hdd  1.81940         0      0 B      0 B      0 B     0 B      0
B      0 B      0     0    0    down
 1    hdd  3.63869         0      0 B      0 B      0 B     0 B      0
B      0 B      0     0    0    down
 3    hdd  1.81940         0      0 B      0 B      0 B     0 B      0
B      0 B      0     0  112    down
 4    hdd  1.81940         0      0 B      0 B      0 B     0 B      0
B      0 B      0     0  117    down
 5    hdd  3.63869         0      0 B      0 B      0 B     0 B      0
B      0 B      0     0    0    down
 6    hdd  3.63869         0      0 B      0 B      0 B     0 B      0
B      0 B      0     0    0    down
 7    hdd  1.81940         0      0 B      0 B      0 B     0 B      0
B      0 B      0     0    0    down
 8    hdd  1.81940         0      0 B      0 B      0 B     0 B      0
B      0 B      0     0  106    down
20    hdd  1.81940         0      0 B      0 B      0 B     0 B      0
B      0 B      0     0  115    down
21    hdd  1.81940         0      0 B      0 B      0 B     0 B      0
B      0 B      0     0   94    down
22    hdd  1.81940         0      0 B      0 B      0 B     0 B      0
B      0 B      0     0   98    down
23    hdd  1.81940         0      0 B      0 B      0 B     0 B      0
B      0 B      0     0  109    down
24    hdd  1.81940   1.00000  1.8 TiB  1.6 TiB  1.6 TiB   4 KiB  3.0
GiB  186 GiB  90.00  1.06  117      up
25    hdd  1.81940   1.00000  1.8 TiB  1.6 TiB  1.6 TiB  10 KiB  2.8
GiB  220 GiB  88.18  1.04  114      up
26    hdd  1.81940   1.00000  1.8 TiB  1.5 TiB  1.5 TiB   9 KiB  2.8
GiB  297 GiB  84.07  0.99  109      up
27    hdd  1.81940   1.00000  1.8 TiB  1.4 TiB  1.4 TiB   7 KiB  2.5
GiB  474 GiB  74.58  0.88   98      up
28    hdd  1.81940   1.00000  1.8 TiB  1.6 TiB  1.6 TiB  10 KiB  3.0
GiB  206 GiB  88.93  1.04  115      up
                       TOTAL  9.1 TiB  7.7 TiB  7.7 TiB  42 KiB   14
GiB  1.4 TiB  85.15
MIN/MAX VAR: 0.88/1.06  STDDEV: 5.65
```

ceph pg stat
```
545 pgs: 545 unknown; 7.5 TiB data, 7.7 TiB used, 1.4 TiB / 9.1 TiB
avail
```

systemctl | grep ceph-768819b0-a83f-11ee-81d6-74563c5bfc7b
```
ceph-768819b0-a83f-11ee-81d6-74563c5bfc7b@xxxxxxxxxxxxxxxxxxxxxxxxxxxx
ice         loaded active     running   Ceph alertmanager.fileserver
for 768819b0-a83f-11ee-81d6-74563c5bfc7b
ceph-768819b0-a83f-11ee-81d6-74563c5bfc7b@xxxxxxxxxxxxxxxxxxxxxxxxxxxx
vice        loaded active     running   Ceph ceph-exporter.fileserver
for 768819b0-a83f-11ee-81d6-74563c5bfc7b
ceph-768819b0-a83f-11ee-81d6-74563c5bfc7b@crash.fileserver.service
loaded active     running   Ceph crash.fileserver for
768819b0-a83f-11ee-81d6-74563c5bfc7b
ceph-768819b0-a83f-11ee-81d6-74563c5bfc7b@grafana.fileserver.service
loaded active     running   Ceph grafana.fileserver for
768819b0-a83f-11ee-81d6-74563c5bfc7b
ceph-768819b0-a83f-11ee-81d6-74563c5bfc7b@mgr.fileserver.gikddq.servic
e           loaded active     running   Ceph mgr.fileserver.gikddq for
768819b0-a83f-11ee-81d6-74563c5bfc7b
ceph-768819b0-a83f-11ee-81d6-74563c5bfc7b@mgr.fileserver.rgtdvr.servic
e           loaded active     running   Ceph mgr.fileserver.rgtdvr for
768819b0-a83f-11ee-81d6-74563c5bfc7b
ceph-768819b0-a83f-11ee-81d6-74563c5bfc7b@mon.fileserver.service
loaded active     running   Ceph mon.fileserver for
768819b0-a83f-11ee-81d6-74563c5bfc7b
ceph-768819b0-a83f-11ee-81d6-74563c5bfc7b@osd.0.service
loaded active     running   Ceph osd.0 for
768819b0-a83f-11ee-81d6-74563c5bfc7b
ceph-768819b0-a83f-11ee-81d6-74563c5bfc7b@osd.1.service
loaded active     running   Ceph osd.1 for
768819b0-a83f-11ee-81d6-74563c5bfc7b
ceph-768819b0-a83f-11ee-81d6-74563c5bfc7b@osd.20.service
loaded active     running   Ceph osd.20 for
768819b0-a83f-11ee-81d6-74563c5bfc7b
ceph-768819b0-a83f-11ee-81d6-74563c5bfc7b@osd.21.service
loaded active     running   Ceph osd.21 for
768819b0-a83f-11ee-81d6-74563c5bfc7b
ceph-768819b0-a83f-11ee-81d6-74563c5bfc7b@osd.22.service
loaded active     running   Ceph osd.22 for
768819b0-a83f-11ee-81d6-74563c5bfc7b
ceph-768819b0-a83f-11ee-81d6-74563c5bfc7b@osd.23.service
loaded active     running   Ceph osd.23 for
768819b0-a83f-11ee-81d6-74563c5bfc7b
ceph-768819b0-a83f-11ee-81d6-74563c5bfc7b@osd.24.service
loaded active     running   Ceph osd.24 for
768819b0-a83f-11ee-81d6-74563c5bfc7b
ceph-768819b0-a83f-11ee-81d6-74563c5bfc7b@osd.25.service
loaded active     running   Ceph osd.25 for
768819b0-a83f-11ee-81d6-74563c5bfc7b
ceph-768819b0-a83f-11ee-81d6-74563c5bfc7b@osd.26.service
loaded active     running   Ceph osd.26 for
768819b0-a83f-11ee-81d6-74563c5bfc7b
ceph-768819b0-a83f-11ee-81d6-74563c5bfc7b@osd.27.service
loaded active     running   Ceph osd.27 for
768819b0-a83f-11ee-81d6-74563c5bfc7b
ceph-768819b0-a83f-11ee-81d6-74563c5bfc7b@osd.28.service
loaded active     running   Ceph osd.28 for
768819b0-a83f-11ee-81d6-74563c5bfc7b
ceph-768819b0-a83f-11ee-81d6-74563c5bfc7b@osd.3.service
loaded active     running   Ceph osd.3 for
768819b0-a83f-11ee-81d6-74563c5bfc7b
ceph-768819b0-a83f-11ee-81d6-74563c5bfc7b@osd.4.service
loaded active     running   Ceph osd.4 for
768819b0-a83f-11ee-81d6-74563c5bfc7b
ceph-768819b0-a83f-11ee-81d6-74563c5bfc7b@osd.5.service
loaded active     running   Ceph osd.5 for
768819b0-a83f-11ee-81d6-74563c5bfc7b
ceph-768819b0-a83f-11ee-81d6-74563c5bfc7b@osd.6.service
loaded active     running   Ceph osd.6 for
768819b0-a83f-11ee-81d6-74563c5bfc7b
ceph-768819b0-a83f-11ee-81d6-74563c5bfc7b@osd.7.service
loaded active     running   Ceph osd.7 for
768819b0-a83f-11ee-81d6-74563c5bfc7b
ceph-768819b0-a83f-11ee-81d6-74563c5bfc7b@osd.8.service
loaded active     running   Ceph osd.8 for
768819b0-a83f-11ee-81d6-74563c5bfc7b
ceph-768819b0-a83f-11ee-81d6-74563c5bfc7b@prometheus.fileserver.servic
e           loaded active     running   Ceph prometheus.fileserver for
768819b0-a83f-11ee-81d6-74563c5bfc7b
system-ceph\x2d768819b0\x2da83f\x2d11ee\x2d81d6\x2d74563c5bfc7b.slice
loaded active     active    Slice
/system/ceph-768819b0-a83f-11ee-81d6-74563c5bfc7b
ceph-768819b0-a83f-11ee-81d6-74563c5bfc7b.target
loaded active     active    Ceph cluster
768819b0-a83f-11ee-81d6-74563c5bfc7b
```
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux