# ceph orch ps | grep osd.34osd.34 balin running (14m) 108s ago 8M 75.3M 793M 17.2.6 b1a23658afad 5b9dbea262c7
# ceph osd tree | grep 34 34 hdd 1.81940 osd.34 down 0 1.00000 I really need help with this since I don't know what more to look at. Thanks in advance, Nicola On 13/06/23 08:35, Nicola Mori wrote:
Dear Ceph users,after a host reboot one of the OSDs is now stuck down (and out). I tried several times to restart it and even to reboot the host, but it still remains down.# ceph -s cluster: id: b1029256-7bb3-11ec-a8ce-ac1f6b627b45 health: HEALTH_WARN 4 OSD(s) have spurious read errors (muted: OSD_SLOW_PING_TIME_BACK OSD_SLOW_PING_TIME_FRONT) services: mon: 5 daemons, quorum bofur,balin,aka,romolo,dwalin (age 16h)mgr: bofur.tklnrn(active, since 16h), standbys: aka.wzystq, balin.hvunfemds: 2/2 daemons up, 1 standby osd: 104 osds: 103 up (since 16h), 103 in (since 13h); 4 remapped pgs data: volumes: 1/1 healthy pools: 3 pools, 529 pgs objects: 18.85M objects, 41 TiB usage: 56 TiB used, 139 TiB / 195 TiB avail pgs: 68130/150150628 objects misplaced (0.045%) 522 active+clean 4 active+remapped+backfilling 3 active+clean+scrubbing+deep io: recovery: 46 MiB/s, 21 objects/sThe host is reachable (its other OSDs are in) and from the systemd logs of the OSD I don't see anything wrong:$ sudo systemctl status ceph-b1029256-7bb3-11ec-a8ce-ac1f6b627b45@osd.34● ceph-b1029256-7bb3-11ec-a8ce-ac1f6b627b45@osd.34.service - Ceph osd.34 for b1029256-7bb3-11ec-a8ce-ac1f6b627b45 Loaded: loaded (/etc/systemd/system/ceph-b1029256-7bb3-11ec-a8ce-ac1f6b627b45@.service; enabled; vendor preset: disabled)Active: active (running) since Mon 2023-06-12 17:00:25 CEST; 15h ago Main PID: 36286 (bash) Tasks: 11 (limit: 152154) Memory: 20.0MCGroup: /system.slice/system-ceph\x2db1029256\x2d7bb3\x2d11ec\x2da8ce\x2dac1f6b627b45.slice/ceph-b1029256-7bb3-11ec-a8ce-ac1f6b627b45@osd.34.service ├─36286 /bin/bash /var/lib/ceph/b1029256-7bb3-11ec-a8ce-ac1f6b627b45/osd.34/unit.run └─36657 /usr/bin/docker run --rm --ipc=host --stop-signal=SIGTERM --net=host --entrypoint /usr/bin/ceph-osd --privileged --group-add=disk --init --name ceph-b1029256-7bb3-11ec-a8ce-ac1f6b627b45-osd-34 --pids-limit=0 -e CONTAINER_IMAGE=snack14/ceph-wizard@sha>Jun 12 17:00:25 balin systemd[1]: Started Ceph osd.34 for b1029256-7bb3-11ec-a8ce-ac1f6b627b45. Jun 12 17:00:27 balin bash[36306]: Running command: /usr/bin/chown -R ceph:ceph /var/lib/ceph/osd/ceph-34 Jun 12 17:00:27 balin bash[36306]: Running command: /usr/bin/ceph-bluestore-tool prime-osd-dir --path /var/lib/ceph/osd/ceph-34 --no-mon-config --dev /dev/mapper/ceph--9a4c3927--d3da--4b49--80fe--6cdc00c7897c-osd--block--36d2f793--e5c7--4247--a314--bcc40389d50d Jun 12 17:00:27 balin bash[36306]: Running command: /usr/bin/chown -h ceph:ceph /dev/mapper/ceph--9a4c3927--d3da--4b49--80fe--6cdc00c7897c-osd--block--36d2f793--e5c7--4247--a314--bcc40389d50d Jun 12 17:00:27 balin bash[36306]: Running command: /usr/bin/chown -R ceph:ceph /dev/dm-6 Jun 12 17:00:27 balin bash[36306]: Running command: /usr/bin/ln -s /dev/mapper/ceph--9a4c3927--d3da--4b49--80fe--6cdc00c7897c-osd--block--36d2f793--e5c7--4247--a314--bcc40389d50d /var/lib/ceph/osd/ceph-34/block Jun 12 17:00:27 balin bash[36306]: Running command: /usr/bin/chown -R ceph:ceph /var/lib/ceph/osd/ceph-34 Jun 12 17:00:27 balin bash[36306]: --> ceph-volume raw activate successful for osd ID: 34 Jun 12 17:00:29 balin bash[36657]: debug 2023-06-12T15:00:29.066+0000 7f818e356540 -1 Falling back to public interfaceI'd need some help to understand how to fix this. Thank you, Nicola
-- Nicola Mori, Ph.D. INFN sezione di Firenze Via Bruno Rossi 1, 50019 Sesto F.no (Italy) +390554572660 mori@xxxxxxxxxx
Attachment:
smime.p7s
Description: S/MIME Cryptographic Signature
_______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx