Re: OSD stuck down

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

did you check the MON logs? They should contain some information about the reason why the OSD is marked down and out. You could also just try to mark it in yourself, does it change anything?

$ ceph osd in 34

I would also take another look into the OSD logs:

cephadm logs --name osd.34



Zitat von Nicola Mori <mori@xxxxxxxxxx>:

Dear Ceph users,

after a host reboot one of the OSDs is now stuck down (and out). I tried several times to restart it and even to reboot the host, but it still remains down.

# ceph -s
  cluster:
    id:     b1029256-7bb3-11ec-a8ce-ac1f6b627b45
    health: HEALTH_WARN
            4 OSD(s) have spurious read errors
            (muted: OSD_SLOW_PING_TIME_BACK OSD_SLOW_PING_TIME_FRONT)

  services:
    mon: 5 daemons, quorum bofur,balin,aka,romolo,dwalin (age 16h)
    mgr: bofur.tklnrn(active, since 16h), standbys: aka.wzystq, balin.hvunfe
    mds: 2/2 daemons up, 1 standby
    osd: 104 osds: 103 up (since 16h), 103 in (since 13h); 4 remapped pgs

  data:
    volumes: 1/1 healthy
    pools:   3 pools, 529 pgs
    objects: 18.85M objects, 41 TiB
    usage:   56 TiB used, 139 TiB / 195 TiB avail
    pgs:     68130/150150628 objects misplaced (0.045%)
             522 active+clean
             4   active+remapped+backfilling
             3   active+clean+scrubbing+deep

  io:
    recovery: 46 MiB/s, 21 objects/s



The host is reachable (its other OSDs are in) and from the systemd logs of the OSD I don't see anything wrong:

$ sudo systemctl status ceph-b1029256-7bb3-11ec-a8ce-ac1f6b627b45@osd.34
● ceph-b1029256-7bb3-11ec-a8ce-ac1f6b627b45@osd.34.service - Ceph osd.34 for b1029256-7bb3-11ec-a8ce-ac1f6b627b45 Loaded: loaded (/etc/systemd/system/ceph-b1029256-7bb3-11ec-a8ce-ac1f6b627b45@.service; enabled; vendor preset: disabled)
   Active: active (running) since Mon 2023-06-12 17:00:25 CEST; 15h ago
 Main PID: 36286 (bash)
    Tasks: 11 (limit: 152154)
   Memory: 20.0M
CGroup: /system.slice/system-ceph\x2db1029256\x2d7bb3\x2d11ec\x2da8ce\x2dac1f6b627b45.slice/ceph-b1029256-7bb3-11ec-a8ce-ac1f6b627b45@osd.34.service ├─36286 /bin/bash /var/lib/ceph/b1029256-7bb3-11ec-a8ce-ac1f6b627b45/osd.34/unit.run └─36657 /usr/bin/docker run --rm --ipc=host --stop-signal=SIGTERM --net=host --entrypoint /usr/bin/ceph-osd --privileged --group-add=disk --init --name ceph-b1029256-7bb3-11ec-a8ce-ac1f6b627b45-osd-34 --pids-limit=0 -e CONTAINER_IMAGE=snack14/ceph-wizard@sha>

Jun 12 17:00:25 balin systemd[1]: Started Ceph osd.34 for b1029256-7bb3-11ec-a8ce-ac1f6b627b45. Jun 12 17:00:27 balin bash[36306]: Running command: /usr/bin/chown -R ceph:ceph /var/lib/ceph/osd/ceph-34 Jun 12 17:00:27 balin bash[36306]: Running command: /usr/bin/ceph-bluestore-tool prime-osd-dir --path /var/lib/ceph/osd/ceph-34 --no-mon-config --dev /dev/mapper/ceph--9a4c3927--d3da--4b49--80fe--6cdc00c7897c-osd--block--36d2f793--e5c7--4247--a314--bcc40389d50d Jun 12 17:00:27 balin bash[36306]: Running command: /usr/bin/chown -h ceph:ceph /dev/mapper/ceph--9a4c3927--d3da--4b49--80fe--6cdc00c7897c-osd--block--36d2f793--e5c7--4247--a314--bcc40389d50d Jun 12 17:00:27 balin bash[36306]: Running command: /usr/bin/chown -R ceph:ceph /dev/dm-6 Jun 12 17:00:27 balin bash[36306]: Running command: /usr/bin/ln -s /dev/mapper/ceph--9a4c3927--d3da--4b49--80fe--6cdc00c7897c-osd--block--36d2f793--e5c7--4247--a314--bcc40389d50d /var/lib/ceph/osd/ceph-34/block Jun 12 17:00:27 balin bash[36306]: Running command: /usr/bin/chown -R ceph:ceph /var/lib/ceph/osd/ceph-34 Jun 12 17:00:27 balin bash[36306]: --> ceph-volume raw activate successful for osd ID: 34 Jun 12 17:00:29 balin bash[36657]: debug 2023-06-12T15:00:29.066+0000 7f818e356540 -1 Falling back to public interface


I'd need some help to understand how to fix this.
Thank you,

Nicola


_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux