HELP NEEDED : cephadm adopt osd crash

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

We've already convert two PRODUCTION storage nodes on Octopus to cephadm without problem.

On the third one, we succeeded to convert only one OSD.

[root@server4 osd]# cephadm adopt --style legacy --name osd.0
Found online OSD at //var/lib/ceph/osd/ceph-0/fsid
objectstore_type is bluestore
Stopping old systemd unit ceph-osd@0...
Disabling old systemd unit ceph-osd@0...
Moving data...
Chowning content...
Chowning /var/lib/ceph/fsid replaced/osd.0/block...
Renaming /etc/ceph/osd/0-2d973f03-82f3-499f-b5dc-d4c28dbe1b3d.json -> /etc/ceph/osd/0-2d973f03-82f3-499f-b5dc-d4c28dbe1b3d.json.adopted-by-cephadm
Disabling host unit ceph-volume@ simple unit...
Moving logs...
Creating new units...

For the others, we have this error:

[root@server4 osd]# cephadm adopt --style legacy --name osd.17

Found online OSD at //var/lib/ceph/osd/ceph-17/fsid
objectstore_type is bluestore
Stopping old systemd unit ceph-osd@17...
Disabling old systemd unit ceph-osd@17...
Moving data...
Traceback (most recent call last):
  File "/sbin/cephadm", line 6251, in <module>
    r = args.func()
  File "/sbin/cephadm", line 1458, in _default_image
    return func()
  File "/sbin/cephadm", line 4027, in command_adopt
    command_adopt_ceph(daemon_type, daemon_id, fsid);
  File "/sbin/cephadm", line 4170, in command_adopt_ceph
    os.rmdir(data_dir_src)
OSError: [Errno 16] Device or resource busy: '//var/lib/ceph/osd/ceph-17'


The directory /var/lib/ceph/osd/ceph-17 is now empty.

The directory /var/lib/ceph/<fsid>/osd.17 contains:

[root@server4 osd.17]# ls -l
total 72
-rw-r--r-- 1 ceph ceph  411 Jan 29  2018 activate.monmap
-rw-r--r-- 1 ceph ceph    3 Jan 29  2018 active
lrwxrwxrwx 1 root root   10 Nov  8 15:54 block -> /dev/sdad2
-rw-r--r-- 1 ceph ceph   37 Jan 29  2018 block_uuid
-rw-r--r-- 1 ceph ceph    2 Jan 29  2018 bluefs
-rw-r--r-- 1 ceph ceph   37 Jan 29  2018 ceph_fsid
-rw-r--r-- 1 ceph ceph 1226 Nov  8 15:53 config
-rw-r--r-- 1 ceph ceph   37 Jan 29  2018 fsid
-rw------- 1 ceph ceph   57 Jan 29  2018 keyring
-rw-r--r-- 1 ceph ceph    8 Jan 29  2018 kv_backend
-rw-r--r-- 1 ceph ceph   21 Jan 29  2018 magic
-rw-r--r-- 1 ceph ceph    4 Jan 29  2018 mkfs_done
-rw-r--r-- 1 ceph ceph    6 Jan 29  2018 ready
-rw------- 1 ceph ceph    3 Nov  8 14:47 require_osd_release
-rw-r--r-- 1 ceph ceph    0 Jan 13  2020 systemd
-rw-r--r-- 1 ceph ceph   10 Jan 29  2018 type
-rw------- 1 root root   22 Nov  8 15:53 unit.image
-rw------- 1 root root 1042 Nov  8 16:30 unit.poststop
-rw------- 1 root root 1851 Nov  8 16:30 unit.run
-rw-r--r-- 1 ceph ceph    3 Jan 29  2018 whoami

When trying to start or redeploy osd.17, podman inspect complains about non-existent image:

2022-11-08 16:58:58,503 7f930fab3740 DEBUG Running command: /bin/podman inspect --format {{.Id}},{{.Config.Image}},{{.Image}},{{.Created}},{{index .Config.Labels "io.ceph.version"}} ceph-<fsid replaced>-osd.17 2022-11-08 16:58:58,591 7f930fab3740 DEBUG /bin/podman: stderr Error: error getting image "ceph-<fsid replaced>-osd.17": unable to find a name and tag match for ceph-<fsid replaced>-osd.17 in repotags: no such image

Is there a way to save osd.17 and create manually the podman image ?

Thanks in advance,

Patrick


_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux