After carefully reading the code of cephadm, I think the adoption process from official document be apply to my cluster with some small change. And I successfully convert my cluster. First of all, I change the docker image, use lsync to sync changed service file to external volume, and do not share /var/lib/containers with host. Anyone interest can be the code from below repo: https://github.com/yuchangyuan/cephadm-container I generally follow the adoption process, but before adopt any ceph daemon, I need manually stop the daemon on host, due "cephadm" command can not stop the daemon from inside container. And adoption command will refuse to start generated ceph daemon service due lack of firewalld.service, so I need to manually start the service. For OSD adoption, I update /etc/ceph/ceph.conf, remove any 'osd data' lines. And original docker image I use is 'ceph/ceph-daemon', which do not have '/var/lib/ceph/osd/ceph-{ID}' directory on host, so I need manually create this directory. I do not use cephx auth, I have below 3 lines of config in my ceph.conf ``` auth client required = none auth cluster required = none auth service required = none ``` but "ceph config generate-minimal-conf" output only include first line, which will cause mgr or other daemon fail to start. so run command "ceph cephadm set-extra-ceph-conf", to insert later 2 line. And finally, there're some questions: newly deployed MDS use 'docker.io/ceph/daemon-base:latest-pacific-devel', but other daemon(mon, mgr & osd) use 'quay.io/ceph/ceph:v16', why use different image? And how can I make MDS to use 'quay.io/ceph/daemon-base:lastest-pacific-devel' image? Yu Changyuan <reivzy@xxxxxxxxx> writes: > I run a small ceph cluster(3 mon on 3 node, 7 osd on 2 node) at home > with custom setup, and I think cephadm is the future, so I want to > convert this cluster to cephadm. > > My cluster setup is complex compare to standard deployment, the cluster > is created in early days, so the it is deployed manually, and later I > make all ceph daemons run inside container(using ceph/daemon) with > podman to decouple with the host system(is NixOS), and manage container > startup with NixOS using systemd service(service file is generated with > nix expression). > > I think some OS files need to be mutable to make cephadm work properly, > for example, /etc/ceph/ceph.conf need to be writable by cephadm. This > is how we config most Linux distros, but not NixOS, which is basically > all system files is immutable, include /etc. > > So I plan to run cephadm in a container, with "--privileged=true" and > "--net=host", and ssh listen on port '23' to avoid conflict with host, > and create a dummy 'ntp.service' which only run 'sleep inf' to cheat > cephadm, because I have chrony on host system. Maybe /dev need to bind > mount from host. > > I have already build the image and successfully run 'cephadm check-host' > in the container. Official document for cephadm adoption > process(https://docs.ceph.com/en/latest/cephadm/adoption/) lack details > so I am not sure whether my unusual setup cluster can be convert > successfully or not. so I need some suggestion for further steps of > convertion. > > Below is some details of what I have already done: > > Dockerfile: > ``` > FROM fedora:36 > > RUN dnf -y install \ > systemd openssh-server openssh-clients cephadm podman containernetworking-plugins && \ > dnf clean all > > RUN ssh-keygen -f /etc/ssh/ssh_host_rsa_key -N '' -t rsa && \ > ssh-keygen -f /etc/ssh/ssh_host_ed25519_key -N '' -t ed25519 && \ > sed -i -e 's/^.*pam_loginuid.so.*$/session optional pam_loginuid.so/' /etc/pam.d/sshd && \ > sed -i -e 's/^.*Port 22/Port 23/' /etc/ssh/sshd_config > > EXPOSE 23 > > RUN (for i in \ > systemd-network-generator.service \ > rpmdb-migrate.service \ > rpmdb-rebuild.service \ > getty@tty1.service \ > remote-fs.target \ > systemd-resolved.service \ > systemd-oomd.service \ > systemd-network-generator.service \ > dnf-makecache.timer \ > fstrim.timer; do \ > rm -f /etc/systemd/system/*.wants/$i; \ > done) > > COPY ./ntp.service /etc/systemd/system > > RUN (cd /etc/systemd/system/multi-user.target.wants; ln -s ../ntp.service) > > RUN mkdir -p /etc/ceph && \ > mkdir -p /var/lib/containers && \ > mkdir -p /var/lib/ceph && \ > mkdir -p /var/log/ceph && \ > mkdir -p /root/.ssh && chown 700 /root/.ssh > > VOLUME /etc/ceph > VOLUME /var/lib/containers > VOLUME /var/lib/ceph > VOLUME /var/log/ceph > VOLUME /root/.ssh > > CMD ["/sbin/init"] > ``` > > and below is ntp.service file: > ``` > [Unit] > After=network.target > > [Service] > ExecStart=/bin/sleep inf > Restart=always > Type=simple > ``` > > I start tag the image build from above Dockerfile with name 'cephadm', > and "--security-opt=seccomp=unconfined" option is necessary for podman > build to work. > > Then I start container with below script: > ``` > #!/bin/sh > > mkdir -p /var/log/ceph > mkdir -p /etc/ceph/ssh > > podman run --rm -d \ > --net=host \ > --privileged=true \ > --name=cephadm \ > -v /var/lib/containers:/var/lib/containers:z \ > -v /var/lib/ceph:/var/lib/ceph:z \ > -v /var/log/ceph:/var/log/ceph:z \ > -v /etc/ceph:/etc/ceph:z \ > -v /etc/ceph/ssh:/root/.ssh:z \ > cephadm > ``` > > finally run "podman exec -it cephadm cephadm host-check" will generate > below output: > ``` > podman (/usr/bin/podman) version 4.0.3 is present > systemctl is present > lvcreate is present > Unit ntp.service is enabled and running > Host looks OK > ``` > > and logs in /var/log/ceph/cephadm.log is also looks good. -- Best wishes ~ _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx