hi , for trying deploy cluster with cephadm version 19.2.1 and using docker version 28.0.1 i get this error : ------- # cephadm --image opkbhfpsksp0101.p.fnst/ceph/ceph:v19.2.1 bootstrap --mon-ip 10.248.35.143 --registry-json /root/reg.json --allow-fqdn-hostname --initial-dashboard-user admin --initial-dashboard-password P@ssw0rd1404 --dashboard-password-noupdate --ssh-user cephadmin Verifying ssh connectivity using standard pubkey authentication ... Adding key to cephadmin@localhost authorized_keys... Verifying podman|docker is present... Verifying lvm2 is present... Verifying time synchronization is in place... Unit chronyd.service is enabled and running Repeating the final host check... docker (/bin/docker) is present systemctl is present lvcreate is present Unit chronyd.service is enabled and running Host looks OK Cluster fsid: 15d9eaee-fbe0-11ef-ad63-005056a83619 Verifying IP 10.248.35.143 port 3300 ... Verifying IP 10.248.35.143 port 6789 ... Mon IP `10.248.35.143` is in CIDR network `10.248.35.0/24` Mon IP `10.248.35.143` is in CIDR network `10.248.35.0/24` Internal network (--cluster-network) has not been provided, OSD replication will default to the public_network Logging into custom registry. Pulling custom registry login info from /root/reg.json. Pulling container image opkbhfpsksp0101.p.fnst/ceph/ceph:v19.2.1... Non-zero exit code 125 from /bin/docker run --rm --ipc=host --stop-signal=SIGTERM --ulimit nofile=1048576 --net=host --entrypoint ceph --init -e CONTAINER_IMAGE=opkbhfpsksp0101.p.fnst/ceph/ceph:v19.2.1 -e NODE_NAME=opcpmfpsksa0403 opkbhfpsksp0101.p.fnst/ceph/ceph:v19.2.1 --version ceph: stderr docker: Error response from daemon: failed to create task for container: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: can't copy bootstrap data to pipe: write init-p: broken pipe: unknown ceph: stderr ceph: stderr Run 'docker run --help' for more information RuntimeError: Failed command: /bin/docker run --rm --ipc=host --stop-signal=SIGTERM --ulimit nofile=1048576 --net=host --entrypoint ceph --init -e CONTAINER_IMAGE=opkbhfpsksp0101.p.fnst/ceph/ceph:v19.2.1 -e NODE_NAME=opcpmfpsksa0403 opkbhfpsksp0101.p.fnst/ceph/ceph:v19.2.1 --version *************** Cephadm hit an issue during cluster installation. Current cluster files will be deleted automatically. To disable this behaviour you can pass the --no-cleanup-on-failure flag. In case of any previous broken installation, users must use the following command to completely delete the broken cluster: > cephadm rm-cluster --force --zap-osds --fsid <fsid> for more information please refer to https://docs.ceph.com/en/latest/cephadm/operations/#purging-a-cluster *************** Deleting cluster with fsid: 15d9eaee-fbe0-11ef-ad63-005056a83619 Traceback (most recent call last): File "/tmp/tmpfe1bt8s9.cephadm.build/app/__main__.py", line 2628, in _rollback File "/tmp/tmpfe1bt8s9.cephadm.build/app/__main__.py", line 446, in _default_image File "/tmp/tmpfe1bt8s9.cephadm.build/app/__main__.py", line 2763, in command_bootstrap File "/tmp/tmpfe1bt8s9.cephadm.build/app/cephadmlib/container_types.py", line 429, in run File "/tmp/tmpfe1bt8s9.cephadm.build/app/cephadmlib/call_wrappers.py", line 310, in call_throws RuntimeError: Failed command: /bin/docker run --rm --ipc=host --stop-signal=SIGTERM --ulimit nofile=1048576 --net=host --entrypoint ceph --init -e CONTAINER_IMAGE=opkbhfpsksp0101.p.fnst/ceph/ceph:v19.2.1 -e NODE_NAME=opcpmfpsksa0403 opkbhfpsksp0101.p.fnst/ceph/ceph:v19.2.1 --version During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/usr/lib64/python3.9/runpy.py", line 197, in _run_module_as_main return _run_code(code, main_globals, None, File "/usr/lib64/python3.9/runpy.py", line 87, in _run_code exec(code, run_globals) File "/tmp/tmpfe1bt8s9.cephadm.build/app/__main__.py", line 5581, in <module> File "/tmp/tmpfe1bt8s9.cephadm.build/app/__main__.py", line 5569, in main File "/tmp/tmpfe1bt8s9.cephadm.build/app/__main__.py", line 2657, in _rollback File "/tmp/tmpfe1bt8s9.cephadm.build/app/__main__.py", line 4391, in _rm_cluster File "/tmp/tmpfe1bt8s9.cephadm.build/app/__main__.py", line 4317, in get_ceph_cluster_count FileNotFoundError: [Errno 2] No such file or directory: '/var/lib/ceph' ----- for that i checked everything lik docker version also containerd and runc and just find this log in journalctl -u docker : ------- Mar 02 06:23:00 opcpmfpsksa0403 systemd[1]: Starting Docker Application Container Engine... Mar 02 06:23:00 opcpmfpsksa0403 dockerd[18363]: time="2025-03-02T06:23:00.533468075Z" level=info msg="Starting up" Mar 02 06:23:00 opcpmfpsksa0403 dockerd[18363]: time="2025-03-02T06:23:00.535018237Z" level=info msg="OTEL tracing is not configured, using no-op tracer provider" Mar 02 06:23:00 opcpmfpsksa0403 dockerd[18363]: time="2025-03-02T06:23:00.566901059Z" level=info msg="[graphdriver] using prior storage driver: overlay2" Mar 02 06:23:00 opcpmfpsksa0403 dockerd[18363]: time="2025-03-02T06:23:00.567653709Z" level=info msg="Loading containers: start." Mar 02 06:23:01 opcpmfpsksa0403 dockerd[18363]: time="2025-03-02T06:23:01.325500958Z" level=info msg="Loading containers: done." Mar 02 06:23:01 opcpmfpsksa0403 dockerd[18363]: time="2025-03-02T06:23:01.344752401Z" level=warning msg="Not using native diff for overlay2, this may cause degraded performance for building images: kernel has CONFIG_OVERLAY_FS_REDIRECT_DIR enabled" storage-driver=overlay2 Mar 02 06:23:01 opcpmfpsksa0403 dockerd[18363]: time="2025-03-02T06:23:01.344892780Z" level=info msg="Docker daemon" commit=bbd0a17 containerd-snapshotter=false storage-driver=overlay2 version=28.0.1 Mar 02 06:23:01 opcpmfpsksa0403 dockerd[18363]: time="2025-03-02T06:23:01.344968676Z" level=info msg="Initializing buildkit" Mar 02 06:23:01 opcpmfpsksa0403 dockerd[18363]: time="2025-03-02T06:23:01.393298555Z" level=info msg="Completed buildkit initialization" Mar 02 06:23:01 opcpmfpsksa0403 dockerd[18363]: time="2025-03-02T06:23:01.401141531Z" level=info msg="Daemon has completed initialization" Mar 02 06:23:01 opcpmfpsksa0403 dockerd[18363]: time="2025-03-02T06:23:01.401227515Z" level=info msg="API listen on /run/docker.sock" Mar 02 06:23:01 opcpmfpsksa0403 systemd[1]: Started Docker Application Container Engine. Mar 08 05:42:50 opcpmfpsksa0403 dockerd[18363]: time="2025-03-08T05:42:50.468402014Z" level=error msg="copy stream failed" error="reading from a closed fifo" stream=stderr Mar 08 05:42:50 opcpmfpsksa0403 dockerd[18363]: time="2025-03-08T05:42:50.468433232Z" level=error msg="copy stream failed" error="reading from a closed fifo" stream=stdout Mar 08 05:42:51 opcpmfpsksa0403 dockerd[18363]: time="2025-03-08T05:42:51.103965419Z" level=error msg="Handler for POST /v1.48/containers/40b1295b9067eb570d01b1509c59593f29e7ad61fb61e8ed4a82166441d52d53/start returned error: failed to create task for container: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: can't copy bootstrap data to pipe: write init-p: broken pipe: unknown" -------- we use linux oracle 9.5 with kernel 5.15.0-305.176.4.el9uek.x86_64, we searched anywhere but we can't understand that what happen, anybody know how to can resolve that ? or what's happening? _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx