Re: cephadm bootstrap failed with docker

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

googling a bit shows several potential issues. Do you build your own container images or did you just push it to your private registry? Some people changed their containerd version to get it working. If you shared your versions of docker, containerd etc, maybe someone could help. We use podman and haven’t had any issues yet. Maybe you could try podman to check if it works?

Zitat von farhad kh <farhad.khedriyan@xxxxxxxxx>:

hi , for trying deploy cluster with cephadm version 19.2.1 and using docker
version 28.0.1 i get this error :
-------
# cephadm    --image opkbhfpsksp0101.p.fnst/ceph/ceph:v19.2.1 bootstrap
--mon-ip 10.248.35.143 --registry-json /root/reg.json
 --allow-fqdn-hostname --initial-dashboard-user admin
--initial-dashboard-password P@ssw0rd1404   --dashboard-password-noupdate
 --ssh-user cephadmin
Verifying ssh connectivity using standard pubkey authentication ...
Adding key to cephadmin@localhost authorized_keys...
Verifying podman|docker is present...
Verifying lvm2 is present...
Verifying time synchronization is in place...
Unit chronyd.service is enabled and running
Repeating the final host check...
docker (/bin/docker) is present
systemctl is present
lvcreate is present
Unit chronyd.service is enabled and running
Host looks OK
Cluster fsid: 15d9eaee-fbe0-11ef-ad63-005056a83619
Verifying IP 10.248.35.143 port 3300 ...
Verifying IP 10.248.35.143 port 6789 ...
Mon IP `10.248.35.143` is in CIDR network `10.248.35.0/24`
Mon IP `10.248.35.143` is in CIDR network `10.248.35.0/24`
Internal network (--cluster-network) has not been provided, OSD replication
will default to the public_network
Logging into custom registry.
Pulling custom registry login info from /root/reg.json.
Pulling container image opkbhfpsksp0101.p.fnst/ceph/ceph:v19.2.1...
Non-zero exit code 125 from /bin/docker run --rm --ipc=host
--stop-signal=SIGTERM --ulimit nofile=1048576 --net=host --entrypoint ceph
--init -e CONTAINER_IMAGE=opkbhfpsksp0101.p.fnst/ceph/ceph:v19.2.1 -e
NODE_NAME=opcpmfpsksa0403 opkbhfpsksp0101.p.fnst/ceph/ceph:v19.2.1 --version
ceph: stderr docker: Error response from daemon: failed to create task for
container: failed to create shim task: OCI runtime create failed: runc
create failed: unable to start container process: can't copy bootstrap data
to pipe: write init-p: broken pipe: unknown
ceph: stderr
ceph: stderr Run 'docker run --help' for more information
RuntimeError: Failed command: /bin/docker run --rm --ipc=host
--stop-signal=SIGTERM --ulimit nofile=1048576 --net=host --entrypoint ceph
--init -e CONTAINER_IMAGE=opkbhfpsksp0101.p.fnst/ceph/ceph:v19.2.1 -e
NODE_NAME=opcpmfpsksa0403 opkbhfpsksp0101.p.fnst/ceph/ceph:v19.2.1 --version


        ***************
        Cephadm hit an issue during cluster installation. Current cluster
files will be deleted automatically.
        To disable this behaviour you can pass the --no-cleanup-on-failure
flag. In case of any previous
        broken installation, users must use the following command to
completely delete the broken cluster:

        > cephadm rm-cluster --force --zap-osds --fsid <fsid>

        for more information please refer to
https://docs.ceph.com/en/latest/cephadm/operations/#purging-a-cluster
        ***************


Deleting cluster with fsid: 15d9eaee-fbe0-11ef-ad63-005056a83619
Traceback (most recent call last):
  File "/tmp/tmpfe1bt8s9.cephadm.build/app/__main__.py", line 2628, in
_rollback
  File "/tmp/tmpfe1bt8s9.cephadm.build/app/__main__.py", line 446, in
_default_image
  File "/tmp/tmpfe1bt8s9.cephadm.build/app/__main__.py", line 2763, in
command_bootstrap
  File "/tmp/tmpfe1bt8s9.cephadm.build/app/cephadmlib/container_types.py",
line 429, in run
  File "/tmp/tmpfe1bt8s9.cephadm.build/app/cephadmlib/call_wrappers.py",
line 310, in call_throws
RuntimeError: Failed command: /bin/docker run --rm --ipc=host
--stop-signal=SIGTERM --ulimit nofile=1048576 --net=host --entrypoint ceph
--init -e CONTAINER_IMAGE=opkbhfpsksp0101.p.fnst/ceph/ceph:v19.2.1 -e
NODE_NAME=opcpmfpsksa0403 opkbhfpsksp0101.p.fnst/ceph/ceph:v19.2.1 --version

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib64/python3.9/runpy.py", line 197, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/usr/lib64/python3.9/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/tmp/tmpfe1bt8s9.cephadm.build/app/__main__.py", line 5581, in
<module>
  File "/tmp/tmpfe1bt8s9.cephadm.build/app/__main__.py", line 5569, in main
  File "/tmp/tmpfe1bt8s9.cephadm.build/app/__main__.py", line 2657, in
_rollback
  File "/tmp/tmpfe1bt8s9.cephadm.build/app/__main__.py", line 4391, in
_rm_cluster
  File "/tmp/tmpfe1bt8s9.cephadm.build/app/__main__.py", line 4317, in
get_ceph_cluster_count
FileNotFoundError: [Errno 2] No such file or directory: '/var/lib/ceph'
-----

for that i checked everything lik docker version also containerd and runc
and just find this log in journalctl -u docker :

-------
Mar 02 06:23:00 opcpmfpsksa0403 systemd[1]: Starting Docker Application
Container Engine...
Mar 02 06:23:00 opcpmfpsksa0403 dockerd[18363]:
time="2025-03-02T06:23:00.533468075Z" level=info msg="Starting up"
Mar 02 06:23:00 opcpmfpsksa0403 dockerd[18363]:
time="2025-03-02T06:23:00.535018237Z" level=info msg="OTEL tracing is not
configured, using no-op tracer provider"
Mar 02 06:23:00 opcpmfpsksa0403 dockerd[18363]:
time="2025-03-02T06:23:00.566901059Z" level=info msg="[graphdriver] using
prior storage driver: overlay2"
Mar 02 06:23:00 opcpmfpsksa0403 dockerd[18363]:
time="2025-03-02T06:23:00.567653709Z" level=info msg="Loading containers:
start."
Mar 02 06:23:01 opcpmfpsksa0403 dockerd[18363]:
time="2025-03-02T06:23:01.325500958Z" level=info msg="Loading containers:
done."
Mar 02 06:23:01 opcpmfpsksa0403 dockerd[18363]:
time="2025-03-02T06:23:01.344752401Z" level=warning msg="Not using native
diff for overlay2, this may cause degraded performance for building images:
kernel has CONFIG_OVERLAY_FS_REDIRECT_DIR enabled" storage-driver=overlay2
Mar 02 06:23:01 opcpmfpsksa0403 dockerd[18363]:
time="2025-03-02T06:23:01.344892780Z" level=info msg="Docker daemon"
commit=bbd0a17 containerd-snapshotter=false storage-driver=overlay2
version=28.0.1
Mar 02 06:23:01 opcpmfpsksa0403 dockerd[18363]:
time="2025-03-02T06:23:01.344968676Z" level=info msg="Initializing buildkit"
Mar 02 06:23:01 opcpmfpsksa0403 dockerd[18363]:
time="2025-03-02T06:23:01.393298555Z" level=info msg="Completed buildkit
initialization"
Mar 02 06:23:01 opcpmfpsksa0403 dockerd[18363]:
time="2025-03-02T06:23:01.401141531Z" level=info msg="Daemon has completed
initialization"
Mar 02 06:23:01 opcpmfpsksa0403 dockerd[18363]:
time="2025-03-02T06:23:01.401227515Z" level=info msg="API listen on
/run/docker.sock"
Mar 02 06:23:01 opcpmfpsksa0403 systemd[1]: Started Docker Application
Container Engine.
Mar 08 05:42:50 opcpmfpsksa0403 dockerd[18363]:
time="2025-03-08T05:42:50.468402014Z" level=error msg="copy stream failed"
error="reading from a closed fifo" stream=stderr
Mar 08 05:42:50 opcpmfpsksa0403 dockerd[18363]:
time="2025-03-08T05:42:50.468433232Z" level=error msg="copy stream failed"
error="reading from a closed fifo" stream=stdout
Mar 08 05:42:51 opcpmfpsksa0403 dockerd[18363]:
time="2025-03-08T05:42:51.103965419Z" level=error msg="Handler for POST
/v1.48/containers/40b1295b9067eb570d01b1509c59593f29e7ad61fb61e8ed4a82166441d52d53/start
returned error: failed to create task for container: failed to create shim
task: OCI runtime create failed: runc create failed: unable to start
container process: can't copy bootstrap data to pipe: write init-p: broken
pipe: unknown"
--------
we use linux oracle 9.5 with kernel 5.15.0-305.176.4.el9uek.x86_64, we
searched anywhere but we can't understand that what happen, anybody know
how to can resolve that ? or what's happening?
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx


_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux