Re: Node Exporter keep failing while upgrading cluster in Air-gapped ( isolated environment ).

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I wouldn't worry about the one the config option gives you right now. The
one on your local repo looks like the same version. For isolated
deployments like this, the default options aren't going to work, as they'll
always point to images that require internet access to pull. I'd just
update the config setting to point at the image on your local repo and then
run `ceph orch redeploy node-exporter`.

We executed following command but it is in error state.
> root@node-01:~# ceph config get mgr
> mgr/cephadm/container_image_node_exporter
> 192.168.1.10:5000/prometheus/node-exporter:v1.5.0


I don't think you can pass parameters to the `config get` command. You'll
have to run `config set` instead and then verify after by running the
`config get` command without an argument that it was updated properly

On Tue, Jul 16, 2024 at 12:09 PM Saif Mohammad <samdto987@xxxxxxxxx> wrote:

> Hello Adam,
>
> Thanks for the prompt response.
> We have below image in private-registry for node-exporter.
>
> 192.168.1.10:5000/prometheus/node-exporter           v1.5.0
>  0da6a335fe13   19 months ago   22.5MB
>
> But upon ceph upgrade, we are getting the mentioned image (
> quay.io/prometheus/node-exporter:v1.5.0) by executing following command.
> root@node-01:~# ceph config get mgr
> mgr/cephadm/container_image_node_exporter
> quay.io/prometheus/node-exporter:v1.5.0
>
> I am a bit reluctant which image do i need to work upon, the image that is
> there on private-registry (192.168.1.10:5000/prometheus/node-exporter )
> or the image that we are getting after executing "ceph config" command i.e
> ( quay.io/prometheus/node-exporter:v1.5.0 ).
>
>
>
> For every ( monitoring stack components ) images do we required Internet
> no matter where the image is present even if it is there on
> private-registry. Right ?
>
> We executed following command but it is in error state.
> root@node-01:~# ceph config get mgr
> mgr/cephadm/container_image_node_exporter
> 192.168.1.10:5000/prometheus/node-exporter:v1.5.0
>
> Sharing some important outputs:
> root@node-01:~# ceph health detail
> HEALTH_WARN 1 failed cephadm daemon(s)
> [WRN] CEPHADM_FAILED_DAEMON: 1 failed cephadm daemon(s)
>     daemon node-exporter.sky-blue on sky-blue is in error state
>
> Journalctl logs:
>
> Jul 16 12:01:18 sky-blue systemd[1]:
> ceph-3d444cb6-435b-11ef-bdc9-f30dc5fa2774@node-exporter.
> sky-blue.service: Failed with result 'exit-code'.
> Jul 16 12:01:18  sky-blue systemd[1]: Failed to start Ceph node-exporter.
> sky-blue for 3d444cb6-435b-11ef-bdc9-f30dc5fa2774.
> Jul 16 12:01:28  sky-blue systemd[1]:
> ceph-3d444cb6-435b-11ef-bdc9-f30dc5fa2774@node-exporter.
> sky-blue.service: Scheduled restart job, restart counter is at 4.
> Jul 16 12:01:28  sky-blue systemd[1]: Stopped Ceph node-exporter. sky-blue
> for 3d444cb6-435b-11ef-bdc9-f30dc5fa2774.
> Jul 16 12:01:28  sky-blue systemd[1]: Starting Ceph node-exporter.
> sky-blue for 3d444cb6-435b-11ef-bdc9-f30dc5fa2774...
> Jul 16 12:01:29  sky-blue bash[115638]: Trying to pull
> quay.io/prometheus/node-exporter:v1.5.0...
> Jul 16 12:02:29  sky-blue bash[115638]: Error: initializing source
> docker://quay.io/prometheus/node-exporter:v1.5.0: pinging container
> registry quay.io: Get "https://quay.io/v2/": dial tcp 3.214.40.167:443:
> i/o timeout
> Jul 16 12:02:29  sky-blue systemd[1]:
> ceph-3d444cb6-435b-11ef-bdc9-f30dc5fa2774@node-exporter.
> sky-blue.service: Control process exited, code=exited, status=125/n/a
> Jul 16 12:02:29  sky-blue systemd[1]:
> ceph-3d444cb6-435b-11ef-bdc9-f30dc5fa2774@node-exporter.
> sky-blue.service: Failed with result 'exit-code'.
> Jul 16 12:02:29  sky-blue systemd[1]: Failed to start Ceph node-exporter.
> sky-blue for 3d444cb6-435b-11ef-bdc9-f30dc5fa2774.
> Jul 16 12:02:39  sky-bluey systemd[1]:
> ceph-3d444cb6-435b-11ef-bdc9-f30dc5fa2774@node-exporter.
> sky-blue.service: Scheduled restart job, restart counter is at 5.
> Jul 16 12:02:39  sky-blue systemd[1]: Stopped Ceph node-exporter. sky-blue
> for 3d444cb6-435b-11ef-bdc9-f30dc5fa2774.
> Jul 16 12:02:39  sky-blue systemd[1]:
> ceph-3d444cb6-435b-11ef-bdc9-f30dc5fa2774@node-exporter.
> sky-blue.service: Start request repeated too quickly.
> Jul 16 12:02:39  sky-blue systemd[1]:
> ceph-3d444cb6-435b-11ef-bdc9-f30dc5fa2774@node-exporter.
> sky-blue.service: Failed with result 'exit-code'.
> Jul 16 12:02:39  sky-bluey systemd[1]: Failed to start Ceph node-exporter.
> sky-blue for 3d444cb6-435b-11ef-bdc9-f30dc5fa2774.
>
> It is attempting to start the following image:
> quay.io/prometheus/node-exporter:v1.5.0
>
> root@sky-blue:/var/lib/ceph/3d444cb6-435b-11ef-bdc9-f30dc5fa2774/node-exporter.sky-blue#
> cat unit.run
> set -e
> # node-exporter.sky-blue
> ! /usr/bin/podman rm -f
> ceph-3d444cb6-435b-11ef-bdc9-f30dc5fa2774-node-exporter.sky-blue 2>
> /dev/null
> ! /usr/bin/podman rm -f
> ceph-3d444cb6-435b-11ef-bdc9-f30dc5fa2774-node-exporter-sky-blue 2>
> /dev/null
> ! /usr/bin/podman rm -f --storage
> ceph-3d444cb6-435b-11ef-bdc9-f30dc5fa2774-node-exporter-sky-blue 2>
> /dev/null
> ! /usr/bin/podman rm -f --storage
> ceph-3d444cb6-435b-11ef-bdc9-f30dc5fa2774-node-exporter.sky-blue 2>
> /dev/null
> /usr/bin/podman run --rm --ipc=host --stop-signal=SIGTERM --net=host
> --init --name
> ceph-3d444cb6-435b-11ef-bdc9-f30dc5fa2774-node-exporter-sky-blue --user
> 65534 --security-opt label=disable -d --log-driver journald
> --conmon-pidfile
> /run/ceph-3d444cb6-435b-11ef-bdc9-f30dc5fa2774@xxxxxxxxxxxxxxxxx-blue.service-pid
> --cidfile
> /run/ceph-3d444cb6-435b-11ef-bdc9-f30dc5fa2774@xxxxxxxxxxxxxxxxx-blue.service-cid
> --cgroups=split -e CONTAINER_IMAGE=quay.io/prometheus/node-exporter:v1.5.0
> -e NODE_NAME=ocean-bay -e CEPH_USE_RANDOM_NONCE=1 -v
> /var/lib/ceph/3d444cb6-435b-11ef-bdc9-f30dc5fa2774/node-exporter.sky-blue/etc/node-exporter:/etc/node-exporter:Z
> -v /proc:/host/proc:ro -v /sys:/host/sys:ro -v /:/rootfs:ro -v
> /etc/hosts:/etc/hosts:ro quay.io/prometheus/node-exporter:v1.5.0
> --no-collector.timex --web.listen-address=:9100 --path.procfs=/host/proc
> --path.sysfs=/host/sys --path.rootfs=/rootfs
>
>
> Please guide how to resolve this.
>
> Regards,
> Mohammad Saif
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx
>
>
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux