Re: Make ceph orch daemons reboot safe

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



The other hosts are still online and the cluster only lost 1/3 of its services. 



> Am 16.09.2023 um 12:53 schrieb Eugen Block <eblock@xxxxxx>:
> 
> I don’t have time to look into all the details, but I’m wondering how you seem to be able to start mgr services with the orchestrator if all mgr daemons are down. The orchestrator is a mgr module, so that’s a bit weird, isn’t it?
> 
> Zitat von Boris Behrens <bb@xxxxxxxxx>:
> 
>> Hi Eugen,
>> the test-test cluster where we started with simple ceph and the adoption
>> when straight forward are working fine.
>> 
>> But this test cluster was all over the place.
>> We had an old running update via orchestrator which was still in the
>> pipeline, the adoption process was stopped a year ago and now got picked up
>> again, and so on and so forth.
>> 
>> But now we have it clean, at least we think it's clean.
>> 
>> After a reboot, the services are not available. I have to start the via
>> ceph orch.
>> root@0cc47a6df14e:~# systemctl list-units | grep ceph
>>  ceph-crash.service
>>                loaded active running   Ceph crash dump collector
>>  ceph-fuse.target
>>                loaded active active    ceph target allowing to start/stop
>> all ceph-fuse@.service instances at once
>>  ceph-mds.target
>>               loaded active active    ceph target allowing to start/stop
>> all ceph-mds@.service instances at once
>>  ceph-mgr.target
>>               loaded active active    ceph target allowing to start/stop
>> all ceph-mgr@.service instances at once
>>  ceph-mon.target
>>               loaded active active    ceph target allowing to start/stop
>> all ceph-mon@.service instances at once
>>  ceph-osd.target
>>               loaded active active    ceph target allowing to start/stop
>> all ceph-osd@.service instances at once
>>  ceph-radosgw.target
>>               loaded active active    ceph target allowing to start/stop
>> all ceph-radosgw@.service instances at once
>>  ceph.target
>>               loaded active active    All Ceph clusters and services
>> root@0cc47a6df14e:~# ceph orch start mgr
>> Scheduled to start mgr.0cc47a6df14e.nvjlcx on host '0cc47a6df14e'
>> Scheduled to start mgr.0cc47a6df330.aznjao on host '0cc47a6df330'
>> Scheduled to start mgr.0cc47aad8ce8.ifiydp on host '0cc47aad8ce8'
>> root@0cc47a6df14e:~# ceph orch start mon
>> Scheduled to start mon.0cc47a6df14e on host '0cc47a6df14e'
>> Scheduled to start mon.0cc47a6df330 on host '0cc47a6df330'
>> Scheduled to start mon.0cc47aad8ce8 on host '0cc47aad8ce8'
>> root@0cc47a6df14e:~# ceph orch start osd.all-flash-over-1tb
>> Scheduled to start osd.2 on host '0cc47a6df14e'
>> Scheduled to start osd.5 on host '0cc47a6df14e'
>> Scheduled to start osd.3 on host '0cc47a6df330'
>> Scheduled to start osd.0 on host '0cc47a6df330'
>> Scheduled to start osd.4 on host '0cc47aad8ce8'
>> Scheduled to start osd.1 on host '0cc47aad8ce8'
>> root@0cc47a6df14e:~# systemctl list-units | grep ceph
>>  ceph-03977a23-f00f-4bb0-b9a7-de57f40ba853@mgr.0cc47a6df14e.nvjlcx.service
>>                                       loaded active running   Ceph
>> mgr.0cc47a6df14e.nvjlcx for 03977a23-f00f-4bb0-b9a7-de57f40ba853
>>  ceph-03977a23-f00f-4bb0-b9a7-de57f40ba853@mon.0cc47a6df14e.service
>>                                        loaded active running   Ceph
>> mon.0cc47a6df14e for 03977a23-f00f-4bb0-b9a7-de57f40ba853
>>  ceph-03977a23-f00f-4bb0-b9a7-de57f40ba853@osd.2.service
>>                                       loaded active running   Ceph osd.2
>> for 03977a23-f00f-4bb0-b9a7-de57f40ba853
>>  ceph-crash.service
>>                                        loaded active running   Ceph crash
>> dump collector
>>  system-ceph\x2d03977a23\x2df00f\x2d4bb0\x2db9a7\x2dde57f40ba853.slice
>>                                       loaded active active
>> system-ceph\x2d03977a23\x2df00f\x2d4bb0\x2db9a7\x2dde57f40ba853.slice
>>  ceph-fuse.target
>>                                        loaded active active    ceph target
>> allowing to start/stop all ceph-fuse@.service instances at once
>>  ceph-mds.target
>>                                       loaded active active    ceph target
>> allowing to start/stop all ceph-mds@.service instances at once
>>  ceph-mgr.target
>>                                       loaded active active    ceph target
>> allowing to start/stop all ceph-mgr@.service instances at once
>>  ceph-mon.target
>>                                       loaded active active    ceph target
>> allowing to start/stop all ceph-mon@.service instances at once
>>  ceph-osd.target
>>                                       loaded active active    ceph target
>> allowing to start/stop all ceph-osd@.service instances at once
>>  ceph-radosgw.target
>>                                       loaded active active    ceph target
>> allowing to start/stop all ceph-radosgw@.service instances at once
>>  ceph.target
>>                                       loaded active active    All Ceph
>> clusters and services
>> root@0cc47a6df14e:~# systemctl status
>> ceph-03977a23-f00f-4bb0-b9a7-de57f40ba853@mgr.0cc47a6df14e.nvjlcx.service
>> ● ceph-03977a23-f00f-4bb0-b9a7-de57f40ba853@mgr.0cc47a6df14e.nvjlcx.service
>> - Ceph mgr.0cc47a6df14e.nvjlcx for 03977a23-f00f-4bb0-b9a7-de57f40ba853
>>     Loaded: loaded
>> (/etc/systemd/system/ceph-03977a23-f00f-4bb0-b9a7-de57f40ba853@.service;
>> enabled; vendor preset: enabled)
>>     Active: active (running) since Sat 2023-09-16 09:18:53 UTC; 51s ago
>>    Process: 4828 ExecStartPre=/bin/rm -f
>> /run/ceph-03977a23-f00f-4bb0-b9a7-de57f40ba853@mgr.0cc47a6df14e.nvjlcx.service-pid
>> /run/ceph-03977a23-f00f-4bb0-b9a7-de57f40ba853@mgr.0cc47a6df1>
>>    Process: 4829 ExecStart=/bin/bash
>> /var/lib/ceph/03977a23-f00f-4bb0-b9a7-de57f40ba853/mgr.0cc47a6df14e.nvjlcx/unit.run
>> (code=exited, status=0/SUCCESS)
>>   Main PID: 5132 (conmon)
>>      Tasks: 36 (limit: 309227)
>>     Memory: 512.0M
>>     CGroup:
>> /system.slice/system-ceph\x2d03977a23\x2df00f\x2d4bb0\x2db9a7\x2dde57f40ba853.slice/ceph-03977a23-f00f-4bb0-b9a7-de57f40ba853@mgr.0cc47a6df14e.nvjlcx.service
>>             ├─container
>>             │ ├─5136 /dev/init -- /usr/bin/ceph-mgr -n
>> mgr.0cc47a6df14e.nvjlcx -f --setuser ceph --setgroup ceph
>> --default-log-to-file=false --default-log-to-journald=true --default-log>
>>             │ └─5139 /usr/bin/ceph-mgr -n mgr.0cc47a6df14e.nvjlcx -f
>> --setuser ceph --setgroup ceph --default-log-to-file=false
>> --default-log-to-journald=true --default-log-to-stderr=fa>
>>             └─supervisor
>>               └─5132 /usr/libexec/podman/conmon --api-version 1 -c
>> 0165b4f78867ad284cc65fbece46013e6547a2f3ecf99cc7ffb8b720f705ee66 -u
>> 0165b4f78867ad284cc65fbece46013e6547a2f3ecf99cc7ff>
>> 
>> Sep 16 09:19:04 0cc47a6df14e.f00f.gridscale.dev
>> ceph-03977a23-f00f-4bb0-b9a7-de57f40ba853-mgr-0cc47a6df14e-nvjlcx[5132]:
>> 2023-09-16T09:19:04.333+0000 7f4fcc0a91c0 -1 mgr[py] Module alert>
>> Sep 16 09:19:04 0cc47a6df14e.f00f.gridscale.dev
>> ceph-03977a23-f00f-4bb0-b9a7-de57f40ba853-mgr-0cc47a6df14e-nvjlcx[5132]:
>> 2023-09-16T09:19:04.501+0000 7f4fcc0a91c0 -1 mgr[py] Module iosta>
>> Sep 16 09:19:05 0cc47a6df14e.f00f.gridscale.dev
>> ceph-03977a23-f00f-4bb0-b9a7-de57f40ba853-mgr-0cc47a6df14e-nvjlcx[5132]:
>> 2023-09-16T09:19:05.249+0000 7f4fcc0a91c0 -1 mgr[py] Module orche>
>> Sep 16 09:19:05 0cc47a6df14e.f00f.gridscale.dev
>> ceph-03977a23-f00f-4bb0-b9a7-de57f40ba853-mgr-0cc47a6df14e-nvjlcx[5132]:
>> 2023-09-16T09:19:05.481+0000 7f4fcc0a91c0 -1 mgr[py] Module rbd_s>
>> Sep 16 09:19:06 0cc47a6df14e.f00f.gridscale.dev
>> ceph-03977a23-f00f-4bb0-b9a7-de57f40ba853-mgr-0cc47a6df14e-nvjlcx[5132]:
>> [16/Sep/2023:09:19:06] ENGINE Bus STARTING
>> Sep 16 09:19:06 0cc47a6df14e.f00f.gridscale.dev
>> ceph-03977a23-f00f-4bb0-b9a7-de57f40ba853-mgr-0cc47a6df14e-nvjlcx[5132]:
>> CherryPy Checker:
>> Sep 16 09:19:06 0cc47a6df14e.f00f.gridscale.dev
>> ceph-03977a23-f00f-4bb0-b9a7-de57f40ba853-mgr-0cc47a6df14e-nvjlcx[5132]:
>> The Application mounted at '' has an empty config.
>> Sep 16 09:19:06 0cc47a6df14e.f00f.gridscale.dev
>> ceph-03977a23-f00f-4bb0-b9a7-de57f40ba853-mgr-0cc47a6df14e-nvjlcx[5132]:
>> Sep 16 09:19:06 0cc47a6df14e.f00f.gridscale.dev
>> ceph-03977a23-f00f-4bb0-b9a7-de57f40ba853-mgr-0cc47a6df14e-nvjlcx[5132]:
>> [16/Sep/2023:09:19:06] ENGINE Serving on http://:::9283
>> Sep 16 09:19:06 0cc47a6df14e.f00f.gridscale.dev
>> ceph-03977a23-f00f-4bb0-b9a7-de57f40ba853-mgr-0cc47a6df14e-nvjlcx[5132]:
>> [16/Sep/2023:09:19:06] ENGINE Bus STARTED
>> 
>> This seems to be the cephadm log:
>> 
>> cephadm ['adopt', '--style', 'legacy', '--name', 'osd.3']
>> 2023-09-15 11:32:44,290 7fef7b041740 INFO Pulling container image
>> quay.io/ceph/ceph:v17...
>> 2023-09-15 11:32:47,128 7fef7b041740 INFO Found online OSD at
>> //var/lib/ceph/osd/ceph-3/fsid
>> 2023-09-15 11:32:47,129 7fef7b041740 INFO objectstore_type is bluestore
>> 2023-09-15 11:32:47,150 7fef7b041740 INFO Stopping old systemd unit
>> ceph-osd@3...
>> 2023-09-15 11:32:48,560 7fef7b041740 INFO Disabling old systemd unit
>> ceph-osd@3...
>> 2023-09-15 11:32:49,157 7fef7b041740 INFO Moving data...
>> 2023-09-15 11:32:49,158 7fef7b041740 DEBUG move file
>> '//var/lib/ceph/osd/ceph-3/require_osd_release' ->
>> '/var/lib/ceph/03977a23-f00f-4bb0-b9a7-de57f40ba853/osd.3/require_osd_release'
>> 2023-09-15 11:32:49,158 7fef7b041740 DEBUG chown 167:167
>> `/var/lib/ceph/03977a23-f00f-4bb0-b9a7-de57f40ba853/osd.3/require_osd_release`
>> 2023-09-15 11:32:49,158 7fef7b041740 DEBUG symlink
>> '/var/lib/ceph/03977a23-f00f-4bb0-b9a7-de57f40ba853/osd.3/block' ->
>> '/dev/ceph-66d3bb27-cd5c-4897-aa76-684bc46d1c8b/osd-block-4bfc2101-e9b2-468d-8f54-a05f080ebdfe'
>> 2023-09-15 11:32:49,158 7fef7b041740 DEBUG move file
>> '//var/lib/ceph/osd/ceph-3/ready' ->
>> '/var/lib/ceph/03977a23-f00f-4bb0-b9a7-de57f40ba853/osd.3/ready'
>> 2023-09-15 11:32:49,159 7fef7b041740 DEBUG chown 167:167
>> `/var/lib/ceph/03977a23-f00f-4bb0-b9a7-de57f40ba853/osd.3/ready`
>> 2023-09-15 11:32:49,159 7fef7b041740 DEBUG move file
>> '//var/lib/ceph/osd/ceph-3/type' ->
>> '/var/lib/ceph/03977a23-f00f-4bb0-b9a7-de57f40ba853/osd.3/type'
>> 2023-09-15 11:32:49,159 7fef7b041740 DEBUG chown 167:167
>> `/var/lib/ceph/03977a23-f00f-4bb0-b9a7-de57f40ba853/osd.3/type`
>> 2023-09-15 11:32:49,159 7fef7b041740 DEBUG move file
>> '//var/lib/ceph/osd/ceph-3/fsid' ->
>> '/var/lib/ceph/03977a23-f00f-4bb0-b9a7-de57f40ba853/osd.3/fsid'
>> 2023-09-15 11:32:49,159 7fef7b041740 DEBUG chown 167:167
>> `/var/lib/ceph/03977a23-f00f-4bb0-b9a7-de57f40ba853/osd.3/fsid`
>> 2023-09-15 11:32:49,160 7fef7b041740 DEBUG move file
>> '//var/lib/ceph/osd/ceph-3/ceph_fsid' ->
>> '/var/lib/ceph/03977a23-f00f-4bb0-b9a7-de57f40ba853/osd.3/ceph_fsid'
>> 2023-09-15 11:32:49,160 7fef7b041740 DEBUG chown 167:167
>> `/var/lib/ceph/03977a23-f00f-4bb0-b9a7-de57f40ba853/osd.3/ceph_fsid`
>> 2023-09-15 11:32:49,160 7fef7b041740 DEBUG move file
>> '//var/lib/ceph/osd/ceph-3/keyring' ->
>> '/var/lib/ceph/03977a23-f00f-4bb0-b9a7-de57f40ba853/osd.3/keyring'
>> 2023-09-15 11:32:49,160 7fef7b041740 DEBUG chown 167:167
>> `/var/lib/ceph/03977a23-f00f-4bb0-b9a7-de57f40ba853/osd.3/keyring`
>> 2023-09-15 11:32:49,160 7fef7b041740 DEBUG move file
>> '//var/lib/ceph/osd/ceph-3/whoami' ->
>> '/var/lib/ceph/03977a23-f00f-4bb0-b9a7-de57f40ba853/osd.3/whoami'
>> 2023-09-15 11:32:49,161 7fef7b041740 DEBUG chown 167:167
>> `/var/lib/ceph/03977a23-f00f-4bb0-b9a7-de57f40ba853/osd.3/whoami`
>> 2023-09-15 11:32:49,161 7fef7b041740 DEBUG Remove dir
>> `//var/lib/ceph/osd/ceph-3`
>> 2023-09-15 11:32:49,166 7fef7b041740 INFO Chowning content...
>> 2023-09-15 11:32:49,171 7fef7b041740 DEBUG chown: stdout changed ownership
>> of '/var/lib/ceph/03977a23-f00f-4bb0-b9a7-de57f40ba853/osd.3/block' from
>> root:root to 167:167
>> 2023-09-15 11:32:49,172 7fef7b041740 INFO Chowning
>> /var/lib/ceph/03977a23-f00f-4bb0-b9a7-de57f40ba853/osd.3/block...
>> 2023-09-15 11:32:49,172 7fef7b041740 INFO Disabling host unit ceph-volume@
>> lvm unit...
>> 2023-09-15 11:32:49,649 7fef7b041740 DEBUG systemctl: stderr Removed
>> /etc/systemd/system/multi-user.target.wants/ceph-volume@lvm-3-4bfc2101-e9b2-468d-8f54-a05f080ebdfe.service.
>> 2023-09-15 11:32:49,650 7fef7b041740 DEBUG copy file `//etc/ceph/ceph.conf`
>> -> `/var/lib/ceph/03977a23-f00f-4bb0-b9a7-de57f40ba853/osd.3/config`
>> 2023-09-15 11:32:49,650 7fef7b041740 DEBUG chown 167:167
>> `/var/lib/ceph/03977a23-f00f-4bb0-b9a7-de57f40ba853/osd.3/config`
>> 2023-09-15 11:32:49,650 7fef7b041740 INFO Moving logs...
>> 2023-09-15 11:32:49,651 7fef7b041740 DEBUG move file
>> '//var/log/ceph/ceph-osd.3.log' ->
>> '/var/log/ceph/03977a23-f00f-4bb0-b9a7-de57f40ba853/ceph-osd.3.log'
>> 2023-09-15 11:32:49,651 7fef7b041740 DEBUG chown 167:167
>> `/var/log/ceph/03977a23-f00f-4bb0-b9a7-de57f40ba853/ceph-osd.3.log`
>> 2023-09-15 11:32:49,651 7fef7b041740 INFO Creating new units...
>> 2023-09-15 11:32:50,803 7fef7b041740 DEBUG sysctl: stdout * Applying
>> /etc/sysctl.d/10-console-messages.conf ...
>> 2023-09-15 11:32:50,803 7fef7b041740 DEBUG sysctl: stdout kernel.printk = 4
>> 4 1 7
>> 2023-09-15 11:32:50,803 7fef7b041740 DEBUG sysctl: stdout * Applying
>> /etc/sysctl.d/10-ipv6-privacy.conf ...
>> 2023-09-15 11:32:50,803 7fef7b041740 DEBUG sysctl: stdout
>> net.ipv6.conf.all.use_tempaddr = 2
>> 2023-09-15 11:32:50,803 7fef7b041740 DEBUG sysctl: stdout
>> net.ipv6.conf.default.use_tempaddr = 2
>> 2023-09-15 11:32:50,803 7fef7b041740 DEBUG sysctl: stdout * Applying
>> /etc/sysctl.d/10-kernel-hardening.conf ...
>> 2023-09-15 11:32:50,803 7fef7b041740 DEBUG sysctl: stdout
>> kernel.kptr_restrict = 1
>> 2023-09-15 11:32:50,804 7fef7b041740 DEBUG sysctl: stdout * Applying
>> /etc/sysctl.d/10-link-restrictions.conf ...
>> 2023-09-15 11:32:50,804 7fef7b041740 DEBUG sysctl: stdout
>> fs.protected_hardlinks = 1
>> 2023-09-15 11:32:50,804 7fef7b041740 DEBUG sysctl: stdout
>> fs.protected_symlinks = 1
>> 2023-09-15 11:32:50,804 7fef7b041740 DEBUG sysctl: stdout * Applying
>> /etc/sysctl.d/10-magic-sysrq.conf ...
>> 2023-09-15 11:32:50,804 7fef7b041740 DEBUG sysctl: stdout kernel.sysrq = 176
>> 2023-09-15 11:32:50,804 7fef7b041740 DEBUG sysctl: stdout * Applying
>> /etc/sysctl.d/10-network-security.conf ...
>> 2023-09-15 11:32:50,804 7fef7b041740 DEBUG sysctl: stdout
>> net.ipv4.conf.default.rp_filter = 2
>> 2023-09-15 11:32:50,804 7fef7b041740 DEBUG sysctl: stdout
>> net.ipv4.conf.all.rp_filter = 2
>> 2023-09-15 11:32:50,804 7fef7b041740 DEBUG sysctl: stdout * Applying
>> /etc/sysctl.d/10-ptrace.conf ...
>> 2023-09-15 11:32:50,804 7fef7b041740 DEBUG sysctl: stdout
>> kernel.yama.ptrace_scope = 1
>> 2023-09-15 11:32:50,804 7fef7b041740 DEBUG sysctl: stdout * Applying
>> /etc/sysctl.d/10-zeropage.conf ...
>> 2023-09-15 11:32:50,804 7fef7b041740 DEBUG sysctl: stdout vm.mmap_min_addr
>> = 65536
>> 2023-09-15 11:32:50,804 7fef7b041740 DEBUG sysctl: stdout * Applying
>> /etc/sysctl.d/30-ceph-osd.conf ...
>> 2023-09-15 11:32:50,804 7fef7b041740 DEBUG sysctl: stdout fs.aio-max-nr =
>> 1048576
>> 2023-09-15 11:32:50,804 7fef7b041740 DEBUG sysctl: stdout kernel.pid_max =
>> 4194304
>> 2023-09-15 11:32:50,804 7fef7b041740 DEBUG sysctl: stdout * Applying
>> /usr/lib/sysctl.d/50-coredump.conf ...
>> 2023-09-15 11:32:50,804 7fef7b041740 DEBUG sysctl: stdout
>> kernel.core_pattern = |/usr/lib/systemd/systemd-coredump %P %u %g %s %t %c
>> %h
>> 2023-09-15 11:32:50,804 7fef7b041740 DEBUG sysctl: stdout * Applying
>> /usr/lib/sysctl.d/50-default.conf ...
>> 2023-09-15 11:32:50,804 7fef7b041740 DEBUG sysctl: stdout
>> net.ipv4.conf.default.promote_secondaries = 1
>> 2023-09-15 11:32:50,804 7fef7b041740 DEBUG sysctl: stdout
>> net.ipv4.ping_group_range = 0 2147483647
>> 2023-09-15 11:32:50,805 7fef7b041740 DEBUG sysctl: stdout
>> net.core.default_qdisc = fq_codel
>> 2023-09-15 11:32:50,805 7fef7b041740 DEBUG sysctl: stdout
>> fs.protected_regular = 1
>> 2023-09-15 11:32:50,805 7fef7b041740 DEBUG sysctl: stdout
>> fs.protected_fifos = 1
>> 2023-09-15 11:32:50,805 7fef7b041740 DEBUG sysctl: stdout * Applying
>> /usr/lib/sysctl.d/50-pid-max.conf ...
>> 2023-09-15 11:32:50,805 7fef7b041740 DEBUG sysctl: stdout kernel.pid_max =
>> 4194304
>> 2023-09-15 11:32:50,805 7fef7b041740 DEBUG sysctl: stdout * Applying
>> /etc/sysctl.d/90-ceph-03977a23-f00f-4bb0-b9a7-de57f40ba853-osd.conf ...
>> 2023-09-15 11:32:50,805 7fef7b041740 DEBUG sysctl: stdout fs.aio-max-nr =
>> 1048576
>> 2023-09-15 11:32:50,805 7fef7b041740 DEBUG sysctl: stdout kernel.pid_max =
>> 4194304
>> 2023-09-15 11:32:50,805 7fef7b041740 DEBUG sysctl: stdout * Applying
>> /etc/sysctl.d/99-sysctl.conf ...
>> 2023-09-15 11:32:50,805 7fef7b041740 DEBUG sysctl: stdout * Applying
>> /usr/lib/sysctl.d/protect-links.conf ...
>> 2023-09-15 11:32:50,805 7fef7b041740 DEBUG sysctl: stdout
>> fs.protected_fifos = 1
>> 2023-09-15 11:32:50,805 7fef7b041740 DEBUG sysctl: stdout
>> fs.protected_hardlinks = 1
>> 2023-09-15 11:32:50,805 7fef7b041740 DEBUG sysctl: stdout
>> fs.protected_regular = 2
>> 2023-09-15 11:32:50,805 7fef7b041740 DEBUG sysctl: stdout
>> fs.protected_symlinks = 1
>> 2023-09-15 11:32:50,805 7fef7b041740 DEBUG sysctl: stdout * Applying
>> /etc/sysctl.conf ...
>> 2023-09-15 11:32:50,805 7fef7b041740 DEBUG sysctl: stderr sysctl: setting
>> key "net.ipv4.conf.all.promote_secondaries": Invalid argument
>> 2023-09-15 11:32:51,469 7fef7b041740 DEBUG Non-zero exit code 1 from
>> systemctl reset-failed ceph-03977a23-f00f-4bb0-b9a7-de57f40ba853@osd.3
>> 2023-09-15 11:32:51,469 7fef7b041740 DEBUG systemctl: stderr Failed to
>> reset failed state of unit
>> ceph-03977a23-f00f-4bb0-b9a7-de57f40ba853@osd.3.service: Unit
>> ceph-03977a23-f00f-4bb0-b9a7-de57f40ba853@osd.3.service not loaded.
>> 2023-09-15 11:32:51,954 7fef7b041740 DEBUG systemctl: stderr Created
>> symlink
>> /etc/systemd/system/ceph-03977a23-f00f-4bb0-b9a7-de57f40ba853.target.wants/ceph-03977a23-f00f-4bb0-b9a7-de57f40ba853@osd.3.service
>> → /etc/systemd/system/ceph-03977a23-f00f-4bb0-b9a7-de57f40ba853@.service.
>> 2023-09-15 11:32:54,331 7fef7b041740 DEBUG firewalld does not appear to be
>> present
>> 
>>> Am Sa., 16. Sept. 2023 um 10:25 Uhr schrieb Eugen Block <eblock@xxxxxx>:
>>> 
>>> That sounds a bit strange to me, because all clusters we adopted so
>>> far successfully converted the previous systemd-units into systemd
>>> units targeting the pods. This process also should have been logged
>>> (stdout, probably in the cephadm.log as well), resulting in "enabled"
>>> systemd units. Can you paste the output of 'systemctl status
>>> ceph-<FSID>@mon.<MON>'? If you have it, please also share the logs
>>> from the adoption process.
>>> What I did notice in a test cluster a while ago was that I had to
>>> reboot a node where I had to "play around" a bit with removed and
>>> redeployed osd containers. At some point they didn't react to
>>> systemctl commands anymore, but a reboot fixed that. But I haven't
>>> seen that in a production cluster yet, so some more details would be
>>> useful.
>>> 
>>> Zitat von Boris Behrens <bb@xxxxxxxxx>:
>>> 
>>> > Hi,
>>> > is there a way to have the pods start again after reboot?
>>> > Currently I need to start them by hand via ceph orch start
>>> mon/mgr/osd/...
>>> >
>>> > I imagine this will lead to a lot of headache when the ceph cluster gets
>>> a
>>> > powercycle and the mon pods will not start automatically.
>>> >
>>> > I've spun up a test cluster and there the pods start very fast. On the
>>> > legacy test cluster, which got adopted to cephadm, it does not.
>>> >
>>> > Cheers
>>> >  Boris
>>> > _______________________________________________
>>> > ceph-users mailing list -- ceph-users@xxxxxxx
>>> > To unsubscribe send an email to ceph-users-leave@xxxxxxx
>>> 
>>> 
>>> _______________________________________________
>>> ceph-users mailing list -- ceph-users@xxxxxxx
>>> To unsubscribe send an email to ceph-users-leave@xxxxxxx
>>> 
>> 
>> 
>> --
>> Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im
>> groüen Saal.
> 
> 
> 
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux