Here's the reason they exit:
7f1605dc9700 -1 osd.97 486896 _committed_osd_maps marked down 6 >
osd_max_markdown_count 5 in last 600.000000 seconds, shutting down
If an osd flaps (marked down, then up) 6 times in 10 minutes, it
exits. (This is a safety measure).
It's normally caused by a network issue -- other OSDs are telling the
mon that he is down, but then the OSD himself tells the mon that he's
up!
Cheers, Dan
On Mon, Mar 7, 2022 at 10:36 PM Boris Behrens <bb@xxxxxxxxx> wrote:
Hi,
we've had the problem with OSDs marked as offline since we updated to
octopus and hope the problem would be fixed with the latest patch. We
have
this kind of problem only with octopus and there only with the big s3
cluster.
* Hosts are all Ubuntu 20,04 and we've set the txqueuelen to 10k
* Network interfaces are 20gbit (2x10 in a 802.3ad encap3+4 bond)
* We only use the frontend network.
* All disks are spinning, some have block.db devices.
* All disks are bluestore
* configs are mostly defaults
* we've set the OSDs to restart=always without a limit, because we had
the
problem with unavailable PGs when two OSDs are marked as offline and the
share PGs.
But since we installed the latest patch we are experiencing more OSD
downs
and even crashes.
I tried to remove as much duplicated lines as possible.
Is the numa error a problem?
Why do OSD daemons not respond to hearthbeats? I mean even when the disk
is
totally loaded with IO, the system itself should answer heathbeats, or
am I
missing something?
I really hope some of you could send me on the correct way to solve this
nasty problem.
This is how the latest crash looks like
Mar 07 17:44:15 s3db18 ceph-osd[4530]: 2022-03-07T17:44:15.099+0000
7f5f05d2a700 -1 osd.161 489755 set_numa_affinity unable to identify
public
interface '' numa node: (2) No such file or directory
...
Mar 07 17:49:07 s3db18 ceph-osd[4530]: 2022-03-07T17:49:07.678+0000
7f5f05d2a700 -1 osd.161 489774 set_numa_affinity unable to identify
public
interface '' numa node: (2) No such file or directory
Mar 07 17:53:07 s3db18 ceph-osd[4530]: *** Caught signal (Aborted) **
Mar 07 17:53:07 s3db18 ceph-osd[4530]: in thread 7f5ef1501700
thread_name:tp_osd_tp
Mar 07 17:53:07 s3db18 ceph-osd[4530]: ceph version 15.2.16
(d46a73d6d0a67a79558054a3a5a72cb561724974) octopus (stable)
Mar 07 17:53:07 s3db18 ceph-osd[4530]: 1: (()+0x143c0) [0x7f5f0d4623c0]
Mar 07 17:53:07 s3db18 ceph-osd[4530]: 2: (pthread_kill()+0x38)
[0x7f5f0d45ef08]
Mar 07 17:53:07 s3db18 ceph-osd[4530]: 3:
(ceph::HeartbeatMap::_check(ceph::heartbeat_handle_d const*, char const*,
unsigned long)+0x471) [0x55a699a01201]
Mar 07 17:53:07 s3db18 ceph-osd[4530]: 4:
(ceph::HeartbeatMap::reset_timeout(ceph::heartbeat_handle_d*, unsigned
long, unsigned long)+0x8e) [0x55a699a0199e]
Mar 07 17:53:07 s3db18 ceph-osd[4530]: 5:
(ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x3f0)
[0x55a699a224b0]
Mar 07 17:53:07 s3db18 ceph-osd[4530]: 6:
(ShardedThreadPool::WorkThreadSharded::entry()+0x14) [0x55a699a252c4]
Mar 07 17:53:07 s3db18 ceph-osd[4530]: 7: (()+0x8609) [0x7f5f0d456609]
Mar 07 17:53:07 s3db18 ceph-osd[4530]: 8: (clone()+0x43)
[0x7f5f0cfc0163]
Mar 07 17:53:07 s3db18 ceph-osd[4530]: 2022-03-07T17:53:07.387+0000
7f5ef1501700 -1 *** Caught signal (Aborted) **
Mar 07 17:53:07 s3db18 ceph-osd[4530]: in thread 7f5ef1501700
thread_name:tp_osd_tp
Mar 07 17:53:07 s3db18 ceph-osd[4530]: ceph version 15.2.16
(d46a73d6d0a67a79558054a3a5a72cb561724974) octopus (stable)
Mar 07 17:53:07 s3db18 ceph-osd[4530]: 1: (()+0x143c0) [0x7f5f0d4623c0]
Mar 07 17:53:07 s3db18 ceph-osd[4530]: 2: (pthread_kill()+0x38)
[0x7f5f0d45ef08]
Mar 07 17:53:07 s3db18 ceph-osd[4530]: 3:
(ceph::HeartbeatMap::_check(ceph::heartbeat_handle_d const*, char const*,
unsigned long)+0x471) [0x55a699a01201]
Mar 07 17:53:07 s3db18 ceph-osd[4530]: 4:
(ceph::HeartbeatMap::reset_timeout(ceph::heartbeat_handle_d*, unsigned
long, unsigned long)+0x8e) [0x55a699a0199e]
Mar 07 17:53:07 s3db18 ceph-osd[4530]: 5:
(ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x3f0)
[0x55a699a224b0]
Mar 07 17:53:07 s3db18 ceph-osd[4530]: 6:
(ShardedThreadPool::WorkThreadSharded::entry()+0x14) [0x55a699a252c4]
Mar 07 17:53:07 s3db18 ceph-osd[4530]: 7: (()+0x8609) [0x7f5f0d456609]
Mar 07 17:53:07 s3db18 ceph-osd[4530]: 8: (clone()+0x43)
[0x7f5f0cfc0163]
Mar 07 17:53:07 s3db18 ceph-osd[4530]: NOTE: a copy of the executable,
or
`objdump -rdS <executable>` is needed to interpret this.
Mar 07 17:53:07 s3db18 ceph-osd[4530]: -5246>
2022-03-07T17:49:07.678+0000
7f5f05d2a700 -1 osd.161 489774 set_numa_affinity unable to identify
public
interface '' numa node: (2) No such file or directory
Mar 07 17:53:07 s3db18 ceph-osd[4530]: 0>
2022-03-07T17:53:07.387+0000
7f5ef1501700 -1 *** Caught signal (Aborted) **
Mar 07 17:53:07 s3db18 ceph-osd[4530]: in thread 7f5ef1501700
thread_name:tp_osd_tp
Mar 07 17:53:07 s3db18 ceph-osd[4530]: ceph version 15.2.16
(d46a73d6d0a67a79558054a3a5a72cb561724974) octopus (stable)
Mar 07 17:53:07 s3db18 ceph-osd[4530]: 1: (()+0x143c0) [0x7f5f0d4623c0]
Mar 07 17:53:07 s3db18 ceph-osd[4530]: 2: (pthread_kill()+0x38)
[0x7f5f0d45ef08]
Mar 07 17:53:07 s3db18 ceph-osd[4530]: 3:
(ceph::HeartbeatMap::_check(ceph::heartbeat_handle_d const*, char const*,
unsigned long)+0x471) [0x55a699a01201]
Mar 07 17:53:07 s3db18 ceph-osd[4530]: 4:
(ceph::HeartbeatMap::reset_timeout(ceph::heartbeat_handle_d*, unsigned
long, unsigned long)+0x8e) [0x55a699a0199e]
Mar 07 17:53:07 s3db18 ceph-osd[4530]: 5:
(ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x3f0)
[0x55a699a224b0]
Mar 07 17:53:07 s3db18 ceph-osd[4530]: 6:
(ShardedThreadPool::WorkThreadSharded::entry()+0x14) [0x55a699a252c4]
Mar 07 17:53:07 s3db18 ceph-osd[4530]: 7: (()+0x8609) [0x7f5f0d456609]
Mar 07 17:53:07 s3db18 ceph-osd[4530]: 8: (clone()+0x43)
[0x7f5f0cfc0163]
Mar 07 17:53:07 s3db18 ceph-osd[4530]: NOTE: a copy of the executable,
or
`objdump -rdS <executable>` is needed to interpret this.
Mar 07 17:53:07 s3db18 ceph-osd[4530]: -5246>
2022-03-07T17:49:07.678+0000
7f5f05d2a700 -1 osd.161 489774 set_numa_affinity unable to identify
public
interface '' numa node: (2) No such file or directory
Mar 07 17:53:07 s3db18 ceph-osd[4530]: 0>
2022-03-07T17:53:07.387+0000
7f5ef1501700 -1 *** Caught signal (Aborted) **
Mar 07 17:53:07 s3db18 ceph-osd[4530]: in thread 7f5ef1501700
thread_name:tp_osd_tp
Mar 07 17:53:07 s3db18 ceph-osd[4530]: ceph version 15.2.16
(d46a73d6d0a67a79558054a3a5a72cb561724974) octopus (stable)
Mar 07 17:53:07 s3db18 ceph-osd[4530]: 1: (()+0x143c0) [0x7f5f0d4623c0]
Mar 07 17:53:07 s3db18 ceph-osd[4530]: 2: (pthread_kill()+0x38)
[0x7f5f0d45ef08]
Mar 07 17:53:07 s3db18 ceph-osd[4530]: 3:
(ceph::HeartbeatMap::_check(ceph::heartbeat_handle_d const*, char const*,
unsigned long)+0x471) [0x55a699a01201]
Mar 07 17:53:07 s3db18 ceph-osd[4530]: 4:
(ceph::HeartbeatMap::reset_timeout(ceph::heartbeat_handle_d*, unsigned
long, unsigned long)+0x8e) [0x55a699a0199e]
Mar 07 17:53:07 s3db18 ceph-osd[4530]: 5:
(ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x3f0)
[0x55a699a224b0]
Mar 07 17:53:07 s3db18 ceph-osd[4530]: 6:
(ShardedThreadPool::WorkThreadSharded::entry()+0x14) [0x55a699a252c4]
Mar 07 17:53:07 s3db18 ceph-osd[4530]: 7: (()+0x8609) [0x7f5f0d456609]
Mar 07 17:53:07 s3db18 ceph-osd[4530]: 8: (clone()+0x43)
[0x7f5f0cfc0163]
Mar 07 17:53:07 s3db18 ceph-osd[4530]: NOTE: a copy of the executable,
or
`objdump -rdS <executable>` is needed to interpret this.
Mar 07 17:53:09 s3db18 systemd[1]: ceph-osd@161.service: Main process
exited, code=killed, status=6/ABRT
Mar 07 17:53:09 s3db18 systemd[1]: ceph-osd@161.service: Failed with
result
'signal'.
Mar 07 17:53:19 s3db18 systemd[1]: ceph-osd@161.service: Scheduled
restart
job, restart counter is at 1.
Mar 07 17:53:19 s3db18 systemd[1]: Stopped Ceph object storage daemon
osd.161.
Mar 07 17:53:19 s3db18 systemd[1]: Starting Ceph object storage daemon
osd.161...
Mar 07 17:53:19 s3db18 systemd[1]: Started Ceph object storage daemon
osd.161.
Mar 07 17:53:20 s3db18 ceph-osd[4009440]: 2022-03-07T17:53:20.498+0000
7f9617781d80 -1 Falling back to public interface
Mar 07 17:53:33 s3db18 ceph-osd[4009440]: 2022-03-07T17:53:33.906+0000
7f9617781d80 -1 osd.161 489778 log_to_monitors {default=true}
Mar 07 17:53:34 s3db18 ceph-osd[4009440]: 2022-03-07T17:53:34.206+0000
7f96106f2700 -1 osd.161 489778 set_numa_affinity unable to identify
public
interface '' numa node: (2) No such file or directory
...
Mar 07 18:58:12 s3db18 ceph-osd[4009440]: 2022-03-07T18:58:12.717+0000
7f96106f2700 -1 osd.161 489880 set_numa_affinity unable to identify
public
interface '' numa node: (2) No such file or directory
And this is how an it looks like when OSDs get marked as out:
Mar 03 19:29:04 s3db13 ceph-osd[5792]: 2022-03-03T19:29:04.857+0000
7f16115e0700 -1 osd.97 485814 heartbeat_check: no reply from
[XX:22::65]:6886 osd.124 since back 2022-03-03T19:28:41.250692+0000 front
2022-03-03T19:28:41.250649+0000 (oldest deadline
2022-03-03T19:29:04.150352+0000)
...130 time...
Mar 03 21:55:37 s3db13 ceph-osd[5792]: 2022-03-03T21:55:37.844+0000
7f16115e0700 -1 osd.97 486383 heartbeat_check: no reply from
[XX:22::65]:6941 osd.124 since back 2022-03-03T21:55:12.514627+0000 front
2022-03-03T21:55:12.514649+0000 (oldest deadline
2022-03-03T21:55:36.613469+0000)
Mar 04 00:00:05 s3db13 ceph-osd[5792]: 2022-03-04T00:00:05.035+0000
7f1613080700 -1 received signal: Hangup from killall -q -1 ceph-mon
ceph-mgr ceph-mds ceph-osd ceph-fuse radosgw rbd-mirror (PID: 1385079)
UID: 0
Mar 04 00:00:05 s3db13 ceph-osd[5792]: 2022-03-04T00:00:05.047+0000
7f1613080700 -1 received signal: Hangup from (PID: 1385080) UID: 0
Mar 04 00:06:00 s3db13 sudo[1389262]: ceph : TTY=unknown ; PWD=/ ;
USER=root ; COMMAND=/usr/sbin/smartctl -a --json=o /dev/sde
Mar 04 00:06:00 s3db13 sudo[1389262]: pam_unix(sudo:session): session
opened for user root by (uid=0)
Mar 04 00:06:00 s3db13 sudo[1389262]: pam_unix(sudo:session): session
closed for user root
Mar 04 00:06:01 s3db13 sudo[1389287]: ceph : TTY=unknown ; PWD=/ ;
USER=root ; COMMAND=/usr/sbin/nvme ata smart-log-add --json /dev/sde
Mar 04 00:06:01 s3db13 sudo[1389287]: pam_unix(sudo:session): session
opened for user root by (uid=0)
Mar 04 00:06:01 s3db13 sudo[1389287]: pam_unix(sudo:session): session
closed for user root
Mar 05 00:00:10 s3db13 ceph-osd[5792]: 2022-03-05T00:00:10.213+0000
7f1613080700 -1 received signal: Hangup from killall -q -1 ceph-mon
ceph-mgr ceph-mds ceph-osd ceph-fuse radosgw rbd-mirror (PID: 2406262)
UID: 0
Mar 05 00:00:10 s3db13 ceph-osd[5792]: 2022-03-05T00:00:10.237+0000
7f1613080700 -1 received signal: Hangup from (PID: 2406263) UID: 0
Mar 05 00:08:03 s3db13 sudo[2411721]: ceph : TTY=unknown ; PWD=/ ;
USER=root ; COMMAND=/usr/sbin/smartctl -a --json=o /dev/sde
Mar 05 00:08:03 s3db13 sudo[2411721]: pam_unix(sudo:session): session
opened for user root by (uid=0)
Mar 05 00:08:04 s3db13 sudo[2411721]: pam_unix(sudo:session): session
closed for user root
Mar 05 00:08:04 s3db13 sudo[2411725]: ceph : TTY=unknown ; PWD=/ ;
USER=root ; COMMAND=/usr/sbin/nvme ata smart-log-add --json /dev/sde
Mar 05 00:08:04 s3db13 sudo[2411725]: pam_unix(sudo:session): session
opened for user root by (uid=0)
Mar 05 00:08:04 s3db13 sudo[2411725]: pam_unix(sudo:session): session
closed for user root
Mar 05 19:19:49 s3db13 ceph-osd[5792]: 2022-03-05T19:19:49.189+0000
7f160fddd700 -1 osd.97 486852 set_numa_affinity unable to identify public
interface '' numa node: (2) No such file or directory
Mar 05 19:21:18 s3db13 ceph-osd[5792]: 2022-03-05T19:21:18.377+0000
7f160fddd700 -1 osd.97 486858 set_numa_affinity unable to identify public
interface '' numa node: (2) No such file or directory
Mar 05 19:21:45 s3db13 ceph-osd[5792]: 2022-03-05T19:21:45.304+0000
7f16115e0700 -1 osd.97 486863 heartbeat_check: no reply from
[XX:22::60]:6834 osd.171 since back 2022-03-05T19:21:21.762744+0000 front
2022-03-05T19:21:21.762723+0000 (oldest deadline
2022-03-05T19:21:45.261347+0000)
Mar 05 19:21:46 s3db13 ceph-osd[5792]: 2022-03-05T19:21:46.260+0000
7f16115e0700 -1 osd.97 486863 heartbeat_check: no reply from
[XX:22::60]:6834 osd.171 since back 2022-03-05T19:21:21.762744+0000 front
2022-03-05T19:21:21.762723+0000 (oldest deadline
2022-03-05T19:21:45.261347+0000)
Mar 05 19:21:47 s3db13 ceph-osd[5792]: 2022-03-05T19:21:47.252+0000
7f16115e0700 -1 osd.97 486863 heartbeat_check: no reply from
[XX:22::60]:6834 osd.171 since back 2022-03-05T19:21:21.762744+0000 front
2022-03-05T19:21:21.762723+0000 (oldest deadline
2022-03-05T19:21:45.261347+0000)
Mar 05 19:22:59 s3db13 ceph-osd[5792]: 2022-03-05T19:22:59.636+0000
7f160fddd700 -1 osd.97 486869 set_numa_affinity unable to identify public
interface '' numa node: (2) No such file or directory
Mar 05 19:23:33 s3db13 ceph-osd[5792]: 2022-03-05T19:23:33.439+0000
7f16115e0700 -1 osd.97 486872 get_health_metrics reporting 2 slow ops,
oldest is osd_op(client.2304224848.0:3139913 4.d 4.97748d0d (undecoded)
ondisk+retry+read+known_if_redirected e486872)
Mar 05 19:23:34 s3db13 ceph-osd[5792]: 2022-03-05T19:23:34.458+0000
7f16115e0700 -1 osd.97 486872 get_health_metrics reporting 2 slow ops,
oldest is osd_op(client.2304224848.0:3139913 4.d 4.97748d0d (undecoded)
ondisk+retry+read+known_if_redirected e486872)
Mar 05 19:23:35 s3db13 ceph-osd[5792]: 2022-03-05T19:23:35.434+0000
7f16115e0700 -1 osd.97 486872 heartbeat_check: no reply from
[XX:22::60]:6834 osd.171 since back 2022-03-05T19:23:09.928097+0000 front
2022-03-05T19:23:09.928150+0000 (oldest deadline
2022-03-05T19:23:35.227545+0000)
...
Mar 05 19:23:48 s3db13 ceph-osd[5792]: 2022-03-05T19:23:48.386+0000
7f16115e0700 -1 osd.97 486872 get_health_metrics reporting 2 slow ops,
oldest is osd_op(client.2304224848.0:3139913 4.d 4.97748d0d (undecoded)
ondisk+retry+read+known_if_redirected e486872)
Mar 05 19:23:49 s3db13 ceph-osd[5792]: 2022-03-05T19:23:49.362+0000
7f16115e0700 -1 osd.97 486872 heartbeat_check: no reply from
[XX:22::60]:6834 osd.171 since back 2022-03-05T19:23:09.928097+0000 front
2022-03-05T19:23:09.928150+0000 (oldest deadline
2022-03-05T19:23:35.227545+0000)
Mar 05 19:23:49 s3db13 ceph-osd[5792]: 2022-03-05T19:23:49.362+0000
7f16115e0700 -1 osd.97 486872 get_health_metrics reporting 2 slow ops,
oldest is osd_op(client.2304224848.0:3139913 4.d 4.97748d0d (undecoded)
ondisk+retry+read+known_if_redirected e486872)
Mar 05 19:23:50 s3db13 ceph-osd[5792]: 2022-03-05T19:23:50.358+0000
7f16115e0700 -1 osd.97 486873 get_health_metrics reporting 2 slow ops,
oldest is osd_op(client.2304224848.0:3139913 4.d 4.97748d0d (undecoded)
ondisk+retry+read+known_if_redirected e486872)
Mar 05 19:23:51 s3db13 ceph-osd[5792]: 2022-03-05T19:23:51.330+0000
7f16115e0700 -1 osd.97 486874 get_health_metrics reporting 2 slow ops,
oldest is osd_op(client.2304224848.0:3139913 4.d 4:b0b12ee9:::gc.22:head
[call rgw_gc.rgw_gc_queue_list_entries in=46b] snapc 0=[] RETRY=9
ondisk+retry+read+known_if_redirected e486872)
Mar 05 19:23:52 s3db13 ceph-osd[5792]: 2022-03-05T19:23:52.326+0000
7f16115e0700 -1 osd.97 486874 get_health_metrics reporting 2 slow ops,
oldest is osd_op(client.2304224848.0:3139913 4.d 4:b0b12ee9:::gc.22:head
[call rgw_gc.rgw_gc_queue_list_entries in=46b] snapc 0=[] RETRY=9
ondisk+retry+read+known_if_redirected e486872)
Mar 05 19:23:53 s3db13 ceph-osd[5792]: 2022-03-05T19:23:53.338+0000
7f16115e0700 -1 osd.97 486874 get_health_metrics reporting 2 slow ops,
oldest is osd_op(client.2304224848.0:3139913 4.d 4:b0b12ee9:::gc.22:head
[call rgw_gc.rgw_gc_queue_list_entries in=46b] snapc 0=[] RETRY=9
ondisk+retry+read+known_if_redirected e486872)
Mar 05 19:25:02 s3db13 ceph-osd[5792]: 2022-03-05T19:25:02.342+0000
7f160fddd700 -1 osd.97 486878 set_numa_affinity unable to identify public
interface '' numa node: (2) No such file or directory
Mar 05 19:25:33 s3db13 ceph-osd[5792]: 2022-03-05T19:25:33.569+0000
7f16115e0700 -1 osd.97 486880 get_health_metrics reporting 2 slow ops,
oldest is osd_op(client.2304224857.0:4271104 4.d 4.97748d0d (undecoded)
ondisk+retry+write+known_if_redirected e486879)
...
Mar 05 19:25:44 s3db13 ceph-osd[5792]: 2022-03-05T19:25:44.476+0000
7f16115e0700 -1 osd.97 486880 get_health_metrics reporting 3 slow ops,
oldest is osd_op(client.2304224857.0:4271104 4.d 4.97748d0d (undecoded)
ondisk+retry+write+known_if_redirected e486879)
Mar 05 19:25:45 s3db13 ceph-osd[5792]: 2022-03-05T19:25:45.456+0000
7f16115e0700 -1 osd.97 486880 heartbeat_check: no reply from
[XX:22::60]:6834 osd.171 ever on either front or back, first ping sent
2022-03-05T19:25:25.281582+0000 (oldest deadline
2022-03-05T19:25:45.281582+0000)
Mar 05 19:25:45 s3db13 ceph-osd[5792]: 2022-03-05T19:25:45.456+0000
7f16115e0700 -1 osd.97 486880 get_health_metrics reporting 3 slow ops,
oldest is osd_op(client.2304224857.0:4271104 4.d 4.97748d0d (undecoded)
ondisk+retry+write+known_if_redirected e486879)
...
Mar 05 19:26:08 s3db13 ceph-osd[5792]: 2022-03-05T19:26:08.363+0000
7f16115e0700 -1 osd.97 486880 get_health_metrics reporting 3 slow ops,
oldest is osd_op(client.2304224857.0:4271104 4.d 4.97748d0d (undecoded)
ondisk+retry+write+known_if_redirected e486879)
Mar 05 19:26:09 s3db13 ceph-osd[5792]: 2022-03-05T19:26:09.371+0000
7f16115e0700 -1 osd.97 486880 heartbeat_check: no reply from
[XX:22::60]:6834 osd.171 ever on either front or back, first ping sent
2022-03-05T19:25:25.281582+0000 (oldest deadline
2022-03-05T19:25:45.281582+0000)
Mar 05 19:26:09 s3db13 ceph-osd[5792]: 2022-03-05T19:26:09.375+0000
7f16115e0700 -1 osd.97 486880 get_health_metrics reporting 3 slow ops,
oldest is osd_op(client.2304224857.0:4271104 4.d 4.97748d0d (undecoded)
ondisk+retry+write+known_if_redirected e486879)
Mar 05 19:26:10 s3db13 ceph-osd[5792]: 2022-03-05T19:26:10.383+0000
7f16115e0700 -1 osd.97 486881 get_health_metrics reporting 3 slow ops,
oldest is osd_op(client.2304224857.0:4271104 4.d 4.97748d0d (undecoded)
ondisk+retry+write+known_if_redirected e486879)
Mar 05 19:26:11 s3db13 ceph-osd[5792]: 2022-03-05T19:26:11.407+0000
7f16115e0700 -1 osd.97 486882 get_health_metrics reporting 1 slow ops,
oldest is osd_op(client.2304224848.0:3139913 4.d 4:b0b12ee9:::gc.22:head
[call rgw_gc.rgw_gc_queue_list_entries in=46b] snapc 0=[] RETRY=11
ondisk+retry+read+known_if_redirected e486879)
Mar 05 19:26:12 s3db13 ceph-osd[5792]: 2022-03-05T19:26:12.399+0000
7f16115e0700 -1 osd.97 486882 get_health_metrics reporting 1 slow ops,
oldest is osd_op(client.2304224848.0:3139913 4.d 4:b0b12ee9:::gc.22:head
[call rgw_gc.rgw_gc_queue_list_entries in=46b] snapc 0=[] RETRY=11
ondisk+retry+read+known_if_redirected e486879)
Mar 05 19:27:24 s3db13 ceph-osd[5792]: 2022-03-05T19:27:24.975+0000
7f160fddd700 -1 osd.97 486887 set_numa_affinity unable to identify public
interface '' numa node: (2) No such file or directory
Mar 05 19:27:58 s3db13 ceph-osd[5792]: 2022-03-05T19:27:58.114+0000
7f16115e0700 -1 osd.97 486890 get_health_metrics reporting 4 slow ops,
oldest is osd_op(client.2304235452.0:811825 4.d 4.97748d0d (undecoded)
ondisk+retry+write+known_if_redirected e486889)
...
Mar 05 19:28:08 s3db13 ceph-osd[5792]: 2022-03-05T19:28:08.137+0000
7f16115e0700 -1 osd.97 486890 get_health_metrics reporting 4 slow ops,
oldest is osd_op(client.2304235452.0:811825 4.d 4.97748d0d (undecoded)
ondisk+retry+write+known_if_redirected e486889)
Mar 05 19:28:09 s3db13 ceph-osd[5792]: 2022-03-05T19:28:09.125+0000
7f16115e0700 -1 osd.97 486890 heartbeat_check: no reply from
[XX:22::60]:6834 osd.171 ever on either front or back, first ping sent
2022-03-05T19:27:48.548094+0000 (oldest deadline
2022-03-05T19:28:08.548094+0000)
Mar 05 19:28:09 s3db13 ceph-osd[5792]: 2022-03-05T19:28:09.125+0000
7f16115e0700 -1 osd.97 486890 get_health_metrics reporting 4 slow ops,
oldest is osd_op(client.2304235452.0:811825 4.d 4.97748d0d (undecoded)
ondisk+retry+write+known_if_redirected e486889)
...
Mar 05 19:28:29 s3db13 ceph-osd[5792]: 2022-03-05T19:28:29.060+0000
7f16115e0700 -1 osd.97 486890 get_health_metrics reporting 4 slow ops,
oldest is osd_op(client.2304235452.0:811825 4.d 4.97748d0d (undecoded)
ondisk+retry+write+known_if_redirected e486889)
Mar 05 19:28:30 s3db13 ceph-osd[5792]: 2022-03-05T19:28:30.040+0000
7f16115e0700 -1 osd.97 486890 heartbeat_check: no reply from
[XX:22::60]:6834 osd.171 ever on either front or back, first ping sent
2022-03-05T19:27:48.548094+0000 (oldest deadline
2022-03-05T19:28:08.548094+0000)
Mar 05 19:29:43 s3db13 ceph-osd[5792]: 2022-03-05T19:29:43.696+0000
7f1605dc9700 -1 osd.97 486896 _committed_osd_maps marked down 6 >
osd_max_markdown_count 5 in last 600.000000 seconds, shutting down
Mar 05 19:29:43 s3db13 ceph-osd[5792]: 2022-03-05T19:29:43.700+0000
7f1613080700 -1 received signal: Interrupt from Kernel ( Could be
generated by pthread_kill(), raise(), abort(), alarm() ) UID: 0
Mar 05 19:29:43 s3db13 ceph-osd[5792]: 2022-03-05T19:29:43.700+0000
7f1613080700 -1 osd.97 486896 *** Got signal Interrupt ***
Mar 05 19:29:43 s3db13 ceph-osd[5792]: 2022-03-05T19:29:43.700+0000
7f1613080700 -1 osd.97 486896 *** Immediate shutdown
(osd_fast_shutdown=true) ***
Mar 05 19:29:44 s3db13 systemd[1]: ceph-osd@97.service: Succeeded.
Mar 05 19:29:54 s3db13 systemd[1]: ceph-osd@97.service: Scheduled
restart
job, restart counter is at 1.
Mar 05 19:29:54 s3db13 systemd[1]: Stopped Ceph object storage daemon
osd.97.
Mar 05 19:29:54 s3db13 systemd[1]: Starting Ceph object storage daemon
osd.97...
Mar 05 19:29:54 s3db13 systemd[1]: Started Ceph object storage daemon
osd.97.
Mar 05 19:29:55 s3db13 ceph-osd[3236773]: 2022-03-05T19:29:55.116+0000
7f5852f74d80 -1 Falling back to public interface
Mar 05 19:30:34 s3db13 ceph-osd[3236773]: 2022-03-05T19:30:34.746+0000
7f5852f74d80 -1 osd.97 486896 log_to_monitors {default=true}
--
Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im
groüen Saal.
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx