Hi Torkil,
cephadm does regular checks, for example some 'ceph-volume' stuff to
see if all assigned disks have actually been deployed as OSDs and so
on. That's why there are "random" containers created and destroyed. I
don't have a complete list of checks, though. You should be able to
match those timestamps with the /var/log/ceph/cephadm.log, here's an
example from one of my clusters:
# cephadm.log
2024-02-01 00:05:16,491 7f1265f64740 DEBUG
--------------------------------------------------------------------------------
cephadm ['--image',
'registry.domain/ebl/ceph-upstream@sha256:057e08bf8d2d20742173a571bc28b65674b055bebe5f4c6cd488c1a6fd51f685', '--timeout', '895', 'ceph-volume', '--fsid', '201a2fbc-ce7b-44a3-9ed7-39427972083b', '--', 'inventory', '--format=json-pretty',
'--filter-for-batch']
# syslog:
2024-02-01T00:05:17.335128+01:00 nautilus2 podman[2526232]: 2024-02-01
00:05:17.334873422 +0100 CET m=+0.210837970 container create
1731d65c011e4178380d70ed662f4c36dd6fea6429adc382ff1978007be770a8
(image=registry.domain/ebl/ceph-upstream@sha256:057e08bf8d2d20742173a571bc28b65674b055bebe5f4c6cd488c1a6fd51f685, name=cranky_hermann, GIT_BRANCH=HEAD,
GIT_COMMIT=0396eef90bef641b676c164ec7a3876f45010308,
...
A couple of seconds later it checks "cephadm list-networks" and then
"cephadm ls", and so on. Basically, it's a consistency check.
Regards,
Eugen
Zitat von Torkil Svensgaard <torkil@xxxxxxxx>:
So it seems some ceph housekeeping spawn containers without giving
them a name and that causes this in the journal:
"
Feb 01 04:10:07 dopey podman[766731]: 2024-02-01 04:10:07.967786606
+0100 CET m=+0.043987882 container create
95967a040795bd61588dcfdc6ba5daf92553cd2cb3ecd7318cd8b16c1b15782d
(image=quay.io/ceph/ceph@sha256:1793ff3af6ae74527c86e1a0b22401e9c42dc08d0ebb8379653be07db17d0007, name=practical_hypatia, org.label-schema.vendor=CentOS, GIT_BRANCH=HEAD, maintainer=Guillaume Abrioux <gabrioux@xxxxxxxxxx>, io.buildah.version=1.29.1, org.label-schema.license=GPLv2, GIT_CLEAN=True, GIT_COMMIT=0396eef90bef641b676c164ec7a3876f45010308, ceph=True, RELEASE=HEAD, GIT_REPO=https://github.com/ceph/ceph-container.git, CEPH_POINT_RELEASE=-18.2.0, org.label-schema.build-date=20231212, org.label-schema.name=CentOS Stream 8 Base Image,
org.label-schema.schema-version=1.0)
...
Feb 01 04:10:08 dopey practical_hypatia[766758]: 167 167
...
Feb 01 04:10:08 dopey systemd[1]:
libpod-conmon-95967a040795bd61588dcfdc6ba5daf92553cd2cb3ecd7318cd8b16c1b15782d.scope: Deactivated
successfully
"
Mvh.
Torkil
On 01/02/2024 08:24, Torkil Svensgaard wrote:
We have ceph (currently 18.2.0) log to an rsyslog server with the
following file name format:
template (name="DynaFile" type="string"
string="/tank/syslog/%fromhost-ip%/%hostname%/%programname%.log")
Around May 25th this year something changed so instead of getting
the usual program log names we are also getting a lot of diffent
logs with weird names. Here's an ls excerpt:
"
...
-rw-------. 1 root root 517K Feb 1 05:42 hardcore_raman.log
-rw-------. 1 root root 198K Feb 1 05:42 sleepy_moser.log
-rw-------. 1 root root 203K Feb 1 05:42 friendly_gagarin.log
-rw-------. 1 root root 1.1K Feb 1 05:42 goofy_hypatia.log
-rw-------. 1 root root 164K Feb 1 05:42 kind_chebyshev.log
-rw-------. 1 root root 11K Feb 1 05:42 magical_archimedes.log
-rw-------. 1 root root 373K Feb 1 05:42 busy_bardeen.log
-rw-------. 1 root root 178K Feb 1 05:42 trusting_euler.log
-rw-------. 1 root root 526K Feb 1 05:42 inspiring_golick.log
-rw-------. 1 root root 369K Feb 1 06:12 condescending_ganguly.log
-rw-------. 1 root root 191K Feb 1 06:12 mystifying_torvalds.log
-rw-------. 1 root root 475K Feb 1 06:12 charming_nash.log
-rw-------. 1 root root 168K Feb 1 06:12 zealous_sinoussi.log
-rw-------. 1 root root 325K Feb 1 06:12 amazing_booth.log
-rw-------. 1 root root 516K Feb 1 06:12 great_ardinghelli.log
-rw-------. 1 root root 313K Feb 1 06:12 magical_bell.log
-rw-------. 1 root root 22K Feb 1 06:12 nifty_swartz.log
-rw-------. 1 root root 426 Feb 1 06:12 upbeat_beaver.log
-rw-------. 1 root root 166K Feb 1 06:13 funny_lederberg.log
-rw-------. 1 root root 164K Feb 1 06:13 frosty_murdock.log
-rw-------. 1 root root 374K Feb 1 06:13 elastic_banach.log
-rw-------. 1 root root 308K Feb 1 06:13 inspiring_cohen.log
-rw-------. 1 root root 176K Feb 1 06:13 angry_wu.log
-rw-------. 1 root root 662 Feb 1 06:42 admiring_kalam.log
-rw-------. 1 root root 3.1K Feb 1 06:43 thirsty_colden.log
-rw-------. 1 root root 4.5M Feb 1 07:01 run-parts.log
-rw-------. 1 root root 16M Feb 1 07:01 CROND.log
-rw-------. 1 root root 109M Feb 1 07:06 python3.log
-rw-------. 1 root root 3.4M Feb 1 07:29 systemd-journald.log
-rw-------. 1 root root 596M Feb 1 07:34 sudo.log
-rw-------. 1 root root 549 Feb 1 07:44 interesting_rosalind.log
-rw-------. 1 root root 342K Feb 1 07:45 beautiful_hamilton.log
-rw-------. 1 root root 348K Feb 1 07:45 cool_ride.log
-rw-------. 1 root root 15G Feb 1 07:45 conmon.log
-rw-------. 1 root root 395K Feb 1 07:45 compassionate_satoshi.log
-rw-------. 1 root root 11K Feb 1 07:45 hardcore_noether.log
-rw-------. 1 root root 223K Feb 1 07:45 wizardly_johnson.log
-rw-------. 1 root root 270M Feb 1 07:49 sshd.log
-rw-------. 1 root root 111M Feb 1 07:49 systemd-logind.log
-rw-------. 1 root root 1.6G Feb 1 07:50 systemd.log
-rw-------. 1 root root 119M Feb 1 07:54 rsyslogd.log
-rw-------. 1 root root 94G Feb 1 07:55
ceph-8ee2d228-ed21-4580-8bbf-064.log
-rw-------. 1 root root 1.1G Feb 1 07:56 podman.log
-rw-------. 1 root root 1.8G Feb 1 07:58 ceph-mgr.log
-rw-------. 1 root root 213G Feb 1 07:58 ceph-osd.log
-rw-------. 1 root root 48G Feb 1 07:58 ceph-mon.log
"
Those are container names or something like that? The file content
seems to be assorted bits from the ceph disk tool:
"
# cat goofy_hypatia.log
Jun 7 04:12:43 dopey goofy_hypatia[3224681]: 167 167
Jun 24 09:00:08 dopey goofy_hypatia[2319188]: --> passed data
devices: 22 physical, 0 LVM
Jun 24 09:00:08 dopey goofy_hypatia[2319188]: --> relative data size: 1.0
Jun 24 09:00:08 dopey goofy_hypatia[2319188]: --> All data devices
are unavailable
Jun 24 09:00:08 dopey goofy_hypatia[2319188]: []
Sep 13 14:22:10 dopey goofy_hypatia[2027428]: --> passed data
devices: 22 physical, 0 LVM
Sep 13 14:22:10 dopey goofy_hypatia[2027428]: --> relative data size: 1.0
Sep 13 14:22:10 dopey goofy_hypatia[2027428]: --> All data devices
are unavailable
2023-11-11T03:36:33+01:00 dopey goofy_hypatia[411330]: --> passed
data devices: 2 physical, 0 LVM
2023-11-11T03:36:33+01:00 dopey goofy_hypatia[411330]: --> relative
data size: 1.0
2023-11-11T03:36:33+01:00 dopey goofy_hypatia[411330]: --> All data
devices are unavailable
2023-12-26T09:38:59+01:00 dopey goofy_hypatia[893177]: 167 167
2023-12-26T10:41:14+01:00 dopey goofy_hypatia[1057467]: 167 167
2024-01-05T21:00:37+01:00 dopey goofy_hypatia[3395314]: {}
2024-02-01T05:42:17+01:00 dopey goofy_hypatia[845150]: 167 167
"
Anyone else had this issue? Suggestions on how to get a real
program name instead?
Mvh.
Torkil
--
Torkil Svensgaard
Sysadmin
MR-Forskningssektionen, afs. 714
DRCMR, Danish Research Centre for Magnetic Resonance
Hvidovre Hospital
Kettegård Allé 30
DK-2650 Hvidovre
Denmark
Tel: +45 386 22828
E-mail: torkil@xxxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx