UPGRADE_REDEPLOY_DAEMON: Upgrading daemon failed

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi folks,

we did upgrade one of our clusters from pacific to Quincy. Everything worked fine, but cephadm complains about one osd not being upgraded:

[WRN] UPGRADE_REDEPLOY_DAEMON: Upgrading daemon osd.15 on host osd-dmz-k5-1 failed.
    Upgrade daemon: osd.15: cephadm exited with an error code: 1, stderr: Redeploy daemon osd.15 ...
Failed to trim old cgroups /sys/fs/cgroup/system.slice/system-ceph\x2df852c3fc\x2d05a0\x2d11e8\x2dbae7\x2d77689751e5e7.slice/ceph-f852c3fc-05a0-11e8-bae7-77689751e5e7@osd.15.service
Non-zero exit code 1 from systemctl start ceph-f852c3fc-05a0-11e8-bae7-77689751e5e7@osd.15
systemctl: stderr Job for ceph-f852c3fc-05a0-11e8-bae7-77689751e5e7@osd.15.service failed because the control process exited with error code.
systemctl: stderr See "systemctl status ceph-f852c3fc-05a0-11e8-bae7-77689751e5e7@osd.15.service" and "journalctl -xeu ceph-f852c3fc-05a0-11e8-bae7-77689751e5e7@osd.15.service" for details.
Traceback (most recent call last):
  File "/var/lib/ceph/f852c3fc-05a0-11e8-bae7-77689751e5e7/cephadm.8b92cafd937eb89681ee011f9e70f85937fd09c4bd61ed4a59981d275a1f255b", line 9679, in <module>
    main()
  File "/var/lib/ceph/f852c3fc-05a0-11e8-bae7-77689751e5e7/cephadm.8b92cafd937eb89681ee011f9e70f85937fd09c4bd61ed4a59981d275a1f255b", line 9667, in main
    r = ctx.func(ctx)
  File "/var/lib/ceph/f852c3fc-05a0-11e8-bae7-77689751e5e7/cephadm.8b92cafd937eb89681ee011f9e70f85937fd09c4bd61ed4a59981d275a1f255b", line 2168, in _default_image
    return func(ctx)
  File "/var/lib/ceph/f852c3fc-05a0-11e8-bae7-77689751e5e7/cephadm.8b92cafd937eb89681ee011f9e70f85937fd09c4bd61ed4a59981d275a1f255b", line 5992, in command_deploy
    deploy_daemon(ctx, ctx.fsid, daemon_type, daemon_id, c, uid, gid,
  File "/var/lib/ceph/f852c3fc-05a0-11e8-bae7-77689751e5e7/cephadm.8b92cafd937eb89681ee011f9e70f85937fd09c4bd61ed4a59981d275a1f255b", line 3301, in deploy_daemon
    deploy_daemon_units(ctx, fsid, uid, gid, daemon_type, daemon_id,
  File "/var/lib/ceph/f852c3fc-05a0-11e8-bae7-77689751e5e7/cephadm.8b92cafd937eb89681ee011f9e70f85937fd09c4bd61ed4a59981d275a1f255b", line 3558, in deploy_daemon_units
    call_throws(ctx, ['systemctl', 'start', unit_name])
  File "/var/lib/ceph/f852c3fc-05a0-11e8-bae7-77689751e5e7/cephadm.8b92cafd937eb89681ee011f9e70f85937fd09c4bd61ed4a59981d275a1f255b", line 1806, in call_throws
    raise RuntimeError(f'Failed command: {" ".join(command)}: {s}')
RuntimeError: Failed command: systemctl start ceph-f852c3fc-05a0-11e8-bae7-77689751e5e7@osd.15: Job for ceph-f852c3fc-05a0-11e8-bae7-77689751e5e7@osd.15.service failed because the control process exited with error code.
See "systemctl status ceph-f852c3fc-05a0-11e8-bae7-77689751e5e7@osd.15.service" and "journalctl -xeu ceph-f852c3fc-05a0-11e8-bae7-77689751e5e7@osd.15.service" for details.

The osd in question seems to be running fine:

systemctl status ceph-f852c3fc-05a0-11e8-bae7-77689751e5e7@osd.15.service
● ceph-f852c3fc-05a0-11e8-bae7-77689751e5e7@osd.15.service - Ceph osd.15 for f852c3fc-05a0-11e8-bae7-77689751e5e7
     Loaded: loaded (/etc/systemd/system/ceph-f852c3fc-05a0-11e8-bae7-77689751e5e7@.service; enabled; vendor preset: enabled)
     Active: active (running) since Sat 2024-11-16 10:02:27 CET; 1 week 2 days ago
   Main PID: 24583 (conmon)
      Tasks: 67 (limit: 76281)
     Memory: 6.0G
        CPU: 9h 57min 20.017s
     CGroup: /system.slice/system-ceph\x2df852c3fc\x2d05a0\x2d11e8\x2dbae7\x2d77689751e5e7.slice/ceph-f852c3fc-05a0-11e8-bae7-77689751e5e7@osd.15.service
             ├─libpod-payload-3e6ba1f01ad8ca4c20c08a9984bdd983f43b9a15a0ec1b452b4d17c9f5ef519e
             │ ├─24586 /dev/init -- /usr/bin/ceph-osd -n osd.15 -f --setuser ceph --setgroup ceph --default-log-to-file=false --default-log-to-journald=true --default-log-to-stderr=false
             │ └─24588 /usr/bin/ceph-osd -n osd.15 -f --setuser ceph --setgroup ceph --default-log-to-file=false --default-log-to-journald=true --default-log-to-stderr=false
             └─supervisor
               └─24583 /usr/bin/conmon --api-version 1 -c 3e6ba1f01ad8ca4c20c08a9984bdd983f43b9a15a0ec1b452b4d17c9f5ef519e -u 3e6ba1f01ad8ca4c20c08a9984bdd983f43b9a15a0ec1b452b4d17c9f5ef519e -r /usr/bin/crun -b /var/lib/containers/storage/overlay-containers/3e6ba1f01ad8ca4c20c08a9984bdd983f43b9a15a0ec1b452b4d17c9f5ef519e/userdata -p /run/containers/storage/over>

Nov 25 11:23:14 osd-dmz-k5-1 ceph-osd[24588]: rocksdb: (Original Log Time 2024/11/25-10:23:14.904662) [db/memtable_list.cc:628] [default] Level-0 commit table #794120: memtable #1 done
Nov 25 11:23:14 osd-dmz-k5-1 ceph-osd[24588]: rocksdb: (Original Log Time 2024/11/25-10:23:14.904710) EVENT_LOG_v1 {"time_micros": 1732530194904694, "job": 1660, "event": "flush_finished", "output_compression": "NoCompression", "lsm_state": [2, 1, 8, 44, 0, 0, 0], "immutable_memtables": 0}
Nov 25 11:23:14 osd-dmz-k5-1 ceph-osd[24588]: rocksdb: (Original Log Time 2024/11/25-10:23:14.904789) [db/db_impl/db_impl_compaction_flush.cc:233] [default] Level summary: files[2 1 8 44 0 0 0] max score 0.78
Nov 25 11:23:14 osd-dmz-k5-1 ceph-osd[24588]: rocksdb: [db/db_impl/db_impl_files.cc:415] [JOB 1660] Try to delete WAL files size 255924988, prev total WAL file size 256244157, number of live WAL files 2.
Nov 25 11:23:14 osd-dmz-k5-1 ceph-osd[24588]: rocksdb: [file/delete_scheduler.cc:69] Deleted file db/794117.log immediately, rate_bytes_per_sec 0, total_trash_size 0 max_trash_db_ratio 0.250000
Nov 25 11:23:14 osd-dmz-k5-1 ceph-osd[24588]: rocksdb: (Original Log Time 2024/11/25-10:23:14.905401) [db/db_impl/db_impl_compaction_flush.cc:2818] Compaction nothing to do
Nov 25 11:32:34 osd-dmz-k5-1 ceph-osd[24588]: rocksdb: [db/db_impl/db_impl.cc:901] ------- DUMPING STATS -------
Nov 25 11:32:34 osd-dmz-k5-1 ceph-osd[24588]: rocksdb: [db/db_impl/db_impl.cc:903]
                                              ** DB Stats **
                                              Uptime(secs): 783001.8 total, 600.0 interval
                                              Cumulative writes: 24M writes, 97M keys, 24M commit groups, 1.0 writes per commit group, ingest: 119.22 GB, 0.16 MB/s
                                              Cumulative WAL: 24M writes, 11M syncs, 2.03 writes per sync, written: 119.22 GB, 0.16 MB/s
                                              Cumulative stall: 00:00:0.000 H:M:S, 0.0 percent
                                              Interval writes: 17K writes, 61K keys, 17K commit groups, 1.0 writes per commit group, ingest: 95.37 MB, 0.16 MB/s
                                              Interval WAL: 17K writes, 8473 syncs, 2.01 writes per sync, written: 0.09 MB, 0.16 MB/s
                                              Interval stall: 00:00:0.000 H:M:S, 0.0 percent

                                              ** Compaction Stats [default] **
                                              Level    Files   Size     Score Read(GB)  Rn(GB) Rnp1(GB) Write(GB) Wnew(GB) Moved(GB) W-Amp Rd(MB/s) Wr(MB/s) Comp(sec) CompMergeCPU(sec) Comp(cnt) Avg(sec) KeyIn KeyDrop
                                              ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
                                                L0      2/0    9.86 MB   0.5      0.0     0.0      0.0       1.9      1.9       0.0   1.0      0.0     29.6     66.03             64.32       505    0.131       0      0
                                                L1      1/0   66.88 MB   0.7      4.6     1.9      2.7       3.3      0.7       0.0   1.8     71.1     52.1     65.56             60.72       126    0.520    100M  4156K
                                                L2      8/0   450.76 MB   0.8      7.1     0.7      6.4       6.8      0.4       0.0  10.3     53.7     51.6    135.52            118.93        16    8.470    190M  1298K
                                                L3     44/0    2.65 GB   0.1      0.7     0.3      0.4       0.4     -0.0       0.0   1.3     79.5     43.3      8.89              7.81         4    2.223     28M    17M
                                               Sum     55/0    3.17 GB   0.0     12.3     2.9      9.5      12.4      3.0       0.0   6.5     45.8     46.2    276.01            251.78       651    0.424    318M    22M
                                               Int      0/0    0.00 KB   0.0      0.0     0.0      0.0       0.0      0.0       0.0   1.0      0.0     44.0      0.12              0.12         1    0.124       0      0

                                              ** Compaction Stats [default] **
                                              Priority    Files   Size     Score Read(GB)  Rn(GB) Rnp1(GB) Write(GB) Wnew(GB) Moved(GB) W-Amp Rd(MB/s) Wr(MB/s) Comp(sec) CompMergeCPU(sec) Comp(cnt) Avg(sec) KeyIn KeyDrop
                                              -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
                                               Low      0/0    0.00 KB   0.0     12.3     2.9      9.5      10.5      1.0       0.0   0.0     60.2     51.4    209.98            187.46       146    1.438    318M    22M
                                              High      0/0    0.00 KB   0.0      0.0     0.0      0.0       1.9      1.9       0.0   0.0      0.0     29.5     66.00             64.32       504    0.131       0      0
                                              User      0/0    0.00 KB   0.0      0.0     0.0      0.0       0.0      0.0       0.0   0.0      0.0    102.0      0.02              0.00         1    0.025       0      0
                                              Uptime(secs): 783001.9 total, 600.0 interval
                                              Flush(GB): cumulative 1.906, interval 0.005
                                              AddFile(GB): cumulative 0.000, interval 0.000
                                              AddFile(Total Files): cumulative 0, interval 0
                                              AddFile(L0 Files): cumulative 0, interval 0
                                              AddFile(Keys): cumulative 0, interval 0
                                              Cumulative compaction: 12.45 GB write, 0.02 MB/s write, 12.35 GB read, 0.02 MB/s read, 276.0 seconds
                                              Interval compaction: 0.01 GB write, 0.01 MB/s write, 0.00 GB read, 0.00 MB/s read, 0.1 seconds
                                              Stalls(count): 0 level0_slowdown, 0 level0_slowdown_with_compaction, 0 level0_numfiles, 0 level0_numfiles_with_compaction, 0 stop for pending_compaction_bytes, 0 slowdown for pending_compaction_bytes, 0 memtable_compaction, 0 memtable_slowdown, interval 0 total count

                                              ** File Read Latency Histogram By Level [default] **

                                              ** Compaction Stats [default] **
                                              Level    Files   Size     Score Read(GB)  Rn(GB) Rnp1(GB) Write(GB) Wnew(GB) Moved(GB) W-Amp Rd(MB/s) Wr(MB/s) Comp(sec) CompMergeCPU(sec) Comp(cnt) Avg(sec) KeyIn KeyDrop
                                              ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
                                                L0      2/0    9.86 MB   0.5      0.0     0.0      0.0       1.9      1.9       0.0   1.0      0.0     29.6     66.03             64.32       505    0.131       0      0
                                                L1      1/0   66.88 MB   0.7      4.6     1.9      2.7       3.3      0.7       0.0   1.8     71.1     52.1     65.56             60.72       126    0.520    100M  4156K
                                                L2      8/0   450.76 MB   0.8      7.1     0.7      6.4       6.8      0.4       0.0  10.3     53.7     51.6    135.52            118.93        16    8.470    190M  1298K
                                                L3     44/0    2.65 GB   0.1      0.7     0.3      0.4       0.4     -0.0       0.0   1.3     79.5     43.3      8.89              7.81         4    2.223     28M    17M
                                               Sum     55/0    3.17 GB   0.0     12.3     2.9      9.5      12.4      3.0       0.0   6.5     45.8     46.2    276.01            251.78       651    0.424    318M    22M
                                               Int      0/0    0.00 KB   0.0      0.0     0.0      0.0       0.0      0.0       0.0   0.0      0.0      0.0      0.00              0.00         0    0.000       0      0

                                              ** Compaction Stats [default] **
                                              Priority    Files   Size     Score Read(GB)  Rn(GB) Rnp1(GB) Write(GB) Wnew(GB) Moved(GB) W-Amp Rd(MB/s) Wr(MB/s) Comp(sec) CompMergeCPU(sec) Comp(cnt) Avg(sec) KeyIn KeyDrop
                                              -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
                                               Low      0/0    0.00 KB   0.0     12.3     2.9      9.5      10.5      1.0       0.0   0.0     60.2     51.4    209.98            187.46       146    1.438    318M    22M
                                              High      0/0    0.00 KB   0.0      0.0     0.0      0.0       1.9      1.9       0.0   0.0      0.0     29.5     66.00             64.32       504    0.131       0      0
                                              User      0/0    0.00 KB   0.0      0.0     0.0      0.0       0.0      0.0       0.0   0.0      0.0    102.0      0.02              0.00         1    0.025       0      0
                                              Uptime(secs): 783001.9 total, 0.0 interval
                                              Flush(GB): cumulative 1.906, interval 0.000
                                              AddFile(GB): cumulative 0.000, interval 0.000
                                              AddFile(Total Files): cumulative 0, interval 0
                                              AddFile(L0 Files): cumulative 0, interval 0
                                              AddFile(Keys): cumulative 0, interval 0
                                              Cumulative compaction: 12.45 GB write, 0.02 MB/s write, 12.35 GB read, 0.02 MB/s read, 276.0 seconds


How do i fix this? We tried redeploying the osd but to no success.

Best regards
Felix


------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------
Forschungszentrum Jülich GmbH
52425 Jülich
Sitz der Gesellschaft: Jülich
Eingetragen im Handelsregister des Amtsgerichts Düren Nr. HR B 3498
Vorsitzender des Aufsichtsrats: MinDir Stefan Müller
Geschäftsführung: Prof. Dr. Astrid Lambrecht (Vorsitzende),
Karsten Beneke (stellv. Vorsitzender), Prof. Dr. Ir. Pieter Jansens
------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux