Re: OSD upgrade problem nautilus->octopus - snap_mapper upgrade stuck

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

Someone told me that we could just destroy the FileStore OSD’s and recreate them as BlueStore, even though the cluster is partially upgraded. So I guess I’ll just do that. (Unless someone here tells me that that’s a terrible idea :))

I would agree, rebuilding seems a reasonable approach here. The SUSE documentation [1] doesn't even allow Filestore OSDs before an upgrade to a cephadm managed cluster. I don't have tried to upgrade Filestore based clusters so I can't tell if this is related to Filestore in general or a different issue. Anyway, moving towards Bluestore is inevitable, so at some point you'll need to rebuild anyway. :-)

Regards,
Eugen

[1] https://documentation.suse.com/ses/7.1/single-html/ses-deployment/#before-upgrade


Zitat von Mark Schouten <mark@xxxxxxxx>:

Hi,

Thanks. Someone told me that we could just destroy the FileStore OSD’s and recreate them as BlueStore, even though the cluster is partially upgraded. So I guess I’ll just do that. (Unless someone here tells me that that’s a terrible idea :))

—
Mark Schouten, CTO
Tuxis B.V.
mark@xxxxxxxx / +31 318 200208


------ Original Message ------
From "Eugen Block" <eblock@xxxxxx>
To ceph-users@xxxxxxx
Date 2/7/2023 4:58:11 PM
Subject Re: OSD upgrade problem nautilus->octopus - snap_mapper upgrade stuck

Hi,

I don't really have an answer, but there was a bug with snap mapper [1], [2] is supposed to verify consistency, but Octopus is EOL so you might need to upgrade directly to Pacific. That's what we did on multiple clusters (N --> P) a few months back. I'm not sure if it would just work if you already have a couple of Octopus daemons, maybe you can try it on a test cluster.

Regards,
Eugen

[1] https://tracker.ceph.com/issues/56147
[2] https://github.com/ceph/ceph/pull/47388

Zitat von Mark Schouten <mark@xxxxxxxx>:

Hi,

I’m seeing the same thing …

With debug logging enabled I see this:
2023-02-07T00:35:51.853+0100 7fdab9930e00 10 snap_mapper.convert_legacy converted 1410 keys 2023-02-07T00:35:51.853+0100 7fdab9930e00 10 snap_mapper.convert_legacy converted 1440 keys 2023-02-07T00:35:51.853+0100 7fdab9930e00 10 snap_mapper.convert_legacy converted 1470 keys 2023-02-07T00:35:51.853+0100 7fdab9930e00 10 snap_mapper.convert_legacy converted 1500 keys

It ends at 1500 keys. And nothing happens.

I’m now stuck with a cluster that has 4 OSD’s on Octopus, 10 on Nautilus, and one down .. A hint on how to work around this is welcome :)

—
Mark Schouten, CTO
Tuxis B.V.
mark@xxxxxxxx / +31 318 200208


------ Original Message ------
From "Jan Pekař - Imatic" <jan.pekar@xxxxxxxxx>
To ceph-users@xxxxxxx
Date 1/12/2023 5:53:02 PM
Subject OSD upgrade problem nautilus->octopus - snap_mapper upgrade stuck

Hi all,

I have problem upgrading nautilus to octopus on my OSD.

Upgrade mon and mgr was OK and first OSD stuck on

2023-01-12T09:25:54.122+0100 7f49ff3eae00 1 osd.0 126556 init upgrade snap_mapper (first start as octopus)

and there were no activity after that for more than 48 hours. No disk activity.

I restarted OSD many times and nothing changed.

It is old, filestore OSD based on XFS filesystem. Is upgrade to snap mapper 2 reliable? What is OSD waiting for? Can I start OSD without upgrade and get cluster healthy with old snap structure? Or should I skip octopus upgrade and go to pacific directly (some bug backport is missing?).

Thank you for help, I'm sending some logs below..

Log shows

2023-01-09T19:12:49.471+0100 7f41f60f1e00 0 ceph version 15.2.17 (694d03a6f6c6e9f814446223549caf9a9f60dba0) octopus (stable), process ceph-osd, pid 2566563 2023-01-09T19:12:49.471+0100 7f41f60f1e00 0 pidfile_write: ignore empty --pid-file 2023-01-09T19:12:49.499+0100 7f41f60f1e00 -1 missing 'type' file, inferring filestore from current/ dir 2023-01-09T19:12:49.531+0100 7f41f60f1e00 0 starting osd.0 osd_data /var/lib/ceph/osd/ceph-0 /var/lib/ceph/osd/ceph-0/journal 2023-01-09T19:12:49.531+0100 7f41f60f1e00 -1 Falling back to public interface 2023-01-09T19:12:49.871+0100 7f41f60f1e00 0 load: jerasure load: lrc load: isa 2023-01-09T19:12:49.875+0100 7f41f60f1e00 0 filestore(/var/lib/ceph/osd/ceph-0) backend xfs (magic 0x58465342) 2023-01-09T19:12:49.883+0100 7f41f60f1e00 0 osd.0:0.OSDShard using op scheduler ClassedOpQueueScheduler(queue=WeightedPriorityQueue, cutoff=196) 2023-01-09T19:12:49.883+0100 7f41f60f1e00 0 osd.0:1.OSDShard using op scheduler ClassedOpQueueScheduler(queue=WeightedPriorityQueue, cutoff=196) 2023-01-09T19:12:49.883+0100 7f41f60f1e00 0 osd.0:2.OSDShard using op scheduler ClassedOpQueueScheduler(queue=WeightedPriorityQueue, cutoff=196) 2023-01-09T19:12:49.883+0100 7f41f60f1e00 0 osd.0:3.OSDShard using op scheduler ClassedOpQueueScheduler(queue=WeightedPriorityQueue, cutoff=196) 2023-01-09T19:12:49.883+0100 7f41f60f1e00 0 osd.0:4.OSDShard using op scheduler ClassedOpQueueScheduler(queue=WeightedPriorityQueue, cutoff=196) 2023-01-09T19:12:49.883+0100 7f41f60f1e00 0 filestore(/var/lib/ceph/osd/ceph-0) backend xfs (magic 0x58465342) 2023-01-09T19:12:49.927+0100 7f41f60f1e00 0 genericfilestorebackend(/var/lib/ceph/osd/ceph-0) detect_features: FIEMAP ioctl is disabled via 'filestore fiemap' config option 2023-01-09T19:12:49.927+0100 7f41f60f1e00 0 genericfilestorebackend(/var/lib/ceph/osd/ceph-0) detect_features: SEEK_DATA/SEEK_HOLE is disabled via 'filestore seek data hole' config option 2023-01-09T19:12:49.927+0100 7f41f60f1e00 0 genericfilestorebackend(/var/lib/ceph/osd/ceph-0) detect_features: splice() is disabled via 'filestore splice' config option 2023-01-09T19:12:49.983+0100 7f41f60f1e00 0 genericfilestorebackend(/var/lib/ceph/osd/ceph-0) detect_features: syncfs(2) syscall fully supported (by glibc and kernel) 2023-01-09T19:12:49.983+0100 7f41f60f1e00 0 xfsfilestorebackend(/var/lib/ceph/osd/ceph-0) detect_feature: extsize is disabled by conf 2023-01-09T19:12:50.015+0100 7f41f60f1e00 0 filestore(/var/lib/ceph/osd/ceph-0) start omap initiation 2023-01-09T19:12:50.079+0100 7f41f60f1e00 1 leveldb: Recovering log #165531 2023-01-09T19:12:50.083+0100 7f41f60f1e00 1 leveldb: Level-0 table #165533: started 2023-01-09T19:12:50.235+0100 7f41f60f1e00 1 leveldb: Level-0 table #165533: 1598 bytes OK 2023-01-09T19:12:50.583+0100 7f41f60f1e00 1 leveldb: Delete type=0 #165531

2023-01-09T19:12:50.615+0100 7f41f60f1e00 1 leveldb: Delete type=3 #165529

2023-01-09T19:12:51.339+0100 7f41f60f1e00 0 filestore(/var/lib/ceph/osd/ceph-0) mount(1861): enabling WRITEAHEAD journal mode: checkpoint is not enabled 2023-01-09T19:12:51.379+0100 7f41f60f1e00 1 journal _open /var/lib/ceph/osd/ceph-0/journal fd 35: 2998927360 bytes, block size 4096 bytes, directio = 1, aio = 1 2023-01-09T19:12:51.931+0100 7f41f60f1e00 -1 journal do_read_entry(243675136): bad header magic 2023-01-09T19:12:51.939+0100 7f41f60f1e00 1 journal _open /var/lib/ceph/osd/ceph-0/journal fd 35: 2998927360 bytes, block size 4096 bytes, directio = 1, aio = 1 2023-01-09T19:12:51.943+0100 7f41f60f1e00 1 filestore(/var/lib/ceph/osd/ceph-0) upgrade(1466) 2023-01-09T19:12:52.015+0100 7f41f60f1e00 1 osd.0 126556 init upgrade snap_mapper (first start as octopus)

lsof shows

COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
ceph-osd 225860 ceph  cwd       DIR              9,127 4096          2 /
ceph-osd 225860 ceph  rtd       DIR              9,127 4096          2 /
ceph-osd 225860 ceph txt REG 9,127 31762544 5021 /usr/bin/ceph-osd ceph-osd 225860 ceph mem REG 8,70 2147237 68104224 /var/lib/ceph/osd/ceph-0/current/omap/165546.ldb ceph-osd 225860 ceph mem REG 8,70 2147792 68104190 /var/lib/ceph/osd/ceph-0/current/omap/165545.ldb ceph-osd 225860 ceph mem REG 8,70 2147689 68104240 /var/lib/ceph/osd/ceph-0/current/omap/165466.ldb ceph-osd 225860 ceph mem REG 8,70 2142721 68102679 /var/lib/ceph/osd/ceph-0/current/omap/165544.ldb ceph-osd 225860 ceph mem REG 8,70 2142677 68104239 /var/lib/ceph/osd/ceph-0/current/omap/165465.ldb ceph-osd 225860 ceph mem REG 8,70 2144979 68078254 /var/lib/ceph/osd/ceph-0/current/omap/165543.ldb ceph-osd 225860 ceph mem REG 8,70 2143705 68163491 /var/lib/ceph/osd/ceph-0/current/omap/165526.ldb ceph-osd 225860 ceph mem REG 8,70 2141468 68163492 /var/lib/ceph/osd/ceph-0/current/omap/165527.ldb ceph-osd 225860 ceph mem REG 8,70 145986 68018644 /var/lib/ceph/osd/ceph-0/current/omap/165541.ldb ceph-osd 225860 ceph mem REG 8,70 2143434 68163490 /var/lib/ceph/osd/ceph-0/current/omap/165525.ldb ceph-osd 225860 ceph mem REG 8,70 2136002 68122351 /var/lib/ceph/osd/ceph-0/current/omap/165472.ldb ceph-osd 225860 ceph mem REG 8,70 1965262 68119647 /var/lib/ceph/osd/ceph-0/current/omap/165467.ldb ceph-osd 225860 ceph mem REG 8,70 2145206 68104229 /var/lib/ceph/osd/ceph-0/current/omap/165464.ldb ceph-osd 225860 ceph mem REG 8,70 61600 68002130 /var/lib/ceph/osd/ceph-0/current/omap/165536.ldb ceph-osd 225860 ceph mem REG 8,70 352689 67945734 /var/lib/ceph/osd/ceph-0/current/omap/165530.ldb ceph-osd 225860 ceph mem REG 8,70 82500 67480519 /var/lib/ceph/osd/ceph-0/current/omap/165539.ldb ceph-osd 225860 ceph mem REG 9,127 189200 2878 /usr/lib/ceph/erasure-code/libec_isa.so ceph-osd 225860 ceph mem REG 9,127 2021136 3033 /usr/lib/ceph/erasure-code/libec_lrc.so ceph-osd 225860 ceph mem REG 9,127 360544 2942 /usr/lib/ceph/erasure-code/libec_jerasure.so ceph-osd 225860 ceph mem REG 9,127 55792 15227 /lib/x86_64-linux-gnu/libnss_files-2.28.so ceph-osd 225860 ceph mem REG 9,127 22368 133997 /usr/lib/x86_64-linux-gnu/liburcu-common.so.6.0.0 ceph-osd 225860 ceph mem REG 9,127 42976 133996 /usr/lib/x86_64-linux-gnu/liburcu-cds.so.6.0.0 ceph-osd 225860 ceph mem REG 9,127 51128 149821 /usr/lib/x86_64-linux-gnu/liblttng-ust-tracepoint.so.0.0.0 ceph-osd 225860 ceph mem REG 8,70 33977 68116674 /var/lib/ceph/osd/ceph-0/current/omap/165547.ldb
ceph-osd 225860 ceph  DEL       REG               0,19 64261192 /[aio]
ceph-osd 225860 ceph  DEL       REG               0,19 64261191 /[aio]
ceph-osd 225860 ceph mem REG 9,127 158400 2523 /lib/x86_64-linux-gnu/liblzma.so.5.2.4 ceph-osd 225860 ceph mem REG 9,127 135064 12635 /lib/x86_64-linux-gnu/libnl-3.so.200.26.0 ceph-osd 225860 ceph mem REG 9,127 488272 133248 /usr/lib/x86_64-linux-gnu/libnl-route-3.so.200.26.0 ceph-osd 225860 ceph mem REG 9,127 35808 15263 /lib/x86_64-linux-gnu/librt-2.28.so ceph-osd 225860 ceph mem REG 9,127 55512 133042 /usr/lib/x86_64-linux-gnu/libunwind.so.8.0.1 ceph-osd 225860 ceph mem REG 9,127 30776 16463 /lib/x86_64-linux-gnu/libuuid.so.1.3.0 ceph-osd 225860 ceph mem REG 9,127 1820400 15070 /lib/x86_64-linux-gnu/libc-2.28.so ceph-osd 225860 ceph mem REG 9,127 100712 7687 /lib/x86_64-linux-gnu/libgcc_s.so.1 ceph-osd 225860 ceph mem REG 9,127 1579448 15202 /lib/x86_64-linux-gnu/libm-2.28.so ceph-osd 225860 ceph mem REG 9,127 1570256 133576 /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.25 ceph-osd 225860 ceph mem REG 9,127 112800 133289 /usr/lib/x86_64-linux-gnu/librdmacm.so.1.1.22.1 ceph-osd 225860 ceph mem REG 9,127 104496 133287 /usr/lib/x86_64-linux-gnu/libibverbs.so.1.5.22.1 ceph-osd 225860 ceph mem REG 9,127 149704 5527 /lib/x86_64-linux-gnu/libudev.so.1.6.13 ceph-osd 225860 ceph mem REG 9,127 146968 15251 /lib/x86_64-linux-gnu/libpthread-2.28.so ceph-osd 225860 ceph mem REG 9,127 3044224 135371 /usr/lib/x86_64-linux-gnu/libcrypto.so.1.1 ceph-osd 225860 ceph mem REG 9,127 14592 15128 /lib/x86_64-linux-gnu/libdl-2.28.so ceph-osd 225860 ceph mem REG 9,127 93000 15255 /lib/x86_64-linux-gnu/libresolv-2.28.so ceph-osd 225860 ceph mem REG 9,127 301288 134118 /usr/lib/x86_64-linux-gnu/libtcmalloc.so.4.5.3 ceph-osd 225860 ceph mem REG 9,127 14184 134682 /usr/lib/x86_64-linux-gnu/libaio.so.1.0.1 ceph-osd 225860 ceph mem REG 9,127 117184 2738 /lib/x86_64-linux-gnu/libz.so.1.2.11 ceph-osd 225860 ceph mem REG 9,127 121184 136027 /usr/lib/x86_64-linux-gnu/liblz4.so.1.8.3 ceph-osd 225860 ceph mem REG 9,127 31104 133334 /usr/lib/x86_64-linux-gnu/libsnappy.so.1.1.7 ceph-osd 225860 ceph mem REG 9,127 378160 133086 /usr/lib/x86_64-linux-gnu/libleveldb.so.1d.20 ceph-osd 225860 ceph mem REG 9,127 256120 7650 /lib/x86_64-linux-gnu/libfuse.so.2.9.9 ceph-osd 225860 ceph mem REG 9,127 343008 16890 /lib/x86_64-linux-gnu/libblkid.so.1.1.0 ceph-osd 225860 ceph mem REG 8,70 1546 68022625 /var/lib/ceph/osd/ceph-0/current/omap/165542.ldb ceph-osd 225860 ceph mem REG 8,70 1598 67109142 /var/lib/ceph/osd/ceph-0/current/omap/165533.ldb ceph-osd 225860 ceph mem REG 9,127 35112 133884 /usr/lib/x86_64-linux-gnu/liburcu-bp.so.6.0.0 ceph-osd 225860 ceph mem REG 9,127 165632 15000 /lib/x86_64-linux-gnu/ld-2.28.so ceph-osd 225860 ceph 0r CHR 1,3 0t0 6 /dev/null ceph-osd 225860 ceph 1u unix 0x00000000ab7b4943 0t0 64259884 type=STREAM ceph-osd 225860 ceph 2u unix 0x00000000ab7b4943 0t0 64259884 type=STREAM ceph-osd 225860 ceph 3u a_inode 0,14 0 11090 [eventpoll] ceph-osd 225860 ceph 4r FIFO 0,13 0t0 64259943 pipe ceph-osd 225860 ceph 5w FIFO 0,13 0t0 64259943 pipe ceph-osd 225860 ceph 6u a_inode 0,14 0 11090 [eventpoll] ceph-osd 225860 ceph 7r FIFO 0,13 0t0 64259944 pipe ceph-osd 225860 ceph 8w FIFO 0,13 0t0 64259944 pipe ceph-osd 225860 ceph 9u a_inode 0,14 0 11090 [eventpoll] ceph-osd 225860 ceph 10r FIFO 0,13 0t0 64259945 pipe ceph-osd 225860 ceph 11w FIFO 0,13 0t0 64259945 pipe ceph-osd 225860 ceph 12w REG 9,127 8765 3314 /var/log/ceph/ceph-osd.0.log ceph-osd 225860 ceph 13r FIFO 0,13 0t0 64259950 pipe ceph-osd 225860 ceph 14w FIFO 0,13 0t0 64259950 pipe ceph-osd 225860 ceph 15u unix 0x0000000097fdc46a 0t0 64259951 /var/run/ceph/ceph-osd.0.asok type=STREAM ceph-osd 225860 ceph 24r FIFO 0,13 0t0 64261188 pipe ceph-osd 225860 ceph 25w FIFO 0,13 0t0 64261188 pipe ceph-osd 225860 ceph 26r FIFO 0,13 0t0 64261189 pipe ceph-osd 225860 ceph 27w FIFO 0,13 0t0 64261189 pipe ceph-osd 225860 ceph 28uW REG 8,70 37 21 /var/lib/ceph/osd/ceph-0/fsid ceph-osd 225860 ceph 29r DIR 8,70 262 16 /var/lib/ceph/osd/ceph-0 ceph-osd 225860 ceph 30r DIR 8,70 24576 27 /var/lib/ceph/osd/ceph-0/current ceph-osd 225860 ceph 31u REG 8,70 10 28 /var/lib/ceph/osd/ceph-0/current/commit_op_seq ceph-osd 225860 ceph 32uW REG 8,70 0 67108882 /var/lib/ceph/osd/ceph-0/current/omap/LOCK ceph-osd 225860 ceph 33w REG 8,70 813568 68035819 /var/lib/ceph/osd/ceph-0/current/omap/165540.log ceph-osd 225860 ceph 34w REG 8,70 32068 68072533 /var/lib/ceph/osd/ceph-0/current/omap/MANIFEST-165538 ceph-osd 225860 ceph 36u BLK 8,65 0t0 422 /dev/sde1 ceph-osd 225860 ceph 37u REG 8,70 538 134217745 /var/lib/ceph/osd/ceph-0/current/meta/DIR_E/DIR_D/DIR_C/osd\\usuperblock__0_23C2FCDE__none ceph-osd 225860 ceph 38u REG 8,70 22453 1075330317 /var/lib/ceph/osd/ceph-0/current/meta/DIR_4/DIR_B/DIR_4/osdmap.126556__0_DF0524B4__none

strace shows

strace: Process 225860 attached
futex(0x555f3e1c05c8, FUTEX_WAIT_PRIVATE, 0, NULL


-- ============
Ing. Jan Pekař
jan.pekar@xxxxxxxxx
----
Imatic | Jagellonská 14 | Praha 3 | 130 00
https://www.imatic.cz | +420326555326
============
--
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx


_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx


_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux