Re: OSD upgrade problem nautilus->octopus - snap_mapper upgrade stuck

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

Thanks. Someone told me that we could just destroy the FileStore OSD’s and recreate them as BlueStore, even though the cluster is partially upgraded. So I guess I’ll just do that. (Unless someone here tells me that that’s a terrible idea :))

—
Mark Schouten, CTO
Tuxis B.V.
mark@xxxxxxxx / +31 318 200208


------ Original Message ------
From "Eugen Block" <eblock@xxxxxx>
To ceph-users@xxxxxxx
Date 2/7/2023 4:58:11 PM
Subject Re: OSD upgrade problem nautilus->octopus - snap_mapper upgrade stuck

Hi,

I don't really have an answer, but there was a bug with snap mapper  [1], [2] is supposed to verify consistency, but Octopus is EOL so you  might need to upgrade directly to Pacific. That's what we did on  multiple clusters (N --> P) a few months back. I'm not sure if it  would just work if you already have a couple of Octopus daemons, maybe  you can try it on a test cluster.

Regards,
Eugen

[1] https://tracker.ceph.com/issues/56147
[2] https://github.com/ceph/ceph/pull/47388

Zitat von Mark Schouten <mark@xxxxxxxx>:

Hi,

I’m seeing the same thing …

With debug logging enabled I see this:
2023-02-07T00:35:51.853+0100 7fdab9930e00 10  snap_mapper.convert_legacy converted 1410 keys
2023-02-07T00:35:51.853+0100 7fdab9930e00 10  snap_mapper.convert_legacy converted 1440 keys
2023-02-07T00:35:51.853+0100 7fdab9930e00 10  snap_mapper.convert_legacy converted 1470 keys
2023-02-07T00:35:51.853+0100 7fdab9930e00 10  snap_mapper.convert_legacy converted 1500 keys

It ends at 1500 keys. And nothing happens.

I’m now stuck with a cluster that has 4 OSD’s on Octopus, 10 on  Nautilus, and one down .. A hint on how to work around this is  welcome :)

—
Mark Schouten, CTO
Tuxis B.V.
mark@xxxxxxxx / +31 318 200208


------ Original Message ------
From "Jan Pekař - Imatic" <jan.pekar@xxxxxxxxx>
To ceph-users@xxxxxxx
Date 1/12/2023 5:53:02 PM
Subject  OSD upgrade problem nautilus->octopus -  snap_mapper upgrade stuck

Hi all,

I have problem upgrading nautilus to octopus on my OSD.

Upgrade mon and mgr was OK and first OSD stuck on

2023-01-12T09:25:54.122+0100 7f49ff3eae00  1 osd.0 126556 init  upgrade snap_mapper (first start as octopus)

and there were no activity after that for more than 48 hours. No  disk activity.

I restarted OSD many times and nothing changed.

It is old, filestore OSD based on XFS filesystem. Is upgrade to  snap mapper 2 reliable? What is OSD waiting for? Can I start OSD  without upgrade and get cluster healthy with old snap structure? Or  should I skip octopus upgrade and go to pacific directly (some bug  backport is missing?).

Thank you for help, I'm sending some logs below..

Log shows

2023-01-09T19:12:49.471+0100 7f41f60f1e00  0 ceph version 15.2.17  (694d03a6f6c6e9f814446223549caf9a9f60dba0) octopus (stable),  process ceph-osd, pid 2566563
2023-01-09T19:12:49.471+0100 7f41f60f1e00  0 pidfile_write: ignore  empty --pid-file
2023-01-09T19:12:49.499+0100 7f41f60f1e00 -1 missing 'type' file,  inferring filestore from current/ dir
2023-01-09T19:12:49.531+0100 7f41f60f1e00  0 starting osd.0  osd_data /var/lib/ceph/osd/ceph-0 /var/lib/ceph/osd/ceph-0/journal
2023-01-09T19:12:49.531+0100 7f41f60f1e00 -1 Falling back to public  interface
2023-01-09T19:12:49.871+0100 7f41f60f1e00  0 load: jerasure load:  lrc load: isa
2023-01-09T19:12:49.875+0100 7f41f60f1e00  0  filestore(/var/lib/ceph/osd/ceph-0) backend xfs (magic 0x58465342)
2023-01-09T19:12:49.883+0100 7f41f60f1e00  0 osd.0:0.OSDShard using  op scheduler ClassedOpQueueScheduler(queue=WeightedPriorityQueue,  cutoff=196)
2023-01-09T19:12:49.883+0100 7f41f60f1e00  0 osd.0:1.OSDShard using  op scheduler ClassedOpQueueScheduler(queue=WeightedPriorityQueue,  cutoff=196)
2023-01-09T19:12:49.883+0100 7f41f60f1e00  0 osd.0:2.OSDShard using  op scheduler ClassedOpQueueScheduler(queue=WeightedPriorityQueue,  cutoff=196)
2023-01-09T19:12:49.883+0100 7f41f60f1e00  0 osd.0:3.OSDShard using  op scheduler ClassedOpQueueScheduler(queue=WeightedPriorityQueue,  cutoff=196)
2023-01-09T19:12:49.883+0100 7f41f60f1e00  0 osd.0:4.OSDShard using  op scheduler ClassedOpQueueScheduler(queue=WeightedPriorityQueue,  cutoff=196)
2023-01-09T19:12:49.883+0100 7f41f60f1e00  0  filestore(/var/lib/ceph/osd/ceph-0) backend xfs (magic 0x58465342)
2023-01-09T19:12:49.927+0100 7f41f60f1e00  0  genericfilestorebackend(/var/lib/ceph/osd/ceph-0) detect_features:  FIEMAP ioctl is disabled via 'filestore fiemap' config option
2023-01-09T19:12:49.927+0100 7f41f60f1e00  0  genericfilestorebackend(/var/lib/ceph/osd/ceph-0) detect_features:  SEEK_DATA/SEEK_HOLE is disabled via 'filestore seek data hole'  config option
2023-01-09T19:12:49.927+0100 7f41f60f1e00  0  genericfilestorebackend(/var/lib/ceph/osd/ceph-0) detect_features:  splice() is disabled via 'filestore splice' config option
2023-01-09T19:12:49.983+0100 7f41f60f1e00  0  genericfilestorebackend(/var/lib/ceph/osd/ceph-0) detect_features:  syncfs(2) syscall fully supported (by glibc and kernel)
2023-01-09T19:12:49.983+0100 7f41f60f1e00  0  xfsfilestorebackend(/var/lib/ceph/osd/ceph-0) detect_feature:  extsize is disabled by conf
2023-01-09T19:12:50.015+0100 7f41f60f1e00  0  filestore(/var/lib/ceph/osd/ceph-0) start omap initiation
2023-01-09T19:12:50.079+0100 7f41f60f1e00  1 leveldb: Recovering log #165531
2023-01-09T19:12:50.083+0100 7f41f60f1e00  1 leveldb: Level-0 table  #165533: started
2023-01-09T19:12:50.235+0100 7f41f60f1e00  1 leveldb: Level-0 table  #165533: 1598 bytes OK
2023-01-09T19:12:50.583+0100 7f41f60f1e00  1 leveldb: Delete type=0 #165531

2023-01-09T19:12:50.615+0100 7f41f60f1e00  1 leveldb: Delete type=3 #165529

2023-01-09T19:12:51.339+0100 7f41f60f1e00  0  filestore(/var/lib/ceph/osd/ceph-0) mount(1861): enabling  WRITEAHEAD journal mode: checkpoint is not enabled
2023-01-09T19:12:51.379+0100 7f41f60f1e00  1 journal _open  /var/lib/ceph/osd/ceph-0/journal fd 35: 2998927360 bytes, block  size 4096 bytes, directio = 1, aio = 1
2023-01-09T19:12:51.931+0100 7f41f60f1e00 -1 journal  do_read_entry(243675136): bad header magic
2023-01-09T19:12:51.939+0100 7f41f60f1e00  1 journal _open  /var/lib/ceph/osd/ceph-0/journal fd 35: 2998927360 bytes, block  size 4096 bytes, directio = 1, aio = 1
2023-01-09T19:12:51.943+0100 7f41f60f1e00  1  filestore(/var/lib/ceph/osd/ceph-0) upgrade(1466)
2023-01-09T19:12:52.015+0100 7f41f60f1e00  1 osd.0 126556 init  upgrade snap_mapper (first start as octopus)

lsof shows

COMMAND     PID USER   FD      TYPE             DEVICE SIZE/OFF        NODE NAME
ceph-osd 225860 ceph  cwd       DIR              9,127 4096          2 /
ceph-osd 225860 ceph  rtd       DIR              9,127 4096          2 /
ceph-osd 225860 ceph  txt       REG              9,127 31762544        5021 /usr/bin/ceph-osd
ceph-osd 225860 ceph  mem       REG               8,70  2147237  68104224 /var/lib/ceph/osd/ceph-0/current/omap/165546.ldb
ceph-osd 225860 ceph  mem       REG               8,70  2147792  68104190 /var/lib/ceph/osd/ceph-0/current/omap/165545.ldb
ceph-osd 225860 ceph  mem       REG               8,70  2147689  68104240 /var/lib/ceph/osd/ceph-0/current/omap/165466.ldb
ceph-osd 225860 ceph  mem       REG               8,70  2142721  68102679 /var/lib/ceph/osd/ceph-0/current/omap/165544.ldb
ceph-osd 225860 ceph  mem       REG               8,70  2142677  68104239 /var/lib/ceph/osd/ceph-0/current/omap/165465.ldb
ceph-osd 225860 ceph  mem       REG               8,70  2144979  68078254 /var/lib/ceph/osd/ceph-0/current/omap/165543.ldb
ceph-osd 225860 ceph  mem       REG               8,70  2143705  68163491 /var/lib/ceph/osd/ceph-0/current/omap/165526.ldb
ceph-osd 225860 ceph  mem       REG               8,70  2141468  68163492 /var/lib/ceph/osd/ceph-0/current/omap/165527.ldb
ceph-osd 225860 ceph  mem       REG               8,70   145986  68018644 /var/lib/ceph/osd/ceph-0/current/omap/165541.ldb
ceph-osd 225860 ceph  mem       REG               8,70  2143434  68163490 /var/lib/ceph/osd/ceph-0/current/omap/165525.ldb
ceph-osd 225860 ceph  mem       REG               8,70  2136002  68122351 /var/lib/ceph/osd/ceph-0/current/omap/165472.ldb
ceph-osd 225860 ceph  mem       REG               8,70  1965262  68119647 /var/lib/ceph/osd/ceph-0/current/omap/165467.ldb
ceph-osd 225860 ceph  mem       REG               8,70  2145206  68104229 /var/lib/ceph/osd/ceph-0/current/omap/165464.ldb
ceph-osd 225860 ceph  mem       REG               8,70    61600  68002130 /var/lib/ceph/osd/ceph-0/current/omap/165536.ldb
ceph-osd 225860 ceph  mem       REG               8,70   352689  67945734 /var/lib/ceph/osd/ceph-0/current/omap/165530.ldb
ceph-osd 225860 ceph  mem       REG               8,70    82500  67480519 /var/lib/ceph/osd/ceph-0/current/omap/165539.ldb
ceph-osd 225860 ceph  mem       REG              9,127 189200        2878 /usr/lib/ceph/erasure-code/libec_isa.so
ceph-osd 225860 ceph  mem       REG              9,127 2021136        3033 /usr/lib/ceph/erasure-code/libec_lrc.so
ceph-osd 225860 ceph  mem       REG              9,127 360544        2942 /usr/lib/ceph/erasure-code/libec_jerasure.so
ceph-osd 225860 ceph  mem       REG              9,127 55792       15227 /lib/x86_64-linux-gnu/libnss_files-2.28.so
ceph-osd 225860 ceph  mem       REG              9,127 22368      133997 /usr/lib/x86_64-linux-gnu/liburcu-common.so.6.0.0
ceph-osd 225860 ceph  mem       REG              9,127 42976      133996 /usr/lib/x86_64-linux-gnu/liburcu-cds.so.6.0.0
ceph-osd 225860 ceph  mem       REG              9,127 51128      149821 /usr/lib/x86_64-linux-gnu/liblttng-ust-tracepoint.so.0.0.0
ceph-osd 225860 ceph  mem       REG               8,70    33977  68116674 /var/lib/ceph/osd/ceph-0/current/omap/165547.ldb
ceph-osd 225860 ceph  DEL       REG               0,19 64261192 /[aio]
ceph-osd 225860 ceph  DEL       REG               0,19 64261191 /[aio]
ceph-osd 225860 ceph  mem       REG              9,127 158400        2523 /lib/x86_64-linux-gnu/liblzma.so.5.2.4
ceph-osd 225860 ceph  mem       REG              9,127 135064       12635 /lib/x86_64-linux-gnu/libnl-3.so.200.26.0
ceph-osd 225860 ceph  mem       REG              9,127 488272      133248 /usr/lib/x86_64-linux-gnu/libnl-route-3.so.200.26.0
ceph-osd 225860 ceph  mem       REG              9,127 35808       15263 /lib/x86_64-linux-gnu/librt-2.28.so
ceph-osd 225860 ceph  mem       REG              9,127 55512      133042 /usr/lib/x86_64-linux-gnu/libunwind.so.8.0.1
ceph-osd 225860 ceph  mem       REG              9,127 30776       16463 /lib/x86_64-linux-gnu/libuuid.so.1.3.0
ceph-osd 225860 ceph  mem       REG              9,127 1820400       15070 /lib/x86_64-linux-gnu/libc-2.28.so
ceph-osd 225860 ceph  mem       REG              9,127 100712        7687 /lib/x86_64-linux-gnu/libgcc_s.so.1
ceph-osd 225860 ceph  mem       REG              9,127 1579448       15202 /lib/x86_64-linux-gnu/libm-2.28.so
ceph-osd 225860 ceph  mem       REG              9,127 1570256      133576 /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.25
ceph-osd 225860 ceph  mem       REG              9,127 112800      133289 /usr/lib/x86_64-linux-gnu/librdmacm.so.1.1.22.1
ceph-osd 225860 ceph  mem       REG              9,127 104496      133287 /usr/lib/x86_64-linux-gnu/libibverbs.so.1.5.22.1
ceph-osd 225860 ceph  mem       REG              9,127 149704        5527 /lib/x86_64-linux-gnu/libudev.so.1.6.13
ceph-osd 225860 ceph  mem       REG              9,127 146968       15251 /lib/x86_64-linux-gnu/libpthread-2.28.so
ceph-osd 225860 ceph  mem       REG              9,127 3044224      135371 /usr/lib/x86_64-linux-gnu/libcrypto.so.1.1
ceph-osd 225860 ceph  mem       REG              9,127 14592       15128 /lib/x86_64-linux-gnu/libdl-2.28.so
ceph-osd 225860 ceph  mem       REG              9,127 93000       15255 /lib/x86_64-linux-gnu/libresolv-2.28.so
ceph-osd 225860 ceph  mem       REG              9,127 301288      134118 /usr/lib/x86_64-linux-gnu/libtcmalloc.so.4.5.3
ceph-osd 225860 ceph  mem       REG              9,127 14184      134682 /usr/lib/x86_64-linux-gnu/libaio.so.1.0.1
ceph-osd 225860 ceph  mem       REG              9,127 117184        2738 /lib/x86_64-linux-gnu/libz.so.1.2.11
ceph-osd 225860 ceph  mem       REG              9,127 121184      136027 /usr/lib/x86_64-linux-gnu/liblz4.so.1.8.3
ceph-osd 225860 ceph  mem       REG              9,127 31104      133334 /usr/lib/x86_64-linux-gnu/libsnappy.so.1.1.7
ceph-osd 225860 ceph  mem       REG              9,127 378160      133086 /usr/lib/x86_64-linux-gnu/libleveldb.so.1d.20
ceph-osd 225860 ceph  mem       REG              9,127 256120        7650 /lib/x86_64-linux-gnu/libfuse.so.2.9.9
ceph-osd 225860 ceph  mem       REG              9,127 343008       16890 /lib/x86_64-linux-gnu/libblkid.so.1.1.0
ceph-osd 225860 ceph  mem       REG               8,70     1546  68022625 /var/lib/ceph/osd/ceph-0/current/omap/165542.ldb
ceph-osd 225860 ceph  mem       REG               8,70     1598  67109142 /var/lib/ceph/osd/ceph-0/current/omap/165533.ldb
ceph-osd 225860 ceph  mem       REG              9,127 35112      133884 /usr/lib/x86_64-linux-gnu/liburcu-bp.so.6.0.0
ceph-osd 225860 ceph  mem       REG              9,127 165632       15000 /lib/x86_64-linux-gnu/ld-2.28.so
ceph-osd 225860 ceph    0r      CHR                1,3 0t0           6 /dev/null
ceph-osd 225860 ceph    1u     unix 0x00000000ab7b4943      0t0  64259884 type=STREAM
ceph-osd 225860 ceph    2u     unix 0x00000000ab7b4943      0t0  64259884 type=STREAM
ceph-osd 225860 ceph    3u  a_inode               0,14 0      11090  [eventpoll]
ceph-osd 225860 ceph    4r     FIFO               0,13      0t0  64259943 pipe
ceph-osd 225860 ceph    5w     FIFO               0,13      0t0  64259943 pipe
ceph-osd 225860 ceph    6u  a_inode               0,14 0      11090  [eventpoll]
ceph-osd 225860 ceph    7r     FIFO               0,13      0t0  64259944 pipe
ceph-osd 225860 ceph    8w     FIFO               0,13      0t0  64259944 pipe
ceph-osd 225860 ceph    9u  a_inode               0,14 0      11090  [eventpoll]
ceph-osd 225860 ceph   10r     FIFO               0,13      0t0  64259945 pipe
ceph-osd 225860 ceph   11w     FIFO               0,13      0t0  64259945 pipe
ceph-osd 225860 ceph   12w      REG              9,127 8765        3314 /var/log/ceph/ceph-osd.0.log
ceph-osd 225860 ceph   13r     FIFO               0,13      0t0  64259950 pipe
ceph-osd 225860 ceph   14w     FIFO               0,13      0t0  64259950 pipe
ceph-osd 225860 ceph   15u     unix 0x0000000097fdc46a      0t0  64259951 /var/run/ceph/ceph-osd.0.asok type=STREAM
ceph-osd 225860 ceph   24r     FIFO               0,13      0t0  64261188 pipe
ceph-osd 225860 ceph   25w     FIFO               0,13      0t0  64261188 pipe
ceph-osd 225860 ceph   26r     FIFO               0,13      0t0  64261189 pipe
ceph-osd 225860 ceph   27w     FIFO               0,13      0t0  64261189 pipe
ceph-osd 225860 ceph   28uW     REG               8,70 37          21 /var/lib/ceph/osd/ceph-0/fsid
ceph-osd 225860 ceph   29r      DIR               8,70 262          16 /var/lib/ceph/osd/ceph-0
ceph-osd 225860 ceph   30r      DIR               8,70 24576          27 /var/lib/ceph/osd/ceph-0/current
ceph-osd 225860 ceph   31u      REG               8,70 10          28 /var/lib/ceph/osd/ceph-0/current/commit_op_seq
ceph-osd 225860 ceph   32uW     REG               8,70        0  67108882 /var/lib/ceph/osd/ceph-0/current/omap/LOCK
ceph-osd 225860 ceph   33w      REG               8,70   813568  68035819 /var/lib/ceph/osd/ceph-0/current/omap/165540.log
ceph-osd 225860 ceph   34w      REG               8,70    32068  68072533 /var/lib/ceph/osd/ceph-0/current/omap/MANIFEST-165538
ceph-osd 225860 ceph   36u      BLK               8,65 0t0         422 /dev/sde1
ceph-osd 225860 ceph   37u      REG               8,70      538  134217745  /var/lib/ceph/osd/ceph-0/current/meta/DIR_E/DIR_D/DIR_C/osd\\usuperblock__0_23C2FCDE__none
ceph-osd 225860 ceph   38u      REG               8,70    22453  1075330317  /var/lib/ceph/osd/ceph-0/current/meta/DIR_4/DIR_B/DIR_4/osdmap.126556__0_DF0524B4__none

strace shows

strace: Process 225860 attached
futex(0x555f3e1c05c8, FUTEX_WAIT_PRIVATE, 0, NULL


-- ============
Ing. Jan Pekař
jan.pekar@xxxxxxxxx
----
Imatic | Jagellonská 14 | Praha 3 | 130 00
https://www.imatic.cz | +420326555326
============
--
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx


_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux