Re: Stuck in upgrade process to reef

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Jan,

this doesn't look like RocksDB corruption but rather like some BlueStore metadata inconsistency. Also assertion backtrace in the new log looks completely different from the original one. So in an attempt to find any systematic pattern I'd suggest to run fsck with verbose logging for every failing OSD. Relevant command line:

CEPH_ARGS="--log-file osd.N.log --debug-bluestore 5/20" bin/ceph-bluestore-tool --path <path-to-osd> --command fsck

Unlikely this will fix anything it's rather a way to collect logs to get better insight.


Additionally you might want to run similar fsck for a couple of healthy OSDs - curious if it succeeds as I have a feeling that the problem with crashing OSDs had been hidden before the upgrade and revealed rather than caused by it.


Thanks,

Igor

On 12/29/2023 3:28 PM, Jan Marek wrote:
Hello Igor,

I'm attaching a part of syslog creating while starting OSD.0.

Many thanks for help.

Sincerely
Jan Marek

Dne St, pro 27, 2023 at 04:42:56 CET napsal(a) Igor Fedotov:
Hi Jan,

IIUC the attached log is for ceph-kvstore-tool, right?

Can you please share full OSD startup log as well?


Thanks,

Igor

On 12/27/2023 4:30 PM, Jan Marek wrote:
Hello,

I've problem: my ceph cluster (3x mon nodes, 6x osd nodes, every
osd node have 12 rotational disk and one NVMe device for
bluestore DB). CEPH is installed by ceph orchestrator and have
bluefs storage on osd.

I've started process upgrade from version 17.2.6 to 18.2.1 by
invocating:

ceph orch upgrade start --ceph-version 18.2.1

After upgrade of mon and mgr processes orchestrator tried to
upgrade the first OSD node, but they are falling down.

I've stop the process of upgrade, but I have 1 osd node
completely down.

After upgrade I've got some error messages and I've found
/var/lib/ceph/crashxxxx directories, I attach to this message
files, which I've found here.

Please, can you advice, what now I can do? It seems, that rocksdb
is even non-compatible or corrupted :-(

Thanks in advance.

Sincerely
Jan Marek

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx
--
Igor Fedotov
Ceph Lead Developer

Looking for help with your Ceph cluster? Contact us at https://croit.io

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263
Web: https://croit.io | YouTube: https://goo.gl/PGE1Bx

--
Igor Fedotov
Ceph Lead Developer

Looking for help with your Ceph cluster? Contact us at https://croit.io

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263
Web: https://croit.io | YouTube: https://goo.gl/PGE1Bx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux