Re: OSD crashes during upgrade mimic->octopus

Frank Schilder <frans@xxxxxx> · Thu, 6 Oct 2022 11:55:18 +0000

Hi Igor,

it has the SSD OSDs down, the HDD OSDs are running just fine. I don't want to make a bad situation worse for now and wait for recovery to finish. The inactive PGs are activating very slowly.

By the way, there are 2 out of 4 OSDs up in the replicated 4(2) pool. Why are PGs even inactive here? This "feature" is new in octopus, I reported it about 2 months ago as a bug. Testing with mimic I cannot reproduce this problem: https://tracker.ceph.com/issues/56995

I found this in the syslog, maybe it helps:

kernel: task:bstore_kv_sync  state:D stack:    0 pid:3646032 ppid:3645340 flags:0x00000000
kernel: Call Trace:
kernel: __schedule+0x2a2/0x7e0
kernel: schedule+0x4e/0xb0
kernel: io_schedule+0x16/0x40
kernel: wait_on_page_bit_common+0x15c/0x3e0
kernel: ? __page_cache_alloc+0xb0/0xb0
kernel: wait_on_page_bit+0x3f/0x50
kernel: wait_on_page_writeback+0x26/0x70
kernel: __filemap_fdatawait_range+0x98/0x100
kernel: ? __filemap_fdatawrite_range+0xd8/0x110
kernel: file_fdatawait_range+0x1a/0x30
kernel: sync_file_range+0xc2/0xf0
kernel: ksys_sync_file_range+0x41/0x80
kernel: __x64_sys_sync_file_range+0x1e/0x30
kernel: do_syscall_64+0x3b/0x90
kernel: entry_SYSCALL_64_after_hwframe+0x44/0xae
kernel: RIP: 0033:0x7ffbb6f77ae7
kernel: RSP: 002b:00007ffba478c3c0 EFLAGS: 00000293 ORIG_RAX: 0000000000000115
kernel: RAX: ffffffffffffffda RBX: 000000000000002d RCX: 00007ffbb6f77ae7
kernel: RDX: 0000000000002000 RSI: 000000015f849000 RDI: 000000000000002d
kernel: RBP: 000000015f849000 R08: 0000000000000000 R09: 0000000000002000
kernel: R10: 0000000000000007 R11: 0000000000000293 R12: 0000000000002000
kernel: R13: 0000000000000007 R14: 0000000000000001 R15: 0000560a1ae20380
kernel: INFO: task bstore_kv_sync:3646117 blocked for more than 123 seconds.
kernel:      Tainted: G            E     5.14.13-1.el7.elrepo.x86_64 #1
kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.

It is quite possible that this was the moment when these OSDs got stuck and were marked down. The time stamp is about right.

Best regards,
=================
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14

________________________________________
From: Igor Fedotov <igor.fedotov@xxxxxxxx>
Sent: 06 October 2022 13:45:17
To: Frank Schilder; ceph-users@xxxxxxx
Subject: Re:  OSD crashes during upgrade mimic->octopus

 From your response to Stefan I'm getting that one of two damaged hosts
has all OSDs down and unable to start. I that correct? If so you can
reboot it with no problem and proceed with manual compaction [and other
experiments] quite "safely" for the rest of the cluster.

On 10/6/2022 2:35 PM, Frank Schilder wrote:
> Hi Igor,
>
> I can't access these drives. They have an OSD- or LVM process hanging in D-state. Any attempt to do something with these gets stuck as well.
>
> I somehow need to wait for recovery to finish and protect the still running OSDs from crashing similarly badly.
>
> After we have full redundancy again and service is back, I can add the setting osd_compact_on_start=true and start rebooting servers. Right now I need to prevent the ship from sinking.
>
> Best regards,
> =================
> Frank Schilder
> AIT Risø Campus
> Bygning 109, rum S14
>
> ________________________________________
> From: Igor Fedotov <igor.fedotov@xxxxxxxx>
> Sent: 06 October 2022 13:28:11
> To: Frank Schilder; ceph-users@xxxxxxx
> Subject: Re:  OSD crashes during upgrade mimic->octopus
>
> IIUC the OSDs that expose "had timed out after 15" are failing to start
> up. Is that correct or I missed something?  I meant trying compaction
> for them...
>
>
> On 10/6/2022 2:27 PM, Frank Schilder wrote:
>> Hi Igor,
>>
>> thanks for your response.
>>
>>> And what's the target Octopus release?
>> ceph version 15.2.17 (8a82819d84cf884bd39c17e3236e0632ac146dc4) octopus (stable)
>>
>> I'm afraid I don't have the luxury right now to take OSDs down or add extra load with an on-line compaction. I would really appreciate a way to make the OSDs more crash tolerant until I have full redundancy again. Is there a setting that increases the OPS timeout or is there a way to restrict the load to tolerable levels?
>>
>> Best regards,
>> =================
>> Frank Schilder
>> AIT Risø Campus
>> Bygning 109, rum S14
>>
>> ________________________________________
>> From: Igor Fedotov <igor.fedotov@xxxxxxxx>
>> Sent: 06 October 2022 13:15
>> To: Frank Schilder; ceph-users@xxxxxxx
>> Subject: Re:  OSD crashes during upgrade mimic->octopus
>>
>> Hi Frank,
>>
>> you might want to compact RocksDB by ceph-kvstore-tool for those OSDs
>> which are showing
>>
>> "heartbeat_map is_healthy 'OSD::osd_op_tp thread 0x7f1886536700' had timed out after 15"
>>
>>
>>
>> I could see such an error after bulk data removal and following severe
>> DB performance drop pretty often.
>>
>> Thanks,
>> Igor
> --
> Igor Fedotov
> Ceph Lead Developer
>
> Looking for help with your Ceph cluster? Contact us at https://croit.io
>
> croit GmbH, Freseniusstr. 31h, 81247 Munich
> CEO: Martin Verges - VAT-ID: DE310638492
> Com. register: Amtsgericht Munich HRB 231263
> Web: https://croit.io | YouTube: https://goo.gl/PGE1Bx
>
--
Igor Fedotov
Ceph Lead Developer

Looking for help with your Ceph cluster? Contact us at https://croit.io

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263
Web: https://croit.io | YouTube: https://goo.gl/PGE1Bx

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx