Hi Igor, it has the SSD OSDs down, the HDD OSDs are running just fine. I don't want to make a bad situation worse for now and wait for recovery to finish. The inactive PGs are activating very slowly. By the way, there are 2 out of 4 OSDs up in the replicated 4(2) pool. Why are PGs even inactive here? This "feature" is new in octopus, I reported it about 2 months ago as a bug. Testing with mimic I cannot reproduce this problem: https://tracker.ceph.com/issues/56995 I found this in the syslog, maybe it helps: kernel: task:bstore_kv_sync state:D stack: 0 pid:3646032 ppid:3645340 flags:0x00000000 kernel: Call Trace: kernel: __schedule+0x2a2/0x7e0 kernel: schedule+0x4e/0xb0 kernel: io_schedule+0x16/0x40 kernel: wait_on_page_bit_common+0x15c/0x3e0 kernel: ? __page_cache_alloc+0xb0/0xb0 kernel: wait_on_page_bit+0x3f/0x50 kernel: wait_on_page_writeback+0x26/0x70 kernel: __filemap_fdatawait_range+0x98/0x100 kernel: ? __filemap_fdatawrite_range+0xd8/0x110 kernel: file_fdatawait_range+0x1a/0x30 kernel: sync_file_range+0xc2/0xf0 kernel: ksys_sync_file_range+0x41/0x80 kernel: __x64_sys_sync_file_range+0x1e/0x30 kernel: do_syscall_64+0x3b/0x90 kernel: entry_SYSCALL_64_after_hwframe+0x44/0xae kernel: RIP: 0033:0x7ffbb6f77ae7 kernel: RSP: 002b:00007ffba478c3c0 EFLAGS: 00000293 ORIG_RAX: 0000000000000115 kernel: RAX: ffffffffffffffda RBX: 000000000000002d RCX: 00007ffbb6f77ae7 kernel: RDX: 0000000000002000 RSI: 000000015f849000 RDI: 000000000000002d kernel: RBP: 000000015f849000 R08: 0000000000000000 R09: 0000000000002000 kernel: R10: 0000000000000007 R11: 0000000000000293 R12: 0000000000002000 kernel: R13: 0000000000000007 R14: 0000000000000001 R15: 0000560a1ae20380 kernel: INFO: task bstore_kv_sync:3646117 blocked for more than 123 seconds. kernel: Tainted: G E 5.14.13-1.el7.elrepo.x86_64 #1 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. It is quite possible that this was the moment when these OSDs got stuck and were marked down. The time stamp is about right. Best regards, ================= Frank Schilder AIT Risø Campus Bygning 109, rum S14 ________________________________________ From: Igor Fedotov <igor.fedotov@xxxxxxxx> Sent: 06 October 2022 13:45:17 To: Frank Schilder; ceph-users@xxxxxxx Subject: Re: OSD crashes during upgrade mimic->octopus From your response to Stefan I'm getting that one of two damaged hosts has all OSDs down and unable to start. I that correct? If so you can reboot it with no problem and proceed with manual compaction [and other experiments] quite "safely" for the rest of the cluster. On 10/6/2022 2:35 PM, Frank Schilder wrote: > Hi Igor, > > I can't access these drives. They have an OSD- or LVM process hanging in D-state. Any attempt to do something with these gets stuck as well. > > I somehow need to wait for recovery to finish and protect the still running OSDs from crashing similarly badly. > > After we have full redundancy again and service is back, I can add the setting osd_compact_on_start=true and start rebooting servers. Right now I need to prevent the ship from sinking. > > Best regards, > ================= > Frank Schilder > AIT Risø Campus > Bygning 109, rum S14 > > ________________________________________ > From: Igor Fedotov <igor.fedotov@xxxxxxxx> > Sent: 06 October 2022 13:28:11 > To: Frank Schilder; ceph-users@xxxxxxx > Subject: Re: OSD crashes during upgrade mimic->octopus > > IIUC the OSDs that expose "had timed out after 15" are failing to start > up. Is that correct or I missed something? I meant trying compaction > for them... > > > On 10/6/2022 2:27 PM, Frank Schilder wrote: >> Hi Igor, >> >> thanks for your response. >> >>> And what's the target Octopus release? >> ceph version 15.2.17 (8a82819d84cf884bd39c17e3236e0632ac146dc4) octopus (stable) >> >> I'm afraid I don't have the luxury right now to take OSDs down or add extra load with an on-line compaction. I would really appreciate a way to make the OSDs more crash tolerant until I have full redundancy again. Is there a setting that increases the OPS timeout or is there a way to restrict the load to tolerable levels? >> >> Best regards, >> ================= >> Frank Schilder >> AIT Risø Campus >> Bygning 109, rum S14 >> >> ________________________________________ >> From: Igor Fedotov <igor.fedotov@xxxxxxxx> >> Sent: 06 October 2022 13:15 >> To: Frank Schilder; ceph-users@xxxxxxx >> Subject: Re: OSD crashes during upgrade mimic->octopus >> >> Hi Frank, >> >> you might want to compact RocksDB by ceph-kvstore-tool for those OSDs >> which are showing >> >> "heartbeat_map is_healthy 'OSD::osd_op_tp thread 0x7f1886536700' had timed out after 15" >> >> >> >> I could see such an error after bulk data removal and following severe >> DB performance drop pretty often. >> >> Thanks, >> Igor > -- > Igor Fedotov > Ceph Lead Developer > > Looking for help with your Ceph cluster? Contact us at https://croit.io > > croit GmbH, Freseniusstr. 31h, 81247 Munich > CEO: Martin Verges - VAT-ID: DE310638492 > Com. register: Amtsgericht Munich HRB 231263 > Web: https://croit.io | YouTube: https://goo.gl/PGE1Bx > -- Igor Fedotov Ceph Lead Developer Looking for help with your Ceph cluster? Contact us at https://croit.io croit GmbH, Freseniusstr. 31h, 81247 Munich CEO: Martin Verges - VAT-ID: DE310638492 Com. register: Amtsgericht Munich HRB 231263 Web: https://croit.io | YouTube: https://goo.gl/PGE1Bx _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx