Re: Scope of Pacific 16.2.6 OMAP Keys Bug?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hey Jay,

first of all I'd like to mention that there were two OMAP naming scheme modifications since Nautilus:

1) per-pool OMAPs

2) per-pg OMAPs

Both are applied during BlueStore repair/quick-fix. So may be you performed the first one but not the second.

You might want to set bluestore_warn_on_no_per_pg_omap to true and inspect ceph health alerts to learn if per-pg format is disabled in your cluster.

Alternatively you might want to inspect db manually against an offline OSD with ceph-kvstore-tool: ceph-kvstore-tool bluestore-kv <path-to-osd> get S per_pool_omap:

(S, per_pool_omap)
00000000  32                                                |2|
00000001

Having ASCII '2' there means per-pg format is already applied. I can't explain why you're observing no issues if that's the case though...


As for your other questions - you can stay on 16.2.6 as far as you don't run BlueStore repair/quick-fix - i.e. the relevant setting is false and nobody runs relevant ceph-bluestore-tool commands  manually.


And you mentioned bluestore_fsck_quick_fix_on_mount was set to true for until now - curious if you had any OSD restarts with that setting set to true?


Thanks,

Igor

On 1/19/2022 4:28 AM, Jay Sullivan wrote:
https://tracker.ceph.com/issues/53062

Can someone help me understand the scope of the OMAP key bug linked above? I’ve been using 16.2.6 for three months and I don’t _think_ I’ve seen any related problems.

I upgraded my Nautilus (then 14.2.21) clusters to Pacific (16.2.4) in mid-June. One of my clusters was born in Jewel and has made all of the even-numbered releases to Pacific. I skipped over 16.2.5 and upgraded to 16.2.6 in mid-October. It looks like the aforementioned OMAP bug was discovered shortly after, on/around October 20th. My clusters had bluestore_fsck_quick_fix_on_mount set to true until about 10 minutes ago. I _think_ all of my OSDs did the OMAP conversion when I upgraded from Nautilus to Pacific back in June (I remember it taking several minutes per spinning OSD).

Questions for my sanity:

   *   Do I need to upgrade to 16.2.7 ASAP? Or can I wait until my next regular maintenance window?
   *   What is the risk of staying on 16.2.6 if I have bluestore_fsck_quick_fix_on_mount set to false?
   *   If I don’t have OSDs crashing, how would I know if I was impacted by the bug?

Thanks! ❤

~Jay

--
Jay Sullivan
Rochester Institute of Technology
jay.sullivan@xxxxxxx

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx

--
Igor Fedotov
Ceph Lead Developer

Looking for help with your Ceph cluster? Contact us at https://croit.io

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263
Web: https://croit.io | YouTube: https://goo.gl/PGE1Bx

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux