Hi Tom, responding back on this briefly so that people are in the loop; I'll have more details in a blog post that I hope to get around to writing. On 12/08/2019 11:34, Thomas Byrne - UKRI STFC wrote: >> And bluestore should refuse to start if the configured limit is > 4GB. Or something along those lines... > > Just on this point - Bluestore OSDs will fail to start with an osd_max_object_size >=4GB with a helpful error message about the Bluestore hard limit. I was mildly amused when I discovered that luminous OSDs can start with osd_max_object_size = 4GB - 1 byte, but mimic OSDs require it to be <= 4GB - 2 bytes to start without an error. I haven't checked to see if nautilus OSDs require <= 4GB - 3 bytes yet. Yes but that doesn't help users much for clusters where very large objects already exist. Even in Luminous, osd_max_object_size defaults to 128M, but if an OSD already has objects larger than that, it will still happily start up and serve data with FileStore — and crash any newly added BlueStore OSDs unfortunate enough to be mapped to a PG with one or more objects that are 4GiB or larger. The pending PR to make this a scrub error even on FileStore OSDs mitigates this issue (https://github.com/ceph/ceph/pull/29579), but it'll still cause a somewhat unexpected surprise for people who have just updated to a version including that fix and suddenly see tons of scrub errors — they would be easily forgiven for assuming they've run into a regression that involves false positives on scrub. "Hey, none of these errors were here before the upgrade, surely there's a problem with the software rather than my data!" We've progressed further in the interim and it appears like I can give all-clears on a couple of concerns that we had: 1. It looks like these objects were not created by an RBD going haywire, but by something actually using librados to create them, presumably long before the cluster ever went into production. 2. I am not changing the subject line so I don't mess up people's list archives if their MUA doesn't correctly thread based on In-Reply-To or References, but it's now evident that this is *not* related to bug #38724 but instead really just due to objects being too large for BlueStore, like Sage said in his first reply. Thanks for the answer — by the way I have been imploring all my colleagues to watch your Cephalocon talk,[1] which was excellent. Cheers, Florian [1] https://youtu.be/niFNZN5EKvE _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx