Hi, On Tue, Sep 8, 2020 at 11:20 PM David Orman <ormandj@xxxxxxxxxxxx> wrote: > > Every time we look at them, we see the same checksum (0x6706be76): > > This looks a lot like: https://tracker.ceph.com/issues/22464 > > Some more context on this as I've built the work-around for this issue: * the checksum is for a block of all zeroes * this seemed to happen when memory runs low * it is *NOT* related to swap: this happened on systems with swap disabled and no file-backed mmaped memory (BlueStore-only servers w/o non-OSD disks) * only showed up on some kernel versions * re-trying the read did solve it, very rare to see two consecutive read failures, never saw it with 3 retries * root cause was never found, as I never managed to reliably reproduce this on test setups where I could play around with bisecting the kernel :( Here's the patch that added the read retries: https://github.com/ceph/ceph/pull/23273/files What you can do is: 1. check the performance counter bluestore_reads_with_retries on affected OSDs, should be non-zero 2. increase the setting bluestore_retry_disk_reads (default 3) to see if that helps Anyways, what you are seeing might be something completely different than whatever caused this bug... but it's worth playing around with the retry option Paul > That said, we've got the following versions in play (cluster was created > with 15.2.3): > > ceph version 15.2.4 (7447c15c6ff58d7fce91843b705a268a1917325c) octopus > (stable) > > > This is a containerized cephadm installation, in case it's relevant. > Distribution is Ubuntu 18.04.04, kernel is the HWE kernel: > > Linux ceph02 5.4.0-42-generic #46~18.04.1-Ubuntu SMP Fri Jul 10 07:21:24 > UTC 2020 x86_64 x86_64 x86_64 GNU/Linux > > A repair operation 'fixes' it. These are occurring across many PGs, on the > various different servers, and we see no indication of any hardware related > issues. > > Any ideas what to do next? > _______________________________________________ > ceph-users mailing list -- ceph-users@xxxxxxx > To unsubscribe send an email to ceph-users-leave@xxxxxxx > _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx