It's this: https://tracker.ceph.com/issues/51948 The fix just landed in 4.18.0-305.19.1 https://access.redhat.com/errata/RHSA-2021:3548 On Tue, 21 Sep 2021, 19:35 Marc, <Marc@xxxxxxxxxxxxxxxxx> wrote: > > I do not have access to this page. Maybe others also not, so it is better > to paste it's content here. > > > -----Original Message----- > > From: Patrick Donnelly <pdonnell@xxxxxxxxxx> > > Sent: Tuesday, 21 September 2021 19:30 > > To: David Schulz <dschulz@xxxxxxxxxxx> > > Cc: ceph-users@xxxxxxx > > Subject: *****SPAM***** Re: Corruption on cluster > > > > Hi Dave, > > > > On Tue, Sep 21, 2021 at 1:20 PM David Schulz <dschulz@xxxxxxxxxxx> > > wrote: > > > > > > Hi Everyone, > > > > > > For a couple of weeks I've been battling a corruption in Ceph FS that > > > happens when a writer on one node writes a line and calls sync as is > > > typical with logging and the file is corrupted when the same file that > > > is being written is read from another client. > > > > > > The cluster is a Nautilus 14.2.9 and the clients are all kernel client > > > mounting the filesystem with CentOS 8.4 kernel > > > 4.18.0-305.10.2.el8_4.x86_64. Bluestore OSDs and Eraseure coding are > > > both used. The cluster was upgraded from Mimic (the first installed > > > versoin) at some point. > > > > > > Here is a little python3 program that triggers the issue: > > > > > > import os > > > import time > > > > > > fh=open("test.log", "a") > > > > > > while True: > > > start = time.time() > > > fh.writelines("test2\n") > > > end = time.time() > > > fh.flush() > > > junk=os.getpid() > > > fh.writelines(f"took {(end - start)}\n") > > > fh.flush() > > > time.sleep(1) > > > > > > If I run this on one client and repeatedly run "wc -l " on a different > > > client. The wc will do 2 different behaviours, sometimes NULL bytes > > get > > > scribbled in the file and the next line of output is appended and > > other > > > times the file gets truncated. > > > > > > I did update from 14.2.2 to 14.2.9 (I had the a clone of the 14.2.9 > > repo > > > on hand). I read the release notes and there did seem to be some > > > related fixes between 14.2.2 and 14.2.9 but nothing after 14.2.9. > > > > > > I can't seem to find any references to a problem like this anywhere. > > > Does anyone have any ideas? > > > > You're probably hitting this bug: > > https://bugzilla.redhat.com/show_bug.cgi?id=1996680 > > > > Try upgrading your kernel. > > > > -- > > Patrick Donnelly, Ph.D. > > He / Him / His > > Principal Software Engineer > > Red Hat Sunnyvale, CA > > GPG: 19F28A586F808C2402351B93C3301A3E258DD79D > > > > _______________________________________________ > > ceph-users mailing list -- ceph-users@xxxxxxx > > To unsubscribe send an email to ceph-users-leave@xxxxxxx > _______________________________________________ > ceph-users mailing list -- ceph-users@xxxxxxx > To unsubscribe send an email to ceph-users-leave@xxxxxxx > _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx