Thanks for sharing this. Following this thread, I realize we are also affected by this bug. We have multiple reports on corrupted tensorboard event file, which I think are caused by this bug. We are using Ubuntu 20.04, the affected kernel version should be HWE kernel > 5.11 and < 5.11.0-34. The fix for Ubuntu kernel is here: https://git.launchpad.net/~ubuntu-kernel/ubuntu/+source/linux/+git/focal/commit/fs/ceph/addr.c?h=hwe-5.11&id=353cafd20b8c28423aeec0c474dab80dbcec3c44 Now we are working on upgrade every client to 5.11.0-34-generic. Weiwen Hu 发件人: Nathan Fish<mailto:lordcirth@xxxxxxxxx> 发送时间: 2021年9月9日 2:41 收件人: ceph-users<mailto:ceph-users@xxxxxxx> 主题: Re: Data loss on appends, prod outage The bug appears to have already been reported: https://apac01.safelinks.protection.outlook.com/?url=https%3A%2F%2Ftracker.ceph.com%2Fissues%2F51948&data=04%7C01%7C%7Ceaa1b6aa0a6d4b04f17008d972f833b2%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637667232638555408%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=ng81sR414%2F5fD8fDOTNiX4MTRUMTTQiyetkM%2F0F5kt8%3D&reserved=0 Also, it should be noted that the write append bug does sometimes occur when writing from a single client, so controlling write patterns is not sufficient to stop data loss. On Wed, Sep 8, 2021 at 1:39 PM Frank Schilder <frans@xxxxxx> wrote: > > Can you make the devs aware of the regression? > > Best regards, > ================= > Frank Schilder > AIT Risø Campus > Bygning 109, rum S14 > > ________________________________________ > From: Nathan Fish <lordcirth@xxxxxxxxx> > Sent: 08 September 2021 19:33 > To: ceph-users > Subject: Re: Data loss on appends, prod outage > > Rolling back to kernel 5.4 has resolved the issue. > > On Tue, Sep 7, 2021 at 3:51 PM Frank Schilder <frans@xxxxxx> wrote: > > > > Hi Nathan, > > > > > Is this the bug you are referring to? https://apac01.safelinks.protection.outlook.com/?url=https%3A%2F%2Ftracker.ceph.com%2Fissues%2F37713&data=04%7C01%7C%7Ceaa1b6aa0a6d4b04f17008d972f833b2%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637667232638555408%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=L2%2Fz01BiWJCShilUErHJV%2FpD78GujjkJq3j2uMH257c%3D&reserved=0 > > > > yes, its one of them. I believe there were more such reports. _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx