On Fri, Jun 30, 2023 at 11:39 AM Amir Goldstein <amir73il@xxxxxxxxx> wrote: > > On Thu, Jun 29, 2023 at 10:31 PM Ignat Korchagin <ignat@xxxxxxxxxxxxxx> wrote: > > > > On Thu, Jun 29, 2023 at 7:14 PM Darrick J. Wong <djwong@xxxxxxxxxx> wrote: > > > > > > [add the xfs lts maintainers] > > > > > > On Thu, Jun 29, 2023 at 05:34:00PM +0100, Matthew Wilcox wrote: > > > > On Thu, Jun 29, 2023 at 05:09:41PM +0100, Daniel Dao wrote: > > > > > Hi Dave and Derrick, > > > > > > > > > > We are tracking down some corruptions on xfs for our rocksdb workload, > > > > > running on kernel 6.1.25. The corruptions were > > > > > detected by rocksdb block checksum. The workload seems to share some > > > > > similarities > > > > > with the multi-threaded write workload described in > > > > > https://lore.kernel.org/linux-fsdevel/20221129001632.GX3600936@xxxxxxxxxxxxxxxxxxx/ > > > > > > > > > > Can we backport the patch series to stable since it seemed to fix data > > > > > corruptions ? > > > > > > > > For clarity, are you asking for permission or advice about doing this > > > > yourself, or are you asking somebody else to do the backport for you? > > > > > > Nobody's officially committed to backporting and testing patches for > > > 6.1; are you (Cloudflare) volunteering? > > > > Yes, we have applied them on top of 6.1.36, will be gradually > > releasing to our servers and will report back if we see the issues go > > away > > > > Getting feedback back from Cloudflare production servers is awesome > but it's not enough. > > The standard for getting xfs LTS backports approved is: > 1. Test the backports against regressions with several rounds of fstests > check -g auto on selected xfs configurations [1] > 2. Post the backport series to xfs list and get an ACK from upstream > xfs maintainers > > We have volunteers doing this work for 5.4.y, 5.10.y and 5.15.y. > We do not yet have a volunteer to do that work for 6.1.y. > > The question is whether you (or your team) are volunteering to > do that work for 6.1.y xfs backports to help share the load? We are not a big team and apart from other internal project work our efforts are focused on fixing this issue in production, because it affects many teams and workloads. If we confirm that these patches fix the issue in production, we will definitely consider dedicating some work to ensure they are officially backported. But if not - we would be required to search for a fix first before we can commit to any work. So, IOW - can we come back to you a bit later on this after we get the feedback from production? > If your employer is interested in running reliable and stable xfs > code with 6.1.y LTS, I recommend that you seriously consider > this option, because for the time being, it doesn't look like any > of us are able to perform this role. > > For testing, you could establish your own baseline for 6.1.y or, you > could run kdevops and use the baseline already established by > other testers for the selected xfs configurations [1]. > > I can help you get up to speed with kdevops if you like. This looks interesting (regardless of this project). We will explore it and come back with questions, if any. > > Thanks, > Amir. > > [1] https://github.com/linux-kdevops/kdevops/tree/master/workflows/fstests/expunges/6.1.0-rc6/xfs/unassigned Ignat