On Tue, Aug 29, 2023 at 06:15:36PM +0100, Jose M Calhariz wrote: > > Hi, > > I have been chasing a data corruption problem under heavy load on 4 > servers that I have at my care. First I thought of an hardware > problem because it only happen with RAID 6 disks. So I reported to Debian: > > https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1032391 > > Further research pointed to be the XFS the common pattern, not an > hardware issue. So I made an informal query to a friend in a software > house that relies heavily on XFS about his thought on this issue. He > made reference to several problems fixed on kernel 6.2 and a > discussion on this mailing list about back porting the fixes to 6.1 > kernel. > > With this information I have tried the latest kernel at that time on > Debian testing over Debian v12 and I could not reproduce the > problem. So I made another bug report: > > https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1040416 > > My questions to this mailing list: > > - Have anyone experienced under Debian or with vanilla kernels > corruption under heavy load on XFS? Yes. There were a rash of corruption problems that got fixed in 6.2: https://git.kernel.org/pub/scm/fs/xfs/xfs-linux.git/tag/?h=xfs-6.2-merge-8 My guess with no other information is either the write invalidation problem in iomap; or maybe COW extent allocations racing with the log. Most of these haven't been backported to 6.1 because our only choices as a community were (a) let a dumb bot shovel in patches with zero QA or (b) try to scare up volunteers to backport things to LTS kernels. (a) wasn't acceptable, but then with (b)... > - Should I stop waiting for the fixes being back ported to vanilla > 6.1 and run the latest kernel from Debian testing anyway? Taking > notice that kernels from testing have less security updates on time > than stable kernels, specially security issues with limited > disclosure. ...there isn't really a designated 6.1 LTS backport engineer right now. A couple folks from Cloudflare; Amir Goldstein; and Ted Ts'o have been sharing the work when they have spare time. --D > I am happy to provide more info about my setup or my stability tests > that fail under XFS. > > > Kind regards > Jose M Calhariz > > -- > -- > Um falso amigo nunca o xinga > > Um verdadeiro amigo já o xingou de tudo quanto é > palavrão que existe - e até inventou alguns novos