On Mon, Aug 26, 2024 at 11:40:57PM GMT, Kent Overstreet wrote: > On Tue, Aug 27, 2024 at 04:36:33AM GMT, Matthew Wilcox wrote: > > On Mon, Aug 26, 2024 at 11:29:52PM -0400, Kent Overstreet wrote: > > > We had a report of corruption on nixos, on tests that build a system > > > image, it bisected to the patch that enabled buffered writes without > > > taking the inode lock: > > > > > > https://evilpiepirate.org/git/bcachefs.git/commit/?id=7e64c86cdc6c > > > > > > It appears that dirty folios are being dropped somehow; corrupt files, > > > when checked against good copies, have ranges of 0s that are 4k aligned > > > (modulo 2k, likely a misaligned partition). > > > > > > Interestingly, it only triggers for QEMU - the test fails pretty > > > consistently and we have a lot of nixos users, we'd notice (via nix > > > store verifies) if the corruption was more widespread. We believe it > > > only triggers with QEMU's snapshots mode (but don't quote me on that). > > > > Just to be crystal clear here, the corruption happens while running > > bcachefs in the qemu guest, and it doesn't matter what the host > > filesystem is? > > > > Or did I misunderstand, and it occurs while running anything inside qemu > > on top of a bcachefs host? > > The host is running bcachefs, backing qemu's disk image. > > (And I'm using nested virtualization for bisecting, it's been a lot to > keep straight). Also, the size of the missing data is not a power of two - it's not a single folio.