On Tue, Nov 28, 2017 at 11:27 PM, Ashlie Martinez <ashmrtn@xxxxxxxxxx> wrote: > On Tue, Nov 28, 2017 at 2:45 PM, Theodore Ts'o <tytso@xxxxxxx> wrote: >> On Tue, Nov 28, 2017 at 07:04:54AM -0600, Ashlie Martinez wrote: >>> No biggie, part of the reason this was so hard for me to wrap my head >>> around is I don't have a physical machine that I can reproduce this on >>> (and I never got around to getting a GCE instance to test on). Not >>> being able to poke around a reproducing system makes it a little bit >>> harder for me to reason about :) >> >> This does reproduce easily using kvm-xfstests[1]; using gce-xfstests >> was not necessary. That's actually how I debugged it, since kvm >> starts up in under 5 seconds, while starting up a cloud VM takes a bit >> longer. So if you want a quick edit/compile/debug cycle, or if you >> attach a debugger to the running kernel, using kvm-xfstests is the >> right tool to use. 99% of the command syntax and test appliance >> implementation is the same between kvm-xfstests and gce-xfstests. > > Unfortunately this timing bug only reproduces on some machines. Xiao > and I have been unable to reproduce this bug (I've tried kvm-xfstests, > my own kvm VMs, VMs without kvm, VMs with/without virtio drivers, and > another bare metal system). generic/456 basically sets up a race > condition between a kernel flusher thread and triggering dm-flakey, so > I think things like system load, core count, etc. might cause > different test results. > For what it's worth, I wasn't able to reproduce on my kvm-xfstests machine either. With 2 cores, virtio over LVM/SSD. Didn't try to play with parameters. Amir.