On Fri, Mar 07, 2025 at 06:51:23AM -0500, Kent Overstreet wrote: > > Better bisection algorithm? Standand bisect does really badly when fed > noisy data, but it wouldn't be hard to fix that: after N successive > passes or fails, which is unlikely because bisect tests are coinflips, > backtrack and gather more data in the part of the commit history where > you don't have much. My general approach when handling some test failure is to try running the reproducer 5-10 times on the original commit where the failure was detected, to see if the reproducer is reliable. Once it's been established whether the failure reproduces 100% of the time, or some fraction of the time, say 25% of the time, then we can estalbish how times we should try running the reproducer before we can conclude the that a particular commit is "good" --- and the first time we detect a failure, we can declare the commit is "bad", even if it happens on the 2nd out of the 25 tries that we might need to run a test if it is particularly flaky. Maybe this is something Syzbot could implement? And if someone is familiar with the Go language, patches to implement this in gce-xfstests's ltm server would be great! It's something I've wanted to do, but I haven't gotten around to implementing it yet so it can be fully automated. Right now, ltm's git branch watcher reruns any failing test 5 times, so I get an idea of whether a failure is flaky or not. I'll then manually run a potentially flaky test 30 times, and based on how reliable or flaky the test failure happens to be, I then tell gce-xfstests to do a bisect running each test N times, without having it stop once the test fails. It wasts a bit of test resources, but since it doesn't block my personal time (results land in my inbox when the bisect completes), it hasn't risen to the top of my todo list. Cheers, - Ted