Thanks a lot for your suggestions and patience . It is a great guidance for a newbie of ext4! On Tue, Sep 13, 2022 at 12:33 AM Theodore Ts'o <tytso@xxxxxxx> wrote: > > Hi, > > So first of all, I would recommend that you learn how to use > kvm-xfstests. The reason for this is that kvm-xfstests is very useful > for testing any changes that you make. The same test appliance can be > used for testing file systems for Android and using Google Compute > Engine VM's (which is one of the best ways to use it). Please take a > look at these references: > > https://thunk.org/gce-xfstests > https://github.com/tytso/xfstests-bld/blob/master/Documentation/what-is-xfstests.md > https://github.com/tytso/xfstests-bld/blob/master/Documentation/kvm-quickstart.md > https://github.com/tytso/xfstests-bld/blob/master/Documentation/kvm-xfstests.md > > In addition to using this as a way of a quick "playground" where you > can test patches, this can also be a good way to (for example) test > syzbot reports. > > Another thing which you could potentially do is to manual backporting > of ext4 patches which didn't automatically get applied because the > patch required some adjustments (or required backporting some > additional commits, etc.) to fix a particular problem. So for > example, you could try running xfstests using the latest 5.10.y or > 5.15.y stable kernels, since as we fix bugs, we often add tests to > check for regressions. For example, if you look at the header of the > test ext4/058, you'll find: > > # Set 256 blocks in a block group, then inject I/O pressure, > # it will trigger off kernel BUG in ext4_mb_mark_diskspace_used > # > # Regression test for commit > # a08f789d2ab5 ext4: fix bug_on ext4_mb_use_inode_pa > > So if you find out that a particular test fails on an LTS kernel > (e.g., 5.15.y or 5.10.y), but it passes on upstream, it could be that > a missing commit needs to be backported. We don't currently have > anyone doing this on a regular basis for the LTS kernels (I maybe will > do this once every few months, when I have time), so this could be a > good way for you to contribute and also learn more about ext4 as you > go. > > Finally, I'll note that although I do run xfstests regularly, and will > reject patches that cause regressions, but there are still some tests > that fail. For example, here is my latest test report: > > TESTRUNID: ltm-20220912073217 > KERNEL: kernel 6.0.0-rc4-xfstests #760 SMP PREEMPT_DYNAMIC Mon Sep 12 07:23:13 EDT 2022 x86_64 > CMDLINE: full --kernel gs://gce-xfstests/kernel.deb > CPUS: 4 > MEM: 7680 > > ext4/4k: 515 tests, 27 skipped, 4093 seconds > ext4/1k: 511 tests, 2 failures, 40 skipped, 5095 seconds > Flaky: generic/475: 40% (2/5) generic/476: 40% (2/5) > ext4/ext3: 507 tests, 115 skipped, 3514 seconds > ext4/encrypt: 493 tests, 3 failures, 129 skipped, 2583 seconds > Failures: generic/681 generic/682 generic/691 > ext4/nojournal: 510 tests, 4 failures, 94 skipped, 3610 seconds > Failures: ext4/301 ext4/304 generic/455 > Flaky: generic/077: 40% (2/5) > ext4/ext3conv: 512 tests, 27 skipped, 3650 seconds > ext4/adv: 512 tests, 3 failures, 34 skipped, 3860 seconds > Failures: generic/475 generic/477 > Flaky: generic/455: 80% (4/5) > ext4/dioread_nolock: 513 tests, 27 skipped, 4235 seconds > ext4/data_journal: 511 tests, 2 failures, 87 skipped, 3647 seconds > Failures: generic/231 generic/455 > ext4/bigalloc: 489 tests, 2 failures, 34 skipped, 3904 seconds > Failures: generic/455 shared/298 > ext4/bigalloc_1k: 488 tests, 2 failures, 51 skipped, 3826 seconds > Failures: generic/455 shared/298 > ext4/dax: 502 tests, 127 skipped, 2520 seconds > Totals: 6135 tests, 792 skipped, 80 failures, 0 errors, 44288s > > (This was done by using gce-xfstests, which is a cloud VM variant of > kvm-xfstests. The equivalant would take roughly 12 to 24 hours using > kvm-xfstests, whichj gets run on multiple VM times, so the wall clock > time needed is perhaps two to two and a half hours.) > > In general, I try very hard to make sure that ext4/4k (ext4 with the > default 4k block size) to be free of failures hen running the xfstests > "auto" group. However, you'll see that there are other configs where > there are failures, some of which have been around for a while. > However, the challenge is that these are bugs that often, more senior > ext4 developers have tried looking at for, say, an hour or two, and > then said, "I have higher priority fires to fight". But these might > not be the best tests failures to ask a ext4 newbie to debug. That > being said, if you don't mind a bit (or a lot) of frustration, it > could be that you might be able root cause soe of these failed tests. > > (But starting with testing the LTS kernels might be a better place to > start.) > > Cheers, > > - Ted