> -----Original Message----- > From: Kees Cook > Sent: Sunday, June 14, 2020 12:18 PM > To: David Gow <davidgow@xxxxxxxxxx> > Cc: Vitor Massaru Iha <vitor@xxxxxxxxxxx>; KUnit Development <kunit-dev@xxxxxxxxxxxxxxxx>; Shuah Khan > <skhan@xxxxxxxxxxxxxxxxxxx>; open list:KERNEL SELFTEST FRAMEWORK <linux-kselftest@xxxxxxxxxxxxxxx>; Linux Kernel Mailing List > <linux-kernel@xxxxxxxxxxxxxxx>; Brendan Higgins <brendanhiggins@xxxxxxxxxx>; linux-kernel-mentees@xxxxxxxxxxxxxxxxxxxxxxxxx; > linux@xxxxxxxxxxxxxxxxxx > Subject: Re: RFC - kernel selftest result documentation (KTAP) > > On Sat, Jun 13, 2020 at 02:51:17PM +0800, David Gow wrote: > > On Sat, Jun 13, 2020 at 6:36 AM Kees Cook <keescook@xxxxxxxxxxxx> wrote: > > > Regarding output: > > > > > > [ 36.611358] TAP version 14 > > > [ 36.611953] # Subtest: overflow > > > [ 36.611954] 1..3 > > > ... > > > [ 36.622914] # overflow_calculation_test: s64: 21 arithmetic tests > > > [ 36.624020] ok 1 - overflow_calculation_test > > > ... > > > [ 36.731096] # overflow_shift_test: ok: (s64)(0 << 63) == 0 > > > [ 36.731840] ok 2 - overflow_shift_test > > > ... > > > [ 36.750294] kunit_try_catch: vmalloc: allocation failure: 18446744073709551615 bytes, mode:0xcc0(GFP_KERNEL), > nodemask=(null),cpuset=/,mems_allowed=0 > > > ... > > > [ 36.805350] # overflow_allocation_test: devm_kzalloc detected saturation > > > [ 36.807763] ok 3 - overflow_allocation_test > > > [ 36.807765] ok 1 - overflow > > > > > > A few things here.... > > > > Tim Bird has just sent out an RFC for a "KTAP" specification, which > > we'll hope to support in KUnit: > > Ah-ha! Thanks for the heads-up. :) > > > https://lore.kernel.org/linux- > kselftest/CY4PR13MB1175B804E31E502221BC8163FD830@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx/T/#u > > *thread split/merge* > > This is coming from: > https://lore.kernel.org/linux-kselftest/CABVgOSnofuJQ_fiCL-8KdKezg3Hnqk3A+X509c4YP_toKeBVBg@xxxxxxxxxxxxxx/ > But I'm attempting a thread jump... ;) > > > That's probably where we'll end up trying to hash out exactly what > > this format should be. Fortunately, I don't think any of these will > > require any per-test work, just changes to the KUnit implementation. > > Yup, good. > > > > - On the outer test report, there is no "plan" line (I was expecting > > > "1..1"). Technically it's optional, but it seems like the information > > > is available. :) > > > > There's work underway to support this, but it's hit a few minor snags: > > https://lkml.org/lkml/2020/2/27/2155 > > Okay, cool. It's not critical, I don't think. > > > > - The subtest should have its own "TAP version 14" line, and it should > > > be using the diagnostic line prefix for the top-level test (this is > > > what kselftest is doing). > > > > Alas, TAP itself hasn't standardised subtests. Personally, I think it > > doesn't fundamentally matter which way we do this (I actually prefer > > the way we're doing it currently slightly), but converging with what > > kselftest does would be ideal. > > I see the KTAP RFC doesn't discuss subtests at all, but kselftest actually > already handles subtests. I strongly feel that line-start formatting is > the correct way to deal with this, with each subtest having it's own > self-contained KTAP. This allows for several important features: > > - the subtest, run on its own, needs no knowledge about its execution > environment: it simply emits its standard KTAP output. > > - subtest output can be externally parsed separably, without any > knowledge or parsing of the enclosing test. I agree with both of these, but will save additional comments for the thread on my KTAP doc RFC. > > For example, with my recent series[1], "make -C seccomp run_tests" > produces: > > TAP version 13 > 1..2 > # selftests: seccomp: seccomp_bpf > # TAP version 13 > # 1..77 > # # Starting 77 tests from 6 test cases. > # # RUN global.mode_strict_support ... > # # OK global.mode_strict_support > # ok 1 global.mode_strict_support > ... > # ok 77 TSYNC.two_siblings_not_under_filter > # # FAILED: 73 / 77 tests passed. > # # Totals: pass:72 fail:4 xfail:1 xpass:0 skip:0 error:0 > not ok 1 selftests: seccomp: seccomp_bpf # exit=1 > # selftests: seccomp: seccomp_benchmark > # > not ok 2 selftests: seccomp: seccomp_benchmark # TIMEOUT > > > > - There is no way to distinguish top-level TAP output from kernel log > > > lines. I think we should stick with the existing marker, which is > > > "# ", so that kernel output has no way to be interpreted as TAP > > > details -- unless it intentionally starts adding "#"s. ;) > > > > At the moment, we're doing this in KUnit tool by stripping anything > > before "TAP version 14" (e.g., the timestamp), and then only incuding > > lines which parse correctly (are a test plan, result, or a diagnostic > > line beginning with '#'). > > This has worked pretty well thus far, and we do have the ability to > > get results from debugfs instead of the kernel log, which won't have > > the same problems. > > > > It's worth considering, though, particularly since our parser should > > handle this anyway without any changes. > > For the KTAP parsing, actually think it's very important to include > kernel log lines in the test output (as diagnostic lines), since they > are "unexpected" (they fail to have the correct indentation) and may > provide additional context for test failures when reading log files. > (As in the "vmalloc: allocation failure" line in the quoted section > above, to be included as a diagnostic line for test #3.) > > > > - There is no summary line (to help humans). For example, the kselftest > > > API produces a final pass/fail report. > > > > Currently, we're relying on the kunit.py script to produce this, but > > it shouldn't be impossible to add to the kernel, particularly once the > > "KUnit Executor" changes mentioned above land. > > Cool. Yeah, it's not required, but I think there are two use cases: > humans running a single test (where is a summary is valuable, especially > for long tests that scroll off the screen), and automation (where it can > ignore the summary, as it will produce its own in a regularized fashion). > > > > Taken together, I was expecting the output to be: > > > > > > [ 36.611358] # TAP version 14 > > > [ 36.611953] # 1..1 > > > [ 36.611958] # # TAP version 14 > > > [ 36.611954] # # 1..3 > > > ... > > > [ 36.622914] # # # overflow_calculation_test: s64: 21 arithmetic tests > > > [ 36.624020] # # ok 1 - overflow_calculation_test > > > ... > > > [ 36.731096] # # # overflow_shift_test: ok: (s64)(0 << 63) == 0 > > > [ 36.731840] # # ok 2 - overflow_shift_test > > > ... > > > [ 36.750294] kunit_try_catch: vmalloc: allocation failure: 18446744073709551615 bytes, mode:0xcc0(GFP_KERNEL), > nodemask=(null),cpuset=/,mems_allowed=0 > > > ... > > > [ 36.805350] # # # overflow_allocation_test: devm_kzalloc detected saturation > > > [ 36.807763] # # ok 3 - overflow_allocation_test > > > [ 36.807763] # # # overflow: PASS > > > [ 36.807765] # ok 1 - overflow > > > [ 36.807767] # # kunit: PASS > > > > > > But I assume there are threads on this that I've not read... :) > > > > These discussions all seem to be coming to a head now, so this is > > probably just the kick we need to prioritise this a bit more. The KTAP > > thread hasn't covered all of these (particularly the subtest stuff) > > yet, so I confess I hadn't realised the extent of the divergence > > between KUnit and kselftest here. > > Cool. Yeah, I'd like to keep things as close as possible. In looking at > this again, I wonder if perhaps it would be better to change the > "indent" from "# " to " ", which might make things more readable for > both dmesg and terminal output: > > [ 36.611358] TAP version 14 > [ 36.611953] 1..1 > [ 36.611958] TAP version 14 > [ 36.611954] 1..3 > ... > [ 36.622914] # overflow_calculation_test: s64: 21 arithmetic tests > [ 36.624020] ok 1 - overflow_calculation_test > ... > [ 36.731096] # overflow_shift_test: ok: (s64)(0 << 63) == 0 > [ 36.731840] ok 2 - overflow_shift_test > ... > [ 36.750294] kunit_try_catch: vmalloc: allocation failure: 18446744073709551615 bytes, mode:0xcc0(GFP_KERNEL), > nodemask=(null),cpuset=/,mems_allowed=0 > ... > [ 36.805350] # overflow_allocation_test: devm_kzalloc detected saturation > [ 36.807763] ok 3 - overflow_allocation_test > [ 36.807763] # overflow: PASS > [ 36.807765] ok 1 - overflow > [ 36.807767] # kunit: PASS > > As it happens, subtests are a bit rare in kselftests right now, but > for kunit, the "common" output (IIUC) will fundamentally be a top-level > test running all the subtests, so we should get it right. :) Personally, as a human I find the space prefix a bit easier to read. However, I think that in "normal" kernel log output it is unusual for a line to be prefixed with a hash (#), so this might be easier to both visually distinguish and for automated parsing. So I'm torn. I don't have a strong opinion on space vs. hash prefix for indicating sub-test. I think the KUnit convention of matching whatever was the prefix of the "TAP version 14" line is clever, but it would be nice to just pick a prefix and be done with it. IMHO. -- Tim > -Kees > > [1] https://git.kernel.org/pub/scm/linux/kernel/git/kees/linux.git/log/?h=selftests/harness/tap > > -- > Kees Cook