Just a quick note that there's been a lot of good discussion. I have an updated draft of the document, but I need to review the flurry of comments today, and I'm busy getting my slides ready for a conference. So I just wanted to give a heads up that I'll be working on this (responding to comments and hopefully posting an updated draft version) early next week. Thanks for the feedback. -- Tim > -----Original Message----- > From: Frank Rowand <frowand.list@xxxxxxxxx> > > On 2020-06-16 23:05, David Gow wrote: > > On Wed, Jun 17, 2020 at 11:36 AM Kees Cook <keescook@xxxxxxxxxxxx> wrote: > >> > >> On Wed, Jun 17, 2020 at 02:30:45AM +0000, Bird, Tim wrote: > >>> Agreed. You only need machine-parsable data if you expect the CI > >>> system to do something more with the data than just present it. > >>> What that would be, that would be common for all tests (or at least > >>> many test), is unclear. Maybe there are patterns in the diagnostic > >>> data that could lead to higher-level analysis, or even automated > >>> fixes, that don't become apparent if the data is unstructured. But > >>> it's hard to know until you have lots of data. I think just getting > >>> the other things consistent is a good priority right now. > >> > >> Yeah. I think the main place for this is performance analysis, but I > >> think that's a separate system entirely. TAP is really strictly yes/no, > >> where as performance analysis a whole other thing. The only other thing > >> I can think of is some kind of feature analysis, but that would be built > >> out of the standard yes/no output. i.e. if I create a test that checks > >> for specific security mitigation features (*cough*LKDTM*cough*), having > >> a dashboard that shows features down one axis and architectures and/or > >> kernel versions on other axes, then I get a pretty picture. But it's > >> still being built out of the yes/no info. > >> > >> *shrug* > >> > >> I think diagnostic should be expressly non-machine-oriented. > > > > So from the KUnit side, we sort-of have three kinds of diagnostic lines: > > - Lines printed directly from tests (typically using kunit_info() or > > similar functions): as I understand it, these are basically the > > equivalent of what kselftest typically uses diagnostics for -- > > test-specific, human-readable messages. I don't think we need/want to > > parse these much. > > > > - Kernel messages during test execution. If we get the results from > > scraping the kernel log (which is still the default for KUnit, though > > there is also a debugfs info), other kernel logs can be interleaved > > with the results. Sometimes these are irrelevant things happening on > > another thread, sometimes they're something directly related to the > > test which we'd like to capture (KASAN errors, for instance). I don't > > think we want these to be machine oriented, but we may want to be able > > to filter them out. > > This is an important conceptual difference between testing a user > space program (which is the environment that TAP initially was > created for) and testing kernel code. This difference should be > addressed in the KTAP standard. As noted above, a kernel test > case may call into other kernel code, where the other kernel code > generates messages that get into the test output. > > One issue with the kernel issues is that they may be warnings or > errors, and to anyone other than the test creator it is probably > hard to determine whether the warnings and errors are reporting > bugs or whether they are expected results triggered by the test. > > I created a solution to report what error(s) were expected for a > test, and a tool to validate whether the error(s) occurred or not. > This is currently in the devicetree unittests, but the exact > implementation should be discussed in the KUnit context, and it > should be included in the KTAP spec. > > I can describe the current implementation and start a discussion > of any issues in this thread or I can start a new thread. Whichever > seems appropriate to everyone. > > -Frank > > > > - Expectation failures: as Brendan mentioned, KUnit will print some > > diagnostic messages for individual assertion/expectation failures, > > including the expected and actual values. We'd ideally like to be able > > to identify and parse these, but keeping them human-readable is > > definitely also a goal. > > > > Now, to be honest, I doubt that the distinction here would be of much > > use to kselftest, but it could be nice to not go out of our way to > > make parsing some diagnostic lines possible. That being said, > > personally I'm all for avoiding the yaml for diagnostic messages stuff > > and sticking to something simple and line-based, possibly > > standardising a the format of a few common diagnostic measurements > > (e.g., assertions/expected values/etc) in a way that's both > > human-readable and parsable if possible. > > > > I agree that there's a lot of analysis that is possible with just the > > yes/no data. There's probably some fancy correlation one could do even > > with unstructured diagnostic logs, so I don't think overstructuring > > things is a necessity by any means. Where we have different tests > > doing similar sorts of things, though, consistency in message > > formatting could help even if things are not explicitly parsed. > > Ensuring that helper functions that log and the like are spitting > > things out in the same format is probably a good starting step down > > that path. > > > > Cheers, > > -- David > >