Re: [PATCH v2 0/9] ci: make Git's GitHub workflow output much more helpful

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Ævar Arnfjörð Bjarmason wrote:
> 
> On Thu, Mar 24 2022, Victoria Dye wrote:
> 
>> Johannes Schindelin via GitGitGadget wrote:
>> [...]
>>> Is this the best UI we can have for test failures in CI runs? I hope we can
>>> do better. Having said that, this patch series presents a pretty good start,
>>> and offers a basis for future improvements.
>>>
>>
>> I think these are really valuable improvements over our current state, but I
>> also understand the concerns about performance elsewhere in this thread
>> (it's really slow to load for me as well, and scrolling/expanding the log
>> groups can be a bit glitchy in my browser). That said, I think there are a
>> couple ways you could improve the load time without sacrificing the (very
>> helpful) improvements you've made to error log visibility. I experimented a
>> bit (example result [1]) and came up with some things that may help:
>>
>> * group errors by test file, rather than by test case (to reduce
>>   parsing/rendering time for lots of groups).
>> * print the verbose logs only for the failed test cases (to massively cut
>>   down on the size of the log, particularly when there's only a couple
>>   failures in a test file with a lot of passing tests).
>> * skip printing the full text of the test in 'finalize_test_case_output'
>>   when creating the group, i.e., use '$1' instead of '$*' (in both passing
>>   and failing tests, this information is already printed via some other
>>   means).
>>
>> If you wanted to make sure a user could still access the full failure logs
>> (i.e., including the "ok" test results), you could print a link to the
>> artifacts page as well - that way, all of the information we currently
>> provide to users can still be found somewhere.
>>
>> [1] https://github.com/vdye/git/runs/5666973267
> 
> Thanks a lot for trying to address those concerns.
> 
> I took a look at this and it definitely performs better, although in
> this case the overall output is ~3k lines.
> 
> I'd be curious to see how it performs on some of the cases discussed in
> earlier threads of >~50k lines, although it looks like in this case that
> would require failures to be really widespread in the test suite.
> 

Unfortunately, I don't have a direct comparison to that (the longest I found
elsewhere in the thread was ~33k lines [1], but those failures came from
strange interactions on the 'shears/seen' branch of Git for Windows that I
couldn't easily replicate). If it helps, though, here's a 1:1 comparison of
my "experiment" branch's forced test failures with and without the
optimizations I tried (without optimization, the total log is ~28k lines):

without optimization: https://github.com/vdye/git/runs/5696305589 with
optimization: https://github.com/vdye/git/runs/5666973267

So it's definitely faster - it still takes a couple seconds to load, but not
so long that my browser struggles with it (which was my main issue with the
original approach).

[1] https://github.com/dscho/git/runs/4840190622

> I just looked at this briefly, but looking at the branch I see you
> removed the "checking known breakage of[...]" etc. from the non-GitHub
> markdown output, I didn't spot how that was related/needed.
> 

It was mostly just another attempt to cut down on extraneous output (since,
if a test fails, the test definition is printed after the failure, so we
would end up with the same information twice). 

That said, if that were to be incorporated here, it'd need to be smarter
than what I tried - my change removed it entirely from the '.out' logs, and
it means that any test that *does* pass wouldn't have its test definition
logged anywhere (I think). The ideal situation would be the extraneous test
definition is only removed from the '.markup' files, but that's probably a
change better saved for a future patch/series.

>>> Johannes Schindelin (9):
>>>   ci: fix code style
>>>   ci/run-build-and-tests: take a more high-level view
>>>   ci: make it easier to find failed tests' logs in the GitHub workflow
>>>   ci/run-build-and-tests: add some structure to the GitHub workflow
>>>     output
>>>   tests: refactor --write-junit-xml code
>>>   test(junit): avoid line feeds in XML attributes
>>>   ci: optionally mark up output in the GitHub workflow
>>>   ci: use `--github-workflow-markup` in the GitHub workflow
>>>   ci: call `finalize_test_case_output` a little later
>>>
>>
>> The organization of these commits makes the series a bit confusing to read,
>> mainly due to the JUnit changes in the middle. Patches 5-6 don't appear to
>> be dependent on patches 1-4, so they could be moved to the beginning (after
>> patch 1). With that change, I think this series would flow more smoothly:
>> "Cleanup/non-functional change" -> "JUnit XML improvements" -> "Log UX
>> improvements".
> 
> Have you had a change to look at the approach my suggestion of an
> alternate approach to the early part of this series takes?:
> https://lore.kernel.org/git/cover-00.25-00000000000-20220221T143936Z-avarab@xxxxxxxxx/
> 
> I.e. to not build up ci/lib.sh to know to group the "build" etc. within
> the "run-build-and-test" step, but instead just to pull those to the
> top-level by running separate build & test steps.
> 

I looked at it a while ago, but I actually had a similar issue following
that series as I did this one; it's difficult to tell what's cleanup, what's
refactoring unrelated to this series, and what's an explicit difference in
approach compared with this series. 

Revisiting it now, I did the same thing I did with dscho's series: ran your
branch with some forced test failures and looked at the results [2]. Based
on that, there are a couple of helpful things I see in your series that
contribute to the same overarching goal as this dscho's:

* Separating build & test into different steps.
    * This makes it more immediately obvious to a user whether the issue was
      a compiler error or a test failure. Since test failures can only even
      happen if the compilation passes, this doesn't create (another)
      situation where the relevant failure information is in a different
      step than the auto-expanded failing one.
* Separating 'lib.sh --build' and 'make' into different steps. 
    * I was initially unsure of the value of this (conceptually, wouldn't
      they both be part of "build"?), but I eventually understood it to be
      "setup the environment for [build|test]" followed by "run the
      [build|test]". Since the main thing dscho's series is addressing is
      information visibility, I like that this similarly "unburies" the
      environment configuration at the beginning of build/test.

Those changes are great (and they probably have some positive impact on load
times). But as far as I can tell, nothing else in your series directly
addresses the main problem dscho is fixing here, which is that the verbose
failure logs are effectively hidden from the user (unless you know exactly
where to look). As a result, it doesn't really fit as a "replacement" to
this one for me. Honestly, my ideal "final form" of all of this may be a
combination of both series, having the CI steps:

- setup build environment
- run build (make)
- setup test environment
- run test (make test) & print failure logs

You can still pull the 'make' executions out of 'run-build-and-test.sh', but
I think the "& print failure logs" part of the 'test' step (i.e., the added
'|| handle_failed_tests') is the critical piece that, although it would slow
things down to some extent (and, of course, it's subjective where the "too
slow" line is), it would relevant failure information a whole lot more
accessible. That's the real "value-add" of this series for me, if only
because I know it would have helped me a bunch of times in the past - I
absolutely believe it would similarly help new contributors in the future.

[2] https://github.com/vdye/git/runs/5695895629



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux