Re: Raptor Engineering dedicating resources to KVM on PowerNV + KVM CI/CD

Shawn Anastasio <sanastasio@xxxxxxxxxxxxxxxxxxxxx> · Wed, 8 Jan 2025 11:17:13 -0600

Hi Alex,

On 1/7/25 5:45 AM, Alex Williamson wrote
> Hi,
> 
> What are you supposing the value to the community is for a CI pipeline
> that always fails?  Are you hoping the community will address the
> failing tests or monitor the failures to try to make them not become
> worse?

The failing tests are all isolated to issues with the specific AMD
graphics hardware that the test machine is using for the VFIO and host
GPU tests, and are likely isolated to the amdgpu driver itself. We have
filed bugs with amdgpu folks.

The non-failing tests however, possess value for regression monitoring
including VM boot smoke tests for both little endian and big endian
ppc64/pseries targets, as well as the vfio-*-attach tests that ensure
hardware can be successfully bound to the vfio-pci driver on a PowerNV
host. The test artifacts also include full dmesg output from the host
and guest machine (when applicable) to assist with debugging.

The data could definitely be presented in an easier to digest way to
make it more obvious which failures are regressions and which are due to
the aforementioned amdgpu issues, so that's an area for improvement.

> 
> I would imagine that CI against key developer branches or linux-next
> would be more useful than finding problems after we've merged with
> mainline, but it's not clear there's any useful baseline here to
> monitor for regressions.  Thanks,
>

That's a good point -- I'll definitely look into adding at least
linux-next, as well as any other branch requests from developers.

> Alex

Thanks,
Shawn