Re: Running ttm_device_test leads to list_add corruption. prev->next should be next (ffffffffc05cd428), but was 6b6b6b6b6b6b6b6b. (prev=ffffa0b1a5c034f0) (kernel 6.7.5)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Maxime,

Am 21.02.24 um 15:41 schrieb Maxime Ripard:
Hi Christian,

On Tue, Feb 20, 2024 at 04:03:57PM +0100, Christian König wrote:
Am 20.02.24 um 15:56 schrieb Maxime Ripard:
On Tue, Feb 20, 2024 at 02:28:53PM +0100, Christian König wrote:
[SNIP]
This kunit test is not meant to be run on real hardware, but rather just as
stand a long kunit tests within user mode linux. I was assuming that it
doesn't even compiles on bare metal.

We should probably either double check the kconfig options to prevent
compiling it or modify the test so that it can run on real hardware as well.
I think any cross-compiled kunit run will be impossible to differentiate
from running on real hardware. We should just make it work there.
The problem is what the unit test basically does is registering and
destroying a dummy device to see if initializing and tear down of the global
pools work correctly.

If you run on real hardware and have a real device
I assume you mean a real DRM device backed by TTM here, right?

Right.

additionally to the dummy device the reference count of the global
pool never goes down to zero and so it is never torn down.

So running this test just doesn't make any sense in that environment.
Any idea how to work around that?
I've added David, Brendan and Rae in Cc.

To sum up the problem, your tests are relying on the mock device created
to run a kunit test to be the sole DRM device in the system. But if you
compile a kernel with the kunit tests enabled and boot that on a real
hardware, then that assumption might not be true anymore and things
break apart. Is that a fair description?

Yes, exactly that.


If so, maybe we could detect if it's running under qemu or UML (if
that's something we can do in the first place), and then extend
kunit_attributes to only run that test if it's in a simulated
environment.

Yeah, as I said AMDs CI is running those tests with UML only and I strongly assume Intel is doing the same.

In my reply to the reporter of the bug I provided a patch which limits the tests to (UML || COMPILE_TEST) and as far as I can see is the easiest option for now.

We could detect that we are not in UML and skip the device test, but that's also rather pointless. Better not to provide the option in the first place.

Regards,
Christian.


Maxime


[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux