Re: [PATCH] common/config: use modprobe -w when supported

Eric Sandeen <sandeen@xxxxxxxxxx> · Wed, 4 Dec 2024 22:35:45 -0600

On 12/4/24 6:26 PM, Luis Chamberlain wrote:
> Eric, I saw you mentioning on IRC you didn't understand *why*
> the patient module remover was added. Even though I thought the
> commit log explained it, let me summarize again: fix tons of
> flaky tests which assume module removal is being done correctly.
> It is not and fixing this is a module specific issue like with
> scsi_debug as documented in the commit log bugzilla references.
> So any sane test suite thing relying on module removal should use
> something like modprobe -w <timeout-in-ms>.

Well, I was having a sad because the upshot of all of that was
that when xfs was not removable at all because it was the root
filesystem, the end result was something like this:

    --- tests/xfs/435.out	2024-11-21 05:13:04.000000000 -0500
    +++ /var/lib/xfstests/results//xfs/435.out.bad	2024-11-21 05:14:47.949206141 -0500
    @@ -3,3 +3,4 @@
     Create a many-block file
     Remount to check recovery
     See if we leak
    +custom patient module removal for xfs timed out waiting for refcnt to become 0 using timeout of 50

which is kind of nonsense, but that probably has more to do with
the test not realizing /before it starts/ that the module cannot
be removed and it should not even try.

Darrick fixed that with:

[PATCH 2/2] xfs/43[4-6]: implement impatient module reloading

but it's starting to feel like a bit of a complex house of cards
by now. We might need a more robust framework for determining whether
a module is removable /at all/ before we decide to wait patiently
for a thing that cannot ever happen?

-Eric