Re: [PATCH v3 00/35] Memory allocation profiling

Kent Overstreet <kent.overstreet@xxxxxxxxx> · Wed, 14 Feb 2024 11:38:14 -0500

On Wed, Feb 14, 2024 at 11:20:26AM +0100, Vlastimil Babka wrote:
> On 2/14/24 00:08, Kent Overstreet wrote:
> > And, as I keep saying: that alloc_hooks() macro will also get us _per
> > callsite fault injection points_, and we really need that because - if
> > you guys have been paying attention to other threads - whenever moving
> > more stuff to PF_MEMALLOC_* flags comes up (including adding
> > PF_MEMALLOC_NORECLAIM), the issue of small allocations not failing and
> > not being testable keeps coming up.
> 
> How exactly do you envision the fault injection to help here? The proposals
> are about scoping via a process flag, and the process may then call just
> about anything under that scope. So if our tool is per callsite fault
> injection points, how do we know which callsites to enable to focus the
> fault injection on the particular scope?

So the question with fault injection is - how do we integrate it into
our existing tests?

We need fault injection that we can integrate into our existing tests
because that's the only way to get the code coverage we need - writing
new tests that cover all the error paths isn't going to happen, and
wouldn't work as well anyways.

But the trouble with injecting memory allocation failures is that
they'll result in errors bubbling up to userspace, and in unpredictable
ways.

We _definitely_ cannot enable random memory allocation faults for the
entire kernel at runttme - or rather we _could_, and that would actually
be great to do as a side project; but that's not something we can do in
our existing automated tests because the results will be completely
unpredictable. If we did that the goal would be to just make sure the
kernel doesn't explode - but what we actually want is for our automated
pass/fail tests to still pass; we need to constrain what will fail.

So we need at a minumum to be able to only enable memory allocation
failures for the code we're interested in testing (file/module) -
enabling memory allocation failures in some other random subsystem we're
not developing or looking at isn't what we want.

Beyond that, it's very much subsystem dependent. For bcachefs, my main
strategy has been to flip on random (1%) memory allocation failures
after the filesystem has mounted. During startup, we do a ton of
allocations (I cover those with separate tests), but after startup we
should be able to run normally in the precence of allocation failures
without ever returning an error to userspace - so that's what I'm trying
to test.