Re: [PATCH v2 4/4] livepatch/shadow: Add garbage collection of shadow variables

Petr Mladek <pmladek@xxxxxxxx> · Thu, 2 Feb 2023 14:58:31 +0100

On Tue 2023-01-31 13:17:31, Josh Poimboeuf wrote:
> On Tue, Jan 31, 2023 at 03:23:58PM +0100, Petr Mladek wrote:
> > > > +
> > > >  void *klp_shadow_get(void *obj, struct klp_shadow_type *shadow_type);
> > > >  void *klp_shadow_alloc(void *obj, struct klp_shadow_type *shadow_type,
> > > >  		       size_t size, gfp_t gfp_flags, void *ctor_data);
> > > 
> > > I find the type-based interface to be unnecessarily clunky and
> > > confusing:
> > > 
> > > - It obscures the fact that uniqueness is determined by ID, not by type.
> > > 
> > > - It's weird to be passing around the type even after it has been
> > >   registered: e.g., why doesn't klp remember the ctor/dtor?
> > 
> > ctor/dtor must be implemented in each livepatch. klp_shadow_register()
> > would need to remember all registered implementations.
> > klp_shadow_alloc/get/free would need to find and choose one.
> > It would add an extra overhead and non-determines.
> 
> There's really no need to even "register" the constructor.  It can
> continue to just be passed as needed to the alloc functions.

Sure. But it looks weird to handle the constructor another way then
the destructor. I mean that if we registrer the destructor then we
should registrer the constructor as well.

I see an analogy with C++ here. klp_shadow_register_type() is similar
to defining a class. And both constructor and destructor are defined
in the class definition.

A custom constructor in C++ might allow to pass extra parameters
needed for initialization. In our case, these are obj, size,
gpf_flags, ctor_data.

> (Side note, I don't see why klp_shadow_alloc() even needs a constructor
> as it's typically used when its corresponding object gets allocated, so
> its initialization doesn't need to be atomic.  klp_shadow_get_or_alloc()
> on the other hand, does need a constructor since it's used for attaching
> to already-existing objects.)

If we agree that klp_shadow_get_or_alloc() needed a constructor then
klp_shadow_alloc() should do the same. Different behavior would add
more confusion from my POV.

> The thing really complicating this whole mess of an API is the
> destructor.  The problem is that it can change when replacing
> livepatches, so we can't just remember it in the registration.
> 
> At least at Red Hat, I don't think we've ever used a shadow destructor.
> Its real world use seems somewhere between rare and non-existent.

I checked SUSE livepatches and we use the destructor once
for mutex_destroy(). I guess that it made the livepatch more safe.
Also it might have been useful for testing.

> I guess a destructor would be needed if the constructor (or some other
> initialization) allocated additional memory associated with the shadow
> variable.  That data would need to be freed.

It seems that the constructor is primary used to initialize the
structure members, for example, locks.

The commit e91c2518a5d22a07642f35 ("livepatch: Initialize shadow
variables safely by a custom callback") mentions as an example
list_head. The constructor might want to link it into another list.
In that case, the destructor would do the opposite.

> But in that case couldn't an additional shadow variable be used for the
> additional memory?  Then we could just get rid of this idea of the
> destructor and this API could become much more straightforward.
> 
> Alternatively, each livepatch could just have an optional global
> destructor which is called for every object which needs destructing.  It
> could then use the ID to decide what needs done (or pass it off to a
> more specific destructor).

Honestly, I do not think that it would really make things easier
for people developing the livepatches.

Summary:

My understanding is that you do not like the following things:

1. Problem: Having both struct klp_shadow_type and
	 struct klp_shadow_type_reg.

   Proposal: Get rid of struct klp_shadow_type_reg. Instead, add
	a list_head into struct klp_shadow_type and use it for
	registration and reference counting.

2. Problem: Using struct klp_shadow_type * instead of ID in
	the klp_shadow API.

   Solution:
	a) Find the shadow_type by ID.
	b) Get rid of klp_shadow_type completely

3. Problem: The API is obscure in general and this makes it even worse

   Solution:
       a) Do not pass constructor in klp_shadow_alloc()
       b) Do not support destructor
       c) Have global destructor

Requirements:

We agree that a constructor is needed at least for
klp_shadow_get_or_alloc().

The garbage collection solves a real problem. It provides a real
solution instead of custom hacks.

My opinion:

The shadow API is used rarely so that livepatch developer do not
have much experience with it.

The shadow API is usually used for some tricky things. And it is
used for custom livepatch-specific modifications that do not
get much review.

An ideal API should be simple. But we agree that the constructor is
needed in klp_shadow_get_or_alloc(). It defines the basic
complexity of the API.

The API should be predictable. IMHO, klp_shadow_alloc() should handle
the constructor the same way as klp_shadow_get_or_alloc().

Also it would help when the API is symmetric. I would keep the
destructor. And if register destructor then we should register
constructor as well.

I do not see big difference between passing ID or struct
klp_shadow_type * in the API. IMHO, it is better than a common
destructor.

Summary: I would get rid of struct klp_shadow_type_reg and
	keep the rest as is.

Proposal:

If we can't find an agreenment. Then the decision should be made
by people actually using the API. On the SUSE side, it is Nicolai
and Marcos.

IMHO, the main question is whether we need custom destructors.
We could avoid struct klp_shadow_type if the destructors are
not needed...

> > > - I don't understand the exposed klp_shadow_{un}register() interfaces.
> > >   What's the point of forcing their use if there's no garbage
> > >   collection?
> > 
> > I probably do not understand it correctly.
> > 
> > Normal livepatch developers do not need to use klp_shadow_{un}register().
> > They are called transparently in __klp_enable_patch()/klp_complete_transition()
> > and klp_module_coming()/klp_cleanup_module_patches_limited().
> > 
> > The main reason why they are exposed via include/linux/livepatch.h
> > and EXPORT_SYMBOL_GPL() is the selftest lib/livepatch/test_klp_shadow_vars.c.
> > 
> > Well, the selftest probably could be bundled into a livepatch to go
> > around this.
> 
> In that case, at the very least they should be prefixed with underscores
> and the function comment should make it clear they shouldn't be called
> under normal usage.

Good point. Well, I think that we should start testing the klp_shadow API using
livepatches. So that it won't be necessary to export the registration API.

Best Regards,
Petr