Re: unload and reload modules with patched function

Song Liu <songliubraving@xxxxxx> · Fri, 22 Jul 2022 16:53:28 +0000

Hi Petr, 

> On Jul 22, 2022, at 3:40 AM, Petr Mladek <pmladek@xxxxxxxx> wrote:
> 
> On Wed 2022-07-20 23:57:25, Song Liu wrote:
>> Hi folks,
>> 
>> While testing livepatch kernel modules, we found that if a kernel module has
>> patched functions, we cannot unload and load it again (rmmod, then insmod).
>> This hasn't happened in production yet, but it feels very risky. We use
>> automation (chef to be specific) to handle kernel modules and livepatch.
>> It is totally possible some weird sequence of operations would cause insmod
>> errors on thousands of servers. Therefore, we would like to fix this issue
>> before it hits us hard.
>> 
>> A bit of searching with the error message shows it was a known issue [1], and
>> a few options are discussed:
>> 
>> "Possible ways to fix it:
>> 
>> 1) Remove the error check in apply_relocate_add(). I don't think we
>> should do this, because the error is actually useful for detecting
>> corrupt modules. And also, powerpc has the similar error so this
>> wouldn't be a universal solution.
>> 
>> 2) In klp_unpatch_object(), call an arch-specific arch_unpatch_object()
>> which reverses any arch-specific patching: on x86, clearing all
>> relocation targets to zero; on powerpc, converting the instructions
>> after relative link branches to nops. I don't think we should do
>> this because it's not a global solution and requires fidgety
>> arch-specific patching code.
>> 
>> 3) Don't allow patched modules to be removed. I think this makes the
>> most sense. Nobody needs this functionality anyway (right?).
>> "
> 
> Just for completeness there is one more possibility. We have sometimes
> discussed a split of the livepatch module into per-object
> (vmlinux + per-module) modules. So that modules can be loaded and
> unloaded together with the respective livepatch counter parts.
> 
> I have played with this idea some (years) ago. It was quite
> complicated because of the consistency model. If I remember correctly
> the main challenges were:
> 
> 1. The livepatch module must be loaded together with all related
> livepatch modules for all loaded modules before the transition
> is started.
> 
> 2. If any module is loaded then it must wait in MODULE_STATE_COMMING
> until the related livepatch module is loaded and the livepatch
> ftrace callbacks applied.
> 
> 3. The naming is a nightmare.
> 
> 
> Ad 1. and 2.: It needs some "hacks" in the module loader. It requires
> calling modprobe from kernel code which some people hate.
> 
> Ad 3: Livepatch is a module. The per-object livepatch is set of related
> modules. The livepatch modules do livepatch vmlinux and "normal"
> modules. It is easy to get lost in the terms. Especially it hard
> to distinguish "livepatched modules" and "livepatch modules"
> in code (variable and function names) and comments.

+1 on naming is so hard for this. 

> 
> 
> I have never published the POC because it was not finished and it got
> less important after removing the most of the arch-specific code.
> I could put it somewhere when anyone is interested.
> 
> Anyway, I think that it is _not_ the way to go. IMHO, the split
> livepatch modules bring more problems than they solve.

Thanks for sharing these experiences! Let's hope option 2) or 3)
would fix the issue. 

Song