Re: Minimalistic consistency model (one step above null model)

Josh Poimboeuf <jpoimboe@xxxxxxxxxx> · Fri, 14 Nov 2014 13:58:06 -0600

On Fri, Nov 14, 2014 at 07:27:22AM +0100, Vojtech Pavlik wrote:
> On Thu, Nov 13, 2014 at 11:22:10PM -0600, Josh Poimboeuf wrote:
> 
> > > The really trivial solution is:
> > > 
> > > 
> > > 	static void lpc_ftrace_handler(unsigned long ip, unsigned long parent_ip,
> > > 					struct ftrace_ops *ops, struct pt_regs *regs)
> > > 	{
> > > 		struct lpc_func *func = ops->private;
> > > 
> > > 		if (!within_patched_set(parent_ip))
> > > 			regs->ip = func->new_addr;
> > > 	}
> > > 
> > > 
> > > Obviously, within_patched_set() would need an efficient data structure,
> > > like an interval tree, to quickly decide whether the parent ip address
> > > is within the patched set.
> > > 
> > > This is enough to make sure that function calls are consistent.
> > > 
> > > No stack checking, no stop_machine(), no forcing threads through
> > > userspace.
> > > 
> > > A disadvantage is that it doesn't give an indication when patching is
> > > over and the trampoline can be removed. For that, I currently don't see
> > > a better method than tracking kernel exits and/or stack analysis.
> > > 
> > > What do you think?
> > 
> > Interesting idea.  But the hard part is figuring out when the patching
> > is complete.  Once you implement that, it might not be so minimal.
> 
> True.
> 
> The same applies to the current livepatch implementation, though.  You
> cannot tell if all threads have left the functions you've patched or
> whether they're remaining on the stack and as such cannot remove patches
> safely nor start transforming data.

Yes, the current livepatch code can't remove patches.  But it can at
least disable them.  If you follow the rule that function prototype and
data changes aren't allowed, then you can do a crude cumulative patch
upgrade by disabling the old patch module and loading a new one.  You
can also do incremental patching, if desired.  This is a workable first
solution, I think.  It's a placeholder until we can get a real
consistency model agreed upon and implemented.

As a temporary solution, I don't see it being much of a problem that you
can't change function prototypes.  In most cases I think you can work
around that with a little bit of creativity.  In fact, this limitation
is really a benefit as it keeps the code very simple.

Once you add the above minimal approach, by definition you have to start
allowing function prototype changes.  And then, for each patch you need
to start tracking its patched function set and its patching progress,
which complicates things considerably when you have to deal with
multiple patches.  So I think this starts to look like a not-so-minimal
consistency model, with the only benefit being that you can change
function prototypes.

> The above implementation of extending the current livepatch to handle
> function prototype changes can be further extended to handle multiple
> patches 'in progress' by only jumping to a 'next level' patch if coming
> from a function not changed by that level.
> 
> 
> > BTW, here's a very simple way to implement LEAVE_PATCHED_SET +
> > SWITCH_THREAD:
> > 
> > for each thread {
> >     if (!on_cpu(thread) and !on_stack(thread, funcs) {
> >         switch thread to new universe
> >     } else if (unfreezable kthread) {
> >         retry and/or fail (kthread is either CPU-bound or is using patched func)
> >     } else {
> >         freeze thread
> > 	if (!on_stack(thread, funcs))
> > 	    switch thread to new universe
> > 	else
> > 	    fail (patch affects freezer/schedule path)
> >     }
> > }
> 
> I think this could work. You need to make sure you can prevent a process
> from waking up, however, to avoid race conditions in the first if().

Yeah, this is the tricky part.  I'm not sure how to do it properly
without hacking the scheduler.  I believe this race also exists with
Masami's kpatch implementation.

> A very similar method could be used to detect if the minimalistic
> approach above finished patching, trampolines can be removed, data
> patching started, etc.

True, but then what's the advantage of doing all that instead of a
proper LEAVE_PATCHED_SET SWITCH_THREAD approach?

> > Note this approach will fail in a few cases, like if you have a
> > CPU-bound kthread (but we can keep retrying in that case), if a patched
> > function is in use by a sleeping kthread, or if the patch affects the
> > freeze/schedule path.
> 
> None of the so far proposed methods can complete patching if a patched
> function is in use by a sleeping kthread, or even any thread. So the
> limitations you show aren't a huge problem.
>
> > But the code is simple and converges to the patched state quickly.
> > 
> > Thoughts?
> 
> You could avoid the freezing (and thus userspace impact) by tracking
> syscall (or freezer) entry/exit like kGraft does.

Yes, in the rare cases where we'd have to freeze a running thread, there
is a user space performance impact.  It might be useful to measure it.
I'd imagine it's much better than stop_machine.

> Any process cpu-bound in userspace can be migrated immediately

Good point.  I wonder if there's a race-free way to do that without
modifying the syscall code.

> and any proces executing syscalls can be migrated when on the
> kernel/userspace boundary. And processes that are sleeping are already
> handled.

Yes, but now you're getting into LEAVE_KERNEL territory.  And we'd have
to modify syscall and kthread code again...

The benefits of the freezer approach are that it kills two birds with
one stone in a very simple way, and doesn't necessarily require an "in
progress" patching state like options b and c.

The drawbacks are the rare performance impact and the fact that it will
fail sometimes.  Most of the failures would be predictable ahead of
time, since I think(?) the freezer code path is consistent.

But it would be hard to predict which functions would be used by a
sleeping kthread.  I suppose we could avoid the unpredictable failures
by setting an "in progress" state instead of returning an error, but
that eliminates one of its advantages.  Another approach to this problem
would be to continue with the work of converting kthreads to workqueues
and/or making kthreads freezable.

But... maybe let's forget all that.  Here's yet another idea which is
currently my favorite :-)

1. Install an ftrace handler at the beginning of every to-be-patched
   function which converts the current thread to the new universe if
   there are no other patched functions on its stack.

2. Run the following on each cpu (smp_call_function):
	for each thread
		if task_cpu(task) == this cpu
			check backtrace and convert thread to new universe if possible

   [ This introduces some scheduling latency, but it's much better than
   stop_machine. ]

3. Send fake signals to any remaining old universe tasks, and wait for
   the ftrace handler to convert them.  Then repeat step 2 to catch
   those stragglers for which a signal didn't cause it to run the same
   code path again.  Now the patching is complete (no "in progress"
   state needed).

3. (Alternative) Have an "in progress" state.  Periodically check if all
   tasks have been converted and repeat step 2 if not.  User can
   optionally send signals to tasks to get them to exit out of the old
   functions.

-- 
Josh
--
To unsubscribe from this list: send the line "unsubscribe live-patching" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html