Re: [RFC v8 09/20] um: lkl: kernel thread support

Johannes Berg <johannes@xxxxxxxxxxxxxxxx> · Sun, 14 Mar 2021 18:01:20 +0100

On Wed, 2021-01-20 at 11:27 +0900, Hajime Tazaki wrote:

> +void __weak subarch_cpu_idle(void)
> +{
> +}
> +
>  void arch_cpu_idle(void)
>  {
>  	cpu_tasks[current_thread_info()->cpu].pid = os_getpid();
>  	um_idle_sleep();
> +	subarch_cpu_idle();

Not sure that belongs into this patch in the first place, but wouldn't
it make some sense to move the um_idle_sleep() into the
subarch_cpu_idle() so LKL (or apps using it) can get full control?

> +/*
> + * This structure is used to get access to the "LKL CPU" that allows us to run
> + * Linux code. Because we have to deal with various synchronization requirements
> + * between idle thread, system calls, interrupts, "reentrancy", CPU shutdown,
> + * imbalance wake up (i.e. acquire the CPU from one thread and release it from
> + * another), we can't use a simple synchronization mechanism such as (recursive)
> + * mutex or semaphore. Instead, we use a mutex and a bunch of status data plus a
> + * semaphore.
> + */

Honestly, some of that documentation, and perhaps even the whole API for
LKL feels like it should come earlier in the series.

E.g. now here I see all those lkl_mutex_lock() (where btw documentation
doesn't always match the function name), so you *don't* have the
function ops pointer struct anymore?

It'd be nice to have some high-level view of what applications *must*
do, and what they *can* do, at the beginning of this series.

> +	 *
> +	 * This algorithm assumes that we never have more the MAX_THREADS
> +	 * requesting CPU access.
> +	 */
> +	#define MAX_THREADS 1000000

What implications does that value have? It seems several orders of
magnitude too large?

> +static int __cpu_try_get_lock(int n)
> +{
> +	lkl_thread_t self;
> +
> +	if (__sync_fetch_and_add(&cpu.shutdown_gate, n) >= MAX_THREADS)
> +		return -2;

Feels like that could be some kind of enum here instead of -2 and -1 and
all that magic.

> +	/* when somebody holds a lock, sleep until released,
> +	 * with obtaining a semaphore (cpu.sem)
> +	 */

nit: /*
      * use this comment style
      */

> +void switch_threads(jmp_buf *me, jmp_buf *you)
> +{
> +	/* NOP */
> +}

Why, actually?

Again, goes back to the high-level design thing I alluded to above, but
it's not clear to me why you need idle (which implies you're running the
scheduler) but not this (which implies you're *not* running the
scheduler)?

johannes