Re: [PATCH] MIPS: Fix ejtag handler on SMP

Paul Burton <paul.burton@xxxxxxxx> · Fri, 8 Jun 2018 15:37:05 -0700

Hi Heiher,

On Fri, Jun 08, 2018 at 01:51:09PM +0800, r@xxxxxx wrote:
> diff --git a/arch/mips/kernel/genex.S b/arch/mips/kernel/genex.S
> index 37b9383eacd3..3804afd878f8 100644
> --- a/arch/mips/kernel/genex.S
> +++ b/arch/mips/kernel/genex.S
> @@ -354,16 +354,56 @@ NESTED(ejtag_debug_handler, PT_SIZE, sp)
>  	sll	k0, k0, 30	# Check for SDBBP.
>  	bgez	k0, ejtag_return
>  
> +#ifdef CONFIG_SMP
> +1:	PTR_LA	k0, ejtag_debug_buffer_spinlock
> +	ll	k0, 0(k0)
> +	bnez	k0, 1b
> +	PTR_LA	k0, ejtag_debug_buffer_spinlock
> +	sc	k0, 0(k0)
> +	beqz	k0, 1b
> +# ifdef CONFIG_WEAK_REORDERING_BEYOND_LLSC
> +	sync
> +# endif
> +
> +	PTR_LA	k0, ejtag_debug_buffer
> +	LONG_S	k1, 0(k0)
> +
> +	mfc0	k1, CP0_EBASE
> +	andi	k1, MIPS_EBASE_CPUNUM
> +	PTR_SLL	k1, LONGLOG
> +	PTR_LA	k0, ejtag_debug_buffer_per_cpu
> +	PTR_ADDU k0, k1

Neat - I like the concept of using the spinlock for just a short window
to free up k1, then using a per-CPU buffer.

Unfortunately it's not going to be as simple as using EBase.CPUNum, for
at least 2 reasons:

  - EBase.CPUNum doesn't always equal the Linux CPU number. For example
    in many systems which implement multi-threading ASE CPUNum is
    actually just a concatenation of some fixed number of bits
    specifying the core number and some fixed number of bits specifying
    the VP(E) number. So for example we might have a system with 2 cores
    each of which have 2 threads, but whose numbering looks like this:

      Linux CPU Number | Core Number | VP(E) Number | EBase.CPUNum
      -----------------|-------------|--------------|-------------
                     0 |  0b00 / 0x0 |   0b00 / 0x0 | 0b0000 / 0x0
		     1 |  0b00 / 0x0 |   0b01 / 0x1 | 0b0001 / 0x1
		     2 |  0b01 / 0x1 |   0b00 / 0x0 | 0b0100 / 0x4
		     3 |  0b01 / 0x1 |   0b01 / 0x1 | 0b0101 / 0x5

    This means that it might be the case that EBase.CPUNum >= NR_CPUS,
    which would result in a buffer overflow here.

    There are other ways this could happen too, for example if the
    kernel is configured with CONFIG_NR_CPUS=2 and run on a system which
    actually has more than 2 CPUs we'd have problems if the kernel was
    actually running on CPUs besides the ones CPUNum refers to as 0 & 1.

  - Some newer MIPS CPUs such as the I6500 introduce the concept of
    clusters, which are a layer of CPU topology beyond cores. In these
    systems EBase.CPUNum may not be aware of clusters at all, so for
    example the first CPU in each cluster may both have EBase.CPUNum
    equal to zero. MIPSr6 introduced the GlobalNumber register to
    resolve this problem.

One option would be to load the CPU number the same way
smp_processor_id() does with:

    lw	k1, TI_CPU(gp)

This has the disadvantage that things may go wrong if we take an EJTAG
exception before the gp register has been setup when a CPU is onlined,
but that should hopefully be a very small window of time. I think that
may be our best option.

Thanks,
    Paul