Re: [RFC] MIPS: Add cacheinfo support

Leonid Yegoshin <Leonid.Yegoshin@xxxxxxxxxx> · Thu, 8 Dec 2016 17:01:53 -0800

On 12/08/2016 04:28 PM, Justin Chen wrote:
Thanks for the comments Leonid.

We should consider the scope of this patch. The information we are
trying to expose to userspace is limited to the struct cacheinfo
located at include/linux/cacheinfo.c (of course this can always be
expanded). So the question is what information would be useful to
expose to userspace.
Some justification for exposing the current information in the
cacheinfo struct could be: (Pulled from another email chain)
"Agreed. So far I have got requests from GCC, JIT and graphics guys.
IIUC they need this to support cache flushing for user applications like
gcc trampolines and JIT compilers. I am also told that having knowledge
of cache architecture can help optimal code strategies, though I don't
have much details on that."
https://patchwork.kernel.org/patch/5867721/

There may be justification for including the points you mentioned
above, but I believe that is outside the scope of this patch. The
cache information exposed in this patch is limited, but I do not
believe it is useless. The points above can be added, but it will
require a rework of the base cacheinfo driver. driver/base/cacheinfo.c

Justin,

CACHE instruction is not available in user space, only SYNCI on MIPS R2+ 
for trampoline.
Any operation with CACHE requires a syscall.

As for SYNCI (trampoline from L1D->L1I) the following information in 
user space is needed:

    1. L1 line size (available via RDHWR $x, $1). It is available now 
directly from CPU, but may be better to supply from kernel because some 
cores has no that.

    2. The flag that L1I is NOT coherent with L1D and SYNCI is needed 
and available

The knowledge about L1/L2 sizes is not needed for regular application... 
well, if application wants to get advantage of cache sizes, well, in 
this case it can be supplied.

But it is unreliable because app may be rescheduled into different kind 
of core which has a different L1 size (in heterogeneous system, BTW). It 
can be fixed by setting affinity, of course (not sure - can it be 
reliably done in BIG/LITTLE approach). But that requires in application 
the knowledge and understanding of system CPU structure... well why we 
can allow all that stuff besides information purpose? It corrupts the 
all efforts and optimization in kernel about performance and powersaving.

Regards,
- Leonid.