Re: [PATCH v2 5/6] mips: use per-mm page to execute FP branch delay slots

Ralf Baechle <ralf@xxxxxxxxxxxxxx> · Fri, 4 Jul 2014 10:52:46 +0200

On Fri, Jul 04, 2014 at 09:06:41AM +0100, Paul Burton wrote:

> Yes, I think it would. The reason I went with the per-mm approach though
> was to try to avoid so much overhead. I suppose we could possibly
> allocate the page on demand so that threads which don't use FP don't pay
> for it, and maybe use the shrinker interface to free the page if we run
> low on memory and aren't currently executing from it. Though it would
> mean that the FP branch delay "emulation" could fail if memory is tight,
> but I suppose that's no worse than now where it could blow the (user)
> stack.
> 
> I'll try to get a v3 out at some point soon.

The actual piece of code that needs to be installed is tiny.  So the page
could be shared between many threads.  In fact a single page would
suffice for most processes and only threads would require more slots
than provided by a single page so more pags could be allocated or the
process could sleep until a slot becomes available.

Assuming the smallest supported page size of 4k and slots of 128 bytes
(that is the largest S-cache line size in common use) that's 32 slots.

I'm also wondering how insane emulation would be.  We already have the
capability to emulate a fair fraction of the instruction set.

  Ralf