Re: [PATCH v2 5/6] mips: use per-mm page to execute FP branch delay slots

Paul Burton <paul.burton@xxxxxxxxxx> · Fri, 4 Jul 2014 10:06:01 +0100

On Fri, Jul 04, 2014 at 10:52:46AM +0200, Ralf Baechle wrote:
> On Fri, Jul 04, 2014 at 09:06:41AM +0100, Paul Burton wrote:
> 
> > Yes, I think it would. The reason I went with the per-mm approach though
> > was to try to avoid so much overhead. I suppose we could possibly
> > allocate the page on demand so that threads which don't use FP don't pay
> > for it, and maybe use the shrinker interface to free the page if we run
> > low on memory and aren't currently executing from it. Though it would
> > mean that the FP branch delay "emulation" could fail if memory is tight,
> > but I suppose that's no worse than now where it could blow the (user)
> > stack.
> > 
> > I'll try to get a v3 out at some point soon.
> 
> The actual piece of code that needs to be installed is tiny.  So the page
> could be shared between many threads.  In fact a single page would
> suffice for most processes and only threads would require more slots
> than provided by a single page so more pags could be allocated or the
> process could sleep until a slot becomes available.

You just roughly described the v2 patch that we're replying to :)

The problem is how to reliably free the frame after it has been used.
I can see ways to do it, but none that are particularly "nice".

> Assuming the smallest supported page size of 4k and slots of 128 bytes
> (that is the largest S-cache line size in common use) that's 32 slots.

Why S-cache line sized slots? I suppose it could simplify updating the
page slightly at the cost of space.

> I'm also wondering how insane emulation would be.  We already have the
> capability to emulate a fair fraction of the instruction set.

Yeah, and I'm reasonably sure we're going to need some more once MIPSr6
is supported. I guess (perhaps only for the short term?) it could be
done in stages - if systems include ASEs or cop2 that the emulation
didn't implement then it could fall back to the current emuframe code.

I'm in 2 minds about this - it sounds crazy but perhaps it's the most
sane option available :)

Paul