On Fri, Jul 04, 2014 at 10:52:46AM +0200, Ralf Baechle wrote: > On Fri, Jul 04, 2014 at 09:06:41AM +0100, Paul Burton wrote: > > > Yes, I think it would. The reason I went with the per-mm approach though > > was to try to avoid so much overhead. I suppose we could possibly > > allocate the page on demand so that threads which don't use FP don't pay > > for it, and maybe use the shrinker interface to free the page if we run > > low on memory and aren't currently executing from it. Though it would > > mean that the FP branch delay "emulation" could fail if memory is tight, > > but I suppose that's no worse than now where it could blow the (user) > > stack. > > > > I'll try to get a v3 out at some point soon. > > The actual piece of code that needs to be installed is tiny. So the page > could be shared between many threads. In fact a single page would > suffice for most processes and only threads would require more slots > than provided by a single page so more pags could be allocated or the > process could sleep until a slot becomes available. You just roughly described the v2 patch that we're replying to :) The problem is how to reliably free the frame after it has been used. I can see ways to do it, but none that are particularly "nice". > Assuming the smallest supported page size of 4k and slots of 128 bytes > (that is the largest S-cache line size in common use) that's 32 slots. Why S-cache line sized slots? I suppose it could simplify updating the page slightly at the cost of space. > I'm also wondering how insane emulation would be. We already have the > capability to emulate a fair fraction of the instruction set. Yeah, and I'm reasonably sure we're going to need some more once MIPSr6 is supported. I guess (perhaps only for the short term?) it could be done in stages - if systems include ASEs or cop2 that the emulation didn't implement then it could fall back to the current emuframe code. I'm in 2 minds about this - it sounds crazy but perhaps it's the most sane option available :) Paul