Re: anybody tried NPTL?

Daniel Jacobowitz <dan@xxxxxxxxxx> · Mon, 23 Aug 2004 13:44:47 -0400

On Mon, Aug 23, 2004 at 07:12:57PM +0200, Ralf Baechle wrote:
> Thiemo and have been compiling various pieces of code with different
> gcc versions trying to find the best possible register for that purpose.
> We used code bloat as (weak ...) indicator for register pressure.  It
> turned out that $t9 was the best choice for all tested compiler versions;
> thanks to the much improved register allocation of newer gcc the choice
> of a particular register made far less difference on recent compilers
> than on older compilers.
> 
> I've also implemented a fast system call for reading the thread registers.
> Benchmarks did show that to have about half the latency of a regular
> syscall; the hope was if gcc was doing clever optimization that overhead
> would effectivly become zero.
> 
> I was favoring this low-overhead syscall approach because it would avoid
> the loss of a register thus leaving performance of non-threaded code
> unchanged but other developers generally favor the permanent allocation
> of $t9 as a thread register.

Personally, I favor doing the low-overhead syscall for o32 and then
moving to the new ABI that MIPS is talking about with a thread
register.  I'm not sure what to do about n32/n64.

> Other crazy ideas did include a per-thread mapping containing the thread
> pointer - and possibly more information in the future.

Does MIPS have an efficient way to do this for SMP?

> On the positive side if we had multiple register sets on a MIPSxx V2
> processor we could exploit that to get rid of this overheade and do
> other nice optimizations for TLB reload also.  Unfortunately these
> register sets are optional feature of the architecture only.

That's more or less what was talked about for ARM v6.

-- 
Daniel Jacobowitz