Re: [PATCH] Document LWS ABI.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> The question is "Are you OK with the existing ABI?" :-)

No.  As I understand it, r2 doesn't need to be clobbered because
glibc doesn't currently clobber it.  So, using it in the LWS code
would cause an ABI break.  That's one register back to userspace.

I want to keep r19 and r27 for userspace so the PIC register doesn't
have to be saved and restored in the asm (linux-atomic.c is compiled
as PIC code).  You can have r29.

That leaves three free registers for the LWS code: r22, r23 and r29.
The LWS ABI has r1, r20-r26 and r28-r31.  Userspace has two call-clobbered
registers free across the asm in PIC code, and three in non-PIC code.
That's enough to efficiently perform the error comparisons.

The asm would be more efficient if the registers used for lws_mem,
lws_old and lws_new were not written to.  This occurs only for the
call in the 32-bit runtime with a 64-bit kernel.  As it stands,
the lws_mem, lws_old and lws_new arguments get reloaded every time
around the EAGAIN loop.  This is the crucial code in the compare
and swap:

        /* The load and store could fail */
1:      ldw     0(%sr3,%r26), %r28
	sub,<>  %r28, %r25, %r0
2:      stw     %r24, 0(%sr3,%r26)

The sub,<> instruction uses a 32-bit compare/subtract condition, so
the clipping of r25 isn't necessary.  Similarly, the stw instruction
ignores the most significant 32-bits of r24.  The value in r26 needs
clipping but you have three free registers, and it looks like r1 is
also free at this point in the code.  You can deposit the least
significant 32-bits of r26 into a field of zeros in another register
in one instruction.

It looks like lws_compare_and_swap64 and lws_compare_and_swap32 become
more or less functionally identical.  The above would become something
like:

#ifdef CONFIG_64BIT
	depd,z	%r26,63,32,%r1
1:      ldw     0(%sr3,%r1), %r28
        sub,<>  %r28, %r25, %r0
2:      stw     %r24, 0(%sr3,%r1)
#else
1:      ldw     0(%sr3,%r26), %r28
        sub,<>  %r28, %r25, %r0
2:      stw     %r24, 0(%sr3,%r26)
#endif

The argument clipping in the current code would be removed.  As a result,
the branch to lws_compare_and_swap can be eliminated in the 64-bit path.

It's my impression that the tightness of the loop for the compare/exchange
operation is important.

Dave
-- 
J. David Anglin                                  dave.anglin@xxxxxxxxxxxxxx
National Research Council of Canada              (613) 990-0752 (FAX: 952-6602)
--
To unsubscribe from this list: send the line "unsubscribe linux-parisc" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux SoC]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux