Re: SYSENTER and libraries

Rene Herman <rene.herman@xxxxxxxxx> · Thu, 25 Jan 2007 03:15:06 +0100

On 01/25/2007 12:18 AM, Gaurav Dhiman wrote:

I am also studing the code of OpenSolaris now days and just wanted to 
share and just wanted to share about it ....  Solaris libraries uses a 
better mechanism for invoking the system call, which make the library 
independent of the kernel.

Really, the way you describe is not only not better; it's significantly 
worse.

Library have an invalid instruction for invliking the system call. So
when first system call is done on the system, invalid OPCode
exception is generated and in the handler of this, solaris kernel do
chaeck if it is a specific invalid instruction as mentioned in the
lib, then it simply replaces that invalid instruction with the valid
instruction for making system call (either through SYSENTER, if
processor supports it, else through INT instruction).

This means the solaris kernel has to patch up every instance of this 
kernel-trap in the userland code. It's only once per specific address 
but as many times per process as that process has the trap instruction 
embedded. Especially since each write to userland will be invalidating 
icache, this could become quite expensive.

I think on OpenSoalris INT 0x91 is used rather than INT 0x80 as done 
on Linux .....

In this way the library is more independent of the kernel and need
not to be changed, if we change the mechanism of invoking the system
call. We give ful control to kernel to decide how to invoke the
system call and kernel decides it in a better way .....

Linux on the other hand has the vsyscall code. Userland code generated 
by the kernel, and mapped into the process' addresspace. The process 
just calls into that kernel generated code to make a syscall. So here 
it's _also_ the kernel that decides how to call into it and can change 
that around completely without any impact on userland.

This in fact is the  point of why that code was introduced -- the 
pentium4 was really slow on int 80 and people wanted it to use sysenter. 
Not all CPUs on which Linux runs support sysenter (and someone might've 
wanted to use SYSCALL on AMD instead) and it's obviously not a good idea 
to have each userland program figure out the best way to call into the 
kernel itself, so this vsyscall method was thought up.

Userland just calls into it, blissfully unaware of the actual method the 
kernel has setup there to enter itself. And blisfully unaware that the 
method it will be using when running under kernelversion N+1 tomorrow 
might be completely different.

See, no hackish patching (all over the place or even just once, as 
Solaris could do if it uses one common kernel entrypoint), one common 
kernel entrypoint and all the advantages you speak of with the kernel 
deciding the best method (including the best method to restart system 
calls when they were interrupted by signals).

Rene.

--
Kernelnewbies: Help each other learn about the Linux kernel.
Archive:       http://mail.nl.linux.org/kernelnewbies/
FAQ:           http://kernelnewbies.org/faq/