Re: [RFC PATCH] getcpu_cache system call: caching current CPU number (x86)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 18.07.2015 02:33, Andy Lutomirski wrote:
On Fri, Jul 17, 2015 at 4:28 PM, Ondřej Bílka <neleai@xxxxxxxxx> wrote:
On Fri, Jul 17, 2015 at 11:48:14AM -0700, Linus Torvalds wrote:

On x86, if you want per-cpu memory areas, you should basically plan on
using segment registers instead (although other odd state has been
used - there's been the people who use segment limits etc rather than
the *pointer* itself, preferring to use "lsl" to get percpu data. You
could also imaging hiding things in the vector state somewhere if you
control your environment well enough).

Thats correct, problem is that you need some sort of hack like this on
archs that otherwise would need syscall to get tid/access tls variable.

On x64 and archs that have register for tls this could be implemented
relatively easily.

Kernel needs to allocate

int running_cpu_for_tid[32768];

On context switch it atomically writes to this table

running_cpu_for_tid[tid] = cpu;

This table is read-only accessible from userspace as mmaped file.

Then userspace just needs to access it with three indirections like:

__thread tid;

char caches[CPU_MAX];
#define getcpu_cache caches[tid > 32768 ? get_cpu() : running_cpu_for_tid[tid]]

With more complicated kernel interface you could eliminate one
indirection as we would use void * array instead and thread could do
syscall to register what values it should use for each thread.

Or we implement per-cpu segment registers so you can point gs directly
at percpu data.  This is conceptually easy and has no weird ABI
issues.  All it needs is an implementation and some good tests.

I think the API should be "set gsbase to x + y*(cpu number)".  On
x86_64, userspace just allocates a big swath of virtual space and
populates it as needed.

I've proposed exactly that design last year:
https://lwn.net/Articles/611946/

--Andy



--
To unsubscribe from this list: send the line "unsubscribe linux-api" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux