----- On Jan 11, 2016, at 11:27 PM, Ben Maurer bmaurer@xxxxxx wrote: > One disadvantage of only allowing one is that high performance server > applications tend to statically link. It'd suck to have to go through what ever > type of relocation we'd need to pull this out of glibc. But if there's only one > registration allowed a statically linked app couldn't create its own if glibc > might use it some day. One idea I have would be to let the kernel reserve some space either after the first stack address (for a stack growing down) or at the beginning of the allocated TLS area for each thread in copy_thread_tls() by fiddling with sp or the tls base address when creating a thread. In theory, this would allow always returning the same address, and the memory would exist as long as the thread exists. Not sure whether it may have unforeseen impact though. Thoughts ? Thanks, Mathieu > > > > Sent from my iPhone > >> On Jan 11, 2016, at 6:46 PM, Josh Triplett <josh@xxxxxxxxxxxxxxxx> wrote: >> >>> On Tue, Jan 12, 2016 at 12:49:18AM +0000, Mathieu Desnoyers wrote: >>> ----- On Jan 11, 2016, at 6:03 PM, Josh Triplett josh@xxxxxxxxxxxxxxxx wrote: >>> >>>>> On Mon, Jan 11, 2016 at 10:38:28PM +0000, Seymour, Shane M wrote: >>>>> I have some concerns and suggestions for you about this. >>>>> >>>>> What's to stop someone in user space from requesting an arbitrarily large number >>>>> of CPU # cache locations that the kernel needs to allocate memory to track and >>>>> each time the task migrates to a new CPU it needs to update them all? Could you >>>>> use it to dramatically slow down a system/task switching? Should there be a >>>>> ulimit type value or a sysctl setting to limit the number that you're allowed >>>>> to register per-task? >>>> >>>> The documented behavior of the syscall allows only one location per >>>> thread, so the kernel can track that one and only address rather easily >>>> in the task_struct. Allowing dynamic allocation definitely doesn't seem >>>> like a good idea. >>> >>> The current implementation now allows more than one location per >>> thread. Which piece of documentation states that only one location >>> per thread is allowed ? This was indeed the case for the prior >>> implementations, but I moved to implementing a linked-list of >>> cpu_cache areas per thread to allow the getcpu_cache system call to >>> be used by more than a single shared object within a given program. >> >> Ah, I missed that change. >> >>> Without the linked list, as soon as more than one shared object try >>> to register their cache, the first one will prohibit all others from >>> doing so. >>> >>> We could perhaps try to document that this system call should only >>> ever be used by *libc, and all libraries and applications should >>> then use the libc TLS cache variable, but it seems rather fragile, >>> and any app/lib could try to register its own cache. >> >> That does seem a bit fragile, true; on the other hand, the linked-list >> approach would allow userspace to allocate an unbounded amount of kernel >> memory, without any particular control on it. That doesn't seem >> reasonable. Introducing an rlimit or similar for this seems like >> massive overkill, and hardcoding a fixed limit breaks the 0-1-infinity >> rule. >> >> Given that any registered location will always provide the same value, >> allowing only a single registration doesn't seem *too* problematic; >> libc-based programs can use the libc implementation, and non-libc-based >> programs can register a location themselves. And users of this API will >> already likely want to use some TLS mechanism, which already interacts >> heavily with libc (set_thread_area/clone). >> >> Allowing only one registration at a time seems preferable to introducing >> another way to allocate kernel resources on a process's behalf. >> > > - Josh Triplett -- Mathieu Desnoyers EfficiOS Inc. http://www.efficios.com -- To unsubscribe from this list: send the line "unsubscribe linux-api" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html