On February 27, 2016 6:15:01 AM PST, Mathieu Desnoyers <mathieu.desnoyers@xxxxxxxxxxxx> wrote: >----- On Feb 27, 2016, at 1:24 AM, H. Peter Anvin hpa@xxxxxxxxx wrote: > >> On 02/26/16 16:40, Mathieu Desnoyers wrote: >>>> >>>> I think it would be a good idea to make this a general pointer for >the kernel to >>>> be able to write per thread state to user space, which obviously >can't be done >>>> with the vDSO. >>>> >>>> This means the libc per thread startup should query the kernel for >the size of >>>> this structure and allocate thread local data accordingly. We can >then grow >>>> this structure if needed without making the ABI even more complex. >>>> >>>> This is more than a system call: this is an entirely new way for >userspace to >>>> interact with the kernel. Therefore we should make it a general >facility. >>> >>> I'm really glad to see I'm not the only one seeing potential for >>> genericity here. :-) This is exactly what I had in mind >>> last year when proposing the thread_local_abi() system call: >>> a generic way to register an extensible per-thread data structure >>> so the kernel can communicate with user-space and vice-versa. >>> >>> Rather than having the libc query the kernel for size of the >structure, >>> I would recommend that libc tells the kernel the size of the >thread-local >>> ABI structure it supports. The idea here is that both the kernel and >libc >>> need to know about the fields in that structure to allow a two-way >>> interaction. Fields known only by either the kernel or userspace >>> are useless for a given thread anyway. This way, libc could >statically >>> define the structure. >> >> Big fat NOPE there. Why? Because it means that EVERY interaction >with >> this memory, no matter how critical, needs to be conditionalized. >> Furthermore, userspace != libc. Applications or higher-layer >libraries >> might have more information than the running libc about additional >> fields, but with your proposal libc would gate them. > >Good point! > >> >> As far as the kernel providing the size in the structure (alone) -- I >> *really* hope you can see what is wrong with that!! That doesn't >mean >> we can't provide it in the structure as well, and that too might >avoid >> the skipped libc problem. > >Indeed, libc would need to query the size before it can allocate >the structure. > >> >>> I would be tempted to also add "features" flags, so both user-space >>> and the kernel could tell each other what they support: user-space >>> would announce the set of features it supports, and it could also >>> query the kernel for the set of supported features. One simple >approach >>> would be to use a uint64_t as type for those feature flags, and >>> reserve the last bit for extending to future flags if we ever have >>> more than 64. >>> >>> Thoughts ? >> >> It doesn't seem like it would hurt, although the size of the flags >field >> could end up being an issue. > >I'm concerned that this thread-local ABI structure may become messy. >Let's just imagine how we would first introduce a "cpu_id" field >(int32_t), >and eventually add a "seqnum" field for rseq in the future (unsigned >long). > >Both fields need to be read with single-copy semantics as volatile >reads, and both need to be naturally aligned. However, I'm tempted >to use the "packed" attribute on the structure since it's an ABI >between kernel and user-space. A pretty bad example of what this >could become, due to alignment constraints, looks like: > >/* This structure needs to be aligned on pointer size. */ >struct thread_local_abi { > int32_t cpu_id; > int32_t __unused1; > unsigned long seqnum; > /* Add new fields at the end. */ >} __attribute__((packed)); > >And this is just a start. It may become messier as we append >new fields in the future. > >The main argument I currently see in favor of having this >meta system call for all per-thread features is to only >maintain a single pointer in the kernel task_struct rather >than one per thread-local feature. > >If the goal is really to keep the burden on the task struct >small, we could use kmalloc()/kfree() to allocate and free an >array of pointers to the various per-thread features, rather >than putting them directly in task_struct. We could keep a >mask of the enabled features in the task struct too (which >we will likely have to do even if we go the the thread-local >ABI meta system call). > >Having this per-task allocated pointer array at kernel-level >would allow us to have one system call per feature, with clear >semantics, without evolving a messy thread-local ABI structure >due to all sorts of alignment constraints. > >Thoughts ? > >Thanks, > >Mathieu I think you are worried about problems which we have already solved many, many times - structures are very common in the user space ABI and we know how to deal with this. And when you say: > However, I'm tempted > to use the "packed" attribute on the structure > since it's an ABI > between kernel and user-space. and mention "unsigned long" in a user space ABI all I can think of that you really have not followed the issues of user space ABI design as they have evolved over the last 20 years. Simply put: non-problem. -- Sent from my Android device with K-9 Mail. Please excuse brevity and formatting. -- To unsubscribe from this list: send the line "unsubscribe linux-api" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html