On Thu, Oct 12, 2017 at 4:03 PM, Mathieu Desnoyers <mathieu.desnoyers@xxxxxxxxxxxx> wrote: > This new cpu_opv system call executes a vector of operations on behalf > of user-space on a specific CPU with preemption disabled. It is inspired > from readv() and writev() system calls which take a "struct iovec" array > as argument. > > The operations available are: comparison, memcpy, add, or, and, xor, > left shift, and right shift. The system call receives a CPU number from > user-space as argument, which is the CPU on which those operations need > to be performed. All preparation steps such as loading pointers, and > applying offsets to arrays, need to be performed by user-space before > invoking the system call. The "comparison" operation can be used to > check that the data used in the preparation step did not change between > preparation of system call inputs and operation execution within the > preempt-off critical section. > > The reason why we require all pointer offsets to be calculated by > user-space beforehand is because we need to use get_user_pages_fast() to > first pin all pages touched by each operation. This takes care of > faulting-in the pages. Then, preemption is disabled, and the operations > are performed atomically with respect to other thread execution on that > CPU, without generating any page fault. > > A maximum limit of 16 operations per cpu_opv syscall invocation is > enforced, so user-space cannot generate a too long preempt-off critical > section. Each operation is also limited a length of PAGE_SIZE bytes, > meaning that an operation can touch a maximum of 4 pages (memcpy: 2 > pages for source, 2 pages for destination if addresses are not aligned > on page boundaries). > > If the thread is not running on the requested CPU, a new > push_task_to_cpu() is invoked to migrate the task to the requested CPU. > If the requested CPU is not part of the cpus allowed mask of the thread, > the system call fails with EINVAL. After the migration has been > performed, preemption is disabled, and the current CPU number is checked > again and compared to the requested CPU number. If it still differs, it > means the scheduler migrated us away from that CPU. Return EAGAIN to > user-space in that case, and let user-space retry (either requesting the > same CPU number, or a different one, depending on the user-space > algorithm constraints). This series seems to get more complicated every time, and it's been so long that I've mostly forgetten all the details. I would have sworn we had a solution that got single-stepping right without any complicated work like this in the kernel and had at most a minor performance hit relative to the absolutely fastest solution. I'll try to dig it up. -- To unsubscribe from this list: send the line "unsubscribe linux-api" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html