On Thu, Apr 25, 2019 at 6:14 PM Dmitry Safonov <dima@xxxxxxxxxx> wrote: > > As it has been discussed on timens RFC, adding a new conditional branch > `if (inside_time_ns)` on VDSO for all processes is undesirable. > It will add a penalty for everybody as branch predictor may mispredict > the jump. Also there are instruction cache lines wasted on cmp/jmp. > > Those effects of introducing time namespace are very much unwanted > having in mind how much work have been spent on micro-optimisation > vdso code. > > The propose is to allocate a second vdso code with dynamically > patched out (disabled by static_branch) timens code on boot time. > > Allocate another vdso and copy original code. [...] > diff --git a/arch/x86/entry/vdso/vma.c b/arch/x86/entry/vdso/vma.c > index 80cbb2167eba..6aae9c0d400d 100644 > --- a/arch/x86/entry/vdso/vma.c > +++ b/arch/x86/entry/vdso/vma.c [...] > static vm_fault_t vdso_fault(const struct vm_special_mapping *sm, > struct vm_area_struct *vma, struct vm_fault *vmf) > { > const struct vdso_image *image = vma->vm_mm->context.vdso_image; > + unsigned long offset = vmf->pgoff << PAGE_SHIFT; > > if (!image || (vmf->pgoff << PAGE_SHIFT) >= image->size) > return VM_FAULT_SIGBUS; > > - vmf->page = virt_to_page(image->text + (vmf->pgoff << PAGE_SHIFT)); > + if (current_timens_offsets() && image->text_timens) I'm pretty sure that accessing `current` in here is wrong. AFAIK this fault handler can be invoked on remote processes, through interfaces like /proc/$pid/mem and process_vm_readv(); in that case, the kernel should install a page based on the time namespace of the target process, not based on the time namespace of the caller. > + vmf->page = vmalloc_to_page(image->text_timens + offset); > + else > + vmf->page = virt_to_page(image->text + offset); > + > get_page(vmf->page); > return 0; > } [...]