Running posix cpu timers in hard interrupt context has a few downsides: - For PREEMPT_RT it cannot work as the expiry code needs to take sighand lock, which is a 'sleeping spinlock' in RT - For fine grained accounting it's just wrong to run this in context of the timer interrupt because that way a process specific cpu time is accounted to the timer interrupt. There is no real hard requirement to run the expiry code in hard interrupt context. The posix CPU timers are an approximation anyway, so having them expired and evaluated in task work context does not really make them worse. That unearthed the fact that KVM is missing to handle task work before entering a VM which is delaying pending task work until the vCPU thread goes all the way back to user space qemu. The series implements the necessary handling for x86/KVM and switches the posix cpu timer expiry into task work for X86. The posix timer modification is conditional on a selectable config switch as this requires that task work is handled in KVM. The available tests pass and no problematic difference has been observed. Thanks, tglx 8<-------------------- arch/x86/kvm/x86.c | 8 ++++- arch/x86/Kconfig | 1 include/linux/sched.h | 3 ++ include/linux/tracehook.h | 15 ++++++++++ kernel/task_work.c | 19 ++++++++++++ kernel/time/Kconfig | 5 +++ kernel/time/posix-cpu-timers.c | 61 ++++++++++++++++++++++++++++++----------- 7 files changed, 95 insertions(+), 17 deletions(-)