On 10/29/2014 03:05 PM, Waiman Long wrote:
On 10/27/2014 05:22 PM, Waiman Long wrote:On 10/27/2014 02:04 PM, Peter Zijlstra wrote:On Mon, Oct 27, 2014 at 01:38:20PM -0400, Waiman Long wrote:On 10/24/2014 04:54 AM, Peter Zijlstra wrote:My concern is that spin_unlock() can be called in many places, including loadable kernel modules. Can the paravirt_patch_ident_32() function able to patch all of them in reasonable time? How about a kernel module loaded laterOn Thu, Oct 16, 2014 at 02:10:38PM -0400, Waiman Long wrote:Since enabling paravirt spinlock will disable unlock function inlining, a jump label can be added to the unlock function without adding patchsites all over the kernel.But you don't have to. My patches allowed for the inline to remain,again reducing the overhead of enabling PV spinlocks while running on areal machine. Look at: http://lkml.kernel.org/r/20140615130154.213923590@xxxxxxxxx In particular this hunk: Index: linux-2.6/arch/x86/kernel/paravirt_patch_64.c =================================================================== --- linux-2.6.orig/arch/x86/kernel/paravirt_patch_64.c +++ linux-2.6/arch/x86/kernel/paravirt_patch_64.c @@ -22,6 +22,10 @@ DEF_NATIVE(pv_cpu_ops, swapgs, "swapgs") DEF_NATIVE(, mov32, "mov %edi, %eax"); DEF_NATIVE(, mov64, "mov %rdi, %rax");+#if defined(CONFIG_PARAVIRT_SPINLOCKS)&& defined(CONFIG_QUEUE_SPINLOCK)+DEF_NATIVE(pv_lock_ops, queue_unlock, "movb $0, (%rdi)"); +#endif + unsigned paravirt_patch_ident_32(void *insnbuf, unsigned len) { return paravirt_patch_insns(insnbuf, len, @@ -61,6 +65,9 @@ unsigned native_patch(u8 type, u16 clobb PATCH_SITE(pv_cpu_ops, clts); PATCH_SITE(pv_mmu_ops, flush_tlb_single); PATCH_SITE(pv_cpu_ops, wbinvd);+#if defined(CONFIG_PARAVIRT_SPINLOCKS)&& defined(CONFIG_QUEUE_SPINLOCK)+ PATCH_SITE(pv_lock_ops, queue_unlock); +#endif patch_site: ret = paravirt_patch_insns(ibuf, len, start, end); That makes sure to overwrite the callee-saved call to the pv_lock_ops::queue_unlock with the immediate asm "movb $0, (%rdi)".Therefore you can retain the inlined unlock with hardly (there might besome NOP padding) any overhead at all. On PV it reverts to a callee saved function call.at run time?modules should be fine, see arch/x86/kernel/module.c:module_finalize() -> apply_paravirt(). Also note that the 'default' text is an indirect call into the paravirt ops table which routes to the 'right' function, so even if the text patching would be 'late' calls would 'work' as expected, just slower.Thanks for letting me know about that. I have this concern because your patch didn't change the current configuration of disabling unlock inlining when paravirt_spinlock is enabled. With that, I think it is worthwhile to reduce the performance delta between the PV and non-PV kernel on bare metal.I am sorry that the unlock call sites patching code doesn't work in a virtual guest. Your pvqspinlock patch did an unconditional patching even in a virtual guest. I added check for the paravirt_spinlocks_enabled, but it turned out that some spin_unlock() seemed to be called before paravirt_spinlocks_enabled is set. As a result, some call sites were still patched resulting in missed wake up's and system hang.At this point, I am going to leave out that change from my patch set until we can figure out a better way of doing that.
Below was a partial kernel log with the unlock call site patch code in a KVM guest:
[ 0.438006] native_patch: patch out pv_queue_unlock! [ 0.438565] native_patch: patch out pv_queue_unlock! [ 0.439006] native_patch: patch out pv_queue_unlock! [ 0.439638] native_patch: patch out pv_queue_unlock! [ 0.440052] native_patch: patch out pv_queue_unlock! [ 0.441006] native_patch: patch out pv_queue_unlock! [ 0.441566] native_patch: patch out pv_queue_unlock! [ 0.442035] ftrace: allocating 24168 entries in 95 pages [ 0.451208] Switched APIC routing to physical flat. [ 0.453202] ..TIMER: vector=0x30 apic1=0 pin1=2 apic2=-1 pin2=-1[ 0.454002] smpboot: CPU0: Intel QEMU Virtual CPU version 1.5.3 (fam: 06, model: 06, stepping: 03) [ 0.456000] Performance Events: Broken PMU hardware detected, using software events only.
[ 0.456003] Failed to access perfctr msr (MSR c1 is 0) [ 0.457151] KVM setup paravirtual spinlock [ 0.460039] NMI watchdog: disabled (cpu0): hardware events not enabledIt could be seen that some unlock call sites were patched before the KVM setup code set the paravirt_spinlocks_enabled flag.
-Longman -- To unsubscribe from this list: send the line "unsubscribe linux-arch" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html