On Mon, 2013-06-17 at 10:37 -0400, Konrad Rzeszutek Wilk wrote: > On Sun, Jun 16, 2013 at 11:39:48PM +0100, Ben Hutchings wrote: > > On Fri, 2013-06-14 at 10:11 -0400, Konrad Rzeszutek Wilk wrote: > > > On Fri, Jun 14, 2013 at 02:41:33AM +0100, Ben Hutchings wrote: > > > > On Fri, 2013-06-14 at 02:30 +0100, Ben Hutchings wrote: > > > > > On Tue, 2013-06-11 at 12:35 -0700, gregkh@xxxxxxxxxxxxxxxxxxx wrote: > > > > > > This is a note to let you know that I've just added the patch titled > > > > > > > > > > > > xen/smp: Fixup NOHZ per cpu data when onlining an offline CPU. > > > > > > > > > > > > to the 3.9-stable tree which can be found at: > > > > > > http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary > > > > > > > > > > > > The filename of the patch is: > > > > > > xen-smp-fixup-nohz-per-cpu-data-when-onlining-an-offline-cpu.patch > > > > > > and it can be found in the queue-3.9 subdirectory. > > > > > > > > > > > > If you, or anyone else, feels it should not be added to the stable tree, > > > > > > please let <stable@xxxxxxxxxxxxxxx> know about it. > > > > > [...] > > > > > > That was OK (albeit the API for play_dead assumes that the CPU > > > > > > stays dead and never returns) but with commit 4b0c0f294 > > > > > > (tick: Cleanup NOHZ per cpu data on cpu down) that is no longer safe > > > > > > as said commit resets the ts->inidle which at the start of the > > > > > > cpu_idle loop was set. > > > > > [...] > > > > > > > > > > This also needs to be applied to earlier branches, because commit > > > > > 4b0c0f294 was applied to all of them: > > > > > > > > > > 2.6.32: d31e3b9 tick: Cleanup NOHZ per cpu data on cpu down > > > > > 3.0: b9cbfd2 tick: Cleanup NOHZ per cpu data on cpu down > > > > > 3.2: d9202d6 tick: Cleanup NOHZ per cpu data on cpu down > > > > > 3.4: 33b7cfc tick: Cleanup NOHZ per cpu data on cpu down > > > > > 3.9: c25c0eb tick: Cleanup NOHZ per cpu data on cpu down > > > > > > > > ...though, at least for 3.2, it's not at all obvious how to backport it > > > > as tick_nohz_idle_enter() doesn't exist. Konrad, can you provide a > > > > > > Ooops. > > > > backport of this, or should the backported commit 4b0c0f294 be reverted > > > > in older stable branches? > > > > > > I would say just hold of applying this particular patch (the > > > xen/smp: Fixup NOHZ per cpu data when onlining an offline CPU0 to the > > > older trees and let 4b0c0f294 go forth. I can take a look at doing a > > > backport but it might be more invasive. > > > > Just to be clear: you're recommending to leave stable branches as they > > are for now, despite this regression under Xen? > > It is either fix the baremetal or break Xen. Or give me a week and I can try > to fix both by cobbling up a patch. > > The functionality that this will break (from Xen standpoint) is the CPU hotplug > support. Meaning if a user does 'xl vcpu-set <guest> vCPUs' it will trigger > the splash WARN. It will still work, just a nasty WARN will pop up. OK, that's not as bad as I thought. > I am not thrilled about it, but I am buried with other bugs (And deadlines) > and don't have yet the time to make a nice patch that would solve it. > > Is there a third option of delaying the 4b0c0f294 delayed a bit to give > me some time? As I said, it's already been applied. So it could be reverted in a later stable release, but it sounds like by then you'll have a fix for Xen anyway. Ben. -- Ben Hutchings Humans are not rational beings; they are rationalising beings.
Attachment:
signature.asc
Description: This is a digitally signed message part