On 15.12.2010, at 01:00, Scott Wood wrote: > On Wed, 15 Dec 2010 00:29:40 +0100 > Alexander Graf <agraf@xxxxxxx> wrote: > >> On 15.12.2010, at 00:17, Scott Wood wrote: >> >>> So that once KVM has an interrupt to deliver, and sees that critical is >>> engaged, it knows that the next magic page store will resolve things. >>> Either it is a store to critical, and KVM can now deliver the >>> interrupt -- or it is some other store (scratch or MSR itself) and thus >>> int_pending has not yet been checked. >>> >>> I don't think it would be a problem for live patching. It just seems a >>> bit icky. >> >> Oh, because you'd only trap stores, but no writes? Yep, that would work. > > "writes" or "loads"? :-) s/writes/loads/ :). Sorry, it's 1am here :). > >> I actually like that idea. It's probably the cleanest we can get away with without deep modifications of the guest. Single-step is always icky. > > Well, there's another complication -- if we trap on the final store to > end the critical section, the critical section won't actually be ended > until after that instruction executes. Which won't happen until we set > the page to read/write and let it go. So we'd have to look at the > instruction to see what it's doing. Yep, which is why I proposed the thing below. We'd have to emulate the uncrit store and then automatically inject the interrupt because we're not in the crit section anymore. > >> Thinking about the whole thing - can't we create an "interrupt notification page"? Some page that is always mapped read-only when interrupts are available, but read-write when they're not? > Then we could just do an unconditional store after the crit section is done and everyone's happy. > > I'd limit it to interrupts that were deferred due to critical, > to avoid unnecessary MMU manipulation, and unnecessary traps when doing > mtmsr/wrtee if there's an interrupt pending and old EE = new EE = zero > (assuming the guest doesn't use a separate restore path for that case). Hrm, we already do treat EE 0 -> 1 changes differently in the code: /* Check if we have to fetch an interrupt */ lwz r31, (KVM_MAGIC_PAGE + KVM_MAGIC_INT)(0) cmpwi r31, 0 beq+ no_check /* Check if we may trigger an interrupt */ andi. r30, r30, MSR_EE beq no_check SCRATCH_RESTORE /* Nag hypervisor */ kvm_emulate_mtmsrd_orig_ins: tlbsync b kvm_emulate_mtmsrd_branch So the only thing we need to do is to get rid of the MAGIC_INT check and unconditionally store a random register into the notification page (-2*PAGE_SIZE) instead of the tlbsync instruction (sure, things there are slightly more compllicated because we actually use the real mtmsr, but you get the point). > > But otherwise sounds reasonable, if we're willing to change the > interface that much. Does it even need to be read-only, or could it be > entirely unmapped when there's a pending interrupt? I don't see why we shouldn't add this to the interface. But I'd leave this in the back of our heads for a couple more days so in case we end up coming up with an even better idea, we rather implement that one :) It could be read-only or unmapped - doesn't really matter. It has to be a store, because otherwise we'd clobber a guest register. It could however be a single physical page shared by all guests. We don't care about the contents written into that page anyways. Alex -- To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html