Re: Reset problem vs. MMIO emulation, hypercalls, etc...

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 08/07/2012 03:14 PM, David Gibson wrote:
> On Tue, Aug 07, 2012 at 11:46:35AM +0300, Avi Kivity wrote:
>> On 08/07/2012 04:32 AM, David Gibson wrote:
>> > On Tue, Aug 07, 2012 at 06:57:57AM +1000, Benjamin Herrenschmidt wrote:
>> >> On Mon, 2012-08-06 at 13:13 +1000, David Gibson wrote:
>> >> > So, I'm still trying to nut out the implications for H_CEDE, and think
>> >> > if there are any other hypercalls that might want to block the guest
>> >> > for a time.  We were considering blocking H_PUT_TCE if qemu devices
>> >> > had active dma maps on the previously mapped iovas.  I'm not sure if
>> >> > the discussions that led to the inclusion of the qemu IOMMU code
>> >> > decided that was wholly unnnecessary or just not necessary for the
>> >> > time being.
>> >> 
>> >> For "sleeping hcalls" they will simply have to set exit_request to
>> >> complete the hcall from the kernel perspective, leaving us in a state
>> >> where the kernel is about to restart at srr0 + 4, along with some other
>> >> flag (stop or halt) to actually freeze the vcpu.
>> >> 
>> >> If such an "async" hcall decides to return an error, it can then set
>> >> gpr3 directly using ioctls before restarting the vcpu.
>> > 
>> > Yeah, I'd pretty much convinced myself of that by the end of
>> > yesterday.  I hope to send patches implementing these fixes today.
>> > 
>> > There are also some questions about why our in-kernel H_CEDE works
>> > kind of differently from x86's hlt instruction implementation (which
>> > comes out to qemu unless the irqchip is in-kernel as well).  I don't
>> > think we have an urgent problem there though.
>> 
>> It's the other way round, hlt sleeps in the kernel unless the irqchip is
>> not in the kernel.
> 
> That's the same as what I said.

I meant to stress that the normal way which other archs should emulate
is sleep-in-kernel.

> 
> We never have irqchip in kernel (because we haven't written that yet)
> but we still sleep in-kernel for CEDE.  I haven't spotted any problem
> with that, but now I'm wondering if there is one, since x86 don't do
> it in what seems like the analogous situation.
> 
> It's possible this works because our decrementer (timer) interrupts
> are different at the core level from external interrupts coming from
> the PIC, and *are* handled in kernel, but I haven't actually followed
> the logic to work out if this is the case.
> 
>>  Meaning the normal state of things is to sleep in
>> the kernel (whether or not you have an emulated interrupt controller in
>> the kernel -- the term irqchip in kernel is overloaded for x86).
> 
> Uh.. overloaded in what way.

On x86, irqchip-in-kernel means that the local APICs, the IOAPIC, and
the two PICs are emulated in the kernel.  Now the IOAPIC and the PICs
correspond to non-x86 interrupt controllers, but the local APIC is more
tightly coupled to the core.  Interrupt acceptance by the core is an
operation that involved synchronous communication with the local APIC:
the APIC presents the interrupt, the core accepts it based on the value
of the interrupt enable flag and possible a register (CR8), then the
APIC updates the ISR and IRR.

The upshot is that if the local APIC is in userspace, interrupts must be
synchronous with vcpu exection, so that KVM_INTERRUPT is a vcpu ioctl
and HLT is emulated in userspace (so that local APIC emulation can check
if an interrupt wakes it up or not).  As soon as the local APIC is
emulated in the kernel, HLT can be emulated there as well, and
interrupts become asynchronous (KVM_IRQ_LINE, a vm ioctl).

So irqchip_in_kernel, for most discussions, really means whether
interrupt queuing is synchronous or asynchronous.  It has nothing to do
with the interrupt controllers per se.  All non-x86 archs always have
irqchip_in_kernel() in this sense.

Peter has started to fix up this naming mess in qemu.  I guess we should
do the same for the kernel (except for ABIs) and document it, because it
keeps generating confusion.

-- 
error compiling committee.c: too many arguments to function
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux