On Wed, May 06, 2009 at 10:35:27AM +0800, Sheng Yang wrote: > On Tuesday 05 May 2009 20:46:04 Michael S. Tsirkin wrote: > > On Tue, May 05, 2009 at 07:49:10AM -0300, Marcelo Tosatti wrote: > > > On Tue, May 05, 2009 at 01:34:50PM +0300, Michael S. Tsirkin wrote: > > > > On Tue, May 05, 2009 at 07:19:45AM -0300, Marcelo Tosatti wrote: > > > > > On Tue, May 05, 2009 at 12:51:36PM +0300, Michael S. Tsirkin wrote: > > > > > > On Mon, Apr 27, 2009 at 10:30:17PM +0800, Sheng Yang wrote: > > > > > > > > > > > If guest can write to the real device MSI-X table > > > > > > > > > > > directly, it would cause chaos on interrupt delivery, for > > > > > > > > > > > what guest see is totally different with what's host > > > > > > > > > > > see... > > > > > > > > > > > > > > > > > > > > Obviously. > > > > > > > > > > > > > > > > > > > > Thanks, > > > > > > > > > > > > What's the reason that this page is unmapped from the qemu memory > > > > > > space? Specifically what do these lines do: > > > > > > int offset = r_dev->msix_table_addr - > > > > > > real_region->base_addr; ret = munmap(region->u.r_virtbase + offset, > > > > > > TARGET_PAGE_SIZE); > > > > > > > > > > I believe this allows accesses to this page (the MSI-X table), which > > > > > is part of the guest address space (through kvm memory slots), to be > > > > > trapped by qemu. > > > > > > > > > > Since there is no actual page in this guest address, KVM treats > > > > > accesses as MMIO and forwards them to QEMU. > > > > > > > > I thought about this too. > > > > But why is this necessary for assigned MSI-X but not for emulated > > > > devices such as e.g. e1000? All e1000 does seems to be > > > > cpu_register_physical_memory ... > > > > > > Because there is no registered (kvm) memory slot for the range which > > > e1000 registers its MMIO? Not sure about the address of the MSI-X table > > > page, but you could achieve the same effect by splitting the slot which > > > it lives in two, with a 1 page hole between them. > > > > You could also move the emulated MSI-X table, sticking it on top of the > > existing BAR. Since PCI config includes the pointer to the table, > > a driver that reads this pointer will continue to work. > > One BAR can contain more than a MSI-X table... The PCI spec only said the > other information should be page aligned and can't in the same page of MSI-X > table(except PBA). I think this method make thing more complicate, we don't > want to and can't trap other informations in the same BAR... The trick I was suggesting was increasing the BAR size. Let's assume we have real BAR of size 1Mbyte and MSI-X table at offset 0. We report to guest BAR of size 2Mbyte and MSI-X table offset 1MByte. Trap all accesses 1MByte to 2MByte and copy them to MSI-X table. > > Of course, there's no guarantee that guest drivers don't just hard-code > > this offset. > > I think this mostly won't happen. > > > > > BTW this is why you can't map the MSI-X table page directly, you want > > > accesses to be trapped. > > > > BTW current design won't work if the base page size is > 4K, will it? > > The hole covers a page, so you'll get faults outside the MSI-X table. > > Yes. One entry for MSI-X is 16bytes, one page can contain 256 entries. Well, I > haven't see a device get more than 100 entries, but for this limitation, maybe > we should limit MSI-X max entries to 256 (rather than 512 entries > now)temporarily... Drivers might not have a clean fallback path if the number of entries becomes smaller. Another problem is if TARGET_PAGE_SIZE is > 4K. PCI spec only asks devices to reserve 4K of space for the table, so you will accidentally trapping accesses not related to MSI-X. -- MST -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html