Re: [PATCH 2/2] device-assignment: Allow PCI to manage the option ROM

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Oct 08, 2010 at 09:12:52AM -0600, Alex Williamson wrote:
> On Fri, 2010-10-08 at 10:40 +0200, Michael S. Tsirkin wrote:
> > On Thu, Oct 07, 2010 at 10:02:25PM -0600, Alex Williamson wrote:
> > > On Fri, 2010-10-08 at 00:45 +0200, Michael S. Tsirkin wrote:
> > > > On Thu, Oct 07, 2010 at 11:34:01AM -0600, Alex Williamson wrote:
> > > > > On Thu, 2010-10-07 at 19:18 +0200, Michael S. Tsirkin wrote:
> > > > > > On Mon, Oct 04, 2010 at 03:26:30PM -0600, Alex Williamson wrote:
> > > > > > > --- a/hw/device-assignment.c
> > > > > > > +++ b/hw/device-assignment.c
> > > > > ...
> > > > > > > @@ -1644,58 +1621,64 @@ void add_assigned_devices(PCIBus *bus, const char **devices, int n_devices)
> > > > > > >   */
> > > > > > >  static void assigned_dev_load_option_rom(AssignedDevice *dev)
> > > > > > >  {
> > > > > > > -    int size, len, ret;
> > > > > > > -    void *buf;
> > > > > > > +    char name[32], rom_file[64];
> > > > > > >      FILE *fp;
> > > > > > > -    uint8_t i = 1;
> > > > > > > -    char rom_file[64];
> > > > > > > +    uint8_t val;
> > > > > > > +    struct stat st;
> > > > > > > +    void *ptr;
> > > > > > > +
> > > > > > > +    /* If loading ROM from file, pci handles it */
> > > > > > > +    if (dev->dev.romfile || !dev->dev.rom_bar)
> > > > > > > +        return;
> > > > > > >  
> > > > > > >      snprintf(rom_file, sizeof(rom_file),
> > > > > > >               "/sys/bus/pci/devices/%04x:%02x:%02x.%01x/rom",
> > > > > > >               dev->host.seg, dev->host.bus, dev->host.dev, dev->host.func);
> > > > > > >  
> > > > > > > -    if (access(rom_file, F_OK))
> > > > > > > +    if (stat(rom_file, &st)) {
> > > > > > >          return;
> > > > > > > +    }
> > > > > > >  
> > > > > > 
> > > > > > Just a note that stat on the ROM sysfs file returns window size,
> > > > > > not the ROM size. So this allocates more ram than really necessary for
> > > > > > ROM. Real size is returned by fread.
> > > > > > 
> > > > > > Do we care?
> > > > > 
> > > > > That was my intention with using stat.  I thought that by default the
> > > > > ROM BAR should match physical hardware, so even if the contents could be
> > > > > rounded down to a smaller size, we maintain the size of the physical
> > > > > device.  To use the minimum size, the contents could be extracted using
> > > > > pci-sysfs and passed with the romfile option, or the ROM could be
> > > > > disabled altogether with the rombar=0 option.  Sound reasonable?
> > > > > Thanks,
> > > > > 
> > > > > Alex
> > > > 
> > > > For BAR size yes, but we do not need the buffer full of 0xff as it is
> > > > never accessed: let's have buffer size match real ROM, avoid wasting
> > > > memory: this can come up to megabytes easily.
> > > > Makes sense?
> > > 
> > > I tend to doubt that hardware vendors are going to waste money putting
> > > seriously oversized eeproms on devices.  It does seem pretty typical to
> > > find graphics cards with 128K ROM BARs where the actual ROM squeezes
> > > just under 64K, but that's a long way from megabytes of wasted memory.
> > > The only device I have with a ROM BAR in the megabytes is an 82576, but
> > > it comes up as an invalid rom through pci-sysfs, so we skip it.  I
> > > assume that just means someone was lazy and didn't bother to fuse a
> > > transistor that disables the ROM BAR, leaving it at it's maximum
> > > aperture w/ no eeprom to back it.  Anyone know?  Examples to the
> > > contrary welcome.
> > > 
> > > So I think the question comes down to whether there's any value to
> > > trying to exactly mimic the resource layout of the device.  I'm doubtful
> > > that there is, but at the potential cost of 10-100s of KBs of memory, I
> > > thought it might be worthwhile.  If you feel strongly otherwise, I'll
> > > follow-up with a patch to size it by the actual readable contents.
> > > Thanks,
> > > 
> > > Alex
> > 
> > I actually agree sizing ROM BAR exactly the same as the device
> > is a good idea. I just thought we can save the extra memory
> > by not allocating the RAM in question, and writing code
> > to return 0xff on reads within the BAR but outside ROM.
> > And no, I don't feel strongly about this optimization.
> > 
> 
> Ok, so you're looking for something like below.  We can no longer map
> the ROM into the guest,
> but it's a ROM, so we don't care about speed.

Why can't we map ROM? Map full pages, leave 0xff unmapped.
The reason there will be such is because BAR is power of 2.

> Here's the big problem... it breaks migration.  The ramblock live
> migration code isn't going to deal well with migration from a VM with a
> BAR sized ramblock to a ROM sized ramblock (likewise the reverse).

You mean cross-version migration? Otherwise, why would not both
sides be ROM sized?

>  So
> we could do it for passthrough devices since they can't migrate anyway,
> but then we have to go back to separate code to handle assigned device
> ROMs vs emulated device ROMs.

I think this is based on the assumption we do not map ROM.
If we do map it, then most of the code is still same,
just add 0xff handling for pages after end of ROM.
These typically are unaccessed anyway.

>  Good idea, but I don't think it's worth
> the effort.  Thanks,
> 
> Alex
> 
> Not Signed-off, Not to be applied...
> 
> diff --git a/hw/device-assignment.c b/hw/device-assignment.c
> index 26cb797..94561ef 100644
> --- a/hw/device-assignment.c
> +++ b/hw/device-assignment.c
> @@ -1622,6 +1622,7 @@ void add_assigned_devices(PCIBus *bus, const char **devices, int n_devices)
>  static void assigned_dev_load_option_rom(AssignedDevice *dev)
>  {
>      char name[32], rom_file[64];
> +    size_t size;
>      FILE *fp;
>      uint8_t val;
>      struct stat st;
> @@ -1654,20 +1655,23 @@ static void assigned_dev_load_option_rom(AssignedDevice *dev)
>      if (fwrite(&val, 1, 1, fp) != 1) {
>          goto close_rom;
>      }
> +
> +    fseek(fp, 0, SEEK_END);
> +    size = ftell(fp);

I don't think this works: looking at kernel code:
loff_t
generic_file_llseek_unlocked(struct file *file, loff_t offset, int
origin)
{
        struct inode *inode = file->f_mapping->host;

        switch (origin) {
        case SEEK_END:
                offset += inode->i_size;

So this seems to still be BAR size, you really need the size returned by
fread.


>      fseek(fp, 0, SEEK_SET);
>  
>      snprintf(name, sizeof(name), "%s.rom", dev->dev.qdev.info->name);
> -    dev->dev.rom_offset = qemu_ram_alloc(&dev->dev.qdev, name, st.st_size);
> +    dev->dev.rom_offset = qemu_ram_alloc(&dev->dev.qdev, name, size);
> +    dev->dev.rom_size = size;
>      ptr = qemu_get_ram_ptr(dev->dev.rom_offset);
> -    memset(ptr, 0xff, st.st_size);
>  
> -    if (!fread(ptr, 1, st.st_size, fp)) {
> +    if (!fread(ptr, 1, size, fp)) {
>          fprintf(stderr, "pci-assign: Cannot read from host %s\n"
>                  "\tDevice option ROM contents are probably invalid "
>                  "(check dmesg).\n\tSkip option ROM probe with rombar=0, "
>                  "or load from file with romfile=\n", rom_file);
>          qemu_ram_free(dev->dev.rom_offset);
> -        dev->dev.rom_offset = 0;
> +        dev->dev.rom_offset = dev->dev.rom_size = 0;
>          goto close_rom;
>      }
>  
> diff --git a/hw/pci.c b/hw/pci.c
> index 07e9661..bd15eb7 100644
> --- a/hw/pci.c
> +++ b/hw/pci.c
> @@ -1973,9 +1973,49 @@ static uint8_t pci_find_capability_list(PCIDevice *pdev, uint8_t cap_id,
>      return next;
>  }
>  
> -void pci_map_option_rom(PCIDevice *pdev, int region_num, pcibus_t addr, pcibus_t size, int type)
> +static uint32_t rom_readb(void *opaque, target_phys_addr_t addr)
>  {
> -    cpu_register_physical_memory(addr, size, pdev->rom_offset);
> +    PCIDevice *pdev = opaque;
> +
> +    if (addr > pdev->rom_size)
> +         return 0xff;
> +
> +    return *(uint8_t *)qemu_get_ram_ptr(pdev->rom_offset + addr);
> +}
> +
> +static uint32_t rom_readw(void *opaque, target_phys_addr_t addr)
> +{
> +    PCIDevice *pdev = opaque;
> +
> +    if (addr > pdev->rom_size)
> +         return 0xffff;
> +
> +    return *(uint16_t *)qemu_get_ram_ptr(pdev->rom_offset + addr);
> +}
> +
> +static uint32_t rom_readl(void *opaque, target_phys_addr_t addr)
> +{
> +    PCIDevice *pdev = opaque;
> +
> +    if (addr > pdev->rom_size)
> +         return 0xffffffff;
> +
> +    return *(uint32_t *)qemu_get_ram_ptr(pdev->rom_offset + addr);
> +}
> +
> +static CPUReadMemoryFunc * const rom_reads[] = {
> +    &rom_readb, &rom_readw, &rom_readl
> +};
> +
> +static CPUWriteMemoryFunc * const rom_writes[] = { NULL, NULL, NULL };
> +
> +void pci_map_option_rom(PCIDevice *pdev, int region_num, pcibus_t addr,
> +                        pcibus_t size, int type)
> +{
> +    int m;
> +
> +    m = cpu_register_io_memory(rom_reads, rom_writes, pdev);
> +    cpu_register_physical_memory(addr, size, m);
>  }
>  
>  /* Add an option rom for the device */
> @@ -2016,9 +2056,7 @@ static int pci_add_option_rom(PCIDevice *pdev)
>                       __FUNCTION__, pdev->romfile);
>          return -1;
>      }
> -    if (size & (size - 1)) {
> -        size = 1 << qemu_fls(size);
> -    }
> +    pdev->rom_size = size;
>  
>      if (pdev->qdev.info->vmsd)
>          snprintf(name, sizeof(name), "%s.rom", pdev->qdev.info->vmsd->name);
> @@ -2030,6 +2068,11 @@ static int pci_add_option_rom(PCIDevice *pdev)
>      load_image(path, ptr);
>      qemu_free(path);
>  
> +    /* Round up size for the BAR */
> +    if (size & (size - 1)) {
> +        size = 1 << qemu_fls(size);
> +    }
> +
>      pci_register_bar(pdev, PCI_ROM_SLOT, size,
>                       0, pci_map_option_rom);
>  
> @@ -2042,7 +2085,7 @@ static void pci_del_option_rom(PCIDevice *pdev)
>          return;
>  
>      qemu_ram_free(pdev->rom_offset);
> -    pdev->rom_offset = 0;
> +    pdev->rom_offset = pdev->rom_size = 0;
>  }
>  
>  /* Reserve space and add capability to the linked list in pci config space */
> diff --git a/hw/pci.h b/hw/pci.h
> index 9ee8db3..ed87b1a 100644
> --- a/hw/pci.h
> +++ b/hw/pci.h
> @@ -187,6 +187,7 @@ struct PCIDevice {
>      /* Location of option rom */
>      char *romfile;
>      ram_addr_t rom_offset;
> +    size_t rom_size;
>      uint32_t rom_bar;
>  
>      /* How much space does an MSIX table need. */
> 
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux