Re: [RESEND][PATCH] PCI: Reassign page-aligned memory resources to device for pci passthrough.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Thank you for your reply.
See below.

On Wed, 19 Nov 2008 15:08:34 -0800
Jesse Barnes <jbarnes@xxxxxxxxxxxxxxxx> wrote:

> On Wednesday, November 12, 2008 9:11 pm Yuji Shimada wrote:
> > > Like Matthew said, reassigning at runtime should be possible, but can be
> > > a little trickier since your configuration has to disallow driver binding
> > > (or unbind everything) until after you've done your reassignments.  For
> > > v11n setups that doesn't seem wholly unreasonable, but probably isn't as
> > > convenient as simply doing it at startup time.
> >
> > Reassigning at runtime needs suspending device driver. Device driver
> > should hide the fact of suspending from upper layer. To achieve this,
> > device driver should queue requests from upper layer during
> > suspending.
> 
> Yuji-san, after talking with a few more v12n folks, I think it would be best 
> to bite the bullet and move over to a runtime re-allocation scheme, rather 
> than the boot time scheme you have here.  This has the advantage of not 
> relying on specific bus numbers (which will be reordered on many machines), 
> and also provides more flexibility to users who want to re-configure their 
> guest device assignments w/o rebooting.  I'm not sure if there's a guest 
> assignment ioctl that you could hook the reallocation into, if not you might 
> want to add a sysfs file or something to allow the user to disable/re-enable 
> the device with the larger alignment contraints.

I have no plan to create the patch for runtime re-allocation, while I
agree with you it is useful.

> > Currently linux does not have such mechanism. So reassigning at
> > runtime is big challenge.
> 
> I think you'd have to unbind the driver altogether in this case.  But if 
> you're assigning a device to a guest, there shouldn't be any host driver 
> bound to it, right?

The device is not bound by host driver, though it is bound by dummy
driver called 'pciback'.

> Maybe I'm not understanding the problem here though, can 
> you provide a concrete example?

Please consider dual port NIC or dual port HBA. One port will be
assigned to guest, other port is used by host. When I need to reassign
resources to all device behind PCI-PCI bridge, I have to stop device
driver which binds to port used by host.

> > > > +		for (i=0; i < PCI_NUM_RESOURCES; i++) {
> > > > +			r = &dev->resource[i];
> > > > +			if (!(r->flags & IORESOURCE_MEM))
> > > > +				continue;
> > > > +
> > > > +			r->end = r->end - r->start;
> > > > +			r->start = 0;
> > >
> > > Do you also want to make sure the size is at least a page here? Otherwise
> > > the page you assign to the guest might contain another device too, right?
> > >  Also, increasing the size to at least a page here should make the other
> > > alignment checks in setup-bus.c and setup-res.c unnecessary, since size
> > > alignment is the default...
> >
> > That's right.
> >
> > But, if I make sure the size is at least a page (make sure "r->end -
> > r->start >= PAGE_SIZE"), then /proc/iomem and
> > /sys/bus/pci/devices/dddd:bb:dd.f/resource will not show real size.
> 
> But it *will* show the actual iomem that the device has assigned to it.  
> You're right that it doesn't reflect the actual decode range of the device, 
> but any software that wanted to get that could do its own BAR sizing on it, 
> or if we really needed to we could save the original size somewhere.
> 
> > I can make sure no other resources fall into the same page,
> > specifying device which have small resource, with "pci=pagealignmem=".
> >
> > So I'd like to keep this.
> 
> I think the advantage of rounding up to a page outweighs the disadvantage, 
> since it means we don't have to change the core resource code (it's already 
> fragile enough, it would be best if we could avoid touching it).

I am willing to modify my patch to make sure the size is at least a page, if 
boot-time re-allocation is acceptable.

Thanks,
--
Yuji Shimada

--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [DMA Engine]     [Linux Coverity]     [Linux USB]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Greybus]

  Powered by Linux