Re: [RESEND][PATCH] PCI: Reassign page-aligned memory resources to device for pci passthrough.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tuesday, November 25, 2008 9:51 pm Yuji Shimada wrote:
> Thank you for your reply.
> See below.
>
> On Wed, 19 Nov 2008 15:08:34 -0800
>
> Jesse Barnes <jbarnes@xxxxxxxxxxxxxxxx> wrote:
> > On Wednesday, November 12, 2008 9:11 pm Yuji Shimada wrote:
> > > > Like Matthew said, reassigning at runtime should be possible, but can
> > > > be a little trickier since your configuration has to disallow driver
> > > > binding (or unbind everything) until after you've done your
> > > > reassignments.  For v11n setups that doesn't seem wholly
> > > > unreasonable, but probably isn't as convenient as simply doing it at
> > > > startup time.
> > >
> > > Reassigning at runtime needs suspending device driver. Device driver
> > > should hide the fact of suspending from upper layer. To achieve this,
> > > device driver should queue requests from upper layer during
> > > suspending.
> >
> > Yuji-san, after talking with a few more v12n folks, I think it would be
> > best to bite the bullet and move over to a runtime re-allocation scheme,
> > rather than the boot time scheme you have here.  This has the advantage
> > of not relying on specific bus numbers (which will be reordered on many
> > machines), and also provides more flexibility to users who want to
> > re-configure their guest device assignments w/o rebooting.  I'm not sure
> > if there's a guest assignment ioctl that you could hook the reallocation
> > into, if not you might want to add a sysfs file or something to allow the
> > user to disable/re-enable the device with the larger alignment
> > contraints.
>
> I have no plan to create the patch for runtime re-allocation, while I
> agree with you it is useful.

I hope someone from one of the virtualization projects will step up to do 
this...

> > > Currently linux does not have such mechanism. So reassigning at
> > > runtime is big challenge.
> >
> > I think you'd have to unbind the driver altogether in this case.  But if
> > you're assigning a device to a guest, there shouldn't be any host driver
> > bound to it, right?
>
> The device is not bound by host driver, though it is bound by dummy
> driver called 'pciback'.
>
> > Maybe I'm not understanding the problem here though, can
> > you provide a concrete example?
>
> Please consider dual port NIC or dual port HBA. One port will be
> assigned to guest, other port is used by host. When I need to reassign
> resources to all device behind PCI-PCI bridge, I have to stop device
> driver which binds to port used by host.

Right, maybe I assume too much here; my guess was that a configuration like 
that would be specified in some sort of boot scripts, to prevent host drivers 
from binding to specific sub functions if needed, that way you'd avoid the 
need to unbind.

> > > I can make sure no other resources fall into the same page,
> > > specifying device which have small resource, with "pci=pagealignmem=".
> > >
> > > So I'd like to keep this.
> >
> > I think the advantage of rounding up to a page outweighs the
> > disadvantage, since it means we don't have to change the core resource
> > code (it's already fragile enough, it would be best if we could avoid
> > touching it).
>
> I am willing to modify my patch to make sure the size is at least a page,
> if boot-time re-allocation is acceptable.

I don't have a problem in principle with boot time re-allocation, but in order 
to make it acceptable for upstream we'll have to figure out a way of dealing 
with the problem of changing PCI bus numbers (and therefore boot parameters) 
across kernel versions, firmware whims, and hardware changes.  And yes, to 
make it less invasive I'd like to round up to a page size as well.

I was hoping that a runtime scheme would be acceptable to you, since that 
avoid the problem entirely.  It doesn't sound like runtime will work for you 
though, so we're left with the bus numbering problem.

Can you think of any solutions?  Matthew suggested simply reassigning all 
devices; though I can imagine that this would be problematic on systems where 
we don't pick up all the system I/O resources used by the firmware for 
example, and it might also cause us to run out of space much sonner than we 
would otherwise.  Another option might be to limit it to specific PCI device 
classes, or some combination of class, vendor and device ID.

-- 
Jesse Barnes, Intel Open Source Technology Center

--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [DMA Engine]     [Linux Coverity]     [Linux USB]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Greybus]

  Powered by Linux