Re: [RESEND][PATCH] PCI: Reassign page-aligned memory resources to device for pci passthrough.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Arvind,

On Thu, 4 Dec 2008 15:31:48 -0800 (PST)
arvind vasudev <arvind_vasudev2000@xxxxxxxxx> wrote:

> Yuji-san,
> 
> I a trying to do something similar to what you are to do, except in
> my case, I would like to reassign the resources at a 1MB boundary,
> for some specific devices. The question I had with this approach is
> that, what if the allocation of the resources fails? If the
> requested resource along with the alignment is more than the space
> that was allocated for the device, and the bridges above
> it. Wouldn't that cause a potential problem which may require
> re-balancing of the entire resource tree?

That's right.
The bridge's resource window should be re-assigned and expanded if needed.

In theory, we can do this by hot-removing the bridge, and hot-adding it.

But current fakephp driver does not call "pci_bus_size_bridges". So we
can't expand resource window.


Note: current fakephp driver does not call "pci_bus_assign_resources"
too. There is a patch to call "pci_bus_assign_resources". My approach
is depend on it.

Thanks,
--
Yuji Shimada

> 
> Thanks,
> Arvind.
> 
> 
> 
> ----- Original Message ----
> From: Yuji Shimada <shimada-yxb@xxxxxxxxxxxxxxx>
> To: Jesse Barnes <jbarnes@xxxxxxxxxxxxxxxx>
> Cc: linux-pci@xxxxxxxxxxxxxxx; "Zhao, Yu" <yu.zhao@xxxxxxxxx>
> Sent: Thursday, December 4, 2008 12:43:14 AM
> Subject: Re: [RESEND][PATCH] PCI: Reassign page-aligned memory resources to device for pci passthrough.
> 
> What do you think about supporting both boottime and runtime re-assignment.
> 
> I find we can re-assign resources using fakephp driver. What I need to
> do is enhancing my patch to set "pagealignmem" parameter at runtime.
> 
> I think it is good to create "/proc/bus/pci/pagealignmem".
> 
> Re-assignment can be done as follows.
> 
> 1. Sets SSSS:DD:BB.F of the device to "pagealignmem" parameter at runtime.
> 2. Hot-remove the device
> 3. Hot-add it.
> 
> 
> If you agree with me, I will submit new patch.
> 
> Thanks,
> --
> Yuji Shimada
> 
> On Mon, 1 Dec 2008 12:55:19 -0800
> Jesse Barnes <jbarnes@xxxxxxxxxxxxxxxx> wrote:
> 
> > On Tuesday, November 25, 2008 9:51 pm Yuji Shimada wrote:
> > > Thank you for your reply.
> > > See below.
> > >
> > > On Wed, 19 Nov 2008 15:08:34 -0800
> > >
> > > Jesse Barnes <jbarnes@xxxxxxxxxxxxxxxx> wrote:
> > > > On Wednesday, November 12, 2008 9:11 pm Yuji Shimada wrote:
> > > > > > Like Matthew said, reassigning at runtime should be possible, but can
> > > > > > be a little trickier since your configuration has to disallow driver
> > > > > > binding (or unbind everything) until after you've done your
> > > > > > reassignments.  For v11n setups that doesn't seem wholly
> > > > > > unreasonable, but probably isn't as convenient as simply doing it at
> > > > > > startup time.
> > > > >
> > > > > Reassigning at runtime needs suspending device driver. Device driver
> > > > > should hide the fact of suspending from upper layer. To achieve this,
> > > > > device driver should queue requests from upper layer during
> > > > > suspending.
> > > >
> > > > Yuji-san, after talking with a few more v12n folks, I think it would be
> > > > best to bite the bullet and move over to a runtime re-allocation scheme,
> > > > rather than the boot time scheme you have here.  This has the advantage
> > > > of not relying on specific bus numbers (which will be reordered on many
> > > > machines), and also provides more flexibility to users who want to
> > > > re-configure their guest device assignments w/o rebooting.  I'm not sure
> > > > if there's a guest assignment ioctl that you could hook the reallocation
> > > > into, if not you might want to add a sysfs file or something to allow the
> > > > user to disable/re-enable the device with the larger alignment
> > > > contraints.
> > >
> > > I have no plan to create the patch for runtime re-allocation, while I
> > > agree with you it is useful.
> > 
> > I hope someone from one of the virtualization projects will step up to do 
> > this...
> > 
> > > > > Currently linux does not have such mechanism. So reassigning at
> > > > > runtime is big challenge.
> > > >
> > > > I think you'd have to unbind the driver altogether in this case.  But if
> > > > you're assigning a device to a guest, there shouldn't be any host driver
> > > > bound to it, right?
> > >
> > > The device is not bound by host driver, though it is bound by dummy
> > > driver called 'pciback'.
> > >
> > > > Maybe I'm not understanding the problem here though, can
> > > > you provide a concrete example?
> > >
> > > Please consider dual port NIC or dual port HBA. One port will be
> > > assigned to guest, other port is used by host. When I need to reassign
> > > resources to all device behind PCI-PCI bridge, I have to stop device
> > > driver which binds to port used by host.
> > 
> > Right, maybe I assume too much here; my guess was that a configuration like 
> > that would be specified in some sort of boot scripts, to prevent host drivers 
> > from binding to specific sub functions if needed, that way you'd avoid the 
> > need to unbind.
> > 
> > > > > I can make sure no other resources fall into the same page,
> > > > > specifying device which have small resource, with "pci=pagealignmem=".
> > > > >
> > > > > So I'd like to keep this.
> > > >
> > > > I think the advantage of rounding up to a page outweighs the
> > > > disadvantage, since it means we don't have to change the core resource
> > > > code (it's already fragile enough, it would be best if we could avoid
> > > > touching it).
> > >
> > > I am willing to modify my patch to make sure the size is at least a page,
> > > if boot-time re-allocation is acceptable.
> > 
> > I don't have a problem in principle with boot time re-allocation, but in order 
> > to make it acceptable for upstream we'll have to figure out a way of dealing 
> > with the problem of changing PCI bus numbers (and therefore boot parameters) 
> > across kernel versions, firmware whims, and hardware changes.  And yes, to 
> > make it less invasive I'd like to round up to a page size as well.
> > 
> > I was hoping that a runtime scheme would be acceptable to you, since that 
> > avoid the problem entirely.  It doesn't sound like runtime will work for you 
> > though, so we're left with the bus numbering problem.
> > 
> > Can you think of any solutions?  Matthew suggested simply reassigning all 
> > devices; though I can imagine that this would be problematic on systems where 
> > we don't pick up all the system I/O resources used by the firmware for 
> > example, and it might also cause us to run out of space much sonner than we 
> > would otherwise.  Another option might be to limit it to specific PCI device 
> > classes, or some combination of class, vendor and device ID.
> > 
> > -- 
> > Jesse Barnes, Intel Open Source Technology Center
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [DMA Engine]     [Linux Coverity]     [Linux USB]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Greybus]

  Powered by Linux