Hi Arvind, On Thu, 4 Dec 2008 15:31:48 -0800 (PST) arvind vasudev <arvind_vasudev2000@xxxxxxxxx> wrote: > Yuji-san, > > I a trying to do something similar to what you are to do, except in > my case, I would like to reassign the resources at a 1MB boundary, > for some specific devices. The question I had with this approach is > that, what if the allocation of the resources fails? If the > requested resource along with the alignment is more than the space > that was allocated for the device, and the bridges above > it. Wouldn't that cause a potential problem which may require > re-balancing of the entire resource tree? That's right. The bridge's resource window should be re-assigned and expanded if needed. In theory, we can do this by hot-removing the bridge, and hot-adding it. But current fakephp driver does not call "pci_bus_size_bridges". So we can't expand resource window. Note: current fakephp driver does not call "pci_bus_assign_resources" too. There is a patch to call "pci_bus_assign_resources". My approach is depend on it. Thanks, -- Yuji Shimada > > Thanks, > Arvind. > > > > ----- Original Message ---- > From: Yuji Shimada <shimada-yxb@xxxxxxxxxxxxxxx> > To: Jesse Barnes <jbarnes@xxxxxxxxxxxxxxxx> > Cc: linux-pci@xxxxxxxxxxxxxxx; "Zhao, Yu" <yu.zhao@xxxxxxxxx> > Sent: Thursday, December 4, 2008 12:43:14 AM > Subject: Re: [RESEND][PATCH] PCI: Reassign page-aligned memory resources to device for pci passthrough. > > What do you think about supporting both boottime and runtime re-assignment. > > I find we can re-assign resources using fakephp driver. What I need to > do is enhancing my patch to set "pagealignmem" parameter at runtime. > > I think it is good to create "/proc/bus/pci/pagealignmem". > > Re-assignment can be done as follows. > > 1. Sets SSSS:DD:BB.F of the device to "pagealignmem" parameter at runtime. > 2. Hot-remove the device > 3. Hot-add it. > > > If you agree with me, I will submit new patch. > > Thanks, > -- > Yuji Shimada > > On Mon, 1 Dec 2008 12:55:19 -0800 > Jesse Barnes <jbarnes@xxxxxxxxxxxxxxxx> wrote: > > > On Tuesday, November 25, 2008 9:51 pm Yuji Shimada wrote: > > > Thank you for your reply. > > > See below. > > > > > > On Wed, 19 Nov 2008 15:08:34 -0800 > > > > > > Jesse Barnes <jbarnes@xxxxxxxxxxxxxxxx> wrote: > > > > On Wednesday, November 12, 2008 9:11 pm Yuji Shimada wrote: > > > > > > Like Matthew said, reassigning at runtime should be possible, but can > > > > > > be a little trickier since your configuration has to disallow driver > > > > > > binding (or unbind everything) until after you've done your > > > > > > reassignments. For v11n setups that doesn't seem wholly > > > > > > unreasonable, but probably isn't as convenient as simply doing it at > > > > > > startup time. > > > > > > > > > > Reassigning at runtime needs suspending device driver. Device driver > > > > > should hide the fact of suspending from upper layer. To achieve this, > > > > > device driver should queue requests from upper layer during > > > > > suspending. > > > > > > > > Yuji-san, after talking with a few more v12n folks, I think it would be > > > > best to bite the bullet and move over to a runtime re-allocation scheme, > > > > rather than the boot time scheme you have here. This has the advantage > > > > of not relying on specific bus numbers (which will be reordered on many > > > > machines), and also provides more flexibility to users who want to > > > > re-configure their guest device assignments w/o rebooting. I'm not sure > > > > if there's a guest assignment ioctl that you could hook the reallocation > > > > into, if not you might want to add a sysfs file or something to allow the > > > > user to disable/re-enable the device with the larger alignment > > > > contraints. > > > > > > I have no plan to create the patch for runtime re-allocation, while I > > > agree with you it is useful. > > > > I hope someone from one of the virtualization projects will step up to do > > this... > > > > > > > Currently linux does not have such mechanism. So reassigning at > > > > > runtime is big challenge. > > > > > > > > I think you'd have to unbind the driver altogether in this case. But if > > > > you're assigning a device to a guest, there shouldn't be any host driver > > > > bound to it, right? > > > > > > The device is not bound by host driver, though it is bound by dummy > > > driver called 'pciback'. > > > > > > > Maybe I'm not understanding the problem here though, can > > > > you provide a concrete example? > > > > > > Please consider dual port NIC or dual port HBA. One port will be > > > assigned to guest, other port is used by host. When I need to reassign > > > resources to all device behind PCI-PCI bridge, I have to stop device > > > driver which binds to port used by host. > > > > Right, maybe I assume too much here; my guess was that a configuration like > > that would be specified in some sort of boot scripts, to prevent host drivers > > from binding to specific sub functions if needed, that way you'd avoid the > > need to unbind. > > > > > > > I can make sure no other resources fall into the same page, > > > > > specifying device which have small resource, with "pci=pagealignmem=". > > > > > > > > > > So I'd like to keep this. > > > > > > > > I think the advantage of rounding up to a page outweighs the > > > > disadvantage, since it means we don't have to change the core resource > > > > code (it's already fragile enough, it would be best if we could avoid > > > > touching it). > > > > > > I am willing to modify my patch to make sure the size is at least a page, > > > if boot-time re-allocation is acceptable. > > > > I don't have a problem in principle with boot time re-allocation, but in order > > to make it acceptable for upstream we'll have to figure out a way of dealing > > with the problem of changing PCI bus numbers (and therefore boot parameters) > > across kernel versions, firmware whims, and hardware changes. And yes, to > > make it less invasive I'd like to round up to a page size as well. > > > > I was hoping that a runtime scheme would be acceptable to you, since that > > avoid the problem entirely. It doesn't sound like runtime will work for you > > though, so we're left with the bus numbering problem. > > > > Can you think of any solutions? Matthew suggested simply reassigning all > > devices; though I can imagine that this would be problematic on systems where > > we don't pick up all the system I/O resources used by the firmware for > > example, and it might also cause us to run out of space much sonner than we > > would otherwise. Another option might be to limit it to specific PCI device > > classes, or some combination of class, vendor and device ID. > > > > -- > > Jesse Barnes, Intel Open Source Technology Center -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html