What do you think about supporting both boottime and runtime re-assignment. I find we can re-assign resources using fakephp driver. What I need to do is enhancing my patch to set "pagealignmem" parameter at runtime. I think it is good to create "/proc/bus/pci/pagealignmem". Re-assignment can be done as follows. 1. Sets SSSS:DD:BB.F of the device to "pagealignmem" parameter at runtime. 2. Hot-remove the device 3. Hot-add it. If you agree with me, I will submit new patch. Thanks, -- Yuji Shimada On Mon, 1 Dec 2008 12:55:19 -0800 Jesse Barnes <jbarnes@xxxxxxxxxxxxxxxx> wrote: > On Tuesday, November 25, 2008 9:51 pm Yuji Shimada wrote: > > Thank you for your reply. > > See below. > > > > On Wed, 19 Nov 2008 15:08:34 -0800 > > > > Jesse Barnes <jbarnes@xxxxxxxxxxxxxxxx> wrote: > > > On Wednesday, November 12, 2008 9:11 pm Yuji Shimada wrote: > > > > > Like Matthew said, reassigning at runtime should be possible, but can > > > > > be a little trickier since your configuration has to disallow driver > > > > > binding (or unbind everything) until after you've done your > > > > > reassignments. For v11n setups that doesn't seem wholly > > > > > unreasonable, but probably isn't as convenient as simply doing it at > > > > > startup time. > > > > > > > > Reassigning at runtime needs suspending device driver. Device driver > > > > should hide the fact of suspending from upper layer. To achieve this, > > > > device driver should queue requests from upper layer during > > > > suspending. > > > > > > Yuji-san, after talking with a few more v12n folks, I think it would be > > > best to bite the bullet and move over to a runtime re-allocation scheme, > > > rather than the boot time scheme you have here. This has the advantage > > > of not relying on specific bus numbers (which will be reordered on many > > > machines), and also provides more flexibility to users who want to > > > re-configure their guest device assignments w/o rebooting. I'm not sure > > > if there's a guest assignment ioctl that you could hook the reallocation > > > into, if not you might want to add a sysfs file or something to allow the > > > user to disable/re-enable the device with the larger alignment > > > contraints. > > > > I have no plan to create the patch for runtime re-allocation, while I > > agree with you it is useful. > > I hope someone from one of the virtualization projects will step up to do > this... > > > > > Currently linux does not have such mechanism. So reassigning at > > > > runtime is big challenge. > > > > > > I think you'd have to unbind the driver altogether in this case. But if > > > you're assigning a device to a guest, there shouldn't be any host driver > > > bound to it, right? > > > > The device is not bound by host driver, though it is bound by dummy > > driver called 'pciback'. > > > > > Maybe I'm not understanding the problem here though, can > > > you provide a concrete example? > > > > Please consider dual port NIC or dual port HBA. One port will be > > assigned to guest, other port is used by host. When I need to reassign > > resources to all device behind PCI-PCI bridge, I have to stop device > > driver which binds to port used by host. > > Right, maybe I assume too much here; my guess was that a configuration like > that would be specified in some sort of boot scripts, to prevent host drivers > from binding to specific sub functions if needed, that way you'd avoid the > need to unbind. > > > > > I can make sure no other resources fall into the same page, > > > > specifying device which have small resource, with "pci=pagealignmem=". > > > > > > > > So I'd like to keep this. > > > > > > I think the advantage of rounding up to a page outweighs the > > > disadvantage, since it means we don't have to change the core resource > > > code (it's already fragile enough, it would be best if we could avoid > > > touching it). > > > > I am willing to modify my patch to make sure the size is at least a page, > > if boot-time re-allocation is acceptable. > > I don't have a problem in principle with boot time re-allocation, but in order > to make it acceptable for upstream we'll have to figure out a way of dealing > with the problem of changing PCI bus numbers (and therefore boot parameters) > across kernel versions, firmware whims, and hardware changes. And yes, to > make it less invasive I'd like to round up to a page size as well. > > I was hoping that a runtime scheme would be acceptable to you, since that > avoid the problem entirely. It doesn't sound like runtime will work for you > though, so we're left with the bus numbering problem. > > Can you think of any solutions? Matthew suggested simply reassigning all > devices; though I can imagine that this would be problematic on systems where > we don't pick up all the system I/O resources used by the firmware for > example, and it might also cause us to run out of space much sonner than we > would otherwise. Another option might be to limit it to specific PCI device > classes, or some combination of class, vendor and device ID. > > -- > Jesse Barnes, Intel Open Source Technology Center -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html