Re: Fw: Re: [RESEND][PATCH] PCI: Reassign page-aligned memory resources to device for pci passthrough.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wednesday, November 12, 2008 9:11 pm Yuji Shimada wrote:
> > Like Matthew said, reassigning at runtime should be possible, but can be
> > a little trickier since your configuration has to disallow driver binding
> > (or unbind everything) until after you've done your reassignments.  For
> > v11n setups that doesn't seem wholly unreasonable, but probably isn't as
> > convenient as simply doing it at startup time.
>
> Reassigning at runtime needs suspending device driver. Device driver
> should hide the fact of suspending from upper layer. To achieve this,
> device driver should queue requests from upper layer during
> suspending.

Yuji-san, after talking with a few more v12n folks, I think it would be best 
to bite the bullet and move over to a runtime re-allocation scheme, rather 
than the boot time scheme you have here.  This has the advantage of not 
relying on specific bus numbers (which will be reordered on many machines), 
and also provides more flexibility to users who want to re-configure their 
guest device assignments w/o rebooting.  I'm not sure if there's a guest 
assignment ioctl that you could hook the reallocation into, if not you might 
want to add a sysfs file or something to allow the user to disable/re-enable 
the device with the larger alignment contraints.

> Currently linux does not have such mechanism. So reassigning at
> runtime is big challenge.

I think you'd have to unbind the driver altogether in this case.  But if 
you're assigning a device to a guest, there shouldn't be any host driver 
bound to it, right?  Maybe I'm not understanding the problem here though, can 
you provide a concrete example?

> > > +		for (i=0; i < PCI_NUM_RESOURCES; i++) {
> > > +			r = &dev->resource[i];
> > > +			if (!(r->flags & IORESOURCE_MEM))
> > > +				continue;
> > > +
> > > +			r->end = r->end - r->start;
> > > +			r->start = 0;
> >
> > Do you also want to make sure the size is at least a page here? Otherwise
> > the page you assign to the guest might contain another device too, right?
> >  Also, increasing the size to at least a page here should make the other
> > alignment checks in setup-bus.c and setup-res.c unnecessary, since size
> > alignment is the default...
>
> That's right.
>
> But, if I make sure the size is at least a page (make sure "r->end -
> r->start >= PAGE_SIZE"), then /proc/iomem and
> /sys/bus/pci/devices/dddd:bb:dd.f/resource will not show real size.

But it *will* show the actual iomem that the device has assigned to it.  
You're right that it doesn't reflect the actual decode range of the device, 
but any software that wanted to get that could do its own BAR sizing on it, 
or if we really needed to we could save the original size somewhere.

> I can make sure no other resources fall into the same page,
> specifying device which have small resource, with "pci=pagealignmem=".
>
> So I'd like to keep this.

I think the advantage of rounding up to a page outweighs the disadvantage, 
since it means we don't have to change the core resource code (it's already 
fragile enough, it would be best if we could avoid touching it).

Thanks,
Jesse
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [DMA Engine]     [Linux Coverity]     [Linux USB]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Greybus]

  Powered by Linux