Re: [RFC PATCH] pci: add hook for architectures to disble SR-IOV at runtime

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Ben,

On 07.10.2011 [08:48:20 +0200], Benjamin Herrenschmidt wrote:
> 
> > Are you saying that Power is intrinsically unable to support SR-IOV?
> 
> The 'pseries' platform on power is, at least with the current
> hypervisor, as the HV doesn't allocate resources for the IOV BAR and the
> kernel is not allowed to mess around with PCI resources allocation.
> 
> In the long run, they -may- fix that, but currently, those machines must
> not enable SR-IOV or bad things happen (sriov_init enables VFs which
> causes the PF driver to fail as well).
> 
> > Or is it merely that we can't use SR-IOV on this particular
> > machine/configuration because the BAR assignments are fixed and there
> > isn't enough space for the VF BARs?
> >
> > I would expect the latter, and therefore it seems wrong to disable it
> > across the board.  Perhaps on another machine with different PCI
> > resource assignments, there *would* be space for the VF BARs and
> > SR-IOV could be used.
> 
> No. We can never modify the resource allocation and we know that on all
> current pseries platforms, the HV will not allocate the VF BAR.
> 
> The patch is not ideal though. Ideally, what you'd need to do is:

To be clear here, "the patch" you refer to is the one in this thread? Or
the other one that skips VF BARs in the pci_enable path?

>  - Detect that the VF BAR hasn't been assigned
> 
>  - Combine that with the knowledge that you cannot re-assign resources
>    on the platform (this is a flag in asm-generic/pci-flags.h but that's
>    not available on all archs unfortunately).
> 
> And based on that, don't try to mess around with VFs etc... in
> sriov_init().
> 
> The problem is that damn flag. Those PCI flags are currently used by
> several archs but exclusively from within the arch code and aren't
> defined on x86 for example.
> 
> Maybe we could wrap that into a:
> 
> 	pcibios_can_assign_resources()
> 
> Which I could make return 0 when running on such a platform.

That would, I think, end up looking similar to my patch in this thread
with a weak symbol at the topmost level?

Would the weak symbol decided to look at the PCI flags based upon a
compile-time #ifdef, e.g.,

#ifdef CONFIG_PCI
bool __attribute__((weak)) pcibios_can_assign_resources()
{
#ifdef PCI_PROBE_ONLY
	if (pci_has_flag(PCI_PROBE_ONLY))
		return false;
	return true;
#else
	return true;
#endif
}
#else
bool __attribute__((weak)) pcibios_can_assign_resources()
{
	return false;
}
#endif

?

> > > I defined a weak version of this function that returns true if
> > > CONFIG_PCI_IOV is set (which I think should reflect the current setup
> > > that CONFIG_PCI_IOV represents an unconditional support of SR-IOV) and
> > > false otherwise. The only architecture that implements the hook is
> > > powerpc, which uses a machine callback to decide if a platform supports
> > > SR-IOV or not. I only defined one such callback, for the pseries
> > > platform, and return false unconditionally there.
> > 
> > I think this approach is a band-aid that happens to cover up the
> > problem.  I'd rather fix the PF enable path so it doesn't depend on VF
> > BAR allocation.  That seems like a much cleaner approach to me.  I
> > think your pci_select_bars() patch is acceptable for now, even though
> > I would still like to see us move all the VF BAR stuff out of the
> > pci_dev and into the pci_sriov.
> 
> It's more than just the BAR allocation it seems. The code in
> sriov_init() also unconditionally sets the VF count and change the
> system page size setting, which appears to break some adapters when the
> VF BAR hasn't been assigned. Either that or the attempt at changing the
> page size breaks the firmware or the driver (Nish, have you tried just
> preventing that part ?)

sriov_init() definitely has issues.

I haven't had the time to dive down into sriov_init() to determine which
call it is that actually causes the break. I will try to find some time
to do that either today or on Monday.

> In any case, it makes no sense to turn VFs on etc... that way, I think a
> lot of that code should be moved to be done lazily when the driver
> actually tries to enable a VF. That way the check for the validity of
> the IOV resource can be done at that point too.

Minimally, I think we can remove:

        pci_write_config_word(dev, pos + PCI_SRIOV_NUM_VF, total);

from sriov_init() as this is going to be overridden by sriov_enable()'s

        pci_write_config_word(dev, iov->pos + PCI_SRIOV_NUM_VF, nr_virtfn);

?

Thanks,
Nish

-- 
Nishanth Aravamudan <nacc@xxxxxxxxxx>
IBM Linux Technology Center
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [DMA Engine]     [Linux Coverity]     [Linux USB]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Greybus]

  Powered by Linux