On Tue, Jan 07, 2020 at 02:34:35PM -0600, Bjorn Helgaas wrote: > On Mon, Jan 06, 2020 at 03:48:06PM +0000, Nicholas Johnson wrote: > > Remove checks for resource size in extend_bridge_window(). This is > > necessary to allow the pci_bus_distribute_available_resources() to > > function when the kernel parameter pci=hpmemsize=nn[KMG] is used to > > allocate resources. Because the kernel parameter sets the size of all > > hotplug bridges to be the same, there are problems when nested hotplug > > bridges are encountered. Fitting a downstream hotplug bridge with size X > > and normal bridges with non-zero size Y into parent hotplug bridge with > > size X is impossible, and hence the downstream hotplug bridge needs to > > shrink to fit into its parent. > > s/extend_bridge_window()/adjust_bridge_window()/ above > s/to allow the/to allow/ Okay > > If this patch allows pci_bus_distribute_available_resources() to > function when pci=hpmemsize=nn is used, what happens *before* this > patch? The text implies that pci_bus_distribute_available_resources() > doesn't function, but what happens? Do we try to assign a downstream > bridge requiring X+n inside an upstream window of size X and the > assignment fails, leaving the downstream bridge unusable? I could add something similar to this to the log: The hpmemsize is applied to add_size of every hotplug bridge, even nested ones. Say we set hpmemsize=256M, the upstream hotplug bridge gets 256M. Then when we hot-add a Thunderbolt device with daisy chaining, the new nested bridge also gets 256M and this will not fit because some further space has been consumed by the endpoints in the Thunderbolt device. Hence, we cannot extend. It works for Mika because he is interested in the cases when the firmware assigns the resources, hence hpmemsize=2M (default) and it does not cause problems, unless we run out of space and need to go below 2M. > > > Add check for if bridge is extended or shrunken and reflect that in the > > call to pci_dbg(). > > > > Reset the resource if its new size is zero (if we have run out of a > > bridge window resource) to prevent the PCI resource assignment code from > > attempting to assign a zero-sized resource. > > > > Signed-off-by: Nicholas Johnson <nicholas.johnson-opensource@xxxxxxxxxxxxxx> > > --- > > drivers/pci/setup-bus.c | 17 ++++++++++++----- > > 1 file changed, 12 insertions(+), 5 deletions(-) > > > > diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c > > index 0c51f4937..e7e57bf72 100644 > > --- a/drivers/pci/setup-bus.c > > +++ b/drivers/pci/setup-bus.c > > @@ -1836,18 +1836,25 @@ static void adjust_bridge_window(struct pci_dev *bridge, struct resource *res, > > struct list_head *add_list, > > resource_size_t new_size) > > { > > - resource_size_t add_size; > > + resource_size_t add_size, size = resource_size(res); > > > > if (res->parent) > > return; > > > > - if (resource_size(res) >= new_size) > > - return; > > + if (new_size > size) { > > + add_size = new_size - size; > > + pci_dbg(bridge, "bridge window %pR extended by %pa\n", res, > > + &add_size); > > + } else if (new_size < size) { > > + add_size = size - new_size; > > + pci_dbg(bridge, "bridge window %pR shrunken by %pa\n", res, > > + &add_size); > > + } > > Where's the patch that changes the caller so "new_size" may be smaller > than "size"? I guess it must be "[3/3] PCI: Consider alignment of > hot-added bridges ..." because that's the only one that makes a > non-trivial change, right? As above, there was always a possibility of the new_size being smaller. For some reason, 1M is assigned to bridges, even if nothing is below them (for example, unused non hotplug bridges in a Thunderbolt dock). It may be an edge case if we are low on space, but theoretically it can happen. Also, when writing this, Mika was not interested in using hpmemsize, which, when used, will cause new_size to be smaller than the current size (actual size and add_size combined). So it does not need a patch to cause "new_size" to be smaller than "size" - just a change in user behaviour to use pci=hpmemsize. > > > - add_size = new_size - resource_size(res); > > - pci_dbg(bridge, "bridge window %pR extended by %pa\n", res, &add_size); > > res->end = res->start + new_size - 1; > > remove_from_list(add_list, res); > > + if (!new_size) > > + reset_resource(res); > > I consider reset_resource() to be deprecated because it throws away > res->flags, which tells us what kind of resource it is > (mem/io/32-bit/64-bit/prefetchable). We learn this during > enumeration, and we shouldn't forget the information until we remove > the device. I will look at this, but I distinctly remember doing this because of IO BARs which would run out and cause the device not to be enabled. Can you please comment on IORESOURCE_UNSET and what effect that would have if applied to the flags. Also, can you suggest any other solution other than making it handle zero-sized resources better? I agree, that would be ideal to make it handle zero-sized resources, but given the state of drivers/pci/setup-bus.c, I feel like it will just recursively open up more cans of worms until we metaphorically stack overflow. And I am happy to go down that path. But at some point I feel we will have to make some compromises / stop-gap measures to apply some patches and make progress, before going down that road. Would you agree? > > If the resource assignment code doesn't do the right thing with a > zero-sized resource, I think we should fix that code. Clearing the > resource struct does nothing with the hardware BAR or window > registers, so the BAR/window remains enabled unless we do something > more. If we don't need a window and we want to disable it, we can do > that, but it requires writing special values to the hardware > registers. > > Bjorn https://lkml.org/lkml/2020/1/7/1544 You describe this as "black magic code", what appears to be the assignment code which handles lists of resources. And I agree. I believe it is in both our interests to avoid using add_size because nobody understands how these are handled. There may be bugs, and there is definitely lots of complexity involved. I believe simplicity is key. Hence why these changes in this series: - Change resource size directly instead of using add_size - Aside from the currently known cases of needing to shrink the resource, we cannot know that there will not be more cases of this in the future. There is no need for preventing it from shrinking - we have an available size for the bridge window, and if that happens to be smaller than the bridge window, we have no choice but to shrink. I believe this makes the check unnecessary and warrants removal. Regards, Nicholas