On Thu, May 23, 2013 at 4:16 PM, Yinghai Lu <yinghai@xxxxxxxxxx> wrote: > On Thu, May 23, 2013 at 2:23 PM, Benjamin Herrenschmidt > <benh@xxxxxxxxxxxxxxxxxxx> wrote: >> On Thu, 2013-05-23 at 14:00 -0700, Yinghai Lu wrote: >>> Yes, that is what patch try to do. Keep the mmio allocation from first >>> try, and only retry with io port. >> >> Can you describe more precisely what happens with the *current* code ? >> >> Ie. The first pass allocates my device MMIO regions fine. The second >> pass them spew some error messages about some mem allocation. I can >> still observe *something* being assigned to devices and in the (limited) >> setup I've been able to test with, so far, the devices seemed to still >> work, so I don't have a good grasp of the extent of the risk here. > > after bus_size_bridge, will have two list > a. head that is for must-have resource > b. realloc_head that is for optional size. > > then __assign_resources_sorted: > in the first try, > clone head list to saved list, then update head list with optional size > from realloc_head list. and try to assign resources, fail resources > will be added to local_fail_head. > > retry: > if local_fail_head is empty, mean must+optional are all allocated, so done. > > if local_fail_head is not empty, will restore head list from saved list. > then try to assign resource, fail resources will be add to global resources. > then try to extend (reassign) resources with optional size from realloc_size. > >> >> Is there a chance that this failed "second pass" actually prevents >> proper allocation of resources ? > > failed "second pass", should only fail to extend optional resources > like resource > for SRIOV, because fail to allocate resource compactly. > > in your case, "second pass" assign must-have, then extend optional way. > > pci 0001:00:00.0: BAR 8: assigned [mem 0x3d01080000000-0x3d01081ffffff] > pci 0001:00:00.0: BAR 9: assigned [mem 0x3d01082000000-0x3d010837fffff pref] > pci 0001:00:00.0: BAR 7: can't assign io (size 0x3000) ==> cause first try fail > pci 0001:00:00.0: BAR 8: assigned [mem 0x3d01080000000-0x3d010817fffff] > pci 0001:00:00.0: BAR 9: assigned [mem 0x3d01081800000-0x3d010827fffff pref] > ===> go extend now > pci 0001:00:00.0: BAR 8: reassigned [mem 0x3d01082800000-0x3d010847fffff] > pci 0001:00:00.0: BAR 9: reassigned [mem 0x3d01080000000-0x3d010817fffff pref] > ==> luckly it will extend ok, but it will stay in the MIDDLE. > > then come the poor 0001:01:00.0 > > pci 0001:01:00.0: BAR 8: assigned [mem 0x3d01082800000-0x3d01083ffffff] > pci 0001:01:00.0: BAR 9: assigned [mem 0x3d01080000000-0x3d010817fffff pref] > pci 0001:01:00.0: BAR 0: assigned [mem 0x3d01084000000-0x3d0108403ffff] > pci 0001:01:00.0: BAR 7: can't assign io (size 0x3000) ===> cause first try fail > pci 0001:01:00.0: BAR 8: assigned [mem 0x3d01082800000-0x3d010837fffff] > pci 0001:01:00.0: BAR 9: assigned [mem 0x3d01080000000-0x3d01080ffffff pref] > pci 0001:01:00.0: BAR 0: assigned [mem 0x3d01083800000-0x3d0108383ffff] > ===> go extend optional now > pci 0001:01:00.0: BAR 8: can't assign mem (size 0x1000000) > pci 0001:01:00.0: failed to add 800000 res[8]=[mem > 0x3d01082800000-0x3d010837fffff] > pci 0001:01:00.0: BAR 9: reassigned [mem 0x3d01080000000-0x3d010817fffff pref] > pci 0001:01:00.0: BAR 7: can't assign io (size 0x3000) > ===> so BAR8 can not extend now...because other lay on the middle... > > > So we need to patch that will prevent us to fall into "allocate must > and extend it" > trap as it will not allocate resource efficiently. > > otherwise in your case, devices pci0001:01:00.0 will have problem to use SRIOV. Whew. I think I finally see what you're getting at. From the log in Ben's very first email, it looks like we initially assign two 24MB MEM windows to the 0001:01:00.0 bridge: pci 0001:01:00.0: BAR 8: assigned [mem 0x3d01081800000-0x3d01082ffffff] pci 0001:01:00.0: BAR 9: assigned [mem 0x3d01080000000-0x3d010817fffff pref] After the reassignment brouhaha, the prefetchable window is unchanged, but the non-prefetchable window has been reduced in size to 8MB: pci 0001:01:00.0: PCI bridge to [bus 02-0d] pci 0001:01:00.0: bridge window [mem 0x3d01081800000-0x3d01081ffffff] pci 0001:01:00.0: bridge window [mem 0x3d01080000000-0x3d010817fffff pref] That should not happen, because the only reason we're reassigning is for IO space, and we shouldn't be touching the MEM assignments. I agree that we should fix that. I do not agree that your current patches are the right way to fix this. You are trying to make a relatively small patch that we can wedge into v3.10 in a hurry. I would prefer that we work out a cleaner solution that simplifies reassignment, *then* figure out the best way to backport it as needed. Bjorn -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html