PCI VF driver with long delay in remove routine

"Keller, Jacob E" <jacob.e.keller@xxxxxxxxx> · Tue, 28 Mar 2017 16:24:36 +0000

Hi All,

I've been investigating some delays in the i40e and i40evf drivers, involving interacting with large numbers of VF devices. Specifically I've been trying to remove large delays which result from looping over multiple VFs in sequence, such as when loading, unloading, resetting, and other similar problems.

One place I've been investigating is the sriov_disable() which occurs when removing VFs. This function happens to loop over all of the VFs and remove them in sequence.

My VF driver's remove routine has a few msleep(50) calls which are added in order to delay removing memory until shutdown procedures finish. I've been told by other developers that these delays are necessary, and I can't simply remove them.

The result, is that when you remove the i40e PF driver, or disable all VFs, it can take an inordinate amount of time (upwards of 13 seconds).

In other code locations, I was able to avoid this by amortizing the flow so that I start the process for all VFs, then wait once, and then finishing the shutdown. In this case, I can't do that because the code is spread throughout the pci iov code and the VF driver code.

I'm writing to ask if anyone has solved similar issues with other drivers. I've thought of a few possible solutions:

1) changing the code in sriov_disable() somehow so that it can remove the drivers in parallel. One possibility is to use a workqueue. This obviously adds a lot of complexity to the remove routine.

2) some sort of 2-stage removal setup where we can notify a VF that it will be shutting down soon, so that it can begin with the delay stuff, and then finish shutting down afterwards.

3) modifying the driver so that it can unlink and hold references to the memory it needs to free, and then can simply handle this task in a separate workqueue with a reference counted structure of some sort.

I've implemented (1) but this seems like it isn't a good solution because it fixes the problem only in one place, and I think only if the VFs are on bare metal (not sure how it works when VMs are used). I've thought about (2) but this seems like it will need some core infrastructure in place, and I'm not sure how this should be handled.

I think (3) might be the best route, but I'm really not sure what sort of guarantees the remove routine needs to make. What must be fully shut down when this function exits if I were using some sort of deferred remove work task?

I feel like this can't be something no one else has thought about before, and I know this sort of thing will only get wose as new parts have more and more available virtual functions. Imagine this case when someone has loaded a few hundred or thousands of VFs and suddenly it starts taking more like minutes to shut down.

Thanks,
Jake