Re: [PATCH mlx5-next v7 0/4] Dynamically assign MSI-X vectors count

Alexander Duyck <alexander.duyck@xxxxxxxxx> · Thu, 11 Mar 2021 18:53:16 -0800

On Thu, Mar 11, 2021 at 3:21 PM Jason Gunthorpe <jgg@xxxxxxxxxx> wrote:
>
> On Thu, Mar 11, 2021 at 01:49:24PM -0800, Alexander Duyck wrote:
> > > We don't need to invent new locks and new complexity for something
> > > that is trivially solved already.
> >
> > I am not wanting a new lock. What I am wanting is a way to mark the VF
> > as being stale/offline while we are performing the update. With that
> > we would be able to apply similar logic to any changes in the future.
>
> I think we should hold off doing this until someone comes up with HW
> that needs it. The response time here is microseconds, it is not worth
> any complexity

I disagree. Take a look at section 8.5.3 in the NVMe document that was
linked to earlier:
https://nvmexpress.org/wp-content/uploads/NVM-Express-1_4a-2020.03.09-Ratified.pdf

This is exactly what they are doing and I think it makes a ton of
sense. Basically the VF has to be taken "offline" before you are
allowed to start changing resources on it. It would basically consist
of one extra sysfs file and has additional uses beyond just the
configuration of MSI-X vectors.

We would just have to add one additional sysfs file, maybe modify the
"dead" device flag to be "offline", and we could make this work with
minimal changes to the patch set you already have. We could probably
toggle to "offline" while holding just the VF lock. To toggle the VF
back to being "online" we might need to take the PF device lock since
it is ultimately responsible for guaranteeing we have the resources.

Another way to think of this is that we are essentially pulling a
device back after we have already allocated the VFs and we are
reconfiguring it before pushing it back out for usage. Having a flag
that we could set on the VF device to say it is "under
construction"/modification/"not ready for use" would be quite useful I
would think.