Re: [net-next v5 03/15] devlink: Introduce PCI SF port flavour and port attribute

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, 2020-12-16 at 15:59 -0800, Jakub Kicinski wrote:
> On Wed, 16 Dec 2020 03:42:51 +0000 Parav Pandit wrote:
> > > From: Jakub Kicinski <kuba@xxxxxxxxxx>
> > > So subfunctions don't have a VF id but they may have a
> > > controller?
> > >  
> > Right. SF can be on external controller.
> >  
> > > Can you tell us more about the use cases and deployment models
> > > you're
> > > intending to support? Let's not add attributes and info which
> > > will go unused.
> > >   
> > External will be used the same way how it is used for PF and VF.
> > 
> > > How are SFs supposed to be used with SmartNICs? Are you assuming
> > > single
> > > domain of control?  
> > No. it is not assumed. SF can be deployed from smartnic to external
> > host.
> > A user has to pass appropriate controller number, pf number
> > attributes during creation time.
> 
> My problem with this series is that I've gotten some real life
> application exposure over the last year, and still I have no idea 
> who is going to find this feature useful and why.
> 
> That's the point of my questions in the previous email - what
> are the use cases, how are they going to operate.
> 

The main focus of this feature is scale-ability we want to run
thousands of Containers/VMs, this is useful for both smartnic and
baremetal hypervisor worlds, where security and control is exclusive to
the eswitch manager may it be the smarnic embedded CPU or the x86
Hypervisor.

deployment models is identical to SRIOV, the only difference is the
instantiation model of SF, which is the main discussion point of this
series (i hope), which to my taste is very modest and minimal.
after SF is instantiated from that point nothing is new, the SF is
exposing standard linux interfaces netdev/rdma identical to what VF
does, most likely you will assign them a namespace and pass them
through to a container or assign them (not direct assignment) to a VM
via the virt stack, or create a vdpa instance and pass it to a virtio
interface.

There are endless usecases for the netdev stack, for customers who want
high scale virtualized/containerized environments, with thousands of
network functions that can deliver high speed and full offload
accelerators, Native XDP, Crypto, encap/decap, and HW filtering and
processing pipeline capabilities.

I have a long list of customers with various and different applications
and i am not even talking about the rdma and vdpa customers ! those
customers just can't wait to leave sriov behind and scale up !

this feature has a lot of value to the netdev users only because of the
minimal foot print to the netdev stack (to be honest there is no change
in netdev, only a thin API layer in devlink) and the immediate and
effortless benefits to deploy multiple (accelerated) netdevs at scale.


> It's hard to review an API without knowing the use of it. iproute2
> is low level plumbing.
> 

I don't know how to put this, let me try:
A) SRIOV model
echo 128 > /sys/class/net/eth0/device/sriov_numvfs
ubind vf

ip set vf attribute x
configure representor .. 
deploy vf/netdev/rdma interface into the container

B) SF model 
you do (every thing under the devlink umbrella/switchdev):
for i in {1..1024} ; do
devlink port add pci/0000:06:00.0 flavour pcisf pfnum 0 sfnum $i
devlink port sf $i set attribute x

# from here on, it is identical to a VF
configure representor
deply sf/netdev/rdma interfaces into a container 

B is more scale-able and has more visibility and controllability  to
the user, after you create the SFs deployment and usecases are
identical to SRIOV VF usecases.

See the improvement ? :)

> Here the patch is adding the ability to apparently create a SF on 
> a remote controller. If you haven't thought that use case through
> just don't allow it until you know how it will work.
> 

We have thought the use case through it is not any different from the 
local controller use case. the code is uniform, we need to work hard to
block a remote controller :) .. 

> > > It seems that the way the industry is moving the major
> > > use case for SmartNICs is bare metal.
> > > 
> > > I always assumed nested eswitches when thinking about SmartNICs,
> > > what
> > > are you intending to do?
> > >  
> > Mlx5 doesn't support nested eswitch. SF can be deployed on the
> > external controller PCI function.
> > But this interface neither limited nor enforcing nested or flat
> > eswitch.
> >  
> > > What are your plans for enabling this feature in user space
> > > project?  
> > Do you mean K8s plugin or iproute2? Can you please tell us what
> > user space project?
> 
> That's my question. For SR-IOV it'd be all the virt stacks out there.
> But this can't do virt. So what can it do?
> 

you are thinking VF direct assignment. but don't forget
virt handles netdev assignment to a vm perfectly fine and SF has a
netdev.

And don't get me started on the weird virt handling of SRIOV VF, the
whole thing is a big mess :) it shouldn't be a de facto standard that
we need to follow.. 




[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux