Re: [External] Re: [PATCH 0/7] Introduce vdpa management tool

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 2020/12/2 下午2:24, Parav Pandit wrote:

From: Jason Wang <jasowang@xxxxxxxxxx>
Sent: Wednesday, December 2, 2020 11:21 AM

On 2020/12/2 下午12:53, Parav Pandit wrote:
From: Yongji Xie <xieyongji@xxxxxxxxxxxxx>
Sent: Wednesday, December 2, 2020 9:00 AM

On Tue, Dec 1, 2020 at 11:59 PM Parav Pandit <parav@xxxxxxxxxx> wrote:

From: Yongji Xie <xieyongji@xxxxxxxxxxxxx>
Sent: Tuesday, December 1, 2020 7:49 PM

On Tue, Dec 1, 2020 at 7:32 PM Parav Pandit <parav@xxxxxxxxxx>
wrote:

From: Yongji Xie <xieyongji@xxxxxxxxxxxxx>
Sent: Tuesday, December 1, 2020 3:26 PM

On Tue, Dec 1, 2020 at 2:25 PM Jason Wang
<jasowang@xxxxxxxxxx>
wrote:
On 2020/11/30 下午3:07, Yongji Xie wrote:
Thanks for adding me, Jason!

Now I'm working on a v2 patchset for VDUSE (vDPA Device in
Userspace) [1]. This tool is very useful for the vduse device.
So I'm considering integrating this into my v2 patchset.
But there is one problem:

In this tool, vdpa device config action and enable action are
combined into one netlink msg: VDPA_CMD_DEV_NEW. But in
vduse
case, it needs to be splitted because a chardev should be
created and opened by a userspace process before we enable
the vdpa device (call vdpa_register_device()).

So I'd like to know whether it's possible (or have some
plans) to add two new netlink msgs something like:
VDPA_CMD_DEV_ENABLE
and
VDPA_CMD_DEV_DISABLE to make the config path more
flexible.
Actually, we've discussed such intermediate step in some early
discussion. It looks to me VDUSE could be one of the users of
this.
Or I wonder whether we can switch to use anonymous
inode(fd) for VDUSE then fetching it via an
VDUSE_GET_DEVICE_FD
ioctl?
Yes, we can. Actually the current implementation in VDUSE is
like this.  But seems like this is still a intermediate step.
The fd should be binded to a name or something else which need
to be configured before.
The name could be specified via the netlink. It looks to me the
real issue is that until the device is connected with a
userspace, it can't be used. So we also need to fail the
enabling if it doesn't
opened.
Yes, that's true. So you mean we can firstly try to fetch the fd
binded to a name/vduse_id via an VDUSE_GET_DEVICE_FD, then
use
the name/vduse_id as a attribute to create vdpa device? It looks
fine to
me.
I probably do not well understand. I tried reading patch [1] and
few things
do not look correct as below.
Creating the vdpa device on the bus device and destroying the
device from
the workqueue seems unnecessary and racy.
It seems vduse driver needs
This is something should be done as part of the vdpa dev add
command,
instead of connecting two sides separately and ensuring race free
access to it.
So VDUSE_DEV_START and VDUSE_DEV_STOP should possibly be
avoided.
Yes, we can avoid these two ioctls with the help of the management
tool.
$ vdpa dev add parentdev vduse_mgmtdev type net name foo2

When above command is executed it creates necessary vdpa device
foo2
on the bus.
When user binds foo2 device with the vduse driver, in the probe(),
it
creates respective char device to access it from user space.

I see. So vduse cannot work with any existing vdpa devices like ifc,
mlx5 or
netdevsim.
It has its own implementation similar to fuse with its own backend of
choice.
More below.

But vduse driver is not a vdpa bus driver. It works like vdpasim
driver, but offloads the data plane and control plane to a user space
process.
In that case to draw parallel lines,

1. netdevsim:
(a) create resources in kernel sw
(b) datapath simulates in kernel

2. ifc + mlx5 vdpa dev:
(a) creates resource in hw
(b) data path is in hw

3. vduse:
(a) creates resources in userspace sw
(b) data path is in user space.
hence creates data path resources for user space.
So char device is created, removed as result of vdpa device creation.

For example,
$ vdpa dev add parentdev vduse_mgmtdev type net name foo2

Above command will create char device for user space.

Similar command for ifc/mlx5 would have created similar channel for
rest of
the config commands in hw.
vduse channel = char device, eventfd etc.
ifc/mlx5 hw channel = bar, irq, command interface etc Netdev sim
channel = sw direct calls

Does it make sense?
In my understanding, to make vdpa work, we need a backend (datapath
resources) and a frontend (a vdpa device attached to a vdpa bus). In
the above example, it looks like we use the command "vdpa dev add ..."
   to create a backend, so do we need another command to create a
frontend?
For block device there is certainly some backend to process the IOs.
Sometimes backend to be setup first, before its front end is exposed.
"vdpa dev add" is the front end command who connects to the backend
(implicitly) for network device.
vhost->vdpa_block_device->backend_io_processor (usr,hw,kernel).

And it needs a way to connect to backend when explicitly specified during
creation time.
Something like,
$ vdpa dev add parentdev vdpa_vduse type block name foo3 handle
<uuid>
In above example some vendor device specific unique handle is passed
based on backend setup in hardware/user space.
In below 3 examples, vdpa block simulator is connecting to backend block
or file.
$ vdpa dev add parentdev vdpa_blocksim type block name foo4 blockdev
/dev/zero

$ vdpa dev add parentdev vdpa_blocksim type block name foo5 blockdev
/dev/sda2 size=100M offset=10M

$ vdpa dev add parentdev vdpa_block filebackend_sim type block name
foo6 file /root/file_backend.txt

Or may be backend connects to the created vdpa device is bound to the
driver.
Can vduse attach to the created vdpa block device through the char device
and establish the channel to receive IOs, and to setup the block config space?


I think it can work.

Another thing I wonder it that, do we consider more than one VDUSE
parentdev(or management dev)? This allows us to have separated devices
implemented via different processes.
Multiple parentdev should be possible per one driver. for example mlx5_vdpa.ko will create multiple parent dev, one for each PCI VFs, SFs.
vdpa dev add can certainly use one parent/mgmt dev to create multiple vdpa devices.
Not sure why do we need to create multiple parent dev for that.
I guess there is just one parent/mgmt. dev for VDUSE. What will each mgmtdev do differently?
Demux of IOs, events will be per individual char dev level?


It could be something like how it works for different hardware vendors. E.g IFCVF and mlx5 will register different parentdevs. For userspace, we need to allow different software vendors to manage their instances individually.

Thanks



If yes, VDUSE ioctl needs to be extended to register/unregister parentdev.

Thanks


Thanks,
Yongji

_______________________________________________
Virtualization mailing list
Virtualization@xxxxxxxxxxxxxxxxxxxxxxxxxx
https://lists.linuxfoundation.org/mailman/listinfo/virtualization




[Index of Archives]     [KVM Development]     [Libvirt Development]     [Libvirt Users]     [CentOS Virtualization]     [Netdev]     [Ethernet Bridging]     [Linux Wireless]     [Kernel Newbies]     [Security]     [Linux for Hams]     [Netfilter]     [Bugtraq]     [Yosemite Forum]     [MIPS Linux]     [ARM Linux]     [Linux RAID]     [Linux Admin]     [Samba]

  Powered by Linux