Re: [PATCH linux-next 2/9] vdpa: Introduce query of device config layout

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 2021/3/1 8:06 下午, Sean Mooney wrote:
On Mon, 2021-03-01 at 11:28 +0100, Adrian Moreno wrote:
On 3/1/21 8:50 AM, Jason Wang wrote:
On 2021/3/1 3:29 下午, Parav Pandit wrote:
From: Jason Wang <jasowang@xxxxxxxxxx>
Sent: Monday, March 1, 2021 9:29 AM

On 2021/2/26 4:50 下午, Parav Pandit wrote:
From: Jason Wang <jasowang@xxxxxxxxxx>
Sent: Friday, February 26, 2021 1:56 PM


On 2021/2/26 1:32 下午, Parav Pandit wrote:
From: Jason Wang <jasowang@xxxxxxxxxx>
Sent: Wednesday, February 24, 2021 2:18 PM

On 2021/2/24 2:18 下午, Parav Pandit wrote:
+
+    if (nla_put_u16(msg,
VDPA_ATTR_DEV_NET_CFG_MAX_VQP,
+            config->max_virtqueue_pairs))
+        return -EMSGSIZE;
We probably need to check VIRTIO_NET_F_RSS here.
Yes. Will use it in v2.

+    if (nla_put_u8(msg,
VDPA_ATTR_DEV_NET_CFG_RSS_MAX_KEY_LEN,
+               config->rss_max_key_size))
+        return -EMSGSIZE;
+
+    rss_field = le16_to_cpu(config->rss_max_key_size);
+    if (nla_put_u16(msg,
VDPA_ATTR_DEV_NET_CFG_RSS_MAX_IT_LEN,
rss_field))
+        return -EMSGSIZE;
+
+    hash_types = le32_to_cpu(config->supported_hash_types);
+    if (nla_put_u32(msg,
VDPA_ATTR_DEV_NET_CFG_RSS_HASH_TYPES,
+            config->supported_hash_types))
+        return -EMSGSIZE;
+    return 0;
+}
+
+static int vdpa_dev_net_config_fill(struct vdpa_device *vdev,
+struct sk_buff *msg) {
+    struct virtio_net_config config = {};
+
+    vdev->config->get_config(vdev, 0, &config, sizeof(config));
Do we need to sync with other possible get_config calls here?
To readers of get_config() is ok. No need to sync.
However, I think set_config() and get_config() should sync otherwise
get_config can get partial value.
Will probably have to add vdpa_device->config_access_lock().
Probably. And the set_config() should be synchronized with the
requrest that comes from vDPA bus.
Yes, a rw semaphore is good enough.
Device config fields can be change from the device side anyway, such as
link status anyway.
Sync using lock shouldn’t be problem.

This makes me think whether we should go back to two phases method,
create and enable.

The vDPA device is only registred after enabling, then there's no
need to sync with vDPA bus, and mgmt is not allowed to change config
after enalbing?
In that case we should be able to disable it as well. Disable should take the
device of the bus.
I find it weird to have this model to linger around the device without
anchoring on the bus.
For example device is not yet enabled, so it is not anchored on the bus, but
its name is still have to remain unique.


Can we do some synchornization between vdpa device allocation and vdpa
device registier to solve the naming issue?
mgmt tool querying the device config space after vdpa device is in use is real.
So I don't see solving it any differently.

Now upper layers of vdpa to not access the device on the placed on the vdpa
bus, can be taken care by existing driver code using
echo 0 > /sys/bus/vdpa/drivers_autoprobe

By default vdpa device placed on the bus wont bind to any driver (net/vhost etc).
1. Decision to bind to which driver is done after config of the vdpa device is
done.
2. orchestration sw does create and config before it binds to the right driver
3. which driver to bind to decided based on the use case of that individual
vdpa device
For example,
(a) vdpa0 binds to net driver to have Netdev for container
(b) vdpa1 binds to vhost driver to map vdpa device to VM

I think it should work.
Adding Adrian, Maxime and Sean for more comments.
Is this a strong requirement or just a optional operation?


So my understanding is that something needs to be done to make sure:

1) the vpda device is not bound to virtio-vdpa
2) config is not changed if it has been set by management


This might break some existing logic that expects netdev or vhost-vdpa device to
exist before the some config (e.g:mac address) is set.
the orcestration software should not require permission to do dirver binding and in the case of openstack will
not have the capablity to bind the vdpa device to a dirvier in the current proposed design.
the ordering we are expecting to support in openstack at least initally is that the adminsitrator
will use either systemd service files or udev rules to allocate the VF and vdpa device staictly at boot.

in the case of openstack we will expect all vdpa devies that will be used by openstack(nova) for assignment
to vms will be prebound by the adminstrators to vhost-vdpa. openstack will simply track the /dev/vhost* paths
corallated to the vdpa devices whose parent VF are allowed by the pci passthough whitelist.

if the adminstrator binds the vdpa device to virtio-vdpa so that virtio-net-pci is used on the host nova and libvirt
will not correct that and the vm boot will fail. but provided you do not list that parent VF in the whitelist nova
will also ignore it. so if you want to mix and match the vdpa device driver that should not break anything provided
you do not list its parent in the pci whitelist.

but one thing to keep in mind is whlie we are not planning to create VF/VDPA devices dynamiclaly at lesat not for the initall
implentation we do expact to be able to configure attibute of the vdpa device after tehy are create like any other nic.
for example the mac, mtu, link-state should all be configureable at runtime via normal ip route2 commands.


So in the case of vDPA, this task is expected to be done by vdpa tool. (There's no guarantee that vDPA is a VF).



at the morment we are encounterng issue with the mac adress specificly which qemu is hopefully going to work around since
libvirt and nova cannot currently configure the mac address of a vhost-vdpa device so we are relying on qemu
to take the mac adress form the qemu command line so that it does not end up as all 0 macs in the guest.


Note that a kernel patch is posted by Eli[1] which assign a randomized mac. This should solve the issue in another way.


as some one that works on orcestration level i would have expect that setting the mac/mtu on either the vf or the netdev
representor to match the deireed mac/mtu in the guest would have been sufficent to alter the vdpa device confirutation.
likely  the netdev representor would make the most sence to alter to match the desired mac so that the mac adress see on the ovs
bridge matches the mac of the vm.


Speaking for genreal vDPA, it's not guaranteed that the vendor will implmenet the device with a representor. So vDPA tool is expected to be the only way to configure the device.

Thanks

[1] https://www.spinics.net/lists/linux-virtualization/msg48451.html


the current case where the vdpa representor netdev seams to have no bareing on the vdpa device
config makes debugging much more difficalt and is likely to confuse monitoring software. fortunetly openstack is not matching on the mac
address in this spefic case to identify the netdev represntor port but we have other backend that do look at the mac. in the case of ovs
we store the uuid of the neutron port in the ovsdb when adding a port to ovs so we rely on that not the mac but  for legacy mode
sriov which we hope to support with vdpa eventulally after we comple the hardware offloaed ovs support, currenlty relys on the mac in some cases.

i hope that answered your questions?
we are aware that we likely will have to support vdpa device creation at some point if we intend to add support for vdpa devices
created form subfunctions or intels siov howver that was not ready when we started this work so it was declared out of scope and
the curernt static approch was taken as the only path we could enable before code free in 10 days.

regards sean.


Thanks


https://www.spinics.net/lists/linux-virtualization/msg48451.html

_______________________________________________
Virtualization mailing list
Virtualization@xxxxxxxxxxxxxxxxxxxxxxxxxx
https://lists.linuxfoundation.org/mailman/listinfo/virtualization




[Index of Archives]     [KVM Development]     [Libvirt Development]     [Libvirt Users]     [CentOS Virtualization]     [Netdev]     [Ethernet Bridging]     [Linux Wireless]     [Kernel Newbies]     [Security]     [Linux for Hams]     [Netfilter]     [Bugtraq]     [Yosemite Forum]     [MIPS Linux]     [ARM Linux]     [Linux RAID]     [Linux Admin]     [Samba]

  Powered by Linux