On 8/17/2022 7:14 AM, Si-Wei Liu wrote:
Yes it is a little messy, and we can not check _F_VERSION_1 because of transitional devices, so maybe this is the best we can do for nowOn 8/16/2022 2:08 AM, Zhu, Lingshan wrote:While the fix is fine, the comment is misleading in giving readers false hope. This is in vdpa_dev_net_config_fill() the vdpa tool query path, instead of calls from the VMM dealing with vhost/virtio plumbing specifics. I think what's missing today in vdpa core is the detection of guest type (legacy, transitional, or modern) regarding endianness through F_VERSION_1 and legacy interface access, the latter of which would need some assistance from VMM for sure. However, the presence of information via the vdpa tool query is totally orthogonal. I don't get a good reason for why it has to couple with endianness. How vdpa tool users space is supposed to tweak it? I don't get it...On 8/16/2022 3:58 PM, Si-Wei Liu wrote:https://urldefense.com/v3/__https://www.spinics.net/lists/netdev/msg837114.html__;!!ACWV5N9M2RV99hQ!KVwfun0b1Q59Ajp6O7JrB-BuEBSLyQ9e95oGq1cVG_sQIPDL0whI5frx1EGoQFznmm67RsEeJTrUdfYrGq7Vwjk$ The challenge is that the status filed is virtio16, not le16, so le16_to_cpu(xxx) is wrong anyway. However we can not tell whether it is a LE or BE device from struct vdpa_device, so for most cases, we assume it is LE, and leave this comment.On 8/15/2022 6:58 PM, Zhu, Lingshan wrote:Can you provide the context or link? It shouldn't work like this, otherwise it is breaking uABI. E.g. how will a legacy/BE supporting kernel/device be backward compatible with older vdpa tool (which has knowledge of this endianness implication/assumption from day one)?On 8/16/2022 7:32 AM, Si-Wei Liu wrote:the spec RFC894 says it is 1500, see <a href="https://urldefense.com/v3/__https://www.rfc-editor.org/rfc/rfc894.txt__;!!ACWV5N9M2RV99hQ!MdgxZjw5sp5Qz-GKfwT1IWcw_L4Jo1-UekuJPFz1UrG3YuqirKz7P9ksdJFh1vB6zHJ7z8Q04fpT0-9jWXCtlWM$">https://urldefense.com/v3/__https://www.rfc-editor.org/rfc/rfc894.txt__;!!ACWV5N9M2RV99hQ!KVwfun0b1Q59Ajp6O7JrB-BuEBSLyQ9e95oGq1cVG_sQIPDL0whI5frx1EGoQFznmm67RsEeJTrUdfYrmZPRFaM$ </a>On 8/15/2022 2:26 AM, Zhu Lingshan wrote:Noted there's a typo in the above "The *maximum* length of the data field of a packet sent over an Ethernet is 1500 octets ..." and the RFC was written 1984.Some fields of virtio-net device config space are conditional on the feature bits, the spec says: "The mac address field always exists (though is only valid if VIRTIO_NET_F_MAC is set)" "max_virtqueue_pairs only exists if VIRTIO_NET_F_MQ or VIRTIO_NET_F_RSS is set" "mtu only exists if VIRTIO_NET_F_MTU is set" so we should read MTU, MAC and MQ in the device config space only when these feature bits are offered. For MQ, if both VIRTIO_NET_F_MQ and VIRTIO_NET_F_RSS are not set, the virtio device should haveone queue pair as default value, so when userspace querying queue pair numbers,it should return mq=1 than zero. For MTU, if VIRTIO_NET_F_MTU is not set, we should not read MTU from the device config sapce.RFC894 <A Standard for the Transmission of IP Datagrams over Ethernet Networks>says:"The minimum length of the data field of a packet sent over an Ethernet is 1500 octets, thus the maximum length of an IP datagramsent over an Ethernet is 1500 octets. Implementations are encouragedto support full-length packets"This could be a larger number for sure, we are trying to find out the min value for Ethernet here, to support 1500 octets, MTU should be 1500 at least, so I assume 1500 should be the default value for MTUApparently that is no longer true with the introduction of Jumbo size frame later in the 2000s. I'm not sure what is the point of mention this ancient RFC. It doesn't say default MTU of any Ethernet NIC/switch should be 1500 in either case.virtio spec says:"The virtio network device is a virtual ethernet card",Right,... but it doesn't say the default is 1500. At least, not in explicit way. Why it can't be 1492 or even lower? In practice, if the network backend has a MTU higher than 1500, there's nothing wrong for guest to configure default MTU more than 1500.so the default MTU value should be 1500 for virtio-net.same as aboveAs said, there's no virtio spec defined value for MTU. Please leave this field out if feature VIRTIO_NET_F_MTU is not negotiated.For MAC, the spec says:"If the VIRTIO_NET_F_MAC feature bit is set, the configuration space mac entry indicates the “physical” address of the network card, otherwise the driver would typically generate a random local MAC address." So there is no default MAC address if VIRTIO_NET_F_MAC not set. This commits introduces functions vdpa_dev_net_mtu_config_fill() and vdpa_dev_net_mac_config_fill() to fill MTU and MAC. It also fixes vdpa_dev_net_mq_config_fill() to report correct MQ when _F_MQ is not present. These functions should check devices features than driver features, and struct vdpa_device is not needed as a parameter The test & userspace tool output: Feature bit VIRTIO_NET_F_MTU, VIRTIO_NET_F_RSS, VIRTIO_NET_F_MQ and VIRTIO_NET_F_MAC can be mask out by hardcode. However, it is challenging to "disable" the related fields in the HW device config space, so let's just assume the values are meaningless if the feature bits are not set. Before this change, when feature bits for RSS, MQ, MTU and MAC are not set, iproute2 output:$vdpa vdpa0: mac 00:e8:ca:11:be:05 link up link_announce false mtu 1500negotiated_features without this commit, function vdpa_dev_net_config_fill() reads all config space fields unconditionally, so let's assume the MAC and MTU are meaningless, and it checks MQ with driver_features, so we don't see max_vq_pairs. After applying this commit, when feature bits for MQ, RSS, MAC and MTU are not set,iproute2 output: $vdpa dev config show vdpa0 vdpa0: link up link_announce false max_vq_pairs 1 mtu 1500 negotiated_features As explained above: Here is no MAC, because VIRTIO_NET_F_MAC is not set, and there is no default value for MAC. It shows max_vq_paris = 1 because even without MQ feature, a functional virtio-net must have one queue pair. mtu = 1500 is the default value as ethernet required. This commit also add supplementary comments for __virtio16_to_cpu(true, xxx) operations in vdpa_dev_net_config_fill() and vdpa_fill_stats_rec() Signed-off-by: Zhu Lingshan <lingshan.zhu@xxxxxxxxx> ---drivers/vdpa/vdpa.c | 60 +++++++++++++++++++++++++++++++++++----------1 file changed, 47 insertions(+), 13 deletions(-) diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c index efb55a06e961..a74660b98979 100644 --- a/drivers/vdpa/vdpa.c +++ b/drivers/vdpa/vdpa.c@@ -801,19 +801,44 @@ static int vdpa_nl_cmd_dev_get_dumpit(struct sk_buff *msg, struct netlink_callbareturn msg->len; } -static int vdpa_dev_net_mq_config_fill(struct vdpa_device *vdev, - struct sk_buff *msg, u64 features,+static int vdpa_dev_net_mq_config_fill(struct sk_buff *msg, u64 features,const struct virtio_net_config *config) { u16 val_u16; - if ((features & BIT_ULL(VIRTIO_NET_F_MQ)) == 0) - return 0; + if ((features & BIT_ULL(VIRTIO_NET_F_MQ)) == 0 && + (features & BIT_ULL(VIRTIO_NET_F_RSS)) == 0) + val_u16 = 1; + else+ val_u16 = __virtio16_to_cpu(true, config->max_virtqueue_pairs);- val_u16 = le16_to_cpu(config->max_virtqueue_pairs);return nla_put_u16(msg, VDPA_ATTR_DEV_NET_CFG_MAX_VQP, val_u16);}+static int vdpa_dev_net_mtu_config_fill(struct sk_buff *msg, u64 features,+ const struct virtio_net_config *config) +{ + u16 val_u16; + + if ((features & BIT_ULL(VIRTIO_NET_F_MTU)) == 0) + val_u16 = 1500;same as abovewe have discussed it in another thread, and this comment is suggested by MST.You can leave it as a TODO for kernel (vdpa core limitation), but AFAIK there's nothing userspace needs to do to infer the endianness. IMHO it's the kernel's job to provide an abstraction rather than rely on userspace guessing it.+ else + val_u16 = __virtio16_to_cpu(true, config->mtu); + + return nla_put_u16(msg, VDPA_ATTR_DEV_NET_CFG_MTU, val_u16); +} ++static int vdpa_dev_net_mac_config_fill(struct sk_buff *msg, u64 features,+ const struct virtio_net_config *config) +{ + if ((features & BIT_ULL(VIRTIO_NET_F_MAC)) == 0) + return 0; + else + return nla_put(msg, VDPA_ATTR_DEV_NET_CFG_MACADDR, + sizeof(config->mac), config->mac); +} + +static int vdpa_dev_net_config_fill(struct vdpa_device *vdev, struct sk_buff *msg){ struct virtio_net_config config = {};@@ -822,18 +847,16 @@ static int vdpa_dev_net_config_fill(struct vdpa_device *vdev, struct sk_buff *msvdpa_get_config_unlocked(vdev, 0, &config, sizeof(config));- if (nla_put(msg, VDPA_ATTR_DEV_NET_CFG_MACADDR, sizeof(config.mac),- config.mac)) - return -EMSGSIZE; + /* + * Assume little endian for now, userspace can tweak this for + * legacy guest support.
-SiweiThanks-Siwei+ */ + val_u16 = __virtio16_to_cpu(true, config.status); val_u16 = __virtio16_to_cpu(true, config.status); if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_STATUS, val_u16)) return -EMSGSIZE; - val_u16 = __virtio16_to_cpu(true, config.mtu); - if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_CFG_MTU, val_u16)) - return -EMSGSIZE; - features_driver = vdev->config->get_driver_features(vdev);if (nla_put_u64_64bit(msg, VDPA_ATTR_DEV_NEGOTIATED_FEATURES, features_driver,VDPA_ATTR_PAD))@@ -846,7 +869,13 @@ static int vdpa_dev_net_config_fill(struct vdpa_device *vdev, struct sk_buff *msVDPA_ATTR_PAD)) return -EMSGSIZE;- return vdpa_dev_net_mq_config_fill(vdev, msg, features_driver, &config); + if (vdpa_dev_net_mac_config_fill(msg, features_device, &config))+ return -EMSGSIZE; ++ if (vdpa_dev_net_mtu_config_fill(msg, features_device, &config))+ return -EMSGSIZE; ++ return vdpa_dev_net_mq_config_fill(msg, features_device, &config);} static int@@ -914,6 +943,11 @@ static int vdpa_fill_stats_rec(struct vdpa_device *vdev, struct sk_buff *msg,} vdpa_get_config_unlocked(vdev, 0, &config, sizeof(config)); + /* + * Assume little endian for now, userspace can tweak this for + * legacy guest support. + */ +Ditto.same as above ThanksThanks, -Siweimax_vqp = __virtio16_to_cpu(true, config.max_virtqueue_pairs); if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_CFG_MAX_VQP, max_vqp)) return -EMSGSIZE;