On Tue, Mar 8, 2022 at 10:13 PM Eli Cohen <elic@xxxxxxxxxx> wrote:
> -----Original Message-----
> From: Si-Wei Liu <si-wei.liu@xxxxxxxxxx>
> Sent: Tuesday, March 8, 2022 8:16 AM
> To: Eli Cohen <elic@xxxxxxxxxx>
> Cc: mst@xxxxxxxxxx; jasowang@xxxxxxxxxx; virtualization@lists.linux-
> foundation.org; eperezma@xxxxxxxxxx; amorenoz@xxxxxxxxxx;
> lvivier@xxxxxxxxxx; sgarzare@xxxxxxxxxx; Parav Pandit <parav@xxxxxxxxxx>
> Subject: Re: [PATCH v1 1/2] vdpa: Add support for querying vendor statistics
>
>
>
> On 3/6/2022 11:57 PM, Eli Cohen wrote:
> >
> >> -----Original Message-----
> >> From: Si-Wei Liu <si-wei.liu@xxxxxxxxxx>
> >> Sent: Saturday, March 5, 2022 12:34 AM
> >> To: Eli Cohen <elic@xxxxxxxxxx>
> >> Cc: mst@xxxxxxxxxx; jasowang@xxxxxxxxxx; virtualization@lists.linux-
> >> foundation.org; eperezma@xxxxxxxxxx; amorenoz@xxxxxxxxxx;
> >> lvivier@xxxxxxxxxx; sgarzare@xxxxxxxxxx; Parav Pandit
> >> <parav@xxxxxxxxxx>
> >> Subject: Re: [PATCH v1 1/2] vdpa: Add support for querying vendor
> >> statistics
> >>
> >> Sorry, I somehow missed this after my break. Please see comments in line.
> >>
> >> On 2/16/2022 10:46 PM, Eli Cohen wrote:
> >>> On Wed, Feb 16, 2022 at 10:49:26AM -0800, Si-Wei Liu wrote:
> >>>> On 2/16/2022 12:00 AM, Eli Cohen wrote:
> >>>>> Allows to read vendor statistics of a vdpa device. The specific
> >>>>> statistics data is received by the upstream driver in the form of
> >>>>> an (attribute name, attribute value) pairs.
> >>>>>
> >>>>> An example of statistics for mlx5_vdpa device are:
> >>>>>
> >>>>> received_desc - number of descriptors received by the virtqueue
> >>>>> completed_desc - number of descriptors completed by the virtqueue
> >>>>>
> >>>>> A descriptor using indirect buffers is still counted as 1. In
> >>>>> addition, N chained descriptors are counted correctly N times as
> >>>>> one
> >> would expect.
> >>>>> A new callback was added to vdpa_config_ops which provides the
> >>>>> means for the vdpa driver to return statistics results.
> >>>>>
> >>>>> The interface allows for reading all the supported virtqueues,
> >>>>> including the control virtqueue if it exists.
> >>>>>
> >>>>> Below are some examples taken from mlx5_vdpa which are introduced
> >>>>> in the following patch:
> >>>>>
> >>>>> 1. Read statistics for the virtqueue at index 1
> >>>>>
> >>>>> $ vdpa dev vstats show vdpa-a qidx 1
> >>>>> vdpa-a:
> >>>>> queue_type tx queue_index 1 received_desc 3844836 completed_desc
> >>>>> 3844836
> >>>>>
> >>>>> 2. Read statistics for the virtqueue at index 32 $ vdpa dev vstats
> >>>>> show vdpa-a qidx 32
> >>>>> vdpa-a:
> >>>>> queue_type control_vq queue_index 32 received_desc 62
> >>>>> completed_desc
> >>>>> 62
> >>>>>
> >>>>> 3. Read statisitics for the virtqueue at index 0 with json output
> >>>>> $ vdpa -j dev vstats show vdpa-a qidx 0 {"vstats":{"vdpa-a":{
> >>>>>
> >> "queue_type":"rx","queue_index":0,"name":"received_desc","value":4177
> >> 76,\
> >>>>> "name":"completed_desc","value":417548}}}
> >>>>>
> >>>>> 4. Read statistics for the virtqueue at index 0 with preety json
> >>>>> output $ vdpa -jp dev vstats show vdpa-a qidx 0 {
> >>>>> "vstats": {
> >>>>> "vdpa-a": {
> >>>>>
> >>>>> "queue_type": "rx",
> >>>> I wonder where this info can be inferred? I don't see relevant
> >>>> change in the patch series that helps gather the
> VDPA_ATTR_DEV_QUEUE_TYPE?
> >>>> Is this an arbitrary string defined by the vendor as well? If so,
> >>>> how does the user expect to consume it?
> >>> The queue tupe is deduced from the index and whether we have a
> >>> virtqueue. Even numbers are rx, odd numbers are tx and if there is
> >>> CVQ, the last one is CVQ.
> >> OK, then VDPA_ATTR_DEV_QUEUE_TYPE attribute introduced in this patch
> >> might not be useful at all?
> > Right, will remove.
> >
> >> And how do you determine in the vdpa tool if CVQ is negotiated or
> >> not?
> > I make a netlink call to get the same information as " vdpa dev config show"
> retrieves. I use the negotiated features to determine if a CVQ is available. If it
> is, the number of VQs equals the control VQ index. So there are two netlink
> calls under the hood.
> The lock vdpa_dev_mutex won't hold across the two separate netlink calls, and
> it may end up with inconsistent state - theoretically things could happen like
> that the first call gets CVQ negotiated, but the later call for
> get_vendor_vq_stats() on the cvq might get -EINVAL due to device reset. Can
> the negotiated status and stat query be done within one single netlink call?
I see your concern.
The only reason I do the extra call is to know if we have a control VQ and what
index it is, just to print a descriptive string telling if it's a either rx, tx or control VQ.
So the cure can be simple. Let's have a new attribute that returns the type of
virtqueue. I think Jason did not like the idea of communicating the kind of VQ
from kernel to userspace but under these circumstances, maybe he would approve.
Jason?
I may miss something but I still don't get the reason for this. I think the atom could be done by using a single netlink call. Then the userspace could deduce the queue type based on the negotiated features.
For returning the type, it means the vdpa core has the device specific knowledge which still seems sub-optimal.
Thanks
>
> What worried me is that the queue index being dynamic and depended on
> negotiation status would make host admin user quite hard to follow. The guest
> may or may not advertise F_MQ and/or F_CTRL_VQ across various phases, e.g.
> firmware (UEFI), boot loader (grub) till OS driver is up and running, which can
> be agnostic to host admin. For most of the part it's not easy to script and
> predict the queue index which can change from time to time. Can we define
> the order of host predictable queue index, which is independent from any
> guest negotiated state?
>
> >
> >> Looks to me there are still some loose end I don't quite yet
> >> understand.
> >>
> >>
> >>>>> "queue_index": 0,
> > I think this can be removed since the command is for a specific index.
> >
> >>>>> "name": "received_desc",
> >>>>> "value": 417776,
> >>>>> "name": "completed_desc",
> >>>>> "value": 417548
> >>>> Not for this kernel patch, but IMHO it's the best to put the name &
> >>>> value pairs in an array instead of flat entries in json's
> >>>> hash/dictionary. The hash entries can be re-ordered deliberately by
> >>>> external json parsing tool, ending up with inconsistent stat values.
> >> This comment is missed for some reason. Please change the example in
> >> the log if you agree to address it in vdpa tool. Or justify why
> >> keeping the order for json hash/dictionary is fine.
> > Sorry for skipping this comment.
> > Do you mean to present the information like:
> > "received_desc": 417776,
> > "completed_desc": 417548,
> I mean the following presentation:
>
> $ vdpa -jp dev vstats show vdpa-a qidx 0 {
> "vstats": {
> "vdpa-a": {
> "queue_stats": [{
> "queue_index": 0,
> "queue_type": "rx",
> "stat_name": [ "received_desc","completed_desc" ],
> "stat_value": [ 417776,417548 ],
> }]
> }
> }
> }
>
> I think Parav had similar suggestion, too.
>
> Thanks,
> -Siwei
>
> >
> >> Thanks,
> >> -Siwei
> >>
> >>>> Thanks,
> >>>> -Siwei
> >>>>> }
> >>>>> }
> >>>>> }
> >>>>>
> >>>>> Signed-off-by: Eli Cohen <elic@xxxxxxxxxx>
> >>>>> ---
> >>>>> drivers/vdpa/vdpa.c | 129
> >> ++++++++++++++++++++++++++++++++++++++
> >>>>> include/linux/vdpa.h | 5 ++
> >>>>> include/uapi/linux/vdpa.h | 7 +++
> >>>>> 3 files changed, 141 insertions(+)
> >>>>>
> >>>>> diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c index
> >>>>> 9846c9de4bfa..d0ff671baf88 100644
> >>>>> --- a/drivers/vdpa/vdpa.c
> >>>>> +++ b/drivers/vdpa/vdpa.c
> >>>>> @@ -909,6 +909,74 @@ vdpa_dev_config_fill(struct vdpa_device
> >>>>> *vdev,
> >> struct sk_buff *msg, u32 portid,
> >>>>> return err;
> >>>>> }
> >>>>> +static int vdpa_fill_stats_rec(struct vdpa_device *vdev, struct
> >>>>> +sk_buff
> >> *msg,
> >>>>> + struct genl_info *info, u32 index) {
> >>>>> + int err;
> >>>>> +
> >>>>> + if (nla_put_u32(msg, VDPA_ATTR_DEV_QUEUE_INDEX, index))
> >>>>> + return -EMSGSIZE;
> >>>>> +
> >>>>> + err = vdev->config->get_vendor_vq_stats(vdev, index, msg, info-
> >>> extack);
> >>>>> + if (err)
> >>>>> + return err;
> >>>>> +
> >>>>> + return 0;
> >>>>> +}
> >>>>> +
> >>>>> +static int vendor_stats_fill(struct vdpa_device *vdev, struct sk_buff
> *msg,
> >>>>> + struct genl_info *info, u32 index) {
> >>>>> + int err;
> >>>>> +
> >>>>> + if (!vdev->config->get_vendor_vq_stats)
> >>>>> + return -EOPNOTSUPP;
> >>>>> +
> >>>>> + err = vdpa_fill_stats_rec(vdev, msg, info, index);
> >>>>> + if (err)
> >>>>> + return err;
> >>>>> +
> >>>>> + return 0;
> >>>>> +}
> >>>>> +
> >>>>> +static int vdpa_dev_vendor_stats_fill(struct vdpa_device *vdev,
> >>>>> + struct sk_buff *msg,
> >>>>> + struct genl_info *info, u32 index) {
> >>>>> + u32 device_id;
> >>>>> + void *hdr;
> >>>>> + int err;
> >>>>> + u32 portid = info->snd_portid;
> >>>>> + u32 seq = info->snd_seq;
> >>>>> + u32 flags = 0;
> >>>>> +
> >>>>> + hdr = genlmsg_put(msg, portid, seq, &vdpa_nl_family, flags,
> >>>>> + VDPA_CMD_DEV_VSTATS_GET);
> >>>>> + if (!hdr)
> >>>>> + return -EMSGSIZE;
> >>>>> +
> >>>>> + if (nla_put_string(msg, VDPA_ATTR_DEV_NAME, dev_name(&vdev-
> >>> dev))) {
> >>>>> + err = -EMSGSIZE;
> >>>>> + goto undo_msg;
> >>>>> + }
> >>>>> +
> >>>>> + device_id = vdev->config->get_device_id(vdev);
> >>>>> + if (nla_put_u32(msg, VDPA_ATTR_DEV_ID, device_id)) {
> >>>>> + err = -EMSGSIZE;
> >>>>> + goto undo_msg;
> >>>>> + }
> >>>>> +
> >>>>> + err = vendor_stats_fill(vdev, msg, info, index);
> >>>>> +
> >>>>> + genlmsg_end(msg, hdr);
> >>>>> +
> >>>>> + return err;
> >>>>> +
> >>>>> +undo_msg:
> >>>>> + genlmsg_cancel(msg, hdr);
> >>>>> + return err;
> >>>>> +}
> >>>>> +
> >>>>> static int vdpa_nl_cmd_dev_config_get_doit(struct sk_buff
> >>>>> *skb, struct
> >> genl_info *info)
> >>>>> {
> >>>>> struct vdpa_device *vdev;
> >>>>> @@ -990,6 +1058,60 @@ vdpa_nl_cmd_dev_config_get_dumpit(struct
> >> sk_buff *msg, struct netlink_callback *
> >>>>> return msg->len;
> >>>>> }
> >>>>> +static int vdpa_nl_cmd_dev_stats_get_doit(struct sk_buff *skb,
> >>>>> + struct genl_info *info)
> >>>>> +{
> >>>>> + struct vdpa_device *vdev;
> >>>>> + struct sk_buff *msg;
> >>>>> + const char *devname;
> >>>>> + struct device *dev;
> >>>>> + u32 index;
> >>>>> + int err;
> >>>>> +
> >>>>> + if (!info->attrs[VDPA_ATTR_DEV_NAME])
> >>>>> + return -EINVAL;
> >>>>> +
> >>>>> + if (!info->attrs[VDPA_ATTR_DEV_QUEUE_INDEX])
> >>>>> + return -EINVAL;
> >>>>> +
> >>>>> + devname = nla_data(info->attrs[VDPA_ATTR_DEV_NAME]);
> >>>>> + msg = nlmsg_new(NLMSG_DEFAULT_SIZE, GFP_KERNEL);
> >>>>> + if (!msg)
> >>>>> + return -ENOMEM;
> >>>>> +
> >>>>> + index = nla_get_u32(info->attrs[VDPA_ATTR_DEV_QUEUE_INDEX]);
> >>>>> + mutex_lock(&vdpa_dev_mutex);
> >>>>> + dev = bus_find_device(&vdpa_bus, NULL, devname,
> >> vdpa_name_match);
> >>>>> + if (!dev) {
> >>>>> + NL_SET_ERR_MSG_MOD(info->extack, "device not found");
> >>>>> + err = -ENODEV;
> >>>>> + goto dev_err;
> >>>>> + }
> >>>>> + vdev = container_of(dev, struct vdpa_device, dev);
> >>>>> + if (!vdev->mdev) {
> >>>>> + NL_SET_ERR_MSG_MOD(info->extack, "unmanaged vdpa
> >> device");
> >>>>> + err = -EINVAL;
> >>>>> + goto mdev_err;
> >>>>> + }
> >>>>> + err = vdpa_dev_vendor_stats_fill(vdev, msg, info, index);
> >>>>> + if (!err)
> >>>>> + err = genlmsg_reply(msg, info);
> >>>>> +
> >>>>> + put_device(dev);
> >>>>> + mutex_unlock(&vdpa_dev_mutex);
> >>>>> +
> >>>>> + if (err)
> >>>>> + nlmsg_free(msg);
> >>>>> +
> >>>>> + return err;
> >>>>> +
> >>>>> +mdev_err:
> >>>>> + put_device(dev);
> >>>>> +dev_err:
> >>>>> + mutex_unlock(&vdpa_dev_mutex);
> >>>>> + return err;
> >>>>> +}
> >>>>> +
> >>>>> static const struct nla_policy vdpa_nl_policy[VDPA_ATTR_MAX + 1] = {
> >>>>> [VDPA_ATTR_MGMTDEV_BUS_NAME] = { .type =
> NLA_NUL_STRING },
> >>>>> [VDPA_ATTR_MGMTDEV_DEV_NAME] = { .type = NLA_STRING
> }, @@ -
> >> 997,6
> >>>>> +1119,7 @@ static const struct nla_policy
> >> vdpa_nl_policy[VDPA_ATTR_MAX + 1] = {
> >>>>> [VDPA_ATTR_DEV_NET_CFG_MACADDR] =
> NLA_POLICY_ETH_ADDR,
> >>>>> /* virtio spec 1.1 section 5.1.4.1 for valid MTU range */
> >>>>> [VDPA_ATTR_DEV_NET_CFG_MTU] =
> NLA_POLICY_MIN(NLA_U16, 68),
> >>>>> + [VDPA_ATTR_DEV_QUEUE_INDEX] = NLA_POLICY_RANGE(NLA_U32, 0,
> >> 65535),
> >>>>> };
> >>>>> static const struct genl_ops vdpa_nl_ops[] = { @@ -1030,6
> >>>>> +1153,12 @@ static const struct genl_ops vdpa_nl_ops[] = {
> >>>>> .doit = vdpa_nl_cmd_dev_config_get_doit,
> >>>>> .dumpit = vdpa_nl_cmd_dev_config_get_dumpit,
> >>>>> },
> >>>>> + {
> >>>>> + .cmd = VDPA_CMD_DEV_VSTATS_GET,
> >>>>> + .validate = GENL_DONT_VALIDATE_STRICT |
> >> GENL_DONT_VALIDATE_DUMP,
> >>>>> + .doit = vdpa_nl_cmd_dev_stats_get_doit,
> >>>>> + .flags = GENL_ADMIN_PERM,
> >>>>> + },
> >>>>> };
> >>>>> static struct genl_family vdpa_nl_family __ro_after_init = {
> >>>>> diff --git a/include/linux/vdpa.h b/include/linux/vdpa.h index
> >>>>> 2de442ececae..274203845cfc 100644
> >>>>> --- a/include/linux/vdpa.h
> >>>>> +++ b/include/linux/vdpa.h
> >>>>> @@ -275,6 +275,9 @@ struct vdpa_config_ops {
> >>>>> const struct vdpa_vq_state *state);
> >>>>> int (*get_vq_state)(struct vdpa_device *vdev, u16 idx,
> >>>>> struct vdpa_vq_state *state);
> >>>>> + int (*get_vendor_vq_stats)(struct vdpa_device *vdev, u16 idx,
> >>>>> + struct sk_buff *msg,
> >>>>> + struct netlink_ext_ack *extack);
> >>>>> struct vdpa_notification_area
> >>>>> (*get_vq_notification)(struct vdpa_device *vdev, u16 idx);
> >>>>> /* vq irq is not expected to be changed once DRIVER_OK is set
> >>>>> */ @@ -466,4 +469,6 @@ struct vdpa_mgmt_dev {
> >>>>> int vdpa_mgmtdev_register(struct vdpa_mgmt_dev *mdev);
> >>>>> void vdpa_mgmtdev_unregister(struct vdpa_mgmt_dev *mdev);
> >>>>> +#define VDPA_INVAL_QUEUE_INDEX 0xffff
> >>>>> +
> >>>>> #endif /* _LINUX_VDPA_H */
> >>>>> diff --git a/include/uapi/linux/vdpa.h b/include/uapi/linux/vdpa.h
> >>>>> index 1061d8d2d09d..c5f229a41dc2 100644
> >>>>> --- a/include/uapi/linux/vdpa.h
> >>>>> +++ b/include/uapi/linux/vdpa.h
> >>>>> @@ -18,6 +18,7 @@ enum vdpa_command {
> >>>>> VDPA_CMD_DEV_DEL,
> >>>>> VDPA_CMD_DEV_GET, /* can dump */
> >>>>> VDPA_CMD_DEV_CONFIG_GET, /* can dump */
> >>>>> + VDPA_CMD_DEV_VSTATS_GET,
> >>>>> };
> >>>>> enum vdpa_attr {
> >>>>> @@ -46,6 +47,12 @@ enum vdpa_attr {
> >>>>> VDPA_ATTR_DEV_NEGOTIATED_FEATURES, /* u64 */
> >>>>> VDPA_ATTR_DEV_MGMTDEV_MAX_VQS, /*
> u32 */
> >>>>> VDPA_ATTR_DEV_SUPPORTED_FEATURES, /* u64 */
> >>>>> +
> >>>>> + VDPA_ATTR_DEV_QUEUE_INDEX, /* u16 */
> >>>>> + VDPA_ATTR_DEV_QUEUE_TYPE, /* string */
> >>>>> + VDPA_ATTR_DEV_VENDOR_ATTR_NAME, /* string */
> >>>>> + VDPA_ATTR_DEV_VENDOR_ATTR_VALUE, /* u64 */
> >>>>> +
> >>>>> /* new attributes must be added above here */
> >>>>> VDPA_ATTR_MAX,
> >>>>> };
_______________________________________________ Virtualization mailing list Virtualization@xxxxxxxxxxxxxxxxxxxxxxxxxx https://lists.linuxfoundation.org/mailman/listinfo/virtualization