> -----Original Message----- > From: Nathan Chancellor <natechancellor@xxxxxxxxx> > Sent: Thursday, April 30, 2020 2:02 AM > To: Michael Kelley <mikelley@xxxxxxxxxxxxx> > Cc: KY Srinivasan <kys@xxxxxxxxxxxxx>; Haiyang Zhang > <haiyangz@xxxxxxxxxxxxx>; Stephen Hemminger > <sthemmin@xxxxxxxxxxxxx>; Wei Liu <wei.liu@xxxxxxxxxx>; linux- > hyperv@xxxxxxxxxxxxxxx; netdev@xxxxxxxxxxxxxxx; linux- > kernel@xxxxxxxxxxxxxxx; clang-built-linux@xxxxxxxxxxxxxxxx; Sami > Tolvanen <samitolvanen@xxxxxxxxxx> > Subject: Re: [PATCH v2] hv_netvsc: Fix netvsc_start_xmit's return type > > Hi Michael, > > On Thu, Apr 30, 2020 at 12:06:09AM +0000, Michael Kelley wrote: > > From: Nathan Chancellor <natechancellor@xxxxxxxxx> Sent: Tuesday, > > April 28, 2020 10:55 AM > > > > > > Do note that netvsc_xmit still returns int because netvsc_xmit has a > > > potential return from netvsc_vf_xmit, which does not return > > > netdev_tx_t because of the call to dev_queue_xmit. > > > > > > I am not sure if that is an oversight that was introduced by commit > > > 0c195567a8f6e ("netvsc: transparent VF management") or if everything > > > works properly as it is now. > > > > > > My patch is purely concerned with making the definition match the > > > prototype so it should be NFC aside from avoiding the CFI panic. > > > > > > > While it probably works correctly now, I'm not too keen on just > > changing the return type for netvsc_start_xmit() and assuming the > > 'int' that is returned from netvsc_xmit() will be correctly mapped to > > the netdev_tx_t enum type. While that mapping probably happens > > correctly at the moment, this really should have a more holistic fix. > > While it might work correctly, I am not sure that the mapping is correct, > hence that comment. > > netdev_tx_t is an enum with two acceptable types, 0x00 and 0x10. Up until > commit 0c195567a8f6e ("netvsc: transparent VF management"), netvsc_xmit > was guaranteed to return something of type netdev_tx_t. > > However, after said commit, we could return anything from netvsc_vf_xmit, > which in turn calls dev_queue_xmit then __dev_queue_xmit which will > return either an error code (-ENOMEM or > -ENETDOWN) or something from __dev_xmit_skb, which appears to be > NET_XMIT_SUCCESS, NET_XMIT_DROP, or NET_XMIT_CN. > > It does not look like netvsc_xmit or netvsc_vf_xmit try to convert those > returns to netdev_tx_t in some way; netvsc_vf_xmit just passes the return > value up to netvsc_xmit, which is the part that I am unsure about... > > > Nathan -- are you willing to look at doing the more holistic fix? Or > > should we see about asking Haiyang Zhang to do it? > > I would be fine trying to look at a more holistic fix but I know basically nothing > about this subsystem. I am unsure if something like this would be acceptable > or if something else needs to happen. > Otherwise, I'd be fine with you guys taking a look and just giving me > reported-by credit. Here is more info regarding Linux network subsystem: As said in "include/linux/netdevice.h", drivers are allowed to return any codes from the three different namespaces. And hv_netvsc needs to support "transparent VF", and calls netvsc_vf_xmit >> dev_queue_xmit which returns qdisc return codes, and errnos like -ENOMEM, etc. These are compliant with the guideline below: 79 /* 80 * Transmit return codes: transmit return codes originate from three different 81 * namespaces: 82 * 83 * - qdisc return codes 84 * - driver transmit return codes 85 * - errno values 86 * 87 * Drivers are allowed to return any one of those in their hard_start_xmit() Also, ndo_start_xmit function pointer is used by upper layer functions which can handles three types of the return codes. For example, in the calling stack: ndo_start_xmit << netdev_start_xmit << xmit_one << dev_hard_start_xmit(): The function dev_hard_start_xmit() uses dev_xmit_complete() to handle the return codes. It handles three types of the return codes correctly. 3483 struct sk_buff *dev_hard_start_xmit(struct sk_buff *first, struct net_device *dev, 3484 struct netdev_queue *txq, int *ret) 3485 { 3486 struct sk_buff *skb = first; 3487 int rc = NETDEV_TX_OK; 3488 3489 while (skb) { 3490 struct sk_buff *next = skb->next; 3491 3492 skb_mark_not_on_list(skb); 3493 rc = xmit_one(skb, dev, txq, next != NULL); 3494 if (unlikely(!dev_xmit_complete(rc))) { 3495 skb->next = next; 3496 goto out; 3497 } 3498 3499 skb = next; 3500 if (netif_tx_queue_stopped(txq) && skb) { 3501 rc = NETDEV_TX_BUSY; 3502 break; 3503 } 3504 } 3505 3506 out: 3507 *ret = rc; 3508 return skb; 3509 } 118 /* 119 * Current order: NETDEV_TX_MASK > NET_XMIT_MASK >= 0 is significant; 120 * hard_start_xmit() return < NET_XMIT_MASK means skb was consumed. 121 */ 122 static inline bool dev_xmit_complete(int rc) 123 { 124 /* 125 * Positive cases with an skb consumed by a driver: 126 * - successful transmission (rc == NETDEV_TX_OK) 127 * - error while transmitting (rc < 0) 128 * - error while queueing to a different device (rc & NET_XMIT_MASK) 129 */ 130 if (likely(rc < NET_XMIT_MASK)) 131 return true; 132 133 return false; 134 } Regarding "a more holistic fix", I believe the return type of ndo_start_xmit should be int, because of three namespaces of the return codes. This means to change all network drivers. I'm not proposing to do this big change right now. So I have no objection of your patch. Thanks, - Haiyang Reviewed-by: Haiyang Zhang <haiyangz@xxxxxxxxxxxxx>