>-----Original Message----- >From: Alexander Lobakin <aleksander.lobakin@xxxxxxxxx> >Sent: Wednesday, July 5, 2023 8:06 PM >To: Souradeep Chakrabarti <schakrabarti@xxxxxxxxxxxxx>; souradeep >chakrabarti <schakrabarti@xxxxxxxxxxxxxxxxxxx> >Cc: KY Srinivasan <kys@xxxxxxxxxxxxx>; Haiyang Zhang ><haiyangz@xxxxxxxxxxxxx>; wei.liu@xxxxxxxxxx; Dexuan Cui ><decui@xxxxxxxxxxxxx>; davem@xxxxxxxxxxxxx; edumazet@xxxxxxxxxx; >kuba@xxxxxxxxxx; pabeni@xxxxxxxxxx; Long Li <longli@xxxxxxxxxxxxx>; Ajay >Sharma <sharmaajay@xxxxxxxxxxxxx>; leon@xxxxxxxxxx; >cai.huoqing@xxxxxxxxx; ssengar@xxxxxxxxxxxxxxxxxxx; vkuznets@xxxxxxxxxx; >tglx@xxxxxxxxxxxxx; linux-hyperv@xxxxxxxxxxxxxxx; netdev@xxxxxxxxxxxxxxx; >linux-kernel@xxxxxxxxxxxxxxx; linux-rdma@xxxxxxxxxxxxxxx; >stable@xxxxxxxxxxxxxxx >Subject: Re: [EXTERNAL] Re: [PATCH V4 net] net: mana: Fix MANA VF unload >when host is unresponsive > >[You don't often get email from aleksander.lobakin@xxxxxxxxx. Learn why this is >important at https://aka.ms/LearnAboutSenderIdentification ] > >From: Souradeep Chakrabarti <schakrabarti@xxxxxxxxxxxxx> >Date: Mon, 3 Jul 2023 19:55:06 +0000 > >> >> >>> -----Original Message----- >>> From: Alexander Lobakin <aleksander.lobakin@xxxxxxxxx> >>> Sent: Monday, July 3, 2023 10:18 PM > >[...] > >>>> for (i = 0; i < apc->num_queues; i++) { >>>> txq = &apc->tx_qp[i].txq; >>>> - >>>> - while (atomic_read(&txq->pending_sends) > 0) >>>> + while (atomic_read(&txq->pending_sends) > 0 && >>>> + time_before(jiffies, timeout)) { >>>> usleep_range(1000, 2000);> + } >>>> } >>> >>> 120 seconds by 2 msec step is 60000 iterations, by 1 msec is 120000 >>> iterations. I know usleep_range() often is much less precise, but still. >>> Do you really need that much time? Has this been measured during the >>> tests that it can take up to 120 seconds or is it just some random >>> value that "should be enough"? >>> If you really need 120 seconds, I'd suggest using a timer / delayed >>> work instead of wasting resources. >> Here the intent is not waiting for 120 seconds, rather than avoid >> continue checking the pending_sends of each tx queues for an indefinite time, >before freeing sk_buffs. >> The pending_sends can only get decreased for a tx queue, if >> mana_poll_tx_cq() gets called for a completion notification and increased by >xmit. >> >> In this particular bug, apc->port_is_up is not set to false, causing >> xmit to keep increasing the pending_sends for the queue and >> mana_poll_tx_cq() not getting called for the queue. >> >> If we see the comment in the function mana_dealloc_queues(), it mentions it : >> >> 2346 /* No packet can be transmitted now since apc->port_is_up is false. >> 2347 * There is still a tiny chance that mana_poll_tx_cq() can re-enable >> 2348 * a txq because it may not timely see apc->port_is_up being cleared >> 2349 * to false, but it doesn't matter since mana_start_xmit() drops any >> 2350 * new packets due to apc->port_is_up being false. >> >> The value 120 seconds has been decided here based on maximum number of >> queues > >This is quite opposite to what you're saying above. How should I connect these >two: > >Here the intent is not waiting for 120 seconds > >+ > >The value 120 seconds has been decided here based on maximum number of >queues > >? >Can cleaning the Tx queues really last for 120 seconds? >My understanding is that timeouts need to be sensible and not go to the outer >space. What is the medium value you got during the tests? > For each queue each takes few milli second, in a normal condition. So based on maximum number of allowed queues for our h/w it won't go beyond a sec. The 120s only happens rarely during some NIC HW issue -unexpected. So this timeout will only trigger in a very rare scenario. >> are allowed in this specific hardware, it is a safe assumption. >>> >>>> >>>> + for (i = 0; i < apc->num_queues; i++) { >>>> + txq = &apc->tx_qp[i].txq; >>>> + cq = &apc->tx_qp[i].tx_cq; >>> >>> cq can be just &txq->tx_cq. >> mana_txq structure does not have a pointer to mana_cq. > >Sorry, misread, my bad. > >>> >>>> + while (atomic_read(&txq->pending_sends)) { >>>> + skb = skb_dequeue(&txq->pending_skbs); >>>> + mana_unmap_skb(skb, apc); >>>> + napi_consume_skb(skb, cq->budget); >>> >>> (you already have comment about this one) >>> >>>> + atomic_sub(1, &txq->pending_sends); >>>> + } >>>> + } >>>> /* We're 100% sure the queues can no longer be woken up, because >>>> * we're sure now mana_poll_tx_cq() can't be running. >>>> */ >>> >>> Thanks, >>> Olek >Thanks, >Olek