On 8/11/20 9:03 AM, Muneendra Kumar M wrote: > Hi Hannes, >>> >>> Hmm. Wouldn't it make more sense to introduce a new port state 'marginal' >>> for this? We might >want/need to introduce additional error recovery >>> mechanisms here, so having a new state >might be easier in the long run >>> ... >> >>> Additionally, from my understanding the FPIN events will be generated >>> with a certain >frequency. So we could model the new 'marginal' state >>> similar to the dev_loss_tmo >mechanism; start a timer whenever the >>> 'marginal' state is being set, and clear the state back to >'running' >>> if the state hasn't been refreshed within that timeframe. >>> That would give us an automatic state reset back to running, and >>> quite easy to implement from >userland. >> >> Thanks for the review. >> I have a small doubt. >> When the port state moves from marginal to running state does it mean >> we expect a traffic from the path ? >> >> We don't expect traffic; rather we _allow_ traffic. >> But moving to from marginal to running means that we didn't receive FPIN >> events, and the path should be considered healthy again. >> So from that perspective it should be back to normal operations. > > > But this could apply only to FPIN-Congestion. Only in this case FPIN-CN > FPIN events will be generated with a certain frequency. > But for FPIN-Li this is not the case. > FPIN-LI is used to inform about marginal paths, which needs manual > intervention to recover. > And for FPIN-LI the path should be re-enabled on any link bounce > (portdisable followed by portenable) which would correlated to a cable/sfp > change. > For now, however, we are addressing FPIN-LI primarily. > Ah. So that changes things slightly; I had hoped we can address things systematically, but with link integrity issues we don't have any other choice but to replace the cable (ie wait for user interaction). But still I'm in favour of the 'marginal' state; that one could be set manually (or by an FPIN LI event), and would need to be reset either manually or by link reset. And have the advantage of being easier to implement :-) Cheers, Hannes -- Dr. Hannes Reinecke Kernel Storage Architect hare@xxxxxxx +49 911 74053 688 SUSE Software Solutions Germany GmbH, Maxfeldstr. 5, 90409 Nürnberg HRB 36809 (AG Nürnberg), GF: Felix Imendörffer