On Sat, Jun 05, 2021 at 01:33:07AM +0100, Russell King (Oracle) wrote: > On Sat, Jun 05, 2021 at 01:34:55AM +0200, Pali Rohár wrote: > > But as this is really confusing what each mode means for Linux, I would > > suggest that documentation for these modes in ethernet-controller.yaml > > file (or in any other location) could be extended. I see that it is > > really hard to find exact information what these modes mean and what is > > their meaning in DTS / kernel. > > We have been adding documentation to: > > Documentation/networking/phy.rst > > for each of the modes that have had issues. The 2500base-X entry > hasn't been updated yet, as the question whether it can have in-band > signaling is unclear (there is no well defined standard for this.) > > Some vendors state that there is no in-band signalling in 2500base-X. > Others (e.g. Xilinx) make it clear that it is optional. Others don't > say either way, and when testing hardware, it appears to be functional. > > So, coming up with a clear definition for this, when we have no real > method in the DT file to say "definitely do not use in-band" is a tad > difficult. If you use phylink, doesn't the lack of managed = "in-band-status"; mean "definitely do not use in-band"? > It started out as described - literally, 1000base-X multiplied by 2.5x. > There are setups where that is known to work - namely GPON SFPs that > support 2500base-X. What that means is that we know the GPON SFP > module negotiates in-band AN with 2500base-X. However, we don't know > whether the module will work if we disable in-band AN. Pardon my ignorance, but what is inside a GPON ONT module? Just a laser and some amplifiers? So it would still be the MAC PCS negotiating flow control with the remote link partner? That's a different use case from a PHY transmitting the negotiated link modes to the MAC. > There is hardware out there as well which allows one to decide whether > to use in-band AN with 2500base-X or not. Xilinx is one such vendor > who explicitly documents this. Marvell on the other hand do not > prohibit in-band AN with mvneta, neither to they explicitly state it > is permitted. In at least one of their PHY documents, they suggest it > isn't supported if the MAC side is operating in 2500base-X. > > Others (NXP) take the position that in-band AN is not supported at > 2500base-X speeds. I think a few months ago, Vladimir persuaded me > that we should disable in-band AN for 2500base-X - I had forgotten > about the Xilinx documentation I had which shows that it's optional. > (Practically, it's optional in hardware with 1000base-X too, but then > it's not actually conforming with 802.3's definition of 1000base-X.) I don't think it is me who persuaded you, but rather the reality exposed by Marek Behun that the Marvell switches and PHYs don't support clause 37 in-band AN either, at least when connected to one another, just the mvneta appears to do something with the GPON modules: https://lore.kernel.org/netdev/20210113011823.3e407b31@xxxxxxxxxx/ I tend to agree, though. We should prevent in-band AN from being requested on implementations where we know it will not work. That includes any NXP products. In the case of DPAA1, this uses the same PCS block as DPAA2, ENETC, Felix and Seville, so if it were to use phylink and the common drivers/net/pcs/pcs-lynx.c driver, then the comment near the lynx_pcs_link_up_2500basex() function would equally apply to it too (I hope this answers Pali's original question). > The result is, essentially, a total mess. 2500base-X is not a standards > defined thing, so different vendors have gone off and done different > things. Correct, I can not find any document mentioning what 2500base-x is either, while I can find documents mentioning SGMII at 2500Mbps. https://patents.google.com/patent/US7356047B1/en This Cisco patent does say a few things, like the fact that the link for 10/100/1000/2500 SGMII should operate at 3.125 Gbaud, and there should be a rate adaptation unit separate from the PCS block, which should split a frame into 2 segments, and say for 1Gbps, the first segment should have its octets repeated twice, and the second segment should have its octets repeated 3 times. This patent also does _not_ say how the in-band autonegotiation code word should be adapted to switch between 10/100/1000/2500. Which makes the whole patent kind of useless as the basis for a standard for real life products. NXP does _not_ follow that patent (we cannot perform symbol replication in that way, and in fact I would be surprised if anyone does, given the lack of a way to negotiate between them), and with the limited knowledge I currently have, that is the only thing I would call "SGMII-2500". So Cisco "SGMII-2500" does in theory exist, but in practice it is a bit mythical given what is currently public knowledge. By the "genus proximum et differentia specifica" criterion, what we have according to Linux terminology is 2500base-x (whatever that might be, we at least know the baud rate and the coding scheme) without in-band AN. We don't seem to have any characteristic that would make the "genus proximum" be Cisco SGMII (i.e. we can't operate at any other speeds via symbol replication). But that is ok given the actual use, for example we achieve the lower speeds using PAUSE frames sent by the PHY. > Sometimes it's amazing that you can connect two devices together and > they will actually talk to each other! This is not so surprising to me, if you consider the fact that these devices were built to common sense specs communicated over email between engineers at different companies. There aren't really that many companies building these things. The fact that the standards bodies haven't kept up and unified the implementations is a different story. I can agree that the chosen name is confusing. What it is is an overclocked serial GMII (in the sense that it is intended as a MAC-to-PHY link), with no intended relation to Cisco SGMII. Being intended as a MAC-to-PHY link, clause 37 AN does not make sense because flow control is 100% managed by the PHY (negotiated over the copper side, as well as used for rate adaptation). So there _is_ some merit in calling it something with "serial" and "GMII" in the name, it is just describing what it is. Using this interface type over a PHY-less fiber SFP+ module (therefore using it to its BASE-X name) works by virtue of the fact that the signaling/coding is compatible, but it wasn't intended that way, otherwise it would have had support for clause 37 flow control resolution.