On Tue, Aug 27, 2024 8:19 PM, Andrew Lunn wrote: > > > More details please. > > > > > > Linux assume it is driving the hardware. Your firmware cannot be > > > touching any registers which will clear on read. QSFP states that > > > registers 3-31 of page 0 are all clear on read, for example. The > > > firmware should also not be setting any registers, otherwise you can > > > confuse Linux which assumes registers it set stay set, because it is > > > controlling the hardware. > > > > > > Your firmware also needs to handle that Linux can change the page. If > > > the firmware changes the page, it must restore it back to whatever > > > page Linux selected, etc. > > > > > > The fact you are submitting this for net suggests you have seen real > > > issues. Please describe what those issues are. > > > > The error log shows: > > > > [257681.367827] sfp sfp.1025: Host maximum power 1.0W > > [257681.370813] txgbe 0000:04:00.1: 31.504 Gb/s available PCIe bandwidth, limited by 8.0 GT/s PCIe x4 link at 0000:02:02.0 (capable > > of 63.008 Gb/s with 8.0 GT/s PCIe x8 link) > > [257681.373364] txgbe 0000:04:00.1 enp4s0f1: renamed from eth0 > > [257681.434719] txgbe 0000:04:00.1 enp4s0f1: configuring for inband/10gbase-r link mode > > [257681.676747] sfp sfp.1025: EEPROM base structure checksum failure: 0x63 != 0x1f > > [257681.676755] sfp EE: 00000000: 03 04 07 10 00 00 01 00 00 00 00 06 67 02 00 00 ............g... > > [257681.676757] sfp EE: 00000010: 1e 0f 00 00 46 69 62 65 72 53 74 6f 72 65 20 64 ....FiberStore d > > [257681.676759] sfp EE: 00000020: 20 20 20 20 00 00 1b 21 53 46 50 2d 31 30 47 53 ...!SFP-10GS > > [257681.676760] sfp EE: 00000030: 52 2d 38 35 20 20 20 20 41 20 20 20 03 52 00 1f R-85 A .R.. > > [257681.676762] sfp EE: 00000040: 00 81 cd 5b df 25 0a bd 40 f6 c6 ce 47 8e ff ff ...[.%..@...G... > > [257681.676763] sfp EE: 00000050: 10 d8 24 33 44 8e ff ff 10 41 b0 9a ff ff ff ff ..$3D....A...... > > > > It looks like some fields are read incorrectly. For comparison, I printed the > > SFP info when it loaded correctly: > > > > [260908.194533] sfp EE: 00000000: 03 04 07 10 00 00 01 00 00 00 00 06 67 02 00 00 ............g... > > [260908.194536] sfp EE: 00000010: 1e 0f 00 00 46 69 62 65 72 53 74 6f 72 65 20 20 ....FiberStore > > [260908.194538] sfp EE: 00000020: 20 20 20 20 00 00 1b 21 53 46 50 2d 31 30 47 53 ...!SFP-10GS > > [260908.194540] sfp EE: 00000030: 52 2d 38 35 20 20 20 20 41 20 20 20 03 52 00 1f R-85 A .R.. > > [260908.194541] sfp EE: 00000040: 40 63 bd df 40 8e ff ff 10 41 b0 9a ff ff ff ff @c..@....A...... > > [260908.194543] sfp EE: 00000050: 10 58 5b 29 41 8e ff ff 10 41 b0 9a ff ff ff ff .X[)A....A...... > > [260908.198205] sfp sfp.1025: module FiberStore SFP-10GSR-85 rev A sn G1804125607 dc 180605 > > > > Since the read mechanism of I2C is to write the offset and read command > > first, and then read the target address. I think it's possible that the different > > offsets be written at the same time, from Linux and firmware. > > O.K, that is bad. The SFP is totally unreliable... > > You however have still not answered my question. What is the firmware > accessing? How does it handle pages? > > The hack you have put in place is per i2c transaction. But accessing > pages is likely to be multiple transactions. One to change the page, > followed by a few reads/writes in the new page, then maybe followed by > a transactions to return to page 0. Do you mean the bus address A0 or A2? Firmware accesses I2C just like driver, but it only change the page once per full transaction, during a possession of the semaphore. What you fear seems unlikely to happen. > I think your best solution is to simply take the mutex and never > release it. Block your firmware from accessing the SFP. Firmware accesses the SFP in order to provide information to the BMC. So it cannot simply be blocked.