On 28.08.2023 17:10, Jean Delvare wrote: > On Mon, 28 Aug 2023 15:27:47 +0200, Jean Delvare wrote: >> On Sun, 27 Aug 2023 19:14:38 +0200, Heiner Kallweit wrote: >>> We set SMBHSTCNT_LAST_BYTE whilst the host is receiving the last byte. >>> Apparently the host checks for SMBHSTCNT_LAST_BYTE once it received >>> a byte, in order to determine whether to ack the byte or not. >>> So SMBHSTCNT_LAST_BYTE doesn't have to be set before the host starts >>> receiving the last byte. >> >> How is this not racy? >> >> In the interrupt-driven case, at the end of a block read transaction, >> we set SMBHSTCNT_LAST_BYTE at the end of i801_isr_byte_done(), then >> return to i801_isr() where we write 1 to SMBHSTSTS_BYTE_DONE to clear >> it. This lets the controller handle the last byte with the knowledge >> that this is the last byte. >> >> However, in the poll-driven case, SMBHSTSTS_BYTE_DONE is being cleared >> at the end of the loop in i801_block_transaction_byte_by_byte(), then >> at the beginning of the next iteration, we write SMBHSTCNT_LAST_BYTE, >> then wait for completion. If the controller is super fast (or, to be >> more realistic, the i2c-i801 driver gets preempted between writing >> SMBHSTSTS_BYTE_DONE and writing SMBHSTCNT_LAST_BYTE) then the byte may >> have been already read and acked, before we have the time to let the >> controller know that no ACK should be sent. This looks racy. Am I >> missing something? > > I made a little experiment which, I think, proves my point. > > Firstly, I loaded the i2c-i801 driver with disable_features=0x12, to > make sure the poll-based byte-by-byte code path is used. The EEPROM > data starts with: > > 0 1 2 3 4 5 6 7 8 9 a b c d e f > 00: 92 13 0b 02 04 21 02 01 03 52 01 08 0a 00 fc 00 > > The following commands read the first 4 bytes with an I2C Block Read > command, then fetch the next byte from the EEPROM (without specifying > the offset): > > # /usr/local/sbin/i2cget -y 4 0x53 0 i 4 ; /usr/local/sbin/i2cget -y 4 0x53 > 0x92 0x13 0x0b 0x02 > 0x04 > > As you can see, I get the 5 first bytes of the EEPROM, as expected. > > Then I added an arbitrary delay in the driver where I think the race > condition exists: > > --- a/drivers/i2c/busses/i2c-i801.c > +++ b/drivers/i2c/busses/i2c-i801.c > @@ -684,8 +684,10 @@ static int i801_block_transaction_byte_b > > if (i == 1) > outb_p(smbcmd | SMBHSTCNT_START, SMBHSTCNT(priv)); > - else if (smbcmd & SMBHSTCNT_LAST_BYTE) > + else if (smbcmd & SMBHSTCNT_LAST_BYTE) { > + usleep_range(10000, 20000); > outb_p(smbcmd, SMBHSTCNT(priv)); > + } > > status = i801_wait_byte_done(priv); > if (status) > > I loaded the modified i2c-i801 driver, still with > disable_features=0x12. Running the same commands again, I now get: > > # /usr/local/sbin/i2cget -y 4 0x53 0 i 4 ; /usr/local/sbin/i2cget -y 4 0x53 > 0x92 0x13 0x0b 0x02 > 0x21 > > So I get EEPROM bytes 0-3 then it jumps to offset 5 directly. This > means that the EEPROM started sending the byte at offset 4 at the end > of the I2C Block Read transfer, due to the controller sending an ACK > after the byte at offset 3. > > For the code to be safe, we need to set SMBHSTCNT_LAST_BYTE *before* > clearing SMBHSTSTS_BYTE_DONE. > I think you're right with the preemption scenario. I'll submit a fix for it. > Note: the transfers do not fail, only the internal register pointer of > the EEPROM gets screwed, so this is probably not an issue for devices > which don't rely on an internal register pointer, only EEPROM-like > devices are affected. >