From: Chris Packham > Sent: 14 March 2021 21:26 > > On 12/03/21 10:25 pm, David Laight wrote: > > From: Linuxppc-dev Guenter Roeck > >> Sent: 11 March 2021 21:35 > >> > >> On 3/11/21 1:17 PM, Chris Packham wrote: > >>> On 11/03/21 9:18 pm, Wolfram Sang wrote: > >>>>> Bummer. What is really weird is that you see clock stretching under > >>>>> CPU load. Normally clock stretching is triggered by the device, not > >>>>> by the host. > >>>> One example: Some hosts need an interrupt per byte to know if they > >>>> should send ACK or NACK. If that interrupt is delayed, they stretch the > >>>> clock. > >>>> > >>> It feels like something like that is happening. Looking at the T2080 > >>> Reference manual there is an interesting timing diagram (Figure 14-2 if > >>> someone feels like looking it up). It shows SCL low between the ACK for > >>> the address and the data byte. I think if we're delayed in sending the > >>> next byte we could violate Ttimeout or Tlow:mext from the SMBUS spec. > >>> > >> I think that really leaves you only two options that I can see: > >> Rework the driver to handle critical actions (such as setting TXAK, > >> and everything else that might result in clock stretching) in the > >> interrupt handler, or rework the driver to handle everything in > >> a high priority kernel thread. > > > > I'm not sure a high priority kernel thread will help. > > Without CONFIG_PREEMPT (which has its own set of nasties) > > a RT process won't be scheduled until the processor it last > > ran on does a reschedule. > > I don't think a kernel thread will be any different from a > > user process running under the RT scheduler. > > > > I'm trying to remember the smbus spec (without remembering the I2C one). > For those following along the spec is available here[0]. I know there's > a 3.0 version[1] as well but the devices I'm dealing with are from a 2.0 > vintage. > > While basically a clock+data bit-bang the slave is allowed to drive > > the clock low to extend a cycle. > > It may be allowed to do this at any point? > > From what I can see it's actually the master extending the clock. Or > more accurately holding it low between the address and data bytes (which > from the T2080 reference manual looks expected). I think this may cause > a strictly compliant SMBUS device to determine that Tlow:mext has been > violated. Yes, the spec does seem to assume that is a signal is stable for 20ms something has gone 'horribly wrong'. I wasn't worries about that, our fpga does the whole transaction as a single command. None of our slaves generate interrupts - so it is purely master/slave. If you run your process under the RT scheduler it is unlikely that pre-emption will be delayed by long enough to stop the process running for 10ms. I've seen >1ms delays (testing RTP audio), but most of the long loops have a cond_resched() in them. ... > Probably depends on the device implementation. I've got multiple other > I2C/SMBUS devices and the LM81 seems to be the one that objects. I bet most don't implement any of the timeouts. I found one interesting pmbus device. Sometimes it would detect a STOP condition because the data line went high when it tri-stated its output driver in response to the rising clock edge! So it saw the same clock edge twice. > [0] - http://www.smbus.org/specs/smbus20.pdf > [1] - https://pmbus.org/Assets/PDFS/Public/SMBus_3_0_20141220.pdf I should have both those - I've copied them to the directory where I'd look for them first! David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)