On Tue, 15 Sep 2020, at 02:13, Guenter Roeck wrote: > On 9/14/20 5:28 AM, Andrew Jeffery wrote: > > Hello, > > > > While working with system designs making use of TI's UCD90320 Power > > Sequencer we've found that communication with the device isn't terribly > > reliable. > > > > It appears that back-to-back transfers where commands addressed to the > > device are put onto the bus with intervals between STOP and START in the > > neighbourhood of 250us or less can cause bad behaviour. This primarily > > happens during driver probe while scanning the device to determine its > > capabilities. > > > > We have observed the device causing excessive clock stretches and bus > > lockups, and also corruption of the device's volatile state (requiring it > > to be reset). The latter is particularly disruptive in that the controlled > > rails are brought down either by: > > > > 1. The corruption causing a fault condition, or > > 2. Asserting the device's reset line to recover > > > > A further observation is that pacing transfers to the device appears to > > mitigate the bad behaviour. We're in discussion with TI to better > > understand the limitations and at least get the behaviour documented. > > > > This short series implements the mitigation in terms of a throttle in the > > i2c_client associated with the device's driver. Before the first > > communication with the device in the probe() of ucd9000 we configure the > > i2c_client to throttle transfers with a minimum of a 1ms delay (with the > > delay exposed as a module parameter). > > > > The series is RFC for several reasons: > > > > The first is to sus out feelings on the general direction. The problem is > > pretty unfortunate - are there better ways to implement the mitigation? > > > > If there aren't, then: > > > > I'd like thoughts on whether we want to account for i2c-dev clients. > > Implementing throttling in i2c_client feels like a solution-by-proxy as the > > throttling is really a property of the targeted device, but we don't have a > > coherent representation between platform devices and devices associated > > with i2c-dev clients. At the moment we'd have to resort to address-based > > lookups for platform data stashed in the transfer functions. > > > > Next is that I've only implemented throttling for SMBus devices. I don't > > yet have a use-case for throttling non-SMBus devices so I'm not sure it's > > worth poking at it, but would appreciate thoughts there. > > > > Further, I've had a bit of a stab at dealing with atomic transfers that's > > not been tested. Hopefully it makes sense. > > > > Finally I'm also interested in feedback on exposing the control in a little > > more general manner than having to implement a module parameter in all > > drivers that want to take advantage of throttling. This isn't a big problem > > at the moment, but if anyone has thoughts there then I'm happy to poke at > > those too. > > > > As mentioned in patch 2/2, I don't think a module parameter is a good idea. > I think this should be implemented on driver level, similar to zl6100.c, > it should be limited to affected devices and not be user controllable. > > In respect to implementation in the i2c core vs in drivers: So far we > encountered this problem for some Zilker labs devices and for some LTC > devices. While the solution needed here looks similar to the solution > implemented for Zilker labs devices, the solution for LTC devices is > different. I am not sure if an implementation in the i2c core is > desirable. It looks quite invasive to me, and it won't solve the problem > for all devices since it isn't always a simple "wait <n> microseconds > between accesses". For example, some devices may require a wait after > a write but not after a read, or a wait only after certain commands (such > as commands writing to an EEPROM). Other devices may require a mechanism > different to "wait a certain period of time". It seems all but impossible > to implement a generic mechanism on i2c level. So I think it could be handled with an optional i2c client callback: e.g. struct i2c_client { ... bool (*prepare_device)(const struct i2c_client *client); } This way the logic to delay is kept inside the driver, catering to both the Zilker and the LTC devices. If the problem exists only after specific operations then we can stash some state in the client in the same way I've done in patch 1, test that state in the callback and only do the "preparation" if it's necessary. I can knock that up and post another RFC, just so we can get a feel for how that solution looks. Andrew