On Fri, Mar 13, 2015 at 09:03:33AM +0100, Alexander Sverdlin wrote: > Hello! > > On 12/03/15 14:16, ext Grygorii.Strashko@xxxxxxxxxx wrote: > >> There is one big problem in the current design: ISR accesses the controller > >> > registers in parallel with i2c_davinci_xfer_msg() in process context. The whole > >> > logic is not obvious, many operations are performed in process context while > >> > ISR is always enabled and does something asynchronous even while it's not > >> > expected. We have faced these races on 4-cores Keystone chip. Some examples: > >> > > >> > - when non-existing device is accessed first comes NAK IRQ, then ARDY IRQ. After > >> > NAK we already jump out of wait_for_completion_timeout() and depending on how > >> > lucky we are ARDY IRQ will access MDR register in the middle of some other > >> > operation in process context; > >> > > >> > - STOP condition is triggered in many places in the driver, in ISR, in > >> > i2c_davinci_xfer_msg(), but there is no code which guarantees that STOP will > >> > be really completed. We have seen many STOP conditions simply missing in > >> > back-to-back transfers, when next i2c_davinci_xfer_msg() call simply overwrites > >> > MDR register while STOP is still not generated. > >> > > >> > So, make the design more robust and obvious: > >> > - leave hot path (buffers management) in ISR, remove other registers access from > >> > ISR; > >> > - introduce second synchronization point, to make sure that STOP condition is > >> > really generated and it's safe to begin next transfer; > >> > - simplify the state machine; > >> > - enable IRQs only when they are expected, disable them in ISR when transfer is > >> > completed/failed; > >> > - STOP is normally set simultaneously with START condition (in case of last > >> > message); only special case when STOP is additionally generated is received NAK > >> > -- this case is handled separately. > > I'm not sure about this change (- It's too significant and definitely will need more review & Tested-by. > > Maybe you can offer this patch the customers who suddenly cannot access the devices on the bus until reboot... > Because it's not a "bus lockup". > > > We need to be careful with it, especially taking into account DAVINCI_I2C_MDR_RM mode and future > > changes like https://lkml.org/lkml/2014/5/1/348. > > > > May be you can split it? > > I can may be split it into two patches, but I'm not sure about the result. Each of them will only solve > 50% of the problem and then nobody will see a clear benefit applying only one. But what I can offer you is > to spend the same effort on rebasing the "DAVINCI_I2C_MDR_RM mode" patch you are referring to. I can rebase > it and take it into my series if you wish. So, shall I take this into i2c/for-next?
Attachment:
signature.asc
Description: Digital signature