Re: I2C bus driver TIMEDOUT because of PM autosuspend

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 3. 12. 19 02:30, anish singh wrote:
On Fri, Nov 29, 2019 at 12:53 PM Primoz Beltram
<primoz.beltram@xxxxxxxxx> wrote:
I am analysing a problem with I2C bus driver where the problem shows up
as I2C bus completely blocked. The LX driver in question is
/drivers/i2c/busses/i2c-xiic.c.
Problem is difficult to reproduce, it happens very rarely. So far I saw
that the main precondition is to have very heavy I2C traffic on bus.
In my case this is achieved/reproduced via netdev driving SFP LEDs via
/sys/class/leds/ (via gpio-pca953x). I generate traffic with iperf3.
Network traffic is on 10Gbps EMAC. LX kernel is 4.14.0.
What I saw from debugging this problem is that I2C bus get blocked when
wait_event_timeout() completes because of timeout. The timeout handling
in this driver is probably not robust enough (bus should not remain
blocked), but at this moment this are just my speculations (don't know
enough details).
Check with salea logic analyzer what happens to the i2c bus.

Looking the driver code and data on oscilloscope, I saw that SCL in
single I2C data transfer sequence can be interrupted for very long
delays, e.g up to hundredths of usec (SCL is 100kHz). I started to
suspect that PM autosuspend delay could play some role here. There are
only two delays in driver code, first in wait_event_timeout and second
in set autosuspend delay. Case is a bit strange because in very busy I2C
traffic, PM autosuspend should not be triggered at all. Additionally, if
I lower PM timeout, e.g. from 1000 (default) to 100, I hit the problem
sooner (waits for problem hit are in order of n*10minutes).

It looks to me that PM autosupend is playing some role here.

Power management options in my .config:
# CONFIG_SUSPEND is not set
# CONFIG_PM is not set
CONFIG_ARCH_SUSPEND_POSSIBLE=y

I intentionally did not put all detail descriptions of embedded system
and test setup here (long list), because the main reason of this post is:

The workaround that works for me/customer (at the moment) is to disable
PM autosuspend in the driver code, either by incerementing PM delay from
1000 to 10000 or by disabling autosuspend (comment out call to
pm_runtime_put_autosuspend() in xiic_xfer()).

But, I would like to expose/discuss this issue (maintainer of the code,
or others).
The reason/source of the problem can be much more complex and in some
other place.

So my question is who should I contact, is this the M: in the
MAINTAINERS list, the MODULE_AUTHOR, ...?
You can certainly add the author in loop but I am afraid
you won't get any help as this would be specific to your board. So,
best is to check soc vendor who has written your i2c
bus driver or it could be a issue with your i2c client in that
case show them your salea logic analyzer logs to see
if they can figure out anything wrong.

Thanks for reply and suggestions.

My first suspicion was signal integrity on PCB, but if I add some debug prints in i2c-xiic driver (e.g. build with DEBUG define), the problem is no longer reproducible (not a single timeout completion in wait_event_timeout()).

Signal integrity problem does not look credible to me.

For my system I fixed the problem in i2c-xiic driver (in handIing timeout, not leave bus blocked).

Found also a contact and fill report for SoC vendor.

WBR Primoz
How to proceed.

WBR Primoz


_______________________________________________
Kernelnewbies mailing list
Kernelnewbies@xxxxxxxxxxxxxxxxx
https://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies



_______________________________________________
Kernelnewbies mailing list
Kernelnewbies@xxxxxxxxxxxxxxxxx
https://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies



[Index of Archives]     [Newbies FAQ]     [Linux Kernel Mentors]     [Linux Kernel Development]     [IETF Annouce]     [Git]     [Networking]     [Security]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux RAID]     [Linux SCSI]     [Linux ACPI]

  Powered by Linux