TWIMC: this mail is primarily send for documentation purposes and for regzbot, my Linux kernel regression tracking bot. These mails usually contain '#forregzbot' in the subject, to make them easy to spot and filter. On 08.09.22 15:59, Peter Rosin wrote: > 2022-09-08 at 14:06, Thorsten Leemhuis wrote: >> Hi, this is your Linux kernel regression tracker. Top-posting for once, >> to make this easily accessible to everyone. >> >> Peter, Codrin, could you help me out here please: I still have the >> regression report from Peter that started this thread in the list of >> tracked issues. From Peter's last msg quoted below it seems the thread >> just faded out without the regression being fixed. Or was it? If not: >> what can we do to finally get this resolved? > > No, it is not resolved that I know of. We are only writing during > production, but are working around it by verifying and looping back. > Sometimes it takes surprisingly long for the loop to finish, but > it's not a huge deal. But it is of course not completely satisfying > either... > > Reading is never a problem, so post-production behavior is sane. FWIW, in that case: #regzbot backburner: a proper fix will take some time and reporter has a work-around >> Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat) >> >> P.S.: As the Linux kernel's regression tracker I deal with a lot of >> reports and sometimes miss something important when writing mails like >> this. If that's the case here, don't hesitate to tell me in a public >> reply, it's in everyone's interest to set the public record straight. >> >> On 30.06.22 09:44, Peter Rosin wrote: >>> 2022-06-10 at 22:51, Peter Rosin wrote: >>>> 2022-06-10 at 09:35, Codrin.Ciubotariu@xxxxxxxxxxxxx wrote: >>>>> On 09.06.2022 17:28, Peter Rosin wrote: >>>>>> >>>>>> I have not actually bisected this issue but reverting the effects of >>>>>> patch a4bd8da893a3 ("ARM: dts: at91: sama5d3: add i2c gpio pinctrl") >>>>>> makes the problem go away. >>>>>> >>>>>> I.e. I need something like this in my dts >>>>>> >>>>>> &i2c2 { >>>>>> status = "okay"; >>>>>> >>>>>> pinctrl-names = "default"; >>>>>> /delete-property/ pinctrl-1; >>>>>> /delete-property/ sda-gpios; >>>>>> /delete-property/ scl-gpios; >>>>>> >>>>>> eeprom@50 { >>>>>> compatible = "st,24c64", "atmel,24c64"; >>>>>> reg = <0x50>; >>>>>> wp-gpios = <&filter_gpio 7 GPIO_ACTIVE_HIGH>; >>>>>> }; >>>>>> }; >>>>>> >>>>>> for multi-page eeprom writes to not time out (a page is 32 bytes on this >>>>>> eeprom). >>>>>> >>>>>> For reference, the current defaults for this SoC/I2C-bus, that I modify, >>>>>> are: >>>>>> >>>>>> pinctrl-names = "default", "gpio"; >>>>>> pinctrl-0 = <&pinctrl_i2c2>; >>>>>> pinctrl-1 = <&pinctrl_i2c2_gpio>; >>>>>> sda-gpios = <&pioA 18 GPIO_ACTIVE_HIGH>; >>>>>> scl-gpios = <&pioA 19 (GPIO_ACTIVE_HIGH | GPIO_OPEN_DRAIN)>; >>>>>> >>>>>> I suspect that the underlying reason is that the bus recovery takes >>>>>> too long and that the at24 eeprom driver gives up prematurely. I doubt >>>>>> that this is chip specific, but I don't know that. >>>>>> >>>>>> I can work around the issue in user space with by writing in 4 byte >>>>>> chunks, like so >>>>>> >>>>>> dd if=source.file of=/sys/bus/i2c/devices/2-0050/eeprom obs=4 >>>>>> >>>>>> but that is really ugly and gets slow too, about 20 seconds to program >>>>>> the full 8kB eeprom. With the above in my dts it takes a second or >>>>>> so (a bit more with dynamic debug active). >>>>>> >>>>>> >>>>>> If I run >>>>>> >>>>>> dd if=source.file of=/sys/bus/i2c/devices/2-0050/eeprom >>>>>> >>>>>> with a source.file of 8kB and the upstream dts properties in place, I can >>>>>> collect the following debug output from at24, i2c-core and i2c-at91: >>>>>> >>>>>> Jun 9 15:56:34 me20 kernel: i2c i2c-2: at91_xfer: processing 1 messages: >>>>>> Jun 9 15:56:34 me20 kernel: at91_i2c f801c000.i2c: transfer: write 34 bytes. >>>>>> Jun 9 15:56:34 me20 kernel: at91_i2c f801c000.i2c: transfer complete >>>>>> Jun 9 15:56:34 me20 kernel: at24 2-0050: write 32@0 --> 0 (-23170) >>>>>> Jun 9 15:56:34 me20 kernel: i2c i2c-2: at91_xfer: processing 1 messages: >>>>>> Jun 9 15:56:34 me20 kernel: at91_i2c f801c000.i2c: transfer: write 34 bytes. >>>>>> Jun 9 15:56:34 me20 kernel: at91_i2c f801c000.i2c: received nack >>>>>> Jun 9 15:56:34 me20 kernel: i2c i2c-2: Trying i2c bus recovery >>>>>> Jun 9 15:56:34 me20 kernel: at24 2-0050: write 32@32 --> -121 (-23169) >>>>>> Jun 9 15:56:34 me20 kernel: i2c i2c-2: at91_xfer: processing 1 messages: >>>>>> Jun 9 15:56:34 me20 kernel: at91_i2c f801c000.i2c: transfer: write 34 bytes. >>>>>> Jun 9 15:56:34 me20 kernel: at91_i2c f801c000.i2c: transfer complete >>>>>> Jun 9 15:56:34 me20 kernel: at24 2-0050: write 32@32 --> 0 (-23168) >>>>>> Jun 9 15:56:34 me20 kernel: i2c i2c-2: at91_xfer: processing 1 messages: >>>>>> Jun 9 15:56:34 me20 kernel: at91_i2c f801c000.i2c: transfer: write 34 bytes. >>>>>> Jun 9 15:56:34 me20 kernel: at91_i2c f801c000.i2c: received nack >>>>>> Jun 9 15:56:34 me20 kernel: i2c i2c-2: Trying i2c bus recovery >>>>>> Jun 9 15:56:34 me20 kernel: at24 2-0050: write 32@64 --> -121 (-23168) >>>>>> Jun 9 15:56:34 me20 kernel: i2c i2c-2: at91_xfer: processing 1 messages: >>>>>> Jun 9 15:56:34 me20 kernel: at91_i2c f801c000.i2c: transfer: write 34 bytes. >>>>>> Jun 9 15:56:34 me20 kernel: at91_i2c f801c000.i2c: transfer complete >>>>>> Jun 9 15:56:34 me20 kernel: at24 2-0050: write 32@64 --> 0 (-23167) >>>>>> Jun 9 15:56:34 me20 kernel: i2c i2c-2: at91_xfer: processing 1 messages: >>>>>> Jun 9 15:56:34 me20 kernel: at91_i2c f801c000.i2c: transfer: write 34 bytes. >>>>>> Jun 9 15:56:34 me20 kernel: at91_i2c f801c000.i2c: received nack >>>>>> Jun 9 15:56:34 me20 kernel: i2c i2c-2: Trying i2c bus recovery >>>>>> Jun 9 15:56:34 me20 kernel: at24 2-0050: write 32@96 --> -121 (-23167) >>>>>> Jun 9 15:56:34 me20 kernel: i2c i2c-2: at91_xfer: processing 1 messages: >>>>>> Jun 9 15:56:34 me20 kernel: at91_i2c f801c000.i2c: transfer: write 34 bytes. >>>>>> Jun 9 15:56:34 me20 kernel: at91_i2c f801c000.i2c: controller timed out >>>>>> Jun 9 15:56:34 me20 kernel: i2c i2c-2: Trying i2c bus recovery >>>>>> Jun 9 15:56:34 me20 kernel: at24 2-0050: write 32@96 --> -110 (-23155) >>>>>> Jun 9 15:56:34 me20 kernel: i2c i2c-2: at91_xfer: processing 1 messages: >>>>>> Jun 9 15:56:34 me20 kernel: at91_i2c f801c000.i2c: transfer: write 34 bytes. >>>>>> Jun 9 15:56:34 me20 kernel: at91_i2c f801c000.i2c: controller timed out >>>>>> Jun 9 15:56:34 me20 kernel: i2c i2c-2: Trying i2c bus recovery >>>>>> Jun 9 15:56:34 me20 kernel: at24 2-0050: write 32@96 --> -110 (-23143) >>>>>> >>>>>> And then there is no more action. I.e. only a couple of 32 byte pages >>>>>> are written. >>>>>> >>>>>> With the above mentioned dts override in place I instead get this, which is >>>>>> a lot more sensible: >>>>>> >>>>>> Jun 9 15:48:53 me20 kernel: i2c i2c-2: at91_xfer: processing 1 messages: >>>>>> Jun 9 15:48:53 me20 kernel: at91_i2c f801c000.i2c: transfer: write 34 bytes. >>>>>> Jun 9 15:48:53 me20 kernel: at91_i2c f801c000.i2c: transfer complete >>>>>> Jun 9 15:48:53 me20 kernel: at24 2-0050: write 32@0 --> 0 (753629) >>>>>> Jun 9 15:48:53 me20 kernel: i2c i2c-2: at91_xfer: processing 1 messages: >>>>>> Jun 9 15:48:53 me20 kernel: at91_i2c f801c000.i2c: transfer: write 34 bytes. >>>>>> Jun 9 15:48:53 me20 kernel: at91_i2c f801c000.i2c: received nack >>>>>> Jun 9 15:48:53 me20 kernel: at24 2-0050: write 32@32 --> -121 (753629) >>>>>> Jun 9 15:48:53 me20 kernel: i2c i2c-2: at91_xfer: processing 1 messages: >>>>>> Jun 9 15:48:53 me20 kernel: at91_i2c f801c000.i2c: transfer: write 34 bytes. >>>>>> Jun 9 15:48:53 me20 kernel: at91_i2c f801c000.i2c: transfer complete >>>>>> Jun 9 15:48:53 me20 kernel: at24 2-0050: write 32@32 --> 0 (753630) >>>>>> Jun 9 15:48:53 me20 kernel: i2c i2c-2: at91_xfer: processing 1 messages: >>>>>> Jun 9 15:48:53 me20 kernel: at91_i2c f801c000.i2c: transfer: write 34 bytes. >>>>>> Jun 9 15:48:53 me20 kernel: at91_i2c f801c000.i2c: received nack >>>>>> Jun 9 15:48:53 me20 kernel: at24 2-0050: write 32@64 --> -121 (753630) >>>>>> Jun 9 15:48:53 me20 kernel: i2c i2c-2: at91_xfer: processing 1 messages: >>>>>> Jun 9 15:48:53 me20 kernel: at91_i2c f801c000.i2c: transfer: write 34 bytes. >>>>>> Jun 9 15:48:53 me20 kernel: at91_i2c f801c000.i2c: transfer complete >>>>>> Jun 9 15:48:53 me20 kernel: at24 2-0050: write 32@64 --> 0 (753631) >>>>>> Jun 9 15:48:53 me20 kernel: i2c i2c-2: at91_xfer: processing 1 messages: >>>>>> Jun 9 15:48:53 me20 kernel: at91_i2c f801c000.i2c: transfer: write 34 bytes. >>>>>> Jun 9 15:48:53 me20 kernel: at91_i2c f801c000.i2c: received nack >>>>>> Jun 9 15:48:53 me20 kernel: at24 2-0050: write 32@96 --> -121 (753631) >>>>>> Jun 9 15:48:53 me20 kernel: i2c i2c-2: at91_xfer: processing 1 messages: >>>>>> Jun 9 15:48:53 me20 kernel: at91_i2c f801c000.i2c: transfer: write 34 bytes. >>>>>> Jun 9 15:48:53 me20 kernel: at91_i2c f801c000.i2c: transfer complete >>>>>> Jun 9 15:48:53 me20 kernel: at24 2-0050: write 32@96 --> 0 (753632) >>>>>> Jun 9 15:48:53 me20 kernel: i2c i2c-2: at91_xfer: processing 1 messages: >>>>>> Jun 9 15:48:53 me20 kernel: at91_i2c f801c000.i2c: transfer: write 34 bytes. >>>>>> Jun 9 15:48:53 me20 kernel: at91_i2c f801c000.i2c: received nack >>>>>> Jun 9 15:48:53 me20 kernel: at24 2-0050: write 32@128 --> -121 (753632) >>>>>> Jun 9 15:48:53 me20 kernel: i2c i2c-2: at91_xfer: processing 1 messages: >>>>>> Jun 9 15:48:53 me20 kernel: at91_i2c f801c000.i2c: transfer: write 34 bytes. >>>>>> Jun 9 15:48:53 me20 kernel: at91_i2c f801c000.i2c: transfer complete >>>>>> Jun 9 15:48:53 me20 kernel: at24 2-0050: write 32@128 --> 0 (753633) >>>>>> Jun 9 15:48:53 me20 kernel: i2c i2c-2: at91_xfer: processing 1 messages: >>>>>> Jun 9 15:48:53 me20 kernel: at91_i2c f801c000.i2c: transfer: write 34 bytes. >>>>>> Jun 9 15:48:53 me20 kernel: at91_i2c f801c000.i2c: received nack >>>>>> Jun 9 15:48:53 me20 kernel: at24 2-0050: write 32@160 --> -121 (753633) >>>>>> Jun 9 15:48:53 me20 kernel: i2c i2c-2: at91_xfer: processing 1 messages: >>>>>> Jun 9 15:48:53 me20 kernel: at91_i2c f801c000.i2c: transfer: write 34 bytes. >>>>>> Jun 9 15:48:53 me20 kernel: at91_i2c f801c000.i2c: transfer complete >>>>>> Jun 9 15:48:53 me20 kernel: at24 2-0050: write 32@160 --> 0 (753634) >>>>>> ... snip ... >>>>>> Jun 9 15:48:55 me20 kernel: i2c i2c-2: at91_xfer: processing 1 messages: >>>>>> Jun 9 15:48:55 me20 kernel: at91_i2c f801c000.i2c: transfer: write 34 bytes. >>>>>> Jun 9 15:48:55 me20 kernel: at91_i2c f801c000.i2c: received nack >>>>>> Jun 9 15:48:55 me20 kernel: at24 2-0050: write 32@8128 --> -121 (753883) >>>>>> Jun 9 15:48:55 me20 kernel: i2c i2c-2: at91_xfer: processing 1 messages: >>>>>> Jun 9 15:48:55 me20 kernel: at91_i2c f801c000.i2c: transfer: write 34 bytes. >>>>>> Jun 9 15:48:55 me20 kernel: at91_i2c f801c000.i2c: transfer complete >>>>>> Jun 9 15:48:55 me20 kernel: at24 2-0050: write 32@8128 --> 0 (753884) >>>>>> Jun 9 15:48:55 me20 kernel: i2c i2c-2: at91_xfer: processing 1 messages: >>>>>> Jun 9 15:48:55 me20 kernel: at91_i2c f801c000.i2c: transfer: write 34 bytes. >>>>>> Jun 9 15:48:55 me20 kernel: at91_i2c f801c000.i2c: received nack >>>>>> Jun 9 15:48:55 me20 kernel: at24 2-0050: write 32@8160 --> -121 (753884) >>>>>> Jun 9 15:48:55 me20 kernel: i2c i2c-2: at91_xfer: processing 1 messages: >>>>>> Jun 9 15:48:55 me20 kernel: at91_i2c f801c000.i2c: transfer: write 34 bytes. >>>>>> Jun 9 15:48:55 me20 kernel: at91_i2c f801c000.i2c: transfer complete >>>>>> Jun 9 15:48:55 me20 kernel: at24 2-0050: write 32@8160 --> 0 (753885) >>>>> >>>>> could you please apply this patch-set [1] and let us know if it >>>>> addresses your issue? >>>>> >>>>> Thanks and best regards, >>>>> Codrin >>>>> >>>>> https://patchwork.ozlabs.org/project/linux-i2c/list/?series=255408 >>>> >>>> That series does indeed help! I'll reply with a tested-by etc on the >>>> first two patches, I can't test patch 3/3 with my sama5d3 board... >>>> >>>> Thank you very much! >>> >>> Since replying to the actual patches do not work for me, I'm writing here >>> instead. Sorry about that. As stated above, it /seems/ to work much better >>> with these patches. But I fooled myself and there is still some remaining >>> trouble. It is not uncommon that the second (32-byte) page in the eeprom >>> is not written correctly for whatever reason. I do not know why it's >>> always the second page that gets corrupted, but this is a bad problem since >>> the failure is completely silent. >>> >>> Cheers, >>> Peter >> >> #regzbot poke >