Hi, Thorsten here, the Linux kernel's regression tracker. Seems this regression is still unfixed (please correct me if I'm wrong), so I'm back with another comment: On 15.12.22 19:50, Conor.Dooley@xxxxxxxxxxxxx wrote: > On 15/12/2022 17:53, Thorsten Leemhuis wrote: >> On 08.09.22 15:59, Peter Rosin wrote: >>> 2022-09-08 at 14:06, Thorsten Leemhuis wrote: >>>> >>>> Peter, Codrin, could you help me out here please: I still have the >>>> regression report from Peter that started this thread in the list of >>>> tracked issues. From Peter's last msg quoted below it seems the thread >>>> just faded out without the regression being fixed. Or was it? If not: >>>> what can we do to finally get this resolved? >>> >>> No, it is not resolved that I know of. We are only writing during >>> production, but are working around it by verifying and looping back. >>> Sometimes it takes surprisingly long for the loop to finish, but >>> it's not a huge deal. But it is of course not completely satisfying >>> either... >>> >>> Reading is never a problem, so post-production behavior is sane. >> >> I still have this regression that Peter reported in late July on my >> list. :-( >> >> Codrin (and maybe Wolfram), could you provide a update please? Afaics >> this is the state of things (please correct me if I'm wrong!): In an >> earlier mail >> (https://lore.kernel.org/lkml/38dedc92-62a2-7365-6fda-95d6404be749@xxxxxxxxxx/ >> ) of this thread Peter stated that the following patch set Codrin posted >> mid 2021 helped: >> https://lore.kernel.org/all/20210727111554.1338832-1-codrin.ciubotariu@xxxxxxxxxxxxx/ > > IIRC (and I may well be wrong as it is not my neck of the woods) Codrin is > no longer at Microchip. Nicolas, do you know who has taken over this driver? Nicolas didn't reply afaics, but I just found he in https://lore.kernel.org/all/176099e2-cbff-1987-f59a-2ca618a9c92a@xxxxxxxxxxxxx/ mentioned that Codrin left. Did anyone else take over his duties and that patchset? Or should I file this under "regressions that were bisected[1], but nevertheless fixed"? I'd hate to do that when patches to resolve it are actually available and got stuck in review... [1] to a change from Kamel Bouhara iirc Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat) -- Everything you wanna know about Linux kernel regression tracking: https://linux-regtracking.leemhuis.info/about/#tldr If I did something stupid, please tell me, as explained on that page. #regzbot poke >> There where a few review comments from Wolfram, but looks like then >> things stalled. Can we somehow get this rolling again to finally get >> this regression fixed? >> >> Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat) >> >> P.S.: As the Linux kernel's regression tracker I deal with a lot of >> reports and sometimes miss something important when writing mails like >> this. If that's the case here, don't hesitate to tell me in a public >> reply, it's in everyone's interest to set the public record straight. >> >>>> Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat) >>>> >>>> P.S.: As the Linux kernel's regression tracker I deal with a lot of >>>> reports and sometimes miss something important when writing mails like >>>> this. If that's the case here, don't hesitate to tell me in a public >>>> reply, it's in everyone's interest to set the public record straight. >>>> >>>> On 30.06.22 09:44, Peter Rosin wrote: >>>>> 2022-06-10 at 22:51, Peter Rosin wrote: >>>>>> 2022-06-10 at 09:35, Codrin.Ciubotariu@xxxxxxxxxxxxx wrote: >>>>>>> On 09.06.2022 17:28, Peter Rosin wrote: >>>>>>>> >>>>>>>> I have not actually bisected this issue but reverting the effects of >>>>>>>> patch a4bd8da893a3 ("ARM: dts: at91: sama5d3: add i2c gpio pinctrl") >>>>>>>> makes the problem go away. >>>>>>>> >>>>>>>> I.e. I need something like this in my dts >>>>>>>> >>>>>>>> &i2c2 { >>>>>>>> status = "okay"; >>>>>>>> >>>>>>>> pinctrl-names = "default"; >>>>>>>> /delete-property/ pinctrl-1; >>>>>>>> /delete-property/ sda-gpios; >>>>>>>> /delete-property/ scl-gpios; >>>>>>>> >>>>>>>> eeprom@50 { >>>>>>>> compatible = "st,24c64", "atmel,24c64"; >>>>>>>> reg = <0x50>; >>>>>>>> wp-gpios = <&filter_gpio 7 GPIO_ACTIVE_HIGH>; >>>>>>>> }; >>>>>>>> }; >>>>>>>> >>>>>>>> for multi-page eeprom writes to not time out (a page is 32 bytes on this >>>>>>>> eeprom). >>>>>>>> >>>>>>>> For reference, the current defaults for this SoC/I2C-bus, that I modify, >>>>>>>> are: >>>>>>>> >>>>>>>> pinctrl-names = "default", "gpio"; >>>>>>>> pinctrl-0 = <&pinctrl_i2c2>; >>>>>>>> pinctrl-1 = <&pinctrl_i2c2_gpio>; >>>>>>>> sda-gpios = <&pioA 18 GPIO_ACTIVE_HIGH>; >>>>>>>> scl-gpios = <&pioA 19 (GPIO_ACTIVE_HIGH | GPIO_OPEN_DRAIN)>; >>>>>>>> >>>>>>>> I suspect that the underlying reason is that the bus recovery takes >>>>>>>> too long and that the at24 eeprom driver gives up prematurely. I doubt >>>>>>>> that this is chip specific, but I don't know that. >>>>>>>> >>>>>>>> I can work around the issue in user space with by writing in 4 byte >>>>>>>> chunks, like so >>>>>>>> >>>>>>>> dd if=source.file of=/sys/bus/i2c/devices/2-0050/eeprom obs=4 >>>>>>>> >>>>>>>> but that is really ugly and gets slow too, about 20 seconds to program >>>>>>>> the full 8kB eeprom. With the above in my dts it takes a second or >>>>>>>> so (a bit more with dynamic debug active). >>>>>>>> >>>>>>>> >>>>>>>> If I run >>>>>>>> >>>>>>>> dd if=source.file of=/sys/bus/i2c/devices/2-0050/eeprom >>>>>>>> >>>>>>>> with a source.file of 8kB and the upstream dts properties in place, I can >>>>>>>> collect the following debug output from at24, i2c-core and i2c-at91: >>>>>>>> >>>>>>>> Jun 9 15:56:34 me20 kernel: i2c i2c-2: at91_xfer: processing 1 messages: >>>>>>>> Jun 9 15:56:34 me20 kernel: at91_i2c f801c000.i2c: transfer: write 34 bytes. >>>>>>>> Jun 9 15:56:34 me20 kernel: at91_i2c f801c000.i2c: transfer complete >>>>>>>> Jun 9 15:56:34 me20 kernel: at24 2-0050: write 32@0 --> 0 (-23170) >>>>>>>> Jun 9 15:56:34 me20 kernel: i2c i2c-2: at91_xfer: processing 1 messages: >>>>>>>> Jun 9 15:56:34 me20 kernel: at91_i2c f801c000.i2c: transfer: write 34 bytes. >>>>>>>> Jun 9 15:56:34 me20 kernel: at91_i2c f801c000.i2c: received nack >>>>>>>> Jun 9 15:56:34 me20 kernel: i2c i2c-2: Trying i2c bus recovery >>>>>>>> Jun 9 15:56:34 me20 kernel: at24 2-0050: write 32@32 --> -121 (-23169) >>>>>>>> Jun 9 15:56:34 me20 kernel: i2c i2c-2: at91_xfer: processing 1 messages: >>>>>>>> Jun 9 15:56:34 me20 kernel: at91_i2c f801c000.i2c: transfer: write 34 bytes. >>>>>>>> Jun 9 15:56:34 me20 kernel: at91_i2c f801c000.i2c: transfer complete >>>>>>>> Jun 9 15:56:34 me20 kernel: at24 2-0050: write 32@32 --> 0 (-23168) >>>>>>>> Jun 9 15:56:34 me20 kernel: i2c i2c-2: at91_xfer: processing 1 messages: >>>>>>>> Jun 9 15:56:34 me20 kernel: at91_i2c f801c000.i2c: transfer: write 34 bytes. >>>>>>>> Jun 9 15:56:34 me20 kernel: at91_i2c f801c000.i2c: received nack >>>>>>>> Jun 9 15:56:34 me20 kernel: i2c i2c-2: Trying i2c bus recovery >>>>>>>> Jun 9 15:56:34 me20 kernel: at24 2-0050: write 32@64 --> -121 (-23168) >>>>>>>> Jun 9 15:56:34 me20 kernel: i2c i2c-2: at91_xfer: processing 1 messages: >>>>>>>> Jun 9 15:56:34 me20 kernel: at91_i2c f801c000.i2c: transfer: write 34 bytes. >>>>>>>> Jun 9 15:56:34 me20 kernel: at91_i2c f801c000.i2c: transfer complete >>>>>>>> Jun 9 15:56:34 me20 kernel: at24 2-0050: write 32@64 --> 0 (-23167) >>>>>>>> Jun 9 15:56:34 me20 kernel: i2c i2c-2: at91_xfer: processing 1 messages: >>>>>>>> Jun 9 15:56:34 me20 kernel: at91_i2c f801c000.i2c: transfer: write 34 bytes. >>>>>>>> Jun 9 15:56:34 me20 kernel: at91_i2c f801c000.i2c: received nack >>>>>>>> Jun 9 15:56:34 me20 kernel: i2c i2c-2: Trying i2c bus recovery >>>>>>>> Jun 9 15:56:34 me20 kernel: at24 2-0050: write 32@96 --> -121 (-23167) >>>>>>>> Jun 9 15:56:34 me20 kernel: i2c i2c-2: at91_xfer: processing 1 messages: >>>>>>>> Jun 9 15:56:34 me20 kernel: at91_i2c f801c000.i2c: transfer: write 34 bytes. >>>>>>>> Jun 9 15:56:34 me20 kernel: at91_i2c f801c000.i2c: controller timed out >>>>>>>> Jun 9 15:56:34 me20 kernel: i2c i2c-2: Trying i2c bus recovery >>>>>>>> Jun 9 15:56:34 me20 kernel: at24 2-0050: write 32@96 --> -110 (-23155) >>>>>>>> Jun 9 15:56:34 me20 kernel: i2c i2c-2: at91_xfer: processing 1 messages: >>>>>>>> Jun 9 15:56:34 me20 kernel: at91_i2c f801c000.i2c: transfer: write 34 bytes. >>>>>>>> Jun 9 15:56:34 me20 kernel: at91_i2c f801c000.i2c: controller timed out >>>>>>>> Jun 9 15:56:34 me20 kernel: i2c i2c-2: Trying i2c bus recovery >>>>>>>> Jun 9 15:56:34 me20 kernel: at24 2-0050: write 32@96 --> -110 (-23143) >>>>>>>> >>>>>>>> And then there is no more action. I.e. only a couple of 32 byte pages >>>>>>>> are written. >>>>>>>> >>>>>>>> With the above mentioned dts override in place I instead get this, which is >>>>>>>> a lot more sensible: >>>>>>>> >>>>>>>> Jun 9 15:48:53 me20 kernel: i2c i2c-2: at91_xfer: processing 1 messages: >>>>>>>> Jun 9 15:48:53 me20 kernel: at91_i2c f801c000.i2c: transfer: write 34 bytes. >>>>>>>> Jun 9 15:48:53 me20 kernel: at91_i2c f801c000.i2c: transfer complete >>>>>>>> Jun 9 15:48:53 me20 kernel: at24 2-0050: write 32@0 --> 0 (753629) >>>>>>>> Jun 9 15:48:53 me20 kernel: i2c i2c-2: at91_xfer: processing 1 messages: >>>>>>>> Jun 9 15:48:53 me20 kernel: at91_i2c f801c000.i2c: transfer: write 34 bytes. >>>>>>>> Jun 9 15:48:53 me20 kernel: at91_i2c f801c000.i2c: received nack >>>>>>>> Jun 9 15:48:53 me20 kernel: at24 2-0050: write 32@32 --> -121 (753629) >>>>>>>> Jun 9 15:48:53 me20 kernel: i2c i2c-2: at91_xfer: processing 1 messages: >>>>>>>> Jun 9 15:48:53 me20 kernel: at91_i2c f801c000.i2c: transfer: write 34 bytes. >>>>>>>> Jun 9 15:48:53 me20 kernel: at91_i2c f801c000.i2c: transfer complete >>>>>>>> Jun 9 15:48:53 me20 kernel: at24 2-0050: write 32@32 --> 0 (753630) >>>>>>>> Jun 9 15:48:53 me20 kernel: i2c i2c-2: at91_xfer: processing 1 messages: >>>>>>>> Jun 9 15:48:53 me20 kernel: at91_i2c f801c000.i2c: transfer: write 34 bytes. >>>>>>>> Jun 9 15:48:53 me20 kernel: at91_i2c f801c000.i2c: received nack >>>>>>>> Jun 9 15:48:53 me20 kernel: at24 2-0050: write 32@64 --> -121 (753630) >>>>>>>> Jun 9 15:48:53 me20 kernel: i2c i2c-2: at91_xfer: processing 1 messages: >>>>>>>> Jun 9 15:48:53 me20 kernel: at91_i2c f801c000.i2c: transfer: write 34 bytes. >>>>>>>> Jun 9 15:48:53 me20 kernel: at91_i2c f801c000.i2c: transfer complete >>>>>>>> Jun 9 15:48:53 me20 kernel: at24 2-0050: write 32@64 --> 0 (753631) >>>>>>>> Jun 9 15:48:53 me20 kernel: i2c i2c-2: at91_xfer: processing 1 messages: >>>>>>>> Jun 9 15:48:53 me20 kernel: at91_i2c f801c000.i2c: transfer: write 34 bytes. >>>>>>>> Jun 9 15:48:53 me20 kernel: at91_i2c f801c000.i2c: received nack >>>>>>>> Jun 9 15:48:53 me20 kernel: at24 2-0050: write 32@96 --> -121 (753631) >>>>>>>> Jun 9 15:48:53 me20 kernel: i2c i2c-2: at91_xfer: processing 1 messages: >>>>>>>> Jun 9 15:48:53 me20 kernel: at91_i2c f801c000.i2c: transfer: write 34 bytes. >>>>>>>> Jun 9 15:48:53 me20 kernel: at91_i2c f801c000.i2c: transfer complete >>>>>>>> Jun 9 15:48:53 me20 kernel: at24 2-0050: write 32@96 --> 0 (753632) >>>>>>>> Jun 9 15:48:53 me20 kernel: i2c i2c-2: at91_xfer: processing 1 messages: >>>>>>>> Jun 9 15:48:53 me20 kernel: at91_i2c f801c000.i2c: transfer: write 34 bytes. >>>>>>>> Jun 9 15:48:53 me20 kernel: at91_i2c f801c000.i2c: received nack >>>>>>>> Jun 9 15:48:53 me20 kernel: at24 2-0050: write 32@128 --> -121 (753632) >>>>>>>> Jun 9 15:48:53 me20 kernel: i2c i2c-2: at91_xfer: processing 1 messages: >>>>>>>> Jun 9 15:48:53 me20 kernel: at91_i2c f801c000.i2c: transfer: write 34 bytes. >>>>>>>> Jun 9 15:48:53 me20 kernel: at91_i2c f801c000.i2c: transfer complete >>>>>>>> Jun 9 15:48:53 me20 kernel: at24 2-0050: write 32@128 --> 0 (753633) >>>>>>>> Jun 9 15:48:53 me20 kernel: i2c i2c-2: at91_xfer: processing 1 messages: >>>>>>>> Jun 9 15:48:53 me20 kernel: at91_i2c f801c000.i2c: transfer: write 34 bytes. >>>>>>>> Jun 9 15:48:53 me20 kernel: at91_i2c f801c000.i2c: received nack >>>>>>>> Jun 9 15:48:53 me20 kernel: at24 2-0050: write 32@160 --> -121 (753633) >>>>>>>> Jun 9 15:48:53 me20 kernel: i2c i2c-2: at91_xfer: processing 1 messages: >>>>>>>> Jun 9 15:48:53 me20 kernel: at91_i2c f801c000.i2c: transfer: write 34 bytes. >>>>>>>> Jun 9 15:48:53 me20 kernel: at91_i2c f801c000.i2c: transfer complete >>>>>>>> Jun 9 15:48:53 me20 kernel: at24 2-0050: write 32@160 --> 0 (753634) >>>>>>>> ... snip ... >>>>>>>> Jun 9 15:48:55 me20 kernel: i2c i2c-2: at91_xfer: processing 1 messages: >>>>>>>> Jun 9 15:48:55 me20 kernel: at91_i2c f801c000.i2c: transfer: write 34 bytes. >>>>>>>> Jun 9 15:48:55 me20 kernel: at91_i2c f801c000.i2c: received nack >>>>>>>> Jun 9 15:48:55 me20 kernel: at24 2-0050: write 32@8128 --> -121 (753883) >>>>>>>> Jun 9 15:48:55 me20 kernel: i2c i2c-2: at91_xfer: processing 1 messages: >>>>>>>> Jun 9 15:48:55 me20 kernel: at91_i2c f801c000.i2c: transfer: write 34 bytes. >>>>>>>> Jun 9 15:48:55 me20 kernel: at91_i2c f801c000.i2c: transfer complete >>>>>>>> Jun 9 15:48:55 me20 kernel: at24 2-0050: write 32@8128 --> 0 (753884) >>>>>>>> Jun 9 15:48:55 me20 kernel: i2c i2c-2: at91_xfer: processing 1 messages: >>>>>>>> Jun 9 15:48:55 me20 kernel: at91_i2c f801c000.i2c: transfer: write 34 bytes. >>>>>>>> Jun 9 15:48:55 me20 kernel: at91_i2c f801c000.i2c: received nack >>>>>>>> Jun 9 15:48:55 me20 kernel: at24 2-0050: write 32@8160 --> -121 (753884) >>>>>>>> Jun 9 15:48:55 me20 kernel: i2c i2c-2: at91_xfer: processing 1 messages: >>>>>>>> Jun 9 15:48:55 me20 kernel: at91_i2c f801c000.i2c: transfer: write 34 bytes. >>>>>>>> Jun 9 15:48:55 me20 kernel: at91_i2c f801c000.i2c: transfer complete >>>>>>>> Jun 9 15:48:55 me20 kernel: at24 2-0050: write 32@8160 --> 0 (753885) >>>>>>> >>>>>>> could you please apply this patch-set [1] and let us know if it >>>>>>> addresses your issue? >>>>>>> >>>>>>> Thanks and best regards, >>>>>>> Codrin >>>>>>> >>>>>>> https://patchwork.ozlabs.org/project/linux-i2c/list/?series=255408 >>>>>> >>>>>> That series does indeed help! I'll reply with a tested-by etc on the >>>>>> first two patches, I can't test patch 3/3 with my sama5d3 board... >>>>>> >>>>>> Thank you very much! >>>>> >>>>> Since replying to the actual patches do not work for me, I'm writing here >>>>> instead. Sorry about that. As stated above, it /seems/ to work much better >>>>> with these patches. But I fooled myself and there is still some remaining >>>>> trouble. It is not uncommon that the second (32-byte) page in the eeprom >>>>> is not written correctly for whatever reason. I do not know why it's >>>>> always the second page that gets corrupted, but this is a bad problem since >>>>> the failure is completely silent. >>>>> >>>>> Cheers, >>>>> Peter >>>> >>>> #regzbot poke >>> >>> _______________________________________________ >>> linux-arm-kernel mailing list >>> linux-arm-kernel@xxxxxxxxxxxxxxxxxxx >>> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel >> >> #regzbot poke >