Regression: at24 eeprom writing times out on sama5d3

Peter Rosin peda at axentia.se
Thu Sep 8 06:59:10 PDT 2022


Hi!

2022-09-08 at 14:06, Thorsten Leemhuis wrote:
> Hi, this is your Linux kernel regression tracker. Top-posting for once,
> to make this easily accessible to everyone.
> 
> Peter, Codrin, could you help me out here please: I still have the
> regression report from Peter that started this thread in the list of
> tracked issues. From Peter's last msg quoted below it seems the thread
> just faded out without the regression being fixed. Or was it? If not:
> what can we do to finally get this resolved?

No, it is not resolved that I know of. We are only writing during
production, but are working around it by verifying and looping back.
Sometimes it takes surprisingly long for the loop to finish, but
it's not a huge deal. But it is of course not completely satisfying
either...

Reading is never a problem, so post-production behavior is sane.

Cheers,
Peter

> Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
> 
> P.S.: As the Linux kernel's regression tracker I deal with a lot of
> reports and sometimes miss something important when writing mails like
> this. If that's the case here, don't hesitate to tell me in a public
> reply, it's in everyone's interest to set the public record straight.
> 
> On 30.06.22 09:44, Peter Rosin wrote:
>> 2022-06-10 at 22:51, Peter Rosin wrote:
>>> 2022-06-10 at 09:35, Codrin.Ciubotariu at microchip.com wrote:
>>>> On 09.06.2022 17:28, Peter Rosin wrote:
>>>>>
>>>>> I have not actually bisected this issue but reverting the effects of
>>>>> patch a4bd8da893a3 ("ARM: dts: at91: sama5d3: add i2c gpio pinctrl")
>>>>> makes the problem go away.
>>>>>
>>>>> I.e. I need something like this in my dts
>>>>>
>>>>> &i2c2 {
>>>>>          status = "okay";
>>>>>
>>>>>          pinctrl-names = "default";
>>>>>          /delete-property/ pinctrl-1;
>>>>>          /delete-property/ sda-gpios;
>>>>>          /delete-property/ scl-gpios;
>>>>>
>>>>>          eeprom at 50 {
>>>>>                  compatible = "st,24c64", "atmel,24c64";
>>>>>                  reg = <0x50>;
>>>>>                  wp-gpios = <&filter_gpio 7 GPIO_ACTIVE_HIGH>;
>>>>>          };
>>>>> };
>>>>>
>>>>> for multi-page eeprom writes to not time out (a page is 32 bytes on this
>>>>> eeprom).
>>>>>
>>>>> For reference, the current defaults for this SoC/I2C-bus, that I modify,
>>>>> are:
>>>>>
>>>>>          pinctrl-names = "default", "gpio";
>>>>>          pinctrl-0 = <&pinctrl_i2c2>;
>>>>>          pinctrl-1 = <&pinctrl_i2c2_gpio>;
>>>>>          sda-gpios = <&pioA 18 GPIO_ACTIVE_HIGH>;
>>>>>          scl-gpios = <&pioA 19 (GPIO_ACTIVE_HIGH | GPIO_OPEN_DRAIN)>;
>>>>>
>>>>> I suspect that the underlying reason is that the bus recovery takes
>>>>> too long and that the at24 eeprom driver gives up prematurely. I doubt
>>>>> that this is chip specific, but I don't know that.
>>>>>
>>>>> I can work around the issue in user space with by writing in 4 byte
>>>>> chunks, like so
>>>>>
>>>>> dd if=source.file of=/sys/bus/i2c/devices/2-0050/eeprom obs=4
>>>>>
>>>>> but that is really ugly and gets slow too, about 20 seconds to program
>>>>> the full 8kB eeprom. With the above in my dts it takes a second or
>>>>> so (a bit more with dynamic debug active).
>>>>>
>>>>>
>>>>> If I run
>>>>>
>>>>> dd if=source.file of=/sys/bus/i2c/devices/2-0050/eeprom
>>>>>
>>>>> with a source.file of 8kB and the upstream dts properties in place, I can
>>>>> collect the following debug output from at24, i2c-core and i2c-at91:
>>>>>
>>>>> Jun  9 15:56:34 me20 kernel: i2c i2c-2: at91_xfer: processing 1 messages:
>>>>> Jun  9 15:56:34 me20 kernel: at91_i2c f801c000.i2c: transfer: write 34 bytes.
>>>>> Jun  9 15:56:34 me20 kernel: at91_i2c f801c000.i2c: transfer complete
>>>>> Jun  9 15:56:34 me20 kernel: at24 2-0050: write 32 at 0 --> 0 (-23170)
>>>>> Jun  9 15:56:34 me20 kernel: i2c i2c-2: at91_xfer: processing 1 messages:
>>>>> Jun  9 15:56:34 me20 kernel: at91_i2c f801c000.i2c: transfer: write 34 bytes.
>>>>> Jun  9 15:56:34 me20 kernel: at91_i2c f801c000.i2c: received nack
>>>>> Jun  9 15:56:34 me20 kernel: i2c i2c-2: Trying i2c bus recovery
>>>>> Jun  9 15:56:34 me20 kernel: at24 2-0050: write 32 at 32 --> -121 (-23169)
>>>>> Jun  9 15:56:34 me20 kernel: i2c i2c-2: at91_xfer: processing 1 messages:
>>>>> Jun  9 15:56:34 me20 kernel: at91_i2c f801c000.i2c: transfer: write 34 bytes.
>>>>> Jun  9 15:56:34 me20 kernel: at91_i2c f801c000.i2c: transfer complete
>>>>> Jun  9 15:56:34 me20 kernel: at24 2-0050: write 32 at 32 --> 0 (-23168)
>>>>> Jun  9 15:56:34 me20 kernel: i2c i2c-2: at91_xfer: processing 1 messages:
>>>>> Jun  9 15:56:34 me20 kernel: at91_i2c f801c000.i2c: transfer: write 34 bytes.
>>>>> Jun  9 15:56:34 me20 kernel: at91_i2c f801c000.i2c: received nack
>>>>> Jun  9 15:56:34 me20 kernel: i2c i2c-2: Trying i2c bus recovery
>>>>> Jun  9 15:56:34 me20 kernel: at24 2-0050: write 32 at 64 --> -121 (-23168)
>>>>> Jun  9 15:56:34 me20 kernel: i2c i2c-2: at91_xfer: processing 1 messages:
>>>>> Jun  9 15:56:34 me20 kernel: at91_i2c f801c000.i2c: transfer: write 34 bytes.
>>>>> Jun  9 15:56:34 me20 kernel: at91_i2c f801c000.i2c: transfer complete
>>>>> Jun  9 15:56:34 me20 kernel: at24 2-0050: write 32 at 64 --> 0 (-23167)
>>>>> Jun  9 15:56:34 me20 kernel: i2c i2c-2: at91_xfer: processing 1 messages:
>>>>> Jun  9 15:56:34 me20 kernel: at91_i2c f801c000.i2c: transfer: write 34 bytes.
>>>>> Jun  9 15:56:34 me20 kernel: at91_i2c f801c000.i2c: received nack
>>>>> Jun  9 15:56:34 me20 kernel: i2c i2c-2: Trying i2c bus recovery
>>>>> Jun  9 15:56:34 me20 kernel: at24 2-0050: write 32 at 96 --> -121 (-23167)
>>>>> Jun  9 15:56:34 me20 kernel: i2c i2c-2: at91_xfer: processing 1 messages:
>>>>> Jun  9 15:56:34 me20 kernel: at91_i2c f801c000.i2c: transfer: write 34 bytes.
>>>>> Jun  9 15:56:34 me20 kernel: at91_i2c f801c000.i2c: controller timed out
>>>>> Jun  9 15:56:34 me20 kernel: i2c i2c-2: Trying i2c bus recovery
>>>>> Jun  9 15:56:34 me20 kernel: at24 2-0050: write 32 at 96 --> -110 (-23155)
>>>>> Jun  9 15:56:34 me20 kernel: i2c i2c-2: at91_xfer: processing 1 messages:
>>>>> Jun  9 15:56:34 me20 kernel: at91_i2c f801c000.i2c: transfer: write 34 bytes.
>>>>> Jun  9 15:56:34 me20 kernel: at91_i2c f801c000.i2c: controller timed out
>>>>> Jun  9 15:56:34 me20 kernel: i2c i2c-2: Trying i2c bus recovery
>>>>> Jun  9 15:56:34 me20 kernel: at24 2-0050: write 32 at 96 --> -110 (-23143)
>>>>>
>>>>> And then there is no more action. I.e. only a couple of 32 byte pages
>>>>> are written.
>>>>>
>>>>> With the above mentioned dts override in place I instead get this, which is
>>>>> a lot more sensible:
>>>>>
>>>>> Jun  9 15:48:53 me20 kernel: i2c i2c-2: at91_xfer: processing 1 messages:
>>>>> Jun  9 15:48:53 me20 kernel: at91_i2c f801c000.i2c: transfer: write 34 bytes.
>>>>> Jun  9 15:48:53 me20 kernel: at91_i2c f801c000.i2c: transfer complete
>>>>> Jun  9 15:48:53 me20 kernel: at24 2-0050: write 32 at 0 --> 0 (753629)
>>>>> Jun  9 15:48:53 me20 kernel: i2c i2c-2: at91_xfer: processing 1 messages:
>>>>> Jun  9 15:48:53 me20 kernel: at91_i2c f801c000.i2c: transfer: write 34 bytes.
>>>>> Jun  9 15:48:53 me20 kernel: at91_i2c f801c000.i2c: received nack
>>>>> Jun  9 15:48:53 me20 kernel: at24 2-0050: write 32 at 32 --> -121 (753629)
>>>>> Jun  9 15:48:53 me20 kernel: i2c i2c-2: at91_xfer: processing 1 messages:
>>>>> Jun  9 15:48:53 me20 kernel: at91_i2c f801c000.i2c: transfer: write 34 bytes.
>>>>> Jun  9 15:48:53 me20 kernel: at91_i2c f801c000.i2c: transfer complete
>>>>> Jun  9 15:48:53 me20 kernel: at24 2-0050: write 32 at 32 --> 0 (753630)
>>>>> Jun  9 15:48:53 me20 kernel: i2c i2c-2: at91_xfer: processing 1 messages:
>>>>> Jun  9 15:48:53 me20 kernel: at91_i2c f801c000.i2c: transfer: write 34 bytes.
>>>>> Jun  9 15:48:53 me20 kernel: at91_i2c f801c000.i2c: received nack
>>>>> Jun  9 15:48:53 me20 kernel: at24 2-0050: write 32 at 64 --> -121 (753630)
>>>>> Jun  9 15:48:53 me20 kernel: i2c i2c-2: at91_xfer: processing 1 messages:
>>>>> Jun  9 15:48:53 me20 kernel: at91_i2c f801c000.i2c: transfer: write 34 bytes.
>>>>> Jun  9 15:48:53 me20 kernel: at91_i2c f801c000.i2c: transfer complete
>>>>> Jun  9 15:48:53 me20 kernel: at24 2-0050: write 32 at 64 --> 0 (753631)
>>>>> Jun  9 15:48:53 me20 kernel: i2c i2c-2: at91_xfer: processing 1 messages:
>>>>> Jun  9 15:48:53 me20 kernel: at91_i2c f801c000.i2c: transfer: write 34 bytes.
>>>>> Jun  9 15:48:53 me20 kernel: at91_i2c f801c000.i2c: received nack
>>>>> Jun  9 15:48:53 me20 kernel: at24 2-0050: write 32 at 96 --> -121 (753631)
>>>>> Jun  9 15:48:53 me20 kernel: i2c i2c-2: at91_xfer: processing 1 messages:
>>>>> Jun  9 15:48:53 me20 kernel: at91_i2c f801c000.i2c: transfer: write 34 bytes.
>>>>> Jun  9 15:48:53 me20 kernel: at91_i2c f801c000.i2c: transfer complete
>>>>> Jun  9 15:48:53 me20 kernel: at24 2-0050: write 32 at 96 --> 0 (753632)
>>>>> Jun  9 15:48:53 me20 kernel: i2c i2c-2: at91_xfer: processing 1 messages:
>>>>> Jun  9 15:48:53 me20 kernel: at91_i2c f801c000.i2c: transfer: write 34 bytes.
>>>>> Jun  9 15:48:53 me20 kernel: at91_i2c f801c000.i2c: received nack
>>>>> Jun  9 15:48:53 me20 kernel: at24 2-0050: write 32 at 128 --> -121 (753632)
>>>>> Jun  9 15:48:53 me20 kernel: i2c i2c-2: at91_xfer: processing 1 messages:
>>>>> Jun  9 15:48:53 me20 kernel: at91_i2c f801c000.i2c: transfer: write 34 bytes.
>>>>> Jun  9 15:48:53 me20 kernel: at91_i2c f801c000.i2c: transfer complete
>>>>> Jun  9 15:48:53 me20 kernel: at24 2-0050: write 32 at 128 --> 0 (753633)
>>>>> Jun  9 15:48:53 me20 kernel: i2c i2c-2: at91_xfer: processing 1 messages:
>>>>> Jun  9 15:48:53 me20 kernel: at91_i2c f801c000.i2c: transfer: write 34 bytes.
>>>>> Jun  9 15:48:53 me20 kernel: at91_i2c f801c000.i2c: received nack
>>>>> Jun  9 15:48:53 me20 kernel: at24 2-0050: write 32 at 160 --> -121 (753633)
>>>>> Jun  9 15:48:53 me20 kernel: i2c i2c-2: at91_xfer: processing 1 messages:
>>>>> Jun  9 15:48:53 me20 kernel: at91_i2c f801c000.i2c: transfer: write 34 bytes.
>>>>> Jun  9 15:48:53 me20 kernel: at91_i2c f801c000.i2c: transfer complete
>>>>> Jun  9 15:48:53 me20 kernel: at24 2-0050: write 32 at 160 --> 0 (753634)
>>>>> ... snip ...
>>>>> Jun  9 15:48:55 me20 kernel: i2c i2c-2: at91_xfer: processing 1 messages:
>>>>> Jun  9 15:48:55 me20 kernel: at91_i2c f801c000.i2c: transfer: write 34 bytes.
>>>>> Jun  9 15:48:55 me20 kernel: at91_i2c f801c000.i2c: received nack
>>>>> Jun  9 15:48:55 me20 kernel: at24 2-0050: write 32 at 8128 --> -121 (753883)
>>>>> Jun  9 15:48:55 me20 kernel: i2c i2c-2: at91_xfer: processing 1 messages:
>>>>> Jun  9 15:48:55 me20 kernel: at91_i2c f801c000.i2c: transfer: write 34 bytes.
>>>>> Jun  9 15:48:55 me20 kernel: at91_i2c f801c000.i2c: transfer complete
>>>>> Jun  9 15:48:55 me20 kernel: at24 2-0050: write 32 at 8128 --> 0 (753884)
>>>>> Jun  9 15:48:55 me20 kernel: i2c i2c-2: at91_xfer: processing 1 messages:
>>>>> Jun  9 15:48:55 me20 kernel: at91_i2c f801c000.i2c: transfer: write 34 bytes.
>>>>> Jun  9 15:48:55 me20 kernel: at91_i2c f801c000.i2c: received nack
>>>>> Jun  9 15:48:55 me20 kernel: at24 2-0050: write 32 at 8160 --> -121 (753884)
>>>>> Jun  9 15:48:55 me20 kernel: i2c i2c-2: at91_xfer: processing 1 messages:
>>>>> Jun  9 15:48:55 me20 kernel: at91_i2c f801c000.i2c: transfer: write 34 bytes.
>>>>> Jun  9 15:48:55 me20 kernel: at91_i2c f801c000.i2c: transfer complete
>>>>> Jun  9 15:48:55 me20 kernel: at24 2-0050: write 32 at 8160 --> 0 (753885)
>>>>
>>>> could you please apply this patch-set [1] and let us know if it 
>>>> addresses your issue?
>>>>
>>>> Thanks and best regards,
>>>> Codrin
>>>>
>>>> https://patchwork.ozlabs.org/project/linux-i2c/list/?series=255408
>>>
>>> That series does indeed help! I'll reply with a tested-by etc on the
>>> first two patches, I can't test patch 3/3 with my sama5d3 board...
>>>
>>> Thank you very much!
>>
>> Since replying to the actual patches do not work for me, I'm writing here
>> instead. Sorry about that. As stated above, it /seems/ to work much better
>> with these patches. But I fooled myself and there is still some remaining
>> trouble. It is not uncommon that the second (32-byte) page in the eeprom
>> is not written correctly for whatever reason. I do not know why it's
>> always the second page that gets corrupted, but this is a bad problem since
>> the failure is completely silent.
>>
>> Cheers,
>> Peter
> 
> #regzbot poke



More information about the linux-arm-kernel mailing list