[PATCH v2] i2c: rk3x: fix potential spinlock recursion on poll

Jensen Huang jensenhuang at friendlyarm.com
Fri Dec 22 00:22:26 PST 2023


On Wed, Dec 20, 2023 at 1:23 AM Wolfram Sang <wsa at kernel.org> wrote:
>
> On Thu, Dec 07, 2023 at 04:21:59PM +0800, Jensen Huang wrote:
> > Possible deadlock scenario (on reboot):
> > rk3x_i2c_xfer_common(polling)
> >     -> rk3x_i2c_wait_xfer_poll()
> >         -> rk3x_i2c_irq(0, i2c);
> >             --> spin_lock(&i2c->lock);
> >             ...
> >         <rk3x i2c interrupt>
> >         -> rk3x_i2c_irq(0, i2c);
> >             --> spin_lock(&i2c->lock); (deadlock here)
> >
> > Store the IRQ number and disable/enable it around the polling transfer.
> > This patch has been tested on NanoPC-T4.
> >
> > Signed-off-by: Jensen Huang <jensenhuang at friendlyarm.com>
> > Reviewed-by: Heiko Stuebner <heiko at sntech.de>
> > Reviewed-by: Andi Shyti <andi.shyti at kernel.org>
>
> Applied to for-current, thanks!
>
> But I'd like to see the follow-up patches somewhen which have been
> discussed in this thread.

I've made some attempts, such as removing irqsave/restore, but it
didn't meet my expectations and
requires further testing. Therefore, I may not be able to submit such
a patch in the short term. However,
if someone else submits a new patch, I'd be happy to test it on rk3328/rk3399.


On Fri, Dec 8, 2023 at 8:17 PM Dmitry Osipenko
<dmitry.osipenko at collabora.com> wrote:
>
> On 12/8/23 11:53, Jensen Huang wrote:
> > On Fri, Dec 8, 2023 at 12:00 AM Dmitry Osipenko
> > <dmitry.osipenko at collabora.com> wrote:
> >>
> >> On 12/7/23 17:10, Dragan Simic wrote:
> >>> On 2023-12-07 10:25, Jensen Huang wrote:
> >>>> On Thu, Dec 7, 2023 at 4:37 PM Dragan Simic <dsimic at manjaro.org> wrote:
> >>>>>
> >>>>> On 2023-12-07 09:21, Jensen Huang wrote:
> >>>>>> Possible deadlock scenario (on reboot):
> >>>>>> rk3x_i2c_xfer_common(polling)
> >>>>>>     -> rk3x_i2c_wait_xfer_poll()
> >>>>>>         -> rk3x_i2c_irq(0, i2c);
> >>>>>>             --> spin_lock(&i2c->lock);
> >>>>>>             ...
> >>>>>>         <rk3x i2c interrupt>
> >>>>>>         -> rk3x_i2c_irq(0, i2c);
> >>>>>>             --> spin_lock(&i2c->lock); (deadlock here)
> >>>>>>
> >>>>>> Store the IRQ number and disable/enable it around the polling
> >>>>> transfer.
> >>>>>> This patch has been tested on NanoPC-T4.
> >>>>>
> >>>>> In case you haven't already seen the related discussion linked below,
> >>>>> please have a look.  I also added more people to the list of recipients,
> >>>>> in an attempt to make everyone aware of the different approaches to
> >>>>> solving this issue.
> >>>>>
> >>>>> https://lore.kernel.org/all/655177f4.050a0220.d85c9.3ba0@mx.google.com/T/#m6fc9c214452fec6681843e7f455978c35c6f6c8b
> >>>>
> >>>> Thank you for providing the information. I hadn't seen this link before.
> >>>> After carefully looking into the related discussion, it appears that
> >>>> Dmitry Osipenko is already working on a suitable patch. To avoid
> >>>> duplication
> >>>> or conflicts, my patch can be discarded.
> >>>
> >>> Thank you for responding so quickly.  Perhaps it would be best to hear
> >>> from Dmitry as well, before discarding anything.  It's been a while
> >>> since Dmitry wrote about working on the patch, so he might have
> >>> abandoned it.
> >>
> >> This patch is okay. In general, will be better to have IRQ disabled by
> >> default like I did in my variant, it should allow to remove the spinlock
> >> entirely. Of course this also can be done later on in a follow up
> >> patches. Jensen, feel free to use my variant of the patch, add my
> >> s-o-b+co-developed tags to the commit msg if you'll do. Otherwise I'll
> >> be able to send my patch next week.
> >
> > Thank you for the suggestion. I've updated the patch to your variant, and
> > as confirmed by others, reboots are functioning correctly. I measured the
> > overhead of enable_irq/disable_irq() by calculating ktime in the
> > updated version,
> > and on rk3399, the minimum delta I observed was 291/875 ns. This extra
> > cost may impact most interrupt-based transfers. Therefore, I personally lean
> > towards the current v2 patch and handle the spinlock and irqsave/restore in
> > a follow up patch. I'd like to hear everyone's thoughts on this.
>
> Please don't use ktime for perf measurements, ktime itself is a slow API
> and it should be 200us that ktime takes itself. Also, the 0.2us is
> practically nothing, especially compared to I2C transfers measured in ms.
>
> I'm fine with keeping your v2 variant for the bug fix if you prefer
> that. Thanks for addressing the issue :)

Thank you for clarifying the situation with ktime. Honestly, I also
believe that the overhead
of disable_irq/enable_irq should be minimal. Considering that system
frequency adjustments
involve I2C transfers, I conducted this 'somewhat imprecise' measurements.


--
Best regards,
Jensen



More information about the linux-arm-kernel mailing list