[PATCH RFC] regmap: maple: Switch to use irq-safe locking

Mark Brown broonie at kernel.org
Wed Aug 14 12:04:02 PDT 2024


On Wed, Aug 14, 2024 at 01:20:21AM +0300, Cristian Ciocaltea wrote:

> Commit 3d59c22bbb8d ("drm/rockchip: vop2: Convert to use maple tree
> register cache") enabled the use of maple tree register cache in
> Rockchip VOP2 driver.  However, building the kernel with lockdep support
> indicates locking rules violation when trying to unload the rockchipdrm
> module:

> [ 48.360258] ========================================================
> [ 48.360829] WARNING: possible irq lock inversion dependency detected
> [ 48.361400] 6.11.0-rc1 #40 Not tainted
> [ 48.361743] --------------------------------------------------------
> [ 48.362311] modprobe/685 just changed the state of lock:
> [ 48.362790] ffff0000087fa798 (&mt->ma_lock){+...}-{2:2}, at: regcache_maple_exit+0x6c/0xe0

Please think hard before including complete backtraces in upstream
reports, they are very large and contain almost no useful information
relative to their size so often obscure the relevant content in your
message. If part of the backtrace is usefully illustrative (it often is
for search engines if nothing else) then it's usually better to pull out
the relevant sections.

> The problem is that the regmap lock could be taken by an IRQ context,
> interrupting the irq-unsafe maple tree lock, which may result in a lock
> inversion deadlock scenario.

> Switch to use irq-safe locking in the maple tree register cache.

I'd have a bigger question here which is why the driver is using a
dynamically allocated register cache in a hardirq context, especially
with no defaults provided?  Anything except the flat cache might do
allocations at runtime which might include in interrupt context unless
the caller is very careful and since the lockdep warning triggered it's
clear that this driver isn't.  The core will be doing atomic allocations
for MMIO but that's not something we want to be doing as a matter of
course...  I would generally expect drivers to try to ensure that any
registers are cached outside of the interrupt handler, usually by
specifying defaults or touching all registers during setup.

Without having done a full analysis it also looks like the marking of
volatile registers isn't right, it's not immediately clear that the
interrupt status and clear registers are volatile and they ought to be.
None of the registers accessed in interrupt context look like they
should be cached at all unless there's something triggered via the DRM
vblank calls.

It might be safer to fall back to the rbtree cache for this device since
rbtree doesn't force an extra level of locking on us, though like I say
I'm not convinced that what the driver is doing with caching is a super
good idea.  Though probably what the driver is doing should work.

> Fixes: f033c26de5a5 ("regmap: Add maple tree based register cache")

This is obvious nonsense.  If anything it'd be the conversion to maple
tree, though there were issues before then as it was a conversion from
rbtree that was what added the extra locking.  Please only add Fixes
tags if there's a clear link between the issue and the commit being
pointed at.

> +#define mas_lock_irq(mas, flags) spin_lock_irqsave(&((mas)->tree->ma_lock), flags)
> +#define mas_unlock_irq(mas, flags) spin_unlock_irqrestore(&((mas)->tree->ma_lock), flags)

It's clearly not appropriate to add these outside of the maple tree
code, especially with this naming - this should be with the other
mas_lock() stuff in the maple tree header so added as a separate commit.
It also doesn't seem like a super good idea to unconditionally force all
maple tree users to save interrupt state whenever they need to do
allocations, the spinlock is a bit heavyweight already and this just
escalates it.

My first thought here is that if we've got a regmap using spinlocks for
the regmap lock and a maple tree cache we should arrange things so that
the maple tree lock is used for the regmap's lock.  That would however
involve some unpleasant abstraction violation, and possibly some macro
fun since we'd need to elide the locking from the cache itself when
using the same lock at the regmap level.  I think that's going to be a
case of choosing the least unpleasant option.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 488 bytes
Desc: not available
URL: <http://lists.infradead.org/pipermail/linux-rockchip/attachments/20240814/f8a862cd/attachment.sig>


More information about the Linux-rockchip mailing list