[PATCH] locking/lockdep: Save and display stack trace of held locks

Kobe-CP Wu Kobe-CP.Wu at mediatek.com
Fri Jun 28 02:42:40 PDT 2019

On Mon, 2019-06-24 at 13:08 +0200, Peter Zijlstra wrote:
> On Mon, Jun 24, 2019 at 04:00:59PM +0800, Kobe Wu wrote:
> > Save the stack trace of held locks when lock_acquire() is invoked
> > and display the stack trace when lockdep_print_held_locks() is
> > invoked. The runtime stack trace of held locks are helpful in
> > analyzing code flow and lockdep's warning.
> > 
> > Save stack trace of each held lock will increase runtime overhead
> > and memory consumption. The operation will be activated under
> > CONFIG_LOCKDEP_TRACE_HELD_LOCK. So the impact will only occur
> Yeah, I don't see the point of this. If you cannot find where the lock
> got taken you've bigger problems.

There are some examples which can explain why stack trace are helpful.


It provides more information for debug_show_all_locks().
debug_show_all_locks() is called by other debug functions and helpful in
analyzing problems and understanding system status.

If a task goes into sleeping with holding several locks for a long time,
it may result in problems. It would be easy to know how the task goes
into sleeping and as a beginning to analyze why the task keep sleeping
from the stack trace. Because the task goes into sleeping after the
stack trace. We can research what happened after the code flow.

For the following example, (&f->f_pos_lock) is the lock we cannot
acquire for a long time. We want to know what happened on
logd.klogd/344. Why logd.klogd/344 keep sleeping? Only final function
entry and task name is hard to understand the code flow and condition.

Showing all locks held in the system:
1 lock held by logd.klogd/344:
 #0:  (&f->f_pos_lock){+.+.+.}, at: [<ffffff8182a8ae04>] __fdget_pos


Provide more information for "BUG: task/pid still has locks held!".

The original warning is as following. It not easy to know where the lock
is held. Because the lock could be used in many functions and could be
packed by other functions.

[ BUG: vpud/657 still has locks held! ]
4.9.117+ #2 Tainted: G S      W  O   
1 lock held by vpud/657 on CPU#6:
 #0:  (&dev->enc_mutex){+.+.+.}, at: [<ffffff8c5ca3ca74>] venc_lock
Call trace:
[<ffffff8c5be8d190>] dump_backtrace+0x0/0x2bc
[<ffffff8c5be8d188>] show_stack+0x18/0x20
[<ffffff8c5c268274>] dump_stack+0xa8/0xf8
[<ffffff8c5bf4c1e4>] debug_check_no_locks_held+0x174/0x17c
[<ffffff8c5bf9e1dc>] futex_wait_queue_me+0xf4/0x16c
[<ffffff8c5bf9bc44>] futex_wait+0x14c/0x334
[<ffffff8c5bf9a534>] do_futex+0x10c/0x16d0
[<ffffff8c5bf9f688>] compat_SyS_futex+0x100/0x158
[<ffffff8c5be83e00>] el0_svc_naked+0x34/0x38

If there is a stack trace for reference, it will be easy to understand
and resolve this problem.

1 lock held by vpud/657/[0] on CPU#6:
 #0:  (&dev->enc_mutex){+.+.+.}, at: [<ffffff8c5ca3ca74>] venc_lock

For the same reason, it is helpful in analyzing "BUG: held lock freed!" 
and "INFO: possible recursive locking detected".

More information about the Linux-mediatek mailing list