CAAM: kernel BUG at drivers/crypto/caam/jr.c:230! (and dma-coherent query)
Sascha Hauer
sha at pengutronix.de
Wed Mar 3 14:56:56 GMT 2021
On Wed, Mar 03, 2021 at 12:26:32PM +0200, Horia Geantă wrote:
> Adding some people in the loop, maybe they could help in understanding
> why lack of "dma-coherent" property for a HW-coherent device could lead to
> unexpected / strange side effects.
>
> On 3/1/2021 5:22 PM, Sascha Hauer wrote:
> > Hi All,
> >
> > I am on a Layerscape LS1046a using Linux-5.11. The CAAM driver sometimes
> > crashes during the run-time self tests with:
> >
> >> kernel BUG at drivers/crypto/caam/jr.c:247!
> >> Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
> >> Modules linked in:
> >> CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.11.0-20210225-3-00039-g434215968816-dirty #12
> >> Hardware name: TQ TQMLS1046A SoM on Arkona AT1130 (C300) board (DT)
> >> pstate: 60000005 (nZCv daif -PAN -UAO -TCO BTYPE=--)
> >> pc : caam_jr_dequeue+0x98/0x57c
> >> lr : caam_jr_dequeue+0x98/0x57c
> >> sp : ffff800010003d50
> >> x29: ffff800010003d50 x28: ffff8000118d4000
> >> x27: ffff8000118d4328 x26: 00000000000001f0
> >> x25: ffff0008022be480 x24: ffff0008022c6410
> >> x23: 00000000000001f1 x22: ffff8000118d4329
> >> x21: 0000000000004d80 x20: 00000000000001f1
> >> x19: 0000000000000001 x18: 0000000000000020
> >> x17: 0000000000000000 x16: 0000000000000015
> >> x15: ffff800011690230 x14: 2e2e2e2e2e2e2e2e
> >> x13: 2e2e2e2e2e2e2020 x12: 3030303030303030
> >> x11: ffff800011700a38 x10: 00000000fffff000
> >> x9 : ffff8000100ada30 x8 : ffff8000116a8a38
> >> x7 : 0000000000000001 x6 : 0000000000000000
> >> x5 : 0000000000000000 x4 : 0000000000000000
> >> x3 : 00000000ffffffff x2 : 0000000000000000
> >> x1 : 0000000000000000 x0 : 0000000000001800
> >> Call trace:
> >> caam_jr_dequeue+0x98/0x57c
> >> tasklet_action_common.constprop.0+0x164/0x18c
> >> tasklet_action+0x44/0x54
> >> __do_softirq+0x160/0x454
> >> __irq_exit_rcu+0x164/0x16c
> >> irq_exit+0x1c/0x30
> >> __handle_domain_irq+0xc0/0x13c
> >> gic_handle_irq+0x5c/0xf0
> >> el1_irq+0xb4/0x180
> >> arch_cpu_idle+0x18/0x30
> >> default_idle_call+0x3c/0x1c0
> >> do_idle+0x23c/0x274
> >> cpu_startup_entry+0x34/0x70
> >> rest_init+0xdc/0xec
> >> arch_call_rest_init+0x1c/0x28
> >> start_kernel+0x4ac/0x4e4
> >> Code: 91392021 912c2000 d377d8c6 97f24d96 (d4210000)
> >
> > The driver iterates over the descriptors in the output ring and matches them
> > with the ones it has previously queued. If it doesn't find a matching
> > descriptor it complains with the BUG_ON() seen above. What I see sometimes is
> > that the address in the output ring is 0x0, the job status in this case is
> > 0x40000006 (meaning DECO Invalid KEY command). It seems that the CAAM doesn't
> > write the descriptor address to the output ring at least in some error cases.
> > When we don't have the descriptor address of the failed descriptor we have no
> > way to find it in the list of queued descriptors, thus we also can't find the
> > callback for that descriptor. This looks very unfortunate, anyone else seen
> > this or has an idea what to do about it?
> >
> > I haven't investigated yet which job actually fails and why. Of course that would
> > be my ultimate goal to find that out.
> >
> This looks very similar to an earlier report from Greg.
> He confirmed that adding "dma-coherent" property to the "crypto" DT node
> fixes the issue:
> https://lore.kernel.org/linux-crypto/74f664f5-5433-d322-4789-3c78bdb814d8@kernel.org
> Patch rebased on v5.11 is at the bottom. Does it work for you too?
Indeed this seems to solve it for me as well, you can add my
Tested-by: Sascha Hauer <s.hauer at pengutronix.de>
However, there seem to be two problems: First that "DECO Invalid KEY
command" actually occurs and second that the deqeueue code currently
can't handle a NULL pointer in the output ring.
Do you think that the occurence of a NULL pointer is also a coherency
issue?
Sascha
--
Pengutronix e.K. | |
Steuerwalder Str. 21 | http://www.pengutronix.de/ |
31137 Hildesheim, Germany | Phone: +49-5121-206917-0 |
Amtsgericht Hildesheim, HRA 2686 | Fax: +49-5121-206917-5555 |
More information about the linux-arm-kernel
mailing list