[PATCH] arm64: restore get_current() optimisation
Jon Hunter
jonathanh at nvidia.com
Thu Mar 2 03:35:06 PST 2017
Hi Mark,
On 03/01/17 18:27, Mark Rutland wrote:
> Hi Catalin,
>
> My THREAD_INFO_IN_TASK series had an unintended performance regression in
> get_current() / current_thread_info(). Could you please take the below as a
> fix for the next rc?
>
> Thanks,
> Mark.
>
> ---->8----
> Commit c02433dd6de32f04 ("arm64: split thread_info from task stack")
> inverted the relationship between get_current() and
> current_thread_info(), with sp_el0 now holding the current task_struct
> rather than the current thead_info. The new implementation of
> get_current() prevents the compiler from being able to optimize repeated
> calls to either, resulting in a noticeable penalty in some
> microbenchmarks.
>
> This patch restores the previous optimisation by implementing
> get_current() in the same way as our old current_thread_info(), using a
> non-volatile asm statement.
>
> Signed-off-by: Mark Rutland <mark.rutland at arm.com>
> Cc: Will Deacon <will.deacon at arm.com>
> Cc: Catalin Marinas <catalin.marinas at arm.com>
> Reported-by: Davidlohr Bueso <dbueso at suse.de>
> ---
> arch/arm64/include/asm/current.h | 10 +++++++++-
> 1 file changed, 9 insertions(+), 1 deletion(-)
>
> diff --git a/arch/arm64/include/asm/current.h b/arch/arm64/include/asm/current.h
> index f2bcbe2..86c4041 100644
> --- a/arch/arm64/include/asm/current.h
> +++ b/arch/arm64/include/asm/current.h
> @@ -9,9 +9,17 @@
>
> struct task_struct;
>
> +/*
> + * We don't use read_sysreg() as we want the compiler to cache the value where
> + * possible.
> + */
> static __always_inline struct task_struct *get_current(void)
> {
> - return (struct task_struct *)read_sysreg(sp_el0);
> + unsigned long sp_el0;
> +
> + asm ("mrs %0, sp_el0" : "=r" (sp_el0));
> +
> + return (struct task_struct *)sp_el0;
> }
>
> #define current get_current()
I noticed that with v4.10 I am seeing the following panic ...
[ 184.523390] Unable to handle kernel paging request at virtual address ffff8001bb7a2800
[ 184.531316] pgd = ffff8000b96b1000
[ 184.534711] [ffff8001bb7a2800] *pgd=0000000000000000
[ 184.539670] Internal error: Oops: 96000005 [#1] PREEMPT SMP
[ 184.545231] Modules linked in:
[ 184.548285] CPU: 2 PID: 1407 Comm: tinymix Not tainted 4.10.0-00017-g50bc4a8b2868 #19
[ 184.556104] Hardware name: Google Pixel C (DT)
[ 184.560540] task: ffff8000bb558c80 task.stack: ffff8000b9648000
[ 184.566458] PC is at regcache_flat_read+0x14/0x20
[ 184.571155] LR is at regcache_read+0x50/0x78
[ 184.575417] pc : [<ffff0000085d0c6c>] lr : [<ffff0000085cefa8>] pstate: 400001c5
[ 184.582802] sp : ffff8000b964b970
[ 184.586108] x29: ffff8000b964b970 x28: ffff8000b9584800
[ 184.591412] x27: ffff8000b964bcc8 x26: ffff8000b9461000
[ 184.596716] x25: 0000000000000000 x24: 0000000000000000
[ 184.602019] x23: 00000000ffff8000 x22: ffff8000b964ba1c
[ 184.607322] x21: ffff8000b964ba1c x20: 00000000ffff8000
[ 184.612626] x19: ffff8000bb7dc400 x18: 0000000000000000
[ 184.617928] x17: 0000000000000001 x16: ffff0000081f79e8
[ 184.623230] x15: 0000000000497000 x14: 0000000000000000
[ 184.628532] x13: 0000000000000001 x12: 0000000005cc6000
[ 184.633835] x11: 0000000000000000 x10: ffff8000bc16bf00
[ 184.639138] x9 : 0000000000000000 x8 : 0000000000000000
[ 184.644441] x7 : ffff8000bff68908 x6 : 0000000000000000
[ 184.649742] x5 : ffff000008fc9f00 x4 : ffff8000bb7aa800
[ 184.655044] x3 : 0000000000000002 x2 : ffff8000b964ba1c
[ 184.660347] x1 : 000000003fffe000 x0 : 0000000000000000
[ 184.665650]
[ 184.667137] Process tinymix (pid: 1407, stack limit = 0xffff8000b9648000)
[ 184.673913] Stack: (0xffff8000b964b970 to 0xffff8000b964c000)
[ 184.679649] b960: ffff8000b964b9a0 ffff0000085cce60
[ 184.687469] b980: ffff8000bb7dc400 ffff8000bb7dc400 00000000ffff8000 ffff0000085cd104
[ 184.695288] b9a0: ffff8000b964b9d0 ffff0000085cd218 ffff8000b964ba8f ffff8000bb7dc400
[ 184.703109] b9c0: 00000000bc1d14a0 00000000ffff8000 ffff8000b964ba20 ffff0000085ce1d8
[ 184.710929] b9e0: ffff8000bb7dc400 00000000ffff8000 00000000bc1d14a0 00000000ffff8000
[ 184.718748] ba00: ffff8000b964ba8f 0000000000000000 ffff8000bb7dc400 ffff0000085ce1e8
[ 184.726567] ba20: ffff8000b964ba70 ffff000008856c44 ffff000008ffbff0 ffff000008ffbe08
[ 184.734386] ba40: 0000000000000001 ffff8000b964bb08 ffff8000b964bb28 0000000000000000
[ 184.742206] ba60: ffff000008ffc020 ffff00000884e700 ffff8000b964ba90 ffff00000884e7f4
[ 184.750026] ba80: ffff8000b964ba80 00ff8000b964ba80 ffff8000b964bb40 ffff00000884eb2c
[ 184.757846] baa0: ffff8000b9584748 0000000000000008 ffff8000b9583900 ffff000008ffbe08
[ 184.765666] bac0: ffff000008ffaa30 ffff8000b964bcc8 0000000000000003 0000000000000002
[ 184.773485] bae0: 0000000000000003 ffff000008ffaa20 ffff8000b964bb20 ffff000008d6ede8
[ 184.781303] bb00: ffff8000bb7dc400 ffff8000b9544710 ffff8000b9544710 ffff8000b964bb18
[ 184.789122] bb20: ffff8000b964bb18 ffff8000b964bb28 ffff8000b964bb28 ffff00000884ebbc
[ 184.796942] bb40: ffff8000b964bb80 ffff00000884eb9c ffff000008ffbe08 ffff8000b9583900
[ 184.804762] bb60: ffff000008ffbe58 0000000000000001 ffff000008ffaa20 0000000000000001
[ 184.812581] bb80: ffff8000b964bbc0 ffff00000886bd04 0000000000000001 ffff8000b9583900
[ 184.820402] bba0: ffff8000b964bcf0 ffff8000b964bcf0 ffff000009062000 ffff000008b0a390
[ 184.828220] bbc0: ffff8000b964bcf0 ffff000008830110 ffff8000bc33b000 ffff8000bc1d1000
[ 184.836039] bbe0: 00000000ffffffff ffff8000b96a9800 ffff8000bc1d14a0 ffff8000bc1d1870
[ 184.843858] bc00: 0000000000000123 000000000000001d ffff000008982000 ffff8000bb558c80
[ 184.851677] bc20: ffff8000b964bd40 0000000000000000 0000000000000001 ffff000008830b24
[ 184.859496] bc40: ffff000008b0a390 ffff8000bc33b000 ffff8000bb7b9520 ffff8000bb7b9400
[ 184.867316] bc60: 0000000200000139 0000024000000040 78754d2000000440 ffff8000b9583900
[ 184.875137] bc80: 3f30031f00000240 0000000000000000 0000000000000000 0000000000000000
[ 184.882956] bca0: ffff8000b9583900 ff1cf31300000440 ffff800000000000 ffff000008830038
[ 184.890777] bcc0: ffff8000bc33b000 ffff8000b9583900 0f1f03ff00000040 ffff800000000001
[ 184.898597] bce0: ffff8000bc1d14a0 ffff00000818f4e4 ffff8000b964bd70 ffff000008830610
[ 184.906417] bd00: ffff8000bc33b000 0000000000000000 0000fffffdbf4308 ffff8000bc1d1000
[ 184.914236] bd20: ffff8000b96a9800 000000000000001d ffff000008982000 0000000000000000
[ 184.922055] bd40: ffff8000b964bd70 ffff0000088305d0 00000000c4c85513 0000fffffdbf4308
[ 184.929875] bd60: 0000fffffdbf4308 0000000000000000 ffff8000b964be00 ffff0000081f7354
[ 184.937694] bd80: ffff8000b9665600 0000fffffdbf4308 ffff8000b969b238 0000000000000003
[ 184.945514] bda0: 00000000c4c85513 0000fffffdbf4308 0000000000000123 0000000092000047
[ 184.953333] bdc0: 000000003a0f1018 ffff8000b964bec0 0000000060000000 0000000000000024
[ 184.961152] bde0: 0000000092000047 000000003a0f1018 0000000000000020 ffff8000bb558c80
[ 184.968972] be00: ffff8000b964be80 ffff0000081f7a74 0000000000000000 ffff8000b9665600
[ 184.976792] be20: ffff8000b9665600 0000000000000003 00000000c4c85513 0000000000415230
[ 184.984612] be40: ffff8000b964be80 ffff0000081f7a28 0000000000000000 ffff8000b9665600
[ 184.992432] be60: ffff8000b9665600 0000000000000003 00000000c4c85513 ffff0000081f7a0c
[ 185.000253] be80: 0000000000000000 ffff000008082f30 0000000000000000 00008000b70ac000
[ 185.008072] bea0: ffffffffffffffff 000000000041c51c 0000000080000000 0000000000000015
[ 185.015892] bec0: 0000000000000003 00000000c4c85513 0000fffffdbf4308 0000000000000010
[ 185.023712] bee0: fffffffffffffff0 0000000000000040 000000000000003f 0000000000000000
[ 185.031530] bf00: 000000000000001d 0000000000000004 0101010101010101 0000000000000005
[ 185.039350] bf20: ffffffffffffffff 0000000000499000 0000000000499000 0000000000497000
[ 185.047169] bf40: 0000fffffdbf4b68 0000000000000001 0000000000000000 00000000004001a0
[ 185.054988] bf60: 0000000000000000 00000000004001a0 0000000000000000 0000000000000000
[ 185.062807] bf80: 000000000040559c 00000000004054e4 0000000000000000 0000000000000000
[ 185.070627] bfa0: 0000000000000000 0000fffffdbf42e0 0000000000402998 0000fffffdbf42e0
[ 185.078447] bfc0: 000000000041c51c 0000000080000000 0000000000000003 000000000000001d
[ 185.086265] bfe0: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
[ 185.094083] Call trace:
[ 185.096525] Exception stack(0xffff8000b964b7a0 to 0xffff8000b964b8d0)
[ 185.102954] b7a0: ffff8000bb7dc400 0001000000000000 ffff8000b964b970 ffff0000085d0c6c
[ 185.110774] b7c0: ffff8000b964b7e0 ffff0000080fb520 0000000000000001 0000002af6734537
[ 185.118594] b7e0: ffff8000b964b810 ffff0000080e8f40 ffff8000bc16be80 00000000000008f4
[ 185.126415] b800: ffff8000b964b830 ffff0000080eeb14 ffff8000bbaa0d00 ffff8000bff688e0
[ 185.134234] b820: ffff8000b964b830 ffff0000080eeb28 ffff8000b964b850 ffff0000080e94f4
[ 185.142054] b840: 0000000000000000 000000003fffe000 ffff8000b964ba1c 0000000000000002
[ 185.149873] b860: ffff8000bb7aa800 ffff000008fc9f00 0000000000000000 ffff8000bff68908
[ 185.157693] b880: 0000000000000000 0000000000000000 ffff8000bc16bf00 0000000000000000
[ 185.165512] b8a0: 0000000005cc6000 0000000000000001 0000000000000000 0000000000497000
[ 185.173331] b8c0: ffff0000081f79e8 0000000000000001
[ 185.178203] [<ffff0000085d0c6c>] regcache_flat_read+0x14/0x20
[ 185.183939] [<ffff0000085cce60>] _regmap_read+0x98/0xe8
[ 185.189155] [<ffff0000085cd218>] _regmap_update_bits+0xa0/0xf0
[ 185.194978] [<ffff0000085ce1d8>] regmap_update_bits_base+0x60/0x90
[ 185.201152] [<ffff000008856c44>] snd_soc_component_update_bits+0x24/0x40
[ 185.207843] [<ffff00000884e7f4>] dapm_power_widgets+0x474/0x730
[ 185.213751] [<ffff00000884eb2c>] soc_dapm_mux_update_power.isra.29+0x7c/0xa0
[ 185.220787] [<ffff00000884eb9c>] snd_soc_dapm_mux_update_power+0x4c/0x88
[ 185.227479] [<ffff00000886bd04>] tegra210_xbar_put_value_enum+0x1b4/0x228
[ 185.234256] [<ffff000008830110>] snd_ctl_elem_write+0x110/0x188
[ 185.240165] [<ffff000008830610>] snd_ctl_ioctl+0xd0/0x798
[ 185.245557] [<ffff0000081f7354>] do_vfs_ioctl+0xa4/0x738
[ 185.250859] [<ffff0000081f7a74>] SyS_ioctl+0x8c/0xa0
[ 185.255818] [<ffff000008082f30>] el0_svc_naked+0x24/0x28
[ 185.261121] Code: 52800000 b941c883 f9410084 1ac32421 (b8615881)
[ 185.267223] ---[ end trace 5f6a6332822eca30 ]---
Bisecting the panic ends up at this patch and reverting it on top of v4.10 prevents this from
occurring.
The occurs when I start playing audio on Tegra210 using tinymix. I do have some out-of-tree
patches for Tegra audio that I am using when seeing this but I have been using those for
probably a year or so, as I am gradually upstreaming bits.
I am a bit flummoxed by the above, any thoughts?
Cheers
Jon
--
nvpublic
More information about the linux-arm-kernel
mailing list