Re: ❌ FAIL: Test report for kernel 5.13.0-rc7 (arm-next, 8ab9b1a9)

Robin Murphy robin.murphy at arm.com
Fri Jun 25 04:02:52 PDT 2021


On 2021-06-25 10:52, Veronika Kabatova wrote:
[...]
>>>         ❌ stress: stress-ng
>>
>> Oh no, this looks like another alignment fault in memcpy:
>>
>> [13330.651903] Unable to handle kernel paging request at virtual address ffff8000534705ff
>> [13330.651914] Mem abort info:
>> [13330.651918]   ESR = 0x96000021
>> [13330.651922]   EC = 0x25: DABT (current EL), IL = 32 bits
>> [13330.651928]   SET = 0, FnV = 0
>> [13330.651931]   EA = 0, S1PTW = 0
>> [13330.651933]   FSC = 0x21: alignment fault
>> [13330.651938] Data abort info:
>> [13330.651940]   ISV = 0, ISS = 0x00000021
>> [13330.651941]   CM = 0, WnR = 0
>> [13330.651943] swapper pgtable: 4k pages, 48-bit VAs, pgdp=00000000f3e6b000
>> [13330.651945] [ffff8000534705ff] pgd=1000008ffcfff003, p4d=1000008ffcfff003, pud=100000088e57d003, pmd=10000008d0aeb003, pte=006800008021370f
>> [13330.651956] Internal error: Oops: 96000021 [#1] SMP
>> [13330.651961] Modules linked in: unix_diag binfmt_misc fcrypt sm4_generic crc32_generic md4 michael_mic nhpoly1305_neon nhpoly1305 poly1305_generic libpoly1305 poly1305_neon rmd160 sha3_generic sm3_generic streebog_generic wp512 blowfish_generic blowfish_common cast5_generic des_generic libdes chacha_generic chacha_neon libchacha camellia_generic cast6_generic cast_common serpent_generic twofish_generic twofish_common dm_thin_pool dm_persistent_data dm_bio_prison nvme nvme_core ipmi_watchdog ipmi_poweroff loop tun af_key crypto_user scsi_transport_iscsi xt_multiport ip_gre ip_tunnel gre overlay xt_CONNSECMARK xt_SECMARK nft_counter xt_state xt_conntrack nft_compat ah6 ah4 nft_objref nft_ct nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nf_tables nfnetlink jfs sctp ip6_udp_tunnel udp_tunnel dm_log_writes dm_flakey rfkill mlx5_ib ib_uverbs ib_core sunrpc coresight_etm4x i2c_smbus coresight_replicator coresight_tpiu coresight_tmc joydev mlx5_core acpi_ipmi psample ipmi_ssif mlxfw !
>>   ipmi_devintf
>> [13330.652076]  ipmi_msghandler coresight_funnel thunderx2_pmu coresight vfat fat fuse zram ip_tables xfs ast crct10dif_ce i2c_algo_bit ghash_ce drm_vram_helper drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops cec drm_ttm_helper ttm drm gpio_xlp i2c_xlp9xx uas usb_storage aes_neon_bs [last unloaded: nvmet]
>> [13330.652123] CPU: 115 PID: 188446 Comm: stress-ng Tainted: G           OEL    5.13.0-rc7 #1
>> [13330.652129] Hardware name: HPE Apollo 70             /C01_APACHE_MB         , BIOS L50_5.13_1.15 05/08/2020
>> [13330.652133] pstate: 80400009 (Nzcv daif +PAN -UAO -TCO BTYPE=--)
>> [13330.652139] pc : __memcpy+0x168/0x250
>> [13330.652150] lr : memory_read_from_buffer+0x58/0x80
>> [13330.652161] sp : ffff800063ef3c20
>> [13330.652163] x29: ffff800063ef3c20 x28: ffff0008b1380000 x27: 0000000000000000
>> [13330.652170] x26: 0000000000000000 x25: 0000000000000000 x24: ffff00080a960fe0
>> [13330.652176] x23: ffff800063ef3d28 x22: 000000000000063f x21: ffff800063ef3c88
>> [13330.652181] x20: 000000000000063f x19: 000000000000063f x18: 0000000000000000
>> [13330.652186] x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000
>> [13330.652191] x14: 0000000000000000 x13: 0000000000000000 x12: 0000000000000000
>> [13330.652196] x11: 0000000000000000 x10: 0000000000000000 x9 : 0000000000000000
>> [13330.652200] x8 : 0000000000000000 x7 : 0000000000000000 x6 : 0000000000000000
>> [13330.652206] x5 : ffff000d0fb0063f x4 : ffff80005347063f x3 : ffff000d0fb005c0
>> [13330.652212] x2 : ffffffffffffffef x1 : ffff800053470600 x0 : ffff000d0fb00000
>> [13330.652218] Call trace:
>> [13330.652221]  __memcpy+0x168/0x250
>> [13330.652225]  acpi_data_show+0x5c/0x8c
>> [13330.652232]  sysfs_kf_bin_read+0x78/0xa0
>> [13330.652238]  kernfs_file_read_iter+0x9c/0x1a4
>> [13330.652241]  kernfs_fop_read_iter+0x34/0x50
>> [13330.652244]  new_sync_read+0xdc/0x154
>> [13330.652253]  vfs_read+0x158/0x1e4
>> [13330.652260]  ksys_read+0x64/0xec
>> [13330.652266]  __arm64_sys_read+0x28/0x34
>> [13330.652273]  invoke_syscall+0x50/0x120
>> [13330.652280]  el0_svc_common.constprop.0+0x4c/0xd4
>> [13330.652284]  do_el0_svc+0x30/0x9c
>> [13330.652286]  el0_svc+0x2c/0x54
>> [13330.652294]  el0t_64_sync_handler+0x1a4/0x1b0
>> [13330.652296]  el0t_64_sync+0x19c/0x1a0
>> [13330.652303] Code: a984346c a9c4342c f1010042 54fffee8 (a97c3c8e)
>> [13330.652307] ---[ end trace 227d4380f57145d4 ]---
>>
>> So maybe this issue isn't limited to weird modules, after all...
>>
> 
> It ran on the machine from the same set that we were able to reproduce
> it on previously. If you or anyone else have an idea on how to stabilize
> the reproducibility or have a debug patch we'll be happy to try it.

Possibly it depends on the individual machines' firmware exactly how the 
relevant bits of their ACPI tables are aligned in memory?

I've started digging into that callstack - it may not be a "weird 
module" but it's definitely crusty ACPI code... a238317ce818 ("ACPI: 
Clean up acpi_os_map/unmap_memory() to eliminate __iomem.") looks 
frankly a bit questionable in its decision to blithely cast away 
__iomem, but then the rationale in aafc65c731fe ("ACPI: add arm64 to the 
platforms that use ioremap") seems particularly dubious on top of that 
(especially given this end result).

At a wild guess, I'm wondering if this may be sufficient:

----->8-----
diff --git a/drivers/acpi/osl.c b/drivers/acpi/osl.c
index 327e1b4eb6b0..f5d26b102fbe 100644
--- a/drivers/acpi/osl.c
+++ b/drivers/acpi/osl.c
@@ -277,7 +277,7 @@ acpi_map_lookup_virt(void __iomem *virt, acpi_size size)
         return NULL;
  }

-#if defined(CONFIG_IA64) || defined(CONFIG_ARM64)
+#if defined(CONFIG_IA64)
  /* ioremap will take care of cache attributes */
  #define should_use_kmap(pfn)   0
  #else
-----8<-----

otherwise we might need to refactor acpi_os_map_memory() to properly use 
memremap() instead of bodging the ioremap() path.

My actual ACPI-capable dev board has a trashed filesystem at the moment, 
but I guess I can try an ACPI VM with QEMU to poke further.

Robin.


More information about the linux-arm-kernel mailing list