arm64: csdlock at early boot due to slow serial (?)
Mark Rutland
mark.rutland at arm.com
Thu Jul 3 09:31:09 PDT 2025
On Thu, Jul 03, 2025 at 03:13:26PM +0100, Breno Leitao wrote:
> On Thu, Jul 03, 2025 at 11:28:50AM +0100, Mark Rutland wrote:
> > On Wed, Jul 02, 2025 at 10:10:21AM -0700, Breno Leitao wrote:
> > > I'm observing two unusual behaviors during the boot process on my SBSA
> > > ARM machine, with upstream kernel (6.16-rc4):
> >
> > Can you say which SoC in particular that is? Knowing that would help to
> > identify whether there's some known erratum, clocking issue, etc.
>
> This is custom made rack mounted machine based on Grace CPU. Here are
> some info about the hardware:
>
> # lscpu:
> Vendor ID: ARM
> Model name: Neoverse-V2
> Model: 0
> Thread(s) per core: 1
> Core(s) per socket: 72
> Socket(s): 1
> Stepping: r0p0
>
> # /proc/cpuinfo
> processor : 71
> BogoMIPS : 2000.00
> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh bti
> CPU implementer : 0x41
> CPU architecture: 8
> CPU variant : 0x0
> CPU part : 0xd4f
> CPU revision : 0
>
> # lshw
> description: Rack Mount Chassis
> product: <Internal name>
> vendor: Quanta
> version: <Internal name>
> width: 64 bits
> capabilities: smbios-3.6.0 dmi-3.6.0 smp sve_default_vector_length tagged_addr_disabled
> configuration: boot=normal chassis=rackmount family=Default string sku=Default string uuid=...
>
> How do I find the SoC exactly?
>From what you've told me above, the SoC is Nvidia Grace; what they call
the CPU is the whole SoC.
> > Likewise that might imply more folk to add to Cc.
I've added Ankit and Besar, since they've both worked on some system
level bits on Grace, and might have an idea.
Ankit, Besar, are you aware of any UART issues on Grace (as described in
Breno's messages below), or do you know of anyone who might have an
idea?
Thanks,
Mark.
> > [...]
> >
> > > At timestamp 9.69 seconds, the serial console is still flushing messages from
> > > 0.92 seconds, indicating that the initial 9-second gap is spent looping in
> > > cpu_relax()-about 20,000 times per message, which is clearly suboptimal.
> > >
> > > Further debugging revealed the following sequence with the pl011 registers:
> > >
> > > 1) uart_console_write()
> > > 2) REG_FR has BUSY | RXFE | TXFF for a while (~1k cpu_relax())
> > > 3) RXFE and TXFF are cleaned, and BUSY stay on for another 17k-19k cpu_relax()
> > >
> > > Michael has reported a hardware issue where the BUSY bit could get
> > > stuck (see commit d8a4995bcea1: "tty: pl011: Work around QDF2400 E44 stuck BUSY
> > > bit"), which is very similar. TXFE goes down, but BUSY is(?) still stuck for long.
> >
> > Looking at the commit message, that was an issue with the a "custom
> > (non-PrimeCell) implementation of the SBSA UART" present on QDF400. I
> > assume that was soemthing that Qualcomm Datacenter Technologies designed
> > themselves.
> >
> > It's possible that your SoC has a similar issue with whatever IP block
> > is being used as the UART, but the issue in that commit certainly
> > doesn't apply to most PL011 / SBSA-UART implementations.
>
> That makes total sense. Decoding SPCR I see the following:
>
> # iasl -d spcr.dat
> Intel ACPI Component Architecture
> ASL+ Optimizing Compiler/Disassembler version 20210604
> Copyright (c) 2000 - 2021 Intel Corporation
>
> File appears to be binary: found 56 non-ASCII characters, disassembling
> Binary file appears to be a valid ACPI table, disassembling
> Input file spcr.dat, Length 0x50 (80) bytes
> ACPI: SPCR 0x0000000000000000 000050 (v02 NVIDIA A M I 00000001 ARMH 00010000)
> Acpi Data Table [SPCR] decoded
> Formatted output: spcr.dsl - 2624 bytes
>
> Thanks,
> --breno
More information about the linux-arm-kernel
mailing list