[GIT PULL] arm64 updates for 6.1-rc1

Catalin Marinas catalin.marinas at arm.com
Fri Nov 11 03:15:11 PST 2022


On Tue, Nov 08, 2022 at 10:58:16PM +0530, Amit Pundir wrote:
> On Tue, 25 Oct 2022 at 18:08, Amit Pundir <amit.pundir at linaro.org> wrote:
> > On Wed, 12 Oct 2022 at 17:24, Catalin Marinas <catalin.marinas at arm.com> wrote:
> > > On Sat, Oct 08, 2022 at 08:28:26PM +0530, Amit Pundir wrote:
> > > > On Wed, 5 Oct 2022 at 20:11, Catalin Marinas <catalin.marinas at arm.com> wrote:
> > > > > Will Deacon (2):
> > > > >       arm64: dma: Drop cache invalidation from arch_dma_prep_coherent()
> > > >
> > > > This patch broke AOSP on Dragonboard 845c (SDM845). I don't see any
> > > > relevant crash in the attached log and device silently reboots into
> > > > USB crash dump mode. The crash is fairly reproducible on db845c. I
> > > > could trigger it twice in 5 reboots and it always crash at the same
> > > > point during the boot process. Reverting this patch fixes the crash.
> > > >
> > > > I'm happy to test run any debug patche(s), that would help narrow
> > > > down this breakage.
[...]
> > Further narrowed down the breakage to the userspace daemon rmtfs
> > https://github.com/andersson/rmtfs. Is there anything specific in the
> > userspace code that I should be paying attention to?

Since you don't see anything in the logs like a crash and the system
restarts, I suspect it's some deadlock and that's triggering the
watchdog. We have an erratum (826319) but that's for Cortex-A53. IIUC
SDM845 has Kryo 3xx series which based on some random google searches is
derived from A75/A55. Unfortunately the MIDR_EL1 register doesn't match
the Arm Ltd numbering, so I have no idea what CPUs these are by looking
at the boot log.

I wouldn't be surprised if you hit a similar bug, though I couldn't find
anything close in the A55 errata notice.

While we could revert commit c44094eee32f ("arm64: dma: Drop cache
invalidation from arch_dma_prep_coherent()"), if you hit a real hardware
issue it may trigger in other scenario where we only do cache cleaning
(without invalidate), like arch_sync_dma_for_device(). So I'd rather get
to the bottom of this and potentially enable the workaround for this
chipset.

You could give it a quick try to by adding the MIDR ranges for SDM845 to
struct midr_range workaround_clean_cache[].

After that I suggest you raise it with Qualcomm to investigate. Normally
we ask for an erratum number to enable a workaround and it's only
Qualcomm that can provide one here.

-- 
Catalin



More information about the linux-arm-kernel mailing list