Why is the ARM SMMU v1/v2 put into bypass mode on kexec?

Fri Mar 22 12:52:52 PDT 2024

On 2024-03-22 15:55:29, Will Deacon wrote:
> Hey Jason,
> 
> On Tue, Mar 19, 2024 at 02:50:07PM -0300, Jason Gunthorpe wrote:
> > On Tue, Mar 19, 2024 at 03:47:56PM +0000, Will Deacon wrote:
> > 
> > > Right, it's hard to win if DMA-active devices weren't quiesced properly
> > > by the outgoing kernel. Either the SMMU was left in abort (leading to the
> > > problems you list above) or the SMMU is left in bypass (leading to possible
> > > data corruption). Which is better?
> > 
> > For whatever reason (and I really don't like this design) alot of work
> > was done on x86 so that device continues to work as-was right up until
> > the crash kernel does the first DMA operation. Including having the
> > crash kernel non disruptively inherit and retain the IOMMU
> > configuration. (eg see translation_pre_enabled() stuff in intel
> > driver)
> 
> Right, I'm also not thrilled about trying to implement that :)
> What we have at the moment seems to be good enough to avoid folks
> complaining about it.
> 
> For the case Tyler is reporting, though, I _think_ it's just a standard
> kexec() rather than a crashkernel.

That's correct.

Tyler