revert scope for 5.8, was Re: dma-pool fixes

Amit Pundir amit.pundir at linaro.org
Sat Aug 1 07:57:04 EDT 2020


On Sat, 1 Aug 2020 at 14:27, Christoph Hellwig <hch at lst.de> wrote:
>
> On Sat, Aug 01, 2020 at 01:20:07AM -0700, David Rientjes wrote:
> > To follow-up on this, the introduction of the DMA atomic pools in 5.8
> > fixes an issue for any AMD SEV enabled guest that has a driver that
> > requires atomic DMA allocations (for us, nvme) because runtime decryption
> > of memory allocated through the DMA API may block.  This manifests itself
> > as "sleeping in invalid context" BUGs for any confidential VM user in
> > cloud.
> >
> > I unfortunately don't have Amit's device to be able to independently debug
> > this issue and certainly could not have done a better job at working the
> > bug than Nicolas and Christoph have done so far.  I'm as baffled by the
> > results as anybody else.
> >
> > I fully understand the no regressions policy.  I'd also ask that we
> > consider that *all* SEV guests are currently broken if they use nvme or
> > any other driver that does atomic DMA allocations.  It's an extremely
> > serious issue for cloud.  If there is *anything* that I can do to make
> > forward progress on this issue for 5.8, including some of the workarounds
> > above that Amit requested, I'd be very happy to help.  Christoph will make
> > the right decision for DMA in 5.8, but I simply wanted to state how
> > critical working SEV guests are to users.
>
> I'm between a rock and a hard place here.  If we simply want to revert
> commits as-is to make sure both the Raspberry Pi 4 and thone phone do
> not regress we'll have to go all the way back and revert the whole SEV
> pool support.  I could try to manual revert of the multiple pool
> support, but it is very late for that.

Hi, I found the problematic memory region. It was a memory
chunk reserved/removed in the downstream tree but was
seemingly reserved upstream for different drivers. I failed to
calculate the length of the total region reserved downstream
correctly. And there was still a portion of memory left unmarked,
which I should have marked as reserved in my testing earlier
today.

Sorry for all the noise and thanks Nicolas, Christoph and David
for your patience.

Regards,
Amit Pundir


>
> Or maybe Linus has decided to cut a -rc8 which would give us a little
> more time.
> -



More information about the linux-rpi-kernel mailing list