next/master bisection: baseline.login on rk3288-rock2-square
Marc Zyngier
maz at kernel.org
Thu Feb 4 07:26:44 EST 2021
On 2021-02-04 10:55, Ard Biesheuvel wrote:
> (cc Marc)
>
> On Thu, 4 Feb 2021 at 11:48, Russell King - ARM Linux admin
> <linux at armlinux.org.uk> wrote:
>>
>> On Thu, Feb 04, 2021 at 11:27:16AM +0100, Ard Biesheuvel wrote:
>> > Hi Russell,
>> >
>> > If Guillaume is willing to do the experiment, and it fixes the issue,
>> > it proves that rk3288 is relying on the flush before the MMU is
>> > disabled, and so in that case, the fix is trivial, and we can just
>> > apply it.
>> >
>> > If the experiment fails (which would mean rk3288 does not tolerate the
>> > cache maintenance being performed after cache off), it is going to be
>> > hairy, and so it will definitely take more time.
>> >
>> > So in the latter case (or if Guillaume does not get back to us), I
>> > think reverting my queued fix is the only sane option. But in that
>> > case, may I suggest that we queue the revert of the original by-VA
>> > change for v5.12 so it gets lots of coverage in -next, and allows us
>> > an opportunity to come up with a proper fix in the same timeframe, and
>> > backport the revert and the subsequent fix as a pair? Otherwise, we'll
>> > end up in the situation where v5.10.x until today has by-va, v5.10.x-y
>> > has set/way, and v5.10y+ has by-va again. (I don't think we care about
>> > anything before that, given that v5.4 predates any of this)
>>
>> I'm suggesting dropping your fix (9052/1) and reverting
>> "ARM: decompressor: switch to by-VA cache maintenance for v7 cores"
>> which gets us to a point where _both_ regressions are fixed.
>>
>
> I understand, but we don't know whether doing so might regress other
> platforms that were added in the mean time.
>
>> I'm of the opinion that the by-VA patch was incorrect when it was
>> merged (it caused a regression), and it's only a performance
>> improvement.
>
> It is a correctness improvement, not a performance improvement.
>
> Without that change, the 32-bit ARM kernel cannot boot bare metal on
> platforms with a system cache such as 8040 or SynQuacer, and can only
> boot under KVM on such systems because of the special handling of
> set/way instructions by the host.
I agree. With set/way CMOs, there is no way to reach the PoC if
it beyond the system cache, leading to an unbootable kernel.
This is actually pretty well documented in the architecture,
and it did bite us for the first time on XGene-1, 7 years ago.
In retrospect, having KVM to handle set/way CMOs in was a mistake,
as it just papered over the problem for the sake of running older
32bit guests. It violated the principle of KVM/arm being strictly
architectural and provided unrealistic expectations. I'll take the
blame for this.
Thanks,
M.
--
Jazz is not dead. It just smells funny...
More information about the linux-arm-kernel
mailing list