next/master bisection: baseline.login on rk3288-rock2-square
guillaume.tucker at collabora.com
Thu Feb 4 06:32:05 EST 2021
On 04/02/2021 10:33, Guillaume Tucker wrote:
> On 04/02/2021 10:27, Ard Biesheuvel wrote:
>> On Thu, 4 Feb 2021 at 11:06, Russell King - ARM Linux admin
>> <linux at armlinux.org.uk> wrote:
>>> On Thu, Feb 04, 2021 at 10:07:58AM +0100, Ard Biesheuvel wrote:
>>>> On Thu, 4 Feb 2021 at 09:43, Guillaume Tucker
>>>> <guillaume.tucker at collabora.com> wrote:
>>>>> Hi Ard,
>>>>> Please see the bisection report below about a boot failure on
>>>>> rk3288 with next-20210203. It was also bisected on
>>>>> imx6q-var-dt6customboard with next-20210202.
>>>>> Reports aren't automatically sent to the public while we're
>>>>> trialing new bisection features on kernelci.org but this one
>>>>> looks valid.
>>>>> The kernel is most likely crashing very early on, so there's
>>>>> nothing in the logs. Please let us know if you need some help
>>>>> with debugging or trying a fix on these platforms.
>>>> Thanks for the report.
>>> I want to send my fixes branch today which includes your regression
>>> fix that caused this regression.
>>> As this is proving difficult to fix, I can only drop your fix from
>>> my fixes branch - and given that this seems to be problematical, I'm
>>> tempted to revert the original change at this point which should fix
>>> both of these regressions - and then we have another go at getting rid
>>> of the set/way instructions during the next cycle.
>> Hi Russell,
>> If Guillaume is willing to do the experiment, and it fixes the issue,
> Yes, I'm running some tests with that fix now and should have
> some results shortly.
Yes it does fix the issue:
with Ard's fix applied to this test branch:
It's worth mentioning that the issue only happens with kernels
built with Clang. As you can see there are several other arm
platforms failing with clang-11 builds but booting fine with
Here's a sample build log:
make -j18 ARCH=arm CROSS_COMPILE=arm-linux-gnueabihf- LLVM=1 CC="ccache clang" zImage
I believe it should be using the GNU assembler as LLVM_IAS=1 is
not defined, but there may be something more subtle about it.
>> it proves that rk3288 is relying on the flush before the MMU is
>> disabled, and so in that case, the fix is trivial, and we can just
>> apply it.
>> If the experiment fails (which would mean rk3288 does not tolerate the
>> cache maintenance being performed after cache off), it is going to be
>> hairy, and so it will definitely take more time.
>> So in the latter case (or if Guillaume does not get back to us), I
>> think reverting my queued fix is the only sane option. But in that
>> case, may I suggest that we queue the revert of the original by-VA
>> change for v5.12 so it gets lots of coverage in -next, and allows us
>> an opportunity to come up with a proper fix in the same timeframe, and
>> backport the revert and the subsequent fix as a pair? Otherwise, we'll
>> end up in the situation where v5.10.x until today has by-va, v5.10.x-y
>> has set/way, and v5.10y+ has by-va again. (I don't think we care about
>> anything before that, given that v5.4 predates any of this)
>> But in the end, I'm happy to go along with whatever works best for you.
More information about the linux-arm-kernel