Re: ❌ FAIL: Test report for kernel 5.10.0-rc5 (arm-next)

Xiao Ni xni at redhat.com
Fri Nov 27 05:13:50 EST 2020



On 11/27/2020 05:09 PM, Will Deacon wrote:
> [+Mark, since this is basically hammering on his lockdep/tracing fixes]
>
> On Fri, Nov 27, 2020 at 07:58:13AM -0000, CKI Project wrote:
>> We ran automated tests on a recent commit from this kernel tree:
>>
>>         Kernel repo: https://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git
>>              Commit: 770428a3d2c3 - arm64: sdei: fix arm64_{enter,exit}_nmi() calls
>>
>> The results of these automated tests are provided below.
>>
>>      Overall result: FAILED (see details below)
>>               Merge: OK
>>             Compile: OK
>>               Tests: FAILED
>>
>> All kernel binaries, config files, and logs are available for download here:
>>
>>    https://arr-cki-prod-datawarehouse-public.s3.amazonaws.com/index.html?prefix=datawarehouse-public/2020/11/26/618759
>>
>> One or more kernel tests failed:
>>
>>      aarch64:
>>       ❌ LTP
> This guy is just perf_event_open02, and I'm not worried about it. The
> overhead of PROVE_LOCKING and friends is likely the culprit (and I see
> stuff in dmesg about throttling the sample period).
>
>>       ❌ storage: software RAID testing
> This one has an interested RCU splat, but it looks like a straight-up
> bug to me:
>
>    drivers/md/raid10.c:1732 suspicious rcu_dereference_check() usage!
>
> as there appears to be two calls to rcu_dereference() outside of a read-side
> critical section introduced relatively recently by bcc90d280465 ("md/raid10:
> improve raid10 discard request").
>
> I've added the folks from that commit. Full logs at:
>
> https://arr-cki-prod-datawarehouse-public.s3.amazonaws.com/datawarehouse-public/2020/11/26/618759/build_aarch64_redhat%3A1038231/tests/storage_software_RAID_testing/9107170_aarch64_2_dmesg.log
>
> But anyway, Mark, your stuff seems to be holding up. Thanks.
>
> Will
>
Hi Will

Thanks for pointing about this.

rdev must have value at this time. Because it already adds the 
rdev->nr_pending before this for loop
when it needs to send flush bio to this rdev. So the patch should fix 
this problem.

Hi, Song, do you think the following patch is good?


diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c
index 3153183..7324a7d 100644
--- a/drivers/md/raid10.c
+++ b/drivers/md/raid10.c
@@ -1729,9 +1729,8 @@ static int raid10_handle_discard(struct mddev 
*mddev, struct bio *bio)
         for (disk = 0; disk < geo->raid_disks; disk++) {
                 sector_t dev_start, dev_end;
                 struct bio *mbio, *rbio = NULL;
-               struct md_rdev *rdev = 
rcu_dereference(conf->mirrors[disk].rdev);
-               struct md_rdev *rrdev = rcu_dereference(
-                       conf->mirrors[disk].replacement);
+               struct md_rdev *rdev = conf->mirrors[disk].rdev;
+               struct md_rdev *rrdev = conf->mirrors[disk].replacement;

                 /*
                  * Now start to calculate the start and end address for 
each disk.

Regards
Xiao




More information about the linux-arm-kernel mailing list