bcm2835-rng: Performance regression since 96cb9d055445

Thorsten Leemhuis regressions at leemhuis.info
Wed Aug 30 03:38:03 PDT 2023


/me gets the impression he has to chime in here

On 25.08.23 14:14, Stefan Wahren wrote:
> Am 25.08.23 um 13:26 schrieb Jason A. Donenfeld:
>> On Fri, Aug 25, 2023 at 01:14:55PM +0200, Stefan Wahren wrote:
>
>>> i didn't find the time to fix the performance regression in bcm2835-rng
>>> which affects Raspberry Pi 0 - 3, so report it at least. AFAIK the first
>>> report about this issue was here [1] and identified the offending
>>> commit:
>>>
>>> 96cb9d055445 ("hwrng: bcm2835 - use hwrng_msleep() instead of
>>>    cpu_relax()")
>>>
>>> #regzbot introduced: 96cb9d055445
>>>
>>> I was able to reproduce this issue with a Raspberry Pi 3 B+ on Linux
>>> 6.5-rc6 (arm64/defconfig).
>>>
>>> Before:
>>> time sudo dd if=/dev/hwrng of=/dev/urandom count=1 bs=4096 status=none
>>>
>>> real    3m29,002s
>>> user    0m0,018s
>>> sys    0m0,054s
>> That's not surprising. But also, does it matter? That script has
>> *always* been wrong. Writing to /dev/urandom like that has *never*
>> ensured that those bytes are taken into account immediately after. It's
>> just not how that interface works. So any assumptions based on that are
>> bogus, and that line effectively does nothing.
>>
>> Fortunately, however, the kernel itself incorporates hwrng output into
>> the rng pool, so you don't need to think about doing it yourself.
>>
>> So go ahead and remove that line from your script.
>
> Thanks for your explanation. Unfortunately this isn't my script.

And I assume it's in the standard install of the RpiOS or similarly
widespread?

> I'm
> just a former BCM2835 maintainer and interested that more user stick to
> the mainline kernel instead of the vendor ones. I will try to report the
> script owner.

thx

>> Now as far as the "regression" goes, we've made an already broken
>> userspace script take 3 minutes longer than usual, but it still does
>> eventually complete, so it's not making boot impossible or something.
>> How this relates to the "don't break userspace" rule might be a matter
>> of opinion.

Yup, but I'd say it bad enough to qualify as regression. If it would be
something like 10 seconds it might be something different, but 3 minutes
will look like a hang to many people, and I'm pretty sure that's
something Linus doesn't want to see. But let's not involve him for now
and first try to solve this differently.

>> If you think it does, maybe send a patch to Herbert reducing
>> that sleep from 1000 to 100 and stating why with my background above,
>> and see if he agrees it's worth fixing.

Stefan, did you try to see how long it would take when the sleep time is
reduced? I guess that might be our best chance to solve this, as
reverting the culprit afaics would lead to regressions for others.

/me wonders if the sleep time could even be reduced futher that 100

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
If I did something stupid, please tell me, as explained on that page.

> Or, if removing that line from
>> your scripts is good enough for you, that's also fine by me.
> Now i agree that the provided example isn't the proper way to handle
> /dev/urandom. Unfortunately most of the Raspberry Pi users doesn't care
> about such details. In their eyes the mainline kernel is "broken" and
> this is one argument to go back to the vendor ones.
> 
> Beside of the /dev/urandom abuse, i'm not convinced that sleeping for 1
> sec is a good choice. This HRNG IP is used in embedded devices (e.g.
> Router) and mostly without any other good source of entropy. So i think
> it's worth to define a reasonable value.




More information about the linux-arm-kernel mailing list