bcm2835-rng: Performance regression since 96cb9d055445

Phil Elwell phil at raspberrypi.com
Wed Aug 30 03:44:31 PDT 2023


Hi all,

On Wed, 30 Aug 2023 at 11:38, Thorsten Leemhuis
<regressions at leemhuis.info> wrote:
>
> /me gets the impression he has to chime in here
>
> On 25.08.23 14:14, Stefan Wahren wrote:
> > Am 25.08.23 um 13:26 schrieb Jason A. Donenfeld:
> >> On Fri, Aug 25, 2023 at 01:14:55PM +0200, Stefan Wahren wrote:
> >
> >>> i didn't find the time to fix the performance regression in bcm2835-rng
> >>> which affects Raspberry Pi 0 - 3, so report it at least. AFAIK the first
> >>> report about this issue was here [1] and identified the offending
> >>> commit:
> >>>
> >>> 96cb9d055445 ("hwrng: bcm2835 - use hwrng_msleep() instead of
> >>>    cpu_relax()")
> >>>
> >>> #regzbot introduced: 96cb9d055445
> >>>
> >>> I was able to reproduce this issue with a Raspberry Pi 3 B+ on Linux
> >>> 6.5-rc6 (arm64/defconfig).
> >>>
> >>> Before:
> >>> time sudo dd if=/dev/hwrng of=/dev/urandom count=1 bs=4096 status=none
> >>>
> >>> real    3m29,002s
> >>> user    0m0,018s
> >>> sys    0m0,054s
> >> That's not surprising. But also, does it matter? That script has
> >> *always* been wrong. Writing to /dev/urandom like that has *never*
> >> ensured that those bytes are taken into account immediately after. It's
> >> just not how that interface works. So any assumptions based on that are
> >> bogus, and that line effectively does nothing.
> >>
> >> Fortunately, however, the kernel itself incorporates hwrng output into
> >> the rng pool, so you don't need to think about doing it yourself.
> >>
> >> So go ahead and remove that line from your script.
> >
> > Thanks for your explanation. Unfortunately this isn't my script.
>
> And I assume it's in the standard install of the RpiOS or similarly
> widespread?
>
> > I'm
> > just a former BCM2835 maintainer and interested that more user stick to
> > the mainline kernel instead of the vendor ones. I will try to report the
> > script owner.
>
> thx
>
> >> Now as far as the "regression" goes, we've made an already broken
> >> userspace script take 3 minutes longer than usual, but it still does
> >> eventually complete, so it's not making boot impossible or something.
> >> How this relates to the "don't break userspace" rule might be a matter
> >> of opinion.
>
> Yup, but I'd say it bad enough to qualify as regression. If it would be
> something like 10 seconds it might be something different, but 3 minutes
> will look like a hang to many people, and I'm pretty sure that's
> something Linus doesn't want to see. But let's not involve him for now
> and first try to solve this differently.
>
> >> If you think it does, maybe send a patch to Herbert reducing
> >> that sleep from 1000 to 100 and stating why with my background above,
> >> and see if he agrees it's worth fixing.
>
> Stefan, did you try to see how long it would take when the sleep time is
> reduced? I guess that might be our best chance to solve this, as
> reverting the culprit afaics would lead to regressions for others.
>
> /me wonders if the sleep time could even be reduced futher that 100

FYI, this is how I tackled it downstream:

https://github.com/raspberrypi/linux/commit/6a825ed68f75bd768e31110ba825b75c5c09cf2d

I can send a patch if it looks appropriate for upstream use.

Phil



More information about the linux-arm-kernel mailing list