bcm2835-rng: Performance regression since 96cb9d055445
Jason A. Donenfeld
Jason at zx2c4.com
Wed Aug 30 04:37:39 PDT 2023
On Wed, Aug 30, 2023 at 11:44:31AM +0100, Phil Elwell wrote:
> Hi all,
>
> On Wed, 30 Aug 2023 at 11:38, Thorsten Leemhuis
> <regressions at leemhuis.info> wrote:
> >
> > /me gets the impression he has to chime in here
> >
> > On 25.08.23 14:14, Stefan Wahren wrote:
> > > Am 25.08.23 um 13:26 schrieb Jason A. Donenfeld:
> > >> On Fri, Aug 25, 2023 at 01:14:55PM +0200, Stefan Wahren wrote:
> > >
> > >>> i didn't find the time to fix the performance regression in bcm2835-rng
> > >>> which affects Raspberry Pi 0 - 3, so report it at least. AFAIK the first
> > >>> report about this issue was here [1] and identified the offending
> > >>> commit:
> > >>>
> > >>> 96cb9d055445 ("hwrng: bcm2835 - use hwrng_msleep() instead of
> > >>> cpu_relax()")
> > >>>
> > >>> #regzbot introduced: 96cb9d055445
> > >>>
> > >>> I was able to reproduce this issue with a Raspberry Pi 3 B+ on Linux
> > >>> 6.5-rc6 (arm64/defconfig).
> > >>>
> > >>> Before:
> > >>> time sudo dd if=/dev/hwrng of=/dev/urandom count=1 bs=4096 status=none
> > >>>
> > >>> real 3m29,002s
> > >>> user 0m0,018s
> > >>> sys 0m0,054s
> > >> That's not surprising. But also, does it matter? That script has
> > >> *always* been wrong. Writing to /dev/urandom like that has *never*
> > >> ensured that those bytes are taken into account immediately after. It's
> > >> just not how that interface works. So any assumptions based on that are
> > >> bogus, and that line effectively does nothing.
> > >>
> > >> Fortunately, however, the kernel itself incorporates hwrng output into
> > >> the rng pool, so you don't need to think about doing it yourself.
> > >>
> > >> So go ahead and remove that line from your script.
> > >
> > > Thanks for your explanation. Unfortunately this isn't my script.
> >
> > And I assume it's in the standard install of the RpiOS or similarly
> > widespread?
> >
> > > I'm
> > > just a former BCM2835 maintainer and interested that more user stick to
> > > the mainline kernel instead of the vendor ones. I will try to report the
> > > script owner.
> >
> > thx
> >
> > >> Now as far as the "regression" goes, we've made an already broken
> > >> userspace script take 3 minutes longer than usual, but it still does
> > >> eventually complete, so it's not making boot impossible or something.
> > >> How this relates to the "don't break userspace" rule might be a matter
> > >> of opinion.
> >
> > Yup, but I'd say it bad enough to qualify as regression. If it would be
> > something like 10 seconds it might be something different, but 3 minutes
> > will look like a hang to many people, and I'm pretty sure that's
> > something Linus doesn't want to see. But let's not involve him for now
> > and first try to solve this differently.
> >
> > >> If you think it does, maybe send a patch to Herbert reducing
> > >> that sleep from 1000 to 100 and stating why with my background above,
> > >> and see if he agrees it's worth fixing.
> >
> > Stefan, did you try to see how long it would take when the sleep time is
> > reduced? I guess that might be our best chance to solve this, as
> > reverting the culprit afaics would lead to regressions for others.
> >
> > /me wonders if the sleep time could even be reduced futher that 100
>
> FYI, this is how I tackled it downstream:
>
> https://github.com/raspberrypi/linux/commit/6a825ed68f75bd768e31110ba825b75c5c09cf2d
>
> I can send a patch if it looks appropriate for upstream use.
>
> Phil
Upstream discussion: https://lore.kernel.org/all/ZOoe0lOR9zpcAw5I@zx2c4.com/
More information about the linux-arm-kernel
mailing list