[PATCH 6/6] arm64: module: rework module VA range selection

Mark Rutland mark.rutland at arm.com
Tue May 9 08:16:33 PDT 2023


On Tue, May 09, 2023 at 04:38:05PM +0200, Ard Biesheuvel wrote:
> On Tue, 9 May 2023 at 16:18, Mark Rutland <mark.rutland at arm.com> wrote:
> >
> > On Tue, May 09, 2023 at 03:00:50PM +0100, Mark Rutland wrote:
> > > On Tue, May 09, 2023 at 01:40:12PM +0200, Ard Biesheuvel wrote:
> > > > On Tue, 9 May 2023 at 13:15, Mark Rutland <mark.rutland at arm.com> wrote:
> > > > > +       if (kernel_size >= SZ_2G) {
> > > > > +               pr_warn("Kernel is too large to support modules (%llu bytes)\n",
> > > > > +                       kernel_size);
> > > > > +               return 0;
> > > > > +       }
> > > > >
> > > > >         if (IS_ENABLED(CONFIG_RANDOMIZE_MODULE_REGION_FULL)) {
> > > > > -               /*
> > > > > -                * Randomize the module region over a 2 GB window covering the
> > > > > -                * kernel. This reduces the risk of modules leaking information
> > > > > -                * about the address of the kernel itself, but results in
> > > > > -                * branches between modules and the core kernel that are
> > > > > -                * resolved via PLTs. (Branches between modules will be
> > > > > -                * resolved normally.)
> > > > > -                */
> > > > > -               module_range = SZ_2G - (u64)(_end - _stext);
> > > > > -               module_alloc_base = max((u64)_end - SZ_2G, (u64)MODULES_VADDR);
> > > > > +               pr_info("2G module region forced by RANDOMIZE_MODULE_REGION_FULL\n");
> > > > > +       } else if (kernel_size >= SZ_128M) {
> > > >
> > > > I suppose this bound is somewhat arbitrary? I mean, if kernel_size
> > > > were SZ_128M-SZ_4K, we'd have the exact same problem, and end up using
> > > > the 2G region all the same, just with a different diagnostic message?
> > >
> > > That's a fair point, and that's also true for the 2G boundary.
> > >
> > > Since the useful bound is arbitrary, it's probably better to log how many pages
> > > we could potentially use.
> > >
> > > I'll have a go at doing that instead.
> >
> > Locally I've modified this to always log the number of pages avaialble for !PLT
> > and PLT, and removed the 128M and 2G boundary messages, so we'll always get
> > something like:
> >
> > [    0.487632] Modules: 2G module region forced by RANDOMIZE_MODULE_REGION_FULL
> > [    0.487914] Modules: 0 pages potentially available for non-PLT usage
> > [    0.487997] Modules: 514320 pages potentially available for PLT usage
> >
> > Does that sound ok?
> 
> I think 'in range' is more self explanatory that 'potentially
> available' 

Sure; I'm happy to change that.

> but I wonder whether we need this at all tbh.
> 
> With PLT support always enabled, the available module range is always
> 2G minus the size of the kernel image. The only difference is whether
> we will make an effort to stay within 128 MiB if the kernel's size
> permits it, but most people will not notice that at all.
> 
> So the only case to worry about here is where the kernel image is
> pathologically large, to the extent that it uses up all the module
> space too. Given that we are dealing with the first known case where
> the module space of 128 MiB is being exhausted, maybe we should WARN
> rather loudly when the kernel size exceeds 2G-128M but stay silent
> otherwise.
> 
> In any case, I don't have a strong preference one way or the other if
> stuff just works for the majority of users.

FWIW, I agree that this isn't going to be a problem for the majority of users.

The reason I care is that I regularly build hilariously large kernels, and
occasionally get bug reports from people doing similarly, and I'd like those to
work where possible, and self-diagnose otherwise.

For example, when I test with Syzkaller I enable a bunch of compile-time
instrumenatation and regularly end up with ~200M-300M kernel images. In the
past I have built 900M+ images too, so 2G doesn't seem impossible for some
testing config (its practical usefulness aside).

Given that, I'd prefer to consistently log the information rather than WARN()
in the 2G-128M case, lest we start getting regression reports when people
inevitably manage to build a test kernel that large.

Thanks,
Mark.



More information about the linux-arm-kernel mailing list