[PATCH v3] riscv: cif: reduce shadow stack size limit from 4GB to 2GB
Zong Li
zong.li at sifive.com
Tue May 19 00:04:47 PDT 2026
On Mon, May 18, 2026 at 5:57 PM David Laight
<david.laight.linux at gmail.com> wrote:
>
> On Mon, 18 May 2026 11:54:32 +0800
> Zong Li <zong.li at sifive.com> wrote:
>
> > On Sat, May 16, 2026 at 3:16 AM David Laight
> > <david.laight.linux at gmail.com> wrote:
> > >
> > > On Fri, 15 May 2026 22:29:05 +0800
> > > Zong Li <zong.li at sifive.com> wrote:
> > >
> > > > On Fri, May 15, 2026 at 5:24 PM David Laight
> > > > <david.laight.linux at gmail.com> wrote:
> > > > >
> > > > > On Fri, 15 May 2026 11:42:45 +0800
> > > > > Zong Li <zong.li at sifive.com> wrote:
> > > > >
> > > > > > On Thu, May 14, 2026 at 4:56 PM David Laight
> > > > > ..
> > > > > > > I also don't understand the rational for just /2 and the 2G upper limit.
> > > > > > > You need 512 nested function calls to even use 4k.
> > > > > > > That would have to be quite deep recursion.
> > > > > >
> > > > > > During the discussions about the ARM GCS v3 series, community pointed
> > > > > > out that a 4G shadow stack might be too large. This size is hard to
> > > > > > support in memory-constrained environments like Android. However, the
> > > > > > size cannot be too small either, or we might face stack overflow
> > > > > > issues. At that time, a perfect size was not decided.
> > > > >
> > > > > It is only VA not real memory so shouldn't make much difference to memory
> > > > > use (except for nommu where the actual memory has to be allocated).
> > > > >
> > > >
> > > > You raise a valid point that shadow stacks are primarily a VA
> > > > allocation. However, in Linux, the memory overcommit mechanism creates
> > > > a practical link between VA allocation and physical memory capacity.
> > > > As I mentioned in the commit message, memory allocation will fail when
> > > > the overcommit mode is set to OVERCOMMIT_GUESS or OVERCOMMIT_NEVER.
> > > >
> > > > In __vm_enough_memory:
> > > > if (pages > totalram_pages() + total_swap_pages)
> > > > goto error;
> > > >
> > > > Many page requests for VA will fail if the requested size exceeds the
> > > > system's total RAM plus Swap. On memory-constrained systems,
> > > > allocating a massive 4GB shadow stack per thread would immediately
> > > > trigger this error.
> > >
> > > But reducing the size by half makes little difference.
> > > You'd need a much bigger reduction to make any real difference.
> > >
> >
> > I agree with you that a smaller size would cover more cases. I am very
> > open to your ideas regarding the size. Would you prefer to use 1GB or
> > 512MB as the default instead?
> > As I mentioned in my previous emails, using 2GB seems to be a safe
> > starting point. This is because it is already accepted by the
> > community and the Android system (in GCS implementation).
> > Additionally, although the CFI feature doesn't support 32-bit systems
> > yet, normal 32-bit systems can only support up to 4GB of physical
> > memory. If the default shadow stack size is 4GB, it would be almost
> > impossible to run on a 32-bit system. Using at least 2GB can help
> > avoid this issue in the future. If you don't have a preferred default
> > value, maybe we could start with 2G?
>
> I've no real idea - note that the rlimit value should be small for 32bit
> (or at least the actual stack is small regardless of the rlimit value).
> The 2G is just an upper bound - probably matching the 4G upper bound
> for the 64bit stack itself.
>
> Don't focus on the 2G limit, but on the rlimit(STACK)/2 (or rather the size
> of the normal stack).
> On my systems the default soft limit is 8M, a 4M shadow stack supports 512k
> nested function calls (64bit) - none of which can have any local data.
> In reality programs that use a lot of stack allocate large buffers on stack,
> they don't have silly depths of recursive functions with no local data.
>
> I've just looked at vmlinux.o - which won't be representative of userspace!
> While there are a lot of functions with small stack frame (sub $0x10,%rsp)
> they tend to have saved a few registers in stack first.
> The majority will have a stack delta of over 64 bytes.
> That corresponds to rlimit(STACK)/8 and even that is conservative.
>
> I'd suspect that could safely halve that again.
Thank you for the detailed analysis and the empirical data.
With RLIMIT_STACK/8, the corresponding upper limit could be SZ_512M,
which also better accommodates memory-constrained platforms. This
would provide:
- 128K call depth on typical systems (8MB rlimit)
- 75% memory savings in multi-threaded environments
- More reasonable for memory-constrained systems
While I appreciate your suggestion that RLIMIT_STACK/16 might also be
sufficient, let's try the RLIMIT_STACK/8 approach first. This gives us
a good balance between memory efficiency and safety margin, and we can
always revisit if real-world usage data suggests we can go smaller.
I'll update the patch to use RLIMIT_STACK/8 with SZ_512M cap.
Thank you again for the thorough analysis!
>
> Actually would it be possible to initially just allocate one page?
> If you get an overflow fault on the shadow stack I think you can
> safely reallocate it at an entirely different user virtual address.
> That would remove all the problems over committing a lot of swap.
> Most threads will never do the 512 nested calls needed to blow the stack.
>
> -- David
>
> >
> >
> > > -- David
> > >
> > > >
> > > > > But 32bit programs with lots of threads can run out of VA.
> > > > > Increasing the stack VA size by 50% might even give problems for 64bit
> > > > > programs - if they are already reducing the thread stack size avoid
> > > > > running out of VA.
> > > > >
> > > > > I've not checked, but pthread_attr_setstacksize() sets a limit for the
> > > > > thread stack size (which would otherwise default so rlimit(STACK)).
> > > > > I don't believe it should update the rlimit value itself.
> > > > > In which case you are using the wrong size.
> > > > >
> > > > > But for a thread with a very reduced stack (say 128k) you probably only
> > > > > need 1 page of shadow stack, any more could easily lead to running out
> > > > > of VA.
> > > > >
> > > > > -- David
> > > >
> > >
>
More information about the linux-riscv
mailing list