[PATCH v2] um: Fix stack pointer alignment

YiFei Zhu zhuyifei1999 at gmail.com
Mon Apr 19 21:36:11 BST 2021


On Mon, Apr 19, 2021 at 2:41 PM Johannes Berg <johannes at sipsolutions.net> wrote:
> On Mon, 2021-04-19 at 10:32 -0500, YiFei Zhu wrote:
> > On a side note, musl is unaffected by this issue because it forces
> > 16 byte alignment via `and $-16,%rsi` in its clone wrapper.
> > Similarly, glibc i386 is also uneffected because it has

I realize I forgot to fix the typo here. Can you amend it or shall I send a v3?

> > `andl $0xfffffff0, %ecx`.
>
> I wonder if this isn't really a glibc bug?
>
> After all, the man page states no alignment restrictions, except when
> documenting the error codes:
>
> EINVAL
> stack is not aligned to a suitable boundary for  this  architecture.
> For example, on aarch64, stack must be a multiple of 16.

This could be considered a glibc bug that it doesn't force alignment,
yeah, considering musl does it for both x86_32 and x86_64, and glibc
does it for only x86_32 and not x86_64. However, I'm unaware that
anywhere saying something like "it's libc's duty to align the stack
pointer to clone()"

Speaking of aarch64, it looks like that message might be out of date.
I was trying to find where this is being enforced, and could not
quickly find the code, so did a quick search on this and see commit
e6d9a5254333 ("arm64: do not enforce strict 16 byte alignment to stack
pointer"), and also two related discussions [1][2]. It seems that the
error entry to the man page was added around the same time as the
check got removed, and the check was there only because it would have
caused SIGBUS when the clone returns. Although in x86, non-16-byte
aligned push / pop would not SIGBUS, unlike aarch64...

[1] https://patchwork.kernel.org/project/linux-arm-kernel/patch/1462985814-16146-1-git-send-email-colin.king@canonical.com/
[2] https://lore.kernel.org/linux-man/571E731A.6050809@canonical.com/

> > To reproduce this bug, enable CONFIG_UML_RTC. uml_rtc will call
> > add_sigio_fd which will then cause write_sigio_thread to go
> > into segfault loop.
>
> It must also depend on the glibc version, because I've definitely been
> testing UML_RTC on 64-bit, on Fedora 32 at the time.
>

Hmm. Interesting. I can't seem to find anything suggesting Fedora has
a patch that would align the stack within clone() [3][4]. I also got a
Fedora 32 docker image and could not see the aligning from disassembly
of clone, and the gcc version installed by yum is 10.2.1-9.fc32, which
is supposed to be affected by this issue... weird. I would expect this
to fail outright. I'm considering compiling uml inside this container
to see what is going on.

[3] https://github.com/bminor/glibc/commits/master/sysdeps/unix/sysv/linux/x86_64/clone.S
[4] https://src.fedoraproject.org/rpms/glibc/tree/rawhide

YiFei Zhu



More information about the linux-um mailing list