[PATCH v2] um: Fix stack pointer alignment

Johannes Berg johannes at sipsolutions.net
Tue Apr 20 07:50:42 BST 2021


Hi,

Sorry, went to sleep after sending the other mail last night :)

On Mon, 2021-04-19 at 15:36 -0500, YiFei Zhu wrote:
> On Mon, Apr 19, 2021 at 2:41 PM Johannes Berg <johannes at sipsolutions.net> wrote:
> > On Mon, 2021-04-19 at 10:32 -0500, YiFei Zhu wrote:
> > > On a side note, musl is unaffected by this issue because it forces
> > > 16 byte alignment via `and $-16,%rsi` in its clone wrapper.
> > > Similarly, glibc i386 is also uneffected because it has
> 
> I realize I forgot to fix the typo here. Can you amend it or shall I send a v3?
> 

Please resend - I most likely won't be the one applying the patch, but
Richard. Just easier that way.

> 
> This could be considered a glibc bug that it doesn't force alignment,
> yeah, considering musl does it for both x86_32 and x86_64, and glibc
> does it for only x86_32 and not x86_64. However, I'm unaware that
> anywhere saying something like "it's libc's duty to align the stack
> pointer to clone()"
> 

True. And the man page does document alignment restrictions at least in
some cases.

> Speaking of aarch64, it looks like that message might be out of date.

Might very well be. I found another reference to "126-bit alignment"
(and have since sent a patch to fix that to say "128-bit"), but I don't
know whether any of that is really accurate.
> > 
> > It must also depend on the glibc version, because I've definitely been
> > testing UML_RTC on 64-bit, on Fedora 32 at the time.
> > 
> 
> Hmm. Interesting. I can't seem to find anything suggesting Fedora has
> a patch that would align the stack within clone() [3][4]. I also got a
> Fedora 32 docker image and could not see the aligning from disassembly
> of clone, and the gcc version installed by yum is 10.2.1-9.fc32, which
> is supposed to be affected by this issue... weird. I would expect this
> to fail outright. I'm considering compiling uml inside this container
> to see what is going on.

Hm, wait. It could also be a *compiler* version thing, right?

And actually, most of my testing probably wasn't on Fedora 33, but
rather on a somewhat dated version of nixos, with gcc 9.3. I could even
give you the description file to reproduce the exact environment, but I
doubt it's really worthwhile. Let's just fix the bug?

johannes




More information about the linux-um mailing list