arch/x86: Fix stack alignment for user threads

The x86_64 SysV ABI requires 16 byte alignment for the stack pointer
during execution of normal code.  That means that on entry to an
ABI-compatible C function (which is reached via a CALL instruction
that pushes the return address) the RSP register must be MISaligned by
exactly 8 bytes.  The kernel mode thread setup got this right, but we
missed the equivalent condition in userspace entry.

The end result was a misaligned stack, which is surprisingly robust
for most use.  But recent toolchains have starting doing some more
elaborate vectorization, and the resulting SSE instructions started
failing in userspace on the misaliged loads.

Note that there's a comment about optimization: we're doing the stack
alignment in the "wrong place" and are needlessly wasting bytes in
some cases.  We should see the raw stack boundaries where we are
setting up RSP values.  Add a FIXME to this effect, but don't touch
anything as this patch is a targeted bugfix.

Also fix a somewhat embarassing 32-bit-ism that would have truncated
the address of a userspace stack that we tried to put above 4G.

Fixes #31018

Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
This commit is contained in:
Andy Ross 2021-02-03 12:56:51 -08:00 committed by Anas Nashif
commit cce5ff1510
2 changed files with 18 additions and 1 deletions

View file

@ -12,6 +12,14 @@
extern void x86_sse_init(struct k_thread *); /* in locore.S */
/* FIXME: This exists to make space for a "return address" at the top
* of the stack. Obviously this is unused at runtime, but is required
* for alignment: stacks at runtime should be 16-byte aligned, and a
* CALL will therefore push a return address that leaves the stack
* misaligned. Effectively we're wasting 8 bytes here to undo (!) the
* alignment that the upper level code already tried to do for us. We
* should clean this up.
*/
struct x86_initial_frame {
/* zeroed return address for ABI */
uint64_t rip;