arch/x86: Fix stack alignment for user threads

The x86_64 SysV ABI requires 16 byte alignment for the stack pointer during execution of normal code. That means that on entry to an ABI-compatible C function (which is reached via a CALL instruction that pushes the return address) the RSP register must be MISaligned by exactly 8 bytes. The kernel mode thread setup got this right, but we missed the equivalent condition in userspace entry. The end result was a misaligned stack, which is surprisingly robust for most use. But recent toolchains have starting doing some more elaborate vectorization, and the resulting SSE instructions started failing in userspace on the misaliged loads. Note that there's a comment about optimization: we're doing the stack alignment in the "wrong place" and are needlessly wasting bytes in some cases. We should see the raw stack boundaries where we are setting up RSP values. Add a FIXME to this effect, but don't touch anything as this patch is a targeted bugfix. Also fix a somewhat embarassing 32-bit-ism that would have truncated the address of a userspace stack that we tried to put above 4G. Fixes #31018 Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2021-02-03 12:56:51 -08:00 · 2021-02-03 12:56:51 -08:00 · cce5ff1510
commit cce5ff1510
parent a980762f70
2 changed files with 18 additions and 1 deletions
--- a/arch/x86/core/intel64/thread.c
+++ b/arch/x86/core/intel64/thread.c
@ -12,6 +12,14 @@

 extern void x86_sse_init(struct k_thread *); /* in locore.S */

+/* FIXME: This exists to make space for a "return address" at the top
+ * of the stack.  Obviously this is unused at runtime, but is required
+ * for alignment: stacks at runtime should be 16-byte aligned, and a
+ * CALL will therefore push a return address that leaves the stack
+ * misaligned.  Effectively we're wasting 8 bytes here to undo (!) the
+ * alignment that the upper level code already tried to do for us.  We
+ * should clean this up.
+ */
 struct x86_initial_frame {
 	/* zeroed return address for ABI */
 	uint64_t rip;