The set of interrupt stacks is now expressed as an array. We
also define the idle threads and their associated stacks this
way. This allows for iteration in cases where we have multiple
CPUs.
There is now a centralized declaration in kernel_internal.h.
On uniprocessor systems, z_interrupt_stacks has one element
and can be used in the same way as _interrupt_stack.
The IRQ stack for CPU 0 is now set in init.c instead of in
arch code.
The extern definition of the main thread stack is now removed,
this doesn't need to be in a header.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
Change to code to use the automatically generated DT_INST_*
defines and remove the now unneeded configs and fixups.
Signed-off-by: Timo Teräs <timo.teras@iki.fi>
If IO APIC is in logical destination mode, local APICs compare their
logical APIC ID defined in LDR (Logical Destination Register) with
the destination code sent with the interrupt to determine whether or not
to accept the incoming interrupt.
This patch programs LDR in xAPIC mode to support IO APIC logical mode.
The local APIC ID from local APIC ID register can't be used as the
'logical APIC ID' because LAPIC ID may not be consecutive numbers hence
it makes it impossible for LDR to encode 8 IDs within 8 bits.
This patch chooses 0 for BSP, and for APs, cpu_number which is the index
to x86_cpuboot[], which ultimately assigned in z_smp_init[].
Signed-off-by: Zide Chen <zide.chen@intel.com>
The callee-saved registers have been separated out and will not
be saved/restored if exception debugging is shut off.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
The context switch implementation forgot to save the current flag
state of the old thread, so on resume the flags would be restored to
whatever value they had at the last interrupt preemption or thread
initialization. In practice this guaranteed that the interrupt enable
bit would always be wrong, becuase obviously new threads and preempted
ones have interrupts enabled, while arch_switch() is always called
with them masked. This opened up a race between exit from
arch_switch() and the final exit path in z_swap().
The other state bits weren't relevant -- the oddball ones aren't used
by Zephyr, and as arch_switch() on this architecture is a function
call the compiler would have spilled the (caller-save) comparison
result flags anyway.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
Use of the _current_cpu pointer cannot be done safely in a preemptible
context. If a thread is preempted and migrates to another CPU, the
old CPU record will be wrong.
Add a validation assert to the expression that catches incorrect
usages, and fix up the spots where it was wrong (most important being
a few uses of _current outside of locks, and the arch_is_in_isr()
implementation).
Note that the resulting _current expression now requires locking and
is going to be somewhat slower. Longer term it's going to be better
to augment the arch API to allow SMP architectures to implement a
faster "get current thread pointer" action than this default.
Note also that this change means that "_current" is no longer
expressible as an lvalue (long ago, it was just a static variable), so
the places where it gets assigned now assign to _current_cpu->current
instead.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
Add TRACING_ISR Kconfig to help high latency backend working well.
Currently the ISR tracing hook function is put at the begining and
ending of ISR wrapper, when there is ISR needed in the tracing path
(especially tracing backend), it will cause tracing buffer easily
be exhausted if async tracing method enabled. Also it will increase
system latency if all the ISRs are traced. So add TRACING_ISR to
enable/disable ISR tracing here. Later a filter out mechanism based
on irq number will be added.
Signed-off-by: Wentong Wu <wentong.wu@intel.com>
There was a bug where double-dispatch of a single thread on multiple
SMP CPUs was possible. This can be mind-bending to diagnose, so when
CONFIG_ASSERT is enabled add an extra instruction to __resume (the
shared code path for both interupt return and context switch) that
poisons the shared RIP of the now-running thread with a recognizable
invalid value.
Now attempts to run the thread again will crash instantly with a
discoverable cookie in their instruction pointer, and this will remain
true until it gets a new RIP at the next interrupt or switch.
This is under CONFIG_ASSERT because it meets the same design goals of
"a cheap test for impossible situations", not because it's part of the
assertion framework.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
The original intent was that the output handle be written through the
pointer in the second argument, though not all architectures used that
scheme. As it turns out, that write is becoming a synchronization
signal, so it's no longer optional.
Clarify the documentation in arch_switch() about this requirement, and
add an instruction to the x86_64 context switch to implement it as
original envisioned.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
Implement a set of per-cpu trampoline stacks which all
interrupts and exceptions will initially land on, and also
as an intermediate stack for privilege changes as we need
some stack space to swap page tables.
Set up the special trampoline page which contains all the
trampoline stacks, TSS, and GDT. This page needs to be
present in the user page tables or interrupts don't work.
CPU exceptions, with KPTI turned on, are treated as interrupts
and not traps so that we have IRQs locked on exception entry.
Add some additional macros for defining IDT entries.
Add special handling of locore text/rodata sections when
creating user mode page tables on x86-64.
Restore qemu_x86_64 to use KPTI, and remove restrictions on
enabling user mode on x86-64.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
generated_dts_board.h is pretty redundant and confusing as a name. Call
it devicetree.h instead.
dts.h would be another option, but DTS stands for "devicetree source"
and is the source code format, so it's a bit confusing too.
The replacement was done by grepping for 'generated_dts_board' and
'GENERATED_DTS_BOARD'.
Two build diagram and input-output SVG files were updated as well, along
with misc. documentation.
hal_ti, mcuboot, and ci-tools updates are included too, in the west.yml
update.
Signed-off-by: Ulf Magnusson <Ulf.Magnusson@nordicsemi.no>
KPTI is still work-in-progress on x86_64. Don't allow
user mode to be enabled unless the SOC/board configuration
indicates that the CPU in use is invulnerable to meltdown
attacks.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
See CVE-2019-1125. We mitigate this by adding an 'lfence'
upon interrupt/exception entry after the decision has been
made whether it's necessary to invoke 'swapgs' or not.
Only applies to x86_64, 32-bit doesn't use swapgs.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
- In early boot, enable the syscall instruction and set up
necessary MSRs
- Add a hook to update page tables on context switch
- Properly initialize thread based on whether it will
start in user or supervisor mode
- Add landing function for system calls to execute the
desired handler
- Implement arch_user_string_nlen()
- Implement logic for dropping a thread down to user mode
- Reserve per-CPU storage space for user and privilege
elevation stack pointers, necessary for handling syscalls
when no free registers are available
- Proper handling of gs register considerations when
transitioning privilege levels
Kernel page table isolation (KPTI) is not yet implemented.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
This code:
1) Doesn't work
2) Hasn't ever been enabled by default
3) We mitigate Spectre V2 via Extended IBRS anyway
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
We use a fixed value of 32 as the way interrupts/exceptions
are setup in x86_64's locore.S do not lend themselves to
Kconfig configuration of the vector to use.
HW-based kernel oops is now permanently on, there's no reason
to make it optional that I can see.
Default vectors for IPI and irq offload adjusted to not
collide.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
This is causing problems, as if we create a thread in
a system call we will *not* be using the kernel page
tables if CONFIG_KPTI=n.
Just don't fiddle with this page's permissions; we don't
need it as a guard area anyway since we have a stack
guard placed immediately before it, and this page
is unused if user mode isn't active.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
Nothing too fancy here, we try as much as possible to
use the same register layout as the C calling convention.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
These are now common code, all are related to user mode
threads. The rat's nest of ifdefs in ia32's arch_new_thread
has been greatly simplified, there is now just one hook
if user mode is turned on.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
z_x86_thread_page_tables_get() now works for both user
and supervisor threads, returning the kernel page tables
in the latter case. This API has been up-leveled to
a common header.
The per-thread privilege elevation stack initial stack
pointer, and the per-thread page table locations are no
longer computed from other values, and instead are stored
in thread->arch.
A problem where the wrong page tables were dumped out
on certain kinds of page faults has been fixed.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
Add two new non-static APIs for dumping out the
page table entries for a specified memory address,
and move to the main MMU code. Has debugging uses
when trying to figure out why memory domains are not
set up correctly.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
We don't need to set up GDT data descriptors for setting
%gs. Instead, we use the x86 MSRs to set GS_BASE and
KERNEL_GS_BASE.
We don't currently allow user mode to set %gs on its own,
but later on if we do, we have everything set up to issue
'swapgs' instructions on syscall or IRQ.
Unused entries in the GDT have been removed.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
These were previously assumed to always be fatal.
We can't have the faulting thread's XMM registers
clobbered, so put the SIMD/FPU state onto the stack
as well. This is fairly large (512 bytes) and the
execption stack is already uncomfortably small, so
increase to 2K.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
Same deal as in commit 41713244b3 ("kconfig: Remove '# Hidden' comments
on promptless symbols"). I forgot to do a case-insensitive search.
Signed-off-by: Ulf Magnusson <Ulf.Magnusson@nordicsemi.no>
Runtime stack traces (at least as currently implemented)
don't work on x86_64 normally as RBP is treated as a general-
purpose register. Depend on CONFIG_NO_OPTIMIZATIONS to enable
this on 64-bit.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
qemu_x86_64 will exit the emulator on a fatal system error,
like qemu_x86 already does.
Improves CI times when tests fail since sanitycheck will not
need to wait for the timeout to expire.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
We now dump more information for less common cases,
and this is now centralized code for 32-bit/64-bit.
All of this code is now correctly wrapped around
CONFIG_EXCEPTION_DEBUG. Some cruft and unused defines
removed.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
We need a size_t and not a u32_t for partition sizes,
for 64-bit compatibility.
Additionally, app_memdomain.h was also casting the base
address to a u32_t instead of a uintptr_t.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
This is causing problems, as if we create a thread in
a system call we will *not* be using the kernel page
tables if CONFIG_KPTI=n, resulting in a crash when
the later call to copy_page_tables() tries to initialize
the PDPT (which is in the same page as the privilege
stack).
Just don't fiddle with this page's permissions; we don't
need it as a guard area anyway since we have a stack
guard placed immediately before it, and this page
is unused if user mode isn't active.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
Bool symbols implicitly default to 'n'.
A 'default n' can make sense e.g. in a Kconfig.defconfig file, if you
want to override a 'default y' on the base definition of the symbol. It
isn't used like that on any of these symbols though.
Also replace some
config
prompt "foo"
bool/int
with the more common shorthand
config
bool/int "foo"
See the 'Style recommendations and shorthands' section in
https://docs.zephyrproject.org/latest/guides/kconfig/index.html.
Signed-off-by: Ulf Magnusson <Ulf.Magnusson@nordicsemi.no>
The races are believed to be resolved with the patch to
irq_offload(). Allow the MMU to be turned on and enable
it for qemu_x86_64.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
This commit inlines the direct ISR functions that were previously
implemented in irq_manage.c, since the PR #20119 resolved the circular
dependency between arch.h and kernel_structs.h described in the issue
#3056.
Signed-off-by: Stephanos Ioannidis <root@stephanos.io>
Promote the private z_arch_* namespace, which specifies
the interface between the core kernel and the
architecture code, to a new top-level namespace named
arch_*.
This allows our documentation generation to create
online documentation for this set of interfaces,
and this set of interfaces is worth treating in a
more formal way anyway.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
Some code for unwinding stacks and z_x86_fatal_error()
now in a common C file, suitable for both modes.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>