This function iterates over the thread's memory domain
and updates page tables based on it. We need to be holding
z_mem_domain_lock while this happens.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
The boot code of x86_64 initializes the stack (if enabled)
with a hard-coded size for the ISR stack. However,
the stack being used does not have to be the ISR stack,
and can be any defined stacks. So pass in the actual size
of the stack so the stack can be initialized properly.
Fixes#21843
Signed-off-by: Daniel Leung <daniel.leung@intel.com>
We no longer plan to support a split address space with
the kernel in high memory and per-process address spaces.
Because of this, we can simplify some things. System RAM
is now always identity mapped at boot.
We no longer require any virtual-to-physical translation
for page tables, and can remove the dual-mapping logic
from the page table generation script since we won't need
to transition the instruction point off of physical
addresses.
CONFIG_KERNEL_VM_BASE and CONFIG_KERNEL_VM_LIMIT
have been removed. The kernel's address space always
starts at CONFIG_SRAM_BASE_ADDRESS, of a fixed size
specified by CONFIG_KERNEL_VM_SIZE.
Driver MMIOs and other uses of k_mem_map() are still
virtually mapped, and the later introduction of demand
paging will result in only a subset of system RAM being
a fixed identity mapping instead of all of it.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
In order to be possible to debug usermode threads need to be able
issue breakpoint and debug exceptions. To do this it is necessary to
set DPL bits to, at least, the same CPL level.
Signed-off-by: Flavio Ceolin <flavio.ceolin@intel.com>
The x86 paging code has been rewritten to support another paging mode
and non-identity virtual mappings.
- Paging code now uses an array of paging level characteristics and
walks tables using for loops. This is opposed to having different
functions for every paging level and lots of #ifdefs. The code is
now more concise and adding new paging modes should be trivial.
- We now support 32-bit, PAE, and IA-32e page tables.
- The page tables created by gen_mmu.py are now installed at early
boot. There are no longer separate "flat" page tables. These tables
are mutable at any time.
- The x86_mmu code now has a private header. Many definitions that did
not need to be in public scope have been moved out of mmustructs.h
and either placed in the C file or in the private header.
- Improvements to dumping page table information, with the physical
mapping and flags all shown
- arch_mem_map() implemented
- x86 userspace/memory domain code ported to use the new
infrastructure.
- add logic for physical -> virtual instruction pointer transition,
including cleaning up identity mappings after this takes place.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
These stacks are appropriate for threads that run purely in
supervisor mode, and also as stacks for interrupt and exception
handling.
Two new arch defines are introduced:
- ARCH_KERNEL_STACK_GUARD_SIZE
- ARCH_KERNEL_STACK_OBJ_ALIGN
New public declaration macros:
- K_KERNEL_STACK_RESERVED
- K_KERNEL_STACK_EXTERN
- K_KERNEL_STACK_DEFINE
- K_KERNEL_STACK_ARRAY_DEFINE
- K_KERNEL_STACK_MEMBER
- K_KERNEL_STACK_SIZEOF
If user mode is not enabled, K_KERNEL_STACK_* and K_THREAD_STACK_*
are equivalent.
Separately generated privilege elevation stacks are now declared
like kernel stacks, removing the need for K_PRIVILEGE_STACK_ALIGN.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
This commit renames the x86 Kconfig `CONFIG_{EAGER,LAZY}_FP_SHARING`
symbol to `CONFIG_{EAGER,LAZY}_FPU_SHARING`, in order to align with the
recent `CONFIG_FP_SHARING` to `CONFIG_FPU_SHARING` renaming.
Signed-off-by: Stephanos Ioannidis <root@stephanos.io>
This operation is formally defined as rounding down a potential
stack pointer value to meet CPU and ABI requirments.
This was previously defined ad-hoc as STACK_ROUND_DOWN().
A new architecture constant ARCH_STACK_PTR_ALIGN is added.
Z_STACK_PTR_ALIGN() is defined in terms of it. This used to
be inconsistently specified as STACK_ALIGN or STACK_PTR_ALIGN;
in the latter case, STACK_ALIGN meant something else, typically
a required alignment for the base of a stack buffer.
STACK_ROUND_UP() only used in practice by Risc-V, delete
elsewhere.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
The set of interrupt stacks is now expressed as an array. We
also define the idle threads and their associated stacks this
way. This allows for iteration in cases where we have multiple
CPUs.
There is now a centralized declaration in kernel_internal.h.
On uniprocessor systems, z_interrupt_stacks has one element
and can be used in the same way as _interrupt_stack.
The IRQ stack for CPU 0 is now set in init.c instead of in
arch code.
The extern definition of the main thread stack is now removed,
this doesn't need to be in a header.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
Use of the _current_cpu pointer cannot be done safely in a preemptible
context. If a thread is preempted and migrates to another CPU, the
old CPU record will be wrong.
Add a validation assert to the expression that catches incorrect
usages, and fix up the spots where it was wrong (most important being
a few uses of _current outside of locks, and the arch_is_in_isr()
implementation).
Note that the resulting _current expression now requires locking and
is going to be somewhat slower. Longer term it's going to be better
to augment the arch API to allow SMP architectures to implement a
faster "get current thread pointer" action than this default.
Note also that this change means that "_current" is no longer
expressible as an lvalue (long ago, it was just a static variable), so
the places where it gets assigned now assign to _current_cpu->current
instead.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
Implement a set of per-cpu trampoline stacks which all
interrupts and exceptions will initially land on, and also
as an intermediate stack for privilege changes as we need
some stack space to swap page tables.
Set up the special trampoline page which contains all the
trampoline stacks, TSS, and GDT. This page needs to be
present in the user page tables or interrupts don't work.
CPU exceptions, with KPTI turned on, are treated as interrupts
and not traps so that we have IRQs locked on exception entry.
Add some additional macros for defining IDT entries.
Add special handling of locore text/rodata sections when
creating user mode page tables on x86-64.
Restore qemu_x86_64 to use KPTI, and remove restrictions on
enabling user mode on x86-64.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
- In early boot, enable the syscall instruction and set up
necessary MSRs
- Add a hook to update page tables on context switch
- Properly initialize thread based on whether it will
start in user or supervisor mode
- Add landing function for system calls to execute the
desired handler
- Implement arch_user_string_nlen()
- Implement logic for dropping a thread down to user mode
- Reserve per-CPU storage space for user and privilege
elevation stack pointers, necessary for handling syscalls
when no free registers are available
- Proper handling of gs register considerations when
transitioning privilege levels
Kernel page table isolation (KPTI) is not yet implemented.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
We use a fixed value of 32 as the way interrupts/exceptions
are setup in x86_64's locore.S do not lend themselves to
Kconfig configuration of the vector to use.
HW-based kernel oops is now permanently on, there's no reason
to make it optional that I can see.
Default vectors for IPI and irq offload adjusted to not
collide.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
These are now common code, all are related to user mode
threads. The rat's nest of ifdefs in ia32's arch_new_thread
has been greatly simplified, there is now just one hook
if user mode is turned on.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
z_x86_thread_page_tables_get() now works for both user
and supervisor threads, returning the kernel page tables
in the latter case. This API has been up-leveled to
a common header.
The per-thread privilege elevation stack initial stack
pointer, and the per-thread page table locations are no
longer computed from other values, and instead are stored
in thread->arch.
A problem where the wrong page tables were dumped out
on certain kinds of page faults has been fixed.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
We don't need to set up GDT data descriptors for setting
%gs. Instead, we use the x86 MSRs to set GS_BASE and
KERNEL_GS_BASE.
We don't currently allow user mode to set %gs on its own,
but later on if we do, we have everything set up to issue
'swapgs' instructions on syscall or IRQ.
Unused entries in the GDT have been removed.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
We now dump more information for less common cases,
and this is now centralized code for 32-bit/64-bit.
All of this code is now correctly wrapped around
CONFIG_EXCEPTION_DEBUG. Some cruft and unused defines
removed.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
Promote the private z_arch_* namespace, which specifies
the interface between the core kernel and the
architecture code, to a new top-level namespace named
arch_*.
This allows our documentation generation to create
online documentation for this set of interfaces,
and this set of interfaces is worth treating in a
more formal way anyway.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
Some code for unwinding stacks and z_x86_fatal_error()
now in a common C file, suitable for both modes.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
This commit refactors kernel and arch headers to establish a boundary
between private and public interface headers.
The refactoring strategy used in this commit is detailed in the issue
This commit introduces the following major changes:
1. Establish a clear boundary between private and public headers by
removing "kernel/include" and "arch/*/include" from the global
include paths. Ideally, only kernel/ and arch/*/ source files should
reference the headers in these directories. If these headers must be
used by a component, these include paths shall be manually added to
the CMakeLists.txt file of the component. This is intended to
discourage applications from including private kernel and arch
headers either knowingly and unknowingly.
- kernel/include/ (PRIVATE)
This directory contains the private headers that provide private
kernel definitions which should not be visible outside the kernel
and arch source code. All public kernel definitions must be added
to an appropriate header located under include/.
- arch/*/include/ (PRIVATE)
This directory contains the private headers that provide private
architecture-specific definitions which should not be visible
outside the arch and kernel source code. All public architecture-
specific definitions must be added to an appropriate header located
under include/arch/*/.
- include/ AND include/sys/ (PUBLIC)
This directory contains the public headers that provide public
kernel definitions which can be referenced by both kernel and
application code.
- include/arch/*/ (PUBLIC)
This directory contains the public headers that provide public
architecture-specific definitions which can be referenced by both
kernel and application code.
2. Split arch_interface.h into "kernel-to-arch interface" and "public
arch interface" divisions.
- kernel/include/kernel_arch_interface.h
* provides private "kernel-to-arch interface" definition.
* includes arch/*/include/kernel_arch_func.h to ensure that the
interface function implementations are always available.
* includes sys/arch_interface.h so that public arch interface
definitions are automatically included when including this file.
- arch/*/include/kernel_arch_func.h
* provides architecture-specific "kernel-to-arch interface"
implementation.
* only the functions that will be used in kernel and arch source
files are defined here.
- include/sys/arch_interface.h
* provides "public arch interface" definition.
* includes include/arch/arch_inlines.h to ensure that the
architecture-specific public inline interface function
implementations are always available.
- include/arch/arch_inlines.h
* includes architecture-specific arch_inlines.h in
include/arch/*/arch_inline.h.
- include/arch/*/arch_inline.h
* provides architecture-specific "public arch interface" inline
function implementation.
* supersedes include/sys/arch_inline.h.
3. Refactor kernel and the existing architecture implementations.
- Remove circular dependency of kernel and arch headers. The
following general rules should be observed:
* Never include any private headers from public headers
* Never include kernel_internal.h in kernel_arch_data.h
* Always include kernel_arch_data.h from kernel_arch_func.h
* Never include kernel.h from kernel_struct.h either directly or
indirectly. Only add the kernel structures that must be referenced
from public arch headers in this file.
- Relocate syscall_handler.h to include/ so it can be used in the
public code. This is necessary because many user-mode public codes
reference the functions defined in this header.
- Relocate kernel_arch_thread.h to include/arch/*/thread.h. This is
necessary to provide architecture-specific thread definition for
'struct k_thread' in kernel.h.
- Remove any private header dependencies from public headers using
the following methods:
* If dependency is not required, simply omit
* If dependency is required,
- Relocate a portion of the required dependencies from the
private header to an appropriate public header OR
- Relocate the required private header to make it public.
This commit supersedes #20047, addresses #19666, and fixes#3056.
Signed-off-by: Stephanos Ioannidis <root@stephanos.io>
There are two set of code supporting x86_64: x86_64 using x32 ABI,
and x86 long mode, and this consolidates both into one x86_64
architecture and SoC supporting truly 64-bit mode.
() Removes the x86_64:x32 architecture and SoC, and replaces
them with the existing x86 long mode arch and SoC.
() Replace qemu_x86_64 with qemu_x86_long as qemu_x86_64.
() Updates samples and tests to remove reference to
qemu_x86_long.
() Renames CONFIG_X86_LONGMODE to CONFIG_X86_64.
Signed-off-by: Daniel Leung <daniel.leung@intel.com>
The page tables to use are now stored in the cpuboot struct.
For the first CPU, we set to the flat page tables, and then
update later in z_x86_prep_c() once the runtime tables have
been generated.
For other CPUs, by the time we get to z_arch_start_cpu()
the runtime tables are ready do go, and so we just install
them directly.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
- Bring in CONFIG_X86_MMU and some related defines to
common X86 Kconfig
- Don't set ARCH_HAS_USERSPACE for intel64 yet when
X86_MMU is enabled
- Uplevel x86_mmu.c to common code
- Add logic for handling PML4 table and generating PDPTs
- move z_x86_paging_init() to common kernel_arch_func.h
- Uplevel inclusion of mmustructs.h to common x86 arch.h,
both need it for memory domain defines
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
Duplicate definitions elsewhere have been removed.
A couple functions which are defined by the arch interface
to be non-inline, but were implemented inline by native_posix
and intel64, have been moved to non-inline.
Some missing conditional compilation for z_arch_irq_offload()
has been fixed, as this is an optional feature.
Some massaging of native_posix headers to get everything
in the right scope.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
The PDPT was moved to the stack area since it has alignment
requirements, but never removed from here.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
This will be used for both 32-bit and 64-bit mode.
This header gets pulled in by x86's arch/cpu.h, so put
it in include/arch/x86/.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
include/sys/arch_inlines.h will contain all architecture APIs
that are used by public inline functions and macros,
with implementations deriving from include/arch/cpu.h.
kernel/include/arch_interface.h will contain everything
else, with implementations deriving from
arch/*/include/kernel_arch_func.h.
Instances of duplicate documentation for these APIs have been
removed; implementation details have been left in place.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
It's possible to have multiple processors configured without using the
SMP scheduler, so don't make definitions dependent on CONFIG_SMP.
Signed-off-by: Charles E. Youse <charles.youse@intel.com>
Add duplicate per-CPU data structures (x86_cpuboot, tss, stacks, etc.)
for up to 4 total CPUs, add code in locore and z_arch_start_cpu().
The test board, qemu_x86_long, now defaults to 2 CPUs.
Signed-off-by: Charles E. Youse <charles.youse@intel.com>
A new 'struct x86_cpuboot' is created as well as an instance called
'x86_cpuboot[]' which contains per-CPU boot data (initial stack,
entry function/arg, selectors, etc.). The locore now consults this
table to set up per-CPU registers, etc. during early boot.
Also, rename tss.c to cpu.c as its scope is growing.
Signed-off-by: Charles E. Youse <charles.youse@intel.com>
There's no need to qualify the 64-bit CS/DS selectors, and the GS and
TR selectors are renamed CPU0_GS and CPU0_TR as they are CPU-specific.
Signed-off-by: Charles E. Youse <charles.youse@intel.com>
The linker script was missing symbols that defined the boundaries
of kernel memory segments (_image_rom_end, etc.). These are added
so that core/memmap.c can properly account for those segments.
Signed-off-by: Charles E. Youse <charles.youse@intel.com>
And set qemu_x86_long board to build with CONFIG_SMP=y by default.
Apparently two benchmark tests - latency_measure and sys_kernel -
do not work with the SMP scheduler, so those tests are disabled.
Signed-off-by: Charles E. Youse <charles.youse@intel.com>
This patch is a preparatory step in enabling the MMU in
long mode; no steps are taken to implement long mode support.
We introduce struct x86_page_tables, which represents the
top-level data structure for page tables:
- For 32-bit, this will contain a four-entry page directory
pointer table (PDPT)
- For 64-bit, this will (eventually) contain a page map level 4
table (PML4)
In either case, this pointer value is what gets programmed into
CR3 to activate a set of page tables. There are extra bits in
CR3 to set for long mode, we'll get around to that later.
This abstraction will allow us to use the same APIs that work
with page tables in either mode, rather than hard-coding that
the top level data structure is a PDPT.
z_x86_mmu_validate() has been re-written to make it easier to
add another level of paging for long mode, to support 2MB
PDPT entries, and correctly validate regions which span PDPTE
entries.
Some MMU-related APIs moved out of 32-bit x86's arch.h into
mmustructs.h.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
This is part of the core kernel -> architecture interface and
has been renamed z_arch_kernel_init().
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
z_set_thread_return_value is part of the core kernel -> arch
interface and has been renamed to z_arch_thread_return_value_set.
z_set_thread_return_value_with_data renamed to
z_thread_return_value_set_with_data for consistency.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>