In PAE boot tables the __mmu_tables_start points to page directory
pointer (PDPT). Enable the PAE by updating the CR4.PAE and
IA32_EFER.NXE bits.
JIRA:ZEP-2511
Signed-off-by: Adithya Baglody <adithya.nagaraj.baglody@intel.com>
Created structures and unions needed to enable the software to
access these tables.
Also updated the helper macros to ease the usage of the MMU page
tables.
JIRA: ZEP-2511
Signed-off-by: Adithya Baglody <adithya.nagaraj.baglody@intel.com>
If CONFIG_X86_PAE_MODE is enabled for the build, then gen_mmu.py
would generate the boot time page tables in PAE format.
This supports 3 level paging i.e Page Directory Pointer(PDPT), Page
Directory(PD) and Page Table(PT). Each Page Table Entry(PTE) maps to
a 4KB region. Each Page Directory Entry(PDE) maps a 2MB region.
Each Page Directory Pointer Entry(PDPTE) maps to a 1GB region.
JIRA: ZEP-2511
Signed-off-by: Adithya Baglody <adithya.nagaraj.baglody@intel.com>
Page Address Extension(PAE) page tables would be used
if this option is enabled.
JIRA:ZEP-2511
Signed-off-by: Adithya Baglody <adithya.nagaraj.baglody@intel.com>
During swap the required page tables are configured. The outgoing
thread's memory domain pages are reset and the incoming thread's
memory domain is loaded. The pages are configured if userspace
is enabled and if memory domain has been initialized before
calling swap.
GH-3852
Signed-off-by: Adithya Baglody <adithya.nagaraj.baglody@intel.com>
Currently this is defined as a k_thread_stack_t pointer.
However this isn't correct, stacks are defined as arrays. Extern
references to k_thread_stack_t doesn't work properly as the compiler
treats it as a pointer to the stack array and not the array itself.
Declaring as an unsized array of k_thread_stack_t doesn't work
well either. The least amount of confusion is to leave out the
pointer/array status completely, use pointers for function prototypes,
and define K_THREAD_STACK_EXTERN() to properly create an extern
reference.
The definitions for all functions and struct that use
k_thread_stack_t need to be updated, but code that uses them should
be unchanged.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
Looking up the PTE flags was page faulting if the address wasn't
marked as present in the page directory, since there is no page table
for that directory entry.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
At very low optimization levels, the call to
K_THREAD_STACK_BUFFER doesn't get inlined, overflowing the
tiny stack.
Replace with _ARCH_THREAD_STACK_BUFFER() which on x86 is
just a macro.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
For 'rep stosl' ECX isn't a size value, it's how many times to repeat
the 4-byte string copy operation.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
Some our Zephyr tools don't like seeing UTF-8 characters, as reported in
issue #4131) so a quick scan and replace for UTF-8 characters in .rst,
.h, and Kconfig files using "file --mime-encoding" (excluding the /ext
folders) finds these files to tweak.
Signed-off-by: David B. Kinder <david.b.kinder@intel.com>
- syscall.h now contains those APIs needed to support invoking calls
from user code. Some stuff moved out of main kernel.h.
- syscall_handler.h now contains directives useful for implementing
system call handler functions. This header is not pulled in by
kernel.h and is intended to be used by C files implementing kernel
system calls and driver subsystem APIs.
- syscall_list.h now contains the #defines for system call IDs. This
list is expected to grow quite large so it is put in its own header.
This is now an enumerated type instead of defines to make things
easier as we introduce system calls over the new few months. In the
fullness of time when we desire to have a fixed userspace/kernel ABI,
this can always be converted to defines.
Some new code added:
- _SYSCALL_MEMORY() macro added to check memory regions passed up from
userspace in handler functions
- _syscall_invoke{7...10}() inline functions declare for invoking system
calls with more than 6 arguments. 10 was chosen as the limit as that
corresponds to the largest arg list we currently have
which is for k_thread_create()
Other changes
- auto-generated K_SYSCALL_DECLARE* macros documented
- _k_syscall_table in userspace.c is not a placeholder. There's no
strong need to generate it and doing so would require the introduction
of a third build phase.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
A quick look at "man syscall" shows that in Linux, all architectures
support at least 6 argument system calls, with a few supporting 7. We
can at least do 6 in Zephyr.
x86 port modified to use EBP register to carry the 6th system call
argument.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
* Instead of a common system call entry function, we instead create a
table mapping system call ids to handler skeleton functions which are
invoked directly by the architecture code which receives the system
call.
* system call handler prototype specified. All but the most trivial
system calls will implement one of these. They validate all the
arguments, including verifying kernel/device object pointers, ensuring
that the calling thread has appropriate access to any memory buffers
passed in, and performing other parameter checks that the base system
call implementation does not check, or only checks with __ASSERT().
It's only possible to install a system call implementation directly
inside this table if the implementation has a return value and requires
no validation of any of its arguments.
A sample handler implementation for k_mutex_unlock() might look like:
u32_t _syscall_k_mutex_unlock(u32_t mutex_arg, u32_t arg2, u32_t arg3,
u32_t arg4, u32_t arg5, void *ssf)
{
struct k_mutex *mutex = (struct k_mutex *)mutex_arg;
_SYSCALL_ARG1;
_SYSCALL_IS_OBJ(mutex, K_OBJ_MUTEX, 0, ssf);
_SYSCALL_VERIFY(mutex->lock_count > 0, ssf);
_SYSCALL_VERIFY(mutex->owner == _current, ssf);
k_mutex_unlock(mutex);
return 0;
}
* the x86 port modified to work with the system call table instead of
calling a common handler function. fixed an issue where registers being
changed could confuse the compiler has been fixed; all registers, even
ones used for parameters, must be preserved across the system call.
* a new arch API for producing a kernel oops when validating system call
arguments added. The debug information reported will be from the system
call site and not inside the handler function.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
- _arch_user_mode_enter() implemented
- _arch_is_user_context() implemented
- _new_thread() will honor K_USER option if passed in
- System call triggering macros implemented
- _thread_entry_wrapper moved and now looks for the next function to
call in EDI
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
- There's no point in building up "validity" (declared volatile for some
strange reason), just exit with false return value if any of the page
directory or page table checks don't come out as expected
- The function was returning the opposite value as its documentation
(0 on success, -EPERM on failure). Documentation updated.
- This function will only be used to verify buffers from user-space.
There's no need for a flags parameter, the only option that needs to
be passed in is whether the buffer has write permissions or not.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
We shouldn't be imposing any policy here, we do not yet use these in
Zephyr. Zero these at boot and otherwise leave alone.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
In various places, a private _thread_entry_t, or the full prototype
were being used. Be consistent and use the same typedef everywhere.
Signen-off-by: Andrew Boie <andrew.p.boie@intel.com>
Helper macros to ease the usage of the MMU page table structures.
Added Macros to get Page table address and Page Table Entry
values.
Signed-off-by: Adithya Baglody <adithya.nagaraj.baglody@intel.com>
Most x86 exceptions that don't already have their own handlers
are fairly rare, but with the introduction of userspace
people will be seeing General Protection Faults much more
often. Report it as text so that users unfamiliar with x86
internals will know what is happening.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
Previously, this was only done if an essential thread self-exited,
and was a runtime check that generated a kernel panic.
Now if any thread has k_thread_abort() called on it, and that thread
is essential to the system operation, this check is made. It is now
an assertion.
_NANO_ERR_INVALID_TASK_EXIT checks and printouts removed since this
is now an assertion.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
Years of iterative development had made this function more complicated
than it needed to be. Fixed some errors in the documentation as well.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
The API/Variable names in timing_info looks very speicific to
platform (like systick etc), whereas these variabled are used
across platforms (nrf/arm/quark).
So this patch :-
1. changing API/Variable names to generic one.
2. Creating some of Macros whose implimentation is platform
depenent.
Jira: ZEP-2314
Signed-off-by: Youvedeep Singh <youvedeep.singh@intel.com>
The value of the PTE (starting_pte_num) was not
calulated correctly. If size of the buffer exceeded 4KB,
the buffer validation API was failing.
JIRA: ZEP-2489
Signed-off-by: Adithya Baglody <adithya.nagaraj.baglody@intel.com>
The API name space for Bluetooth is bt_* and BT_* so it makes sense to
align the Kconfig name space with this. The additional benefit is that
this also makes the names shorter. It is also in line with what Linux
uses for Bluetooth Kconfig entries.
Some Bluetooth-related Networking Kconfig defines are renamed as well
in order to be consistent, such as NET_L2_BLUETOOTH.
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
As luck would have it, the TSS for the main IA task has
all the information we need, populate an exception stack
frame with it.
The double-fault handler just stashes data and makes the main
hardware thread runnable again, and processing of the
exception continues from there.
We check the first byte before the faulting ESP value to see
if the stack pointer had run up to a non-present page, a sign
that this is a stack overflow and not a double fault for
some other reason.
Stack overflows in kernel mode are now recoverable for non-
essential threads, with the caveat that we hope we weren't in
a critical section updating kernel data structures when it
happened.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
Configuring the RAM/ROM regions will be the same for all
x86 targets as this is done with linker symbols.
Peripheral configuration left at the SOC level.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
The CPU first checks the page directory entry for write
or user permissions on a particular page before looking
at the page table entry.
If a region configured all pages to be non user accessible,
and this was changed for a page within it to be accessible,
the PDE would not be updated and any access would still
return a page fault.
The least amount of runtime logic to deal with this is to
indicate at build time that some pages within a region may
be marked writable or user accessible at runtime, and to
pre-set the flags in the page directory entry accordingly.
The driving need for this is the region configuration for
kernel memory, which will have user permissions set at
runtime for stacks and user-configured memory domains.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
Page faults will additionally dump out some interesting
page directory and page table flags for the faulting
memory address.
Intended to help determine whether the page tables have been
configured incorrectly as we enable memory protection features.
This only happens if CONFIG_EXCEPTION_DEBUG is turned on.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
Historically, stacks were just character buffers and could be treated
as such if the user wanted to look inside the stack data, and also
declared as an array of the desired stack size.
This is no longer the case. Certain architectures will create a memory
region much larger to account for MPU/MMU guard pages. Unfortunately,
the kernel interfaces treat both the declared stack, and the valid
stack buffer within it as the same char * data type, even though these
absolutely cannot be used interchangeably.
We introduce an opaque k_thread_stack_t which gets instantiated by
K_THREAD_STACK_DECLARE(), this is no longer treated by the compiler
as a character pointer, even though it really is.
To access the real stack buffer within, the result of
K_THREAD_STACK_BUFFER() can be used, which will return a char * type.
This should catch a bunch of programming mistakes at build time:
- Declaring a character array outside of K_THREAD_STACK_DECLARE() and
passing it to K_THREAD_CREATE
- Directly examining the stack created by K_THREAD_STACK_DECLARE()
which is not actually the memory desired and may trigger a CPU
exception
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
This will trigger a page fault if the guard area
is written to. Since the exception itself will try
to write to the memory, a double fault will be triggered
and we will do an IA task switch to the df_tss and panic.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
Subsequent patches will set this guard page as unmapped,
triggering a page fault on access. If this is due to
stack overflow, a double fault will be triggered,
which we are now capable of handling with a switch to
a know good stack.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
We now create a special IA hardware task for handling
double faults. This has a known good stack so that if
the kernel tries to push stack data onto an unmapped page,
we don't triple-fault and reset the system.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
We will need this for stack memory protection scenarios
where a writable GDT with Task State Segment descriptors
will be used. The addresses of the TSS segments cannot be
put in the GDT via preprocessor magic due to architecture
requirments that the address be split up into different
fields in the segment descriptor.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
This has one use-case: configuring the double-fault #DF
exception handler to do an IA task switch to a special
IA task with a known good stack, such that we can dump
diagnostic information and then panic.
Will be used for stack overflow detection in kernel mode,
as otherwise the CPU will triple-fault and reset.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>