A fix for this issue is in progress, meanwhile warn the user that
they may be susceptible to this problem if they enable user mode on
an x86-based target that is not known to be immune.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
when a current thread is added to a memory domain the pages/sections
must be configured immediately.
A problem occurs when we add a thread to current and then drop
down to usermode. In such a case memory domain will become active
the next time a swap occurs.
Signed-off-by: Adithya Baglody <adithya.nagaraj.baglody@intel.com>
Add an architecure specfic code for the memory domain
configuration. This is needed to support a memory domain API
k_mem_domain_add_thread.
Signed-off-by: Adithya Baglody <adithya.nagaraj.baglody@intel.com>
Save the required scratch pad register (in this case only edx)
before calling the C function.
Signed-off-by: Adithya Baglody <adithya.nagaraj.baglody@intel.com>
When CONFIG_X86_MMU is enabled for arduino 101 the start address
should be aligned to 4kB. If not aligned the page tables would not
be created and the build fails.
Signed-off-by: Adithya Baglody <adithya.nagaraj.baglody@intel.com>
Introducing CMake is an important step in a larger effort to make
Zephyr easy to use for application developers working on different
platforms with different development environment needs.
Simplified, this change retains Kconfig as-is, and replaces all
Makefiles with CMakeLists.txt. The DSL-like Make language that KBuild
offers is replaced by a set of CMake extentions. These extentions have
either provided simple one-to-one translations of KBuild features or
introduced new concepts that replace KBuild concepts.
This is a breaking change for existing test infrastructure and build
scripts that are maintained out-of-tree. But for FW itself, no porting
should be necessary.
For users that just want to continue their work with minimal
disruption the following should suffice:
Install CMake 3.8.2+
Port any out-of-tree Makefiles to CMake.
Learn the absolute minimum about the new command line interface:
$ cd samples/hello_world
$ mkdir build && cd build
$ cmake -DBOARD=nrf52_pca10040 ..
$ cd build
$ make
PR: zephyrproject-rtos#4692
docs: http://docs.zephyrproject.org/getting_started/getting_started.html
Signed-off-by: Sebastian Boe <sebastian.boe@nordicsemi.no>
During swap the required page tables are configured. The outgoing
thread's memory domain pages are reset and the incoming thread's
memory domain is loaded. The pages are configured if userspace
is enabled and if memory domain has been initialized before
calling swap.
GH-3852
Signed-off-by: Adithya Baglody <adithya.nagaraj.baglody@intel.com>
This is intended for memory-constrained systems and will save
4K per thread, since we will no longer reserve room for or
activate a kernel stack guard page.
If CONFIG_USERSPACE is enabled, stack overflows will still be
caught in some situations:
1) User mode threads overflowing stack, since it crashes into the
kernel stack page
2) Supervisor mode threads overflowing stack, since the kernel
stack page is marked non-present for non-user threads
Stack overflows will not be caught:
1) When handling a system call
2) When the interrupt stack overflows
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
Besides the fact that we did not have that for the current supported
boards, that makes sense for this new, virtualized mode, that is meant
to be run on top of full-fledged x86 64 CPUs.
By having xAPIC mode access only, Jailhouse has to intercept those MMIO
reads and writes, in order to examine what they do and arbitrate if it's
safe or not (e.g. not all values are accepted to ICR register). This
means that we can't run away from having a VM-exit event for each and
every access to APIC memory region and this impacts the latency the
guest OS observes over bare metal a lot.
When in x2APIC mode, Jailhouse does not require VM-exits for MSR
accesses other that writes to the ICR register, so the latency the guest
observes is reduced to almost zero.
Here are some outputs of the the command line
$ sudo ./tools/jailhouse cell stats tiny-demo
on a Jailhouse's root cell console, for one of the Zephyr demos using
LOAPIC timers, left for a couple of seconds:
Statistics for tiny-demo cell (x2APIC root, x2APIC inmate)
COUNTER SUM PER SEC
vmexits_total 7 0
vmexits_management 3 0
vmexits_cr 2 0
vmexits_cpuid 1 0
vmexits_msr 1 0
vmexits_exception 0 0
vmexits_hypercall 0 0
vmexits_mmio 0 0
vmexits_pio 0 0
vmexits_xapic 0 0
vmexits_xsetbv 0 0
Statistics for tiny-demo cell (xAPIC root, xAPIC inmate)
COUNTER SUM PER SEC
vmexits_total 4087 40
vmexits_xapic 4080 40
vmexits_management 3 0
vmexits_cr 2 0
vmexits_cpuid 1 0
vmexits_msr 1 0
vmexits_exception 0 0
vmexits_hypercall 0 0
vmexits_mmio 0 0
vmexits_pio 0 0
vmexits_xsetbv 0 0
Statistics for tiny-demo cell (xAPIC root, x2APIC inmate)
COUNTER SUM PER SEC
vmexits_total 4087 40
vmexits_msr 4080 40
vmexits_management 3 0
vmexits_cr 2 0
vmexits_cpuid 1 0
vmexits_exception 0 0
vmexits_hypercall 0 0
vmexits_mmio 0 0
vmexits_pio 0 0
vmexits_xapic 0 0
vmexits_xsetbv 0 0
See that under x2APIC mode on both Jailhouse/root-cell and guest, the
interruptions from the hypervisor are minimal. That is not the case when
Jailhouse is on xAPIC mode, though. Note also that, as a plus, x2APIC
accesses on the guest will map to xAPIC MMIO on the hypervisor just
fine.
Signed-off-by: Gustavo Lima Chaves <gustavo.lima.chaves@intel.com>
This is an introductory port for Zephyr to be run as a Jailhouse
hypervisor[1]'s "inmate cell", on x86 64-bit CPUs (running on 32-bit
mode). This was tested with their "tiny-demo" inmate demo cell
configuration, which takes one of the CPUs of the QEMU-VM root cell
config, along with some RAM and serial controller access (it will even
do nice things like reserving some L3 cache for it via Intel CAT) and
Zephyr samples:
- hello_world
- philosophers
- synchronization
The final binary receives an additional boot sequence preamble that
conforms to Jailhouse's expectations (starts at 0x0 in real mode). It
will put the processor in 32-bit protected mode and then proceed to
Zephyr's __start function.
Testing it is just a matter of:
$ mmake -C samples/<sample_dir> BOARD=x86_jailhouse JAILHOUSE_QEMU_IMG_FILE=<path_to_image.qcow2> run
$ sudo insmod <path to jailhouse.ko>
$ sudo jailhouse enable <path to configs/qemu-x86.cell>
$ sudo jailhouse cell create <path to configs/tiny-demo.cell>
$ sudo mount -t 9p -o trans/virtio host /mnt
$ sudo jailhouse cell load tiny-demo /mnt/zephyr.bin
$ sudo jailhouse cell start tiny-demo
$ sudo jailhouse cell destroy tiny-demo
$ sudo jailhouse disable
$ sudo rmmod jailhouse
For the hello_world demo case, one should then get QEMU's serial port
output similar to:
"""
Created cell "tiny-demo"
Page pool usage after cell creation: mem 275/1480, remap 65607/131072
Cell "tiny-demo" can be loaded
CPU 3 received SIPI, vector 100
Started cell "tiny-demo"
***** BOOTING ZEPHYR OS v1.9.0 - BUILD: Sep 12 2017 20:03:22 *****
Hello World! x86
"""
Note that the Jailhouse's root cell *has to be started in xAPIC
mode* (kernel command line argument 'nox2apic') in order for this to
work. x2APIC support and its reasoning will come on a separate commit.
As a reminder, the make run target introduced for x86_jailhouse board
involves a root cell image with Jailhouse in it, to be launched and then
partitioned (with >= 2 64-bit CPUs in it).
Inmate cell configs with no JAILHOUSE_CELL_PASSIVE_COMMREG flag
set (e.g. apic-demo one) would need extra code in Zephyr to deal with
cell shutdown command responses from the hypervisor.
You may want to fine tune CONFIG_SYS_CLOCK_HW_CYCLES_PER_SEC for your
specific CPU—there is no detection from Zephyr with regard to that.
Other config differences from pristine QEMU defaults worth of mention
are:
- there is no HPET when running as Jailhouse guest. We use the LOAPIC
timer, instead
- there is no PIC_DISABLE, because there is no 8259A PIC when running
as a Jailhouse guest
- XIP makes no sense also when running as Jailhouse guest, and both
PHYS_RAM_ADDR/PHYS_LOAD_ADD are set to zero, what tiny-demo cell
config is set to
This opens up new possibilities for Zephyr, so that usages beyond just
MCUs come to the table. I see special demand coming from
functional-safety related use cases on industry, automotive, etc.
[1] https://github.com/siemens/jailhouse
Reference to Jailhouse's booting preamble code:
Origin: Jailhouse
License: BSD 2-Clause
URL: https://github.com/siemens/jailhouse
commit: 607251b44397666a3cbbf859d784dccf20aba016
Purpose: Dual-licensing of inmate lib code
Maintained-by: Zephyr
Signed-off-by: Gustavo Lima Chaves <gustavo.lima.chaves@intel.com>
Some "random" drivers are not drivers at all: they just implement the
function `sys_rand32_get()`. Move those to a random subsystem in
preparation for a reorganization.
Signed-off-by: Leandro Pereira <leandro.pereira@intel.com>
In PAE boot tables the __mmu_tables_start points to page directory
pointer (PDPT). Enable the PAE by updating the CR4.PAE and
IA32_EFER.NXE bits.
JIRA:ZEP-2511
Signed-off-by: Adithya Baglody <adithya.nagaraj.baglody@intel.com>
Created structures and unions needed to enable the software to
access these tables.
Also updated the helper macros to ease the usage of the MMU page
tables.
JIRA: ZEP-2511
Signed-off-by: Adithya Baglody <adithya.nagaraj.baglody@intel.com>
If CONFIG_X86_PAE_MODE is enabled for the build, then gen_mmu.py
would generate the boot time page tables in PAE format.
This supports 3 level paging i.e Page Directory Pointer(PDPT), Page
Directory(PD) and Page Table(PT). Each Page Table Entry(PTE) maps to
a 4KB region. Each Page Directory Entry(PDE) maps a 2MB region.
Each Page Directory Pointer Entry(PDPTE) maps to a 1GB region.
JIRA: ZEP-2511
Signed-off-by: Adithya Baglody <adithya.nagaraj.baglody@intel.com>
Page Address Extension(PAE) page tables would be used
if this option is enabled.
JIRA:ZEP-2511
Signed-off-by: Adithya Baglody <adithya.nagaraj.baglody@intel.com>
During swap the required page tables are configured. The outgoing
thread's memory domain pages are reset and the incoming thread's
memory domain is loaded. The pages are configured if userspace
is enabled and if memory domain has been initialized before
calling swap.
GH-3852
Signed-off-by: Adithya Baglody <adithya.nagaraj.baglody@intel.com>
Currently this is defined as a k_thread_stack_t pointer.
However this isn't correct, stacks are defined as arrays. Extern
references to k_thread_stack_t doesn't work properly as the compiler
treats it as a pointer to the stack array and not the array itself.
Declaring as an unsized array of k_thread_stack_t doesn't work
well either. The least amount of confusion is to leave out the
pointer/array status completely, use pointers for function prototypes,
and define K_THREAD_STACK_EXTERN() to properly create an extern
reference.
The definitions for all functions and struct that use
k_thread_stack_t need to be updated, but code that uses them should
be unchanged.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
Looking up the PTE flags was page faulting if the address wasn't
marked as present in the page directory, since there is no page table
for that directory entry.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
At very low optimization levels, the call to
K_THREAD_STACK_BUFFER doesn't get inlined, overflowing the
tiny stack.
Replace with _ARCH_THREAD_STACK_BUFFER() which on x86 is
just a macro.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
For 'rep stosl' ECX isn't a size value, it's how many times to repeat
the 4-byte string copy operation.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
Some our Zephyr tools don't like seeing UTF-8 characters, as reported in
issue #4131) so a quick scan and replace for UTF-8 characters in .rst,
.h, and Kconfig files using "file --mime-encoding" (excluding the /ext
folders) finds these files to tweak.
Signed-off-by: David B. Kinder <david.b.kinder@intel.com>
- syscall.h now contains those APIs needed to support invoking calls
from user code. Some stuff moved out of main kernel.h.
- syscall_handler.h now contains directives useful for implementing
system call handler functions. This header is not pulled in by
kernel.h and is intended to be used by C files implementing kernel
system calls and driver subsystem APIs.
- syscall_list.h now contains the #defines for system call IDs. This
list is expected to grow quite large so it is put in its own header.
This is now an enumerated type instead of defines to make things
easier as we introduce system calls over the new few months. In the
fullness of time when we desire to have a fixed userspace/kernel ABI,
this can always be converted to defines.
Some new code added:
- _SYSCALL_MEMORY() macro added to check memory regions passed up from
userspace in handler functions
- _syscall_invoke{7...10}() inline functions declare for invoking system
calls with more than 6 arguments. 10 was chosen as the limit as that
corresponds to the largest arg list we currently have
which is for k_thread_create()
Other changes
- auto-generated K_SYSCALL_DECLARE* macros documented
- _k_syscall_table in userspace.c is not a placeholder. There's no
strong need to generate it and doing so would require the introduction
of a third build phase.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
A quick look at "man syscall" shows that in Linux, all architectures
support at least 6 argument system calls, with a few supporting 7. We
can at least do 6 in Zephyr.
x86 port modified to use EBP register to carry the 6th system call
argument.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
* Instead of a common system call entry function, we instead create a
table mapping system call ids to handler skeleton functions which are
invoked directly by the architecture code which receives the system
call.
* system call handler prototype specified. All but the most trivial
system calls will implement one of these. They validate all the
arguments, including verifying kernel/device object pointers, ensuring
that the calling thread has appropriate access to any memory buffers
passed in, and performing other parameter checks that the base system
call implementation does not check, or only checks with __ASSERT().
It's only possible to install a system call implementation directly
inside this table if the implementation has a return value and requires
no validation of any of its arguments.
A sample handler implementation for k_mutex_unlock() might look like:
u32_t _syscall_k_mutex_unlock(u32_t mutex_arg, u32_t arg2, u32_t arg3,
u32_t arg4, u32_t arg5, void *ssf)
{
struct k_mutex *mutex = (struct k_mutex *)mutex_arg;
_SYSCALL_ARG1;
_SYSCALL_IS_OBJ(mutex, K_OBJ_MUTEX, 0, ssf);
_SYSCALL_VERIFY(mutex->lock_count > 0, ssf);
_SYSCALL_VERIFY(mutex->owner == _current, ssf);
k_mutex_unlock(mutex);
return 0;
}
* the x86 port modified to work with the system call table instead of
calling a common handler function. fixed an issue where registers being
changed could confuse the compiler has been fixed; all registers, even
ones used for parameters, must be preserved across the system call.
* a new arch API for producing a kernel oops when validating system call
arguments added. The debug information reported will be from the system
call site and not inside the handler function.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
- _arch_user_mode_enter() implemented
- _arch_is_user_context() implemented
- _new_thread() will honor K_USER option if passed in
- System call triggering macros implemented
- _thread_entry_wrapper moved and now looks for the next function to
call in EDI
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>