Now that device_api attribute is unmodified at runtime, as well as all
the other attributes, it is possible to switch all device driver
instance to be constant.
A coccinelle rule is used for this:
@r_const_dev_1
disable optional_qualifier
@
@@
-struct device *
+const struct device *
@r_const_dev_2
disable optional_qualifier
@
@@
-struct device * const
+const struct device *
Fixes#27399
Signed-off-by: Tomasz Bursztyka <tomasz.bursztyka@linux.intel.com>
The x86 paging code has been rewritten to support another paging mode
and non-identity virtual mappings.
- Paging code now uses an array of paging level characteristics and
walks tables using for loops. This is opposed to having different
functions for every paging level and lots of #ifdefs. The code is
now more concise and adding new paging modes should be trivial.
- We now support 32-bit, PAE, and IA-32e page tables.
- The page tables created by gen_mmu.py are now installed at early
boot. There are no longer separate "flat" page tables. These tables
are mutable at any time.
- The x86_mmu code now has a private header. Many definitions that did
not need to be in public scope have been moved out of mmustructs.h
and either placed in the C file or in the private header.
- Improvements to dumping page table information, with the physical
mapping and flags all shown
- arch_mem_map() implemented
- x86 userspace/memory domain code ported to use the new
infrastructure.
- add logic for physical -> virtual instruction pointer transition,
including cleaning up identity mappings after this takes place.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
The address was being truncated because we were using
32-bit registers. CONFIG_MMU is always enabled on 64-bit,
remove the #ifdef.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
Move tracing switched_in and switched_out to the architecture code and
remove duplications. This changes swap tracing for x86, xtensa.
Signed-off-by: Anas Nashif <anas.nashif@intel.com>
This set of functions seem to be there just because of historical
reasons, stemming from Kbuild. They are non-obvious and prone to errors,
so remove them in favor of the `_ifdef()` ones with an explicit
`CONFIG_` condition.
Script used:
git grep -l _if_kconfig | xargs sed -E -i
"s/_if_kconfig\(\s*(\w*)/_ifdef(CONFIG_\U\1\E \1/g"
Signed-off-by: Carles Cufi <carles.cufi@nordicsemi.no>
These stacks are appropriate for threads that run purely in
supervisor mode, and also as stacks for interrupt and exception
handling.
Two new arch defines are introduced:
- ARCH_KERNEL_STACK_GUARD_SIZE
- ARCH_KERNEL_STACK_OBJ_ALIGN
New public declaration macros:
- K_KERNEL_STACK_RESERVED
- K_KERNEL_STACK_EXTERN
- K_KERNEL_STACK_DEFINE
- K_KERNEL_STACK_ARRAY_DEFINE
- K_KERNEL_STACK_MEMBER
- K_KERNEL_STACK_SIZEOF
If user mode is not enabled, K_KERNEL_STACK_* and K_THREAD_STACK_*
are equivalent.
Separately generated privilege elevation stacks are now declared
like kernel stacks, removing the need for K_PRIVILEGE_STACK_ALIGN.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
This now takes a stack pointer as an argument with TLS
and random offsets accounted for properly.
Based on #24467 authored by Flavio Ceolin.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
The core kernel computes the initial stack pointer
for a thread, properly aligning it and subtracting out
any random offsets or thread-local storage areas.
arch_new_thread() no longer needs to make any calculations,
an initial stack frame may be placed at the bounds of
the new 'stack_ptr' parameter passed in. This parameter
replaces 'stack_size'.
thread->stack_info is now set before arch_new_thread()
is invoked, z_new_thread_init() has been removed.
The values populated may need to be adjusted on arches
which carve-out MPU guard space from the actual stack
buffer.
thread->stack_info now has a new member 'delta' which
indicates any offset applied for TLS or random offset.
It's used so the calculations don't need to be repeated
if the thread later drops to user mode.
CONFIG_INIT_STACKS logic is now performed inside
z_setup_new_thread(), before arch_new_thread() is called.
thread->stack_info is now defined as the canonical
user-accessible area within the stack object, including
random offsets and TLS. It will never include any
carved-out memory for MPU guards and must be updated at
runtime if guards are removed.
Available stack space is now optimized. Some arches may
need to significantly round up the buffer size to account
for page-level granularity or MPU power-of-two requirements.
This space is now accounted for and used by virtue of
the Z_THREAD_STACK_SIZE_ADJUST() call in z_setup_new_thread.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
MISRA-C wants the parameter names in a function implementaion
to match the names used by the header prototype.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
arch_new_thread() passes along the thread priority and option
flags, but these are already initialized in thread->base and
can be accessed there if needed.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
MISRA-C directive 4.10 requires that files being included must
prevent itself from being included more than once. So add
include guards to the offset files, even though they are C
source files.
Signed-off-by: Daniel Leung <daniel.leung@intel.com>
The hardware stack overflow feature requires
CONFIG_THREAD_STACK_INFO enabled in order to distingush
stack overflows from other causes when we get an exception.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
A hack was required for the loapic code due to the address
range not being in DTS. A bug was filed.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
This driver code uses PCIe and doesn't use Zephyr's
device model, so we can't use the nice DEVICE_MMIO macros.
Set stuff up manually instead using device_map().
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
This currently only supports identity paging; there's just
enough here for device_map() calls to work.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
Adding just the cache flush function for x86. The name
arch_cache_flush comply with API names in include/cache.h
Signed-off-by: Aastha Grover <aastha.grover@intel.com>
The page table initialization needs a populated PCI MMIO
configuration, and that is lazy-evaluated. We aren't guaranteed that
a driver already hit that path, so be sure to call it explicitly.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
Right now x86_64 doesn't install handlers for vectors that aren't
populated by Zephyr code. Add a tiny spurious interrupt handler that
logs the error and triggers a fatal error, like other platforms do.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
This patch is almost entirely aesthetics, designed to isolate the
variant configurations to a simple macro API (just IN/OUT), reduce
complexity derived from code pasted out of the larger ns16550 driver,
and keep the complexity out of the (very simple!) core code. Useful
when hacking on the driver in contexts where it isn't working yet.
The sole behavioral change here is that I've removed the runtime
printk hook installation in favor of defining an
arch_printk_char_out() function which overrides the weak-linked
default (that is, we don't need to install a hook, we can be the
default hook at startup).
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
Various cleanups to the x86 early serial driver, mostly with the goal
of simplifying its deployment during board bringup (which is really
the only reason it exists in the first place):
+ Configure it =y by default. While there are surely constrained
environments that will want to disable it, this is a TINY driver,
and it serves a very important role for niche tasks. It should be
built always to make sure it works everywhere.
+ Decouple from devicetree as much as possible. This code HAS to work
during board bringup, often with configurations cribbed from other
machines, before proper configuration gets written. Experimentally,
devicetree errors tend to be easy to make, and without a working
console impossible to diagnose. Specify the device via integer
constants in soc.h (in the case of IOPORT access, we already had
such a symbol) so that the path from what the developer intends to
what the code executes is as short and obvious as possible.
Unfortunately I'm not allowed to remove devicetree entirely here,
but at least a developer adding a new platform will be able to
override it in an obvious way instead of banging blindly on the
other side of a DTS compiler.
+ Don't try to probe the PCI device by ID to "verify". While this
sounds like a good idea, in practice it's just an extra thing to get
wrong. If we bail on our early console because someone (yes, that's
me) got the bus/device/function right but typoed the VID/DID
numbers, we're doing no one any favors.
+ Remove the word-sized-I/O feature. This is a x86 driver for a PCI
device. No known PC hardware requires that UART register access be
done in dword units (in fact doing so would be a violation of the
PCI specifciation as I understand it). It looks to have been cut
and pasted from the ns16550 driver, remove.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
The default page table (the architecturally required one used for
entrance to long mode, before the OS page tables get assembled) was
mapping the first 4G of memory.
Extend this to 512G by fully populating the second level page table.
We have devices now (up_squared) which have real RAM mapped above 4G.
There's really no good reason not to do this, the page is present
always anyway.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
A last minute "cleanup" to the EFI startup path (on a system where I
had SMP disabled) moved the load of the x86_cpuboot[0] entry into RBP
into the main startup code, which is wrong because on auxiliary CPUs
that's already set up by the 16/32 bit entry code to point to the
OTHER entries.
Put it back where it belongs.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
This is a first cut on a tool that will convert a built Zephyr ELF
file into an EFI applciation suitable for launching directly from the
firmware of a UEFI-capable device, without the need for an external
bootloader.
It works by including the Zephyr sections into the EFI binary as
blobs, then copying them into place on startup.
Currently, it is not integrated in the build. Right now you have to
build an image for your target (up_squared has been tested) and then
pass the resulting zephyr.elf file as an argument to the
arch/x86/zefi/zefi.py script. It will produce a "zephyr.efi" file in
the current directory.
This involved a little surgery in x86_64 to copy over some setup that
was previously being done in 32 bit mode to a new EFI entry point.
There is no support for 32 bit UEFI targets for toolchain reasons.
See the README for more details.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
The traditional IO Port configuration mechanism was technically
deprecated about 15 years ago when PCI Express started shipping.
While frankly the MMIO support is significantly more complicated and
no more performant in practice, Zephyr should have support for current
standards. And (particularly complicated) devices do exist in the
wild whose extended capability pointers spill beyond the 256 byte area
allowed by the legacy mechanism. Zephyr will want drivers for those
some day.
Also, Windows and Linux use MMIO access, which means that's what
system vendors validate.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
The existing minimal ACPI implementation was enough to find the MADT
table for dumping CPU info. Enhance it with a slightly less minimal
implementation that can fetch any table, supports the ACPI 2.0 XSDT
directory (technically required on 64 bit systems so tables can live
>4G) and provides definitions for the MCFG table with the PCI
configuration pointers.
Note that there is no use case right now for high performance table
searching, so the "init" step has been removed and tables are probed
independently from scratch for each one requested (there are only
two).
Note also that the memory to which these tables point is not
understood by the Zephyr MMU configuration, so in long mode all ACPI
calls have to be done very early, before z_x86_paging_init() (or on a
build with the MMU initialization disabled).
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
If we get a page fault in early boot context, before
main thread is started, page faults were being
incorrectly reported as stack overflows.
z_x86_check_stack_bounds() needs to consider the
interrupt stack as the correct stack for this context.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
This helps distingush between fatal errors if logging isn't
enabled.
As detailed in comments, pass a reason code which controls
the QEMU process' return value.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
x86_64's __resume path 'poisons' the incoming thread's
saved RIP value with a special 0xB9 value, to catch
re-use of thread objects across CPUs in SMP. Add a check
and printout for this when handling fatal errors, and
treat as a kernel panic.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
If KPTI is not enabled, the current value of CR3 is the correct
page tables when the exception happened in all cases.
If KPTI is enabled, and the excepting thread was in user mode,
then a page table switch happened and the current value of CR3
is not the page tables when the fault happened. Get it out of the
thread object instead.
Fixes two problems:
- Divergent exception loop if we crash when _current is a dummy
thread or its page table pointer stored in the thread object is
NULL or uninitialized
- Printing the wrong CR3 value on exceptions from user mode in
the register dump
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
In one of the ASSERT() statement, the PHYS_RAM_ADDR (alias
of DT_REG_ADDR()) may be interpreted by the compiler as
long long int when it's large than 0x7FFFFFFF, but is
paired with %x, resulting in compiler warning. Fix this
by type casting it to uintptr_t and use %lx instead.
Signed-off-by: Daniel Leung <daniel.leung@intel.com>
On x86_64, the arch_timing_* variables are not set which
results in incorrect values being used in the timing_info
benchmarks. So instrument the code for those values.
Signed-off-by: Daniel Leung <daniel.leung@intel.com>
The SoCs usually have devices that are accessed through MMIO.
This requires the corresponding regions to be marked readable
and writable in the MMU or else accesses will result in page
faults.
This adds a function which can be implemented in the SoC code to
specify those pages to be added to MMU.
Signed-off-by: Daniel Leung <daniel.leung@intel.com>
The integers used for pointer calculation were u32_t.
Change them to uintptr_t to be compatible with 64-bit.
Signed-off-by: Daniel Leung <daniel.leung@intel.com>
x86-32 thread objects require special alignment since they
contain a buffer that is passed to fxsave/fxrstor instructions.
This fell over if the dummy thread is created in a stack frame.
Implement a custom swap to main for x86 which still uses a
dummy thread, but in an unused part of the interrupt stack
with proper alignment.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
If IO APIC is in logical destination mode, local APICs compare their
logical APIC ID defined in LDR (Logical Destination Register) with
the destination code sent with the interrupt to determine whether or not
to accept the incoming interrupt.
This patch programs LDR in xAPIC mode to support IO APIC logical mode.
The local APIC ID from local APIC ID register can't be used as the
'logical APIC ID' because LAPIC ID may not be consecutive numbers hence
it makes it impossible for LDR to encode 8 IDs within 8 bits.
This patch chooses 0 for BSP, and for APs, cpu_number which is the index
to x86_cpuboot[], which ultimately assigned in z_smp_init[].
Signed-off-by: Zide Chen <zide.chen@intel.com>
This commit renames the x86 Kconfig `CONFIG_{EAGER,LAZY}_FP_SHARING`
symbol to `CONFIG_{EAGER,LAZY}_FPU_SHARING`, in order to align with the
recent `CONFIG_FP_SHARING` to `CONFIG_FPU_SHARING` renaming.
Signed-off-by: Stephanos Ioannidis <root@stephanos.io>
This commit renames the Kconfig `FP_SHARING` symbol to `FPU_SHARING`,
since this symbol specifically refers to the hardware FPU sharing
support by means of FPU context preservation, and the "FP" prefix is
not fully descriptive of that; leaving room for ambiguity.
Signed-off-by: Stephanos Ioannidis <root@stephanos.io>
This expands the early_serial to support MMIO UART, in addition to
port I/O, by duplicating part of the hardware initialization from
the NS16550 UART driver. This allows enabling of early console on
hardware with MMIO-based UARTs.
Signed-off-by: Daniel Leung <daniel.leung@intel.com>
x86_64 supports 4 levels of interrupt nesting, with
the interrupt stack divided up into sub-stacks for
each nesting level.
Unfortunately, the initial interrupt stack pointer
on the first CPU was not taking into account reserved
space for guard areas, causing a stack overflow exception
when attempting to use the last interrupt nesting level,
as that page had been set up as a stack guard.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
We need to lock interrupts before setting the thread's
stack pointer to the trampoline stack. Otherwise, we
could unexpectedly take an interrupt on this stack
instead of the thread stack as intended.
The specific problem happens at the end of the interrupt,
when we switch back to the thread stack and call swap.
Doing this on a per-cpu trampoline stack instead of the
thread stack causes data corruption.
Fixes: #24869
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
Replace DT_PHYS_RAM_ADDR and DT_RAM_SIZE with DT_REG_ADDR/DT_REG_SIZE
for the DT_CHOSEN(zephyr_sram) node.
Signed-off-by: Kumar Gala <kumar.gala@linaro.org>
This commit renames the Kconfig `FLOAT` symbol to `FPU`, since this
symbol only indicates that the hardware Floating Point Unit (FPU) is
used and does not imply and/or indicate the general availability of
toolchain-level floating point support (i.e. this symbol is not
selected when building for an FPU-less platform that supports floating
point operations through the toolchain-provided software floating point
library).
Moreover, given that the symbol that indicates the availability of FPU
is named `CPU_HAS_FPU`, it only makes sense to use "FPU" in the name of
the symbol that enables the FPU.
Signed-off-by: Stephanos Ioannidis <root@stephanos.io>
This operation is formally defined as rounding down a potential
stack pointer value to meet CPU and ABI requirments.
This was previously defined ad-hoc as STACK_ROUND_DOWN().
A new architecture constant ARCH_STACK_PTR_ALIGN is added.
Z_STACK_PTR_ALIGN() is defined in terms of it. This used to
be inconsistently specified as STACK_ALIGN or STACK_PTR_ALIGN;
in the latter case, STACK_ALIGN meant something else, typically
a required alignment for the base of a stack buffer.
STACK_ROUND_UP() only used in practice by Risc-V, delete
elsewhere.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
The core kernel z_setup_new_thread() calls into arch_new_thread(),
which calls back into the core kernel via z_new_thread_init().
Move everything that doesn't have to be in z_new_thread_init() to
z_setup_new_thread() and convert to an inline function.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
Those are used only in tests, so remove them from kernel Kconfig and set
them in the tests that use them directly.
Signed-off-by: Anas Nashif <anas.nashif@intel.com>
The set of interrupt stacks is now expressed as an array. We
also define the idle threads and their associated stacks this
way. This allows for iteration in cases where we have multiple
CPUs.
There is now a centralized declaration in kernel_internal.h.
On uniprocessor systems, z_interrupt_stacks has one element
and can be used in the same way as _interrupt_stack.
The IRQ stack for CPU 0 is now set in init.c instead of in
arch code.
The extern definition of the main thread stack is now removed,
this doesn't need to be in a header.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
Change to code to use the automatically generated DT_INST_*
defines and remove the now unneeded configs and fixups.
Signed-off-by: Timo Teräs <timo.teras@iki.fi>
If IO APIC is in logical destination mode, local APICs compare their
logical APIC ID defined in LDR (Logical Destination Register) with
the destination code sent with the interrupt to determine whether or not
to accept the incoming interrupt.
This patch programs LDR in xAPIC mode to support IO APIC logical mode.
The local APIC ID from local APIC ID register can't be used as the
'logical APIC ID' because LAPIC ID may not be consecutive numbers hence
it makes it impossible for LDR to encode 8 IDs within 8 bits.
This patch chooses 0 for BSP, and for APs, cpu_number which is the index
to x86_cpuboot[], which ultimately assigned in z_smp_init[].
Signed-off-by: Zide Chen <zide.chen@intel.com>
The callee-saved registers have been separated out and will not
be saved/restored if exception debugging is shut off.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
The context switch implementation forgot to save the current flag
state of the old thread, so on resume the flags would be restored to
whatever value they had at the last interrupt preemption or thread
initialization. In practice this guaranteed that the interrupt enable
bit would always be wrong, becuase obviously new threads and preempted
ones have interrupts enabled, while arch_switch() is always called
with them masked. This opened up a race between exit from
arch_switch() and the final exit path in z_swap().
The other state bits weren't relevant -- the oddball ones aren't used
by Zephyr, and as arch_switch() on this architecture is a function
call the compiler would have spilled the (caller-save) comparison
result flags anyway.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
Add TRACING_ISR Kconfig to help high latency backend working well.
Currently the ISR tracing hook function is put at the begining and
ending of ISR wrapper, when there is ISR needed in the tracing path
(especially tracing backend), it will cause tracing buffer easily
be exhausted if async tracing method enabled. Also it will increase
system latency if all the ISRs are traced. So add TRACING_ISR to
enable/disable ISR tracing here. Later a filter out mechanism based
on irq number will be added.
Signed-off-by: Wentong Wu <wentong.wu@intel.com>
There was a bug where double-dispatch of a single thread on multiple
SMP CPUs was possible. This can be mind-bending to diagnose, so when
CONFIG_ASSERT is enabled add an extra instruction to __resume (the
shared code path for both interupt return and context switch) that
poisons the shared RIP of the now-running thread with a recognizable
invalid value.
Now attempts to run the thread again will crash instantly with a
discoverable cookie in their instruction pointer, and this will remain
true until it gets a new RIP at the next interrupt or switch.
This is under CONFIG_ASSERT because it meets the same design goals of
"a cheap test for impossible situations", not because it's part of the
assertion framework.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
The original intent was that the output handle be written through the
pointer in the second argument, though not all architectures used that
scheme. As it turns out, that write is becoming a synchronization
signal, so it's no longer optional.
Clarify the documentation in arch_switch() about this requirement, and
add an instruction to the x86_64 context switch to implement it as
original envisioned.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
Implement a set of per-cpu trampoline stacks which all
interrupts and exceptions will initially land on, and also
as an intermediate stack for privilege changes as we need
some stack space to swap page tables.
Set up the special trampoline page which contains all the
trampoline stacks, TSS, and GDT. This page needs to be
present in the user page tables or interrupts don't work.
CPU exceptions, with KPTI turned on, are treated as interrupts
and not traps so that we have IRQs locked on exception entry.
Add some additional macros for defining IDT entries.
Add special handling of locore text/rodata sections when
creating user mode page tables on x86-64.
Restore qemu_x86_64 to use KPTI, and remove restrictions on
enabling user mode on x86-64.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
generated_dts_board.h is pretty redundant and confusing as a name. Call
it devicetree.h instead.
dts.h would be another option, but DTS stands for "devicetree source"
and is the source code format, so it's a bit confusing too.
The replacement was done by grepping for 'generated_dts_board' and
'GENERATED_DTS_BOARD'.
Two build diagram and input-output SVG files were updated as well, along
with misc. documentation.
hal_ti, mcuboot, and ci-tools updates are included too, in the west.yml
update.
Signed-off-by: Ulf Magnusson <Ulf.Magnusson@nordicsemi.no>
See CVE-2019-1125. We mitigate this by adding an 'lfence'
upon interrupt/exception entry after the decision has been
made whether it's necessary to invoke 'swapgs' or not.
Only applies to x86_64, 32-bit doesn't use swapgs.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
- In early boot, enable the syscall instruction and set up
necessary MSRs
- Add a hook to update page tables on context switch
- Properly initialize thread based on whether it will
start in user or supervisor mode
- Add landing function for system calls to execute the
desired handler
- Implement arch_user_string_nlen()
- Implement logic for dropping a thread down to user mode
- Reserve per-CPU storage space for user and privilege
elevation stack pointers, necessary for handling syscalls
when no free registers are available
- Proper handling of gs register considerations when
transitioning privilege levels
Kernel page table isolation (KPTI) is not yet implemented.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
This code:
1) Doesn't work
2) Hasn't ever been enabled by default
3) We mitigate Spectre V2 via Extended IBRS anyway
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
We use a fixed value of 32 as the way interrupts/exceptions
are setup in x86_64's locore.S do not lend themselves to
Kconfig configuration of the vector to use.
HW-based kernel oops is now permanently on, there's no reason
to make it optional that I can see.
Default vectors for IPI and irq offload adjusted to not
collide.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
This is causing problems, as if we create a thread in
a system call we will *not* be using the kernel page
tables if CONFIG_KPTI=n.
Just don't fiddle with this page's permissions; we don't
need it as a guard area anyway since we have a stack
guard placed immediately before it, and this page
is unused if user mode isn't active.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
Nothing too fancy here, we try as much as possible to
use the same register layout as the C calling convention.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
These are now common code, all are related to user mode
threads. The rat's nest of ifdefs in ia32's arch_new_thread
has been greatly simplified, there is now just one hook
if user mode is turned on.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
z_x86_thread_page_tables_get() now works for both user
and supervisor threads, returning the kernel page tables
in the latter case. This API has been up-leveled to
a common header.
The per-thread privilege elevation stack initial stack
pointer, and the per-thread page table locations are no
longer computed from other values, and instead are stored
in thread->arch.
A problem where the wrong page tables were dumped out
on certain kinds of page faults has been fixed.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
Add two new non-static APIs for dumping out the
page table entries for a specified memory address,
and move to the main MMU code. Has debugging uses
when trying to figure out why memory domains are not
set up correctly.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
We don't need to set up GDT data descriptors for setting
%gs. Instead, we use the x86 MSRs to set GS_BASE and
KERNEL_GS_BASE.
We don't currently allow user mode to set %gs on its own,
but later on if we do, we have everything set up to issue
'swapgs' instructions on syscall or IRQ.
Unused entries in the GDT have been removed.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
These were previously assumed to always be fatal.
We can't have the faulting thread's XMM registers
clobbered, so put the SIMD/FPU state onto the stack
as well. This is fairly large (512 bytes) and the
execption stack is already uncomfortably small, so
increase to 2K.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
Runtime stack traces (at least as currently implemented)
don't work on x86_64 normally as RBP is treated as a general-
purpose register. Depend on CONFIG_NO_OPTIMIZATIONS to enable
this on 64-bit.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
qemu_x86_64 will exit the emulator on a fatal system error,
like qemu_x86 already does.
Improves CI times when tests fail since sanitycheck will not
need to wait for the timeout to expire.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
We now dump more information for less common cases,
and this is now centralized code for 32-bit/64-bit.
All of this code is now correctly wrapped around
CONFIG_EXCEPTION_DEBUG. Some cruft and unused defines
removed.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
We need a size_t and not a u32_t for partition sizes,
for 64-bit compatibility.
Additionally, app_memdomain.h was also casting the base
address to a u32_t instead of a uintptr_t.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
This is causing problems, as if we create a thread in
a system call we will *not* be using the kernel page
tables if CONFIG_KPTI=n, resulting in a crash when
the later call to copy_page_tables() tries to initialize
the PDPT (which is in the same page as the privilege
stack).
Just don't fiddle with this page's permissions; we don't
need it as a guard area anyway since we have a stack
guard placed immediately before it, and this page
is unused if user mode isn't active.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
This commit inlines the direct ISR functions that were previously
implemented in irq_manage.c, since the PR #20119 resolved the circular
dependency between arch.h and kernel_structs.h described in the issue
#3056.
Signed-off-by: Stephanos Ioannidis <root@stephanos.io>
Promote the private z_arch_* namespace, which specifies
the interface between the core kernel and the
architecture code, to a new top-level namespace named
arch_*.
This allows our documentation generation to create
online documentation for this set of interfaces,
and this set of interfaces is worth treating in a
more formal way anyway.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
Some code for unwinding stacks and z_x86_fatal_error()
now in a common C file, suitable for both modes.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
When compiling the components under the arch directory, the compiler
include paths for arch and kernel private headers need to be specified.
This was previously done by adding 'zephyr_library_include_directories'
to CMakeLists.txt file for every component under the arch directory,
and this resulted in a significant amount of duplicate code.
This commit uses the CMake 'include_directories' command in the root
CMakeLists.txt to simplify specification of the private header include
paths for all the arch components.
Signed-off-by: Stephanos Ioannidis <root@stephanos.io>
This commit refactors kernel and arch headers to establish a boundary
between private and public interface headers.
The refactoring strategy used in this commit is detailed in the issue
This commit introduces the following major changes:
1. Establish a clear boundary between private and public headers by
removing "kernel/include" and "arch/*/include" from the global
include paths. Ideally, only kernel/ and arch/*/ source files should
reference the headers in these directories. If these headers must be
used by a component, these include paths shall be manually added to
the CMakeLists.txt file of the component. This is intended to
discourage applications from including private kernel and arch
headers either knowingly and unknowingly.
- kernel/include/ (PRIVATE)
This directory contains the private headers that provide private
kernel definitions which should not be visible outside the kernel
and arch source code. All public kernel definitions must be added
to an appropriate header located under include/.
- arch/*/include/ (PRIVATE)
This directory contains the private headers that provide private
architecture-specific definitions which should not be visible
outside the arch and kernel source code. All public architecture-
specific definitions must be added to an appropriate header located
under include/arch/*/.
- include/ AND include/sys/ (PUBLIC)
This directory contains the public headers that provide public
kernel definitions which can be referenced by both kernel and
application code.
- include/arch/*/ (PUBLIC)
This directory contains the public headers that provide public
architecture-specific definitions which can be referenced by both
kernel and application code.
2. Split arch_interface.h into "kernel-to-arch interface" and "public
arch interface" divisions.
- kernel/include/kernel_arch_interface.h
* provides private "kernel-to-arch interface" definition.
* includes arch/*/include/kernel_arch_func.h to ensure that the
interface function implementations are always available.
* includes sys/arch_interface.h so that public arch interface
definitions are automatically included when including this file.
- arch/*/include/kernel_arch_func.h
* provides architecture-specific "kernel-to-arch interface"
implementation.
* only the functions that will be used in kernel and arch source
files are defined here.
- include/sys/arch_interface.h
* provides "public arch interface" definition.
* includes include/arch/arch_inlines.h to ensure that the
architecture-specific public inline interface function
implementations are always available.
- include/arch/arch_inlines.h
* includes architecture-specific arch_inlines.h in
include/arch/*/arch_inline.h.
- include/arch/*/arch_inline.h
* provides architecture-specific "public arch interface" inline
function implementation.
* supersedes include/sys/arch_inline.h.
3. Refactor kernel and the existing architecture implementations.
- Remove circular dependency of kernel and arch headers. The
following general rules should be observed:
* Never include any private headers from public headers
* Never include kernel_internal.h in kernel_arch_data.h
* Always include kernel_arch_data.h from kernel_arch_func.h
* Never include kernel.h from kernel_struct.h either directly or
indirectly. Only add the kernel structures that must be referenced
from public arch headers in this file.
- Relocate syscall_handler.h to include/ so it can be used in the
public code. This is necessary because many user-mode public codes
reference the functions defined in this header.
- Relocate kernel_arch_thread.h to include/arch/*/thread.h. This is
necessary to provide architecture-specific thread definition for
'struct k_thread' in kernel.h.
- Remove any private header dependencies from public headers using
the following methods:
* If dependency is not required, simply omit
* If dependency is required,
- Relocate a portion of the required dependencies from the
private header to an appropriate public header OR
- Relocate the required private header to make it public.
This commit supersedes #20047, addresses #19666, and fixes#3056.
Signed-off-by: Stephanos Ioannidis <root@stephanos.io>
Use this short header style in all Kconfig files:
# <description>
# <copyright>
# <license>
...
Also change all <description>s from
# Kconfig[.extension] - Foo-related options
to just
# Foo-related options
It's clear enough that it's about Kconfig.
The <description> cleanup was done with this command, along with some
manual cleanup (big letter at the start, etc.)
git ls-files '*Kconfig*' | \
xargs sed -i -E '1 s/#\s*Kconfig[\w.-]*\s*-\s*/# /'
Signed-off-by: Ulf Magnusson <Ulf.Magnusson@nordicsemi.no>
There are two set of code supporting x86_64: x86_64 using x32 ABI,
and x86 long mode, and this consolidates both into one x86_64
architecture and SoC supporting truly 64-bit mode.
() Removes the x86_64:x32 architecture and SoC, and replaces
them with the existing x86 long mode arch and SoC.
() Replace qemu_x86_64 with qemu_x86_long as qemu_x86_64.
() Updates samples and tests to remove reference to
qemu_x86_long.
() Renames CONFIG_X86_LONGMODE to CONFIG_X86_64.
Signed-off-by: Daniel Leung <daniel.leung@intel.com>
The page tables to use are now stored in the cpuboot struct.
For the first CPU, we set to the flat page tables, and then
update later in z_x86_prep_c() once the runtime tables have
been generated.
For other CPUs, by the time we get to z_arch_start_cpu()
the runtime tables are ready do go, and so we just install
them directly.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
- Bring in CONFIG_X86_MMU and some related defines to
common X86 Kconfig
- Don't set ARCH_HAS_USERSPACE for intel64 yet when
X86_MMU is enabled
- Uplevel x86_mmu.c to common code
- Add logic for handling PML4 table and generating PDPTs
- move z_x86_paging_init() to common kernel_arch_func.h
- Uplevel inclusion of mmustructs.h to common x86 arch.h,
both need it for memory domain defines
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
Program text, rodata, and data need different MMU
permissions. Split out rodata and data from the program
text, updating the linker script appropriately.
Region size symbols added to the linker script, so these
can later be used with MMU_BOOT_REGION().
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
Duplicate definitions elsewhere have been removed.
A couple functions which are defined by the arch interface
to be non-inline, but were implemented inline by native_posix
and intel64, have been moved to non-inline.
Some missing conditional compilation for z_arch_irq_offload()
has been fixed, as this is an optional feature.
Some massaging of native_posix headers to get everything
in the right scope.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
The intel64 switch implementation doesn't actually use a switch handle
per se, just the raw thread struct pointers which get stored into the
handle field. This works fine for normally initialized threads, but
when switching out of a dummy thread at initialization, nothing has
initialized that field and the code was dumping registers into the
bottom of memory through the resulting NULL pointer.
Fix this by skipping the load of the field value and just using an
offset instead to get the struct address, which is actually slightly
faster anyway (a SUB immediate instruction vs. the load).
Actually for extra credit we could even move the switch_handle field
to the top of the thread struct and eliminate the instruction
entirely, though if we did that it's probably worth adding some
conditional code to make the switch_handle field disappear entirely.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
Line up everything nicely, add leading '0x' to hex
addresses, and remove redundant newlines. Add
whitespace between the register name and contents
so the contents can be easily selected from a terminal.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
The struct definitions for pdpt, pd, and pt entries has been
removed:
- Bitfield ordering in a struct is implementation dependent,
it can be right-to-left or left-to-right
- The two different structures for page directory entries were
not being used consistently, or when the type of the PDE
was unknown
- Anonymous structs/unions are GCC extensions
Instead these are now u64_t, with bitwise operations used to
get/set fields.
A new set of inline functions for fetcing various page table
structures has been implemented, replacing the older macros.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
This hasn't been necessary since we dropped support for 32-bit
non-PAE page tables. Replace it with u64_t and scrub any
unnecessary casts left behind.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
This will be used for both 32-bit and 64-bit mode.
This header gets pulled in by x86's arch/cpu.h, so put
it in include/arch/x86/.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
include/sys/arch_inlines.h will contain all architecture APIs
that are used by public inline functions and macros,
with implementations deriving from include/arch/cpu.h.
kernel/include/arch_interface.h will contain everything
else, with implementations deriving from
arch/*/include/kernel_arch_func.h.
Instances of duplicate documentation for these APIs have been
removed; implementation details have been left in place.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
Set the NXE bit in the EFER MSR so that the NX bit can
be set in page tables. Otherwise, the NX bit is treated
as reserved and leads to a fault if set.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
It's possible to have multiple processors configured without using the
SMP scheduler, so don't make definitions dependent on CONFIG_SMP.
Signed-off-by: Charles E. Youse <charles.youse@intel.com>
In non-SMP MP situations, the interrupt stacks might not exist, so
do not assume they do. Instead, initialize the TSS IST1 from the
cpuboot[] vector (meaning, on APs, the stack from z_arch_start_cpu).
Eliminates redundancy at the same time.
Signed-off-by: Charles E. Youse <charles.youse@intel.com>
This is the Wrong Thing(tm) with SMP enabled. Previously this
worked because interrupts would be re-enabled in the interrupt
entry sequence, but this is no longer the case.
Signed-off-by: Charles E. Youse <charles.youse@intel.com>
was ignoring the rest of the expression, though the effect was
harmless (including unreachable code in some builds).
Signed-off-by: Charles E. Youse <charles.youse@intel.com>
Add duplicate per-CPU data structures (x86_cpuboot, tss, stacks, etc.)
for up to 4 total CPUs, add code in locore and z_arch_start_cpu().
The test board, qemu_x86_long, now defaults to 2 CPUs.
Signed-off-by: Charles E. Youse <charles.youse@intel.com>
Take a dummy first argument, so that the BSP entry point (z_x86_prep_c)
has the same signature as the AP entry point (smp_init_top).
Signed-off-by: Charles E. Youse <charles.youse@intel.com>
A new 'struct x86_cpuboot' is created as well as an instance called
'x86_cpuboot[]' which contains per-CPU boot data (initial stack,
entry function/arg, selectors, etc.). The locore now consults this
table to set up per-CPU registers, etc. during early boot.
Also, rename tss.c to cpu.c as its scope is growing.
Signed-off-by: Charles E. Youse <charles.youse@intel.com>
There's no need to qualify the 64-bit CS/DS selectors, and the GS and
TR selectors are renamed CPU0_GS and CPU0_TR as they are CPU-specific.
Signed-off-by: Charles E. Youse <charles.youse@intel.com>
In some places the code was being overly pedantic; e.g., there is no
need to load our own 32-bit descriptors because the loader's are fine
for our purposes. We can defer loading our own segments until 64-bit.
The sequence is re-ordered to faciliate code sharing between the BSP
and APs when SMP is enabled (all BSP-specific operations occur before
the per-CPU initialization).
Signed-off-by: Charles E. Youse <charles.youse@intel.com>
This is really just to facilitate CPU bootstrap code between
the BSP and the APs, moving the clear operation out of the way.
Signed-off-by: Charles E. Youse <charles.youse@intel.com>
In the general case, the local APIC can't be treated as a normal device
with a single boot-time initialization - on SMP systems, each CPU must
initialize its own. Hence the initialization proper is separated from
the device-driver initialization, and said initialization is called
from the early startup-assembly code when appropriate.
Signed-off-by: Charles E. Youse <charles.youse@intel.com>
The 32-bit and 64-bit assembly startup sequences share quite a
bunch of common code, so it's factored out into one file to avoid
repeating ourselves (and potentially falling out of sync).
Signed-off-by: Charles E. Youse <charles.youse@intel.com>
The linker script was missing symbols that defined the boundaries
of kernel memory segments (_image_rom_end, etc.). These are added
so that core/memmap.c can properly account for those segments.
Signed-off-by: Charles E. Youse <charles.youse@intel.com>
Elevate the previously 32-bit-only z_x86_prep_c() function to common
code, so both 32-bit and 64-bit arches now enter the kernel this way.
Minor changes to prep_c.c to make it build with the SMP scheduler on.
Signed-off-by: Charles E. Youse <charles.youse@intel.com>
This patch is a preparatory step in enabling the MMU in
long mode; no steps are taken to implement long mode support.
We introduce struct x86_page_tables, which represents the
top-level data structure for page tables:
- For 32-bit, this will contain a four-entry page directory
pointer table (PDPT)
- For 64-bit, this will (eventually) contain a page map level 4
table (PML4)
In either case, this pointer value is what gets programmed into
CR3 to activate a set of page tables. There are extra bits in
CR3 to set for long mode, we'll get around to that later.
This abstraction will allow us to use the same APIs that work
with page tables in either mode, rather than hard-coding that
the top level data structure is a PDPT.
z_x86_mmu_validate() has been re-written to make it easier to
add another level of paging for long mode, to support 2MB
PDPT entries, and correctly validate regions which span PDPTE
entries.
Some MMU-related APIs moved out of 32-bit x86's arch.h into
mmustructs.h.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
This makes it clearer that this is an API that is expected
to be implemented at the architecture level.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
This metric shows when the system first enters an idle
state, which has already been recorded in the arch-
independent implementation of the idle thread.
Only x86 was doing this.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
This is part of the core kernel -> architecture interface and
has been renamed z_arch_kernel_init().
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
z_set_thread_return_value is part of the core kernel -> arch
interface and has been renamed to z_arch_thread_return_value_set.
z_set_thread_return_value_with_data renamed to
z_thread_return_value_set_with_data for consistency.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
k_cpu_idle() and k_cpu_atomic_idle() were being directly
implemented by arch code.
Rename these implementations to z_arch_cpu_idle() and
z_arch_cpu_atomic_idle(), and call them from new inline
function definitions in kernel.h.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
This is part of the core kernel -> architecture interface
and is appropriately renamed z_arch_is_in_isr().
References from test cases changed to k_is_in_isr().
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
This is part of the core kernel -> architecture interface
and should have a leading prefix z_arch_.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
Global variables related to timing information have been
renamed to be prefixed with z_arch, with naming arranged
in increasing order of specificity.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
ACPI is predominantly x86, and only currently implemented on x86,
but it is employed on other architectures, so rename accordingly.
Signed-off-by: Charles E. Youse <charles.youse@intel.com>
Simple naming change, since MULTIBOOT is clear enough by itself and
"namespacing" it to X86 is unnecessary and/or inappropriate.
Signed-off-by: Charles E. Youse <charles.youse@intel.com>
x86 has more complex memory maps than most Zephyr targets. A mechanism
is introduced here to manage such a map, and some methods are provided
to populate it (e.g., Multiboot).
The x86_info tool is extended to display memory map data.
Signed-off-by: Charles E. Youse <charles.youse@intel.com>
Originally, the multiboot info struct was copied in the early assembly
language code. This code is moved to a C function in multiboot.c for
two reasons:
1. It's about to get more complicated, as we want the ability to use
a multiboot-provided memory map if available, and
2. this will faciliate its sharing between 32- and 64-bit subarches.
Signed-off-by: Charles E. Youse <charles.youse@intel.com>
Implement a simple ACPI parser with enough functionality to
enumerate CPU cores and determine their local APIC IDs.
Signed-off-by: Charles E. Youse <charles.youse@intel.com>
Various C and Assembly modules
make function calls to z_sys_trace_*. These merely call
corresponding functions sys_trace_*. This commit
is to simplify these by making direct function calls
to the sys_trace_* functions from these modules.
Subsequently, the z_sys_trace_* functions are removed.
Signed-off-by: Mrinal Sen <msen@oticon.com>
This used to be part of the "restore always" set of registers because
__swap was expected to return a value. No longer required, so RAX is
moved to the volatile registers and we save a few cycles occasionally.
Signed-off-by: Charles E. Youse <charles.youse@intel.com>
A space is allocated in the TSS for per-CPU variables. At present,
this is only a 'struct _cpu *' to find the _kernel CPU struct. The
locore routines are rewritten to find _current and _nested via this
pointer rather than referencing the _kernel global directly.
This is obviously in preparation for SMP support.
Signed-off-by: Charles E. Youse <charles.youse@intel.com>
This function call was erroneously inserted between the instruction
that set the Z flag and the instruction that tested the Z flag. The
call is moved up a few instructions where it can't junk CPU state.
Signed-off-by: Charles E. Youse <charles.youse@intel.com>
Declare the 64-bit TSS as a struct, and define the instance in C.
Add a data segment selector that overlaps the TSS and keep that
loaded in GS so we can access the TSS via a segment-override prefix.
Signed-off-by: Charles E. Youse <charles.youse@intel.com>
This is largely a conceptual change rather than an actual change.
Instead of using an array of interrupt stacks (one for each IRQ
nesting level), we use one interrupt stack and subdivide it. The
effect is the same, but this is more in line with the Zephyr model
of one ISR stack per CPU (as reflected in init.c).
Signed-off-by: Charles E. Youse <charles.youse@intel.com>
Like its 32-bit sibling, the 64-bit code should EOI inline rather than
invoking a function. Defeats the performance advantages of x2APIC.
Signed-off-by: Charles E. Youse <charles.youse@intel.com>
The boot time measurement sample was giving bogus values on x86: an
assumption was made that the system timer is in sync with the CPU TSC,
which is not the case on most x86 boards.
Boot time measurements are no longer permitted unless the timer source
is the local APIC. To avoid issues of TSC scaling, the startup datum
has been forced to 0, which is in line with the ARM implementation
(which is the only other platform which supports this feature).
Cleanups along the way:
As the datum is now assumed zero, some variables are removed and
calculations simplified. The global variables involved in boot time
measurements are moved to the kernel.h header rather than being
redeclared in every place they are referenced. Since none of the
measurements actually use 64-bit precision, the samples are reduced
to 32-bit quantities.
In addition, this feature has been enabled in long mode.
Fixes: #19144
Signed-off-by: Charles E. Youse <charles.youse@intel.com>
There are not enough bits in k_thread.thread_state with SMP enabled,
and the field is (should be) private to the scheduler, anyway. So
move state bits to the _thread_arch where they belong.
While we're at it, refactor some offset data w/r/t _thread_arch
because it can be shared between 32- and 64-bit subarches.
Signed-off-by: Charles E. Youse <charles.youse@intel.com>
k_thread.thread_state (or rather, _thread_base.thread_state) should be
private to the kernel/scheduler, so flags previously stored there are
moved to _thread_arch where the belong.
Signed-off-by: Charles E. Youse <charles.youse@intel.com>
We don't need to save the ABI caller-save registers here, because
we don't preempt threads from nested IRQ contexts.
Signed-off-by: Charles E. Youse <charles.youse@intel.com>
This is a naive implementation which does "eager" context switching
for floating-point context, which, of course, introduces performance
concerns. Other approaches have security concerns, SMP implications,
and impact the x86 arch and Zephyr project as a whole. Discussion is
needed, so punting with the straightforward solution for now.
Signed-off-by: Charles E. Youse <charles.youse@intel.com>
Fleshed out z_arch_esf_t and added code to build this frame when
exceptions occur. Created a separate small stack for exceptions and
shifted the initialization code to use this instead of the IRQ stack.
Moved IRQ stack(s) to irq.c.
Signed-off-by: Charles E. Youse <charles.youse@intel.com>
The IRQ_OFFLOAD_VECTOR config option is also moved to the arch level,
as it is shared between both 32- and 64-bit subarches.
Signed-off-by: Charles E. Youse <charles.youse@intel.com>
Using the arch Kconfig here, instead of kernel/Kconfig. Intel64 with
the SysV ABI requires some pretty big stacks. These 4K-8K defaults
are arguably a bit small, but the Zephyr defaults are REALLY too small.
Signed-off-by: Charles E. Youse <charles.youse@intel.com>
First "complete" version of Intel64 support for x86. Compilation of
apps for supported boards (read: up_squared) with CONFIG_X86_LONGMODE=y
is now working. Booting, device drivers, interrupts, scheduling, etc.
appear to be functioning properly. Beware that this is ALHPA quality,
not ready for production use, but the port has advanced far enough that
it's time to start working through the test suite and samples, fleshing
out any missing features, and squashing bugs.
Signed-off-by: Charles E. Youse <charles.youse@intel.com>
Widen the integer to pointer size before conversion, to make
explicit the intent (and silence the compiler warning). Also
fix a minor bug involving a duplicate (and thus dead) store.
Signed-off-by: Charles E. Youse <charles.youse@intel.com>
This patch adds basic build infrastructure, definitions, a linker
script, etc. to use the Zephyr and 0.10.1 SDK to build a 64-bit
ELF binary suitable for use with GRUB to minimally bootstrap an
Apollo Lake (e.g., UpSquared) board. The resulting binary can hardly
be called a Zephyr kernel as it is lacking most of the glue logic,
but it is a starting point to flesh those out in the x86 tree.
The "kernel" builds with a few harmless warnings, both with GCC from
the Zephyr SDK and with ICC (which is currently being worked on in
a separate branch). These warnings are either related to pointer size
differences (since this is an LP64 build) and/or dummy functions
that will be replaced with working versions shortly.
Signed-off-by: Charles E. Youse <charles.youse@intel.com>
The IA32 and Intel64 subarchitectures will generate different offset
symbols, so they are refactored. No functional change.
Signed-off-by: Charles E. Youse <charles.youse@intel.com>
The _irq_to_interrupt_vector[] array shouldn't be accessed directly,
as there is a macro for this.
Signed-off-by: Charles E. Youse <charles.youse@intel.com>
From the Jailhouse days, this has been a function call. That's silly.
We now inline the EOI in the ISR when in x2APIC mode. Also clean up
z_irq_controller_eoi(), so it now uses the inline macros.
Also, we now enable x2APIC on up_squared by default.
Fixes: #17133
Signed-off-by: Charles E. Youse <charles.youse@intel.com>
Assembly language start code will enter here, which sets up
early kernel initialization and then calls z_cstart() when
finished.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
Removes very complex boot-time generation of page tables
with a much simpler runtime generation of them at bootup.
For those x86 boards that enable the MMU in the defconfig,
set the number of page pool pages appropriately.
The MMU_RUNTIME_* flags have been removed. They were an
artifact of the old page table generation and did not
correspond to any hardware state.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
Makes the code that defines stacks, and code referencing
areas within the stack object, much clearer.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
Previously, context switching on x86 with memory protection
enabled involved walking the page tables, de-configuring all
the partitions in the outgoing thread's memory domain, and
then configuring all the partitions in the incoming thread's
domain, on a global set of page tables.
We now have a much faster design. Each thread has reserved in
its stack object a number of pages to store page directories
and page tables pertaining to the system RAM area. Each
thread also has a toplevel PDPT which is configured to use
the per-thread tables for system RAM, and the global tables
for the rest of the address space.
The result of this is on context switch, at most we just have
to update the CR3 register to the incoming thread's PDPT.
The x86_mmu_api test was making too many assumptions and has
been adjusted to work with the new design.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
The current API was assuming too much, in that it expected that
arch-specific memory domain configuration is only maintained
in some global area, and updates to domains that are not currently
active have no effect.
This was true when all memory domain state was tracked in page
tables or MPU registers, but no longer works when arch-specific
memory management information is stored in thread-specific areas.
This is needed for: #13441#13074#15135
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
These turned out to be quite useful when debugging MMU
issues, commit them to the tree. The output format is
virtually the same as gen_mmu_x86.py's verbose output.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
Currently page tables have to be re-computed in
an expensive operation on context switch. Here we
reserve some room in the page tables such that
we can have per-thread page table data, which will
be much simpler to update on context switch at
the expense of memory.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
Has the same effect of catching stack overflows, but
makes debugging with GDB simpler since we won't get
errors when inspecting such regions. Making these
areas non-present was more than we needed, read-only
is sufficient.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
Adapted from similar code in the x86_64 port.
Useful when debugging boot problems on actual x86
hardware if a JTAG isn't handy or feasible.
Turn this on for qemu_x86.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
It looks like, at some point in the past, initializing thread stacks
was the responsibility of the arch layer. After that was centralized,
we forgot to remove the related conditional header inclusion. Fixed.
Signed-off-by: Charles E. Youse <charles.youse@intel.com>
This is now called z_arch_esf_t, conforming to our naming
convention.
This needs to remain a typedef due to how our offset generation
header mechanism works.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
We introduce a new z_fatal_print() API and replace all
occurrences of exception handling code to use it.
This routes messages to the logging subsystem if enabled.
Otherwise, messages are sent to printk().
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
* z_NanoFatalErrorHandler() is now moved to common kernel code
and renamed z_fatal_error(). Arches dump arch-specific info
before calling.
* z_SysFatalErrorHandler() is now moved to common kernel code
and renamed k_sys_fatal_error_handler(). It is now much simpler;
the default policy is simply to lock interrupts and halt the system.
If an implementation of this function returns, then the currently
running thread is aborted.
* New arch-specific APIs introduced:
- z_arch_system_halt() simply powers off or halts the system.
* We now have a standard set of fatal exception reason codes,
namespaced under K_ERR_*
* CONFIG_SIMPLE_FATAL_ERROR_HANDLER deleted
* LOG_PANIC() calls moved to k_sys_fatal_error_handler()
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
Before, attempting to induce a kernel oops would instead
lead to a general protection fault as the interrupt vector
was at DPL=0.
Now we allow by setting DPL=3. We restrict the allowable
reason codes to either stack overflows or kernel oops; we
don't want user mode to be able to create a kernel panic,
or fake some other kind of exception.
Fixes an issue where the stack canary test case was triggering
a GPF instead of a stack check exception on x86.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
For the x86 architecture the z_arch_float_disable() is only
implemented when building with CONFIG_LAZY_FP_SHARING, so we
make z_arch_float_disable() return -ENOSYS when we build with
FLOAT and FP_SHARING but on an x86 platform where
LAZY_FP_SHARING is not supported.
Signed-off-by: Ioannis Glaropoulos <Ioannis.Glaropoulos@nordicsemi.no>
This file merely declares external functions referenced only
by ia32/cache.c, so the declarations are inlined instead.
Signed-off-by: Charles E. Youse <charles.youse@intel.com>
This file was used to generate offsets for host tools that are no
longer in use, so it's removed and the offsets are no longer generated.
Signed-off-by: Charles E. Youse <charles.youse@intel.com>
Over time, this has been reduced to a few functions dealing solely
with floating-point support, referenced only from core/ia32/float.c.
Thus they are moved into that file and the header is eliminated.
Signed-off-by: Charles E. Youse <charles.youse@intel.com>
The MVIC is no longer supported, and only the APIC-based interrupt
subsystem remains. Thus this layer of indirection is unnecessary.
This also corrects an oversight left over from the Jailhouse x2APIC
implementation affecting EOI delivery for direct ISRs only.
Signed-off-by: Charles E. Youse <charles.youse@intel.com>
This header is currently IA32-specific, so move it into the subarch
directory and update references to it.
Signed-off-by: Charles E. Youse <charles.youse@intel.com>
Making room for the Intel64 subarch in this tree. This header is
32-bit specific and so it's relocated, and references rewritten
to find it in its new location.
Signed-off-by: Charles E. Youse <charles.youse@intel.com>
This file is 32-bit specific, so it is moved into the ia32/ directory
and references to it are updated accordingly.
Also, SP_ARG* definitions are no longer used, so they are removed.
Signed-off-by: Charles E. Youse <charles.youse@intel.com>
Eliminate definitions for MSRs that we don't use. Centralize the
definitions for the MSRs that we do use, including their fields.
Signed-off-by: Charles E. Youse <charles.youse@intel.com>
Work around a testcase problem, where we want to check some
logic for the bounds check bypass mitigation in the common
kernel code. By changing the ifdef to the x86-specific option
for these lfence instructions, we avoid IAMCU build errors
but still test the common code.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
move misc/reboot.h to power/reboot.h and
create a shim for backward-compatibility.
No functional changes to the headers.
A warning in the shim can be controlled with CONFIG_COMPAT_INCLUDES.
Related to #16539
Signed-off-by: Anas Nashif <anas.nashif@intel.com>
move misc/util.h to sys/util.h and
create a shim for backward-compatibility.
No functional changes to the headers.
A warning in the shim can be controlled with CONFIG_COMPAT_INCLUDES.
Related to #16539
Signed-off-by: Anas Nashif <anas.nashif@intel.com>
move misc/printk.h to sys/printk.h and
create a shim for backward-compatibility.
No functional changes to the headers.
A warning in the shim can be controlled with CONFIG_COMPAT_INCLUDES.
Related to #16539
Signed-off-by: Anas Nashif <anas.nashif@intel.com>
move misc/__assert.h to sys/__assert.h and
create a shim for backward-compatibility.
No functional changes to the headers.
A warning in the shim can be controlled with CONFIG_COMPAT_INCLUDES.
Related to #16539
Signed-off-by: Anas Nashif <anas.nashif@intel.com>
move tracing.h to debug/tracing.h and
create a shim for backward-compatibility.
No functional changes to the headers.
A warning in the shim can be controlled with CONFIG_COMPAT_INCLUDES.
Related to #16539
Signed-off-by: Anas Nashif <anas.nashif@intel.com>
This was only enabled by the MVIC, which in turn was only used
by the Quark D2000, which has been removed.
Signed-off-by: Charles E. Youse <charles.youse@intel.com>
The Quark D2000 is the only x86 with an MVIC, and since support for
it has been dropped, the interrupt controller is orphaned. Removed.
Signed-off-by: Charles E. Youse <charles.youse@intel.com>
These constants do not need global exposure, as they're only
referenced in the reboot API implementation. Also their names
are trimmed to fit into the X86-arch-specific namespace.
Signed-off-by: Charles E. Youse <charles.youse@intel.com>
This appears to date all the way back to the initial import
and is used in exactly one place if DEBUG is on. Removed.
Signed-off-by: Charles E. Youse <charles.youse@intel.com>
Previously the existing EFLAGS was used as a base which was
then manipulated accordingly. This is unnecessary as the bits
preserved contain no useful state related to the new thread.
Signed-off-by: Charles E. Youse <charles.youse@intel.com>
Found a few annoying typos and figured I better run script and
fix anything it can find, here are the results...
Signed-off-by: Anas Nashif <anas.nashif@intel.com>
Create source directory for IA32-subarch specific files, and move
qualifying files to that subdirectory.
Signed-off-by: Charles E. Youse <charles.youse@intel.com>
This commit adds the architecture-specific implementation
of k_float_disable() for ARM and x86.
Signed-off-by: Ioannis Glaropoulos <Ioannis.Glaropoulos@nordicsemi.no>
The real-mode startup code is trivially changed to refer to MSR
definitions in include/arch/x86/msr.h, rather than its ad-hoc ones.
Signed-off-by: Charles E. Youse <charles.youse@intel.com>
Light reorganization. All MSR definitions and manipulation functions
are consolidated into one header. The names are changed to use an
X86_* prefix instead of IA32_* which is misleading/incorrect.
Signed-off-by: Charles E. Youse <charles.youse@intel.com>
drivers/interrupt_controller/i8259.c is not a driver; it exists
solely to disable the i8259s when the configuration calls for it.
The six-byte sequence to mask the controllers is moved to crt0.S
and the pseudo-driver is removed.
Signed-off-by: Charles E. Youse <charles.youse@intel.com>
A basic display driver is added for a generic 32-bpp framebuffer.
Glue logic is added to the x86 arch to request the intitialization
of a linear framebuffer by the Multiboot loader (GRUB) and connect
it to this generic driver.
Signed-off-by: Charles E. Youse <charles.youse@intel.com>
When booting using GRUB, some useful information about the environment
is given to us via a boot information structure. We've not made any
use of this information so far, but the x86 framebuffer driver will.
A skeletal definition of the structure is given, and provisions are
made to preserve its contents at boot if the configuration requires it.
Signed-off-by: Charles E. Youse <charles.youse@intel.com>
We do have a multi-architecture latency benchmark now, this one was x86
only, was never used or compiled in and is out-dated.
Signed-off-by: Anas Nashif <anas.nashif@intel.com>
The only use of the BOOTLOADER_UNKNOWN config option is on x86, where
it controls whether a multiboot header is embedded in the output.
This patch renames the option to be more descriptive, and makes it
an x86-specific option, rather than a Zephyr top-level option.
This also enables X86_MULTIBOOT by default, since the header only
occupies 12-16 bytes of memory and is (almost always) harmless.
Signed-off-by: Charles E. Youse <charles.youse@intel.com>
Update the name of mem-domain API function to add a partition
so that it complies with the 'z_' prefix convention. Correct
the function documentation.
Signed-off-by: Ioannis Glaropoulos <Ioannis.Glaropoulos@nordicsemi.no>
It's relatively hard to figure out what thread a crash happens in
from the crash dump. E.g, it's usually not immediately possible to
find it out from linker map due to the fact that static symbols are
not there (https://sourceware.org/bugzilla/show_bug.cgi?id=16566).
So, try to do it as easy if possible, by just printing thread name
in a dump, if thread names are enabled at all.
Signed-off-by: Paul Sokolovsky <paul.sokolovsky@linaro.org>
A parallel PCI implementation ("pcie") is added with features for PCIe.
In particular, message-signaled interrupts (MSI) are supported, which
are essential to the use of any non-trivial PCIe device.
The NS16550 UART driver is modified to use pcie.
pcie is a complete replacement for the old PCI support ("pci"). It is
smaller, by an order of magnitude, and cleaner. Both pci and pcie can
(and do) coexist in the same builds, but the intent is to rework any
existing drivers that depend on pci and ultimately remove pci entirely.
This patch is large, but things in mirror are smaller than they appear.
Most of the modified files are configuration-related, and are changed
only slightly to accommodate the modified UART driver.
Deficiencies:
64-bit support is minimal. The code works fine with 64-bit capable
devices, but will not cooperate with MMIO regions (or MSI targets) that
have high bits set. This is not needed on any current boards, and is
unlikely to be needed in the future. Only superficial changes would
be required if we change our minds.
The method specifying PCI endpoints in devicetree is somewhat kludgey.
The "right" way would be to hang PCI devices off a topological tree;
while this would be more aesthetically pleasing, I don't think it's
worth the effort, given our non-standard use of devicetree.
Signed-off-by: Charles E. Youse <charles.youse@intel.com>
Update the files which contain no license information with the
'Apache-2.0' SPDX license identifier. Many source files in the tree are
missing licensing information, which makes it harder for compliance
tools to determine the correct license.
By default all files without license information are under the default
license of Zephyr, which is Apache version 2.
Signed-off-by: Anas Nashif <anas.nashif@intel.com>
The results were incorrect because the timer was firing the
interrupts before the measurement was made.
Fixes: GH-14556
Signed-off-by: Adithya Baglody <adithya.nagaraj.baglody@intel.com>
This macro is slated for complete removal, as it's not possible
on arches with an MPU stack guard to know the true buffer bounds
without also knowing the runtime state of its associated thread.
As removing this completely would be invasive to where we are
in the 1.14 release, demote to a private kernel Z_ API instead.
The current way that the macro is being used internally will
not cause any undue harm, we just don't want any external code
depending on it.
The final work to remove this (and overhaul stack specification in
general) will take place in 1.15 in the context of #14269Fixes: #14766
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
Rename reserved function names in arch/ subdirectory. The Python
script gen_priv_stacks.py was updated to follow the 'z_' prefix
naming.
Signed-off-by: Patrik Flykt <patrik.flykt@intel.com>
The legacy struct s_coopFloatReg was never being used, though it was
an empty struct (not wasting space), some symbols were being generate
for it.
Nevertheless, neither C99 nor C11 allow empty structs, so this
was also a violation to the C standards.
Signed-off-by: Flavio Ceolin <flavio.ceolin@intel.com>
Signed-off-by: Anas Nashif <anas.nashif@intel.com>
MISRA defines a serie of essential types, boolean, signed/unsigned
integers, float, ... and operations must respect these essential types.
MISRA-C rule 10.1
Signed-off-by: Flavio Ceolin <flavio.ceolin@intel.com>
BIT macro uses an unsigned int avoiding implementation-defined behavior
when shifting signed types.
MISRA-C rule 10.1
Signed-off-by: Flavio Ceolin <flavio.ceolin@intel.com>
This adds a compiler option -fno-inline for code coverage on
architectures which supports doing code coverage. This also
modifies the ALWAYS_INLINE macro to not do any inlining. This
needs to be done so code coverage can count the number of
executions to the correct lines.
Signed-off-by: Daniel Leung <daniel.leung@intel.com>
This commit cleans up names of system power management functions by
assuring that:
- all functions start with 'sys_pm_' prefix
- API functions which should not be exposed to the user start with '_'
- name of the function hints at its purpose
Signed-off-by: Piotr Mienkowski <piotr.mienkowski@gmail.com>
Speculative execution side channel attacks can read the
entire FPU/SIMD register state on affected Intel Core
processors, see CVE-2018-3665.
We now have two options for managing floating point
context between threads on x86: CONFIG_EAGER_FP_SHARING
and CONFIG_LAZY_FP_SHARING.
The mitigation is to unconditionally save/restore these
registers on context switch, instead of the lazy sharing
algorithm used by CONFIG_LAZY_FP_SHARING.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
Update reserved function names starting with one underscore, replacing
them as follows:
'_k_' with 'z_'
'_K_' with 'Z_'
'_handler_' with 'z_handl_'
'_Cstart' with 'z_cstart'
'_Swap' with 'z_swap'
This renaming is done on both global and those static function names
in kernel/include and include/. Other static function names in kernel/
are renamed by removing the leading underscore. Other function names
not starting with any prefix listed above are renamed starting with
a 'z_' or 'Z_' prefix.
Function names starting with two or three leading underscores are not
automatcally renamed since these names will collide with the variants
with two or three leading underscores.
Various generator scripts have also been updated as well as perf,
linker and usb files. These are
drivers/serial/uart_handlers.c
include/linker/kobject-text.ld
kernel/include/syscall_handler.h
scripts/gen_kobject_list.py
scripts/gen_syscall_header.py
Signed-off-by: Patrik Flykt <patrik.flykt@intel.com>
We add two points where we add lfences to disable
speculation:
* In the memory buffer validation code, which takes memory
addresses and sizes from userspace and determins whether
this memory is actually accessible.
* In the system call landing site, after the system call ID
has been validated but before it is used.
Kconfigs have been added to enable these checks if the CPU
is not known to be immune on X86.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
On x86, if a supervisor thread belonging to a memory domain
adds a new partition to that domain, subsequent context switches
to another thread in the same domain, or dropping itself to user
mode, does not have the correct setup in the page tables.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
We need a copy of the flags field for ever PTE we are
updating, we can't just keep OR-ing in the address
field.
Fixes issues seen when setting flags for memory regions
larger than a page.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
During speculative execution, non-present pages are treated
as valid, which may expose their contents through side
channels.
Any non-present PTE will now have its address bits zeroed,
such that any speculative reads to them will go to the NULL
page.
The expected hit on performance is so minor that this is
enabled at all times.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
Retpolines were never completely implemented, even on x86.
Move this particular Kconfig to only concern itself with
the assembly code, and don't default it on ever since we
prefer SSBD instead.
We can restore the common kernel-wide CONFIG_RETPOLINE once
we have an end-to-end implementation.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
PAE page tables (the only kind we support) have 512
entries per page directory, not 1024.
Fixes: #13838
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
This is an integral part of userspace and cannot be used
on its own. Fold into the main userspace configuration.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
If the faulting context is in user mode, then we are
not on the same stack due to HW-level stack switching
on privilege elevation, and the faulting ESP is on
the stack itself.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
The code did not consider privilege level stack switches.
We have the original stack pointer in the NANO_ESF,
just use that.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
We now have a dedicated function to test whether
a memory region is withing the boundary of the
faulting context's stack buffer.
We use this to determine whether a page or double fault
was due to ESP being outside the bounds of the stack,
as well as when unwinding stack frames to print debug
output.
Fixes two issues:
- Stack overflows in user mode being incorrectly reported
as just page fault exceptions
- Exceptions that occur when unwinding corrupted stacks
The type of fault which triggered the stack overflow
logic (double or page fault) is now always shown.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
The code wasn't checking if the memory address to check
corresponded to a non-present page directory pointer
table entry.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
Upon hard/soft irq or exception entry/exit, handle transitions
off or onto the trampoline stack, which is the only stack that
can be used on the kernel side when the shadow page table
is active. We swap page tables when on this stack.
Adjustments to page tables are now as follows:
- Any adjustments for stack memory access now are always done
to the user page tables
- Any adjustments for memory domains are now always done to
the user page tables
- With KPTI, resetting a page now clears the present bit
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
In the event of a double fault, we do a HW task switch to
a special _df_tss hardware task which resets the stack
pointer to the interrupt stack and otherwise restores
the main hardware task to a runnable state so that
_df_handler_bottom() can run.
However, we need to make sure that _df_handler_bottom()
runs with interrupts locked, otherwise another IRQ could
corrupt the interrupt stack resulting in undefined
behavior.
We have very little stack space to work with in this
context, just zero it. It's a fatal error for the thread
in any event.
Fixes: #7291
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
This commit changes the names of SYS_POWER_DEEP_SLEEP* Kconfig
options in order to match SYS_POWER_LOW_POWER_STATE* naming
scheme.
Signed-off-by: Piotr Zięcik <piotr.ziecik@nordicsemi.no>
At boot, user threads were being granted access to the entire
app shared memory section. This is incorrect; user threads should
have no access until they are added to a memory domain, which
may contain partitions defined within it.
Change from MMU_ENTRY_USER (which grants permission at boot)
to MMU_ENTRY_RUNTIME_USER (which indicates that the pages may
be granted to user mode at runtime, but not at boot).
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
We want a _Swap() variant that can atomically release/restore a
spinlock state in addition to the legacy irqlock. The function as it
was is now named "_Swap_irqlock()", while _Swap() now refers to a
spinlock and takes two arguments. The former will be going away once
existing users (not that many! Swap() is an internal API, and the
long port away from legacy irqlocking is going to be happening mostly
in drivers) are ported to spinlocks.
Obviously on uniprocessor setups, these produce identical code. But
SMP requires that the correct API be used to maintain the global lock.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
This was never a long-term solution, more of a gross hack
to get test cases working until we could figure out a good
end-to-end solution for memory domains that generated
appropriate linker sections. Now that we have this with
the app shared memory feature, and have converted all tests
to remove it, delete this feature.
To date all userspace APIs have been tagged as 'experimental'
which sidesteps deprecation policies.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
This diverges from policy for all of our other arches
and C libraries. Global access to the malloc arena
may not be desirable.
Forthcoming patch will expose, for all C libraries, a
k_mem_partition with the malloc arena which can be
added to domains as desired.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
This is a separate data section which needs to be copied into
RAM.
Most arches just use the kernel's _data_copy(), but x86 has its
own optimized copying code.
Signed-off-by: Adithya Baglody <adithya.nagaraj.baglody@intel.com>
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
PAE tables introduce the NX bit which is very desirable
from a security perspetive, back in 1995.
PAE tables are larger, but we are not targeting x86 memory
protection for RAM constrained devices.
Remove the old style 32-bit tables to make the x86 port
easier to maintain.
Renamed some verbosely named data structures, and fixed
incorrect number of entries for the page directory
pointer table.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
This patch adds all the required hooks needed in the kernel to
get the coverage reports from x86 SoCs.
Signed-off-by: Adithya Baglody <adithya.nagaraj.baglody@intel.com>
The operation was shifiting bit using a signed constant in the left
operand. Use BIT macro to do it properly.
MISRA-C rule 12.2
Signed-off-by: Flavio Ceolin <flavio.ceolin@intel.com>
In C90 was introduced function prototype, that allows argument types
to be checked against parameter types, though it is not necessary
specify names for the parameters. MISRA-C requires names for function
prototype parameters, it claims that names can provide useful
information regarding the function interface.
MISRA-C rule 8.2
Signed-off-by: Flavio Ceolin <flavio.ceolin@intel.com>
When __ASSERT is not enabled there is an attribution to the variable
total_partitions and it is never used.
MISRA-C rule 2.2
Signed-off-by: Flavio Ceolin <flavio.ceolin@intel.com>
There is a function called _thread_entry defined in
lib/thread_entry.c. Just changing name to fix MISRA-C violation.
MISRA-C rule 5.8
Signed-off-by: Flavio Ceolin <flavio.ceolin@intel.com>
Previously, this was only built if CONFIG_EXCEPTION_DEBUG
was enabled, but CONFIG_USERSPACE needs it too for validating
strings sent in from user mode.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
If dynamic interrupts are enabled, a set of trampoline stubs
are generated which transfer control to a common dynamic
interrupt handler function, which then looks up the proper
handler and parameter and then executes the interrupt.
Based on the prior x86 dynamic interrupt implementation which
was removed from the kernel some time ago, and adapted to
changes in the common interrupt handling code, build system,
and IDT generation tools.
An alternative approach could be to read the currently executing
vector out of the APIC, but this is a much slower operation.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
struct k_thread already has a pointer type k_tid_t, there is no need for
this definition to tcs.
Less symbols/names make the code cleaner and more readable.
Signed-off-by: Flavio Ceolin <flavio.ceolin@intel.com>
Instead of checking every time we hit the low-level context switch
path to see if the new thread has a "partner" with which it needs to
share time, just run the slice timer always and reset it from the
scheduler at the points where it has already decided a switch needs to
happen. In TICKLESS_KERNEL situations, we pay the cost of extra timer
interrupts at ~10Hz or whatever, which is low (note also that this
kind of regular wakeup architecture is required on SMP anyway so the
scheduler can "notice" threads scheduled by other CPUs). Advantages:
1. Much simpler logic. Significantly smaller code. No variance or
dependence on tickless modes or timer driver (beyond setting a
simple timeout).
2. No arch-specific assembly integration with _Swap() needed
3. Better performance on many workloads, as the accounting now happens
at most once per timer interrupt (~5 Hz) and true rescheduling and
not on every unrelated context switch and interrupt return.
4. It's SMP-safe. The previous scheme kept the slice ticks as a
global variable, which was an unnoticed bug.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
MISRA-C requires that all declarations of a specific function, or
object, use the same names and type qualifiers.
MISRA-C rule 8.3
Signed-off-by: Flavio Ceolin <flavio.ceolin@intel.com>
Under GNU C, sizeof(void) = 1. This commit merely makes it explicit u8.
Pointer arithmetics over void types is:
* A GNU C extension
* Not supported by Clang
* Illegal across all ISO C standards
See also: https://gcc.gnu.org/onlinedocs/gcc/Pointer-Arith.html
Signed-off-by: Mark Ruvald Pedersen <mped@oticon.com>
Added LOG_PANIC to fault handlers to ensure that log is flush and
logger processes messages in a blocking way in fault handler.
Signed-off-by: Krzysztof Chruscinski <krzysztof.chruscinski@nordicsemi.no>
_k_syscall_table is an array of function pointers and is declared as
such in C sources, this makes it an STT_OBJECT[0] in the symbol
table. But when the same symbol is declared in assembly, it is
declared to be a function, which would make the symbol an STT_FUNC.
When linking with LTO this type inconsistency results in the warning:
real-ld: Warning: type of symbol `_k_syscall_table' changed from 2 to
1 in /tmp/cc84ofK0.ltrans8.ltrans.o
To fix this warning we declare the table with GDATA instead of GTEXT,
which will change the type from 'function' to 'object'.
[0]
https://docs.oracle.com/cd/E19455-01/816-0559/chapter6-79797/index.html
Signed-off-by: Sebastian Bøe <sebastian.boe@nordicsemi.no>
Move to more generic tracing hooks that can be implemented in different
ways and do not interfere with the kernel.
Signed-off-by: Anas Nashif <anas.nashif@intel.com>
Define generic interface and hooks for tracing to replace
kernel_event_logger and existing tracing facilities with something more
common.
Signed-off-by: Anas Nashif <anas.nashif@intel.com>
irq_lock returns an unsigned int, though, several places was using
signed int. This commit fix this behaviour.
In order to avoid this error happens again, a coccinelle script was
added and can be used to check violations.
Signed-off-by: Flavio Ceolin <flavio.ceolin@intel.com>
There exist two symbols that became equivalent when PR #9383 was
merged; _SYSCALL_LIMIT and K_SYSCALL_LIMIT. This patch deprecates the
redundant _SYSCALL_LIMIT symbol.
_SYSCALL_LIMIT was initally introduced because before PR #9383 was
merged K_SYSCALL_LIMIT was an enum, which couldn't be included into
assembly files. PR #9383 converted it into a define, which can be
included into assembly files, making _SYSCALL_LIMIT redundant.
Likewise for _SYSCALL_BAD.
Signed-off-by: Sebastian Bøe <sebastian.boe@nordicsemi.no>
Consistently use
config FOO
bool/int/hex/string "Prompt text"
instead of
config FOO
bool/int/hex/string
prompt "Prompt text"
(...and a bunch of other variations that e.g. swapped the order of the
type and the 'prompt', or put other properties between them).
The shorthand is fully equivalent to using 'prompt'. It saves lines and
avoids tricking people into thinking there is some semantic difference.
Most of the grunt work was done by a modified version of
https://unix.stackexchange.com/questions/26284/how-can-i-use-sed-to-replace-a-multi-line-string/26290#26290, but some
of the rarer variations had to be converted manually.
Signed-off-by: Ulf Magnusson <Ulf.Magnusson@nordicsemi.no>
Split out the arch specific syscall code to reduce include pollution
from other arch related headers. For example on ARM its possible to get
errno.h included via SoC specific headers. Which created an interesting
compile issue because of the order of syscall & errno/errno syscall
inclusion.
Signed-off-by: Kumar Gala <kumar.gala@linaro.org>
Uses fixup infrastructure to safely abort if we get a page
fault while measuring a string passed in from user mode.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
Summary: revised attempt at addressing issue 6290. The
following provides an alternative to using
CONFIG_APPLICATION_MEMORY by compartmentalizing data into
Memory Domains. Dependent on MPU limitations, supports
compartmentalized Memory Domains for 1...N logical
applications. This is considered an initial attempt at
designing flexible compartmentalized Memory Domains for
multiple logical applications and, with the provided python
script and edited CMakeLists.txt, provides support for power
of 2 aligned MPU architectures.
Overview: The current patch uses qualifiers to group data into
subsections. The qualifier usage allows for dynamic subsection
creation and affords the developer a large amount of flexibility
in the grouping, naming, and size of the resulting partitions and
domains that are built on these subsections. By additional macro
calls, functions are created that help calculate the size,
address, and permissions for the subsections and enable the
developer to control application data in specified partitions and
memory domains.
Background: Initial attempts focused on creating a single
section in the linker script that then contained internally
grouped variables/data to allow MPU/MMU alignment and protection.
This did not provide additional functionality beyond
CONFIG_APPLICATION_MEMORY as we were unable to reliably group
data or determine their grouping via exported linker symbols.
Thus, the resulting decision was made to dynamically create
subsections using the current qualifier method. An attempt to
group the data by object file was tested, but found that this
broke applications such as ztest where two object files are
created: ztest and main. This also creates an issue of grouping
the two object files together in the same memory domain while
also allowing for compartmenting other data among threads.
Because it is not possible to know a) the name of the partition
and thus the symbol in the linker, b) the size of all the data
in the subsection, nor c) the overall number of partitions
created by the developer, it was not feasible to align the
subsections at compile time without using dynamically generated
linker script for MPU architectures requiring power of 2
alignment.
In order to provide support for MPU architectures that require a
power of 2 alignment, a python script is run at build prior to
when linker_priv_stacks.cmd is generated. This script scans the
built object files for all possible partitions and the names given
to them. It then generates a linker file (app_smem.ld) that is
included in the main linker.ld file. This app_smem.ld allows the
compiler and linker to then create each subsection and align to
the next power of 2.
Usage:
- Requires: app_memory/app_memdomain.h .
- _app_dmem(id) marks a variable to be placed into a data
section for memory partition id.
- _app_bmem(id) marks a variable to be placed into a bss
section for memory partition id.
- These are seen in the linker.map as "data_smem_id" and
"data_smem_idb".
- To create a k_mem_partition, call the macro
app_mem_partition(part0) where "part0" is the name then used to
refer to that partition. This macro only creates a function and
necessary data structures for the later "initialization".
- To create a memory domain for the partition, the macro
app_mem_domain(dom0) is called where "dom0" is the name then
used for the memory domain.
- To initialize the partition (effectively adding the partition
to a linked list), init_part_part0() is called. This is followed
by init_app_memory(), which walks all partitions in the linked
list and calculates the sizes for each partition.
- Once the partition is initialized, the domain can be
initialized with init_domain_dom0(part0) which initializes the
domain with partition part0.
- After the domain has been initialized, the current thread
can be added using add_thread_dom0(k_current_get()).
- The code used in ztests ans kernel/init has been added under
a conditional #ifdef to isolate the code from other tests.
The userspace test CMakeLists.txt file has commands to insert
the CONFIG_APP_SHARED_MEM definition into the required build
targets.
Example:
/* create partition at top of file outside functions */
app_mem_partition(part0);
/* create domain */
app_mem_domain(dom0);
_app_dmem(dom0) int var1;
_app_bmem(dom0) static volatile int var2;
int main()
{
init_part_part0();
init_app_memory();
init_domain_dom0(part0);
add_thread_dom0(k_current_get());
...
}
- If multiple partitions are being created, a variadic
preprocessor macro can be used as provided in
app_macro_support.h:
FOR_EACH(app_mem_partition, part0, part1, part2);
or, for multiple domains, similarly:
FOR_EACH(app_mem_domain, dom0, dom1);
Similarly, the init_part_* can also be used in the macro:
FOR_EACH(init_part, part0, part1, part2);
Testing:
- This has been successfully tested on qemu_x86 and the
ARM frdm_k64f board. It compiles and builds power of 2
aligned subsections for the linker script on the 96b_carbon
boards. These power of 2 alignments have been checked by
hand and are viewable in the zephyr.map file that is
produced during build. However, due to a shortage of
available MPU regions on the 96b_carbon board, we are unable
to test this.
- When run on the 96b_carbon board, the test suite will
enter execution, but each individaul test will fail due to
an MPU FAULT. This is expected as the required number of
MPU regions exceeds the number allowed due to the static
allocation. As the MPU driver does not detect this issue,
the fault occurs because the data being accessed has been
placed outside the active MPU region.
- This now compiles successfully for the ARC boards
em_starterkit_em7d and em_starterkit_em7d_v22. However,
as we lack ARC hardware to run this build on, we are unable
to test this build.
Current known issues:
1) While the script and edited CMakeLists.txt creates the
ability to align to the next power of 2, this does not
address the shortage of available MPU regions on certain
devices (e.g. 96b_carbon). In testing the APB and PPB
regions were commented out.
2) checkpatch.pl lists several issues regarding the
following:
a) Complex macros. The FOR_EACH macros as defined in
app_macro_support.h are listed as complex macros needing
parentheses. Adding parentheses breaks their
functionality, and we have otherwise been unable to
resolve the reported error.
b) __aligned() preferred. The _app_dmem_pad() and
_app_bmem_pad() macros give warnings that __aligned()
is preferred. Prior iterations had this implementation,
which resulted in errors due to "complex macros".
c) Trailing semicolon. The macro init_part(name) has
a trailing semicolon as the semicolon is needed for the
inlined macro call that is generated when this macro
expands.
Update: updated to alternative CONFIG_APPLCATION_MEMORY.
Added config option CONFIG_APP_SHARED_MEM to enable a new section
app_smem to contain the shared memory component. This commit
seperates the Kconfig definition from the definition used for the
conditional code. The change is in response to changes in the
way the build system treats definitions. The python script used
to generate a linker script for app_smem was also midified to
simplify the alignment directives. A default linker script
app_smem.ld was added to remove the conditional includes dependency
on CONFIG_APP_SHARED_MEM. By addining the default linker script
the prebuild stages link properly prior to the python script running
Signed-off-by: Joshua Domagalski <jedomag@tycho.nsa.gov>
Signed-off-by: Shawn Mosley <smmosle@tycho.nsa.gov>
Add an LLVM backend and a clang toolchain variant to support building
with llvm coming with popular Linux distributions.
This has been tested with X86 boards:
- quark_d2000_crb
- quark_se_c1000_devboard/Arduino 101
Use:
export ZEPHYR_TOOLCHAIN_VARIANT=clang
Signed-off-by: Anas Nashif <anas.nashif@intel.com>
Bool symbols implicitly default to 'n'.
A 'default n' can make sense e.g. in a Kconfig.defconfig file, if you
want to override a 'default y' on the base definition of the symbol. It
isn't used like that on any of these symbols though.
Signed-off-by: Ulf Magnusson <Ulf.Magnusson@nordicsemi.no>
intList has been populated with the number of isrs, aka interrupts,
but nothing has not been using this information so we drop it and
everything used to construct it.
Signed-off-by: Sebastian Bøe <sebastian.boe@nordicsemi.no>
The original implementation of CONFIG_THREAD_MONITOR would
try to leverage a thread's initial stack layout to provide
the entry function with arguments for any given thread.
This is problematic:
- Some arches do not have a initial stack layout suitable for
this
- Some arches never enabled this at all (riscv32, nios2)
- Some arches did not enable this properly
- Dropping to user mode would erase or provide incorrect
information.
Just spend a few extra bytes to store this stuff directly
in the k_thread struct and get rid of all the arch-specific
code for this.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
Setting bit CR0.WP (bit 16) will inhibit supervisor threads from
writing to RO pages. It's a necessary flag to be set, and the constant
name CR0_PAGING_ENABLE didn't reflect the fact that the 16th bit was
being set.
Signed-off-by: Leandro Pereira <leandro.pereira@intel.com>
The metairq feature exposed the fact that all of our arch code (and a
few mistaken spots in the scheduler too) was trying to interpret
"preemptible" threads independently.
As of the scheduler rewrite, that logic is entirely within sched.c and
doing it externally is redundant. And now that "cooperative" threads
can be preempted, it's wrong and produces test failures when used with
metairq threads.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
In order to mitigate against Spectre V4, add an option that will, at
boot time, verify if the CPU supports the SPEC_CTRL MSR; if so, it'll
attempt to disable the feature.
More information can be found in chapter 4 (Speculative Store Bypass
Mitigation) of the "Speculative Execution Side Channel Mitigations"
document, version 2, published by Intel: https://goo.gl/nocTcj
Signed-off-by: Leandro Pereira <leandro.pereira@intel.com>
Rename _MsrRead() and _MsrWrite() to _x86_msr_read() and
_x86_msr_write() respectively.
Given that these functions are essentially implemented in assembly.
make them static inline. They can be inlined by the compiler quite
well, most of the time incurring in space savings due to better
handling of the cobbled registers.
Also simplifies the inline assembly, using constraints instead of
moving registers ourselves. Should shave off a few bytes from code
using these functions.
Signed-off-by: Leandro Pereira <leandro.pereira@intel.com>
Normally a syscall would check the current privilege level and then
decide to go to _impl_<syscall> directly or go through a
_handler_<syscall>.
__ZEPHYR_SUPERVISOR__ is a compiler optimization flag which will
make all the system calls from the arch files directly link
to the _impl_<syscall>. Thereby reducing the overhead of checking the
privileges.
In the previous implementation all the source files would be compiled
by zephyr_source() rule. This means that zephyr_* is a catchall CMake
library for source files that can be built purely with the include
paths, defines, and other compiler flags that all zephyr source
files uses. This states that adding one extra compiler flag for only
one complete directory would fail.
This limitation can be overcome by using zephyr_libray* APIs. This
creates a library for the required directories and it also supports
directory level properties.
Hence we use zephyr_library* to create a new library with
macro _ZEPHYR_SUPERVISOR_ for the optimization.
Signed-off-by: Adithya Baglody <adithya.nagaraj.baglody@intel.com>
patch removes Kconfig defines for RAM and ROM size in x86. Instead
these values are derived from dts.
Signed-off-by: Savinay Dharmappa <savinay.dharmappa@intel.com>
MPU devices that enforce power-of-two alignment now
specify the size of the buffer used for the newlib heap.
This buffer will be properly aligned and a pointer
exposed in a kernel header, such that it can be added
to a user thread's memory domain configuration if
necessary.
MPU devices that don't have these restrictions allocate
the heap as normal.
In all cases, if an MPU/MMU region needs to be programmed,
the z_newlib_get_heap_bounds() API will return the necessary
information.
Given how precious MPU regions are, no automatic programming
of the MPU is done; applications will need to do this as
needed in their memory domain configurations.
On x86, the x86 MMU-specific code has been moved to arch/x86
using the new z_newlib_get_heap_bounds() API.
Fixes: #6814
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
In order to mitigate Spectre variant 2 (branch target injection), use
retpolines for indirect jumps and calls.
The newly-added hidden CONFIG_X86_NO_SPECTRE flag, which is disabled
by default, must be set by a x86 SoC if its CPU performs speculative
execution. Most targets supported by Zephyr do not, so this is
set to "y" by default.
A new setting, CONFIG_RETPOLINE, has been added to the "Security
Options" sections, and that will be enabled by default if
CONFIG_X86_NO_SPECTRE is disabled.
Signed-off-by: Leandro Pereira <leandro.pereira@intel.com>
If we enable CONFIG_DEBUG_INFO, then we need to fixup the stack
on thread entry so that the EFLAGS value in the EBP slot doesn't
confuse the debugger or any runtime stack unwinding code.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
The comment was obsolete; we simply do not allow use of the FPU or
vector math in ISRs. There is no desire to add such support, doing
this is properly offloaded to a worker thread.
Fixes#5283.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
The xtensa-asm2 work included a patch that added nano_internal.h
includes in lots of places that needed to have _Swap defined, because
it had to break a cycle and this no longer got pulled in from the arch
headers.
Unfortunately those new includes created new and more amusing cycles
elsewhere which led to breakage on other platforms.
Break out the _Swap definition (only) into a separate header and use
that instead. Cleaner. Seems not to have any more hidden gotchas.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
During compile of lwm2m_client using qemu_x86, the following build
warning was noticed:
zephyr/arch/x86/core/excstub.S:132:2: warning: "/*" within comment [-Wcomment]
/*
In commit ff42bdd0a0 ("debug: remove option GDB_INFO"), the comment tag
was omitted. Fix the comment end tag.
Signed-off-by: Michael Scott <michael@opensourcefoundries.com>
This feature is X86 only and is not used or being tested. It is legacy
feature and no one can prove it actually works. Remove it until we have
proper documentation and samples and multi architecture support.
Signed-off-by: Anas Nashif <anas.nashif@intel.com>
This feature is X86 only and is not used or being tested. It is legacy
feature and no one can prove it actually works. Remove it until we have
proper documentation and samples and multi architecture support.
Signed-off-by: Anas Nashif <anas.nashif@intel.com>
when a current thread is added to a memory domain the pages/sections
must be configured immediately.
A problem occurs when we add a thread to current and then drop
down to usermode. In such a case memory domain will become active
the next time a swap occurs.
Signed-off-by: Adithya Baglody <adithya.nagaraj.baglody@intel.com>
Add an architecure specfic code for the memory domain
configuration. This is needed to support a memory domain API
k_mem_domain_add_thread.
Signed-off-by: Adithya Baglody <adithya.nagaraj.baglody@intel.com>
Save the required scratch pad register (in this case only edx)
before calling the C function.
Signed-off-by: Adithya Baglody <adithya.nagaraj.baglody@intel.com>
Introducing CMake is an important step in a larger effort to make
Zephyr easy to use for application developers working on different
platforms with different development environment needs.
Simplified, this change retains Kconfig as-is, and replaces all
Makefiles with CMakeLists.txt. The DSL-like Make language that KBuild
offers is replaced by a set of CMake extentions. These extentions have
either provided simple one-to-one translations of KBuild features or
introduced new concepts that replace KBuild concepts.
This is a breaking change for existing test infrastructure and build
scripts that are maintained out-of-tree. But for FW itself, no porting
should be necessary.
For users that just want to continue their work with minimal
disruption the following should suffice:
Install CMake 3.8.2+
Port any out-of-tree Makefiles to CMake.
Learn the absolute minimum about the new command line interface:
$ cd samples/hello_world
$ mkdir build && cd build
$ cmake -DBOARD=nrf52_pca10040 ..
$ cd build
$ make
PR: zephyrproject-rtos#4692
docs: http://docs.zephyrproject.org/getting_started/getting_started.html
Signed-off-by: Sebastian Boe <sebastian.boe@nordicsemi.no>
During swap the required page tables are configured. The outgoing
thread's memory domain pages are reset and the incoming thread's
memory domain is loaded. The pages are configured if userspace
is enabled and if memory domain has been initialized before
calling swap.
GH-3852
Signed-off-by: Adithya Baglody <adithya.nagaraj.baglody@intel.com>
This is intended for memory-constrained systems and will save
4K per thread, since we will no longer reserve room for or
activate a kernel stack guard page.
If CONFIG_USERSPACE is enabled, stack overflows will still be
caught in some situations:
1) User mode threads overflowing stack, since it crashes into the
kernel stack page
2) Supervisor mode threads overflowing stack, since the kernel
stack page is marked non-present for non-user threads
Stack overflows will not be caught:
1) When handling a system call
2) When the interrupt stack overflows
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
This is an introductory port for Zephyr to be run as a Jailhouse
hypervisor[1]'s "inmate cell", on x86 64-bit CPUs (running on 32-bit
mode). This was tested with their "tiny-demo" inmate demo cell
configuration, which takes one of the CPUs of the QEMU-VM root cell
config, along with some RAM and serial controller access (it will even
do nice things like reserving some L3 cache for it via Intel CAT) and
Zephyr samples:
- hello_world
- philosophers
- synchronization
The final binary receives an additional boot sequence preamble that
conforms to Jailhouse's expectations (starts at 0x0 in real mode). It
will put the processor in 32-bit protected mode and then proceed to
Zephyr's __start function.
Testing it is just a matter of:
$ mmake -C samples/<sample_dir> BOARD=x86_jailhouse JAILHOUSE_QEMU_IMG_FILE=<path_to_image.qcow2> run
$ sudo insmod <path to jailhouse.ko>
$ sudo jailhouse enable <path to configs/qemu-x86.cell>
$ sudo jailhouse cell create <path to configs/tiny-demo.cell>
$ sudo mount -t 9p -o trans/virtio host /mnt
$ sudo jailhouse cell load tiny-demo /mnt/zephyr.bin
$ sudo jailhouse cell start tiny-demo
$ sudo jailhouse cell destroy tiny-demo
$ sudo jailhouse disable
$ sudo rmmod jailhouse
For the hello_world demo case, one should then get QEMU's serial port
output similar to:
"""
Created cell "tiny-demo"
Page pool usage after cell creation: mem 275/1480, remap 65607/131072
Cell "tiny-demo" can be loaded
CPU 3 received SIPI, vector 100
Started cell "tiny-demo"
***** BOOTING ZEPHYR OS v1.9.0 - BUILD: Sep 12 2017 20:03:22 *****
Hello World! x86
"""
Note that the Jailhouse's root cell *has to be started in xAPIC
mode* (kernel command line argument 'nox2apic') in order for this to
work. x2APIC support and its reasoning will come on a separate commit.
As a reminder, the make run target introduced for x86_jailhouse board
involves a root cell image with Jailhouse in it, to be launched and then
partitioned (with >= 2 64-bit CPUs in it).
Inmate cell configs with no JAILHOUSE_CELL_PASSIVE_COMMREG flag
set (e.g. apic-demo one) would need extra code in Zephyr to deal with
cell shutdown command responses from the hypervisor.
You may want to fine tune CONFIG_SYS_CLOCK_HW_CYCLES_PER_SEC for your
specific CPU—there is no detection from Zephyr with regard to that.
Other config differences from pristine QEMU defaults worth of mention
are:
- there is no HPET when running as Jailhouse guest. We use the LOAPIC
timer, instead
- there is no PIC_DISABLE, because there is no 8259A PIC when running
as a Jailhouse guest
- XIP makes no sense also when running as Jailhouse guest, and both
PHYS_RAM_ADDR/PHYS_LOAD_ADD are set to zero, what tiny-demo cell
config is set to
This opens up new possibilities for Zephyr, so that usages beyond just
MCUs come to the table. I see special demand coming from
functional-safety related use cases on industry, automotive, etc.
[1] https://github.com/siemens/jailhouse
Reference to Jailhouse's booting preamble code:
Origin: Jailhouse
License: BSD 2-Clause
URL: https://github.com/siemens/jailhouse
commit: 607251b44397666a3cbbf859d784dccf20aba016
Purpose: Dual-licensing of inmate lib code
Maintained-by: Zephyr
Signed-off-by: Gustavo Lima Chaves <gustavo.lima.chaves@intel.com>
In PAE boot tables the __mmu_tables_start points to page directory
pointer (PDPT). Enable the PAE by updating the CR4.PAE and
IA32_EFER.NXE bits.
JIRA:ZEP-2511
Signed-off-by: Adithya Baglody <adithya.nagaraj.baglody@intel.com>
Created structures and unions needed to enable the software to
access these tables.
Also updated the helper macros to ease the usage of the MMU page
tables.
JIRA: ZEP-2511
Signed-off-by: Adithya Baglody <adithya.nagaraj.baglody@intel.com>