As we have removed MPU_STACK_GUARD for ARC_MPU_VER 2, we also
need to remove ARCH_HAS_STACK_PROTECTION for boards with
ARC_MPU_VER 2 and no hardware stack checking, relative commit:
commit(arch: arc: remove MPU_STACK_GUARD for ARC_MPU_VER 2)
in pull request #24021
Signed-off-by: Watson Zeng <zhiwei@synopsys.com>
Since the removal of Quark-based boards, there are no user of
Minute-IA. Also, the generic x86 SoC is not exactly Minute-IA
so change it to use a fairly safe CPU_ATOM.
Fixes#14442
Signed-off-by: Daniel Leung <daniel.leung@intel.com>
Can only be written at the highest Exception level implemented.
For example, if EL3 is the highest implemented Exception level,
CNTFRQ_EL0 can only be written at EL3.
Also move z_arm64_el_highest_plat_init to be called when is_el_highest
Signed-off-by: Peng Fan <peng.fan@nxp.com>
This patch adds the code managing the syscalls. The privileged stack
is setup before jumping into the real syscall.
Signed-off-by: Carlo Caione <ccaione@baylibre.com>
This leverages the AT (address translation) instruction to test for
given access permission. The result is then provided in the PAR_EL1
register.
Thanks to @jharris-intel for the suggestion.
Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
Introduce the arch_user_string_nlen() assembly routine and the necessary
C code bits.
Signed-off-by: Carlo Caione <ccaione@baylibre.com>
Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
User mode is only allowed to induce oopses and stack check failures via
software-triggered system fatal exceptions.
Signed-off-by: Carlo Caione <ccaione@baylibre.com>
The arch_is_user_context() function is relying on the content of the
tpidrro_el0 register to determine whether we are in user context or not.
This register is set to '1' when in EL1 and set back to '0' when user
threads are running in userspace.
Signed-off-by: Carlo Caione <ccaione@baylibre.com>
Introduce the first pieces needed to schedule user threads by defining
two different code paths for kernel and user threads.
Signed-off-by: Carlo Caione <ccaione@baylibre.com>
If EL2 is implemented but we're skipping EL2, we should still
do EL2 init. Otherwise we end up with a bunch of things still
at their (unknown) reset values.
This in particular causes problems when different
cores have different virtual timer offsets.
Signed-off-by: James Harris <james.harris@intel.com>
There are several issues with the current implemenation of the
{inc,dec}_nest_counter macros.
The first problem is that it's internally using a call to a misplaced
function called z_arm64_curr_cpu() (for some unknown reason hosted in
irq_manage.c) that could potentially clobber the caller-saved registers
without any notice to the user of the macro.
The second problem is that being a macro the clobbered registers should
be specified at the calling site, this is not possible given the current
implementation.
To fix these issues and make the call quicker, this patch rewrites the
code in assembly leveraging the availability of the _curr_cpu array. It
now clobbers only two registers passed from the calling site.
Signed-off-by: Carlo Caione <ccaione@baylibre.com>
Null-pointer exception detection using DWT is currently incompatible
with current openocd runner default implementation that leaves debug
mode on by default.
As a consequence, on all targets that use openocd runner, null-pointer
exception detection using DWT will generated an assert.
As a consequence, all tests are failing on such platforms.
Disable this until openocd behavior is fixed (#32984) and enable
the MPU based solution for now.
Signed-off-by: Erwan Gouriou <erwan.gouriou@linaro.org>
When we reach this code in interrupt context, our upper GPRs contain a
cross-stack call that may still include some registers from the
interrupted thread. Those need to go out to memory before we can do
our cache coherence dance here.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
Both new thread creation and context switch had the same mistake in
cache management: the bottom of the stack (the "unused" region between
the lower memory bound and the live stack pointer) needs to be
invalidated before we switch, because otherwise any dirty lines we
might have left over can get flushed out on top of the same thread on
another CPU that is putting live data there.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
The Xtensa L1 cache layer has straightforward semantics accessible via
single-instructions that operate on cache lines via physical
addresses. These are very amenable to inlining.
Unfortunately the Xtensa HAL layer requires function calls to do this,
leading to significant code waste at the calling site, an extra frame
on the stack and needless runtime instructions for situations where
the call is over a constant region that could elide the loop. This is
made even worse because the HAL library is not built with
-ffunction-sections, so pulling in even one of these tiny cache
functions has the effect of importing a 1500-byte object file into the
link!
Add our own tiny cache layer to include/arch/xtensa/cache.h and use
that instead.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
Back when I started work on this stuff, I had a set of notes on
register windows that slowly evolved into something that looks like
formal documentation. There really isn't any overview-style
documentation of this stuff on the public internet, so it couldn't
hurt to commit it here for posterity.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
Instead of passing the crt1 _start function as the entry code for
auxiliary CPUs, use a tiny assembly stub instead which can avoid the
runtime testing needed to skip the work in _start. All the crt1 code
was doing was clearing BSS (which must not happen on a second CPU) and
setting the stack pointer (which is wrong on the second CPU).
This allows us to clean out the SMP code in crt1.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
The kernel passes the CPU's interrupt stack expected that it will
start on that, so do it. Pass the initial stack pointer from the SOC
layer in the variable "z_mp_stack_top" and set it in the assembly
startup before calling z_mp_entry().
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
The xtensa atomics layer was written with hand-coded assembly that had
to be called as functions. That's needlessly slow, given that the low
level primitives are a two-instruction sequence. Ideally the compiler
should see this as an inline to permit it to better optimize around
the needed barriers.
There was also a bug with the atomic_cas function, which had a loop
internally instead of returning the old value synchronously on a
failed swap. That's benign right now because our existing spin lock
does nothing but retry it in a tight loop anyway, but it's incorrect
per spec and would have caused a contention hang with more elaborate
algorithms (for example a spinlock with backoff semantics).
Remove the old implementation and replace with a much smaller inline C
one based on just two assembly primitives.
This patch also contains a little bit of refactoring to address the
scheme has been split out into a separate header for each, and the
ATOMIC_OPERATIONS_CUSTOM kconfig has been renamed to
ATOMIC_OPERATIONS_ARCH to better capture what it means.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
There was a bunch of dead historical cruft floating around in the
arch/xtensa tree, left over from older code versions. It's time to do
a cleanup pass. This is entirely refactoring and size optimization,
no behavior changes on any in-tree devices should be present.
Among the more notable changes:
+ xtensa_context.h offered an elaborate API to deal with a stack frame
and context layout that we no longer use.
+ xtensa_rtos.h was entirely dead code
+ xtensa_timer.h was a parallel abstraction layer implementing in the
architecture layer what we're already doing in our timer driver.
+ The architecture thread structs (_callee_saved and _thread_arch)
aren't used by current code, and had dead fields that were removed.
Unfortunately for standards compliance and C++ compatibility it's
not possible to leave an empty struct here, so they have a single
byte field.
+ xtensa_api.h was really just some interrupt management inlines used
by irq.h, so fold that code into the outer header.
+ Remove the stale assembly offsets. This architecture doesn't use
that facility.
All told, more than a thousand lines have been removed. Not bad.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
With _kernel_offset_to_nested, we only able to access the nested counter
of the first cpu. Since we are going to support SMP, we need accessing
nested from per cpu.
To get the current cpu, introduce z_arm64_curr_cpu for asm usage,
because arch_curr_cpu could not be compiled in asm code.
Signed-off-by: Peng Fan <peng.fan@nxp.com>
This patch adds weak sys_arch_reboot() function to avoid build error
with CONFIG_REBOOT=y. Some SoC has already had own reboot function
but others (Ex. qemu boards) faced buld error.
- openisa_rv32m1: Not change
- riscv-ite: Do nothing, remove and use arch/riscv function
Signed-off-by: Katsuhiro Suzuki <katsuhiro@katsuster.net>
There is no strict reason to use assembly for the reset routine. Move as
much code as possible to C code using the proper helpers.
Signed-off-by: Carlo Caione <ccaione@baylibre.com>
The name for registers and bit-field in the cpu.h file is incoherent and
messy. Refactor the whole file using the proper suffixes for bits,
shifts and masks.
Signed-off-by: Carlo Caione <ccaione@baylibre.com>
For some unknown reason, the pagetable address for _df_tss.cr3
did not get translated from virtual to physical. However,
the translation is done if the pointer to pagetable is obtained
through reference to the first array element (instead of simply
through the name of array). Without CR3 pointing to the page
table via physical address, double fault does not work. So
fixing this by being explicit with the page table pointer.
Signed-off-by: Daniel Leung <daniel.leung@intel.com>
When adding a new thread to memory domain, there is a NULL check
to figure out if a thread is being migrated to another memory
domain. However, the NULL check is AFTER physical-to-virtual
address translation which means (NULL + offset) != NULL anymore.
This results in calling reset_region() with an invalid page table
pointer. Fix this by doing the NULL check before address
translation.
Signed-off-by: Daniel Leung <daniel.leung@intel.com>
When linking in virtual address space, we still need physical
addresses in SRAM to be mapped so platform can boot from physical
memory and to access structure necessary for boot (e.g. GDT and
IDT). So we need to enlarge the reserved space for page table
to accommodate this.
Signed-off-by: Daniel Leung <daniel.leung@intel.com>
We have been having the assumption that the physical memory
is identity-mapped to virtual address space. However, with
the ability to set CONFIG_KERNEL_VM_BASE separately from
CONFIG_SRAM_BASE_ADDRESS, the assumption is no longer valid.
This changes the boot code in x86 32-bit, so that once
the page table is loaded, we can proceed with executing in
the virtual address space. So do a long jump to virtual
address just before calling z_x86_prep_c. From this point on,
code execution is in virtual address space.
Signed-off-by: Daniel Leung <daniel.leung@intel.com>
When linking in virtual address space, we still need physical
addressed in SRAM to be mapped so platform can boot from physical
memory and to access structure necessary for boot (e.g. GDT and
IDT). So identity maps the kernel in SRAM.
Signed-off-by: Daniel Leung <daniel.leung@intel.com>
When the kernel is mapped into virtual address space
that is different than the physical address space,
the dynamic GDT generation uses the virtual addresses.
However, the GDT table is required at boot before
page table is loaded where the virtual addresses are
invalid. So make sure GDT generation is using
physical addresses.
Signed-off-by: Daniel Leung <daniel.leung@intel.com>
There is an assumption made in the page table generation code
that the kernel would occupy the same physical and virtual
addresses. However, we may want to map the kernel into
a virtual address space which differs from kernel's physical
address space. For example, with demand paging enabled on
kernel code and data, we can accommodate kernel that is
larger than physical memory size, and may want to utilize
a bigger virtual address space. So add address translation
in the gen_mmu.py script for this.
Signed-off-by: Daniel Leung <daniel.leung@intel.com>
This adds virtual address translation to a few variables
used in crt0.S. This is needed as they are linked at
virtual addresses but before page table is loaded,
they are not available at virtual addresses and must be
referred via physical addresses.
Signed-off-by: Daniel Leung <daniel.leung@intel.com>
When feeding &z_shared_kernel_page_start directly to
Z_X86_PHYS_ADDR(), the compiler would complain array subscript
out of bound if linking in virtual address space. So cast it
into uintptr_t first before translation.
Signed-off-by: Daniel Leung <daniel.leung@intel.com>
Each vector slot has room for 32 instructions. The exception context
saving needs 15 instructions already. Rather than duplicating those
instructions in each out-of-line exception routines, let's store
them directly in the vector table. That vector space is otherwise
wasted anyway. Move the z_arm64_enter_exc macro into vector_table.S
as this is the only place where it should be used.
To further reduce code size, let's make z_arm64_exit_exc into a
function of its own to avoid code duplication again. It is put in
vector_table.S as this is the most logical location to go with its
z_arm64_enter_exc counterpart.
Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
a0 is used as scratch register. Restore value of a0 (return address)
from stack frame before spilling registers on stack
Signed-off-by: Shubham Kulkarni <shubham.kulkarni@espressif.com>
Assert if the null pointer de-referencing detection (via DWT) is
enabled when the processor is in debug mode, because the debug
monitor exception can not be triggered in debug mode (i.e. the
behavior is unpredictable). Add a note in the Kconfig definition
of the null-pointer detection implementation via DWT, stressing
that the solution requires the core be in normal mode.
Signed-off-by: Ioannis Glaropoulos <Ioannis.Glaropoulos@nordicsemi.no>
We introduce build time asserts for
CONFIG_CORTEX_M_DEBUG_NULL_POINTER_EXCEPTION_PAGE_SIZE
to catch that the user-supplied value has, as requested
by the Kconfig symbol specification, a power of 2 value.
For the MPU-based implementation of null-pointer detection
we can use an existing macro for the build time assert,
since the region for catching null-pointer exceptions
is a regular MPU region, with different restrictions,
depending on the MPU architecture. For the DWT-based
implementation, we introduce a custom build-time assert.
We add also a run-time ASSERT for the MPU-based
implementation in ARMv8-M platforms, which require
that the null pointer exception detection page is
already mapped by the MPU.
Signed-off-by: Ioannis Glaropoulos <Ioannis.Glaropoulos@nordicsemi.no>
By design, the DebugMonitor exception is only employed
for null-pointer dereferencing detection, and enabling
that feature is not supported in Non-Secure builds. So
when enabling the DebugMonitor exception, assert that
it is not targeting the Non Secure domain.
Signed-off-by: Ioannis Glaropoulos <Ioannis.Glaropoulos@nordicsemi.no>
Enable the null-pointer dereferencing detection by default
throughout the test-suite. Explicitly disable this for the
gen_isr_table test which needs to perform vector table reads.
Disable null-pointer exception detection on qemu_cortex_m3
board, as DWT it is not emulated by QEMU on this platform.
Additionally, disable null-pointer exception detection on
mps2_an521 (QEMU target), as DWT is not present and the MPU
based solution won't work, since the target does not have
the area 0x0 - 0x400 mapped, but the QEMU still permits
read access.
Signed-off-by: Ioannis Glaropoulos <Ioannis.Glaropoulos@nordicsemi.no>