The custom memory range checks should be implemented in SoC or
board level as these checsk are SoC/board specific. So remove
it from the architecture level.
Signed-off-by: Daniel Leung <daniel.leung@intel.com>
Add ARM64_PAGE_SIZE Kconfig choice allowing 4KB, 16KB and 64KB
page sizes. The MMU code already derived all constants from
PAGE_SIZE_SHIFT so most of the infrastructure was ready.
Changes:
- Add ARM64_PAGE_SIZE choice (4KB default, 16KB, 64KB) in Kconfig
- Derive PAGE_SIZE_SHIFT from CONFIG_MMU_PAGE_SIZE in mmu.h
- Select proper TCR granule bits (TG0/TG1) per page size in mmu.c
- Round ARCH_THREAD_STACK_RESERVED up to page alignment so that
the user-accessible stack buffer starts on a page boundary
- Fix MEM_REGION_ALLOC in mem_protect test to use CONFIG_MMU_PAGE_SIZE
Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
Drop Synopsis / Designware from ARC's full name and just go with "ARC"
which should be obvious enough of a name when listed alongside other
processor architectures.ull name for
Signed-off-by: Benjamin Cabé <benjamin@zephyrproject.org>
arch_secondary_cpu_init() never returns (it ends with fn(arg) into the
scheduler) but its definition lacks FUNC_NORETURN. The compiler
generates a PACIASP/AUTIASP pair and turns the final fn(arg) into a
tail-call: AUTIASP followed by BR. The AUTIASP causes a PAC
authentication failure (FPAC exception) on secondary CPUs.
Fix by marking the definition FUNC_NORETURN with CODE_UNREACHABLE,
matching the extern declaration. The compiler then generates a plain
BLR without the AUTIASP epilogue.
Also fix the function signature to take no arguments, matching the
extern declaration and actual call sites, and move both
arch_secondary_cpu_init() and z_arm64_mm_init() declarations into
boot.h instead of scattering extern declarations across source files.
Also remove the dead arch_cache_init() call in z_arm64_secondary_prep_c()
that was placed after the noreturn call and could never execute. It is
absent from the primary CPU path in z_prep_c() and the implementation
is empty on arm64 anyway.
Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
On a cache incoherent system, we need to make sure the caching
of stack space is properly flushed to memory when creating new
threads. This is especially important if the thread starts
running on the CPU other than the one initializing the thread.
Without flushing, the other CPU would not have the up-to-date
data to correctly start the thread.
Signed-off-by: Daniel Leung <daniel.leung@intel.com>
arch/riscv/core/thread.c does not use any stdio symbols.
Therefore, remove the unused include.
No functional change.
Signed-off-by: Mirai SHINJO <oss@mshinjo.com>
The generic ARM MPU nocache-memory cleanup path assumes Cortex-M
SCB dcache support whenever it needs to clean and invalidate
cache state before programming MPU regions.
That is correct for integrated ARCH_CACHE systems, but not for
cache backends such as NXP LMEM on RT11xx CM4 targets. Those
targets can select CPU_HAS_DCACHE and NOCACHE_MEMORY while using
a non-ARCH cache backend, which makes the direct SCB dcache
symbols unavailable and breaks builds in z_arm_mpu_init().
Keep the direct CMSIS SCB_CleanInvalidateDCache() call under the
ARCH_CACHE guard — since we already test SCB->CCR the integrated
cache controller is known to be present — and use the generic
cache API for other cache backends. This preserves the existing
integrated-cache behavior while allowing non-ARCH cache backends
to participate in the same MPU cleanup path.
Signed-off-by: Holt Sun <holt.sun@nxp.com>
With kernel coherence enabled, it is possible that the stack has
been allocated on uncached area. This has implications on
performance as memory access is not cached.
This adds a kconfig to force the indicated stack pointer of
the allocated thread stack object to be in cached area.
Signed-off-by: Daniel Leung <daniel.leung@intel.com>
Add the full name "OpenRISC" for the OpenRISC architecture so it shows
up correctly in the documentation.
Signed-off-by: Benjamin Cabé <benjamin@zephyrproject.org>
Implement the custom stack guard using the Andes StackSafe hardware
stack protection. It triggers an exception on stack overflow when the
stack pointer exceeds the configured limit.
Signed-off-by: Rick Tsao <rick592@andestech.com>
Add architecture-level support for a custom stack guard on RISC-V,
preventing stack overflow at the hardware level.
This framework allows vendors to implement the custom stack guard
using their own vendor-specific stack protection hardware, providing
flexibility for different RISC-V cores.
A new config option, CUTOM_STACK_GUARD, allows users to enable this
stack guard on supported RISC-V cores.
Signed-off-by: Rick Tsao <rick592@andestech.com>
Clear the TLS base pointer (r10) in arch_kernel_init.
Allocate the TLS area in arch_tls_stack_setup.
Set the TLS base pointer register (r10) in arch_new_thread.
Set ARCH_HAS_THREAD_LOCAL_STORAGE for config OPENRISC.
Signed-off-by: Keith Packard <keithp@keithp.com>
This patch adds support for the OpenRISC 1000 (or1k) architecture: a
MIPS-like open hardware ISA which was first introduced in 2000.
The thread switching implementation uses the modern Zephyr thread "switch"
architecture.
Signed-off-by: Joel Holdsworth <jholdsworth@nvidia.com>
Add XTENSA_BACKTRACE_EXCEPTION_DUMP_HOOK Kconfig option for sending
backtrace through exception dump hook.
This commit also disables the printk backtrace dumping if Kconfig
option EXCEPTION_DUMP_HOOK_ONLY is set.
Signed-off-by: Jyri Sarha <jyri.sarha@linux.intel.com>
Add Kconfig option EXCEPTION_DUMP_HOOK_ONLY. If the option is selected
the exception dumps are sent only to the exception hook. Sometimes even
the attempt to log in the exception routine may hang the system.
Signed-off-by: Jyri Sarha <jyri.sarha@linux.intel.com>
The new exception dump hooks provides helper function for draining or
flushing the accumulated dump data. These helpers are for the backend
to deal intelligently with often excessive amount of data for limited
bandwidth interfaces.
These calls are placed specifically for SOF application, but AFAIK SOF
is the most widely used Zephyr application running on Xtensa.
The helpers do not have any effect if CONFIG_EXCEPTION_DUMP_HOOK is
not set.
Signed-off-by: Jyri Sarha <jyri.sarha@linux.intel.com>
Add hooks for delivering exception dump prints over a specialized
interface. If CONFING_EXCEPTION_DUMP_HOOK=y then a client program can
set function pointers for printing, flushing, and draining exception
generated prints.
These hooks were implemented for SOF usage, but should be generic
enough to implement alternative exception reporting on any platform.
Signed-off-by: Jyri Sarha <jyri.sarha@linux.intel.com>
This reduces the typical number of instructions executed on interrupt by
one and saves an additional 3-4 instructions on syscall, by two
related optimizations.
* The top bit of `mcause` indicates an interrupt, and the RISC-V ISA
specification suggests checking the sign of `mcause` to separate
interrupts from exceptions. Doing so saves one instruction in
generating an intermediate value to compare against and comparing to
zero instead. In the exception branch, this doesn't modify the
temporary value and saves one instruction in not needing to reload
with the value of `mcause`.
* Loading a register with `CONFIG_RISCV_MCAUSE_EXCEPTION_MASK` and
masking `mcause` with that requires two instructions at minimum, and
three if the mask is too large to fit into a single instruction.
Since the first optimization leaves the temporary value of `mcause`
unmodified and it is known that the interrupt bit is clear after the
branch to `is_interrupt`, reloading and masking the value of `mcause`
can be skipped entirely.
Signed-off-by: Peter Marheine <pmarheine@chromium.org>
This option was formerly enabled by sy1xx, but all supported socs now
appear to use the standard behavior so this support can be removed.
Signed-off-by: Peter Marheine <pmarheine@chromium.org>
The RXv2 and RXv3 core support FPU in CPU.
This enable FPU instruction build for the RX140, RX261 and RX26T
Signed-off-by: Duy Nguyen <duy.nguyen.xa@renesas.com>
When CONFIG_RISCV_ALWAYS_SWITCH_THROUGH_ECALL and CONFIG_PMP_STACK_GUARD
are enabled, the first context switch enables the stack guard
(mstatus.MPRV and MPP) in is_kernel_syscall. However, there is no proper
catch-all PMP entry during early kernel initialization.
This change uses CONFIG_PMP_KERNEL_MODE_DYNAMIC (selected by
CONFIG_MEM_ATT, CONFIG_PMP_NO_LOCK_GLOBAL, and CONFIG_PMP_STACK_GUARD) to
configure a catch-all PMP entry in pmp initialization.
Although a catch-all entry is not required when
CONFIG_RISCV_ALWAYS_SWITCH_THROUGH_ECALL is disabled, using it keeps the
PMP setup simpler and more consistent.
Signed-off-by: Jimmy Zheng <jimmyzhe@andestech.com>
ARMv8-R AArch32 cores determine the CPU start address on reset from
RVBAR (Reset Vector Base Address Register), which only stores bits
[31:5] — bits [4:0] are RES0. Any firmware or boot-loader that
programs RVBAR from the ELF entry point will silently truncate
a non-aligned address to a 32-byte boundary, causing the CPU to
begin executing at the wrong location.
Whether __start lands on a 32-byte boundary depends on the size of
code sections placed before it, which changes with Kconfig options.
This makes the failure non-deterministic: a build may work today and
break after enabling an unrelated feature like logging.
Force 32-byte alignment on z_arm_reset/__start for ARMv8-R so the
entry point survives RVBAR truncation on any SoC.
Signed-off-by: Appana Durga Kedareswara rao <appana.durga.kedareswara.rao@amd.com>
Add interrupt locking to arc_core_mpu_buffer_validate() to be atomic.
The function iterates through MPU regions using bank selection, which
requires multiple register accesses. Without interrupt protection, an
interrupt or context switch during iteration can corrupt the bank
selection state, causing incorrect region lookups and spurious access
denials.
Signed-off-by: Mohamed Moawad <moawad@synopsys.com>
We will make use of the .exc_return member during walk_stackframe() to
know whether we have extended stack or standard stack.
Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
is_fatal_error is used to determine whether an exception is
a fatal one. In the default switch case for exception handling,
is_fatal_error needs to be set true. However, setting this
variable was done after stack bound check. So if stack bound
check fails, is_fatal_error is never set. So set the variable
earlier before the stack bound check.
Signed-off-by: Daniel Leung <daniel.leung@intel.com>
For !TRACING, most arch_cpu_idle and arch_cpu_atomic_idle implementation
relies on the fact that there's weak stub implementations in
subsys/tracing/tracing_none.c, this works, but the arch_cpu_idle sits in
hot code path, so we'd better to make it as efficient as possible.
Take the riscv implementation for example,
Before the patch:
80000a66 <arch_cpu_idle>:
80000a66: 1141 addi sp,sp,-16
80000a68: c606 sw ra,12(sp)
80000a6a: 37c5 jal 80000a4a <sys_trace_idle>
80000a6c: 10500073 wfi
80000a70: 3ff1 jal 80000a4c <sys_trace_idle_exit>
80000a72: 47a1 li a5,8
80000a74: 3007a073 csrs mstatus,a5
80000a78: 40b2 lw ra,12(sp)
80000a7a: 0141 addi sp,sp,16
80000a7c: 8082 ret
NOTE: the sys_trace_idle and sys_trace_idle_exit are just stubs when
!TRACING
after the patch:
80000a62 <arch_cpu_idle>:
80000a62: 10500073 wfi
80000a66: 47a1 li a5,8
80000a68: 3007a073 csrs mstatus,a5
80000a6c: 8082 ret
Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
Some ARMv6-M and ARMv8-M Baseline cores indeed support MPU
(CPU_HAS_ARM_MPU in soc Kconfig), so the exclusion should
not be based on ARMV6_M_ARMV8_M_BASELINE.
Signed-off-by: Andy Lin <andylinpersonal@gmail.com>
The ARM Architecture Reference Manual (DDI 0487) requires a context
synchronization event (ISB) between modifying SVE trap control registers
(CPTR_EL3.EZ, CPTR_EL2.TZ, CPACR_EL1.ZEN) and accessing the
corresponding ZCR_ELx registers: "The effect of the change is guaranteed
to be observable only after a Context synchronization event."
Without the ISB, the processor may still observe the old trap
configuration and generate an UNDEFINED exception on the ZCR write.
This also fixes the EL2 SVE initialization for non-VHE mode
(HCR_EL2.E2H=0): CPTR_EL2 bits [17:16] (ZEN) are RES0 in non-VHE
format and must not be set. SVE trapping at EL2 in non-VHE mode is
controlled by the TZ bit (bit 8) instead. The previous code wrote the
VHE-format ZEN bits which is architecturally UNPREDICTABLE in non-VHE
mode. Match the Linux kernel sequence (arch/arm64/include/asm/el2_setup.h).
Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
Extend the ARM Cortex-M coredump arch block to version 3 with metadata
that provides the offset to the callee_saved struct within k_thread.
This enables the coredump GDB stub to accurately retrieve callee-saved
registers (r4-r11) for non-faulting threads during multi-thread
debugging.
Signed-off-by: Mark Holden <mholden@meta.com>
The arch_dcache_enable() and arch_icache_enable() functions could
cause system crashes when called on caches that were already enabled.
This occurs because arch_dcache_invd_all() invalidates the entire
cache without first flushing dirty data, leading to memory corruption
when the cache was previously enabled.
This scenario happens in cache tests where test setup calls
sys_cache_data_enable(), but the SoC early init hook has already
enabled caches during boot.
Fix by checking the SCTLR register before performing cache operations:
- If D-cache is already enabled, perform clean+invalidate instead of
just invalidate to preserve dirty cache lines
- If I-cache is already enabled, perform invalidate only (no dirty
lines in I-cache)
- If cache is not enabled, proceed with normal enable sequence
This makes the enable functions safe to call multiple times without
risking data corruption or system crashes.
Signed-off-by: Appana Durga Kedareswara rao <appana.durga.kedareswara.rao@amd.com>
relocate_vector_table is called as part of z_arm_reset.
This is considered early-boot code before XIP.
At this stage, Program might not have access to optimized
compiler APIs that reside in FLASH.
Thus, its better for relocate_vector_table to use arch_early_memcpy.
Signed-off-by: Shreyas Shankar <s-shankar@ti.com>
Add calls to sys_trace_idle_exit before leaving idle state
to track CPU load.
Extend CPU_LOAD to CPU_AARCH32_CORTEX_R and CPU_AARCH32_CORTEX_A, thus
we can support CPU_LOAD for all CPU_CORTEX.
Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
ASM is notoriously harder to maintain than C and requires core specific
adaptation which impairs even more the readability of the code.
As for performance concern, there's no difference of generated code
between ASM and C version.
ASM version:
<arch_cpu_idle>:
f57ff04f dsb sy
e320f003 wfi
f1080080 cpsie i
f57ff06f isb sy
e12fff1e bx lr
<arch_cpu_atomic_idle>:
f10c0080 cpsid i
f57ff04f dsb sy
e320f002 wfe
e3500000 cmp r0, #0
1a000000 bne 102ca8 <_irq_disabled>
f1080080 cpsie i
<_irq_disabled>:
e12fff1e bx lr
C version:
<arch_cpu_idle>:
f57ff04f dsb sy
e320f003 wfi
f1080080 cpsie i
f57ff06f isb sy
e12fff1e bx lr
<arch_cpu_atomic_idle>:
f10c0080 cpsid i
f57ff04f dsb sy
e320f002 wfe
e3500000 cmp r0, #0
112fff1e bxne lr
f1080080 cpsie i
e12fff1e bx lr
As can be seen, the C version use 'bxne lx' to return directly for irq
disabled case, cost one less instruction than asm version. So from this
PoV, C version not only improves the readability and maintainability
but also generates better code.
Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
ASM is notoriously harder to maintain than C and requires core specific
adaptation which impairs even more the readability of the code.
There's a bug in current arch_cpu_atomic_idle asm version:
tst x0, #(DAIF_IRQ_BIT) //here Z := (DAIF_IRQ_BIT == 0)
beq _irq_disabled //jump to _irq_disabled when Z is set
msr daifclr, #(DAIFCLR_IRQ_BIT)
_irq_disabled:
ret
As can be seen, the asm code jumps to _irq_disabled when Z is set, but
per aarch64 architecture reference, DAIF_IRQ == 0 means the IRQ is
unmasked, I.E enabled. So the asm logic here is wrong. I fixed this bug
in C version. This shows the benefit of ASM -> C
As for performance concern, except the bug fix above, there's no
difference of generated code between ASM and C version.
ASM version:
<arch_cpu_idle>:
d5033f9f dsb sy
d503207f wfi
d50342ff msr daifclr, #0x2
d65f03c0 ret
arch_cpu_atomic_idle>:
d50342df msr daifset, #0x2
d5033fdf isb
d503205f wfe
f279001f tst x0, #0x80
54000040 b.eq 1001d10 <_irq_disabled> // b.none
d50342ff msr daifclr, #0x2
_irq_disabled>:
d65f03c0 ret
C version:
<arch_cpu_idle>:
d5033f9f dsb sy
d503207f wfi
d50342ff msr daifclr, #0x2
d65f03c0 ret
<arch_cpu_atomic_idle>:
d50342df msr daifset, #0x2
d5033fdf isb
d503205f wfe
37380040 tbnz w0, #7, 1001d0c <arch_cpu_atomic_idle+0x14>
d50342ff msr daifclr, #0x2
d65f03c0 ret
And as can be seen, C version use the tbnz instruction to test bit and
branch. Unlike TST, TBNZ does not affect the Z, N, C, or V flags in the
processor state. So except the bug fix, C version looks a bit better
than asm version.
Other architectures such as x86, riscv, rx, xtensa, mips and even arm
cortex_m also use c version for cpu_idle, it's safe for ASM -> C.
Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
This allows to distinguish between f16 storage format support
(CONFIG_FP16) and actual f16 arithmetic capability.
CONFIG_FP16_ARITHMETIC requires either MVE float (ARMV8_1_M_MVEF) or a
Cortex-A core (CPU_CORTEX_A).
Signed-off-by: Martin Jäger <martin.jaeger@a-labs.io>
USE_SWITCH is a new feature and needs more testing before enabling it by
default. While all tests in upstream Zephyr CI passed, keeping this
config disabled helps in getting majority of the work in without causing
regression on upstream boards that are not tested in ci.
Signed-off-by: Sudan Landge <sudan.landge@arm.com>
Fix below issues when trying to build hello world with armclang:
```
Error: L6218E: Undefined symbol z_arm_exc_exit (referred from reset.o).
Error: L6218E: Undefined symbol z_arm_int_exit (referred from reset.o).
Error: L6218E: Undefined symbol z_arm_pendsv (referred from reset.o).
```
Signed-off-by: Sudan Landge <sudan.landge@arm.com>
orr fix is as reported in review:
```
The add causes a crash with IAR tools as the address loaded to r8
already has the lowest bit set, and the add causes it to be set to ARM
mode. The orr instruction works fine with both scenarios
```
`UDF 0` seems to break on IAR but `UDF #0` works for all.
Signed-off-by: Sudan Landge <sudan.landge@arm.com>