Currently the lazy fpu saving algorithm in arm64 is using the fpu_owner
pointer from the cpu structure to understand the owner of the context
in the cpu and save it in case someone different from the owner is
accessing the fpu.
The semantics for memory consistency across smp systems is quite prone
to errors and reworks on the current code might miss some barriers that
could lead to inconsistent state across cores, so to overcome the issue,
use atomics to hide the complexity and be sure that the code will behave
as intended.
While there, add some isb barriers after writes to cpacr_el1, following
the guidance of ARM ARM specs about writes on system registers.
Signed-off-by: Luca Fancellu <luca.fancellu@arm.com>
This will avoid unconditionally pulling z_riscv_switch() into the build
as it is not used, reducing the resulting binary some more.
Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
With Zephyr now always using `int main(void)`, there's no longer any need
for this definition. The last remaining use which gated the declaration of
_posix_zephyr_main isn't necessary as adding that declaration
unconditionally is harmless.
Signed-off-by: Keith Packard <keithp@keithp.com>
cpu_node_list does not hold the corrent mapping of cpu id and mpid when
core booting sequence does not follow the DTS cpu node sequence. This
will cause an issue that sgi cannot deliver to the right target.
Add the cpu_map array to hold the corrent mapping between cpu id and
mpid.
Signed-off-by: Jaxson Han <jaxson.han@arm.com>
Each core should init their own stack during the reset when SMP enabled,
but do not touch others. The current init results in each core starting
init the stack from the same address which will break others.
Fix the issue by setting a correct start address.
Signed-off-by: Jaxson Han <jaxson.han@arm.com>
LOG system has unalignment access instruction which will cause an
alignment exception before MPU is enabled. Remove the LOG print before
MPU is enabled to avoid this issue.
Signed-off-by: Jaxson Han <jaxson.han@arm.com>
This trick turns out also to be needed by the abort/join code.
Promote it to a more formal-looking internal API and clean up the
documentation to (hopefully) clarify the exact behavior and better
explain the need.
This is one of the more... enchanted bits of the scheduler, and while
the trick is IMHO pretty clean, it remains a big SMP footgun.
Signed-off-by: Andy Ross <andyross@google.com>
For secure EL2 to be entered the EEL2 bit in SCR_EL3 must be set. This
should only be set if Zephyr has not been configured for NS mode only,
if the device is currently in secure EL3, and if secure EL2 is supported
via the SEL2 bit in AA64PFRO_EL1. Added logic to enable EEL2 if all
conditions are met.
Signed-off-by: Chad Karaginides <quic_chadk@quicinc.com>
This reverts commit f0b458a619.
This is a pointless change that simply increases footprint.
Existing code already supports compilation without multithreading.
Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
Per the ARMv8 architecture document, modification of the system control
register is a context-changing operation. Context-changing operations are
only guaranteed to be seen after a context synchronization event.
An ISB is a context synchronization event. One has been placed after
each SCTLR modification. Issue was found running full speed on target.
Signed-off-by: Chad Karaginides <quic_chadk@quicinc.com>
Allow builds which has CONFIG_MULTITHREADING disabled.
This is reduce code footprint which is handy for
constrained targets as bootloaders.
Signed-off-by: Marek Matej <marek.matej@espressif.com>
Allow builds which has CONFIG_MULTITHREADING disabled.
This is reduce code footprint which is handy for
constrained targets as bootloaders.
Signed-off-by: Marek Matej <marek.matej@espressif.com>
This make MCUboot build as Zephyr application.
Providing optinal 2nd stage bootloader to the
IDF bootloader, which is used by default.
This provides more flexibility when building
and loading multiple images and aims to
brings better DX to users by using the sysbuild.
MCUboot and applications has now separate
linker scripts.
Signed-off-by: Marek Matej <marek.matej@espressif.com>
Let's consider CPU1 waiting on a spinlock already taken by CPU2.
It is possible for CPU2 to invoke the FPU and trigger an FPU exception
when the FPU context for CPU2 is not live on that CPU. If the FPU context
for the thread on CPU2 is still held in CPU1's FPU then an IPI is sent
to CPU1 asking to flush its FPU to memory.
But if CPU1 is spinning on a lock already taken by CPU2, it won't see
the pending IPI as IRQs are disabled. CPU2 won't get its FPU state
restored and won't complete the required work to release the lock.
Let's prevent this deadlock scenario by looking for pending FPU IPI from
the spinlock loop using the arch_spin_relax() hook.
Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
Let's consider CPU1 waiting on a spinlock already taken by CPU2.
It is possible for CPU2 to invoke the FPU and trigger an FPU exception
when the FPU context for CPU2 is not live on that CPU. If the FPU context
for the thread on CPU2 is still held in CPU1's FPU then an IPI is sent
to CPU1 asking to flush its FPU to memory.
But if CPU1 is spinning on a lock already taken by CPU2, it won't see
the pending IPI as IRQs are disabled. CPU2 won't get its FPU state
restored and won't complete the required work to release the lock.
Let's prevent this deadlock scenario by looking for a pending FPU IPI
from the arch_spin_relax() hook and honor it.
Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
The compiler is not able to emit a proper DSB operation for ARM64. Move
to the arch-specific implementation and use assembly code instead.
Signed-off-by: Carlo Caione <ccaione@baylibre.com>
Enhance for cases when call z_float_enable() with NULL thread.
Signed-off-by: Dong Wang <dong.d.wang@intel.com>
Signed-off-by: Qipeng Zha <qipeng.zha@intel.com>
This adds code to always map data TLB for VECBASE so that
we would be dealing with fewer data TLB misses during
exception handling. With VECBASE always mapped, there is
no need to pre-load anymore.
Signed-off-by: Daniel Leung <daniel.leung@intel.com>
This moves the TLB miss handling to the C exception handler.
This also allows us to handle page faults (for example,
unmapped pages) during this time as any more exceptions
handled in the C handler will not trigger the double
exception handler but the same C handler.
Signed-off-by: Daniel Leung <daniel.leung@intel.com>
Instead of being able to arbitrarily set the PTEVADDR for page
table, this provides choices (currently just one). This is in
preparation to enable handling memory management exception in
C code. For that to work, we will need to pre-load the page
table address (PTEVADDR) for the memory page containing
exception code and data (containing jump addresses), and
various stacks. This is to prempt any TLB misses during handling
the level 1 interrupt code. If a TLB miss is encountered during
handling of level 1 interrupt, we will be thrown into double
exception handling code where we will get stuck in infinite
loop. However, in order to pre-load the page table entries,
PTEVADDR needs to be calculated. This requires the use of
PTEVADDR base which cannot be loaded via l32r, as we may cause
a data TLB miss. So we must be able to grab the PTEVADDR base
address strictly within code, and must be without any data
load. So changing CONFIG_XTENSA_MMU_PTEVADDR to be based on
choice so we can have pre-defined bit shift value for shift
operation. This shift value will be used in exception handling
code.
Signed-off-by: Daniel Leung <daniel.leung@intel.com>
Add a build option to tell if memory should be mapped in cached
and uncachedr regions.
If the memory is neither in cached nor uncached region it is not double
mapped.
Signed-off-by: Flavio Ceolin <flavio.ceolin@intel.com>
Initial support for Xtensa MMU version 3. It is using a two level page
table based on fact that the page table is in the virtual space. Only
the top level (page directory) is wired mapped in the TLB to avoid
second level page miss.
The mapped memory is completely fragmented in multiple sections, maybe
we find a better way in future.
The exception handler is where we effectively map the memory, the way it
works is:
1) SW try to access some memory address
2) The address is not mapped, so the MMU will try the auto-refill,
looking the page table
3) The page table contents is not mapped (remember, just the top-level page
is mapped)
4) An exception will be triggered, in the exception we try to read the
portion of the page table that maps the original address
5) The address is not mapped, so the MMU will try again the auto-refill.
This time though, the address is mapped by the top level page that is
properly mapped. (The top-level page maps the page table itself).
Signed-off-by: Flavio Ceolin <flavio.ceolin@intel.com>
Unlike tracing module mainly for debug usage, this is
to allow runtime profiling IRQ performance data, and
target to enable it in product release since platform
can choose to make it work with low weight protocol.
Enable this option and implement runtime_irq_stats()
in platform code, such as Intel ISH platform implement
with SHMI protocol to allow host profiling irq stats.
Signed-off-by: Qipeng Zha <qipeng.zha@intel.com>
Until now iterable sections APIs have been part of the toolchain
(common) headers. They are not strictly related to a toolchain, they
just rely on linker providing support for sections. Most files relied on
indirect includes to access the API, now, it is included as needed.
Signed-off-by: Gerard Marull-Paretas <gerard.marull@nordicsemi.no>
The commit 434ca63e2f introduced the
Cortex-A and Cortex-R CPU type dependency to `CONFIG_FP16` based on
the reasoning that the hardware half-precision support is only
available on them.
While it is true that the _hardware_ half-precision support is limited
to these targets, the compiler will provide the _software_ emulation
for the targets that lack the hardware half-precision support, as
mentioned in 41fd6e003c (the original
commit that introduced `CONFIG_FP16`).
Signed-off-by: Stephanos Ioannidis <stephanos.ioannidis@nordicsemi.no>
In z_xtensa_backtrace_print the parameter depth is checked for <= 0.
There is no need to check it again later, also, since the variable is
not used after the while loop we can use directly the parameter without
an additional variable.
Signed-off-by: Flavio Ceolin <flavio.ceolin@intel.com>
Enable single-threading support for the riscv architecture.
Add z_riscv_switch_to_main_no_multithreading function for
supporting single-threading.
The single-threading does not work with enabling PMP_STACK_GUARD.
It is because single-threading does not use context-switching.
But the privileged mode transition that PMP depends on implicitly
presupposes using context-switching. It is a contradiction.
Thus, disable PMP_STACK_GUARD when MULTITHREADING is not enabled.
Signed-off-by: TOKITA Hiroshi <tokita.hiroshi@fujitsu.com>
Intel ISH SoC can't reboot via RST_CNT register,
so make sys_arch_reboot as weak function to allow
implement different arch reboot in SoC layer.
Signed-off-by: Dong Wang <dong.d.wang@intel.com>
Signed-off-by: Qipeng Zha <qipeng.zha@intel.com>
The "cross stack call" mechanism has intermediate states where the
stack frames are not valid for our own interrupt entry code, which
causes corruption if an interrupt races at exactly the right time.
Leave interrupts masked until just before the call.
The fix is midly complicated by the fact that we RELY on nested window
exception frames to spill registers from the interruptee, so have to
do the masking with PS.INTLEVEL, which requires a register to save its
contents, which we don't have since everything needs to happen in one
4-register window. But thankfully our Zephyr-reserved EPS register is
guaranteed to be available through this process.
Fixes#57009
Signed-off-by: Andy Ross <andyross@google.com>
With paging config, need to use physical address as
paging is not enabled here.
From IA manual, LDMXCSR instruction description is,
Loads the source operand into the MXCSR control/status
register, the source operand is a 32-bit memory location.
Signed-off-by: Qipeng Zha <qipeng.zha@intel.com>
Use the common exit() provided by libc so we get standard behavior
across all architectures. So only implement a special exit when
XT_SIMULATOR is defined.
Signed-off-by: Kumar Gala <kumar.gala@intel.com>
Local variables in ASM macro works differently for GNU and MWDT
toolchains. In case of GNU toolchain they are local per each macro
instance, but in case of MWDT they are local per file where macro
is used.
To avoid issues when macro is used multiple times in one file let's
align _st32_huge_offset to have same behaviour with GNU & MWDT
toolchains.
Signed-off-by: Eugeniy Paltsev <Eugeniy.Paltsev@synopsys.com>
Signed-off-by: Evgeniy Paltsev <PaltsevEvgeniy@gmail.com>
The backtrace requires a valid stack pointer to start
printing backtraces. So if there is no stack pointer
being passed in, skip printing backtraces.
Signed-off-by: Daniel Leung <daniel.leung@intel.com>