This adds a new kconfig and corresponding code to allow flushing
auto-refill data TLBs when page tables are swapped (e.g. during
context switching). This is mainly used to avoid multi-hit TLB
exception raised by certain memory access pattern. If memory is
only marked for user mode access but not inside a memory domain,
accessing that page in kernel mode would result in a TLB being
filled with kernel ASID. When going back into user mode, access
to the memory would result in another TLB being filled with
the user mode ASID. Now there are two entries on the same memory
page, and the multi-hit TLB exception will be raised if that
memory page is accessed. This type of access is better served
using memory partition and memory domain to share data. However,
this type of access is not prohibited but highly discouraged.
Wrapping the code in kconfig is simply because of the execution
penalty as there will be unnecessary TLB refilling being done.
So only enable this if necessary.
Fixes#88772
Signed-off-by: Daniel Leung <daniel.leung@intel.com>
Since the necessary register values are now pre-computed and
stored in the memory domain struct, we can use them directly
in various assembly locations, thus replacing the function
call to xtensa_swap_update_page_tables().
Signed-off-by: Daniel Leung <daniel.leung@intel.com>
When context switching and dealing with non-nested interrupts,
the context to be restored are saved in the thread stack.
When userspace is enabled, this means saving context into
the user stacks for user threads. This allows PS values to be
manipulated externally by setting PS.RING in the saved PS
value to 0, resulting in granting kernel access privilege when
the thread is restored. To prevent this, we store the PS value
into the thread struct instead, where user threads cannot
manipulate that. Note that nested interrupts and syscalls are
not using the user stack but the interrupt stack and thread
privileged stack respectively, where they are not accessible
under user mode.
Signed-off-by: Daniel Leung <daniel.leung@intel.com>
This extends arch_cohere_stacks() to handle privileged stacks of
user threads when userspace is enabled.
Signed-off-by: Daniel Leung <daniel.leung@intel.com>
Towards the end of interrupt handling, and before restoring
context, we would spill all register windows. This requires
A0 and A1 to be restored from the saved context so spilling
would work correct. However, when coherence is enabled,
window spilling has already been done earlier so there is
no need to spill the register windows again. So there is
no need to restore A0 and A1. They will be restored again
before returning from interrupt anyway.
Signed-off-by: Daniel Leung <daniel.leung@intel.com>
Instead of computing all the needed register values when
swapping page tables, we can pre-compute those values when
the memory domain is first initialized. Should save some
time during context switching.
Signed-off-by: Daniel Leung <daniel.leung@intel.com>
There is no need for ZSR_FLUSH when threads are pin only
(CONFIG_SCHED_CPU_MASK_PIN_ONLY=y), so there is no need to
reserve it.
Signed-off-by: Daniel Leung <daniel.leung@intel.com>
Xtensa ISA mentions that stack always needs to be aligned on
16 bytes. So we need to pad the stack frame to be also 16 bytes
aligned when dealing with interrupts. Or else the stack would
not be 16 bytes aligned when we add the stack frame to stack
during interrupt handling.
Signed-off-by: Daniel Leung <daniel.leung@intel.com>
When crossing stack during interrupt handling, we do two call4
to pivot to the interrupt stack, with arguments to these two
call4 stashed in A6, A10 and A11. However, A4-A11 may be marked
as invalid in the register file, and accessing them would
result in window overflowing. At that point, A0 and A1 are not
setup to handle window overflows, and will result in registers
being stashed in incorrect location, resulting in incorrect
value being restored during window underflowing. So move around
the code a bit to restore A0 and A1 properly before accessing
A4-A11.
Signed-off-by: Daniel Leung <daniel.leung@intel.com>
The logic to swap page tables or MPU entries is moved to the end
of cross stack call, since it is still running in the interrupt
stack instead of the thread stack. The old logic was calling
the swap functions in the outgoing thread stack, which is not
desirable.
Signed-off-by: Daniel Leung <daniel.leung@intel.com>
During arch_cohere_stacks(), the used portion of the outgoing
thread is cache flushed, and then the unused portion of cache
invalidated. However, this results in the cache line at
the stack pointer being flushed and then invalidated due to
how sys_cache_data_*() operates. If we are swapping back to
the same thread (e.g. after handling interrupt), this cache
line will need to be retrieved again from main memory since
it has already been invalidated. This creates unnecessary
data move between cache and main memory. So create our own
version of cache flushing and invalidation routines just for
arch_cohere_stacks(). Bouns is that these work directly with
bounding addresses and skips the size calculation which should
save a little bit amount of execution time.
Signed-off-by: Daniel Leung <daniel.leung@intel.com>
This removes _xstack_call1_* trampoline as we can simply use
callx4 to jump to the interrupt handler.
Signed-off-by: Daniel Leung <daniel.leung@intel.com>
Before cross stack call is setup correctly, we cannot allow
interrupts to be triggered or it may interfere with register
window spilling since we are clobbering registers needed for
that to work. However, there was a brief period where higher
level interrupts could fire due to code writing to PS with
lowered interrupt mask before raising it again. So rework
that part to avoid writing PS with intermediate value, and
now we mask interrupt until everything is setup correctly
before interrupt is enabled again.
Signed-off-by: Daniel Leung <daniel.leung@intel.com>
Syscall entrance makes use of ODD_REG_SAVE but it does not
really need to save FPU registers as it is technically
the same thread and same context. So extract call to
FPU_REG_SAVE to interrupt handling code.
Signed-off-by: Daniel Leung <daniel.leung@intel.com>
This adds ODD_REG_RESTORE as a counterpart to ODD_REG_SAVE.
Both the code in interrupt handling and syscall exit have
been refactored to use this new macro.
Signed-off-by: Daniel Leung <daniel.leung@intel.com>
This adds two parameters for ODD_REG_SAVE for scratch register
and BSA pointer, thus allowing a bit more flexibility on how
it can be called.
Signed-off-by: Daniel Leung <daniel.leung@intel.com>
Allow SoC to implement their custom per-core initialization function by
selecting `CONFIG_SOC_PER_CORE_INIT_HOOK` and implement
`soc_per_core_init_hook()`.
Signed-off-by: Maxim Adelman <imax@meta.com>
Signed-off-by: Yong Cong Sin <ycsin@meta.com>
Signed-off-by: Yong Cong Sin <yongcong.sin@gmail.com>
For code clarity, remove unnecessary `return` statements
in functions with a void return type they don't affect control flow.
Signed-off-by: Pisit Sawangvonganan <pisit@ndrsolution.com>
This adds a kconfig to enable making the interrupts
non-preemptible by other interrupts. Enabling this will set
the INTLEVEL to the max non-debug level before clearing
the EXCM bit.
Signed-off-by: Christopher J. Champagne <christopher.j.champagne@intel.com>
Signed-off-by: Daniel Leung <daniel.leung@intel.com>
Newer Xtensa toolchain needs to include xtensa-types.h so that
macros in tie.h can be used without compilation errors.
The exact verison needing this is unknown but first encountered
in RJ-2023.2. Tested with older toolchain and that did not
cause any compilation errors so just include xtensa-types.h
if xt-clang is used. Haven't seen newer toolchains being
generated with xcc, so skip that for now.
Note that Zephyr SDK and the public HAL in Zephyr do not provide
this file.
Signed-off-by: Anthony Giardina <anthony.giardina@intel.com>
Signed-off-by: Daniel Leung <daniel.leung@intel.com>
arch_kernel_init() was misused for all architecture initialization code
that is done in prep_c and prior to cstart on other architectures.
arch_kernel_init() is late in the init process and comes after EARLY
init level, making xtensa have a very special boot path.
Signed-off-by: Anas Nashif <anas.nashif@intel.com>
Check that the stack frame pointer is valid before dumping
any registers while handling exceptions. If the pointer is
invalid, anything it points to will probably be also be
invalid. Accessing them may result in another access
violation.
Signed-off-by: Daniel Leung <daniel.leung@intel.com>
There actually is no triple faults on Xtensa. Once PS.EXCM is
set, it keeps going through double exception vector for any
new exceptions. However, our exception code needs to unmask
PS.EXCM to enable register window operations. So after that,
any new exceptions will go through the kernel or user vectors
depending on PS.UM. If there is continuous faults, it may
keep ping-ponging between double and kernel/user exception
vectors that may never get resolved. Since we stash DEPC
during double exception, and the stashed one is only cleared
once the double exception has been processed, we can use
the stashed DEPC value to detect if the next exception could
be considered a triple fault. If such a case exists, simply
jump to an infinite loop, or quit the simulator, or invoke
debugger.
Signed-off-by: Daniel Leung <daniel.leung@intel.com>
This adds a new function xtensa_mem_kernel_has_access() to
determine if a memory region can be accessed by kernel threads.
This allows checking for valid mapped memory before accessing
them to avoid relying on page faults to detect invalid access.
Also fixed an issue with arch_buffer_validate() on MPU where
it may return okay even if the incoming memory region has no
corresponding entry in the MPU table.
Signed-off-by: Daniel Leung <daniel.leung@intel.com>
If there are any TLB misses during interrupt handling,
the user, kernel and double exception vector will be triggered
for the miss and the DEPC and EXCCAUSE overwritten as the TLB
missse are be handled in the assembly code and execution
returned to the original vector code. Because of this, both
DEPC and EXCCAUSE being read in the C handler are not the ones
that triggered the original exception (for example, level-1
interrupt). So stash both DEPC and EXCCAUSE such that
the original cause of exception is visible in the C handler.
Signed-off-by: Daniel Leung <daniel.leung@intel.com>
Make `struct arch_esf` compulsory for all architectures by
declaring it in the `arch_interface.h` header.
After this commit, the named struct `z_arch_esf_t` is only used
internally to generate offsets, and is slated to be removed
from the `arch_interface.h` header in the future.
Signed-off-by: Yong Cong Sin <ycsin@meta.com>
Namespaced the generated headers with `zephyr` to prevent
potential conflict with other headers.
Introduce a temporary Kconfig `LEGACY_GENERATED_INCLUDE_PATH`
that is enabled by default. This allows the developers to
continue the use of the old include paths for the time being
until it is deprecated and eventually removed. The Kconfig will
generate a build-time warning message, similar to the
`CONFIG_TIMER_RANDOM_GENERATOR`.
Updated the includes path of in-tree sources accordingly.
Most of the changes here are scripted, check the PR for more
info.
Signed-off-by: Yong Cong Sin <ycsin@meta.com>
We can use some extra bits available for SW implementation to
save original permissions and avoid duplicating the kernel page tables
for the default memory domain.
Whe duplicating the page table to a new domain we just ensure
to restore the original map.
Signed-off-by: Flavio Ceolin <flavio.ceolin@intel.com>
A workaround to avoid icache corruption was added in commit be881d4cf2
("arch: xtensa: add isync to interrupt vector").
This patch implements a different workaround by adding custom logic to
idle entry on affected Intel ADSP platforms. To safely enter "waiti"
when clock gating is enabled, we need to ensure icache is both unlocked
and invalidated upon entry.
Signed-off-by: Kai Vehmanen <kai.vehmanen@linux.intel.com>
Simple rename to align the kernel naming scheme. This is being
used throughout the tree, especially in the architecture code.
As this is not a private API internal to kernel, prefix it
appropriately with K_.
Signed-off-by: Daniel Leung <daniel.leung@intel.com>
arch_interface.h is for architecture and should not be
under sys/. So move it under include/zephyr/arch/.
Signed-off-by: Daniel Leung <daniel.leung@intel.com>
On Intel ADSP platforms, additional "isync" is needed in interrupt
vector to synchronize icache when core is woken up from deeper
sleep state by an interrupt. This is only needed if DSP clock
gating is enabled.
Signed-off-by: Kai Vehmanen <kai.vehmanen@linux.intel.com>
Adds the necessary code required to unconditionally save/restore the
HiFi AE registers. The macros xchal_cp1_load and xchal_cp1_store
are defined in the Xtensa HAL.
Signed-off-by: Peter Mitsis <peter.mitsis@intel.com>
Updates the xtensa_irq_base_save_area structure to include space
for saving/restoring the HiFi AudioEngine registers used by CP1.
The starting address of these HiFi AE registers also needs to be
referenced from assembly, so it is added to the set of symbols
symbols for which we need an offset to be auto-generated.
Signed-off-by: Peter Mitsis <peter.mitsis@intel.com>
Some Xtensa cores do not support NMI, so XCHAL_HAVE_NMI=0 and
XCHAL_NMILEVEL won't be defined at all causing
arch/xtensa/include/xtensa-asm2-s.h to throw compilation error.
Fixes: #67855
Signed-off-by: Maciej Kusio <maciejkusio@meta.com>
Signed-off-by: Anas Nashif <anas.nashif@intel.com>
There is no need to sync in every xtlb invalidation. Sync only
after all tlb autofill ways invalidation.
Signed-off-by: Flavio Ceolin <flavio.ceolin@intel.com>
wsr.UPPERCASE can lead to compiler errors when UPPERCASE matches
a macro defined in the special register header file.
Signed-off-by: Peter Mitsis <peter.mitsis@intel.com>
rsr.UPPERCASE can lead to compiler errors when UPPERCASE matches
a macro defined in the special register header file.
Signed-off-by: Peter Mitsis <peter.mitsis@intel.com>
This follows the idea to remove any z_ prefix. Since MMU has
a large number of these, separate out these changes into one
commit to ease review effort.
Since these are no longer have z_, these need proper doxygen
doc. So add them too.
Signed-off-by: Daniel Leung <daniel.leung@intel.com>
z_xtensa_dump_stack() and z_xtensa_exccause() are both arch
internal functions that should not be exposed in public API.
Signed-off-by: Daniel Leung <daniel.leung@intel.com>
Header files under arch/xtensa/include are considered internal
to architecture. There is really no need for two places to
house architecture internal header files.
Signed-off-by: Daniel Leung <daniel.leung@intel.com>
Simply to provide some consistencies on file naming under
arch/xtensa.
These are all internally used files and are not public.
So there is no need to provide a deprecation path for
them.
Signed-off-by: Daniel Leung <daniel.leung@intel.com>
xtensa-asm2.h only contains the function declaration of
xtensa_init_stack() which is only used in one file. So
make the actual implementation a static function in that
file. Also there is really no need to expose stack init
function as arch public API. So remove xtensa-asm2.h.
Signed-off-by: Daniel Leung <daniel.leung@intel.com>