Commit graph

642 commits

Author SHA1 Message Date
Daniel Leung
482f9fbbfb xtensa: pad IRQ stack frame to be 16 bytes aligned
Xtensa ISA mentions that stack always needs to be aligned on
16 bytes. So we need to pad the stack frame to be also 16 bytes
aligned when dealing with interrupts. Or else the stack would
not be 16 bytes aligned when we add the stack frame to stack
during interrupt handling.

Signed-off-by: Daniel Leung <daniel.leung@intel.com>
2025-04-17 00:57:19 +02:00
Daniel Leung
7229b00433 xtensa: window may overflow during irq stack crossing
When crossing stack during interrupt handling, we do two call4
to pivot to the interrupt stack, with arguments to these two
call4 stashed in A6, A10 and A11. However, A4-A11 may be marked
as invalid in the register file, and accessing them would
result in window overflowing. At that point, A0 and A1 are not
setup to handle window overflows, and will result in registers
being stashed in incorrect location, resulting in incorrect
value being restored during window underflowing. So move around
the code a bit to restore A0 and A1 properly before accessing
A4-A11.

Signed-off-by: Daniel Leung <daniel.leung@intel.com>
2025-04-17 00:57:19 +02:00
Daniel Leung
9610f1a229 xtensa: mmu: do not force small vector table
Do not force CONFIG_XTENSA_SMALL_VECTOR_TABLE_ENTRY when MMU
is enabled. It is up to the SoC to decide whether they need
this.

Signed-off-by: Daniel Leung <daniel.leung@intel.com>
2025-04-17 00:57:19 +02:00
Daniel Leung
774b73c8a4 xtensa: userspace: swap page table at end of cross stack call
The logic to swap page tables or MPU entries is moved to the end
of cross stack call, since it is still running in the interrupt
stack instead of the thread stack. The old logic was calling
the swap functions in the outgoing thread stack, which is not
desirable.

Signed-off-by: Daniel Leung <daniel.leung@intel.com>
2025-04-17 00:57:19 +02:00
Daniel Leung
e0752c2938 xtensa: userspace: swap page table earlier in xtensa_switch
When switching page table and MPU enties, we need to call
corresponding functions via call4 which would move the register
window. So we need to do that earlier before we start saving
context into stack by manipulating stack pointer manually which
definitely would interfere with window spilling.

Signed-off-by: Daniel Leung <daniel.leung@intel.com>
2025-04-17 00:57:19 +02:00
Daniel Leung
6874cdcf19 xtensa: fix unneeded cache invalidation in arch_cohere_stacks()
During arch_cohere_stacks(), the used portion of the outgoing
thread is cache flushed, and then the unused portion of cache
invalidated. However, this results in the cache line at
the stack pointer being flushed and then invalidated due to
how sys_cache_data_*() operates. If we are swapping back to
the same thread (e.g. after handling interrupt), this cache
line will need to be retrieved again from main memory since
it has already been invalidated. This creates unnecessary
data move between cache and main memory. So create our own
version of cache flushing and invalidation routines just for
arch_cohere_stacks(). Bouns is that these work directly with
bounding addresses and skips the size calculation which should
save a little bit amount of execution time.

Signed-off-by: Daniel Leung <daniel.leung@intel.com>
2025-04-17 00:57:19 +02:00
Daniel Leung
9fb87cf586 xtensa: remove second cross stack trampoline
This removes _xstack_call1_* trampoline as we can simply use
callx4 to jump to the interrupt handler.

Signed-off-by: Daniel Leung <daniel.leung@intel.com>
2025-04-17 00:57:19 +02:00
Daniel Leung
3f8e4d7400 xtensa: rework IRQ masking during IRQ entrance
Before cross stack call is setup correctly, we cannot allow
interrupts to be triggered or it may interfere with register
window spilling since we are clobbering registers needed for
that to work. However, there was a brief period where higher
level interrupts could fire due to code writing to PS with
lowered interrupt mask before raising it again. So rework
that part to avoid writing PS with intermediate value, and
now we mask interrupt until everything is setup correctly
before interrupt is enabled again.

Signed-off-by: Daniel Leung <daniel.leung@intel.com>
2025-04-17 00:57:19 +02:00
Daniel Leung
dcf75319e2 xtensa: set pointer to privileged stack only if user mode
When initializing the stack at thread creation, we should not
set the pointer to privileged stack pointer yet as the thread
can be a kernel thread. Only when a thread is transitioning to
user mode, then we need to set the pointer to point to
the privileged stack. This is a purely semantic change and
should not affect any functionality.

Signed-off-by: Daniel Leung <daniel.leung@intel.com>
2025-04-17 00:57:19 +02:00
Daniel Leung
d725f37d5e xtensa: userspace: calculate PC earlier at syscall entry
This moves the calculation of the return PC earlier in syscall
entry. There is no need to stash it into BSA, load it, do
the calculation and save it back. We can do the calculation
first and save it in BSA at the same time.

Signed-off-by: Daniel Leung <daniel.leung@intel.com>
2025-04-17 00:57:19 +02:00
Daniel Leung
1203ed05d9 xtensa: syscall: fix setting up PS for window spilling
We should not perform a straight OR operation on INTLEVEL as
we have no idea what existing PS.INTLEVEL is. Also, to avoid
any interferences, we disable interrupts very early during
syscall entrance. So we can remove the OR operation as
PS.INTLEVEL will still have all interrupts masked. Note that
we do not really need to OR PS_WOE into PS as we currently
only support windowed ABI which must have PS_WOE set in PS
anyway.

Signed-off-by: Daniel Leung <daniel.leung@intel.com>
2025-04-17 00:57:19 +02:00
Daniel Leung
7f693679b8 xtensa: move FPU_REG_SAVE out of ODD_REG_SAVE
Syscall entrance makes use of ODD_REG_SAVE but it does not
really need to save FPU registers as it is technically
the same thread and same context. So extract call to
FPU_REG_SAVE to interrupt handling code.

Signed-off-by: Daniel Leung <daniel.leung@intel.com>
2025-04-17 00:57:19 +02:00
Daniel Leung
c13c499557 xtensa: add ODD_REG_RESTORE
This adds ODD_REG_RESTORE as a counterpart to ODD_REG_SAVE.
Both the code in interrupt handling and syscall exit have
been refactored to use this new macro.

Signed-off-by: Daniel Leung <daniel.leung@intel.com>
2025-04-17 00:57:19 +02:00
Daniel Leung
fc85aface1 xtensa: parameterized ODD_REG_SAVE
This adds two parameters for ODD_REG_SAVE for scratch register
and BSA pointer, thus allowing a bit more flexibility on how
it can be called.

Signed-off-by: Daniel Leung <daniel.leung@intel.com>
2025-04-17 00:57:19 +02:00
Daniel Leung
10f8882eec xtensa: userspace: remove saving HiFi registers for syscalls
This removes the call to _xtensa_hifi_save() to save the HiFi
registers during syscalls. During syscall, we are not doing
context switching, and technically it is still the same thread.
There is no need to save HiFi registers.

Signed-off-by: Daniel Leung <daniel.leung@intel.com>
2025-04-17 00:57:19 +02:00
Daniel Leung
d9f6882071 xtensa: rsync before reading ZSR_FLUSH
The Xtensa ISA reference manual says to do rsync after wsr to
make sure register is updated before rsr. So do that.

Signed-off-by: Daniel Leung <daniel.leung@intel.com>
2025-04-17 00:57:19 +02:00
Daniel Leung
a4367eb514 xtensa: remove CONFIG_XTENSA_INVALIDATE_MEM_DOMAIN_TLB_ON_SWAP
Remove CONFIG_XTENSA_INVALIDATE_MEM_DOMAIN_TLB_ON_SWAP as it is
remnant from early MMU enabling work which is not needed as
the page table code is different from early version where
the PTEVADDR would be the same for all memory domains.

Signed-off-by: Daniel Leung <daniel.leung@intel.com>
2025-04-17 00:57:19 +02:00
Daniel Leung
ee9ddb0eeb xtensa: remove CALC_PTEVADDR_BASE and PRELOAD_PTEVADDR macros
These assembly macros are not being used at all. So remove
them.

Signed-off-by: Daniel Leung <daniel.leung@intel.com>
2025-04-17 00:57:19 +02:00
Daniel Leung
833bb667b2 xtensa: fix thread_page_tables_get being unused
thread_page_tables_get() is only used when userspace is
enabled. So move it with userspace #ifdef, or else
compiler would complain about it being unused.

Fixes #88421

Signed-off-by: Daniel Leung <daniel.leung@intel.com>
2025-04-15 19:09:52 +02:00
Luca Burelli
1b48f74346 llext: avoid redundant 'ldr_parm' checks
Since commit 0aa6b1c9de, the 'ldr_parm' pointer is guaranteed to be
valid inside all the functions of the llext_load() call tree.

This commit fixes the only exception of llext_copy_strings(), which was
not passed the 'ldr_parm' pointer, and remove the redundant checks.

No functional change is intended by this commit.

Signed-off-by: Luca Burelli <l.burelli@arduino.cc>
2025-03-17 19:58:15 +01:00
Daniel Leung
c925b0ecd5 xtensa: mmu: fix incorrect caching attrs on double mapping
Inside map_memory() with double mapping enabled, we should not
be mapping the memory with the incoming attributes as-is since
the incoming address may be on un-cached region but with
caching attribute. So we need to sanitize the attributes
according to the incoming address.

Signed-off-by: Daniel Leung <daniel.leung@intel.com>
2025-03-14 01:01:32 +01:00
Guennadi Liakhovetski
c29e0d3480 llext: xtensa: disable relative relocations for pre-located
When building pre-located LLEXTs of ET_DYN type (shared ELF object)
on Xtensa, all the R_XTENSA_RELATIVE relocations are already correct,
the current code actually breaks them by mobing the value from the
target address-space to a storage range address. Simply removing the
recalculation solves the issue.

Signed-off-by: Guennadi Liakhovetski <guennadi.liakhovetski@linux.intel.com>
2025-02-20 21:05:37 +01:00
Nicolas Pitre
46aa6717ff Revert "arch: deprecate _current"
Mostly a revert of commit b1def7145f ("arch: deprecate `_current`").

This commit was part of PR #80716 whose initial purpose was about providing
an architecture specific optimization for _current. The actual deprecation
was sneaked in later on without proper discussion.

The Zephyr core always used _current before and that was fine. It is quite
prevalent as well and the alternative is proving rather verbose.
Furthermore, as a concept, the "current thread" is not something that is
necessarily architecture specific. Therefore the primary abstraction
should not carry the arch_ prefix.

Hence this revert.

Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
2025-01-10 07:49:08 +01:00
Andy Ross
8b39d4a613 arch/xtensa: Add build-time validation of cache line kconfig
Xtensa cache line sizes aren't an obtuse area of pedantry like they
are in x86.  Different cores already in Zephyr are already using
variant cache line sizes (64 and 128 bytes are both common).

And I tripped over this by using the wrong value because the kconfig
was being inherited (incorrectly) from a default somewhere.

Xtensa exposes the correct value in core-isa.h (well, unless the
toolchain/hal gets messed up).  Add a check to make sure that our
platform kconfig gets it right.

Note that qemu/dc233c was already getting this wrong, leaving the
value at the kconfig default of zero.  That was benign (qemu doesn't
provide any cache emulation for incoherent DMA), but needs to be
fixed.

Signed-off-by: Andy Ross <andyross@google.com>
2025-01-06 20:33:04 +01:00
Guennadi Liakhovetski
00b6ae5bbc llext: xtensa: fix RELATIVE relocations
With dynamically linked / shared ELF objects the displacement
between file offsets and memory addresses can differ between
sections. Therefore we cannot use .text for all such relocations and
have to locate the respective section instead.

Signed-off-by: Guennadi Liakhovetski <guennadi.liakhovetski@linux.intel.com>
2024-12-17 20:55:15 +01:00
Guennadi Liakhovetski
92287c4e8e llext: xtensa: (cosmetic) simplify a calculation
To obtain a section header reference we don't need to calculate its
location in the ELF image, we already have a pointer to all section
headers, just use them.

Signed-off-by: Guennadi Liakhovetski <guennadi.liakhovetski@linux.intel.com>
2024-12-17 20:55:15 +01:00
Guennadi Liakhovetski
024f43b1f0 llext: move a calculation to a more logical location
Currently llext_link_plt() calculates offsets to relocation addresses
relative to the .text section and passes them to
arch_elf_relocate_global() and arch_elf_relocate_local(), where then
.text memory address is added to them. Instead it's more logical to
concentrate the entire calculation in the caller, which then also
removes the assumption, that all sections have the same VMA - LMA
offset from those functions.

Signed-off-by: Guennadi Liakhovetski <guennadi.liakhovetski@linux.intel.com>
2024-12-17 20:55:15 +01:00
Guennadi Liakhovetski
acbd183b72 llext: fix .bss on heap
With writable LLEXT but without pre-assigned section addresses
.bss section offset in its ELF header will not match .bss eventual
location as it's allocated on the heap. Use llext_loaded_sect_ptr()
to get a correct address in both cases.

Signed-off-by: Guennadi Liakhovetski <guennadi.liakhovetski@linux.intel.com>
2024-12-17 20:55:15 +01:00
Daniel Leung
42b209f816 xtensa: optimize syscall helper functions
The original idea of the syscall helpers is to workaround
an issue where the compiler could not correctly model register
usage with inline functions. It was supposed to be only used
when there are more then 3 arguments to syscalls. However,
during the original MMU code development, the helper got
expanded to cover "less then 2 arguments syscalls" as a way
for debugging syscall handling code and was never removed.
So fix that now by limiting the helper for syscalls with
more than 3 arguments.

Moreover, instead of using one helper with 6 arguments, now
we have separate implementations for both 4 and 5 arguments
syscall helpers. Now the compiler does not have to generate
code to always handle 6 arguments which saves a few code bytes
as there is no need to pass extra zeroes as arguments.

This has been verified to work with xt-clang RI-2022.10.

Signed-off-by: Daniel Leung <daniel.leung@intel.com>
2024-12-04 14:16:15 -05:00
Yong Cong Sin
b1def7145f arch: deprecate _current
`_current` is now functionally equals to `arch_curr_thread()`, remove
its usage in-tree and deprecate it instead of removing it outright,
as it has been with us since forever.

Signed-off-by: Yong Cong Sin <ycsin@meta.com>
Signed-off-by: Yong Cong Sin <yongcong.sin@gmail.com>
2024-11-23 20:12:24 -05:00
Guennadi Liakhovetski
4f0cb90c59 LLEXT: fix advanced uses of detached sections
When detached sections are used, STB_GLOBAL relocations also have to
be processed depending on the relocation type. This commit unified
STB_GLOBAL and STB_LOCAL flows by making them use the same relocation
type parser, adds support for R_XTENSA_GLOB_DAT and R_XTENSA_JMP_SLOT
type relocations on Xtensa and fixes STT_SECTION address calculation
for such sections.

Signed-off-by: Guennadi Liakhovetski <guennadi.liakhovetski@linux.intel.com>
2024-11-16 15:28:00 -05:00
Guennadi Liakhovetski
ce32eea24a LLEXT: Xtensa: fix logging level
Use the same logging level as in the rest of LLEXT.

Signed-off-by: Guennadi Liakhovetski <guennadi.liakhovetski@linux.intel.com>
2024-11-16 15:28:00 -05:00
Yong Cong Sin
de347a4e07 init: support per-core init hook
Allow SoC to implement their custom per-core initialization function by
selecting `CONFIG_SOC_PER_CORE_INIT_HOOK` and implement
`soc_per_core_init_hook()`.

Signed-off-by: Maxim Adelman <imax@meta.com>
Signed-off-by: Yong Cong Sin <ycsin@meta.com>
Signed-off-by: Yong Cong Sin <yongcong.sin@gmail.com>
2024-11-16 14:04:25 -05:00
Daniel Leung
cdb9166b81 cmake: toolchain/xcc,xt-clang: env vars for multiple cores
To use Xtensa toolchain, various environment variables must be
set so the executables can find necessary files and what core
to compile for. This becomes an annoyance when you have to
test multiple boards with different cores. You have to use
one set of environment variables per core. Twister cannot test
them all in one setting, and it is especially annoying doing
west builds. So enhance the environment variables handling so
that TOOLCHAIN_VER and XTENSA_CORE can be replaced by
TOOLCHAIN_VAR_<board> and XTENSA_CORE_<board> where <board>
is the normalized board target (think <board>.yaml). CMake
will then figure out the core ID for the toolchain to use.

Signed-off-by: Daniel Leung <daniel.leung@intel.com>
2024-10-31 09:26:00 -05:00
Tahsin Mutlugun
94f307ab75 arch: xtensa: rename processor state save and restore calls
_xtos_pso_savearea and _xtos_core_restore_nw have been renamed to
xthal_pso_savearea and xthal_core_restore_nw in recent Xtensa
toolchains. Rename them in reset_vector.S to prevent linker errors
when building Zephyr with Xtensa toolchain.

Signed-off-by: Tahsin Mutlugun <Tahsin.Mutlugun@analog.com>
2024-10-29 07:07:42 -05:00
Guennadi Liakhovetski
18419b618f llext: xtensa: fix relocations with multiple SLOT0_OP
Support for R_XTENSA_SLOT0_OP was implemented with a relocation table
test case, containing only one entry at offset 0. For multiple
entries .r_addend has to be taken into account.

Signed-off-by: Guennadi Liakhovetski <guennadi.liakhovetski@linux.intel.com>
2024-10-17 10:49:50 -04:00
Daniel Leung
4f4cc4de08 xtensa: fix typo userpsace to userspace
s/userpsace/userspace/

Signed-off-by: Daniel Leung <daniel.leung@intel.com>
2024-10-08 18:10:03 -04:00
Daniel Leung
8638e411e7 xtensa: fix insecure userspace warning wording
"in behave of" -> "on behalf of", which is more accurate of
what is actually happening in the code.

Signed-off-by: Daniel Leung <daniel.leung@intel.com>
2024-10-03 11:41:40 +01:00
Guennadi Liakhovetski
d676d469bd LLEXT: Xtensa: add support for L32R relocation
When building LLEXT for Xtensa with custom sections the compiler
can leave unresolved references in them. Then this has to be done by
the LLEXT core during linking. This commit adds linking support for
the L32R Xtensa instruction.

Signed-off-by: Guennadi Liakhovetski <guennadi.liakhovetski@linux.intel.com>
2024-09-29 21:21:24 +02:00
Daniel Flodin
746c59c82a arch: kernel: lib: toolchain: Standardize TLS keyword
Up until now, the `__thread` keyword has been used for declaring
variables as Thread local storage. However, `__thread` is a GNU
specific keyword which thus limits compatibility with other
toolchains (for instance IAR).

This PR intoduces a new macro `Z_THREAD_LOCAL` which expands to the
corresponding C11, C23 or C++11 standard keyword based on the standard
that is specified during compilation, else it uses the old `__thread`
keyword.

Signed-off-by: Daniel Flodin <daniel.flodin@iar.com>
2024-09-23 10:01:48 +02:00
Daniel Leung
efb2a354a0 xtensa: coredump: support dumping privilege stack
Adds the bits to support dumping privilege stack during
coredump.

Signed-off-by: Daniel Leung <daniel.leung@intel.com>
2024-09-21 11:29:39 +02:00
Pisit Sawangvonganan
9955b0bd6f style: arch: remove unnecessary return statements
For code clarity, remove unnecessary `return` statements
in functions with a void return type they don't affect control flow.

Signed-off-by: Pisit Sawangvonganan <pisit@ndrsolution.com>
2024-09-20 11:06:55 +02:00
Anas Nashif
7e225efab7 arch: initialize irq_offload during boot, do not use SYS_INIT
Do not use SYS_INIT for initializing irq_offload when enabled, instead
using a new interface that is called during the boot process for all
architectures.

Signed-off-by: Anas Nashif <anas.nashif@intel.com>
2024-09-17 20:05:22 -04:00
Anas Nashif
6ec74d02b6 cache: add new interface arch_cache_init() for initializing cache
Add a new call for initializing cache on architectures that need that.
Avoid using SYS_INIT for this and instead call the hook in a fixed place
and run if implemented.

Signed-off-by: Anas Nashif <anas.nashif@intel.com>
2024-09-17 20:05:22 -04:00
Christopher J. Champagne
63051bf71a xtensa: add kconfig to allow non-preemptible interrupts
This adds a kconfig to enable making the interrupts
non-preemptible by other interrupts. Enabling this will set
the INTLEVEL to the max non-debug level before clearing
the EXCM bit.

Signed-off-by: Christopher J. Champagne <christopher.j.champagne@intel.com>
Signed-off-by: Daniel Leung <daniel.leung@intel.com>
2024-09-17 14:54:19 -04:00
Daniel Leung
da5f7e1816 xtensa: mpu: update hardware if manipulating current domain
If adding/removing to the domain of the current running
thread, we need to update the hardware MPU regions or else
the addition or removal would not be reflected to current
running thread.

Signed-off-by: Daniel Leung <daniel.leung@intel.com>
2024-09-16 09:55:53 +02:00
Daniel Leung
6bd0dcf920 xtensa: mpu: make sure write to MPU regions is atomic
This adds a spinlock to make sure writing to hardware MPU
regions is atomic, and cannot be interrupted until all
regions are written to hardware.

Signed-off-by: Daniel Leung <daniel.leung@intel.com>
2024-09-16 09:55:53 +02:00
Anas Nashif
e260d03686 init: introduce soc and board hooks
Introduce soc and board hooks to replace arch specific code
and replace usages of SYS_INIT for platform initialization.

include/zephyr/platform/hooks.h introduces the hooks to be implemented
by boards and SoCs.

Signed-off-by: Anas Nashif <anas.nashif@intel.com>
2024-09-09 10:07:33 +02:00
Daniel Leung
55ee97c7d2 xtensa: implements arch_thread_priv_stack_space_get
This implements arch_thread_priv_stack_space_get() so this can
be used to figure out how much privileged stack space is used.

Signed-off-by: Daniel Leung <daniel.leung@intel.com>
2024-08-28 06:50:30 -04:00
Daniel Leung
1dc02fcbda xtensa: initialize privileged stack during thread init
This adds the bits to initialize the privileged stack for
each thread during thread initialization. This prevents
information leaking if the thread stack is reused, and
also aids in calculating stack space usage during system
calls.

Signed-off-by: Daniel Leung <daniel.leung@intel.com>
2024-08-28 06:50:30 -04:00