The interface to flush fpu is not unique to one architecture, make it a
generic, optional interface that can be implemented (and overriden) by a
platform.
Signed-off-by: Anas Nashif <anas.nashif@intel.com>
Currently the lazy fpu saving algorithm in arm64 is using the fpu_owner
pointer from the cpu structure to understand the owner of the context
in the cpu and save it in case someone different from the owner is
accessing the fpu.
The semantics for memory consistency across smp systems is quite prone
to errors and reworks on the current code might miss some barriers that
could lead to inconsistent state across cores, so to overcome the issue,
use atomics to hide the complexity and be sure that the code will behave
as intended.
While there, add some isb barriers after writes to cpacr_el1, following
the guidance of ARM ARM specs about writes on system registers.
Signed-off-by: Luca Fancellu <luca.fancellu@arm.com>
Change for loops of the form:
for (i = 0; i < CONFIG_MP_NUM_CPUS; i++)
...
to
unsigned int num_cpus = arch_num_cpus();
for (i = 0; i < num_cpus; i++)
...
We do the call outside of the for loop so that it only happens once,
rather than on every iteration.
Signed-off-by: Kumar Gala <kumar.gala@intel.com>
Following zephyr's style guideline, all if statements, including single
line statements shall have braces.
Signed-off-by: Anas Nashif <anas.nashif@intel.com>
In order to bring consistency in-tree, migrate all arch code to the new
prefix <zephyr/...>. Note that the conversion has been scripted, refer
to zephyrproject-rtos#45388 for more details.
Signed-off-by: Gerard Marull-Paretas <gerard.marull@nordicsemi.no>
Make it optimal without the need for an SVC/exception roundtrip on
every context switch. Performance numbers from tests/benchmarks/sched:
Before:
unpend 85 ready 58 switch 258 pend 231 tot 632 (avg 699)
After:
unpend 85 ready 59 switch 115 pend 138 tot 397 (avg 478)
Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
Every va_start() currently triggers a FPU access trap if FPU is not
already used. This is due to the fact that va_start() must copy FPU
registers that are used for float argument passing into the va_list
object. Flushing the FPU context to its owner and granting access to
the current thread is wasteful if this is only for va_start(),
especially since in most cases there are simply no FP arguments
being passed by the caller.
This is made even worse with exception code (syscalls, IRQ handlers,
etc.) where the exception code has to be resumed with interrupts
disabled upon FPU access as there is no provision for preserving an
interrupted exception mode's FPU context.
Fix those issues by simply simulating the sequence of STR instructions
that the va_start() generates without actually granting FPU access.
We limit ourselves only to exception context to keep changes to a
minimum for now.
This also allows for reverting the ARM64 exception in the nested IRQ
test as it now works properly even if FPU_SHARING is enabled.
Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
This adds FPU sharing support with a lazy context switching algorithm.
Every thread is allowed to use FPU/SIMD registers. In fact, the compiler
may insert FPU reg accesses in anycontext to optimize even non-FP code
unless the -mgeneral-regs-only compiler flag is used, but Zephyr
currently doesn't support such a build.
It is therefore possible to do FP access in IRS as well with this patch
although IRQs are then disabled to prevent nested IRQs in such cases.
Because the thread object grows in size, some tests have to be adjusted.
Signed-off-by: Nicolas Pitre <npitre@baylibre.com>