Commit graph

5896 commits

Author SHA1 Message Date
Nicolas Pitre
247d2c8e3b riscv: move the tp register from caller-saved to callee-saved
This is a per-thread register that gets updated only when context
switching. No need to load and save it on every exception entry.

Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
2022-03-21 07:28:05 -04:00
Nicolas Pitre
50c0df1bd2 riscv: align struct __esf properly
The minimum stack alignment is 16. Therefore, the stack space to store
a struct __esf object must be rounded up to the next 16-byte boundary.

It is not sufficient to do the rounding on the __z_arch_esf_t_SIZEOF
definition. When the stack is constructed in arch_new_thread() it is
also necessary to do the rounding there too.

Let's make the structure itself carry the alignment attribute instead to
make it work in all cases.

While at it, remove the unused _K_THREAD_NO_FLOAT_SIZEOF definition.

Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
2022-03-21 07:28:05 -04:00
Nicolas Pitre
df852a0b77 riscv: implement CONFIG_IRQ_OFFLOAD_NESTED
It can easily be done now, so why not. Suffice to increment the nested
count like with actual IRQs.

Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
2022-03-21 07:28:05 -04:00
Nicolas Pitre
cb5221c087 riscv: irq_offload: simpler implementation
Get rid of all those global variables and IRQ locking.
Use the regular IRQ exit path to let tests validate preemption properly.

Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
2022-03-21 07:28:05 -04:00
Nicolas Pitre
a50c433012 riscv: exception code mega simplification and optimization
Complete revamp of the exception entry code, including syscall handling.
Proper syscall frame exception trigger. Many correctness fixes, hacks
removal, etc. etc.

I tried to make this into several commits, but this stuff is all
inter-related and a pain to split.

The diffstat summary:

 14 files changed, 250 insertions(+), 802 deletions(-)

Binary size (before):

   text	   data	    bss	    dec	    hex	filename
   1104	      0	      0	   1104	    450	isr.S.obj
     64	      0	      0	     64	     40	userspace.S.obj

Binary size (after):

   text	   data	    bss	    dec	    hex	filename
    600	      0	      0	    600	    258	isr.S.obj
     36	      0	      0	     36	     24	userspace.S.obj

Run of samples/userspace/syscall_perf (before):

*** Booting Zephyr OS build zephyr-v3.0.0-325-g3748accae018  ***
Main Thread started; qemu_riscv32
Supervisor thread started
User thread started
Supervisor thread(0x80010048):       384 cycles	     509 instructions
User thread(0x80010140):           77312 cycles	   77437 instructions

Run of samples/userspace/syscall_perf (after):

*** Booting Zephyr OS build zephyr-v3.0.0-326-g4c877a2753b3  ***
Main Thread started; qemu_riscv32
Supervisor thread started
User thread started
Supervisor thread(0x80010048):       384 cycles	     509 instructions
User thread(0x80010138):            7040 cycles     7165 instructions

Yes, that's more than a 10x speed-up!

Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
2022-03-21 07:28:05 -04:00
Nicolas Pitre
bfb7919ed0 riscv: better abstraction for register-wide FP load/store opcodes
Same rationale as preceding commit. Let's create pseudo-instructions in
assembly scope to make the code more uniform and readable.

Furthermore the definition of COPY_ESF_FP() was wrong as the width of
floating point registers vary not according to CONFIG_64BIT but
CONFIG_CPU_HAS_FPU_DOUBLE_PRECISION. It is therefore wrong to use
lr/sr (previously RV_OP_LOADREG/RV_OP_STOREREG) and a regular temporary
register to transfer such content.

Note: There are far more efficient ways to copy FP context around but
      such optimisations will come separately.

Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
2022-03-21 07:28:05 -04:00
Nicolas Pitre
1fd79b3ef4 riscv: better abstraction for register-wide load/store opcodes
Those are prominent enough that having RV_OP_LOADREG and RV_OP_STOREREG
shouting at you all over the place is rather unpleasant and bad taste.

Let's create pseudo-instructions of our own with assembler macros
rather than preprocessor defines and only in assembly scope.
This makes the asm code way more uniform and readable.

Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
2022-03-21 07:28:05 -04:00
Nicolas Pitre
94f39e5a80 riscv: fix wrong access width in assembly code
The thread->base.user_options field is an uint8_t. Access it using lb.
A "copy" of it is made into __esf.fp_state. Make that field an uint8_t
too and access it with lb/sb.

_callee_saved.fcsr is an uint32_t. Access it with lw/sw.
Ditto for is_user_mode.

Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
2022-03-21 07:28:05 -04:00
Nicolas Pitre
9ed17943b9 riscv: use simplest asm expression when possible
Let's take advantage of assembler pseudoinstructions:

- convert `addi rd, rs, 0` to `mv rd, rs`
- convert `jal x0, somewhere` to `j somewhere`
- convert `csrrs x0, csrreg, rs` to `csrs csrreg, rs`
- convert `fscsr x0, rs` to `fscsr rs`

And simplify zero offsets to simply 0.

Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
2022-03-21 07:28:05 -04:00
Nicolas Pitre
f2bb937547 Revert "arch/riscv: Get current CPU properly instead of assuming single CPU"
This reverts commit 8686ab5472.

The purpose of this commit will be reintroduced later on top of
a cleaner codebase.

Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
2022-03-21 07:28:05 -04:00
Nicolas Pitre
442ab22bdc Revert "arch/riscv: Use arch_switch() for context swap"
This reverts commit be28de692c.

The purpose of this commit will be reintroduced later on top of
a cleaner codebase.

Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
2022-03-21 07:28:05 -04:00
Nicolas Pitre
13a7047ea9 Revert "arch/riscv: Do not use irq_lock() on arch_irq_offload"
This reverts commit b0458201cc.

The purpose of this commit will be reintroduced later on top of
a cleaner codebase.

Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
2022-03-21 07:28:05 -04:00
Nicolas Pitre
47e4a4487f arm64: simplify the code around the call to z_get_next_switch_handle()
Remove the special SMP workaround and the extra wrapper.

Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
2022-03-18 13:32:49 -04:00
Nazar Kazakov
f483b1bc4c everywhere: fix typos
Fix a lot of typos

Signed-off-by: Nazar Kazakov <nazar.kazakov.work@gmail.com>
2022-03-18 13:24:08 -04:00
Julien Massot
1e538607b8 arch: arm: aarch32: Do not relocate vector table on ARMv8-R
ARMv8-R allows to set the vector table address using VBAR
register, so there is no need to relocate it.

Move away vector_table setting from reset.S and move it to
relocate vector table function as it's done for Cortex-M
CPU.

Signed-off-by: Julien Massot <julien.massot@iot.bzh>
2022-03-17 15:57:15 -05:00
Jaxson Han
7ea0591d30 arm64: v8r: Enable AARCH64_IMAGE_HEADER by default
Enable AARCH64_IMAGE_HEADER by default and fix the relevant warning

Signed-off-by: Jaxson Han <jaxson.han@arm.com>
2022-03-16 09:19:44 -05:00
Corey Wharton
72afd96c9b arch: riscv: ensure fcsr is cleared on thread start or FPU enable
Ensure fcsr is always initially cleared for FPU enabled threads.

Signed-off-by: Corey Wharton <xodus7@cwharton.com>
2022-03-16 10:25:50 +01:00
Nicolas Pitre
2ef47509c3 arm64: simplify user mode transition code
It is not necessary to go through the full exception exit code.
This is simpler, smaller and faster.

Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
2022-03-15 22:24:22 -04:00
Nicolas Pitre
8affac64a7 arm64: improved arch_switch() implementation
Make it optimal without the need for an SVC/exception  roundtrip on
every context switch. Performance numbers from tests/benchmarks/sched:

Before:
unpend   85 ready   58 switch  258 pend  231 tot  632 (avg  699)

After:
unpend   85 ready   59 switch  115 pend  138 tot  397 (avg  478)

Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
2022-03-15 22:24:22 -04:00
Nicolas Pitre
bd941bcc68 arm64: implement CONFIG_IRQ_OFFLOAD_NESTED
It can easily be done now, so why not. Suffice to increment the nested
count like with actual IRQs.

Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
2022-03-14 22:03:05 -04:00
Nicolas Pitre
90fcef4254 arm64: irq_offload: simpler implementation
Get rid of all those global variables and scheduler locking.
Use the reguler IRQ exit path to let tests properly validate preemption.

Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
2022-03-14 22:03:05 -04:00
Nicolas Pitre
9d0bcfa884 arm64: isr_wrapper.S: tiny assembly optimization
Save one instruction in the ISR hot path.

Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
2022-03-14 22:03:05 -04:00
Nazar Kazakov
9713f0d47c everywhere: fix typos
Fix a lot of typos

Signed-off-by: Nazar Kazakov <nazar.kazakov.work@gmail.com>
2022-03-14 20:22:24 -04:00
Jaxson Han
65d7e64e06 board: arm64: fvp_baser_aemv8r: Fix misc SMP issues
Add CONFIG_SMP to fvp_baser_aemv8r_smp board.
Fix compile warnings by adding missing header file in arm_mpu.c.

Signed-off-by: Jaxson Han <jaxson.han@arm.com>
2022-03-11 11:00:05 +01:00
Jaxson Han
3122b9ed10 arm64: smp: Fix broadcast_ipi issue
This commit mainly fixes the broadcast_ipi issue when one core broadcast
ipi to other cores using gic_raise_sgi. The issue doesn't affect the
functionality of Zephyr SMP but will happen when Zephyr runs on Xen.
Suppose Xen provides 4 CPUs to the Zephyr guest, for example, when cpu0
broadcasts ipi to the rest of the cores, the mask should be 0xE(0b1110),
but for now, Zephyr will send 0xFFFE. So for Xen, it will receive a
target list containing many invalid CPUs which don't exist. My solution
is: to generate the target list according to the online CPUs.

Signed-off-by: Jaxson Han <jaxson.han@arm.com>
2022-03-11 11:00:05 +01:00
Julien Massot
7a510245c9 arch: arm: cortex_a_r: Add support to start in HYP mode
The ARMv8-R processors always boot into Hyp mode (EL2)

To enter EL1:
Program the HACTLR register because it defaults
to only allowing EL2 accesses. HACTLR controls
whether EL1 can access memory region registers and CPUACTLR.
Program the SPSR before entering EL1.
Other registers default to allowing accesses at EL1 from reset.
Set VBAR to the correct location for the vector table.
Set ELR to point to the entry point of the EL1 code and call ERET.

Signed-off-by: Julien Massot <julien.massot@iot.bzh>
2022-03-11 10:59:48 +01:00
Julien Massot
59aae63f51 arch: arm: Add support for Cortex-R52
Cortex-R52 is an ARMv8-R processor with AArch32 profile.

Signed-off-by: Julien Massot <julien.massot@iot.bzh>
2022-03-11 10:59:48 +01:00
Fabio Baltieri
cfa0205c6f arm: cortex-m: add an option to trap unaligned access
Cortex-M mainline cores have an option to generate a fault on word and
halfword unaligned access [1], this patch adds a Kconfig option for
enabling the feature.

[1] https://developer.arm.com/documentation/dui0552/a/cortex-m3-peripherals/system-control-block/configuration-and-control-register

Signed-off-by: Fabio Baltieri <fabiobaltieri@google.com>
2022-03-10 13:47:41 -05:00
Gerard Marull-Paretas
a87c811ec9 arch: x86: use DEVICE_DT_GET_ONE
Improve code by using DEVICE_DT_GET_ONE instead of device_get_binding,
since the intel_vt_d device instance can be obtained at compile time.

Signed-off-by: Gerard Marull-Paretas <gerard.marull@nordicsemi.no>
2022-03-10 13:45:59 -05:00
Gerard Marull-Paretas
dffaf5375c kconfig: tweak Kconfig prompts
Tweak some Kconfig prompts after the removal of "Enable...".

Signed-off-by: Gerard Marull-Paretas <gerard.marull@nordicsemi.no>
2022-03-09 15:35:54 +01:00
Gerard Marull-Paretas
95fb0ded6b kconfig: remove Enable from boolean prompts
According to Kconfig guidelines, boolean prompts must not start with
"Enable...". The following command has been used to automate the changes
in this patch:

sed -i "s/bool \"[Ee]nables\? \(\w\)/bool \"\U\1/g" **/Kconfig*

Signed-off-by: Gerard Marull-Paretas <gerard.marull@nordicsemi.no>
2022-03-09 15:35:54 +01:00
Jaxson Han
fd231e32e9 arm64: Fix booting issue with FVP V8R >= 11.16.16
In the Armv8R AArch64 profile[1], the Armv8R AArch64 is always in secure
mode. But the FVP_BaseR_AEMv8R before version 11.16.16 doesn't strictly
follow this rule. It still has some non-secure registers
(e.g. CNTHP_CTL_EL2).

Since version 11.16.16, the FVP_BaseR_AEMv8R has fixed this issue. The
CNTHP_XXX_EL2 registers have been changed to CNTHPS_XXX_EL2. So the
FVP_BaseR_AEMv8R (version >= 11.16.16) cannot boot Zephyr. This patch
will fix it.

[1] https://developer.arm.com/documentation/ddi0600/latest/

Signed-off-by: Jaxson Han <jaxson.han@arm.com>
Change-Id: If986f34dc080ae7a8b226bba589b6fe616a4260b
2022-03-08 11:09:13 +01:00
Krzysztof Chruscinski
47ae656cc1 all: Deprecate UTIL_LISTIFY and replace with LISTIFY
UTIL_LISTIFY is deprecated. Replacing it with LISTIFY.

Signed-off-by: Krzysztof Chruscinski <krzysztof.chruscinski@nordicsemi.no>
2022-03-08 11:03:30 +01:00
Ederson de Souza
2aab236c12 arch/riscv: Add IPI support
Use CLINT to send interrupts to another CPU. SMP support is kinda
incomplete without it.

This patch only enables it for riscv-privilege platforms - specifically,
"virt" one.

Signed-off-by: Ederson de Souza <ederson.desouza@intel.com>
2022-02-25 19:13:50 -05:00
Ederson de Souza
b0458201cc arch/riscv: Do not use irq_lock() on arch_irq_offload
With SMP, it's the wrong with to do, according to
3b145c0d4b.

Signed-off-by: Ederson de Souza <ederson.desouza@intel.com>
2022-02-25 19:13:50 -05:00
Ederson de Souza
d9ab35577b arch/riscv: Boot secondary CPUs for SMP support
Secondary CPUs are now initialised and made available to the system. If
the system has more CPUs than configured via CONFIG_MP_NUM_CPUS, those
are still left looping as before.

Some implementations of `soc_interrupt_init` also changed to use
`arch_irq_lock` instead of `irq_lock`.

Signed-off-by: Ederson de Souza <ederson.desouza@intel.com>
2022-02-25 19:13:50 -05:00
Ederson de Souza
be28de692c arch/riscv: Use arch_switch() for context swap
Enable `arch_switch()` as preparation for SMP support. This patch
doesn't try to keep support for old style context swap - only switch
based swap is supported, to keep things simple.

A fair amount of refactoring was done in this patch, specially regarding
the code that decides what to do about the ISR. In RISC-V, ECALL
instructions are used to signalize several events, such as user space
system calls, forced syscall, IRQ offload, return from syscall and
context switch. All those handled by the ISR - which also handles
interrupts. After refactor, this "dispatching" step is done at the
beginning of ISR (just after saving generic registers).

As with other platforms, the thread object itself is used as the thread
"switch handle" for the context swap.

Signed-off-by: Ederson de Souza <ederson.desouza@intel.com>
2022-02-25 19:13:50 -05:00
Ederson de Souza
8686ab5472 arch/riscv: Get current CPU properly instead of assuming single CPU
isr.S code currently gets CPU information from global `_kernel` assuming
there's only one CPU. In order to prepare for upcoming SMP support,
change code to actually get current CPU information.

Signed-off-by: Ederson de Souza <ederson.desouza@intel.com>
2022-02-25 19:13:50 -05:00
Ederson de Souza
fdf7c96994 arch/riscv: Implement arch_curr_cpu()
Implement function that will be necessary for upcoming SMP support.

Signed-off-by: Ederson de Souza <ederson.desouza@intel.com>
2022-02-25 19:13:50 -05:00
Bradley Bolen
c0dd594d4d arch: arm: aarch32: Change CPU_CORTEX_R kconfig option
Change the CPU_CORTEX_R kconfig option to CPU_AARCH32_CORTEX_R to
distinguish the armv7 version from the armv8 version of Cortex-R.

Signed-off-by: Bradley Bolen <bbolen@lexmark.com>
2022-02-23 08:14:15 -06:00
Tomasz Bursztyka
0c9ce49d2a arch/x86: Fix MSI MAP destination
When Zephyr runs directly on actual hardware, it will be always
directing MSI messages to BSP (BootStrap Processor). This was fine until
Zephyr could be ran on virtualizor that may NOT run it on BSP.

So directing MSI messages on current processor. If Zephyr runs on actual
hardware, it will be BSP since such setup is always made at boot time by
the BSP. On other use case it will be whatever is relevant at that time.

Signed-off-by: Tomasz Bursztyka <tomasz.bursztyka@linux.intel.com>
2022-02-22 10:35:39 -05:00
Tomasz Bursztyka
0affb29572 arch/x86: Add a CPUID function to get initial APIC ID
Depending on whether X2APIC is enabled or not, it will be safer to grab
such ID from the right place.

Signed-off-by: Tomasz Bursztyka <tomasz.bursztyka@linux.intel.com>
2022-02-22 10:35:39 -05:00
Tomasz Bursztyka
7ea9b169f7 arch/x86: Have a dedicated place for CPUID related functions
This will centralize CPUID related accessors. There was no need for it
so far, but this is going to change.

Signed-off-by: Tomasz Bursztyka <tomasz.bursztyka@linux.intel.com>
2022-02-22 10:35:39 -05:00
Carles Cufi
e83a13aabf kconfig: Rename the TEST_EXTRA stack size option to align with the rest
All stack sizes should end with STACK_SIZE.

Signed-off-by: Carles Cufi <carles.cufi@nordicsemi.no>
2022-02-22 08:23:05 -05:00
Carlo Caione
240c975ad4 core: z_data_copy does not depend on CONFIG_XIP
When XIP is not enabled, z_data_copy() already falls back to an empty
function. No need to ifdef it.

Signed-off-by: Carlo Caione <ccaione@baylibre.com>
2022-02-22 10:22:53 +01:00
Andy Ross
73453a39d1 arch: Add IRQ_OFFSET_NESTED feature
The x86 and xtensa implementations of irq_offload() invoke synchronous
interrupts on the local CPU, and are therefore safe to use from within
an interrupt context.  This is a cheap and portable way to exercise
nested interrupts, which are otherwise highly platform-dependent to
test.  Add a kconfig to signal the capability.

Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2022-02-21 22:10:03 -05:00
Andy Ross
c174ade4a1 arch/xtensa: Rework irq_offload: automatic config, SMP-safe
The Xtensa implementation of arch_irq_offload() required that the user
select the correct interrupt manually, and would race with itself if
invoked from separate CPUs (it was saved here by the main
irq_offload() function which has a semaphore to serialize access).

Use the new gen_zsr.py script to automatically detect the highest
available software interrupt, and keep a per-CPU set of
callback/parameter pointers.

Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2022-02-21 22:10:03 -05:00
Hou Zhiqiang
1fca05b7f8 arm64: cache: Fix data corruption issue on DCACHE range invalidation
Currently, the DCACHE range invalidation can cause data corruption when
the ends of the given range is not aligned to a full cache line.

Signed-off-by: Hou Zhiqiang <Zhiqiang.Hou@nxp.com>
2022-02-21 22:00:16 -05:00
Nicolas Pitre
34d425fbe5 arm64: switch to the IRQ stack during ISR execution
Avoid executing ISRs using the thread stack as it might not be sized
for that. Plus, we do have IRQ stacks already set up for us.

The non-nested IRQ context is still (and has to be) saved on the thread
stack as the thread could be preempted.

The irq_offload case is never nested and always invoked with the
sched_lock held so it can be simplified a bit.

Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
2022-02-21 21:53:23 -05:00
Nicolas Pitre
6381ee7391 arm64: update _current_cpu->nested properly
This is an uint32_t so the proper register width must be used, otherwise
the adjacent structure member will be overwritten (didn't happen in
practice because of struct member alignment but still). This makes the
inc_nest_counter and dec_nest_counter macros rather unwieldy, especially
with upcoming changes, so let's just remove them.

Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
2022-02-21 21:53:23 -05:00