Commit graph

4409 commits

Author SHA1 Message Date
Daniel Leung
03b413712a x86: gen_mmu: double map physical/virtual memory at top level
This reuses the page directory pointer table (PAE=y) or page
directory (PAE=n) to point to next level page directory table
(PAE=y) or page tables (PAE=n) to identity map the physical
memory. This gets rid of the extra memory needed to host
the extra mappings which are only used at boot. Following
patches will have code to actual unmap physical memory
during the boot process, so this avoids some wasting of
memory.

Since no extra memory needs to be reserved, this also reverts
commit ee3d345c09
("x86: mmu: reserve more space for page table if linking in virt").

Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
Signed-off-by: Daniel Leung <daniel.leung@intel.com>
2021-03-16 15:03:44 -04:00
Daniel Leung
dd0748a979 x86: gen_mmu: use constants to refer to page level...
...instead of magic numbers. Makes it a tiny bit easier to
read code.

Signed-off-by: Daniel Leung <daniel.leung@intel.com>
2021-03-16 15:03:44 -04:00
Daniel Leung
b95b8fb075 x86: gen_mmu: allow more verbose messages
This allows specifying second --verbose in command line to
enable more messages. Two new ones have been added to aid
in debugging code for mapping and setting permission to
a single page.

Signed-off-by: Daniel Leung <daniel.leung@intel.com>
2021-03-16 15:03:44 -04:00
Daniel Leung
4bab992e80 x86: gen_mmu: consolidate map() and identity_map()
Consolidate map() and identity_map() as they are mostly
the same.

Signed-off-by: Daniel Leung <daniel.leung@intel.com>
2021-03-16 15:03:44 -04:00
Daniel Leung
e211d3a999 kernel: remove CONFIG_KERNEL_LINK_IN_VIRT
There actually is no need for a separate kconfig here, as
the kernel VM address and SRAM address can be used to figure
out if the kernel is linked in virtual address space.

Signed-off-by: Daniel Leung <daniel.leung@intel.com>
2021-03-16 15:03:44 -04:00
Daniel Leung
273a5e670b x86: remove usage of CONFIG_KERNEL_LINK_IN_VIRT
There is no need to use this kconfig, as the phys-to-virt
offset is enough to figure out if the kernel is linked in
virtual address space in gen_mmu.py.

For code, use Z_VM_KERNEL instead.

Signed-off-by: Daniel Leung <daniel.leung@intel.com>
2021-03-16 15:03:44 -04:00
Daniel Leung
d39012a590 x86: use Z_MEM_*_ADDR instead of Z_X86_*_ADDR
With the introduction of Z_MEM_*_ADDR for physical<->virtual
address translation, there is no need to have x86 specific
versions.

Signed-off-by: Daniel Leung <daniel.leung@intel.com>
2021-03-16 15:03:44 -04:00
Nicolas Pitre
f062490c7e aarch64: mmu: add TLB flushing on mapping changes
Pretty crude for now, as we always invalidate the entire set.
It remains to be seen if more fined grained TLB flushing is worth
the added complexity given this ought to be a relatively rare event.

Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
2021-03-16 08:43:19 -04:00
Carlo Caione
a010651c65 aarch64: mmu: Add initial support for memory domains
Introduce the basic support code for memory domains. To each domain
is associated a top page table which is a copy of the global kernel
one. When a partition is added, corresponding memory range is made
private before its mapping is adjusted.

Signed-off-by: Carlo Caione <ccaione@baylibre.com>
Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
2021-03-16 08:43:19 -04:00
Nicolas Pitre
c77ffebb24 aarch64: mmu: apply proper locking
We need to protect against concurrent modifications to page tables and
their use counts.

It would have been nice to have one lock per domain, but we heavily
share page tables across domains. Hence the global lock.

Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
2021-03-16 08:43:19 -04:00
Nicolas Pitre
e4cd3d4292 aarch64: mmu: code to split/combine page tables
Two scenarios are possible.

privatize_page_range:

Affected pages are made private if they're not. This means a whole
new page branch starting from the top may be allocated and content
shared with the reference page tables, except for the private range
where content is duplicated.

globalize_page_range:

That's the reverse operation where pages for given range is shared with
the reference page tables and no longer needed pages are freed.

When changing a domain mapping the range needs to be privatized first.

When changing a global mapping the range needs to be globalized last.

This way page table sharing across domains is maximized and memory
usage remains optimal.

Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
2021-03-16 08:43:19 -04:00
Nicolas Pitre
402636153d aarch64: mmu: factor out table expansion code
Make the allocation, population and linking of a new table into
a function of its own for easier code reuse.

Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
2021-03-16 08:43:19 -04:00
Eugeniy Paltsev
8165f3ad80 ARC: cleanup instruction cache initialization
As of today during the Zephyr start we
 - invalidate I$
 - disable I$
 - enable I$

Given that we don't need to have I$ disabled during any
initialization period and ARC processors have caches enabled
after reset the I$ disabling/enabling is excessive, so we can
drop it.

By that we also aligh the I$ initialization on ARC with other
projects like U-boot and Linux kernel.

Signed-off-by: Eugeniy Paltsev <Eugeniy.Paltsev@synopsys.com>
2021-03-12 18:29:07 -05:00
Watson Zeng
5c3e7e3cb7 arch: arc: remove ARCH_HAS_STACK_PROTECTION for ARC_MPU_VER 2
As we have removed MPU_STACK_GUARD for ARC_MPU_VER 2, we also
need to remove ARCH_HAS_STACK_PROTECTION for boards with
ARC_MPU_VER 2 and no hardware stack checking, relative commit:
commit(arch: arc: remove MPU_STACK_GUARD for ARC_MPU_VER 2)
in pull request #24021

Signed-off-by: Watson Zeng <zhiwei@synopsys.com>
2021-03-11 08:57:01 -05:00
Daniel Leung
6cac92ad52 x86: remove CONFIG_CPU_MINUTEIA
Since the removal of Quark-based boards, there are no user of
Minute-IA. Also, the generic x86 SoC is not exactly Minute-IA
so change it to use a fairly safe CPU_ATOM.

Fixes #14442

Signed-off-by: Daniel Leung <daniel.leung@intel.com>
2021-03-11 06:37:02 -05:00
Peng Fan
b4f5b9e237 aarch64: reset: initialize CNTFRQ_EL0 in the highest EL
Can only be written at the highest Exception level implemented.
For example, if EL3 is the highest implemented Exception level,
CNTFRQ_EL0 can only be written at EL3.

Also move z_arm64_el_highest_plat_init to be called when is_el_highest

Signed-off-by: Peng Fan <peng.fan@nxp.com>
2021-03-11 12:24:18 +01:00
Carlo Caione
dacd176991 aarch64: userspace: Implement syscalls
This patch adds the code managing the syscalls. The privileged stack
is setup before jumping into the real syscall.

Signed-off-by: Carlo Caione <ccaione@baylibre.com>
2021-03-10 14:52:50 -05:00
Nicolas Pitre
f2995bcca2 aarch64: arch_buffer_validate() implementation
This leverages the AT (address translation) instruction to test for
given access permission. The result is then provided in the PAR_EL1
register.

Thanks to @jharris-intel for the suggestion.

Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
2021-03-10 14:52:50 -05:00
Carlo Caione
9ec1c1a793 aarch64: userspace: Introduce arch_user_string_nlen
Introduce the arch_user_string_nlen() assembly routine and the necessary
C code bits.

Signed-off-by: Carlo Caione <ccaione@baylibre.com>
Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
2021-03-10 14:52:50 -05:00
Carlo Caione
a7a3e800bf aarch64: fatal: Restrict oops-es when in user-mode
User mode is only allowed to induce oopses and stack check failures via
software-triggered system fatal exceptions.

Signed-off-by: Carlo Caione <ccaione@baylibre.com>
2021-03-10 14:52:50 -05:00
Carlo Caione
6978160427 aarch64: userspace: Introduce arch_is_user_context
The arch_is_user_context() function is relying on the content of the
tpidrro_el0 register to determine whether we are in user context or not.

This register is set to '1' when in EL1 and set back to '0' when user
threads are running in userspace.

Signed-off-by: Carlo Caione <ccaione@baylibre.com>
2021-03-10 14:52:50 -05:00
Carlo Caione
6cf0d000e8 aarch64: userspace: Introduce skeleton code for user-threads
Introduce the first pieces needed to schedule user threads by defining
two different code paths for kernel and user threads.

Signed-off-by: Carlo Caione <ccaione@baylibre.com>
2021-03-10 14:52:50 -05:00
Carlo Caione
a7d3d2e0b1 aarch64: fatal: Add arch_syscall_oops hook
Add the arch_syscall_oops hook for the AArch64.

Signed-off-by: Carlo Caione <ccaione@baylibre.com>
2021-03-10 14:52:50 -05:00
James Harris
4e1926d508 arch: aarch64: do EL2 init in EL3 if necessary
If EL2 is implemented but we're skipping EL2, we should still
do EL2 init. Otherwise we end up with a bunch of things still
at their (unknown) reset values.

This in particular causes problems when different
cores have different virtual timer offsets.

Signed-off-by: James Harris <james.harris@intel.com>
2021-03-10 06:50:36 -05:00
Carlo Caione
8388794c9b aarch64: Rename z_arm64_get_cpu_id macro
z_arm64_* prefix should not be used for macros. Rename it.

Signed-off-by: Carlo Caione <ccaione@baylibre.com>
2021-03-09 04:52:40 -05:00
Carlo Caione
bdbe33b795 aarch64: Rework {inc,dec}_nest_counter
There are several issues with the current implemenation of the
{inc,dec}_nest_counter macros.

The first problem is that it's internally using a call to a misplaced
function called z_arm64_curr_cpu() (for some unknown reason hosted in
irq_manage.c) that could potentially clobber the caller-saved registers
without any notice to the user of the macro.

The second problem is that being a macro the clobbered registers should
be specified at the calling site, this is not possible given the current
implementation.

To fix these issues and make the call quicker, this patch rewrites the
code in assembly leveraging the availability of the _curr_cpu array. It
now clobbers only two registers passed from the calling site.

Signed-off-by: Carlo Caione <ccaione@baylibre.com>
2021-03-09 04:52:40 -05:00
Erwan Gouriou
19314514e6 arch/arm: cortex_m: Disable DWT based null-pointer exception detection
Null-pointer exception detection using DWT is currently incompatible
with current openocd runner default implementation that leaves debug
mode on by default.
As a consequence, on all targets that use openocd runner, null-pointer
exception detection using DWT will generated an assert.
As a consequence, all tests are failing on such platforms.

Disable this until openocd behavior is fixed (#32984) and enable
the MPU based solution for now.

Signed-off-by: Erwan Gouriou <erwan.gouriou@linaro.org>
2021-03-08 19:19:14 -05:00
Andy Ross
ae4f7a1a06 arch/xtensa: Remember to spill windows in arch_cohere_stacks()
When we reach this code in interrupt context, our upper GPRs contain a
cross-stack call that may still include some registers from the
interrupted thread.  Those need to go out to memory before we can do
our cache coherence dance here.

Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2021-03-08 11:14:27 -05:00
Andy Ross
b28da4a3b7 arch/xtensa: Invalidate bottom of outbound stacks
Both new thread creation and context switch had the same mistake in
cache management: the bottom of the stack (the "unused" region between
the lower memory bound and the live stack pointer) needs to be
invalidated before we switch, because otherwise any dirty lines we
might have left over can get flushed out on top of the same thread on
another CPU that is putting live data there.

Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2021-03-08 11:14:27 -05:00
Andy Ross
64cf33952d arch/xtensa: Add non-HAL caching primitives
The Xtensa L1 cache layer has straightforward semantics accessible via
single-instructions that operate on cache lines via physical
addresses.  These are very amenable to inlining.

Unfortunately the Xtensa HAL layer requires function calls to do this,
leading to significant code waste at the calling site, an extra frame
on the stack and needless runtime instructions for situations where
the call is over a constant region that could elide the loop.  This is
made even worse because the HAL library is not built with
-ffunction-sections, so pulling in even one of these tiny cache
functions has the effect of importing a 1500-byte object file into the
link!

Add our own tiny cache layer to include/arch/xtensa/cache.h and use
that instead.

Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2021-03-08 11:14:27 -05:00
Andy Ross
d0c538e9a2 arch/xtensa: Add an arch-internal README on register windows
Back when I started work on this stuff, I had a set of notes on
register windows that slowly evolved into something that looks like
formal documentation.  There really isn't any overview-style
documentation of this stuff on the public internet, so it couldn't
hurt to commit it here for posterity.

Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2021-03-08 11:14:27 -05:00
Andy Ross
a230fafde5 arch/xtensa: soc/intel_adsp: Rework MP code entry
Instead of passing the crt1 _start function as the entry code for
auxiliary CPUs, use a tiny assembly stub instead which can avoid the
runtime testing needed to skip the work in _start.  All the crt1 code
was doing was clearing BSS (which must not happen on a second CPU) and
setting the stack pointer (which is wrong on the second CPU).

This allows us to clean out the SMP code in crt1.

Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2021-03-08 11:14:27 -05:00
Andy Ross
613594e68c soc/intel_adsp: Use the correct MP stack pointer
The kernel passes the CPU's interrupt stack expected that it will
start on that, so do it.  Pass the initial stack pointer from the SOC
layer in the variable "z_mp_stack_top" and set it in the assembly
startup before calling z_mp_entry().

Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2021-03-08 11:14:27 -05:00
Andy Ross
820c94e5dd arch/xtensa: Inline atomics
The xtensa atomics layer was written with hand-coded assembly that had
to be called as functions.  That's needlessly slow, given that the low
level primitives are a two-instruction sequence.  Ideally the compiler
should see this as an inline to permit it to better optimize around
the needed barriers.

There was also a bug with the atomic_cas function, which had a loop
internally instead of returning the old value synchronously on a
failed swap.  That's benign right now because our existing spin lock
does nothing but retry it in a tight loop anyway, but it's incorrect
per spec and would have caused a contention hang with more elaborate
algorithms (for example a spinlock with backoff semantics).

Remove the old implementation and replace with a much smaller inline C
one based on just two assembly primitives.

This patch also contains a little bit of refactoring to address the
scheme has been split out into a separate header for each, and the
ATOMIC_OPERATIONS_CUSTOM kconfig has been renamed to
ATOMIC_OPERATIONS_ARCH to better capture what it means.

Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2021-03-08 11:14:27 -05:00
Andy Ross
eb1ef50b6b arch/xtensa: General cleanup, remove dead code
There was a bunch of dead historical cruft floating around in the
arch/xtensa tree, left over from older code versions.  It's time to do
a cleanup pass.  This is entirely refactoring and size optimization,
no behavior changes on any in-tree devices should be present.

Among the more notable changes:

+ xtensa_context.h offered an elaborate API to deal with a stack frame
  and context layout that we no longer use.

+ xtensa_rtos.h was entirely dead code

+ xtensa_timer.h was a parallel abstraction layer implementing in the
  architecture layer what we're already doing in our timer driver.

+ The architecture thread structs (_callee_saved and _thread_arch)
  aren't used by current code, and had dead fields that were removed.
  Unfortunately for standards compliance and C++ compatibility it's
  not possible to leave an empty struct here, so they have a single
  byte field.

+ xtensa_api.h was really just some interrupt management inlines used
  by irq.h, so fold that code into the outer header.

+ Remove the stale assembly offsets.  This architecture doesn't use
  that facility.

All told, more than a thousand lines have been removed.  Not bad.

Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2021-03-08 11:14:27 -05:00
Peng Fan
e27c9c7c52 arch: arm64: select SCHED_IPI_SUPPORTED when SMP enabled
Select SCHED_IPI_SUPPORTED when SMP enabled.

Signed-off-by: Peng Fan <peng.fan@nxp.com>
2021-03-06 07:36:37 -05:00
Peng Fan
a2ea20dd6d arch: arm: aarch64: add SMP support
With timer/gic/cache added, we could add the SMP support.
Bringup cores

Signed-off-by: Peng Fan <peng.fan@nxp.com>
2021-03-06 07:36:37 -05:00
Peng Fan
14b9b752be arch: arm: aarch64: add arch_dcache_range
Add arch_dcache_range to support flush and invalidate

Signed-off-by: Peng Fan <peng.fan@nxp.com>
2021-03-06 07:36:37 -05:00
Peng Fan
e10d9364d0 arch: arm64: irq/switch: accessing nested using _cpu_t
With _kernel_offset_to_nested, we only able to access the nested counter
of the first cpu. Since we are going to support SMP, we need accessing
nested from per cpu.

To get the current cpu, introduce z_arm64_curr_cpu for asm usage,
because arch_curr_cpu could not be compiled in asm code.

Signed-off-by: Peng Fan <peng.fan@nxp.com>
2021-03-06 07:36:37 -05:00
Peng Fan
251b1d39ac arch: arm: aarch64: export z_arm64_mmu_init for SMP
Export z_arm64_mmu_init for SMP usage

Signed-off-by: Peng Fan <peng.fan@nxp.com>
2021-03-06 07:36:37 -05:00
Peng Fan
6182330fc3 arm: core: aarch64: save switch_handle
Save old_thread to switch_handle for wait_for_thread usage

Signed-off-by: Peng Fan <peng.fan@nxp.com>
2021-03-06 07:36:37 -05:00
Ioannis Glaropoulos
191c3088af arm: cortex_m: fix arguments to dwt_init() function
Fix the call to z_arm_dwt_init(), remove the NULL argument.

Signed-off-by: Ioannis Glaropoulos <Ioannis.Glaropoulos@nordicsemi.no>
2021-03-05 18:13:22 -06:00
Katsuhiro Suzuki
e58e2767f8 arch: riscv: add common stub reboot function
This patch adds weak sys_arch_reboot() function to avoid build error
with CONFIG_REBOOT=y. Some SoC has already had own reboot function
but others (Ex. qemu boards) faced buld error.

- openisa_rv32m1: Not change
- riscv-ite: Do nothing, remove and use arch/riscv function

Signed-off-by: Katsuhiro Suzuki <katsuhiro@katsuster.net>
2021-03-04 11:09:51 -06:00
Carlo Caione
9d908c78fa aarch64: Rewrite reset code using C
There is no strict reason to use assembly for the reset routine. Move as
much code as possible to C code using the proper helpers.

Signed-off-by: Carlo Caione <ccaione@baylibre.com>
2021-03-04 06:51:48 -05:00
Carlo Caione
bba7abe975 aarch64: Use helpers instead of inline assembly
No need to rely on inline assembly when helpers are available.

Signed-off-by: Carlo Caione <ccaione@baylibre.com>
2021-03-04 06:51:48 -05:00
Carlo Caione
a2226f5200 aarch64: Fix registers naming in cpu.h
The name for registers and bit-field in the cpu.h file is incoherent and
messy. Refactor the whole file using the proper suffixes for bits,
shifts and masks.

Signed-off-by: Carlo Caione <ccaione@baylibre.com>
2021-03-04 06:51:48 -05:00
Daniel Leung
90722ad548 x86: gen_idt: fix some pylint issues
Fixes some issues identified by pylint.

Signed-off-by: Daniel Leung <daniel.leung@intel.com>
2021-03-03 20:10:22 -05:00
Daniel Leung
685e6aa2e4 x86: gen_mmu: fix some pylint issues
Fixes some issues identified by pylint.

Signed-off-by: Daniel Leung <daniel.leung@intel.com>
2021-03-03 20:10:22 -05:00
Daniel Leung
8cfdd91d54 x86: ia32/fatal: be explicit on pointer math with _df_tss.cr3
For some unknown reason, the pagetable address for _df_tss.cr3
did not get translated from virtual to physical. However,
the translation is done if the pointer to pagetable is obtained
through reference to the first array element (instead of simply
through the name of array). Without CR3 pointing to the page
table via physical address, double fault does not work. So
fixing this by being explicit with the page table pointer.

Signed-off-by: Daniel Leung <daniel.leung@intel.com>
2021-03-03 20:10:22 -05:00
Daniel Leung
fa6d7cecb5 x86: mmu/mem_domain: don't translate address before null check
When adding a new thread to memory domain, there is a NULL check
to figure out if a thread is being migrated to another memory
domain. However, the NULL check is AFTER physical-to-virtual
address translation which means (NULL + offset) != NULL anymore.
This results in calling reset_region() with an invalid page table
pointer. Fix this by doing the NULL check before address
translation.

Signed-off-by: Daniel Leung <daniel.leung@intel.com>
2021-03-03 20:10:22 -05:00