Commit graph

1302 commits

Author SHA1 Message Date
Andrew Boie
d2a72273b7 x86: add support for common page tables
We provide an option for low-memory systems to use a single set
of page tables for all threads. This is only supported if
KPTI and SMP are disabled. This configuration saves a considerable
amount of RAM, especially if multiple memory domains are used,
at a cost of context switching overhead.

Some caching techniques are used to reduce the amount of context
switch updates; the page tables aren't updated if switching to
a supervisor thread, and the page table configuration of the last
user thread switched in is cached.

Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
2020-11-05 09:33:40 -05:00
Andrew Boie
a15be58019 x86: move page table reservation macros
We don't need this for stacks any more and only use this
for pre-calculating the boot page tables size. Move to C
code, this doesn't need to be in headers anywhere.

Names adjusted for conciseness.

Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
2020-11-05 09:33:40 -05:00
Andrew Boie
1367c4a4b7 x86: don't reserve room for page tables in stack
These are handled at the memory domain level now.

Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
2020-11-05 09:33:40 -05:00
Andrew Boie
b8242bff64 x86: move page tables to the memory domain level
- z_x86_userspace_enter() for both 32-bit and 64-bit now
  call into C code to clear the stack buffer and set the
  US bits in the page tables for the memory range.

- Page tables are now associated with memory domains,
  instead of having separate page tables per thread.
  A spinlock protects write access to these page tables,
  and read/write access to the list of active page
  tables.

- arch_mem_domain_init() implemented, allocating and
  copying page tables from the boot page tables.

- struct arch_mem_domain defined for x86. It has
  a page table link and also a list node for iterating
  over them.

Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
2020-11-05 09:33:40 -05:00
Carlo Caione
7b7c328f7a aarch64: mmu: Enable support for unprivileged EL0
The current MMU code is assuming that both kernel and threads are both
running in EL1, not supporting EL0. Extend the support to EL0 by adding
the missing attribute to mirror the access / execute permissions to EL0.

Signed-off-by: Carlo Caione <ccaione@baylibre.com>
2020-11-04 13:58:19 -08:00
Ioannis Glaropoulos
47e87d8459 arch: arm: cortex_m: implement functionality for ARCH core regs init
Implement the functionality for configuring the
architecture core registers to their warm reset
values upon system initialization. We enable the
support of the feature in the Cortex-M architecture.

Signed-off-by: Ioannis Glaropoulos <Ioannis.Glaropoulos@nordicsemi.no>
2020-11-02 15:02:24 +01:00
Carlo Caione
b3ff89bd51 arch: arm64: Remove _BIT suffix
This is redundant and not coherent with the rest of the file. Thus
remove the _BIT suffix from the bit field names.

Signed-off-by: Carlo Caione <ccaione@baylibre.com>
2020-11-02 12:04:35 +01:00
Carlo Caione
8941f9a00c x86: mmustructs: Fix define typo
Fix typo s/Z_X96_MMU_RW/Z_X86_MMU_RW/

Signed-off-by: Carlo Caione <ccaione@baylibre.com>
2020-10-29 16:44:22 -04:00
Daniel Leung
2c8625ea7e xtensa: remove errno_var from strcut _thread_arch
The errno_var copy in Xtensa's struct is not being used at all
for errno (as there is already one in struct k_thread).
So remove it.

Signed-off-by: Daniel Leung <daniel.leung@intel.com>
2020-10-24 10:52:00 -07:00
Daniel Leung
8a79ce1428 riscv: add support for thread local storage
Adds the necessary bits to initialize TLS in the stack
area and sets up CPU registers during context switch.

Signed-off-by: Daniel Leung <daniel.leung@intel.com>
2020-10-24 10:52:00 -07:00
Daniel Leung
388725870f arm: cortex_m: add support for thread local storage
Adds the necessary bits to initialize TLS in the stack
area and sets up CPU registers during context switch.

Note that since Cortex-M does not have the thread ID or
process ID register needed to store TLS pointer at runtime
for toolchain to access thread data, a global variable is
used instead.

Signed-off-by: Daniel Leung <daniel.leung@intel.com>
2020-10-24 10:52:00 -07:00
Daniel Leung
778c996831 arm: cortex_r: add support for thread local storage
Adds the necessary bits to initialize TLS in the stack
area and sets up CPU registers during context switch.

Signed-off-by: Daniel Leung <daniel.leung@intel.com>
2020-10-24 10:52:00 -07:00
Daniel Leung
df77e2af8b arm64: add support for thread local storage
Adds the necessary bits to initialize TLS in the stack
area and sets up CPU registers during context switch.

Signed-off-by: Daniel Leung <daniel.leung@intel.com>
2020-10-24 10:52:00 -07:00
Daniel Leung
4b38392ded x86: add support for thread local storage
Adds the necessary bits to initialize TLS in the stack
area and sets up CPU registers during context switch.

Signed-off-by: Daniel Leung <daniel.leung@intel.com>
2020-10-24 10:52:00 -07:00
Daniel Leung
53ac1ee6fa x86_64: add support for thread local storage
Adds the necessary bits to initialize TLS in the stack
area and sets up CPU registers during context switch.

Signed-off-by: Daniel Leung <daniel.leung@intel.com>
2020-10-24 10:52:00 -07:00
Andy Ross
0e83961b21 arch/xtensa: soc/xtensa/intel_adsp: Enable KERNEL_COHERENCE
Implement the kernel "coherence" API on top of the linker
cached/uncached mapping work.

Add Xtensa handling for the stack coherence API.

Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2020-10-21 06:38:53 -04:00
Carlo Caione
2f3962534a arch: arm64: Remove new thread entry wrapper
Instead of having some special stack frame when first scheduling new
thread and a new thread entry wrapper to pull out the needed data, we
can reuse the context restore code by adapting the initial stack frame.

This reduces the lines of code and simplify the code at the expense of a
slightly bigger initial stack frame.

Signed-off-by: Carlo Caione <ccaione@baylibre.com>
2020-10-06 10:25:56 -04:00
Eugeniy Paltsev
9b0ef4f19a ARC: MWDT: drop redundant stack checking
MWDT toolchain has Stackcheck_alloca option enabled by default.
So it adds stack checking in addition to Zephyr's stack checking.

As it is completely redundant let's drop it.

Signed-off-by: Eugeniy Paltsev <Eugeniy.Paltsev@synopsys.com>
2020-10-02 11:32:12 +02:00
Yuguo Zou
df4b7803a1 arch: arc: unify different versions of MPU registers
Previously MPU registers macros are only defined within its own header
files and could not be used by other part of program. This commit unify
them together.

Signed-off-by: Yuguo Zou <yuguo.zou@synopsys.com>
2020-10-02 11:31:34 +02:00
Aastha Grover
83b9f69755 code-guideline: Fixing code violation 10.4 Rule
Both operands of an operator in the arithmetic conversions
performed shall have the same essential type category.

Changes are related to converting the integer constants to the
unsigned integer constants

Signed-off-by: Aastha Grover <aastha.grover@intel.com>
2020-10-01 17:13:29 -04:00
Tomasz Bursztyka
d98f7b1895 arch/x86: Optimize ACPI RSDP lookup
As well as normalizing its signature declaration through header.

Signed-off-by: Tomasz Bursztyka <tomasz.bursztyka@linux.intel.com>
2020-10-01 11:16:40 -07:00
Tomasz Bursztyka
4ff1885f69 arch/x86: Move ACPI structures to header file
Let's have all specified ACPI structures in the central header.

Signed-off-by: Tomasz Bursztyka <tomasz.bursztyka@linux.intel.com>
2020-10-01 11:16:40 -07:00
Tomasz Bursztyka
c7787c623e arch/x86: Cleanup ACPI structure attributes names
No need to mix super short version of names with other structures
having full name. Let's follow a more relevant naming where each and
every attribute name is self-documenting then. (such as s/id/apic_id
etc...)

Also make CONFIG_ACPI usable through IS_ENABLED by enclosing exposed
functions with ifdef CONFIG_ACPI.

Signed-off-by: Tomasz Bursztyka <tomasz.bursztyka@linux.intel.com>
2020-10-01 11:16:40 -07:00
Luke Starrett
169e7c5e75 arch: arm64: Fix arm64 crash dump output
- x0/x1 register printing is reversed
- The error stack frame struct (z_arch_esf_t) had the SPSR and ELR in
  the wrong position, inconsistent with the order these regs are pushed
  to the stack in z_arm64_svc.  This caused all register printing to be
  skewed by two.
- Verified by writing known values (abcd0000 -> abcd000f) to x0 - x15
  and then forcing a data abort.

Signed-off-by: Luke Starrett <luke.starrett@gmail.com>
2020-10-01 07:29:27 -04:00
Andrew Boie
391935bea1 x86_64: add dedicated MEMORY area for locore
This had been hacked into the linker script, define
a proper MEMORY region for it.

Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
2020-09-30 14:14:07 -07:00
Andrew Boie
7f4901b8a6 x86: 32-bit: remove mmu region list
This related to old infrastructure which has been removed
from Zephyr.

Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
2020-09-30 14:14:07 -07:00
Andrew Boie
3807c51e4e x86: add common memory.ld
We need the same logic for each SOC, instead of copypasting
things just put this in a common file. This approach still
leaves the door open for custom memory layouts if desired.

Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
2020-09-30 14:14:07 -07:00
Andrew Boie
27e00f497a x86: 32-bit: set _image_text_start properly
Strictly speaking, should bee a virtual, not physical address.

Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
2020-09-30 14:14:07 -07:00
Andrew Boie
c3c7f6c6d3 x86: don't define _image_rom_* unless XIP
Meaningless if we are not a XIP system and are running
from RAM.

Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
2020-09-30 14:14:07 -07:00
Eugeniy Paltsev
8e1f40a632 ARC: linker: merge GNU and MWDT linker scripts
As discussed in #22668, there is additional risk ascociated with
splitting linker files, as one may update one script and not be
aware of the other. Especially related to updating GNU ld, and
not mwdt could break code for mwdt unnoticed, as mwdt is not
part of CI.

Let's create a single entry point for linker template.

Signed-off-by: Eugeniy Paltsev <Eugeniy.Paltsev@synopsys.com>
2020-09-18 09:49:09 -04:00
Ioannis Glaropoulos
d19d6a76a7 arch: arm: aarch64: remove non-applicable linker section
The non-secure callable functions' section is only applicable
to Cortex-M with TrustZone-M extension. Remove it from AARCH64
linker script. (CONFIG_ARM_FIRMWARE_HAS_SECURE_ENTRY_FUNCTIONS
is only enabled for Cortex-M so this is a no-op, but still, it
is a useful cleanup.)

Signed-off-by: Ioannis Glaropoulos <Ioannis.Glaropoulos@nordicsemi.no>
2020-09-14 19:17:04 -05:00
Anas Nashif
6e27478c3d benchmarking: remove execution benchmarking code
This code had one purpose only, feed timing information into a test and
was not used by anything else. The custom trace points unfortunatly were
not accurate and this test was delivering informatin that conflicted
with other tests we have due to placement of such trace points in the
architecture and kernel code.

For such measurements we are planning to use the tracing functionality
in a special mode that would be used for metrics without polluting the
architecture and kernel code with additional tracing and timing code.

Furthermore, much of the assembly code used had issues.

Signed-off-by: Anas Nashif <anas.nashif@intel.com>
Signed-off-by: Daniel Leung <daniel.leung@intel.com>
2020-09-05 13:28:38 -05:00
Daniel Leung
80fb6538b3 x86: use =A as output for RDTSC on x86-32
The timing_info benchmark on qemu_x86 shows this is a bit faster.

Before:
  START - Time Measurement
  Timing results: Clock frequency: 1000 MHz
  Context switch                               : 896 cycles ,   895 ns
  Interrupt latency                            : 768 cycles ,   767 ns
  Tick overhead                                :14912 cycles , 14911 ns
  Thread creation                              :18688 cycles , 18687 ns
  Thread abort (non-running)                   :49216 cycles , 49215 ns
  Thread abort (_current)                      :55616 cycles , 55615 ns
  Thread suspend                               :11072 cycles , 11071 ns
  Thread resume                                :10272 cycles , 10271 ns
  Thread yield                                 :12213 cycles , 12212 ns
  Thread sleep                                 :17984 cycles , 17983 ns
  Heap malloc                                  :21702 cycles , 21701 ns
  Heap free                                    :15176 cycles , 15175 ns
  Semaphore take with context switch           :19168 cycles , 19167 ns
  Semaphore give with context switch           :18400 cycles , 18399 ns
  Semaphore take without context switch        :2208 cycles ,  2207 ns
  Semaphore give without context switch        :4704 cycles ,  4703 ns
  Mutex lock                                   :1952 cycles ,  1951 ns
  Mutex unlock                                 :7936 cycles ,  7935 ns
  Message queue put with context switch        :20320 cycles , 20319 ns
  Message queue put without context switch     :5792 cycles ,  5791 ns
  Message queue get with context switch        :22112 cycles , 22111 ns
  Message queue get without context switch     :5312 cycles ,  5311 ns
  Mailbox synchronous put                      :27936 cycles , 27935 ns
  Mailbox synchronous get                      :23392 cycles , 23391 ns
  Mailbox asynchronous put                     :11808 cycles , 11807 ns
  Mailbox get without context switch           :20416 cycles , 20415 ns
  Drop to user mode                            :643712 cycles , 643711 ns
  User thread creation                         :652096 cycles , 652095 ns
  Syscall overhead                             :2720 cycles ,  2719 ns
  Validation overhead k_object init            :4256 cycles ,  4255 ns
  Validation overhead k_object permission      :4224 cycles ,  4223 ns
  Time Measurement finished

After:
  START - Time Measurement
  Timing results: Clock frequency: 1000 MHz
  Context switch                               : 896 cycles ,   895 ns
  Interrupt latency                            : 768 cycles ,   767 ns
  Tick overhead                                :14752 cycles , 14751 ns
  Thread creation                              :18464 cycles , 18463 ns
  Thread abort (non-running)                   :48992 cycles , 48991 ns
  Thread abort (_current)                      :55552 cycles , 55551 ns
  Thread suspend                               :10848 cycles , 10847 ns
  Thread resume                                :10048 cycles , 10047 ns
  Thread yield                                 :12213 cycles , 12212 ns
  Thread sleep                                 :17984 cycles , 17983 ns
  Heap malloc                                  :21702 cycles , 21701 ns
  Heap free                                    :15176 cycles , 15175 ns
  Semaphore take with context switch           :19104 cycles , 19103 ns
  Semaphore give with context switch           :18368 cycles , 18367 ns
  Semaphore take without context switch        :1984 cycles ,  1983 ns
  Semaphore give without context switch        :4480 cycles ,  4479 ns
  Mutex lock                                   :1728 cycles ,  1727 ns
  Mutex unlock                                 :7712 cycles ,  7711 ns
  Message queue put with context switch        :20224 cycles , 20223 ns
  Message queue put without context switch     :5568 cycles ,  5567 ns
  Message queue get with context switch        :22016 cycles , 22015 ns
  Message queue get without context switch     :5088 cycles ,  5087 ns
  Mailbox synchronous put                      :27840 cycles , 27839 ns
  Mailbox synchronous get                      :23296 cycles , 23295 ns
  Mailbox asynchronous put                     :11584 cycles , 11583 ns
  Mailbox get without context switch           :20192 cycles , 20191 ns
  Drop to user mode                            :643616 cycles , 643615 ns
  User thread creation                         :651872 cycles , 651871 ns
  Syscall overhead                             :2464 cycles ,  2463 ns
  Validation overhead k_object init            :4032 cycles ,  4031 ns
  Validation overhead k_object permission      :4000 cycles ,  3999 ns
  Time Measurement finished

Signed-off-by: Daniel Leung <daniel.leung@intel.com>
2020-09-05 13:28:38 -05:00
Daniel Leung
c234821861 x86: use LFENCE instead of CPUID before reading TSC for x86_64
According to Intel 64 and IA-32 Architectures Software
Developer’s Manual, volume 3, chapter 8.2.5, LFENCE provides
a more efficient method of controlling memory ordering than
the CPUID instruction. So use LFENCE here, as all 64-bit
CPUs have LFENCE.

Signed-off-by: Daniel Leung <daniel.leung@intel.com>
2020-09-05 13:28:38 -05:00
Eugeniy Paltsev
05b6468a73 ARC: linker: add more place for optimization
Do not force linker to place text sections after each other
to have more freedom to optimize.

Signed-off-by: Eugeniy Paltsev <Eugeniy.Paltsev@synopsys.com>
2020-09-05 10:22:56 -05:00
Wayne Ren
cc897a5198 ARC: add linker script template for metware toolchain
Add linker script template for MWDT toolchain (linker-mwdt.ld)
Move linker.ld to linker-gnu.ld (without changes)

The "linker.ld" is wraper now.

Signed-off-by: Wayne Ren <wei.ren@synopsys.com>
2020-09-05 10:22:56 -05:00
Eugeniy Paltsev
0227056066 ARC: change direct IRQ declaration for metaware toolchain
* change direct IRQ declaration for metaware toolchain
* drop unused irq-related definitions

Signed-off-by: Eugeniy Paltsev <Eugeniy.Paltsev@synopsys.com>
2020-09-05 10:22:56 -05:00
Eugeniy Paltsev
932e178007 ARC: use MWDT intrinsics to access aux regs in case of MWDT toolchain
Metaware doesn't support gcc's builtins so use corresponding intrinsics
instead.

Signed-off-by: Eugeniy Paltsev <Eugeniy.Paltsev@synopsys.com>
2020-09-05 10:22:56 -05:00
Wayne Ren
ef224ce1cd ARC: make the assembly codes compatible
Make the assembly codes compatible with both GNU
and Metaware toolchain.

* replace ".balign" with ".align"
  ".align" assembler directive is supposed by all
  ARC toolchains and it is implemented in a same
  way across ARC toolchains.
* replace "mov_s __certain_reg" with "mov __certain_reg"
  Even though GCC encodes those mnemonics and even real
  HW executes them according to PRM these are restricted
  ones for mov_s and CCAC rightfully refuses to accept
  such mnemonics. So for compatibility and clarity sake
  we switch to 32-bit mov instruction which allows use
  of all those instructions.
* Add "%%" prefix while accessing registers from inline
  ASM as it is required by MWDT.
* Drop "@" prefix while accessing symbols (defined in C
  code) from ASM code as it is required by MWDT.

Signed-off-by: Wayne Ren <wei.ren@synopsys.com>

/#
2020-09-05 10:22:56 -05:00
Eugeniy Paltsev
596cd869c3 ARC: sys_io: rewrite to C code
Replace ASM sys_io implementation with identical C code for ARC.
This significantly improves portability, i.e. compiler decides
which instructions to use for a particular CPU and / or
configuration.

Signed-off-by: Eugeniy Paltsev <Eugeniy.Paltsev@synopsys.com>
2020-09-05 10:22:56 -05:00
Wayne Ren
d67475ab6e ARC: handle the difference of assembly macro definition
GNU toolchain and MWDT (Metware) toolchain have different style
for accessing arguments in assembly macro. Implement the
preprocessor macro to handle the difference.

Make all ASM macros in swap_macros.h compatible for both ARC
toolchains.

Signed-off-by: Wayne Ren <wei.ren@synopsys.com>
Signed-off-by: Eugeniy Paltsev <Eugeniy.Paltsev@synopsys.com>
2020-09-05 10:22:56 -05:00
Carlo Caione
df4aa230c8 arch: arm64: Use _arch_switch() API
Switch to the _arch_switch() API that is required for an SMP-aware
scheduler instead of using the old arch_swap mechanism.

SMP is not supported yet but this is a necessary step in that direction.

Signed-off-by: Carlo Caione <ccaione@baylibre.com>
2020-09-05 12:06:38 +02:00
Flavio Ceolin
5408f3102d debug: x86: Add gdbstub for X86
It implements gdb remote protocol to talk with a host gdb during the
debug session. The implementation is divided in three layers:

1 - The top layer that is responsible for the gdb remote protocol.
2 - An architecture specific layer responsible to write/read registers,
    set breakpoints, handle exceptions, ...
3 - A transport layer to be used to communicate with the host

The communication with GDB in the host is synchronous and the systems
stops execution waiting for instructions and return its execution after
a "continue" or "step" command. The protocol has an exception that is
when the host sends a packet to cause an interruption, usually triggered
by a Ctrl-C. This implementation ignores this instruction though.

This initial work supports only X86 using uart as backend.

Signed-off-by: Flavio Ceolin <flavio.ceolin@intel.com>
2020-09-02 20:54:57 -04:00
Peter Bigot
039e3edcda Revert "posix: linker: Wrap rodata and rwdata in sections."
This reverts commit b51eeb03f4.

The linker script is now putting read-only material in writable
segments, which causes glib with -D_FORTIFY_SOURCE=2 to abort.

Signed-off-by: Peter Bigot <peter.bigot@nordicsemi.no>
2020-09-02 14:46:01 -04:00
Ioannis Glaropoulos
4ec7725110 arch: arm: cortex-m: Modify ARM-only API for IRQ target state mgmt
we modify the ARM Cortex-M only API for managing the
security target state of the NVIC IRQs. We remove the
internal ASSERT checking allowing to call the API for
non-implemented NVIC IRQ lines. However we still give the
option to the user to check the success of the IRQ target
state setting operation by allowing the API function to
return the resulting target state.

Signed-off-by: Ioannis Glaropoulos <Ioannis.Glaropoulos@nordicsemi.no>
2020-09-02 15:01:30 +02:00
Tomasz Bursztyka
93cd336204 arch: Apply dynamic IRQ API change
Switching to constant parameter.

Fixes #27399

Signed-off-by: Tomasz Bursztyka <tomasz.bursztyka@linux.intel.com>
2020-09-02 13:48:13 +02:00
Tomasz Bursztyka
6df8b3995e irq: Change dynamic API to take a constant parameter
All ISRs are meant to take a const struct device pointer, but to
simplify the change let's just move the parameter to constant and that
should be fine.

Fixes #27399

Signed-off-by: Tomasz Bursztyka <tomasz.bursztyka@linux.intel.com>
2020-09-02 13:48:13 +02:00
Tomasz Bursztyka
84942e4fbc irq: Change offload API to take a constant parameter
All ISRs are meant to take a const struct device pointer, but to
simplify the change let's just move the parameter to constant and that
should be fine.

Fixes #27399

Signed-off-by: Tomasz Bursztyka <tomasz.bursztyka@linux.intel.com>
2020-09-02 13:48:13 +02:00
Eugeniy Paltsev
7547b44441 ARC: use generic bitops implementation
There is no need in custom, partially ASM bitops implementation
for ARC, we can use generic one.

Signed-off-by: Eugeniy Paltsev <Eugeniy.Paltsev@synopsys.com>
2020-09-01 13:36:48 +02:00
Eugeniy Paltsev
874d9426b3 ARM: aarch64: use generic bitops implementation
aarch64 has bitops implementation fully identical to generic one.
So drop redundant code.

Signed-off-by: Eugeniy Paltsev <Eugeniy.Paltsev@synopsys.com>
2020-09-01 13:36:48 +02:00