lib/os: Add sys_heap, a new/simpler/faster memory allocator
The existing mem_pool implementation has been an endless source of
frustration. It's had alignment bugs, it's had racy behavior. It's
never been particularly fast. It's outrageously complicated to
configure statically. And while its fragmentation resistance and
overhead on small blocks is good, it's space efficiencey has always
been very poor due to the four-way buddy scheme.
This patch introduces sys_heap. It's a more or less conventional
segregated fit allocator with power-of-two buckets. It doesn't expose
its level structure to the user at all, simply taking an arbitrarily
aligned pointer to memory. It stores all metadata inside the heap
region. It allocates and frees by simple pointer and not block ID.
Static initialization is trivial, and runtime initialization is only a
few cycles to format and add one block to a list header.
It has excellent space efficiency. Chunks can be split arbitrarily in
8 byte units. Overhead is only four bytes per allocated chunk (eight
bytes for heaps >256kb or on 64 bit systems), plus a log2-sized array
of 2-word bucket headers. No coarse alignment restrictions on blocks,
they can be split and merged (in units of 8 bytes) arbitrarily.
It has good fragmentation resistance. Freed blocks are always
immediately merged with adjacent free blocks. Allocations are
attempted from a sample of the smallest bucket that might fit, falling
back rapidly to the smallest block guaranteed to fit. Split memory
remaining in the chunk is always returned immediately to the heap for
other allocation.
It has excellent performance with firmly bounded runtime. All
operations are constant time (though there is a search of the smallest
bucket that has a compile-time-configurable upper bound, setting this
to extreme values results in an effectively linear search of the
list), objectively fast (about a hundred instructions) and amenable to
locked operation. No more need for fragile lock relaxation trickery.
It also contains an extensive validation and stress test framework,
something that was sorely lacking in the previous implementation.
Note that sys_heap is not a compatible API with sys_mem_pool and
k_mem_pool. Partial wrappers for those (now-) legacy APIs will appear
later and a deprecation strategy needs to be chosen.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2019-07-17 09:58:25 -07:00
|
|
|
/*
|
|
|
|
* Copyright (c) 2019 Intel Corporation
|
|
|
|
*
|
|
|
|
* SPDX-License-Identifier: Apache-2.0
|
|
|
|
*/
|
2022-05-06 11:23:05 +02:00
|
|
|
#include <zephyr/sys/sys_heap.h>
|
|
|
|
#include <zephyr/sys/util.h>
|
|
|
|
#include <zephyr/sys/heap_listener.h>
|
|
|
|
#include <zephyr/kernel.h>
|
2020-10-21 13:24:32 -07:00
|
|
|
#include <string.h>
|
lib/os: Add sys_heap, a new/simpler/faster memory allocator
The existing mem_pool implementation has been an endless source of
frustration. It's had alignment bugs, it's had racy behavior. It's
never been particularly fast. It's outrageously complicated to
configure statically. And while its fragmentation resistance and
overhead on small blocks is good, it's space efficiencey has always
been very poor due to the four-way buddy scheme.
This patch introduces sys_heap. It's a more or less conventional
segregated fit allocator with power-of-two buckets. It doesn't expose
its level structure to the user at all, simply taking an arbitrarily
aligned pointer to memory. It stores all metadata inside the heap
region. It allocates and frees by simple pointer and not block ID.
Static initialization is trivial, and runtime initialization is only a
few cycles to format and add one block to a list header.
It has excellent space efficiency. Chunks can be split arbitrarily in
8 byte units. Overhead is only four bytes per allocated chunk (eight
bytes for heaps >256kb or on 64 bit systems), plus a log2-sized array
of 2-word bucket headers. No coarse alignment restrictions on blocks,
they can be split and merged (in units of 8 bytes) arbitrarily.
It has good fragmentation resistance. Freed blocks are always
immediately merged with adjacent free blocks. Allocations are
attempted from a sample of the smallest bucket that might fit, falling
back rapidly to the smallest block guaranteed to fit. Split memory
remaining in the chunk is always returned immediately to the heap for
other allocation.
It has excellent performance with firmly bounded runtime. All
operations are constant time (though there is a search of the smallest
bucket that has a compile-time-configurable upper bound, setting this
to extreme values results in an effectively linear search of the
list), objectively fast (about a hundred instructions) and amenable to
locked operation. No more need for fragile lock relaxation trickery.
It also contains an extensive validation and stress test framework,
something that was sorely lacking in the previous implementation.
Note that sys_heap is not a compatible API with sys_mem_pool and
k_mem_pool. Partial wrappers for those (now-) legacy APIs will appear
later and a deprecation strategy needs to be chosen.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2019-07-17 09:58:25 -07:00
|
|
|
#include "heap.h"
|
2022-08-16 11:42:53 -07:00
|
|
|
#ifdef CONFIG_MSAN
|
|
|
|
#include <sanitizer/msan_interface.h>
|
|
|
|
#endif
|
lib/os: Add sys_heap, a new/simpler/faster memory allocator
The existing mem_pool implementation has been an endless source of
frustration. It's had alignment bugs, it's had racy behavior. It's
never been particularly fast. It's outrageously complicated to
configure statically. And while its fragmentation resistance and
overhead on small blocks is good, it's space efficiencey has always
been very poor due to the four-way buddy scheme.
This patch introduces sys_heap. It's a more or less conventional
segregated fit allocator with power-of-two buckets. It doesn't expose
its level structure to the user at all, simply taking an arbitrarily
aligned pointer to memory. It stores all metadata inside the heap
region. It allocates and frees by simple pointer and not block ID.
Static initialization is trivial, and runtime initialization is only a
few cycles to format and add one block to a list header.
It has excellent space efficiency. Chunks can be split arbitrarily in
8 byte units. Overhead is only four bytes per allocated chunk (eight
bytes for heaps >256kb or on 64 bit systems), plus a log2-sized array
of 2-word bucket headers. No coarse alignment restrictions on blocks,
they can be split and merged (in units of 8 bytes) arbitrarily.
It has good fragmentation resistance. Freed blocks are always
immediately merged with adjacent free blocks. Allocations are
attempted from a sample of the smallest bucket that might fit, falling
back rapidly to the smallest block guaranteed to fit. Split memory
remaining in the chunk is always returned immediately to the heap for
other allocation.
It has excellent performance with firmly bounded runtime. All
operations are constant time (though there is a search of the smallest
bucket that has a compile-time-configurable upper bound, setting this
to extreme values results in an effectively linear search of the
list), objectively fast (about a hundred instructions) and amenable to
locked operation. No more need for fragile lock relaxation trickery.
It also contains an extensive validation and stress test framework,
something that was sorely lacking in the previous implementation.
Note that sys_heap is not a compatible API with sys_mem_pool and
k_mem_pool. Partial wrappers for those (now-) legacy APIs will appear
later and a deprecation strategy needs to be chosen.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2019-07-17 09:58:25 -07:00
|
|
|
|
2022-03-29 19:07:43 +02:00
|
|
|
#ifdef CONFIG_SYS_HEAP_RUNTIME_STATS
|
|
|
|
static inline void increase_allocated_bytes(struct z_heap *h, size_t num_bytes)
|
|
|
|
{
|
|
|
|
h->allocated_bytes += num_bytes;
|
|
|
|
h->max_allocated_bytes = MAX(h->max_allocated_bytes, h->allocated_bytes);
|
|
|
|
}
|
|
|
|
#endif
|
|
|
|
|
lib/os: Add sys_heap, a new/simpler/faster memory allocator
The existing mem_pool implementation has been an endless source of
frustration. It's had alignment bugs, it's had racy behavior. It's
never been particularly fast. It's outrageously complicated to
configure statically. And while its fragmentation resistance and
overhead on small blocks is good, it's space efficiencey has always
been very poor due to the four-way buddy scheme.
This patch introduces sys_heap. It's a more or less conventional
segregated fit allocator with power-of-two buckets. It doesn't expose
its level structure to the user at all, simply taking an arbitrarily
aligned pointer to memory. It stores all metadata inside the heap
region. It allocates and frees by simple pointer and not block ID.
Static initialization is trivial, and runtime initialization is only a
few cycles to format and add one block to a list header.
It has excellent space efficiency. Chunks can be split arbitrarily in
8 byte units. Overhead is only four bytes per allocated chunk (eight
bytes for heaps >256kb or on 64 bit systems), plus a log2-sized array
of 2-word bucket headers. No coarse alignment restrictions on blocks,
they can be split and merged (in units of 8 bytes) arbitrarily.
It has good fragmentation resistance. Freed blocks are always
immediately merged with adjacent free blocks. Allocations are
attempted from a sample of the smallest bucket that might fit, falling
back rapidly to the smallest block guaranteed to fit. Split memory
remaining in the chunk is always returned immediately to the heap for
other allocation.
It has excellent performance with firmly bounded runtime. All
operations are constant time (though there is a search of the smallest
bucket that has a compile-time-configurable upper bound, setting this
to extreme values results in an effectively linear search of the
list), objectively fast (about a hundred instructions) and amenable to
locked operation. No more need for fragile lock relaxation trickery.
It also contains an extensive validation and stress test framework,
something that was sorely lacking in the previous implementation.
Note that sys_heap is not a compatible API with sys_mem_pool and
k_mem_pool. Partial wrappers for those (now-) legacy APIs will appear
later and a deprecation strategy needs to be chosen.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2019-07-17 09:58:25 -07:00
|
|
|
static void *chunk_mem(struct z_heap *h, chunkid_t c)
|
|
|
|
{
|
2019-09-25 16:06:26 -04:00
|
|
|
chunk_unit_t *buf = chunk_buf(h);
|
|
|
|
uint8_t *ret = ((uint8_t *)&buf[c]) + chunk_header_bytes(h);
|
lib/os: Add sys_heap, a new/simpler/faster memory allocator
The existing mem_pool implementation has been an endless source of
frustration. It's had alignment bugs, it's had racy behavior. It's
never been particularly fast. It's outrageously complicated to
configure statically. And while its fragmentation resistance and
overhead on small blocks is good, it's space efficiencey has always
been very poor due to the four-way buddy scheme.
This patch introduces sys_heap. It's a more or less conventional
segregated fit allocator with power-of-two buckets. It doesn't expose
its level structure to the user at all, simply taking an arbitrarily
aligned pointer to memory. It stores all metadata inside the heap
region. It allocates and frees by simple pointer and not block ID.
Static initialization is trivial, and runtime initialization is only a
few cycles to format and add one block to a list header.
It has excellent space efficiency. Chunks can be split arbitrarily in
8 byte units. Overhead is only four bytes per allocated chunk (eight
bytes for heaps >256kb or on 64 bit systems), plus a log2-sized array
of 2-word bucket headers. No coarse alignment restrictions on blocks,
they can be split and merged (in units of 8 bytes) arbitrarily.
It has good fragmentation resistance. Freed blocks are always
immediately merged with adjacent free blocks. Allocations are
attempted from a sample of the smallest bucket that might fit, falling
back rapidly to the smallest block guaranteed to fit. Split memory
remaining in the chunk is always returned immediately to the heap for
other allocation.
It has excellent performance with firmly bounded runtime. All
operations are constant time (though there is a search of the smallest
bucket that has a compile-time-configurable upper bound, setting this
to extreme values results in an effectively linear search of the
list), objectively fast (about a hundred instructions) and amenable to
locked operation. No more need for fragile lock relaxation trickery.
It also contains an extensive validation and stress test framework,
something that was sorely lacking in the previous implementation.
Note that sys_heap is not a compatible API with sys_mem_pool and
k_mem_pool. Partial wrappers for those (now-) legacy APIs will appear
later and a deprecation strategy needs to be chosen.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2019-07-17 09:58:25 -07:00
|
|
|
|
2021-03-17 21:53:25 -04:00
|
|
|
CHECK(!(((uintptr_t)ret) & (big_heap(h) ? 7 : 3)));
|
lib/os: Add sys_heap, a new/simpler/faster memory allocator
The existing mem_pool implementation has been an endless source of
frustration. It's had alignment bugs, it's had racy behavior. It's
never been particularly fast. It's outrageously complicated to
configure statically. And while its fragmentation resistance and
overhead on small blocks is good, it's space efficiencey has always
been very poor due to the four-way buddy scheme.
This patch introduces sys_heap. It's a more or less conventional
segregated fit allocator with power-of-two buckets. It doesn't expose
its level structure to the user at all, simply taking an arbitrarily
aligned pointer to memory. It stores all metadata inside the heap
region. It allocates and frees by simple pointer and not block ID.
Static initialization is trivial, and runtime initialization is only a
few cycles to format and add one block to a list header.
It has excellent space efficiency. Chunks can be split arbitrarily in
8 byte units. Overhead is only four bytes per allocated chunk (eight
bytes for heaps >256kb or on 64 bit systems), plus a log2-sized array
of 2-word bucket headers. No coarse alignment restrictions on blocks,
they can be split and merged (in units of 8 bytes) arbitrarily.
It has good fragmentation resistance. Freed blocks are always
immediately merged with adjacent free blocks. Allocations are
attempted from a sample of the smallest bucket that might fit, falling
back rapidly to the smallest block guaranteed to fit. Split memory
remaining in the chunk is always returned immediately to the heap for
other allocation.
It has excellent performance with firmly bounded runtime. All
operations are constant time (though there is a search of the smallest
bucket that has a compile-time-configurable upper bound, setting this
to extreme values results in an effectively linear search of the
list), objectively fast (about a hundred instructions) and amenable to
locked operation. No more need for fragile lock relaxation trickery.
It also contains an extensive validation and stress test framework,
something that was sorely lacking in the previous implementation.
Note that sys_heap is not a compatible API with sys_mem_pool and
k_mem_pool. Partial wrappers for those (now-) legacy APIs will appear
later and a deprecation strategy needs to be chosen.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2019-07-17 09:58:25 -07:00
|
|
|
|
|
|
|
return ret;
|
|
|
|
}
|
|
|
|
|
2020-06-25 17:18:21 -04:00
|
|
|
static void free_list_remove_bidx(struct z_heap *h, chunkid_t c, int bidx)
|
lib/os: Add sys_heap, a new/simpler/faster memory allocator
The existing mem_pool implementation has been an endless source of
frustration. It's had alignment bugs, it's had racy behavior. It's
never been particularly fast. It's outrageously complicated to
configure statically. And while its fragmentation resistance and
overhead on small blocks is good, it's space efficiencey has always
been very poor due to the four-way buddy scheme.
This patch introduces sys_heap. It's a more or less conventional
segregated fit allocator with power-of-two buckets. It doesn't expose
its level structure to the user at all, simply taking an arbitrarily
aligned pointer to memory. It stores all metadata inside the heap
region. It allocates and frees by simple pointer and not block ID.
Static initialization is trivial, and runtime initialization is only a
few cycles to format and add one block to a list header.
It has excellent space efficiency. Chunks can be split arbitrarily in
8 byte units. Overhead is only four bytes per allocated chunk (eight
bytes for heaps >256kb or on 64 bit systems), plus a log2-sized array
of 2-word bucket headers. No coarse alignment restrictions on blocks,
they can be split and merged (in units of 8 bytes) arbitrarily.
It has good fragmentation resistance. Freed blocks are always
immediately merged with adjacent free blocks. Allocations are
attempted from a sample of the smallest bucket that might fit, falling
back rapidly to the smallest block guaranteed to fit. Split memory
remaining in the chunk is always returned immediately to the heap for
other allocation.
It has excellent performance with firmly bounded runtime. All
operations are constant time (though there is a search of the smallest
bucket that has a compile-time-configurable upper bound, setting this
to extreme values results in an effectively linear search of the
list), objectively fast (about a hundred instructions) and amenable to
locked operation. No more need for fragile lock relaxation trickery.
It also contains an extensive validation and stress test framework,
something that was sorely lacking in the previous implementation.
Note that sys_heap is not a compatible API with sys_mem_pool and
k_mem_pool. Partial wrappers for those (now-) legacy APIs will appear
later and a deprecation strategy needs to be chosen.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2019-07-17 09:58:25 -07:00
|
|
|
{
|
|
|
|
struct z_heap_bucket *b = &h->buckets[bidx];
|
|
|
|
|
2019-09-24 20:47:24 -04:00
|
|
|
CHECK(!chunk_used(h, c));
|
lib/os: Add sys_heap, a new/simpler/faster memory allocator
The existing mem_pool implementation has been an endless source of
frustration. It's had alignment bugs, it's had racy behavior. It's
never been particularly fast. It's outrageously complicated to
configure statically. And while its fragmentation resistance and
overhead on small blocks is good, it's space efficiencey has always
been very poor due to the four-way buddy scheme.
This patch introduces sys_heap. It's a more or less conventional
segregated fit allocator with power-of-two buckets. It doesn't expose
its level structure to the user at all, simply taking an arbitrarily
aligned pointer to memory. It stores all metadata inside the heap
region. It allocates and frees by simple pointer and not block ID.
Static initialization is trivial, and runtime initialization is only a
few cycles to format and add one block to a list header.
It has excellent space efficiency. Chunks can be split arbitrarily in
8 byte units. Overhead is only four bytes per allocated chunk (eight
bytes for heaps >256kb or on 64 bit systems), plus a log2-sized array
of 2-word bucket headers. No coarse alignment restrictions on blocks,
they can be split and merged (in units of 8 bytes) arbitrarily.
It has good fragmentation resistance. Freed blocks are always
immediately merged with adjacent free blocks. Allocations are
attempted from a sample of the smallest bucket that might fit, falling
back rapidly to the smallest block guaranteed to fit. Split memory
remaining in the chunk is always returned immediately to the heap for
other allocation.
It has excellent performance with firmly bounded runtime. All
operations are constant time (though there is a search of the smallest
bucket that has a compile-time-configurable upper bound, setting this
to extreme values results in an effectively linear search of the
list), objectively fast (about a hundred instructions) and amenable to
locked operation. No more need for fragile lock relaxation trickery.
It also contains an extensive validation and stress test framework,
something that was sorely lacking in the previous implementation.
Note that sys_heap is not a compatible API with sys_mem_pool and
k_mem_pool. Partial wrappers for those (now-) legacy APIs will appear
later and a deprecation strategy needs to be chosen.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2019-07-17 09:58:25 -07:00
|
|
|
CHECK(b->next != 0);
|
2021-12-11 22:58:38 -05:00
|
|
|
CHECK(h->avail_buckets & BIT(bidx));
|
lib/os: Add sys_heap, a new/simpler/faster memory allocator
The existing mem_pool implementation has been an endless source of
frustration. It's had alignment bugs, it's had racy behavior. It's
never been particularly fast. It's outrageously complicated to
configure statically. And while its fragmentation resistance and
overhead on small blocks is good, it's space efficiencey has always
been very poor due to the four-way buddy scheme.
This patch introduces sys_heap. It's a more or less conventional
segregated fit allocator with power-of-two buckets. It doesn't expose
its level structure to the user at all, simply taking an arbitrarily
aligned pointer to memory. It stores all metadata inside the heap
region. It allocates and frees by simple pointer and not block ID.
Static initialization is trivial, and runtime initialization is only a
few cycles to format and add one block to a list header.
It has excellent space efficiency. Chunks can be split arbitrarily in
8 byte units. Overhead is only four bytes per allocated chunk (eight
bytes for heaps >256kb or on 64 bit systems), plus a log2-sized array
of 2-word bucket headers. No coarse alignment restrictions on blocks,
they can be split and merged (in units of 8 bytes) arbitrarily.
It has good fragmentation resistance. Freed blocks are always
immediately merged with adjacent free blocks. Allocations are
attempted from a sample of the smallest bucket that might fit, falling
back rapidly to the smallest block guaranteed to fit. Split memory
remaining in the chunk is always returned immediately to the heap for
other allocation.
It has excellent performance with firmly bounded runtime. All
operations are constant time (though there is a search of the smallest
bucket that has a compile-time-configurable upper bound, setting this
to extreme values results in an effectively linear search of the
list), objectively fast (about a hundred instructions) and amenable to
locked operation. No more need for fragile lock relaxation trickery.
It also contains an extensive validation and stress test framework,
something that was sorely lacking in the previous implementation.
Note that sys_heap is not a compatible API with sys_mem_pool and
k_mem_pool. Partial wrappers for those (now-) legacy APIs will appear
later and a deprecation strategy needs to be chosen.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2019-07-17 09:58:25 -07:00
|
|
|
|
2019-10-23 19:43:16 -04:00
|
|
|
if (next_free_chunk(h, c) == c) {
|
|
|
|
/* this is the last chunk */
|
2021-12-11 22:58:38 -05:00
|
|
|
h->avail_buckets &= ~BIT(bidx);
|
lib/os: Add sys_heap, a new/simpler/faster memory allocator
The existing mem_pool implementation has been an endless source of
frustration. It's had alignment bugs, it's had racy behavior. It's
never been particularly fast. It's outrageously complicated to
configure statically. And while its fragmentation resistance and
overhead on small blocks is good, it's space efficiencey has always
been very poor due to the four-way buddy scheme.
This patch introduces sys_heap. It's a more or less conventional
segregated fit allocator with power-of-two buckets. It doesn't expose
its level structure to the user at all, simply taking an arbitrarily
aligned pointer to memory. It stores all metadata inside the heap
region. It allocates and frees by simple pointer and not block ID.
Static initialization is trivial, and runtime initialization is only a
few cycles to format and add one block to a list header.
It has excellent space efficiency. Chunks can be split arbitrarily in
8 byte units. Overhead is only four bytes per allocated chunk (eight
bytes for heaps >256kb or on 64 bit systems), plus a log2-sized array
of 2-word bucket headers. No coarse alignment restrictions on blocks,
they can be split and merged (in units of 8 bytes) arbitrarily.
It has good fragmentation resistance. Freed blocks are always
immediately merged with adjacent free blocks. Allocations are
attempted from a sample of the smallest bucket that might fit, falling
back rapidly to the smallest block guaranteed to fit. Split memory
remaining in the chunk is always returned immediately to the heap for
other allocation.
It has excellent performance with firmly bounded runtime. All
operations are constant time (though there is a search of the smallest
bucket that has a compile-time-configurable upper bound, setting this
to extreme values results in an effectively linear search of the
list), objectively fast (about a hundred instructions) and amenable to
locked operation. No more need for fragile lock relaxation trickery.
It also contains an extensive validation and stress test framework,
something that was sorely lacking in the previous implementation.
Note that sys_heap is not a compatible API with sys_mem_pool and
k_mem_pool. Partial wrappers for those (now-) legacy APIs will appear
later and a deprecation strategy needs to be chosen.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2019-07-17 09:58:25 -07:00
|
|
|
b->next = 0;
|
|
|
|
} else {
|
2019-09-24 20:47:24 -04:00
|
|
|
chunkid_t first = prev_free_chunk(h, c),
|
|
|
|
second = next_free_chunk(h, c);
|
lib/os: Add sys_heap, a new/simpler/faster memory allocator
The existing mem_pool implementation has been an endless source of
frustration. It's had alignment bugs, it's had racy behavior. It's
never been particularly fast. It's outrageously complicated to
configure statically. And while its fragmentation resistance and
overhead on small blocks is good, it's space efficiencey has always
been very poor due to the four-way buddy scheme.
This patch introduces sys_heap. It's a more or less conventional
segregated fit allocator with power-of-two buckets. It doesn't expose
its level structure to the user at all, simply taking an arbitrarily
aligned pointer to memory. It stores all metadata inside the heap
region. It allocates and frees by simple pointer and not block ID.
Static initialization is trivial, and runtime initialization is only a
few cycles to format and add one block to a list header.
It has excellent space efficiency. Chunks can be split arbitrarily in
8 byte units. Overhead is only four bytes per allocated chunk (eight
bytes for heaps >256kb or on 64 bit systems), plus a log2-sized array
of 2-word bucket headers. No coarse alignment restrictions on blocks,
they can be split and merged (in units of 8 bytes) arbitrarily.
It has good fragmentation resistance. Freed blocks are always
immediately merged with adjacent free blocks. Allocations are
attempted from a sample of the smallest bucket that might fit, falling
back rapidly to the smallest block guaranteed to fit. Split memory
remaining in the chunk is always returned immediately to the heap for
other allocation.
It has excellent performance with firmly bounded runtime. All
operations are constant time (though there is a search of the smallest
bucket that has a compile-time-configurable upper bound, setting this
to extreme values results in an effectively linear search of the
list), objectively fast (about a hundred instructions) and amenable to
locked operation. No more need for fragile lock relaxation trickery.
It also contains an extensive validation and stress test framework,
something that was sorely lacking in the previous implementation.
Note that sys_heap is not a compatible API with sys_mem_pool and
k_mem_pool. Partial wrappers for those (now-) legacy APIs will appear
later and a deprecation strategy needs to be chosen.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2019-07-17 09:58:25 -07:00
|
|
|
|
|
|
|
b->next = second;
|
2019-09-24 23:20:53 -04:00
|
|
|
set_next_free_chunk(h, first, second);
|
|
|
|
set_prev_free_chunk(h, second, first);
|
lib/os: Add sys_heap, a new/simpler/faster memory allocator
The existing mem_pool implementation has been an endless source of
frustration. It's had alignment bugs, it's had racy behavior. It's
never been particularly fast. It's outrageously complicated to
configure statically. And while its fragmentation resistance and
overhead on small blocks is good, it's space efficiencey has always
been very poor due to the four-way buddy scheme.
This patch introduces sys_heap. It's a more or less conventional
segregated fit allocator with power-of-two buckets. It doesn't expose
its level structure to the user at all, simply taking an arbitrarily
aligned pointer to memory. It stores all metadata inside the heap
region. It allocates and frees by simple pointer and not block ID.
Static initialization is trivial, and runtime initialization is only a
few cycles to format and add one block to a list header.
It has excellent space efficiency. Chunks can be split arbitrarily in
8 byte units. Overhead is only four bytes per allocated chunk (eight
bytes for heaps >256kb or on 64 bit systems), plus a log2-sized array
of 2-word bucket headers. No coarse alignment restrictions on blocks,
they can be split and merged (in units of 8 bytes) arbitrarily.
It has good fragmentation resistance. Freed blocks are always
immediately merged with adjacent free blocks. Allocations are
attempted from a sample of the smallest bucket that might fit, falling
back rapidly to the smallest block guaranteed to fit. Split memory
remaining in the chunk is always returned immediately to the heap for
other allocation.
It has excellent performance with firmly bounded runtime. All
operations are constant time (though there is a search of the smallest
bucket that has a compile-time-configurable upper bound, setting this
to extreme values results in an effectively linear search of the
list), objectively fast (about a hundred instructions) and amenable to
locked operation. No more need for fragile lock relaxation trickery.
It also contains an extensive validation and stress test framework,
something that was sorely lacking in the previous implementation.
Note that sys_heap is not a compatible API with sys_mem_pool and
k_mem_pool. Partial wrappers for those (now-) legacy APIs will appear
later and a deprecation strategy needs to be chosen.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2019-07-17 09:58:25 -07:00
|
|
|
}
|
2021-10-29 15:49:07 +08:00
|
|
|
|
|
|
|
#ifdef CONFIG_SYS_HEAP_RUNTIME_STATS
|
|
|
|
h->free_bytes -= chunksz_to_bytes(h, chunk_size(h, c));
|
|
|
|
#endif
|
lib/os: Add sys_heap, a new/simpler/faster memory allocator
The existing mem_pool implementation has been an endless source of
frustration. It's had alignment bugs, it's had racy behavior. It's
never been particularly fast. It's outrageously complicated to
configure statically. And while its fragmentation resistance and
overhead on small blocks is good, it's space efficiencey has always
been very poor due to the four-way buddy scheme.
This patch introduces sys_heap. It's a more or less conventional
segregated fit allocator with power-of-two buckets. It doesn't expose
its level structure to the user at all, simply taking an arbitrarily
aligned pointer to memory. It stores all metadata inside the heap
region. It allocates and frees by simple pointer and not block ID.
Static initialization is trivial, and runtime initialization is only a
few cycles to format and add one block to a list header.
It has excellent space efficiency. Chunks can be split arbitrarily in
8 byte units. Overhead is only four bytes per allocated chunk (eight
bytes for heaps >256kb or on 64 bit systems), plus a log2-sized array
of 2-word bucket headers. No coarse alignment restrictions on blocks,
they can be split and merged (in units of 8 bytes) arbitrarily.
It has good fragmentation resistance. Freed blocks are always
immediately merged with adjacent free blocks. Allocations are
attempted from a sample of the smallest bucket that might fit, falling
back rapidly to the smallest block guaranteed to fit. Split memory
remaining in the chunk is always returned immediately to the heap for
other allocation.
It has excellent performance with firmly bounded runtime. All
operations are constant time (though there is a search of the smallest
bucket that has a compile-time-configurable upper bound, setting this
to extreme values results in an effectively linear search of the
list), objectively fast (about a hundred instructions) and amenable to
locked operation. No more need for fragile lock relaxation trickery.
It also contains an extensive validation and stress test framework,
something that was sorely lacking in the previous implementation.
Note that sys_heap is not a compatible API with sys_mem_pool and
k_mem_pool. Partial wrappers for those (now-) legacy APIs will appear
later and a deprecation strategy needs to be chosen.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2019-07-17 09:58:25 -07:00
|
|
|
}
|
|
|
|
|
2020-06-25 17:18:21 -04:00
|
|
|
static void free_list_remove(struct z_heap *h, chunkid_t c)
|
lib/os: Add sys_heap, a new/simpler/faster memory allocator
The existing mem_pool implementation has been an endless source of
frustration. It's had alignment bugs, it's had racy behavior. It's
never been particularly fast. It's outrageously complicated to
configure statically. And while its fragmentation resistance and
overhead on small blocks is good, it's space efficiencey has always
been very poor due to the four-way buddy scheme.
This patch introduces sys_heap. It's a more or less conventional
segregated fit allocator with power-of-two buckets. It doesn't expose
its level structure to the user at all, simply taking an arbitrarily
aligned pointer to memory. It stores all metadata inside the heap
region. It allocates and frees by simple pointer and not block ID.
Static initialization is trivial, and runtime initialization is only a
few cycles to format and add one block to a list header.
It has excellent space efficiency. Chunks can be split arbitrarily in
8 byte units. Overhead is only four bytes per allocated chunk (eight
bytes for heaps >256kb or on 64 bit systems), plus a log2-sized array
of 2-word bucket headers. No coarse alignment restrictions on blocks,
they can be split and merged (in units of 8 bytes) arbitrarily.
It has good fragmentation resistance. Freed blocks are always
immediately merged with adjacent free blocks. Allocations are
attempted from a sample of the smallest bucket that might fit, falling
back rapidly to the smallest block guaranteed to fit. Split memory
remaining in the chunk is always returned immediately to the heap for
other allocation.
It has excellent performance with firmly bounded runtime. All
operations are constant time (though there is a search of the smallest
bucket that has a compile-time-configurable upper bound, setting this
to extreme values results in an effectively linear search of the
list), objectively fast (about a hundred instructions) and amenable to
locked operation. No more need for fragile lock relaxation trickery.
It also contains an extensive validation and stress test framework,
something that was sorely lacking in the previous implementation.
Note that sys_heap is not a compatible API with sys_mem_pool and
k_mem_pool. Partial wrappers for those (now-) legacy APIs will appear
later and a deprecation strategy needs to be chosen.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2019-07-17 09:58:25 -07:00
|
|
|
{
|
2020-06-25 17:18:21 -04:00
|
|
|
if (!solo_free_header(h, c)) {
|
|
|
|
int bidx = bucket_idx(h, chunk_size(h, c));
|
|
|
|
free_list_remove_bidx(h, c, bidx);
|
2020-06-18 10:13:45 -07:00
|
|
|
}
|
2020-06-25 17:18:21 -04:00
|
|
|
}
|
2020-06-18 10:13:45 -07:00
|
|
|
|
2020-06-25 17:18:21 -04:00
|
|
|
static void free_list_add_bidx(struct z_heap *h, chunkid_t c, int bidx)
|
|
|
|
{
|
|
|
|
struct z_heap_bucket *b = &h->buckets[bidx];
|
lib/os: Add sys_heap, a new/simpler/faster memory allocator
The existing mem_pool implementation has been an endless source of
frustration. It's had alignment bugs, it's had racy behavior. It's
never been particularly fast. It's outrageously complicated to
configure statically. And while its fragmentation resistance and
overhead on small blocks is good, it's space efficiencey has always
been very poor due to the four-way buddy scheme.
This patch introduces sys_heap. It's a more or less conventional
segregated fit allocator with power-of-two buckets. It doesn't expose
its level structure to the user at all, simply taking an arbitrarily
aligned pointer to memory. It stores all metadata inside the heap
region. It allocates and frees by simple pointer and not block ID.
Static initialization is trivial, and runtime initialization is only a
few cycles to format and add one block to a list header.
It has excellent space efficiency. Chunks can be split arbitrarily in
8 byte units. Overhead is only four bytes per allocated chunk (eight
bytes for heaps >256kb or on 64 bit systems), plus a log2-sized array
of 2-word bucket headers. No coarse alignment restrictions on blocks,
they can be split and merged (in units of 8 bytes) arbitrarily.
It has good fragmentation resistance. Freed blocks are always
immediately merged with adjacent free blocks. Allocations are
attempted from a sample of the smallest bucket that might fit, falling
back rapidly to the smallest block guaranteed to fit. Split memory
remaining in the chunk is always returned immediately to the heap for
other allocation.
It has excellent performance with firmly bounded runtime. All
operations are constant time (though there is a search of the smallest
bucket that has a compile-time-configurable upper bound, setting this
to extreme values results in an effectively linear search of the
list), objectively fast (about a hundred instructions) and amenable to
locked operation. No more need for fragile lock relaxation trickery.
It also contains an extensive validation and stress test framework,
something that was sorely lacking in the previous implementation.
Note that sys_heap is not a compatible API with sys_mem_pool and
k_mem_pool. Partial wrappers for those (now-) legacy APIs will appear
later and a deprecation strategy needs to be chosen.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2019-07-17 09:58:25 -07:00
|
|
|
|
2020-08-20 16:47:11 -07:00
|
|
|
if (b->next == 0U) {
|
2021-12-11 22:58:38 -05:00
|
|
|
CHECK((h->avail_buckets & BIT(bidx)) == 0);
|
lib/os: Add sys_heap, a new/simpler/faster memory allocator
The existing mem_pool implementation has been an endless source of
frustration. It's had alignment bugs, it's had racy behavior. It's
never been particularly fast. It's outrageously complicated to
configure statically. And while its fragmentation resistance and
overhead on small blocks is good, it's space efficiencey has always
been very poor due to the four-way buddy scheme.
This patch introduces sys_heap. It's a more or less conventional
segregated fit allocator with power-of-two buckets. It doesn't expose
its level structure to the user at all, simply taking an arbitrarily
aligned pointer to memory. It stores all metadata inside the heap
region. It allocates and frees by simple pointer and not block ID.
Static initialization is trivial, and runtime initialization is only a
few cycles to format and add one block to a list header.
It has excellent space efficiency. Chunks can be split arbitrarily in
8 byte units. Overhead is only four bytes per allocated chunk (eight
bytes for heaps >256kb or on 64 bit systems), plus a log2-sized array
of 2-word bucket headers. No coarse alignment restrictions on blocks,
they can be split and merged (in units of 8 bytes) arbitrarily.
It has good fragmentation resistance. Freed blocks are always
immediately merged with adjacent free blocks. Allocations are
attempted from a sample of the smallest bucket that might fit, falling
back rapidly to the smallest block guaranteed to fit. Split memory
remaining in the chunk is always returned immediately to the heap for
other allocation.
It has excellent performance with firmly bounded runtime. All
operations are constant time (though there is a search of the smallest
bucket that has a compile-time-configurable upper bound, setting this
to extreme values results in an effectively linear search of the
list), objectively fast (about a hundred instructions) and amenable to
locked operation. No more need for fragile lock relaxation trickery.
It also contains an extensive validation and stress test framework,
something that was sorely lacking in the previous implementation.
Note that sys_heap is not a compatible API with sys_mem_pool and
k_mem_pool. Partial wrappers for those (now-) legacy APIs will appear
later and a deprecation strategy needs to be chosen.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2019-07-17 09:58:25 -07:00
|
|
|
|
|
|
|
/* Empty list, first item */
|
2021-12-11 22:58:38 -05:00
|
|
|
h->avail_buckets |= BIT(bidx);
|
2020-06-25 17:18:21 -04:00
|
|
|
b->next = c;
|
2019-09-24 23:20:53 -04:00
|
|
|
set_prev_free_chunk(h, c, c);
|
|
|
|
set_next_free_chunk(h, c, c);
|
lib/os: Add sys_heap, a new/simpler/faster memory allocator
The existing mem_pool implementation has been an endless source of
frustration. It's had alignment bugs, it's had racy behavior. It's
never been particularly fast. It's outrageously complicated to
configure statically. And while its fragmentation resistance and
overhead on small blocks is good, it's space efficiencey has always
been very poor due to the four-way buddy scheme.
This patch introduces sys_heap. It's a more or less conventional
segregated fit allocator with power-of-two buckets. It doesn't expose
its level structure to the user at all, simply taking an arbitrarily
aligned pointer to memory. It stores all metadata inside the heap
region. It allocates and frees by simple pointer and not block ID.
Static initialization is trivial, and runtime initialization is only a
few cycles to format and add one block to a list header.
It has excellent space efficiency. Chunks can be split arbitrarily in
8 byte units. Overhead is only four bytes per allocated chunk (eight
bytes for heaps >256kb or on 64 bit systems), plus a log2-sized array
of 2-word bucket headers. No coarse alignment restrictions on blocks,
they can be split and merged (in units of 8 bytes) arbitrarily.
It has good fragmentation resistance. Freed blocks are always
immediately merged with adjacent free blocks. Allocations are
attempted from a sample of the smallest bucket that might fit, falling
back rapidly to the smallest block guaranteed to fit. Split memory
remaining in the chunk is always returned immediately to the heap for
other allocation.
It has excellent performance with firmly bounded runtime. All
operations are constant time (though there is a search of the smallest
bucket that has a compile-time-configurable upper bound, setting this
to extreme values results in an effectively linear search of the
list), objectively fast (about a hundred instructions) and amenable to
locked operation. No more need for fragile lock relaxation trickery.
It also contains an extensive validation and stress test framework,
something that was sorely lacking in the previous implementation.
Note that sys_heap is not a compatible API with sys_mem_pool and
k_mem_pool. Partial wrappers for those (now-) legacy APIs will appear
later and a deprecation strategy needs to be chosen.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2019-07-17 09:58:25 -07:00
|
|
|
} else {
|
2021-12-11 22:58:38 -05:00
|
|
|
CHECK(h->avail_buckets & BIT(bidx));
|
2019-10-23 19:43:16 -04:00
|
|
|
|
lib/os: Add sys_heap, a new/simpler/faster memory allocator
The existing mem_pool implementation has been an endless source of
frustration. It's had alignment bugs, it's had racy behavior. It's
never been particularly fast. It's outrageously complicated to
configure statically. And while its fragmentation resistance and
overhead on small blocks is good, it's space efficiencey has always
been very poor due to the four-way buddy scheme.
This patch introduces sys_heap. It's a more or less conventional
segregated fit allocator with power-of-two buckets. It doesn't expose
its level structure to the user at all, simply taking an arbitrarily
aligned pointer to memory. It stores all metadata inside the heap
region. It allocates and frees by simple pointer and not block ID.
Static initialization is trivial, and runtime initialization is only a
few cycles to format and add one block to a list header.
It has excellent space efficiency. Chunks can be split arbitrarily in
8 byte units. Overhead is only four bytes per allocated chunk (eight
bytes for heaps >256kb or on 64 bit systems), plus a log2-sized array
of 2-word bucket headers. No coarse alignment restrictions on blocks,
they can be split and merged (in units of 8 bytes) arbitrarily.
It has good fragmentation resistance. Freed blocks are always
immediately merged with adjacent free blocks. Allocations are
attempted from a sample of the smallest bucket that might fit, falling
back rapidly to the smallest block guaranteed to fit. Split memory
remaining in the chunk is always returned immediately to the heap for
other allocation.
It has excellent performance with firmly bounded runtime. All
operations are constant time (though there is a search of the smallest
bucket that has a compile-time-configurable upper bound, setting this
to extreme values results in an effectively linear search of the
list), objectively fast (about a hundred instructions) and amenable to
locked operation. No more need for fragile lock relaxation trickery.
It also contains an extensive validation and stress test framework,
something that was sorely lacking in the previous implementation.
Note that sys_heap is not a compatible API with sys_mem_pool and
k_mem_pool. Partial wrappers for those (now-) legacy APIs will appear
later and a deprecation strategy needs to be chosen.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2019-07-17 09:58:25 -07:00
|
|
|
/* Insert before (!) the "next" pointer */
|
2020-06-25 17:18:21 -04:00
|
|
|
chunkid_t second = b->next;
|
2019-09-24 20:47:24 -04:00
|
|
|
chunkid_t first = prev_free_chunk(h, second);
|
lib/os: Add sys_heap, a new/simpler/faster memory allocator
The existing mem_pool implementation has been an endless source of
frustration. It's had alignment bugs, it's had racy behavior. It's
never been particularly fast. It's outrageously complicated to
configure statically. And while its fragmentation resistance and
overhead on small blocks is good, it's space efficiencey has always
been very poor due to the four-way buddy scheme.
This patch introduces sys_heap. It's a more or less conventional
segregated fit allocator with power-of-two buckets. It doesn't expose
its level structure to the user at all, simply taking an arbitrarily
aligned pointer to memory. It stores all metadata inside the heap
region. It allocates and frees by simple pointer and not block ID.
Static initialization is trivial, and runtime initialization is only a
few cycles to format and add one block to a list header.
It has excellent space efficiency. Chunks can be split arbitrarily in
8 byte units. Overhead is only four bytes per allocated chunk (eight
bytes for heaps >256kb or on 64 bit systems), plus a log2-sized array
of 2-word bucket headers. No coarse alignment restrictions on blocks,
they can be split and merged (in units of 8 bytes) arbitrarily.
It has good fragmentation resistance. Freed blocks are always
immediately merged with adjacent free blocks. Allocations are
attempted from a sample of the smallest bucket that might fit, falling
back rapidly to the smallest block guaranteed to fit. Split memory
remaining in the chunk is always returned immediately to the heap for
other allocation.
It has excellent performance with firmly bounded runtime. All
operations are constant time (though there is a search of the smallest
bucket that has a compile-time-configurable upper bound, setting this
to extreme values results in an effectively linear search of the
list), objectively fast (about a hundred instructions) and amenable to
locked operation. No more need for fragile lock relaxation trickery.
It also contains an extensive validation and stress test framework,
something that was sorely lacking in the previous implementation.
Note that sys_heap is not a compatible API with sys_mem_pool and
k_mem_pool. Partial wrappers for those (now-) legacy APIs will appear
later and a deprecation strategy needs to be chosen.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2019-07-17 09:58:25 -07:00
|
|
|
|
2019-09-24 23:20:53 -04:00
|
|
|
set_prev_free_chunk(h, c, first);
|
|
|
|
set_next_free_chunk(h, c, second);
|
|
|
|
set_next_free_chunk(h, first, c);
|
|
|
|
set_prev_free_chunk(h, second, c);
|
lib/os: Add sys_heap, a new/simpler/faster memory allocator
The existing mem_pool implementation has been an endless source of
frustration. It's had alignment bugs, it's had racy behavior. It's
never been particularly fast. It's outrageously complicated to
configure statically. And while its fragmentation resistance and
overhead on small blocks is good, it's space efficiencey has always
been very poor due to the four-way buddy scheme.
This patch introduces sys_heap. It's a more or less conventional
segregated fit allocator with power-of-two buckets. It doesn't expose
its level structure to the user at all, simply taking an arbitrarily
aligned pointer to memory. It stores all metadata inside the heap
region. It allocates and frees by simple pointer and not block ID.
Static initialization is trivial, and runtime initialization is only a
few cycles to format and add one block to a list header.
It has excellent space efficiency. Chunks can be split arbitrarily in
8 byte units. Overhead is only four bytes per allocated chunk (eight
bytes for heaps >256kb or on 64 bit systems), plus a log2-sized array
of 2-word bucket headers. No coarse alignment restrictions on blocks,
they can be split and merged (in units of 8 bytes) arbitrarily.
It has good fragmentation resistance. Freed blocks are always
immediately merged with adjacent free blocks. Allocations are
attempted from a sample of the smallest bucket that might fit, falling
back rapidly to the smallest block guaranteed to fit. Split memory
remaining in the chunk is always returned immediately to the heap for
other allocation.
It has excellent performance with firmly bounded runtime. All
operations are constant time (though there is a search of the smallest
bucket that has a compile-time-configurable upper bound, setting this
to extreme values results in an effectively linear search of the
list), objectively fast (about a hundred instructions) and amenable to
locked operation. No more need for fragile lock relaxation trickery.
It also contains an extensive validation and stress test framework,
something that was sorely lacking in the previous implementation.
Note that sys_heap is not a compatible API with sys_mem_pool and
k_mem_pool. Partial wrappers for those (now-) legacy APIs will appear
later and a deprecation strategy needs to be chosen.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2019-07-17 09:58:25 -07:00
|
|
|
}
|
2021-10-29 15:49:07 +08:00
|
|
|
|
|
|
|
#ifdef CONFIG_SYS_HEAP_RUNTIME_STATS
|
|
|
|
h->free_bytes += chunksz_to_bytes(h, chunk_size(h, c));
|
|
|
|
#endif
|
lib/os: Add sys_heap, a new/simpler/faster memory allocator
The existing mem_pool implementation has been an endless source of
frustration. It's had alignment bugs, it's had racy behavior. It's
never been particularly fast. It's outrageously complicated to
configure statically. And while its fragmentation resistance and
overhead on small blocks is good, it's space efficiencey has always
been very poor due to the four-way buddy scheme.
This patch introduces sys_heap. It's a more or less conventional
segregated fit allocator with power-of-two buckets. It doesn't expose
its level structure to the user at all, simply taking an arbitrarily
aligned pointer to memory. It stores all metadata inside the heap
region. It allocates and frees by simple pointer and not block ID.
Static initialization is trivial, and runtime initialization is only a
few cycles to format and add one block to a list header.
It has excellent space efficiency. Chunks can be split arbitrarily in
8 byte units. Overhead is only four bytes per allocated chunk (eight
bytes for heaps >256kb or on 64 bit systems), plus a log2-sized array
of 2-word bucket headers. No coarse alignment restrictions on blocks,
they can be split and merged (in units of 8 bytes) arbitrarily.
It has good fragmentation resistance. Freed blocks are always
immediately merged with adjacent free blocks. Allocations are
attempted from a sample of the smallest bucket that might fit, falling
back rapidly to the smallest block guaranteed to fit. Split memory
remaining in the chunk is always returned immediately to the heap for
other allocation.
It has excellent performance with firmly bounded runtime. All
operations are constant time (though there is a search of the smallest
bucket that has a compile-time-configurable upper bound, setting this
to extreme values results in an effectively linear search of the
list), objectively fast (about a hundred instructions) and amenable to
locked operation. No more need for fragile lock relaxation trickery.
It also contains an extensive validation and stress test framework,
something that was sorely lacking in the previous implementation.
Note that sys_heap is not a compatible API with sys_mem_pool and
k_mem_pool. Partial wrappers for those (now-) legacy APIs will appear
later and a deprecation strategy needs to be chosen.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2019-07-17 09:58:25 -07:00
|
|
|
}
|
|
|
|
|
2020-06-25 17:18:21 -04:00
|
|
|
static void free_list_add(struct z_heap *h, chunkid_t c)
|
|
|
|
{
|
|
|
|
if (!solo_free_header(h, c)) {
|
|
|
|
int bidx = bucket_idx(h, chunk_size(h, c));
|
|
|
|
free_list_add_bidx(h, c, bidx);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2020-06-10 08:18:17 -07:00
|
|
|
/* Splits a chunk "lc" into a left chunk and a right chunk at "rc".
|
|
|
|
* Leaves both chunks marked "free"
|
|
|
|
*/
|
|
|
|
static void split_chunks(struct z_heap *h, chunkid_t lc, chunkid_t rc)
|
|
|
|
{
|
|
|
|
CHECK(rc > lc);
|
|
|
|
CHECK(rc - lc < chunk_size(h, lc));
|
|
|
|
|
2021-03-17 21:53:25 -04:00
|
|
|
chunksz_t sz0 = chunk_size(h, lc);
|
|
|
|
chunksz_t lsz = rc - lc;
|
|
|
|
chunksz_t rsz = sz0 - lsz;
|
2020-06-10 08:18:17 -07:00
|
|
|
|
|
|
|
set_chunk_size(h, lc, lsz);
|
|
|
|
set_chunk_size(h, rc, rsz);
|
|
|
|
set_left_chunk_size(h, rc, lsz);
|
|
|
|
set_left_chunk_size(h, right_chunk(h, rc), rsz);
|
|
|
|
}
|
|
|
|
|
|
|
|
/* Does not modify free list */
|
|
|
|
static void merge_chunks(struct z_heap *h, chunkid_t lc, chunkid_t rc)
|
|
|
|
{
|
2021-03-17 21:53:25 -04:00
|
|
|
chunksz_t newsz = chunk_size(h, lc) + chunk_size(h, rc);
|
2020-06-10 08:18:17 -07:00
|
|
|
|
|
|
|
set_chunk_size(h, lc, newsz);
|
|
|
|
set_left_chunk_size(h, right_chunk(h, rc), newsz);
|
|
|
|
}
|
|
|
|
|
2020-06-25 17:18:21 -04:00
|
|
|
static void free_chunk(struct z_heap *h, chunkid_t c)
|
2020-06-10 08:18:17 -07:00
|
|
|
{
|
|
|
|
/* Merge with free right chunk? */
|
|
|
|
if (!chunk_used(h, right_chunk(h, c))) {
|
2020-06-25 17:18:21 -04:00
|
|
|
free_list_remove(h, right_chunk(h, c));
|
2020-06-10 08:18:17 -07:00
|
|
|
merge_chunks(h, c, right_chunk(h, c));
|
lib/os: Add sys_heap, a new/simpler/faster memory allocator
The existing mem_pool implementation has been an endless source of
frustration. It's had alignment bugs, it's had racy behavior. It's
never been particularly fast. It's outrageously complicated to
configure statically. And while its fragmentation resistance and
overhead on small blocks is good, it's space efficiencey has always
been very poor due to the four-way buddy scheme.
This patch introduces sys_heap. It's a more or less conventional
segregated fit allocator with power-of-two buckets. It doesn't expose
its level structure to the user at all, simply taking an arbitrarily
aligned pointer to memory. It stores all metadata inside the heap
region. It allocates and frees by simple pointer and not block ID.
Static initialization is trivial, and runtime initialization is only a
few cycles to format and add one block to a list header.
It has excellent space efficiency. Chunks can be split arbitrarily in
8 byte units. Overhead is only four bytes per allocated chunk (eight
bytes for heaps >256kb or on 64 bit systems), plus a log2-sized array
of 2-word bucket headers. No coarse alignment restrictions on blocks,
they can be split and merged (in units of 8 bytes) arbitrarily.
It has good fragmentation resistance. Freed blocks are always
immediately merged with adjacent free blocks. Allocations are
attempted from a sample of the smallest bucket that might fit, falling
back rapidly to the smallest block guaranteed to fit. Split memory
remaining in the chunk is always returned immediately to the heap for
other allocation.
It has excellent performance with firmly bounded runtime. All
operations are constant time (though there is a search of the smallest
bucket that has a compile-time-configurable upper bound, setting this
to extreme values results in an effectively linear search of the
list), objectively fast (about a hundred instructions) and amenable to
locked operation. No more need for fragile lock relaxation trickery.
It also contains an extensive validation and stress test framework,
something that was sorely lacking in the previous implementation.
Note that sys_heap is not a compatible API with sys_mem_pool and
k_mem_pool. Partial wrappers for those (now-) legacy APIs will appear
later and a deprecation strategy needs to be chosen.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2019-07-17 09:58:25 -07:00
|
|
|
}
|
|
|
|
|
2020-06-10 08:18:17 -07:00
|
|
|
/* Merge with free left chunk? */
|
|
|
|
if (!chunk_used(h, left_chunk(h, c))) {
|
2020-06-25 17:18:21 -04:00
|
|
|
free_list_remove(h, left_chunk(h, c));
|
2020-06-10 08:18:17 -07:00
|
|
|
merge_chunks(h, left_chunk(h, c), c);
|
|
|
|
c = left_chunk(h, c);
|
|
|
|
}
|
lib/os: Add sys_heap, a new/simpler/faster memory allocator
The existing mem_pool implementation has been an endless source of
frustration. It's had alignment bugs, it's had racy behavior. It's
never been particularly fast. It's outrageously complicated to
configure statically. And while its fragmentation resistance and
overhead on small blocks is good, it's space efficiencey has always
been very poor due to the four-way buddy scheme.
This patch introduces sys_heap. It's a more or less conventional
segregated fit allocator with power-of-two buckets. It doesn't expose
its level structure to the user at all, simply taking an arbitrarily
aligned pointer to memory. It stores all metadata inside the heap
region. It allocates and frees by simple pointer and not block ID.
Static initialization is trivial, and runtime initialization is only a
few cycles to format and add one block to a list header.
It has excellent space efficiency. Chunks can be split arbitrarily in
8 byte units. Overhead is only four bytes per allocated chunk (eight
bytes for heaps >256kb or on 64 bit systems), plus a log2-sized array
of 2-word bucket headers. No coarse alignment restrictions on blocks,
they can be split and merged (in units of 8 bytes) arbitrarily.
It has good fragmentation resistance. Freed blocks are always
immediately merged with adjacent free blocks. Allocations are
attempted from a sample of the smallest bucket that might fit, falling
back rapidly to the smallest block guaranteed to fit. Split memory
remaining in the chunk is always returned immediately to the heap for
other allocation.
It has excellent performance with firmly bounded runtime. All
operations are constant time (though there is a search of the smallest
bucket that has a compile-time-configurable upper bound, setting this
to extreme values results in an effectively linear search of the
list), objectively fast (about a hundred instructions) and amenable to
locked operation. No more need for fragile lock relaxation trickery.
It also contains an extensive validation and stress test framework,
something that was sorely lacking in the previous implementation.
Note that sys_heap is not a compatible API with sys_mem_pool and
k_mem_pool. Partial wrappers for those (now-) legacy APIs will appear
later and a deprecation strategy needs to be chosen.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2019-07-17 09:58:25 -07:00
|
|
|
|
2020-06-10 08:18:17 -07:00
|
|
|
free_list_add(h, c);
|
lib/os: Add sys_heap, a new/simpler/faster memory allocator
The existing mem_pool implementation has been an endless source of
frustration. It's had alignment bugs, it's had racy behavior. It's
never been particularly fast. It's outrageously complicated to
configure statically. And while its fragmentation resistance and
overhead on small blocks is good, it's space efficiencey has always
been very poor due to the four-way buddy scheme.
This patch introduces sys_heap. It's a more or less conventional
segregated fit allocator with power-of-two buckets. It doesn't expose
its level structure to the user at all, simply taking an arbitrarily
aligned pointer to memory. It stores all metadata inside the heap
region. It allocates and frees by simple pointer and not block ID.
Static initialization is trivial, and runtime initialization is only a
few cycles to format and add one block to a list header.
It has excellent space efficiency. Chunks can be split arbitrarily in
8 byte units. Overhead is only four bytes per allocated chunk (eight
bytes for heaps >256kb or on 64 bit systems), plus a log2-sized array
of 2-word bucket headers. No coarse alignment restrictions on blocks,
they can be split and merged (in units of 8 bytes) arbitrarily.
It has good fragmentation resistance. Freed blocks are always
immediately merged with adjacent free blocks. Allocations are
attempted from a sample of the smallest bucket that might fit, falling
back rapidly to the smallest block guaranteed to fit. Split memory
remaining in the chunk is always returned immediately to the heap for
other allocation.
It has excellent performance with firmly bounded runtime. All
operations are constant time (though there is a search of the smallest
bucket that has a compile-time-configurable upper bound, setting this
to extreme values results in an effectively linear search of the
list), objectively fast (about a hundred instructions) and amenable to
locked operation. No more need for fragile lock relaxation trickery.
It also contains an extensive validation and stress test framework,
something that was sorely lacking in the previous implementation.
Note that sys_heap is not a compatible API with sys_mem_pool and
k_mem_pool. Partial wrappers for those (now-) legacy APIs will appear
later and a deprecation strategy needs to be chosen.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2019-07-17 09:58:25 -07:00
|
|
|
}
|
|
|
|
|
2020-06-26 12:53:36 -04:00
|
|
|
/*
|
|
|
|
* Return the closest chunk ID corresponding to given memory pointer.
|
|
|
|
* Here "closest" is only meaningful in the context of sys_heap_aligned_alloc()
|
|
|
|
* where wanted alignment might not always correspond to a chunk header
|
|
|
|
* boundary.
|
|
|
|
*/
|
2020-06-24 01:10:14 -04:00
|
|
|
static chunkid_t mem_to_chunkid(struct z_heap *h, void *p)
|
|
|
|
{
|
|
|
|
uint8_t *mem = p, *base = (uint8_t *)chunk_buf(h);
|
|
|
|
return (mem - chunk_header_bytes(h) - base) / CHUNK_UNIT;
|
|
|
|
}
|
|
|
|
|
lib/os: Add sys_heap, a new/simpler/faster memory allocator
The existing mem_pool implementation has been an endless source of
frustration. It's had alignment bugs, it's had racy behavior. It's
never been particularly fast. It's outrageously complicated to
configure statically. And while its fragmentation resistance and
overhead on small blocks is good, it's space efficiencey has always
been very poor due to the four-way buddy scheme.
This patch introduces sys_heap. It's a more or less conventional
segregated fit allocator with power-of-two buckets. It doesn't expose
its level structure to the user at all, simply taking an arbitrarily
aligned pointer to memory. It stores all metadata inside the heap
region. It allocates and frees by simple pointer and not block ID.
Static initialization is trivial, and runtime initialization is only a
few cycles to format and add one block to a list header.
It has excellent space efficiency. Chunks can be split arbitrarily in
8 byte units. Overhead is only four bytes per allocated chunk (eight
bytes for heaps >256kb or on 64 bit systems), plus a log2-sized array
of 2-word bucket headers. No coarse alignment restrictions on blocks,
they can be split and merged (in units of 8 bytes) arbitrarily.
It has good fragmentation resistance. Freed blocks are always
immediately merged with adjacent free blocks. Allocations are
attempted from a sample of the smallest bucket that might fit, falling
back rapidly to the smallest block guaranteed to fit. Split memory
remaining in the chunk is always returned immediately to the heap for
other allocation.
It has excellent performance with firmly bounded runtime. All
operations are constant time (though there is a search of the smallest
bucket that has a compile-time-configurable upper bound, setting this
to extreme values results in an effectively linear search of the
list), objectively fast (about a hundred instructions) and amenable to
locked operation. No more need for fragile lock relaxation trickery.
It also contains an extensive validation and stress test framework,
something that was sorely lacking in the previous implementation.
Note that sys_heap is not a compatible API with sys_mem_pool and
k_mem_pool. Partial wrappers for those (now-) legacy APIs will appear
later and a deprecation strategy needs to be chosen.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2019-07-17 09:58:25 -07:00
|
|
|
void sys_heap_free(struct sys_heap *heap, void *mem)
|
|
|
|
{
|
|
|
|
if (mem == NULL) {
|
|
|
|
return; /* ISO C free() semantics */
|
|
|
|
}
|
|
|
|
struct z_heap *h = heap->heap;
|
2020-06-24 01:10:14 -04:00
|
|
|
chunkid_t c = mem_to_chunkid(h, mem);
|
lib/os: Add sys_heap, a new/simpler/faster memory allocator
The existing mem_pool implementation has been an endless source of
frustration. It's had alignment bugs, it's had racy behavior. It's
never been particularly fast. It's outrageously complicated to
configure statically. And while its fragmentation resistance and
overhead on small blocks is good, it's space efficiencey has always
been very poor due to the four-way buddy scheme.
This patch introduces sys_heap. It's a more or less conventional
segregated fit allocator with power-of-two buckets. It doesn't expose
its level structure to the user at all, simply taking an arbitrarily
aligned pointer to memory. It stores all metadata inside the heap
region. It allocates and frees by simple pointer and not block ID.
Static initialization is trivial, and runtime initialization is only a
few cycles to format and add one block to a list header.
It has excellent space efficiency. Chunks can be split arbitrarily in
8 byte units. Overhead is only four bytes per allocated chunk (eight
bytes for heaps >256kb or on 64 bit systems), plus a log2-sized array
of 2-word bucket headers. No coarse alignment restrictions on blocks,
they can be split and merged (in units of 8 bytes) arbitrarily.
It has good fragmentation resistance. Freed blocks are always
immediately merged with adjacent free blocks. Allocations are
attempted from a sample of the smallest bucket that might fit, falling
back rapidly to the smallest block guaranteed to fit. Split memory
remaining in the chunk is always returned immediately to the heap for
other allocation.
It has excellent performance with firmly bounded runtime. All
operations are constant time (though there is a search of the smallest
bucket that has a compile-time-configurable upper bound, setting this
to extreme values results in an effectively linear search of the
list), objectively fast (about a hundred instructions) and amenable to
locked operation. No more need for fragile lock relaxation trickery.
It also contains an extensive validation and stress test framework,
something that was sorely lacking in the previous implementation.
Note that sys_heap is not a compatible API with sys_mem_pool and
k_mem_pool. Partial wrappers for those (now-) legacy APIs will appear
later and a deprecation strategy needs to be chosen.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2019-07-17 09:58:25 -07:00
|
|
|
|
2019-09-26 15:55:50 -04:00
|
|
|
/*
|
|
|
|
* This should catch many double-free cases.
|
|
|
|
* This is cheap enough so let's do it all the time.
|
|
|
|
*/
|
|
|
|
__ASSERT(chunk_used(h, c),
|
|
|
|
"unexpected heap state (double-free?) for memory at %p", mem);
|
2020-06-10 08:18:17 -07:00
|
|
|
|
2019-09-26 15:55:50 -04:00
|
|
|
/*
|
|
|
|
* It is easy to catch many common memory overflow cases with
|
|
|
|
* a quick check on this and next chunk header fields that are
|
|
|
|
* immediately before and after the freed memory.
|
|
|
|
*/
|
|
|
|
__ASSERT(left_chunk(h, right_chunk(h, c)) == c,
|
2020-06-10 08:18:17 -07:00
|
|
|
"corrupted heap bounds (buffer overflow?) for memory at %p",
|
|
|
|
mem);
|
lib/os: Add sys_heap, a new/simpler/faster memory allocator
The existing mem_pool implementation has been an endless source of
frustration. It's had alignment bugs, it's had racy behavior. It's
never been particularly fast. It's outrageously complicated to
configure statically. And while its fragmentation resistance and
overhead on small blocks is good, it's space efficiencey has always
been very poor due to the four-way buddy scheme.
This patch introduces sys_heap. It's a more or less conventional
segregated fit allocator with power-of-two buckets. It doesn't expose
its level structure to the user at all, simply taking an arbitrarily
aligned pointer to memory. It stores all metadata inside the heap
region. It allocates and frees by simple pointer and not block ID.
Static initialization is trivial, and runtime initialization is only a
few cycles to format and add one block to a list header.
It has excellent space efficiency. Chunks can be split arbitrarily in
8 byte units. Overhead is only four bytes per allocated chunk (eight
bytes for heaps >256kb or on 64 bit systems), plus a log2-sized array
of 2-word bucket headers. No coarse alignment restrictions on blocks,
they can be split and merged (in units of 8 bytes) arbitrarily.
It has good fragmentation resistance. Freed blocks are always
immediately merged with adjacent free blocks. Allocations are
attempted from a sample of the smallest bucket that might fit, falling
back rapidly to the smallest block guaranteed to fit. Split memory
remaining in the chunk is always returned immediately to the heap for
other allocation.
It has excellent performance with firmly bounded runtime. All
operations are constant time (though there is a search of the smallest
bucket that has a compile-time-configurable upper bound, setting this
to extreme values results in an effectively linear search of the
list), objectively fast (about a hundred instructions) and amenable to
locked operation. No more need for fragile lock relaxation trickery.
It also contains an extensive validation and stress test framework,
something that was sorely lacking in the previous implementation.
Note that sys_heap is not a compatible API with sys_mem_pool and
k_mem_pool. Partial wrappers for those (now-) legacy APIs will appear
later and a deprecation strategy needs to be chosen.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2019-07-17 09:58:25 -07:00
|
|
|
|
2020-06-25 17:18:21 -04:00
|
|
|
set_chunk_used(h, c, false);
|
2021-10-29 15:49:07 +08:00
|
|
|
#ifdef CONFIG_SYS_HEAP_RUNTIME_STATS
|
|
|
|
h->allocated_bytes -= chunksz_to_bytes(h, chunk_size(h, c));
|
|
|
|
#endif
|
2022-01-05 14:06:12 -08:00
|
|
|
|
|
|
|
#ifdef CONFIG_SYS_HEAP_LISTENER
|
|
|
|
heap_listener_notify_free(HEAP_ID_FROM_POINTER(heap), mem,
|
|
|
|
chunksz_to_bytes(h, chunk_size(h, c)));
|
|
|
|
#endif
|
|
|
|
|
2020-06-25 17:18:21 -04:00
|
|
|
free_chunk(h, c);
|
lib/os: Add sys_heap, a new/simpler/faster memory allocator
The existing mem_pool implementation has been an endless source of
frustration. It's had alignment bugs, it's had racy behavior. It's
never been particularly fast. It's outrageously complicated to
configure statically. And while its fragmentation resistance and
overhead on small blocks is good, it's space efficiencey has always
been very poor due to the four-way buddy scheme.
This patch introduces sys_heap. It's a more or less conventional
segregated fit allocator with power-of-two buckets. It doesn't expose
its level structure to the user at all, simply taking an arbitrarily
aligned pointer to memory. It stores all metadata inside the heap
region. It allocates and frees by simple pointer and not block ID.
Static initialization is trivial, and runtime initialization is only a
few cycles to format and add one block to a list header.
It has excellent space efficiency. Chunks can be split arbitrarily in
8 byte units. Overhead is only four bytes per allocated chunk (eight
bytes for heaps >256kb or on 64 bit systems), plus a log2-sized array
of 2-word bucket headers. No coarse alignment restrictions on blocks,
they can be split and merged (in units of 8 bytes) arbitrarily.
It has good fragmentation resistance. Freed blocks are always
immediately merged with adjacent free blocks. Allocations are
attempted from a sample of the smallest bucket that might fit, falling
back rapidly to the smallest block guaranteed to fit. Split memory
remaining in the chunk is always returned immediately to the heap for
other allocation.
It has excellent performance with firmly bounded runtime. All
operations are constant time (though there is a search of the smallest
bucket that has a compile-time-configurable upper bound, setting this
to extreme values results in an effectively linear search of the
list), objectively fast (about a hundred instructions) and amenable to
locked operation. No more need for fragile lock relaxation trickery.
It also contains an extensive validation and stress test framework,
something that was sorely lacking in the previous implementation.
Note that sys_heap is not a compatible API with sys_mem_pool and
k_mem_pool. Partial wrappers for those (now-) legacy APIs will appear
later and a deprecation strategy needs to be chosen.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2019-07-17 09:58:25 -07:00
|
|
|
}
|
|
|
|
|
2021-10-06 05:53:36 -07:00
|
|
|
size_t sys_heap_usable_size(struct sys_heap *heap, void *mem)
|
|
|
|
{
|
|
|
|
struct z_heap *h = heap->heap;
|
|
|
|
chunkid_t c = mem_to_chunkid(h, mem);
|
|
|
|
size_t addr = (size_t)mem;
|
|
|
|
size_t chunk_base = (size_t)&chunk_buf(h)[c];
|
|
|
|
size_t chunk_sz = chunk_size(h, c) * CHUNK_UNIT;
|
|
|
|
|
|
|
|
return chunk_sz - (addr - chunk_base);
|
|
|
|
}
|
|
|
|
|
2021-03-17 21:53:25 -04:00
|
|
|
static chunkid_t alloc_chunk(struct z_heap *h, chunksz_t sz)
|
lib/os: Add sys_heap, a new/simpler/faster memory allocator
The existing mem_pool implementation has been an endless source of
frustration. It's had alignment bugs, it's had racy behavior. It's
never been particularly fast. It's outrageously complicated to
configure statically. And while its fragmentation resistance and
overhead on small blocks is good, it's space efficiencey has always
been very poor due to the four-way buddy scheme.
This patch introduces sys_heap. It's a more or less conventional
segregated fit allocator with power-of-two buckets. It doesn't expose
its level structure to the user at all, simply taking an arbitrarily
aligned pointer to memory. It stores all metadata inside the heap
region. It allocates and frees by simple pointer and not block ID.
Static initialization is trivial, and runtime initialization is only a
few cycles to format and add one block to a list header.
It has excellent space efficiency. Chunks can be split arbitrarily in
8 byte units. Overhead is only four bytes per allocated chunk (eight
bytes for heaps >256kb or on 64 bit systems), plus a log2-sized array
of 2-word bucket headers. No coarse alignment restrictions on blocks,
they can be split and merged (in units of 8 bytes) arbitrarily.
It has good fragmentation resistance. Freed blocks are always
immediately merged with adjacent free blocks. Allocations are
attempted from a sample of the smallest bucket that might fit, falling
back rapidly to the smallest block guaranteed to fit. Split memory
remaining in the chunk is always returned immediately to the heap for
other allocation.
It has excellent performance with firmly bounded runtime. All
operations are constant time (though there is a search of the smallest
bucket that has a compile-time-configurable upper bound, setting this
to extreme values results in an effectively linear search of the
list), objectively fast (about a hundred instructions) and amenable to
locked operation. No more need for fragile lock relaxation trickery.
It also contains an extensive validation and stress test framework,
something that was sorely lacking in the previous implementation.
Note that sys_heap is not a compatible API with sys_mem_pool and
k_mem_pool. Partial wrappers for those (now-) legacy APIs will appear
later and a deprecation strategy needs to be chosen.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2019-07-17 09:58:25 -07:00
|
|
|
{
|
|
|
|
int bi = bucket_idx(h, sz);
|
|
|
|
struct z_heap_bucket *b = &h->buckets[bi];
|
|
|
|
|
2021-03-17 21:05:49 -04:00
|
|
|
CHECK(bi <= bucket_idx(h, h->end_chunk));
|
lib/os: Add sys_heap, a new/simpler/faster memory allocator
The existing mem_pool implementation has been an endless source of
frustration. It's had alignment bugs, it's had racy behavior. It's
never been particularly fast. It's outrageously complicated to
configure statically. And while its fragmentation resistance and
overhead on small blocks is good, it's space efficiencey has always
been very poor due to the four-way buddy scheme.
This patch introduces sys_heap. It's a more or less conventional
segregated fit allocator with power-of-two buckets. It doesn't expose
its level structure to the user at all, simply taking an arbitrarily
aligned pointer to memory. It stores all metadata inside the heap
region. It allocates and frees by simple pointer and not block ID.
Static initialization is trivial, and runtime initialization is only a
few cycles to format and add one block to a list header.
It has excellent space efficiency. Chunks can be split arbitrarily in
8 byte units. Overhead is only four bytes per allocated chunk (eight
bytes for heaps >256kb or on 64 bit systems), plus a log2-sized array
of 2-word bucket headers. No coarse alignment restrictions on blocks,
they can be split and merged (in units of 8 bytes) arbitrarily.
It has good fragmentation resistance. Freed blocks are always
immediately merged with adjacent free blocks. Allocations are
attempted from a sample of the smallest bucket that might fit, falling
back rapidly to the smallest block guaranteed to fit. Split memory
remaining in the chunk is always returned immediately to the heap for
other allocation.
It has excellent performance with firmly bounded runtime. All
operations are constant time (though there is a search of the smallest
bucket that has a compile-time-configurable upper bound, setting this
to extreme values results in an effectively linear search of the
list), objectively fast (about a hundred instructions) and amenable to
locked operation. No more need for fragile lock relaxation trickery.
It also contains an extensive validation and stress test framework,
something that was sorely lacking in the previous implementation.
Note that sys_heap is not a compatible API with sys_mem_pool and
k_mem_pool. Partial wrappers for those (now-) legacy APIs will appear
later and a deprecation strategy needs to be chosen.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2019-07-17 09:58:25 -07:00
|
|
|
|
|
|
|
/* First try a bounded count of items from the minimal bucket
|
|
|
|
* size. These may not fit, trying (e.g.) three means that
|
|
|
|
* (assuming that chunk sizes are evenly distributed[1]) we
|
|
|
|
* have a 7/8 chance of finding a match, thus keeping the
|
|
|
|
* number of such blocks consumed by allocation higher than
|
|
|
|
* the number of smaller blocks created by fragmenting larger
|
|
|
|
* ones.
|
|
|
|
*
|
|
|
|
* [1] In practice, they are never evenly distributed, of
|
|
|
|
* course. But even in pathological situations we still
|
|
|
|
* maintain our constant time performance and at worst see
|
|
|
|
* fragmentation waste of the order of the block allocated
|
|
|
|
* only.
|
|
|
|
*/
|
2024-06-07 14:51:44 +00:00
|
|
|
if (b->next != 0U) {
|
2019-10-23 19:43:16 -04:00
|
|
|
chunkid_t first = b->next;
|
|
|
|
int i = CONFIG_SYS_HEAP_ALLOC_LOOPS;
|
|
|
|
do {
|
2020-06-26 12:18:35 -04:00
|
|
|
chunkid_t c = b->next;
|
|
|
|
if (chunk_size(h, c) >= sz) {
|
|
|
|
free_list_remove_bidx(h, c, bi);
|
|
|
|
return c;
|
2019-10-23 19:43:16 -04:00
|
|
|
}
|
2020-06-26 12:18:35 -04:00
|
|
|
b->next = next_free_chunk(h, c);
|
2019-10-23 19:43:16 -04:00
|
|
|
CHECK(b->next != 0);
|
|
|
|
} while (--i && b->next != first);
|
lib/os: Add sys_heap, a new/simpler/faster memory allocator
The existing mem_pool implementation has been an endless source of
frustration. It's had alignment bugs, it's had racy behavior. It's
never been particularly fast. It's outrageously complicated to
configure statically. And while its fragmentation resistance and
overhead on small blocks is good, it's space efficiencey has always
been very poor due to the four-way buddy scheme.
This patch introduces sys_heap. It's a more or less conventional
segregated fit allocator with power-of-two buckets. It doesn't expose
its level structure to the user at all, simply taking an arbitrarily
aligned pointer to memory. It stores all metadata inside the heap
region. It allocates and frees by simple pointer and not block ID.
Static initialization is trivial, and runtime initialization is only a
few cycles to format and add one block to a list header.
It has excellent space efficiency. Chunks can be split arbitrarily in
8 byte units. Overhead is only four bytes per allocated chunk (eight
bytes for heaps >256kb or on 64 bit systems), plus a log2-sized array
of 2-word bucket headers. No coarse alignment restrictions on blocks,
they can be split and merged (in units of 8 bytes) arbitrarily.
It has good fragmentation resistance. Freed blocks are always
immediately merged with adjacent free blocks. Allocations are
attempted from a sample of the smallest bucket that might fit, falling
back rapidly to the smallest block guaranteed to fit. Split memory
remaining in the chunk is always returned immediately to the heap for
other allocation.
It has excellent performance with firmly bounded runtime. All
operations are constant time (though there is a search of the smallest
bucket that has a compile-time-configurable upper bound, setting this
to extreme values results in an effectively linear search of the
list), objectively fast (about a hundred instructions) and amenable to
locked operation. No more need for fragile lock relaxation trickery.
It also contains an extensive validation and stress test framework,
something that was sorely lacking in the previous implementation.
Note that sys_heap is not a compatible API with sys_mem_pool and
k_mem_pool. Partial wrappers for those (now-) legacy APIs will appear
later and a deprecation strategy needs to be chosen.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2019-07-17 09:58:25 -07:00
|
|
|
}
|
|
|
|
|
|
|
|
/* Otherwise pick the smallest non-empty bucket guaranteed to
|
|
|
|
* fit and use that unconditionally.
|
|
|
|
*/
|
2021-12-11 22:58:38 -05:00
|
|
|
uint32_t bmask = h->avail_buckets & ~BIT_MASK(bi + 1);
|
lib/os: Add sys_heap, a new/simpler/faster memory allocator
The existing mem_pool implementation has been an endless source of
frustration. It's had alignment bugs, it's had racy behavior. It's
never been particularly fast. It's outrageously complicated to
configure statically. And while its fragmentation resistance and
overhead on small blocks is good, it's space efficiencey has always
been very poor due to the four-way buddy scheme.
This patch introduces sys_heap. It's a more or less conventional
segregated fit allocator with power-of-two buckets. It doesn't expose
its level structure to the user at all, simply taking an arbitrarily
aligned pointer to memory. It stores all metadata inside the heap
region. It allocates and frees by simple pointer and not block ID.
Static initialization is trivial, and runtime initialization is only a
few cycles to format and add one block to a list header.
It has excellent space efficiency. Chunks can be split arbitrarily in
8 byte units. Overhead is only four bytes per allocated chunk (eight
bytes for heaps >256kb or on 64 bit systems), plus a log2-sized array
of 2-word bucket headers. No coarse alignment restrictions on blocks,
they can be split and merged (in units of 8 bytes) arbitrarily.
It has good fragmentation resistance. Freed blocks are always
immediately merged with adjacent free blocks. Allocations are
attempted from a sample of the smallest bucket that might fit, falling
back rapidly to the smallest block guaranteed to fit. Split memory
remaining in the chunk is always returned immediately to the heap for
other allocation.
It has excellent performance with firmly bounded runtime. All
operations are constant time (though there is a search of the smallest
bucket that has a compile-time-configurable upper bound, setting this
to extreme values results in an effectively linear search of the
list), objectively fast (about a hundred instructions) and amenable to
locked operation. No more need for fragile lock relaxation trickery.
It also contains an extensive validation and stress test framework,
something that was sorely lacking in the previous implementation.
Note that sys_heap is not a compatible API with sys_mem_pool and
k_mem_pool. Partial wrappers for those (now-) legacy APIs will appear
later and a deprecation strategy needs to be chosen.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2019-07-17 09:58:25 -07:00
|
|
|
|
2021-03-17 22:31:59 -04:00
|
|
|
if (bmask != 0U) {
|
|
|
|
int minbucket = __builtin_ctz(bmask);
|
2020-06-26 12:18:35 -04:00
|
|
|
chunkid_t c = h->buckets[minbucket].next;
|
lib/os: Add sys_heap, a new/simpler/faster memory allocator
The existing mem_pool implementation has been an endless source of
frustration. It's had alignment bugs, it's had racy behavior. It's
never been particularly fast. It's outrageously complicated to
configure statically. And while its fragmentation resistance and
overhead on small blocks is good, it's space efficiencey has always
been very poor due to the four-way buddy scheme.
This patch introduces sys_heap. It's a more or less conventional
segregated fit allocator with power-of-two buckets. It doesn't expose
its level structure to the user at all, simply taking an arbitrarily
aligned pointer to memory. It stores all metadata inside the heap
region. It allocates and frees by simple pointer and not block ID.
Static initialization is trivial, and runtime initialization is only a
few cycles to format and add one block to a list header.
It has excellent space efficiency. Chunks can be split arbitrarily in
8 byte units. Overhead is only four bytes per allocated chunk (eight
bytes for heaps >256kb or on 64 bit systems), plus a log2-sized array
of 2-word bucket headers. No coarse alignment restrictions on blocks,
they can be split and merged (in units of 8 bytes) arbitrarily.
It has good fragmentation resistance. Freed blocks are always
immediately merged with adjacent free blocks. Allocations are
attempted from a sample of the smallest bucket that might fit, falling
back rapidly to the smallest block guaranteed to fit. Split memory
remaining in the chunk is always returned immediately to the heap for
other allocation.
It has excellent performance with firmly bounded runtime. All
operations are constant time (though there is a search of the smallest
bucket that has a compile-time-configurable upper bound, setting this
to extreme values results in an effectively linear search of the
list), objectively fast (about a hundred instructions) and amenable to
locked operation. No more need for fragile lock relaxation trickery.
It also contains an extensive validation and stress test framework,
something that was sorely lacking in the previous implementation.
Note that sys_heap is not a compatible API with sys_mem_pool and
k_mem_pool. Partial wrappers for those (now-) legacy APIs will appear
later and a deprecation strategy needs to be chosen.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2019-07-17 09:58:25 -07:00
|
|
|
|
2020-06-26 12:18:35 -04:00
|
|
|
free_list_remove_bidx(h, c, minbucket);
|
|
|
|
CHECK(chunk_size(h, c) >= sz);
|
|
|
|
return c;
|
lib/os: Add sys_heap, a new/simpler/faster memory allocator
The existing mem_pool implementation has been an endless source of
frustration. It's had alignment bugs, it's had racy behavior. It's
never been particularly fast. It's outrageously complicated to
configure statically. And while its fragmentation resistance and
overhead on small blocks is good, it's space efficiencey has always
been very poor due to the four-way buddy scheme.
This patch introduces sys_heap. It's a more or less conventional
segregated fit allocator with power-of-two buckets. It doesn't expose
its level structure to the user at all, simply taking an arbitrarily
aligned pointer to memory. It stores all metadata inside the heap
region. It allocates and frees by simple pointer and not block ID.
Static initialization is trivial, and runtime initialization is only a
few cycles to format and add one block to a list header.
It has excellent space efficiency. Chunks can be split arbitrarily in
8 byte units. Overhead is only four bytes per allocated chunk (eight
bytes for heaps >256kb or on 64 bit systems), plus a log2-sized array
of 2-word bucket headers. No coarse alignment restrictions on blocks,
they can be split and merged (in units of 8 bytes) arbitrarily.
It has good fragmentation resistance. Freed blocks are always
immediately merged with adjacent free blocks. Allocations are
attempted from a sample of the smallest bucket that might fit, falling
back rapidly to the smallest block guaranteed to fit. Split memory
remaining in the chunk is always returned immediately to the heap for
other allocation.
It has excellent performance with firmly bounded runtime. All
operations are constant time (though there is a search of the smallest
bucket that has a compile-time-configurable upper bound, setting this
to extreme values results in an effectively linear search of the
list), objectively fast (about a hundred instructions) and amenable to
locked operation. No more need for fragile lock relaxation trickery.
It also contains an extensive validation and stress test framework,
something that was sorely lacking in the previous implementation.
Note that sys_heap is not a compatible API with sys_mem_pool and
k_mem_pool. Partial wrappers for those (now-) legacy APIs will appear
later and a deprecation strategy needs to be chosen.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2019-07-17 09:58:25 -07:00
|
|
|
}
|
|
|
|
|
2020-06-10 08:18:17 -07:00
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
|
|
|
void *sys_heap_alloc(struct sys_heap *heap, size_t bytes)
|
|
|
|
{
|
2021-01-19 14:38:56 -05:00
|
|
|
struct z_heap *h = heap->heap;
|
2022-01-05 14:06:12 -08:00
|
|
|
void *mem;
|
2021-01-19 14:38:56 -05:00
|
|
|
|
2025-06-02 11:27:57 -07:00
|
|
|
if (bytes == 0U) {
|
2020-06-10 08:18:17 -07:00
|
|
|
return NULL;
|
|
|
|
}
|
|
|
|
|
2025-06-02 11:27:57 -07:00
|
|
|
chunksz_t chunk_sz = bytes_to_chunksz(h, bytes, 0);
|
2020-07-23 12:57:22 -07:00
|
|
|
chunkid_t c = alloc_chunk(h, chunk_sz);
|
2025-06-02 11:27:57 -07:00
|
|
|
|
2020-08-20 16:47:11 -07:00
|
|
|
if (c == 0U) {
|
2020-06-26 12:18:35 -04:00
|
|
|
return NULL;
|
|
|
|
}
|
|
|
|
|
|
|
|
/* Split off remainder if any */
|
2020-07-23 12:57:22 -07:00
|
|
|
if (chunk_size(h, c) > chunk_sz) {
|
|
|
|
split_chunks(h, c, c + chunk_sz);
|
|
|
|
free_list_add(h, c + chunk_sz);
|
2020-06-26 12:18:35 -04:00
|
|
|
}
|
|
|
|
|
|
|
|
set_chunk_used(h, c, true);
|
2022-01-05 14:06:12 -08:00
|
|
|
|
|
|
|
mem = chunk_mem(h, c);
|
|
|
|
|
2021-10-29 15:49:07 +08:00
|
|
|
#ifdef CONFIG_SYS_HEAP_RUNTIME_STATS
|
2022-03-29 19:07:43 +02:00
|
|
|
increase_allocated_bytes(h, chunksz_to_bytes(h, chunk_size(h, c)));
|
2021-10-29 15:49:07 +08:00
|
|
|
#endif
|
2022-01-05 14:06:12 -08:00
|
|
|
|
|
|
|
#ifdef CONFIG_SYS_HEAP_LISTENER
|
|
|
|
heap_listener_notify_alloc(HEAP_ID_FROM_POINTER(heap), mem,
|
|
|
|
chunksz_to_bytes(h, chunk_size(h, c)));
|
|
|
|
#endif
|
|
|
|
|
2022-08-16 11:42:53 -07:00
|
|
|
IF_ENABLED(CONFIG_MSAN, (__msan_allocated_memory(mem, bytes)));
|
2022-01-05 14:06:12 -08:00
|
|
|
return mem;
|
lib/os: Add sys_heap, a new/simpler/faster memory allocator
The existing mem_pool implementation has been an endless source of
frustration. It's had alignment bugs, it's had racy behavior. It's
never been particularly fast. It's outrageously complicated to
configure statically. And while its fragmentation resistance and
overhead on small blocks is good, it's space efficiencey has always
been very poor due to the four-way buddy scheme.
This patch introduces sys_heap. It's a more or less conventional
segregated fit allocator with power-of-two buckets. It doesn't expose
its level structure to the user at all, simply taking an arbitrarily
aligned pointer to memory. It stores all metadata inside the heap
region. It allocates and frees by simple pointer and not block ID.
Static initialization is trivial, and runtime initialization is only a
few cycles to format and add one block to a list header.
It has excellent space efficiency. Chunks can be split arbitrarily in
8 byte units. Overhead is only four bytes per allocated chunk (eight
bytes for heaps >256kb or on 64 bit systems), plus a log2-sized array
of 2-word bucket headers. No coarse alignment restrictions on blocks,
they can be split and merged (in units of 8 bytes) arbitrarily.
It has good fragmentation resistance. Freed blocks are always
immediately merged with adjacent free blocks. Allocations are
attempted from a sample of the smallest bucket that might fit, falling
back rapidly to the smallest block guaranteed to fit. Split memory
remaining in the chunk is always returned immediately to the heap for
other allocation.
It has excellent performance with firmly bounded runtime. All
operations are constant time (though there is a search of the smallest
bucket that has a compile-time-configurable upper bound, setting this
to extreme values results in an effectively linear search of the
list), objectively fast (about a hundred instructions) and amenable to
locked operation. No more need for fragile lock relaxation trickery.
It also contains an extensive validation and stress test framework,
something that was sorely lacking in the previous implementation.
Note that sys_heap is not a compatible API with sys_mem_pool and
k_mem_pool. Partial wrappers for those (now-) legacy APIs will appear
later and a deprecation strategy needs to be chosen.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2019-07-17 09:58:25 -07:00
|
|
|
}
|
|
|
|
|
2025-03-10 17:26:40 -04:00
|
|
|
void *sys_heap_noalign_alloc(struct sys_heap *heap, size_t align, size_t bytes)
|
|
|
|
{
|
|
|
|
ARG_UNUSED(align);
|
|
|
|
|
|
|
|
return sys_heap_alloc(heap, bytes);
|
|
|
|
}
|
|
|
|
|
2020-06-18 10:13:45 -07:00
|
|
|
void *sys_heap_aligned_alloc(struct sys_heap *heap, size_t align, size_t bytes)
|
|
|
|
{
|
|
|
|
struct z_heap *h = heap->heap;
|
2021-03-22 07:58:19 -04:00
|
|
|
size_t gap, rew;
|
2020-06-18 10:13:45 -07:00
|
|
|
|
2021-01-15 22:39:02 -05:00
|
|
|
/*
|
|
|
|
* Split align and rewind values (if any).
|
|
|
|
* We allow for one bit of rewind in addition to the alignment
|
2025-03-10 22:39:37 -04:00
|
|
|
* value to efficiently accommodate z_alloc_helper().
|
2021-01-15 22:39:02 -05:00
|
|
|
* So if e.g. align = 0x28 (32 | 8) this means we align to a 32-byte
|
|
|
|
* boundary and then rewind 8 bytes.
|
|
|
|
*/
|
2021-03-22 07:58:19 -04:00
|
|
|
rew = align & -align;
|
|
|
|
if (align != rew) {
|
|
|
|
align -= rew;
|
|
|
|
gap = MIN(rew, chunk_header_bytes(h));
|
2021-01-15 22:39:02 -05:00
|
|
|
} else {
|
|
|
|
if (align <= chunk_header_bytes(h)) {
|
|
|
|
return sys_heap_alloc(heap, bytes);
|
|
|
|
}
|
2021-03-22 07:58:19 -04:00
|
|
|
rew = 0;
|
2021-01-15 22:39:02 -05:00
|
|
|
gap = chunk_header_bytes(h);
|
|
|
|
}
|
2020-12-17 20:44:44 -05:00
|
|
|
__ASSERT((align & (align - 1)) == 0, "align must be a power of 2");
|
2020-06-24 01:10:14 -04:00
|
|
|
|
2025-06-02 11:27:57 -07:00
|
|
|
if (bytes == 0) {
|
2020-06-18 10:13:45 -07:00
|
|
|
return NULL;
|
|
|
|
}
|
|
|
|
|
2020-06-24 01:10:14 -04:00
|
|
|
/*
|
|
|
|
* Find a free block that is guaranteed to fit.
|
|
|
|
* We over-allocate to account for alignment and then free
|
|
|
|
* the extra allocations afterwards.
|
|
|
|
*/
|
2025-06-02 11:27:57 -07:00
|
|
|
chunksz_t padded_sz = bytes_to_chunksz(h, bytes, align - gap);
|
2020-06-26 12:18:35 -04:00
|
|
|
chunkid_t c0 = alloc_chunk(h, padded_sz);
|
2020-06-18 10:13:45 -07:00
|
|
|
|
|
|
|
if (c0 == 0) {
|
|
|
|
return NULL;
|
|
|
|
}
|
2021-01-15 22:39:02 -05:00
|
|
|
uint8_t *mem = chunk_mem(h, c0);
|
2020-06-18 10:13:45 -07:00
|
|
|
|
2020-06-24 01:10:14 -04:00
|
|
|
/* Align allocated memory */
|
2021-03-22 07:58:19 -04:00
|
|
|
mem = (uint8_t *) ROUND_UP(mem + rew, align) - rew;
|
2020-10-22 18:06:39 -04:00
|
|
|
chunk_unit_t *end = (chunk_unit_t *) ROUND_UP(mem + bytes, CHUNK_UNIT);
|
2020-06-18 10:13:45 -07:00
|
|
|
|
2020-10-22 18:06:39 -04:00
|
|
|
/* Get corresponding chunks */
|
2020-06-24 01:10:14 -04:00
|
|
|
chunkid_t c = mem_to_chunkid(h, mem);
|
2020-10-22 18:06:39 -04:00
|
|
|
chunkid_t c_end = end - chunk_buf(h);
|
|
|
|
CHECK(c >= c0 && c < c_end && c_end <= c0 + padded_sz);
|
2020-06-18 10:13:45 -07:00
|
|
|
|
|
|
|
/* Split and free unused prefix */
|
|
|
|
if (c > c0) {
|
|
|
|
split_chunks(h, c0, c);
|
2020-06-24 01:10:14 -04:00
|
|
|
free_list_add(h, c0);
|
2020-06-18 10:13:45 -07:00
|
|
|
}
|
|
|
|
|
|
|
|
/* Split and free unused suffix */
|
2020-10-22 18:06:39 -04:00
|
|
|
if (right_chunk(h, c) > c_end) {
|
|
|
|
split_chunks(h, c, c_end);
|
|
|
|
free_list_add(h, c_end);
|
2020-06-18 10:13:45 -07:00
|
|
|
}
|
|
|
|
|
2020-06-26 12:18:35 -04:00
|
|
|
set_chunk_used(h, c, true);
|
2022-08-16 11:42:53 -07:00
|
|
|
|
2021-10-29 15:49:07 +08:00
|
|
|
#ifdef CONFIG_SYS_HEAP_RUNTIME_STATS
|
2022-03-29 19:07:43 +02:00
|
|
|
increase_allocated_bytes(h, chunksz_to_bytes(h, chunk_size(h, c)));
|
2021-10-29 15:49:07 +08:00
|
|
|
#endif
|
2022-01-05 14:06:12 -08:00
|
|
|
|
|
|
|
#ifdef CONFIG_SYS_HEAP_LISTENER
|
|
|
|
heap_listener_notify_alloc(HEAP_ID_FROM_POINTER(heap), mem,
|
|
|
|
chunksz_to_bytes(h, chunk_size(h, c)));
|
|
|
|
#endif
|
|
|
|
|
2022-08-16 11:42:53 -07:00
|
|
|
IF_ENABLED(CONFIG_MSAN, (__msan_allocated_memory(mem, bytes)));
|
2020-06-24 01:10:14 -04:00
|
|
|
return mem;
|
2020-06-18 10:13:45 -07:00
|
|
|
}
|
|
|
|
|
2025-03-12 12:38:53 -04:00
|
|
|
static bool inplace_realloc(struct sys_heap *heap, void *ptr, size_t bytes)
|
2020-10-21 13:24:32 -07:00
|
|
|
{
|
2021-01-19 14:38:56 -05:00
|
|
|
struct z_heap *h = heap->heap;
|
|
|
|
|
2020-10-21 13:24:32 -07:00
|
|
|
chunkid_t c = mem_to_chunkid(h, ptr);
|
2021-01-25 20:55:16 -05:00
|
|
|
size_t align_gap = (uint8_t *)ptr - (uint8_t *)chunk_mem(h, c);
|
2025-06-02 11:27:57 -07:00
|
|
|
|
|
|
|
chunksz_t chunks_need = bytes_to_chunksz(h, bytes, align_gap);
|
2020-10-21 13:24:32 -07:00
|
|
|
|
2025-03-12 12:38:53 -04:00
|
|
|
if (chunk_size(h, c) == chunks_need) {
|
2020-12-17 22:21:10 -05:00
|
|
|
/* We're good already */
|
2025-03-12 12:38:53 -04:00
|
|
|
return true;
|
|
|
|
}
|
|
|
|
|
|
|
|
if (chunk_size(h, c) > chunks_need) {
|
2020-10-21 13:24:32 -07:00
|
|
|
/* Shrink in place, split off and free unused suffix */
|
2022-01-05 14:06:12 -08:00
|
|
|
#ifdef CONFIG_SYS_HEAP_LISTENER
|
2022-01-10 14:23:03 -08:00
|
|
|
size_t bytes_freed = chunksz_to_bytes(h, chunk_size(h, c));
|
2022-01-05 14:06:12 -08:00
|
|
|
#endif
|
|
|
|
|
2021-10-29 15:49:07 +08:00
|
|
|
#ifdef CONFIG_SYS_HEAP_RUNTIME_STATS
|
|
|
|
h->allocated_bytes -=
|
|
|
|
(chunk_size(h, c) - chunks_need) * CHUNK_UNIT;
|
|
|
|
#endif
|
2022-01-05 14:06:12 -08:00
|
|
|
|
2020-10-21 13:24:32 -07:00
|
|
|
split_chunks(h, c, c + chunks_need);
|
|
|
|
set_chunk_used(h, c, true);
|
|
|
|
free_chunk(h, c + chunks_need);
|
2022-01-05 14:06:12 -08:00
|
|
|
|
|
|
|
#ifdef CONFIG_SYS_HEAP_LISTENER
|
|
|
|
heap_listener_notify_alloc(HEAP_ID_FROM_POINTER(heap), ptr,
|
|
|
|
chunksz_to_bytes(h, chunk_size(h, c)));
|
|
|
|
heap_listener_notify_free(HEAP_ID_FROM_POINTER(heap), ptr,
|
|
|
|
bytes_freed);
|
|
|
|
#endif
|
|
|
|
|
2025-03-12 12:38:53 -04:00
|
|
|
return true;
|
|
|
|
}
|
|
|
|
|
|
|
|
chunkid_t rc = right_chunk(h, c);
|
|
|
|
|
|
|
|
if (!chunk_used(h, rc) &&
|
|
|
|
(chunk_size(h, c) + chunk_size(h, rc) >= chunks_need)) {
|
2020-10-21 13:24:32 -07:00
|
|
|
/* Expand: split the right chunk and append */
|
2021-11-12 00:07:57 -05:00
|
|
|
chunksz_t split_size = chunks_need - chunk_size(h, c);
|
2020-10-21 13:24:32 -07:00
|
|
|
|
2022-01-05 14:06:12 -08:00
|
|
|
#ifdef CONFIG_SYS_HEAP_LISTENER
|
2022-01-10 14:23:03 -08:00
|
|
|
size_t bytes_freed = chunksz_to_bytes(h, chunk_size(h, c));
|
2022-01-05 14:06:12 -08:00
|
|
|
#endif
|
|
|
|
|
2021-10-29 15:49:07 +08:00
|
|
|
#ifdef CONFIG_SYS_HEAP_RUNTIME_STATS
|
2022-03-29 19:07:43 +02:00
|
|
|
increase_allocated_bytes(h, split_size * CHUNK_UNIT);
|
2021-10-29 15:49:07 +08:00
|
|
|
#endif
|
|
|
|
|
2020-10-21 13:24:32 -07:00
|
|
|
free_list_remove(h, rc);
|
2020-12-17 22:21:10 -05:00
|
|
|
|
2020-10-21 13:24:32 -07:00
|
|
|
if (split_size < chunk_size(h, rc)) {
|
|
|
|
split_chunks(h, rc, rc + split_size);
|
|
|
|
free_list_add(h, rc + split_size);
|
|
|
|
}
|
|
|
|
|
2020-12-17 22:21:10 -05:00
|
|
|
merge_chunks(h, c, rc);
|
2020-10-21 13:24:32 -07:00
|
|
|
set_chunk_used(h, c, true);
|
2022-01-05 14:06:12 -08:00
|
|
|
|
|
|
|
#ifdef CONFIG_SYS_HEAP_LISTENER
|
|
|
|
heap_listener_notify_alloc(HEAP_ID_FROM_POINTER(heap), ptr,
|
|
|
|
chunksz_to_bytes(h, chunk_size(h, c)));
|
|
|
|
heap_listener_notify_free(HEAP_ID_FROM_POINTER(heap), ptr,
|
|
|
|
bytes_freed);
|
|
|
|
#endif
|
|
|
|
|
2025-03-12 12:38:53 -04:00
|
|
|
return true;
|
|
|
|
}
|
|
|
|
|
|
|
|
return false;
|
|
|
|
}
|
|
|
|
|
|
|
|
void *sys_heap_realloc(struct sys_heap *heap, void *ptr, size_t bytes)
|
|
|
|
{
|
|
|
|
/* special realloc semantics */
|
|
|
|
if (ptr == NULL) {
|
|
|
|
return sys_heap_alloc(heap, bytes);
|
|
|
|
}
|
|
|
|
if (bytes == 0) {
|
|
|
|
sys_heap_free(heap, ptr);
|
|
|
|
return NULL;
|
|
|
|
}
|
|
|
|
|
|
|
|
if (inplace_realloc(heap, ptr, bytes)) {
|
|
|
|
return ptr;
|
|
|
|
}
|
|
|
|
|
|
|
|
/* In-place realloc was not possible: fallback to allocate and copy. */
|
|
|
|
void *ptr2 = sys_heap_alloc(heap, bytes);
|
|
|
|
|
|
|
|
if (ptr2 != NULL) {
|
|
|
|
size_t prev_size = sys_heap_usable_size(heap, ptr);
|
|
|
|
|
|
|
|
memcpy(ptr2, ptr, MIN(prev_size, bytes));
|
|
|
|
sys_heap_free(heap, ptr);
|
|
|
|
}
|
|
|
|
return ptr2;
|
|
|
|
}
|
|
|
|
|
|
|
|
void *sys_heap_aligned_realloc(struct sys_heap *heap, void *ptr,
|
|
|
|
size_t align, size_t bytes)
|
|
|
|
{
|
|
|
|
/* special realloc semantics */
|
|
|
|
if (ptr == NULL) {
|
|
|
|
return sys_heap_aligned_alloc(heap, align, bytes);
|
|
|
|
}
|
|
|
|
if (bytes == 0) {
|
|
|
|
sys_heap_free(heap, ptr);
|
|
|
|
return NULL;
|
|
|
|
}
|
|
|
|
|
|
|
|
__ASSERT((align & (align - 1)) == 0, "align must be a power of 2");
|
|
|
|
|
|
|
|
if ((align == 0 || ((uintptr_t)ptr & (align - 1)) == 0) &&
|
|
|
|
inplace_realloc(heap, ptr, bytes)) {
|
2020-12-17 22:21:10 -05:00
|
|
|
return ptr;
|
2021-01-25 20:55:16 -05:00
|
|
|
}
|
2020-10-21 13:24:32 -07:00
|
|
|
|
2022-01-05 14:06:12 -08:00
|
|
|
/*
|
2025-03-12 12:38:53 -04:00
|
|
|
* Either ptr is not sufficiently aligned for in-place realloc or
|
|
|
|
* in-place realloc was not possible: fallback to allocate and copy.
|
2022-01-05 14:06:12 -08:00
|
|
|
*/
|
2021-01-25 20:55:16 -05:00
|
|
|
void *ptr2 = sys_heap_aligned_alloc(heap, align, bytes);
|
2020-10-21 13:24:32 -07:00
|
|
|
|
2021-01-26 13:54:36 -05:00
|
|
|
if (ptr2 != NULL) {
|
2025-03-12 12:38:53 -04:00
|
|
|
size_t prev_size = sys_heap_usable_size(heap, ptr);
|
2021-01-25 20:55:16 -05:00
|
|
|
|
2021-01-26 13:54:36 -05:00
|
|
|
memcpy(ptr2, ptr, MIN(prev_size, bytes));
|
|
|
|
sys_heap_free(heap, ptr);
|
|
|
|
}
|
2021-01-25 20:55:16 -05:00
|
|
|
return ptr2;
|
2020-10-21 13:24:32 -07:00
|
|
|
}
|
|
|
|
|
lib/os: Add sys_heap, a new/simpler/faster memory allocator
The existing mem_pool implementation has been an endless source of
frustration. It's had alignment bugs, it's had racy behavior. It's
never been particularly fast. It's outrageously complicated to
configure statically. And while its fragmentation resistance and
overhead on small blocks is good, it's space efficiencey has always
been very poor due to the four-way buddy scheme.
This patch introduces sys_heap. It's a more or less conventional
segregated fit allocator with power-of-two buckets. It doesn't expose
its level structure to the user at all, simply taking an arbitrarily
aligned pointer to memory. It stores all metadata inside the heap
region. It allocates and frees by simple pointer and not block ID.
Static initialization is trivial, and runtime initialization is only a
few cycles to format and add one block to a list header.
It has excellent space efficiency. Chunks can be split arbitrarily in
8 byte units. Overhead is only four bytes per allocated chunk (eight
bytes for heaps >256kb or on 64 bit systems), plus a log2-sized array
of 2-word bucket headers. No coarse alignment restrictions on blocks,
they can be split and merged (in units of 8 bytes) arbitrarily.
It has good fragmentation resistance. Freed blocks are always
immediately merged with adjacent free blocks. Allocations are
attempted from a sample of the smallest bucket that might fit, falling
back rapidly to the smallest block guaranteed to fit. Split memory
remaining in the chunk is always returned immediately to the heap for
other allocation.
It has excellent performance with firmly bounded runtime. All
operations are constant time (though there is a search of the smallest
bucket that has a compile-time-configurable upper bound, setting this
to extreme values results in an effectively linear search of the
list), objectively fast (about a hundred instructions) and amenable to
locked operation. No more need for fragile lock relaxation trickery.
It also contains an extensive validation and stress test framework,
something that was sorely lacking in the previous implementation.
Note that sys_heap is not a compatible API with sys_mem_pool and
k_mem_pool. Partial wrappers for those (now-) legacy APIs will appear
later and a deprecation strategy needs to be chosen.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2019-07-17 09:58:25 -07:00
|
|
|
void sys_heap_init(struct sys_heap *heap, void *mem, size_t bytes)
|
|
|
|
{
|
2022-08-16 11:42:53 -07:00
|
|
|
IF_ENABLED(CONFIG_MSAN, (__sanitizer_dtor_callback(mem, bytes)));
|
|
|
|
|
2021-10-12 23:41:20 -04:00
|
|
|
if (IS_ENABLED(CONFIG_SYS_HEAP_SMALL_ONLY)) {
|
|
|
|
/* Must fit in a 15 bit count of HUNK_UNIT */
|
|
|
|
__ASSERT(bytes / CHUNK_UNIT <= 0x7fffU, "heap size is too big");
|
|
|
|
} else {
|
|
|
|
/* Must fit in a 31 bit count of HUNK_UNIT */
|
|
|
|
__ASSERT(bytes / CHUNK_UNIT <= 0x7fffffffU, "heap size is too big");
|
|
|
|
}
|
lib/os: Add sys_heap, a new/simpler/faster memory allocator
The existing mem_pool implementation has been an endless source of
frustration. It's had alignment bugs, it's had racy behavior. It's
never been particularly fast. It's outrageously complicated to
configure statically. And while its fragmentation resistance and
overhead on small blocks is good, it's space efficiencey has always
been very poor due to the four-way buddy scheme.
This patch introduces sys_heap. It's a more or less conventional
segregated fit allocator with power-of-two buckets. It doesn't expose
its level structure to the user at all, simply taking an arbitrarily
aligned pointer to memory. It stores all metadata inside the heap
region. It allocates and frees by simple pointer and not block ID.
Static initialization is trivial, and runtime initialization is only a
few cycles to format and add one block to a list header.
It has excellent space efficiency. Chunks can be split arbitrarily in
8 byte units. Overhead is only four bytes per allocated chunk (eight
bytes for heaps >256kb or on 64 bit systems), plus a log2-sized array
of 2-word bucket headers. No coarse alignment restrictions on blocks,
they can be split and merged (in units of 8 bytes) arbitrarily.
It has good fragmentation resistance. Freed blocks are always
immediately merged with adjacent free blocks. Allocations are
attempted from a sample of the smallest bucket that might fit, falling
back rapidly to the smallest block guaranteed to fit. Split memory
remaining in the chunk is always returned immediately to the heap for
other allocation.
It has excellent performance with firmly bounded runtime. All
operations are constant time (though there is a search of the smallest
bucket that has a compile-time-configurable upper bound, setting this
to extreme values results in an effectively linear search of the
list), objectively fast (about a hundred instructions) and amenable to
locked operation. No more need for fragile lock relaxation trickery.
It also contains an extensive validation and stress test framework,
something that was sorely lacking in the previous implementation.
Note that sys_heap is not a compatible API with sys_mem_pool and
k_mem_pool. Partial wrappers for those (now-) legacy APIs will appear
later and a deprecation strategy needs to be chosen.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2019-07-17 09:58:25 -07:00
|
|
|
|
2021-03-17 21:05:49 -04:00
|
|
|
/* Reserve the end marker chunk's header */
|
2020-06-25 15:01:09 -04:00
|
|
|
__ASSERT(bytes > heap_footer_bytes(bytes), "heap size is too small");
|
2019-09-26 01:59:35 -04:00
|
|
|
bytes -= heap_footer_bytes(bytes);
|
|
|
|
|
lib/os: Add sys_heap, a new/simpler/faster memory allocator
The existing mem_pool implementation has been an endless source of
frustration. It's had alignment bugs, it's had racy behavior. It's
never been particularly fast. It's outrageously complicated to
configure statically. And while its fragmentation resistance and
overhead on small blocks is good, it's space efficiencey has always
been very poor due to the four-way buddy scheme.
This patch introduces sys_heap. It's a more or less conventional
segregated fit allocator with power-of-two buckets. It doesn't expose
its level structure to the user at all, simply taking an arbitrarily
aligned pointer to memory. It stores all metadata inside the heap
region. It allocates and frees by simple pointer and not block ID.
Static initialization is trivial, and runtime initialization is only a
few cycles to format and add one block to a list header.
It has excellent space efficiency. Chunks can be split arbitrarily in
8 byte units. Overhead is only four bytes per allocated chunk (eight
bytes for heaps >256kb or on 64 bit systems), plus a log2-sized array
of 2-word bucket headers. No coarse alignment restrictions on blocks,
they can be split and merged (in units of 8 bytes) arbitrarily.
It has good fragmentation resistance. Freed blocks are always
immediately merged with adjacent free blocks. Allocations are
attempted from a sample of the smallest bucket that might fit, falling
back rapidly to the smallest block guaranteed to fit. Split memory
remaining in the chunk is always returned immediately to the heap for
other allocation.
It has excellent performance with firmly bounded runtime. All
operations are constant time (though there is a search of the smallest
bucket that has a compile-time-configurable upper bound, setting this
to extreme values results in an effectively linear search of the
list), objectively fast (about a hundred instructions) and amenable to
locked operation. No more need for fragile lock relaxation trickery.
It also contains an extensive validation and stress test framework,
something that was sorely lacking in the previous implementation.
Note that sys_heap is not a compatible API with sys_mem_pool and
k_mem_pool. Partial wrappers for those (now-) legacy APIs will appear
later and a deprecation strategy needs to be chosen.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2019-07-17 09:58:25 -07:00
|
|
|
/* Round the start up, the end down */
|
2019-09-25 16:06:26 -04:00
|
|
|
uintptr_t addr = ROUND_UP(mem, CHUNK_UNIT);
|
|
|
|
uintptr_t end = ROUND_DOWN((uint8_t *)mem + bytes, CHUNK_UNIT);
|
2021-03-17 21:53:25 -04:00
|
|
|
chunksz_t heap_sz = (end - addr) / CHUNK_UNIT;
|
lib/os: Add sys_heap, a new/simpler/faster memory allocator
The existing mem_pool implementation has been an endless source of
frustration. It's had alignment bugs, it's had racy behavior. It's
never been particularly fast. It's outrageously complicated to
configure statically. And while its fragmentation resistance and
overhead on small blocks is good, it's space efficiencey has always
been very poor due to the four-way buddy scheme.
This patch introduces sys_heap. It's a more or less conventional
segregated fit allocator with power-of-two buckets. It doesn't expose
its level structure to the user at all, simply taking an arbitrarily
aligned pointer to memory. It stores all metadata inside the heap
region. It allocates and frees by simple pointer and not block ID.
Static initialization is trivial, and runtime initialization is only a
few cycles to format and add one block to a list header.
It has excellent space efficiency. Chunks can be split arbitrarily in
8 byte units. Overhead is only four bytes per allocated chunk (eight
bytes for heaps >256kb or on 64 bit systems), plus a log2-sized array
of 2-word bucket headers. No coarse alignment restrictions on blocks,
they can be split and merged (in units of 8 bytes) arbitrarily.
It has good fragmentation resistance. Freed blocks are always
immediately merged with adjacent free blocks. Allocations are
attempted from a sample of the smallest bucket that might fit, falling
back rapidly to the smallest block guaranteed to fit. Split memory
remaining in the chunk is always returned immediately to the heap for
other allocation.
It has excellent performance with firmly bounded runtime. All
operations are constant time (though there is a search of the smallest
bucket that has a compile-time-configurable upper bound, setting this
to extreme values results in an effectively linear search of the
list), objectively fast (about a hundred instructions) and amenable to
locked operation. No more need for fragile lock relaxation trickery.
It also contains an extensive validation and stress test framework,
something that was sorely lacking in the previous implementation.
Note that sys_heap is not a compatible API with sys_mem_pool and
k_mem_pool. Partial wrappers for those (now-) legacy APIs will appear
later and a deprecation strategy needs to be chosen.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2019-07-17 09:58:25 -07:00
|
|
|
|
|
|
|
CHECK(end > addr);
|
2021-03-17 21:53:25 -04:00
|
|
|
__ASSERT(heap_sz > chunksz(sizeof(struct z_heap)), "heap size is too small");
|
lib/os: Add sys_heap, a new/simpler/faster memory allocator
The existing mem_pool implementation has been an endless source of
frustration. It's had alignment bugs, it's had racy behavior. It's
never been particularly fast. It's outrageously complicated to
configure statically. And while its fragmentation resistance and
overhead on small blocks is good, it's space efficiencey has always
been very poor due to the four-way buddy scheme.
This patch introduces sys_heap. It's a more or less conventional
segregated fit allocator with power-of-two buckets. It doesn't expose
its level structure to the user at all, simply taking an arbitrarily
aligned pointer to memory. It stores all metadata inside the heap
region. It allocates and frees by simple pointer and not block ID.
Static initialization is trivial, and runtime initialization is only a
few cycles to format and add one block to a list header.
It has excellent space efficiency. Chunks can be split arbitrarily in
8 byte units. Overhead is only four bytes per allocated chunk (eight
bytes for heaps >256kb or on 64 bit systems), plus a log2-sized array
of 2-word bucket headers. No coarse alignment restrictions on blocks,
they can be split and merged (in units of 8 bytes) arbitrarily.
It has good fragmentation resistance. Freed blocks are always
immediately merged with adjacent free blocks. Allocations are
attempted from a sample of the smallest bucket that might fit, falling
back rapidly to the smallest block guaranteed to fit. Split memory
remaining in the chunk is always returned immediately to the heap for
other allocation.
It has excellent performance with firmly bounded runtime. All
operations are constant time (though there is a search of the smallest
bucket that has a compile-time-configurable upper bound, setting this
to extreme values results in an effectively linear search of the
list), objectively fast (about a hundred instructions) and amenable to
locked operation. No more need for fragile lock relaxation trickery.
It also contains an extensive validation and stress test framework,
something that was sorely lacking in the previous implementation.
Note that sys_heap is not a compatible API with sys_mem_pool and
k_mem_pool. Partial wrappers for those (now-) legacy APIs will appear
later and a deprecation strategy needs to be chosen.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2019-07-17 09:58:25 -07:00
|
|
|
|
|
|
|
struct z_heap *h = (struct z_heap *)addr;
|
2019-09-25 16:06:26 -04:00
|
|
|
heap->heap = h;
|
2021-03-17 21:53:25 -04:00
|
|
|
h->end_chunk = heap_sz;
|
lib/os: Add sys_heap, a new/simpler/faster memory allocator
The existing mem_pool implementation has been an endless source of
frustration. It's had alignment bugs, it's had racy behavior. It's
never been particularly fast. It's outrageously complicated to
configure statically. And while its fragmentation resistance and
overhead on small blocks is good, it's space efficiencey has always
been very poor due to the four-way buddy scheme.
This patch introduces sys_heap. It's a more or less conventional
segregated fit allocator with power-of-two buckets. It doesn't expose
its level structure to the user at all, simply taking an arbitrarily
aligned pointer to memory. It stores all metadata inside the heap
region. It allocates and frees by simple pointer and not block ID.
Static initialization is trivial, and runtime initialization is only a
few cycles to format and add one block to a list header.
It has excellent space efficiency. Chunks can be split arbitrarily in
8 byte units. Overhead is only four bytes per allocated chunk (eight
bytes for heaps >256kb or on 64 bit systems), plus a log2-sized array
of 2-word bucket headers. No coarse alignment restrictions on blocks,
they can be split and merged (in units of 8 bytes) arbitrarily.
It has good fragmentation resistance. Freed blocks are always
immediately merged with adjacent free blocks. Allocations are
attempted from a sample of the smallest bucket that might fit, falling
back rapidly to the smallest block guaranteed to fit. Split memory
remaining in the chunk is always returned immediately to the heap for
other allocation.
It has excellent performance with firmly bounded runtime. All
operations are constant time (though there is a search of the smallest
bucket that has a compile-time-configurable upper bound, setting this
to extreme values results in an effectively linear search of the
list), objectively fast (about a hundred instructions) and amenable to
locked operation. No more need for fragile lock relaxation trickery.
It also contains an extensive validation and stress test framework,
something that was sorely lacking in the previous implementation.
Note that sys_heap is not a compatible API with sys_mem_pool and
k_mem_pool. Partial wrappers for those (now-) legacy APIs will appear
later and a deprecation strategy needs to be chosen.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2019-07-17 09:58:25 -07:00
|
|
|
h->avail_buckets = 0;
|
|
|
|
|
2021-10-29 15:49:07 +08:00
|
|
|
#ifdef CONFIG_SYS_HEAP_RUNTIME_STATS
|
|
|
|
h->free_bytes = 0;
|
|
|
|
h->allocated_bytes = 0;
|
2022-03-29 19:07:43 +02:00
|
|
|
h->max_allocated_bytes = 0;
|
2021-10-29 15:49:07 +08:00
|
|
|
#endif
|
|
|
|
|
2025-01-16 12:05:09 +00:00
|
|
|
#if CONFIG_SYS_HEAP_ARRAY_SIZE
|
|
|
|
sys_heap_array_save(heap);
|
|
|
|
#endif
|
|
|
|
|
2021-03-17 21:53:25 -04:00
|
|
|
int nb_buckets = bucket_idx(h, heap_sz) + 1;
|
|
|
|
chunksz_t chunk0_size = chunksz(sizeof(struct z_heap) +
|
2019-09-25 16:06:26 -04:00
|
|
|
nb_buckets * sizeof(struct z_heap_bucket));
|
lib/os: Add sys_heap, a new/simpler/faster memory allocator
The existing mem_pool implementation has been an endless source of
frustration. It's had alignment bugs, it's had racy behavior. It's
never been particularly fast. It's outrageously complicated to
configure statically. And while its fragmentation resistance and
overhead on small blocks is good, it's space efficiencey has always
been very poor due to the four-way buddy scheme.
This patch introduces sys_heap. It's a more or less conventional
segregated fit allocator with power-of-two buckets. It doesn't expose
its level structure to the user at all, simply taking an arbitrarily
aligned pointer to memory. It stores all metadata inside the heap
region. It allocates and frees by simple pointer and not block ID.
Static initialization is trivial, and runtime initialization is only a
few cycles to format and add one block to a list header.
It has excellent space efficiency. Chunks can be split arbitrarily in
8 byte units. Overhead is only four bytes per allocated chunk (eight
bytes for heaps >256kb or on 64 bit systems), plus a log2-sized array
of 2-word bucket headers. No coarse alignment restrictions on blocks,
they can be split and merged (in units of 8 bytes) arbitrarily.
It has good fragmentation resistance. Freed blocks are always
immediately merged with adjacent free blocks. Allocations are
attempted from a sample of the smallest bucket that might fit, falling
back rapidly to the smallest block guaranteed to fit. Split memory
remaining in the chunk is always returned immediately to the heap for
other allocation.
It has excellent performance with firmly bounded runtime. All
operations are constant time (though there is a search of the smallest
bucket that has a compile-time-configurable upper bound, setting this
to extreme values results in an effectively linear search of the
list), objectively fast (about a hundred instructions) and amenable to
locked operation. No more need for fragile lock relaxation trickery.
It also contains an extensive validation and stress test framework,
something that was sorely lacking in the previous implementation.
Note that sys_heap is not a compatible API with sys_mem_pool and
k_mem_pool. Partial wrappers for those (now-) legacy APIs will appear
later and a deprecation strategy needs to be chosen.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2019-07-17 09:58:25 -07:00
|
|
|
|
2021-05-19 09:50:17 -07:00
|
|
|
__ASSERT(chunk0_size + min_chunk_size(h) <= heap_sz, "heap size is too small");
|
lib/os: Add sys_heap, a new/simpler/faster memory allocator
The existing mem_pool implementation has been an endless source of
frustration. It's had alignment bugs, it's had racy behavior. It's
never been particularly fast. It's outrageously complicated to
configure statically. And while its fragmentation resistance and
overhead on small blocks is good, it's space efficiencey has always
been very poor due to the four-way buddy scheme.
This patch introduces sys_heap. It's a more or less conventional
segregated fit allocator with power-of-two buckets. It doesn't expose
its level structure to the user at all, simply taking an arbitrarily
aligned pointer to memory. It stores all metadata inside the heap
region. It allocates and frees by simple pointer and not block ID.
Static initialization is trivial, and runtime initialization is only a
few cycles to format and add one block to a list header.
It has excellent space efficiency. Chunks can be split arbitrarily in
8 byte units. Overhead is only four bytes per allocated chunk (eight
bytes for heaps >256kb or on 64 bit systems), plus a log2-sized array
of 2-word bucket headers. No coarse alignment restrictions on blocks,
they can be split and merged (in units of 8 bytes) arbitrarily.
It has good fragmentation resistance. Freed blocks are always
immediately merged with adjacent free blocks. Allocations are
attempted from a sample of the smallest bucket that might fit, falling
back rapidly to the smallest block guaranteed to fit. Split memory
remaining in the chunk is always returned immediately to the heap for
other allocation.
It has excellent performance with firmly bounded runtime. All
operations are constant time (though there is a search of the smallest
bucket that has a compile-time-configurable upper bound, setting this
to extreme values results in an effectively linear search of the
list), objectively fast (about a hundred instructions) and amenable to
locked operation. No more need for fragile lock relaxation trickery.
It also contains an extensive validation and stress test framework,
something that was sorely lacking in the previous implementation.
Note that sys_heap is not a compatible API with sys_mem_pool and
k_mem_pool. Partial wrappers for those (now-) legacy APIs will appear
later and a deprecation strategy needs to be chosen.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2019-07-17 09:58:25 -07:00
|
|
|
|
2019-09-25 16:06:26 -04:00
|
|
|
for (int i = 0; i < nb_buckets; i++) {
|
|
|
|
h->buckets[i].next = 0;
|
lib/os: Add sys_heap, a new/simpler/faster memory allocator
The existing mem_pool implementation has been an endless source of
frustration. It's had alignment bugs, it's had racy behavior. It's
never been particularly fast. It's outrageously complicated to
configure statically. And while its fragmentation resistance and
overhead on small blocks is good, it's space efficiencey has always
been very poor due to the four-way buddy scheme.
This patch introduces sys_heap. It's a more or less conventional
segregated fit allocator with power-of-two buckets. It doesn't expose
its level structure to the user at all, simply taking an arbitrarily
aligned pointer to memory. It stores all metadata inside the heap
region. It allocates and frees by simple pointer and not block ID.
Static initialization is trivial, and runtime initialization is only a
few cycles to format and add one block to a list header.
It has excellent space efficiency. Chunks can be split arbitrarily in
8 byte units. Overhead is only four bytes per allocated chunk (eight
bytes for heaps >256kb or on 64 bit systems), plus a log2-sized array
of 2-word bucket headers. No coarse alignment restrictions on blocks,
they can be split and merged (in units of 8 bytes) arbitrarily.
It has good fragmentation resistance. Freed blocks are always
immediately merged with adjacent free blocks. Allocations are
attempted from a sample of the smallest bucket that might fit, falling
back rapidly to the smallest block guaranteed to fit. Split memory
remaining in the chunk is always returned immediately to the heap for
other allocation.
It has excellent performance with firmly bounded runtime. All
operations are constant time (though there is a search of the smallest
bucket that has a compile-time-configurable upper bound, setting this
to extreme values results in an effectively linear search of the
list), objectively fast (about a hundred instructions) and amenable to
locked operation. No more need for fragile lock relaxation trickery.
It also contains an extensive validation and stress test framework,
something that was sorely lacking in the previous implementation.
Note that sys_heap is not a compatible API with sys_mem_pool and
k_mem_pool. Partial wrappers for those (now-) legacy APIs will appear
later and a deprecation strategy needs to be chosen.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2019-07-17 09:58:25 -07:00
|
|
|
}
|
|
|
|
|
2019-09-26 01:59:35 -04:00
|
|
|
/* chunk containing our struct z_heap */
|
2019-09-25 16:06:26 -04:00
|
|
|
set_chunk_size(h, 0, chunk0_size);
|
2021-03-17 21:53:25 -04:00
|
|
|
set_left_chunk_size(h, 0, 0);
|
2019-09-25 16:06:26 -04:00
|
|
|
set_chunk_used(h, 0, true);
|
|
|
|
|
2019-09-26 01:59:35 -04:00
|
|
|
/* chunk containing the free heap */
|
2021-03-17 21:53:25 -04:00
|
|
|
set_chunk_size(h, chunk0_size, heap_sz - chunk0_size);
|
2019-09-25 16:06:26 -04:00
|
|
|
set_left_chunk_size(h, chunk0_size, chunk0_size);
|
2019-09-26 01:59:35 -04:00
|
|
|
|
|
|
|
/* the end marker chunk */
|
2021-03-17 21:53:25 -04:00
|
|
|
set_chunk_size(h, heap_sz, 0);
|
|
|
|
set_left_chunk_size(h, heap_sz, heap_sz - chunk0_size);
|
|
|
|
set_chunk_used(h, heap_sz, true);
|
2019-09-26 01:59:35 -04:00
|
|
|
|
2019-09-25 16:06:26 -04:00
|
|
|
free_list_add(h, chunk0_size);
|
lib/os: Add sys_heap, a new/simpler/faster memory allocator
The existing mem_pool implementation has been an endless source of
frustration. It's had alignment bugs, it's had racy behavior. It's
never been particularly fast. It's outrageously complicated to
configure statically. And while its fragmentation resistance and
overhead on small blocks is good, it's space efficiencey has always
been very poor due to the four-way buddy scheme.
This patch introduces sys_heap. It's a more or less conventional
segregated fit allocator with power-of-two buckets. It doesn't expose
its level structure to the user at all, simply taking an arbitrarily
aligned pointer to memory. It stores all metadata inside the heap
region. It allocates and frees by simple pointer and not block ID.
Static initialization is trivial, and runtime initialization is only a
few cycles to format and add one block to a list header.
It has excellent space efficiency. Chunks can be split arbitrarily in
8 byte units. Overhead is only four bytes per allocated chunk (eight
bytes for heaps >256kb or on 64 bit systems), plus a log2-sized array
of 2-word bucket headers. No coarse alignment restrictions on blocks,
they can be split and merged (in units of 8 bytes) arbitrarily.
It has good fragmentation resistance. Freed blocks are always
immediately merged with adjacent free blocks. Allocations are
attempted from a sample of the smallest bucket that might fit, falling
back rapidly to the smallest block guaranteed to fit. Split memory
remaining in the chunk is always returned immediately to the heap for
other allocation.
It has excellent performance with firmly bounded runtime. All
operations are constant time (though there is a search of the smallest
bucket that has a compile-time-configurable upper bound, setting this
to extreme values results in an effectively linear search of the
list), objectively fast (about a hundred instructions) and amenable to
locked operation. No more need for fragile lock relaxation trickery.
It also contains an extensive validation and stress test framework,
something that was sorely lacking in the previous implementation.
Note that sys_heap is not a compatible API with sys_mem_pool and
k_mem_pool. Partial wrappers for those (now-) legacy APIs will appear
later and a deprecation strategy needs to be chosen.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2019-07-17 09:58:25 -07:00
|
|
|
}
|