lib/os: Add sys_heap, a new/simpler/faster memory allocator
The existing mem_pool implementation has been an endless source of
frustration. It's had alignment bugs, it's had racy behavior. It's
never been particularly fast. It's outrageously complicated to
configure statically. And while its fragmentation resistance and
overhead on small blocks is good, it's space efficiencey has always
been very poor due to the four-way buddy scheme.
This patch introduces sys_heap. It's a more or less conventional
segregated fit allocator with power-of-two buckets. It doesn't expose
its level structure to the user at all, simply taking an arbitrarily
aligned pointer to memory. It stores all metadata inside the heap
region. It allocates and frees by simple pointer and not block ID.
Static initialization is trivial, and runtime initialization is only a
few cycles to format and add one block to a list header.
It has excellent space efficiency. Chunks can be split arbitrarily in
8 byte units. Overhead is only four bytes per allocated chunk (eight
bytes for heaps >256kb or on 64 bit systems), plus a log2-sized array
of 2-word bucket headers. No coarse alignment restrictions on blocks,
they can be split and merged (in units of 8 bytes) arbitrarily.
It has good fragmentation resistance. Freed blocks are always
immediately merged with adjacent free blocks. Allocations are
attempted from a sample of the smallest bucket that might fit, falling
back rapidly to the smallest block guaranteed to fit. Split memory
remaining in the chunk is always returned immediately to the heap for
other allocation.
It has excellent performance with firmly bounded runtime. All
operations are constant time (though there is a search of the smallest
bucket that has a compile-time-configurable upper bound, setting this
to extreme values results in an effectively linear search of the
list), objectively fast (about a hundred instructions) and amenable to
locked operation. No more need for fragile lock relaxation trickery.
It also contains an extensive validation and stress test framework,
something that was sorely lacking in the previous implementation.
Note that sys_heap is not a compatible API with sys_mem_pool and
k_mem_pool. Partial wrappers for those (now-) legacy APIs will appear
later and a deprecation strategy needs to be chosen.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2019-07-17 09:58:25 -07:00
|
|
|
/*
|
|
|
|
* Copyright (c) 2019 Intel Corporation
|
|
|
|
*
|
|
|
|
* SPDX-License-Identifier: Apache-2.0
|
|
|
|
*/
|
2022-05-06 11:23:05 +02:00
|
|
|
#include <zephyr/sys/sys_heap.h>
|
|
|
|
#include <zephyr/sys/util.h>
|
|
|
|
#include <zephyr/kernel.h>
|
lib/os: Add sys_heap, a new/simpler/faster memory allocator
The existing mem_pool implementation has been an endless source of
frustration. It's had alignment bugs, it's had racy behavior. It's
never been particularly fast. It's outrageously complicated to
configure statically. And while its fragmentation resistance and
overhead on small blocks is good, it's space efficiencey has always
been very poor due to the four-way buddy scheme.
This patch introduces sys_heap. It's a more or less conventional
segregated fit allocator with power-of-two buckets. It doesn't expose
its level structure to the user at all, simply taking an arbitrarily
aligned pointer to memory. It stores all metadata inside the heap
region. It allocates and frees by simple pointer and not block ID.
Static initialization is trivial, and runtime initialization is only a
few cycles to format and add one block to a list header.
It has excellent space efficiency. Chunks can be split arbitrarily in
8 byte units. Overhead is only four bytes per allocated chunk (eight
bytes for heaps >256kb or on 64 bit systems), plus a log2-sized array
of 2-word bucket headers. No coarse alignment restrictions on blocks,
they can be split and merged (in units of 8 bytes) arbitrarily.
It has good fragmentation resistance. Freed blocks are always
immediately merged with adjacent free blocks. Allocations are
attempted from a sample of the smallest bucket that might fit, falling
back rapidly to the smallest block guaranteed to fit. Split memory
remaining in the chunk is always returned immediately to the heap for
other allocation.
It has excellent performance with firmly bounded runtime. All
operations are constant time (though there is a search of the smallest
bucket that has a compile-time-configurable upper bound, setting this
to extreme values results in an effectively linear search of the
list), objectively fast (about a hundred instructions) and amenable to
locked operation. No more need for fragile lock relaxation trickery.
It also contains an extensive validation and stress test framework,
something that was sorely lacking in the previous implementation.
Note that sys_heap is not a compatible API with sys_mem_pool and
k_mem_pool. Partial wrappers for those (now-) legacy APIs will appear
later and a deprecation strategy needs to be chosen.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2019-07-17 09:58:25 -07:00
|
|
|
#include "heap.h"
|
|
|
|
|
|
|
|
/* White-box sys_heap validation code. Uses internal data structures.
|
|
|
|
* Not expected to be useful in production apps. This checks every
|
|
|
|
* header field of every chunk and returns true if the totality of the
|
|
|
|
* data structure is a valid heap. It doesn't necessarily tell you
|
|
|
|
* that it is the CORRECT heap given the history of alloc/free calls
|
|
|
|
* that it can't inspect. In a pathological case, you can imagine
|
|
|
|
* something scribbling a copy of a previously-valid heap on top of a
|
|
|
|
* running one and corrupting it. YMMV.
|
|
|
|
*/
|
|
|
|
|
2020-06-26 10:37:40 -04:00
|
|
|
#define VALIDATE(cond) do { if (!(cond)) { return false; } } while (0)
|
|
|
|
|
lib/os: Add sys_heap, a new/simpler/faster memory allocator
The existing mem_pool implementation has been an endless source of
frustration. It's had alignment bugs, it's had racy behavior. It's
never been particularly fast. It's outrageously complicated to
configure statically. And while its fragmentation resistance and
overhead on small blocks is good, it's space efficiencey has always
been very poor due to the four-way buddy scheme.
This patch introduces sys_heap. It's a more or less conventional
segregated fit allocator with power-of-two buckets. It doesn't expose
its level structure to the user at all, simply taking an arbitrarily
aligned pointer to memory. It stores all metadata inside the heap
region. It allocates and frees by simple pointer and not block ID.
Static initialization is trivial, and runtime initialization is only a
few cycles to format and add one block to a list header.
It has excellent space efficiency. Chunks can be split arbitrarily in
8 byte units. Overhead is only four bytes per allocated chunk (eight
bytes for heaps >256kb or on 64 bit systems), plus a log2-sized array
of 2-word bucket headers. No coarse alignment restrictions on blocks,
they can be split and merged (in units of 8 bytes) arbitrarily.
It has good fragmentation resistance. Freed blocks are always
immediately merged with adjacent free blocks. Allocations are
attempted from a sample of the smallest bucket that might fit, falling
back rapidly to the smallest block guaranteed to fit. Split memory
remaining in the chunk is always returned immediately to the heap for
other allocation.
It has excellent performance with firmly bounded runtime. All
operations are constant time (though there is a search of the smallest
bucket that has a compile-time-configurable upper bound, setting this
to extreme values results in an effectively linear search of the
list), objectively fast (about a hundred instructions) and amenable to
locked operation. No more need for fragile lock relaxation trickery.
It also contains an extensive validation and stress test framework,
something that was sorely lacking in the previous implementation.
Note that sys_heap is not a compatible API with sys_mem_pool and
k_mem_pool. Partial wrappers for those (now-) legacy APIs will appear
later and a deprecation strategy needs to be chosen.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2019-07-17 09:58:25 -07:00
|
|
|
static bool in_bounds(struct z_heap *h, chunkid_t c)
|
|
|
|
{
|
2020-06-26 10:37:40 -04:00
|
|
|
VALIDATE(c >= right_chunk(h, 0));
|
2021-05-26 18:44:14 +09:00
|
|
|
VALIDATE(c < h->end_chunk);
|
2021-03-17 21:05:49 -04:00
|
|
|
VALIDATE(chunk_size(h, c) < h->end_chunk);
|
2020-06-26 10:37:40 -04:00
|
|
|
return true;
|
lib/os: Add sys_heap, a new/simpler/faster memory allocator
The existing mem_pool implementation has been an endless source of
frustration. It's had alignment bugs, it's had racy behavior. It's
never been particularly fast. It's outrageously complicated to
configure statically. And while its fragmentation resistance and
overhead on small blocks is good, it's space efficiencey has always
been very poor due to the four-way buddy scheme.
This patch introduces sys_heap. It's a more or less conventional
segregated fit allocator with power-of-two buckets. It doesn't expose
its level structure to the user at all, simply taking an arbitrarily
aligned pointer to memory. It stores all metadata inside the heap
region. It allocates and frees by simple pointer and not block ID.
Static initialization is trivial, and runtime initialization is only a
few cycles to format and add one block to a list header.
It has excellent space efficiency. Chunks can be split arbitrarily in
8 byte units. Overhead is only four bytes per allocated chunk (eight
bytes for heaps >256kb or on 64 bit systems), plus a log2-sized array
of 2-word bucket headers. No coarse alignment restrictions on blocks,
they can be split and merged (in units of 8 bytes) arbitrarily.
It has good fragmentation resistance. Freed blocks are always
immediately merged with adjacent free blocks. Allocations are
attempted from a sample of the smallest bucket that might fit, falling
back rapidly to the smallest block guaranteed to fit. Split memory
remaining in the chunk is always returned immediately to the heap for
other allocation.
It has excellent performance with firmly bounded runtime. All
operations are constant time (though there is a search of the smallest
bucket that has a compile-time-configurable upper bound, setting this
to extreme values results in an effectively linear search of the
list), objectively fast (about a hundred instructions) and amenable to
locked operation. No more need for fragile lock relaxation trickery.
It also contains an extensive validation and stress test framework,
something that was sorely lacking in the previous implementation.
Note that sys_heap is not a compatible API with sys_mem_pool and
k_mem_pool. Partial wrappers for those (now-) legacy APIs will appear
later and a deprecation strategy needs to be chosen.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2019-07-17 09:58:25 -07:00
|
|
|
}
|
|
|
|
|
|
|
|
static bool valid_chunk(struct z_heap *h, chunkid_t c)
|
|
|
|
{
|
2020-06-26 10:37:40 -04:00
|
|
|
VALIDATE(chunk_size(h, c) > 0);
|
2021-03-17 21:05:49 -04:00
|
|
|
VALIDATE(c + chunk_size(h, c) <= h->end_chunk);
|
2020-06-26 10:37:40 -04:00
|
|
|
VALIDATE(in_bounds(h, c));
|
|
|
|
VALIDATE(right_chunk(h, left_chunk(h, c)) == c);
|
|
|
|
VALIDATE(left_chunk(h, right_chunk(h, c)) == c);
|
2020-06-26 11:11:04 -04:00
|
|
|
if (chunk_used(h, c)) {
|
|
|
|
VALIDATE(!solo_free_header(h, c));
|
|
|
|
} else {
|
2020-06-26 10:37:40 -04:00
|
|
|
VALIDATE(chunk_used(h, left_chunk(h, c)));
|
|
|
|
VALIDATE(chunk_used(h, right_chunk(h, c)));
|
2020-06-26 11:11:04 -04:00
|
|
|
if (!solo_free_header(h, c)) {
|
|
|
|
VALIDATE(in_bounds(h, prev_free_chunk(h, c)));
|
|
|
|
VALIDATE(in_bounds(h, next_free_chunk(h, c)));
|
|
|
|
}
|
2020-06-26 10:37:40 -04:00
|
|
|
}
|
|
|
|
return true;
|
lib/os: Add sys_heap, a new/simpler/faster memory allocator
The existing mem_pool implementation has been an endless source of
frustration. It's had alignment bugs, it's had racy behavior. It's
never been particularly fast. It's outrageously complicated to
configure statically. And while its fragmentation resistance and
overhead on small blocks is good, it's space efficiencey has always
been very poor due to the four-way buddy scheme.
This patch introduces sys_heap. It's a more or less conventional
segregated fit allocator with power-of-two buckets. It doesn't expose
its level structure to the user at all, simply taking an arbitrarily
aligned pointer to memory. It stores all metadata inside the heap
region. It allocates and frees by simple pointer and not block ID.
Static initialization is trivial, and runtime initialization is only a
few cycles to format and add one block to a list header.
It has excellent space efficiency. Chunks can be split arbitrarily in
8 byte units. Overhead is only four bytes per allocated chunk (eight
bytes for heaps >256kb or on 64 bit systems), plus a log2-sized array
of 2-word bucket headers. No coarse alignment restrictions on blocks,
they can be split and merged (in units of 8 bytes) arbitrarily.
It has good fragmentation resistance. Freed blocks are always
immediately merged with adjacent free blocks. Allocations are
attempted from a sample of the smallest bucket that might fit, falling
back rapidly to the smallest block guaranteed to fit. Split memory
remaining in the chunk is always returned immediately to the heap for
other allocation.
It has excellent performance with firmly bounded runtime. All
operations are constant time (though there is a search of the smallest
bucket that has a compile-time-configurable upper bound, setting this
to extreme values results in an effectively linear search of the
list), objectively fast (about a hundred instructions) and amenable to
locked operation. No more need for fragile lock relaxation trickery.
It also contains an extensive validation and stress test framework,
something that was sorely lacking in the previous implementation.
Note that sys_heap is not a compatible API with sys_mem_pool and
k_mem_pool. Partial wrappers for those (now-) legacy APIs will appear
later and a deprecation strategy needs to be chosen.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2019-07-17 09:58:25 -07:00
|
|
|
}
|
|
|
|
|
|
|
|
/* Validate multiple state dimensions for the bucket "next" pointer
|
|
|
|
* and see that they match. Probably should unify the design a
|
|
|
|
* bit...
|
|
|
|
*/
|
|
|
|
static inline void check_nexts(struct z_heap *h, int bidx)
|
|
|
|
{
|
|
|
|
struct z_heap_bucket *b = &h->buckets[bidx];
|
|
|
|
|
2021-12-11 22:58:38 -05:00
|
|
|
bool emptybit = (h->avail_buckets & BIT(bidx)) == 0;
|
lib/os: Add sys_heap, a new/simpler/faster memory allocator
The existing mem_pool implementation has been an endless source of
frustration. It's had alignment bugs, it's had racy behavior. It's
never been particularly fast. It's outrageously complicated to
configure statically. And while its fragmentation resistance and
overhead on small blocks is good, it's space efficiencey has always
been very poor due to the four-way buddy scheme.
This patch introduces sys_heap. It's a more or less conventional
segregated fit allocator with power-of-two buckets. It doesn't expose
its level structure to the user at all, simply taking an arbitrarily
aligned pointer to memory. It stores all metadata inside the heap
region. It allocates and frees by simple pointer and not block ID.
Static initialization is trivial, and runtime initialization is only a
few cycles to format and add one block to a list header.
It has excellent space efficiency. Chunks can be split arbitrarily in
8 byte units. Overhead is only four bytes per allocated chunk (eight
bytes for heaps >256kb or on 64 bit systems), plus a log2-sized array
of 2-word bucket headers. No coarse alignment restrictions on blocks,
they can be split and merged (in units of 8 bytes) arbitrarily.
It has good fragmentation resistance. Freed blocks are always
immediately merged with adjacent free blocks. Allocations are
attempted from a sample of the smallest bucket that might fit, falling
back rapidly to the smallest block guaranteed to fit. Split memory
remaining in the chunk is always returned immediately to the heap for
other allocation.
It has excellent performance with firmly bounded runtime. All
operations are constant time (though there is a search of the smallest
bucket that has a compile-time-configurable upper bound, setting this
to extreme values results in an effectively linear search of the
list), objectively fast (about a hundred instructions) and amenable to
locked operation. No more need for fragile lock relaxation trickery.
It also contains an extensive validation and stress test framework,
something that was sorely lacking in the previous implementation.
Note that sys_heap is not a compatible API with sys_mem_pool and
k_mem_pool. Partial wrappers for those (now-) legacy APIs will appear
later and a deprecation strategy needs to be chosen.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2019-07-17 09:58:25 -07:00
|
|
|
bool emptylist = b->next == 0;
|
2019-10-23 19:43:16 -04:00
|
|
|
bool empties_match = emptybit == emptylist;
|
lib/os: Add sys_heap, a new/simpler/faster memory allocator
The existing mem_pool implementation has been an endless source of
frustration. It's had alignment bugs, it's had racy behavior. It's
never been particularly fast. It's outrageously complicated to
configure statically. And while its fragmentation resistance and
overhead on small blocks is good, it's space efficiencey has always
been very poor due to the four-way buddy scheme.
This patch introduces sys_heap. It's a more or less conventional
segregated fit allocator with power-of-two buckets. It doesn't expose
its level structure to the user at all, simply taking an arbitrarily
aligned pointer to memory. It stores all metadata inside the heap
region. It allocates and frees by simple pointer and not block ID.
Static initialization is trivial, and runtime initialization is only a
few cycles to format and add one block to a list header.
It has excellent space efficiency. Chunks can be split arbitrarily in
8 byte units. Overhead is only four bytes per allocated chunk (eight
bytes for heaps >256kb or on 64 bit systems), plus a log2-sized array
of 2-word bucket headers. No coarse alignment restrictions on blocks,
they can be split and merged (in units of 8 bytes) arbitrarily.
It has good fragmentation resistance. Freed blocks are always
immediately merged with adjacent free blocks. Allocations are
attempted from a sample of the smallest bucket that might fit, falling
back rapidly to the smallest block guaranteed to fit. Split memory
remaining in the chunk is always returned immediately to the heap for
other allocation.
It has excellent performance with firmly bounded runtime. All
operations are constant time (though there is a search of the smallest
bucket that has a compile-time-configurable upper bound, setting this
to extreme values results in an effectively linear search of the
list), objectively fast (about a hundred instructions) and amenable to
locked operation. No more need for fragile lock relaxation trickery.
It also contains an extensive validation and stress test framework,
something that was sorely lacking in the previous implementation.
Note that sys_heap is not a compatible API with sys_mem_pool and
k_mem_pool. Partial wrappers for those (now-) legacy APIs will appear
later and a deprecation strategy needs to be chosen.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2019-07-17 09:58:25 -07:00
|
|
|
|
|
|
|
(void)empties_match;
|
|
|
|
CHECK(empties_match);
|
|
|
|
|
|
|
|
if (b->next != 0) {
|
|
|
|
CHECK(valid_chunk(h, b->next));
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2021-11-12 00:12:29 -05:00
|
|
|
static void get_alloc_info(struct z_heap *h, size_t *alloc_bytes,
|
|
|
|
size_t *free_bytes)
|
2021-11-11 15:37:04 +08:00
|
|
|
{
|
2021-11-12 00:12:29 -05:00
|
|
|
chunkid_t c;
|
2021-11-11 15:37:04 +08:00
|
|
|
|
2021-11-12 00:12:29 -05:00
|
|
|
*alloc_bytes = 0;
|
|
|
|
*free_bytes = 0;
|
2021-11-11 15:37:04 +08:00
|
|
|
|
2021-11-12 00:12:29 -05:00
|
|
|
for (c = right_chunk(h, 0); c < h->end_chunk; c = right_chunk(h, c)) {
|
2021-11-11 15:37:04 +08:00
|
|
|
if (chunk_used(h, c)) {
|
2021-11-12 00:12:29 -05:00
|
|
|
*alloc_bytes += chunksz_to_bytes(h, chunk_size(h, c));
|
|
|
|
} else if (!solo_free_header(h, c)) {
|
|
|
|
*free_bytes += chunksz_to_bytes(h, chunk_size(h, c));
|
2021-11-11 15:37:04 +08:00
|
|
|
}
|
2021-11-12 00:12:29 -05:00
|
|
|
}
|
2021-11-11 15:37:04 +08:00
|
|
|
}
|
|
|
|
|
lib/os: Add sys_heap, a new/simpler/faster memory allocator
The existing mem_pool implementation has been an endless source of
frustration. It's had alignment bugs, it's had racy behavior. It's
never been particularly fast. It's outrageously complicated to
configure statically. And while its fragmentation resistance and
overhead on small blocks is good, it's space efficiencey has always
been very poor due to the four-way buddy scheme.
This patch introduces sys_heap. It's a more or less conventional
segregated fit allocator with power-of-two buckets. It doesn't expose
its level structure to the user at all, simply taking an arbitrarily
aligned pointer to memory. It stores all metadata inside the heap
region. It allocates and frees by simple pointer and not block ID.
Static initialization is trivial, and runtime initialization is only a
few cycles to format and add one block to a list header.
It has excellent space efficiency. Chunks can be split arbitrarily in
8 byte units. Overhead is only four bytes per allocated chunk (eight
bytes for heaps >256kb or on 64 bit systems), plus a log2-sized array
of 2-word bucket headers. No coarse alignment restrictions on blocks,
they can be split and merged (in units of 8 bytes) arbitrarily.
It has good fragmentation resistance. Freed blocks are always
immediately merged with adjacent free blocks. Allocations are
attempted from a sample of the smallest bucket that might fit, falling
back rapidly to the smallest block guaranteed to fit. Split memory
remaining in the chunk is always returned immediately to the heap for
other allocation.
It has excellent performance with firmly bounded runtime. All
operations are constant time (though there is a search of the smallest
bucket that has a compile-time-configurable upper bound, setting this
to extreme values results in an effectively linear search of the
list), objectively fast (about a hundred instructions) and amenable to
locked operation. No more need for fragile lock relaxation trickery.
It also contains an extensive validation and stress test framework,
something that was sorely lacking in the previous implementation.
Note that sys_heap is not a compatible API with sys_mem_pool and
k_mem_pool. Partial wrappers for those (now-) legacy APIs will appear
later and a deprecation strategy needs to be chosen.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2019-07-17 09:58:25 -07:00
|
|
|
bool sys_heap_validate(struct sys_heap *heap)
|
|
|
|
{
|
|
|
|
struct z_heap *h = heap->heap;
|
|
|
|
chunkid_t c;
|
|
|
|
|
2020-06-26 10:37:40 -04:00
|
|
|
/*
|
|
|
|
* Walk through the chunks linearly, verifying sizes and end pointer.
|
|
|
|
*/
|
2021-05-26 18:44:14 +09:00
|
|
|
for (c = right_chunk(h, 0); c < h->end_chunk; c = right_chunk(h, c)) {
|
2020-06-26 10:37:40 -04:00
|
|
|
if (!valid_chunk(h, c)) {
|
|
|
|
return false;
|
|
|
|
}
|
|
|
|
}
|
2021-03-17 21:05:49 -04:00
|
|
|
if (c != h->end_chunk) {
|
2020-06-26 10:37:40 -04:00
|
|
|
return false; /* Should have exactly consumed the buffer */
|
|
|
|
}
|
|
|
|
|
2021-11-11 15:37:04 +08:00
|
|
|
#ifdef CONFIG_SYS_HEAP_RUNTIME_STATS
|
|
|
|
/*
|
|
|
|
* Validate sys_heap_runtime_stats_get API.
|
|
|
|
* Iterate all chunks in sys_heap to get total allocated bytes and
|
|
|
|
* free bytes, then compare with the results of
|
|
|
|
* sys_heap_runtime_stats_get function.
|
|
|
|
*/
|
2021-11-12 00:12:29 -05:00
|
|
|
size_t allocated_bytes, free_bytes;
|
|
|
|
struct sys_heap_runtime_stats stat;
|
2021-11-11 15:37:04 +08:00
|
|
|
|
2021-11-12 00:12:29 -05:00
|
|
|
get_alloc_info(h, &allocated_bytes, &free_bytes);
|
|
|
|
sys_heap_runtime_stats_get(heap, &stat);
|
|
|
|
if ((stat.allocated_bytes != allocated_bytes) ||
|
|
|
|
(stat.free_bytes != free_bytes)) {
|
2021-11-11 15:37:04 +08:00
|
|
|
return false;
|
|
|
|
}
|
|
|
|
#endif
|
|
|
|
|
lib/os: Add sys_heap, a new/simpler/faster memory allocator
The existing mem_pool implementation has been an endless source of
frustration. It's had alignment bugs, it's had racy behavior. It's
never been particularly fast. It's outrageously complicated to
configure statically. And while its fragmentation resistance and
overhead on small blocks is good, it's space efficiencey has always
been very poor due to the four-way buddy scheme.
This patch introduces sys_heap. It's a more or less conventional
segregated fit allocator with power-of-two buckets. It doesn't expose
its level structure to the user at all, simply taking an arbitrarily
aligned pointer to memory. It stores all metadata inside the heap
region. It allocates and frees by simple pointer and not block ID.
Static initialization is trivial, and runtime initialization is only a
few cycles to format and add one block to a list header.
It has excellent space efficiency. Chunks can be split arbitrarily in
8 byte units. Overhead is only four bytes per allocated chunk (eight
bytes for heaps >256kb or on 64 bit systems), plus a log2-sized array
of 2-word bucket headers. No coarse alignment restrictions on blocks,
they can be split and merged (in units of 8 bytes) arbitrarily.
It has good fragmentation resistance. Freed blocks are always
immediately merged with adjacent free blocks. Allocations are
attempted from a sample of the smallest bucket that might fit, falling
back rapidly to the smallest block guaranteed to fit. Split memory
remaining in the chunk is always returned immediately to the heap for
other allocation.
It has excellent performance with firmly bounded runtime. All
operations are constant time (though there is a search of the smallest
bucket that has a compile-time-configurable upper bound, setting this
to extreme values results in an effectively linear search of the
list), objectively fast (about a hundred instructions) and amenable to
locked operation. No more need for fragile lock relaxation trickery.
It also contains an extensive validation and stress test framework,
something that was sorely lacking in the previous implementation.
Note that sys_heap is not a compatible API with sys_mem_pool and
k_mem_pool. Partial wrappers for those (now-) legacy APIs will appear
later and a deprecation strategy needs to be chosen.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2019-07-17 09:58:25 -07:00
|
|
|
/* Check the free lists: entry count should match, empty bit
|
|
|
|
* should be correct, and all chunk entries should point into
|
|
|
|
* valid unused chunks. Mark those chunks USED, temporarily.
|
|
|
|
*/
|
2021-03-17 21:05:49 -04:00
|
|
|
for (int b = 0; b <= bucket_idx(h, h->end_chunk); b++) {
|
lib/os: Add sys_heap, a new/simpler/faster memory allocator
The existing mem_pool implementation has been an endless source of
frustration. It's had alignment bugs, it's had racy behavior. It's
never been particularly fast. It's outrageously complicated to
configure statically. And while its fragmentation resistance and
overhead on small blocks is good, it's space efficiencey has always
been very poor due to the four-way buddy scheme.
This patch introduces sys_heap. It's a more or less conventional
segregated fit allocator with power-of-two buckets. It doesn't expose
its level structure to the user at all, simply taking an arbitrarily
aligned pointer to memory. It stores all metadata inside the heap
region. It allocates and frees by simple pointer and not block ID.
Static initialization is trivial, and runtime initialization is only a
few cycles to format and add one block to a list header.
It has excellent space efficiency. Chunks can be split arbitrarily in
8 byte units. Overhead is only four bytes per allocated chunk (eight
bytes for heaps >256kb or on 64 bit systems), plus a log2-sized array
of 2-word bucket headers. No coarse alignment restrictions on blocks,
they can be split and merged (in units of 8 bytes) arbitrarily.
It has good fragmentation resistance. Freed blocks are always
immediately merged with adjacent free blocks. Allocations are
attempted from a sample of the smallest bucket that might fit, falling
back rapidly to the smallest block guaranteed to fit. Split memory
remaining in the chunk is always returned immediately to the heap for
other allocation.
It has excellent performance with firmly bounded runtime. All
operations are constant time (though there is a search of the smallest
bucket that has a compile-time-configurable upper bound, setting this
to extreme values results in an effectively linear search of the
list), objectively fast (about a hundred instructions) and amenable to
locked operation. No more need for fragile lock relaxation trickery.
It also contains an extensive validation and stress test framework,
something that was sorely lacking in the previous implementation.
Note that sys_heap is not a compatible API with sys_mem_pool and
k_mem_pool. Partial wrappers for those (now-) legacy APIs will appear
later and a deprecation strategy needs to be chosen.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2019-07-17 09:58:25 -07:00
|
|
|
chunkid_t c0 = h->buckets[b].next;
|
2020-05-27 11:26:57 -05:00
|
|
|
uint32_t n = 0;
|
lib/os: Add sys_heap, a new/simpler/faster memory allocator
The existing mem_pool implementation has been an endless source of
frustration. It's had alignment bugs, it's had racy behavior. It's
never been particularly fast. It's outrageously complicated to
configure statically. And while its fragmentation resistance and
overhead on small blocks is good, it's space efficiencey has always
been very poor due to the four-way buddy scheme.
This patch introduces sys_heap. It's a more or less conventional
segregated fit allocator with power-of-two buckets. It doesn't expose
its level structure to the user at all, simply taking an arbitrarily
aligned pointer to memory. It stores all metadata inside the heap
region. It allocates and frees by simple pointer and not block ID.
Static initialization is trivial, and runtime initialization is only a
few cycles to format and add one block to a list header.
It has excellent space efficiency. Chunks can be split arbitrarily in
8 byte units. Overhead is only four bytes per allocated chunk (eight
bytes for heaps >256kb or on 64 bit systems), plus a log2-sized array
of 2-word bucket headers. No coarse alignment restrictions on blocks,
they can be split and merged (in units of 8 bytes) arbitrarily.
It has good fragmentation resistance. Freed blocks are always
immediately merged with adjacent free blocks. Allocations are
attempted from a sample of the smallest bucket that might fit, falling
back rapidly to the smallest block guaranteed to fit. Split memory
remaining in the chunk is always returned immediately to the heap for
other allocation.
It has excellent performance with firmly bounded runtime. All
operations are constant time (though there is a search of the smallest
bucket that has a compile-time-configurable upper bound, setting this
to extreme values results in an effectively linear search of the
list), objectively fast (about a hundred instructions) and amenable to
locked operation. No more need for fragile lock relaxation trickery.
It also contains an extensive validation and stress test framework,
something that was sorely lacking in the previous implementation.
Note that sys_heap is not a compatible API with sys_mem_pool and
k_mem_pool. Partial wrappers for those (now-) legacy APIs will appear
later and a deprecation strategy needs to be chosen.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2019-07-17 09:58:25 -07:00
|
|
|
|
|
|
|
check_nexts(h, b);
|
|
|
|
|
|
|
|
for (c = c0; c != 0 && (n == 0 || c != c0);
|
2019-09-24 20:47:24 -04:00
|
|
|
n++, c = next_free_chunk(h, c)) {
|
lib/os: Add sys_heap, a new/simpler/faster memory allocator
The existing mem_pool implementation has been an endless source of
frustration. It's had alignment bugs, it's had racy behavior. It's
never been particularly fast. It's outrageously complicated to
configure statically. And while its fragmentation resistance and
overhead on small blocks is good, it's space efficiencey has always
been very poor due to the four-way buddy scheme.
This patch introduces sys_heap. It's a more or less conventional
segregated fit allocator with power-of-two buckets. It doesn't expose
its level structure to the user at all, simply taking an arbitrarily
aligned pointer to memory. It stores all metadata inside the heap
region. It allocates and frees by simple pointer and not block ID.
Static initialization is trivial, and runtime initialization is only a
few cycles to format and add one block to a list header.
It has excellent space efficiency. Chunks can be split arbitrarily in
8 byte units. Overhead is only four bytes per allocated chunk (eight
bytes for heaps >256kb or on 64 bit systems), plus a log2-sized array
of 2-word bucket headers. No coarse alignment restrictions on blocks,
they can be split and merged (in units of 8 bytes) arbitrarily.
It has good fragmentation resistance. Freed blocks are always
immediately merged with adjacent free blocks. Allocations are
attempted from a sample of the smallest bucket that might fit, falling
back rapidly to the smallest block guaranteed to fit. Split memory
remaining in the chunk is always returned immediately to the heap for
other allocation.
It has excellent performance with firmly bounded runtime. All
operations are constant time (though there is a search of the smallest
bucket that has a compile-time-configurable upper bound, setting this
to extreme values results in an effectively linear search of the
list), objectively fast (about a hundred instructions) and amenable to
locked operation. No more need for fragile lock relaxation trickery.
It also contains an extensive validation and stress test framework,
something that was sorely lacking in the previous implementation.
Note that sys_heap is not a compatible API with sys_mem_pool and
k_mem_pool. Partial wrappers for those (now-) legacy APIs will appear
later and a deprecation strategy needs to be chosen.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2019-07-17 09:58:25 -07:00
|
|
|
if (!valid_chunk(h, c)) {
|
|
|
|
return false;
|
|
|
|
}
|
2019-09-24 23:20:53 -04:00
|
|
|
set_chunk_used(h, c, true);
|
lib/os: Add sys_heap, a new/simpler/faster memory allocator
The existing mem_pool implementation has been an endless source of
frustration. It's had alignment bugs, it's had racy behavior. It's
never been particularly fast. It's outrageously complicated to
configure statically. And while its fragmentation resistance and
overhead on small blocks is good, it's space efficiencey has always
been very poor due to the four-way buddy scheme.
This patch introduces sys_heap. It's a more or less conventional
segregated fit allocator with power-of-two buckets. It doesn't expose
its level structure to the user at all, simply taking an arbitrarily
aligned pointer to memory. It stores all metadata inside the heap
region. It allocates and frees by simple pointer and not block ID.
Static initialization is trivial, and runtime initialization is only a
few cycles to format and add one block to a list header.
It has excellent space efficiency. Chunks can be split arbitrarily in
8 byte units. Overhead is only four bytes per allocated chunk (eight
bytes for heaps >256kb or on 64 bit systems), plus a log2-sized array
of 2-word bucket headers. No coarse alignment restrictions on blocks,
they can be split and merged (in units of 8 bytes) arbitrarily.
It has good fragmentation resistance. Freed blocks are always
immediately merged with adjacent free blocks. Allocations are
attempted from a sample of the smallest bucket that might fit, falling
back rapidly to the smallest block guaranteed to fit. Split memory
remaining in the chunk is always returned immediately to the heap for
other allocation.
It has excellent performance with firmly bounded runtime. All
operations are constant time (though there is a search of the smallest
bucket that has a compile-time-configurable upper bound, setting this
to extreme values results in an effectively linear search of the
list), objectively fast (about a hundred instructions) and amenable to
locked operation. No more need for fragile lock relaxation trickery.
It also contains an extensive validation and stress test framework,
something that was sorely lacking in the previous implementation.
Note that sys_heap is not a compatible API with sys_mem_pool and
k_mem_pool. Partial wrappers for those (now-) legacy APIs will appear
later and a deprecation strategy needs to be chosen.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2019-07-17 09:58:25 -07:00
|
|
|
}
|
|
|
|
|
2021-12-11 22:58:38 -05:00
|
|
|
bool empty = (h->avail_buckets & BIT(b)) == 0;
|
lib/os: Add sys_heap, a new/simpler/faster memory allocator
The existing mem_pool implementation has been an endless source of
frustration. It's had alignment bugs, it's had racy behavior. It's
never been particularly fast. It's outrageously complicated to
configure statically. And while its fragmentation resistance and
overhead on small blocks is good, it's space efficiencey has always
been very poor due to the four-way buddy scheme.
This patch introduces sys_heap. It's a more or less conventional
segregated fit allocator with power-of-two buckets. It doesn't expose
its level structure to the user at all, simply taking an arbitrarily
aligned pointer to memory. It stores all metadata inside the heap
region. It allocates and frees by simple pointer and not block ID.
Static initialization is trivial, and runtime initialization is only a
few cycles to format and add one block to a list header.
It has excellent space efficiency. Chunks can be split arbitrarily in
8 byte units. Overhead is only four bytes per allocated chunk (eight
bytes for heaps >256kb or on 64 bit systems), plus a log2-sized array
of 2-word bucket headers. No coarse alignment restrictions on blocks,
they can be split and merged (in units of 8 bytes) arbitrarily.
It has good fragmentation resistance. Freed blocks are always
immediately merged with adjacent free blocks. Allocations are
attempted from a sample of the smallest bucket that might fit, falling
back rapidly to the smallest block guaranteed to fit. Split memory
remaining in the chunk is always returned immediately to the heap for
other allocation.
It has excellent performance with firmly bounded runtime. All
operations are constant time (though there is a search of the smallest
bucket that has a compile-time-configurable upper bound, setting this
to extreme values results in an effectively linear search of the
list), objectively fast (about a hundred instructions) and amenable to
locked operation. No more need for fragile lock relaxation trickery.
It also contains an extensive validation and stress test framework,
something that was sorely lacking in the previous implementation.
Note that sys_heap is not a compatible API with sys_mem_pool and
k_mem_pool. Partial wrappers for those (now-) legacy APIs will appear
later and a deprecation strategy needs to be chosen.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2019-07-17 09:58:25 -07:00
|
|
|
bool zero = n == 0;
|
|
|
|
|
|
|
|
if (empty != zero) {
|
|
|
|
return false;
|
|
|
|
}
|
|
|
|
|
|
|
|
if (empty && h->buckets[b].next != 0) {
|
|
|
|
return false;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2020-06-26 10:37:40 -04:00
|
|
|
/*
|
|
|
|
* Walk through the chunks linearly again, verifying that all chunks
|
2020-06-26 11:11:04 -04:00
|
|
|
* but solo headers are now USED (i.e. all free blocks were found
|
|
|
|
* during enumeration). Mark all such blocks UNUSED and solo headers
|
|
|
|
* USED.
|
lib/os: Add sys_heap, a new/simpler/faster memory allocator
The existing mem_pool implementation has been an endless source of
frustration. It's had alignment bugs, it's had racy behavior. It's
never been particularly fast. It's outrageously complicated to
configure statically. And while its fragmentation resistance and
overhead on small blocks is good, it's space efficiencey has always
been very poor due to the four-way buddy scheme.
This patch introduces sys_heap. It's a more or less conventional
segregated fit allocator with power-of-two buckets. It doesn't expose
its level structure to the user at all, simply taking an arbitrarily
aligned pointer to memory. It stores all metadata inside the heap
region. It allocates and frees by simple pointer and not block ID.
Static initialization is trivial, and runtime initialization is only a
few cycles to format and add one block to a list header.
It has excellent space efficiency. Chunks can be split arbitrarily in
8 byte units. Overhead is only four bytes per allocated chunk (eight
bytes for heaps >256kb or on 64 bit systems), plus a log2-sized array
of 2-word bucket headers. No coarse alignment restrictions on blocks,
they can be split and merged (in units of 8 bytes) arbitrarily.
It has good fragmentation resistance. Freed blocks are always
immediately merged with adjacent free blocks. Allocations are
attempted from a sample of the smallest bucket that might fit, falling
back rapidly to the smallest block guaranteed to fit. Split memory
remaining in the chunk is always returned immediately to the heap for
other allocation.
It has excellent performance with firmly bounded runtime. All
operations are constant time (though there is a search of the smallest
bucket that has a compile-time-configurable upper bound, setting this
to extreme values results in an effectively linear search of the
list), objectively fast (about a hundred instructions) and amenable to
locked operation. No more need for fragile lock relaxation trickery.
It also contains an extensive validation and stress test framework,
something that was sorely lacking in the previous implementation.
Note that sys_heap is not a compatible API with sys_mem_pool and
k_mem_pool. Partial wrappers for those (now-) legacy APIs will appear
later and a deprecation strategy needs to be chosen.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2019-07-17 09:58:25 -07:00
|
|
|
*/
|
2019-09-24 20:47:24 -04:00
|
|
|
chunkid_t prev_chunk = 0;
|
2021-05-26 18:44:14 +09:00
|
|
|
for (c = right_chunk(h, 0); c < h->end_chunk; c = right_chunk(h, c)) {
|
2020-06-26 11:11:04 -04:00
|
|
|
if (!chunk_used(h, c) && !solo_free_header(h, c)) {
|
lib/os: Add sys_heap, a new/simpler/faster memory allocator
The existing mem_pool implementation has been an endless source of
frustration. It's had alignment bugs, it's had racy behavior. It's
never been particularly fast. It's outrageously complicated to
configure statically. And while its fragmentation resistance and
overhead on small blocks is good, it's space efficiencey has always
been very poor due to the four-way buddy scheme.
This patch introduces sys_heap. It's a more or less conventional
segregated fit allocator with power-of-two buckets. It doesn't expose
its level structure to the user at all, simply taking an arbitrarily
aligned pointer to memory. It stores all metadata inside the heap
region. It allocates and frees by simple pointer and not block ID.
Static initialization is trivial, and runtime initialization is only a
few cycles to format and add one block to a list header.
It has excellent space efficiency. Chunks can be split arbitrarily in
8 byte units. Overhead is only four bytes per allocated chunk (eight
bytes for heaps >256kb or on 64 bit systems), plus a log2-sized array
of 2-word bucket headers. No coarse alignment restrictions on blocks,
they can be split and merged (in units of 8 bytes) arbitrarily.
It has good fragmentation resistance. Freed blocks are always
immediately merged with adjacent free blocks. Allocations are
attempted from a sample of the smallest bucket that might fit, falling
back rapidly to the smallest block guaranteed to fit. Split memory
remaining in the chunk is always returned immediately to the heap for
other allocation.
It has excellent performance with firmly bounded runtime. All
operations are constant time (though there is a search of the smallest
bucket that has a compile-time-configurable upper bound, setting this
to extreme values results in an effectively linear search of the
list), objectively fast (about a hundred instructions) and amenable to
locked operation. No more need for fragile lock relaxation trickery.
It also contains an extensive validation and stress test framework,
something that was sorely lacking in the previous implementation.
Note that sys_heap is not a compatible API with sys_mem_pool and
k_mem_pool. Partial wrappers for those (now-) legacy APIs will appear
later and a deprecation strategy needs to be chosen.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2019-07-17 09:58:25 -07:00
|
|
|
return false;
|
|
|
|
}
|
2019-09-25 16:06:26 -04:00
|
|
|
if (left_chunk(h, c) != prev_chunk) {
|
|
|
|
return false;
|
lib/os: Add sys_heap, a new/simpler/faster memory allocator
The existing mem_pool implementation has been an endless source of
frustration. It's had alignment bugs, it's had racy behavior. It's
never been particularly fast. It's outrageously complicated to
configure statically. And while its fragmentation resistance and
overhead on small blocks is good, it's space efficiencey has always
been very poor due to the four-way buddy scheme.
This patch introduces sys_heap. It's a more or less conventional
segregated fit allocator with power-of-two buckets. It doesn't expose
its level structure to the user at all, simply taking an arbitrarily
aligned pointer to memory. It stores all metadata inside the heap
region. It allocates and frees by simple pointer and not block ID.
Static initialization is trivial, and runtime initialization is only a
few cycles to format and add one block to a list header.
It has excellent space efficiency. Chunks can be split arbitrarily in
8 byte units. Overhead is only four bytes per allocated chunk (eight
bytes for heaps >256kb or on 64 bit systems), plus a log2-sized array
of 2-word bucket headers. No coarse alignment restrictions on blocks,
they can be split and merged (in units of 8 bytes) arbitrarily.
It has good fragmentation resistance. Freed blocks are always
immediately merged with adjacent free blocks. Allocations are
attempted from a sample of the smallest bucket that might fit, falling
back rapidly to the smallest block guaranteed to fit. Split memory
remaining in the chunk is always returned immediately to the heap for
other allocation.
It has excellent performance with firmly bounded runtime. All
operations are constant time (though there is a search of the smallest
bucket that has a compile-time-configurable upper bound, setting this
to extreme values results in an effectively linear search of the
list), objectively fast (about a hundred instructions) and amenable to
locked operation. No more need for fragile lock relaxation trickery.
It also contains an extensive validation and stress test framework,
something that was sorely lacking in the previous implementation.
Note that sys_heap is not a compatible API with sys_mem_pool and
k_mem_pool. Partial wrappers for those (now-) legacy APIs will appear
later and a deprecation strategy needs to be chosen.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2019-07-17 09:58:25 -07:00
|
|
|
}
|
2019-09-24 20:47:24 -04:00
|
|
|
prev_chunk = c;
|
lib/os: Add sys_heap, a new/simpler/faster memory allocator
The existing mem_pool implementation has been an endless source of
frustration. It's had alignment bugs, it's had racy behavior. It's
never been particularly fast. It's outrageously complicated to
configure statically. And while its fragmentation resistance and
overhead on small blocks is good, it's space efficiencey has always
been very poor due to the four-way buddy scheme.
This patch introduces sys_heap. It's a more or less conventional
segregated fit allocator with power-of-two buckets. It doesn't expose
its level structure to the user at all, simply taking an arbitrarily
aligned pointer to memory. It stores all metadata inside the heap
region. It allocates and frees by simple pointer and not block ID.
Static initialization is trivial, and runtime initialization is only a
few cycles to format and add one block to a list header.
It has excellent space efficiency. Chunks can be split arbitrarily in
8 byte units. Overhead is only four bytes per allocated chunk (eight
bytes for heaps >256kb or on 64 bit systems), plus a log2-sized array
of 2-word bucket headers. No coarse alignment restrictions on blocks,
they can be split and merged (in units of 8 bytes) arbitrarily.
It has good fragmentation resistance. Freed blocks are always
immediately merged with adjacent free blocks. Allocations are
attempted from a sample of the smallest bucket that might fit, falling
back rapidly to the smallest block guaranteed to fit. Split memory
remaining in the chunk is always returned immediately to the heap for
other allocation.
It has excellent performance with firmly bounded runtime. All
operations are constant time (though there is a search of the smallest
bucket that has a compile-time-configurable upper bound, setting this
to extreme values results in an effectively linear search of the
list), objectively fast (about a hundred instructions) and amenable to
locked operation. No more need for fragile lock relaxation trickery.
It also contains an extensive validation and stress test framework,
something that was sorely lacking in the previous implementation.
Note that sys_heap is not a compatible API with sys_mem_pool and
k_mem_pool. Partial wrappers for those (now-) legacy APIs will appear
later and a deprecation strategy needs to be chosen.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2019-07-17 09:58:25 -07:00
|
|
|
|
2020-06-26 11:11:04 -04:00
|
|
|
set_chunk_used(h, c, solo_free_header(h, c));
|
lib/os: Add sys_heap, a new/simpler/faster memory allocator
The existing mem_pool implementation has been an endless source of
frustration. It's had alignment bugs, it's had racy behavior. It's
never been particularly fast. It's outrageously complicated to
configure statically. And while its fragmentation resistance and
overhead on small blocks is good, it's space efficiencey has always
been very poor due to the four-way buddy scheme.
This patch introduces sys_heap. It's a more or less conventional
segregated fit allocator with power-of-two buckets. It doesn't expose
its level structure to the user at all, simply taking an arbitrarily
aligned pointer to memory. It stores all metadata inside the heap
region. It allocates and frees by simple pointer and not block ID.
Static initialization is trivial, and runtime initialization is only a
few cycles to format and add one block to a list header.
It has excellent space efficiency. Chunks can be split arbitrarily in
8 byte units. Overhead is only four bytes per allocated chunk (eight
bytes for heaps >256kb or on 64 bit systems), plus a log2-sized array
of 2-word bucket headers. No coarse alignment restrictions on blocks,
they can be split and merged (in units of 8 bytes) arbitrarily.
It has good fragmentation resistance. Freed blocks are always
immediately merged with adjacent free blocks. Allocations are
attempted from a sample of the smallest bucket that might fit, falling
back rapidly to the smallest block guaranteed to fit. Split memory
remaining in the chunk is always returned immediately to the heap for
other allocation.
It has excellent performance with firmly bounded runtime. All
operations are constant time (though there is a search of the smallest
bucket that has a compile-time-configurable upper bound, setting this
to extreme values results in an effectively linear search of the
list), objectively fast (about a hundred instructions) and amenable to
locked operation. No more need for fragile lock relaxation trickery.
It also contains an extensive validation and stress test framework,
something that was sorely lacking in the previous implementation.
Note that sys_heap is not a compatible API with sys_mem_pool and
k_mem_pool. Partial wrappers for those (now-) legacy APIs will appear
later and a deprecation strategy needs to be chosen.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2019-07-17 09:58:25 -07:00
|
|
|
}
|
2021-03-17 21:05:49 -04:00
|
|
|
if (c != h->end_chunk) {
|
lib/os: Add sys_heap, a new/simpler/faster memory allocator
The existing mem_pool implementation has been an endless source of
frustration. It's had alignment bugs, it's had racy behavior. It's
never been particularly fast. It's outrageously complicated to
configure statically. And while its fragmentation resistance and
overhead on small blocks is good, it's space efficiencey has always
been very poor due to the four-way buddy scheme.
This patch introduces sys_heap. It's a more or less conventional
segregated fit allocator with power-of-two buckets. It doesn't expose
its level structure to the user at all, simply taking an arbitrarily
aligned pointer to memory. It stores all metadata inside the heap
region. It allocates and frees by simple pointer and not block ID.
Static initialization is trivial, and runtime initialization is only a
few cycles to format and add one block to a list header.
It has excellent space efficiency. Chunks can be split arbitrarily in
8 byte units. Overhead is only four bytes per allocated chunk (eight
bytes for heaps >256kb or on 64 bit systems), plus a log2-sized array
of 2-word bucket headers. No coarse alignment restrictions on blocks,
they can be split and merged (in units of 8 bytes) arbitrarily.
It has good fragmentation resistance. Freed blocks are always
immediately merged with adjacent free blocks. Allocations are
attempted from a sample of the smallest bucket that might fit, falling
back rapidly to the smallest block guaranteed to fit. Split memory
remaining in the chunk is always returned immediately to the heap for
other allocation.
It has excellent performance with firmly bounded runtime. All
operations are constant time (though there is a search of the smallest
bucket that has a compile-time-configurable upper bound, setting this
to extreme values results in an effectively linear search of the
list), objectively fast (about a hundred instructions) and amenable to
locked operation. No more need for fragile lock relaxation trickery.
It also contains an extensive validation and stress test framework,
something that was sorely lacking in the previous implementation.
Note that sys_heap is not a compatible API with sys_mem_pool and
k_mem_pool. Partial wrappers for those (now-) legacy APIs will appear
later and a deprecation strategy needs to be chosen.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2019-07-17 09:58:25 -07:00
|
|
|
return false; /* Should have exactly consumed the buffer */
|
|
|
|
}
|
|
|
|
|
|
|
|
/* Go through the free lists again checking that the linear
|
|
|
|
* pass caught all the blocks and that they now show UNUSED.
|
|
|
|
* Mark them USED.
|
|
|
|
*/
|
2021-03-17 21:05:49 -04:00
|
|
|
for (int b = 0; b <= bucket_idx(h, h->end_chunk); b++) {
|
lib/os: Add sys_heap, a new/simpler/faster memory allocator
The existing mem_pool implementation has been an endless source of
frustration. It's had alignment bugs, it's had racy behavior. It's
never been particularly fast. It's outrageously complicated to
configure statically. And while its fragmentation resistance and
overhead on small blocks is good, it's space efficiencey has always
been very poor due to the four-way buddy scheme.
This patch introduces sys_heap. It's a more or less conventional
segregated fit allocator with power-of-two buckets. It doesn't expose
its level structure to the user at all, simply taking an arbitrarily
aligned pointer to memory. It stores all metadata inside the heap
region. It allocates and frees by simple pointer and not block ID.
Static initialization is trivial, and runtime initialization is only a
few cycles to format and add one block to a list header.
It has excellent space efficiency. Chunks can be split arbitrarily in
8 byte units. Overhead is only four bytes per allocated chunk (eight
bytes for heaps >256kb or on 64 bit systems), plus a log2-sized array
of 2-word bucket headers. No coarse alignment restrictions on blocks,
they can be split and merged (in units of 8 bytes) arbitrarily.
It has good fragmentation resistance. Freed blocks are always
immediately merged with adjacent free blocks. Allocations are
attempted from a sample of the smallest bucket that might fit, falling
back rapidly to the smallest block guaranteed to fit. Split memory
remaining in the chunk is always returned immediately to the heap for
other allocation.
It has excellent performance with firmly bounded runtime. All
operations are constant time (though there is a search of the smallest
bucket that has a compile-time-configurable upper bound, setting this
to extreme values results in an effectively linear search of the
list), objectively fast (about a hundred instructions) and amenable to
locked operation. No more need for fragile lock relaxation trickery.
It also contains an extensive validation and stress test framework,
something that was sorely lacking in the previous implementation.
Note that sys_heap is not a compatible API with sys_mem_pool and
k_mem_pool. Partial wrappers for those (now-) legacy APIs will appear
later and a deprecation strategy needs to be chosen.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2019-07-17 09:58:25 -07:00
|
|
|
chunkid_t c0 = h->buckets[b].next;
|
|
|
|
int n = 0;
|
|
|
|
|
|
|
|
if (c0 == 0) {
|
|
|
|
continue;
|
|
|
|
}
|
|
|
|
|
2019-09-24 20:47:24 -04:00
|
|
|
for (c = c0; n == 0 || c != c0; n++, c = next_free_chunk(h, c)) {
|
|
|
|
if (chunk_used(h, c)) {
|
lib/os: Add sys_heap, a new/simpler/faster memory allocator
The existing mem_pool implementation has been an endless source of
frustration. It's had alignment bugs, it's had racy behavior. It's
never been particularly fast. It's outrageously complicated to
configure statically. And while its fragmentation resistance and
overhead on small blocks is good, it's space efficiencey has always
been very poor due to the four-way buddy scheme.
This patch introduces sys_heap. It's a more or less conventional
segregated fit allocator with power-of-two buckets. It doesn't expose
its level structure to the user at all, simply taking an arbitrarily
aligned pointer to memory. It stores all metadata inside the heap
region. It allocates and frees by simple pointer and not block ID.
Static initialization is trivial, and runtime initialization is only a
few cycles to format and add one block to a list header.
It has excellent space efficiency. Chunks can be split arbitrarily in
8 byte units. Overhead is only four bytes per allocated chunk (eight
bytes for heaps >256kb or on 64 bit systems), plus a log2-sized array
of 2-word bucket headers. No coarse alignment restrictions on blocks,
they can be split and merged (in units of 8 bytes) arbitrarily.
It has good fragmentation resistance. Freed blocks are always
immediately merged with adjacent free blocks. Allocations are
attempted from a sample of the smallest bucket that might fit, falling
back rapidly to the smallest block guaranteed to fit. Split memory
remaining in the chunk is always returned immediately to the heap for
other allocation.
It has excellent performance with firmly bounded runtime. All
operations are constant time (though there is a search of the smallest
bucket that has a compile-time-configurable upper bound, setting this
to extreme values results in an effectively linear search of the
list), objectively fast (about a hundred instructions) and amenable to
locked operation. No more need for fragile lock relaxation trickery.
It also contains an extensive validation and stress test framework,
something that was sorely lacking in the previous implementation.
Note that sys_heap is not a compatible API with sys_mem_pool and
k_mem_pool. Partial wrappers for those (now-) legacy APIs will appear
later and a deprecation strategy needs to be chosen.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2019-07-17 09:58:25 -07:00
|
|
|
return false;
|
|
|
|
}
|
2019-09-24 23:20:53 -04:00
|
|
|
set_chunk_used(h, c, true);
|
lib/os: Add sys_heap, a new/simpler/faster memory allocator
The existing mem_pool implementation has been an endless source of
frustration. It's had alignment bugs, it's had racy behavior. It's
never been particularly fast. It's outrageously complicated to
configure statically. And while its fragmentation resistance and
overhead on small blocks is good, it's space efficiencey has always
been very poor due to the four-way buddy scheme.
This patch introduces sys_heap. It's a more or less conventional
segregated fit allocator with power-of-two buckets. It doesn't expose
its level structure to the user at all, simply taking an arbitrarily
aligned pointer to memory. It stores all metadata inside the heap
region. It allocates and frees by simple pointer and not block ID.
Static initialization is trivial, and runtime initialization is only a
few cycles to format and add one block to a list header.
It has excellent space efficiency. Chunks can be split arbitrarily in
8 byte units. Overhead is only four bytes per allocated chunk (eight
bytes for heaps >256kb or on 64 bit systems), plus a log2-sized array
of 2-word bucket headers. No coarse alignment restrictions on blocks,
they can be split and merged (in units of 8 bytes) arbitrarily.
It has good fragmentation resistance. Freed blocks are always
immediately merged with adjacent free blocks. Allocations are
attempted from a sample of the smallest bucket that might fit, falling
back rapidly to the smallest block guaranteed to fit. Split memory
remaining in the chunk is always returned immediately to the heap for
other allocation.
It has excellent performance with firmly bounded runtime. All
operations are constant time (though there is a search of the smallest
bucket that has a compile-time-configurable upper bound, setting this
to extreme values results in an effectively linear search of the
list), objectively fast (about a hundred instructions) and amenable to
locked operation. No more need for fragile lock relaxation trickery.
It also contains an extensive validation and stress test framework,
something that was sorely lacking in the previous implementation.
Note that sys_heap is not a compatible API with sys_mem_pool and
k_mem_pool. Partial wrappers for those (now-) legacy APIs will appear
later and a deprecation strategy needs to be chosen.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2019-07-17 09:58:25 -07:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
/* Now we are valid, but have managed to invert all the in-use
|
|
|
|
* fields. One more linear pass to fix them up
|
|
|
|
*/
|
2021-05-26 18:44:14 +09:00
|
|
|
for (c = right_chunk(h, 0); c < h->end_chunk; c = right_chunk(h, c)) {
|
2019-09-24 23:20:53 -04:00
|
|
|
set_chunk_used(h, c, !chunk_used(h, c));
|
lib/os: Add sys_heap, a new/simpler/faster memory allocator
The existing mem_pool implementation has been an endless source of
frustration. It's had alignment bugs, it's had racy behavior. It's
never been particularly fast. It's outrageously complicated to
configure statically. And while its fragmentation resistance and
overhead on small blocks is good, it's space efficiencey has always
been very poor due to the four-way buddy scheme.
This patch introduces sys_heap. It's a more or less conventional
segregated fit allocator with power-of-two buckets. It doesn't expose
its level structure to the user at all, simply taking an arbitrarily
aligned pointer to memory. It stores all metadata inside the heap
region. It allocates and frees by simple pointer and not block ID.
Static initialization is trivial, and runtime initialization is only a
few cycles to format and add one block to a list header.
It has excellent space efficiency. Chunks can be split arbitrarily in
8 byte units. Overhead is only four bytes per allocated chunk (eight
bytes for heaps >256kb or on 64 bit systems), plus a log2-sized array
of 2-word bucket headers. No coarse alignment restrictions on blocks,
they can be split and merged (in units of 8 bytes) arbitrarily.
It has good fragmentation resistance. Freed blocks are always
immediately merged with adjacent free blocks. Allocations are
attempted from a sample of the smallest bucket that might fit, falling
back rapidly to the smallest block guaranteed to fit. Split memory
remaining in the chunk is always returned immediately to the heap for
other allocation.
It has excellent performance with firmly bounded runtime. All
operations are constant time (though there is a search of the smallest
bucket that has a compile-time-configurable upper bound, setting this
to extreme values results in an effectively linear search of the
list), objectively fast (about a hundred instructions) and amenable to
locked operation. No more need for fragile lock relaxation trickery.
It also contains an extensive validation and stress test framework,
something that was sorely lacking in the previous implementation.
Note that sys_heap is not a compatible API with sys_mem_pool and
k_mem_pool. Partial wrappers for those (now-) legacy APIs will appear
later and a deprecation strategy needs to be chosen.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2019-07-17 09:58:25 -07:00
|
|
|
}
|
|
|
|
return true;
|
|
|
|
}
|
|
|
|
|
|
|
|
struct z_heap_stress_rec {
|
2021-03-22 08:01:59 -04:00
|
|
|
void *(*alloc_fn)(void *arg, size_t bytes);
|
|
|
|
void (*free_fn)(void *arg, void *p);
|
lib/os: Add sys_heap, a new/simpler/faster memory allocator
The existing mem_pool implementation has been an endless source of
frustration. It's had alignment bugs, it's had racy behavior. It's
never been particularly fast. It's outrageously complicated to
configure statically. And while its fragmentation resistance and
overhead on small blocks is good, it's space efficiencey has always
been very poor due to the four-way buddy scheme.
This patch introduces sys_heap. It's a more or less conventional
segregated fit allocator with power-of-two buckets. It doesn't expose
its level structure to the user at all, simply taking an arbitrarily
aligned pointer to memory. It stores all metadata inside the heap
region. It allocates and frees by simple pointer and not block ID.
Static initialization is trivial, and runtime initialization is only a
few cycles to format and add one block to a list header.
It has excellent space efficiency. Chunks can be split arbitrarily in
8 byte units. Overhead is only four bytes per allocated chunk (eight
bytes for heaps >256kb or on 64 bit systems), plus a log2-sized array
of 2-word bucket headers. No coarse alignment restrictions on blocks,
they can be split and merged (in units of 8 bytes) arbitrarily.
It has good fragmentation resistance. Freed blocks are always
immediately merged with adjacent free blocks. Allocations are
attempted from a sample of the smallest bucket that might fit, falling
back rapidly to the smallest block guaranteed to fit. Split memory
remaining in the chunk is always returned immediately to the heap for
other allocation.
It has excellent performance with firmly bounded runtime. All
operations are constant time (though there is a search of the smallest
bucket that has a compile-time-configurable upper bound, setting this
to extreme values results in an effectively linear search of the
list), objectively fast (about a hundred instructions) and amenable to
locked operation. No more need for fragile lock relaxation trickery.
It also contains an extensive validation and stress test framework,
something that was sorely lacking in the previous implementation.
Note that sys_heap is not a compatible API with sys_mem_pool and
k_mem_pool. Partial wrappers for those (now-) legacy APIs will appear
later and a deprecation strategy needs to be chosen.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2019-07-17 09:58:25 -07:00
|
|
|
void *arg;
|
|
|
|
size_t total_bytes;
|
|
|
|
struct z_heap_stress_block *blocks;
|
|
|
|
size_t nblocks;
|
|
|
|
size_t blocks_alloced;
|
|
|
|
size_t bytes_alloced;
|
2020-05-27 11:26:57 -05:00
|
|
|
uint32_t target_percent;
|
lib/os: Add sys_heap, a new/simpler/faster memory allocator
The existing mem_pool implementation has been an endless source of
frustration. It's had alignment bugs, it's had racy behavior. It's
never been particularly fast. It's outrageously complicated to
configure statically. And while its fragmentation resistance and
overhead on small blocks is good, it's space efficiencey has always
been very poor due to the four-way buddy scheme.
This patch introduces sys_heap. It's a more or less conventional
segregated fit allocator with power-of-two buckets. It doesn't expose
its level structure to the user at all, simply taking an arbitrarily
aligned pointer to memory. It stores all metadata inside the heap
region. It allocates and frees by simple pointer and not block ID.
Static initialization is trivial, and runtime initialization is only a
few cycles to format and add one block to a list header.
It has excellent space efficiency. Chunks can be split arbitrarily in
8 byte units. Overhead is only four bytes per allocated chunk (eight
bytes for heaps >256kb or on 64 bit systems), plus a log2-sized array
of 2-word bucket headers. No coarse alignment restrictions on blocks,
they can be split and merged (in units of 8 bytes) arbitrarily.
It has good fragmentation resistance. Freed blocks are always
immediately merged with adjacent free blocks. Allocations are
attempted from a sample of the smallest bucket that might fit, falling
back rapidly to the smallest block guaranteed to fit. Split memory
remaining in the chunk is always returned immediately to the heap for
other allocation.
It has excellent performance with firmly bounded runtime. All
operations are constant time (though there is a search of the smallest
bucket that has a compile-time-configurable upper bound, setting this
to extreme values results in an effectively linear search of the
list), objectively fast (about a hundred instructions) and amenable to
locked operation. No more need for fragile lock relaxation trickery.
It also contains an extensive validation and stress test framework,
something that was sorely lacking in the previous implementation.
Note that sys_heap is not a compatible API with sys_mem_pool and
k_mem_pool. Partial wrappers for those (now-) legacy APIs will appear
later and a deprecation strategy needs to be chosen.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2019-07-17 09:58:25 -07:00
|
|
|
};
|
|
|
|
|
|
|
|
struct z_heap_stress_block {
|
|
|
|
void *ptr;
|
|
|
|
size_t sz;
|
|
|
|
};
|
|
|
|
|
|
|
|
/* Very simple LCRNG (from https://nuclear.llnl.gov/CNP/rng/rngman/node4.html)
|
|
|
|
*
|
|
|
|
* Here to guarantee cross-platform test repeatability.
|
|
|
|
*/
|
2020-05-27 11:26:57 -05:00
|
|
|
static uint32_t rand32(void)
|
lib/os: Add sys_heap, a new/simpler/faster memory allocator
The existing mem_pool implementation has been an endless source of
frustration. It's had alignment bugs, it's had racy behavior. It's
never been particularly fast. It's outrageously complicated to
configure statically. And while its fragmentation resistance and
overhead on small blocks is good, it's space efficiencey has always
been very poor due to the four-way buddy scheme.
This patch introduces sys_heap. It's a more or less conventional
segregated fit allocator with power-of-two buckets. It doesn't expose
its level structure to the user at all, simply taking an arbitrarily
aligned pointer to memory. It stores all metadata inside the heap
region. It allocates and frees by simple pointer and not block ID.
Static initialization is trivial, and runtime initialization is only a
few cycles to format and add one block to a list header.
It has excellent space efficiency. Chunks can be split arbitrarily in
8 byte units. Overhead is only four bytes per allocated chunk (eight
bytes for heaps >256kb or on 64 bit systems), plus a log2-sized array
of 2-word bucket headers. No coarse alignment restrictions on blocks,
they can be split and merged (in units of 8 bytes) arbitrarily.
It has good fragmentation resistance. Freed blocks are always
immediately merged with adjacent free blocks. Allocations are
attempted from a sample of the smallest bucket that might fit, falling
back rapidly to the smallest block guaranteed to fit. Split memory
remaining in the chunk is always returned immediately to the heap for
other allocation.
It has excellent performance with firmly bounded runtime. All
operations are constant time (though there is a search of the smallest
bucket that has a compile-time-configurable upper bound, setting this
to extreme values results in an effectively linear search of the
list), objectively fast (about a hundred instructions) and amenable to
locked operation. No more need for fragile lock relaxation trickery.
It also contains an extensive validation and stress test framework,
something that was sorely lacking in the previous implementation.
Note that sys_heap is not a compatible API with sys_mem_pool and
k_mem_pool. Partial wrappers for those (now-) legacy APIs will appear
later and a deprecation strategy needs to be chosen.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2019-07-17 09:58:25 -07:00
|
|
|
{
|
2020-05-27 11:26:57 -05:00
|
|
|
static uint64_t state = 123456789; /* seed */
|
lib/os: Add sys_heap, a new/simpler/faster memory allocator
The existing mem_pool implementation has been an endless source of
frustration. It's had alignment bugs, it's had racy behavior. It's
never been particularly fast. It's outrageously complicated to
configure statically. And while its fragmentation resistance and
overhead on small blocks is good, it's space efficiencey has always
been very poor due to the four-way buddy scheme.
This patch introduces sys_heap. It's a more or less conventional
segregated fit allocator with power-of-two buckets. It doesn't expose
its level structure to the user at all, simply taking an arbitrarily
aligned pointer to memory. It stores all metadata inside the heap
region. It allocates and frees by simple pointer and not block ID.
Static initialization is trivial, and runtime initialization is only a
few cycles to format and add one block to a list header.
It has excellent space efficiency. Chunks can be split arbitrarily in
8 byte units. Overhead is only four bytes per allocated chunk (eight
bytes for heaps >256kb or on 64 bit systems), plus a log2-sized array
of 2-word bucket headers. No coarse alignment restrictions on blocks,
they can be split and merged (in units of 8 bytes) arbitrarily.
It has good fragmentation resistance. Freed blocks are always
immediately merged with adjacent free blocks. Allocations are
attempted from a sample of the smallest bucket that might fit, falling
back rapidly to the smallest block guaranteed to fit. Split memory
remaining in the chunk is always returned immediately to the heap for
other allocation.
It has excellent performance with firmly bounded runtime. All
operations are constant time (though there is a search of the smallest
bucket that has a compile-time-configurable upper bound, setting this
to extreme values results in an effectively linear search of the
list), objectively fast (about a hundred instructions) and amenable to
locked operation. No more need for fragile lock relaxation trickery.
It also contains an extensive validation and stress test framework,
something that was sorely lacking in the previous implementation.
Note that sys_heap is not a compatible API with sys_mem_pool and
k_mem_pool. Partial wrappers for those (now-) legacy APIs will appear
later and a deprecation strategy needs to be chosen.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2019-07-17 09:58:25 -07:00
|
|
|
|
|
|
|
state = state * 2862933555777941757UL + 3037000493UL;
|
|
|
|
|
2020-05-27 11:26:57 -05:00
|
|
|
return (uint32_t)(state >> 32);
|
lib/os: Add sys_heap, a new/simpler/faster memory allocator
The existing mem_pool implementation has been an endless source of
frustration. It's had alignment bugs, it's had racy behavior. It's
never been particularly fast. It's outrageously complicated to
configure statically. And while its fragmentation resistance and
overhead on small blocks is good, it's space efficiencey has always
been very poor due to the four-way buddy scheme.
This patch introduces sys_heap. It's a more or less conventional
segregated fit allocator with power-of-two buckets. It doesn't expose
its level structure to the user at all, simply taking an arbitrarily
aligned pointer to memory. It stores all metadata inside the heap
region. It allocates and frees by simple pointer and not block ID.
Static initialization is trivial, and runtime initialization is only a
few cycles to format and add one block to a list header.
It has excellent space efficiency. Chunks can be split arbitrarily in
8 byte units. Overhead is only four bytes per allocated chunk (eight
bytes for heaps >256kb or on 64 bit systems), plus a log2-sized array
of 2-word bucket headers. No coarse alignment restrictions on blocks,
they can be split and merged (in units of 8 bytes) arbitrarily.
It has good fragmentation resistance. Freed blocks are always
immediately merged with adjacent free blocks. Allocations are
attempted from a sample of the smallest bucket that might fit, falling
back rapidly to the smallest block guaranteed to fit. Split memory
remaining in the chunk is always returned immediately to the heap for
other allocation.
It has excellent performance with firmly bounded runtime. All
operations are constant time (though there is a search of the smallest
bucket that has a compile-time-configurable upper bound, setting this
to extreme values results in an effectively linear search of the
list), objectively fast (about a hundred instructions) and amenable to
locked operation. No more need for fragile lock relaxation trickery.
It also contains an extensive validation and stress test framework,
something that was sorely lacking in the previous implementation.
Note that sys_heap is not a compatible API with sys_mem_pool and
k_mem_pool. Partial wrappers for those (now-) legacy APIs will appear
later and a deprecation strategy needs to be chosen.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2019-07-17 09:58:25 -07:00
|
|
|
}
|
|
|
|
|
|
|
|
static bool rand_alloc_choice(struct z_heap_stress_rec *sr)
|
|
|
|
{
|
|
|
|
/* Edge cases: no blocks allocated, and no space for a new one */
|
|
|
|
if (sr->blocks_alloced == 0) {
|
|
|
|
return true;
|
|
|
|
} else if (sr->blocks_alloced >= sr->nblocks) {
|
|
|
|
return false;
|
2021-04-28 10:53:38 -07:00
|
|
|
} else {
|
lib/os: Add sys_heap, a new/simpler/faster memory allocator
The existing mem_pool implementation has been an endless source of
frustration. It's had alignment bugs, it's had racy behavior. It's
never been particularly fast. It's outrageously complicated to
configure statically. And while its fragmentation resistance and
overhead on small blocks is good, it's space efficiencey has always
been very poor due to the four-way buddy scheme.
This patch introduces sys_heap. It's a more or less conventional
segregated fit allocator with power-of-two buckets. It doesn't expose
its level structure to the user at all, simply taking an arbitrarily
aligned pointer to memory. It stores all metadata inside the heap
region. It allocates and frees by simple pointer and not block ID.
Static initialization is trivial, and runtime initialization is only a
few cycles to format and add one block to a list header.
It has excellent space efficiency. Chunks can be split arbitrarily in
8 byte units. Overhead is only four bytes per allocated chunk (eight
bytes for heaps >256kb or on 64 bit systems), plus a log2-sized array
of 2-word bucket headers. No coarse alignment restrictions on blocks,
they can be split and merged (in units of 8 bytes) arbitrarily.
It has good fragmentation resistance. Freed blocks are always
immediately merged with adjacent free blocks. Allocations are
attempted from a sample of the smallest bucket that might fit, falling
back rapidly to the smallest block guaranteed to fit. Split memory
remaining in the chunk is always returned immediately to the heap for
other allocation.
It has excellent performance with firmly bounded runtime. All
operations are constant time (though there is a search of the smallest
bucket that has a compile-time-configurable upper bound, setting this
to extreme values results in an effectively linear search of the
list), objectively fast (about a hundred instructions) and amenable to
locked operation. No more need for fragile lock relaxation trickery.
It also contains an extensive validation and stress test framework,
something that was sorely lacking in the previous implementation.
Note that sys_heap is not a compatible API with sys_mem_pool and
k_mem_pool. Partial wrappers for those (now-) legacy APIs will appear
later and a deprecation strategy needs to be chosen.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2019-07-17 09:58:25 -07:00
|
|
|
|
2021-04-28 10:53:38 -07:00
|
|
|
/* The way this works is to scale the chance of choosing to
|
|
|
|
* allocate vs. free such that it's even odds when the heap is
|
|
|
|
* at the target percent, with linear tapering on the low
|
|
|
|
* slope (i.e. we choose to always allocate with an empty
|
|
|
|
* heap, allocate 50% of the time when the heap is exactly at
|
|
|
|
* the target, and always free when above the target). In
|
|
|
|
* practice, the operations aren't quite symmetric (you can
|
|
|
|
* always free, but your allocation might fail), and the units
|
|
|
|
* aren't matched (we're doing math based on bytes allocated
|
|
|
|
* and ignoring the overhead) but this is close enough. And
|
|
|
|
* yes, the math here is coarse (in units of percent), but
|
|
|
|
* that's good enough and fits well inside 32 bit quantities.
|
|
|
|
* (Note precision issue when heap size is above 40MB
|
|
|
|
* though!).
|
|
|
|
*/
|
|
|
|
__ASSERT(sr->total_bytes < 0xffffffffU / 100, "too big for u32!");
|
|
|
|
uint32_t full_pct = (100 * sr->bytes_alloced) / sr->total_bytes;
|
|
|
|
uint32_t target = sr->target_percent ? sr->target_percent : 1;
|
|
|
|
uint32_t free_chance = 0xffffffffU;
|
|
|
|
|
|
|
|
if (full_pct < sr->target_percent) {
|
|
|
|
free_chance = full_pct * (0x80000000U / target);
|
|
|
|
}
|
lib/os: Add sys_heap, a new/simpler/faster memory allocator
The existing mem_pool implementation has been an endless source of
frustration. It's had alignment bugs, it's had racy behavior. It's
never been particularly fast. It's outrageously complicated to
configure statically. And while its fragmentation resistance and
overhead on small blocks is good, it's space efficiencey has always
been very poor due to the four-way buddy scheme.
This patch introduces sys_heap. It's a more or less conventional
segregated fit allocator with power-of-two buckets. It doesn't expose
its level structure to the user at all, simply taking an arbitrarily
aligned pointer to memory. It stores all metadata inside the heap
region. It allocates and frees by simple pointer and not block ID.
Static initialization is trivial, and runtime initialization is only a
few cycles to format and add one block to a list header.
It has excellent space efficiency. Chunks can be split arbitrarily in
8 byte units. Overhead is only four bytes per allocated chunk (eight
bytes for heaps >256kb or on 64 bit systems), plus a log2-sized array
of 2-word bucket headers. No coarse alignment restrictions on blocks,
they can be split and merged (in units of 8 bytes) arbitrarily.
It has good fragmentation resistance. Freed blocks are always
immediately merged with adjacent free blocks. Allocations are
attempted from a sample of the smallest bucket that might fit, falling
back rapidly to the smallest block guaranteed to fit. Split memory
remaining in the chunk is always returned immediately to the heap for
other allocation.
It has excellent performance with firmly bounded runtime. All
operations are constant time (though there is a search of the smallest
bucket that has a compile-time-configurable upper bound, setting this
to extreme values results in an effectively linear search of the
list), objectively fast (about a hundred instructions) and amenable to
locked operation. No more need for fragile lock relaxation trickery.
It also contains an extensive validation and stress test framework,
something that was sorely lacking in the previous implementation.
Note that sys_heap is not a compatible API with sys_mem_pool and
k_mem_pool. Partial wrappers for those (now-) legacy APIs will appear
later and a deprecation strategy needs to be chosen.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2019-07-17 09:58:25 -07:00
|
|
|
|
2021-04-28 10:53:38 -07:00
|
|
|
return rand32() > free_chance;
|
lib/os: Add sys_heap, a new/simpler/faster memory allocator
The existing mem_pool implementation has been an endless source of
frustration. It's had alignment bugs, it's had racy behavior. It's
never been particularly fast. It's outrageously complicated to
configure statically. And while its fragmentation resistance and
overhead on small blocks is good, it's space efficiencey has always
been very poor due to the four-way buddy scheme.
This patch introduces sys_heap. It's a more or less conventional
segregated fit allocator with power-of-two buckets. It doesn't expose
its level structure to the user at all, simply taking an arbitrarily
aligned pointer to memory. It stores all metadata inside the heap
region. It allocates and frees by simple pointer and not block ID.
Static initialization is trivial, and runtime initialization is only a
few cycles to format and add one block to a list header.
It has excellent space efficiency. Chunks can be split arbitrarily in
8 byte units. Overhead is only four bytes per allocated chunk (eight
bytes for heaps >256kb or on 64 bit systems), plus a log2-sized array
of 2-word bucket headers. No coarse alignment restrictions on blocks,
they can be split and merged (in units of 8 bytes) arbitrarily.
It has good fragmentation resistance. Freed blocks are always
immediately merged with adjacent free blocks. Allocations are
attempted from a sample of the smallest bucket that might fit, falling
back rapidly to the smallest block guaranteed to fit. Split memory
remaining in the chunk is always returned immediately to the heap for
other allocation.
It has excellent performance with firmly bounded runtime. All
operations are constant time (though there is a search of the smallest
bucket that has a compile-time-configurable upper bound, setting this
to extreme values results in an effectively linear search of the
list), objectively fast (about a hundred instructions) and amenable to
locked operation. No more need for fragile lock relaxation trickery.
It also contains an extensive validation and stress test framework,
something that was sorely lacking in the previous implementation.
Note that sys_heap is not a compatible API with sys_mem_pool and
k_mem_pool. Partial wrappers for those (now-) legacy APIs will appear
later and a deprecation strategy needs to be chosen.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2019-07-17 09:58:25 -07:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
/* Chooses a size of block to allocate, logarithmically favoring
|
|
|
|
* smaller blocks (i.e. blocks twice as large are half as frequent
|
|
|
|
*/
|
|
|
|
static size_t rand_alloc_size(struct z_heap_stress_rec *sr)
|
|
|
|
{
|
|
|
|
ARG_UNUSED(sr);
|
|
|
|
|
|
|
|
/* Min scale of 4 means that the half of the requests in the
|
|
|
|
* smallest size have an average size of 8
|
|
|
|
*/
|
|
|
|
int scale = 4 + __builtin_clz(rand32());
|
|
|
|
|
2021-12-11 22:58:38 -05:00
|
|
|
return rand32() & BIT_MASK(scale);
|
lib/os: Add sys_heap, a new/simpler/faster memory allocator
The existing mem_pool implementation has been an endless source of
frustration. It's had alignment bugs, it's had racy behavior. It's
never been particularly fast. It's outrageously complicated to
configure statically. And while its fragmentation resistance and
overhead on small blocks is good, it's space efficiencey has always
been very poor due to the four-way buddy scheme.
This patch introduces sys_heap. It's a more or less conventional
segregated fit allocator with power-of-two buckets. It doesn't expose
its level structure to the user at all, simply taking an arbitrarily
aligned pointer to memory. It stores all metadata inside the heap
region. It allocates and frees by simple pointer and not block ID.
Static initialization is trivial, and runtime initialization is only a
few cycles to format and add one block to a list header.
It has excellent space efficiency. Chunks can be split arbitrarily in
8 byte units. Overhead is only four bytes per allocated chunk (eight
bytes for heaps >256kb or on 64 bit systems), plus a log2-sized array
of 2-word bucket headers. No coarse alignment restrictions on blocks,
they can be split and merged (in units of 8 bytes) arbitrarily.
It has good fragmentation resistance. Freed blocks are always
immediately merged with adjacent free blocks. Allocations are
attempted from a sample of the smallest bucket that might fit, falling
back rapidly to the smallest block guaranteed to fit. Split memory
remaining in the chunk is always returned immediately to the heap for
other allocation.
It has excellent performance with firmly bounded runtime. All
operations are constant time (though there is a search of the smallest
bucket that has a compile-time-configurable upper bound, setting this
to extreme values results in an effectively linear search of the
list), objectively fast (about a hundred instructions) and amenable to
locked operation. No more need for fragile lock relaxation trickery.
It also contains an extensive validation and stress test framework,
something that was sorely lacking in the previous implementation.
Note that sys_heap is not a compatible API with sys_mem_pool and
k_mem_pool. Partial wrappers for those (now-) legacy APIs will appear
later and a deprecation strategy needs to be chosen.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2019-07-17 09:58:25 -07:00
|
|
|
}
|
|
|
|
|
|
|
|
/* Returns the index of a randomly chosen block to free */
|
|
|
|
static size_t rand_free_choice(struct z_heap_stress_rec *sr)
|
|
|
|
{
|
|
|
|
return rand32() % sr->blocks_alloced;
|
|
|
|
}
|
|
|
|
|
|
|
|
/* General purpose heap stress test. Takes function pointers to allow
|
|
|
|
* for testing multiple heap APIs with the same rig. The alloc and
|
|
|
|
* free functions are passed back the argument as a context pointer.
|
|
|
|
* The "log" function is for readable user output. The total_bytes
|
|
|
|
* argument should reflect the size of the heap being tested. The
|
|
|
|
* scratch array is used to store temporary state and should be sized
|
|
|
|
* about half as large as the heap itself. Returns true on success.
|
|
|
|
*/
|
2021-03-22 08:01:59 -04:00
|
|
|
void sys_heap_stress(void *(*alloc_fn)(void *arg, size_t bytes),
|
|
|
|
void (*free_fn)(void *arg, void *p),
|
lib/os: Add sys_heap, a new/simpler/faster memory allocator
The existing mem_pool implementation has been an endless source of
frustration. It's had alignment bugs, it's had racy behavior. It's
never been particularly fast. It's outrageously complicated to
configure statically. And while its fragmentation resistance and
overhead on small blocks is good, it's space efficiencey has always
been very poor due to the four-way buddy scheme.
This patch introduces sys_heap. It's a more or less conventional
segregated fit allocator with power-of-two buckets. It doesn't expose
its level structure to the user at all, simply taking an arbitrarily
aligned pointer to memory. It stores all metadata inside the heap
region. It allocates and frees by simple pointer and not block ID.
Static initialization is trivial, and runtime initialization is only a
few cycles to format and add one block to a list header.
It has excellent space efficiency. Chunks can be split arbitrarily in
8 byte units. Overhead is only four bytes per allocated chunk (eight
bytes for heaps >256kb or on 64 bit systems), plus a log2-sized array
of 2-word bucket headers. No coarse alignment restrictions on blocks,
they can be split and merged (in units of 8 bytes) arbitrarily.
It has good fragmentation resistance. Freed blocks are always
immediately merged with adjacent free blocks. Allocations are
attempted from a sample of the smallest bucket that might fit, falling
back rapidly to the smallest block guaranteed to fit. Split memory
remaining in the chunk is always returned immediately to the heap for
other allocation.
It has excellent performance with firmly bounded runtime. All
operations are constant time (though there is a search of the smallest
bucket that has a compile-time-configurable upper bound, setting this
to extreme values results in an effectively linear search of the
list), objectively fast (about a hundred instructions) and amenable to
locked operation. No more need for fragile lock relaxation trickery.
It also contains an extensive validation and stress test framework,
something that was sorely lacking in the previous implementation.
Note that sys_heap is not a compatible API with sys_mem_pool and
k_mem_pool. Partial wrappers for those (now-) legacy APIs will appear
later and a deprecation strategy needs to be chosen.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2019-07-17 09:58:25 -07:00
|
|
|
void *arg, size_t total_bytes,
|
2020-05-27 11:26:57 -05:00
|
|
|
uint32_t op_count,
|
lib/os: Add sys_heap, a new/simpler/faster memory allocator
The existing mem_pool implementation has been an endless source of
frustration. It's had alignment bugs, it's had racy behavior. It's
never been particularly fast. It's outrageously complicated to
configure statically. And while its fragmentation resistance and
overhead on small blocks is good, it's space efficiencey has always
been very poor due to the four-way buddy scheme.
This patch introduces sys_heap. It's a more or less conventional
segregated fit allocator with power-of-two buckets. It doesn't expose
its level structure to the user at all, simply taking an arbitrarily
aligned pointer to memory. It stores all metadata inside the heap
region. It allocates and frees by simple pointer and not block ID.
Static initialization is trivial, and runtime initialization is only a
few cycles to format and add one block to a list header.
It has excellent space efficiency. Chunks can be split arbitrarily in
8 byte units. Overhead is only four bytes per allocated chunk (eight
bytes for heaps >256kb or on 64 bit systems), plus a log2-sized array
of 2-word bucket headers. No coarse alignment restrictions on blocks,
they can be split and merged (in units of 8 bytes) arbitrarily.
It has good fragmentation resistance. Freed blocks are always
immediately merged with adjacent free blocks. Allocations are
attempted from a sample of the smallest bucket that might fit, falling
back rapidly to the smallest block guaranteed to fit. Split memory
remaining in the chunk is always returned immediately to the heap for
other allocation.
It has excellent performance with firmly bounded runtime. All
operations are constant time (though there is a search of the smallest
bucket that has a compile-time-configurable upper bound, setting this
to extreme values results in an effectively linear search of the
list), objectively fast (about a hundred instructions) and amenable to
locked operation. No more need for fragile lock relaxation trickery.
It also contains an extensive validation and stress test framework,
something that was sorely lacking in the previous implementation.
Note that sys_heap is not a compatible API with sys_mem_pool and
k_mem_pool. Partial wrappers for those (now-) legacy APIs will appear
later and a deprecation strategy needs to be chosen.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2019-07-17 09:58:25 -07:00
|
|
|
void *scratch_mem, size_t scratch_bytes,
|
|
|
|
int target_percent,
|
|
|
|
struct z_heap_stress_result *result)
|
|
|
|
{
|
|
|
|
struct z_heap_stress_rec sr = {
|
2021-03-22 08:01:59 -04:00
|
|
|
.alloc_fn = alloc_fn,
|
|
|
|
.free_fn = free_fn,
|
lib/os: Add sys_heap, a new/simpler/faster memory allocator
The existing mem_pool implementation has been an endless source of
frustration. It's had alignment bugs, it's had racy behavior. It's
never been particularly fast. It's outrageously complicated to
configure statically. And while its fragmentation resistance and
overhead on small blocks is good, it's space efficiencey has always
been very poor due to the four-way buddy scheme.
This patch introduces sys_heap. It's a more or less conventional
segregated fit allocator with power-of-two buckets. It doesn't expose
its level structure to the user at all, simply taking an arbitrarily
aligned pointer to memory. It stores all metadata inside the heap
region. It allocates and frees by simple pointer and not block ID.
Static initialization is trivial, and runtime initialization is only a
few cycles to format and add one block to a list header.
It has excellent space efficiency. Chunks can be split arbitrarily in
8 byte units. Overhead is only four bytes per allocated chunk (eight
bytes for heaps >256kb or on 64 bit systems), plus a log2-sized array
of 2-word bucket headers. No coarse alignment restrictions on blocks,
they can be split and merged (in units of 8 bytes) arbitrarily.
It has good fragmentation resistance. Freed blocks are always
immediately merged with adjacent free blocks. Allocations are
attempted from a sample of the smallest bucket that might fit, falling
back rapidly to the smallest block guaranteed to fit. Split memory
remaining in the chunk is always returned immediately to the heap for
other allocation.
It has excellent performance with firmly bounded runtime. All
operations are constant time (though there is a search of the smallest
bucket that has a compile-time-configurable upper bound, setting this
to extreme values results in an effectively linear search of the
list), objectively fast (about a hundred instructions) and amenable to
locked operation. No more need for fragile lock relaxation trickery.
It also contains an extensive validation and stress test framework,
something that was sorely lacking in the previous implementation.
Note that sys_heap is not a compatible API with sys_mem_pool and
k_mem_pool. Partial wrappers for those (now-) legacy APIs will appear
later and a deprecation strategy needs to be chosen.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2019-07-17 09:58:25 -07:00
|
|
|
.arg = arg,
|
|
|
|
.total_bytes = total_bytes,
|
|
|
|
.blocks = scratch_mem,
|
|
|
|
.nblocks = scratch_bytes / sizeof(struct z_heap_stress_block),
|
|
|
|
.target_percent = target_percent,
|
|
|
|
};
|
|
|
|
|
|
|
|
*result = (struct z_heap_stress_result) {0};
|
|
|
|
|
2020-05-27 11:26:57 -05:00
|
|
|
for (uint32_t i = 0; i < op_count; i++) {
|
lib/os: Add sys_heap, a new/simpler/faster memory allocator
The existing mem_pool implementation has been an endless source of
frustration. It's had alignment bugs, it's had racy behavior. It's
never been particularly fast. It's outrageously complicated to
configure statically. And while its fragmentation resistance and
overhead on small blocks is good, it's space efficiencey has always
been very poor due to the four-way buddy scheme.
This patch introduces sys_heap. It's a more or less conventional
segregated fit allocator with power-of-two buckets. It doesn't expose
its level structure to the user at all, simply taking an arbitrarily
aligned pointer to memory. It stores all metadata inside the heap
region. It allocates and frees by simple pointer and not block ID.
Static initialization is trivial, and runtime initialization is only a
few cycles to format and add one block to a list header.
It has excellent space efficiency. Chunks can be split arbitrarily in
8 byte units. Overhead is only four bytes per allocated chunk (eight
bytes for heaps >256kb or on 64 bit systems), plus a log2-sized array
of 2-word bucket headers. No coarse alignment restrictions on blocks,
they can be split and merged (in units of 8 bytes) arbitrarily.
It has good fragmentation resistance. Freed blocks are always
immediately merged with adjacent free blocks. Allocations are
attempted from a sample of the smallest bucket that might fit, falling
back rapidly to the smallest block guaranteed to fit. Split memory
remaining in the chunk is always returned immediately to the heap for
other allocation.
It has excellent performance with firmly bounded runtime. All
operations are constant time (though there is a search of the smallest
bucket that has a compile-time-configurable upper bound, setting this
to extreme values results in an effectively linear search of the
list), objectively fast (about a hundred instructions) and amenable to
locked operation. No more need for fragile lock relaxation trickery.
It also contains an extensive validation and stress test framework,
something that was sorely lacking in the previous implementation.
Note that sys_heap is not a compatible API with sys_mem_pool and
k_mem_pool. Partial wrappers for those (now-) legacy APIs will appear
later and a deprecation strategy needs to be chosen.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2019-07-17 09:58:25 -07:00
|
|
|
if (rand_alloc_choice(&sr)) {
|
|
|
|
size_t sz = rand_alloc_size(&sr);
|
2021-03-22 08:01:59 -04:00
|
|
|
void *p = sr.alloc_fn(sr.arg, sz);
|
lib/os: Add sys_heap, a new/simpler/faster memory allocator
The existing mem_pool implementation has been an endless source of
frustration. It's had alignment bugs, it's had racy behavior. It's
never been particularly fast. It's outrageously complicated to
configure statically. And while its fragmentation resistance and
overhead on small blocks is good, it's space efficiencey has always
been very poor due to the four-way buddy scheme.
This patch introduces sys_heap. It's a more or less conventional
segregated fit allocator with power-of-two buckets. It doesn't expose
its level structure to the user at all, simply taking an arbitrarily
aligned pointer to memory. It stores all metadata inside the heap
region. It allocates and frees by simple pointer and not block ID.
Static initialization is trivial, and runtime initialization is only a
few cycles to format and add one block to a list header.
It has excellent space efficiency. Chunks can be split arbitrarily in
8 byte units. Overhead is only four bytes per allocated chunk (eight
bytes for heaps >256kb or on 64 bit systems), plus a log2-sized array
of 2-word bucket headers. No coarse alignment restrictions on blocks,
they can be split and merged (in units of 8 bytes) arbitrarily.
It has good fragmentation resistance. Freed blocks are always
immediately merged with adjacent free blocks. Allocations are
attempted from a sample of the smallest bucket that might fit, falling
back rapidly to the smallest block guaranteed to fit. Split memory
remaining in the chunk is always returned immediately to the heap for
other allocation.
It has excellent performance with firmly bounded runtime. All
operations are constant time (though there is a search of the smallest
bucket that has a compile-time-configurable upper bound, setting this
to extreme values results in an effectively linear search of the
list), objectively fast (about a hundred instructions) and amenable to
locked operation. No more need for fragile lock relaxation trickery.
It also contains an extensive validation and stress test framework,
something that was sorely lacking in the previous implementation.
Note that sys_heap is not a compatible API with sys_mem_pool and
k_mem_pool. Partial wrappers for those (now-) legacy APIs will appear
later and a deprecation strategy needs to be chosen.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2019-07-17 09:58:25 -07:00
|
|
|
|
|
|
|
result->total_allocs++;
|
|
|
|
if (p != NULL) {
|
|
|
|
result->successful_allocs++;
|
|
|
|
sr.blocks[sr.blocks_alloced].ptr = p;
|
|
|
|
sr.blocks[sr.blocks_alloced].sz = sz;
|
|
|
|
sr.blocks_alloced++;
|
|
|
|
sr.bytes_alloced += sz;
|
|
|
|
}
|
|
|
|
} else {
|
|
|
|
int b = rand_free_choice(&sr);
|
|
|
|
void *p = sr.blocks[b].ptr;
|
|
|
|
size_t sz = sr.blocks[b].sz;
|
|
|
|
|
|
|
|
result->total_frees++;
|
|
|
|
sr.blocks[b] = sr.blocks[sr.blocks_alloced - 1];
|
|
|
|
sr.blocks_alloced--;
|
|
|
|
sr.bytes_alloced -= sz;
|
2021-03-22 08:01:59 -04:00
|
|
|
sr.free_fn(sr.arg, p);
|
lib/os: Add sys_heap, a new/simpler/faster memory allocator
The existing mem_pool implementation has been an endless source of
frustration. It's had alignment bugs, it's had racy behavior. It's
never been particularly fast. It's outrageously complicated to
configure statically. And while its fragmentation resistance and
overhead on small blocks is good, it's space efficiencey has always
been very poor due to the four-way buddy scheme.
This patch introduces sys_heap. It's a more or less conventional
segregated fit allocator with power-of-two buckets. It doesn't expose
its level structure to the user at all, simply taking an arbitrarily
aligned pointer to memory. It stores all metadata inside the heap
region. It allocates and frees by simple pointer and not block ID.
Static initialization is trivial, and runtime initialization is only a
few cycles to format and add one block to a list header.
It has excellent space efficiency. Chunks can be split arbitrarily in
8 byte units. Overhead is only four bytes per allocated chunk (eight
bytes for heaps >256kb or on 64 bit systems), plus a log2-sized array
of 2-word bucket headers. No coarse alignment restrictions on blocks,
they can be split and merged (in units of 8 bytes) arbitrarily.
It has good fragmentation resistance. Freed blocks are always
immediately merged with adjacent free blocks. Allocations are
attempted from a sample of the smallest bucket that might fit, falling
back rapidly to the smallest block guaranteed to fit. Split memory
remaining in the chunk is always returned immediately to the heap for
other allocation.
It has excellent performance with firmly bounded runtime. All
operations are constant time (though there is a search of the smallest
bucket that has a compile-time-configurable upper bound, setting this
to extreme values results in an effectively linear search of the
list), objectively fast (about a hundred instructions) and amenable to
locked operation. No more need for fragile lock relaxation trickery.
It also contains an extensive validation and stress test framework,
something that was sorely lacking in the previous implementation.
Note that sys_heap is not a compatible API with sys_mem_pool and
k_mem_pool. Partial wrappers for those (now-) legacy APIs will appear
later and a deprecation strategy needs to be chosen.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2019-07-17 09:58:25 -07:00
|
|
|
}
|
|
|
|
result->accumulated_in_use_bytes += sr.bytes_alloced;
|
|
|
|
}
|
|
|
|
}
|
2020-06-25 15:16:10 -04:00
|
|
|
|
|
|
|
/*
|
2021-03-15 22:30:20 -04:00
|
|
|
* Print heap info for debugging / analysis purpose
|
2020-06-25 15:16:10 -04:00
|
|
|
*/
|
2021-03-15 22:30:20 -04:00
|
|
|
void heap_print_info(struct z_heap *h, bool dump_chunks)
|
2020-06-25 15:16:10 -04:00
|
|
|
{
|
2021-03-17 21:05:49 -04:00
|
|
|
int i, nb_buckets = bucket_idx(h, h->end_chunk) + 1;
|
2021-03-15 22:30:20 -04:00
|
|
|
size_t free_bytes, allocated_bytes, total, overhead;
|
2020-06-25 15:16:10 -04:00
|
|
|
|
2021-03-15 22:30:20 -04:00
|
|
|
printk("Heap at %p contains %d units in %d buckets\n\n",
|
2021-03-17 21:05:49 -04:00
|
|
|
chunk_buf(h), h->end_chunk, nb_buckets);
|
2020-06-25 15:16:10 -04:00
|
|
|
|
2021-03-15 22:30:20 -04:00
|
|
|
printk(" bucket# min units total largest largest\n"
|
|
|
|
" threshold chunks (units) (bytes)\n"
|
|
|
|
" -----------------------------------------------------------\n");
|
2020-06-25 15:16:10 -04:00
|
|
|
for (i = 0; i < nb_buckets; i++) {
|
|
|
|
chunkid_t first = h->buckets[i].next;
|
2021-03-17 21:53:25 -04:00
|
|
|
chunksz_t largest = 0;
|
2020-06-25 15:16:10 -04:00
|
|
|
int count = 0;
|
|
|
|
|
|
|
|
if (first) {
|
|
|
|
chunkid_t curr = first;
|
|
|
|
do {
|
|
|
|
count++;
|
2021-03-15 22:30:20 -04:00
|
|
|
largest = MAX(largest, chunk_size(h, curr));
|
2020-06-25 15:16:10 -04:00
|
|
|
curr = next_free_chunk(h, curr);
|
|
|
|
} while (curr != first);
|
|
|
|
}
|
2021-03-15 22:30:20 -04:00
|
|
|
if (count) {
|
2021-03-17 21:53:25 -04:00
|
|
|
printk("%9d %12d %12d %12d %12zd\n",
|
2021-03-15 22:30:20 -04:00
|
|
|
i, (1 << i) - 1 + min_chunk_size(h), count,
|
2021-03-17 22:21:14 -04:00
|
|
|
largest, chunksz_to_bytes(h, largest));
|
2021-03-15 22:30:20 -04:00
|
|
|
}
|
2020-06-25 15:16:10 -04:00
|
|
|
}
|
|
|
|
|
2021-03-15 22:30:20 -04:00
|
|
|
if (dump_chunks) {
|
|
|
|
printk("\nChunk dump:\n");
|
2021-11-12 00:12:29 -05:00
|
|
|
for (chunkid_t c = 0; ; c = right_chunk(h, c)) {
|
2021-03-17 21:53:25 -04:00
|
|
|
printk("chunk %4d: [%c] size=%-4d left=%-4d right=%d\n",
|
2021-03-15 22:30:20 -04:00
|
|
|
c,
|
|
|
|
chunk_used(h, c) ? '*'
|
|
|
|
: solo_free_header(h, c) ? '.'
|
|
|
|
: '-',
|
|
|
|
chunk_size(h, c),
|
|
|
|
left_chunk(h, c),
|
|
|
|
right_chunk(h, c));
|
2021-11-12 00:12:29 -05:00
|
|
|
if (c == h->end_chunk) {
|
|
|
|
break;
|
|
|
|
}
|
2020-06-25 15:16:10 -04:00
|
|
|
}
|
|
|
|
}
|
2021-03-15 22:30:20 -04:00
|
|
|
|
2021-11-12 00:12:29 -05:00
|
|
|
get_alloc_info(h, &allocated_bytes, &free_bytes);
|
2021-03-17 21:05:49 -04:00
|
|
|
/* The end marker chunk has a header. It is part of the overhead. */
|
|
|
|
total = h->end_chunk * CHUNK_UNIT + chunk_header_bytes(h);
|
2021-03-15 22:30:20 -04:00
|
|
|
overhead = total - free_bytes - allocated_bytes;
|
|
|
|
printk("\n%zd free bytes, %zd allocated bytes, overhead = %zd bytes (%zd.%zd%%)\n",
|
|
|
|
free_bytes, allocated_bytes, overhead,
|
|
|
|
(1000 * overhead + total/2) / total / 10,
|
|
|
|
(1000 * overhead + total/2) / total % 10);
|
2020-06-25 15:16:10 -04:00
|
|
|
}
|
|
|
|
|
2021-03-15 22:30:20 -04:00
|
|
|
void sys_heap_print_info(struct sys_heap *heap, bool dump_chunks)
|
2020-06-25 15:16:10 -04:00
|
|
|
{
|
2021-03-15 22:30:20 -04:00
|
|
|
heap_print_info(heap->heap, dump_chunks);
|
2020-06-25 15:16:10 -04:00
|
|
|
}
|
2021-10-29 15:49:07 +08:00
|
|
|
|
|
|
|
#ifdef CONFIG_SYS_HEAP_RUNTIME_STATS
|
|
|
|
|
|
|
|
int sys_heap_runtime_stats_get(struct sys_heap *heap,
|
|
|
|
struct sys_heap_runtime_stats *stats)
|
|
|
|
{
|
|
|
|
if ((heap == NULL) || (stats == NULL)) {
|
|
|
|
return -EINVAL;
|
|
|
|
}
|
|
|
|
|
|
|
|
stats->free_bytes = heap->heap->free_bytes;
|
|
|
|
stats->allocated_bytes = heap->heap->allocated_bytes;
|
2022-03-29 19:07:43 +02:00
|
|
|
stats->max_allocated_bytes = heap->heap->max_allocated_bytes;
|
|
|
|
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
|
|
|
int sys_heap_runtime_stats_reset_max(struct sys_heap *heap)
|
|
|
|
{
|
|
|
|
if (heap == NULL) {
|
|
|
|
return -EINVAL;
|
|
|
|
}
|
|
|
|
|
|
|
|
heap->heap->max_allocated_bytes = heap->heap->allocated_bytes;
|
2021-10-29 15:49:07 +08:00
|
|
|
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
|
|
|
#endif
|