tests: latency_measure: Update README.txt

Updates the README.txt with more current sample output. Signed-off-by: Peter Mitsis <peter.mitsis@intel.com>
2024-12-17 16:04:15 -08:00 · 2024-12-17 16:04:15 -08:00 · 60b38d50db
commit 60b38d50db
parent b3b94731d5
1 changed files with 296 additions and 172 deletions
--- a/tests/benchmarks/latency_measure/README.rst
+++ b/tests/benchmarks/latency_measure/README.rst
@ -44,7 +44,9 @@ For example, the following will build this project with userspace support:
 The following table summarizes the purposes of the different extra
 configuration files that are available to be used with this benchmark.
 A tester may mix and match them allowing them different scenarios to
-be easily compared the default.
+be easily compared the default. Benchmark output can be saved and subsequently
+exported to third party tools to compare and chart performance differences
+both between configurations as well as across Zephyr versions.

 +-----------------------------+------------------------------------+
 | prj.canaries.conf           | Enable stack canaries              |
@ -54,184 +56,306 @@ be easily compared the default.
 | prj.userspace.conf          | Enable userspace support           |
 +-----------------------------+------------------------------------+

-Sample output of the benchmark (without userspace enabled)::
+Sample output of the benchmark using the defaults::

-        thread.yield.preemptive.ctx.k_to_k       - Context switch via k_yield                         :     329 cycles ,     2741 ns :
-        thread.yield.cooperative.ctx.k_to_k      - Context switch via k_yield                         :     329 cycles ,     2741 ns :
-        isr.resume.interrupted.thread.kernel     - Return from ISR to interrupted thread              :     363 cycles ,     3033 ns :
-        isr.resume.different.thread.kernel       - Return from ISR to another thread                  :     404 cycles ,     3367 ns :
-        thread.create.kernel.from.kernel         - Create thread                                      :     404 cycles ,     3374 ns :
-        thread.start.kernel.from.kernel          - Start thread                                       :     423 cycles ,     3533 ns :
-        thread.suspend.kernel.from.kernel        - Suspend thread                                     :     428 cycles ,     3574 ns :
-        thread.resume.kernel.from.kernel         - Resume thread                                      :     350 cycles ,     2924 ns :
-        thread.abort.kernel.from.kernel          - Abort thread                                       :     339 cycles ,     2826 ns :
-        fifo.put.immediate.kernel                - Add data to FIFO (no ctx switch)                   :     269 cycles ,     2242 ns :
-        fifo.get.immediate.kernel                - Get data from FIFO (no ctx switch)                 :     128 cycles ,     1074 ns :
-        fifo.put.alloc.immediate.kernel          - Allocate to add data to FIFO (no ctx switch)       :     945 cycles ,     7875 ns :
-        fifo.get.free.immediate.kernel           - Free when getting data from FIFO (no ctx switch)   :     575 cycles ,     4792 ns :
-        fifo.get.blocking.k_to_k                 - Get data from FIFO (w/ ctx switch)                 :     551 cycles ,     4592 ns :
-        fifo.put.wake+ctx.k_to_k                 - Add data to FIFO (w/ ctx switch)                   :     660 cycles ,     5500 ns :
-        fifo.get.free.blocking.k_to_k            - Free when getting data from FIFO (w/ ctx siwtch)   :     553 cycles ,     4608 ns :
-        fifo.put.alloc.wake+ctx.k_to_k           - Allocate to add data to FIFO (w/ ctx switch)       :     655 cycles ,     5458 ns :
-        lifo.put.immediate.kernel                - Add data to LIFO (no ctx switch)                   :     280 cycles ,     2341 ns :
-        lifo.get.immediate.kernel                - Get data from LIFO (no ctx switch)                 :     133 cycles ,     1116 ns :
-        lifo.put.alloc.immediate.kernel          - Allocate to add data to LIFO (no ctx switch)       :     945 cycles ,     7875 ns :
-        lifo.get.free.immediate.kernel           - Free when getting data from LIFO (no ctx switch)   :     580 cycles ,     4833 ns :
-        lifo.get.blocking.k_to_k                 - Get data from LIFO (w/ ctx switch)                 :     553 cycles ,     4608 ns :
-        lifo.put.wake+ctx.k_to_k                 - Add data to LIFO (w/ ctx switch)                   :     655 cycles ,     5458 ns :
-        lifo.get.free.blocking.k_to_k            - Free when getting data from LIFO (w/ ctx switch)   :     550 cycles ,     4583 ns :
-        lifo.put.alloc.wake+ctx.k_to_k           - Allocate to add data to LIFO (w/ ctx siwtch)       :     655 cycles ,     5458 ns :
+        thread.yield.preemptive.ctx.k_to_k       - Context switch via k_yield                         :     315 cycles ,     2625 ns :
+        thread.yield.cooperative.ctx.k_to_k      - Context switch via k_yield                         :     315 cycles ,     2625 ns :
+        isr.resume.interrupted.thread.kernel     - Return from ISR to interrupted thread              :     289 cycles ,     2416 ns :
+        isr.resume.different.thread.kernel       - Return from ISR to another thread                  :     374 cycles ,     3124 ns :
+        thread.create.kernel.from.kernel         - Create thread                                      :     382 cycles ,     3191 ns :
+        thread.start.kernel.from.kernel          - Start thread                                       :     394 cycles ,     3291 ns :
+        thread.suspend.kernel.from.kernel        - Suspend thread                                     :     289 cycles ,     2416 ns :
+        thread.resume.kernel.from.kernel         - Resume thread                                      :     339 cycles ,     2833 ns :
+        thread.abort.kernel.from.kernel          - Abort thread                                       :     339 cycles ,     2833 ns :
+        fifo.put.immediate.kernel                - Add data to FIFO (no ctx switch)                   :     214 cycles ,     1791 ns :
+        fifo.get.immediate.kernel                - Get data from FIFO (no ctx switch)                 :     134 cycles ,     1124 ns :
+        fifo.put.alloc.immediate.kernel          - Allocate to add data to FIFO (no ctx switch)       :     834 cycles ,     6950 ns :
+        fifo.get.free.immediate.kernel           - Free when getting data from FIFO (no ctx switch)   :     560 cycles ,     4666 ns :
+        fifo.get.blocking.k_to_k                 - Get data from FIFO (w/ ctx switch)                 :     510 cycles ,     4257 ns :
+        fifo.put.wake+ctx.k_to_k                 - Add data to FIFO (w/ ctx switch)                   :     590 cycles ,     4923 ns :
+        fifo.get.free.blocking.k_to_k            - Free when getting data from FIFO (w/ ctx siwtch)   :     510 cycles ,     4250 ns :
+        fifo.put.alloc.wake+ctx.k_to_k           - Allocate to add data to FIFO (w/ ctx switch)       :     585 cycles ,     4875 ns :
+        lifo.put.immediate.kernel                - Add data to LIFO (no ctx switch)                   :     214 cycles ,     1791 ns :
+        lifo.get.immediate.kernel                - Get data from LIFO (no ctx switch)                 :     120 cycles ,     1008 ns :
+        lifo.put.alloc.immediate.kernel          - Allocate to add data to LIFO (no ctx switch)       :     831 cycles ,     6925 ns :
+        lifo.get.free.immediate.kernel           - Free when getting data from LIFO (no ctx switch)   :     555 cycles ,     4625 ns :
+        lifo.get.blocking.k_to_k                 - Get data from LIFO (w/ ctx switch)                 :     502 cycles ,     4191 ns :
+        lifo.put.wake+ctx.k_to_k                 - Add data to LIFO (w/ ctx switch)                   :     585 cycles ,     4875 ns :
+        lifo.get.free.blocking.k_to_k            - Free when getting data from LIFO (w/ ctx switch)   :     513 cycles ,     4275 ns :
+        lifo.put.alloc.wake+ctx.k_to_k           - Allocate to add data to LIFO (w/ ctx siwtch)       :     585 cycles ,     4881 ns :
        events.post.immediate.kernel             - Post events (nothing wakes)                        :     225 cycles ,     1875 ns :
-        events.set.immediate.kernel              - Set events (nothing wakes)                         :     225 cycles ,     1875 ns :
-        events.wait.immediate.kernel             - Wait for any events (no ctx switch)                :     130 cycles ,     1083 ns :
-        events.wait_all.immediate.kernel         - Wait for all events (no ctx switch)                :     135 cycles ,     1125 ns :
-        events.wait.blocking.k_to_k              - Wait for any events (w/ ctx switch)                :     573 cycles ,     4783 ns :
-        events.set.wake+ctx.k_to_k               - Set events (w/ ctx switch)                         :     784 cycles ,     6534 ns :
-        events.wait_all.blocking.k_to_k          - Wait for all events (w/ ctx switch)                :     589 cycles ,     4916 ns :
-        events.post.wake+ctx.k_to_k              - Post events (w/ ctx switch)                        :     795 cycles ,     6626 ns :
-        semaphore.give.immediate.kernel          - Give a semaphore (no waiters)                      :     125 cycles ,     1041 ns :
+        events.set.immediate.kernel              - Set events (nothing wakes)                         :     230 cycles ,     1923 ns :
+        events.wait.immediate.kernel             - Wait for any events (no ctx switch)                :     120 cycles ,     1000 ns :
+        events.wait_all.immediate.kernel         - Wait for all events (no ctx switch)                :     110 cycles ,      917 ns :
+        events.wait.blocking.k_to_k              - Wait for any events (w/ ctx switch)                :     514 cycles ,     4291 ns :
+        events.set.wake+ctx.k_to_k               - Set events (w/ ctx switch)                         :     754 cycles ,     6291 ns :
+        events.wait_all.blocking.k_to_k          - Wait for all events (w/ ctx switch)                :     528 cycles ,     4400 ns :
+        events.post.wake+ctx.k_to_k              - Post events (w/ ctx switch)                        :     765 cycles ,     6375 ns :
+        semaphore.give.immediate.kernel          - Give a semaphore (no waiters)                      :      59 cycles ,      492 ns :
        semaphore.take.immediate.kernel          - Take a semaphore (no blocking)                     :      69 cycles ,      575 ns :
-        semaphore.take.blocking.k_to_k           - Take a semaphore (context switch)                  :     494 cycles ,     4116 ns :
-        semaphore.give.wake+ctx.k_to_k           - Give a semaphore (context switch)                  :     599 cycles ,     4992 ns :
-        condvar.wait.blocking.k_to_k             - Wait for a condvar (context switch)                :     692 cycles ,     5767 ns :
-        condvar.signal.wake+ctx.k_to_k           - Signal a condvar (context switch)                  :     715 cycles ,     5958 ns :
-        stack.push.immediate.kernel              - Add data to k_stack (no ctx switch)                :     166 cycles ,     1391 ns :
-        stack.pop.immediate.kernel               - Get data from k_stack (no ctx switch)              :      82 cycles ,      691 ns :
-        stack.pop.blocking.k_to_k                - Get data from k_stack (w/ ctx switch)              :     499 cycles ,     4166 ns :
-        stack.push.wake+ctx.k_to_k               - Add data to k_stack (w/ ctx switch)                :     645 cycles ,     5375 ns :
-        mutex.lock.immediate.recursive.kernel    - Lock a mutex                                       :     100 cycles ,      833 ns :
-        mutex.unlock.immediate.recursive.kernel  - Unlock a mutex                                     :      40 cycles ,      333 ns :
-        heap.malloc.immediate                    - Average time for heap malloc                       :     627 cycles ,     5225 ns :
-        heap.free.immediate                      - Average time for heap free                         :     432 cycles ,     3600 ns :
+        semaphore.take.blocking.k_to_k           - Take a semaphore (context switch)                  :     450 cycles ,     3756 ns :
+        semaphore.give.wake+ctx.k_to_k           - Give a semaphore (context switch)                  :     509 cycles ,     4249 ns :
+        condvar.wait.blocking.k_to_k             - Wait for a condvar (context switch)                :     578 cycles ,     4817 ns :
+        condvar.signal.wake+ctx.k_to_k           - Signal a condvar (context switch)                  :     630 cycles ,     5250 ns :
+        stack.push.immediate.kernel              - Add data to k_stack (no ctx switch)                :     107 cycles ,      899 ns :
+        stack.pop.immediate.kernel               - Get data from k_stack (no ctx switch)              :      80 cycles ,      674 ns :
+        stack.pop.blocking.k_to_k                - Get data from k_stack (w/ ctx switch)              :     467 cycles ,     3899 ns :
+        stack.push.wake+ctx.k_to_k               - Add data to k_stack (w/ ctx switch)                :     550 cycles ,     4583 ns :
+        mutex.lock.immediate.recursive.kernel    - Lock a mutex                                       :      83 cycles ,      692 ns :
+        mutex.unlock.immediate.recursive.kernel  - Unlock a mutex                                     :      44 cycles ,      367 ns :
+        heap.malloc.immediate                    - Average time for heap malloc                       :     610 cycles ,     5083 ns :
+        heap.free.immediate                      - Average time for heap free                         :     425 cycles ,     3541 ns :
        ===================================================================
        PROJECT EXECUTION SUCCESSFUL


-Sample output of the benchmark (with userspace enabled)::
+Sample output of the benchmark with stack canaries enabled::

-        thread.yield.preemptive.ctx.k_to_k       - Context switch via k_yield                         :     970 cycles ,     8083 ns :
-        thread.yield.preemptive.ctx.u_to_u       - Context switch via k_yield                         :    1260 cycles ,    10506 ns :
-        thread.yield.preemptive.ctx.k_to_u       - Context switch via k_yield                         :    1155 cycles ,     9632 ns :
-        thread.yield.preemptive.ctx.u_to_k       - Context switch via k_yield                         :    1075 cycles ,     8959 ns :
-        thread.yield.cooperative.ctx.k_to_k      - Context switch via k_yield                         :     970 cycles ,     8083 ns :
-        thread.yield.cooperative.ctx.u_to_u      - Context switch via k_yield                         :    1260 cycles ,    10506 ns :
-        thread.yield.cooperative.ctx.k_to_u      - Context switch via k_yield                         :    1155 cycles ,     9631 ns :
-        thread.yield.cooperative.ctx.u_to_k      - Context switch via k_yield                         :    1075 cycles ,     8959 ns :
-        isr.resume.interrupted.thread.kernel     - Return from ISR to interrupted thread              :     415 cycles ,     3458 ns :
-        isr.resume.different.thread.kernel       - Return from ISR to another thread                  :     985 cycles ,     8208 ns :
-        isr.resume.different.thread.user         - Return from ISR to another thread                  :    1180 cycles ,     9833 ns :
-        thread.create.kernel.from.kernel         - Create thread                                      :     989 cycles ,     8249 ns :
-        thread.start.kernel.from.kernel          - Start thread                                       :    1059 cycles ,     8833 ns :
-        thread.suspend.kernel.from.kernel        - Suspend thread                                     :    1030 cycles ,     8583 ns :
-        thread.resume.kernel.from.kernel         - Resume thread                                      :     994 cycles ,     8291 ns :
-        thread.abort.kernel.from.kernel          - Abort thread                                       :    2370 cycles ,    19751 ns :
-        thread.create.user.from.kernel           - Create thread                                      :     860 cycles ,     7167 ns :
-        thread.start.user.from.kernel            - Start thread                                       :    8965 cycles ,    74713 ns :
-        thread.suspend.user.from.kernel          - Suspend thread                                     :    1400 cycles ,    11666 ns :
-        thread.resume.user.from.kernel           - Resume thread                                      :    1174 cycles ,     9791 ns :
-        thread.abort.user.from.kernel            - Abort thread                                       :    2240 cycles ,    18666 ns :
-        thread.create.user.from.user             - Create thread                                      :    2105 cycles ,    17542 ns :
-        thread.start.user.from.user              - Start thread                                       :    9345 cycles ,    77878 ns :
-        thread.suspend.user.from.user            - Suspend thread                                     :    1590 cycles ,    13250 ns :
-        thread.resume.user.from.user             - Resume thread                                      :    1534 cycles ,    12791 ns :
-        thread.abort.user.from.user              - Abort thread                                       :    2850 cycles ,    23750 ns :
-        thread.start.kernel.from.user            - Start thread                                       :    1440 cycles ,    12000 ns :
-        thread.suspend.kernel.from.user          - Suspend thread                                     :    1219 cycles ,    10166 ns :
-        thread.resume.kernel.from.user           - Resume thread                                      :    1355 cycles ,    11292 ns :
-        thread.abort.kernel.from.user            - Abort thread                                       :    2980 cycles ,    24834 ns :
-        fifo.put.immediate.kernel                - Add data to FIFO (no ctx switch)                   :     315 cycles ,     2625 ns :
-        fifo.get.immediate.kernel                - Get data from FIFO (no ctx switch)                 :     209 cycles ,     1749 ns :
-        fifo.put.alloc.immediate.kernel          - Allocate to add data to FIFO (no ctx switch)       :    1040 cycles ,     8667 ns :
-        fifo.get.free.immediate.kernel           - Free when getting data from FIFO (no ctx switch)   :     670 cycles ,     5583 ns :
-        fifo.put.alloc.immediate.user            - Allocate to add data to FIFO (no ctx switch)       :    1765 cycles ,    14709 ns :
-        fifo.get.free.immediate.user             - Free when getting data from FIFO (no ctx switch)   :    1410 cycles ,    11750 ns :
-        fifo.get.blocking.k_to_k                 - Get data from FIFO (w/ ctx switch)                 :    1220 cycles ,    10168 ns :
-        fifo.put.wake+ctx.k_to_k                 - Add data to FIFO (w/ ctx switch)                   :    1285 cycles ,    10708 ns :
-        fifo.get.free.blocking.k_to_k            - Free when getting data from FIFO (w/ ctx siwtch)   :    1235 cycles ,    10291 ns :
-        fifo.put.alloc.wake+ctx.k_to_k           - Allocate to add data to FIFO (w/ ctx switch)       :    1340 cycles ,    11167 ns :
-        fifo.get.free.blocking.u_to_k            - Free when getting data from FIFO (w/ ctx siwtch)   :    1715 cycles ,    14292 ns :
-        fifo.put.alloc.wake+ctx.k_to_u           - Allocate to add data to FIFO (w/ ctx switch)       :    1665 cycles ,    13876 ns :
-        fifo.get.free.blocking.k_to_u            - Free when getting data from FIFO (w/ ctx siwtch)   :    1565 cycles ,    13042 ns :
-        fifo.put.alloc.wake+ctx.u_to_k           - Allocate to add data to FIFO (w/ ctx switch)       :    1815 cycles ,    15126 ns :
-        fifo.get.free.blocking.u_to_u            - Free when getting data from FIFO (w/ ctx siwtch)   :    2045 cycles ,    17042 ns :
-        fifo.put.alloc.wake+ctx.u_to_u           - Allocate to add data to FIFO (w/ ctx switch)       :    2140 cycles ,    17834 ns :
-        lifo.put.immediate.kernel                - Add data to LIFO (no ctx switch)                   :     309 cycles ,     2583 ns :
-        lifo.get.immediate.kernel                - Get data from LIFO (no ctx switch)                 :     219 cycles ,     1833 ns :
-        lifo.put.alloc.immediate.kernel          - Allocate to add data to LIFO (no ctx switch)       :    1030 cycles ,     8583 ns :
-        lifo.get.free.immediate.kernel           - Free when getting data from LIFO (no ctx switch)   :     685 cycles ,     5708 ns :
-        lifo.put.alloc.immediate.user            - Allocate to add data to LIFO (no ctx switch)       :    1755 cycles ,    14625 ns :
-        lifo.get.free.immediate.user             - Free when getting data from LIFO (no ctx switch)   :    1405 cycles ,    11709 ns :
-        lifo.get.blocking.k_to_k                 - Get data from LIFO (w/ ctx switch)                 :    1229 cycles ,    10249 ns :
-        lifo.put.wake+ctx.k_to_k                 - Add data to LIFO (w/ ctx switch)                   :    1290 cycles ,    10751 ns :
-        lifo.get.free.blocking.k_to_k            - Free when getting data from LIFO (w/ ctx switch)   :    1235 cycles ,    10292 ns :
-        lifo.put.alloc.wake+ctx.k_to_k           - Allocate to add data to LIFO (w/ ctx siwtch)       :    1310 cycles ,    10917 ns :
-        lifo.get.free.blocking.u_to_k            - Free when getting data from LIFO (w/ ctx switch)   :    1715 cycles ,    14293 ns :
-        lifo.put.alloc.wake+ctx.k_to_u           - Allocate to add data to LIFO (w/ ctx siwtch)       :    1630 cycles ,    13583 ns :
-        lifo.get.free.blocking.k_to_u            - Free when getting data from LIFO (w/ ctx switch)   :    1554 cycles ,    12958 ns :
-        lifo.put.alloc.wake+ctx.u_to_k           - Allocate to add data to LIFO (w/ ctx siwtch)       :    1805 cycles ,    15043 ns :
-        lifo.get.free.blocking.u_to_u            - Free when getting data from LIFO (w/ ctx switch)   :    2035 cycles ,    16959 ns :
-        lifo.put.alloc.wake+ctx.u_to_u           - Allocate to add data to LIFO (w/ ctx siwtch)       :    2125 cycles ,    17709 ns :
-        events.post.immediate.kernel             - Post events (nothing wakes)                        :     295 cycles ,     2458 ns :
-        events.set.immediate.kernel              - Set events (nothing wakes)                         :     300 cycles ,     2500 ns :
-        events.wait.immediate.kernel             - Wait for any events (no ctx switch)                :     220 cycles ,     1833 ns :
-        events.wait_all.immediate.kernel         - Wait for all events (no ctx switch)                :     215 cycles ,     1791 ns :
-        events.post.immediate.user               - Post events (nothing wakes)                        :     795 cycles ,     6625 ns :
-        events.set.immediate.user                - Set events (nothing wakes)                         :     790 cycles ,     6584 ns :
-        events.wait.immediate.user               - Wait for any events (no ctx switch)                :     740 cycles ,     6167 ns :
-        events.wait_all.immediate.user           - Wait for all events (no ctx switch)                :     740 cycles ,     6166 ns :
-        events.wait.blocking.k_to_k              - Wait for any events (w/ ctx switch)                :    1190 cycles ,     9918 ns :
-        events.set.wake+ctx.k_to_k               - Set events (w/ ctx switch)                         :    1464 cycles ,    12208 ns :
-        events.wait_all.blocking.k_to_k          - Wait for all events (w/ ctx switch)                :    1235 cycles ,    10292 ns :
-        events.post.wake+ctx.k_to_k              - Post events (w/ ctx switch)                        :    1500 cycles ,    12500 ns :
-        events.wait.blocking.u_to_k              - Wait for any events (w/ ctx switch)                :    1580 cycles ,    13167 ns :
-        events.set.wake+ctx.k_to_u               - Set events (w/ ctx switch)                         :    1630 cycles ,    13583 ns :
-        events.wait_all.blocking.u_to_k          - Wait for all events (w/ ctx switch)                :    1765 cycles ,    14708 ns :
-        events.post.wake+ctx.k_to_u              - Post events (w/ ctx switch)                        :    1795 cycles ,    14960 ns :
-        events.wait.blocking.k_to_u              - Wait for any events (w/ ctx switch)                :    1375 cycles ,    11459 ns :
-        events.set.wake+ctx.u_to_k               - Set events (w/ ctx switch)                         :    1825 cycles ,    15209 ns :
-        events.wait_all.blocking.k_to_u          - Wait for all events (w/ ctx switch)                :    1555 cycles ,    12958 ns :
-        events.post.wake+ctx.u_to_k              - Post events (w/ ctx switch)                        :    1995 cycles ,    16625 ns :
-        events.wait.blocking.u_to_u              - Wait for any events (w/ ctx switch)                :    1765 cycles ,    14708 ns :
-        events.set.wake+ctx.u_to_u               - Set events (w/ ctx switch)                         :    1989 cycles ,    16583 ns :
-        events.wait_all.blocking.u_to_u          - Wait for all events (w/ ctx switch)                :    2085 cycles ,    17376 ns :
-        events.post.wake+ctx.u_to_u              - Post events (w/ ctx switch)                        :    2290 cycles ,    19084 ns :
-        semaphore.give.immediate.kernel          - Give a semaphore (no waiters)                      :     220 cycles ,     1833 ns :
-        semaphore.take.immediate.kernel          - Take a semaphore (no blocking)                     :     130 cycles ,     1083 ns :
-        semaphore.give.immediate.user            - Give a semaphore (no waiters)                      :     710 cycles ,     5917 ns :
-        semaphore.take.immediate.user            - Take a semaphore (no blocking)                     :     655 cycles ,     5458 ns :
-        semaphore.take.blocking.k_to_k           - Take a semaphore (context switch)                  :    1135 cycles ,     9458 ns :
-        semaphore.give.wake+ctx.k_to_k           - Give a semaphore (context switch)                  :    1244 cycles ,    10374 ns :
-        semaphore.take.blocking.k_to_u           - Take a semaphore (context switch)                  :    1325 cycles ,    11048 ns :
-        semaphore.give.wake+ctx.u_to_k           - Give a semaphore (context switch)                  :    1610 cycles ,    13416 ns :
-        semaphore.take.blocking.u_to_k           - Take a semaphore (context switch)                  :    1499 cycles ,    12499 ns :
-        semaphore.give.wake+ctx.k_to_u           - Give a semaphore (context switch)                  :    1434 cycles ,    11957 ns :
-        semaphore.take.blocking.u_to_u           - Take a semaphore (context switch)                  :    1690 cycles ,    14090 ns :
-        semaphore.give.wake+ctx.u_to_u           - Give a semaphore (context switch)                  :    1800 cycles ,    15000 ns :
-        condvar.wait.blocking.k_to_k             - Wait for a condvar (context switch)                :    1385 cycles ,    11542 ns :
-        condvar.signal.wake+ctx.k_to_k           - Signal a condvar (context switch)                  :    1420 cycles ,    11833 ns :
-        condvar.wait.blocking.k_to_u             - Wait for a condvar (context switch)                :    1537 cycles ,    12815 ns :
-        condvar.signal.wake+ctx.u_to_k           - Signal a condvar (context switch)                  :    1950 cycles ,    16250 ns :
-        condvar.wait.blocking.u_to_k             - Wait for a condvar (context switch)                :    2025 cycles ,    16875 ns :
-        condvar.signal.wake+ctx.k_to_u           - Signal a condvar (context switch)                  :    1715 cycles ,    14298 ns :
-        condvar.wait.blocking.u_to_u             - Wait for a condvar (context switch)                :    2313 cycles ,    19279 ns :
-        condvar.signal.wake+ctx.u_to_u           - Signal a condvar (context switch)                  :    2225 cycles ,    18541 ns :
-        stack.push.immediate.kernel              - Add data to k_stack (no ctx switch)                :     244 cycles ,     2041 ns :
-        stack.pop.immediate.kernel               - Get data from k_stack (no ctx switch)              :     195 cycles ,     1630 ns :
-        stack.push.immediate.user                - Add data to k_stack (no ctx switch)                :     714 cycles ,     5956 ns :
-        stack.pop.immediate.user                 - Get data from k_stack (no ctx switch)              :    1009 cycles ,     8414 ns :
-        stack.pop.blocking.k_to_k                - Get data from k_stack (w/ ctx switch)              :    1234 cycles ,    10291 ns :
-        stack.push.wake+ctx.k_to_k               - Add data to k_stack (w/ ctx switch)                :    1360 cycles ,    11333 ns :
-        stack.pop.blocking.u_to_k                - Get data from k_stack (w/ ctx switch)              :    2084 cycles ,    17374 ns :
-        stack.push.wake+ctx.k_to_u               - Add data to k_stack (w/ ctx switch)                :    1665 cycles ,    13875 ns :
-        stack.pop.blocking.k_to_u                - Get data from k_stack (w/ ctx switch)              :    1544 cycles ,    12874 ns :
-        stack.push.wake+ctx.u_to_k               - Add data to k_stack (w/ ctx switch)                :    1850 cycles ,    15422 ns :
-        stack.pop.blocking.u_to_u                - Get data from k_stack (w/ ctx switch)              :    2394 cycles ,    19958 ns :
-        stack.push.wake+ctx.u_to_u               - Add data to k_stack (w/ ctx switch)                :    2155 cycles ,    17958 ns :
-        mutex.lock.immediate.recursive.kernel    - Lock a mutex                                       :     155 cycles ,     1291 ns :
-        mutex.unlock.immediate.recursive.kernel  - Unlock a mutex                                     :      57 cycles ,      475 ns :
-        mutex.lock.immediate.recursive.user      - Lock a mutex                                       :     665 cycles ,     5541 ns :
-        mutex.unlock.immediate.recursive.user    - Unlock a mutex                                     :     585 cycles ,     4875 ns :
-        heap.malloc.immediate                    - Average time for heap malloc                       :     640 cycles ,     5341 ns :
-        heap.free.immediate                      - Average time for heap free                         :     436 cycles ,     3633 ns :
+        thread.yield.preemptive.ctx.k_to_k       - Context switch via k_yield                         :     485 cycles ,     4042 ns :
+        thread.yield.cooperative.ctx.k_to_k      - Context switch via k_yield                         :     485 cycles ,     4042 ns :
+        isr.resume.interrupted.thread.kernel     - Return from ISR to interrupted thread              :     545 cycles ,     4549 ns :
+        isr.resume.different.thread.kernel       - Return from ISR to another thread                  :     590 cycles ,     4924 ns :
+        thread.create.kernel.from.kernel         - Create thread                                      :     585 cycles ,     4883 ns :
+        thread.start.kernel.from.kernel          - Start thread                                       :     685 cycles ,     5716 ns :
+        thread.suspend.kernel.from.kernel        - Suspend thread                                     :     490 cycles ,     4091 ns :
+        thread.resume.kernel.from.kernel         - Resume thread                                      :     569 cycles ,     4749 ns :
+        thread.abort.kernel.from.kernel          - Abort thread                                       :     629 cycles ,     5249 ns :
+        fifo.put.immediate.kernel                - Add data to FIFO (no ctx switch)                   :     439 cycles ,     3666 ns :
+        fifo.get.immediate.kernel                - Get data from FIFO (no ctx switch)                 :     320 cycles ,     2674 ns :
+        fifo.put.alloc.immediate.kernel          - Allocate to add data to FIFO (no ctx switch)       :    1499 cycles ,    12491 ns :
+        fifo.get.free.immediate.kernel           - Free when getting data from FIFO (no ctx switch)   :    1230 cycles ,    10250 ns :
+        fifo.get.blocking.k_to_k                 - Get data from FIFO (w/ ctx switch)                 :     868 cycles ,     7241 ns :
+        fifo.put.wake+ctx.k_to_k                 - Add data to FIFO (w/ ctx switch)                   :     991 cycles ,     8259 ns :
+        fifo.get.free.blocking.k_to_k            - Free when getting data from FIFO (w/ ctx siwtch)   :     879 cycles ,     7325 ns :
+        fifo.put.alloc.wake+ctx.k_to_k           - Allocate to add data to FIFO (w/ ctx switch)       :     990 cycles ,     8250 ns :
+        lifo.put.immediate.kernel                - Add data to LIFO (no ctx switch)                   :     429 cycles ,     3582 ns :
+        lifo.get.immediate.kernel                - Get data from LIFO (no ctx switch)                 :     320 cycles ,     2674 ns :
+        lifo.put.alloc.immediate.kernel          - Allocate to add data to LIFO (no ctx switch)       :    1499 cycles ,    12491 ns :
+        lifo.get.free.immediate.kernel           - Free when getting data from LIFO (no ctx switch)   :    1220 cycles ,    10166 ns :
+        lifo.get.blocking.k_to_k                 - Get data from LIFO (w/ ctx switch)                 :     863 cycles ,     7199 ns :
+        lifo.put.wake+ctx.k_to_k                 - Add data to LIFO (w/ ctx switch)                   :     985 cycles ,     8208 ns :
+        lifo.get.free.blocking.k_to_k            - Free when getting data from LIFO (w/ ctx switch)   :     879 cycles ,     7325 ns :
+        lifo.put.alloc.wake+ctx.k_to_k           - Allocate to add data to LIFO (w/ ctx siwtch)       :     985 cycles ,     8208 ns :
+        events.post.immediate.kernel             - Post events (nothing wakes)                        :     420 cycles ,     3501 ns :
+        events.set.immediate.kernel              - Set events (nothing wakes)                         :     420 cycles ,     3501 ns :
+        events.wait.immediate.kernel             - Wait for any events (no ctx switch)                :     280 cycles ,     2334 ns :
+        events.wait_all.immediate.kernel         - Wait for all events (no ctx switch)                :     270 cycles ,     2251 ns :
+        events.wait.blocking.k_to_k              - Wait for any events (w/ ctx switch)                :     919 cycles ,     7665 ns :
+        events.set.wake+ctx.k_to_k               - Set events (w/ ctx switch)                         :    1310 cycles ,    10924 ns :
+        events.wait_all.blocking.k_to_k          - Wait for all events (w/ ctx switch)                :     954 cycles ,     7950 ns :
+        events.post.wake+ctx.k_to_k              - Post events (w/ ctx switch)                        :    1340 cycles ,    11166 ns :
+        semaphore.give.immediate.kernel          - Give a semaphore (no waiters)                      :     110 cycles ,      917 ns :
+        semaphore.take.immediate.kernel          - Take a semaphore (no blocking)                     :     180 cycles ,     1500 ns :
+        semaphore.take.blocking.k_to_k           - Take a semaphore (context switch)                  :     755 cycles ,     6292 ns :
+        semaphore.give.wake+ctx.k_to_k           - Give a semaphore (context switch)                  :     812 cycles ,     6767 ns :
+        condvar.wait.blocking.k_to_k             - Wait for a condvar (context switch)                :    1027 cycles ,     8558 ns :
+        condvar.signal.wake+ctx.k_to_k           - Signal a condvar (context switch)                  :    1040 cycles ,     8666 ns :
+        stack.push.immediate.kernel              - Add data to k_stack (no ctx switch)                :     220 cycles ,     1841 ns :
+        stack.pop.immediate.kernel               - Get data from k_stack (no ctx switch)              :     205 cycles ,     1716 ns :
+        stack.pop.blocking.k_to_k                - Get data from k_stack (w/ ctx switch)              :     791 cycles ,     6599 ns :
+        stack.push.wake+ctx.k_to_k               - Add data to k_stack (w/ ctx switch)                :     870 cycles ,     7250 ns :
+        mutex.lock.immediate.recursive.kernel    - Lock a mutex                                       :     175 cycles ,     1459 ns :
+        mutex.unlock.immediate.recursive.kernel  - Unlock a mutex                                     :      61 cycles ,      510 ns :
+        heap.malloc.immediate                    - Average time for heap malloc                       :    1060 cycles ,     8833 ns :
+        heap.free.immediate                      - Average time for heap free                         :     899 cycles ,     7491 ns :
        ===================================================================
        PROJECT EXECUTION SUCCESSFUL
+
+The sample output above (stack canaries are enabled) shows longer times than
+for the default build. Not only does each stack frame in the call tree have
+its own stack canary check, but enabling this feature impacts how the compiler
+chooses to inline or not inline routines.
+
+Sample output of the benchmark with object core enabled::
+
+        thread.yield.preemptive.ctx.k_to_k       - Context switch via k_yield                         :     740 cycles ,     6167 ns :
+        thread.yield.cooperative.ctx.k_to_k      - Context switch via k_yield                         :     740 cycles ,     6167 ns :
+        isr.resume.interrupted.thread.kernel     - Return from ISR to interrupted thread              :     284 cycles ,     2374 ns :
+        isr.resume.different.thread.kernel       - Return from ISR to another thread                  :     784 cycles ,     6541 ns :
+        thread.create.kernel.from.kernel         - Create thread                                      :     714 cycles ,     5958 ns :
+        thread.start.kernel.from.kernel          - Start thread                                       :     819 cycles ,     6833 ns :
+        thread.suspend.kernel.from.kernel        - Suspend thread                                     :     704 cycles ,     5874 ns :
+        thread.resume.kernel.from.kernel         - Resume thread                                      :     761 cycles ,     6349 ns :
+        thread.abort.kernel.from.kernel          - Abort thread                                       :     544 cycles ,     4541 ns :
+        fifo.put.immediate.kernel                - Add data to FIFO (no ctx switch)                   :     211 cycles ,     1766 ns :
+        fifo.get.immediate.kernel                - Get data from FIFO (no ctx switch)                 :     132 cycles ,     1108 ns :
+        fifo.put.alloc.immediate.kernel          - Allocate to add data to FIFO (no ctx switch)       :     850 cycles ,     7091 ns :
+        fifo.get.free.immediate.kernel           - Free when getting data from FIFO (no ctx switch)   :     565 cycles ,     4708 ns :
+        fifo.get.blocking.k_to_k                 - Get data from FIFO (w/ ctx switch)                 :     947 cycles ,     7899 ns :
+        fifo.put.wake+ctx.k_to_k                 - Add data to FIFO (w/ ctx switch)                   :    1015 cycles ,     8458 ns :
+        fifo.get.free.blocking.k_to_k            - Free when getting data from FIFO (w/ ctx siwtch)   :     950 cycles ,     7923 ns :
+        fifo.put.alloc.wake+ctx.k_to_k           - Allocate to add data to FIFO (w/ ctx switch)       :    1010 cycles ,     8416 ns :
+        lifo.put.immediate.kernel                - Add data to LIFO (no ctx switch)                   :     226 cycles ,     1891 ns :
+        lifo.get.immediate.kernel                - Get data from LIFO (no ctx switch)                 :     123 cycles ,     1033 ns :
+        lifo.put.alloc.immediate.kernel          - Allocate to add data to LIFO (no ctx switch)       :     848 cycles ,     7066 ns :
+        lifo.get.free.immediate.kernel           - Free when getting data from LIFO (no ctx switch)   :     565 cycles ,     4708 ns :
+        lifo.get.blocking.k_to_k                 - Get data from LIFO (w/ ctx switch)                 :     951 cycles ,     7932 ns :
+        lifo.put.wake+ctx.k_to_k                 - Add data to LIFO (w/ ctx switch)                   :    1010 cycles ,     8416 ns :
+        lifo.get.free.blocking.k_to_k            - Free when getting data from LIFO (w/ ctx switch)   :     959 cycles ,     7991 ns :
+        lifo.put.alloc.wake+ctx.k_to_k           - Allocate to add data to LIFO (w/ ctx siwtch)       :    1010 cycles ,     8422 ns :
+        events.post.immediate.kernel             - Post events (nothing wakes)                        :     210 cycles ,     1750 ns :
+        events.set.immediate.kernel              - Set events (nothing wakes)                         :     230 cycles ,     1917 ns :
+        events.wait.immediate.kernel             - Wait for any events (no ctx switch)                :     120 cycles ,     1000 ns :
+        events.wait_all.immediate.kernel         - Wait for all events (no ctx switch)                :     150 cycles ,     1250 ns :
+        events.wait.blocking.k_to_k              - Wait for any events (w/ ctx switch)                :     951 cycles ,     7932 ns :
+        events.set.wake+ctx.k_to_k               - Set events (w/ ctx switch)                         :    1179 cycles ,     9833 ns :
+        events.wait_all.blocking.k_to_k          - Wait for all events (w/ ctx switch)                :     976 cycles ,     8133 ns :
+        events.post.wake+ctx.k_to_k              - Post events (w/ ctx switch)                        :    1190 cycles ,     9922 ns :
+        semaphore.give.immediate.kernel          - Give a semaphore (no waiters)                      :      59 cycles ,      492 ns :
+        semaphore.take.immediate.kernel          - Take a semaphore (no blocking)                     :      69 cycles ,      575 ns :
+        semaphore.take.blocking.k_to_k           - Take a semaphore (context switch)                  :     870 cycles ,     7250 ns :
+        semaphore.give.wake+ctx.k_to_k           - Give a semaphore (context switch)                  :     929 cycles ,     7749 ns :
+        condvar.wait.blocking.k_to_k             - Wait for a condvar (context switch)                :    1010 cycles ,     8417 ns :
+        condvar.signal.wake+ctx.k_to_k           - Signal a condvar (context switch)                  :    1060 cycles ,     8833 ns :
+        stack.push.immediate.kernel              - Add data to k_stack (no ctx switch)                :      90 cycles ,      758 ns :
+        stack.pop.immediate.kernel               - Get data from k_stack (no ctx switch)              :      86 cycles ,      724 ns :
+        stack.pop.blocking.k_to_k                - Get data from k_stack (w/ ctx switch)              :     910 cycles ,     7589 ns :
+        stack.push.wake+ctx.k_to_k               - Add data to k_stack (w/ ctx switch)                :     975 cycles ,     8125 ns :
+        mutex.lock.immediate.recursive.kernel    - Lock a mutex                                       :     105 cycles ,      875 ns :
+        mutex.unlock.immediate.recursive.kernel  - Unlock a mutex                                     :      44 cycles ,      367 ns :
+        heap.malloc.immediate                    - Average time for heap malloc                       :     621 cycles ,     5183 ns :
+        heap.free.immediate                      - Average time for heap free                         :     422 cycles ,     3516 ns :
+        ===================================================================
+        PROJECT EXECUTION SUCCESSFUL
+
+The sample output above (object core and statistics enabled) shows longer
+times than for the default build when context switching is involved. A blanket
+enabling of the object cores as was done here results in the additional
+gathering of thread statistics when a thread is switched in/out. The
+gathering of these statistics can be controlled at both at the time of
+project configuration as well as at runtime.
+
+Sample output of the benchmark with userspace enabled::
+
+        thread.yield.preemptive.ctx.k_to_k       - Context switch via k_yield                         :     975 cycles ,     8125 ns :
+        thread.yield.preemptive.ctx.u_to_u       - Context switch via k_yield                         :    1303 cycles ,    10860 ns :
+        thread.yield.preemptive.ctx.k_to_u       - Context switch via k_yield                         :    1180 cycles ,     9834 ns :
+        thread.yield.preemptive.ctx.u_to_k       - Context switch via k_yield                         :    1097 cycles ,     9144 ns :
+        thread.yield.cooperative.ctx.k_to_k      - Context switch via k_yield                         :     975 cycles ,     8125 ns :
+        thread.yield.cooperative.ctx.u_to_u      - Context switch via k_yield                         :    1302 cycles ,    10854 ns :
+        thread.yield.cooperative.ctx.k_to_u      - Context switch via k_yield                         :    1180 cycles ,     9834 ns :
+        thread.yield.cooperative.ctx.u_to_k      - Context switch via k_yield                         :    1097 cycles ,     9144 ns :
+        isr.resume.interrupted.thread.kernel     - Return from ISR to interrupted thread              :     329 cycles ,     2749 ns :
+        isr.resume.different.thread.kernel       - Return from ISR to another thread                  :    1014 cycles ,     8457 ns :
+        isr.resume.different.thread.user         - Return from ISR to another thread                  :    1223 cycles ,    10192 ns :
+        thread.create.kernel.from.kernel         - Create thread                                      :     970 cycles ,     8089 ns :
+        thread.start.kernel.from.kernel          - Start thread                                       :    1074 cycles ,     8957 ns :
+        thread.suspend.kernel.from.kernel        - Suspend thread                                     :     949 cycles ,     7916 ns :
+        thread.resume.kernel.from.kernel         - Resume thread                                      :    1004 cycles ,     8374 ns :
+        thread.abort.kernel.from.kernel          - Abort thread                                       :    2734 cycles ,    22791 ns :
+        thread.create.user.from.kernel           - Create thread                                      :     832 cycles ,     6935 ns :
+        thread.start.user.from.kernel            - Start thread                                       :    9023 cycles ,    75192 ns :
+        thread.suspend.user.from.kernel          - Suspend thread                                     :    1312 cycles ,    10935 ns :
+        thread.resume.user.from.kernel           - Resume thread                                      :    1187 cycles ,     9894 ns :
+        thread.abort.user.from.kernel            - Abort thread                                       :    2597 cycles ,    21644 ns :
+        thread.create.user.from.user             - Create thread                                      :    2144 cycles ,    17872 ns :
+        thread.start.user.from.user              - Start thread                                       :    9399 cycles ,    78330 ns :
+        thread.suspend.user.from.user            - Suspend thread                                     :    1504 cycles ,    12539 ns :
+        thread.resume.user.from.user             - Resume thread                                      :    1574 cycles ,    13122 ns :
+        thread.abort.user.from.user              - Abort thread                                       :    3237 cycles ,    26981 ns :
+        thread.start.kernel.from.user            - Start thread                                       :    1452 cycles ,    12102 ns :
+        thread.suspend.kernel.from.user          - Suspend thread                                     :    1143 cycles ,     9525 ns :
+        thread.resume.kernel.from.user           - Resume thread                                      :    1392 cycles ,    11602 ns :
+        thread.abort.kernel.from.user            - Abort thread                                       :    3372 cycles ,    28102 ns :
+        fifo.put.immediate.kernel                - Add data to FIFO (no ctx switch)                   :     239 cycles ,     1999 ns :
+        fifo.get.immediate.kernel                - Get data from FIFO (no ctx switch)                 :     184 cycles ,     1541 ns :
+        fifo.put.alloc.immediate.kernel          - Allocate to add data to FIFO (no ctx switch)       :     920 cycles ,     7666 ns :
+        fifo.get.free.immediate.kernel           - Free when getting data from FIFO (no ctx switch)   :     650 cycles ,     5416 ns :
+        fifo.put.alloc.immediate.user            - Allocate to add data to FIFO (no ctx switch)       :    1710 cycles ,    14256 ns :
+        fifo.get.free.immediate.user             - Free when getting data from FIFO (no ctx switch)   :    1440 cycles ,    12000 ns :
+        fifo.get.blocking.k_to_k                 - Get data from FIFO (w/ ctx switch)                 :    1209 cycles ,    10082 ns :
+        fifo.put.wake+ctx.k_to_k                 - Add data to FIFO (w/ ctx switch)                   :    1230 cycles ,    10250 ns :
+        fifo.get.free.blocking.k_to_k            - Free when getting data from FIFO (w/ ctx siwtch)   :    1210 cycles ,    10083 ns :
+        fifo.put.alloc.wake+ctx.k_to_k           - Allocate to add data to FIFO (w/ ctx switch)       :    1260 cycles ,    10500 ns :
+        fifo.get.free.blocking.u_to_k            - Free when getting data from FIFO (w/ ctx siwtch)   :    1745 cycles ,    14547 ns :
+        fifo.put.alloc.wake+ctx.k_to_u           - Allocate to add data to FIFO (w/ ctx switch)       :    1600 cycles ,    13333 ns :
+        fifo.get.free.blocking.k_to_u            - Free when getting data from FIFO (w/ ctx siwtch)   :    1550 cycles ,    12922 ns :
+        fifo.put.alloc.wake+ctx.u_to_k           - Allocate to add data to FIFO (w/ ctx switch)       :    1795 cycles ,    14958 ns :
+        fifo.get.free.blocking.u_to_u            - Free when getting data from FIFO (w/ ctx siwtch)   :    2084 cycles ,    17374 ns :
+        fifo.put.alloc.wake+ctx.u_to_u           - Allocate to add data to FIFO (w/ ctx switch)       :    2135 cycles ,    17791 ns :
+        lifo.put.immediate.kernel                - Add data to LIFO (no ctx switch)                   :     234 cycles ,     1957 ns :
+        lifo.get.immediate.kernel                - Get data from LIFO (no ctx switch)                 :     189 cycles ,     1582 ns :
+        lifo.put.alloc.immediate.kernel          - Allocate to add data to LIFO (no ctx switch)       :     935 cycles ,     7791 ns :
+        lifo.get.free.immediate.kernel           - Free when getting data from LIFO (no ctx switch)   :     650 cycles ,     5416 ns :
+        lifo.put.alloc.immediate.user            - Allocate to add data to LIFO (no ctx switch)       :    1715 cycles ,    14291 ns :
+        lifo.get.free.immediate.user             - Free when getting data from LIFO (no ctx switch)   :    1445 cycles ,    12041 ns :
+        lifo.get.blocking.k_to_k                 - Get data from LIFO (w/ ctx switch)                 :    1219 cycles ,    10166 ns :
+        lifo.put.wake+ctx.k_to_k                 - Add data to LIFO (w/ ctx switch)                   :    1230 cycles ,    10250 ns :
+        lifo.get.free.blocking.k_to_k            - Free when getting data from LIFO (w/ ctx switch)   :    1210 cycles ,    10083 ns :
+        lifo.put.alloc.wake+ctx.k_to_k           - Allocate to add data to LIFO (w/ ctx siwtch)       :    1260 cycles ,    10500 ns :
+        lifo.get.free.blocking.u_to_k            - Free when getting data from LIFO (w/ ctx switch)   :    1744 cycles ,    14541 ns :
+        lifo.put.alloc.wake+ctx.k_to_u           - Allocate to add data to LIFO (w/ ctx siwtch)       :    1595 cycles ,    13291 ns :
+        lifo.get.free.blocking.k_to_u            - Free when getting data from LIFO (w/ ctx switch)   :    1544 cycles ,    12874 ns :
+        lifo.put.alloc.wake+ctx.u_to_k           - Allocate to add data to LIFO (w/ ctx siwtch)       :    1795 cycles ,    14958 ns :
+        lifo.get.free.blocking.u_to_u            - Free when getting data from LIFO (w/ ctx switch)   :    2080 cycles ,    17339 ns :
+        lifo.put.alloc.wake+ctx.u_to_u           - Allocate to add data to LIFO (w/ ctx siwtch)       :    2130 cycles ,    17750 ns :
+        events.post.immediate.kernel             - Post events (nothing wakes)                        :     285 cycles ,     2375 ns :
+        events.set.immediate.kernel              - Set events (nothing wakes)                         :     290 cycles ,     2417 ns :
+        events.wait.immediate.kernel             - Wait for any events (no ctx switch)                :     235 cycles ,     1958 ns :
+        events.wait_all.immediate.kernel         - Wait for all events (no ctx switch)                :     245 cycles ,     2042 ns :
+        events.post.immediate.user               - Post events (nothing wakes)                        :     800 cycles ,     6669 ns :
+        events.set.immediate.user                - Set events (nothing wakes)                         :     811 cycles ,     6759 ns :
+        events.wait.immediate.user               - Wait for any events (no ctx switch)                :     780 cycles ,     6502 ns :
+        events.wait_all.immediate.user           - Wait for all events (no ctx switch)                :     770 cycles ,     6419 ns :
+        events.wait.blocking.k_to_k              - Wait for any events (w/ ctx switch)                :    1210 cycles ,    10089 ns :
+        events.set.wake+ctx.k_to_k               - Set events (w/ ctx switch)                         :    1449 cycles ,    12082 ns :
+        events.wait_all.blocking.k_to_k          - Wait for all events (w/ ctx switch)                :    1250 cycles ,    10416 ns :
+        events.post.wake+ctx.k_to_k              - Post events (w/ ctx switch)                        :    1475 cycles ,    12291 ns :
+        events.wait.blocking.u_to_k              - Wait for any events (w/ ctx switch)                :    1612 cycles ,    13435 ns :
+        events.set.wake+ctx.k_to_u               - Set events (w/ ctx switch)                         :    1627 cycles ,    13560 ns :
+        events.wait_all.blocking.u_to_k          - Wait for all events (w/ ctx switch)                :    1785 cycles ,    14875 ns :
+        events.post.wake+ctx.k_to_u              - Post events (w/ ctx switch)                        :    1790 cycles ,    14923 ns :
+        events.wait.blocking.k_to_u              - Wait for any events (w/ ctx switch)                :    1407 cycles ,    11727 ns :
+        events.set.wake+ctx.u_to_k               - Set events (w/ ctx switch)                         :    1828 cycles ,    15234 ns :
+        events.wait_all.blocking.k_to_u          - Wait for all events (w/ ctx switch)                :    1585 cycles ,    13208 ns :
+        events.post.wake+ctx.u_to_k              - Post events (w/ ctx switch)                        :    2000 cycles ,    16666 ns :
+        events.wait.blocking.u_to_u              - Wait for any events (w/ ctx switch)                :    1810 cycles ,    15087 ns :
+        events.set.wake+ctx.u_to_u               - Set events (w/ ctx switch)                         :    2004 cycles ,    16705 ns :
+        events.wait_all.blocking.u_to_u          - Wait for all events (w/ ctx switch)                :    2120 cycles ,    17666 ns :
+        events.post.wake+ctx.u_to_u              - Post events (w/ ctx switch)                        :    2315 cycles ,    19291 ns :
+        semaphore.give.immediate.kernel          - Give a semaphore (no waiters)                      :     125 cycles ,     1042 ns :
+        semaphore.take.immediate.kernel          - Take a semaphore (no blocking)                     :     125 cycles ,     1042 ns :
+        semaphore.give.immediate.user            - Give a semaphore (no waiters)                      :     645 cycles ,     5377 ns :
+        semaphore.take.immediate.user            - Take a semaphore (no blocking)                     :     680 cycles ,     5669 ns :
+        semaphore.take.blocking.k_to_k           - Take a semaphore (context switch)                  :    1140 cycles ,     9500 ns :
+        semaphore.give.wake+ctx.k_to_k           - Give a semaphore (context switch)                  :    1174 cycles ,     9791 ns :
+        semaphore.take.blocking.k_to_u           - Take a semaphore (context switch)                  :    1350 cycles ,    11251 ns :
+        semaphore.give.wake+ctx.u_to_k           - Give a semaphore (context switch)                  :    1542 cycles ,    12852 ns :
+        semaphore.take.blocking.u_to_k           - Take a semaphore (context switch)                  :    1512 cycles ,    12603 ns :
+        semaphore.give.wake+ctx.k_to_u           - Give a semaphore (context switch)                  :    1382 cycles ,    11519 ns :
+        semaphore.take.blocking.u_to_u           - Take a semaphore (context switch)                  :    1723 cycles ,    14360 ns :
+        semaphore.give.wake+ctx.u_to_u           - Give a semaphore (context switch)                  :    1749 cycles ,    14580 ns :
+        condvar.wait.blocking.k_to_k             - Wait for a condvar (context switch)                :    1285 cycles ,    10708 ns :
+        condvar.signal.wake+ctx.k_to_k           - Signal a condvar (context switch)                  :    1315 cycles ,    10964 ns :
+        condvar.wait.blocking.k_to_u             - Wait for a condvar (context switch)                :    1547 cycles ,    12898 ns :
+        condvar.signal.wake+ctx.u_to_k           - Signal a condvar (context switch)                  :    1855 cycles ,    15458 ns :
+        condvar.wait.blocking.u_to_k             - Wait for a condvar (context switch)                :    1990 cycles ,    16583 ns :
+        condvar.signal.wake+ctx.k_to_u           - Signal a condvar (context switch)                  :    1640 cycles ,    13666 ns :
+        condvar.wait.blocking.u_to_u             - Wait for a condvar (context switch)                :    2313 cycles ,    19280 ns :
+        condvar.signal.wake+ctx.u_to_u           - Signal a condvar (context switch)                  :    2170 cycles ,    18083 ns :
+        stack.push.immediate.kernel              - Add data to k_stack (no ctx switch)                :     189 cycles ,     1582 ns :
+        stack.pop.immediate.kernel               - Get data from k_stack (no ctx switch)              :     194 cycles ,     1624 ns :
+        stack.push.immediate.user                - Add data to k_stack (no ctx switch)                :     679 cycles ,     5664 ns :
+        stack.pop.immediate.user                 - Get data from k_stack (no ctx switch)              :    1014 cycles ,     8455 ns :
+        stack.pop.blocking.k_to_k                - Get data from k_stack (w/ ctx switch)              :    1209 cycles ,    10083 ns :
+        stack.push.wake+ctx.k_to_k               - Add data to k_stack (w/ ctx switch)                :    1235 cycles ,    10291 ns :
+        stack.pop.blocking.u_to_k                - Get data from k_stack (w/ ctx switch)              :    2050 cycles ,    17089 ns :
+        stack.push.wake+ctx.k_to_u               - Add data to k_stack (w/ ctx switch)                :    1575 cycles ,    13125 ns :
+        stack.pop.blocking.k_to_u                - Get data from k_stack (w/ ctx switch)              :    1549 cycles ,    12916 ns :
+        stack.push.wake+ctx.u_to_k               - Add data to k_stack (w/ ctx switch)                :    1755 cycles ,    14625 ns :
+        stack.pop.blocking.u_to_u                - Get data from k_stack (w/ ctx switch)              :    2389 cycles ,    19916 ns :
+        stack.push.wake+ctx.u_to_u               - Add data to k_stack (w/ ctx switch)                :    2095 cycles ,    17458 ns :
+        mutex.lock.immediate.recursive.kernel    - Lock a mutex                                       :     165 cycles ,     1375 ns :
+        mutex.unlock.immediate.recursive.kernel  - Unlock a mutex                                     :      80 cycles ,      668 ns :
+        mutex.lock.immediate.recursive.user      - Lock a mutex                                       :     685 cycles ,     5711 ns :
+        mutex.unlock.immediate.recursive.user    - Unlock a mutex                                     :     615 cycles ,     5128 ns :
+        heap.malloc.immediate                    - Average time for heap malloc                       :     626 cycles ,     5224 ns :
+        heap.free.immediate                      - Average time for heap free                         :     427 cycles ,     3558 ns :
+        ===================================================================
+        PROJECT EXECUTION SUCCESSFUL
+
+The sample output above (userspace enabled) shows longer times than for
+the default build scenario. Enabling userspace results in additional
+runtime overhead on each call to a kernel object to determine whether the
+caller is in user or kernel space and consequently whether a system call
+is needed or not.