On dual-core STM32H7, the Cortex-M4 core is supposed to wait until the
Cortex-M7 initializes the system before starting to execute. CM7 should
signal this by locking a specific HSEM, which CM4 should poll until locked.
However, the logic on the Cortex-M4 side was reading the "RLR" register of
HSEM, which *locks the semaphore on read* - in turn, this makes the CM4
start directly since it sees that the semaphore is locked (by itself).
Use proper LL API to read HSEM status - which will read the "R" register
instead - to make sure CM4 doesn't begin execution earlier than it should.
Suggested-by: hglassdyb
Signed-off-by: Mathieu Choplain <mathieu.choplain@st.com>