Move the relocation offset calculation out of assembler and into C. This
also paves the way for the upcoming init sequence simplification by adding
the board_init_f_r flash to RAM transitional function
--
Changes for v2:
- Added commit message
- Minor adjustment to new stack address comment