SiFive - August 21, 2017
All Aboard, Part 2: Relocations in ELF Toolchains
Our first stop on our exploration of the RISC-V toolchain will be an overview of ELF relocations and how they are used by the RISC-V toolchain. We'll shy away from discussing linker relaxations and their impact on performance for a follow-up blog post so this doesn't get too long. The example has been carefully constructed to be unrelaxable as to avoid confusion. Additionally, we're only going to discuss the relocations used by statically linked executables, avoid discussing position independent executables and forget about thread local storage -- like linker relaxation, all of those warrant a whole post on their own. There will be a lot more to come about relocations in later blog posts.
An Example of a Relocation in a C Program
Relocations are a concept that exists due to the split between the compiler and the linker that is present in most toolchains. While the specifics of this article will apply only to ELF-based RISC-V toolchains (i.e., GCC+binutils or LLVM), the general concept of relocations exists in farther-reaching compilers like Hotspot. Since relocations exist to pass information between the compiler and linker, let's first look at how a simple program is compiled. Take the following C code:
long global_symbol[2];
int main() {
return global_symbol[0] != 0;
}
Even though a single GCC invocation can produce a binary for this simple case,
under the covers the GCC driver script is actually running the preprocessor,
then the compiler, then the assembler and finally the linker. The
--save-temps
argument to GCC allows users to see all these intermediate
files, and is a useful argument for poking around inside the toolchain.
$ riscv64-unknown-linux-gnu-gcc relocation.c -o relocation -O3 --save-temps
Each step in this run of the GCC wrapper script generates a file:
relocation.i
: The preprocessed source, which expands any preprocessor directives (things like#include
or#ifdef
).relocation.s
: The output of the actual compiler, which is an assembly file (a text file in the RISC-V assembly format).relocation.o
: The output of the assembler, which is an un-linked object file (an ELF file, but not an executable ELF).relocation
: The output of the linker, which is a linked executable (an executable ELF file).
The first step is to run the preprocessor. Since this is a simple source file with no preprocessor macros, the preprocessor run is pretty boring: all it does is emit some directives to be used if debugging information is later generated:
$ cat relocation.i
# 1 "relocation.c"
# 1 "built-in"
# 1 "command-line"
# 31 "command-line"
# 1 "/scratch/palmer/work/upstream/riscv-gnu-toolchain/build/install/sysroot/usr/include/stdc-predef.h" 1 3 4
# 32 "command-line" 2
# 1 "relocation.c"
long global_symbol;
int main() {
return global_symbol != 0;
}
The preprocessed output is then fed through the compiler, which generates a assembly file. It is at this point at which we begin to see why relocations are necessary. This file is plain-text that contains RISC-V assembly code and therefore is easy to read, so let's take a look right now:
$ cat relocation.s
main:
lui a5,%hi(global_symbol)
ld a0,%lo(global_symbol)(a5)
snez a0,a0
ret
If you're not accustomed to reading the assembly output from RISC-V's GCC port
then this might look a bit odd: there's an additional pair of
addressing modes that aren't listed anywhere in the RISC-V instruction manual
and don't really look like they could be sensibly implemented in hardware:
%hi(global_symbol)
and %lo(global_symbol)(a5)
.
These addressing modes exist to allow the compiler to address global symbols.
The fundamental problem with addressing global symbols is that the compiler
must emit assembly instructions in order to access said symbols, but the actual
address of those global symbols cannot be known until link time, an impossible
task. As a concrete example try to figure out what bits the compiler would
emit for the lui
that addresses global_symbol
.
Relocations resolve this discrepancy: when the compiler is unable to know the bits that should be emitted as part of a particular instruction, in instead just emits arbitrary bits for that instruction and also emits a relocation entry. This relocation entry points to the bits that will be emitted and contains enough information for the linker to fill out those bits.
The specifics of this are probably best explained by example, so let's go through the simple program above to see how it all works. The next link in the toolchain is the assembler, which takes in the assembly file from above and produces an ELF object file that has not yet been linked. You can examine these object files with objdump, which I've done below:
$ riscv64-unknown-linux-gnu-objdump -d -t -r relocation.o
relocation.o: file format elf64-littleriscv
SYMBOL TABLE:
0000000000000000 l df *ABS* 0000000000000000 relocation.c
0000000000000000 l d .text 0000000000000000 .text
0000000000000000 l d .data 0000000000000000 .data
0000000000000000 l d .bss 0000000000000000 .bss
0000000000000000 l d .text.startup 0000000000000000 .text.startup
0000000000000000 l d .comment 0000000000000000 .comment
0000000000000000 g F .text.startup 000000000000000e main
0000000000000010 O *COM* 0000000000000008 global_symbol
Disassembly of section .text.startup:
0000000000000000 main:
0: 000007b7 lui a5,0x0
0: R_RISCV_HI20 global_symbol
0: R_RISCV_RELAX *ABS*
4: 0007b503 ld a0,0(a5) # 0 main
4: R_RISCV_LO12_I global_symbol
4: R_RISCV_RELAX *ABS*
8: 00a03533 snez a0,a0
c: 8082 ret
Now is the first point at which you get to explicitly see a relocation (which
are only shown when the -r
argument is passed to objdump). Here we can see
four RISC-V-specific relocations in two pairs: a
R_RISCV_HI20
+R_RISCV_RELAX
pair for the lui
and a
R-RISCV_LO12_I
+R_RISCV_RELAX
pair for the ld
. The
R_RISCV_RELAX
relocations exist solely to signify that it is legal to
perform linker relaxation on the previous relocation. Since we're not talking
about linker relaxation in this blog entry, we can just ignore those entries for
now.
The other two relocations pair explicitly with an addressing mode present in
the RISC-V ISA: R_RISCV_HI20
pairs with a U-format immediate while
R_RISCV_LO12_I
pairs with an I-format immediate. In general, you'll find
that every addressing mode with an immediate will have at least one relocation
that fills out that immediate -- sometimes there'll be a handful more if
that instruction format is used to link against more complicated forms of
symbols as well (for example, PIC or TLS relocations).
Before we get too deep into relocations, let's quickly examine how the toolchain works when it's possible to fill out a relocation correctly. The next link in the toolchain is the linker, which consumes the relocations generated by the assembler to fill our the relevant bits in the output ELF executable. The program now has all the glibc startup code so it's become quite large. Thus, I'm only posting the relevant snippets below:
$ riscv64-unknown-linux-gnu-objdump -d -t -r relocation
relocation: file format elf64-littleriscv
SYMBOL TABLE:
0000000000012038 g O .bss 0000000000000010 global_symbol
...
Disassembly of section .text:
0000000000010330 main:
10330: 67c9 lui a5,0x12
10332: 0387b503 ld a0,56(a5) # 12038 global_symbol
10336: 00a03533 snez a0,a0
1033a: 8082 ret
As you can see, the symbol table now has an actual address for
global_symbol
, the instructions that were referenced by the relocations
have some non-zero bits filled out to reference global_symbol
, and the
relocations have been dropped from the ELF file as they're no longer necessary
-- this is only strictly the case because we have a statically-linked symbol,
relocating dynamic symbols is deferred to the loader in that case.
The relocation truncated to fit
Error Message
Now that you know a bit about what relocations are we can discuss most people's
only exposure to relocations: the relocation truncated to fit
error
message that appears when linking. It's hard to explain this message to people
who don't understand relocations, but if you understand what a relocation is
then it's not actually that tricky of an error message.
In order to explain the error message, we'll start with an extremely simple
program. In this case we don't want anything from the C library to show up in
our error message so we're defining _start
instead of main
and then
avoiding any standard library objects by passing -nostdlib -nostartfiles
to
GCC -- this program won't actually work, but it'll serve to explain what's
going on. Moving the text section with -Wl,-Ttext-segment,0x80000000
will
actually trigger the bug, you'll see why below:
$ cat reloc_fail.c
long global_symbol;
int _start() {
return global_symbol;
}
$ riscv64-unknown-linux-gnu-gcc reloc_fail.c -o reloc_fail -O3 -nostartfiles -nostdlib --save-temps -Wl,-Ttext-segment,0x80000000
reloc_fail.o: In function `_start':
reloc_fail.c:(.text+0x0): relocation truncated to fit: R_RISCV_HI20 against symbol `global_symbol' defined in COMMON section in reloc_fail.o
/scratch/palmer/work/20170725-binutils-2.29/install/bin/../lib/gcc/riscv64-unknown-linux-gnu/7.1.1/../../../../riscv64-unknown-linux-gnu/bin/ld: final link failed: Symbol needs debug section which does not exist
collect2: error: ld returned 1 exit status
On the surface this looks like a super scary error message: there are all sorts of references to temporary objects; the mention of symbols, sections and relocations; and an odd message about debug sections. This is usually the point at which people give up and call a toolchain hacker, but with your newfound knowledge of relocations you should be able to figure out what's going on here.
First, let's focus on only the important part of the error message and ignore all the cruft that's not actually relevant. The actual error you want to look at here is:
reloc_fail.c:(.text+0x0): relocation truncated to fit: R_RISCV_HI20 against symbol `global_symbol'
which simply states that the compiler generated a R_RISCV_HI20
relocation
against the address global_symbol
, but that the linker was unable to fit the
symbol's full address into the bits specified by that relocation. The phrase
"truncated to fit" is a bit odd: what the linker is actually saying is that the
address in the relocation must be truncated to fit into the bits allocated by
the relocation if it was to fit, but since this is an error the linker isn't
really truncating anything.
In order to start really delving into the "why" of the error message, we need to first look at the input to the linker, which in this case is the object file generated by the assembler. Like the above example, we need the relocation because the compiler needs to reference a global symbol that it can't know the address for.
$ riscv64-unknown-linux-gnu-objdump -d -r reloc_fail.o
reloc_fail.o: file format elf64-littleriscv
Disassembly of section .text:
0000000000000000 <_start>:
0: 000007b7 lui a5,0x0
0: R_RISCV_HI20 global_symbol
0: R_RISCV_RELAX *ABS*
4: 0007a503 lw a0,0(a5) # 0 <_start>
4: R_RISCV_LO12_I global_symbol
4: R_RISCV_RELAX *ABS*
8: 8082 ret
We can't actually see the linker output because it's impossible to link this file. Since I hate doing arithmetic by hand, I instead just went ahead and modified the linker to omit the range check when performing relocations with the patch shown below:
$ git diff
diff --git a/bfd/elfnn-riscv.c b/bfd/elfnn-riscv.c
index 3c04507623c3..f8a97411de35 100644
--- a/bfd/elfnn-riscv.c
+++ b/bfd/elfnn-riscv.c
@@ -1492,8 +1492,6 @@ perform_relocation (const reloc_howto_type *howto,
case R_RISCV_GOT_HI20:
case R_RISCV_TLS_GOT_HI20:
case R_RISCV_TLS_GD_HI20:
- if (ARCH_SIZE > 32 && !VALID_UTYPE_IMM (RISCV_CONST_HIGH_PART (value)))
- return bfd_reloc_overflow;
value = ENCODE_UTYPE_IMM (RISCV_CONST_HIGH_PART (value));
break;
With the above patch, the linker can generate an incorrect object file that we can inspect, which I've shown below:
$ riscv64-unknown-linux-gnu-objdump -d -t reloc_fail
reloc_fail: file format elf64-littleriscv
SYMBOL TABLE:
00000000800000b0 l d .text 0000000000000000 .text
00000000800010c0 l d .bss 0000000000000000 .bss
0000000000000000 l d .comment 0000000000000000 .comment
0000000000000000 l df *ABS* 0000000000000000 reloc_fail.c
00000000800018ba g .text 0000000000000000 __global_pointer$
00000000800010c0 g O .bss 0000000000000008 global_symbol
00000000800000b0 g F .text 000000000000000a _start
00000000800010ba g .bss 0000000000000000 __bss_start
00000000800010ba g .bss 0000000000000000 _edata
00000000800010c8 g .bss 0000000000000000 _end
Disassembly of section .text:
00000000800000b0 <_start>:
800000b0: 800017b7 lui a5,0x80001
800000b4: 0c07a503 lw a0,192(a5) # ffffffff800010c0 <__global_pointer$+0xfffffffefffff806>
800000b8: 8082 ret
As we can clearly see, the instructions that load the value of
global_symbol
do not actually match the address of global_symbol
as
listed by the symbol table, which is exactly what the relocation truncated to fit
error message is trying to say. In the particular case of the
R_RISCV_HI20
+R_RISCV_LO12_I
relocation pair the largest absolute
address that can be generated is 0x7FFFFFFF
-- remember U-type immediates
are signed on RISC-V, so any larger absolute address overflows on RV64.
While every architecture performs some relocations when linking, RISC-V leverages the linker's relocation infrastructure more aggressively than any other architecture so these sorts of issues may crop up more frequently than in other ports. We'll be talking a lot about relocations in the blog as they frequently drive other toolchain design issues.