Understanding and writing an LLVM compiler back-endBruno Cardoso Lopesbruno.cardoso@gmail.comEmbedded Linux Conference 2009San Francisco, CA
• What’s LLVM?
• LLVM design
• The back-end
• Why LLVM?
• Who’s using
Agenda
• Low Level Virtual Machine
• A virtual instruction set
• A compiler infrastructure suite with aggressive
optimizations.
What's LLVM ?Basics
information
• Low-level representation, but with high-level type
• 3-address-code-like representation
• RISC based, language independent with SSA
information. The on-disk representation is called
bitcode.
A virtual instruction set
int dummy(int a) {
return a+3;
}
define i32 @dummy(i32 %a) nounwind readnone {
entry:
}
%0 = add i32 %a, 3
ret i32 %0
A virtual instruction set
• Compiler front-end
• llvm-gcc
• clang
• IR and tools to handle it
• JIT and static back-ends.
Front-endIRBack-endA compiler infrastructure suite
• GCC based front-end : llvm-gcc
• GENERIC to LLVM instead of GIMPLE
• GIMPLE is only an in-memory IR
• GCC is not modular (intentionally)
Front-end : llvm-gcc
C, C++ and ObjC.
• Very mature and supports Java, Ada, FORTRAN,
• Cross-compiler needed for a not native target.
$ llvm-gcc -O2 -c clown.c -emit-llvm -o clown.bc
$ llvm-extract -func=bozo < clown.bc | llvm-dis
define float @bozo(i32 %lhs, i32 %rhs, float %w) nounwind {
entry:
}
%0 = sdiv i32 %lhs, %rhs
%1 = sitofp i32 %0 to float
%2 = mul float %1, %w
ret float %2
LLVM assemblyFront-end : llvm-gcc