## Tuesday, August 29, 2017

### GCC target description macros and functions

This is part four of a series “Writing a GCC back end”.

The functionality that cannot be handled by the machine description in machine.md is implemented using target macros and functions in machine.h and machine.c. Starting from an existing back end that is similar to you target will give you a reasonable implementation for most of these, but you will need to update the implementation related to the register file and addressing modes to correctly describe your architecture. This blog post describes the relevant macros and functions, with examples from the Moxie back end.

### Inspecting the RTL

Some of the macros and functions are given RTL expressions that they need to inspect in order to do what is expected. RTL expressions are represented by a type rtx that is accessed using macros. Assume we have a variable
rtx x;

containing the expression
(plus:SI (reg:SI 100) (const_int 42))

We can now access the different parts of it as
• GET_CODE(x) returns the operation PLUS
• GET_MODE(x) returns the operation’s machine mode SImode
• XEXP(x,0) returns the first operand – an rtx corresponding to (reg:SI 100)
• XEXP(x,1) returns the second operand – an rtx corresponding to (const_int 42)
XEXP() returns an rtx which is not right for accessing the value of a leaf expression (such as reg) – those need to be accessed using a type-specific macro. For example, the integer value from (const_int 42) is accessed using INTVAL(), and the register number from (reg:SI 100) is accessed using REGNO().

For an example of how this is used in the back end, suppose our machine have load and store instructions that accept an address that is a constant, a register, or a register plus a constant in the range [-32768, 32767], and we need a function that given an address expression tells if it is a valid address expression for the target. We could write that function as
bool
{
if (GET_CODE (x) == SYMBOL_REF
|| GET_CODE (x) == LABEL_REF
|| GET_CODE (x) == CONST)
return true;
if (REG_P (x))
return true;
if (GET_CODE(x) == PLUS
&& REG_P (XEXP (x, 0))
&& CONST_INT_P (XEXP (x, 1))
&& IN_RANGE (INTVAL (XEXP (x, 1)), -32768, 32767))
return true;
return false;
}


### Registers

The register names are specified as
#deefine REGISTER_NAMES             \
{                                 \
"$fp", "$sp", "$r0", "$r1",     \
"$r2", "$r3", "$r4", "$r5",     \
"$r6", "$r7", "$r8", "$r9",     \
"$r10", "$r11", "$r12", "$r13", \
"?fp", "?ap", "$pc", "?cc" \ }  The names are used for generating the assembly files, and for the GCC extension letting programmers place variables in specified registers register int *foo asm ("$r12");

The Moxie architecture does only have 18 registers, but it adds two fake registers ?fp and ?ap for the frame pointer and argument pointer (used to access the function’s argument lists). This is a common strategy in GCC back ends to simplify code generation and elimination of unneeded frame pointer and argument pointers, and there there is a mechanism (ELIMINABLE_REGS) that rewrites these fake registers to real registers (such as the stack pointer).

The GCC RTL contains virtual registers (which are called pseudo registers in GCC) before the real registers (called hard registers) are allocated. The registers are represented as an integer, where 0 is the first hard register, 1 is the second hard register, etc., and the pseudo registers follows after the last hard register. The back end specifies the number of hard registers by specifying the first pseudo register
#define FIRST_PSEUDO_REGISTER 20

Some of the registers, such as the program counter $pc, cannot be used by the register allocator, so we need to specify which registers the register allocator must avoid #define FIXED_REGISTERS \ { \ 1, 1, 0, 0, \ 0, 0, 0, 0, \ 0, 0, 0, 0, \ 0, 0, 0, 1, \ 1, 1, 1, 1 \ }  where 1 means that it cannot be used by the register allocator. Similarly, the register allocator needs to know which registers may be changed by calling a function #define CALL_USED_REGISTERS \ { \ 1, 1, 1, 1, \ 1, 1, 1, 1, \ 0, 0, 0, 0, \ 0, 0, 1, 1, \ 1, 1, 1, 1 \ }  where 1 means that the register is clobbered by function calls. It is possible to specify the order in which the register allocator allocates the registers. The Moxie back end does not do this, but it could have done it with something like the code below that would use register 2 ($r0) for the first allocated register, register 3 ($r1) for the second allocated register, etc. #define REG_ALLOC_ORDER \ { \ /* Call-clobbered registers */ \ 2, 3, 4, 5, 6, 7, 14 \ /* Call-saved registers */ \ 8, 9, 10, 11, 12, 13, \ /* Registers not for general use. */ \ 0, 1, 15, 16, 17, 18, 19 \ }  For an example of where this can be used, consider an architecture with 16 registers and “push multiple” and “pop multiple” instructions that push/pop the last $$n$$ registers. The instructions are designed to store call-saved registers when calling a function, and it makes sense to allocate the call-saved registers in reverse order so the push/pop instructions save as few registers as possible #define REG_ALLOC_ORDER \ { \ /* Call-clobbered registers */ \ 0, 1, 2, 3, \ /* Call-saved registers */ \ 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4 \ }  There are usually restrictions on how registers can be used (e.g. the Moxie $pc register cannot be used in arithmetic instructions), and these restrictions are described by register classes. Each register class defines a set of registers that can be used in the same way (for example, that can be used in arithmetic instructions). There are three standard register classes that must be defined
• ALL_REGS – containing all of the registers.
• NO_REGS – containing no registers.
• GENERAL_REGS – used for the ‘r’ and ‘g’ constraints.
and the back end can add its own as needed. The register classes are implemented as
enum reg_class
{
NO_REGS,
GENERAL_REGS,
SPECIAL_REGS,
CC_REGS,
ALL_REGS,
LIM_REG_CLASSES
};

#define N_REG_CLASSES LIM_REG_CLASSES

#define REG_CLASS_NAMES             \
{                                 \
"NO_REGS",                      \
"GENERAL_REGS",                 \
"SPECIAL_REGS",                 \
"CC_REGS",                      \
"ALL_REGS"                      \
}

#define REG_CLASS_CONTENTS                           \
{                                                  \
{ 0x00000000 }, /* Empty */                      \
{ 0x0003FFFF }, /* $fp,$sp, $r0 to$r13, ?fp */ \


### gcc/config/machine

The main part of the back end is located in gcc/config/machine. It consists of eight different components, each implemented in a separate file:
• machine.h is included all over the compiler and contains macros defining properties of the target, such as the size of integers and pointers, number of registers, data alignment rules, etc.
• GCC implements a generic backend where machine.c can override most of the functionality. The backend is written in C,1 so the virtual functions are handled manually with function pointers in a structure, and machine.c overrides the defaults using code of the form
#undef TARGET_FRAME_POINTER_REQUIRED
#define TARGET_FRAME_POINTER_REQUIRED ft32_frame_pointer_required
static bool
ft32_frame_pointer_required (void)
{
return cfun->calls_alloca;
}

• machine-protos.h contains prototypes for the external functions defined in machine.c.
• machine.opt adds target-specific command-line options to the compiler using a record format specifying the option name, properties, and a documentation string for the --help output. For example,
msmall-data-limit=
Target Joined Separate UInteger Var(g_switch_value) Init(8)
-msmall-data-limit=N    Put global and static data smaller than <number> bytes into a special section.

• adds a command-line option -msmall-data-limit that has a default value 8, and is generated as an unsigned variable named g_switch_value.
• machine.md, predicates.md, and constraints.md contain the machine description consisting of rules for instruction selection and register allocation, pipeline description, and peephole optimizations. These will be covered in detail in parts 3–7 of this series.
• machine-modes.def defines extra machine modes for use in the low-level IR (a “machine mode” in the GCC terminology defines the size and representation of a data object. That is, it is a data type.). This is typically used for condition codes and vectors.
The GCC configuration is very flexible and everything can be overridden, so some back ends look slightly different as they, for example, add several .opt files by setting extra_options in config.gcc.

### gcc/common/config/machine

The gcc/common/config/machine directory contains a file machine-common.c that can add/remove optimization passes, change the defaults for --param values, etc.

Many back ends do not need to do anything here, and this file can be disabled by setting
target_has_targetm_common=no

in config.gcc.

### libgcc/config.host

The libgcc config.host works in the same way as config.gcc, but with different variables.

The only variable that must be set is cpu_type that specifies machine. Most targets also set extra_parts that specifies extra object files to include in the library and tmake_file that contains makefile fragments that add extra functionality (such as soft-float support).

A typical configuration for a simple target looks something like
cpu_type=ft32
tmake_file="$tmake_file t-softfp" extra_parts="$extra_parts crti.o crtn.o crtbegin.o crtend.o"


### libgcc/config/machine

The libgcc/config/machine directory contains extra files that may be needed for the target architecture. Simple implementations typically only contain a crti.S and crtn.S (crtbegin/crtend and the makefile support for building all of these have default implementation) and a file sfp-machine.h containing defaults for the soft-float implementation.

1. GCC is written in C++03 these days, but the structure has not been changed since it was written in C.

## Friday, August 4, 2017

### Writing a GCC back end

It is surprisingly easy to design a CPU (see for example Colin Riley’s blog series) and I was recently asked how hard it is to write a GCC back end for your new architecture. That too is easy — provided you have done it once before. But the first time is quite painful...

I plan to write some blog posts the coming weeks that will try to ease the pain by showing what is involved in creating a “working” back end that is capable of compiling simple functions, give some pointers to how to proceed to make this production-ready, and in general provide the overview I would have liked before I started developing my backend (GCC has a good reference manual, “GNU Compiler Collection Internals”, describing everything you need to know, but it is a bit overwhelming when you start...)

The series will cover the following (I’ll update the list with links to the posts as they become available)
1. The structure of a GCC back end
• Which files you need to write/modify
2. Getting started with a GCC back end
• Pointers to resources describing how to set up the initial back end
3. Low-level IR and basic code generation
• How the low-level IR works
• How the IR is lowered to instructions
• How to write simple instruction patterns
4. Target description macros and functions
• Working with the RTL
• Describing the registers (names, register classes, allocation order, ...)
• define_expand
• The unspec and unspec_volatile expression codes
• Attributes
6. Improving the performance
• Cost model
• Peephole optimization
• Tuning the optimization passes
7. Instruction scheduling
• Describing the processor pipeline