How to Use the Plan 9 C Compiler

Rob Pike

rob@plan9.bell-labs.com

Introduction

The C compiler on Plan 9 is a wholly new program; in fact it was the first piece of software written for what would eventually become Plan 9 from Bell Labs. Programmers familiar with existing C compilers will find a number of differences in both the language the Plan 9 compiler accepts and in how the compiler is used.

The compiler is really a set of compilers, one for each architecture — MIPS, SPARC, Motorola 68020, Intel 386, etc. — that accept a dialect of ANSI C and efficiently produce fairly good code for the target machine. There is a packaging of the compiler that accepts strict ANSI C for a POSIX environment, but this document focuses on the native Plan 9 environment, that in which all the system source and almost all the utilities are written.

Source

The language accepted by the compilers is the core ANSI C language with some modest extensions, a greatly simplified preprocessor, a smaller library that includes system calls and related facilities, and a completely different structure for include files.

Official ANSI C accepts the old (K&R) style of declarations for functions; the Plan 9 compilers are more demanding. Without an explicit run-time flag (-B) whose use is discouraged, the compilers insist on new-style function declarations, that is, prototypes for function arguments. The function declarations in the libraries’ include files are all in the new style so the interfaces are checked at compile time. For C programmers who have not yet switched to function prototypes the clumsy syntax may seem repellent but the payoff in stronger typing is substantial. Those who wish to import existing software to Plan 9 are urged to use the opportunity to update their code.

The compilers include an integrated preprocessor that accepts the familiar #include, #define for macros both with and without arguments, #undef, #line, #ifdef, #ifndef, and #endif. It supports neither #if nor ##, although it does honor a few #pragmas. The #if directive was omitted because it greatly complicates the preprocessor, is never necessary, and is usually abused. Conditional compilation in general makes code hard to understand; the Plan 9 source uses it sparingly. Also, because the compilers remove dead code, regular if statements with constant conditions are more readable equivalents to many #ifs. To compile imported code ineluctably fouled by #if there is a separate command, /bin/cpp, that implements the complete ANSI C preprocessor specification.

Include files fall into two groups: machine-dependent and machine-independent. The machine-independent files occupy the directory /sys/include; the others are placed in a directory appropriate to the machine, such as /mips/include. The compiler searches for include files first in the machine-dependent directory and then in the machine-independent directory. At the time of writing there are thirty-one machine-independent include files and two (per machine) machine-dependent ones: <ureg.h> and <u.h>. The first describes the layout of registers on the system stack, for use by the debugger. The second defines some architecture-dependent types such as jmp_buf for setjmp and the va_arg and va_list macros for handling arguments to variadic functions, as well as a set of typedef abbreviations for unsigned short and so on.

Here is an excerpt from /68020/include/u.h:

#define nil     ((void*)0)

typedef unsigned short  ushort;

typedef unsigned char   uchar;

typedef unsigned long   ulong;

typedef unsigned int    uint;

typedef   signed char   schar;

typedef long long       vlong;

typedef long    jmp_buf[2];

#define JMPBUFSP    0

#define JMPBUFPC    1

#define JMPBUFDPC   0

Plan 9 programs use nil for the name of the zero-valued pointer. The type vlong is the largest integer type available; on most architectures it is a 64-bit value. A couple of other types in <u.h> are u32int, which is guaranteed to have exactly 32 bits (a possibility on all the supported architectures) and mpdigit, which is used by the multiprecision math package <mp.h>. The #define constants permit an architecture-independent (but compiler-dependent) implementation of stack-switching using setjmp and longjmp.

Every Plan 9 C program begins

#include <u.h>

because all the other installed header files use the typedefs declared in <u.h>.

In strict ANSI C, include files are grouped to collect related functions in a single file: one for string functions, one for memory functions, one for I/O, and none for system calls. Each include file is protected by an #ifdef to guarantee its contents are seen by the compiler only once. Plan 9 takes a different approach. Other than a few include files that define external formats such as archives, the files in /sys/include correspond to libraries. If a program is using a library, it includes the corresponding header. The default C library comprises string functions, memory functions, and so on, largely as in ANSI C, some formatted I/O routines, plus all the system calls and related functions. To use these functions, one must #include the file <libc.h>, which in turn must follow <u.h>, to define their prototypes for the compiler. Here is the complete source to the traditional first C program:

#include <u.h>

#include <libc.h>

void

main(void)

{

    print("hello world\n");

    exits(0);

}

The print routine and its relatives fprint and sprint resemble the similarly-named functions in Standard I/O but are not attached to a specific I/O library. In Plan 9 main is not integer-valued; it should call exits, which takes a string argument (or null; here ANSI C promotes the 0 to a char*). All these functions are, of course, documented in the Programmer’s Manual.

To use printf, <stdio.h> must be included to define the function prototype for printf:

#include <u.h>

#include <libc.h>

#include <stdio.h>

void

main(int argc, char *argv[])

{

    printf("%s: hello world; argc = %d\n", argv[0], argc);

    exits(0);

}

In practice, Standard I/O is not used much in Plan 9. I/O libraries are discussed in a later section of this document.

There are libraries for handling regular expressions, raster graphics, windows, and so on, and each has an associated include file. The manual for each library states which include files are needed. The files are not protected against multiple inclusion and themselves contain no nested #includes. Instead the programmer is expected to sort out the requirements and to #include the necessary files once at the top of each source file. In practice this is trivial: this way of handling include files is so straightforward that it is rare for a source file to contain more than half a dozen #includes.

The compilers do their own register allocation so the register keyword is ignored. For different reasons, volatile and const are also ignored.

To make it easier to share code with other systems, Plan 9 has a version of the compiler, because all the other installed header files use the typedefs declared in <u.h>.

In strict ANSI C, include files are grouped to collect related functions in a single file: one for string functions, one for memory functions, one for I/O, and none for system calls. Each include file is protected by an #ifdef to guarantee its contents are seen by the compiler only once. Plan 9 takes a different approach. Other than a few include files that define external formats such as archives, the files in /sys/include correspond to libraries. If a program is using a library, it includes the corresponding header. The default C library comprises string functions, memory functions, and so on, largely as in ANSI C, some formatted I/O routines, plus all the system calls and related functions. To use these functions, one must #include the file <libc.h>, which in turn must follow <u.h>, to define their prototypes for the compiler. Here is the complete source to the traditional first C program:

#include <u.h>

#include <libc.h>

void

main(void)

{

    print("hello world\n");

    exits(0);

}

The print routine and its relatives fprint and sprint resemble the similarly-named functions in Standard I/O but are not attached to a specific I/O library. In Plan 9 main is not integer-valued; it should call exits, which takes a string argument (or null; here ANSI C promotes the 0 to a char*). All these functions are, of course, documented in the Programmer’s Manual.

To use printf, <stdio.h> must be included to define the function prototype for printf:

#include <u.h>

#include <libc.h>

#include <stdio.h>

void

main(int argc, char *argv[])

{

    printf("%s: hello world; argc = %d\n", argv[0], argc);

    exits(0);

}

In practice, Standard I/O is not used much in Plan 9. I/O libraries are discussed in a later section of this document.

There are libraries for handling regular expressions, raster graphics, windows, and so on, and each has an associated include file. The manual for each library states which include files are needed. The files are not protected against multiple inclusion and themselves contain no nested #includes. Instead the programmer is expected to sort out the requirements and to #include the necessary files once at the top of each source file. In practice this is trivial: this way of handling include files is so straightforward that it is rare for a source file to contain more than half a dozen #includes.

The compilers do their own register allocation so the register keyword is ignored. For different reasons, volatile