Changes to the Programming Environment

in the

Fourth Release of Plan 9

Rob Pike

rob@plan9.bell-labs.com

Introduction

The fourth release of Plan 9 includes changes at many levels of the system, with repercussions in the libraries and program interfaces. This document summarizes the changes and describes how existing programs must be modified to run in the new release. It is not exhaustive, of course; for further detail about any of the topics refer to the manual pages, as always.

Programmers new to Plan 9 may find valuable tidbits here, but the real audience for this paper is those with a need to update applications and servers written in C for earlier releases of the Plan 9 operating system.

9P, NAMELEN, and strings

The underlying file service protocol for Plan 9, 9P, retains its basic form but has had a number of adjustments to deal with longer file names and error strings, new authentication mechanisms, and to make it more efficient at evaluating file names. The change to file names affects a number of system interfaces; because file name elements are no longer of fixed size, they can no longer be stored as arrays.

9P used to be a fixed-format protocol with NAMELEN-sized byte arrays representing file name elements. Now, it is a variable-format protocol, as described in intro(5), in which strings are represented by a count followed by that many bytes. Thus, the string ken would previously have occupied 28 (NAMELEN) bytes in the message; now it occupies 5: a two-byte count followed by the three bytes of ken and no terminal zero. (And of course, a name could now be much longer.) A similar format change has been made to stat buffers: they are no longer DIRLEN bytes long but instead have variable size prefixed by a two-byte count. And in fact the entire 9P message syntax has changed: every message now begins with a message length field that makes it trivial to break the string into messages without parsing them, so aux/fcall is gone. A new library entry point, read9pmsg, makes it easy for user-level servers to break the client data stream into 9P messages. All servers should switch from using read (or the now gone getS) to using read9pmsg.

This change to 9P affects the way strings are handled by the kernel and throughout the system. The consequences are primarily that fixed-size arrays have been replaced by pointers and counts in a variety of system interfaces. Most programs will need at least some adjustment to the new style. In summary: NAMELEN is gone, except as a vestige in the authentication libraries, where it has been rechristened ANAMELEN. DIRLEN and ERRLEN are also gone. All programs that mention these constants will need to be fixed.

The simplest place to see this change is in the errstr system call, which no longer assumes a buffer of length ERRLEN but now requires a byte-count argument:

char buf[...];

errstr(buf, sizeof buf);

The buffer can be any size you like. For convenience, the kernel stores error strings internally as 256-byte arrays, so if you like — but it’s not required — you can use the defined constant ERRMAX=256 as a good buffer size. Unlike the old ERRLEN (which had value 64), ERRMAX is advisory, not mandatory, and is not part of the 9P specification.

With names, stat buffers, and directories, there isn’t even an echo of a fixed-size array any more.

Directories and wait messages

With strings now variable-length, a number of system calls needed to change: errstr, stat, fstat, wstat, fwstat, and wait are all affected, as is read when applied to directories.

As far as directories are concerned, most programs don’t use the system calls directly anyway, since they operate on the machine-independent form, but instead call the machine-dependent Dir routines dirstat, dirread, etc. These used to fill user-provided fixed-size buffers; now they return objects allocated by malloc (which must therefore be freed after use). To ‘stat’ a file:

Dir *d;

d = dirstat(filename);

if(d == nil){

    fprint(2, "can’t stat %s: %r\n", filename);

    exits("stat");

}

use(d);

free(d);

A common new bug is to forget to free a Dir returned by dirstat.

Dirfstat and Dirfwstat work pretty much as before, but changes to 9P make it possible to exercise finer-grained control on what fields of the Dir are to be changed; see stat(2) and stat(5) for details.

Reading a directory works in a similar way to dirstat, with dirread allocating and filling in an array of Dir structures. The return value is the number of elements of the array. The arguments to dirread now include a pointer to a Dir* to be filled in with the address of the allocated array:

Dir *d;

int i, n;

while((n = dirread(fd, &d)) > 0){

    for(i=0; i<n; i++)

        use(&d[i]);

    free(d);

}

A new library function, dirreadall, has the same form as dirread but returns the entire directory in one call:

n = dirreadall(fd, &d)

for(i=0; i<n; i++)

    use(&d[i]);

free(d);

If your program insists on using the underlying stat system call or its relatives, or wants to operate directly on the machine-independent format returned by stat or read, it will need to be modified. Such programs are rare enough that we’ll not discuss them here beyond referring to the man page stat(2) for details. Be aware, though, that it used to be possible to regard the buffer returned by stat as a byte array that began with the zero-terminated name of the file; this is no longer true. With very rare exceptions, programs that call stat would be better recast to use the dir routines or, if their goal is just to test the existence of a file, access.

Similar changes have affected the wait system call. In fact, wait is no longer a system call but a library routine that calls the new await system call and returns a newly allocated machine-dependent Waitmsg structure:

Waitmsg *w;

w = wait();

if(w == nil)

    error("wait: %r");

print("pid is %d; exit string %s\n", w->pid, w->msg);

free(w);

The exit string w->msg may be empty but it will never be a nil pointer. Again, don’t forget to free the structure returned by wait. If all you need is the pid, you can call waitpid, which reports just the pid and doesn’t return an allocated structure:

int pid;

pid = waitpid();

if(pid < 0)

    error("wait: %r");

print("pid is %d\n", pid);

Quoted strings and tokenize

Wait gives us a good opportunity to describe how the system copes with all this free-format data. Consider the text returned by the await system call, which includes a set of integers (pids and times) and a string (the exit status). This information is formatted free-form; here is the statement in the kernel that generates the message:

n = snprint(a, n, "%d %lud %lud %lud %q",

    wq->w.pid,

    wq->w.time[TUser], wq->w.time[TSys], wq->w.time[TReal],

    wq->w.msg);

Note the use of %q to produce a quoted-string representation of the exit status. The %q format is like %s but will wrap rc-style single quotes around the string if it contains white space or is otherwise ambiguous. The library routine tokenize can be used to parse data formatted this way: it splits white-space-separated fields but understands the %q quoting conventions. Here is how the wait library routine builds its Waitmsg from the data returned by await:

Waitmsg*

wait(void)

{

    int n, l;

    char buf[512], *fld[5];

    Waitmsg *w;

    n = await(buf, sizeof buf-1);

    if(n < 0)

        return nil;

    buf[n] = ’ ’;

    if(tokenize(buf, fld, nelem(fld)) != nelem(fld)){

        werrstr("couldn’t parse wait message");

        return nil;

    }

    l = strlen(fld[4])+1;

w = wait();

if(w == nil)

    error("wait: %r");

print("pid is %d; exit string %s\n", w->pid, w->msg);

free(w);

The exit string w->msg may be empty but it will never be a nil pointer. Again, don’t forget to free the structure returned by wait. If all you need is the pid, you can call waitpid, which reports just the pid and doesn’t return an allocated structure:

int pid;

pid = waitpid();

if(pid < 0)

    error("wait: %r");

print("pid is %d\n", pid);

Quoted strings and tokenize

Wait gives us a good opportunity to describe how the system copes with all this free-format data. Consider the text returned by the await system call, which includes a set of integers (pids and times) and a string (the exit status). This information is formatted free-form; here is the statement in the kernel that generates the message:

n = snprint(a, n, "%d %lud %lud %lud %q",

    wq->w.pid,

    wq->w.time[TUser], wq->w.time[TSys], wq->w.time[TReal],

    wq->w.msg);

Note the use of %q to produce a quoted-string representation of the exit status. The