The Organization of Networks in Plan 9

Dave Presotto

Phil Winterbottom

presotto,philw@plan9.bell-labs.com

ABSTRACT

In a distributed system networks are of paramount importance. This paper describes the implementation, design philosophy, and organization of network support in Plan 9. Topics include network requirements for distributed systems, our kernel implementation, network naming, user interfaces, and performance. We also observe that much of this organization is relevant to current systems.

1. Introduction

Plan 9 [Pike90] is a general-purpose, multi-user, portable distributed system implemented on a variety of computers and networks. What distinguishes Plan 9 is its organization. The goals of this organization were to reduce administration and to promote resource sharing. One of the keys to its success as a distributed system is the organization and management of its networks.

A Plan 9 system comprises file servers, CPU servers and terminals. The file servers and CPU servers are typically centrally located multiprocessor machines with large memories and high speed interconnects. A variety of workstation-class machines serve as terminals connected to the central servers using several networks and protocols. The architecture of the system demands a hierarchy of network speeds matching the needs of the components. Connections between file servers and CPU servers are high-bandwidth point-to-point fiber links. Connections from the servers fan out to local terminals using medium speed networks such as Ethernet [Met80] and Datakit [Fra80]. Low speed connections via the Internet and the AT&T backbone serve users in Oregon and Illinois. Basic Rate ISDN data service and 9600 baud serial lines provide slow links to users at home.

Since CPU servers and terminals use the same kernel, users may choose to run programs locally on their terminals or remotely on CPU servers. The organization of Plan 9 hides the details of system connectivity allowing both users and administrators to configure their environment to be as distributed or centralized as they wish. Simple commands support the construction of a locally represented name space spanning many machines and networks. At work, users tend to use their terminals like workstations, running interactive programs locally and reserving the CPU servers for data or compute intensive jobs such as compiling and computing chess endgames. At home or when connected over a slow network, users tend to do most work on the CPU server to minimize traffic on the slow links. The goal of the network organization is to provide the same environment to the user wherever resources are used.

2. Kernel Network Support

Networks play a central role in any distributed system. This is particularly true in Plan 9 where most resources are provided by servers external to the kernel. The importance of the networking code within the kernel is reflected by its size; of 25,000 lines of kernel code, 12,500 are network and protocol related. Networks are continually being added and the fraction of code devoted to communications is growing. Moreover, the network code is complex. Protocol implementations consist almost entirely of synchronization and dynamic memory management, areas demanding subtle error recovery strategies. The kernel currently supports Datakit, point-to-point fiber links, an Internet (IP) protocol suite and ISDN data service. The variety of networks and machines has raised issues not addressed by other systems running on commercial hardware supporting only Ethernet or FDDI.

2.1. The File System protocol

A central idea in Plan 9 is the representation of a resource as a hierarchical file system. Each process assembles a view of the system by building a name space [Needham] connecting its resources. File systems need not represent disc files; in fact, most Plan 9 file systems have no permanent storage. A typical file system dynamically represents some resource like a set of network connections or the process table. Communication between the kernel, device drivers, and local or remote file servers uses a protocol called 9P. The protocol consists of 17 messages describing operations on files and directories. Kernel resident device and protocol drivers use a procedural version of the protocol while external file servers use an RPC form. Nearly all traffic between Plan 9 systems consists of 9P messages. 9P relies on several properties of the underlying transport protocol. It assumes messages arrive reliably and in sequence and that delimiters between messages are preserved. When a protocol does not meet these requirements (for example, TCP does not preserve delimiters) we provide mechanisms to marshal messages before handing them to the system.

A kernel data structure, the channel, is a handle to a file server. Operations on a channel generate the following 9P messages. The session and attach messages authenticate a connection, established by means external to 9P, and validate its user. The result is an authenticated channel referencing the root of the server. The clone message makes a new channel identical to an existing channel, much like the dup system call. A channel may be moved to a file on the server using a walk message to descend each level in the hierarchy. The stat and wstat messages read and write the attributes of the file referenced by a channel. The open message prepares a channel for subsequent read and write messages to access the contents of the file. Create and remove perform the actions implied by their names on the file referenced by the channel. The clunk message discards a channel without affecting the file.

A kernel resident file server called the mount driver converts the procedural version of 9P into RPCs. The mount system call provides a file descriptor, which can be a pipe to a user process or a network connection to a remote machine, to be associated with the mount point. After a mount, operations on the file tree below the mount point are sent as messages to the file server. The mount driver manages buffers, packs and unpacks parameters from messages, and demultiplexes among processes using the file server.

2.2. Kernel Organization

The network code in the kernel is divided into three layers: hardware interface, protocol processing, and program interface. A device driver typically uses streams to connect the two interface layers. Additional stream modules may be pushed on a device to process protocols. Each device driver is a kernel-resident file system. Simple device drivers serve a single level directory containing just a few files; for example, we represent each UART by a data and a control file.

cpu% cd /dev

cpu% ls -l eia*

--rw-rw-rw- t 0 bootes bootes 0 Jul 16 17:28 eia1

--rw-rw-rw- t 0 bootes bootes 0 Jul 16 17:28 eia1ctl

--rw-rw-rw- t 0 bootes bootes 0 Jul 16 17:28 eia2

--rw-rw-rw- t 0 bootes bootes 0 Jul 16 17:28 eia2ctl

cpu%

The control file is used to control the device; writing the string b1200 to /dev/eia1ctl sets the line to 1200 baud.

Multiplexed devices present a more complex interface structure. For example, the LANCE Ethernet driver serves a two level file tree (Figure 1) providing

∙ device control and configuration

∙ user-level protocols like ARP

∙ diagnostic interfaces for snooping software.

The top directory contains a clone file and a directory for each connection, numbered 1 to n. Each connection directory corresponds to an Ethernet packet type. Opening the clone file finds an unused connection directory and opens its ctl file. Reading the control file returns the ASCII connection number; the user process can use this value to construct the name of the proper connection directory. In each connection directory files named ctl, data, stats, and type provide access to the connection. Writing the string connect 2048 to the ctl file sets the packet type to 2048 and configures the connection to receive all IP packets sent to the machine. Subsequent reads of the file type yield the string 2048. The data file accesses the media; reading it returns the next packet of the selected type. Writing the file queues a packet for transmission after appending a packet header containing the source address and packet type. The stats file returns ASCII text containing the interface address, packet input/output counts, error statistics, and general information about the state of the interface.

If several connections on an interface are configured for a particular packet type, each receives a copy of the incoming packets. The special packet type -1 selects all packets. Writing the strings promiscuous and connect -1 to the ctl file configures a conversation to receive all packets on the Ethernet.

Although the driver interface may seem elaborate, the representation of a device as a set of files using ASCII strings for communication has several advantages. Any mechanism supporting remote access to files immediately allows a remote machine to use our interfaces as gateways. Using ASCII strings to control the interface avoids byte order problems and ensures a uniform representation for devices on the same machine and even allows devices to be accessed remotely. Representing dissimilar devices by the same set of files allows common tools to serve several networks or interfaces. Programs like stty are replaced by echo and shell redirection.

2.3. Protocol devices

Network connections are represented as pseudo-devices called protocol devices. Protocol device drivers exist for the Datakit URP protocol and for each of the Internet IP protocols TCP, UDP, and IL. IL, described below, is a new communication protocol used by Plan 9 for transmitting file system RPC’s. All protocol devices look identical so user programs contain no network-specific code.

Each protocol device driver serves a directory structure similar to that of the Ethernet driver. The top directory contains a clone file and a directory for each connection numbered 0 to n. Each connection directory contains files to control one connection and to send and receive information. A TCP connection directory looks like this:

cpu% cd /net/tcp/2

cpu% ls -l

--rw-rw---- I 0 ehg    bootes 0 Jul 13 21:14 ctl

--rw-rw---- I 0 ehg    bootes 0 Jul 13 21:14 data

--rw-rw---- I 0 ehg    bootes 0 Jul 13 21:14 listen

--rw-rw-rw- t 0 bootes bootes 0 Jul 16 17:28 eia2

--rw-rw-rw- t 0 bootes bootes 0 Jul 16 17:28 eia2ctl

cpu%

The control file is used to control the device; writing the string b1200 to /dev/eia1ctl sets the line to 1200 baud.

Multiplexed devices present a more complex interface structure. For example, the LANCE Ethernet driver serves a two level file tree (Figure 1) providing

∙ device control and configuration

∙ user-level protocols like ARP

∙ diagnostic interfaces for snooping software.

The top directory contains a clone file and a directory for each connection, numbered 1 to n. Each connection directory corresponds to an Ethernet packet type. Opening the clone file finds an unused connection directory and opens its ctl file. Reading the control file returns the ASCII connection number; the user process can use this value to construct the name of the proper connection directory. In each connection directory files named ctl, data, stats, and type provide access to the connection. Writing the string connect 2048 to the ctl file sets the packet type to 2048 and configures the connection to receive all IP packets sent to the machine. Subsequent reads of the file type yield the string 2048. The data file accesses the media; reading it returns the next packet of the selected type. Writing the file queues a packet for transmission after appending a packet header containing the source address and packet type. The stats file returns ASCII text containing the interface address, packet input/output counts, error statistics, and general information about the state of the interface.

If several connections on an interface are configured for a particular packet type, each receives a copy of the incoming packets. The special packet type -1