Security in Plan 9

Russ Cox, MIT LCS

Eric Grosse, Bell Labs

Rob Pike, Bell Labs

Dave Presotto, Avaya Labs and Bell Labs

Sean Quinlan, Bell Labs

{rsc,ehg,rob,presotto,seanq}@plan9.bell-labs.com

ABSTRACT

The security architecture of the Plan 9™ operating system has recently been redesigned to address some technical shortcomings. This redesign provided an opportunity also to make the system more convenient to use securely. Plan 9 has thus improved in two ways not usually seen together: it has become more secure and easier to use.

The central component of the new architecture is a per-user self-contained agent called factotum. Factotum securely holds a copy of the user’s keys and negotiates authentication protocols, on behalf of the user, with secure services around the network. Concentrating security code in a single program offers several advantages including: ease of update or repair to broken security software and protocols; the ability to run secure services at a lower privilege level; uniform management of keys for all services; and an opportunity to provide single sign on, even to unchanged legacy applications. Factotum has an unusual architecture: it is implemented as a Plan 9 file server.

1. Introduction

Secure computing systems face two challenges: first, they must employ sophisticated technology that is difficult to design and prove correct; and second, they must be easy for regular people to use. The question of ease of use is sometimes neglected, but it is essential: weak but easy-to-use security can be more effective than strong but difficult-to-use security if it is more likely to be used. People lock their front doors when they leave the house, knowing full well that a burglar is capable of picking the lock (or avoiding the door altogether); yet few would accept the cost and awkwardness of a bank vault door on the house even though that might reduce the probability of a robbery. A related point is that users need a clear model of how the security operates (if not how it actually provides security) in order to use it well; for example, the clarity of a lock icon on a web browser is offset by the confusing and typically insecure steps for installing X.509 certificates.

The security architecture of the Plan 9 operating system [Pike95] has recently been redesigned to make it both more secure and easier to use. By security we mean three things: first, the business of authenticating users and services; second, the safe handling, deployment, and use of keys and other secret information; and third, the use of encryption and integrity checks to safeguard communications from prying eyes.

The old security architecture of Plan 9 had several engineering problems in common with other operating systems. First, it had an inadequate notion of security domain. Once a user provided a password to connect to a local file store, the system required that the same password be used to access all the other file stores. That is, the system treated all network services as belonging to the same security domain.

Second, the algorithms and protocols used in authentication, by nature tricky and difficult to get right, were compiled into the various applications, kernel modules, and file servers. Changes and fixes to a security protocol required that all components using that protocol needed to be recompiled, or at least relinked, and restarted.

Third, the file transport protocol, 9P [Pike93], that forms the core of the Plan 9 system, had its authentication protocol embedded in its design. This meant that fixing or changing the authentication used by 9P required deep changes to the system. If someone were to find a way to break the protocol, the system would be wide open and very hard to fix.

These and a number of lesser problems, combined with a desire for more widespread use of encryption in the system, spurred us to rethink the entire security architecture of Plan 9.

The centerpiece of the new architecture is an agent, called factotum, that handles the user’s keys and negotiates all security interactions with system services and applications. Like a trusted assistant with a copy of the owner’s keys, factotum does all the negotiation for security and authentication. Programs no longer need to be compiled with cryptographic code; instead they communicate with factotum agents that represent distinct entities in the cryptographic exchange, such as a user and server of a secure service. If a security protocol needs to be added, deleted, or modified, only factotum needs to be updated for all system services to be kept secure.

Building on factotum, we modified secure services in the system to move user authentication code into factotum; made authentication a separable component of the file server protocol; deployed new security protocols; designed a secure file store, called secstore, to protect our keys but make them easy to get when they are needed; designed a new kernel module to support transparent use of Transport Layer Security (TLS) [RFC2246]; and began using encryption for all communications within the system. The overall architecture is illustrated in Figure 1a.

Figure 1a. Components of the security architecture. Each box is a (typically) separate machine; each ellipse a process.

The ellipses labeled FX are factotum processes; those labeled

PX are the pieces and proxies of a distributed program. The authentication server is one of several repositories for users’ security information that factotum processes consult as required. Secstore is a shared resource for storing private information such as keys; factotum consults it for the user during bootstrap.

Secure protocols and algorithms are well understood and are usually not the weakest link in a system’s security. In practice, most security problems arise from buggy servers, confusing software, or administrative oversights. It is these practical problems that we are addressing. Although this paper describes the algorithms and protocols we are using, they are included mainly for concreteness. Our main intent is to present a simple security architecture built upon a small trusted code base that is easy to verify (whether by manual or automatic means), easy to understand, and easy to use.

Although it is a subjective assessment, we believe we have achieved our goal of ease of use. That we have achieved our goal of improved security is supported by our plan to move our currently private computing environment onto the Internet outside the corporate firewall. The rest of this paper explains the architecture and how it is used, to explain why a system that is easy to use securely is also safe enough to run in the open network.

2. An Agent for Security

One of the primary reasons for the redesign of the Plan 9 security infrastructure was to remove the authentication method both from the applications and from the kernel. Cryptographic code is large and intricate, so it should be packaged as a separate component that can be repaired or modified without altering or even relinking applications and services that depend on it. If a security protocol is broken, it should be trivial to repair, disable, or replace it on the fly. Similarly, it should be possible for multiple programs to use a common security protocol without embedding it in each program.

Some systems use dynamically linked libraries (DLLs) to address these configuration issues. The problem with this approach is that it leaves security code in the same address space as the program using it. The interactions between the program and the DLL can therefore accidentally or deliberately violate the interface, weakening security. Also, a program using a library to implement secure services must run at a privilege level necessary to provide the service; separating the security to a different program makes it possible to run the services at a weaker privilege level, isolating the privileged code to a single, more trustworthy component.

Following the lead of the SSH agent [Ylon96], we give each user an agent process responsible for holding and using the user’s keys. The agent program is called factotum because of its similarity to the proverbial servant with the power to act on behalf of his master because he holds the keys to all the master’s possessions. It is essential that factotum keep the keys secret and use them only in the owner’s interest. Later we’ll discuss some changes to the kernel to reduce the possibility of factotum leaking information inadvertently.

Factotum is implemented, like most Plan 9 services, as a file server. It is conventionally mounted upon the directory /mnt/factotum, and the files it serves there are analogous to virtual devices that provide access to, and control of, the services of the factotum. The next few sections describe the design of factotum and how it operates with the other pieces of Plan 9 to provide security services.

2.1. Logging in

To make the discussions that follow more concrete, we begin with a couple of examples showing how the Plan 9 security architecture appears to the user. These examples both involve a user gre logging in after booting a local machine. The user may or may not have a secure store in which all his keys are kept. If he does, factotum will prompt him for the password to the secure store and obtain keys from it, prompting only when a key isn’t found in the store. Otherwise, factotum must prompt for each key.

In the typescripts, \n represents a literal newline character typed to force a default response. User input is in italics, and long lines are folded and indented to fit.

This first example shows a user logging in without help from the secure store. First, factotum prompts for a user name that the local kernel will use:

user[none]: gre

(Default responses appear in square brackets.) The kernel then starts accessing local resources and requests, through factotum, a user/password pair to do so:

!Adding key: dom=cs.bell-labs.com

    proto=p9sk1

user[gre]: \n

Secure protocols and algorithms are well understood and are usually not the weakest link in a system’s security. In practice, most security problems arise from buggy servers, confusing software, or administrative oversights. It is these practical problems that we are addressing. Although this paper describes the algorithms and protocols we are using, they are included mainly for concreteness. Our main intent is to present a simple security architecture built upon a small trusted code base that is easy to verify (whether by manual or automatic means), easy to understand, and easy to use.

Although it is a subjective assessment, we believe we have achieved our goal of ease of use. That we have achieved our goal of improved security is supported by our plan to move our currently private computing environment onto the Internet outside the corporate firewall. Th