figure 1. the clamaV virus scanner. circles represent processes,
rectangles represent files and directories, and rounded rectangles
represent devices. arrows represent the expected data flow for a
well-behaved virus scanner.
AV
Helper
AV
Scanner
User
TTY
Update
Daemon
/tmp
User Data
Virus DB
Network
Some of these attacks can be mitigated by running the
scanner with its own user ID in a chroot jail.
7 However, doing
so requires highly privileged, application-specific code to set
up the chroot environment, and risks breaking the scanner
or one of its helper programs due to missing files.
7 Other
attacks, such as those involving sockets or System V IPC,
can be prevented only by modifying the kernel to restrict
certain system calls. Unfortunately, devising an appropriate
policy in terms of system call arguments is an error-prone
task, which, if incorrectly done, risks leaking private data or
interfering with operation of a legitimate scanner.
A better way to specify the desired policy is in terms of
where information should flow—namely, along the arrows in
the figure. While Linux cannot enforce such a policy, HiStar
can. Figure 2 shows our port of ClamAV to HiStar. There are
two differences from Linux. First, we have labeled files with
private user data as tainted. Tainting a file restricts the flow
of its contents to any untainted component, including the
network. The second difference from Linux is that we have
launched the scanner from a new, 110-line program called
wrap, which has untainting privileges. Wrap untaints the
virus scanner’s result and reports back to the user. The scanner cannot read tainted user files without first tainting itself.
Once tainted, it can no longer convey information to the network or update daemon. As long as wrap is correctly implemented, ClamAV cannot leak the contents of the files it scans.
Although HiStar’s tainting mechanism appears simple
at a high level, making it work in practice requires addressing a number of challenges. First, there are myriad ways
in which data can leak out onto the network, as illustrated
above with Linux. How would an operating system like
HiStar know to check the taint of the data being leaked for
each and every one of them? Second, a typical OS kernel
already provides a wide range of protection mechanisms,
including user IDs, process memory protection, chroot jails,
and so on. How can we avoid further complicating the kernel with yet another mechanism, or at very least, avoid unexpected interactions between the many disparate protection
mechanisms? Finally, managing the tainting of files and
the untainting privileges requires a separate mechanism,
which can equally well be the target of attacks. One answer
figure 2. clamaV running in histar. Lightly shaded components
are tainted, which prevents them from conveying any information
to untainted (unshaded) components. the strongly shaded wrap
has untainting privileges, allowing it to relay the scanner’s output
to the terminal.
AV
Helper
AV
Scanner
wrap
User
TTY
Update
Daemon
Private/tmp
User Data
Virus DB
Network
is to allow only the system administrator—root—access to
this mechanism, but doing so both hampers the ability of
other applications to use this mechanism and increases the
amount of fully privileged code running as root.
HiStar addresses these challenges with three key ideas.
First, instead of implementing a traditional Unix interface, the
kernel provides a lower-level interface, consisting of six types
of kernel objects and a small number of operations that make
any information flows between objects explicit. This provides
a correspondingly small number of places where the kernel
must perform data flow checks. Second, the only protection
mechanism provided by the kernel is an information flow control mechanism, which generalizes the intuition behind taint.
All other forms of protection, including Unix user IDs, process memory protection, and tainting itself, are implemented
in terms of information flow control. This both reduces the
amount of trusted kernel code and avoids any ambiguity about
how the mechanisms will form a coherent policy. Finally,
HiStar’s information flow control mechanism is egalitarian,
meaning that it can be used by any process, not just by superuser, which further reduces the amount of fully trusted code.
Though we used the virus scanner as an example, many
security problems can be couched in terms of information
flow. For example, protecting users’ private profiles on a
Web site often boils down to ensuring one person’s information (Social Security number, credit card, etc.) cannot
be sent to another user’s browser. Protecting against trojan
horses means ensuring network payloads do not affect the
contents of system files. Protecting passwords means ensuring that whatever code verifies them can reveal only the single bit signifying whether or not authentication succeeded.
The rest of this paper describes how HiStar provides a new,
Unix-like environment in which small amounts of code can
secure much larger, untrusted applications by enforcing
such policies.
2. DesiGN
The HiStar kernel is organized around six object types,
shown in Figure 3: a segment (a variable-length byte array
similar to a file), an address space (a mapping from virtual
memory addresses to segment object names), a network