news
Science | DOI: 10.1145/1646353.1646361
type theory
comes of Age
Type systems are moving beyond the realm of data structure
and into more complex domains like security and networking.
WHEn THE pHIlOSOpHEr Bertrand Russell in- vented type theory at he beginning of the 20th century, he could
hardly have imagined that his solution
to a simple logic paradox—defining
the set of all sets not in themselves—
would one day shape the trajectory of
21st century computer science.
Once the province of mathematicians and social scientists, type theory
has gained momentum in recent years
as a powerful tool for ensuring data
consistency and error-free program
execution in modern commercial programming languages like C#, Java,
Ruby, Haskell, and others. And thanks
to recent innovations in the field, type
systems are now moving beyond the
realm of data structure and into more
complex domains like security and networking.
First, a quick primer. In programming languages, a type constitutes a
definition of a set of values (for example,
“all integers”), and the allowable operations on those values (for example,
addition and multiplication). A type
system ensures the correct behavior of
any program routine by enforcing a set
of predetermined behaviors. For example, in a multiplication routine, a type
system might guarantee that a program
will only accept arguments in the form
of numerical values. When other values
appear—like a date or a text string—
the system will return an error. For
programmers, type systems help prevent undetected execution errors. For
language implementers, they optimize
execution and storage efficiency. For example, in Java integers are represented
in the form of 32 bits, while doubles
are represented as 64 bits. So, when a
Java routine multiplies two numbers,
the type system guarantees they are either integers or doubles. Without that
Benjamin c. Pierce,
university of Pennsylvania.
guarantee, the runtime would need to
conduct an expensive check to determine what kinds of numbers were being multiplied before it could complete
the routine.
What distinguishes a type system
from more conventional program-level
verification? First, a type system must
be “decidable”; that is, the checking
should happen mechanically at the earliest opportunity (although this does not
have to happen at compilation time; it
can also be deferred to runtime). A type
system should also be transparent; that
is to say, a programmer should be able
to tell whether a program is valid or not
regardless of the particular checking algorithm being used. Finally, a “sound”
type system prevents a program from
performing any operation outside its
semantics, like manipulating arbitrary
memory locations.
Languages without a sound type
system are sometimes called unsafe or
weakly typed languages. Perhaps the
best-known example of a weakly typed
Alex Wright
system is C. While C does provide types,
its type checking system has been intentionally compromised to provide direct
access to low-level machine operations
using arbitrary pointer arithmetic, casting, and explicit allocation and deallo-cation. However, these maneuvers are
fraught with risk, sometimes resulting
in programs riddled with bugs like buffer overflo ws and dangling pointers that
can cause security vulnerabilities.
By contrast, languages like Java,
C# , Ruby, Javascript, Python, ML, and
Haskell are strongly typed (or “type
safe”). Their sound type systems catch
any type system violations as early as
possible, freeing the programmer to
focus debugging efforts solely on valid
program operations.
static and Dynamic systems
Broadly speaking, type systems come in
two flavors: static and dynamic. Statically typed languages catch almost all
errors at compile time, while dynamically typed languages check most errors at runtime. The past 20 years have
seen the dominance of statically typed
languages like Java, C# , Scala, ML, and
Haskell. In recent years, however, dynamically typed languages like Scheme,
Smalltalk, Ruby, Javascript, Lua, Perl,
and Python have gained in popularity
for their ease of extending programs at
runtime by adding new code, new data,
or even manipulating the type system at
runtime.
Statically typed languages have restrictions and annotations that make
it possible to check most type errors at
compile time. The information used
by the type checker can also be used by
tools that help with program text-edit-ing and refactoring, which is a considerable advantage for large modular programs. Moreover, static type systems
enable change. For example, when an
important data structure definition is