108 COMMUNICATIONS OF THE ACM | DECEMBER 2018 | VOL. 61 | NO. 12
major browser vendors and an online community group to
build a common solution for high-performance
applications.a
While the Web is the primary motivation for WebAssembly,
its design—despite the name—carefully avoids any depen-
dencies on the Web. It is an open standard intended for
embedding in a broad variety of environments, and other
such embeddings are already being developed.
To our knowledge, WebAssembly also is the first industrial-strength language that has been designed with
a formal semantics from the start. It not only demonstrates
the “real world” feasibility of applying formal techniques,
but also that they lead to a remarkably clean and simple
design.
2. A TOUR OF THE LANGUAGE
Even though WebAssembly is a binary code format, we define
it as a programming language with syntax and structure. As
we will see, that makes it easier to explain and understand
and moreover, allows us to apply well-established formal
techniques for defining its semantics and for reasoning
about it. Hence, Figure 1 presents WebAssembly in terms of
a grammar for its abstract syntax.
2. 1. Basics
Let us start by introducing a few unsurprising concepts
before diving into less obvious ones in the following.
Modules. A WebAssembly binary takes the form of a
module. It contains definitions for functions, globals, tables, and
memories.b Definitions may be exported or imported.
While a module corresponds to the static representation
of a program, a module’s dynamic representation is an
instance, complete with its mutable state. Instantiating a
module requires providing definitions for all imports, which
may be exports from previously created instances. Computations is initiated by invoking an exported function.
Modules provide both encapsulation and sandboxing:
because a client can only access the exports of a module,
other internals are protected from tampering; dually, a
module can only interact with its environment through its
imports which are provided by a client, so that the client has
full control over the capabilities given to a module. Both
these aspects are essential ingredients to the safety of
WebAssembly.
Functions. The code in a module is organized into individual functions, taking parameters and returnng results as
defined by its function type. Functions can call each other,
including recursively, but are not first class and cannot be
nested. The call stack for execution is not exposed, and thus
cannot be directly accessed by a running WebAssembly program, even a buggy or malicious one.
Instructions. WebAssembly is conceptually based on a
stack machine: code for a function consists of a sequence of
instructions that manipulate values on an implicit operand
stack. However, thanks to the type system (Section 3. 2), the
layout of the operand stack can be statically determined at
any point in the code, so that implementations can compile
the data flow between instructions directly without ever
materializing the operand stack. The stack organization is
merely a way to achieve a compact program representation,
as it has been shown to be smaller than a register machine.
Traps. Some instructions may produce a trap, which
immediately aborts the current computation. Traps cannot
(currently) be handled by WebAssembly code, but an embed-
der will typically provide means to handle this condition, for
example, by reifying them as JavaScript exceptions.
Machine types. WebAssembly has only four basic value
types t to compute with. These are integers and IEEE 754
floating point numbers, each with 32 or 64 bits, as available
in common hardware. Most WebAssembly instructions pro-
vide simple operators on these data types. The grammar in
Figure 1 conveniently distinguishes categories such as unary
and binary operators, tests, comparisons, and conversions.
Like hardware, WebAssembly makes no distinction between
signed and unsigned integer types. Instead, where it mat-
ters, a sign extension suffix _u or _s to an instruction selects
either unsigned or two’s complement signed behavior.
Variables. Functions can declare mutable local variables,
which essentially provides an infinite set of zero-initialized
virtual registers. A module may also declare typed global vari-
ables that can be either mutable or immutable and require
an explicit initializer. Importing globals allows a limited
b WebAssembly’s text format closely resembles this syntax. For brevity we
omit minor features regarding initialization of modules.
(value types) t ::= i32 | i64 | f32 | f64
(packed types) pt ::= i8 | i16 | i32
(function types) ft ::= t∗ → t∗
(global types) gt ::= mut? t
unopiN ::= clz | ctz | popcnt
unopfN ::= neg | abs | ceil | floor | trunc | nearest | sqrt
binopiN ::= add | sub | mul | div_sx | rem_sx |
and | or | xor | shl | shr_sx | rotl | rotr
binopfN ::= add | sub | mul | div | min | max | copysign
testopiN ::= eqz
relopiN ::= eq | ne | lt_sx | gt_sx | le_sx | ge_sx
relopfN ::= eq | ne | lt | gt |le | ge
cvtop ::= convert | reinterpret
sx ::= s | u
(instructions) e ::= unreachable | nop | drop | select |
block ft e∗ end | loop ft e∗end | if ft e∗ else e∗ end |
br i | br_if i | br_table i+ | return | call i | call_indirect ft |
get_local i | set_local i | tee_local i | get_global i |
set_global i | t.load (pt_sx) a o | t.store pt a o |
memory.size | memory.grow | t.const c |
t.unopt | t.binopt | t.testopt | t.relopt | t.cvtop t_sx?
(functions) func ::= ex∗ func ft local t∗ e∗ | ex∗ func ft im
(globals) glob ::= ex∗ global gt e∗ | ex∗ global gt im
(tables) tab ::= ex∗ table n i ∗ | ex∗ table n im
(memories) mem ::= ex∗ memory n | ex∗ memory n im
(imports) im ::= import “name” “name”
(exports) ex ::= export “name”
(modules) mod ::= module func∗ glob∗ tab mem?
Figure 1. WebAssembly abstract syntax.