3. 2. Validation
On the Web, code is fetched from untrusted sources and
must be validated. Validation rules for WebAssembly are
defined succinctly as a type system. This type system is, by
design, embarrassingly simple, and designed to be efficiently checkable in a single linear pass.
Typing rules. Again, we utilize standard techniques for
defining our semantics declaratively, this time via a system of
natural deduction rules. 12 Figure 3 shows an excerpt of the
rules for typing instructions. They collectively define a
judgement C e : ft, which can be read as “instruction e is valid with
type ft under the assumptions embodied in the context C.”
The context records the types of all declarations that are in
scope at a given point in a program. The type of an instruction is a function type that specifies its required input stack
and the provided output stack.
Each rule consists of a conclusion (the part below the
bar) and a possibly empty list of premises (the pieces above
the bar). It can be read as a big implication: the conclusion
holds if all premises hold. One rule exists for each instruction, defining when it is well-typed. A program is valid if
and only if the rules can inductively derive that it is
well-typed.
For example, the rules for constants and simple numeric
operators are trivial axioms, since they do not even require a
premise: an instruction of the form t.binop always has type t
t → t, that is, consumes two operands of type t and pushes
one. The rules for control constructs require that their type
matches the explicit annotation ft, and they extend the context with a local label when checking the inner block. Label
types are looked up in the context when typing branch
Configurations. In general, execution operates relative to
a global store s as well as the current function’s frame f. Both
are defined in the upper part of Figure 2.
The store models the global state of a program and is a
record of the lists of function, global, table and memory
instances that have been allocated. An index into one of the
store components is called an address a. We use notation
like sfunc(a) to look up the function at address a. A module
instance then maps static indices i that occur in instructions
to their respective dynamic addresses in the store.
To that end, each frame carries—besides the state of
the function’s local variables—a link to the module instance
it resides in, applying notational short-hands like ftable for
( fmodule)table. Essentially, every function instance is a closure
over the function’s module instance. An implementation
can eliminate these closures by specializing generated
machine code to a module instance.
For example, for a direct call i, the respective function
instance is looked up in the store through the frame of the
caller. Similarly, for an indirect call, the current table is looked
up and the callee’s address is taken from there (if the function’s type ft does not match the expected type, then a trap is
generated). Both kinds of calls reduce to a common administrative instruction call fi performing a call to a known function
instance; reducing that further creates a respective frame for
the callee and initializes its locals.
Together, the triple s; f; e of store, frame, and instruction
sequence forms a configuration that represents the complete
state of the WebAssembly abstract machine at a given point
in time. Reduction rules, in their full generality, then rewrite
configurations not just instruction sequences.
(store) s ::= {func fi∗, global v∗, table ti∗, mem mi∗}
(frames) f ::= {module m, local v∗}
(module instances) m ::= {func a∗, global a*, table a?, mem a?}
(function instances) fi ::= {module m, code func } (where func is not an import and has all exports ex∗ erased)
(table instances) ti ::= (a?)∗
(memory instances) mi ::= b∗
(values) v ::= t.const c
(administrative operators) e ::= . . . | trap | call fi | labeln{e∗} e∗ end | framen{f} e∗ end
nop ε
v drop ε
( t.constc)t.unop t.constunopt(c)
t.const binopt(c1, c2) ( t.constc1) ( t.constC2) t.binop
vn block (tn 1 → tm 2 ) e∗ end
labeln{loop(tn 1 → tm 2 ) e∗ end} vn e∗ end vn loop (tn 1 → tm 2 ) e∗ end
labelm{ε} vn e∗ end
labeln{e∗} v∗ end
labeln{e∗} Li [vn (br i)] end
framen{f} Lk [vn return] end
v∗
v
vn e∗
vn
vn
where L0 ::= v∗ [_] e∗ and Lk+ 1 ::= v∗ labeln{e∗} Lk end e∗
f; (get_local i)
f; v (set_local i)
s' ; f0; framen{f'} e' end
f' ; ε
s; f; (call i) call sfunc(ffunc(i))
if flocal(i) = v
if f' = f with local(i) = v
if s; f; e∗ s'; f'; e'∗
call sfunc(stable(ftable(i)) s; f; ( i32.const i) (call_indirect ft)
s; f;( i32.const i) (call_indirect ft)
if sfunc (stable(ftable)(i))code = (func ft local t∗ e*)
otherwisetrap
vn (call fi) framem{module fimodule, local vn ( t.const 0)k} e∗ end . . .
framen{f} vn end | ... where ficode = (func (tn 1 → tm 2 ) local tk e∗)
s; f0; framen{f} e∗ end
→
→
→
→
→
→
→
→
→
→
→
→
→
→
→
→
→→
Figure 2. Small-step reduction rules (Excerpt).