differences; we use the term “space” to highlight these differences and avoid confusion with the “process” and “thread”
abstractions that Determinator emulates at the user level.
A Determinator space cannot outlive its parent, and a
space can directly interact only with its immediate parent
and children via three system calls described below. The
kernel provides no file systems, writable shared memory, or
other abstractions that imply globally shared state.
Only the root space has direct access to nondeterministic
inputs via I/O devices, such as console input or clocks. Other
spaces access I/O devices only indirectly via parent/child
interactions or via privileges delegated by the root space.
A parent space can thus control all nondeterministic inputs
into any unprivileged space subtree, for example, logging
inputs for future replay. This space hierarchy also creates
a performance bottleneck for I/O-bound applications, a
design limitation we leave to address in future work.
3. 2. system call APi
Determinator spaces interact only as a result of processor
traps and three system calls—Put, Get, and Ret, shown in
Table 1. Arguments allow multiple related operations to be
combined in one system call. A Put call can, for example, initialize a child’s registers, copy a virtual memory range into
the child, set permissions on the target range, snapshot the
child’s address space, and start the child executing.
Each space has a namespace of child spaces. User-level
code manages this namespace, specifying any child number
in a Get or Put; the kernel creates this child if it does not exist.
If the child did exist and was running, the kernel blocks the
parent until the child stops via Ret or a trap. These cooperative “rendezvous” semantics ensure that spaces synchronize
only at well-defined points in both spaces’ execution.
figure 2. the kernel’s hierarchy of spaces, each containing private
register and virtual memory state.
table 1. system calls comprising Determinator’s kernel APi.
Call interacts with
Put Child space
Get Child space
Ret Parent space
Copy register state and/or virtual memory range
into child, and optionally start child executing
Copy register state, virtual memory range, and/
or changes since the last snapshot out of a child
Stop and wait for parent to issue a Get or Put.
Processor traps also cause implicit Ret
User-level code can specify a Copy option in a Get or Put
call to copy a range of virtual memory between the invoking
space and a child. The kernel uses copy-on-write to optimize
large copies and avoid physically copying read-only pages.
A Snap option similarly uses copy-on-write to save a reference snapshot of a child space’s entire virtual address space.
The Put call also supports a Merge option, which is like
Copy, except that the kernel copies only bytes that
differ between the child’s current and reference snapshots
into the parent space, leaving other bytes in the parent
untouched. The kernel also detects conflicts: if a byte
changed in both the child’s and parent’s spaces since the
snapshot, the kernel generates an exception, treating a
conflict as a programming error like an illegal memory
access or divide-by-zero. Determinator’s user-level runtime
uses Merge to give multithreaded processes the illusion of
shared memory, as described later.
Finally, the Ret system call stops the calling space, returning control to the space’s parent. Exceptions such as divide-by-zero also cause an implicit Ret, providing the parent a
status code indicating why the child stopped.
To facilitate debugging or protect against looping children,
a parent can start a child with an instruction limit, forcing
the child to trap back to the parent after executing a specific
number of instructions. Counting instructions instead of
“real time” preserves determinism, while enabling spaces to
“quantize” a child’s execution to implement deterministic
scheduling at user level.
3. 3. Distribution via space migration
Space hierarchies can span not only multiple CPUs in a
multiprocessor/multicore system but also multiple nodes in
a homogeneous cluster. While distribution is semantically
transparent to applications, an application may have to be
designed with distribution in mind to perform well.
To support distribution, the kernel interprets the higher
order bits in each space’s child namespace as a node
number. When a space invokes Put or Get, the kernel first
migrates the calling space’s state and control flow to the
node specified in the child number argument, before creating or interacting with some child on that node indicated
in the remaining child number bits. Figure 3 illustrates
a space migrating between two nodes and managing children on each.
figure 3. A space migrating between two nodes and starting child
spaces on each node.
Child (0, 1)
Child ( 1,0)
Child ( 1, 1)
Cluster Node 0
Cluster Node 1