figure 1. C pseudocode for lock-step time simulation, which contains
a data race in standard concurrency models but is bug-free under
Determinator.
struct actor_state actor[nactors];
main()
initialize all elements of actor[] array
for (time = 0; ; time++)
for (i = 0; i < nactors; i++)
if (thread_fork(i) == in_child)
// child thread to process actor[i]
examine state of nearby actors
update state of actor[i] accordingly
thread_exit();
for (i = 0; i < nactors; i++)
thread_join(i);
main thread forks a child thread per actor, then synchronizes by joining all child threads. The child thread code to
update each actor is “inline” within the main() function, a
convenience of Unix fork() that Determinator extends to
shared memory threads.
Each child thread reads the “prior” state of any or all
actors in the array, then updates the state of its assigned
actor in-place, without explicit copying or additional synchronization. With standard threads this code has a read/
write race: each child thread may see an arbitrary mix of
“old” and “new” states as it examines other actors in the
array. Under Determinator, however, this code is correct
and race-free. Each child thread reads only its private working copy of the actors array, which is untouched except by the
child thread itself, since the main thread forked that child.
As the main thread rejoins all its child threads, Determinator
merges each child’s actor array updates back into the main
thread’s working copy, for use in the next time step.
Traditional write/write races become conflicts in this
model. If two child threads concurrently write to the same
actor array element, for example, Determinator detects this
conflict and signals a runtime exception when the main
thread attempts to join the second conflicting child. In the
conventional model, by contrast, either thread might “win”
this timing-dependent race, silently propagating a likely
erroneous value throughout the computation. Running
this code under a conventional deterministic scheduler
causes the “winner” to be decided based on a synthetic,
reproducible time metric (e.g., instruction count) rather
than real time, but the race remains and may still manifest
or vanish because of slight changes in inputs or instruction
path lengths.
2. 3. A race-free synchronization APi
Standard threads can behave nondeterministically even in
a correctly locked program with no memory access races.
Two threads might acquire a lock in any order, for example,
leading to high-level data races.
2 This nondeterminism is
inherent in the lock abstraction: we can record and replay or
synthesize a lock acquisition schedule,
22 but such a schedule
is still arbitrary and unpredictable to the developer.
2. 4. Race-free system namespaces
Current operating system APIs often introduce nondeterminism unintentionally by exposing shared namespaces
implicitly synchronized by locks. Execution timing affects
the pointers returned by malloc() or mmap() or the
file numbers returned by open() in multithreaded Unix
processes, and the process IDs returned by fork() or
the file names returned by mktemp() in single-threaded
processes. Even if only one thread actually uses a given
memory block, file, process ID, or temporary file, the
assignment of these names from a shared namespace is
inherently nondeterministic.
Determinator’s API avoids creating shared namespaces
with system-chosen names, instead favoring thread-private namespaces with application-chosen names.
Applications, not the system, choose virtual addresses for
allocated memory and process IDs for children. This principle ensures that naming a resource reveals no shared
state information other than what the application itself
provided. Since implicitly shared namespaces often cause
multiprocessor contention, designing system APIs to avoid
this implicit sharing may be synergistic with recent multicore scalability work.
10
3. the DeteRmiNAtoR KeRNeL
We now briefly outline the Determinator kernel’s design and
API. Applications normally do not use this API directly, but
rather use the user-level runtime described in Section 4.
3. 1. spaces
Determinator executes application code within a hierarchy of
spaces, illustrated in Figure 2. Each space consists of CPU register state for a single control flow, a virtual address space containing code and working data, and optionally a snapshot of a
prior version of this address space. A Determinator space is
analogous to a single-threaded Unix process, with important