sites import these files and display cross-project credit, that is, the volunteer’s total credit across all projects.
Even with the modest number (60) of current projects, the process
of locating them, reading their web sites, and attaching to a chosen set
is a tedious process, and will become infeasible if the number of projects grows to hundreds or thousands.
BOINC provides a framework for dealing with this problem. A
level of indirection can be placed between client and projects. Instead
of being attached directly to projects, the client can be attached to a
web service called an account manager. The client periodically communicates with the account manager, passing it account credentials
and receiving a list of projects to attach to.
This framework has been used by third-party developers to create
“one-stop shopping” web sites, where volunteers can read summaries
of all existing BOINC projects and can attach to a set of them by
checking boxes. The framework could also be used for delegation of
project selection, analogous to mutual funds. For example, volunteers
wanting to support cancer research could attach to an American
Cancer Society account manager. American Cancer Society experts
would then select a dynamic weighted “portfolio” of meritorious can-cer-related volunteer projects.
The BOINC client software lets volunteers attach to projects and
monitor the progress of jobs.
All HPC paradigms involve human factors, but in volunteer computing these factors are particularly crucial and complex. To begin with,
why do people volunteer?
This question is currently being studied rigorously. Evidence suggests that there are several motivational factors. One such factor is to
support scientific goals, such as curing diseases, finding extraterrestrial
life, or predicting climate change. Another factor is community. Some
volunteers enjoy participating in the online communities and social
networks that form, through message boards and other web features,
around volunteer computing projects. Yet another reason people volunteer is because of the credit incentive. Some volunteers are interested
in the performance of computer systems, and they use volunteer computing to quantify and publicize the performance of their computers.
There have been attempts to commercialize volunteer computing
by paying participants, directly or via a lottery, and reselling the computing power. These efforts have failed because the potential buyers,
such as pharmaceutical companies, are unwilling to have their data on
computers outside of their control.
To attract and retain volunteers, a project must perform a variety
of human functions. It must develop web content describing its
research goals, methods, and credentials. It must provide volunteers
with periodic updates (via web or email) on its scientific progress. It
must manage the moderation of its web site’s message boards to
ensure that they remain positive and useful. It must publicize itself by
whatever media are available—mass media, alumni magazines, blogs,
social networking sites, and so on.
Volunteers must trust projects, but projects cannot trust volun-teers. From a project’s perspective, volunteers are effectively anony-mous. If a volunteer behaves maliciously, for example by intentionally
falsifying computational results, the project has no way to identify and
punish the offender. In other HPC paradigms, such offenders can be
identified and disciplined or fired.
Volunteer computing poses a number of technical problems. For the
most part, these problems are addressed by BOINC, and scientists
need not be concerned with them.
Heterogeneity. The volunteer computer population is extremely
diverse in terms of hardware (processor type and speed, RAM, disk
space), software (operating system and version) and networking
(bandwidth, proxies, firewalls). BOINC provides scheduling mechanisms that assign jobs to the hosts that can best handle them.
However, projects still generally need to compile applications for several platforms (Windows 32 and 64 bit, Mac OS X, Linux 32 and 64
bit, various GPU platforms). This difficulty may soon be reduced by
running applications in virtual machines.
Sporadic availability and churn. Volunteer computers are not ded-icated. The time intervals when a computer is on, and when BOINC is
allowed to compute, are sporadic and generally unpredictable. BOINC
tracks these factors and uses them in estimating job completion
times. In addition, computers are constantly joining and leaving the
pool of a given project. BOINC must address the fact that computers
with many jobs in progress may disappear forever.
Result validation. Because volunteer computers are anonymous
and untrusted, BOINC cannot assume that job results are correct, or
that the claimed credit is accurate. One general way of dealing with
this is replication: that is, send a copy of each job to multiple computers; compare the results; accept the result if the replicas agree; otherwise issue additional replicas. This is complicated by the fact that
different computers often do floating-point calculations differently, so
that there is no unique correct result.
BOINC addresses this with a mechanism called homogeneous
redundancy that sends instances of a given job to numerically identical computers. In addition, redundancy has the drawback that it
reduces throughput by at least 50 percent. To address this, BOINC has
a mechanism called adaptive replication that identifies trustworthy
hosts and replicates their jobs only occasionally.
Scalability. Large volunteer projects can involve a million hosts
and millions of jobs processed per day. This is beyond the capabilities
of grid and cluster systems. BOINC addresses this using an efficient
server architecture that can be distributed across multiple machines.
The server is based on a relational database, so BOINC leverages
advances in scalability and availability of database systems. The communication architecture uses exponential backoff after failures, so that
Spring 2010/ Vol. 16, No. 3