binary problem is avoided through use
of static linking.
The ability to make atomic changes
is also a very powerful feature of the
monolithic model. A developer can
make a major change touching hundreds or thousands of files across the
repository in a single consistent operation. For instance, a developer can
rename a class or function in a single
commit and yet not break any builds
The availability of all source code
in a single repository, or at least on a
centralized server, makes it easier for
the maintainers of core libraries to perform testing and performance benchmarking for high-impact changes before they are committed. This approach
is useful for exploring and measuring
the value of highly disruptive changes.
One concrete example is an experiment
to evaluate the feasibility of converting
Google data centers to support non-x86
With the monolithic structure of
the Google repository, a developer
never has to decide where the repository boundaries lie. Engineers never
need to “fork” the development of
a shared library or merge across repositories to update copied versions
of code. Team boundaries are fluid.
When project ownership changes or
plans are made to consolidate systems, all code is already in the same
repository. This environment makes
it easy to do gradual refactoring and
reorganization of the codebase. The
change to move a project and update all dependencies can be applied
atomically to the repository, and the
development history of the affected
code remains intact and available.
Another attribute of a monolithic
repository is the layout of the codebase is easily understood, as it is organized in a single tree. Each team has
a directory structure within the main
tree that effectively serves as a project’s own namespace. Each source file
can be uniquely identified by a single
string—a file path that optionally includes a revision number. Browsing
the codebase, it is easy to understand
how any source file fits into the big picture of the repository.
The Google codebase is constantly
evolving. More complex codebase
modernization efforts (such as updat-
This section outlines and expands
upon both the advantages of a monolithic codebase and the costs related to
maintaining such a model at scale.
Advantages. Supporting the ultra-large-scale of Google’s codebase while
maintaining good performance for
tens of thousands of users is a challenge, but Google has embraced the
monolithic model due to its compelling advantages.
Most important, it supports:
˲ Unified versioning, one source of
˲ Extensive code sharing and reuse;
˲Simplified dependency management;
˲ Atomic changes;
˲ Large-scale refactoring;
˲ Collaboration across teams;
˲ Flexible team boundaries and code
˲Code visibility and clear tree
structure providing implicit team
A single repository provides unified
versioning and a single source of truth.
There is no confusion about which repository hosts the authoritative version
of a file. If one team wants to depend
on another team’s code, it can depend
on it directly. The Google codebase includes a wealth of useful libraries, and
the monolithic repository leads to extensive code sharing and reuse.
The Google build system5 makes it
easy to include code across directo-
ries, simplifying dependency manage-
ment. Changes to the dependencies
of a project trigger a rebuild of the
dependent code. Since all code is ver-
sioned in the same repository, there
is only ever one version of the truth,
and no concern about independent
versioning of dependencies.
Most notably, the model allows
Google to avoid the “diamond dependency” problem (see Figure 8) that occurs when A depends on B and C, both
B and C depend on D, but B requires
version D. 1 and C requires version D. 2.
In most cases it is now impossible to
build A. For the base library D, it can
become very difficult to release a new
version without causing breakage,
since all its callers must be updated
at the same time. Updating is difficult
when the library callers are hosted in
In the open source world, dependencies are commonly broken by library updates, and finding library versions that all work together can be a
challenge. Updating the versions of
dependencies can be painful for developers, and delays in updating create
technical debt that can become very
expensive. In contrast, with a monolithic source tree it makes sense, and
is easier, for the person updating a library to update all affected dependencies at the same time. The technical
debt incurred by dependent systems is
paid down immediately as changes are
made. Changes to base libraries are instantly propagated through the dependency chain into the final products that
rely on the libraries, without requiring
a separate sync or migration step.
Note the diamond-dependency
problem can exist at the source/API
level, as described here, as well as
between binaries. 12 At Google, the
Figure 8. Diamond dependency problem.