vide better performance but may not
have a stable API. The Bazel Buildfarm
project ( https://github.com/bazelbuild/
bazel-buildfarm) implements this API.
Implementing a Cache Service
An HT TP service that supports PU T and
GET methods with URLs in forms similar to the second example in the previous section can be used by Bazel as the
remote cache service. A few successful
implementations have been reported.
Google Cloud Storage (https://
cloud.google.com/storage/) is the easiest to set up if you are already a user.
It is fully managed, and you are billed
depending on storage needs and network traffic. This option provides good
network latency and bandwidth if your
development environment and build
infrastructure are already hosted in
Google Cloud. It might not be a good
option if you have network restrictions
or the build infrastructure is not located in the same region. Similarly, Amazon S3 (Simple Storage Service; https://
aws.amazon.com/s3/) can be used.
For onsite installation, nginx (https://
nginx.org/en/) with the WebDAV (Web
Distributed Authoring and Versioning)
module ( http://nginx.org/en/docs/http/
ngx_http_dav_module.html) will be the
simplest to set up but lacks data replication and other reliability properties if
installed on a single machine.
The accompanying figure shows an
example system architecture implemen-
tation of a distributed Hazelcast (https://
hazelcast.com/) cache service (https://
as-a-service/) running in Kubernetes
( https://kubernetes.io/). Hazelcast is a
distributed in-memory cache running in
a JVM (Java Virtual Machine). It is used
as a CaaS (cache-as-a-service) with sup-
port for the HTTP/1.1 interface. In
the figure, two instances of Hazelcast
nodes are deployed using Kubernetes
and configured with asynchronous data
replication within the cluster. A Kuber-
netes Service ( https://kubernetes.io/
service/) is configured to expose a port
for the HTTP service, which is load-
balanced within the Hazelcast cluster.
Access metrics and data on the health
of the JVM are collected via JMX (Java
Management Extensions). This exam-
ple architecture is more reliable than a
single-machine installation and easily
scalable in terms of QPS (queries per
second) and storage capacity.
You can also implement your
own HTTP cache service to suit your
needs. Implementing the gRPC in-
terface for a remote cache server is
another possible option, but the APIs
are still under development.
In all implementations of the cache
service it is important to consider cache
eviction. The action cache and CAS will
grow indefinitely since Bazel does not
perform any deletions. Controlling the
storage footprint is always a good idea.
The example Hazelcast implementation in the figure can be configured to
use a least recently used eviction policy
with a cap on the number of cache objects together with an expiration policy.
Users have also reported success with
random eviction and by emptying the
cache daily. In any case, recording metrics about cache size and cache hit ratio will be useful for fine-tuning.
Following the best practices outlined
here will avoid incorrect results and
maximize the cache hit rate. The first
best practice is to write your build rules
without any side effects. Bazel tries very
hard to ensure hermeticity by requiring
the user to explicitly declare input files
to any build rule. When the build rules
are translated to actions, input files are
known and must present during execution. Actions are executed in a sandbox
by default, and then Bazel checks that
all the declared output files are created.
You can, however, still write a build rule
with side effects using genrule or a custom action written in the Skylark language
skylark/ language.html), used for extensions. An example is writing to the temporary directory and using the temporary
files in a subsequent action. Undeclared
side effects will not be cached and might
cause flaky build failures regardless of
whether remote cache is used.
Some built-in rules such as cc _
library and cc _ binary have implicit dependencies on the toolchain
installed on the system and on system
libraries. Because they are not explicitly declared as inputs to an action,
they are not included in the computation of the action digest for looking up
the action cache. This can lead to the
reuse of object files compiled with a
and test system