4x10G or 40G mode. There were no backplane data connections between these chips; all ports were accessible on the
front panel of the chassis.
We employed the Centauri switch as a ToR switch with
each of the four chips serving a subnet of machines. In one
ToR configuration, we configured each chip with 48x10G to
servers and 16x10G to the fabric. Servers could be configured with 40G burst bandwidth for the first time in production (see Table 2). Four Centauris made up a Middle Block
(MB) for use in the aggregation block. The logical topology
of an MB was a 2-stage blocking network, with 256x10G
links available for ToR connectivity and 64x40G available
for connectivity to the rest of the fabric through the spine
Each ToR chip connects to eight such MBs with dual
redundant 10G links. The dual redundancy aids fast
merchant silicon building blocks. A Saturn chassis supports
12-linecards to provide a 288 port non-blocking switch.
These chassis are coupled with new Pluto single-chip ToR
switches; see Figure 7. In the default configuration, Pluto
supports 20 servers with 4x10G provisioned to the cluster
fabric for an average bandwidth of 2Gbps for each server.
For more bandwidth-hungry servers, we could configure the
Pluto ToR with 8x10G uplinks and 16x10G to servers provid-
ing 5 Gbps to each server. Importantly, servers could burst
at 10 Gbps across the fabric for the first time.
3. 3. Jupiter: A 40G datacenter-scale fabric
As bandwidth requirements per server continued to grow, so
did the need for uniform bandwidth across all clusters in the
datacenter. With the advent of dense 40G capable merchant
silicon, we could consider expanding our Clos fabric across
the entire datacenter subsuming the inter-cluster networking
layer. This would potentially enable an unprecedented pool
of compute and storage for application scheduling. Critically,
the unit of maintenance could be kept small enough relative
to the size of the fabric that most applications could now be
agnostic to network maintenance windows unlike previous
generations of the network.
Jupiter, our next generation datacenter fabric, needed to
scale more than 6x the size of our largest existing fabric.
Unlike previous iterations, we set a requirement for incremental deployment of new network technology because
the cost in resource stranding and downtime was too high.
Upgrading networks by simply forklifting existing clusters
stranded hosts already in production. With Jupiter, new
technology would need to be introduced into the network
in situ. Hence, the fabric must support heterogeneous
hardware and speeds.
At Jupiter scale, we had to design the fabric through individual building blocks, see Figure 8. Our unit of deployment
is a Centauri chassis, a 4RU chassis housing two linecards,
each with two switch chips with 16x40G ports controlled by
a separate CPU linecard. Each port could be configured in
Logical Saturn Topology Two racks with four Chassis
Figure 7. Components of a Saturn fabric. A 24x10G Pluto ToR
Switch and a 12-linecard 288x10G Saturn chassis (including logical
topology) built from the same switch chip. Four Saturn chassis
housed in two racks cabled with fiber (right).
128×40G down to 64 aggregation blocks
1×40G MB1 2 3 4 5 6 7 8
MB MB MB MB MB MB MB
Aggregation Block (512×40G to 256 spine blocks)
Middle Block (MB)
Figure 8. Building blocks used in the Jupiter topology.
Figure 9. Jupiter Middle blocks housed in racks.