Producing these plots should become routine. Your monitoring tool
might be able to provide you with
some of these methods already. To get
the others, figure out how to export the
relevant data and import it into the
software tool of your choice (Python, R,
or Excel). Play around with these visualizations and see how your machine
To discover more visualization methods, check out the Seaborn gallery. 18
Histograms in IT operations have two
different roles: as visualization method
and as aggregation method.
To gain a complete understanding
of histograms, let’s start by building
one for the Web request-rate data discussed previously. The listing in Figure
5 contains a complete implementation, discussed step by step here.
1. The first step in building a histogram is to choose a range of values
that should be covered. To make this
choice you need some prior knowledge about the dataset. Minimum and
maximum values are popular choices
in practice. In this example the value
range is [500, 2200].
2. Next the value range is partitioned into bins. Bins are often of
equal size, but there is no need to follow this convention. The bin partition
is represented here by a sequence of
bin boundaries (line 4).
3. Count how many samples of the
given dataset are contained in each
bin (lines 6–13). A value that lies on the
boundary between two bins will be assigned to the higher bin.
4. Finally, produce a bar chart, where
each bar is based on one bin, and the
bar height is equal to the sample count
divided by the bin width (lines 14–16).
The division by bin width is an important normalization, since otherwise
the bar area is not proportional to the
sample count. Figure 5 shows the resulting histogram.
Different choices in selecting the
range and bin boundaries of a histogram
can affect its appearance considerably.
Figure 6 shows a histogram with 100 bins
for the same data. Note that it closely
resembles a rug plot. On the other extreme, choosing a single bin would result in a histogram with a single bar with
a height equal to the sample density.
Figure 4. Scatter plots of request rates of two database nodes.
Node- 1 Request Rate in rps
Node- 1 Request Rate in rps
Figure 5. Result of a manual histogram implementation.
1 from matplotlib import pyplot as plt
2 import numpy as np
3 X = np.genfromtxt("DataSets/RequestRates.csv", delimiter=",")[:, 1]
4 bins = [500, 700, 800, 900, 1000, 1500, 1800, 2000, 2200]
5 bin_count = len(bins) - 1
6 sample_counts =  * bin_count
7 for x in X:
8 for i in range(bin_count):
9 if (bins[i] <= x) and (x < bins[i + 1]):
10 sample_counts[i] += 1
11 bin_widths = [ float(bins[i] - bins[i- 1]) for i in range
( 1, bin_count) ]
12 bin_heights = [ count/width for count, width in zip(sample_counts,
13 plt.bar(bins[:bin_count- 1], width=bin_widths, height=bin_heights);
800 1000 1200 1400
Request Rate in rps
1600 1800 2000 2200