Doi: 10.1145/1498765.1498789
By Michael P. Mesnier, Matthew Wachs, Raja R. Sambasivan, Alice X. Zheng, and Gregory R. Ganger
Relative fitness is a new approach to modeling the performance of storage devices (e.g., disks and RAID arrays). In contrast to a conventional model, which predicts the performance of an application’s I/O on a given device, a relative fitness model predicts performance differences between devices. The result is significantly more accurate predictions.
Relative fitness: the fitness of a genotype compared with another in the same gene system.
Managing storage within a data center can be surprisingly complex and costly. Large data centers have numerous storage devices of varying capability, and one must decide which application data sets (e.g., database tables, web server content) to store on which devices. Sadly, the state-of-the-art in Information Technology (IT) requires much of this to be done manually. At best, this results in an overworked system administrator. However, it can also lead to suboptimal performance and wasted resources.
Many researchers believe that automated storage management2, 5 is one way to offer some relief to administrators. In particular, application workloads can be automatically assigned to storage devices. Doing so requires accurate predictions as to how a workload will perform on a given device, and a model of a storage device can be used to make these predictions. Specifically, one trains a model to predict the performance of a device as a function of the I/O characteristics of a given workload. 1, 7, 11, 13 Common I/O characteristics include an application’s read/write ratio, I/O pattern ( random or sequential), and I/O request size.
Though it sounds simple, such modeling has not been realized in practice, primarily because of the difficulty of obtaining workload characteristics that are good predictors of performance, yet also suitable for use in a model. For example, the I/O request size of an application is often approximated with an average, as opposed to the actual distribution (e.g., bimodal). Although such approximations reduce modeling complexity, they can lead to inaccurate predictions.
This article describes a new modeling approach called relative fitness modeling. 9, 10 A relative fitness model uses observations (performance and resource utilization) from one storage device to predict the performance of another, thereby reducing the dependence on workload characteristics. Figure 1 illustrates relative fitness modeling for two hypothetical devices A and B.
The insight behind relative fitness modeling is best obtained through analogy. When predicting your grade in a college course (a useful prediction during enrollment),
it is helpful to know the grade received by a peer (his performance) and the number of hours he worked each week to achieve that grade (his resource utilization). Naturally, our own performance for a certain task is a complex function of the characteristics of the task and our ability. However, we have learned to make predictions relative to the experiences of others with similar abilities, because it is easier.
Applying the analogy, two storage devices may behave similarly enough to be reasonable predictors for each other. For example, they may have similar RAID levels, caching algorithms, or hardware platforms. As such, their performance may be related. Even dissimilar devices may be related in some ways (e.g., for a given workload type, one usually performs well and the other poorly). The objective of relative fitness modeling is to learn such relationships.
Storage performance modeling is a heavily researched area, including analytical models, 11 statistical or probabilistic models, 1, 7 and machine learning models. 10, 13 Models are either white-box or black-box. White-box models use knowledge of the internals of a storage device (e.g., drives, controllers, and caches), and black-box models do not. Given the complexity of modern-day storage devices, 12 black-box approaches are becoming increasingly attractive.
figure 1: using sample workloads, a model learns to predict how the performance of a workload changes between two devices (A and B). to predict the performance of a new workload on B, the workload characteristics, performance, and resource utilization (as measure on device A) are input into the model of B. the prediction is a performance scaling factor, which we refer to as B’s “relative fitness.”
Step 1: Model differences between devices A and B
Training
data
Device A
Device B
Model learning algorithm
Relative fitness model of B
Step 2:
Use model to predict the performance of B
A’s workload characteristics A’s performance
A’s resource utilization
Relative fitness model of B
B’s relative fitness
A previous version of this research paper was published in the Proceedings of the International Conference on Measurement and Modeling of Computer Systems (San Diego, CA, June 2007), ACM, NY.
aPril 2009 | Vol. 52 | no. 4 | communicAtionS of the Acm
91
References:
Archives