Richard Huang (2007)
Automatic Resource Specification Generation for Resource Selection in Large-Scale Distributed Environments
Ph.D. Thesis, University of California San Diego.
With an increasing number of available resources in large-scale distributed environments, a key challenge is resource selection. First, we show why explicit resource selection is necessary to optimize application performance. With several middleware systems providing resource selection services, a user is still faced with a difficult question: “What should I ask for?” Since most users end up using naïve and suboptimal resource specifications, we propose an automated way to answer this question. We present an automated resource specification generator that given a workflow application (DAG-structured) generates an appropriate resource specification, including number of resources, the range of clock rates among the resources, and network connectivity. Our resource specification generator is composed of a size prediction model and a scheduling heuristic prediction model. Our size prediction models employs application structure information as well as an optional utility function that trades off cost and performance. With extensive simulation experiments for different types of applications, resource conditions, and scheduling heuristics, we show that our model leads consistently to close to optimal application performance and often reduces resource usage. Further, we construct a model that predicts the optimal scheduling heuristic that can be used in conjunction with the size prediction model. Lastly, we show how our resource specification generator can be used in practice to generate resource specifications for three real-world resource selection systems and offer alternative resource specifications when the best resource request cannot be fulfilled.