Personal tools
You are here: Home Publications SCHMIB: Segregating Clusters Hierarchically Making Improved Bounds
Document Actions

John Brevik, Daniel Nurmi, and Rich Wolski (2005)

SCHMIB: Segregating Clusters Hierarchically Making Improved Bounds

University of California, Santa Barbara, Technical Report(2005-27), Santa Barbara, CA.

Most space-sharing parallel computers presently operated by high-performance computing centers use batch-queuing systems to manage processor allocation. In many cases, users wishing to use these batch-queued resources have the option of choosing between different queues (having different charging rates) potentially on a number of different machines where they have access. In such a situation, the amount of time a user's job will wait in any one batch queue can significantly impact the overall time a user waits from job submission to job completion. It thus becomes desirable to provide a prediction for the amount of time a job can expect to wait in the queue at a given time. Further, it is natural to expect that attributes of an incoming job, specifically the number of processors requested and the amount of time requested, might impact that job's wait time. Previous work has shown that it is possible to determine meaningful upper-bounds on queuing delay using a simple non-parametric technique, particularly when site administrators provide information for how jobs should be grouped by processor count. In this work, we explore the possibility of generating more accurate predictions by automatically grouping jobs having similar attributes using model-based clustering. Moreover, we implement this clustering technique for a time series of jobs so that predictions of future wait times can be generated in real time. Using trace-based simulation on data from 7 machines over a 9-year period from across the country, comprising over one million job records, we show that clustering either by requested time or by requested number of processors generally produces more accurate predictions than the earlier more naive approaches, that automatic clustering outperforms administrator-determined clusterings, and that clustering by requested time or the product of requested nodes and requested execution time is substantially more effective than clustering by requested number of processors.

by admin last modified 2008-04-30 12:20
« September 2010 »
Su Mo Tu We Th Fr Sa

VGrADS Collaborators include:


Powered by Plone