Personal tools
You are here: Home Publications Reliability Analysis of Self-Healing Network using Discrete-Event Simulation
Document Actions

Thara Angskun, George Boslica, Graham E Fagg, Jelena Pjesivac-Grbovic, and Jack Dongarra (2007)

Reliability Analysis of Self-Healing Network using Discrete-Event Simulation

In: Proceedings of Seventh IEEE International Symposium on Cluster Computing and the Grid (CCGrid '07), Rio de Janeiro, Brazil.

The number of processors embedded in high performance computing platforms is continuously increasing to accommodate user desire to solve larger and more complex problems. However, as the nmber of components increases, so does the probability of failure. Thus, both scalable and fault-tolerance of software are important issues in this field.


To ensure reliability of the software especially under the failure circumstance, the reliability analysis is needed. The discrete-event simulation technique offers an attractive alternative to traditional markovian-based analytical models, which often have an intractably large state space. In this paper, we analyze reliability of a self-healing network developed for parallel runtime environments using discrete-event simulation. The network is designed to suport transmission of mesages across multiple nodes and at the same time, to protect against node and process failures. Results demonstrate the felxibility of a discrete-event simulation approach for studying the network behavior under failure conditions and various protocol parameters, message types, and routing algorithms.


by Charles Koelbel last modified 2008-05-09 08:06
« September 2010 »
Su Mo Tu We Th Fr Sa
1234
567891011
12131415161718
19202122232425
2627282930
 

VGrADS Collaborators include:

Rice University UCSD UH UCSB UTK ISI UTK

Powered by Plone