ࡱ> pro@ n-jbjb "hhn'l |{h8::::::, hff68668XH8<)AY8{{6kk8 VGrADS Execution System Plans February 10, 2004 Editor: Andrew A. Chien with input from Henri Casanova, Rich Wolski, Jack Dongarra, Fran Berman, Dan Reed, Carl Kesselman, Yang-seok Kee, Alex Olugbile, Richard Huang What is a Virtual Grid? something that accelerates and improves decision making about resources provides structure for information collection by the execution system provides structure for efficient presentation to the application / libraries / programming system enables the scope of resource monitoring and scheduling to be reduced, improving scalability enables proactive and reactive resource monitoring, acquisition to improve properties of performance, reliability, stability, security, etc. focus is on resource management, selection, allocation, and binding many attributes can be represented, including security, reliability, as well as traditional performance measures improved capability enables better decision making and scaling to larger resource environments introduces two views of resources those currently being used the application/system those being monitored, and selected against How are Virtual Grid abstractions defined? There are many possibilities. top-down (from the application); prescriptive in describing a desired virtual environment; driving the virtualization; all of the interaction of these then managed transparently in the underlying grid system bottom-up (from the resources and structures); system resources organized into virtual grids; enhanced properties with respect to those grids; rapid selection and binding resource properties communication structure properties and aggregates over these such as reliability, quality of service Oracle Style Example: Local association of a mainframe and collection of desktops Application needs of these associated in a virtual grid, which can then be tied to underlying physical resources If no underlying resource managers couple these resources together (a mainframe and set of workstations), the you might select from a pool of mainframes and from a pool of units of 100s of workstations If the mainframes and workstations are precoupled together, this may lead to overly specialized, inflexible resource organizations A Straw-man Virtual Grid Abstraction Application / Program Preparation System Facing View Virtual Grid Abstraction is an abstract description of a set of resources; this is the basis of the performance contract between PPS and Execution System The application exports this view and it is realized in a Virtual Grid to support the abstract parallel machine or abstract component machine or another Examples: uniform symmetrically connected machine, uniform mesh machine, database and cluster, big bag of processors, fully-connected, mesh-connected, pipeline, tree, others. How general a description language do we NEED? Or WANT? Virtual Grid Services: raw compute execution interface, high level specification of virtual grid views, communication primitives, and interfaces to request a new virtual grid; Application Required Services: callbacks for CHECKPOINTING/SYNCH/LOGGING and RESCHEDULING All are exposed as SERVICES so they can be used electively Can we meaningfully support use of the system and modification at multiple levels of abstraction? Virtual Grid Runtime View Modules which implement the VG abstractions from an underlying pool of resources that are too large, unreliable, partially accessible, varying characteristics, and varying connectivity Exploit the resource classes and statistical characterization to form complex ensembles of resources which implement the virtual grid view required by the programming system. Certain things are presumed not to matter. In this sense, the resource classes can form another layer of the system resource and resource structure centric, they can enable the virtual grid realization systems to achieve higher performance and capability if they are trusted. Built-in performance monitoring to determine the efficacy of the realization of the abstractions, and application callback for CHECKPOINTING and RESCHEDULING or other forms of adaptation to the application should imminent failure or an inability to meet the requirements of the Virtual Grid abstraction view Where do ideas of transparent fault-tolerance fit? We can embed this technology for example in a parallel virtual grid abstraction implementation that provides fault-tolerance? Virtual Grid Runtime Implementation Requires: continuous activity intelligent scoping mechanisms to focus interest and collect relevant dynamic information proactive acquisition of resources to meet current and anticipated needs planning to adapt to anticipated and unanticipated underlying resource environment changes Resource Classes Characterization and organization of resources Requires short and long-term monitoring and analysis of resources Many Open Questions What are the meaningful/useful resource classes? How do we both support large-scale of resources, yet refined classification? Is this a multi-classification? Is this centralized or decentralized or both? The Virtual Grid / VGrADS runtime lives between the Virtual Grid PPS and Runtime View layers and implements efficient information services, scheduling, and fault-tolerance These are broad challenges How do We Make Progress? take familiar and important application/workloads and explore issues what type of virtual grid might an application specify how might we exploit these attributes for better selection/scheduling, etc. Initial work on EOL and speeding critical phases take typical resource configurations and elicit structure what are the likely configuration of resources in a grid what are their characteristics (static, dynamic) do these naturally fall into structured classification with respect to the needs how might we reduce the scope using a virtual grid mechanism to reduce the number that need be considered explore the performance of grid information systems what types of information can be provided with what resolution and accuracy how do these properties scale what techniques are there to make the gathering and distribution of different types of information more efficient and scalable What might these capabilities look like? How would they be presented? What might we deliver? to the program preparation system team? In the immediate term? (next 12 months) For the site visit in year 3 (24 months) Towards the end of the project (48 months) What are the implications of the framework? how do these affect the interfaces to the program preparation system? How to they affect the functionality needed in the program preparation system? How do they relate to the existing GrADS infrastructure? Gradsoft took a single general-purpose view VGrads takes a specialist view in presentation to the application Underlying elements may be shared Abstract performance model not per application performance models Year-by-Year Research Milestones and Tasks Year 1 Milestones and Tasks: Execution System/Virtualization: Prototype Resource Virtualization and Abstraction Classes [V1] Virtual Scheduling requirements study [V2] Execution System/Performance Provisioning: Initial time-spacereasoning for contracts and signatures [PP1] Execution System/Grid Economy: Develop rudimentary simulation of VGrADS resource allocation mechanisms. [GE1] Begin the exploration of Tatonnement, Smale's method, and Continuous-Price Double auctions using simulation. [GE2] Execution System/Fault Tolerance: Experimental measurement of Grid & cluster reliability [FT1] Year 2 Milestones and Tasks: Execution System/Virtualization: Prototype Virtual Grid examples defined [V3] Prototype virtual scheduler [V4] Execution System/Performance Provisioning: Extended time-spacereasoning for contracts and signatures [PP2] Execution System/Grid Economy: Determine initial pricing conditions and pricing methods that prevent multiple equilibria. [GE3] Verify stability results using simulation environment. [GE4] Execution System/Fault Tolerance: Prototype fault tolerant library [FT2] Year 3 Milestones and Tasks: Execution System/Virtualization: Novel resource selection and virtual scheduling strategy experiments with application kernels on virtual grid environments [V5] Execution System/Performance Provisioning: Limited tunable performance/fault-tolerance capabilities [PP3] Execution System/Grid Economy: Begin designing experiments to test pricing techniques using VGrADS framework. [GE5] Continue simulation experiments to evaluate resource allocation efficiency. [GE6] Execution System/Fault Tolerance: Consider novel techniques [FT3] Year 4 Milestones and Tasks: Execution System/Virtualization: Improved Resource virtualization and virtual scheduling approaches based on experiments with additional virtual grid environments [V6] Execution System/Performance Provisioning: Extended tunable performance/fault-tolerance capabilities [PP4] Execution System/Grid Economy: Convert GridSAT to use the pricing-based resource allocation. [GE7] Conduct empirical investigation of pricing scheme and its effect on resource allocation stability using GridSAT as driving application. [GE8] Execution System/Fault Tolerance: Implement novel techniques (e.g. diskless checkpointing) [FT4] Year 5 Milestones and Tasks: Execution System/Virtualization: Continue to improve Resource virtualization and virtual scheduling approaches based on more application kernel experiments [V7] Integrate Resource Virtualization and Abstraction Classes into VGrADS software [V8] Integrate virtual scheduling strategies into VGrADS software [V9] Execution System/Performance Provisioning: Limited validation and assessment [PP5] Execution System/Grid Economy: Design experiment to investigate allocation efficiency under various pricing schemes. [GE9] Target second VGrADS-enabled application (to be determined) as a driving application. [GE10] Verify using both GridSAT and second application. [GE11] Execution System/Fault Tolerance: Limited validation and assessment [FT5]  5!K!_!m-n- CJOJQJ5CJOJQJ5CJCJ CJOJQJ1I:\F & FEƀf-I$ & FEƀf.a$$a$$a$7$8$H$ 1I:?  4 ` a } ( < ÷vskhe]UM!           G  i    9  }      g      r  W  o pq   :?q*F & FEƀfoF & FEƀfoF & FEƀfo? q*F & FEƀf-F & FEƀf-F & FEƀfo  4 q*F & FEƀfoF & FEƀf-F & FEƀf-4 ` a oF & FEƀf.F & FEƀfo } ( < q*F & FEƀf-F & FEƀf-F & FEƀf-< _ f 1C-:u÷wk_SJB9W  r       i      U  O      19  n  o     .   T   j         < _ qo(F & FEƀf.F & FEƀf -F & FEƀf- f 1q*F & FEƀf oF & FEƀf oF & FEƀf o1oF & FEƀf.F & FEƀf oCq*F & FEƀf oF & FEƀf oF & FEƀf -C-q*F & FEƀf oF & FEƀf F & FEƀf o:uq*F & FEƀf F & FEƀf oF & FEƀf otq*F & FEƀf oF & FEƀf oF & FEƀf -t[@2c(ʾth\TQNKC;f    I  w   L   l     +   ~m           R   d     |     c   ,t[q*F & FEƀf oF & FEƀf F & FEƀf o@q*F & FEƀf F & FEƀf F & FEƀf q*F & FEƀf oF & FEƀf oF & FEƀf -2cq*F & FEƀf F & FEƀf F & FEƀf oq*(F & FEƀf -F & FEƀf F & FEƀf (oF & FEƀf-F & FEƀf.(_P<pYZ 5`a" }zwtld\TQNF>v      =  f    o   b        w      [   3  m     8   !  (_q*F & FEƀfoF & FEƀfoF & FEƀfoPq*F & FEƀfoF & FEƀfoF & FEƀf-<pq*F & FEƀf-F & FEƀfoF & FEƀfopYZq*(F & FEƀfoF & FEƀfoF & FEƀfoZ oF & FEƀf-F & FEƀf- 5`aqooF & FEƀf-F & FEƀf-" # \ qooF & FEƀf-F & FEƀf-" # \ 0!1!2!3!5!`!}!!!" "4"t"""U#w#####!$B$zrj^XPH(  U   v          g        @  Al        j]        &'\ q*F & FEƀf-F & FEƀf-F & FEƀf- 0!1!2!3!5!`!}!eK & F7$8$Eƀf.H$7$8$H$F & FEƀf-}!!!"iK & F7$8$EƀfH$K & F7$8$EƀfH$K & F7$8$Eƀf.H$" "4"t"cK & F7$8$EƀfH$K & F7$8$Eƀf.H$7$8$H$t"""U#iK & F7$8$EƀfH$K & F7$8$EƀfH$K & F7$8$Eƀf.H$U#w###iK & F7$8$EƀfH$K & F7$8$EƀfH$K & F7$8$Eƀf.H$###!$iK & F7$8$EƀfH$K & F7$8$Eƀf.H$K & F7$8$Eƀf.H$!$B$m$$iK & F7$8$Eƀf H$K & F7$8$Eƀf.H$K & F7$8$Eƀf H$B$m$$$/%l%%%%%t&&&&R''''(%((()6)ǻ}uic[SKC2  r    $   E c         L  k      U  v          |        $$/%l%iK & F7$8$Eƀf H$K & F7$8$Eƀf H$K & F7$8$Eƀf.H$l%%%%iK & F7$8$Eƀf.H$K & F7$8$Eƀf H$K & F7$8$Eƀf.H$%%t&&iK & F7$8$Eƀf.H$K & F7$8$EƀfH$K & F7$8$Eƀf.H$&&&R'iK & F7$8$EƀfH$K & F7$8$Eƀf.H$K & F7$8$EƀfH$R''''iK & F7$8$EƀfH$K & F7$8$Eƀf.H$K & F7$8$EƀfH$'(%((iK & F7$8$EƀfH$K & F7$8$Eƀf.H$K & F7$8$Eƀf.H$(()6)iK & F7$8$Eƀf.H$K & F7$8$EƀfH$K & F7$8$Eƀf.H$6)z) *+*j***)+}+++,1,,,$-F-n-Ž}t  %  ^      7  _                `  @      6)z) *+*iK & F7$8$Eƀf.H$K & F7$8$EƀfH$K & F7$8$EƀfH$+*j***iK & F7$8$Eƀf.H$K & F7$8$Eƀf.H$K & F7$8$EƀfH$*)+}++iK & F7$8$EƀfH$K & F7$8$EƀfH$K & F7$8$EƀfH$++,1,iK & F7$8$Eƀf.H$K & F7$8$EƀfH$K & F7$8$Eƀf.H$1,,,$-iK & F7$8$EƀfH$K & F7$8$EƀfH$K & F7$8$EƀfH$$-F-n-iK & F7$8$Eƀf.H$K & F7$8$Eƀf.H$ 1h/ =!"#$% iD@D Normal$CJOJQJ^J_HaJmH sH tH R@R  Heading 1$<@&5CJ KH OJQJ\^JaJ N@N  Heading 3$<@&5CJOJQJ\^JaJ<A@< Default Paragraph Font@@  Balloon TextCJOJQJ^JaJn' z z z z z4"n'%!n-:? 4 < 1Ct(pZ \ }!"t"U##!$$l%%&R''(6)+**+1,$-n- !"#$%&()*+,-/012345789:;<=>@ABCDEFHIJKLM< (" B$6)n-'.6?GUnknownCharles Koelbel00IInt{XbFK\d33BP|':!@!>#E###h%n%%%&&&'p'00I?Feh_b33&&p'3333333Charles Koelbel7Chuck Koelbel:Attachments:vGrADS-Execution2-10-2004.docf A*N(! )ld\)ld d0+B&E_? Wgm~Cpx'h  ^ `.hxx^x`.hHLH^H`L.h^`.h^`.hL^`L.h^`.hX X ^X `.h(#L(#^(#`L.^`OJPJQJ^Jo(-^`OJQJ^Jo(hHopp^p`OJQJo(hH@ @ ^@ `OJQJo(hH^`OJQJ^Jo(hHo^`OJQJo(hH^`OJQJo(hH^`OJQJ^Jo(hHoPP^P`OJQJo(hHh ^`OJQJo(h^`.hL^`L.h^`.hPP^P`.h L ^ `L.h^`.h!!^!`.h$L$^$`L.h  ^ `.hxx^x`.hHLH^H`L.h^`.h^`.hL^`L.h^`.hX X ^X `.h(#L(#^(#`L.h  ^ `.hxx^x`.hHLH^H`L.h^`.h^`.hL^`L.h^`.hX X ^X `.h(#L(#^(#`L.h  ^ `.hxx^x`.hHLH^H`L.h^`.h^`.hL^`L.h^`.hX X ^X `.h(#L(#^(#`L.h  ^ `.hxx^x`.hHLH^H`L.h^`.h^`.hL^`L.h^`.hX X ^X `.h(#L(#^(#`L.hhh^h`.h88^8`.hL^`L.h  ^ `.h  ^ `.hxLx^x`L.hHH^H`.h^`.hL^`L.~Cp\)f &E_ d0 Wgm! )*JK        @`n' @GTimes New Roman5Symbol3 Arial? Courier New5 Tahoma;Wingdings"0hL&f0 E!H!0d''n'2q HP1Andrew A. ChienCharles Koelbel Oh+'0t  0 < HT\dl'1ssAndrew A. ChienndrndrNormalACharles Koelbel7arMicrosoft Word 10.1@ @u`v@:  ՜.+,0 hp  ',UCSD, Concurrent Systems Architecture GroupE'r 1 Title  !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNPQRSTUVWXYZ[\]^`abcdefhijklmnqRoot Entry Fjs1TableOkWordDocument"SummaryInformation(_DocumentSummaryInformation8gCompObjX FMicrosoft Word DocumentNB6WWord.Document.8