You can see Dan Hyde's images of the conference at
URL: http://www.eg.bucknell.edu/~hyde/TeachingCluster/index.html
RATIONALE for comp.distributed
Networks in general, and the internet specifically, have been evolving, from star topologies of thin clients or dumb terminals connected to central servers, to a collection of highly connected nodes, many having significant compute resources, storage, and peripherals, along with human presence. Likewise, internet tools and protocols have evolved from being primarily a mechanism to "push" (via email) or "pull" (via web-browser) untyped data, into supporting more interactive, semantic, and bi-directional relationships. These changes have prompted different communities to (re-)explore the potential of sharing and exploiting collections of heterogeneous, geographically distributed resources such as computers, data, people, and scientific instruments in a secure and consistent manner, usually lacking any central control or authority. These efforts are often described with terms like "peer-to-peer" ("p2p") and "grids", and can serve to virtualize enterprizes by blurring the significance of physical location.
The full Request For Discussion (RFD) can be found at http://groups.google.com/groups?selm=1003963639.19922%40isc.org&output=gplain
1st Workshop on Novel Uses of System Area Networks (SAN-1) held
in conjunction with 8th International Symposium on High Performance
Computer Architecture (HPCA-8), Cambridge, Massachusetts, February 2,
2002
URL: http://www.csl.cornell.edu/SAN-1/
Today's data-driven high performance computer technologies demand reliable delivery systems that combine high-level computing, storage, I/O, and network communication performance. Due to the growth of Internet-driven applications like digital libraries, virtual laboratories, video on demand, e-commerce, web services, and collaborative systems, issues such as storage capacity and access speed have become critical in the design of today's computer systems.
High Performance Mass Storage and Parallel I/O fills the need for a readily accessible single reference source on the subject of high-performance, large scale storage and delivery systems, specifically the use of Redundant Arrays of Inexpensive Disks (RAID) that are accessed using parallel input/output (I/O) architecture. The authors, all internationally recognized experts in the field, have combined the best of the current literature on the subject with important information on emerging technologies and future trends.
Topics covered include:
For further information and sample chapters (RAID and InfiniBand) browse the book website: URL: http://www.buyya.com/superstorage
The US Sandia National Laboratories has released its Cplant system software that enables clusters of off-the-shelf desktop computers to act co-operatively as a supercomputer.
This open-source release is intended to allow researchers free access to the body of research and development that created Sandia's scalable, Linux-based, off-the-shelf computer, according to Sandia manager Neil Pundit. Cplant is modelled on the system software that Sandia developed for the ASCI Red supercomputer built by Intel and installed at the Laboratory's Albuquerque site in 1997. It currently ranks as number three in the Top500 list of the world's fastest computers, published on 21 June.
Sandia's Cplant hardware comprises the largest known sets of Linux clusters for parallel computing. These sets are made up of Compaq Alpha processors and Myrinet interconnects. The largest cluster within Cplant has more than 1,500 Alpha nodes.
The hope, says Pundit, is that modifications and enhancements made by researchers elsewhere will enrich the system software, and that these improvements will be communicated back to Sandia. Release 1.0 totals approximately 43 MB. Requesters must agree to software licensing terms before downloading.
The software can be downloaded from the Cplant website at URL: http://www.cs.sandia.gov/cplant
URL: http://www.osc.edu/press/releases/2001/clusterv41.shtml
InfiniBand, the successor to current PCI--or peripheral component
interconnect--connections, promises to change the way companies
utilize their computers. See article at ACM TechNews.
URL: http://www.acm.org/technews/articles/2001-3/0716m.html#item10
Computational Clusters, Grids, and Peer-to-Peer (P2P) networks have emerged as popular paradigms for next generation parallel and distributed computing. They enable aggregation of distributed resources for solving large-scale data intensive problems in science, engineering, and commerce. In Grid and P2P computing environments, the resources are geographically distributed in multiple administrative domains, managed and owned by different organizations with different policies, and interconnected by wide-area networks or Internet. This introduces several resource management and application scheduling challenges such as security, resource and policy heterogeneity, failures, continuously changing resource conditions, and political issues. The resource management and scheduling system for Grid computing need to manage resources and application execution depending on resource consumers and owners' requirements and continuously adapt to changing in resource conditions.
The management and scheduling of resources in such large-scale distributed systems is complex and, therefore, demands sophisticated tools for analysing and fine-tuning the algorithms before applying them to the real systems. Simulation appears to be the only feasible way to analyse algorithms on large-scale distributed systems of heterogeneous resources. Unlike using the real system in real time, simulation works well, without making the analysis mechanism unnecessarily complex, by avoiding the overhead of coordination of real resources. Simulation is also effective in working with very large hypothetical problems that would otherwise require involvement of a large number of active users and resources, which is very hard to coordinate and build at large-scale research environment for investigation purposes.
To address these issues, we have proposed and developed a Java-based Grid Simulation toolkit called GridSim. The toolkit, built on a basic discrete event simulation system called JavaSim, provides facilities for modeling and simulation of Grid resources (both time and space-shared high performance computers) and network connectivity with different capabilities and configurations. The resources can be modeled to exist in different time zones like in real environments to exhibit different load and cost conditions. GridSim enables the creation of tasks for application models such as task farming and provides interfaces for assigning them to resources. These features can be used to develop resource brokers or Grid schedulers that help in design and evaluation of resource management and scheduling algorithms. We have used the GridSim toolkit to implement a Nimrod-G like Grid resource broker that supports deadline and budget-constrained cost and time minimization scheduling algorithms for executing task farming applications.
The project members are: Rajkumar Buyya and Manzur Murshed from Monash University, Melbourne, Australia.
For further information on the GridSim project and to download the GridSim toolkit, visit: http://www.csse.monash.edu.au/~rajkumar/gridsim/
Since December 29, 2001, visitors to this issue of TFCC Newsletter.