|
|
January 25 University of Tennessee
Innovative Computing Laboratory
Computer Science Department
Jack Dongarra
Windows Cluster
Project
· Detailed description of project exists
People
Jack Dongarra
George Bosilca
Dave Cronk
Graham Fagg
Julien Langou
Piotr Luszczek
Projects:
¨ Numerical Linear Algebra Algorithms and Software
Ø LAPACK, ScaLAPACK, ATLAS
Ø Self Adapting Numerical Algorithms (SANS) Effort
Ø Generic Code Optimization
Ø LAPACK For Clusters – easy access to clusters
¨ Heterogeneous Distributed Computing
Ø NetSolve, FT-MPI, Open-MPI
¨ Performance Evaluation
Ø PAPI, HPC Challenge, Top500
¨ Software Repositories
Ø Netlib
· Project Plan
· has clear timeline and description of expected outcomes
LAPACK
• Used by Matlab, Mathematica, Numeric Python,…
• Tuned version provided by vendors: AMD, Apple, Compaq, Cray, Fujitsu, Hewlett-Packard, Hitachi, IBM, Intel, MathWorks, NAG, NEC, PGI, SUN, Visual Numerics, by Microsoft and most of Linux distribution (Fedora, Debian, Cygwin,...).
• On going work: performance, accuracy, extended precision, ease of use
ScaLAPACK
• Parallel implementation of LAPACK scaling on parallel hardware from 10’s to 100’s to 1000’s of processors
• On going work: Match functionalities of current LAPACK
• On going work: Target new architectures, new parallel environment. For example port to Microsoft HPC cluster solution
LAPACK for Clusters (LFC)
• Most of ScaLAPACK functionality from serial clients (Matlab, Python, Mathematica)
FT-MPI and Open-MPI
· Define the behavior of MPI in event a failure occurs at the process level.
· FT-MPI based on MPI 1.3 (plus some MPI 2 features) with a fault tolerant model similar to what was done in PVM.
o Complete reimplementation, not based on other implementations.
· Gives the application the possibility to recover from a process-failure.
· A regular, non fault-tolerant MPI program will run using FT-MPI.
· What FT-MPI does not do:
o Recover user data (e.g. automatic check-pointing)
o Provide transparent fault-tolerance
Performance evaluation tools
· Performance Application Programming Interface (PAPI)
o A portable library to access hardware counters found on processors
o Provides a standardized list of performance metrics
· KOJAK (Joint with Felix Wolf)
o Software package for the automatic performance analysis of parallel apps
§ Message passing and multi-threading (MPI and/or OpenMP)
§ Parallel performance
§ CPU and memory performance
Posters for Related Projects
|
Hardware Configuration
|
|
Team HPC
|
|
Dual Core 4GB AMD Opterons
|
|
Team HPC Turnkey Beowulf-Class Supercomputer
|
|
26 4GB AMD Opteron DC Compute Nodes, 1 Head Node
|
| CPU Manufacturer
| AMD
|
| CPU Model
| Opteron 265
|
| CPU Speed
| 1.8 GHZ
|
| Number of nodes
| 26
|
| Number of cores
| 2
|
| Interconnect(s)
| Infiniband, Myranet, GigE
|
|
|
|
Item Description
|
QTY
|
|
26 Compute Nodes
|
|
Supermicro H8DCE Motherboard
|
26
|
|
3U Chassis w/ 350W PS with PCI-E riser & Slide Rails
|
26
|
|
AMD Opteron 265 1.8GHz with Heatsink
|
52
|
|
4GB PC3200 Registered/ECC DDR
|
104
|
|
1Gb X4 Total memory
|
|
80GB 7200rpm SATA 8 MB cache HDD
|
26
|
|
ATI Rage on board
|
26
|
|
Dual Gigabit Ethernet Integrated on board
|
26
|
|
One Year Standard Warranty
|
26
|
|
Opteron Linux Installed and Tested
|
26
|
|
Built, Tested & Configured
|
26
|
|
Torque, Kick-Start Utility & Web-Based Mon. Software
|
|
Head Node 4Gb per Node
|
|
|
|
Supermicro H8DCE Motherboard
|
1
|
|
3U Chassis w/ PS and Slide Rails
|
1
|
|
AMD Opteron 265 1.8 GHZ with Heatsink and Fan
|
2
|
|
4GB PC3200 Registered/ECC DDR
|
4
|
|
1GB X 4 Total memory
|
|
DVD Combo Drive
|
1
|
|
ATI Rage on board
|
1
|
|
Dual Gigabit Ethernet Integrated on board
|
1
|
|
42U APC Rack Enclosure with perforated doors, sides and levelers
|
2
|
|
APC Masterswitch 3 Phase 208
|
2
|
|
Wiring Harness
|
1
|
|
1U All in one KB, VIDEO and MOUSE
|
1 |
|  |
|
|
|
|