Headline in Arial Bold 30pt
Le nuove frontiere dell’ HPC
Sergio Re
Sales & Marketing Manager
Silicon Graphics Italia
SGI oggi
Un’anzienda rivolta all’innovazione
Più di
$500m
fatturato
Più di
1700 persone
800+ persone che visitano clienti
300+ Ingegneri in R&D
6000+ Clienti attivi
in oltre 50
paesi nel mondo
• sistemi Linux HPC più avanzati del mondo
I ns punti di forza tecnici
• un’architettura scalare unica
• un sistema di condivione globale della memoria
• File systems e condivisione dello storage
• Servizi e consulenza
Fatturato per segmento di mercato
Revenue Mix
Geographic Contribution
Services
48%
Americas
59%
Legacy
16%
Core
36%
Enterprise Business
Management/Media
9%
Engineering
Analysis
20%
Sciences
36%
Defense & Intelligence
35%
Europe
25%
Rest of
World
16%
SGI Systems
Highly Integrated and Massively Scalable
HighPerformance
Computing
Advanced
Graphics
Storage
Joint Development With Partners
SGI and Intel collaborage on system design requirements for large scale
computing for Itanium and Xeon CPUs
SGI maintains a close working relationship with Novell on Linux support for
scalability, performance, and support for SGI servers and storage systems.
SGI works closely with Red Hat on Linux support with a special emphasis on
security and adherence to standards
SGI has contributed thousands of lines of code to the Linux Community
Including code that supports large scale computing, reliability and stability
SGI and Oracle are jointly developing, selling and marketing
Enterprise solutions for data intensive problems
Le necessità di calcolo
Simulazione Metereologica
CPU
Generiche applicazioni di calcolo
Database
Signal processing
Memoria
Web server
I/O
Media streaming
SGI Advanced HPC Platform
Large
SMP Workflow
SGI® Altix®
4700
Midrange SMP/
Cluster Workflow
SGI® Altix® 450
Intel® ITANIUM2 based
Application
Appliance
SGI® f1200
Intel® Xeon®
based
Cluster Workflow
SGI® Altix® XE
ICE 8200
SGI Workflow Ready Solution
Workflow Continuum
Common Linux OS & Development Tools
SGI Scalable shared file servers & storage solutions
Common Workload Management Tools
Systems Management
& Monitoring
SGI: Complete HPC Solution on Linux®
• Compilers:
– Intel C++ and Fortran Compilers for Linux
– GNU Compiler for C and Fortran 77
• Libraries:
SGI ProPack™
for Linux
And
Third Party
Tools
–
–
–
–
–
–
SGI Message Passing Toolkit
SGI Scientific Computing Software Library
SGI Flexible File I/O
Intel Math Kernel Library
Intel Integrated Performance Primitives
NAG C, Parallel, Fortran, Fortran SMP, F90
• Automated Parallelization Tools:
– Parallel Software Products ParaWise
• Open Source Development Tools:
– Linuxlinks.org, Freshmeat.net,
SourceForge®.net
• SGI Data Management Software
– CXFS™ cluster File System
– DMF hierarchical storage management
Standard Linux
Distributions
• Novell SUSE™ Linux Enterprise Server 9
and 10
• Debuggers:
–
–
–
–
Intel Debugger
Etnus® TotalView®
GNU GDB
Allinea Software Distrbuted Debug Tool
• Performance and Analysis Tools:
–
–
–
–
–
Intel VTune™ Performance Analyzer
Intel Trace Analyzer and Trace Collector
SGI Performance Co-Pilot™
SGI pfmon and profile.pl
SGI Histx
• Other SGI ProPack Tools
–
–
–
–
–
REACT 4.2 (real-time support)
XVM
NUMA tools (cpuset, dlock, dplace)
Embedded Support Partner (ESP)
Graphics support
• RedHat® Enterprise Linux 4
ICE 8200
Breakthrough Reliability
•
•
•
•
•
Carlsbad IRU: 128 cores and no
cables
Redundant, hot swap power and
cooling
Fully Buffered DIMMS to reduce
transient errors
Blade design provides rapid
serviceability
InfiniBand backplane for high signal
reliability
(2) 4x DDR IB Switch Blades
(1) 24-Port IB switch ASIC per blade
Front View
(7+1) 1625W 12VDC Output
Front-End Power Supplies
(16) 2-Socket Nodes
10U 24-inch EIA Form Factor
(17.50-in H x 22.5-in W x 32-in D)
(1) Chassis
Management
Controller (CMC)
ICE 8200
Breakthrough Performance Density
Up to 512 Cores and 6 TFlops per Rack
(16)
Carlsbad
Blades
L1Display
• Each 42U rack (30” W x 40” D) rack has:
L1Display
(16)
Carlsbad
Blades
L1Display
• (48) 4x DDR IB
L1Display
(16)
Carlsbad
Blades
L1Display
L1Display
(16)
Carlsbad
Blades
L1Display
– (4) IRUs with (16) 2-Socket Carlsbad Nodes each
– (128) DP Intel® Xeon® sockets
– DDR IB ports on (4) backplanes for torus
L1Display
• 19” standard rack also supported
• SGI offers optional chilled water-cooled units
for use in large system configurations
• 39.5kW (high-bin SKUs + (4) FB DIMMs
/socket)
– 31.6kW (assuming 80% system-level derate)
• Rack weight ~ 2050 Lb (246 Lb/ft2 footprint)
SGI Scalable ccNUMA Architecture
Basic Node Interconnect
C
A
C
H
E
CPU
CPU
C
A
C
H
E
C
A
C
H
E
CPU
CPU
NUMAlink Interconnect
Interface
Chip
Interface
Chip
Physical Memory Physical Memory Physical Memory
C
A
C
H
E
Open Systems Scaleable Infrastructure
RASC™ (FPGA)
C
A
C
H
E
CPU
CPU
Interface
Chip
Physical Memory
C
A
C
H
E
FPGA(s)
FPGA(s)
Scalable GPUs
C
A
C
CPU
GPUs
H
GPUsCPU
E
TIO
C
A
C
H
E
General Purpose
I/O Interfaces
C
General
A
C
CPU
Purpose
H
E
I/O
General
Purpose
CPU
I/O
Interface
TIOChip
Interface
TIOChip
Physical Memory
Physical Memory
NUMAlink™
Interconnect
Fabric
C
A
C
H
E
SGI® Altix® 450
• “Plug and Solve” Blade Form Factor
• Half-rack or Full-rack
S S S S S
S S S S S
• 5U ‘IRU’ Chassis
S S S S S
S S S S S
-Chassis-only option
S S S S S
S S S S S
• 608 GB SSI Memory
-Increasing to 912GB in 1HCY07
NUMALINK
-4 to 76 cores
Double
Single Single Single Single
Slot
Slot
Slot
Slot
Slot
(I/O)
Power Sup.
S S S S
I/O S S S S
Power Sup.
Power
Sup.
• 2 to 38 Sockets
5U ‘Individual Rack Unit’ Chassis
NUMALINK
-3rd party rack option
Bringing it Together: Solution Components
ISC Star-P™
SGI® Altix® and Altix® XE
•
•
•
•
Scalable servers, clusters, and
supercomputers
Cost-efficient, reliable Altix XE
clusters with leading density, power
efficiency
Advanced scalability to 512
processors per Altix server and
128TB globally addressable
memory per system
Complete Linux solution for HPC
Images courtesy of Silicon Graphics, Inc; Interactive Supercomputing Inc.
•
•
•
•
Interactive parallel computing
platform
Bridges MATLAB and Altix servers
Works with familiar desktop tools,
while leveraging an HPC for
computationally-intensive tasks
Automatic and transparent, no new
programming
How Star-P works
• Star-P consists of desktop & server software
• Desktop software – Star-P Client
–
–
Overloads or intercepts desktop tool functions
Connects and communicates to/with server software – securely
• Server software – Star-P Server on SGI® Altix® or SGI® Altix® XE
–
–
–
Manages and directs resources – memory, cpu’s and I/O
Contains world class libraries for parallel execution
User & Session management
What’s the Value
• Desktop Users:
– No change in religion
– Interactivity
• On parallel machines
• For large data
– No reprogramming
• No C, Fortran, MPI
– Reduced run times
• Not hours or weeks
– Continued model optimization
• Organization
–
–
–
–
–
Collapse development cycles
Reduce costs
Broaden usage
Shorten solution time
Accelerate research
Parallel Development Takes Too Long
•
•
•
•
•
Months or years are spent porting
from desktops to parallel systems
No interactively on parallel machines
from desktop
Little ability to iterate
Long compute times for batch runs;
hours-days
Analyst’s ability to optimize the model
is limited
Using Star-P– Serial operations
•
Use MATLAB
–
–
–
–
–
–
–
•
•
File Editor
Profiler
Debugger
Array Editor
Desktop
Visualization
Small Calculations
Running Star-P does not affect
normal MATLAB environment
Problems that can be solved on
desktop - stay on desktop
SGI® RASC™ Solution:
Simplifies Development & Improves Programmer Efficiency
Gnu Debugger
(GDB)
FPGA Aware
Simultaneous debugging of both the CPU based app and
the FPGA accelerated app
RASC Abstraction
Layer (RASCAL)
SGI provided
Enables serial or parallel FPGA scaling
RASC API and Core Provides tools to develop reconfigurable computing
Services Library
elements in a multi-user, multiprocessing environment
3rd Party HLL
Development Tools
Mitrionics Mitrion C, Impulse-C and ROCCC
Synplicity Synplify
Pro and Xilinx
Synthesis
Technology
Supported within RASC environment
For advanced incremental and modular design
methodologies
How Do FPGAs Differ from Traditional CPUs?
Compare Application
Run Time %’s
Algorithm
Algorithm
Memory Calls
Branche inst.
100%
Export
Algorithm to RASC
90%
80%
% of Runtime
70%
60%
50%
40%
30%
20%
10%
0%
App 1
App 2
App 3
App 4
Identify RASC
appropriate algorithm
App 5
Algorithm
Execution Time
01001000010010
01110100101010
11100101010001
10001000110001
01010101010111
00000111100100
00010010111010
0 11 001 00011 1
1 11011110011 0
Traditional
Method
CPU only
Algorithm
Execution time
01001000010010
01110100101010
RC100 Method
Key Algorithm
running on FPGA
Directly map computationallyintensive
algorithms to hardware with
RC100 technology
Time
Savings
Job Run Time
Application Run-Time Comparison
SGI® RASC™ RC100 Blade
SRAM
SRAM
SRAM
SSP
NL4
V4LX200
TIO
SRAM
PCI
SRAM
Selmap
NL4
Loader
SRAM
Selmap
SRAM
NL4
TIO
SSP
V4LX200
SSAM
SRAM
SRAM
SGI Workflow Ready Solution –Segment
Example –Fluid Structure Interaction (FSI)
Any combination of ALTIX servers & XE sharing storage resources
SGI Solution:
FSI Workload
CAPABILITY
CAPACITY
SGI Altix 450/4700
SMP & super head node
Minimize time-to
solution for the
largest & most
demanding
problems
InfiniBand or GigE
Fabric
Fibre Channel or
InfiniBand or GigE
Fabric
SGI® Scalable NAS (and other
shared file servers)
1.ALTIX XE
• Modest memory
addressability (~24GB+/core)
2.ALTIX 450/4700
• Large memory
addressability (~48GB/core)
• Option for B/W blades
3.Storage
• High Speed SAS
(~250GB/core)
• 4SAS disks per XE node
SGI Altix XE 1200/1300
(x86-64) clusters
Cost-effective
Solution
& performance leader
for most analyses
Optimally meet the diverse needs of all workloads or procurement drivers
Advanced
Graphics
Storage
SGI® InfiniteStorage Hardware
Multi-purpose RAID Systems
Streaming Real-time RAID
4500
6700
•
•
•
•
Max performance
4Gb FC or IB
Enterprise S/W
FC RAID / SATA
Low-cost SATA RAID
350
• Multiple high
resolution
streaming
• Isochronous
• 4 Gb connectivity
• 500GB SATA drives
Ultra-dense RAID
10000
4000
• 4Gb Fibre Channel
• Ultimate
Price/Performance
• FC RAID / SATA
• Ultra-high density
• Tape complement or
replacement
• One rack – 240 TB
SAN
NAS
• Completely
integrated
• Easy to
deploy
• Grows with
customers’
business
JBOD
120
• Easy to deploy modular
scalability
Data Management Software Stack
Storage Product Integration
DMF
CXFS
NFS
RDMA accelerated (NAS)
XFS
SGI System
Altix Servers
3rd Party
Disks, Fiber Channel Switches
InfiniteStorage Appliance Manager
®
SGI® InfiniteStorage, le soluzioni
SGI® InfiniteStorage Data Migration Facility (DMF) migra in maniera trasparente I
files dallo Storage On-line a quello definito near-line secondo I critei temporali assegnati
• Questo porta ad abbassare il TCO
• Incremetare il ROI e la produttività
• E’ più facile da gestire
• Riduce I rischi di perdita dei dati
• Protegge gli investimenti iniziali
• Integra la disponibilità del dato con la sua sicurezza
SGI® InfiniteStorage Shared File System CXFS™
• tutti I file sono condisisi
• non sono copiati
• Non si spreca spazio
• si risparmia tempo
• si risparmiano soldi
File
A
File
D
File
G
File
B
File
E
File
F
File
H
File
C
File I
Dedicato a: Decision Support Centres,
Surveillance/Homeland Security/Crisis, C4I battlefield
command and control
Vi interessa?... Si chiama Pixelfusion Enviroment
Media Fusion Process
Input
Fusion
Native
Render
Network
Input
IP
Streams
Fusion
Fusion
Output
Render
to
Pipes
Output
Local
Display
Stream
to
IP
Local
Streams
Record/
Retrieve
Network
HPC Technology Investment Strategy
• Packaging
– Consistency
– Density & Reliability
– Energy Efficiency
• Interconnect
– Reduce Cost
– Increase Value
• Data Management
Oracle® TimesTen :
In-Memory DB Customer Benchmark Results
Government Customer’s Data & Tests
Incumbent 96GB System vs. Altix 960GB
Improvement
• Ingest order-records
5x
• Ingest person-records
12x
• Query
1 per secest
vs.
91K/sec
• Join Data
1 every min
vs.
13K/sec
• Sub-Query
1 every 5mins
vs.
2.5K/sec
May’06
SGI Altix Servers Support More Memory
Maximum Memory
Memory/Core
128 TB
128 GB/core
IBM p595
2 TB
32 GB/core
HP Superdome
2 TB
16 GB/core
Sun Enterprise
25K
1 TB
8 GB/core
SGI Altix 4700
Source: Ideas International, Inc. – February 2007
• SGI Altix 4700 supports more memory
• Fewer cores are required to support the same level of memory
• Lower TCO:
• Spend less on processors
• Spend less on software licenses
SGI Altix 4700 Requires Less Floorspace
Dense System Packaging is one of SGI’s Core Competencies
System Footprint
Altix 4700
45”
26”
1170”
HP SuperDome
60”
48”
2880”
Sun E25K
65”
33”
2145”
IBM p595
52”
31”
1612”
HP SuperDome
Area
IBM p595
Depth
Sun E25K
Width
SGI Altix 4700
SGI Altix Innovative Power Architecture
SGI Servers are Twice as Efficient
Typical Power Architecture
SGI Power Architecture
AC
Server
48VDC
AC
~80%
No 48V
Conversion
SGI Altix Server
Additional
boards
board
1.85v
12v
3.3v
board
~80%
1.2v
Additional
boards
~70%
80% x 80% x 70% = 45% efficiency
1.85v
12v
90%
3.3v
1.2v
High Efficiency
Power Converters
85%
90% x 92% = 76% efficiency
Proprietary
Power Design
Interconnect Strategy : Reduce Cost & Increase
Capability
• NUMAlink4 (Today)
• NUMAlink5: hw extension of IFB (’09)
– Custom copper cable
– COTS Infiniband12x copper cable
– Custom signaling
– COTS serdes
– Custom protocol
– Custom protocol (higher capability)
$450 (5m)
$150 (5m)
Copper : Weight becoming significant
Picture credit : LRZ
SGI NumaLink System Architecture
MPU
Very Large Shared Memory
CPU
. Globally Addressable
. Low Latency
. High Bandwidth
. Many Ports
CPU
Sergio Re
– Tel. 02.36547100
– E-mail [email protected]
– WWW.SGI.COM
Scarica

SGI Altix 450/4700