Calcolo LHC e Tier2
Le necessità - I lavori in corso – Le Tecnologie (GRID)
ATLAS esperimento di riferimento (esempio)
I risultati raggiunti – I piani
L’impegno INFN nazionale e locale
Misto italiano-inglese e vasto ri-uso slides…
In http://www.mi.infn.it/~perini/milano/calcolo-lhc-tier2
trovate queste trasparenze
4-Luglio-2005
Milano
L.Perini
1
Schema della presentazione
• LHC, il calcolo per gli esperimenti
– Necessità di calcolo, quindi stato e piani
• La tecnologia GRID
– Cosa è e come entra in gioco
• I modelli di calcolo
– Risorse previste e loro organizzazione (Tier1,2,3)
– Pianificazione INFN e prossime milestones
Specificamente per ATLAS
• Verifiche del modello e risultati raggiunti
• I Tier2 INFN
• Il ruolo di Milano e il suo prototipo Tier2
4-Luglio-2005
Milano
L.Perini
2
CERN
CERN:
Annual budget: ~1000 MSFr (~700 M€)
225 Fellows,
Staff members: 2650 ++ 270
Associates
Member states: 20 + 6000 CERN users
CERN (founded 1954) =
“Conseil Européen pour
la Recherche Nucléaire”
Large
Hadron “European Organisation for
Collider
Nuclear Reseach”
Particle Physics
4-Luglio-2005
Milano
L.Perini
27 km circumference tunnel
3
Particle physics data
From raw data to physics results
2037
4003
2132
4732
2421
3451
3742
2446
3611
1870
1102
1211
1942
1288
1733
952
2093
2491
2319
1121
2343
e+
1699
1328
3271
3216
2133
3429
7142
Raw data
Convert to
physics
quantities
f
Z0
eDetector
response
apply
calibration,
alignment
Interaction with
detector material
Pattern,
recognition,
Particle
identification
Fragmentation, Basic physics
Decay,
Physics
Results
analysis
Analysis
Reconstruction
Simulation (Monte-Carlo)
4-Luglio-2005
Milano
L.Perini
_
f
4
Challenge 1: Large, distributed community
ATLAS
“Offline” software
effort:
1000 person-years
per experiment
CMS
Software
life span:
20 years
4-Luglio-2005
~ 5000 Physicists
around the world
- around the
Milano
L.Perini
clock
LHCb
5
LHC users and participants Istitutes
LCG: The worldwide Grid project
Europe:
267 institutes, 4603 users
Elsewhere:
208 institutes, 1632 users
6
CMS
4-Luglio-2005
Milano
L.Perini
7
ATLAS is not one experiment
Extra
Dimensions
Primi
dati
per
fisica
nel
2008
Higgs
Heavy Ion
Physics
SUSY
Electroweak
B physics
4-Luglio-2005
QCD
Milano
L.Perini
8
Balloon
(30 Km)
Challenge 2: Data Volume
CD stack with
1 year LHC data!
(~ 20 Km)
Annual data storage:
12-14 PetaBytes/year
6 cm
Concorde
(15 Km)
50 CD-ROM
Mt. Blanc
(4.8 Km)
= 35 GB
4-Luglio-2005
Milano
L.Perini
9
9 orders of magnitude!
Challenge 3: Find the Needle in a
Haystack
4-Luglio-2005
Milano
All interactions
The HIGGS
L.Perini
Rare phenomena Huge background
Complex events 10
Therefore: Provide mountains of CPU
Calibration
Reconstruction
Simulation
Analysis
For LHC computing,
some 100 Million
SPECint2000
are needed!
Produced by Intel
today in ~6 hours
1 SPECint2000 = 0.1 SPECint95 = 1 CERN-unit = 4
MIPS- a 3 GHz Pentium 4 has ~ 1000 SPECint2000
4-Luglio-2005
Milano
L.Perini
11
The CERN Computing Centre
~2,400 processors
~200 TBytes of disk
~12 PB of magnetic
tape
Even with technology-driven improvements
in performance and costs – CERN can
provide nowhere near enough capacity for LHC!
4-Luglio-2005
Milano
L.Perini
12
What is the Grid?
•
•
•
•
•
Resource Sharing
– On a global scale, across the
labs/universities
Secure Access
– Needs a high level of trust
Resource Use
– Load balancing, making
most efficient use
The “Death of Distance”
– Requires excellent networking
Open Standards
– Allow constructive distributed
development
5.44
Gbps
1.1 TB in
30 min.
• There is not (yet) a single Grid
4-Luglio-2005
Milano
L.Perini
13
How
will
it
work?
The GRID middleware:
• Finds convenient places for the scientists “job” (computing task) to be run
• Optimises use of the widely dispersed resources
• Organises efficient access to scientific data
• Deals with authentication to the different sites that the scientists will be
using
• Interfaces to local site authorisation and resource allocation policies
• Runs the jobs
• Monitors progress
• Recovers from problems
… and ….
Tells you when the work is complete and transfers the result back!
4-Luglio-2005
Milano
L.Perini
14
The LHC Computing Grid Project - LCG
Collaboration
LHC Experiments
Grid projects: Europe, US
Regional & national centres
grid for a
regional group
Choices
Adopt Grid technology.
Go for a “Tier” hierarchy.
Use Intel CPUs in standard PCs
Use LINUX operating system.
Lab m
Uni x
Lab a
Tier3
physics
department
Goal

Prepare and deploy the
computing environment to help
the experiments analyse the data
from the LHC detectors.
France
Tier 1
Italy
CERN
Tier 0
Taipei
Lab b

UK
USA
Tier2

Uni a
CERN
Tier 1
Uni y
Uni n
Japan
Germany
Lab c
Uni b
grid for
a physics
study
group
Desktop
4-Luglio-2005
Milano
L.Perini
15
Ma in pratica?
• Tutto è ormai molto reale, stiamo implementando davvero
quanto descritto nelle trasparenze precedenti in inglese
– Che erano volutamente in stile “divulgativo- pubblicitario”
• Ora quindi vi riporto parte di quanto ho mostrato una
settimana fa al CdS (ma non solo)
Non racconto più cosa è una GRID, già fatto anno scorso, e ormai GRID è
un’infrastruttura reale, funzionante (anche se non ancora come
vorremmo) e GROSSA.
Non è più il tempo di “GRID funzionerà?” è il tempo di farla funzionare
come vogliono gli esperimenti e in fretta, che il 2008 è vicino.
• Anzitutto le reti (Mauro)
4-Luglio-2005
Milano
L.Perini
16
OGGI i TDR!!!
• Il sw e calcolo degli esperimenti LHC e LCG sono arrivati
ad un primo traguardo importante:
– Ultimati e disponibili da 2 settimane i Technical Design Report
del Calcolo&sw dei 4 Esperimenti e di LCG
– I 5 documenti sono contestuali, significativamente:
• LCG http://lcg.web.cern.ch/LCG/tdr/ e
ATLAS
http://doc.cern.ch//archive/electronic/cern/preprints/lhcc/public/lhcc2005-022.pdf
• I relativi Memorandum of Understanding ( MOU ) sono pronti, a livello
di draft, e si conta che il processo di firma inizierà al prossimo
Computing RRB in ottobre
– Intorno ad ottobre dovrà anche culminare il processo di definizione
dei Tier2 INFN
• Sedi e caratteristiche, sia in termini di hw, che di infrastutture e servizio
(personale…)
• Seguono trasparenze sul modello di calcolo (ATLAS) come
riportato nei TDR
4-Luglio-2005
Milano
L.Perini
17
19
Processing Roles

Tier-0:


First pass processing on express/calibration physics stream
24-48 hours later, process full physics data stream with reasonable
calibrations
 Curate the full RAW data and curate the first pass processing
 Getting good calibrations in a short time is a challenge
 There will be large data movement from T0 to T1s

Tier-1s:


Reprocess 1-2 months after arrival with better calibrations
Curate and reprocess all resident RAW at year end with improved
calibration and software
 Curate and allow access to all the reprocessed data sets
 There will be large data movement from T1 to T1 and significant data
movement from T1 to T2

Tier-2s:





Simulation will take place at T2s
Simulated data stored at T1s
There will be significant data movement from T2s to T1s
Need partnerships to plan networking
Must have fail-over to other sites
Milano
L.Perini
20
Analysis Roles
Analysis model broken into two components
 Scheduled central production of augmented AOD,
tuples & TAG collections from ESD
 Done at T1s
 Derived files moved to other T1s and to T2s
 Chaotic user analysis of (shared) augmented AOD
streams, tuples, new selections etc and individual user
simulation and CPU-bound tasks matching the official
MC production
 Done at T2s and local facility
 Modest job traffic between T2s
Milano
L.Perini
21
Required CERN Resources
Tier-0 Growth
35000.00
30000.00
25000.00
•Includes
Heavy Ions
data
•No
replacement
s included
Total Disk (TB)
Total Tape (TB)
20000.00
Total CPU (kSI2k)
15000.00
10000.00
5000.00
0.00
2007
2008
2009
2010
2011
2012
Total Disk (TB)
164.69
388.35
388.35
530.02
530.02
530.02
Total Tape (TB)
1956.68 6658.58 11360.48 17745.54 24130.61 30515.68
Total CPU (kSI2k) 1826.29 6351.82 6351.82 8705.70 8705.70 8705.70
CERN T1AF Growth
14000.00
12000.00
10000.00
Total Disk (TB)
Total Tape (TB)
8000.00
Total CPU (kSI2k)
6000.00
4000.00
2000.00
0.00
2007
2008
2009
2010
2011
2012
Total Disk (TB)
751.17
1994.94 2608.06 3813.56 4638.36 5463.17
Total Tape (TB)
208.09
583.62
Total CPU (kSI2k)
973.51
2975.19 4513.70 7446.50 9834.00 12221.50
854.53
1304.43 1678.37 2052.30
Milano
L.Perini
22
Required Non-CERN Resources
Tier-1 Cloud Growth
200000
180000
The growth is
challenging!
160000
140000
Total Disk (TB)
120000
Total Tape (TB)
100000
Total CPU (kSI2k)
80000
60000
40000
20000
0
2007
2008
2009
2010
2011
2012
Total Disk (TB)
5541.362 16207.46 24223.35 43384.97 58896.52 74408.07
Total Tape (TB)
3015.246 10796.33 20101.14 33523.45 48998.36 66525.88
Total CPU (kSI2k) 8137.721 28803.27 51382.47 86595.49 130571.6 180652.5
Tier-2 Cloud Growth
100000.00
90000.00
80000.00
70000.00
60000.00
50000.00
Disk (TB)
CPU (kSI2k)
40000.00
30000.00
20000.00
10000.00
0.00
2007
2008
2009
2010
2011
2012
Disk (TB)
3213.21
10344.57 17511.25 27421.78 37332.32 47250.97
CPU (kSI2k)
7306.48
22374.28 34203.47 55449.91 73557.39 91664.87
Milano
L.Perini
23
Reality check

Snapshot of 2008 Tier-1 status
The requirement is not
completely matched by
the current pledges
 The table presents
Atlas's estimate of the
lower bound on the
Tier-1 resources
available
 Recent indications
suggest the CPU
shortage may be met,
but the storage remains
a problem
 Recovery strategies
reduce access to data
 The disk shortage is
serious
Summary Tier1s
CPU (kSI2K)
Disk (Tbytes)
Tape (Pbytes)
Split 2008
ATLAS
Offered
20397
Required
27000
Balance
-24%
Offered
10076
Required
15500
Balance
-35%
Offered
10.29
Required
10.1
Balance
3%
Milano
L.Perini
24
Grid Projects
Until deployments provide interoperability
the experiments must provide it themselves
ATLAS must span 3 major Grid
deployments
Milano
L.Perini
EGEE
•
Proposal submitted to EU IST 6th framework
•Project started April 1rst 2004
•Total budget approved of approximately 32 M€ over 2 years
activities
•Deployment and operation of Grid Infrastructure (SA1)
•Re-Engineering of grid middleware (WSRF environment) (JRA1)
•Dissemination, Training and Applications (NA4)
•Italy take part to all 3 area of activities with a global financing of
4.7 M€ (tutto il finanziamento INFN è usato per stipendiare
personale, a parte overhead incamerato da INFN centralmente)
11 regional federations
covering 70 partners in
26 countries
JRA1-mw,SA1,NA e ruolo INFN
 Objectives JRA1
Provide robust, supportable middleware components
Integrate grid services to provide a consistent functional basis for
the EGEE grid infrastructure
 5 partners, approx 16% of total project budget
 The Italy-Chek Cluster is the largest one, Prelz Manager
 Quasi 28 FTE (su 34 persone) lavorano oggi in INFN a JRA1
e 23 FTE sono finanziati sui fondi di EGEE
 SA1 = Infrastruttura GRID: coincidente con
deployment EGEE (hw e responsabili)
 SA1 finanzia 28 FTE in INFN e con le attività NA4 di
introduzione in EGEE di nuove applicazioni si arriva al totale
 53 FTE con contratti INFN (23 e 2222) finanziati
da EGEE (6 a Milano)
 E’ iniziata la fase di negoziazione di EGEE2, ulteriore progetto
biennale che iniziera’ alla fine di EGEE, e che contiamo fornirà
a INFN finanziamento simile a quello di EGEE.
GRID e Tiers in ATLAS
• L’investimento INFN (e nostro di Milano) sui
progetti GRID è motivato dall’uso negli
esperimenti LHC
– Il prototipo Tier2 Milano è dichiarato 100% risorsa
ATLAS che ne è utente prioritario, ma è aperto a tutti gli
altri per uso non prioritario
• Secondo regola generale di LCG, già in uso come vedrete subito
– Il modello di calcolo (TDR!) di ATLAS e tutte le
produzioni dell’ultimo anno sono fondati sull’uso di
Grid ( share INFN=10.5 % ATLAS totale)
Seguono alcune trasparenze che illustrano questo punto,
ATLAS specifiche…
4-Luglio-2005
Milano
L.Perini
27
Ma adesso, cosa esiste e come
funziona?
Da presentazione di ATLAS spokesman due
settimane fa al POB di LCG
(POB è il board di supervisione con le funding
agency, membro giunta INFN ed equivalenti esteri)
Già mostrato al CdS
4-Luglio-2005
Milano
L.Perini
28
29
ATLAS Production System
A big problem is data management
Must cope with >= 3 Grid
catalogues
Data Man.
System
Demands even greater for analysis
ProdDB
AMI
Don Quijote
Windmill
super
jabber
super
super
soap
jabber
LCG
exe
super
LCG
exe
Capone
Dulcinea
RLS
LCG
G3
exe
RLS
NG
jabber
soap
NG
exe
Lexor
super
LSF
exe
RLS
Grid3
LSF
Milano
L.Perini
ATLAS on the LCG/EGEE Grid
Input prepared by D Barberis (ATLAS Computing Coordinator), and
with direct contributions from L Perini, G Poulard, O Smirnova,
and many others
LCG POB 20th June 2005, P Jenni
Milano
L.Perini
4-Luglio-2005
LCG POB: ATLAS on the LCG/EGEE Grid
30
Massive productions on 3 Grids
July-September 2004: DC2 Geant-4 simulation (long jobs)
40% on LCG/EGEE Grid, 30% on Grid3 and 30% on NorduGrid
October-December 2004: DC2 digitization and reconstruction (short jobs)
February-May 2005: Rome production (mix of jobs as digitization and
reconstruction was started as soon as samples had been simulated)
65% on LCG/EGEE Grid, 24% on Grid3, 11% on NorduGrid
Mia aggiunta: Per “ROME” CPU consumata al 20-5, a fase simulazione
~finita, (poi si aggiungono jobs, ma poca CPU)
Grid3:
80 kSI2K.years NorduGrid: 22 kSI2k.years
LCG-tot:
178 kSI2K.years
Total:
280 kSI2K.years
Notare che la gran parte di questo impego di CPU è stata fatta in circa 40
giorni, e che i risultati sono stati usati per analisi fisica
Milano
4-Luglio-2005
L.Perini
LCG POB: ATLAS on the LCG/EGEE Grid
31
Rome production statistics
73 data sets containing 6.1M events simulated and
reconstructed (without pile-up)
Total simulated data: 8.5M events
Number of Jobs
Pile-up done later (for 1.3M events done, 50K reconstructed)
NorduGrid
11%
Grid3
24%
Grid3
LCG
LCG-CG
NorduGrid
LCG-CG
31%
LCG
34%
Milano
4-Luglio-2005
L.Perini
LCG POB: ATLAS on the LCG/EGEE Grid
32
La
presenza di Milano è stata
(preliminary)
significativa (sia Lar che
Pixel): 1 organizzatore di
sessione e 3 speakers
91 entries (out of
about 100 talks),
21 F plus 70 M
441 registered participants
Rome ATLAS Physics Workshop
(6 – 11 June 2005)
Milano
4-Luglio-2005
L.Perini
LCG POB: ATLAS on the LCG/EGEE Grid
33
Notare lavoro su LCG fatto al 75% da Milanesi!!!
4-Luglio-2005
Milano
L.Perini
34
This is the first successful
use of the grid by a large
user community, which has
however also revealed several
shortcomings which need now
to be fixed as LHC turn-on is only
two years ahead!
Milano
4-Luglio-2005
Very instructive comments from the user
feedback have been presented at the
Workshop (obviously this was one of the
main themes and purposes of the meeting)
All this is available on the Web
L.Perini
LCG POB: ATLAS on the LCG/EGEE Grid
35
ATLAS Rome production: countries (sites)
•
•
•
•
•
•
•
•
•
•
Austria (1)
Canada (3)
CERN (1)
Czech Republic (2)
Denmark (3)
France (4)
Germany (1+2)
Greece (1)
Hungary (1)
Italy (17)
17 countries; 51 sites
•
•
•
•
•
•
•
•
•
•
•
•
Netherlands (2)
Norway (2)
Poland (1)
Portugal (1)
Russia (2)
Slovakia (1)
Slovenia (1)
Spain (3)
Sweden (5)
Switzerland (1+1)
Taiwan (1)
UK (8)
•
USA (19)
22 countries
84 sites
7 countries; 14 sites
Milano
4-Luglio-2005
L.Perini
LCG POB: ATLAS on the LCG/EGEE Grid
36
ATLAS Rome production
Number
jobs - Number of Jobs
ATLAS Romeof
Production
4%
5%
6%
4%
6%
573315 jobs
22 countries
84 sites
6%
5%
5%
Milano
4-Luglio-2005
uibk.ac.at
umomtreal.ca
cern.ch
csvs.ch
skurut.cz
atlas.fzk.de
benedict.dk
morpheus.dk
ft.uam.es
marseille.fr
clrece.in2p3.fr
isabella.gr
cnaf.it
roma1.it
ba.it
lnf.it
to.it
ct.it
fe.it
roma2.it
pi.it
nikhef.nl
hypatia.no
lip.pt
hagrid.se
sigrid.se
chalmers.se
savka.sk
sinica.tw
shef.uk
ucl.uk
lancs.uk
ed.uk
BNL.us
UC _ATLAS.us
FNAL.us
OU.us
Hamptom.us
UC SanDiego.us
SMU.us
ANL.us
UC .us
Unknown
triumf.ca
utoronto.ca
unibe.ch
golias.cz
gridka.fzk.de
lcg-gridka.fzk.de
nbi.dk
ific.uv.es
ifae.es
cclcgcdli.in2p3.fr
cea.fr
kfki.hu
lnl.it
mi.it
pd.it
na.it
fi.it
ca.it
pd.it
bo.it
sara.nl
uio.no
zeus.pl
msu.ru
bluesmoke.se
pdc.se
brenta.si
ihep.su
ral.uk
ox.uk
ic.uk
man.uk
UTA.us
BU.us
PDSF.us
IU.us
PSU.us
UNM.us
Uflorida.su
C alTech.us
UWMadison.us
Rice.us
L.Perini
LCG POB: ATLAS on the LCG/EGEE Grid
37
lip.pt
number of jobs
physics.ox.ac.uk
ifae.es
Rome production on the LCG/EGEE Grid
sinp.msu.ru
nikhef.nl
matrix.sara.nl
shef.ac.uk
ccc.ucl.ac.uk
1% 0%2% 2%1%
hep.ph.ic.ac.uk
triumf.ca
7%
fzk.de
1%
25%
4%
ft.uam.es
phys.sinica.edu.tw
1%
gridpp.rl.ac.uk
0%
cesnet.cz
m45.ihep.su
6%
uibk.ac.at
isabella.grnet.gr
2%
ui.savba.sk
3%
4%
farm.particle.cz
pp.rl.ac.uk
3%
mrs.grid.cnrs.fr
1%
grid.sinica.edu.tw
6%
cern.ch
8%
ific.uv.es
7%
2%
4%
1%
3%
1%
1%2%
in2p3.fr
cyf-kr.edu.pl
infn.it
1%
2%
4-Luglio-2005
others
Milano
L.Perini
LCG POB: ATLAS on the LCG/EGEE Grid
38
Experience with LCG-2 Operations
Support for our productions was excellent from the CERN-IT-EIS team
(Simone Campana in particular)
Other LCG/EGEE structures were effectively invisible (GOC, ROCs, GGUS etc)
- no communication line between experiments and the
Grid Operations Centres
- operational trouble info always through the EIS group
- sites scheduled major upgrades or downtimes during our productions
- no concept of “service” for the service providers yet!
- many sites consider themselves as part of a test structure set up
(and funded) by EGEE
- but we consider the LCG Grid as an operational service for us!
- many sites do not have the concept of “permanent disk storage” in a
Storage Element
- if they change something in the filing system, our catalogue has to be
updated
Milano
4-Luglio-2005
L.Perini
LCG POB: ATLAS on the LCG/EGEE Grid
39
Current plans for EGEE/gLite
We are ready to test new components as soon as they are released from the internal
certification process
So far we had only the File Transfer Service and the LCG File Catalogue
- Both being actively tested by our Distributed Data Management group
(M.Branco & D.Cameron)
- FTS will be field-tested by Service Challenge 3 starting in July
- LFC is in our plan for the new DDM system to be deployed during
the Summer
The new Workload Management System nor the new Computing Element releases
are available yet
- But we are ready to test them at large, and tests have started at some
places for some components
As soon as the performance is acceptable we will ask to have them deployed
(… but this cannot be a blank check)
Milano
4-Luglio-2005
L.Perini
LCG POB: ATLAS on the LCG/EGEE Grid
40
Plans for further productions
Summer 2005:
- Deployment of new Distributed Data Management system
- Test and deployment of new version of the Production System
- Decision on the baseline for the Distributed Analysis System
Autumn 2005:
- Software release 11 (including trigger and calibration chain)
- Start of production of simulated events for detector commissioning
- Start of production of simulated events with miscalibrated and
misaligned conditions for Computing System Commissioning
Winter 2006:
- Computing System Commissioning (formerly called DC3)
- Progressive tests of all system components, increasing complexity
and scale
- Detailed planning under discussion this month
Milano
4-Luglio-2005
L.Perini
LCG POB: ATLAS on the LCG/EGEE Grid
41
Impegno italiano nel calcolo LHC
• Quantità di risorse proporzionata alla frazione di
fisici italiani in esperimento e alle responsabilità nel
calcolo
– INFN/ATLAS=11%
INFN/CMS=15% ….
• Tier1 al CNAF
• Tier2 in non più di 9 sedi in totale
– Le risorse hw nei Tier2 di un esperimento in Italia non
dipendono significativamente dal numero dei suoi Tier2
(se sono 3, saranno ognuno i 4/3 di quello che sarebbero
se fossero 4)
• Su questi argomenti seguono trasparenze presentate
da referee calcolo LHC e Grid a CSN1 maggio
4-Luglio-2005
Milano
L.Perini
42
Finanziamento INFN



Il Tier1 è stato finanziato nel 2003 per 22 M€
 E’ la cifra di riferimenti per l’investimento totale, inclusa
l’infrastruttura.
 Gare autorizzate di volta in volta dai referee
 Spesi circa 5M€ per l’infrastruttura e le farm
 Le capacità indicate nell’MoU devono essere realizzabili con i
fondi allocati
 Il supporto per gli esperimenti non-LHC deve essere incluso.
I Tier2 dovranno essere finanziati sui bilanci delle commissioni
scientifiche nazionali 1 e 3.
 Importante mantenere il giusto livello di scrutiny scientifico
 Cifra di riferimento totale: metà del Tier1 = 11 M€
 “No TDR, no Tier2”, F. Ferroni, Marzo 2005
 I primi semi di Tier2 sono stati finanziati sul prog. spec. INFNGrid
Periodo di riferimento: 2005-2010
4-Luglio-2005
Milano
L.Perini
43
Il Tier1

Tier1 CNAF: All Experiments
Risorse risultanti da top = 14000 kEuro
Tier1 CNAF
2005
2006
CPU (kSI2K)
1818
2800
Disk (TB)
507
850
Tapes (TB)
800
900
Contingency (%)*
20%
* Included
theExperiments
above numbers
Tier1
CNAF:inAll
Risorse risultanti da top = 17000 kEuro
Tier1 CNAF
2005
2006
CPU (kSI2K)
1818
2900
Disk (TB)
507
950
Tapes (TB)
800
800
Contingency (%)*
20%
* Included in the above numbers
2007
3600
1300
1000
30%
2008
4800
1900
1550
40%
2009
7600
3500
4100
50%
2010
11500
5700
6100
50%
2007
3800
1650
1300
30%
2008
7800
3500
3400
40%
2009
10500
5000
5700
50%
2010
17000
8100
8700
50%
Presentati i due piani da 14 e 17 M€.
 Da digerire ed ottimizzare
 Spesa per il 2005 (capacità per il 2006) è comunque intorno a
2M€

~800 kSI2K di CPU, ~350 TB disco, ~nulla per i Nastri
Utilizzo del Tier1 in netta crescita
 Service challenge positivi
OK per procedere con le gare
 Il Tier1 ha poco manpower



4-Luglio-2005
Milano
L.Perini
44
Primo round di share

Share del Tier1


FTE
Alice
Atlas
CMS
LHCb
Total
Alice: 22%
Atlas: 32%
CMS: 35%
LHCb: 11%
CORE
Alice
Atlas
CMS
LHCb
Total
Share dei Tier2

Alice: 25%
Atlas: 23%
CMS: 45%
LHCb: 7%
4-Luglio-2005
FTE
Alice
Atlas
CMS
LHCb
Total
Milano
L.Perini
Experiment share of Tier1 in terms of COST
2006
2007
2008
2009
2010
20%
21%
21%
22%
22%
31%
30%
29%
30%
30%
35%
35%
36%
34%
35%
14%
14%
14%
14%
14%
100%
100%
100%
100%
100%
Experiment share of Tier1
2006
20%
35%
36%
8%
100%
2007
21%
34%
37%
8%
100%
2008
21%
34%
37%
8%
100%
2009
22%
34%
35%
9%
100%
2010
22%
34%
36%
8%
100%
Experiment share of Tier2 in terms of COST
2006
2007
2008
2009
2010
27%
25%
25%
24%
25%
22%
24%
24%
22%
22%
45%
44%
44%
47%
48%
7%
7%
7%
6%
6%
100%
100%
100%
100%
100%
45
Lo share

Lo sharing delle risorse INFN tra gli esperimenti
scalda gli animi



Nessun algoritmo è perfetto
Tutti i numeri dovranno essere rivisti
Primo round ottenuto con i seguenti criteri:



Risorse totali INFN divise tra gli esperimenti usando FTE o
Core contributions
Le risorse del singolo esperimento suddivise tra Tier1 e
Tier2 sulla base del computing model dell’esperimento
Struttura fine delle diverse risorse (cpu/disco/nastro) usata
nel conteggio
4-Luglio-2005
Milano
L.Perini
46
I Tier2

Lista temporanea Tier2
Bari
Catania
CNAF
LNF
LNL
Milan
Napoli
Pisa
Roma1
Torino
4-Luglio-2005
ALICE CMS
ALICE
LHCb
ATLAS
ALICE CMS
ATLAS
ATLAS
CMS
ATLAS CMS
ALICE
Milano
L.Perini
47
Princìpi del Tier2 (nota: mie aggiunte
riportano dichiarazioni ufficiali fatte in sede diversa)


Il Tier2 serve una comunità di riferimento, ma è una risorsa per
l’intero esperimento
 Non è una farm di gruppo
Le risorse del Tier2 devono essere condivise:
 Accessibilità tramite Grid per l’esperimento
 Sharing con altri esperimenti locali


Ma ( aggiunta mia ) anche globali, via grid, secondo regole LCG, con
priorità più bassa regolata da esperimento di riferimento
Il direttore deve impegnarsi a supportare il Tier2:
 spazi: un Tier2 medio ospita tra 200 e 500 macchine + disco
 infrastrutture: devono essere già largamente disponibili e
supportate dalla struttura (power+cooling)


Documento di supporto tecnico presto disponibile
manpower: un minimo di 2 FTE spalmati su 4 persone

Ma (aggiunta mia) oltre a questo l’esp. di riferimento dovrà fornire:


4-Luglio-2005
il resposabile del Tier2
manpower per installazione, validazione etc. del sw di esperimento,
monitoring del funzionamento etc.
Milano
L.Perini
48
Prossimi passi (proposta
referees del 21-6)

Inizio luglio:



Settembre:




Documento che descriva quali sono i requirements per la
realizzazione di un Tier2 (referee)
Piano dei Tier2 degli esperimenti (esperimenti)
Progetto tecnico dei Tier2 da parte delle sezioni, in
consultazione con commissione calcolo
Discussione in commissione 1&3 dei piani dei Tier2, loro
numero e dimensione
Documento integrato che descrive i Tier2
Ottobre:
- Approvazione dei Tier2
4-Luglio-2005
Milano
L.Perini
49
Valutazione dei costi
Tier2
ATLAS
INFN (4?)
2006
(new)
CPU
200-80
(kSI2K(tot)
N.biprocesso
450-240
ri?)
2007
(acquisto anno corrente)
2008
2009
2010
350-100
880-340
1782-400
2532-740
1420-250
38321000
2709350
62611100
Tot. K€
K€
116
133
446
241
325
Dischi
(TB)
(new)
50
(tot)
80
177
368
836
1212
847
2039
1334
3194
K€
113
248
736
466
454
2017
Tot. K€
229
381
707
779
3278
4-Luglio-2005
1182
Milano
L.Perini
1261
50
Notazioni su Tabella precedente
• Il N.biprocessori è approssimativo
– Calcolato da me supponendo 1 cpu = 1.25 kSI2k nel 2006 e un
aumento di potenza 2 ogni 2 anni
• Inizio pessimista: N.CPU probabilmente sovrastimato 30%
• I candidati Tier2 di ATLAS sono 4:
– Biprocessori nel 2010 in un Tier2 : 200-275 e 800 TB
• Notare che dal 2010 in poi N.CPU non cresce più lineramente: le
CPU vecchie di > 4 anni vengono sostituite da CPU 4 volte più
potenti
• Slittamenti inizio LHC hanno un grosso effetto
– Uno di 6 mesi rispetto alla tabella sembra sicuro ora
4-Luglio-2005
Milano
L.Perini
51
Università di Catania and INFN Catania - Italy
ALICE Expertiment
Un tipico Tier-2 INFN
Roberto Barbera
Incontro
con10.02.2000
i Referee di INFN Grid, Roma, 21.06.2005
CHEP 2000,
Roberto Barbera
Università di Catania and INFN Catania - Italy
ALICE Expertiment
La “Grid room”
(generalità)
Considerazioni su 150 box 1U e 600 TB di disco (5-6
rack)
Spazio dedicato di almeno 30-40 m2, meglio se più
grande per servizi e spazio “garage”
Cablaggio diretto con la cabina elettrica di
Sezione/Dipartimento
Dimensione cavi (sezione) tale da sopportare potenze
ben al di là di 100 kW (incluso il condizionamento)
Se la “grid room” è nuova, meglio non dotarla di falso
pavimento (contro: peso dei rack, vantaggio delle
soluzioni a cablaggio strutturato integrato)
Incontro con i Referee di INFN Grid, Roma, 21.06.2005
Roberto
RobertoBarbera
Barbera
Università di Catania and INFN Catania - Italy
ALICE Expertiment
Controllo remoto
Incontro con i Referee di INFN Grid, Roma, 21.06.2005
Roberto Barbera
Università di Catania and INFN Catania - Italy
ALICE Expertiment
Gruppo elettrogeno
(meglio se c’è)
400 kVA
bassissimi livelli di
interferenza con
l’UPS nella 7a, 11a
e 13a armonica
interfacciabile via
RS232 con il
sistema APC
consumo: 30 l/h
Incontro con i Referee di INFN Grid, Roma, 21.06.2005
Roberto
RobertoBarbera
Barbera
Università di Catania and INFN Catania - Italy
ALICE Expertiment
Controllo remoto di sicurezza
Incontro con i Referee di INFN Grid, Roma, 21.06.2005
Roberto
RobertoBarbera
Barbera
Università di Catania and INFN Catania - Italy
ALICE Expertiment
Attività e risorse umane
(valori indicativi, sull’esempio di Catania)
Gestione sistemi: 1 FTE
Gestione storage (GPFS): 0.5 FTE
Gestione servizi di rete locali (Router,
DNS, VPN, VLAN, Firewall, Web, ecc.):
0.5 FTE
Incontro con i Referee di INFN Grid, Roma, 21.06.2005
Roberto
RobertoBarbera
Barbera
Università di Catania and INFN Catania - Italy
ALICE Expertiment
Costi
(valori indicativi sull’esempio di Catania - IVA inclusa)
Sala calcolo, impianti elettrici generali,
sistema di condizionamento, sistema antiincendio: ~ 100 k€
Gruppo elettrogeno: ~ 47 k€
UPS, rack, sensori ambientali, controllo
prese, sistema remoto di management e
shutdown: ~ 72 k€

(UPS integrato “Data cube” full-optional: ~
150 k€)
KVM: ~ 25 k€
Apparati di rete: ~ 60 k€
Incontro con i Referee di INFN Grid, Roma, 21.06.2005
Roberto
RobertoBarbera
Barbera
Conclusione -1
Milano: situazione e prospettive
• Oggi il protoTier2 ha 50 CPU e 7 TB di disco
– Finanziati già altri 30 kSI2k (20-24 CPU) e 4 TB
• Milano è stato primo T2 INFN-ATLAS in LCG
– Ora è il primo a partecipare ai Service Challenges LCG per
preparare i servizi dei Tier al 2008…
• Milano ruolo primario in sviluppo e gestione interfaccia
GRID-ATLAS
– D. Rebatto autore Lexor (executor LCG), L.Perini coordinatore
Rome-production su LCG, S.Resconi nello sviluppo analisi
distribuita…
• Adattamento Sala Macchine già parzialmente fatto e
ulteriori finanziamenti già disponibili
4-Luglio-2005
Milano
L.Perini
59
Conclusione -2
Milano: situazione e prospettive
• Un Tier2 a Milano sarebbe un impegno per Sezione e ATLAS
Milano
• Mia proposta è naturalmente che questo impegno sia preso (io sono
disponibile ad assumere responsabilità Tier2) e mia convinzione e mio
impegno è che sia un investimento che porta frutti sia per Sezione che
per ATLAS-Milano
• Ma non nascondo che per entrambi è anche uno sforzo e una sfida
• Il Tier2 di Milano non solo deve essere disponibile per
sinergie locali ma funzionare da entry in GRID per le attività
locali
• Fatta salva ovviamente la priorità di ATLAS, che in certi momenti (ma
non tuttti!) sarà pressante.
– Penso che il calcolo INFN tenderà a riorganizzarsi intorno ai Tier2 con
GRID
– Sarà un processo lungo, ma dovrebbe portare più efficacia e più risorse
di calcolo per tutte le attività INFN
– Vorrei che Milano sviluppasse ulteriormente il suo ruolo in questo
Mondo Nuovo… e sono disponibile a lavorarci il meglio che potrò
4-Luglio-2005
Milano
L.Perini
60
Backup Slides
4-Luglio-2005
Milano
L.Perini
61
Ruolo e persone Milano GRID
• Milano ha ruolo molto rilevante, essendo:
– Uno dei 4 proto-Tier2 italiani coinvolti da subito in LCG
(esperimento di riferimento ATLAS di cui L.Perini è responsabile
nazionale per il calcolo e co-convener GRID internazionale)
• Staff (INFN e in prospettiva stabile) da impegnare in Tier2 è condizione
necessaria per approvazione Tier2 da INFN , attualmente si conta sui
contratti EGEE-SA1 al 100% (Resconi, Vaccarossa, DaCruz)
– Sede di uno dei cluster di sviluppo del m/w in EGEE
• Cluster Manager F.Prelz (50%) e personale al 100% finanziato da
EGEE: art.23 (M. Mezzadri, D.Rebatto) e art.2222 (E.Martelli,
G.Fiorentino, E.Molinari)
• Inoltre contatti stabiliti con gruppo di informatici di nostra università (e
Crema) guidato da B.Apolloni . Un dottorando di informatica lavora già
con noi ( M.Marchi con apposita borsa INFN), % e assoc. in definizione
• Persone degli esperimenti ( 0% formalmente)
• ATLAS L.Perini (60%), C.Troncon ( 20%), T.Lari (20%)
– Su calcolo (e analisi fisica) già iniziato lavoro comune Lar-Pixel,
basato su infrastrutture comuni anche locali e sw comune
• CMS M.Bonesini (30%), CMS Milano-1?
4-Luglio-2005
Milano
L.Perini
62
Scarica
Tier 2

Presentazione Venerdi` - Laboratori Nazionali di Frascati

LNF Tier2 Status

Overview Parigi JetEtMissTau

Proposta di integrazione e consolidamento delle risorse presenti