LCG TDR
Un tentativo di presentazione critica
Note e commenti di L.Perini e C. Grandi
Una prima versione presentata referees INFN GRID 9-9
Qualche modifica introdotta da feedback 9-9
Riportati alla fine gli statements preparati da INFN per la
discussione LCG-EGEE con management CERN
22-Settembre-2005
CSN1@Napoli
L.Perini
1
Il documento
2
EXPERIMENTS’ REQUIREMENTS
2.1
2.2
2.3
2.4
2.5
2.6
2.7
3
Grid Architecture and Services
Tier-0 Architecture
Tier-1 Architecture
Tier-2 Architecture
TECHNOLOGY AND INFRASTRUCTURE
EGEE Middleware
Grid Standards and Interoperability
Grid Operations and Regional Centre Service-Level
Agreements
4.4 Life-Cycle Support – Management of Deployment and
Versioning
4.5 Fabric Technology — Status and Expected Evolution
4.6 Databases – Distributed Deployment
4.7 Initial Software Choices at CERN
4.8 Initial Hardware Choices at CERN
4.9 Hardware Life-Cycle
4.10 Costing
4.11 Networking
4.12 Security
6
4.1
4.2
4.3
22-Settembre-2005
COMMON APPLICATIONS
5.1 High-Level Requirements for LCG Applications
Software
5.2 Software Architecture
5.3 Operating System Platforms
5.4 Core Software Libraries
5.5 Data Management
5.6 Event Simulation
5.7 Software Development Infrastructure and Services
5.8 Project Organization and Schedule
BASIC LCG ARCHITECTURE
3.1
3.2
3.3
3.4
4
5
Logical Dataflow and Workflow
Event Simulation
Resource Expectations
Baseline Requirements
Online Requirements
Analysis Requirements and Plans
Start-up Scenario
CSN1@Napoli
EXPERIENCE: PROTOTYPES AND
EVALUATIONS
6.1 Data Challenges
6.2 Service Challenges
6.3 ARDA
7
PLANS
7.1 Baseline Services
7.2 Phase-2 Planning
8
PROJECT ORGANIZATION AND
MANAGEMENT
8.1 High-Level Committees:
8.2 Participating Institutes
8.3 Interactions and Dependencies
L.Perini
2
La stesura del TDR
• EDITORIAL BOARD (LCG, Esperimenti, Tier1)
– I. Bird, K. Bos, N. Brook, D. Duellmann, C. Eck, I. Fisk, D. Foster, B.
Gibbard, C. Grandi, F. Grey, J. Harvey, A. Heiss, F. Hemmer, S. Jarp, R.
Jones, D. Kelsey, J. Knobloch, M. Lamanna, H. Marten, P. Mato Vila, F.
Ould-Saada, B. Panzer-Steindel, L. Perini, L. Robertson, Y. Schutz, U.
Schwickerath, J. Shiers, T. Wenaus
• Metodo
– Ogni parte assegnata ad 1 o 2 writers
– Eliminare dal testo anything “contenzioso”
– Cosmesi finale
• Risultato (mia valutazione)
– Alcune parti di testo informativo (magari piatto ma utile)
– Planning debole, decisioni non prese (mai discusse), ambiguità
– Alcune parti rischiano di essere misleading ( quanto scritto e’ non-falso, ma
può occultare problemi)
Ma, data la tempistica scelta, era difficile ottenere qualcosa di molto migliore
e sopratutto piu’ utile
22-Settembre-2005
CSN1@Napoli
L.Perini
3
Contenuto del talk
• Survey informazioni rilevanti incluse nel TDR
– Riuso slides Jurgen Knobloch LHCC fine giugno
quando possibile
• Indicazione problemi ancora aperti
• Indicazione di punti a rischio “misleading”
– A giudizio mio e/o di Claudio e sulla base anche del
feedback avuto in seguito a riunione 9-9
Non esaustivo…prevalentemente in forma di commento alle
slides di Jurgen
22-Settembre-2005
CSN1@Napoli
L.Perini
4
LHC Computing Grid – Technical Design Report
LCG
Organizational Structure for Phase 2
LHC Committee – LHCC
Scientific Review
Computing Resources
Review Board - C-RRB
Funding Agencies
Collaboration Board – CB
Experiments and Regional Centres
Overview Board - OB
Management Board - MB
Management of the Project
Grid Deployment Board
Coordination of
Grid Operation
Architects Forum
Coordination of
Common Applications
MB e GDB sono quasi uguali come
composizione. MB è grosso per funzionare
5Jürgen Knobloch, CERN-IT
Slide No. 5
The MB membership includes the LCG Project Leader, the
Technical Heads of the Tier-1 centres, the leaders of the major
activities managed by the Board, the Computing Co-ordinator of
each LHC Experiment, the Chair of the Grid Deployment Board
(GDB), a Scientific Secretary and other members as decided from
time to time by the Board.
The Grid Deployment Board (GDB) is the forum within the Project where the
computing managements of the experiments and the regional computing centres
discuss and take, or prepare, the decisions necessary for planning, deploying, and
operating the LHC Computing Grid. Its membership includes: as voting members —
one person from each country with a regional computing centre providing resources
to an LHC experiment (usually a senior manager from the largest such centre in the
country), a representative of each of the experiments; as non-voting members — the
Computing Co-ordinators of the experiments, the LCG Project Leader, and leaders of
formal activities and projects of the Collaboration. The Chair of the GDB is elected
by the voting members of the board for a two-year term. The GDB may co-opt
additional non-voting members as it deems necessary. The GDB reports to the LCG
Management Board which normally meets immediately after the GDB ratifying the
decisions prepared by the GDB.
22-Settembre-2005
CSN1@Napoli
L.Perini
6
LHC Computing Grid – Technical Design Report
LCG
Cooperation with other projects
 Network Services
 LCG will be one of the most demanding applications of national research
EGEEbackbone
sembra sullo
stesso
networks such as the pan-European
network,
GÉANT
piano dei providers esterni
 Grid Software
 Globus, Condor and VDT have provided key components of the
middleware used. Key members participate in OSG and EGEE
 Enabling Grids for E-sciencE (EGEE) includes a substantial middleware
activity.
 Grid Operational Groupings
~Vero!
 The majority of the resources used are made available
as part of the
EGEE Grid (~140 sites, 12,000 processors). EGEE also supports Core
Infrastructure Centres and Regional Operations Centres.
 The US LHC programmes contribute to and depend on the Open
Science Grid (OSG). Formal relationship with LCG through US-Atlas
and US-CMS computing projects.
 The Nordic Data Grid Facility (NDGF) will begin operation in 2006.
Prototype work is based on the NorduGrid middleware ARC.
Vedi next slide
7Jürgen Knobloch, CERN-IT
Slide No. 7
Grid Operational Groupings
The computing resources will be committed by funding agencies through a
Memorandum of Understanding, which will specify the capacity to be provided at
individual centres. These centres are organized into three major operational
groupings: the EGEE Grid, the Open Science Grid, and the Nordic Data Grid
Facility. Each of these groups uses a specific base set of middleware tools and has
its own Grid operations infrastructure. The body governing the overall operational
policy and strategy for the LHC project is the Grid Deployment Board (GDB). This
has national representation, usually from the major centre(s) in each country. The
GDB will agree on the basic services to be provided at each centre with the aim of
providing a consistent environment for each experiment across the different
operational groupings.
Ma EGEE rischia di non avere un SUO deployment, dato
che il management di SA1 è LCG, che deve fare interessi
delle 3 GRIDs e non di EGEE
22-Settembre-2005
CSN1@Napoli
L.Perini
8
The EGEE Grid
………
The long-term idea is that EGEE will evolve into an organization that will
provide core operation for a science Grid in Europe and perhaps further
afield, rather akin to the role of GÉANT in research networking. However,
the EGEE project is at present funded only until March 2006. It is
expected that the project will be extended for a further period of two
years, which means that it would terminate at the beginning of the first full
year of LHC operation. It is therefore important that LCG maintains
its role in the core operation, and prepares a fall-back plan in the event
that the EU-subsidized evolution beyond EGEE does not materialize or
does not fulfil LCG's requirements. This is clearly a difficult strategy,
with significant risk, but the long-term advantages of a multiscience Grid
infrastructure receiving significant non-HEP funding must be taken into
account.
Maintenance e’ grosso problema long term
22-Settembre-2005
CSN1@Napoli
L.Perini
9
LHC Computing Grid – Technical Design Report
LCG
The Hierarchical Model
 Tier-0 at CERN
 Record RAW data (1.25 GB/s ALICE)
 Distribute second copy to Tier-1s
 Calibrate and do first-pass reconstruction
 Tier-1 centres (11 defined)
 Manage permanent storage – RAW, simulated, processed
 Capacity for reprocessing, bulk analysis
 Tier-2 centres (>~ 100 identified)
 Monte Carlo event simulation
 End-user analysis
 Tier-3
 Facilities at universities and laboratories
 Access to data and processing in Tier-2s, Tier-1s
 Outside the scope of the project
10Jürgen Knobloch, CERN-IT
Slide No. 10
LHC Computing Grid – Technical Design Report
LCG
Tier-1s
Tier-1 Centre
Experiments served with priority
ALICE
ATLAS
TRIUMF, Canada
CMS
LHCb
X
GridKA, Germany
X
X
X
X
CC, IN2P3, France
X
X
X
X
CNAF, Italy
X
X
X
X
SARA/NIKHEF, NL
X
X
Nordic Data Grid Facility (NDGF)
X
X
X
X
X
X
X
ASCC, Taipei
RAL, UK
X
BNL, US
X
X
X
FNAL, US
X
PIC, Spain
X
11Jürgen Knobloch, CERN-IT
X
X
Slide No. 11
LHC Computing Grid – Technical Design Report
The Eventflow
LCG
Rate
RAW
[Hz]
[MB]
ESD
rDST
RECO
[MB]
AOD
Monte
Carlo
Monte
Carlo
[kB]
[MB/evt]
% of real
ALICE HI
100
12.5
2.5
250
ALICE pp
100
1
0.04
ATLAS
200
1.6
0.5
100
2
20
CMS
150
1.5
0.25
50
2
100
LHCb
2000
0.025
0.025
4
300
0.4
0.5
100
100
20
Ovvia, e grande, incertezza….
50 days running in 2007
107 seconds/year pp from 2008 on  ~109 events/experiment
106 seconds/year heavy ion
12Jürgen Knobloch, CERN-IT
Slide No. 12
LHC Computing Grid – Technical Design Report
LCG
CPU Requirements
350
Tier-2
300
200
Tier-1
150
100
0
2007
CERN
50
58%
pledged
MSI2000
250
LHCb-Tier-2
CMS-Tier-2
ATLAS-Tier-2
ALICE-Tier-2
LHCb-Tier-1
CMS-Tier-1
ATLAS-Tier-1
ALICE-Tier-1
LHCb-CERN
CMS-CERN
ATLAS-CERN
ALICE-CERN
2008
2009
2010
Year
13Jürgen Knobloch, CERN-IT
Slide No. 13
LHC Computing Grid – Technical Design Report
LCG
Disk Requirements
160
140
Tier-2
120
80
Tier-1
60
0
2007
CERN
20
pledged
40
54%
PB
100
LHCb-Tier-2
CMS-Tier-2
ATLAS-Tier-2
ALICE-Tier-2
LHCb-Tier-1
CMS-Tier-1
ATLAS-Tier-1
ALICE-Tier-1
LHCb-CERN
CMS-CERN
ATLAS-CERN
ALICE-CERN
2008
2009
2010
Year
14Jürgen Knobloch, CERN-IT
Slide No. 14
LHC Computing Grid – Technical Design Report
LCG
Tape Requirements
160
140
LHCb-Tier-1
120
Tier-1
CMS-Tier-1
ATLAS-Tier-1
ALICE-Tier-1
80
LHCb-CERN
60
CMS-CERN
20
0
2007
CERN
40
75%
pledged
PB
100
2008
2009
ATLAS-CERN
ALICE-CERN
2010
Year
15Jürgen Knobloch, CERN-IT
Slide No. 15
LHC Computing Grid – Technical Design Report
LCG
Experiments’ Requirements
 Single Virtual Organization (VO) across the Grid
 Standard interfaces for Grid access to Storage Elements (SEs) and
Computing Elements (CEs)
 Need of a reliable Workload Management System (WMS) to
efficiently exploit distributed resources.
a DB non
 Non-event data such as calibration andScalabilità
alignmentdell’accesso
data but also
garantita:
problema
detector construction descriptions will be
held in possibile
data bases
 read/write access to central (Oracle) databases at Tier-0 and read
access at Tier-1s with a local database cache at Tier-2s
 Analysis scenarios and specific requirements are still evolving
 Prototype work is in progress (ARDA)
 Online requirements are outside of the scope
LCG,test
butreale
there
are
Finoraof
nessun
a scala
connections:
significativa
 Raw data transfer and buffering
 Database management and data export
 Some potential use of Event Filter Farms for offline processing
16Jürgen Knobloch, CERN-IT
Slide No. 16
LHC Computing Grid – Technical Design Report
Ok SRM ma mancano ancora molte funzionalita' e.g. : disk-only backend debole. DPM?
LCG
POSIX-like I/O problematico: NFS non adeguato. soluzioni proprietarie
(rfio, dcap, ecc...) difficile avere applicazioni site-independent
GFAL o gLiteIO potrebbero risolvere il problema ma sono deboli
 Storage Element
Architecture – Grid services
 Mass Storage System (MSS) (CASTOR, Enstore, HPSS, dCache, etc.)
 Storage Resource Manager (SRM) provides a common way to access MSS,
independent of implementation
 File Transfer Services (FTS) provided e.g. by GridFTP or srmCopy
 Computing Element
Nulla di testato davvero, per ruoli e uso accounting
 Interface to local batch system e.g. Globus gatekeeper.
 Accounting, status query, job monitoring
 Virtual Organization Management
Almeno 2 interfacce: DLI, St.Ind.; molte (troppe)
 Virtual Organization Management Services (VOMS)
implementazioni: RLS, LFC, Fireman. Non dimostrate
 Authentication and authorization based on VOMS model.
performances sufficienti -> sviluppi dei singoli esp.
 Grid Catalogue Services
 Mapping of Globally Unique Identifiers (GUID) to local file name
 Hierarchical namespace, access control
 Interoperability
 EGEE and OSG both use the Virtual Data Toolkit (VDT)
 Different implementations are hidden by common interfaces
Le GRID non interoperano: per DM ATLAS ha
servizio
suo DQ; per submission nulla…
17Jürgen Knobloch, CERN-IT
Slide No. 17
LHC Computing Grid – Technical Design Report
LCG
Baseline Services
Mandate
The goal of the working group is to forge an agreement between the experiments and the
LHC regional centres on the baseline services to be provided to support the computing
models for the initial period of LHC running, which must therefore be in operation by
September 2006.
The services concerned are those that supplement the basic services for which there is
already general agreement and understanding (e.g. provision of operating system services,
local cluster scheduling, compilers, ..) and which are not already covered by other LCG
groups such as the Tier-0/1 Networking Group or the 3D Project. …
Members
Experiments: ALICE: L. Betev, ATLAS: M. Branco, A. de Salvo, CMS: P. Elmer, S. Lacaprara, LHCb: P.
Charpentier, A. Tsaragorodtsev
Projects: ARDA: J. Andreeva, Apps Area: D. Düllmann, gLite: E. Laure
Sites: F. Donno (It), A. Waananen (Nordic), S. Traylen (UK), R. Popescu, R. Pordes (US)
Chair: I. Bird, Secretary: M. Schulz
Timescale: 15 February to 17 June 2005
18Jürgen Knobloch, CERN-IT
Slide No. 18
LHC Computing Grid – Technical Design Report
LCG
Baseline Services –
preliminary priorities
ALICE
ATLAS
CMS
LHCb
Storage Element
A
A
A
A
Basic transfer tools
A
A
A
A
Reliable file transfer service
A
A
A/B
A
Catalogue services
B
B
B
B
Catalogue and data management tools
C
C
C
C
Compute Element
A
A
A
A
B/C
A
A
C
VO agents
A
A
A
A
VOMS
A
A
A
A
Database services
A
A
A
A
Posix-I/O
C
C
C
C
Application software installation
C
C
C
C
Job monitoring tools
C
C
C
C
Reliable messaging service
C
C
C
C
Information system
A
A
A
A
Service
Workload Management
19Jürgen Knobloch, CERN-IT
Slide No. 19
La tabella non esaurisce quello che
serve agli esperimenti
• Dal Baseline documento medesimo (pag.25)
Specific services where the group has identified more work is needed to
understand and specify requirements, are the following:
•􀂃 VO-specific agent service machines, and the requirements of the services
•􀂃 Experiment (and other) application software installation mechanisms
•􀂃 Monitoring tools
•􀂃 Reliable messaging service
•􀂃 VO policy management – priorities, access control, etc.
•􀂃 Information service and devlopments
22-Settembre-2005
CSN1@Napoli
L.Perini
20
LHC Computing Grid – Technical Design Report
LCG
Architecture – Tier-0
Gigabit Ethernet
Ten Gigabit Ethernet
Double ten gigabit Ethernet
WAN
10 Gb/s to 32×1 Gb/s
2.4 Tb/s CORE
Experimenta
l
areas
Campu
s
networ
k
Distribution layer
…
.
..96..
..96..
..32..
..96..
~6000 CPU servers x 8000 SPECINT2000 (2008)
..10..
~2000 Tape and Disk servers
21Jürgen Knobloch, CERN-IT
Slide No. 21
LHC Computing Grid – Technical Design Report
LCG
Tier-0 components
 Batch system (LSF) manage CPU resources
 Shared file system (AFS)
 Disk pool and mass storage (MSS) manager (CASTOR)
 Extremely Large Fabric management system (ELFms)
 Quattor – system administration – installation and configuration
 LHC Era MONitoring (LEMON) system, server/client based
 LHC-Era Automated Fabric (LEAF) – high-level commands to sets of
nodes
 CPU servers – ‘white boxes’, INTEL processors, (scientific) Linux
 Disk Storage – Network Attached Storage (NAS) – mostly mirrored
 Tape Storage – currently STK robots – future system under
evaluation
 Network – fast gigabit Ethernet switches connected to multigigabit
backbone routers
22Jürgen Knobloch, CERN-IT
Slide No. 22
LHC Computing Grid – Technical Design Report
LCG
Technology - Middleware
 Currently, the LCG-2 middleware is deployed in more than 100 sites
 It originated from Condor, EDG, Globus, VDT, and other projects.
 Will evolve now to include functionalities of the gLite middleware
provided by the EGEE project which has just been made available.
 In the TDR, we describe the basic functionality of LCG-2
middleware as well as the enhancements expected from gLite
components.
 Site services include security, the Computing Element (CE), the
Storage Element (SE), Monitoring and Accounting Services –
currently available both form LCG-2 and gLite.
 VO services such as Workload Management System (WMS), File
Catalogues, Information Services, File Transfer Services exist in
both flavours (LCG-2 and gLite) maintaining close relations with
VDT, Condor and Globus.
Attenzione: gli esperimenti hanno anche loro
soluzioni..vobox etc…
23Jürgen Knobloch, CERN-IT
Slide No. 23
LHC Computing Grid – Technical Design Report
LCG
Technology – Fabric Technology
 Moore’s law still holds for processors and disk storage
 For CPU and disks we count a lot on the evolution of the consumer
market
 For processors we expect an increasing importance of 64-bit
architectures and multicore chips
 The cost break-even point between disk and tape store will not be
reached for the initial LHC computing
 Mass storage (tapes and robots) is still a computer centre item
with computer centre pricing
 It is too early to conclude on new tape drives and robots
 Networking has seen a rapid evolution recently
 Ten-gigabit Ethernet is now in the production environment
 Wide-area networking can already now count on 10 Gb connections
between Tier-0 and Tier-1s. This will move gradually to the Tier-1 –
Tier-2 connections.
24Jürgen Knobloch, CERN-IT
Slide No. 24
LHC Computing Grid – Technical Design Report
LCG
Common Physics Applications
 Core software libraries




Simulation Program
SEAL-ROOT merger
Scripting: CINT, Python
Mathematical libraries
Fitting, MINUIT (in C++)
 Data management
Full
Event
Reconstruction Program
Detector
Engines
Generators
Framework
Calibration
Algorithms
Analysis Program
Experiment Frameworks
Persistency
DataBase
Batch
FileCatalog
Conditions
Interactive
Simulation
Data Management
Distributed Analysis
compatibility
ROOT
POOL
ancora
Geometry
Histograms
Fitters
NTuplemissing
Physics
 POOL:
ROOT I/O for bulk data
RDBMS for metadata
 Conditions database – COOL
 Event simulation
 Event generators: generator
library (GENSER)
 Detector simulation:
GEANT4 (ATLAS, CMS,
LHCb)
 Physics validation, compare
GEANT4, FLUKA, test beam
MathLibs
I/O
GUI
PluginMgr
Dictionary
Interpreter
Foundation
Utilities
OS binding
2D Graphics
Collections
3D Graphics
Core
 Software development infrastructure
 External libraries
 Software development and
documentation tools
 Quality assurance and testing
 Project portal: Savannah
25Jürgen Knobloch, CERN-IT
Slide No. 25
LHC Computing Grid – Technical Design Report
LCG
Prototypes
 It is important that the hardware and software systems developed
in the framework of LCG be exercised in more and more demanding
challenges
 Data Challenges have been recommended by the ‘Hoffmann Review’
of 2001. They have now been done by all experiments. Though the
main goal was to validate the distributed computing model and to
gradually build the computing systems, the results have been used
for physics performance studies and for detector, trigger, and
DAQ design. Limitations of the Grids have been identified and are
being addressed.
 Presently, a series of Service Challenges aim to realistic end-to-end
testing of experiment use-cases over in extended period leading to
stable production services.
 The project ‘A Realisation of Distributed Analysis for LHC’ (ARDA)
is developing end-to-end prototypes of distributed analysis systems
using the EGEE middleware gLite for each of the LHC experiments.
26Jürgen Knobloch, CERN-IT
Slide No. 26
LHC Computing Grid – Technical Design Report
LCG
Data Challenges
 ALICE
 PDC04 using AliEn services native or interfaced to LCG-Grid. 400,000
jobs run producing 40 TB of data for the Physics Performance Report.
 PDC05: Event simulation, first-pass reconstruction, transmission to
Tier-1 sites, second pass reconstruction (calibration and storage),
analysis with PROOF – using Grid services from LCG SC3 and AliEn
 ATLAS
 Using tools and resources from LCG, NorduGrid, and Grid3 at 133 sites
in 30 countries using over 10,000 processors where 235,000 jobs
produced more than 30 TB of data using an automatic production
system.
 CMS
 100 TB simulated data reconstructed at a rate of 25 Hz, distributed to
the Tier-1 sites and reprocessed there.
 LHCb
 LCG provided more than 50% of the capacity for the first data
challenge 2004-2005. The production used the DIRAC system.
27Jürgen Knobloch, CERN-IT
Slide No. 27
LHC Computing Grid – Technical Design Report
LCG
Service Challenges
 A series of Service Challenges (SC) set out to successively
approach the production needs of LHC
 While SC1 did not meet the goal to transfer for 2 weeks
continuously at a rate of 500 MB/s, SC2 did exceed the goal (500
MB/s) by sustaining throughput of 600 MB/s to 7 sites.
 SC3 will start now, using gLite middleware components, with diskto-disk throughput tests, 10 Gb networking of Tier-1s to CERN
providing SRM (1.1) interface to managed storage at Tier-1s. The
goal is to achieve 150 MB/s disk-to disk and 60 MB/s to managed
tape. There will be also Tier-1 to Tier-2 transfer tests.
 SC4 aims to demonstrate that all requirements from raw data
taking to analysis can be met at least 6 months prior to data taking.
The aggregate rate out of CERN is required to be 1.6 GB/s to tape
at Tier-1s.
 The Service Challenges will turn into production services for the
Possibili conflitti fra soluzioni LCG e di
experiments.
“esperimento”
28Jürgen Knobloch, CERN-IT
Slide No. 28
LHC Computing Grid – Technical Design Report
LCG
Key dates for Service Preparation
Sep05 - SC3 Service Phase
May06 –SC4 Service Phase
Sep06 – Initial LHC Service in stable operation
Apr07 – LHC Service commissioned
2005
SC3
2006
2007
cosmics
SC4
LHC Service Operation
2008
First physics
First beams
Full physics run
• SC3 – Reliable base service – most Tier-1s, some Tier-2s – basic experiment software chain –
grid data throughput 1GB/sec, including mass storage 500 MB/sec (150 MB/sec & 60 MB/sec at
Tier-1s)
• SC4 – All Tier-1s, major Tier-2s – capable of supporting full experiment software chain inc.
analysis – sustain nominal final grid data throughput (~ 1.5 GB/sec mass storage throughput)
• LHC Service in Operation – September 2006 – ramp up to full operational capacity by April 2007
– capable of handling twice the nominal data throughput
29Jürgen Knobloch, CERN-IT
Slide No. 29
LHC Computing Grid – Technical Design Report
LCG
ARDA-
A Realisation of Distributed Analysis for LHC
 Distributed analysis on the Grid is the most difficult and least
defined topic
 ARDA sets out to develop end-to-end analysis prototypes using the
LCG-supported middleware.
ARDA è stato poco utile (salvo forse LHCb): analisi nel
 ALICE uses the AliROOT
on PROOF.
2005framework
era problemabased
“prematuro”
rispetto a sto di GRID
 ATLAS has used DIAL services with the gLite prototype as
backend.
 CMS has prototyped the ‘ARDA Support for CMS Analysis
Processing’ (ASAP) that us used by several CMS physicists for daily
analysis work.
 LHCb has based its prototype on GANGA, a common project
between ATLAS and LHCb.
30Jürgen Knobloch, CERN-IT
Slide No. 30
Topics to be discussed related to
EGEE-II and WLCG MOU and TDR
statements preparati da INFN per la
discussione LCG-EGEE con management
CERN
In seguito a discussione 9-9 e follow-up
22-Settembre-2005
CSN1@Napoli
L.Perini
31
-1•
Relationship between EGEE and WLCG.
–
The relationship between EGEE and WLCG creates an
asymmetry between EGEE and the other "Grid Operational
Groupings" mentioned in 8.3. The US OSG and the Nordic
NDGF as Naregi and UNICORE DEISA have an operational
structure aimed at deploying and promoting the middleware they
produce; on the other side the EGEE operational structure is under
the leadership of WLCG, that sees as main part of his mission to
mediate with OSG and NDGF. This situation weakens the
common EGEE European middleware development effort and
reflects into the EGEE project itself (see EGEE II proposal),
where SA3, while main part of EGEE-II, seem to be mandated to
consider the JRA1 mw on the same footing as the one produced
by external providers.
22-Settembre-2005
CSN1@Napoli
L.Perini
32
-2•
Future CERN role in middleware development
–
In EGEE-II CERN is almost not contributing to the
middleware development effort (~3 FTEs over 60).
Together with the retirement from OMII Europe this
seems to indicate that CERN has definitely
renounced to lead and even contribute to the
maintenance of an European excellence in the grid
middleware domain. Consequences of this choice
may impact badly on all national middleware efforts
previously related to CERN.
22-Settembre-2005
CSN1@Napoli
L.Perini
33
-3•
Future evolution plans to make Europe ready for analysis
–
Different important points of the computing models of the LHC
experiments have been only partially tested in a significant way
on a realistic scale; this consideration is especially valid for the
analysis, which is the crucial point of the models using Grid. No
credible evolutionary and fall back plans for Grid middleware
evolutions have been provided nor by the experiments neither by
WLCG. The ARDA effort on analysis was scarcely useful in this
respect: concentrating on a “theoretical” analysis when still were
lacking the basic functionalities and reliability needed for the
much simpler environment of organized production, was not the
best path. In the WLCG MOU an EU infrastructure based on
EGEE is explicitly referred to; however no clear indications and
plans on how this could evolve from current situation, in
particular with relation to point 1), are provided.
22-Settembre-2005
CSN1@Napoli
L.Perini
34
-4•
Middleware and EU e-Infrastructure long term
maintenance plan
–
A planning how to provide the maintenance of the middleware
after the end of the EGEE II project is largely missing and in
particular if CERN is still aiming to play a role in this. Presently
EGEE provides also a large fraction of the manpower needed for
the operation of the European GRID and of the Centers that are
the nodes of this GRID. The maintenance of the European GRID
infrastructure for continued efficient operation need to be
addressed at the same level as the maintenance of the middleware,
and neither are really dealt in the WLCG TDR.
22-Settembre-2005
CSN1@Napoli
L.Perini
35
-5•
Experiment’s grid integration activities
–
The important point of an internal CERN team, discussed in the
previous meeting with DG, supporting the experiments for
interfacing their software to the g-Lite/EGEE based GRID and
for being able to run efficiently using a common EU
infrastructure and a common European mw, has not been
addressed. The role of the CERN computing team in CMS is
insignificant. No one of the key managerial computing positions
is occupied by a CERN staff and the overall US dominance has
not been sufficiently balanced.
This reflects badly in the
EGEE middleware development efforts.
The lack of a specific internal, CERN based, EGEE focused
effort inside the experiments in this field explains, at least in part,
why often the experiments have preferred to develop their own
implementations instead of relaying on the commons ones
provided by EGEE/LCG.
22-Settembre-2005
CSN1@Napoli
L.Perini
36
-6•
Experiment Integration Support Group
–
The present EIS group worked well and is highly
appreciated by the experiments also thanks to its
location in SA1 and the consequent in depth
knowledge of service configuration and possibility of
quick feedback interventions. In the TDR and MoU
the planning for the future of a support like the one
that was provided by the EIS group seems still not
specified. Support to the EIS team is one of the main
contributions of INFN to WLCG.
22-Settembre-2005
CSN1@Napoli
L.Perini
37
Scarica

LCG - Infn