AMS Data Handling e INFN
P.G. Rancoita
Perugia 11/12/2002
1
AMS Ground Segment
Data flow in AMS-02
• High Rate (Scientific + Calibration) : 3-4 Mbit/s
• Slow Rate (House Keeping) :
16 kbit/s
• NASA ancillary data :
 1 kbit/s
• Total Volume :
30 - 41 GB/day
11 - 15 TB/year
2
AMS Ground Segment
Data volume in AMS-02
• Archived Data
1. Event Summary Data :
2. Event Tag :
3. Total (+Raw and ancillary) :
• Data on direct access
1. Event Summary Data :
2. Event Tag :
• Total data volume (3 years):
• Namely 180 GB/day
44 TB/year
0.6 TB/year
56 - 60 TB/year
8.3 TB/year
0.6 TB/year
 180 TB
3
Events and Ev. rate
• Exp rate of average accepted ev. about 200
Hz, this means in 3 y’s about (1.5-2)x10^10
ev’s
• Typical reconstructed ev. length less than
about 6.5-7 kB.
• Total storage for ESD about 130 TB
4
AMS Ground Segment
Data budget in AMS-02
Data/
Year
1998
2001
2002
2003
2004
2005
2006
2007
2008
2009
Total
Raw
0.20
----
---
---
---
0.5
15
15
15
0.5
46.2
ESD
0.30
---
---
---
---
1.5
44
44
44
1.5
135.3
Tags
0.05
---
---
---
---
0.1
0.6
0.6
0.6
0.1
2.0
Total
0.55
---
---
---
---
2.1
59.6
59.6
59.6
2.1
183.5
MC
0.11
1.7
8.0
8.0
8.0
8.0
44
44
44
44
210.4
Grand
Total
0.66
1.7
8.0
8.0
8.0
10.1
104
104
104
46.1
~400
5
POIC@MSFC AL
AMS Ground Segment:
External Communications
HOSC Web Server and xterm
commands
POCC
POCC
XTerm
Data budget in AMS-02
Monitoring, H&S data
Flight Ancillary data
AMS science data (selected)
cmds archive
TReK WS
“voice”loop
TReK WS
Video distribution
GSE
PC Farm
Science
Operations
Center
GSE
Buffer data
Retransmit
To SOC
GSE
D
A
T
A
AMS Remote
center
MC production
Data
mirror
archiving
RT data
Commanding
Monitoring
NRT Analysis
S
e
r
v
e
r
Production
Farm
MC
production
NRT Data
Processing
Primary storage
Archiving
Distribution
Science Analysis
Analysis
Facilities
Data
Server
Analysis
Facilities
AMS
Station
AMS
Station
AMS
Station
AMS Ground
Centers
6
AMS Ground Segment
AMS-02 Ground Facilities
•
•
•
•
•
POIC @ Marshal MSFC
POCC @ JSFC / MSFC / MIT / CERN
(A)SOC @ CERN
Remote Center - Italian Ground Segment
Laboratories
7
8
AMS Ground Segment
Payload Operation and
Integration Center (POIC)
•
•
•
•
•
•
•
POIC @ Marshall SFC (Huntsville -AL)
Receives data from ISS
Buffers data until retransmission to (A)SOC
Forward monitoring and meta-data to POCC
Transmits commands from POCC to AMS
Runs unattended 24h/day, 7days/week
Must buffer ~ 2 weeks of data  600 GByte
9
AMS Ground Segment
Payload Operation Control Center
(POCC)
•
•
•
•
•
POCC @ JSFC, MSFC, MIT, CERN
Receives data from POIC @ MSFC
Monitors data and runs quality control program
Process ~ 10% of data in near real time
Originates and transmits commands to AMS
through POIC
• Requires scientists on shift
10
AMS Ground Segment
(AMS) Science Operation Center
[(A)SOC]
•
•
•
•
•
•
•
Complete Data Repository (Raw + Reco)
Production of Reconstructed data
Re-processing / Re-calibration of data
Meta-data Repository and Command archive
Production and management of MC events
MonteCarlo Repository
Scientific Data Analysis Facility
11
AMS Science Operation Center Computing Facilities
Production Farm
PC Linux
2x2GHz+
PC Linux
2x2GHz+
PC Linux
2x2GHz+
PC Linux
2x2GHz+
Tape
Server
PC Linux
2x2GHz+
Tape
Server
Disk
Server
Gigabit Switch (1 Gbit/sec)
#8
#2
PC Linux Server
2x2GHz, SCSI RAID
Cell #1
Gigabit Switch
(1 Gbit/sec)
MC Data Server
Disk
Server
Disk
Disk
Server
Server
Disk
Server
Data Server
Simulated
data
AMS data
NASA data
metadata
Archiving and Staging
2xSMP,
(Q, SUN)
PC Linux
2x2GHz+
PC Linux
2x2GHz+
Gigabit Switch (1 Gbit/sec)
Analysis Facilities
12
AMS Ground Segment
AMS Italian Ground Segment
(IGS)
• Get data (raw + reco + meta-data) from
(A)SOC
• Complete Mirror and Meta-data repository:
Master Copy of the full Data set
• Monte Carlo production (20%)
• Support local user’s community for Data
Analysis
13
AMS Ground Segment
AMS Italian Ground Segment
14
AMS Ground Segment
Italian Ground Segment Facilities
• Italian Ground Segment Data Storage
Complete mirror data and meta-data
repository (IGSDS)
namely the MASTER COPY of the
full AMS Data Set
• Data Transfer Facility
DTF
• Data Transfer Management and Survey
DTMS
• Monte Carlo contribution: (20%)
15
AMS Ground Segment
Data Transfer to IGS
• Involved: DTF, IGSDS, DTMS
• DTF (CERN): access Data at (A)SOC and
transfer to IGSDS
• IGSDS (TBD): receive and store Data
• DTMS (Milano): watch over the Data
transfer
• Network required:
32 Mbit/s
16
17
DATA Transfer Dev.
18
Data transfer
New release of Data Transfer is running since 20 weeks.
Stops are due only to power outages at CERN.
19
Data transfer
• “production rate” = 2.2 Mbit/sec
• Sustainable production rate = 8 Mb/sec (80%
of available bandwidth)
• This thanks to a forking mechanism and
bbftp’s efficient bandwidth usage
• Milano and CERN Data Transfer DB’s
consistency = 100%
• Data that has to be retransmitted= 0.2 %
20
Data transfer: present work
• Test bbftp’s variable TCP parameters (done)
• Release a new version of “our” bbftp (minor
changes on authorization and error reporting)
(done)
• Test system in a more reliable environment
(no power outages…)
• Implement automatic recovery.
• Setup GUI (Graph. User Interface) to
start/stop system
• Complete Web monitoring tools.
21
22
AMS Italian Ground Segment
Data Storage at IGSDS
• Place:
• Archived Data:
• On-line Data:
TBD
180 TB (3 years)
~ 2 TB (1-2 weeks)
23
Descrizione dei costi
• Costi relativi al Central AMS Ground
Segment (POIC+POCC+(A)SOC)
24
Central Production Facility
• La Central Production Facility sara’
dedicata alla ricostruzione dei dati.
• La CPF sara’ fisicamente alloggiata
presso il CERN e fa parte dell’ (A)SOC
• Le necessita’ per la CPF sono suddivise
in storage e CPU (e DB servers).
25
HW e costi del Data Handling di AMS
---Central Production Facility
Per quanto riguarda la potenza di calcolo,
si avra’ bisogno dell’equivalente di:
• 50 dual 1.5 GHz boxes, 1 GB RAM,
• Processing storage: 10 TB
26
Central Production Facility
Ai costi e alle conoscenze attuali degli
sviluppi dei costi, si prevede per la facility
nel periodo 2004-2006 un costo di
• CPF
350 KUS $
• DB Servers
50 KUS $
• Event Storage 200 KUS $
27
POCC, Marshall (POIC), Analysis
Ai costi e alle conoscenze attuali degli
sviluppi dei costi, si prevede un costo di
• Marshall
• POCC (x2)
• Analysis
55 KUS $
150 KUS $
55 KUS $
28
Spese Addizionali
• Spese 2000-2001 per prototipi e
initial set-up)
150 KUS $
• Running costs & Upgrades 2007–2008
150 KUS $
Totale (escluso personale) 1160 KUS $
Si attende che il 20% +IVA di questa circa
venga da parte INFN : 277 k€
29
Stime del personale per il
Data Handling di AMS
• E’ in fase di formalizzazione la spesa per personale (oltre ai
fisici) da dedicare al data handling per il periodo 2003-2008
• Il personale consiste in system administrators, SW and HW
engeneers. Le stime in anni/uomo sono:
• POCC circa 8.5
• (A)SOC circa 15.3
• User’s support group circa 15.6 (incluso personale dedicato
ad item particolari quali lo storage)
• Totale circa 39.4/anni uomo
• Se si assume un costo di 50K€/anno uomo si ottiene circa
1970 K€ , il cui 20% (circa 390 K€) dovrebbe essere un
30
contributo INFN
Descrizione dei costi
• Costi relativi all’ Italian Ground Segment,
relativi a DTF, DTMS, IGSDS
31
DTF
Il sistema di DATA TRANSFER avra’ un
suo front-end INFN presso il CERN, con
un sistema dedicato a “prendere” i dati e
trasferirli in Italia al MASTER COPY repository
Il sistema si basa su:
• Architettura Client/Server (SSL)
• Bbftp
• MySql
32
DTF cont.
Per tale sistema sara’ necessario:
• 1 Server AMD 1.5 GHz
• 1.5 TB su disk raid (scsi)
• 32 Mb/s CERN  IGS
• Costo inclusa la manutenzione e
sostituzione dei server circa 50k€ +IVA
mel periodo 2004-2008
Richieste di banda: (4 R + 8 NT ) + (2 R
+ 4 NT ) rt + 2 (SR+CAS) = 20 Mb/s
33
DTMS
High performance server, with fast CPU and high I/O
throughput.
I/O Buffer
• Capacity equivalent to 7 days of data taking to
recover from any connectivity failure
• 1.5 Tbytes
Network
• High speed network connections to CPF. Must be
consistent with a flux of 3 days worth of data: 32
Mb/s
• Each facility (DTF and DTMS) costs about
27+VAT k€ up 2008
34
DATA STORAGE :
Italian MASTER COPY
2 High performance servers, with fast CPU and high I/O
throughput.
I/O Buffer : Capacity equivalent to about 3 days of data taking
to recover from any connectivity failure (0.5 Tbytes)
On-line storage RAID system (1 Tbytes)
Off-line storage : Tapes or similar (e.g.: LTO) 180 Tbytes. For
instance LTO
Off-line Robotics staging area: Depending on the robot
solution adopted, it varies between a few percent and 10%
of the stored data (10 Tbytes)
Network: High speed network connections to CPF. Must be
consistent with a flux of 3 days worth of data (32 Mb/s)35
Cost (2002 price based on LTO) : 355 k€ + VAT
•
•
•
•
•
•
•
•
Sommario costi per la parte INFN
per il contributo al Ground Segment Centrale (CERN) e
IGS relativa al Data Transfer e Master Copy
per il periodo 2003-2008
HW to AMS central ground segment 277k€
Personnel (A)SOC,POCC, etc 394k€.
Total cost 671 k€ (VAT included)
HW (IGSDS) for 200TB storage 428k€
HW DTF e DTMS (63k€)
Total cost 491k€
Grand Total (2003-2008) 1162 k€
No cost for IGSDS facility (infrastructure and
personnel) is included
36
Scarica

AMS Ground Segment