Publication patterns
in HEP computing
M. G. Pia1, T. Basaglia2,
Z. W. Bell3, P. V. Dressendorfer4
1INFN
Genova, Genova, Italy
2CERN, Geneva, Switzerland
3ORNL, Oak Ridge, TN, USA
4IEEE, Piscataway, NJ, USA
NSS 2012
Maria Grazia Pia,
INFN Genova CA
Anaheim,
1
Analysis topics
What they publish
How much
Where
Citations
Technology vs physics
Software vs hardware
Software/DAQ-trigger
General tools
− Geant4
− ROOT
HEP experiments
− LEP
 ALEPH, DELPHI, L3, OPAL
− CDF
− BaBar
− LHC
 ALICE, ATLAS, CMS, LHCb, TOTEM
Grid computing
− LCG
Maria Grazia Pia, INFN Genova
No time to report all the results
2
Data sources
Thomson-Reuters: ISI Web of Knowledge
− CERN subscription: since 1970, conference database not included
− Search by keywords, collaboration name
Journal web sites
− IEEE TNS
− NIM, Comp. Phys. Comm. (Elsevier)
− JINST (IOP/SISSA)
➤ Full-text searches
CERN databases
− CERN Document System
− Greybook
Years: 1982-2011 (LEP), 1992-2011 (BaBar, LHC)
− Reproducible sample for citation analysis
Publication analysis extended to 30 September 2012
Maria Grazia Pia, INFN Genova
3
Data sample
Contamination
− Non-pertinent entries in the data sample
Omission
− Pertinent papers are not included in the data sample
➩ Cross-checks
− WoS/CDS, WoS/publishers’ web sites
WoS inconsistencies and errors
− Total number of citations includes Conference database
− Proceedings papers: false classifications and omissions
➩Manually corrected whenever possible
Automated analysis (whenever possible)
Manual evaluation: abstracts and full-text papers
− Some degree of subjectivity
Maria Grazia Pia, INFN Genova
4
S. Agostinelli et al.
3301 citations
Geant4: a simulation toolkit
(20 October2012)
NIM A, vol. 506, no. 3, pp. 250-303, 2003
Most cited CERN publication in WoS
(excluding Rev. Part. Properties)
J. Allison et al.
Geant4 Developments and Applications
IEEE Trans. Nucl. Sci., vol. 53, no. 1, pp. 270-278, 2006
665 citations
(20 October2012)
Many papers cite the NIM paper, but they omit citing the TNS one,
even though both are indicated in http://cern.ch/geant4
Many papers that use Geant4 do not cite either reference
Maria Grazia Pia, INFN Genova
Citation analysis: until end 2011
5
500
Geant4 NIM
Geant4 TNS
Born from LHC experimental requirements
Multidisciplinary sources of citations
Citations
400
300
200
100
0
Geant4 NIM: Citing Journals
2003 2004 2005 2006 2007 2008 2009 2010 2011
Year
NIM A
Phys. Rev. D
TNS
Phys. Rev. Lett.
Med. Phys.
Phys. Med. Biol.
Phys. Rev. C
Phys. Lett. B
NIM B
JINST
EPJC
Astrop. Phys.
JHEP
J. Phys. G
Appl. Radiat. Isot.
Radiat. Meas.
J. Korean Phys. Soc.
Radiat. Prot. Dosim.
G4 NIM: Citing Collaborations
BaBar
ATLAS
30% Physics
CMS
LHC
HEP
Other
LHCb
HARP
CDF
LUNA
MiniBooNE
75% citations
(plot)
N TOF
BES III
16% citations (plot)
JET EFDA
19% citations from collaborations
ALICE
ISOLDE
0
100
200
Citations
Maria Grazia Pia, INFN Genova
300
400
0
50
100
150
200
Citations
6
R. Brun and F. Rademakers
ROOT - An object oriented data analysis framework
NIM A, vol. 389, no. 1-2, pp. 81-86, 1997
584 citations
(20 October 2012)
AIHENP Workshop proceedings paper
I. Antcheva et al.
ROOT - A C++ framework for petabyte data storage,
statistical analysis and visualization
Comp. Phys Comm., vol. 180, no. 12, pp. 2499-2512, 2009
Maria Grazia Pia, INFN Genova
32 citations
(20 October 2012)
Citation analysis: until end 2011
7
60
ROOT Proc.
ROOT CPC
Citations
50
ROOT Proc.: Citing Journals
Astropart. Phys.
Lect. Notes Comp.
NIM B
JHEP
Med. Phys.
EPJC
Phys. Med. Biol.
JINST
Phys. Rev. D
Phys. Rev. C
Comp. Phys. Comm.
TNS
NIM A
40
30
20
10
0
1997
1998
1999
2000
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
75% citations
Year
0
20
40
60
80
100
120
Citations
Field of citing journals
Geant4 % ROOT %
30.3
49.6
Technology
29.9
18.2
Physics
13.9
6.0
BioMedical
Maria Grazia Pia, INFN Genova
CMS
CDF
CLAS
D0
N TOF
T2K
ATLAS
ALICE
BABAR
RISING
R3B
PHOBOS
KIMS
JET-EFDA
HADES
H1
GLAST
D0
BELLE
AUGER
8% of all
citations from
collaborations
0
1
2
3
Citations
8
HEP experiments
Maria Grazia Pia, INFN Genova
Collaboration members
3500
3000
2500
2000
1500
1000
500
C
D
F
0
AL
E
D PH
EL
PH
I
L
O 3
PA
Ba L
b
AL ar
IC
AT E
LA
S
C
M
S
LH
TO Cb
TE
M
Members
LEP
• ALEPH
• DELPHI
• L3
• OPAL
CDF
BaBar
LHC
• ALICE
• ATLAS
• CMS
• LHCb
• TOTEM
Start of run
CDF: 1985/1988
LEP: 1989
BaBar: 1999
LHC: 2008
Experiment
9
Time distribution
Start of run
Publications vs operation year
Publication year
All
LEP
BaBar
LHC
CDF
200
LEP
BaBar
LHC
CDF
200
Number of publications
Number of publications
250
CDF: 1985/1988
LEP: 1989
BaBar: 1999
LHC: 2008
150
150
100
100
50
0
1985
1990
1995
2000
2005
Year
Publication year
Maria Grazia Pia, INFN Genova
2010
50
0
−20
−10
0
10
20
Year
Rescaled w.r.t. year of start run
10
Time distribution
Publications/member vs. year
Number of publications
0.12
LEP
BaBar
LHC
CDF
0.10
0.10
0.08
0.06
0.04
0.02
0.00
Start of run
Number of publications
0.12
CDF: 1985/1988
LEP: 1989
BaBar: 1999
LHC: 2008
Publications/member vs. year
LEP
BaBar
LHC
CDF
0.08
0.06
0.04
0.02
1985
1990
1995
2000
Year
2005
2010
0.00
−20
−10
0
10
20
Year
Same as previous slide, rescaled by the number of experiment members
Maria Grazia Pia, INFN Genova
11
Share of hardware,
software and DAQ-trigger
publications
Publications
Publications
200
100
Experiment
Maria Grazia Pia, INFN Genova
C
D
F
0
0.6
0.4
0.2
0.0
L
O 3
PA
Ba L
b
AL ar
IC
AT E
LA
S
C
M
S
LH
TO Cb
TE
M
C
D
F
300
0.8
AL
E
D PH
EL
PH
I
400
1.0
Number of publications
500
Technological publications
general
physics
hardware
DAQ−trigger
software
AL
E
D PH
EL
PH
I
L
O 3
PA
Ba L
b
AL ar
IC
AT E
LA
S
C
M
S
LH
TO Cb
TE
M
Number of publications
600
Experiment
12
Physics publications
Physics publications/member
Physics publications
●
Number of publications
●
400
●
300
●
●
●
200
●
●
100
●
Ba
ba
AL r
IC
AT E
LA
S
C
M
S
LH
TO Cb
TE
M
C
D
F
PA
L
L3
●
O
AL
E
D PH
EL
PH
I
0
●
Experiment
Number of publications/members
1.0
500
●
0.8
●
●
0.6
●
●
0.4
●
0.2
●
●
0.0
H HI
P
E LP
AL DE
L3
●
●
●
L ar
b M
F
S
E
S
PA ab LIC LA CM HC TE CD
O B A AT
L O
T
Experiment
LEP experiments completed their life-cycle
LHC experiments: at an early stage of their physics production
Maria Grazia Pia, INFN Genova
13
Technological publications
Technological publications/member
Technological publications
0.18
50
●
●
●
●
●
●
●
Ba
ba
AL r
IC
AT E
LA
S
C
M
S
LH
TO Cb
TE
M
C
D
F
●
PA
L
0
L3
●
O
●
●
Experiment
0.10
0.08
0.06
0.04
●
●
●
0.02
●
●
●
●
●
●
●
●
0.00
Ba
ba
AL r
IC
AT E
LA
S
C
M
S
LH
TO Cb
TE
M
C
D
F
100
0.12
PA
L
150
Software.
DAQ−trigger
Hardware
0.14
L3
200
Software.
DAQ−trigger
Hardware
0.16
O
●
●
AL
E
D PH
EL
PH
I
Number of publications/members
250
AL
E
D PH
EL
PH
I
Number of publications
300
Experiment
Roughly constant trends, once the number of publications is
normalized to the number of collaborators
Maria Grazia Pia, INFN Genova
14
Software vs. hardware
Hardware/software publications
DAQ−trigger/software publications
18
8
16
14
6
●
12
Ratio
10
8
4
●
6
●
●
●
●
4
●
●
●
2
●
●
●
●
2
●
●
●
●
0
Ba
ba
AL r
IC
AT E
LA
S
C
M
S
LH
TO Cb
TE
M
C
D
F
PA
L
L3
●
O
AL
E
D PH
EL
PH
I
Ba
ba
AL r
IC
AT E
LA
S
C
M
S
LH
TO Cb
TE
M
C
D
F
PA
L
O
L3
0
Experiment
●
●
●
AL
E
D PH
EL
PH
I
Ratio
●
Experiment
Hardware publications: approximately 4 times more than software
DAQ-trigger publications: approximately 1.3 times more than software
Maria Grazia Pia, INFN Genova
15
Journals: LEP and LHC
LEP
Dominated by physics publications
Maria Grazia Pia, INFN Genova
LHC
Still dominated by technological
publications
16
Hardware
0 citations: 6%
100
80
Number of publications
200
0 citations: 20%
150
100
60
40
20
0
0
80
20
40
60
80
20
40
DAQ−Trigger
Software
40
20
20
40
60
Citations
80
100
80
100
0 citations: 28%
50
40
30
20
10
0
0
20
ATLAS: ATLAS pixel detector electronics and sensors: 185
Maria Grazia Pia, INFN Genova
60
Citations
0 citations: 27%
0
0
Citations
60
0
50
0
100
Number of publications
Citations of the
most cited paper
ALEPH:
340
DELPHI:
309
L3:
509
OPAL:
473
BaBar:
859
ALICE:
116
CMS:
129
LHCb:
101
TOTEM:
35
Number of publications
The most cited
papers are often the
general reference
papers about the
detector published
by each experiment
120
Number of publications
Citations
Physics
40
60
80
100
Citations
17
References
60
40
20
0
0
web sites
20
40
60
50
40
30
20
10
0
20
40
60
References
DAQ−Trigger
Software
15
10
5
0
60
0
100
20
0
70
References
20
40
60
References
Maria Grazia Pia, INFN Genova
80
Number of publications
Bibliographical entries in
software papers are often
Number of publications
25
more citations
Hardware
Number of publications
More
references
Number of publications
Physics papers cite
more references
than technological
papers
Physics
80
100
80
100
80
100
15
10
5
0
0
20
40
60
References
18
Sources of citations of physics papers
Samples in plots account for >90% of citations
DELPHI
CMS
ALEPH
Phys. Rev. D
Phys. Rev. D
Phys. Lett. B
JHEP
EPJC
Phys. Lett. B
Nucl. Phys. B
EPJC
Nucl. Phys. B Proc. Suppl.
Phys. Rev. Lett.
Phys. Rev. Lett.
Acta Phys. Pol. B
JHEP
Phys. Rev. C
Z. Phys. C
Nucl. Phys. A
LEP
Int. J. Mod. Phys. A
Mod. Phys. Lett. A
Acta Phys. Pol. B
J. Phys. G Nucl.
J. Phys. G
Int. J. Mod. Phys. A
NIM A
Progr. Theor. Phys. Suppl.
Phys. Rep.
New J. Phys.
Mod. Phys. Lett. A
JINST
Phys. Atom. Nucl.
J. Cosm. Astrop. Phys.
Nucl. Phys. A
Ann. Rev. Nucl. Part. Sci.
0
5
10
15
Citations (%)
ATLAS
20
25
LHC
0
5
10
15
20
25
30
Citations (%)
Citations to HEP physics papers mostly come from journals specialized
in HEP and a few related fields (astroparticle and nuclear physics)
Maria Grazia Pia, INFN Genova
19
Sources of citations of technological papers
DELPHI
ALEPH
NIM A
EPJC
Phys. Lett. B
TNS
Nucl. Phys. B Proc. Suppl.
Z. Phys. C
Comp. Phys. Comm.
Nucle. Phys. B
Phys. Rev. D
JHEP
Acta Phys. Pol. B
Int. J. Mod. Phys. A
Phys. Rep.
Ann. Rev. Nucl. Part. Sci.
Rev. Mod. Phys.
Rep. Prog. Phys.
CMS
ATLAS
NIM A
TNS
JINST
EPJC
Phys. Rev. D
JHEP
LEP
LHC
Nucl. Phys. B Proc. Suppl.
Phys. Lett. B
Comp.Phys. Comm.
Int. J. Mod. Phys A
0
10
20
30
40
0
Citations (%)
10
20
30
40
50
60
Citations (%)
Citations from HEP physics and technology journals
Maria Grazia Pia, INFN Genova
20
2008-2011
More refined analysis of technological
papers published since the start of LHC run
TNS 2008-2011
Hardware
Software
NIM 2008-2011
DAQ-trigger
Hardware
25
Software
50
20
40
Number of papers
Number of papers
45
15
10
5
35
30
25
20
15
10
5
0
0
ATLAS
CMS
LHCb
Maria Grazia Pia, INFN Genova
ALICE TOTEM
LHC
ATLAS
CMS
LHCb
ALICE TOTEM
LHC
21
2008-2011
Self-citations
Hardware
Software
35
30
25
TNS
20
40
DAQ-trigger
15
10
5
Number of outside citations
Number of self-citations
40
Outside citations
0
80
Software
DAQ-trigger
35
30
25
TNS
20
15
10
5
0
ATLAS
CMS LHCb ALICE TOTEM LHC
Hardware
Software
60
50
NIM A
40
30
20
10
0
ATLAS
CMS
LHCb
ATLAS
80
Number of outside citations
70
Number of self-citations
Hardware
ALICE TOTEM LHC
Maria Grazia Pia, INFN Genova
CMS
LHCb
Hardware
ALICE TOTEM
LHC
Software
70
60
50
NIM A
40
30
20
10
0
ATLAS
CMS
LHCb
ALICE TOTEM LHC
22
Conclusions
Software is largely underrepresented in
HEP scholarly literature w.r.t. hardware
Publication patterns appear similar in
the LEP and LHC era
Citation patterns are different for
publications by HEP experiments and
about general software tools
Publish!
…and don’t forget to cite
Maria Grazia Pia, INFN Genova
23
Scarica

Geant4 citations