Tutorial of the Statistical Toolkit
Geant4
Workshop
Vancouver, September 2003
Alberto Ribon
CERN
http://www.ge.infn.it/geant4/analysis/TandA
Alberto Ribon
What is the Test & Analysis project?
Test & Analysis is a project to develop a statistical analysis system
for usage in Geant4 testing
physics validation
Main application areas
regression testing
system testing
Provide tools to compare Geant4 simulation results with
reference data
– equivalent reference distributions (for instance, regression testing)
– experimental measurements
– data libraries from reference distribution sources
– functions deriving from theoretical calculations or from fits
Alberto Ribon
The performed tests were established
with the help of:
Vladimir Ivanchenko (CERN)
Team of Geant4 Test
Michel Maire (IN2P3/LAPP)
Coordinators:
Maria Grazia Pia (INFN, Genova)
Andreas Pfeiffer (CERN)
Realisation of tests:
Pablo Cirrone (INFN Catania)
Susanna Guatelli (Genova, CERN)
Alfonso Mantero (Genova, ESA)
Luciano Pandola (INFN, Gran Sasso)
Automatisation/production of tests:
Sandra Parlati (INFN, Gran Sasso)
Statistical analysis:
Stefania Donadio (INFN, Genova)
Barbara Mascialino (INFN, Genova)
Maria Grazia Pia (INFN, Genova)
Andreas Pfeiffer (CERN)
Alberto Ribon (CERN)
Paolo Viarengo (IST, Genova)
Alberto Ribon
test50
Electromagnetic physics:
G4LowE
G4Standard
G4Penelope
Particle CSDA range
Particle Stopping Power
Transmission coefficient
Backscattering coefficient
Gamma Attenuation coefficient
Gamma processes’ cross section
Particle range
Bremmstrahlung energy spectrum
Multiple scattering test
Energy deposition in absorber material
etc.
Hadronic physics
Proton Bragg Peak
Study of possible tests to
perform
Flexibility
Plug-in
Alberto Ribon
X-ray Attenuation Coefficient - Ge
2N-L=26.3 – =23 - p=0.29
2N-S=27.9 – =23 - p=0.22
NIST-XCOM
G4 Standard
G4 LowE
Alberto Ribon
CSDA Range – Al –G4LowE
Regression
testing
NIST-ESTAR
Geant4-05-00
Geant4-05-02
Alberto Ribon
Backscattering low energies - Au
Alberto Ribon
Download/setup the Statistical Toolkit
http://www.ge.infn.it/geant4/analysis/HEPstatistics
Download :
StatisticsTesting-V1-00-00.tgz
tarball
You also need to install the following libraries:
1)
GSL (GNU Scientific Library)
2)
AIDA
3)
Anaphe
3.0
5.0.5
Edit the script buildAll.py to set the proper GSL path
GSL_DIR=/afs/cern.ch/sw/lhcxx/specific/redhat73/gcc3.2/
PublicDomainPackages/2.0.0/
Run the script: $ ./buildAll.py
Alberto Ribon
Statistiscal Tests available
Currently the available statistical tests are:
1)
Chi2 test (for binned distributions);
2)
Kolmogorov-Smirnov test
(for unbinned distributions).
For the next future:
3)
Cramer-von Mises test (for both binned and unbinned distributions);
4)
Anderson-Darling test (for both binned and unbinned distributions);
…
Alberto Ribon
The AIDA classes you need to know
AIDA::IHistogram1D : for 1-dimensional binned distribution
AIDA::ICloud1D
: for 1-dimensional unbinned distribution
AIDA::IDataPointSet : vector of IDataPoint;
IDataPoint : is a vector of IMeasurement;
IMeasurement : (value, errorPlus, errorMinus).
Alberto Ribon
The StatisticsTesting classes you need to know
namespace StatisticsTesting {
template < class Algorithm >
// For instance: Chi2ComparisonAlgorithm ,
//
KolmogorovSmirnovComparisonAlgorithm .
class StatisticsComparator {
public:
ComparisonResult compare (const AIDA::IDataPointSet & dps1, const AIDA::IDataPointSet & dps2);
ComparisonResult compare (const AIDA::IHistogram1D & histo1, const AIDA::IHistogram1D & histo2);
ComparisonResult compare (const AIDA::ICloud1D & cloud1, const AIDA::ICloud1D & cloud2);
...
}
class ComparisonResult {
public:
double distance( );
double quality ( );
double ndf
( );
...
}
Alberto Ribon
Chi2 test between histograms
#Include “AIDA/AIDA.h”
#include “StatisticsTesting/StatisticsComparator.h”
#include “Chi2ComparisonAlgorithm.h”
#include “ComparisonResult.h”
using namespace StatisticsTesting;
std::auto_ptr<AIDA::IAnalysisFactory> af( AIDA_createAnalysisFactory() );
std::auto_ptr<AIDA::ITreeFactory> tf( af -> createTreeFactory() );
std::auto_ptr<AIDA::ITree> tree( tf -> create() );
std::auto_ptr<AIDA::IHistogramFactory> hf( af->createHistogramFactory( *tree ) );
AIDA::IHistogram1D& hA = *( hf->createHistogram1D( "A", 100, 0.0, 50.0) );
AIDA::IHistogram1D& hB = *( hf->createHistogram1D( "B", 100, 0.0, 50.0) );
hA.fill( 15.7 );
...
hB.fill( 23.4 );
...
StatisticsComparator< Chi2ComparisonAlgorithm > comparator;
ComparisonResult result = comparator.compare( hA, hB );
std::cout << “ distance=“ << result.distance() << “ ndf=“ << result.ndf() << “ p-value=“ << result.quality();
Alberto Ribon
Kolmorogov-Smirnov test between clouds
#Include “AIDA/AIDA.h”
#include “StatisticsTesting/StatisticsComparator.h”
#include “KolmogorovSmirnovComparisonAlgorithm.h"
#include “ComparisonResult.h”
using namespace StatisticsTesting;
std::auto_ptr<AIDA::IAnalysisFactory> af( AIDA_createAnalysisFactory() );
std::auto_ptr<AIDA::ITreeFactory> tf( af -> createTreeFactory() );
std::auto_ptr<AIDA::ITree> tree( tf -> create() );
std::auto_ptr<AIDA::IHistogramFactory> hf( af->createHistogramFactory( *tree ) );
AIDA::ICloud1D& cloudA = *( hf->createCloud1D( "A" ) );
AIDA::ICloud1D& cloudB = *( hf->createCloud1D( "B" ) );
cloudA.fill( 15.7 );
...
cloudB.fill( 23.4 );
...
StatisticsComparator< KolmogorovSmirnovComparisonAlgorithm > comparator;
ComparisonResult result = comparator.compare( cloudA, cloudB );
std::cout << “ K-S distance=“ << result.distance() << “ p-value=“ << result.quality();
Alberto Ribon
Example of an XML data file
<?xml version=“1.0” encoding=“ISO-8859-1”?>
<!DOCTYPE aida SYSTEM http://aida.freehep.org/schemas/3.0/aida.dtd >
<aida version=“3.0”>
<implementation package=“Anaphe” version=“5.0.0”/>
<dataPointSet dimension=“2” name=“attenuation coefficient” path=“/”
title=“attenuation coefficient in Ge”>
<annotation>
<item key=“Title” value=“attenuation coefficient in Ge”/>
<item key=“Name” value=“attenuation coefficient”/>
<item key=“Size” value=“24”/>
</annotation>
<dataPoint>
<measurement value=“1.000e-03”/>
<measurement errorMinus=“9.465e+01” errorPlus=“9.465e+01” value=“1.893e+03” />
</dataPoint>
<measurement value=“1.500e-03/>
<measurement errorMinus=“1.3555e+02” errorPlus=“2.7375e+02” value=“5.475e+03”/>
</dataPoint>
…
<dataPoint>
<measurement value=“1.500e+01/>
<measurement errorMinus=“1.670e-03” errorPlus=“1.670e-03” value=“3.340e-02”/>
</dataPoint>
</dataPointSet>
</aida>
Alberto Ribon
How to write a XML data file
aFact = AIDA_createAnalysisFactory();
treeFact = aFact -> createTreeFactory();
…
theTree = treeFact -> create(“test50.xml”, “xml”, false, true, “uncompress” );
dataPointFactory = aFact -> createDataPointSetFactory( *theTree );
particleTransmissionDataPoint = dataPointFactory -> create(“Transmission test”, 2);
…
particleTransmissionDataPoint -> addPoint();
AIDA::IDataPoint* point = particleTransmissionDataPoint -> point( PointNumber );
AIDA::IMeasurement* coordinateX = point -> coordinate( 0 );
coordinateX -> setValue( primaryParticleEnergy );
AIDA::IMeasurement* coordinateY = point -> coordinate( 1 );
coordinateY -> setValue( TransFraction );
coordinateY -> setErrorPlus( TransError );
coordinateY -> setErrorMinus( TransError );
…
theTree -> commit();
theTree -> close();
Alberto Ribon
How to compare XML dataPointSets
#Include “AIDA/AIDA.h”
#include “StatisticsTesting/StatisticsComparator.h”
#include “Chi2ComparisonAlgorithm.h”
#include “ComparisonResult.h”
using namespace StatisticsTesting;
std::auto_ptr<AIDA::IAnalysisFactory> af( AIDA_createAnalysisFactory() );
std::auto_ptr<AIDA::ITreeFactory> tf( af -> createTreeFactory() );
std::auto_ptr<AIDA::ITree> tree( tf -> create() );
std::auto_ptr<AIDA::IDataPointSetFactory> dpsf( af -> createDataPointSetFactory( *tree ) );
std::auto_ptr<AIDA::ITree> treeXML1( tf -> create(“gamma_lowE_Ge.xml”, “xml”, true, false);
std::auto_ptr<AIDA::ITree> treeXML2( tf -> create(“NIST_attenuationGamma_Ge.xml”, true, false);
AIDA::IDataPointSet & dps1 = * ( dynamic_cast<AIDA::IDataPointSet*>( treeXML1 -> find(“Gamma
attenuation coefficient test” ) ) );
AIDA::IDataPointSet & dps2 = * ( dynamic_cast<AIDA::IDataPointSet*>( treeXML2 -> find(“ Gamma
attenuation coefficient test” ) ) );
std::cout << “ title ” << dps1.title() << “ size=“ << dps1.size() << “ dimension=“ << dps1.dimension()
StatisticsComparator< Chi2ComparisonAlgorithm > comparator;
ComparisonResult result = comparator.compare( dps1, dps2 );
std::cout << “ distance=“ << result.distance() << “ ndf=“ << result.ndf() << “ p-value=“ << result.quality();
Alberto Ribon
Conclusions
The Statistical Toolkit provides
already some important functionalities;
It is already used in test50 for Geant4
physics tests and regression;
it is simple to install and to use;
we are working on the documentation;
other statistical tests are under development;
various long-term extensions are foreseen …
Alberto Ribon