Analisi Statistica dei Dati per HEP (Laboratorio)
• 2013-03-25
• Elementi di C++
• Introduzione a ROOT
• 2013-04-11, Laboratorio Informatico
• ROOT warm up
• 2013-04-17, Laboratorio Informatico
• Introduzione a RooFit
• Primo esercizio con RooFit
• 2013-05-02, Laboratorio Informatico
• RooFit (Workspace, Factory, Composite Model)
• 2013-05-16, Laboratorio Informatico
• Introduzione a RooStats
• 2013-05-23, Laboratorio Informatico
• RooStats , TMVA
/57
RECAP: Esercizio RooStats
[1] roostats_ex1.C
• Si modifichi il modello:
Specificare i componenti del modello per i tool statistici di roostat: osservabile e
parametro di interesse. Utilizzare il numero di eventi di segnale come unico parametro di
interesse. Fissare costanti tutti gli altri parametri del modello.
Importare la configurazione nel workspace e salvare su file.
•
Si modifichi l’uso del modello:
- leggere il modelConfig dal workspace
esempio: ModelConfig* mc = (ModelConfig*) w.obj("ModelConfig");
- calcolare un Confidence Interval utilizzando il ProfileLikelihoodCalculator
- Disegnare il profilo della likelihood e sovrapporre l’intervallo
- calcolare la discovery significance utilizzando il profilelikelihoodcalculator come test di
ipotesi
- scrivere sulla console i limiti dell’intervallo e la significatività
[2] roostats_ex1.C
• Aggiungere l’intervallo calcolato con Feldman-Cousins
suggerimento : guardare il codice in :
$ROOTSYS/tuturials/roostats/IntervalExamples.C
/57
void makeModel(RooWorkspace& w )
{
// Construct model here
w.factory("Exponential:bkg(x[110,120], expr('(-1./tau)',tau[10,-1000,1000]) )");
w.factory("Gaussian:sig(x, mean[115,105,120], sigma[1.0,0,10])");
w.factory("SUM:model( Ns[100,0,120]*sig, Nb[1000,0,10000]*bkg)");
// Getting variables and functions out of a workspace
RooAbsPdf * model = w.pdf("model");
RooRealVar * x = w.var("x"); // the observable
RooRealVar * Ns = w.var("Ns"); // the parameter
x->setBins(30);
// set constant parameters
w.var("mean")->setConstant(true);
w.var("sigma")->setConstant(true);
w.var("tau")->setConstant(true);
w.var("Nb")->setConstant(true);
//specify components of model for statistical tools
ModelConfig modelConfig("ModelConfig");
modelConfig.SetWorkspace(w);
//set components using the name of ws objects
modelConfig.SetPdf( "model" );
modelConfig.SetObservables("x");
modelConfig.SetParametersOfInterest("Ns");
//import modelConfig into workspace too
w.import(modelConfig);
// generate the data (nsig = 50, nbkg=1000)
w.var("Ns")->setVal(50);
w.var("Nb")->setVal(1000);
// use fixed random numbers for reproducibility
RooRandom::randomGenerator()->SetSeed(111);
int N = 1100 ;
// will generate accordint to total S+B events
RooDataSet * data = model->generate( *x );
data->Print();
non metto N
// add data to the workspace
data->SetName("data");
w.import(*data);
// write the workspace in the file
/57
void useModel(RooWorkspace& w )
{
// Getting variables and functions out of a workspace
RooAbsPdf * model = w.pdf("model");
RooRealVar * x = w.var("x"); // the observable
RooRealVar * Ns = w.var("Ns"); // the parameter
RooDataSet * data = w.data("data"); // the data
Chiama il ProfileLikelihoodCalculator
Passando gli argomenti (dati, modelconfig)
// get the modelConfig out of the
ModelConfig* mc = (ModelConfig*) w.obj("ModelConfig");
// create the class using data and model
ProfileLikelihoodCalculator plc(*data, *mc);
// set the confidence level
plc.SetConfidenceLevel(0.683);
// compute the interval
LikelihoodInterval* interval = plc.GetInterval();
double lowerLimit = interval->LowerLimit(*Ns);
double upperLimit = interval->UpperLimit(*Ns);
// plot the interval
LikelihoodIntervalPlot plot(interval);
plot.Draw();
// Calculate the significance: SL2 estimator
Ns->setVal(0);
plc.SetNullParameters(*Ns);
HypoTestResult* hypotest = plc.GetHypoTest();
double SL2 = hypotest->Significance();
cout << "Limits: [" << lowerLimit << " - " << upperLimit << "]" << endl;
cout << "Significance (SL2): " << SL2 << endl;
}
/57
68%
Ns
Limits: [27.9087 - 67.6207] 68% C.L.
Significance (SL2): 2.50742
Ricordare che Nb era fissato costante, così come gli altri parametri del modello (tau, mass, sigma).
/57
void useModel(RooWorkspace& w )
{
// Getting variables and functions out of a workspace
RooAbsPdf * model = w.pdf("model");
RooRealVar * x = w.var("x"); // the observable
RooRealVar * Ns = w.var("Ns"); // the parameter
RooDataSet * data = w.data("data"); // the data
ESERCIZIO 2
Intervallo Feldman-Cousins
Chiama il FeldmanCousins calculator
Passando sempre gli argomenti (dati, modelconfig)
// get the modelConfig out of the
ModelConfig* mc = (ModelConfig*) w.obj("ModelConfig");
FeldmanCousins fc(*data, *modelConfig);
fc.SetConfidenceLevel( 0.95 );
fc.SetNBins(40); // number of points to test per parameter
fc.UseAdaptiveSampling(true); // make it go faster
// The PDF could be extended and this could be removed
// fc.FluctuateNumDataEntries(false);
// Proof
// ProofConfig pc(*wspace, 4, "workers=4", kFALSE); // proof-lite
//ProofConfig pc(w, 8, "localhost"); // proof cluster at "localhost"
// ToyMCSampler* toymcsampler = (ToyMCSampler*) fc.GetTestStatSampler();
// toymcsampler->SetProofConfig(&pc); // enable proof
PointSetInterval* interval = (PointSetInterval*) fc.GetInterval();
std::cout << “Feldman-Cousins Limits: ["<<
interval->LowerLimit(*Ns) << " , " <<
interval->UpperLimit(*Ns) << "]" << endl;
}
/57
=== Using the following for ModelConfig ===
Observables:
RooArgSet:: = (x)
Parameters of Interest: RooArgSet:: = (Ns)
PDF:
RooAddPdf::model[ Ns * sig + Nb * bkg ]
FeldmanCousins: ntoys per point: adaptive
FeldmanCousins: nEvents per toy will fluctuate about expectation
FeldmanCousins: Model has no nuisance parameters
FeldmanCousins: # points to test = 40
Ns
Feldman Cousins Interval is [16.5 , 85.5 ]
ProfileLikeLihood Interval is [9.94475 , 87.7451]
Significance (SL2): 2.50742
95% C.L.
95% C.L.
Ricordare che Nb era fissato costante, così come gli altri parametri del modello (tau, mass, sigma).
/57
void makeModel(RooWorkspace& w )
{
// Construct model here
w.factory("Exponential:bkg(x[110,120], expr('(-1./tau)',tau[10,-1000,1000]) )");
w.factory("Gaussian:sig(x, mean[115,105,120], sigma[1.0,0,10])");
w.factory("SUM:model( Ns[100,0,120]*sig, Nb[1000,0,10000]*bkg)");
// Getting variables and functions out of a workspace
RooAbsPdf * model = w.pdf("model");
RooRealVar * x = w.var("x"); // the observable
RooRealVar * Ns = w.var("Ns"); // the parameter
x->setBins(30);
VARIAZIONE
Introduciamo Nuisance Parameters
// set constant parameters
w.var("mean")->setConstant(true);
w.var("sigma")->setConstant(true);
// define set of nuisance parameters
w.defineSet("nuisParams", "tau,Nb");
//specify components of model for statistical tools
ModelConfig modelConfig("ModelConfig");
modelConfig.SetWorkspace(w);
//set components using the name of ws objects
modelConfig.SetPdf( "model" );
modelConfig.SetObservables("x");
modelConfig.SetParametersOfInterest("Ns");
modelConfig.SetNuisanceParameters(*w.set("nuisParams"));
//import modelConfig into workspace too
w.import(modelConfig);
/57
Ns
Feldman Cousins Interval is [16.5
, 100.5 ] 95% C.L.
ProfileLikeLihood Interval is [10.1149 - 106.863] 95% C.L.
Significance (SL2): 2.3894
/57
VARIAZIONE
Intervallo Bayesian
// example use of BayesianCalculator
// now we also need to specify a prior in the ModelConfig
wspace->factory("Uniform::prior(mu)");
modelConfig->SetPriorPdf(*wspace->pdf("prior"));
// example usage of BayesianCalculator
BayesianCalculator bc(*data, *modelConfig);
bc.SetConfidenceLevel( confidenceLevel);
SimpleInterval* bcInt = bc.GetInterval();
Chiama il BayesianCalculator
Passando sempre gli argomenti (dati, modelconfig)
/57
We create the class and configure it. Since the numerical integration can be tricky, it is better in this case to set a
reasonable range from the nuisance parameters which will be integrated to obtain the marginalised posterior function. A
sensible choice is to use for example an interval using 10-sigma around the best fit values for the nuisance parameters,
RooRealVar *nuisPar = w->var("...");
nuisPar->setRange(nuisPar->getVal() - 10*nuisPar->getError(), nuisPar->getVal() + 10* nuisPar->getError() ) ;
It is also recommended to use this class on a model which has a binned dataset or not having too many events
(<~ 100), otherwise it will take a long time to obtain the result.
We create the class can set the type of interval (central, upper/lower limit or shortest) and the type of
integration method
BayesianCalculator bayesianCalc(*data,*mc);
bayesianCalc.SetConfidenceLevel(0.683); // 68% interval
// set the type of interval (not really needed for central which is the default)
bayesianCalc.SetLeftSideTailFraction(0.5); // for central interval
//bayesianCalc.SetLeftSideTailFraction(0.); // for upper limit
//bayesianCalc.SetShortestInterval(); // for shortest interval
// set the integration type (not really needed for the default ADAPTIVE)
// possible alternative values are "VEGAS" , "MISER", or "PLAIN" (MC integration from libMathMore)
// "TOYMC" (toy MC integration, work when nuisances exist and they have a constraints pdf)
TString integrationType = "";
// this is needed if using TOYMC
if (integrationType.Contains("TOYMC") ) {
RooAbsPdf * nuisPdf = RooStats::MakeNuisancePdf(*mc, "nuisance_pdf");
if (nuisPdf) bayesianCalc.ForceNuisancePdf(*nuisPdf);
}
bayesianCalc.SetIntegrationType(integrationType);
/57
// compute interval by scanning the posterior function
// it is done by default when computing shortest intervals
bayesianCalc.SetScanOfPosterior(100);
RooRealVar* firstPOI = (RooRealVar*) mc->GetParametersOfInterest()->first();
SimpleInterval* interval = bayesianCalc.GetInterval();
if (!interval) {
cout << "Error computing Bayesian interval - exit " << endl;
return;
}
double lowerLimit = interval->LowerLimit();
double upperLimit = interval->UpperLimit();
cout << "\n68% interval on " <<firstPOI->GetName()<<" is : ["<<
lowerLimit << ", "<<
upperLimit <<"] "<<endl;
// draw plot of posterior function
RooPlot * plot = bayesianCalc.GetPosteriorPlot();
if (plot) plot->Draw();
/57
TMVA
/57
Scarica

Document