## Statistical Data Analysis for High Energy Physics
## Graduiertenkolleg der Uni Freiburg "Physik an Hadron-Beschleunigern" |

Glen Cowan, Physics Department, Royal Holloway, University of London, e-mail: g.cowan@rhul.ac.uk

**Dates and times:** See here.

**Course description:**The series of lectures will cover the
statistical methods used in searches for new phenomena in a particle
physics experiment. Statistical tests will be formally defined and
used to quantify the level of agreement between a specified model and
the observed data. Specifically, one tries to reject the Standard
Model in such a test, as this will indicate the discovery of something
new. Even in the absence of a discovery, we would like to say what
possible signal models one may exclude by setting limits on their
parameters. Several procedures for doing this will be discussed,
including CLs, Power-Constrained Limits (PCL), Bayesian, and
Feldman-Cousins methods. The lectures will focus on frequentist
methods, but the Bayesian approach will be addressed as well. In both
cases the role of systematic uncertainties will be
emphasized. Computer tutorials will provide a practical exposure to
the procedures covered in the lectures.

**Lecture Notes** (approx. by day and still evolving):

**Exercises:** Here are the statistics exercises used for the
London course (problem sheets 1 to 3 were on computing).
We will select from these:

On Tuesday for the computer exercises we will compute discovery and exclusions significances using the code here (1 July 2011, bug fixed in fitPar.cc to allow negative muHat). You can download everything from this tarball. There is a note that describes the mathematics behind the exercises, and more details will be given in the session.

**RooStats:**

We will also solve some problems using the RooStats package, which is based on RooFit and Root. Here are some useful links:

- The RooStats Wiki.
- Information on RooFit
- The RooStats class definitions
- The RooFit pdf class definitions
- The RooFit core class definitions

The directory $ROOTSYS/tutorials/roostats contains many tutorials, e.g., IntervalExamples.C. The file IntervalExamples.cc shows the necessary modification to make this a standalone C++ program, which can be built with the makefile here (download files and type gmake).

SimpleCount.C is a RooStats macro that illustrates the problem of observing n events assumed to follow a Poisson distribution with mean s + b. Here s is the expected number of signal events (the parameter of interest) and b is the expected number of background events. In the present version, b is treated as a constant, and the macro finds for a given observed value of n the p-value of the background-only hypothesis and also an limits on the signal parameter s based on a two-sided test.

SimpleCount2.C adds to the previous example a calculation of the one-sided upper limit by using Monte Carlo. The stand-alone C++ version of this macro is SimpleCount.cc, which can be built with this GNUmakefile.

In the practical sessions we will extend this example to the case where b is not known exactly but is constrained by a measurement m, which is assumed to follow a Poisson distribution with mean tau*b, where the scale factor tau is a known constant.

The full macro that can be used to compute upper limits according to the full frequentist procedure can be found here.

Some material has been adapted from a course for postgraduate students at the University of London. The complete set of lecture notes for that course plus other resources can be found here.

If we have time we may also look in some more depth at multivariate methods, e.g., using the lectures here (shown earlier at CERN and University of Mainz):

**Some books:**

- G. Cowan, Statistical Data Analysis, Clarendon Press, Oxford, 1998.
- R.J.Barlow, A Guide to the Use of Statistical Methods in the Physical Sciences, John Wiley, 1989;
- Frederick James, Statistical Methods in Experimental Physics, 2nd Edition, World Scientific, 2006;
- S.Brandt, Statistical and Computational Methods in Data Analysis, Springer, New York, 1998;
- L.Lyons, Statistics for Nuclear and Particle Physics, CUP, 1986.

You can also download the sections on probability, statistics, and Monte Carlo (pdf files) from the Review of Particle Physics by the Particle Data Group.

Glen Cowan