## Statistical Methods for Discovery and Limits
## Workshop on Data Combination and Limit Setting 2011 |

Glen Cowan, Physics Department, Royal Holloway, University of London, e-mail: g.cowan@rhul.ac.uk

**Dates and times:** See here.

**Course description:**The series of lectures will cover the
statistical methods used in searches for new phenomena in a particle
physics experiment. Statistical tests will be formally defined and
used to quantify the level of agreement between a specified model and
the observed data. Specifically, one tries to reject the Standard
Model in such a test, as this will indicate the discovery of something
new. Even in the absence of a discovery, we would like to say what
possible signal models one may exclude by setting limits on their
parameters. Several procedures for doing this will be discussed,
including CLs, Power-Constrained Limits (PCL), Bayesian, and
Feldman-Cousins methods. The lectures will focus on frequentist
methods, but the Bayesian approach will be addressed as well. In both
cases the role of systematic uncertainties will be
emphasized. Computer tutorials will provide a practical exposure to
the procedures covered in the lectures.

**Lecture Notes**:

**Computer Exercises:** Some standalone C++ code to compute
discovery and exclusions significances is here. You can download everything
from this tarball. There
is a note that
describes the mathematics behind the exercises, and more details will
be given in the session.

You can also try using the routine runSigCalc_MC.cc instead of runSigCalc.cc (edit the makefile to link the new one). This routine will calculate the distribution of qmu using Monte Carlo and from this it finds the p-value. By finding pmu versus mu one can find the value of mu where pmu = 5%, which gives the limit.

**RooStats:**
We will may also have time to solve some
problems using the RooStats package, which
is based on RooFit and Root. Here are some useful links:

- The RooStats Wiki.
- Information on RooFit
- The RooStats class definitions
- The RooFit pdf class definitions
- The RooFit core class definitions

The directory $ROOTSYS/tutorials/roostats contains many tutorials, e.g., IntervalExamples.C. The file IntervalExamples.cc shows the necessary modification to make this a standalone C++ program, which can be built with the makefile here (download files and type gmake).

SimpleCount.C is a RooStats macro that illustrates the problem of observing n events assumed to follow a Poisson distribution with mean s + b. Here s is the expected number of signal events (the parameter of interest) and b is the expected number of background events. In the present version, b is treated as a constant, and the macro finds for a given observed value of n the p-value of the background-only hypothesis and also an limits on the signal parameter s based on a two-sided test.

SimpleCount2.C adds to the previous example a calculation of the one-sided upper limit by using Monte Carlo. The stand-alone C++ version of this macro is SimpleCount.cc, which can be built with this GNUmakefile.

In the practical sessions we will extend this example to the case where b is not known exactly but is constrained by a measurement m, which is assumed to follow a Poisson distribution with mean tau*b, where the scale factor tau is a known constant.

The full macro that can be used to compute upper limits according to the full frequentist procedure can be found here.

Some material has been adapted from a course for postgraduate students at the University of London. The complete set of lecture notes for that course plus other resources can be found here.

Glen Cowan