Lecture 1: jet algorithms Lecture 2: jet substructure

Transcription

Lecture 1: jet algorithms Lecture 2: jet substructure
MadGraph School
Shanghai - November 2015
Jets
Lecture 1: jet algorithms
Lecture 2: jet substructure
Matteo Cacciari
LPTHE Paris
Why jets
A jet is something that happens
in high energy events:
a collimated bunch of hadrons
flying roughly in the
same direction
We could eyeball the collimated
bunches, but it becomes impractical
with millions of events
The classification of particles into jets is best done
using a clustering algorithm
Matteo Cacciari - LPTHE
MadGraph School - Shanghai - October 2015
2
The pervasiveness of jets
‣
ATLAS and CMS have each published 400+ papers since 2010
‣ More than half of these papers make use of jets
‣ 60% of the searches papers makes use of jets
Plot by G. Salam
(Source: INSPIRE.
Results may vary when
employing different search
keywords)
Why are jets so important?
Matteo Cacciari - LPTHE
MadGraph School - Shanghai - October 2015
3
Taming reality
Multileg + PS
??
QCD predictions
Real data
Jets
One purpose of a ‘jet clustering’ algorithm is to
reduce the complexity of the final state, simplifying many hadrons
to simpler objects that one can hope to calculate
Matteo Cacciari - LPTHE
MadGraph School - Shanghai - October 2015
4
Jet clustering algorithm
A jet algorithm maps the momenta of the final state particles
into the momenta of a certain number of jets:
{pi}
jet algorithm
{jk}
jets
particles,
4-momenta,
calorimeter towers, ....
Most algorithms contain a resolution parameter, R,
which controls the extension of the jet
Matteo Cacciari - LPTHE
MadGraph School - Shanghai - October 2015
5
Jet Definition
A jet algorithm
+
its parameters (e.g. R)
+
a recombination scheme
=
a Jet Definition
/// define a jet definition
JetDefinition jet_def(JetAlgorithm jet_algorithm,
double R,
RecombinationScheme rec_sch = E_scheme);
Matteo Cacciari - LPTHE
MadGraph School - Shanghai - October 2015
6
Two main classes of jet algorithms
‣
Sequential recombination algorithms
Bottom-up approach: combine particles starting from closest ones
How? Choose a distance measure, iterate recombination until
few objects left, call them jets
Works because of mapping closeness
QCD divergence
Examples: Jade, kt, Cambridge/Aachen, anti-kt, …..
‣
Cone algorithms
Top-down approach: find coarse regions of energy flow.
How? Find stable cones (i.e. their axis coincides with sum of momenta of particles in it)
Works because QCD only modifies energy flow on small scales
Examples: JetClu, MidPoint, ATLAS cone, CMS cone, SISCone…...
Matteo Cacciari - LPTHE
MadGraph School - Shanghai - October 2015
7
A little history
‣Cone-type jets were introduced first in QCD in the 1970s
(Sterman-Weinberg ’77)
‣In the 1980s cone-type jets were adapted for use in hadron
colliders (SppS, Tevatron...) ➙ iterative cone algorithms
‣LEP was a golden era for jets: new algorithms and many
relevant calculations during the 1990s
‣ Introduction of the ‘theory-friendly’ kt algorithm
‣ sequential recombination type algorithm, IRC safe
‣
it allows for all order resummation of jet rates
‣ Several accurate calculations in perturbative QCD of jet
properties: rates, jet mass, thrust, ....
Matteo Cacciari - LPTHE
MadGraph School - Shanghai - October 2015
8
+
ee
kt (Durham) algorithm
[Catani, Dokshitzer, Olsson, Turnock, Webber ’91]
Distance:
In the collinear limit, the numerator reduces to the relative transverse
momentum (squared) of the two particles, hence the name of the algorithm
‣
Find the minimum ymin of all yij
‣
If ymin is below some jet resolution threshold ycut, recombine i and j
into a single new particle (‘pseudojet’), and repeat
‣
If no ymin < ycut are left, all remaining particles are jets
Matteo Cacciari - LPTHE
MadGraph School - Shanghai - October 2015
9
+
ee
kt (Durham) algorithm in action
2-jet
Characterise events
in terms of number of jets
(as a function of ycut)
3-jet
4-jet
5-jet
Resummed calculations for distributions of ycut doable with the kt algorithm
Matteo Cacciari - LPTHE
MadGraph School - Shanghai - October 2015
10
+
ee
kt (Durham) algorithm v. QCD
kt is a sequential recombination type algorithm
One key feature of the kt
algorithm is its relation to the
structure of QCD divergences:
The yij distance is the inverse of the emission probability
‣ The kt algorithm roughly inverts the QCD branching sequence
(the pair which is recombined first is the one with the largest
probability to have branched)
‣ The history of successive clusterings has physical meaning
Matteo Cacciari - LPTHE
MadGraph School - Shanghai - October 2015
11
Jet challenges at the LHC
The LHC environment differs from the LEP one
(and even the Tevatron) under many respects
‣ Number of final state particles much larger (order 103)
➙ needs a fast algorithm
‣ Many higher order calculations (NLO, NNLO) available
➙ needs an IRC-safe algorithm
‣ Presence of background (underlying event and pileup)
➙ needs small/known susceptibility
and/or ability to subtract background
‣ Jets often initiated by a large-momentum heavy particle
➙ needs capability to distinguish
boosted objet jet from QCD jet
Matteo Cacciari - LPTHE
MadGraph School - Shanghai - October 2015
12
IRC safety
An observable is infrared and collinear safe if,
in the limit of a collinear splitting, or the emission of an
infinitely soft particle, the observable remains unchanged:
O(X; p1 , . . . , pn , pn+1
O(X; p1 , . . . , pn ⇥ pn+1 )
0)
O(X; p1 , . . . , pn )
O(X; p1 , . . . , pn + pn+1 )
This property ensures cancellation of real and virtual divergences
in higher order calculations
If we wish to be able to calculate a jet rate in perturbative QCD
the jet algorithm that we use must be IRC safe:
soft emissions and collinear splittings must not change the hard jets
Matteo Cacciari - LPTHE
MadGraph School - Shanghai - October 2015
13
Janus Jets
From Wikipedia:
In ancient Roman religion and myth, Janus (Latin: Ianus, pronounced [ˈiaː.nus])
is the god of beginnings and transitions, and thereby of gates, doors, passages,
endings and time. He is usually depicted as having two faces, since he looks to
the future and to the past. The Romans named the month of January (Ianuarius)
in his honor.
Like Janus, jets can serve two purposes:
‣
Observables
‣
Tools
to be defined, calculated, measured
to be used to extract specific properties of the final state
Different clustering algorithms have different properties and characteristics
that can make them more or less appropriate for each of these tasks
Matteo Cacciari - LPTHE
MadGraph School - Shanghai - October 2015
14
Jets as tools
Background
characterisation
and subtraction
Mass
reconstruction
Remove soft
contamination
from a hard jet
Tag heavy objects
originating the jet
Matteo Cacciari - LPTHE
MadGraph School - Shanghai - October 2015
15
(Boosted) jet studies at the LHC
Lily Asquith, summary talk at BOOST 2015
Essentially none of these tools existed
as lately as seven years ago
Matteo Cacciari - LPTHE
MadGraph School - Shanghai - October 2015
16
Glossary
What
AKT
CA
BDRS
trimmed
pruned
HTT
N-subjettiness
WTA
one-pass
JVT
ρ
D2
PUPPI
Soft Drop
Matteo Cacciari - LPTHE
i.e.
Anti-kt algorithm
Cambridge/Aachen algorithm
mass-drop tagger, includes filtering
Trimming, tagger/groomer
Pruning, tagger/groomer
HepTopTagger
jet shape function, used in tagging
Winner-Take-All (recombination scheme)
choice of axis for N-subjettiness
Jet Vertex Tagger (used in pileup subtr.)
background density (used in pileup subtr.)
jet shape function, used in tagging
particle-by-particle pileup subtr.
tagger/groomer
MadGraph School - Shanghai - October 2015
When
2008
1999
2008
2009
2009
2009
2010
2013
2010
2014
2007
2014
2014
2014
Ref.
0802.1189
9907280
0802.2470
0912.1342
0903.5081
0910.5472
1011.2268
1310.7584
0707.1378
1409.6298
1407.6013
1402.2657
17
Outline
‣Algorithms, speed
‣Infrared and collinear safety
Lecture 1
‣Background (pileup)
‣Substructure
Matteo Cacciari - LPTHE
MadGraph School - Shanghai - October 2015
Lecture 2
18
Outline
‣Algorithms, speed
‣Infrared and collinear safety
‣Background (pileup)
‣Substructure
Matteo Cacciari - LPTHE
MadGraph School - Shanghai - October 2015
19
The common wisdom circa 2005
‣Cone algorithms are IRC unsafe
➙ because, to make them reasonably
fast, they were usually implemented via
approximate methods using seeds
‣Sequential recombination algorithms (i.e. kt)
are slow and too susceptible to background
contamination
➙ because they scale like N3
➙ because they tend to collect soft
particles up to large distances from centre
➙ because they were often run with R=1 and
compared to cones with R=0.5!
Matteo Cacciari - LPTHE
MadGraph School - Shanghai - October 2015
20
Geometry
The solution to the speed problem came from
considering the clustering problem from a
geometrical rather from a combinatorial point of view
‣ Sequential recombination algorithms could be
implemented with O(N2) or even O(NlnN)
complexity rather than O(N3)
[MC, Salam, 2006]
‣ Cone algorithms could be implemented exactly
(and therefore made IRC safe) with O(N2lnN)
rather than O(N2N) complexity
[Salam, Soyez 2007]
Matteo Cacciari - LPTHE
MadGraph School - Shanghai - October 2015
21
The kt algorithm and its siblings
2 + Δφ2
Δy
2p 2p
di j = min(kti , kt j )
R2
p=1
kt algorithm
2p
diB = kti
S. Catani,Y. Dokshitzer, M. Seymour and B. Webber, Nucl. Phys. B406 (1993) 187
S.D. Ellis and D.E. Soper, Phys. Rev. D48 (1993) 3160
p = 0 Cambridge/Aachen algorithm
Y. Dokshitzer, G. Leder, S.Moretti and B. Webber, JHEP 08 (1997) 001
M. Wobisch and T. Wengler, hep-ph/9907280
p = -1 anti-kt algorithm
MC, G. Salam and G. Soyez, arXiv:0802.1189
NB: in anti-kt pairs with a hard particle will cluster first: if no other
hard particles are close by, the algorithm will give perfect cones
Quite ironically, a sequential recombination algorithm is the ‘perfect’ cone algorithm
Matteo Cacciari - LPTHE
MadGraph School - Shanghai - October 2015
22
IRC safety of generalised-kt algorithms
2 + Δφ2
Δy
2p 2p
di j = min(kti , kt j )
R2
2p
diB = kti
p>0
New soft particle (kt →0) means that d → 0
clustered first, no effect on jets
New collinear particle (Δy2+ΔΦ2 → 0) means that d → 0
clustered first, no effect on jets
p=0
New soft particle (kt →0) can be new jet of zero momentum
New collinear particle (Δy2+ΔΦ2 → 0) means that d → 0
no effect on hard jets
clustered first, no effect on jets
p<0
New soft particle (kt →0) means d →∞
clustered last or new zero-jet, no effect on hard jets
New collinear particle (Δy2+ΔΦ2 → 0) means that d → 0
Matteo Cacciari - LPTHE
MadGraph School - Shanghai - October 2015
clustered first, no effect on jets
23
IRC safe algorithms
kt
Cambridge/
Aachen
anti-kt
SISCone
SR
dij = min(kti2,ktj2)ΔRij2/R2
hierarchical in rel p
Catani et al ‘91
Ellis, Soper ‘93
NlnN
Dokshitzer et al ‘97
Wengler, Wobish ‘98
NlnN
t
SR
dij = ΔRij2/R2
hierarchical in angle
SR
MC, Salam, Soyez ’08
2
-2
-2
2
dij = min(kti ,ktj )ΔRij /R
(Delsart, Loch)
gives perfectly conical hard jets
Seedless iterative cone
with split-merge
Salam, Soyez ‘07
gives ‘economical’ jets
N3/2
N2lnN
‘second-generation’ algorithms
All are available in FastJet, http://fastjet.fr
(As well as many IRC unsafe ones)
Matteo Cacciari - LPTHE
MadGraph School - Shanghai - October 2015
24
Jet clustering in FastJet
/// define a jet definition
JetDefinition jet_def(JetAlgorithm jet_algorithm,
double R,
RecombinationScheme rec_sch = E_scheme);
jet_algorithm can be any one of the four IRC safe algorithms, or also
most of the old IRC-unsafe ones, for legacy purposes
/// create a ClusterSequence, extract the jets
ClusterSequence cs(input_particles, jet_def);
vector<PseudoJet> jets = sorted_by_pt(cs.inclusive(jets));
...
// pt of hardest jet
double pt_hardest = jets[0].pt();
...
// constituents of hardest jet
vector<PseudoJet> constits = jets[0].constituents();
Matteo Cacciari - LPTHE
MadGraph School - Shanghai - October 2015
25
FastJet speed
Time needed to cluster an event with N particles
2
10
kt
n
e
t
ch
an
a
A
/
e
g
d
i
r
b
m
Ca
t
i-k
Intel i5 760
FastJet 3.0.1
R=0.7
10
Co
ne
1
10
-2
10
-3
10
-4
10
-5
10
-6
PbPb
collisions
SIS
time (s)
10-1
hadron
collisions
with pileup
anti-kt
kt
C/A
SISCone
hadron
collisions
10-7
1
10
102
103
104
105
106
107
N
Matteo Cacciari - LPTHE
MadGraph School - Shanghai - October 2015
26
Outline
‣Algorithms, speed
‣Infrared and collinear safety
‣Background (pileup)
‣Substructure
Matteo Cacciari - LPTHE
MadGraph School - Shanghai - October 2015
27
Hard jets and background
In a realistic set-up underlying event (UE) and pile-up (PU)
from multiple collisions produce many soft particles which
can ‘contaminate’ the hard jet
Matteo Cacciari - LPTHE
MadGraph School - Shanghai - October 2015
28
Pileup
78-vertices event
from CMS
https://cds.cern.ch/record/1479324
Pileup can deposit several tens of GeV (or even
hundreds, in a heavy ion collision) into a medium-sized jet
Matteo Cacciari - LPTHE
MadGraph School - Shanghai - October 2015
29
Hard jets and background
How are the hard jets
modified by the background?
Susceptibility
(how much bkgd gets picked up)
Resiliency
(how much the original jet changes)
Matteo Cacciari - LPTHE
MadGraph School - Shanghai - October 2015
Jet areas
Backreaction
30
Jet areas
A jet’s area is defined as the extent of the region where
infinitesimally soft particles get clustered into the jet
More in details, a jet’s active area is
the extent of the region where a
distribution of infinitesimally soft
particles, that can also cluster among
themselves, is clustered into the jet
A jet’s active area measures a jet’s
sensitivity to contamination from soft
particles like underlying event and pileup
Matteo Cacciari - LPTHE
Jets do not necessarily have
cone-shaped profiles
MadGraph School - Shanghai - October 2015
31
kt
Cam/Aa
SISCone
anti-kt
Resiliency: backreaction
“How (much) a jet changes when immersed in a background”
Without
background
With
background
Backreaction loss
Backreaction gain
Matteo Cacciari - LPTHE
MadGraph School - Shanghai - October 2015
33
Resiliency: backreaction
1/N dN/dpt (GeV-1)
pT loss
1
SISCone (f=075)
Cam/Aachen
kt
anti-kt
pT gain
Pythia 6.4
LHC (high lumi)
2 hardest jets
pt,jet> 1 TeV
0.1
|y|<2
R=1
0.01
0.001
-20
-15
-10
-5
0
pt(B) (GeV)
5
10
15
Anti-kt jets are much more resilient to changes from background immersion
(NB. Backreaction is a minimal issue in pp background and at large pt.
Can be much more important in Heavy Ion collisions)
Matteo Cacciari - LPTHE
MadGraph School - Shanghai - October 2015
34
Anti-kt jets and background
Anti-kt jets maximise
resiliency, and their regular
shapes makes them easier to
correct for detector-related
effects
Default choice of all LHC collaborations
Matteo Cacciari - LPTHE
MadGraph School - Shanghai - October 2015
35
The IRC safe algorithms
Hierarchical
UE/pileup
Speed Regularity contamination Backreaction
substructure
kt
☺☺☺
☂
☂☂
☁☁
☺☺
Cambridge
/Aachen
☺☺☺
☂
☂
☁☁
☺☺☺
anti-kt
☺☺☺
☺☺
☁/☺
☺☺
✘
SISCone
☺
☁
☺☺
☁
✘
Array of tools with different characteristics.
Pick the right one for the job
Matteo Cacciari - LPTHE
MadGraph School - Shanghai - October 2015
36
Hard jets and background
Modifications of the hard jet
⇧
pt = A ± (⇥ A + ⇥ A +
⇤A2 ⌅
⇤A⌅2 )
+
BR
pt
Background
momentum density
(per unit area)
background
‘susceptibility’
Matteo Cacciari - LPTHE
MadGraph School - Shanghai - October 2015
back-reaction
‘resiliency’
37
Background subtraction
Two main approaches to background subtraction
‣ Jet-based
‣ Cluster the full event, determine the event-specific (ρ) and jet-
specific (A) quantities, and subtract the relevant contamination from a
given observable
‣ Pros: largely unbiased subtraction
‣ Cons: slow, potentially large(er) residual uncertainty
‣
Examples: `jet area/median’ in FastJet, GenericSubtractor for jet shapes, JetFFMoments for
fragmentation functions, ....
‣ Particle-based
‣ Produce a reduced event, by dropping some of the particles. Cluster
this reduced event, and calculate from it the observables
‣ Pros: fast, often small(er) residual uncertainty
‣ Not natively unbiased, can depend on choice of parameters
‣
Examples: ConstituentSubtractor, SoftKiller, PUPPI, ....
Matteo Cacciari - LPTHE
MadGraph School - Shanghai - October 2015
38
Background subtraction: jet-based
Correction of a jet transverse momentum
hard jet, corrected
pT
=
hard jet, raw
pT
ρ ⇥ Areahard jet
MC, Salam, 0707.1378
If ρ is measured on an event-by-event basis, and each jet subtracted individually, this procedure
will remove many fluctuations and generally improve the resolution of, say, a mass peak
⇧
pt = A ± (⇥ A + ⇥ A +
⇤A2 ⌅
⇤A⌅2 ) +
pBR
t
Irreducible fluctuations:
uncertainty of the subtraction
Needs two ingredients: ρ and Ajet
Matteo Cacciari - LPTHE
MadGraph School - Shanghai - October 2015
39
Jet areas
Definition of a specific jet area
and its calculation for each jet in FastJEt
/// specify where to place the ghosts, their area, how many times
/// to repeat the clustering
double rapmax = 2.5;
int nrepeat = 1;
double ghost_area = 0.01;
GhostedAreaSpec gas(rapmax, nrepeat, ghost_area);
/// use this configuration to define the area
/// A sensible default for gas is provided
AreaDefinition area_def(active_area, gas);
/// construct cluster sequence with areas
ClusterSequenceArea(input_particles, jet_def, area_def);
....
jets[0].area()
Matteo Cacciari - LPTHE
MadGraph School - Shanghai - October 2015
40
Background estimation and subtraction
// constructor for a background estimator
JetMedianBackgroundEstimator bge(Selector sel,
JetDefinition jet_def,
AreaDefinition area_def);
// an alternative (faster) background estimator
//GridMedianBackgroundEstimator bge(Selector sel, grid_step);
bge.set_particles(input_particles);
....
double rho = bge.rho(jet);
// extract rho estimation
// define a subtractor
Subtractor sub(&bge);
// apply it to a jet (or a vector of jets)
PseudoJet subtracted_jet = sub(jet);
Matteo Cacciari - LPTHE
MadGraph School - Shanghai - October 2015
41
Numerical jet shape correction
Soyez et al. 1211.2811
A generic jet shape
(a function of the momenta of all
constituents of a jet) is modified
by the addition of pileup
Correct it by calculating numerically the derivatives that enter its Taylor
expansion and subtracting (this generalises the jet area/median subtraction for transverse mom.)
Pileup
momentum density
Matteo Cacciari - LPTHE
Numerical derivative
w.r.t. ghosts momenta
MadGraph School - Shanghai - October 2015
42
Shape subtraction
Pilup subtraction from jet shapes using
GenericSubtractor from fjcontrib
#include “ExampleShapes.hh”
#include “GenericSubtractor.hh”
// define a specific jet shape
contrib::Angularity ang(1.0); // angularity with alpha=1.0
// define a generic subtractor
... construct a background estimator bge_rho....
contrib::GenericSubtractor gen_sub(&bge_rho);
// compute the subtracted shape
double subtracted_ang = gen_sub(ang, jet);
Matteo Cacciari - LPTHE
MadGraph School - Shanghai - October 2015
43
Particle-level pilup removal
Pilup removal using SoftKiller from fjcontrib
(SoftKiller progressively removes soft particles until ρ of event is zero)
#include “SoftKiller.hh”
// define SoftKiller
double grid_size = 0.4;
contrib::SoftKiller soft_killer(rapmax, grid_size);
// apply it to the full event
double pt_thresh //returns threshold for killed particles
vector<PseudoJet> soft_killed_event;
soft_killer.apply(full_event, soft_killed_event, pt_thresh);
// proceed with clustering and calculating shapes with
// the reduced soft_killed_event
ClusterSequence(soft_killed_event,.....)
Matteo Cacciari - LPTHE
MadGraph School - Shanghai - October 2015
44
Comparisons of pileup subtractions
from the CERN pileup mitigation workshop, https://indico.cern.ch/event/306155/
PRELIMINARY
3
areasub
constitsub
safeareasub
soft killer
puppi
safenpcsub
lin cl.(Rsub=0.3)
2.5
anti-kt R=0.4
30, 60, 100, 140 PU
2
σΔO (GeV)
DISPERSION
CHS event, jet m, pt=020
WORSE
1.5
BETTER
1
0.5
BIAS
0
-2
-1.5
-1
-0.5
0
<ΔO> (GeV)
0.5
1
1.5
2
BETTER
Matteo Cacciari - LPTHE
MadGraph School - Shanghai - October 2015
45
Recap of lecture 1
‣ A number of different IRC-safe jet algorithms exist
‣ They all try to be good proxies for hard partons, but they have
different characteristics, especially with respect to soft particles
‣ Jets from all algorithms inevitably suffer from pileup contamination
‣ Techniques exist to subtract it, either at jet-level, or at particle-level
‣ Both the jet algorithms and many pileup subtraction techniques are
packaged aither in FastJet or in fjcontrib contributions
‣ Use of standard algorithms and packages (either directly or
through interfaces) should be privileged, as it ensures
reproducibility
http://fastjet.fr
Matteo Cacciari - LPTHE
http://fastjet.hepforge.org/contrib/
MadGraph School - Shanghai - October 2015
46