Ph.D. thesis
Transcription
Ph.D. thesis
Università degli Studi di Milano-Bicocca Facoltà di Scienze Matematiche Fisiche e Naturali Scuola di Dottorato di Scienze Corso di Dottorato di Ricerca in Fisica e Astronomia CMS Tracking Performance and /T Sesnsitivity for the MSSM A→τ τ →eµE decay. Coordinatore: Prof. Claudio Destri Tutore: Dott. Sandra Malvezzi Tesi di Dottorato di Giuseppe B. Cerati Matricola 700084 XXI ciclo Anno Accademico 2007 − 2008 Contents Introduction 1 1 The CMS Experiment 1.1 The LHC accelerator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2 The CMS detector . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 5 6 2 Tracking in CMS 2.1 Overview of Tracking in CMS 2.2 Definition of Track . . . . . . 2.3 Combinatorial Kalman Filter 2.4 Road Search . . . . . . . . . . 2.5 Cosmic Track Finder . . . . . 2.6 The Final Track Fit . . . . . . . . . . . 15 15 16 17 19 20 20 3 Tracking Validation Tools in CMSSW 3.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 Tracking Performance with Simulated Data . . . . . . . . . . . . . . . . . . . 3.3 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 23 25 28 4 Outlier Rejection in the Final Track Fit 4.1 Implementation of the Algorithm . . . . 4.2 Characterization of the Rejected Hits . . 4.3 Impact on Tracking Performance . . . . 4.4 Conclusions . . . . . . . . . . . . . . . . . . . . 35 35 36 37 47 5 Tracking Efficiency with Cosmic Data during Commissioning 5.1 CRUZET . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2 Efficiency Estimate using Tracker Data only . . . . . . . . . . . . . . . . . . . 5.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 50 52 58 6 Summary on Tracking 65 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 Motivations for the Minimal Supersymmetric Standard 7.1 Open Problems of the Standard Model . . . . . . . . . . . 7.2 A brief introduction to Supersymmetry and MSSM . . . . 7.3 Higgs particles in the MSSM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 67 70 74 i 8 Search for the Heavy Neutral CP-odd Higgs Boson A 8.1 Exclusion limits from LEP and CDF . . . . . . . . . . . . . . . . . . . . . . . 8.2 Production at LHC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . / T decay 9 Sensitivity for the MSSM A → τ τ → eµE 9.1 Introduction . . . . . . . . . . . . . . . . . . . . . 9.2 Event reconstruction . . . . . . . . . . . . . . . . 9.3 Signal and Background samples . . . . . . . . . . 9.4 Crucial issues . . . . . . . . . . . . . . . . . . . . 9.5 Event selection . . . . . . . . . . . . . . . . . . . 9.6 Results . . . . . . . . . . . . . . . . . . . . . . . . in CMS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79 79 85 89 . 89 . 91 . 94 . 103 . 107 . 114 Conclusions 122 Bibliography 124 Acknowledgements 129 ii Introduction At the end of the 20th century, high energy physicists have developed the Standard Model (SM), a powerful theory capable to describe with great precision the behavior of the fundamental particles (matter constituents and force carriers) from the M eV scale up to the order of 100 GeV . The predictions of the Standard Model have been successfully tested for about thirty years at many colliders, like LEP, HERA and TEVATRON. In particular, the mass and width of the Z and W bosons were measured at LEP, the parton distribution functions in the proton have been studied at HERA, while at TEVATRON the last piece of the fermionic picture of the SM was discovered: the top quark. Besides explaining the dynamics and the interactions between the elementary particles, the SM has arisen some more fundamental problems that are being addressed at the 21st century experiments. The most important questions are what the origin of the particles’ mass is, what is the unobserved part of the universe made of, why matter has overcome antimatter, what happened during the Big Bang, if there are hidden dimensions and if there is a symmetry between fermions and bosons. The puzzle to which the largest number of physicists have been researching a solution is the origin of the mass. According to the Higgs Theory, particles are massive because of their interaction with a new field, called the Higgs field. In the simplest model of the Higgs Theory just one new particle is foreseen, the Higgs boson. A more complex Higgs particle spectrum is predicted by other theories that extend the Standard Model, such as the Minimal Supersymmetric Standard Model (MSSM). The search for MSSM Higgs bosons is an extremely important topic, because their discovery would address not only the question about the origin of the mass, but it would also reveal the presence of a fermion-boson symmetry (SUSY) in nature. Furthermore, as the MSSM predicts the existence of a weakly interacting massive particle called neutralino, it would give a clue about the nature of dark matter. All these issues will be investigated at LHC, the Large Hadron Collider built in the LEP tunnel, where proton beams will collide at the unprecedented center of mass energy of 14 T eV and at the luminosity of 1034 cm−2 s−1 . The experiments built at the LHC are ATLAS, CMS, LHC-b and ALICE. LHC-b and ALICE are dedicated to the physics of the b-quark and to the study of heavy ion collisions respectively. Their researches focus on the matter-antimatter asymmetry and a new state of matter, the quark-gluon plasma. ATLAS and CMS, the socalled “general purpose” experiments, will look for answers to all the open questions in high energy physics and, among many new particles, will seek the MSSM Higgs bosons. ATLAS and CMS have different designs and different approaches, mostly due to the choice of the magnetic field: toroidal for ATLAS (plus a small inner solenoidal field), solenoidal for CMS. The CMS solenoid is the largest ever built, reaching a field of almost 4 T and containing both the tracker and the calorimeters. The tracker, being the biggest all-silicon detector in the world, is expected to be extremely efficient and precise also in a very hostile environment 3 such as that of the LHC collisions. The identification and the precise measurement of leptons and conversion photons, the tagging of b and τ jets, the primary and secondary vertex reconstruction and the track-based correction to calorimeter measurements are the key feature necessary to achieve the CMS physics goals. The track reconstruction is the basis for all these tasks and thus, the improvement of tracking performance and the development of tools needed to estimate it are fundamental for the outcome of the whole experiment. Therefore, CMS is an experiment on the cutting edge of the detector technology and pioneers the mysteries of contemporary science. Focusing on the track reconstruction and the discovery potential of MSSM Higgs bosons, this thesis addresses some of the most important topics both from the detector and the physics analysis point of of view. First, in the context of the CMS tracking, the Track Validation Tool has been developed. This tool analyzes simulated data and produces a fully configurable set of plots in order to estimate the tracking performance and compare tracks reconstructed with different algorithms or software releases. It has proven that the CMS tracking has full efficiency, low fake rate and high resolution both for momentum and impact parameter measurements. Then, an algorithm that removes biased and incorrect hits from the reconstructed tracks has been implemented. The effect of this algorithm is to enhance the track quality by correcting the track parameters, reducing the track χ2 and improving the track purity. Finally the tracking performance has been evaluated with the first real data collected by the full tracker detector in its final configuration: a method to evaluate the cosmic ray tracking efficiency using tracker information only has been developed, leading to very good results (∼ 90%) matching Monte Carlo predictions. The second part of the thesis is dedicated to the discovery potential of the MSSM Heavy / T in CMS. The τ decay Neutral CP -odd Higgs boson in the decay channel A → τ τ → eµE channel is particularly favored in the MSSM because the coupling to A is enhanced by a factor tan β with respect to the SM. The two main production processes of A at the LHC are considered, namely the b quark associated and the direct gluon fusion processes. The main backgrounds are Z/γ ∗ → τ τ and tt̄ events, but also W t, W W and bbll events provide a non-negligible contribution after the selection criteria are applied. A key feature of this analysis is the tagging of b jets: several strategies are considered on the basis of the request or the veto on the number of b-tags, leading to significant results for each strategy, but with different signal and background contributions. RThe analysis is performed for tan β = 30 and mA = 160 GeV at an integrated luminosity L = 10 f b−1 ; this channel is shown to be promising for the discovery of A and for the measurement of its mass and width. Finally, a mass scan for the same values of tan β and luminosity has been performed: CMS is sensitive to the A search in the mass range 140 ≤ mA ≤ 300 GeV . 4 Chapter 1 The CMS Experiment 1.1 The LHC accelerator The Large Hadron Collider (LHC) [1] is the proton-proton (p − p) collider at CERN. It √ will collide protons with a center of mass energy s = 14 T eV with a design luminosity L = 1034 cm−2 s−1 . The first LHC single beams successfully circulated through the whole collider in the morning of the 10th September 2008. Although the initial beam commissioning was progressing extremely well, during the commissioning of the main dipole circuit in sector 34 on the 19th September (without beam), a number of magnets were damaged in an incident that saw a large amount of Helium released into the tunnel. Due to the repair of the damaged region and the planned winter shutdown, the next circulating beam in the LHC are foreseen for Spring 2009 at the earliest. During the first 3 years of data taking, the luminosity is expected to be 2 × 1033 cm−2 s−1 (the so called “low luminosity” phase), while at the start up LHC is foreseen to run at L = 1031 -1032 cm−2 s−1 . LHC is installed in the LEP tunnel and the available CERN accelerators are employed in the injection chain: the proton beam exiting a small linear accelerator at 50 M eV , will be injected in the PS at 1.4 GeV , then in the SPS at 25 GeV , and finally in the LHC ring at 450 GeV (Fig. 1.1). One of the critical aspects in accelerating the protons up to an energy of 7 T eV is the required bending magnetic field which, for the LHC bending radius (R ∼ 2780 m) is about 8.4 T . This magnetic field will be provided by the 1232 LHC superconducting 14.2 m long dipole magnets, placed in the eight curved sections which connect the straight sections of the LHC ring. The super-conducting magnets use a Ni-Ti conductor, cooled down to 1.9 K, by means of super-fluid Helium. The choice of a p − p collider obliges to install two separate magnetic chambers which, for economical reasons, will lay in the same mechanical structure and cryostat. The high luminosity of the LHC is obtained by a high frequency of bunch crossing and by a high number of protons per bunch: two beams of protons with an energy of 7 T eV , circulating in two different vacuum chambers, will contain each 2808 bunches filled with about 1.15 × 1011 protons. The beams will cross at the rate of 40 M Hz, at the interaction point, with a spread of 7.5 cm along the beam axis and 15 µm in the transverse directions. The main machine parameters are summarized in Table 1.1. The operating conditions at the LHC are extremely challenging for the experiments. The √ p − p total inelastic cross section at s = 14 T eV is about 80 mb, several orders of magnitude larger than the typical cross section for events with large momentum transfer. Most of the 5 Figure 1.1: Scheme of the LHC injection chain. inelastic events consist of soft p − p interactions characterized by outgoing particles with a low transverse momentum. These events are referred to as minimum bias. It is expected that each bunch crossing will produce about 20 minimum bias events in the high luminosity phase and 5 minimum bias events in the low luminosity phase. Hence, each interesting event will be readout entangled with a large number of minimum bias events, which constitute the pile-up. The high interaction rate (∼ 109 events/s) and the high bunch crossing frequency impose stringent requirements on the data acquisition and trigger systems and on the detectors. The trigger has to provide an high rejection factor, maintaining at the same time a high efficiency in selecting the interesting events. The detectors has a fast response time (25-50 ns) and a fine granularity (and therefore a large number of readout channels) in order to minimize the pile-up effect. Furthermore, the high flux of particles coming from the p − p interactions implies that each component of the detector, including the read-out electronics, has to be radiation resistant. 1.2 The CMS detector The Compact Muon Solenoid[2, 3, 4] is the general purpose experiment installed at Point 5 of the Large Hadron Collider; it is currently under commissioning and waiting for the first beam collisions foreseen in 2009. The CMS structure is a typical one for experiments at colliders: a 6 Beam parameters Beam energy Maximum luminosity Time between collisions Bunch length RMS beam radius at the interaction point 7 T eV 1034 cm−2 s−1 25 ns 7.7 cm 16.7 µm Technical parameters Ring length Radiofrequency Number of bunches Number of dipoles Dipole magnetic field 26658.9 m 400.8 M Hz 2808 1232 8.33 T Table 1.1: The relevant LHC parameters for p − p collisions. cylindrical central section (the barrel) closed at its end by two caps (the endcaps), as sketched in Fig. 1.2. Superconducting Solenoid Silicon Tracker Very-forward Calorimeter Pixel Detector Preshower Hadronic Calorimeter Electromagnetic Calorimeter Muon Detectors Compact Muon Solenoid Figure 1.2: CMS overview. The main design component of CMS is its large superconducting magnet producing a 3.8 T field and allowing for a compact structure: within the solenoid the inner tracking 7 system and both the electromagnetic and the hadron calorimeters are installed, while the muon chambers are interleaved with the iron return yoke. The coordinates system in CMS are chosen with the z axis along the beam direction, the x axis directed toward the center of the LHC ring and the y axis directed upward, orthogonally to the z and x axes. Given the cylindrical structure of CMS, a convenient and commonly used coordinate system is r, φ, η, where r is the distance from the z axis, φ is the azimuthal angle in the xy plane and η is the pseudorapidity defined as θ (1.1) η = − ln tan 2 where θ is the angle with respect to the beam axis. The use of pseudorapidity instead of the polar angle is motivated by the fact that the difference in pseudorapidity between two particles is invariant under Lorentz boosts along the beam axis. CMS is characterized by high hermeticity with a full φ coverage and up to η = 5 in pseudorapidity. 1.2.1 The Magnet A key feature of the CMS experiment is its axial high magnetic field. The magnet system of CMS [5] is composed of three main parts: the superconducting solenoid, the barrel return yoke and the endcap return yoke. The 3.8 T magnetic field allows to measure efficiently the muon momentum up to a pseudorapidity of 2.4. The return yoke is made of iron and contains the muon detectors. It is a 12-sided cylindrical structure, with a total length is about 11 m and it is divided into five rings of about 2.5 m each. It has an outer diameter of 14 m and a total weight of about 7000 tons. Each ring is divided into three iron layers where the muon detectors are inserted. The thickness of the border layers is 630 mm and the middle layer is 295 mm thick. Each endcap is composed by three independent disks; the outermost is 300 mm thick, the others 600 mm. The superconductive coil is housed into a vacuum tank and kept at the temperature of the liquid helium. The vacuum tank is supported only by the central barrel ring of the yoke and in its turn supports the calorimeter system (ECAL and HCAL) and the tracker. 1.2.2 The Tracker The CMS tracker [6] is the subdetector closest to the interaction point, placed in the 3.8 T magnetic field of the superconductive solenoid. It is designed to determine the interaction vertex, measure with high precision the momentum of the charged particles, identify the presence of secondary vertices. The tracker must be able to operate without degrading its performances in the hard radiation environment of LHC. The CMS collaboration has adopted silicon technology for the whole tracker. Three regions can be defined according to the charged particle flux at different radii at high luminosity: • Closest to the interaction vertex where the particle flux is highest (∼ 107 s−1 at r ∼ 10 cm) pixel detectors are placed. The size of the pixel-cells is ∼ 100 × 150µm2 , leading to an occupancy < 10−4 per pixel per LHC crossing. • In the intermediate region (20 < r < 55 cm) the particle flux lowers, allowing for the use of silicon microstrip detectors with a minimum cell size of ∼ 10 cm × 80 µm and resulting in an occupancy of 2-3% per LHC crossing. 8 • The outermost region is characterized by sufficiently low fluxes which enable to adopt larger-pitch silicon microstrips with a maximum cell size of ∼ 25 cm × 80 µm, keeping the occupancy to ∼ 1%. The pixel detector consists of three barrel layers and two endcap disks at each side (Fig. 1.3). The barrel layers are located at 4.4 cm, 7.3 cm and 10.2 cm and are 53 cm long. The two end disks, extending from 6 to 15 cm in radius, are placed on each side at |z| = 34.5 cm and 46.5 cm. The pixel bulk is 270 µm thick. This design allows to obtain at least three 3D measurement points per track in the |η| < 2.4 region for tracks originating from the central interaction point. The total number of cells is about 66 millions, organized in about 16000 modules of 52 columns and 80 rows. The total active area is close to 1 m2 . The pixels resolution is improved thanks to the charge sharing due to the Lorentz drift and the non-zero incident angle with respect to the module surface: • as far as the barrel is concerned, the resolution in r is improved thanks to a Lorentz angle of about 32◦ (cluster size ∼ 2), while the the cluster size along the z-coordinate is 1-7 depending on the incident angle. • for the forward pixels, a turbine geometry of 20◦ was chosen to improve the resolution in r thanks to the Lorentz effect and in rφ thanks to the non-zero incident angle (average cluster size 2) The expected resolutions for unirradiated sensors are less than 15 µm in the transverse direction for the barrel and between 15-30µm for the barrel longitudinal direction and for the endcap disks. Test beam data proved that, even after an irradiation dose greater than 40 M Rad (3 years of LHC at design luminosity), the detector performance, both in terms of efficiency and resolution, remains remarkable. Figure 1.3: The inner pixel detector. The three barrel layers and the two disks of the endcap with blades disposed in a turbine-like shape are visible. The strip tracker is divided into four sub-detectors: the Tracker Inner Barrel (TIB), the Tracker Outer Barrel (TOB), the Tracker Inner Disks (TID) and the Tracker End Cap (TEC). 9 In the barrel, strips are parallel to the beam axis, while, in the endcaps, they have a radial orientation. On the whole the strip tracker is made of about 10 millions of channels for an active area close to 198 m2 . The TIB is made of four layers with about 25 < r < 50 cm and |z| < 65 cm. The first two layers are double-sided (stereo) modules with a tilt angle of 100 mrad and provide a measurement in both r-φ and r-z coordinates. The TOB is composed of six layers from r ∼ 55 cm to r ∼ 110 cm and with |z| < ∼ 110 cm. Also for TOB the innermost two layers are stereo. The TIB single-point resolution is 23-34 µm in the r-φ and 230 µm in the r-z coordinate; TOB resolutions are 35-52 µm in r-φ and 530 µm in z. The three TID and the nine TEC disks extend in the regions with 75 < |z| < 110 cm and 125 < |z| < 280 cm respectively. The two innermost TID and TEC and the fifth TEC rings are stereo. TIB, TID and three innermost TEC ring sensors are 320 µ thick, while TOB and outer TEC ring sensors 500 µm. The CMS tracker, therefore, provides a full coverage for |η| < 2.4 with more than 10 high-resolution measurement points among which at least 5 provide a 3-dimensional position measurement. Figure 1.4: Pseudorapoidity coverage of the CMS tracker. 2D measurement layers are displayed in red, 3D in blue. Main drawback for the all-silicon CMS tracker is the large amount of material due to detector modules, support structure, cooling plant, cables and electronic devices. The total material budget in terms of radiation length is estimated to raise up to 1.8 for η ∼ 1.5, corresponding to about 0.5 interaction lengths (Fig. 1.5). 1.2.3 The Electromagnetic Calorimeter The electromagnetic calorimeter (ECAL) measures the energy of the electrons and photons. The design of the CMS ECAL [7] was driven by the requirements imposed by the search of the Higgs boson in the channel H → γγ, where a peak in the di-photon invariant mass placed at the Higgs mass, has to be distinguished from a continuous background. A good resolution and a fine granularity are therefore required: both of them improve the invariant mass resolution on the di-photon system by improving respectively the energy and angle measurement of the two γs. The fine granularity also helps to obtain a good π 0 /γ separation. 10 x/λ0 Material Budget Tracker x/X0 Material Budget Tracker Support Sensitive Cables Cooling Electronics Other Air 1.8 1.6 Support Sensitive Cables Cooling Electronics Other Air 0.5 1.4 0.4 1.2 0.3 1 0.8 0.2 0.6 0.4 0.1 0.2 0 -5 -4 -3 -2 -1 0 (a) 1 2 3 4 5 η 0 -5 -4 -3 -2 -1 0 1 2 3 4 5 η (b) Figure 1.5: Material budget as a function of η expressed in terms of radiation length X0 (a) and in terms of interaction length λ0 (b). The peak around η=1.5 corresponds to the cables and services of the tracker. In order to provide high energy resolution, ECAL is placed inside the solenoid: hence a compact calorimeter is required. ECAL is a hermetic, homogeneous calorimeter made of lead tungstate (PbWO4 ) crystals, 61200 crystals mounted in the central barrel part, and 7324 crystals in each endcap (Fig. 1.6). The choice of lead tungstate scintillating crystals was driven by the characteristics of these crystals, having a short radiation length (X0 = 0.89 cm) and a small Moliere radius (RM = 2.2 cm), being fast (80% of the scintillation light is emitted within 25 ns) and also radiation hard. However, the relative low light yield (30 γ/MeV) requires the use of photodetectors with intrinsic gain that can operate in a magnetic field. In the barrel, silicon avalanche photodiodes (APDs) are used as photodetectors, while vacuum phototriodes (VPTs) have been chosen for the endcaps. In addition, the sensitivity of both the crystals and the APDs response to temperature changes requires temperature stability. In order to preserve the ECAL energy resolution performances, a water cooling system guarantees a long term stability at the 0.1◦ C. The barrel region has a pseudorapidity coverage up to |η| < 1.479. It has an inner radius of 129 cm and is structured in 36 supermodules, each containing 1700 crystals, covering half the barrel length and covering a 20◦ angle in φ. Each supermodule is divided along η into four modules which in their turn are made of submodules, the basic assembling alveolar units, containing 5×2 crystals each. The barrel crystals have a front face cross-section of ∼ 22× 22 mm2 and have a length of 230 mm, corresponding to 25.8X0 . In order to avoid that particles escape through the dead regions between the crystals, their axes are oriented with a 3◦ tilt with respect to the pointing geometry. The granularity of the barrel is ∆φ × ∆η = 0.0175 × 0.0175 and the crystals are grouped, from the readout point of view, into 5×5 arrays corresponding to the trigger towers. The encaps cover the pseudorapidity region 1.48 < |η| < 3.0, ensuring precision measurements up to η < 2.5. The endcap crystals have dimensions of 28.6×28.6×220 mm2 . Each endcap is structured in two “Dees” consisting of semi-circular aluminum plates from which 11 Figure 1.6: Scheme of the barrel and of the endcaps of the CMS ECAL. are cantilevered structural units of 5×5 crystals, known as “supercrystals”. A preshower device, whose principal aim is to identify neutral pions in the endcaps within 1.653 < |η| < 2.6, is placed in front of the crystal calorimeter. The active elements are two planes of silicon strip detectors which lie behind disks of lead absorber at depths of 2X0 and 3X0 . 1.2.4 The Hadronic Calorimeter The hadron calorimeter (HCAL) [8], placed just outside the electromagnetic calorimeter, plays a major role in the reconstruction of jets and missing energy. Its resolution has to guarantee a good reconstruction of the di-jets invariant mass and an efficient measurements of the missing energy which represent an effective signature in many channels of physics beyond the Standard Model. Similarly to the other subdetectors, HCAL has to provide a good hermeticity, which is critical for determining the missing energy, and a fine granularity to allow for a clear separation of di-jets from resonance decays and improve the resolution in the invariant mass of the di-jets. Moreover, it has to provide a number of interaction lengths sufficient to contain the energetic particles from high transverse momentum jets. The dynamic range has to be large in order to detect signals ranging from the signal of a single minimum ionizing muon up to an energy of 3 TeV. The pseudorapidity region |η| < 3 is covered by the barrel (up to |η| < 1.74) and the two endcaps. The HCAL is composed by brass layers as absorbers interleaved by plastic scintillator layers, 4 mm thick, used as active medium. The absorber layers thickness is between 60 mm thick in the barrel and 80 mm in the endcaps, while the scintillators layers are 4 mm thick. In terms of interaction lengths λ, the barrel ranges from 5.46λ at |η| =0 up to 10.82λ at |η| =1.3; the endcaps correspond on average to ∼ 10λ. The scintillator in each layer is divided into tiles with a granularity matching the granularity of the ECAL trigger towers (∆η × ∆φ = 0.0875 × 0.0875) and the light is collected by wavelength shifters. The two hadronic forward calorimeters improve the HCAL hermeticity, covering the pseudorapidity region 3< |η| <5. It is placed at 11.15 m from the interaction point outside the magnetic field. Due to the extremely harsh radiation environment a different detection technique is used: a grid of quartz (radiation hard) fibers is embedded in a iron absorber. 12 1.2.5 The Muon System In CMS, the muon detectors are placed beyond the calorimeters and the solenoid. The muon system [9] consists of four active stations interleaved by the iron absorber layers which constitute the return yoke for the magnetic field. The muon system has three functions: muon identification, momentum measurement, and triggering. Good muon momentum resolution and trigger capability are enabled by the high field solenoidal magnet and its flux-return yoke. The latter also serves as a hadron absorber for the identification of muons. Three different typologies of detectors are employed: drift tubes (DT) in the barrel region, cathode strip chambers (CSC) in the endcaps and, in addition to DT and CSC, resistive plate chambers (RPC) in both regions. In the barrel region, where the neutron-induced background is small, the muon rate is low, and the magnetic field is uniform and mostly contained in the steel yoke, drift chambers with standard rectangular drift cells are used. The barrel drift tube (DT) chambers cover the pseudorapidity region |η| < 1.2 and are organized into 4 stations interspersed among the layers of the flux return plates. The first 3 stations each contain 8 chambers, in 2 groups of 4, which measure the muon coordinate in the r-φ bending plane, and 4 chambers which provide a measurement in the z direction, along the beam line. The fourth station does not contain the z-measuring planes. In the two endcap regions of CMS, where the muon rates and background levels are high and the magnetic field is large and non-uniform, the muon system uses cathode strip chambers (CSC). With their fast response time, fine segmentation, and radiation resistance, the CSCs identify muons between |η| values of 0.9 and 2.4. There are 4 stations of CSCs in each endcap, with chambers positioned perpendicular to the beam line and interspersed between the flux return plates. The cathode strips of each chamber run radially outward and provide a precision measurement in the r-φ bending plane. The anode wires run approximately perpendicular to the strips and are also read out in order to provide measurements of η and the beam-crossing time of a muon. Each 6-layer CSC provides robust pattern recognition for rejection of non-muon backgrounds and efficient matching of hits to those in other stations and to the CMS inner tracker. A crucial characteristic of the DT and CSC subsystems is that they can each trigger on the pT of muons with good efficiency and high background rejection. A complementary, dedicated trigger system consisting of resistive plate chambers (RPC) was added in both the barrel and endcap regions. The RPCs provide a fast, independent, and highly-segmented trigger with a sharp pT threshold over a large portion of the rapidity range (|η| < 1.6) of the muon system. The RPCs are double-gap chambers, operated in avalanche mode to ensure good operation at high rates. They produce a fast response, with good time resolution but coarser position resolution than the DTs or CSCs. They also help to resolve ambiguities in attempting to make tracks from multiple hits in a chamber. 1.2.6 The Trigger At the nominal LHC luminosity, the expected event rate is about 109 Hz. Given the typical size of a raw event (∼ 1 MB) it is not possible to record the information for each event. Indeed, the event rate is largely dominated by soft p − p interactions with particles of low transverse momentum. The triggering system must have a large reduction factor and maintain at the same time high efficiency on the potential interesting events, reducing the rate down to 100 Hz, which is the maximum sustainable rate for storing events. The trigger system 13 consists of two main steps: a Level 1 Trigger and a High Level Trigger. The basic concepts will be described in the following. The Level 1 trigger The Level 1 trigger [10] (L1) reduces the rate of selected events down to 50 (100) kHz for the low (high) luminosity running. The full data are stored in pipelines of processing elements, while waiting for the trigger decision. The L1 decision has to be taken in 3.2 µs. If the L1 accepts the event, the data are processed by the High Level Trigger. To deal with the 25 ns bunch crossing rate, the L1 trigger has to take a decision in a time too short to read data from the whole detector, therefore it employs calorimetric and muon data only, since the tracker algorithms are too sophisticated for this purpose. The Level-1 trigger is organized into a Calorimeter Trigger and a Muon Trigger whose information is transferred to the Global Trigger which takes the decision. The Calorimeter Trigger is based on trigger towers, arrays of 5 crystals in ECAL, which match the granularity of the HCAL towers. The trigger towers are grouped in calorimetric region of 4 × 4 trigger towers. The Calorimeter Trigger identifies, from the calorimetric region information, the best four candidates of each of the following classes: electrons and photons, central jets, forward jets and τ -jets identified from the profile of the deposited energy. The information of these objects is passed to the Global Trigger, together with the measured missing ET . The Muon trigger is performed separately for each muon detector. The information is then merged and the best four muon candidates are transferred to the Global Trigger. The Global Trigger takes the accept/reject decision exploiting both the characteristic of the single objects and of combination of them. The High Level Trigger The High Level Trigger [11] reduces the output rate down to 100 Hz. The idea of the HLT trigger software is the regional reconstruction on demand, that is only those objects in the useful regions are reconstructed and the uninteresting events are rejected as soon as possible. This leads to the development of three “virtual trigger” levels: at the first level only the full information of the muon system and of the calorimeters is used, in the second level the information of the tracker pixels is added and in the third and final level the full event information is available. 14 Chapter 2 Tracking in CMS Tracking is one of the most important tasks for CMS because most of the reconstructed physics object depend on it: tracks not only provide the momentum measurement for charged particles (muons, electrons and charged hadrons), but also they are the input for primary and secondary vertex reconstruction and for b- and τ -tagging algorithms, they are used to reconstruct the photons converted into e+ e− pairs in the tracker volume and are fundamental for some jet-energy correction algorithms. On the other hand, tracking in CMS is also a very complex task, because LHC collisions will produce thousand tracks every bunch crossing, corresponding to about 2.5 tracks/cm2 every 25 ns at R = 4 cm, where the first pixel barrel layer is placed. In addition, the large material budget provided by the all-silicon tracker makes even tougher the reconstruction of tracks from particles that can suffer from bremsstrahlung (e.g. electrons) or nuclear interactions (e.g. pions). Therefore, to obtain the best results in each physics analysis in CMS, an efficient and precise tracking in different kind of events in needed. 2.1 Overview of Tracking in CMS The CMS tracking group has developed a complex tracking strategy that leads to optimal track reconstruction performance for any physics object use case. In particular, this is achieved by a modular approach that divides the track reconstruction process into three steps [12, 13, 14]: • Seeding: hit pairs or triplets are selected, providing an initial estimate of the track parameters. • Pattern Recognition: starting from the track seed, the hits corresponding to that track are searched for in the whole tracker. • Final Fit: once all the track hits have been found, they are fitted to extract the best estimate of the track parameters. Currently, in CMS several tracking algorithms are available. Each algorithm defines its own seeding and pattern recognition procedures, while, to provide a coherent track definition, the final fit is the same for all the algorithms. Two “general purpose” algorithms have been developed, namely the Combinatorial Kalman Filter and the Road Search. Other algorithms are instead dedicated to the reconstruction of special kind of tracks: the Cosmic Track 15 Finder for the reconstruction of cosmic-rays, the Gaussian Sum Filter for electrons and the Deterministic Annealing Filter for tracks in high energy jets. After the tracks are produced, a set of quality filters [15] are applied, providing a quality label to each reconstructed track. These filters apply cuts based on the track transverse compatibility with the beam line, the longitudinal compatibility with the interaction vertices and the track χ2 . The cut strength depends on the track pT , η and mostly on the number of hits: basically no quality cuts are needed for tracks with many hits (at least 10 crossed layers), while tighter cuts are applied for tracks with lower number of hits. The quality flags are Pre Filter, Loose Quality, Tight Quality and High Purity, the first being the label for tracks before any filtering is applied, the last being the most severe quality selection (Table 2.1). Table 2.1: Examples of High Purity quality cuts for three different reconstructed tracks. σ(α) is the expected resolution on the parameter α for that track. For the definition of the track parameters, see §2.2 χ2 /ndof |dxy| [cm] |∆z| [cm] |dxy|/σ(dxy) |∆z|/σ(z) pT = 0.7 GeV η = 0.8 nhits = 5 <4,5 <0,02 <0,04 <16 <16 pT = 3 GeV η = 0.8 nhits = 7 <6,3 <0,06 <0,15 <62 <62 pT = 3 GeV η = 0.8 nhits = 15 <13,5 <1,2 <3,1 <1300 <1300 Also, in order to reconstruct low momentum and secondary particle tracks, an iterative tracking [16] approach has been developed. After a first reconstruction step, severe cuts are applied in order to keep the fake rate at a very low level, the hits attached to the track collection are removed and a second track reconstruction is performed with the remaining hits and looser tracking cuts. This procedure can be further iterated (the default sequences performs three step). Finally, to complete this introduction to CMS tracking, it is worth recalling that tracks are also used during the High Level Trigger (HLT). For this purpose, tracks can be reconstructed very fast using pixel hits only or using both pixels and strips but in a limited region of the tracker (usually a region pointed by calorimeter towers of tracks is the muon chambers). In the following, the most relevant items for this thesis are reviewed more in detail. 2.2 Definition of Track A track is defined as a set of measurement points (hits) in the tracker that, once properly fitted, provide an estimate (with corresponding errors) of a charged particle trajectory. The trajectory of a charged particle in a magnetic field is a helix and can be geometrically described by five parameters. These parameters are evaluated at a given reference position v = (vx, vy, vz) along the track, which, for tracks from LHC collisions, is the point of closest approach to the beam axis. In addition, some of them are evaluated as distances with respect to another point, the beam spot [17] bs = (bsx, bsy, bsz). The choice of the parameters currently made in CMSSW is the following [18]: 16 1. qoverp = q/|~ p| = signed inverse of momentum [1/GeV]. 2. λ = π/2 − θ where θ is the polar angle at the point v. 3. φ = azimuthal angle at v. 4. dxy = −(vx − bsx) ∗ sin φ + (vy − bsy) ∗ cos φ [cm]. Geometrically, dxy is the signed distance in the XY plane between the straight line passing through (vx, vy) with azimuthal angle φ and the point (bsx, bsy). 5. dsz = (vz − bsz) ∗ cos λ − ((vx − bsx) ∗ cos φ + (vy − bsy) ∗ sin φ) ∗ sin λ [cm]. The dsz parameter is the signed distance in the SZ plane between the straight line l passing through (vx, vy, vz) with angles (φ, λ) and the projection of the bs point on the SZ plane. The S axis is defined by the projection of l on the XY plane. Other important quantities are the transverse impact parameter d0 , the longitudinal impact parameter dz and the transverse momentum pT , which can be expressed as a function of the above parameters as d0 = −dxy, dz = dsz/ cos λ and pT = |p| sin θ. 2.3 Combinatorial Kalman Filter The Combinatorial Kalman Filter (CKF), also known as Combinatorial Track Finder (CTF), is the default CMS track reconstruction algorithm as it provides the best performance both in terms of physics results and computing time. The CKF seeding looks for hit pairs and triplets on consecutive tracker layers or, to account for hit reconstruction and detector acceptance inefficiencies, layers with only another layer with no hits in between. Pairs and triplets have to be compatible with tracks coming from the beam interaction region and with a minimum pT value. By default, both the innermost pixel and strip layers are considered: even if the track density is higher in the inner tracker layers, the pixel hits are particularly useful since their high granularity leads to low occupancy and they provide a 3-D measurement; strip hits, instead, are useful to maximize the seeding efficiency at large η thanks to the bigger geometrical acceptance of the strip detector. The pattern recognition proceeds iteratively starting from the track parameter estimate on the seed layer and including the information of the successive detection layers one by one. First, the layers which are compatible with the initial seed trajectory are determined. The trajectory is then extrapolated to these layers, accounting for magnetic field, multiple scattering and energy loss in the traversed material. Several hits on the new layer may be compatible with the predicted trajectory, and thus one new trajectory candidate per compatible hit is created. In addition, one further trajectory candidate is created, without any reconstructed hit in that layer, but with a fake hit, called invalid hit, used to account for the material effects in a layer where the track has fired no hits. Each trajectory is then updated with the corresponding hit according to the Kalman filter formalism by combining the predicted trajectory parameters and the hit measurement as a weighted mean. This procedure is repeated until the outermost layer of the tracker is reached. Several parameters are tunable to properly configure the CKF algorithm; for example, the maximum number of candidates that are propagated at each step, the maximum number of invalid hits and the minimum transverse momentum. 17 (a) (b) Figure 2.1: Schematic view of the CKF seeding with hit pairs(a): considering two inner tracker layer, starting from a hit in the outer layer, hits in a φ window corresponding to a minimum pT and compatible with the beam spot are searched for in the inner layer. Schematic view of the CKF pattern recognition(b): starting from the seed estimate, the compatible hits are iteratively processed, each leading to trajectory candidate. The candidate on the left is stopped because no more compatible hits are found, the one on the right is stopped because, after adding the last hit, its χ2 is too high. Red dots are compatible measurements, black are not compatible. 18 The default CKF behavior is optimized to reconstruct tracks in collision events. In order to be able to reconstruct also cosmic tracks during commissioning runs, a customized version of it has been implemented. In particular, this adapted algorithm makes use of seeding from outer top layer (global y coordinate > 0) and of loose cuts on the number of invalid hits per track. 2.4 Road Search Road Search (RS) is the other general purpose tracking algorithm in CMS. The algorithm treats the CMS tracker in terms of rings, where a ring contains all tracker modules at a given r − z position, spanning 360 degrees in phi. Its approach differs from CKF since the seeding step: in this case, the seed hits are not searched for in consecutive layers, but in the inner and in the outer rings of the strip detector. The pattern recognition is based on a predetermined set of roads, which are lines in the r − z plane consistent with the trajectory of particles from the beam spot. The roads connecting the seed hits are considered and all the hits along the road are collected in a cloud. Then, starting from the low occupancy layers, the most compatible hits in the cloud are selected. (a) (b) Figure 2.2: Schematic view of the RS seeding regions(a): inner (red) and outer (blue) rings. Schematic representation of the Road Search algorithm(b): in (a) a trajectory is drawn through the two circled seed hits. All hits within a window around the trajectory (shaded region) are collected in the cloud. In (b) a new trajectory is built inside-out using only hits on the low occupancy layers of the cloud, resulting in the fitted trajectory in (c). This trajectory is extrapolated back to the innermost layer and then hits on the higher occupancy layers are tested (d). The best hit on each layer is used to yield the fitted track (e). The RS algorithm has been also customized for cosmic ray tracking. By construction, RS reconstructs tracks in one half only (top or bottom) of the tracker; a cosmic muon may cross the two tracker halves and lead to two different RS tracks: these two tracks are merged into a single track if they match. 19 2.5 Cosmic Track Finder The Cosmic Track Finder (CosmicTF) is an algorithm specialized for the reconstruction of cosmic ray tracks. It is designed to work for low hit multiplicity and single track events. All pair of hits (triplets in case of cosmic run with magnetic field) in the outer tracker layers (both with y < 0 and y > 0) are considered as seeds, each corresponding to a trajectory candidate. The trajectory candidate is iteratively propagated to every hit in the tracker according to their order with respect to the vertical (y) direction. If a hit is compatible, then it is added to the trajectory. Once all the hits have been considered, the final fit is performed on the trajectory candidate. (a) (b) Figure 2.3: Schematic view of the CosmicTF seeding(a): all the hit pairs in the outer tracker layer are seeds; in the example the outer bottom pair is the seed. Schematic view of the CosmicTF pattern recognition(b):the hits are sorted in the y direction, the compatible ones (red) are added to the trajectory, while the not compatible ones (green) are disregarded. After all seeds have been processed, the found tracks are analyzed and only the best track is kept according to the highest number of crossed layers, the highest number of hits and the lowest track χ2 . 2.6 The Final Track Fit The final fit represents the first commitment for this thesis work. The initial task was the porting from the old CMS reconstruction software framework (ORCA [20]) to the new one (CMSSW [19]); afterwards it converted into the maintenance of this code and the development of new features related to the final fit. The final fit is based on the Kalman filtering technique and is the last step, common to all the CMS tracking algorithms, of the track reconstruction process, providing the most accurate measurement of the track parameters after the seeding and the pattern recognition procedures. For each trajectory, the pattern recognition step results in a collection of hits and an estimate of the track parameters. At this stage, the determination of the track parameters is still not optimal since it is accurate only at the last hit of the trajectory and since the estimate 20 can be biased by constraints applied during the seeding stage. Therefore the trajectory is refitted using a least-squares approach, implemented as a combination of a standard Kalman filter and smoother. The final fit starts on the first seed hit layer, usually the innermost, with the estimate of the track parameters obtained during the pattern recognition. The corresponding covariance matrix is scaled by a large factor in order to avoid any bias. This estimate is updated with the measurement provided by the first hit. The trajectory is then propagated (outwards in the case of inner seeding) to the next hit surface, taking into account both the energy loss and the multiple scattering. For each valid hit, the position estimate is re-evaluated using the current values of the track parameters: the information on the incidence angle increases the precision of the measurement especially in the pixel modules. The predicted track parameters on this surface and their covariance matrix are updated with the hit measurement. This sequence is repeated until the last hit has been reached. At this point, a new Kalman filter fit (called smoothing) is initialized with the result of the first one - except for the covariance matrix, which is scaled by a large factor - and is performed in the opposite direction. During the smoothing, for each hit, seven different track parameter estimates are available: • Forward predicted state: track parameter estimate obtained during the first fit propagating the information from the first (n − 1) hits on the n-th hit surface. • Backward predicted state: track parameter estimate obtained during the smoothing step propagating the information from the last (N − n + 1) hits on the n-th hit surface. • Combined predicted state: weighted mean combination of forward and backward predicted states. This is most precise estimate of the track parameters on the n-th hit surface without taking into account the n-th hit measurement. • Forward updated state: estimate obtained updating the forward predicted state with the hit measurement. • Backward updated state: estimate obtained updating the backward predicted state with the hit measurement. • Combined forward updated state: weighted mean combination of forward updated and backward predicted states. • Combined backward updated state: weighted mean combination of backward updated and forward predicted states. The hit position is re-evaluated using the combined predicted state, thus leading to the best hit position estimate. In addition, a hit χ2 , evaluating the compatibility of the hit position with the predicted state position, can be computed between the hit measurement and the combined predicted state. This filtering and smoothing procedure yields optimal estimates of the parameters at the surface associated with each hit and, specially, at the first and the last hit of the trajectory. Estimates on other surfaces, are then derived by extrapolation from the closest hit. In particular, an extrapolation of the track parameters from the innermost hit to the point of closest approach to the beam line is performed in order to evaluate them according to the track definition (§2.2). 21 2.6.1 Track Refitting Some use cases require performing a new final fit over already reconstructed Tracks to reevaluate the track parameters after some conditions have changed, but without performing the whole track reconstruction sequence again. The most typical example are alignment studies, where tracks are refitted after modifying the geometry of detectors. The only difference between a “standard” final fit and a refit is the initial evaluation of the track parameters. The pattern recognition output is a TrackCandidate object, which stores a starting trajectory state on the surface of the first hit; this estimate has enlarged track parameter errors in order to unbias the fit. A reconstructed track, instead, loses this information. Therefore, a new starting state is built propagating the track parameters, defined at the point of closest approach to the beam line, to the surface of the first hit and then rescaling errors. This new state is very close to the TrackCandidate one: in fact, running the Refitter with the same conditions (e.g. same geometry) of the first final fit the differences between any parameter of the produced Tracks are orders of magnitude smaller than the estimated errors on that parameter. Refitting with Constraints The possibility to refit a track using additional constraints is very important in many situations: for example it can be used to set the momentum of the track in such cases when the refitted hits are too few to properly determine it. Other application could be a beam-spot or generic vertex constraint. Therefore, the possibility to add constraints to the track refitter has been provided. An important feature of this implementation is that it does not simply add the constraint at the end of the fit, but it takes into account the constraint as an additional ordinary hit: in this way all the measurement points are affected by the presence of the constraining hit. The main idea is that the constraint is a new kind of hit with user defined values of parameters and associated errors. At the moment there are 2 different kind of constraints that can be applied: • the momentum magnitude constraint • the vertex constraint. The user has to specify the constraint to apply by producing an association map between the track to be refitted and the constraint to be applied. Once this map is provided to the refitter, it finds the correct position in the hit vector for the constraining hit, computes a proper starting state and then performs a new final fit in the usual way. This approach is optimal because, as the constraints are completely defined by the user, it provides the maximum flexibility. 22 Chapter 3 Tracking Validation Tools in CMSSW 3.1 Overview Goal of the CMS Track Validation Tool is to provide an official evaluation of the tracking performance. For this purpose, an analysis program has been written that, given the collections of reconstructed and simulated tracks in the event, provides a set of plots which summarizes the performance of track reconstruction. A set of macros compares the histograms produced by the validation tool on different samples, with different reconstruction algorithms or with different releases. A comparison of the tracking performance under different conditions is very useful to monitor it for every new software release and evaluate the impact of new reconstruction algorithms on track reconstruction. The Track Validation Tool is composed by three elements: • Track Associators: they are used in order to establish if a reconstructed track matches a simulated track. The association is performed according to different criteria that can be divided in methods that compare the parameters of reconstructed and simulated tracks (association by χ2 ) or check the provenance of the track hits (association by hits). Two kinds of association are possible: the association of reconstructed to simulated tracks (RecoToSim) and the association of simulated to reconstructed tracks (SimToReco). RecoToSim and SimToReco may have different association requests. • Track Filters: the input collection can be selected with three different filters: a reconstructed track filter and two simulated track filters, one for efficiency and one for the fake rate studies. The filter for the reconstructed tracks can be used to study tracks with a particular topology, a particular pT or according to the tracking algorithm and the quality label used. The default cuts are reported in Table 3.1: in this case, all the other cuts are dummy because the request for High Purity tracks is already a severe selection. For the evaluation of the efficiency the filter has to select the simulated tracks which are expected to be reconstructed by the tracking algorithm under test. As an example, the filter cuts for the default CMS tracking are reported in Table 3.2. For the fake rate estimate, instead, the simulated track collection does not need to pass severe cuts, so the cuts used for the efficiency are loosened or no filter at all is applied. 23 • MultiTrackValidator: it is the analysis program itself. The analysis performed in the MultiTrackValidator is divided in four steps. First, the track collections are selected with the dedicated filters. They apply the selection cuts either before or during the MultiTrackValidator program execution. The second step consists in the association of the reconstructed and simulated tracks (in case the association map is not already provided as external input). For each specified track associator, the SimToReco association is performed between the collection of simulated tracks, filtered for efficiency studies, and the reconstructed tracks, while the RecoToSim is performed between the reconstructed and the simulated tracks for fake rate studies. Then, for each simulated track selected for the efficiency studies the SimToReco association map is scanned looking for the matching reconstructed track: the efficiency is computed as the number of matched tracks divided by the number of simulated tracks per (η, pT or number of hits) bin. Finally, for every reconstructed track in the input collection, the RecoToSim association map is scanned looking for the matching simulated track: the fake rate is computed as the number of matched tracks divided by the number of reconstructed tracks per bin. In addition, when a reconstructed track is associated to a simulated one, their parameters are compared, providing the input for residue distribution plots. The resolutions are computed by fitting each x axis bin of the residue vs (η, pT or number of hits) 2D plots with a Gaussian function and filling new 1D histograms, defined with the same x axis binning, with the corresponding fit width. Table 3.1: Default selection cuts for the reco::Tracks. algorithm any η −5 < η < 5 quality High Purity pT > 0.1 GeV χ2 < 104 dxy < 120 cm n hits ≥3 dz < 300 cm Table 3.2: Default selection cuts for the TrackingParticles used for efficiency studies. Only signal means that the TrackingParticle is not from a pile-up event and V is the particle production point. only signal true η −2.4 < η < 2.4 q 6= 0 true pT > 0.9 GeV particle type all Vxy < 3.5 cm n hits ≥0 Vz < 30 cm In this chapter, the tracking performance with simulated data are first reviewed, and then, the details about the implementation of the track associators and the validation program are presented. 24 3.2 Tracking Performance with Simulated Data For each new software release, a set of reference samples is produced to monitor the reconstruction performance with that release (RelVal samples). Every time new RelVal samples are available, the CMS tracking group uses the MultiTrackValidator program to analyze them, thus book-keeping the tracking performance with respect to the new software and algorithmic developments and providing and “official” estimate of the expected tracking performance for various kind of events. The most important variables and distributions, taken as benchmark for the tracking performance, are here introduced and discussed. In Fig. 3.1, the efficiency for particle gun events is presented: for single muons the efficiency is close to 100% over the whole Tracker acceptance range, while, because of nuclear interactions, about 10% of single pions yield too few tracker measurements and thus can’t be reconstructed. efficiency vs η efficiency vs η 1 1 0.95 0.95 0.9 0.9 0.85 0.85 0.8 0.8 µ pt=1 GeV µ pt=10 GeV µ pt=100 GeV 0.75 0.7 -2.5 -2 -1.5 -1 -0.5 0 (a) 0.5 1 1.5 0.75 2 2.5 η 0.7 -2.5 -2 -1.5 -1 -0.5 0 π pt=1 GeV π pt=10 GeV π pt=100 GeV 0.5 1 1.5 2 2.5 η (b) Figure 3.1: Single particle gun events efficiency: muons(a) and pions(b). The track parameter resolutions for single muon events are reported in Fig. 3.2. In the central region, the pT resolution (Fig. 3.2(e)) is better than 1% for single muons with pT ≤ 10 GeV , while for large |η| values it worsens because of material effects, reduced lever arm and lower hit resolution. The resolution on the transverse (Fig. 3.2(a)) and longitudinal (Fig. 3.2(b)) impact parameter is at the level of or below a few tens of µm for pT ≥ 10 GeV . The efficiency and fake rate for tt̄ events without and with low luminosity pile-up are reported in Fig. 3.3. The efficiency vs η (Fig. 3.3(a)) is consistent with that obtained for single pion events (Fig. 3.1(b)) and is about 90%, while the fake rate is below 5% (Fig. 3.3(b)). The effect of low luminosity pile-up on the efficiency leads to a small reduction, while it increases the fake rate of about a factor two. Most inefficiencies and fakes are due to wrongly reconstructed low momentum tracks (Fig. 3.3(c) and (d)) and tracks with a small number of hits (Fig. 3.3(e) and (f)). Summarizing, the expected performance of CMS tracking is characterized by a high efficiency (above 99% for single muons, around 90% for tt̄ events), a low fake rate (smaller than 10% in tt̄ events with low luminosity pile-up) and great precision in the track parameter measurement (impact parameter resolution smaller than 10 µm at high momenta and pT resolution at the order of 1%). 25 σ(δ dxy) vs η σ(δ dz) vs η µ pt=1 GeV µ pt=10 GeV µ pt=100 GeV σ(∆dz)[µ m] σ(∆dxy)[µ m] µ pt=1 GeV µ pt=10 GeV µ pt=100 GeV 103 2 10 102 10 -2.5 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 10 -2.5 2.5 η -2 -1.5 -1 -0.5 (a) 0.5 1 1.5 2 2.5 η 1 1.5 2 2.5 η (b) σ(δ φ) vs η σ(δ cot(θ)) vs η µ pt=1 GeV µ pt=10 GeV µ pt=100 GeV µ pt=1 GeV µ pt=10 GeV µ pt=100 GeV -3 10 σ(∆cot(θ))[10 ] σ(∆φ)[mrad] 0 10 1 1 10-1 -2.5 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 -1 10-2.5 2.5 η -2 -1.5 (c) -1 -0.5 0 0.5 (d) σ(δp /p ) vs η t T T σ(∆p /p )[%] t 10 1 µ pt=1 GeV µ pt=10 GeV µ pt=100 GeV -2.5 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 2.5 η (e) Figure 3.2: Resolution of track parameters for single muon events with pT = 1, 10, 100 GeV : dxy(a), dz(b), φ(c), cot θ(d) and pT (e). 26 efficiency vs η fake rate vs η 0.5 tt, no PileUp 1 0.45 tt, LowLumi PileUp 0.4 0.95 0.35 0.9 0.3 0.25 0.85 0.2 0.8 0.15 0.1 0.75 0.7 -2.5 tt, no PileUp 0.05 tt, LowLumi PileUp -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 2.5 η 0 -2.5 -2 -1.5 -1 (a) -0.5 0 0.5 1 1.5 2 2.5 η (b) efficiency vs pT fake rate vs pT 0.5 1 tt, no PileUp 0.9 0.45 0.8 0.4 0.7 0.35 tt, LowLumi PileUp 0.6 0.3 0.5 0.25 0.4 0.2 0.3 0.15 0.1 0.2 tt, no PileUp 0.1 0.05 tt, LowLumi PileUp 0 0 5 10 15 20 25 30 p [GeV] 0 10-1 1 10 T p [GeV] T (c) (d) effic vs hit fake rate vs hit 0.8 1 tt, no PileUp 0.9 0.7 0.8 tt, LowLumi PileUp 0.6 0.7 0.5 0.6 0.5 0.4 0.4 0.3 0.3 0.2 0.2 tt, no PileUp 0.1 0 0 0.1 tt, LowLumi PileUp 5 10 15 (e) 20 25 30 35 number of hits 0 0 5 10 15 20 25 30 35 number of hits (f) Figure 3.3: Validation plots for tt̄ events without and with low luminosity pile-up for high purity tracks with pT > 0.9 GeV : efficiency and fake rate vs η(a)-(b), vs pT (c)-(d) and vs number of hits(e)-(f) 27 3.3 Implementation 3.3.1 Track Associators In order to evaluate the performance of tracking algorithms, it is necessary to define if a reconstructed track is produced by the passage of a charged particle trough the tracker detector or it is due to a combinatorial effect of randomly positioned hits (fake track ). For simulated data this task corresponds to check if a reconstructed track is associated to a simulated track. Within the CMS software framework, the association between a reconstructed track (reco::Track ) and a simulated track object (TrackingParticle) is performed by the TrackAssociators. In the base track associator class (TrackAssociatorBase), two pure virtual method are defined: • associateRecoToSim: associates reco::Tracks to TrackingParticles, returning a RecoToSimCollection. • associateSimToReco: associates TrackingParticles to reco::Tracks, returning a SimToRecoCollection. For each reco::Track (TrackingParticle), the RecoToSimCollection (SimToRecoCollection) association map stores the vector of associated TrackingParticles (reco::Tracks) and the quality of the association. The vector is ordered from the element with the best quality to the element with the worst. The association methods are implemented with a general interface such that any concrete track object that inherits from reco::Track (e.g. reco::GsfTrack ) can be used in the association methods1 . Concrete track associator classes has to implement these methods. TrackAssociatorByHits The TrackAssociatorByHits associates reco::Tracks and TrackingParticles on the basis of the number of hits they share. Thus, it basically checks that the hits found during the pattern recognition correspond to a unique simulated track. The hit level association is performed by the TrackerHitAssociator. Given a reconstructed hit, it provides the vector of simulated hits that merged into the reconstructed hit and the ids of the simulated tracks that fired the simulated hits. Within this context, a simulated hit is defined by the entry point of the simulated track in the detector, the energy loss by the particle, its id, its particle type and the process that generated it. The number of hits shared between a reco::Track and a TrackingParticle correspond to the number of reconstructed hits from that track which, according to the TrackerHitAssociator, are associated to the id of that TrackingParticle. The TrackAssociatorByHits is a configurable object that can be adapted to any usecase by changing the configuration file parameters. These parameters can be grouped into two categories: the parameters tuning the association criteria and those specifying how the number of simulated hits is computed. The first set of parameters are: 1 This is accomplished by defining two version for the association methods, one that takes as input an edm::Handle<edm::View<reco::Track> > and an edm::Handle<TrackingParticleCollection>, the other taking an edm::RefToBaseVector<reco::Track> and an edm::RefVector<TrackingParticleCollection>. 28 • AbsoluteNumberOfHits: false (default) means that the association quality is the fraction of hits shared, true means that the association quality is the absolute number of shared hits. • Purity SimToReco: Purity cut for SimToReco association. The purity is defined as the number of shared hits divided by the number of TrackingParticle hits. The default value is 0.75. • Quality SimToReco: Quality cut for SimToReco association. If AbsoluteNumberOfHits is false, the quality is defined as the number of shared hits divided by the number of TrackingParticle hits (default value 0.5) and a TrackingParticle is associated if the quality is greater than Quality SimToReco and the purity is greater than Purity SimToReco. If AbsoluteNumberOfHits is true the quality is defined as the number of shared hits, the purity cut has no effect and a TrackingParticle is associated if the quality is greater than Quality SimToReco. • Cut RecoToSim: Quality cut for RecoToSim association (default value 0.75). The value of RecoToSim quality is defined as the number of shared hits divided by the number of TrackingParticle hits, the same as SimToReco purity. A reco::Track is associated if the quality is greater than Cut RecoToSim. • SimToRecoDenominator: by default, for the SimToReco association the minimum fraction is defined as the number of shared hits divided by the number of TrackingParticle hits (SimToRecoDenominator = ”sim”). If SimToRecoDenominator = ”reco”, the fraction is defined as the number of shared hits divided by the number of reco::Track hits. The track reconstruction algorithms have been designed as flexible, easily configurable processes: in particular, several of the available features lead to different number and type of reconstructed hits per track. Therefore, in order to correctly compute the number of TrackingParticle hits expected to be reconstructed by any tracking configuration, the TrackAssociatorByHits has to account for all these options and provide a complete set of parameters to handle them: • UseGrouped: during the track pattern recognition two different behaviors are possible: consider at most one hit per tracker layer (TrajectoryBuilder ) or all the hits in case of overlapping layers (GroupedTrajectoryBuilder, default). If UseGrouped is true, the number of TrackingParticle hits is computed counting all its simulated hits in the tracker, if it is false, only one simulated hit per layer is computed,. • UseSplitting: for timing purposes, during the pattern recognition the hits in the double sided strip layer are merged into one matched hit. Before the final fit, these hits can be again splitted in the two (mono and stereo) contributions (default). If UseSplitting is false, only one simulated hit in the double sided strip layers counts for the number of TrackingParticle hits; if true, both mono and stereo hits contribute. • UsePixels: the track reconstruction can optionally be performed considering only strip hits (pixel-less tracking). If UsePixels is true the pixel hits are counted for the number of TrackingParticle hits, if false they are not. 29 • ThreeHitTracksAreSpecial: if true (default) the tracks with a number of hits equal to three are associated only if all their hits are shared with the correct TrackingParticle. TrackAssociatorByChi2 The TrackAssociatorByChi2 associates tracks and TrackingParticles on the basis of the χ2 (per degree of freedom) computed between the track parameters of the reco::Track and those of the TrackingParticle. If the χ2 value is below the cut value they are associated. The TrackAssociatorByChi2 essentially evaluates the compatibility of the reconstructed track parameters with the simulated ones and provides an estimate of the reconstructed track quality. The χ2 is stored in the association map as −χ2 because the association map is sorted for decreasing association quality values: in this way the first element in the map is still the one with the best quality, i.e. the lowest χ2 . The following settings can be specified via configuration file: • chi2cut: value of the applied cut (default=25). • onlyDiagonal: false by default. Turn it to true to use only the diagonal terms of the track parameters covariance matrix. • beamSpot: name of the module that produced the BeamSpot (default: offlineBeamSpot). The BeamSpot is needed by the TrackAssociatorByChi2 because the TrackingParticle contains the information about the position and the momentum at the point where the particle is created, while the reconstructed track parameter are defined at the point of closest approach to the beam line. Thus, the BeamSpot is used for the propagation of the TrackingParticle parameters from the production point to the point of closest approach to the beam line (note that a similar propagation is used also in TrackProducer, see §2.6). Seed Association In order to allow for performance studies of the first two steps of the track reconstruction process, seeding and pattern recognition, the TrackAssociatorBase class allows also the association of TrajectorySeeds and TrackCandidates to TrackingParticles: it defines the corresponding association maps and the associateSimToReco (RecoToSim) methods. These methods have dummy virtual implementations in the TrackAssociatorBase class, so the concrete TrackAssociators can implement their own TrajectorySeed or TrackCandidate association. At the moment, only the TrackAssociatorByHits implements the seed association. The association algorithm is the same as in the track association, with only the following differences: • The input and output collections, which accounts for TrajectorySeeds instead of reco::Tracks. • Not only the RecoToSim, but also the SimToReco association fraction is always defined as the number of shared hits divided by number of reconstructed track hits. • No request is made on the purity. Note that, as the seeds have two or three hits, the default settings (AbsoluteNumberOfHits = false, MinHitCut = 0.5, ThreeHitTracksAreSpecial = true) allow the association only if 100% of the seed hits are shared with the same TrackingParticle. 30 3.3.2 Track Validation The MultiTrackValidator analysis program is fully configurable as it allows for the choice of the input collection, of the track associator to be used and of several other options. As the TrackAssociators, also the Track Validation Tool makes use of an interface that allows to take as input for the validation not only the standard reco::Tracks, but also the electron reco::GsfTracks. MultiTrackValidator takes as input one or more .root files containing previously reconstructed tracks and produces an output file containing the plots. The main configurable parameters of the MultiTrackValidator are: • label, label tp effic and label tp fake are the names of the of input collections. The first is the vector of the reconstructed track collections to analyze, the others are the collection of TrackingParticles used for efficiency studies and for the fake rate evaluation respectively. • beamSpot is the module that produced the beam spot which the track parameters are referred to. • UseAssociators is a flag to control how the association between reco::Track and TrackingParticles is performed. If the association has already been done, and the association map is already stored in the input file UseAssociators has to be set to false, otherwise, to perform a new association during the validation process, it has to be set to true. • associatormap is the name of the module that produced the association map stored in the input file. • TrackingParticleSelectionForEfficiency is a filter used to select the TrackingParticles for the evaluation of tracking efficiency in case UseAssociators is false. • associators is the list of the associators to be used. • min and max is the pseudorapidity range you want to explore while nint is the number of intervals you want to divide it in, minpT, maxpT, nintpT and minHit, maxHit,nintHit are the same of min, max, nint but for studies vs pT and vs the number of hits. • useFabsEta is a flag to fill plots vs the absolute value of pseudorapidity of vs the signed value. • useInvPt is a flag to fill plots vs the inverse of the transverse momentum. • outputFile is the name of the output file containing the performance histograms. As many different track collections can be processed with different track associators during the same program execution, in order to separate the plots corresponding to each case, the output file is organized as follows: several directories are created according to the names in the label and in the associators configuration file vector parameters. Every directory contains the same set of histograms, but filled using a different track collection and a different track associator. For example, a directory named general AssociatorByHits contains the validation plots obtained with the default CMS track collection, labeled as “generalTracks”, and the TrackAssociatorByHits. The plots created by the validation tool can be grouped into different categories: 31 • Global tracking performances: – Number of associated and fake reconstructed tracks per event. – Number of total reco::Tracks, of associated tracks (simToReco and recoToSim) and of simulated tracks vs η, vs pT and vs number of hits. – Efficiency vs η, vs pT and vs number of hits. – Fake rate vs η, vs pT and vs number of hits. – Number of reconstructed vs number of simulated tracks (2D plot). • Number of hits, χ2 and charge distributions: – Track χ2 /ndof and χ2 /ndof probability distributions. – Average track χ2 /ndof vs η and χ2 /ndof vs η 2D plot. – Number of valid and lost hit per track total distributions and vs η. – Track χ2 /ndof vs number of hits, number of valid hits vs η and number of lost hits vs η (2D plots). – Track charge distribution. • Pulls and residues: – η residue. – Pull plots of track pT and θ, φ, dxy, dz and q/p parameters. – Average width of Gaussian fits to the track parameter pull plots vs η. – Width of Gaussian fits to the track parameter pull plots vs η (2D plots). – Average ∆pT /pT vs η. • Resolution of track parameters: – Average width of Gaussian fits to ∆η, ∆pT /pT , ∆ cot θ, ∆φ, ∆dxy and ∆dz distributions vs η and vs pT . – ∆η, ∆pT /pT , ∆ cot θ, ∆φ, ∆dxy and ∆dz distributions vs η and vs pT (2D plots). • Track association: – Fraction of shared hits and number of shared hits distributions (TrackAssociatorByHits only) – Track association χ2 and probability of association χ2 distributions (TrackAssociatorByChi2 only) • cross checking with simulation: – pT and η distributions of simulated tracks. – Number of simulated tracks per event. – Transverse position of production vertices of simulated tracks. 32 Seed Validation The TrackerSeedValidator is a tool that produces a set of histograms useful to test, validate and debug the track seeds. The seed validator probes the seeding performance by comparing every TrajectorySeed with the corresponding TrackingParticle. TrajectorySeeds are matched to TrackingParticles using the seed association provided by the TrackAssociatorByHits. The evaluation of the efficiency and the purity at the seeding level is very important since most of the fake reconstructed tracks correspond to badly reconstructed seeds, and because a high number of fake or duplicated seeds heavily degrades the global timing of the track reconstruction process. The TrackerSeedValidator is a branch that originates from the MultiTrackValidator. They inherits from the same base class (MultiTrackValidatorBase), where the common functionalities are stored. A seed object is mainly constituted by a vector of reconstructed hits (two or three) and a rough estimate of the track parameters on the surface of the outermost hit. In order to compare the seed track parameters and the TrackingParticle ones, also the seed parameters are propagated to the point of closest approach to the beam line. The TrackerSeedValidator works in the same way as the MultiTrackValidator. The only difference is that the parameter label refers to the module that produced the seeds2 . The output file of the SeedValidator has the same features as the MultiTrackValidator, except the fact that the distributions refer to the the seeds instead of the tracks and that the resolution plots are not produced as they are not meaningful in this case. 2 Actually, in the case of the TrajectorySeedValidator, an extra parameter, called TTRHBuilder, is added. It defines the component name of the TransientTrackingrecHitBuilder to be used in the TrackerSeedValidator. The default values of this parameter is: ”WithTrackAngle”. 33 Chapter 4 Outlier Rejection in the Final Track Fit As discussed in §3.2, tracking in CMS has excellent performance, characterized by high efficiency, purity and resolutions. Nevertheless, the quality of some tracks can be improved: the pulls (defined as residuals divided by their errors) of the track parameters show long non Gaussian tails. In other words, sometimes the reconstructed tracks have parameters significantly different from the corresponding simulated ones. This can happen for several reasons: the track can include noise hits, hits belonging to another track or hits generated by δ rays or by other complex processes. Such hits worsen the quality of reconstructed tracks because they provide an incorrect or an inaccurate measurement. These hits are likely to give a large χ2 during the final fit: if they were rejected and the fit repeated, the quality of the tracks would improve [13]. The Outlier Rejection algorithm [21] is intended to remove from the final fit the hits with a large χ2 . The algorithm is designed to fulfill the following criteria: • Do not worsen the tracking efficiency. • Decrease the fake rate. • Improve as much as possible the quality of the tracks. • Keep the CPU time at a reasonable level. 4.1 Implementation of the Algorithm The Outlier Rejection is performed during the Final Track Fit (§2.6). The final fit is the last element of the track reconstruction process and is common to all the tracking algorithms in CMS; therefore the Outlier Rejection can be used by all the algorithms. The Outlier Rejection algorithm makes use of the hit χ2 , defined as the χ2 deviation between the hit position and the combined track state on the hit surface. The combined state is the weighted mean of the forward and backward predicted states: it is the best estimate of the track parameters without taking into account the hit that is being processed. The algorithm works as follows. After the smoothing step, the resulting track is analyzed. If there is at least one hit that has a χ2 greater than a χ2 threshold, then the hit with the largest χ2 is removed and the final fit is restarted. Technically, the hit is removed 35 substituting it with an invalid hit; by definition it does not provide any measurement, but takes into account the material effects of the hit layer during the propagation. The procedure is iterated until one of the following cases is found: 1. No hit has a χ2 above the chosen cut value. 2. The remaining number of hits falls below a minimum number of hits. In this case, the track is rejected. During this iteration, if a track has two or more consecutive invalid hits, the hits before the first of the invalid hit sequence are saved, the others are dismissed. The hit order is given by the track candidate direction. The motivation for breaking these tracks is to separate a primary track from its products, which are expected to be reconstructed during later steps of the iterative tracking (Fig. 4.1). Figure 4.1: Example of a track that benefit from the breaking after two consecutive invalid hits are found during Outlier Rejection. The two consecutive invalid hits are caused by the decay of a primary track into one or more secondary tracks: keeping only the hits before the first invalid correspond to reconstruct only the primary track. At the end of this procedure the useless1 invalid hits at the beginning or at the end of the track are removed. The cut value on the hit χ2 , the minimum number of remaining hits, and the boolean values - to switch off and on the breaking of the tracks with consecutive invalid hits and the cleaning of the invalid hits at the beginning or at the end of the tracks - are parameters that can be easily set via configuration file. If the χ2 threshold is set to -1, the Outlier Rejection is switched off. 4.2 Characterization of the Rejected Hits Hits rejected by the Outlier Rejection algorithm can be classified into five categories: Bad Hits, Good Hits, δ-ray Hits, Not Shared Hits and Shared Hits. Considering a reconstructed track, which is not fake according to the TrackAssociatorByHits, a TrackingParticle is associated to it and each hit of the track can be associated to a vector of simulated hits (SimHits) by means of the hit level associator. The categories above correspond to the following cases: • Good Hits: only one SimHit is associated to the reconstructed hit and it comes from the correct simulated track (i.e. the same TrackingParticle associated to the track). 1 Actually invalid hits in the layers between the first hit and the beam line should be introduced to correctly take into account the material during the propagation to the point of closest approach to the beam line (where the track parameters are defined), but it’s a general issue valid for all the tracks and not only those with outliers. 36 • Bad Hits: none of the associated SimHits is from the correct Tracking Particle. • δ-ray Hits: more than one SimHit is associated and all of them are from the correct TrackingParticle. All the SimHits (except the first one) are produced by an electron ionization process. • Not Shared Hits: also in this case more than one Sim Hit is associated and all of them are from the correct Tracking Particle but not all are δ rays. • Shared Hits: more than one Sim Hit is associated, at least one coming from the correct Tracking Particle and at least one not. Good Hits are correct and unbiased: they are by far the majority of the hits associated to non-fake track and, in principle should not be removed. Bad Hits are wrong measurements, and thus the Outlier Rejection is expected to reject as many as possible. The other three categories correspond to hits that, even if for different causes, provide biased or imprecise measurements. The convenience of rejecting them, of course, depends on how much they are biased with respect to their ideal position. The hit χ2 distributions for all the categories above are reported in Fig. 4.2. As expected, Bad Hits have the broadest distribution and the highest mean value, while Good Hits have the sharpest distribution and the lowest mean value. As discussed in the next section, the effect of Outlier Rejection is to increase the fraction of Good Hits and decrease the fraction of Bad Hits, while the other categories are mainly unaffected. The fraction of hits per tracker layer corresponding to the five categories are shown in Fig. 4.3 and Fig. 4.4. All the results reported in this chapter have been obtained with CM SSW 2 0 X releases. 4.3 Impact on Tracking Performance Tracking performance with Outlier Rejection are studied on various data samples. 4.3.1 Results with tt̄ sample tt̄ events are characterized by a high multiplicity of tracks (≥ 80 tracks per event), contain jets of different energy and also isolated tracks; tracks are produced by several types of particles, like muons, electrons or charged pions. For these reasons, tt̄ events constitute an optimal test-bed for Outlier Rejection. Comparing the track collections obtained with and without Outlier Rejection, many differences can be found: • rejected hits: the same track, reconstructed with a different cut value, differs only for the rejected outlier hits. • lost hits: after Outlier Rejection the track loses other hits because of fitting failures or track breaking for consecutive invalid hits. • gained hits: the track gains hits because, after Outlier Rejection, the fit does not fail anymore. • lost tracks: after Outlier Rejection, the track is not present in the collection anymore. 37 TotChi2Increment Entries 769161 Mean 2.728 RMS 6.002 All Hits TotChi2GoodHit Entries 653690 Mean 2.472 RMS 5.208 Good Hits 30000 104 25000 103 20000 15000 102 10000 10 5000 1 0 10 20 30 40 50 60 70 80 90 0 0 100 χ2 10 20 30 40 (a) 50 60 70 80 90 100 χ2 (b) TotChi2BadHit Entries 20246 Mean 9.77 RMS 15.54 Bad Hits 350 δ-ray Hits TotChi2DeltaHit Entries 76568 Mean 2.878 RMS 6.528 4000 300 3500 250 3000 2500 200 2000 150 1500 100 1000 50 500 0 0 10 20 30 40 50 60 70 80 90 0 0 100 χ2 10 20 30 40 (c) 50 60 70 80 90 100 χ2 (d) TotChi2NSharedHit Not Shared Hits Entries Mean RMS 140 3566 4.062 7.836 TotChi2SharedHit Entries 15091 Mean 4.605 RMS 9.005 Shared Hits 500 120 400 100 80 300 60 200 40 100 20 0 0 10 20 30 40 50 (e) 60 70 80 90 100 χ2 0 0 10 20 30 40 50 60 70 80 90 100 χ2 (f) Figure 4.2: Hit χ2 distributions for the hits in the associated (non-fake) tracks, reconstructed without Outlier Rejection: (a) all hits in the collection, (b) Good Hits, (c) Bad Hits, (d) δ-ray Hits, (e) Not Shared Hits and (f) Shared Hits (sample used: tt̄, 500 events, track quality: Pre Filter tracks). 38 (a) (b) (c) Figure 4.3: Fraction of hits per category per tracker layer in the track collection without Outlier Rejection (blue) and in the collection with χ2 cut = 20 (red): (a) Good Hits, (b) Bad Hits, (c) δ-ray Hits (sample used: tt̄, 500 events, track quality: Pre Filter tracks). 39 (a) (b) Figure 4.4: Fraction of hits per category per tracker layer in the track collection without Outlier Rejection (blue) and in the collection with χ2 cut = 20 (red): (a) Not Shared Hits, (b) Shared Hits (sample used: tt̄, 500 events, track quality: Pre Filter tracks). 40 • gained tracks: a track that was not present in the original collection is found in the collection after Outlier Rejection. • lost track-association: after Outlier Rejection, a track that was associated is now not associated. • gained track-association: a track that was not associated is associated. To avoid the complication of analyzing how frequently each case happen, the most convenient approach is to look at the fraction of hits per category and at the number of associated and fake tracks as a function of the applied χ2 cut. The effect of Outlier Rejection is to increase the fraction of Good Hits and reduce the fraction of Bad Hits in the track collection. The hit fractions for the other categories are not much affected (Tables 4.1, 4.2). Table 4.1: Fraction of the hit number per category as function of the applied χ2 cut (sample used: tt̄, 500 events, track quality: Pre Filter tracks). cut -1 5 10 15 20 25 30 35 40 45 50 55 60 tot hits 769161 662279 725108 738166 744478 747813 749945 751612 752740 753603 754351 755034 755465 good [%] 85.0 86.7 86.3 86.0 85.9 85.7 85.7 85.6 85.6 85.5 85.5 85.5 85.5 bad [%] 2.6 1.1 1.5 1.7 1.8 1.9 2.0 2.0 2.1 2.1 2.1 2.1 2.1 delta [%] 10.0 10.0 10.0 10.0 10.0 10.0 10.0 10.0 10.0 10.0 10.0 10.0 10.0 not shared [%] 0.5 0.4 0.4 0.4 0.4 0.4 0.4 0.4 0.4 0.5 0.5 0.5 0.5 shared [%] 2.0 1.7 1.8 1.9 1.9 1.9 1.9 1.9 1.9 1.9 1.9 1.9 1.9 In the case of Pre Filter tracks, starting from ∼ 102 associated tracks and ∼ 38 fakes per event for disabled Outlier Rejection and using the TrackAssociatorByHits, the effect of activating Outlier Rejection is to decrease both the number of associated tracks and the number of fake track, but the number of fakes decrease 2 − 3 times more than the number of good tracks (Fig. 4.5). For High Purity tracks, instead, the number of associated tracks increases for χ2 cut values above 15 and the number of fake tracks decreases for values below 25 (Fig. 4.6). The average number of associated and fake tracks per event for disabled Outlier Rejection are 84.4 and 0.6 respectively. The effect of Outlier Rejection on tracking performance can be estimated also using the standard tracking validation tool for tracks reconstructed with different χ2 cut values. The plots obtained with Pre Filter tracks are reported in Fig. 4.7. Using the TrackAssociatorByHits, the effect of Outlier Rejection is to reduce both the efficiency (less than 3% for a cut value of 20) and the fake rate (up to 10% for the same cut). Using the TrackAssociatorByChi2, instead, the efficiency increases of a few percent, showing that the quality of the 41 Table 4.2: Fraction of the hit number per category as function of the applied χ2 cut (sample used: tt̄, 500 events, track quality: High Purity tracks). cut -1 5 10 15 20 25 30 35 40 45 50 55 60 tot hits 710949 628661 693321 706079 711706 714450 715979 716970 717460 717786 717884 718107 718088 good [%] 86.6 87.7 87.2 87.0 86.8 86.7 86.7 86.6 86.6 86.6 86.6 86.6 86.6 bad [%] 1.5 0.7 1.0 1.2 1.3 1.3 1.4 1.4 1.4 1.5 1.5 1.5 1.5 delta [%] 9.8 9.8 9.8 9.8 9.8 9.8 9.8 9.8 9.8 9.8 9.8 9.8 9.8 not shared [%] 0.4 0.3 0.4 0.4 0.4 0.4 0.4 0.4 0.4 0.4 0.4 0.4 0.4 shared [%] 1.8 1.6 1.7 1.7 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 Figure 4.5: Difference on the number of associated (blue) and fake (red) tracks per event with respect to the sample without using the Outlier Rejection as a function of the applied χ2 cut (sample used: tt̄, 500 events, track quality: Pre Filter tracks, associator: TrackAssociatorByHits). 42 Figure 4.6: Difference on the number of associated (blue) and fake (red) tracks per event with respect to the sample without using the Outlier Rejection as a function of the applied χ2 cut (sample used: tt̄, 500 events, track quality: High Purity tracks, associator: TrackAssociatorByHits). reconstructed tracks is improved by Outlier Rejection. The average χ2 /n.d.o.f. is reduced and becomes more compatible with the expected value of one. Of course, the effect of Outlier Rejection on High Purity tracks is less striking because these tracks have already passed very selective quality criteria (Fig. 4.8). Nevertheless, it is worth noting that the fake rate evaluated with the TrackAssociatorByHits is reduced of about 1% for some η values (χ2 cut equal to 20) and that the χ2 /n.d.o.f. is significantly closer to one. The improvement in the track quality can be observed by looking at parameter pulls for the tracks which had at least one hit rejected and were reconstructed both in case of disabled Outlier Rejection and in case of χ2 cut value equal to 20; for all the track parameters, the pull plot in case of Outlier Rejection shows a more Gaussian behavior (Fig. 4.9). Plots mainly unaffected by the Outlier Rejection algorithm are those related to resolutions. In fact, Outlier Rejection mainly improves the quality of the tracks that populate the tails of the track parameter pulls and, when resolution values are computed, pulls are fitted with a Gaussian: its width provides the resolution value and the pull tails have a small impact on the results of the fit. The impact of Outlier Rejection on the tracking computing time has also been evaluated. Considering seeding, pattern recognition, final fit and track collection filtering, for tt̄ events, the total computing time per event is about 7 sec. The final fit only takes about 2.5-3 sec per event. Outlier Rejection increases the final fit time of about 20% for a χ2 cut value of 10, 11% for 20 and 4% for 50. Therefore, for a cut value of 20, the impact on the total tracking computing time is an increase of the order of 5%. 43 efficiency vs η fake rate vs η cut=-1 cut=10 cut=20 0.7 cut=30 cut=40 1 0.95 cut=-1 cut=10 cut=20 cut=50 0.6 cut=60 cut=30 cut=40 cut=50 0.9 0.5 cut=60 0.85 0.4 0.8 0.3 0.75 0.7 0.2 0.65 0.1 0.6 0.55 0 0.5 1 1.5 2 0 0 2.5 |η| 0.5 1 (a) 1.5 2 2.5 |η| 1.5 2 2.5 |η| (b) efficiency vs η fake rate vs η cut=-1 cut=10 cut=20 0.7 cut=30 cut=40 1 0.95 cut=-1 cut=10 cut=20 cut=50 0.6 cut=60 cut=30 cut=40 cut=50 0.9 0.5 cut=60 0.85 0.4 0.8 0.3 0.75 0.7 0.2 0.65 0.1 0.6 0.55 0 0.5 1 1.5 2 0 0 2.5 |η| 0.5 1 (c) (d) mean χ2 vs η cut=-1 cut=10 cut=20 2 cut=30 cut=40 cut=50 1.8 cut=60 1.6 1.4 1.2 1 0.8 0.6 0.6 0.8 1 1.2 1.4 1.6 1.8 2 2.2 2.4 |η| (e) Figure 4.7: Impact of the Outlier Rejection algorithm on tracking performance: (a) tracking efficiency evaluated with the TrackAssociatorByHits vs η, (b) fake rate evaluated with the TrackAssociatorByHits vs η, (c) tracking efficiency evaluated with the TrackAssociatorByChi2 vs η, (d) fake rate evaluated with the TrackAssociatorByChi2 vs η and (e) mean track χ2 /n.d.o.f. vs η for different χ2 cut values (sample used: tt̄, 500 events, track quality: Pre Filter tracks). 44 efficiency vs η fake rate vs η cut=-1 cut=10 cut=20 1 0.05 cut=30 cut=40 cut=20 cut=30 cut=60 0.95 cut=-1 cut=10 0.045 cut=50 0.9 0.04 cut=40 cut=50 0.035 cut=60 0.03 0.025 0.85 0.02 0.8 0.015 0.01 0.75 0.005 0.7 0 0.5 1 1.5 2 0 0 2.5 |η| 0.5 1 (a) 1.5 2 2.5 |η| 1.5 2 2.5 |η| (b) efficiency vs η fake rate vs η cut=-1 cut=10 1 cut=20 0.95 cut=50 0.5 cut=30 cut=40 cut=-1 cut=10 0.45 cut=20 cut=30 cut=60 0.4 cut=40 cut=50 0.35 cut=60 0.9 0.85 0.3 0.25 0.8 0.2 0.75 0.15 0.7 0.1 0.65 0.6 0 0.05 0.5 1 1.5 2 0 0 2.5 |η| 0.5 1 (c) (d) mean χ2 vs η cut=-1 cut=10 cut=20 2 cut=30 cut=40 cut=50 1.8 cut=60 1.6 1.4 1.2 1 0.8 0.6 0.6 0.8 1 1.2 1.4 1.6 1.8 2 2.2 2.4 |η| (e) Figure 4.8: Impact of the Outlier Rejection algorithm on tracking performance: (a) tracking efficiency evaluated with the TrackAssociatorByHits vs η, (b) fake rate evaluated with the TrackAssociatorByHits vs η, (c) tracking efficiency evaluated with the TrackAssociatorByChi2 vs η, (d) fake rate evaluated with the TrackAssociatorByChi2 vs η and (e) mean track χ2 /n.d.o.f. vs η for different χ2 cut values (sample used: tt̄, 500 events, track quality: High Purity tracks). 45 histoD0Old dxy pull Entries Mean RMS Underflow Overflow χ 2 / ndf Constant Mean Sigma 250 Entries Mean RMS Underflow Overflow χ 2 / ndf Constant Mean Sigma 200 150 2976 0.006216 3.198 15 27 359.5 / 152 183.3 ± 5.2 -0.003285 ± 0.022157 1.121 ± 0.022 histoD0Out 2976 -0.009644 2.205 11 15 309.5 / 105 205.5 ± 5.8 0.00629 ± 0.01996 1.025 ± 0.021 Entries Mean RMS Underflow Overflow χ 2 / ndf Constant Mean Sigma 200 150 2976 0.04804 3.682 20 31 421 / 164 212.5 ± 6.1 -0.01182 ± 0.01879 0.9403 ± 0.0191 histoDzOut 2976 -0.07886 2.373 13 18 324.5 / 118 222.3 ± 6.3 0.005479 ± 0.018429 0.9405 ± 0.0192 100 cut=-1 0 -25 Entries Mean RMS Underflow Overflow χ 2 / ndf Constant Mean Sigma 250 100 50 histoDzOld dz pull cut=-1 50 cut=20 -20 -15 -10 -5 0 5 10 15 cut=20 0 -25 20 25 ∆dxy/ σdxy -20 -15 -10 -5 (a) 0 5 10 15 20 25 ∆dz/ σdz (b) φ pull histoPhiOld Entries Mean RMS Underflow Overflow χ 2 / ndf Constant Mean Sigma 220 200 180 Entries Mean RMS Underflow Overflow χ 2 / ndf Constant Mean Sigma 160 140 120 2976 0.05271 2.949 3 11 309 / 136 150.6 ± 4.2 -0.002433 ± 0.027295 1.406 ± 0.029 histoPhiOut 2976 -0.02047 2.516 3 11 298.8 / 112 179.4 ± 5.1 -0.00041 ± 0.02296 1.185 ± 0.024 histoThetaOld θ pull Entries Mean RMS Underflow Overflow χ 2 / ndf Constant Mean Sigma 250 200 Entries Mean RMS Underflow Overflow χ 2 / ndf Constant Mean Sigma 150 2976 -0.1061 3 11 10 441.2 / 138 208.2 ± 6.2 -0.003965 ± 0.019238 0.9631 ± 0.0210 histoThetaOut 2976 0.04949 2.449 8 10 336.6 / 117 220.5 ± 6.2 -0.02098 ± 0.01858 0.9487 ± 0.0192 100 100 80 60 cut=-1 40 cut=20 cut=-1 50 cut=20 20 0 -25 -20 -15 -10 -5 0 5 10 15 20 0 -25 25 ∆φ/ σφ -20 -15 -10 -5 (c) 0 5 10 15 20 25 ∆θ/ σθ (d) histoQoverpOld q/p pull Entries Mean RMS Underflow Overflow χ 2 / ndf Constant Mean Sigma 2976 0.002719 3.725 7 13 348 / 150 84.71 ± 2.61 -0.0009018 ± 0.0481770 2.457 ± 0.058 histoQoverpOut Entries 2976 Mean -0.07974 RMS 2.819 Underflow 6 Overflow 12 χ 2 / ndf 323.4 / 128 Constant 148.6 ± 4.3 Mean -0.05677 ± 0.02758 Sigma 1.415 ± 0.030 180 160 140 120 100 80 60 40 cut=-1 cut=20 20 0 -25 -20 -15 -10 -5 0 5 10 15 20 25 ∆(q/p)/ σq/p (e) Figure 4.9: Impact of the Outlier Rejection algorithm on tracking performance: comparison of the track parameter pulls for the tracks with at least one rejected hit which are present both in the collection without Outlier Rejection and in the collection with χ2 cut = 20 (sample used: tt̄, 500 events, track quality: High Purity tracks, associator: TrackAssociatorByHits). 46 4.3.2 Results with 3000 − 3500 GeV QCD jets sample The performance of the Outlier Rejection algorithm has been also analyzed on a 3000 − 3500 GeV QCD jets sample. A sample of extremely high-energy jets contains many tracks in a very narrow cone, where the probability for the pattern recognition to select a wrong hit is high. In this case, the best performance is obtained disabling the trajectory breaking in case of two consecutive invalid hits because such requirement turns out to be too severe for this kind of events and would lead to an efficiency loss. Results obtained without trajectory breaking are reported in Fig. 4.10 Unfortunately the sample has low statistics and thus the uncertainties are large; however, it seems clear that, mostly in the barrel region, the performance improvement thanks to Outlier Rejection is remarkable: the efficiency increase and the fake rate reduction is at the level of a few percent both with the TrackAssociatorByHits and with the TrackAssociatorByChi2, while the average χ2 is significantly closer to one. 4.4 Conclusions The Outlier Rejection algorithm highly improves the performance of Pre Filter tracks and has a positive impact also on High Purity tracks. A χ2 cut value in the range [20,35] increases the efficiency, reduces the fake rate and the number of bad hits in the track collection. Further developments could take into account some new features of local reconstruction, such as pixel quality or template fit probability. 47 efficiency vs η fake rate vs η cut=-1 cut=10 cut=20 0.7 cut=30 cut=40 1 0.95 cut=-1 cut=10 cut=20 cut=50 0.6 cut=60 cut=30 cut=40 cut=50 0.9 0.5 cut=60 0.85 0.4 0.8 0.3 0.75 0.7 0.2 0.65 0.1 0.6 0.55 0 0.5 1 1.5 2 0 0 2.5 |η| 0.5 1 (a) 1.5 2 2.5 |η| 1.5 2 2.5 |η| (b) efficiency vs η fake rate vs η cut=-1 cut=10 cut=20 0.7 cut=30 cut=40 1 0.95 cut=-1 cut=10 cut=20 cut=50 0.6 cut=60 cut=30 cut=40 cut=50 0.9 0.5 cut=60 0.85 0.4 0.8 0.3 0.75 0.7 0.2 0.65 0.1 0.6 0.55 0 0.5 1 1.5 2 0 0 2.5 |η| 0.5 1 (c) (d) mean χ2 vs η cut=-1 cut=10 cut=20 2 cut=30 cut=40 cut=50 1.8 cut=60 1.6 1.4 1.2 1 0.8 0.6 0.6 0.8 1 1.2 1.4 1.6 1.8 2 2.2 2.4 |η| (e) Figure 4.10: Impact of the Outlier Rejection algorithm on tracking performance: (a) tracking efficiency evaluated with the TrackAssociatorByHits vs η, (b) fake rate evaluated with the TrackAssociatorByHits vs η, (c) tracking efficiency evaluated with the TrackAssociatorByChi2 vs η, (d) fake rate evaluated with the TrackAssociatorByChi2 vs η and (e) mean track χ2 /n.d.o.f. vs η for different χ2 cut values (sample used: 3000 − 3500 GeV QCD jets, 100 events, track quality: High Purity tracks). 48 Chapter 5 Tracking Efficiency with Cosmic Data during Commissioning The evaluation of the track reconstruction performance on real data is crucial to check the detector and reconstruction status, measure inefficiencies and understand detector calibration or alignment problems. The first chance to test the track reconstruction performance with the full tracker detector occurred during the commissioning phase at P5. The tracker detector commissioning has been performed in several stages and in different places for the various parts of the detector. The integration of the four strip sub-detectors1 has been completed at the Tracker Integration Facility (TIF) at CERN in 2006 and 2007, where, the first runs on cosmic ray data have been performed as well [22]. 2161 modules, ∼ 15% of all the sub-detectors, have been sandwiched between scintillator counters providing trigger coincidence signal. A 5 cm thick lead plate, placed on top of lower scintillators, served as shield from tracks with momentum below 200 M eV . For completeness, a trigger configuration is displayed in Fig. 5.1. During the runs, the operating temperature was gradually decreased from +15◦ C to −15◦ C. At the TIF, more than 4 million events were collected and, for the first time in CMS, processed to reconstruct tracks. The total number of collected tracks is about 2.3M and similar performance results are obtained for the three tracking algorithms used (CKF, RS, CosmicTF). Many analysis methods have been developed during the TIF data-taking: as the tracker was the only detector used for these runs, all these methods are based on the tracker data only. After the TIF operations, in December 2007, the whole silicon strip detector has been installed at the CMS site. The barrel pixel detector has been built, assembled and tested at PSI (Zurich), while the forward pixel detector has been built and assembled at FNAL (Chicago) and later commissioned at the TIF. During the Summer 2008, the pixel detectors have been inserted into the CMS experiment at P5. Since then, the commissioning operations of the whole tracker detector with cosmic ray data has been carried out. 1 The silicon strip modules have been produced in several parts of Europe and of the U.S., then they have been assembled at CERN (TOB), Italy (TIB and TID)and Germany (TEC). 49 Figure 5.1: Layout of the trigger scintillator position C. The x-y view is shown on the left side, the r-z view is shown on the right. The straight lines connecting the active areas of the top and bottom scintillation counters indicate the acceptance region. In the x-y view, the active TOB modules are shown in contrasting colors while the active TIB area is framed in black. 5.1 CRUZET CRUZET, acronym for Cosmic RUn at ZEro Tesla, is the first global run making use of full CMS detector, trigger and Data Quality Monitor (DQM). The tracker detector joined the CRUZET3 data-taking (7-14 July 2008) with the strip detector only, and the CRUZET4 (1825 August 2008) with both the strip and pixel detectors. The tracker operated in the same conditions foreseen for the collision runs: the raw data from the Front-End Driver (FED) boards are collected and promptly reconstructed; these data are then processed updating the alignment and calibration constants and finally re-reconstructed making use of the new conditions. During these global runs, the tracking algorithms have been used to reconstruct tracks in the whole tracker detector for the first time. The track parameter distributions from the three algorithms are consistent with the related expectations. In particular, CosmicTF reconstructs the highest number of tracks, especially in the end-cap region, while CKF and RS algorithms have similar behaviors. RS, actually, reconstructs more tracks at dz = −100 cm because it also uses seeds from the bottom part of the tracker (Fig. 5.2(a)). The η distributions show a peak due to the presence of the shaft (Fig. 5.2(b)), while the φ distributions have a peak for vertical tracks at φ = − π2 (Fig. 5.2(c)). The CRUZET performance [23] has been evaluated using the methods developed at the TIF for the standalone tracker analyses. New analyses, exploiting data from other detectors, have also been performed. In particular, the tracking efficiency has been evaluated through two different methods: • Efficiency using StandAlone muons. Tracks are reconstructed in the muon chambers and, for those pointing to tracker volume, a matching tracker track is looked for. • Efficiency using tracker data only. This method, based on the TIF tracking efficiency estimate, is presented in this section. 50 (a) (b) (c) Figure 5.2: Distribution of the reconstructed track parameters dz(a), η(b) and φ(c) for the three tracking algorithms used to reconstruct cosmic tracks. Some RS tracks are wrongly reconstructed with φ > 0: for such tracks the φ value has to be corrected as φ = φ − π. 51 5.2 Efficiency Estimate using Tracker Data only At the TIF only a slice of the strip tracker was installed; thus, the method developed to evaluate the efficiency there considers two independently reconstructed classes of tracks: TOB tracks, which are seeded in the outer TOB layers and tracked in the TOB only, and TIB tracks, seeded in the inner TIB layers and tracked in the TIB only. Two efficiencies are computed: ǫ(T IB|T OB), the probability to find a matching TIB track for a given TOB track, and, vice versa, ǫ(T OB|T IB). The match between the two tracks is realized when ∆φ < 5σφ , where σφ is the expected azimuthal angle resolution. Only the events with just one reconstructed track are considered. At P5 the full tracker is installed. Therefore, two independent reconstruction processes can be performed in the top and bottom halves of the tracker. Such approach is interesting since, at least for the barrel region, the tracks produced by LHC collision will be reconstructed in one half of the tracker only as well. The work-flow of this analysis is: • Reprocess the data to reconstruct the tracks using the hits in only one half of the tracker. • Select a high quality reference track. • Look for a track matching the reference track in the opposite half. • Evaluate the efficiency. 5.2.1 Track Reconstruction from Top/Bottom Seeds The tracking algorithms have been customized to allow for cosmic track reconstruction in a single half of the tracker. Tracks have to be independently reconstructed in the top half (Top tracks) and bottom half (Bottom tracks). For this purpose, it is necessary to modify the tracking algorithms (in the version used to reconstruct cosmic ray track, see §2.3-§2.5) already at the seeding level. The details for each algorithm are the following. • CKF: the CKF algorithm for cosmic tracks makes use, by default, of seeds in the outer layers of the tracker top half (global y coordinate > 0), and then searches for other hits moving downwards along the particle momentum. This is the correct behavior for the reconstruction of Top tracks. Before the final fit, a filter for the bottom half hits is added. An option for using seed hits with y < 0 only is added to allow for the reconstruction of Bottom tracks. Then tracks are reconstructed in the direction opposite to the particle momentum. After the pattern recognition, the same filter is applied for hits with y > 0. • RS: RS seeds have hits both in the top and bottom halves of the tracker. The Top/Bottom track reconstruction is implemented by selecting the seeds with hits only in the correct half of the tracker. The option, described in §2.4, which allows for merging the cosmic tracks in the opposite halves, is not needed in this context. The same CKF hit filter is applied for RS tracks too. By default, the RS algorithm reconstructs tracks, from the inner layers to the outer, forcing the track direction to be along the particle momentum: therefore, tracks in the 52 top half, for which inside-out means opposite to momentum, have an incorrect convention for the momentum direction (the true values are φ = φ − π and η = −η). For this study, the RS code has been modified to reconstruct tracks with the correct direction in case of top seeding. • CosmicTF: the CosmicTF is also seeded from both halves of the tracker. Therefore, only seed hits in one half of the tracker are considered and, in case of bottom seeding, tracks are reconstructed in the direction opposite to the particle momentum. Instead of using the hit filter, hits in the other half of the tracker are discarded during the pattern recognition. All the three algorithms are now properly customized to reconstruct Top and Bottom tracks with independent processes and correct momentum direction. The dz distributions and the RS φ distributions are shown in Fig. 5.3: Top and Bottom tracks have different dz acceptance regions, and the customized versions of the RS algorithm always reconstruct the correct φ sign. The hit position for Top and Bottom tracks are reported in Fig. 5.4-5.6. CKF hits are in one half of tracker only. CosmicTF and RS hit filters, instead, are not fully efficient; however, the hit fraction in the other half is negligible. 5.2.2 Analysis Implementation Two efficiencies can be computed: ǫ(T |B) and ǫ(B|T ), which correspond to the probability to find a matching Top track for a given reference Bottom track, and vice versa. Clearly, this method relies on good quality and not fake reference tracks, pointing to the acceptance region of the other half of the detector. Events with only one track in the reference collection are considered. The following requests are then applied: • Nhits ≥ 7 • χ2 /ndof ≤ 10 • Nlay ≥ 5 where Nlay is the number of silicon strip layers crossed by the projection of the track in the opposite half of the tracker. The layer propagation can actually return more than one compatible layer; only the first compatible layer per iteration is considered for the Nlay computation. Once a reference Top (Bottom) track is found, the Bottom (Top) track collection is probed looking for a matching track. For this analysis, two tracks match if the difference between their azimuthal angles is smaller than 0.05 rad. 53 CosmicTF Entries 973950 Mean -4.539 RMS 123.7 CosmicTF_Top Entries 535588 Mean -2.894 RMS 85.65 CosmicTF_Bot Entries 542101 Mean -5.813 RMS 86.3 dz 16000 14000 12000 10000 CKF Entries 569553 Mean -0.707 RMS 81.91 CKF_Top Entries 549991 Mean -0.8088 RMS 82.77 CKF_Bot Entries 581177 Mean -10.16 RMS 89.61 dz 14000 12000 10000 8000 8000 6000 6000 4000 4000 2000 2000 -300 -200 -100 0 100 200 300 dz[cm] -300 -200 -100 (a) RS Entries 566977 Mean -6.085 RMS 86.68 RS_Top Entries 400972 Mean 1.462 RMS 78.96 RS_Bot Entries 413745 Mean -13.08 RMS 76.7 12000 10000 8000 25000 20000 4000 10000 2000 5000 0 (c) 300 dz[cm] 100 200 300 dz[cm] RS Entries 566977 Mean -1.579 RMS 0.4069 RS_Top Entries 400972 Mean -1.586 RMS 0.3854 RS_Bot Entries 413745 Mean -1.571 RMS 0.377 30000 15000 -100 200 phi 6000 -200 100 (b) dz 0 -300 0 0 -3 -2 -1 0 1 2 3 φ[rad] (d) Figure 5.3: dz distribution for the reconstruction algorithms in their default configuration cosmic tracking (blue), for Top (red) and Bottom tracking (green): CosmicTF(a), CKF(b) and RS(c). φ distribution for the RS algorithm(d). 54 hitsValidPosXYBot 3500 100 y[cm] y[cm] hitsValidPosXYTop 3500 100 3000 3000 50 2500 50 2500 2000 2000 0 0 1500 1500 -50 -50 1000 1000 500 500 -100 -100 -100 -50 0 50 100 x[cm] 0 -100 -50 (a) 50 100 x[cm] 35000 100 30000 80 25000 20000 60 z[cm] hitsValidPosZRBot 120 120 35000 100 30000 80 25000 60 20000 15000 40 15000 40 10000 20 0 0 (b) hitsValidPosZRTop z[cm] 0 5000 -100 -50 0 (c) 50 100 r[cm] 0 10000 20 0 5000 -100 -50 0 50 100 r[cm] 0 (d) Figure 5.4: Hit position in the XY plane for CosmicTF Top (a) and Bottom (b) tracks. Hit position in the RZ plane for CosmicTF Top (c) and Bottom (d) tracks. 55 hitsValidPosXYBot y[cm] y[cm] hitsValidPosXYTop 100 3500 100 3000 3000 2500 50 50 2500 2000 2000 0 0 1500 -50 1500 -50 1000 1000 500 500 -100 -100 -100 -50 0 50 100 x[cm] 0 -100 -50 (a) 50 100 x[cm] 35000 z[cm] hitsValidPosZRBot 120 120 35000 100 30000 100 30000 80 25000 80 25000 20000 60 20000 60 15000 40 15000 40 10000 20 0 5000 -100 -50 0 (c) 50 100 r[cm] 0 10000 20 0 5000 -100 -50 0 50 100 r[cm] (d) Figure 5.5: Hit position in the XY plane for CKF Top (a) and Bottom (b) tracks. Hit position in the RZ plane for CKF Top (c) and Bottom (d) tracks. 56 0 (b) hitsValidPosZRTop z[cm] 0 0 hitsValidPosXYBot 3000 100 y[cm] y[cm] hitsValidPosXYTop 3000 100 2500 2500 50 50 2000 0 2000 0 1500 1000 -50 1500 1000 -50 500 500 -100 -100 -100 -50 0 50 100 x[cm] 0 -100 -50 (a) 50 100 x[cm] 30000 100 z[cm] hitsValidPosZRBot 120 120 30000 100 25000 25000 80 80 20000 20000 60 15000 40 10000 20 0 0 (b) hitsValidPosZRTop z[cm] 0 5000 -100 -50 0 (c) 50 100 r[cm] 0 60 15000 40 10000 20 5000 0 -100 -50 0 50 100 r[cm] 0 (d) Figure 5.6: Hit position in the XY plane for RS Top (a) and Bottom (b) tracks. Hit position in the RZ plane for RS Top (c) and Bottom (d) tracks. 57 5.3 Results This method has been applied both on MC and CRUZET4 data, providing the efficiency plots as a function of track parameters and applied cuts. The results are reported in Table 5.1, Fig. 5.7 and Fig. 5.8. Table 5.1: Top/Bottom and Bottom/Top efficiencies. Values are %. ǫ(T |B) ǫ(B|T ) CosmicTF CRUZET4 89.7 ± 0.4 90.3 ± 0.3 CosmicTF MC 92.8 ± 1.0 91.9 ± 0.7 CKF CRUZET4 90.2 ± 0.5 90.8 ± 0.4 CKF MC 93.3 ± 1.2 92.5 ± 0.9 RS CRUZET4 86.2 ± 0.4 84.9 ± 0.4 RS MC 87.3 ± 0.8 87.4 ± 0.8 Measured CRUZET results are essentially consistent with Monte Carlo. Some small differences remain, which can be explained by a not yet complete commissioning of the detector. The overall efficiency is ≥ 90% for CosmicTF and CKF and around 85% for RS. For vertical tracks in the central region of the tracker it can reach values of ≥ 95% and ≥ 90% respectively. The dependence on the reference track selection cuts is further investigated (Fig. 5.9). The efficiency increases, as expected, with the number of hits and layers crossed by the projection of the reference track, while it decreases for increasing χ2 values. Results are quite stable in time (Fig. 5.9(d)). The same method is applied using Top and Bottom Tracks reconstructed with different algorithms. Results are shown in Fig. 5.10 and Fig. 5.11, proving that the efficiencies are almost independent on the algorithm used for the reference tracks and thus providing a good consistency check. 58 ∈(B|T) vs φ ∈(B|T) vs η 1 1 0.8 0.8 0.6 0.6 0.4 0.4 0.2 0 -3 ckf ckf-mc ckf ckf-mc cosmic cosmic-mc cosmic cosmic-mc rs rs-mc rs rs-mc -2.5 -2 -1.5 0.2 -1 -0.5 0 φ [rad] 0 -1.5 -1 -0.5 (a) 0.5 1 1.5 η (b) ∈(B|T) vs dxy ∈(B|T) vs dz 1 1 0.8 0.8 0.6 0.6 0.4 0.4 0.2 0 -80 0 -60 -40 ckf ckf-mc ckf ckf-mc cosmic cosmic-mc cosmic cosmic-mc rs rs-mc rs rs-mc -20 0 (c) 20 0.2 40 60 80 dxy [cm] 0 -150 -100 -50 0 50 100 150 dz [cm] (d) Figure 5.7: ǫ(B|T ) vs azimuthal angle φ(a), pseudorapidity η(b), transverse impact parameter dxy(c) and longitudinal impact parameter dz(d). 59 ∈(T|B) vs φ ∈(T|B) vs η 1 1 0.8 0.8 0.6 0.6 0.4 0.4 0.2 0 -3 ckf ckf-mc ckf ckf-mc cosmic cosmic-mc cosmic cosmic-mc rs rs-mc rs rs-mc -2.5 -2 -1.5 0.2 -1 -0.5 0 φ [rad] 0 -1.5 -1 -0.5 (a) 0.5 1 1.5 η (b) ∈(T|B) vs dxy ∈(T|B) vs dz 1 1 0.8 0.8 0.6 0.6 0.4 0.4 0.2 0 -80 0 -60 -40 ckf ckf-mc ckf ckf-mc cosmic cosmic-mc cosmic cosmic-mc rs rs-mc rs rs-mc -20 0 (c) 20 0.2 40 60 80 dxy [cm] 0 -150 -100 -50 0 50 100 150 dz [cm] (d) Figure 5.8: ǫ(T |B) vs azimuthal angle φ(a), pseudorapidity η(b), transverse impact parameter dxy(c) and longitudinal impact parameter dz(d). 60 ∈(T|B) vs N ∈(T|B) vs N hits lay 1 1 0.8 0.8 0.6 0.6 0.4 0.4 ckf ckf cosmic 0.2 0 cosmic 0.2 rs 6 8 10 12 14 16 18 20 22 24 26 number of hits rs 0 4 6 8 (a) 12 14 number of layers (b) ∈(T|B) vs run ∈(T|B) vs χ2 1 1 0.8 0.8 0.6 0.6 0.4 0.4 ckf ckf cosmic 0.2 0 0 10 cosmic 0.2 rs 2 4 6 (c) 8 10 track χ2/n.d.o.f. 0 57400 rs 57600 57800 58000 58200 58400 58600 58800 run number (d) Figure 5.9: Dependence of ǫ(T |B) on the applied cut for the reference Bottom track selection (number of track hits(a), number of Top layers crossed by the track projection(b), track χ2 (c)) and dependence on the run number(d). 61 ∈(B|T) vs dxy ∈(B|T) vs dxy 1 1 0.8 0.8 0.6 0.6 0.4 0.4 cosmicTop ckfBot ckf cosmicTop cosmicBot cosmic 0.2 0 -80 0.2 rs -60 -40 -20 0 20 40 60 80 dxy[cm] 0 -80 cosmicTop rsBot -60 -40 -20 (a) 20 40 60 80 dxy[cm] (b) ∈(B|T) vs dxy ∈(B|T) vs dxy 1 1 0.8 0.8 0.6 0.6 0.4 0.4 ckfTop ckfBot ckfTop cosmicBot ckfTop rsBot 0.2 0 -80 0 -60 -40 -20 0 (c) 20 40 rsTop ckfBot rsTop cosmicBot rsTop rsBot 0.2 60 80 dxy[cm] 0 -80 -60 -40 -20 0 20 40 60 80 dxy[cm] (d) Figure 5.10: ǫ(B|T ) vs dxy using the same algorithm for Bottom and Top tracks (a) and reconstructing Top tracks with: CosmicTF (b), CKF (c) and RS (d). 62 ∈(T|B) vs dxy ∈(T|B) vs dxy 1 1 0.8 0.8 0.6 0.6 0.4 0.4 ckfTop cosmicBot ckf cosmicTop cosmicBot cosmic 0.2 0 -80 0.2 rs -60 -40 -20 0 20 40 60 80 dxy[cm] 0 -80 rsTop cosmicBot -60 -40 -20 (a) 20 40 60 80 dxy[cm] (b) ∈(T|B) vs dxy ∈(T|B) vs dxy 1 1 0.8 0.8 0.6 0.6 0.4 0.4 ckfTop ckfBot cosmicTop ckfBot rsTop ckfBot 0.2 0 -80 0 -60 -40 -20 0 (c) 20 40 ckfTop rsBot cosmicTop rsBot rsTop rsBot 0.2 60 80 dxy[cm] 0 -80 -60 -40 -20 0 20 40 60 80 dxy[cm] (d) Figure 5.11: ǫ(T |B) vs dxy using the same algorithm for Bottom and Top tracks(a) and reconstructing Bottom tracks with: CosmicTF(b), CKF(c) and RS(d). 63 Chapter 6 Summary on Tracking In the first part of this thesis, the track reconstruction issues in CMS have been addressed. Tracking is one of the fundamental elements of the event reconstruction, being the base for most higher level algorithms. The porting of the final fit code from the old software framework to the new one has been completed and new features have been implemented. Then, starting from some tools developed to check the results of the final fit algorithm, the CMS Track Validation Tool has been developed. This fully configurable analysis program evaluates the tracking performance for any tracking algorithm and software release. The Validation Tool showed that the quality of some tracks is not optimal. Thus, the Outlier Rejection algorithm, which improves the tracking performance by removing the large χ2 hits during the track final fit, has been developed. The first cosmic ray data, taken during the tracker commissioning at CMS site, have been analyzed. In particular, the tracking efficiency using tracker information only has been evaluated. The efficiency results of the order of 90% for all the tracking algorithms and matches Monte Carlo prediction. Such results are remarkable, especially considering that the tracker is still in the commissioning phase and that both the detector and the algorithms are not designed for cosmic ray track reconstruction. In conclusion, the CMS detector is now taking data in a cosmic global run. These data are used for detector commissioning and to evaluate the reconstruction performance. They prove that the experiment, and in particular the tracker detector, is ready for the upcoming beam collisions and for the physics challenges of the Large Hadron Collider. One of the most appealing challenges, the search for the MSSM Higgs boson, is addressed in the second part of the thesis. 65 Chapter 7 Motivations for the Minimal Supersymmetric Standard Model The goal of this chapter is to give a short review of the theoretical background for the search of the Heavy Neutral CP-odd MSSM Higgs Boson A. First, the open problems of the Standard Model and the need for a Higgs boson will be recalled. Then, the motivations for the Supersymmetry and in particular for the MSSM, will be briefly presented. Finally, the Higgs particle spectrum in the MSSM will be described. 7.1 Open Problems of the Standard Model The Standard Model (SM)[24, 25, 26, 27, 28] is a quantum theory that describes how all the known fundamental particles interact via the strong, weak and electromagnetic forces. In its present formulation, SM is a gauge theory with a SU (3)C × SU (2)L × SU (1)Y symmetry. The SM particle masses and interactions have been tested at many collider experiments, proving that the SM provides and incredibly successful description of Nature up to the order of 100 GeV scale with no hints of additional structures. Certainly, a new framework will be required at the Planck scale MP = (8πGNewton )−1/2 = 2.4 × 1018 GeV, where quantum gravitational effects become important, but, because of some open issues, it seems reasonable that New Physics can be discovered at the T eV scale. The main open problem of the Standard Model is the origin of the fundamental particle masses. In fact, the gauge symmetry forbids writing a mass term for the gauge bosons. Fermionic masses are also not possible, because they would mix the left- and right-handed fields, which have different transformation properties, and therefore would produce an explicit breaking of the gauge symmetry. Thus, the SM Lagrangian only contains mass-less fields. Nevertheless, experimental observations show that both the elementary fermions and the weak force bosons (W and Z) are massive. Also, they show that the electromagnetic and weak forces have similar behaviors at high energies, while, for energies below the T eV scale, their symmetry is broken. As the Electro-Weak Symmetry Breaking (EWSB) occurs at the scale of ∼ 100 GeV , new phenomena are expected in the TeV range or below. Other open problems of the SM are the dark matter explanation, the unification of the forces at high energies, the quantistic description of the gravity force and the origin of the predominance of matter over anti-matter. 67 7.1.1 The Higgs Mechanism The Higgs Theory[29, 30] predicts the presence of a new field which breaks the electroweak symmetry, gives masses to fermions and does not disturb gravity nor electromagnetism. The simplest version of the Higgs field has been included in the SM and consists of a SU (2) doublet of complex scalar fields: r 1 φ1 + iφ2 φα (7.1) φ= = φβ 2 φ3 + iφ4 (where φj are four scalar fields) that contributes to the Lagrangian with the following potential term: λ V (φ) = µ2 φ† φ + (φ† φ)2 (7.2) 2 with µ2 < 0 and λ > 0. When a particular minimum of the potential is chosen, the gauge symmetry is spontaneously broken (Fig. 7.1). In fact, V (φ) is gauge invariant, but it has a Figure 7.1: Schematic view of the Higgs potential V shape. It is symmetric with respect to zero, but, in order to minimize it, a non-zero value has to be assumed and thus the symmetry is spontaneously broken. nonzero vacuum expectation value (v.e.v.): µ2 . (7.3) λ Expanding the field about this minimum, it turns out that a particular gauge can be chosen so that, out of the four scalar fields, only one massive physical degree of freedom remains, the Higgs boson H. The other three mass-less degrees of freedom are Goldstone bosons, which can be regarded as the longitudinal polarizations of the weak force bosons W ± , Z, thus providing them with mass. A Yukawa interaction between the fermions and the Higgs field φ is allowed in the Lagrangian: this interaction term, after the EWSB, results in a fermionic mass term. With this mechanism, both the bosons and the fermions acquire a mass value that is proportional to their coupling to the Higgs field. Summarizing, adding the Higgs potential to the SM Lagrangian, it is possible to preserve the gauge invariance of the Lagrangian itself (SU (3)C × SU (2)L × SU (1)Y ). When the Higgs field assumes a minimum of the potential, this symmetry is spontaneously broken: the gauge invariance reduces to SU (3)C × SU (1)em , fermions and bosons acquire a mass proportional to their coupling to the Higgs: v MV2 = gφV V mf = hf v (7.4) 2 hφ† φi = v 2 = − 68 and there is only an additional particle in the SM spectrum, the Higgs boson H, whose mass depends on λ and v: m2HSM = 2λv 2 . (7.5) The Z and W masses are experimentally well known: MZ = 91.1875 ± 0.0021 GeV, MW = 80.398 ± 0.025 GeV . (7.6) From these values, the electroweak mixing angle θW and the Higgs v.e.v. can be obtained: sin2 θW = 1 − 2 MW = 0.223, MZ2 v = 246 GeV . (7.7) The value of v is commonly referred to as the electroweak scale. From a theoretical point of view, this mechanism solves in a simple way the problem of the elementary particle masses and of the Electro-Weak symmetry breaking. Still, the Higgs boson has not been observed experimentally and also it entails some theoretical troubles. The most important is the Hierarchy Problem. 7.1.2 The Hierarchy Problem SM can be regarded as an effective theory, valid in a certain range of energies, from ∼ M eV up to an unknown scale Λ. Its low energy quantities (masses, couplings) are expected to be functions of the parameters of a more fundamental theory valid at a scale Q > Λ. The Higgs boson mass is subjected to one-loop quantum corrections (Fig. 7.2), depending on the Higgs coupling to bosons and fermions, their masses and the cutoff scale Λ: ∆m2 = Λ Λ λS 2 2 2 2 log −2Λ + m Λ − m log + S f 16π 2 mf 16π 2 mS |λ2f | fR h (7.8) SL+SR h fL (a) h h (b) Figure 7.2: Fermionic (a) and bosonic (b) contributions to the Higgs mass one-loop quantum correction. In Fig. 7.3 the latest fit of the Higgs mass in the SM is shown. The χ2 has a minimum at around 100 GeV, although the direct search bound is mH > 114 GeV . The indirect limit implies that mH ≤ 246 GeV at 95% C.L. [34]. Thus, the SM Higgs boson is expected to have a mass value well below the T eV scale. 69 Figure 7.3: The Higgs mass constraints in the SM. For this reason, the quantum correction to the Higgs mass cannot be too large. From eq. (7.8) it is clear that this correction is quadratically divergent with respect to Λ. Therefore, to keep the correction small, Λ should be ≤ 1 T eV or an extremely fine tuning of the parameters should occur. 7.2 A brief introduction to Supersymmetry and MSSM The main idea leading to Supersymmetry [32, 33] (SUSY) is to find a solution for the Hierarchy Problem. Referring to eq. (7.8), if there were two scalars for each fermion, then it would read: λS − |λ2f | 2 Λ 2 ∆m = Λ + O log + ... (7.9) 8π 2 m In addition, if their couplings were λS = |λf |2 , then the quadratic divergence would exactly cancel: Λ + ... (7.10) ∆m2 = O log m Therefore, if a symmetry between fermions and bosons existed that, for each fermion implied the presence of a boson with the same coupling to the Higgs (and vice versa), then the Hierarchy Problem would be solved. Such hypothetical symmetry is called Supersymmetry. It can be shown that the generators Q of such symmetry must have the following properties: • Q|Bosoni = |Fermioni and Q|Fermioni = |Bosoni. • they are spinors, thus carrying spin 1/2. • they satisfy the following commutation rules: {Q, Q† } = P µ , {Q, Q} = {Q† , Q† } = 0 and [P µ , Q] = [P µ , Q† ] = 0, where P µ is the four-momentum generator of space-time translations. 70 The effect of Q is to transform SM particles into their Supersymmetric partners, called Superpartners. The pair of a particle and its Superpartner is called Supermultiplet. From the properties above, it turns out that, for each Supermultiplet, the following rules hold: • Superpartners have the same couplings. • Superpartners have the same mass. • Supermultiplets are made of one state of helicity λmax and a state with helicity λmin = λmax − 1/2. Two kinds of Supermultiplets can be defined: Chiral and Vector Supermultiplets. Chiral (or Matter) Supermultiplets are composed of a state with λmax = 1/2 and the other one with λmin = 0. The simplest example is a Weyl fermion and two real (one complex) scalar called sfermion. Vector (or Gauge) Supermultiplets, instead, have λmax = 1 and λmin = 1/2 like, for example, a gauge vector boson and a fermion called gaugino. Using these rules, the simplest Supersymmetric extension of the SM does not add any particle to the SM spectrum, except the Superpartners of the SM particles. This model is called Minimal Supersymmetric Standard Model (MSSM). The Chiral Supermultiplets are reported in Table 7.1, while the Vector ones in Table 7.2. Table 7.1: Chiral Supermultiplets in the Minimal Supersymmetric Standard Model. The spin-0 fields are complex scalars, and the spin-1/2 fields are left-handed two-component Weyl fermions. Names squarks, quarks (×3 families) sleptons, leptons (×3 families) Higgs, higgsinos Q u d L e Hu Hd spin 0 (e uL deL ) u e∗R de∗ R (e ν eeL ) ee∗R (Hu+ Hu0 ) (Hd0 Hd− ) spin 1/2 SU (3)C , SU (2)L , U (1)Y (uL dL ) u†R d†R (ν eL ) e†R e u+ H e u0 ) (H 0 e e (Hd Hd− ) ( 3, 2 , 61 ) ( 3, 1, − 23 ) ( 3, 1, 13 ) ( 1, 2 , − 12 ) ( 1, 1, 1) ( 1, 2 , + 12 ) ( 1, 2 , − 12 ) Table 7.2: Gauge Supermultiplets in the Minimal Supersymmetric Standard Model. Names gluino, gluon winos, W bosons bino, B boson spin 1/2 ge ± f f0 W W e0 B spin 1 g ± W W0 B0 SU (3)C , SU (2)L , U (1)Y ( 8, 1 , 0) ( 1, 3 , 0) ( 1, 1 , 0) The only difference with respect to the SM is the presence of a Higgs doublet instead of a singlet. In fact, in the SM Lagrangian, H gives mass to up-type quarks, while its conjugate H c to down-type fermions. In a Supersymmetric model, the complex conjugate would make 71 the Lagrangian not invariant under the Supersymmetric transformation; therefore, two different scalar fields are needed to give mass to up- and down-fermions. It is beyond the purposes of this introduction to give a detailed description of a generic Supersymmetric Lagrangian, so only the details needed to understand the properties of the MSSM Higgs bosons will be addressed. The most general Supersymmetric Lagrangian (eq. (7.11)) is composed of three terms: one for the Chiral Supermultiplets (Lchiral ), one for the Gauge Supermultiplets and an additional term with other gauge invariant interaction terms formed out of the fields in Chiral and Gauge Supermultiplets not taken into account in covariant derivatives (Ladd ). L = Lchiral + Lgauge + Ladd (7.11) In particular, defining the scalar fields as φ and the Weyl spinors as ψ, the Chiral term can be written in terms of a function of the scalar fields called Superpotential W as: Lchiral = −∂ µ φ∗i ∂µ φi − iψ †i σ̄ µ ∂µ ψi − where W , W i and W ij are defined as: 1 ij W ψi ψj + Wij∗ ψ †i ψ †j − W i Wi∗ . 2 1 1 W = M ij φi φj + y ijk φi φj φk 2 6 Wi = δW 1 = M ij φj + y ijk φj φk δφi 2 W ij = δ2 W = M ij + y ijk φk δφi δφj (7.12) (7.13) (7.14) The matrices M ij and y ijk are totally symmetric in their indices and correspond to the fermion mass matrix and the Yukawa couplings respectively. In the MSSM case, the Superpotential is: WMSSM = uyu QHu − dyd QHd − eye LHd + µHu Hd . (7.15) The objects Hu , Hd , Q, L, u, d, e appearing here are chiral scalar Superfields corresponding to the chiral Supermultiplets in Table 7.1. The dimensionless Yukawa coupling parameters yu , yd , ye are 3 × 3 matrices in family space. The µ term in eq. (7.15) is the Supersymmetric version of the Higgs boson mass in the Standard Model. As already stated, no other Higgs mass terms are allowed, because terms like Hu∗ Hu or Hd∗ Hd in the Superpotential would lead to a not Supersymmetric invariant Lagrangian and thus are forbidden. Direct fermion mass terms are also forbidden in the Lagrangian, so they acquire mass from the Yukawa couplings in the term: ∂W 2 ūu = ūyu uHu0 (7.16) ∂e u∗R ∂e uL In general, the couplings due to the Superpotential have the form reported in Fig. 7.4, where dashed lines correspond to scalars and full lines to fermions. The Supersymmetric solution of the Hierarchy Problem implies that fermions and scalars ∗ = |yijk |2 . Also, in the same Supermultiplet have the same couplings: λS = |λf |2 or y ijm yklm j ∗ M kj and thus they have from the symmetry of the matrix M it turns out that M 2 i = Mik the same masses. 72 k i j k l j i y ijk ∗ y ijm yklm (a) (b) i j i j j i k ∗ y jkm Mim M ij ∗ M kj Mik (c) (d) (e) Figure 7.4: The Superpotential interaction vertices in a Supersymmetric theory: (a) ∗ scalar-fermion-fermion Yukawa interaction y ijk , (b) quartic scalar interaction y ijm yklm . (c) 3 ∗ jkm ij (scalar) interaction vertex Mim y (d) fermion mass term M (e) scalar squared-mass ∗ term Mik M kj . The µ term in eq. (7.15) provides for higgsino fermion mass contribution in the Lagrangian: e +H e− − H e 0H e0 Lhiggsino mass = −µ(H u u d ) + c.c., d (7.17) as well as Higgs squared-mass terms: Lsupersymmetric Higgs mass = −|µ|2 |Hu0 |2 + |Hu+ |2 + |Hd0 |2 + |Hd− |2 . (7.18) The Supersymmetric Higgs mass term cannot induce a spontaneous Electro-Weak Symmetry Breaking because Hu0 = Hd0 = 0 is a minimum for the potential of eq. (7.18). Of course, EWSB is a necessary feature of a consistent theoretical description and, in order to be an acceptable theory, Supersymmetry has to account for it. Moreover, an exact Supersymmetry would imply that the Supersymmetric particles have the same mass of their SM partners and, for example, no experimental hint of the Supersymmetric electron has been seen at a mass of 0.5 M eV . These facts suggest that, if Supersymmetry is a valid theory, it must be spontaneously broken at the energies below the T eV scale. Therefore, unless a specific mechanism of Supersymmetry breaking is known, no information on the spectrum can be obtained. The cancellation of quadratic divergences in eq. (7.8) relies on equality of couplings and not on equality of the masses of particles and Superpartners. Soft Supersymmetry Breaking terms that give different masses to SM particles and their Superpartners, but preserve the structure of couplings of the theory, can be included in the Lagrangian. In the MSSM the Soft 73 Supersymmetry Breaking contribution to the Lagrangian is: 1 fW f + M1 B eB e + c.c. M3 gege + M2 W 2 e d −e e u −e e d + c.c. e au QH d ad QH e ae LH − u LMSSM = − soft † e † m2Q Q e−L e † m2L L e−u e mu2 e u −e d m2d e d −e e m2e e e −Q † − m2Hu Hu∗ Hu − m2Hd Hd∗ Hd − (bHu Hd + c.c.) † (7.19) In eq. (7.19), M3 , M2 , and M1 are the gluino, wino, and bino mass terms (adjoint representation gauge indices and gauge indices are suppressed). The second line in eq. (7.19) contains the (scalar)3 couplings. Each of au , ad , ae is a complex 3 × 3 matrix in family space, with dimensions of [mass]. They are in one-to-one correspondence with the Yukawa couplings of the Superpotential. The third line consists of squark and slepton mass terms of type (m2 )ji . Finally, in the last line of eq. (7.19) we have Supersymmetry-breaking contributions to the Higgs potential. The soft Supersymmetry breaking terms add to the theory a huge number of parameters (105) which are all expected to be of the order of 100 GeV − 1 T eV . These parameters are actually constrained by several experimental results, like individual leptonic number conservation, CP violation and K0 mixing and by some theoretical arguments, like the assumption that the sector that generates the soft-braking terms is flavor-blind. The Higgs sector of the MSSM, after including the Soft Supersymmetry Breaking terms, will be described in the following section. Before going into it, it is worth recalling why Supersymmetry is an interesting extension of the Standard Model: it provides a brilliant solution to the Higgs Hierarchy problem and also predicts the presence of a weakly interacting, stable and massive particle, the neutralino (Superpartner of the neutrino) which turns out to be a good Dark Matter candidate. Finally, Supersymmetry is compatible with the Grand Unification Theories: SM couplings tend to converge at high energies but unification is quantitatively ruled out, while, in the MSSM, it can be reached at αGU T ≃ 0.04 and MGU T ≃ 1016 GeV (Fig. 7.5). For these reasons the search for Supersymmetric particles has been a hot topic at LEP and TEVATRON experiments, and will hopefully have a definitive answer at the LHC. 7.3 Higgs particles in the MSSM In the MSSM, the description of electroweak symmetry breaking is slightly complicated by the fact that there are two complex Higgs doublets Hu = (Hu+ , Hu0 ) and Hd = (Hd0 , Hd− ). The scalar potential for the Higgs scalar fields in the MSSM is given by V = (|µ|2 + m2Hu )(|Hu0 |2 + |Hu+ |2 ) + (|µ|2 + m2Hd )(|Hd0 |2 + |Hd− |2 ) (7.20) + [b (Hu+ Hd− − Hu0 Hd0 ) + c.c.] 1 1 + (g2 + g′2 )(|Hu0 |2 + |Hu+ |2 − |Hd0 |2 − |Hd− |2 )2 + g2 |Hu+ Hd0∗ + Hu0 Hd−∗ |2 8 2 The minimum of this potential should break electroweak symmetry down to electromagnetism SU (2)L ×U (1)Y → U (1)EM , in agreement with experiment. Using the freedom to make gauge transformations to take Hu+ = Hd− = 0, Hu0 and Hd0 real and positive at the minimum, the 74 Figure 7.5: Gauge coupling running as a function of energy. The solid line is the SM, the dotted (dashed) line is for MSSM with 1 T eV (10 T eV ) SUSY mass scale. scalar potential becomes: V = (|µ|2 + m2Hu )|Hu0 |2 + (|µ|2 + m2Hd )|Hd0 |2 − (b Hu0 Hd0 + c.c.) 1 + (g2 + g′2 )(|Hu0 |2 − |Hd0 |2 )2 . 8 (7.21) The vacuum expectation values of Hu0 and Hd0 , vu = hHu0 i and vd = hHd0 i, are related to the mass of the Z 0 boson and the electroweak gauge couplings: m2Z = 1 2 ′2 2 2 g g vu vd ⇒ v 2 ≡ vu2 + vd2 ≃ (174 GeV )2 2 The ratio between the vacuum expectation values is defined as tan β: vu tan β ≡ vd (7.22) (7.23) After applying the minimization conditions and diagonalizing the mass matrices, the following Higgs mass eigenstates are found: 1. Two CP-odd neutral scalars 0 √ sin β − cos β ImHu G = 2 A cos β sin β ImHd 2. Two charged scalars G+ H+ = sin β − cos β cos β sin β Hu+ Hd+ 3. Two CP-even neutral scalars 0 √ h cos α − sin α ReHu − vu = 2 H sin α cos α ReHd − vd (7.24) (7.25) (7.26) 75 G± and G0 are the Goldstone bosons that give mass to the W ± and Z bosons, while H ± , h0 , H and A are the physical degrees of freedom. At tree level, the masses and the other parameters of the theory are commonly expressed as a function of tan β and mA . The Higgs masses are: m2A = 2b/ sin(2β) = 2|µ|2 + m2Hu + m2Hd q 1 2 mA + m2Z ∓ (m2A − m2Z )2 + 4m2Z m2A sin2 (2β) m2h0 ,H = 2 m2H ± = m2A + m2W (7.27) (7.28) (7.29) while, the mixing angle α is determined by sin 2α = − sin 2β m2H + m2h0 m2H − m2h0 tan 2α = tan 2β , m2A + m2Z m2A − m2Z (7.30) The couplings of h, H and A to standard particles are the same as in the Standard Model, rescaled by α- and β-dependent factors (Table 7.3). It’s worth noting that down type fermion couplings to A are enhanced by a factor tan β. Table 7.3: α- and β-dependent scale factors for Higgs couplings to SM particles in the Minimal Supersymmetric Standard Model. ¯ dd,ss̄,b b̄ uū,cc̄,tt̄ W + W − ,ZZ cos α/ sin β sin α/ sin β −iγ5 cot β sin(β − α) cos(β − α) 0 e+ e− ,µ+ µ− ,τ + τ − h H A −sin α/ cos β cos α/ cos β −iγ5 tan β The following relations hold between the Higgs masses: m2H ± ≥ m2A mh ≤ mA ≤ mH mh < | cos 2β|mZ (7.31) The last relation is actually bounded by LEP results. After adding radiative corrections the exclusion value is weakened to mh < ∼ 130 GeV , compatible with LEP results and within the LHC reach. The value of the Higgs masses as a function of mA and tan β is shown in Fig. 7.6 Some limit cases are interesting to be discussed: • mA ≫ mZ , the decoupling limit. As it can be seen from eq. (7.28-7.30), in such limit α ≈ β − π2 and sin2 (β − α) ≈ 1. Then h0 has a low mass value and the same coupling of a SM Higgs for the same mass. A, H and H ± are much heavier, forming an isospin doublet almost degenerate both in mass and couplings. • Low mA and large tan β. In this scenario cos2 (β − α) ≈ 1, H has the same couplings as a SM Higgs boson and A is degenerate with h. Therefore, for large tan β values, A is always degenerated with one of the two CP -even neutral Higgs bosons h or H. 76 Figure 7.6: Higgs masses as a function of mA for tan β = 3, 30. Considering radiative corrections, some additional parameter have to be taken into account: MSU SY , M2 , µ, A and mg̃ . MSU SY is a soft SUSY-breaking mass parameter and represents a common mass for all scalar fermions (sfermions) at the electroweak scale. Similarly, M2 represents a SU(2) gaugino mass at the electroweak scale. The “Higgs mass parameter” µ is the strength of the Supersymmetric Higgs mixing; A = At = Ab is a common trilinear Higgs-squark coupling at the electroweak scale and mg̃ the gluino mass. Three of these parameters define the stop and sbottom mixing parameters Xt = A − µ cot β and Xb = A − µ tan β. In addition to all these MSSM parameters, the top quark mass also has a strong impact on the predictions through radiative corrections. The parameters for several benchmark scenarios [35] are reported in Table 7.4. The present analysis is performed in the mh -max scenario, where the stop mixing parameter is set to a large value, Xt = 2MSU SY . This model is designed to maximize the theoretical upper bound on mh for a given tan β. This model thus provides the largest parameter space in the mh direction and conservative exclusion limits for tan β. Table 7.4: Parameters defining the main MSSM benchmark scenarios. MSU SY (GeV ) M2 (GeV ) µ (GeV ) mg̃ (GeV ) Xt (GeV ) mh -max 1000 200 -200 800 2MSU SY no-mixing 1000 200 -200 800 0 large-µ 400 400 1000 200 -300 gluophobic 350 300 300 500 -750 small-αef f 800 500 2000 500 -1100 77 Chapter 8 Search for the Heavy Neutral CP-odd Higgs Boson A In this chapter the experimental search for the MSSM Higgs bosons is introduced. The exclusion limits, in terms of the MSSM parameters, obtained so far at LEP and TEVATRON are first presented. The expected signal production processes and decays at the LHC are then discussed. 8.1 Exclusion limits from LEP and CDF The Large Electron Positron Collider (LEP) started its operations in 1989 with a center of mass energy of 91 GeV , at the Z peak. Later it was upgraded to a maximum energy of 209 GeV , allowing for the production of a W pair and continued working until the end of the year 2000. Combining the results of the four LEP experiments, the search for the MSSM Higgs bosons has been finalized, leading to 95% CL exclusion limits in the tan β vs mA plane [34]. Several MSSM scenarios have been investigated, both CP -conserving and CP -violating. Only the results in the CP -conserving mh -max scenario are here presented for consistency. In the e+ e− collisions at the LEP energies, the main production process of h, H, and A are the Higgs-strahlung processes e+ e− → hZ (or HZ when allowed in the parameter space) and the pair production processes e+ e− → hA (or HA). The Higgs-strahlung and pair production cross-sections are complementary: at the LEP energies, the process e+ e− → hZ is typically more abundant for small tan β values, while e+ e− → hA dominates at large tan β. The h boson decays mainly to fermion pairs, with only a small fraction of W W ∗ and ZZ ∗ decays, since its mass is below the corresponding on-shell processes. The A boson also decays predominantly to fermion pairs, independently of its mass, since its coupling to vector bosons is zero at leading order. For tan β > 1, decays of h and A to bb̄ and τ + τ − pairs are preferred while the decays to cc̄ become important for tan β < 1. In each of the four LEP experiments, the data analysis is performed in several steps. A preselection is applied to reduce some of the large backgrounds, in particular from two-photon processes. The remaining background, mainly from production of fermion pairs and W W or ZZ, is further reduced by more selective cuts and applying multivariate techniques. For the two production processes, searches have been carried out considering several final state topologies. For the Higgs-strahlung process the topologies taken into account are the same 79 used in the search for the SM Higgs boson: • the four-jet topology, (h → bb̄)(Z → q q̄), in which the invariant mass of two jets is close to the Z mass while the other two jets are tagged as b. • the missing energy topology, (h → bb̄, τ + τ − )(Z → ν ν̄), in which the event consists of two b- or τ -jets and a large amount of missing energy compatible with mZ . • the leptonic final state, (h → bb̄)(Z → e+ e− , µ+ µ− ), in which the invariant mass of the two leptons is close to mZ . • the final states with τ leptons, (h → τ + τ − )(Z → q q̄) and (h → bb̄, τ + τ − )(Z → τ + τ − ), in which the other the τ + τ − or the q q̄ pair has invariant mass close to mZ . In the case of the pair production process, e+ e− → hA, the principal signal topologies at LEP are: • the four-b final state (A → bb̄)(h → bb̄). • the mixed final states (A → τ + τ − )(h → bb̄) and (A → bb̄)(h → τ + τ − ). • the four-τ final state (A → τ + τ − )(h → τ + τ − ). • the Higgs cascade decay, e+ e− → hA → (hh)h, gives rise to event topologies ranging from six b=jets to six τ leptons. After selection, the combined data are compared to a large number of simulated configurations, generated separately for the hypothesis of background only and signal-plus-background hypothesis. The ratio Q = Ls+b /Lb of the corresponding likelihoods is used as hypothesis test. For an assumed top quark mass of mt = 174.3 GeV , the exclusion limits found for a 95% confidence level are interpreted in the considered MSSM scenarios. The exclusions for the mh -max scenario are shown in Fig. 8.1. In the region with tan β < ∼ 5, the exclusion is provided mainly by the Higgs-strahlung process, providing a lower bound of about 114 GeV for mh . At high tan β, the pair production process gives the main contribution, providing limits of 92.8 and 93.4 GeV for mh and mA respectively. The data also exclude some domains of tan β (Fig. 8.2). For mt = 174.3 GeV , the range 0.7 < tan β < 2 is excluded. 80 (a) (b) (c) (d) Figure 8.1: Exclusions, at 95% CL (light-green) and the 99.7% CL (dark-green), in the case of the CP-conserving mh -max benchmark scenario, for mt = 174.3GeV . The figure shows the theoretically inaccessible domains (yellow) and the regions excluded by this search, in four projections of the MSSM parameters: (a): (mh , mA ); (b): (mh , tan β); (c): (mA , tan β); (d): (mH ± , tan β). The dashed lines indicate the boundaries of the regions which are expected to be excluded, at 95% CL, on the basis of Monte Carlo simulations with no signal. In the (mh , tan β) projection (plot (b)), the upper boundary of the parameter space is indicated for four values of the top quark mass; from left to right: mt = 169.3, 174.3, 179.3 and 183.0 GeV . 81 Figure 8.2: Domains of tan β which are excluded at the 95% CL (light-gray or lightgreen) and the 99.7% CL (dark-green), in the case of the CP-conserving mh -max benchmark scenario, as a function of the assumed top quark mass. 82 The TEVATRON pp̄ collider at 1.96 T eV has been operating since year 1987 and will take data at least until the end of 2009. Two experiments were built at this collider, CDF and D0. As they are currently taking data, no definitive combined results for the search of the MSSM Higgs particles have been published by the two experiments. Therefore, in the following, only the preliminary results in the search for the CP -odd MSSM Higgs boson A (and the corresponding mass-degenerate CP -even h or H) decaying into a τ pair obtained by CDF with an integrated luminosity of 1.8 f b−1 will be presented [36, 37]. At hadron colliders (see. §8.2) there are two dominant production mechanism for neutral MSSM Higgs bosons: gluon fusion gg → φ and associate production gg → bbφ, where φ can denote any of h, A, H. The leading decay modes are bb̄ (∼ 90%) and τ + τ − (∼ 10%). Despite the smaller branching fractions, Higgs searches in the di-τ channel have the advantage that they do not suffer from the large multi-jet backgrounds as φ → bb̄ does. The di-τ channel has been inclusively analyzed in three final states: τe τh , τµ τh and τµ τe , where τe , τµ and τh are short-hand notations for the decay modes τ → eνe ντ , τ → µνµ ντ , τ → (hadrons ντ ) respectively. The dominant, irreducible background in the final sample of selected events is Z/γ ∗ with subsequent decays to τ pairs. The second largest contribution comes from multi-jet events with gluon of quark jets mis-identified as τh . Additional considered backgrounds are Z → ee, µµ, W W , W Z, ZZ, W γ, Zγ and tt̄ production. The number of expected SM background events and the number of observed events in the data after applying all selection criteria are summarized in Table 8.1. To probe for Table 8.1: Predicted backgrounds andR observed events after all selection cuts in the τe τh , τµ τh and τµ τe channels at CDF with L = 1.8 f b−1 . The quoted errors are statistical only. For the jet fakes source, in the channel including τh the uncertainty is included in the systematics. source Z → ττ Z → ee, µµ di-boson events tt̄ jet fakes Sum BG DATA τe τh 137639 ± 8.3 69.7 ± 2.0 4.3 ± 0.1 3.7 ± 0.1 466.5 1921.1 1979 τµ τh 1353.7 ± 8.1 107.3 ± 2.3 3.3 ± 0.05 3.0 ± 0.07 283.6 1750.8 1666 τe τµ 604.8 ± 5.5 19.2 ± 0.9 11.4 ± 0.1 9.1 ± 0.1 57.3 ± 3.3 701.9 726 possible Higgs signal, a binned likelihood fit of the partially reconstructed mass of the di-tau / T ) has been system (mvis defined as the invariant mass of the visible τ -decay products and E performed. An example fit for mA = 140 GeV is reported in Fig. 8.3. No signal evidence has been observed in the range 90 GeV < mA < 250 GeV and the exclusion limits at 95% CL on the production cross-section times the branching ratio are set as in Fig. 8.4. Considering four benchmarks scenarios (standard version - µ = −200 GeV - and the variant with positive sign µ of mh -max and no-mixing scenarios), these results are converted into exclusion regions in the tan β vs mA plane, as shown in Fig. 8.5 for standard mh -max. 83 Figure 8.3: Partially reconstructed di-τ mass. The normalization of the backgrounds and signal (mA = 140 GeV ) correspond to the fit results for signal exclusion at 95% CL. Figure 8.4: Observed and expected limits at 95% CL for Higgs production cross-section times branching fraction to τ pairs at CDF. Figure 8.5: Excluded region in tan β vs mA plane for the mh -max scenario with µ < 0. 84 g g b̄ b φ g b̄ g (a) φ b (b) q q̄ q φ W, Z W, Z φ W, Z q q (c) q W, Z (d) Figure 8.6: Production processes for the MSSM Higgs bosons at LHC: gluon fusion(a), bottom quark associated production(b), vector boson fusion(c), Higgs-strahlung(d). The last two processes are forbidden for A, which does not couple to W and Z. 8.2 Production at LHC Before addressing the experimental search for the A boson at CMS, it is worth summarizing the expected phenomenology of the MSSM Higgs at LHC. The predicted cross section for the production of the MSSM Higgs bosons at LHC varies of several order of magnitude as a function of the tree-level parameters mA and tan β. The main production processes are the gluon fusion production process gg → φ and the b-quark associated production process gg → bbφ (with φ = h/H/A) (Fig. 8.6). A contribution to the bbφ final state comes also from the process qq → bbφ, but it is highly suppressed with respect to gg → bbφ and therefore won’t be considered for the rest of this work. At low tan β values, the dominant production for the CP -odd Higgs boson is gg → A, while at large tan β values, is gg → bbA. Nevertheless, for tan β = 30, the gluon fusion cross section is not completely suppressed, being just about one order of magnitude smaller than the associated process cross section for A masses in the range 100 GeV ≤ mA ≤ 200 GeV (Fig. 8.7). The CP -even Higgs bosons h and H are mainly produced through a direct production process for small values of tan β, while for high values the dominant production process for H is gg → bbH (Fig. 8.8). MSSM Higgs bosons couplings are reported in Table 7.3. From these couplings, it is clear that, for large tan β values, the branching ratios for the decays into down-type fermions are preferred. On the other hand, for low values of tan β other channels may contribute, and in particular, if the Higgs boson is heavy, the tt̄ decay may become dominant. The resulting branching ratios in the mh -max scenario for tanβ = 3 and 30 as a function of the Higgs masses are presented in Fig. (8.9-8.11). Therefore, in the mh -max scenario, the exclusion regions from LEP and CDF results 85 (a) (b) Figure 8.7: Neutral CP -odd MSSM Higgs production cross sections at LHC for gluon fusion gg → A and the associated production gg, q q̄ → bb̄A/tt̄A, including all known QCD corrections for tan β = 3 (a) and tan β = 30 (b). (a) (b) Figure 8.8: Neutral CP -even MSSM Higgs production cross sections at LHC for gluon fusion gg → h/H, vector-boson fusion qq → qqV V → qqh/qqH, Higgs-strahlung q q̄ → V ∗ → hV /HV and the associated production gg, q q̄ → bb̄h/bb̄H/tt̄h/tt̄H/, including all known QCD corrections for tan β = 3 (a) and tan β = 30 (b). 86 Figure 8.9: Branching ratios of the MSSM Higgs boson A for non-SUSY decay modes as a function of its mass for tanβ = 3, 30 and maximal mixing. Figure 8.10: Branching ratios of the MSSM Higgs boson h for non-SUSY decay modes as a function of its mass for tanβ = 3, 30 and maximal mixing. Figure 8.11: Branching ratios of the MSSM Higgs boson H for non-SUSY decay modes as a function of its mass for tanβ = 3, 30 and maximal mixing. 87 constrain the values of tan β and mA in the range 5 < tan β < ∼ 50 and mA > ∼ 92 GeV . For such ranges, the expected main A production processes at LHC are the gluon fusion and the b quark associated production. For most regions in the tan β vs mA plane, the dominant decay channels are the A → bb̄ and A → τ τ . The latter is the most promising channel for the discovery of the MSSM Higgs both from a theoretical point of view, because it is less sensitive to Supersymmetric radiative corrections [35], and from an experimental point of view, as will be discussed at the beginning of the next section. 88 Chapter 9 Sensitivity for the MSSM / T decay in CMS A → τ τ → eµE 9.1 Introduction The A → τ τ decay is the favorite channel for the discovery of MSSM Higgs at LHC because the A coupling to τ s is enhanced by a factor tanβ and, even if the A → bb̄ branching ratio is almost ten times higher, the b decay channel is overwhelmed by the QCD multi-jet background[38]. In the present analysis, in order to have a clean final state, the two τ s are required to decay into different flavor leptons, thus ending up with one electron, one opposite charged muon and missing transverse energy due to four neutrinos. In particular, this final state does not contain τ jets, thus avoiding the difficult task of distinguishing them from b and QCD jets. The A → τ τ channel has already been extensively studied at CMS [39, R40, 41, 42, 43], leading to 5σ discovery regions below tan β = 20 for mA < 300 GeV and L = 30 f b−1 / T has already (Fig. 9.1). In particular, an analysis in the decay channel A → τ τ → eµE been performed in 2006, suggesting the discovery region shown in Fig. 9.2. Since then, new reconstruction and analysis techniques have become available in the new CMS software framework [19]. Moreover, this previous analysis, as the other searches for the Heavy Neutral MSSM Higgs bosons A in CMS, has been performed considering only the main signal contribution coming from the associate production process, thus neglecting the contribution from gluon fusion. The goal of the present work, therefore, is to investigate new strategies for the A boson search, test the new reconstruction algorithms and take into account the contribution from the gg → A process. In this chapter, after a brief review of the algorithms exploited to reconstruct the physics objects and an overview of the signal and background samples used in this analysis, the key issues of this analysis are investigated. In fact, the results of the analysis depends on how the problems of the poor measurement of the Missing Transverse Energy, the low number of b-tags in the gg → bbA sample and the mis-identification of τ -jets as leptons are addressed. Then, the cuts and strategies for rejection of the background and the selection of the signal contributions are described, and, finally, the obtained results are reported. 89 Figure 9.1: The 5σ discovery regions for the neutral Higgs bosons φ (φ=h,H,A) produced in the association with b quarks pp → bbφ with the φ → µµ and φ → µµ decay modes in the mh -max scenario. Figure 9.2: Final results of the 2006 analysis: the 5σ discovery reach for heavy neutral Higgs bosons H and A decaying via τ τ to e + µ final state. The dots give the 5σ limit for the studied values of Higgs mass. The fast simulation result is also shown. 90 9.2 Event reconstruction A clean and accurate reconstruction of the final state is the key for the outcome of any analysis. For this reason, to avoid backgrounds containing more than two leptons, the event is required to have exactly one muon and one electron with opposite charge values. 9.2.1 Trigger The trigger table for the CMSSW releases used to produce the samples is developed for an instantaneous luminosity L = 1032 cm−2 s−1 . Among the available trigger bits, the most convenient and conservative choice is to make a minimal request of at least one lepton in the event. The isolated paths have lower pT thresholds but require isolation already at trigger level, while relaxed paths do not check isolation but have stronger pT thresholds. For this analysis, the lepton pT and isolation cuts are studied and applied off-line and therefore, at the trigger level, an inclusive choice has been made: the trigger selection is performed with a logic OR between the isolated and relaxed single muon and single electron trigger bits. The non-isolated single muon trigger bit has a pT threshold of 16 GeV , while the isolated 11 GeV . The non-isolated single electron has a threshold of 17 GeV , the isolated 15 GeV . A mixed electron-muon trigger is also available, but it is not expected to be necessary for the present analysis because it has lower pT with respect to the single lepton paths, and the latter have pT thresholds already well below the the offline pT cuts. 9.2.2 Leptons Muons [44] are reconstructed with the globalMuon sequence, which matches tracks reconstructed in the muon chambers with a tracker track. The hits in the two track segments are refitted together, providing the best estimate of the muon parameters. This procedure assures a very strong signature, and therefore no other muon identification algorithms are used. Muons are required to be isolated both in the tracker (no other tracks with pT > 1 GeV in a cone with ∆R = 0.4) and in the calorimeter (the sum of the HCAL and ECAL deposits within a cone with ∆R = 0.4 has to be less than 4 GeV ). The sequence used to reconstruct electrons [45] is called pixelMatchGsfElectrons. Starting from an ECAL super cluster, a pixel track seed matching the super cluster is searched for. If the seed is found, the pattern recognition is performed with the Combinatorial Kalman Filter algorithm in a loose cut configuration, while the final fit with the Gaussian Sum Filter (GSF) [46]. GSF is a fitting algorithm dedicated to electrons that accounts for the electron bremsstrahlung energy loss. As reconstructed pions are often mis-identified as electrons, electrons are required to pass the tight version of the category based electron-id algorithm [47]. Electron isolation is also required: no tracks with pT > 1.5 GeV have to lie in a cone with 0.02 < ∆R < 0.2 around the electron; the ECAL deposit within a cone with ∆R = 0.3 is required to be < 0.05×ESuperCluster , while the HCAL deposit in a cone with 0.15 < ∆R < 0.3 has to be < 0.2 × ESuperCluster . For a more detailed discussion about the need for strong lepton isolation and identification requests, see §9.4.1. The request for one muon and one electron in the event, which are identified and isolated according to the above criteria, can be considered as a sort of preselection cuts for this analysis. Unless differently stated, plots in §9.3.1 are obtained for the events satisfying this request. 91 9.2.3 Missing Transverse Energy Besides leptons, the other reconstruction object used in the invariant mass calculation is the / T ) [48]. As the vectorial sum of the transverse momentum Missing Transverse Energy (E of the particles before and after the collision is zero, the unbalance in the total pT of the reconstructed objects is a measurement of the transverse momenta of the neutrinos (and of any other undetectable particle). The precision of this measurement is crucial for this analysis: the mass peak mean value and width determination are highly dependent on the / T measurement accuracy. E / T is provided by the sum of the calorimeter tower energies. This A first estimate of E measurement is very raw and needs some corrections. A first correction is the sum of the / T because muons nearly lose no energy muon energy, which is not accounted for in the raw E / T . Another improvement applied in the calorimeters. This measurement is called Type-0 E / to the E T determination is the correction for jet energies. As discussed in §9.4.3, the jet / T is Zero Suppression followed by the Jet Plus correction that gives the highest benefit to E Track algorithm [49][50]. This correction is applied to the uncorrected Iterative Cone 5 jets[51] and accounts for the calorimeter cells where some energy is deposited, but is not sufficient to exceed the threshold set for the tower making (Zero Suppression) and replaces the calorimeter energy deposit with the more precise measurement of the momentum of the / T measurement after the jet correction tracks pointing to that deposit (Jet Plus Track ). The E /T is applied, is called Type-1 E 9.2.4 Jets Jets [51, 52] are not directly used to compute the invariant mass of the Higgs, but they are useful because, as the jet activity of the signal events is different from that of many backgrounds, it is worth vetoing on the total number of jets in the event. Furthermore they / T , in this are input to the b-tagging algorithms. To be consistent with the used Type-1 E analysis, jets are corrected with the Zero Suppression and Jet Plus Track algorithms. A pT threshold of 30 GeV is applied to corrected jets and they are required to be central (|η| < 2.5). Jets within a cone with ∆R < 0.3 centered on the lepton direction are not considered. 9.2.5 b-tagging A key element for this analysis is b-tagging [53]. It is primarily needed to reject the background and secondarily it may be useful to select the gg → bbA production processes. Several b-tagging algorithms are available: after some studies, described in §9.4.2, the algorithm with the best performance for this analysis is TrackCountingHighEfficiency [54]. The discriminator value provided by the TrackCountingHighEfficiency algorithm is the three dimension impact parameter significance of the track in the jet with the second highest impact parameter significance. The discriminator threshold value optimizing efficiency and fake rate is 2. 9.2.6 Collinear Approximation Because of the presence of four neutrinos in the final state, it is not possible to compute the Higgs invariant mass with an exact formula. The Missing Transverse Energy, in fact, is a measurement of the sum of the transverse momenta of all the invisible particles, but it does not measure the z component. Some experiments, like for example CDF, compute the visible 92 mass of the Higgs by calculating the invariant mass between all the measured quantities / T ). This approach does not provide a direct measurement of (lepton three-momenta and E the Higgs mass which has to be inferred from MC studies. The method used in the present analysis, called Collinear Approximation, can be used to directly estimate the reconstructed Higgs mass. It is based on the fact that the τ s produced by the Higgs decay are highly boosted and thus the directions of all the final state leptons are close to the τ direction. Therefore, as shown in Fig. 9.3, this approximation safely assumes that the τ s have the same direction of the measured leptons. Figure 9.3: Schematic view of the transverse momenta of Higgs and of its decay products in the collinear approximation. The visible fractions of the τ transverse momenta, xτ →l , are defined as the fractions of the two τ momenta which are carried by the reconstructed charged leptons: pTl pl = , l = e, µ (9.1) xτ →l = pτ →l pT τ →l The transverse momentum of the Higgs is the vectorial sum of the charged lepton and neutrino transverse momenta: / Tj , j = x, y pHiggsj = pej + pµj + E (9.2) It can be shown that, under this approximation, xτ →l can be expressed in terms of the transverse momentum of the Higgs boson and the transverse momenta of the charged leptons: xτ →e = xτ →µ = pex pµy − pey pµx pHiggsx pµy − pHiggsy pµx (9.3) pex pµy − pey pµx pHiggsy pex − pHiggsx pey This reconstruction method works only if the τ s are not emitted back-to-back in the transverse plane. For τ decays the reconstruction must yield 0 < xτ →l < 1. Once the visible fractions of the τ momenta are known, the invariant mass of the τ pair can be evaluated by meµ mHiggs = mτ τ = √ (9.4) xτ →e xτ →µ where meµ is the invariant mass of the two charged leptons in the final state. 93 9.3 Signal and Background samples As a preparation for the first data taking, the CMS collaboration has performed a computing and analysis exercise, called CSA07, to test the data flow foreseen for the first years of LHC operations. Even if the final goal of this analysis differs from the purpose of the CSA07 exercise, most of the samples used have been produced during the massive Monte Carlo data production for the CSA07. The signal samples have been produced in the two main production processes gg → bbA and gg → A, forcing A to decay into two taus. Even though only the leptonic final state is considered, the τ decay is not constrained to the exclusive channel in order to be able to estimate the mis-identification of τ jets as leptons. Consistently with the expectations for the MSSM Higgs bosons production described in §8, the signal has been generated with tanβ = 30 for various masses, ranging from 100 GeV up to 800 GeV for gg → bbA and from 100 GeV up to 200 GeV for gg → A. The used signal samples are summarized in Table 9.1. As the final state under investigation contains one electron, one muon and missing transverse energy, the backgrounds that can yield at least two leptons of different flavor have been considered. In order to have samples corresponding to luminosities of the same order of magnitude, for two backgrounds a private production with the fast simulation has been used instead of the CSA07 samples. The used background samples are listed in Table 9.2. Signal and background samples mainly correspond to luminosities of the same order of magnitude. To avoid large scale correction factors, a natural choice for this analysis is to R assume L = 10 f b−1 . Other potential sources of background can be cross-checked running on the CSA07 datasoups1 . No major contaminations arise: events surviving the selection cuts are already considered in the separate samples, except for a tiny contribution from W +jets. 9.3.1 Event variables and cuts The signal final state is characterized by the presence of one muon, one electron, missing transverse energy and, in case of gg → bbA, of b-tagged jets. Therefore it is interesting to / T and of the jets is the signal and in the main look at the properties of the leptons, of the E background samples. As far as the leptons are concerned, the most interesting distributions are the pT , the difference of the φ angles of the muon and the electron (∆φ) and the combined impact parameter significance, defined as: σ= s d0e δd0e 2 + d0µ δd0µ 2 (9.5) where d0l and δd0l are the value and the error of the lepton impact parameter. The τ leptons has a mean lifetime of ∼ 290 × 10−15 s (cτ = 87.1 µm) and, coming from the Higgs decay, they are highly boosted. Therefore, they are expected to travel a few millimeters before decaying and thus their daughters are significantly detached from the primary vertex. The distributions of the main leptonic variables are shown in Fig. 9.4. From these plots, it 1 There are three CSA07 data soups: Chowder, containing the ALPGEN W+jet, Z+jet and tt+jet samples, Stew, containing lepton enriched QCD, bottomonia, charmonia, Gumbo, containing QCD, Photon+jets, MinBias. 94 Table 9.1: Signal samples. Cross sections and branching ratios have been computed with FeynHiggs [55]. Generator used: Pythia [56]; Data-set: CMSSW 167-CSA07. Process gg → A → τ τ MA = 100 GeV gg → A → τ τ MA = 120 GeV gg → A → τ τ MA = 140 GeV gg → A → τ τ MA = 160 GeV gg → A → τ τ MA = 180 GeV gg → A → τ τ MA = 200 GeV gg → bbA → τ τ MA = 100 GeV gg → bbA → τ τ MA = 120 GeV gg → bbA → τ τ MA = 140 GeV gg → bbA → τ τ MA = 160 GeV gg → bbA → τ τ MA = 180 GeV gg → bbA → τ τ MA = 200 GeV gg → bbA → τ τ MA = 300 GeV gg → bbA → τ τ MA = 500 GeV gg → bbA → τ τ MA = 800 GeV R L [f b−1 ] σ × BR [pb] Events 224.1 × 10.9% 163398 95.3 × 11.2% 188994 17.71 45.0 × 11.5% 169994 32.85 23.1 × 11.8% 177796 65.23 12.6 × 11.9% 159593 106.44 7.3 × 12.1% 181996 206.04 868.4 × 10.9% 180196 1.90 506.6 × 11.2% 201994 3.56 313.8 × 11.5% 203994 5.65 203.6 × 11.8% 194994 8.12 137.3 × 11.9% 194794 11.92 95.4 × 12.1% 167196 14.48 21.5 × 11.8% 190596 75.13 2.7 × 9.7% 198997 759.82 0.3 × 7.9% 152396 6430.21 6.69 is already clear that a pT minimum cut would reduce the Drell-Yan background, while ∆φ and σ cuts would reduce the other backgrounds. In Fig. 9.5 the missing transverse energy distributions are shown. The Drell-Yan back/ T distribution softer than the signal, while the other backgrounds harder. ground has a E It is also interesting to look at the number of reconstructed jets per event, their pT , the discriminator distribution and the number of b-tags per event. The corresponding plots are shown in Fig. 9.6. W t and tt̄ have the largest jet and b-tagging activity, while the signal has a jet activity more similar to the Drell-Yan and W W backgrounds. It is worth noting that, even if for the b-tagging discriminator distribution the gg → bbA sample behaves like W t 95 event fraction event fraction gg->bbA gg->A ttbar DY Wt WW 10-1 10-2 gg->bbA gg->A ttbar DY Wt WW 10-1 10-2 10-3 0 20 40 60 80 100 10-3 0 120 pT,µ[GeV] 20 40 60 0.05 120 p [GeV] (b) event fraction event fraction 0.06 100 T,e (a) 0.07 80 gg->bbA gg->A ttbar DY Wt WW 0.8 gg->bbA gg->A ttbar DY Wt WW 0.7 0.6 0.5 0.04 0.4 0.03 0.3 0.02 0.2 0.01 0 0.1 20 40 60 80 100 120 140 0 0 160 180 ∆ φ [o] 1 2 3 4 (c) 5 6 7 8 9 10 σ (d) Figure 9.4: Characterization of leptons: (a) muon pT , (b) electron pT , (c) ∆φ between e and µ, (d) combined impact parameter significance. event fraction 0.22 gg->bbA gg->A ttbar DY Wt WW 0.2 0.18 0.16 0.14 0.12 0.1 0.08 0.06 0.04 0.02 0 0 20 40 60 80 100 120 140 160 180 200 ET [GeV] / T distribution for the signal and for the main backgrounds. Figure 9.5: E 96 Table 9.2: Background samples and cross sections: tt̄ [4], Z/γ ∗ → τ τ [56], tW [57], W W [58], W Z [59], ZZ [60] and bbll [61]. Process bbll tW inclusive W W inclusive W Z inclusive ZZ inclusive Z/γ ∗ → τ τ Mτ τ > 10 GeV Generator Data-set CompHEP [61] CMSSW 167-CSA07 TopRex [62] CMSSW 167-CSA07 Pythia CMSSW 167-CSA07 Pythia CMSSW 167-CSA07 Pythia CMSSW 167-CSA07 R L [f b−1 ] 13.22 7.16 7.40 7.28 8.45 σ [pb] 830 62 114.3 49.9 16.1 Events 1931449 443790 845260 363290 136112 7559 9948432 1.32 840 5899999 7.02 CMSSW 1612 Pythia Fast Sim CMSSW 1612 tt̄ Pythia Fast Sim 0.9 0.35 event fraction event fraction and tt̄ (Fig. 9.6(c)), it has much lower number of b-tags per event (Fig. 9.6(d)), just slightly more than the other samples. This behavior is not obvious and will be addressed in §9.4.2. gg->bbA gg->A ttbar DY Wt WW 0.8 0.7 0.6 gg->bbA gg->A ttbar DY Wt WW 0.3 0.25 0.5 0.2 0.4 0.15 0.3 0.1 0.2 0.05 0.1 0 0 1 2 3 4 5 0 0 6 7 number of jets 50 100 200 250 300 pT,jets [GeV] (b) 0.4 gg->bbA gg->A ttbar DY Wt WW 0.35 0.3 0.25 event fraction event fraction (a) 150 1 gg->bbA gg->A ttbar DY Wt WW 0.9 0.8 0.7 0.6 0.5 0.2 0.4 0.15 0.3 0.1 0.2 0.05 0 -10 0.1 -5 0 5 10 (c) 15 20 25 30 discriminator 0 0 0.5 1 1.5 2 2.5 3 3.5 4 number of b-tags (d) Figure 9.6: Characterization of jets: (a) number of reconstructed jets per event, (b) jet pT , (c) discriminator for the TrackCountingHighEfficiency algorithm and (d) number of b-tags per event. 97 Finally, the distributions of the reconstructed τ momentum fractions xτ →e and xτ →µ are shown in Fig. 9.7. As expected from their topology, the signal and the Drell-Yan samples have more often physical values (0 < xτ →l < 1) with respect to the other backgrounds. gg->bbA gg->A ttbar DY Wt WW 0.2 0.25 event fraction event fraction 0.25 0.2 0.15 0.15 0.1 0.1 0.05 0.05 0 -1 -0.5 0 0.5 1 1.5 2 xτ → e (a) gg->bbA gg->A ttbar DY Wt WW 0 -1 -0.5 0 0.5 1 1.5 2 xτ → µ (b) Figure 9.7: Distribution of the reconstructed τ momentum fractions xτ →e ((a)) and xτ →µ ((b)). 9.3.2 Fast Sim - Full Sim comparison In order to save computing time and be able to quickly simulate and reconstruct events, the Fast Simulation makes use of some approximations and parametrization, thus avoiding many simulation and reconstruction steps performed in the Full Simulation. Therefore, the Fast Simulation is suitable to produce high statistics samples, but its results are less accurate than Full Simulation’s. Before blindly trusting the Fast Simulation data for this analysis, a detailed comparison with a small sample of Full Simulation data has been performed. Fast and Full Simulation data have been produced with the same generator and the same CMSSW release cycle, so that the comparison is straightforward. The first important difference is that, in the CMSSW 1 6 X releases, the trigger is not implemented in the Fast Simulation. Having estimated that only a few percent of the events can pass all the selection cuts without the required trigger bits, it was decided to skip the trigger request for the fast simulation. This fact can lead to an overestimation of the tt̄ and Drell-Yan backgrounds at the percent level. Looking at the distribution plots of the most important variables for this analysis, the Fast Simulation and the Full Simulation show a very good agreement both for the characteristics of the leptons (Fig. 9.8) and for the properties of the jets (Fig. 9.9). A small difference can be noticed for the pT of the electron (Fig. 9.8(b)), but it happens for values well below the trigger and the selection cuts, so it can be neglected. Significant differences between Fast and Full simulation are observed in the combined impact parameter significance distribution (Fig. 9.10). σ (eq. 9.5) is computed starting from the muon and electron impact parameter value and their errors. The impact parameter values obtained with Full and Fast Simulation are compatible both for the electron and the muon, while the errors are not (Fig. 9.11). A compelling explanation for the different behavior of the impact parameter errors is that the track reconstruction for the fast simulation is performed starting from a Gaussian smearing of the true simulated hits, without accounting for the 98 0.09 0.08 DY full sim Entries 370 Mean 15.17 RMS 9.979 DY fast sim Entries 19399 Mean 15.89 RMS 11.05 event fraction event fraction TTbar full sim Entries 550 Mean 46.8 RMS 33.16 TTbar fast sim Entries 36464 Mean 48.26 RMS 34.47 0.1 0.07 TTbar full sim Entries 550 Mean 50.7 RMS 35.41 TTbar fast sim Entries 36464 Mean 51.27 RMS 34.62 0.08 0.07 0.06 DY full sim Entries 370 Mean 20.06 RMS 11.26 DY fast sim Entries 19399 Mean 18.27 RMS 10.94 0.05 0.06 0.05 0.04 0.04 0.03 0.03 0.02 0.02 0.01 0.01 0 0 20 40 60 80 100 120 140 160 0 0 180 200 p [GeV] 20 40 60 80 100 T,µ 140 160 180 200 p [GeV] T,e (a) event fraction 120 (b) TTbar full sim Entries 550 Mean 105.4 RMS 50.87 TTbar fast sim Entries 36464 Mean 105.9 RMS 50.04 0.08 0.07 0.06 DY full sim Entries 370 Mean 152.5 RMS 34.06 DY fast sim Entries 19399 Mean 152.3 RMS 33.05 0.05 0.04 0.03 0.02 0.01 0 0 20 40 60 80 100 120 140 160 180 ∆ φ [ °] (c) Figure 9.8: Fast Simulation - Full Simulation comparison: muon(a) and electron(b) pT and ∆φeµ (c) 99 0.9 0.8 DY full sim Entries 370 Mean 0.2189 RMS 0.5232 DY fast sim Entries 19399 Mean 0.1838 RMS 0.4433 event fraction event fraction TTbar full sim Entries 550 Mean 1.709 RMS 1.032 TTbar fast sim Entries 36464 Mean 1.751 RMS 1.043 1 0.18 0.16 DY full sim Entries 81 Mean 56.67 RMS 32.69 DY fast sim Entries 3566 Mean 55.11 RMS 32.72 0.14 0.7 0.12 0.6 0.5 0.1 0.4 0.08 0.3 0.06 0.2 0.04 0.1 0 0 TTbar full sim Entries 940 Mean 76.85 RMS 48.4 TTbar fast sim Entries 63865 Mean 76.35 RMS 50.04 0.2 0.02 2 4 6 8 10 12 14 0 0 16 18 20 number of jets 50 100 150 event fraction (a) 200 250 300 350 400 pT,jets [GeV] (b) TTbar full sim Entries 550 Mean 0.9891 RMS 0.7385 TTbar fast sim Entries 36464 Mean 1.017 RMS 0.7724 1 0.9 0.8 DY full sim Entries 370 Mean 0.03243 RMS 0.1771 DY fast sim Entries 19399 Mean 0.01943 RMS 0.1421 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0 2 4 6 8 10 12 14 16 18 20 number of b−tags (c) Figure 9.9: Fast Simulation - Full Simulation comparison: number of jets per event(a), jet pT distribution(b), number of b-tags per event(c). 100 event fraction long non Gaussian tails of the hit position residue distributions and without considering the presence of fake hits or hits wrongly associated to the track. The impact parameter measurement is driven by the hits in the innermost layers, which are the most problematic because they have the highest occupancy. Neglecting the contribution of such hits in the Fast Simulation results in an underestimation of the impact parameter error. The effect of the error underestimate is to overestimate the combined impact parameter significance (Fig. 9.10) and, in the end, if a minimum cut on this variable is used, it would imply a significant overestimation of the background produced with the Fast Simulation. In order to correct this bias, the most natural solution is to correct the Fast Simulation data by scaling the impact parameter error, thus pushing the mean value of the Fast Simulation distribution up to approximately the mean value of the Full Simulation distribution. The chosen scale factors are 1.5 for the electron and 1.7 for the muon. After the scaling, the resulting impact parameter significance distribution has a behavior compatible with the Full Simulation (Fig. 9.12 and Table 9.3). Provided this correction, the Fast Simulation can be considered as “validated” for the purposes of this analysis, and will be trustfully used throughout it. TTbar full sim Entries 550 Mean 1.771 RMS 1.233 TTbar fast sim Entries 36464 Mean 2.441 RMS 1.71 0.6 0.5 DY full sim Entries 370 Mean 2.548 RMS 1.846 DY fast sim Entries 19399 Mean 3.23 RMS 2.208 0.4 0.3 0.2 0.1 0 0 1 2 3 4 5 6 7 8 9 10 σ Figure 9.10: Fast Simulation - Full Simulation comparison: combined impact parameter distribution. Table 9.3: Mean and RMS values of the combined impact parameter significance distribution for the Full Simulation and the Fast Simulation before and after the scaling of the impact parameter errors. Sample DY Full Sim DY Fast Sim DY Fast Sim scaled tt̄ Full Sim tt̄ Fast Sim tt̄ Fast Sim scaled mean 2.5 3.2 2.3 1.8 2.4 1.6 RMS 1.8 2.2 1.8 1.2 1.7 1.3 101 0.9 0.8 DY full sim Entries 370 Mean 0.006522 RMS 0.007003 DY fast sim Entries 19399 Mean 0.006063 RMS 0.007973 event fraction event fraction TTbar full sim Entries 550 Mean 0.002813 RMS 0.003934 TTbar fast sim Entries 36464 Mean 0.002835 RMS 0.005556 1 0.7 TTbar full sim Entries 550 Mean 0.004253 RMS 0.005237 TTbar fast sim Entries 36464 Mean 0.003473 RMS 0.005564 0.9 0.8 0.7 DY full sim Entries 370 Mean 0.007068 RMS 0.008545 DY fast sim Entries 19399 Mean 0.006329 RMS 0.007404 0.6 0.6 0.5 0.5 0.4 0.4 0.3 0.3 0.2 0.2 0.1 0.1 0 0 0.01 0.02 0.03 0.04 0 0 0.05 d0µ [cm] 0.01 0.02 0.03 0.05 0.06 0.07 0.08 0.09 0.1 d0e [cm] (b) TTbar full sim Entries 550 Mean 0.002286 RMS 0.001606 TTbar fast sim Entries 36464 Mean 0.00133 RMS 0.001289 1 0.9 0.8 DY full sim Entries 370 Mean 0.003099 RMS 0.001547 DY fast sim Entries 19399 Mean 0.002135 RMS 0.002031 event fraction event fraction (a) 0.04 0.7 TTbar full sim Entries 550 Mean 0.004007 RMS 0.001736 TTbar fast sim Entries 36464 Mean 0.002656 RMS 0.001834 0.8 0.7 0.6 DY full sim Entries 370 Mean 0.005264 RMS 0.002354 DY fast sim Entries 19399 Mean 0.00403 RMS 0.002535 0.5 0.6 0.5 0.4 0.4 0.3 0.3 0.2 0.2 0.1 0.1 0 0 0.002 0.004 0.006 0.008 0 0 0.01 δd0µ [cm] 0.002 0.004 0.006 0.008 0.01 0.012 0.014 0.016 0.018 0.02 δd0e [cm] (c) (d) event fraction Figure 9.11: Fast Simulation - Full Simulation comparison: impact parameter value for muon(a) and electron(b) and impact parameter error for muon(c) and electron(d). TTbar full sim DY full sim Entries 550 Entries 370 Mean 2.548 1.771 Mean RMS 1.233 RMS 1.846 0.6 TTbar fast sim scaled Entries Mean RMS 0.5 DY fast sim scaled 36464 Entries 1.627 Mean 1.349 RMS 19399 2.299 1.837 0.4 0.3 0.2 0.1 0 0 1 2 3 4 5 6 7 8 9 10 σ Figure 9.12: Fast Simulation - Full Simulation comparison: combined impact parameter distribution after the scaling of the impact parameter error. 102 9.4 Crucial issues This analysis is very delicate in some aspects that, if not properly addressed, can lead to a low signal to noise ratio, a poor measurement of the Higgs mass or an incorrect or biased estimate of the production cross section. In particular, a feed-through of τ jet events in the final selection, if not properly taken into account, would imply an overestimate of the signal. Also, the b-tagging is a very challenging task for the signal events, and, if not properly studied, would make the background suppression and weighting the two signal contributions / T measurement is fundamental to isolate a Higgs mass peak difficult. Finally, a precise E from the background. 9.4.1 Lepton mis-identification event fraction Simulated samples contain fully leptonic, fully hadronic and semileptonic di-τ decay final states. Thus, the rate of the mis-identification of τ jets as leptons can be studied. In order to check if loose selection requests on the lepton identification and isolation are sufficient to exclude hadronic and semileptonic τ decay events the final selection, the following test has been performed using the gg → bbA → τ τ sample with MA = 160 GeV . Leptons are required to be isolated from other tracks (muons: no tracks with pT > 1 GeV in a cone with ∆R = 0.4; electrons: no tracks with pT > 1.5 GeV in a 0.02 < ∆R < 0.2 cone), electrons are identified with the loose fixed threshold electron-id and a preliminary set of cuts is applied (pTe,µ > 20 GeV , σ > 2, 100 < ∆φ < 170, #jets = [0, 1], #b-tags = [0, 1]). With these selection criteria, the final mass plot contains about 15% of events with a feed-through of τ jets (Fig. 9.13). mcoll Entries 157 Mean 153.9 RMS 52.62 mcoll_emuN Entries 23 Mean 166.4 RMS 57.43 0.24 0.22 0.2 0.18 0.16 0.14 0.12 0.1 0.08 0.06 0.04 0.02 0 0 100 200 300 400 500 600 700 800 900 1000 mτ τ [GeV] Figure 9.13: Feed-through of τ jets using tracker isolation only and loose fixed threshold electron-id. The white area are the total entries in the invariant mass plot, while the red area are the entries with mis-identified leptons. Clearly such high rate of feed-through is not acceptable for a precise estimate of the signal contribution in the e-µ final state. The largest fake contribution comes from the mis-identification of τ jet pions as electrons. To reduce it, a stronger electron-id can been exploited. The highest purity is obtained with the tight category based electron-id. Using it with the same set of cuts, the feed-through drops at 5% level. 103 An additional request of a calo-based lepton isolation leads to a further reduction of the feed-through. Requiring that the sum of the HCAL and ECAL deposits within a cone with ∆R = 0.4 around the muons is less than 4 GeV and that the energy deposit in ECAL and HCAL is less than 0.05 and 0.2 times the electron super cluster energy respectively, the feed-through rate lowers to 3%. 9.4.2 b-tagging The capability to tag b jets is crucial for the background suppression and for the signal process selection (gg → A and gg → bbA). Both bbA and tt̄ have two generated b quarks per event. However, as can be seen in Fig. 9.6(d) the number of b tags per event is very different: in almost 75% of tt̄ events there is at least one b-tag, while more than 85% of bbA events have no b-tags. This difference can be understood looking at the properties of the b quarks in the two samples (Fig. 9.14). 0.06 Entries 1.236937e+07 Mean RMS 0.05 56.44 37.31 bbA160 Entries 393994 Mean -0.001395 RMS 2.703 TTbar event fraction bbA160 Entries 393994 Mean 16.45 RMS 19.27 TTbar event fraction 0.07 0.14 Entries 1.236937e+07 Mean RMS 0.12 0.0006112 1.547 0.1 0.04 0.08 0.03 0.06 0.02 0.04 0.01 0.02 0 20 40 60 80 100 120 140 160 180 200 p [GeV] -10 -8 -6 -4 -2 0 (a) 2 4 6 8 10 η b T,b (b) Figure 9.14: (a) pT distribution of the b partons in bbA (black) and tt̄ (red) samples. (b) η distribution of b quarks in the same samples. The b quarks in the tt̄ sample have a hard pT spectrum, peaking at ∼ 40 GeV , and a narrow η distribution, while those in the signal sample peak at pT = 5 GeV and have a broad η distribution. These topological characteristics imply that in bbA events, just a few b quarks end up in a taggable reconstructed jet. The CMS b-tagging group has developed a validation tool that evaluates the performance of b-tagging algorithms. The input to this tool are the reconstructed jets with pT > 30 GeV and |η| < 2.4, which can be considered as a sort of definition of taggable jets. The validation tool also makes use of an associator that matches reconstructed jets to the parton that originated that jet2 . Using this associator, the number of reconstructed jets matching a b quark per event has been studied and the result is reported in Fig. 9.15. The difference between the two samples is remarkable, showing that in 80% of the events the b quarks in the bbA do not generate any reconstructed jet (this happens just in 10% of tt̄ events). Clearly it shows that the small number of b-tags in the associated production signal events is not due to a low b-tagging efficiency but to the topological characteristics of the 2 Basically, this associator, called JetFlavourIdentifier looks if among the generated particles there is a b quark in the jet cone extrapolated at the primary vertex. 104 event fraction bbA160 Entries 2619 Mean 0.1409 RMS 0.3692 TTbar Entries 36464 Mean 1.18 RMS 0.6985 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0 2 4 6 8 10 12 14 16 18 20 number of b-quark jets Figure 9.15: Number of reconstructed jets with pT > 30 GeV and |η| < 2.4 associated to a b parton per event in bbA (black) and tt̄ (red) samples. b partons. In fact, even with an ideal and full efficient b-tagging algorithm, the number of b-tags per event could not exceed 0.2. Actually, as b-tagging is a tough task, the performance of the various algorithms provided by the b-tagging group3 have been privately evaluated (and cross-checked with the official b-tagging validation tool) in order to find the most suitable for this analysis. It turns out that the best performance are provided by the TrackCountingHighEfficiency algorithm (Table 9.4). Table 9.4: b-tagging performance for the TrackCountingHighEfficiency algorithm on the tt̄ background and on the two signal production processes with MA = 160 GeV before (first three rows) and after the preselection request of one e and one µ in the event (last three rows). sample tt̄ gg → bbA gg → A tt̄ (e+µ) gg→bbA (e+µ) gg→A(e+µ) # jets b assoc 1.5 0.3 0.02 1.3 0.2 0.01 # jets non-b assoc 2.8 1.7 2.1 0.8 0.4 0.8 # b-tags 1.5 0.3 0.14 1 0.2 0.05 b effic. 74% 65% 70% 72% 68% 66% ucsd effic. 15% 9% 6% 13% 15% 6% fake rate 27% 45% 92% 10% 28% 86% It is worth noting that the fake rate is higher in the signal samples than in the tt̄ and that, after requiring a muon and an electron in the event, the fake rate is reduced, suggesting that many fakes come from the mis-tagging of τ jets. Of course, in the gg → A sample, as no b quarks are expected, the b-tags are almost all fakes. Nevertheless, before the preselection, the total number of b-tags in the gg → A is half the one in the gg → bbA sample. After the preselection request, this fraction is still high, being around 25%. 3 The algorithm tested were: TrackCountingHighEfficiency, TrackCountingHighPurity, JetProbablity and ImpactParameterMVA. 105 These results show that the use of the b-tagging in this analysis is not trivial and that it has to be accurately considered. In particular, a request of one btag in the event highly reduces the gg → bbA sample, while it suppresses, even if not completely, the gg → A. Actually, in both the production processes, most of the signal has no b-tags and, in order to reduce the tt̄ a b-tagging veto can be exploited. 9.4.3 Missing Transverse Energy measurement metres_JPT−RG Entries 2619 Mean 0.612 RMS 13.49 0.08 metres_I5−RG Entries 2619 Mean 5.984 RMS 17.24 0.07 0.06 mcoll_JPT event fraction event fraction / T resolution is a key issue for this analysis because it is As already said in §9.2.3, the E the most delicate term in the collinear approximation invariant mass formula. Since Type/ T is too rough for this analysis, several jet correction algorithms4 have been tested in 0 E / T resolution. The best results are obtained with the combination of order to improve the E the Zero Suppression and the Jet Plus Track correction (JetPlusTrackZSPCorJetIcone5 ). A comparison between the JetPlusTrackZSPCorJetIcone5 and the Monte Carlo correction for Iterative Cone 5 jets (MCJetCorJetIcone5 ) is reported in Fig. 9.16. Entries Mean RMS χ2 / ndf Constant Mean Sigma 0.24 0.22 0.2 0.18 mcoll_I5 Entries Mean RMS χ2 / ndf Constant Mean Sigma 0.16 0.05 0.14 0.12 0.04 67 156.5 49.11 0.01964 / 5 0.2373 ± 0.3126 152.2 ± 54.0 44.75 ± 52.60 67 189.1 82.12 0.1273 / 10 0.1387 ± 0.2182 179.3 ± 104.3 66.34 ± 93.89 0.1 0.03 0.08 0.06 0.02 0.04 0.01 0 −100 0.02 −80 −60 −40 −20 0 (a) 20 40 60 80 100 ∆ ET [GeV] 0 0 100 200 300 400 500 600 700 800 900 1000 mτ τ [GeV] (b) / T corrected with JetPlusTrackZSPCorJetIcone5 (blue) Figure 9.16: Comparison between E / T with respect to the and with MCJetCorJetIcone5 (red). (a) Residue of the reconstructed E / T calculated at generator level. (b) Invariant mass plot using the two correction algorithms E (sample: gg → bbA, with MA = 160 GeV , cuts used will be described in §9.5.2 - soft strategy). / T calculated at generator level (Fig. 9.16(a)) has shorter The residue with respect to the E tails for the JetPlusTrackZSPCorJetIcone5 correction. Also, it is less biased, having a mean value close to zero (0.6 for JetPlusTrackZSPCorJetIcone5, 6.0 for MCJetCorJetIcone5 ). Consequently, the invariant mass distribution (Fig. 9.16(b)) obtained with the JetPlusTrackZSPCorJetIcone5 is narrower with respect to MCJetCorJetIcone5 (Gaussian Fit σ equal to 44.8 and 66.3). Such improvement is fundamental to be able to distinguish the signal peak on top of the background contributions. Only drawback of the JetPlusTrackZSPCorJetIcone5 correction seems to be a tendency to underestimate the mean Higgs mass value (equal to 152.2 GeV for a generated 160 GeV Higgs in the example above). Possible improvements for this correction algorithm are being investigated. 4 The tested jet corrections are: MCJetCorJetIcone5, MCJetCorJetMcone5, MCJetCorJetFastjet6 and JetPlusTrackZSPCorJetIcone5 106 9.5 9.5.1 Event selection Summary of Selection Cuts A review of the variables that can be used to apply cuts is the following: • Trigger: the trigger request of single isolated or relaxed electron or muon has been described in §9.2.1 • µ,e (qµ · qe < 0): in order to suppress the backgrounds with more than two leptons in the final state, only the events with exactly one electron and one muon with opposite charges are considered. • Iso + Id: strong identification and isolation requests are fundamental to reduce the feed-through of τ -jets, as discussed in §9.4.1. • pT,e, pT,µ: a lower cut on the lepton pT is needed to suppress the contribution from the Z/γ ∗ → τ τ background (Fig. 9.4(a) and 9.4(b)). This cut has a small effect on the invariant mass distribution: the harder the cut, the higher values it pushes the mass. • σ: both the signal and the backgrounds peak at combined impact parameter significance values of ∼ 2 (Fig. 9.4(d)). Nevertheless, the signal shows longer tails and a minimum cut on σ is useful to reject many background events (mostly tt̄). Cutting on high values of σ is not advisable since it would reject too many signal events and since it would degrade the invariant mass distribution because the requirement of highly displaced leptons is in contrast with the collinearity assumption used to reconstruct the mass. • #jets: The jet activity of the signal samples is much lower (∼ 74% of events with no jets for gg → bbA and ∼ 62% for gg → A) with respect to the tt̄ and W t backrounds (∼ 10% and ∼ 13%) while it is slightly higher than in the W W and Z/γ ∗ → τ τ (∼ 80% and ∼ 84%). Therefore, a veto on the number of jets is useful to reduce tt̄ and W t without rejecting many signal events. • #b-tags: As discussed in §9.4.2, the request on the number of b-tags is the key cut for this analysis. Several strategies based on this selection criterion can be developed in order to suppress a particular background or to select the signal from the associated production process. These strategies will be discussed in the next section. • ∆φmin: As shown in Fig. 9.4(c), when the τ s decay from the same neutral boson (signal and Drell-Yan samples) the directions of the final state leptons tend to be backto-back. In the other cases, the directions are not correlated, and the ∆φ distribution is almost flat. A minimum cut on ∆φ is thus useful to reduce the contribution from all the backgrounds except Z/γ ∗ . • ∆φmax: On the other hand, as discussed in §9.2.6, if the two leptons are exactly backto-back, the collinear approximation is not valid. Therefore, a cut on the maximum value of ∆φ is needed. Of course, if this cut value is very high, many signal events are not rejected but their mass is not accurately reconstructed; conversely, if the cut is too low, the reconstructed mass has a better resolution, but most of the signal events are lost. 107 • 0 < x < 1: Another consistency check of the collinear approximation is the reconstructed τ momentum fractions: x < 0 and x > 1 are unphysical values. This request is also useful to reduce some backgrounds (see Fig. 9.7). 9.5.2 Analysis Strategies The application of the b-tagging selection cut is not obvious as it can lead to different signal and background contributions. Therefore, three different strategies have been developed: • soft strategy: #b-tags = [0, 1]. Inclusive approach: keeps high statistics for the signal and applies no severe reduction of the background. A gluon fusion signal contribution survives also in this case (∼ 19%). • b-one strategy: #b-tags = [1]. This strategy has an exclusive approach, as it suppresses the Drell-Yan background and the gg → A signal, and is dominated by the tt̄ and gg → bbA contributions. • b-zero strategy: #b-tags = [0]. A selective approach is accomplished by highly reducing the tt̄ background. The signal has a significant contribution from both the production processes (∼ 17% of the total). The list of cuts used for each strategy is summarized in Table 9.5. Table 9.5: Selection cuts for the three strategies. cut trigger eId Iso pT,e > [GeV ] pT,µ > [GeV ] σ> #jets #b-tags ∆φmin [◦ ] ∆φmax [◦ ] xe,µ b-one b-zero soft single e OR µ, iso OR relaxed tight category based track and calo based 28 32 30 23 27 25 2 2 2 [0,1] [0,1] [0,1] [1] [0] [0,1] 0 120 100 175 175 170 (0,1) (0,1) (0,1) The samples can be grouped into three categories according to the shape of their invariant mass plot: • z-shape: all the backgrounds with a Z boson: Z/γ ∗ , bbll, W Z, ZZ. It is dominated by the Drell-Yan contribution. • tt̄-shape: all the backgrounds with two W in the final state: tt̄, tW , W W . Main contribution is tt̄. • signal: gg→bbA and gg→A samples For each category, the invariant mass distribution after the selection cuts are reported in Fig. 9.17-9.19, while the cut efficiencies are in Table 9.6-9.8. 108 tt-shape background entries/25 GeV entries/25 GeV z-shape background sum DY 60 bbll WZ 50 ZZ sum 45 ttbar 40 Wt 35 WW 30 40 25 30 20 15 20 10 10 0 0 5 100 200 300 400 500 600 700 800 0 0 900 1000 mτ τ [GeV] 100 200 300 400 (a) 600 700 800 900 1000 mτ τ [GeV] (b) signal+background A - mA =160 GeV gg->bbA gg->A 25 20 entries/25 GeV signal entries/25 GeV 500 all 100 bck sum bckz 80 bckt 15 60 10 40 5 20 0 0 100 200 300 400 500 600 700 800 0 0 900 1000 mτ τ [GeV] 100 200 300 400 (c) 500 600 700 800 900 1000 mτ τ [GeV] (d) Figure 9.17: Invariant mass plot after the selection cuts for the soft strategy: z-shape(a) and tt̄-shape(b) backgrounds, signal contribution(c) and sum of signal and background(d). tt-shape background entries/25 GeV entries/25 GeV z-shape background sum 5 DY bbll WZ 4 ZZ sum 30 ttbar Wt 25 WW 20 3 15 2 10 1 0 0 5 100 200 300 400 500 600 700 800 0 0 900 1000 mτ τ [GeV] 100 200 300 400 (a) 600 700 800 900 1000 mτ τ [GeV] (b) signal+background A - mA =160 GeV gg->bbA gg->A 12 10 8 entries/25 GeV signal entries/25 GeV 500 all 35 bck sum 30 bckz 25 bckt 20 6 15 4 10 2 0 0 5 100 200 300 400 500 (c) 600 700 800 900 1000 mτ τ [GeV] 0 0 100 200 300 400 500 600 700 800 900 1000 mτ τ [GeV] (d) Figure 9.18: Invariant mass plot after the selection cuts for the b-one strategy: z-shape(a) and tt̄-shape(b) backgrounds, signal contribution(c) and sum of signal and background(d). 109 tt-shape background entries/25 GeV entries/25 GeV z-shape background sum 60 DY bbll 50 WZ ZZ sum 25 ttbar Wt 20 WW 40 15 30 10 20 5 10 0 0 100 200 300 400 500 600 700 800 0 0 900 1000 mτ τ [GeV] 100 200 300 400 (a) 600 700 800 900 1000 mτ τ [GeV] (b) signal+background A - mA =160 GeV gg->bbA gg->A 20 18 16 14 12 entries/25 GeV signal entries/25 GeV 500 all 80 bck sum 70 bckz 60 bckt 50 10 40 8 30 6 20 4 10 2 0 0 100 200 300 400 500 (c) 600 700 800 900 1000 mτ τ [GeV] 0 0 100 200 300 400 500 600 700 800 900 1000 mτ τ [GeV] (d) Figure 9.19: Invariant mass plot after the selection cuts for the b-zero strategy: z-shape(a) and tt̄-shape(b) backgrounds, signal contribution(c) and sum of signal and background(d). 110 Table 9.6: Total andR relative selection cut efficiencies for the soft strategy. Nevents is the number of events for L = 10 f b−1 . sample Nevents ǫ(trigger)[%] ǫ(e± ,µ∓ ,eId)[%] ǫ(Iso)[%] ǫ(pT,e )[%] ǫ(pT,µ )[%] ǫ(σ)[%] ǫ(#jets)[%] ǫ(#b-tags)[%] ǫ(∆φmin )[%] ǫ(∆φmax )[%] ǫ(xe,µ )[%] Nevents sample Nevents ǫ(trigger)[%] ǫ(e± ,µ∓ ,eId)[%] ǫ(Iso)[%] ǫ(pT,e )[%] ǫ(pT,µ )[%] ǫ(σ)[%] ǫ(#jets)[%] ǫ(#b-tags)[%] ǫ(∆φmin )[%] ǫ(∆φmax )[%] ǫ(xe,µ )[%] Nevents Z/γ ∗ 75590000 100 (100) 0.59 (0.6) 0.20 (32.9) 0.03 (13.1) 5 · 10−3 (20.7) 2 · 10−3 (45.9) 2 · 10−3 (93.4) 2 · 10−3 (100) 2 · 10−3 (85.0) 1 · 10−3 (45.8) 5 · 10−4 (35.2) 235.6 sample Nevents ǫ(trigger)[%] ǫ(e± ,µ∓ ,eId)[%] ǫ(Iso)[%] ǫ(pT,e )[%] ǫ(pT,µ )[%] ǫ(σ)[%] ǫ(#jets)[%] ǫ(#b-tags)[%] ǫ(∆φmin )[%] ǫ(∆φmax )[%] ǫ(xe,µ )[%] Nevents gg→bbA 259080 29.24 (29.2) 4.00 (13.6) 1.34 (33.6) 0.66 (49.1) 0.30 (44.7) 0.18 (60.5) 0.17 (95.1) 0.17 (100) 0.17 (98.5) 0.08 (46.6) 0.03 (44.1) 89.1 bbll 1460800 68.94 (17.0) 1.48 (5.5) 0.34 (10.4) 0.10 (57.8) 0.01 (57.6) 7 · 10−3 (32.5) 5 · 10−3 (84.2) 5 · 10−3 (100) 4 · 10−3 (75.0) 3 · 10−3 (87.5) 1 · 10−3 (14.3) 10.6 tt̄ 8400000 100 (100) 6.50 (6.4) 0.62 (9.5) 0.42 (68.4 0.31 (72.1) 0.08 (25.5) 0.03 (43.7) 0.03 (100) 0.03 (74.1) 0.02 (81.9) 5 · 10−3 (22.2) 379.1 gg→A 40640 28.43 (28.4) 4.07 (14.3) 1.40 (34.5) 0.70 (50.0) 0.35 (50.0) 0.22 (62.2) 0.19 (88.7) 0.19 (100) 0.18 (93.6) 0.10 (55.9) 0.05 (51.7) 21.4 WZ 499000 16.98 (68.9) 0.93 (2.2) 0.10 (23.1) 0.06 (28.2) 0.03 (11.6) 0.01 (60.2) 9 · 10−3 (81.5) 9 · 10−3 (100) 7 · 10−3 (68.9) 6 · 10−3 (71.2) 1 · 10−3 (26.9) 4.1 tW 620000 30.81 (22.3) 3.63 (11.3) 0.75 (30.2) 0.58 (67.8) 0.45 (74.0) 0.15 (31.8) 0.12 (96.3) 0.12 (100) 0.10 (80.6) 0.08 (79.2) 0.01 (4.7) 81.2 ZZ 499000 12.45 (12.4) 0.40 (3.2) 0.05 (12.6) 0.02 (44.1) 8 · 10−3 (36.7) 4 · 10−3 (45.4) 3 · 10−3 (80.0) 3 · 10−3 (100) 1 · 10−3 (50.0) 1 · 10−3 (100) 0 (0) 0 WW 1143000 22.32 (22.3) 2.52 (11.3) 0.76 (30.2) 0.52 (67.8) 0.38 (74.0) 0.12 (31.8) 0.12 (96.3) 0.12 (100) 0.09 (80.6) 0.08 (79.2) 4 · 10−3 (4.7) 40.5 111 Table 9.7: Total andR relative selection cut efficiencies for the b-one strategy. Nevents is the number of events for L = 10 f b−1 . sample Nevents ǫ(trigger)[%] ǫ(e± ,µ∓ ,eId)[%] ǫ(Iso)[%] ǫ(pT,e )[%] ǫ(pT,µ )[%] ǫ(σ)[%] ǫ(#jets)[%] ǫ(#b-tags)[%] ǫ(∆φmin )[%] ǫ(∆φmax )[%] ǫ(xe,µ )[%] Nevents sample Nevents ǫ(trigger)[%] ǫ(e± ,µ∓ ,eId)[%] ǫ(Iso)[%] ǫ(pT,e )[%] ǫ(pT,µ )[%] ǫ(σ)[%] ǫ(#jets)[%] ǫ(#b-tags)[%] ǫ(∆φmin )[%] ǫ(∆φmax )[%] ǫ(xe,µ )[%] Nevents 112 Z/γ ∗ 75590000 100 (100) 0.59 (0.6) 0.19 (32.9) 0.03 (16.3) 7 · 10−3 (24.4) 3 · 10−3 (43.2) 3 · 10−3 (94.0) 1 · 10−5 (0.3) 1 · 10−5 (100) 1 · 10−5 (100) 0 (0) 0 sample Nevents ǫ(trigger)[%] ǫ(e± ,µ∓ ,eId)[%] ǫ(Iso)[%] ǫ(pT,e )[%] ǫ(pT,µ )[%] ǫ(σ)[%] ǫ(#jets)[%] ǫ(#b-tags)[%] ǫ(∆φmin )[%] ǫ(∆φmax )[%] ǫ(xe,µ )[%] Nevents gg→bbA 259080 29.24 (29.2) 4.00 (13.7) 1.34 (33.6) 0.73 (54.5) 0.36 (49.1) 0.22 (61.2) 0.21 (95.6) 0.02 (09.5) 0.02 (100 ) 0.02 (92.3) 0.01 (69.4) 33.3 bbll 1460800 68.94 (68.9) 1.48 (2.2) 0.34 (23.1) 0.11 (33.2) 0.02 (15.5) 0.01 (60.3) 9 · 10−3 (85.9) 2 · 10−3 (18.8) 2 · 10−3 (100) 2 · 10−3 (100) 7 · 10−4 (42.4) 10.6 tt̄ 8400000 100 (100) 6.50 (6.5) 0.62 (9.5) 0.44 (71.7) 0.33 (75.1) 0.08 (25.2) 0.04 (43.8) 0.02 (51.4) 0.02 (100) 0.02 (93.8) 4 · 10e−03 (20.3) 286.8 gg→A 40640 28.43 (28.4) 4.07 (14.3) 1.40 (34.5) 0.78 (55.4) 0.42 (53.6) 0.26 (61.9) 0.23 (88.0) 5 · 10−3 (2.2) 5 · 10−3 (100) 5 · 10−3 (100) 3 · 10−3 (66.7) 1.4 WZ 499000 17.00 (17.0) 0.93 (5.5) 0.10 (10.4) 0.06 (63.2) 0.04 (58.1) 0.01 (33.3) 0.01 (83.7) 0 (0) 0 (0) 0 (0) 0 (0) 0 tW 620000 30.81 (30.8) 3.63 (11.8) 0.75 (20.7) 0.60 (80.4) 0.48 (79.4) 0.16 (32.7) 0.13 (79.9) 0.06 (46.3) 0.06 (100) 0.05 (92.2) 0.01 (21.5) 64.4 ZZ 499000 12.45 (12.5) 0.40 (3.2) 0.05 (12.6) 0.02 (47.1) 0.01 (40.6) 4 · 10−3 (46.2) 3 · 10−3 (66.7) 0 (0) 0 (0) 0 (0) 0 (0) 0 WW 1143000 22.32 (22.3) 2.52 (11.3) 0.76 (30.2) 0.55 (72.4) 0.43 (77.2) 0.14 (31.8) 0.13 (96.5) 2 · 10e−03 (1.8) 2 · 10e−03 (100) 2 · 10e−03 (90.0) 2 · 10e−04 (11.1) 2.7 Table 9.8: Total andR relative selection cut efficiencies for the b-zero strategy. Nevents is the number of events for L = 10 f b−1 . sample Nevents ǫ(trigger)[%] ǫ(e± ,µ∓ ,eId)[%] ǫ(Iso)[%] ǫ(pT,e )[%] ǫ(pT,µ )[%] ǫ(σ)[%] ǫ(#jets)[%] ǫ(#b-tags)[%] ǫ(∆φmin )[%] ǫ(∆φmax )[%] ǫ(xe,µ )[%] Nevents sample Nevents ǫ(trigger)[%] ǫ(e± ,µ∓ ,eId)[%] ǫ(Iso)[%] ǫ(pT,e )[%] ǫ(pT,µ )[%] ǫ(σ)[%] ǫ(#jets)[%] ǫ(#b-tags)[%] ǫ(∆φmin )[%] ǫ(∆φmax )[%] ǫ(xe,µ )[%] Nevents Z/γ ∗ 75590000 100 (100) 0.59 (0.6) 0.19 (32.9) 0.02 (10.3) 3 · 10−3 (16.4) 2 · 10−3 (49.4) 2 · 10−3 (90.7) 2 · 10−3 (99.3) 1 · 10−3 (75.2) 7 · 10−4 (63.3) 2 · 10−4 (34.8) 182.4 sample Nevents ǫ(trigger)[%] ǫ(e± ,µ∓ ,eId)[%] ǫ(Iso)[%] ǫ(pT,e )[%] ǫ(pT,µ )[%] ǫ(σ)[%] ǫ(#jets)[%] ǫ(#b-tags)[%] ǫ(∆φmin )[%] ǫ(∆φmax )[%] ǫ(xe,µ )[%] Nevents gg→bbA 259080 29.24 (29.2) 4.00 (13.7) 1.34 (33.6) 0.60 (44.5) 0.25 (41.2) 0.15 (59.0) 0.14 (94.4) 0.13 (91.8) 0.12 (98.4) 0.08 (66.8) 0.03 (39.1) 83.8 bbll 1460800 68.9 (68.9) 1.48 (2.2) 0.34 (23.1) 0.08 (24.1) 7 · 10−3 (9.0) 4 · 10−3 (55.9) 3 · 10−3 (76.3) 1 · 10−3 (70.5) 2 · 10−3 (72.1) 1 · 10−3 (90.3) 4 · 10−4 (25.0) 5.3 tt̄ 8400000 100 (100) 6.50 (6.5) 0.62 (9.5) 0.40 (65.5) 0.28 (69.2) 0.07 (25.8) 0.03 (43.9) 0.02 (49.2) 0.01 (65.0) 9 · 10−3 (89.3) 2 · 10−3 (23.0) 167.6 gg→A 40640 28.42 (28.4) 4.07 (14.3) 1.40 (34.5) 0.64 (45.4) 0.30 (46.6) 0.18 (61.2) 0.16 (86.7) 0.15 (97.5) 0.14 (89.4) 0.10 (70.1) 0.04 (43.3) 17.0 WZ 499000 17.0 (17.0) 0.93 (5.5) 0.10 (10.4) 0.05 (53.0) 0.03 (55.4) 9 · 10−3 (31.1) 7 · 10−3 (84.4) 7 · 10−3 (100) 5 · 10−3 (63.0) 4 · 10−3 (94.1) 8 · 10−4 (18.8) 4.1 tW 620000 30.8 (30.8) 3.63 (11.8) 0.75 (20.7) 0.56 (75.0) 0.42 (74.9) 0.14 (32.7) 0.11 (79.9) 0.06 (53.8) 0.05 (78.0) 0.04 (86.8) 0.01 (14.0) 30.8 ZZ 499000 12.5 (12.5) 0.40 (3.2) 0.05 (12.6) 0.02 (41.2) 6 · 10−3 (28.6) 3 · 10−3 (50.0) 2 · 10−3 (75.0) 2 · 10−3 (100) 7 · 10−4 (33.3) 7 · 10−4 (100) 0 (0) 0 WW 1143000 22.3 (22.3) 2.52 (11.3) 0.76 (30.2) 0.48 (63.2) 0.34 (71.6) 0.11 (31.9) 0.11 (96.3) 0.10 (98.1) 0.08 (73.2) 0.07 (88.0) 5 · 10−3 (6.9) 52.7 113 9.6 9.6.1 Results Fitting with known background shapes For all the strategies, after the selection cuts are applied, a significant signal contribution survives. The significance of the observed Higgs events can be estimated either through a simple counting of background and signal events or through a fit to discriminate the signal contribution from the background contaminations. In order to disentangle the signal contribution from the background, a first fitting strategy [64], which assumes a complete knowledge of the background shapes, has been developed. The two background contributions (z-shape and tt̄-shape) are first fitted separately with a Landau function5 . The resulting fit functions have well distinct parameters: the z-shape Landau peaking at values close the Z mass and being quite sharp, while the tt̄-shape Landau peaking at higher values (about 200 GeV ) and having a large width. The signal contribution is parametrized as a Gaussian. The three contributions (z-shape, tt̄-shape, signal) are then summed and a global fit on the resulting histogram is performed. In this fit, for each selection strategy, the values of the most probable value (MPV) and the width of the background Landau functions are assumed known and thus are used as fixed parameters. The free parameters are the background weights (Wz , Wtt̄ ) and the signal mean (mA ), width (σA ) and weight (WA ). Results are reported in Fig. 9.20-9.21. In all cases, the weights of the three contributions are consistent with one within the errors, the mass is compatible with the expected value of mA = 160 GeV , while the signal width is slightly underestimated. The resulting significances [65] are summarized in Table 9.9. Table 9.9: Significances for the three strategies as estimated from event counting or from R the fit, with and without statistical errors (mA = 160 GeV , tan β = 30 and L = 10 f b−1 ). strategy σcount σcount+stat σf it σf it+stat 9.6.2 b-one 1.7 1.2 1.8 1.1 b-zero 4.6 3.3 5.3 2.7 soft 4.6 3.3 5.3 3.2 Toy Monte Carlo The uncertainties arising from the fitting procedure are evaluated by performing a set of toy experiments (Monte Carlo trials). For simplicity, they have been carried out for the soft strategy only. The toy experiments are implemented with the RooFit package [63]. The sum of the z-shape, tt̄-shape and signal fit functions in Fig. 9.20 is taken as the probability density function (pdf) for the toy Monte Carlo. At each trial, a Monte Carlo data sample isRrandomly built from the pdf. The sample yield is equal to the number of events expected at L = 10 f b−1 . A global fit is then performed 5 in the b-one strategy case, the z-shape background is suppressed and therefore its content is also added to the tt̄-shape contribution. 114 mcoll Mean 125.3 RMS 51.1 χ2 / ndf 77.34 / 37 385.4 ± 38.1 A: MPV: 87.84 ± 2.18 w: 16.16 ± 1.25 70 60 50 tt -shape background fit entries/25 GeV entries/25 GeV z-shape background fit 40 50 40 mcoll Mean 317 RMS 181 χ2 / ndf 56.3 / 37 244.3 ± 16.4 A: 200.7 ± 5.4 MPV: w: 53.67 ± 2.94 30 30 20 20 10 10 entries/25 GeV signal fit 30 25 00 100 200 300 400 500 600 700 800 900 1000 mτ τ [GeV] mcoll Mean 158.7 RMS 56.1 2 χ / ndf 28.11 / 37 24.03 ± 2.87 A: m: 154.2 ± 4.3 σ: 44 ± 3.0 20 mcoll Mean 120 100 80 15 100 200 300 400 500 600 700 800 900 1000 mτ τ [GeV] combined fit entries/25 GeV 0 0 RMS χ 2 / ndf W z: Wtt: WA: mA σA 241 168.3 53.26 / 35 0.9731 ± 0.1314 0.9677 ± 0.0670 1.294 ± 0.307 153.3 ± 9.4 35.94 ± 5.79 60 10 40 5 20 0 0 100 200 300 400 500 600 700 800 900 1000 mτ τ [GeV] 00 100 200 300 400 500 600 700 800 900 1000 mτ τ [GeV] Figure 9.20: Fit with known background shapes for the soft strategy. 115 (a) (b) Figure 9.21: Global fit with known background shapes for the b-one(a) and b-zero(b) strategies. on the sample: the Landau parameters are fixed, while weights and signal mass and width are let free. The results for 5000 toy experiments are reported in Fig. 9.22-9.23. The signal parameters (yield, mean and sigma) do not show worrisome biases with respect to the nominal values. The background yields are reasonably consistent with the expectations. 9.6.3 Mass Scan The same analysis hasR been performed for different values of mA , from 140 GeV to 800 GeV with tan β = 30 and L = 10 f b−1 . The results in terms of significance are summarized in Table 9.10. Table 9.10: Significance for the 140 ≤ mA ≤ 300 GeV from event counting or from the fit, R with and without statistical error (tan β = 30 and L = 10 f b−1 ). mA [GeV ] σcount σcount+stat σf it σf it+stat 140 2.4 1.7 2.4 1.5 160 4.6 3.3 5.3 3.2 200 3.7 2.7 4.1 2.1 300 2.2 1.6 - The highest sensitivity is reached in the range 160 ≤ mA ≤ 200 GeV with a significance (with statistical errors) of the order of 2-3σ. For mA > 200 GeV the production cross section decreases fast, and, even if the background is suppressed, the signal contribution is small. A significance estimate based on event counting is still possible up to mA = 300 GeV . At lower mass values, instead, even if the production cross section is higher, a large Drell-Yan 116 Last Model Fit events/25 GeV events/25 GeV PDF models 80 PDF Signal − Nevs=105.9 Bckg1 − Nevs=244.2 Bckg2 − Nevs=484.2 Model − Nevs=834.2 70 60 100 80 60 50 40 40 30 20 20 10 0 0 100 200 300 400 500 600 700 800 0 0 900 1000 mττ [GeV] 100 200 300 400 (a) 500 600 700 800 900 1000 mττ [GeV] (b) HPullYieldBckg1 Entries 5000 Mean −0.2781 RMS 1.063 2 χ / ndf 65.86 / 17 Constant 943.1 ± 17.0 Mean −0.2925± 0.0153 Sigma 1.018 ± 0.011 Pull Yield Bckg1 900 800 700 HPullYieldBckg2 Pull Yield Bckg2 Entries Mean RMS χ2 / ndf Constant Mean Sigma 1000 800 5000 −0.02084 0.9273 23.75 / 16 1059 ± 18.9 −0.03004± 0.01313 0.9143 ± 0.0098 600 600 500 400 400 300 200 200 100 0 −10 −8 −6 −4 −2 0 (c) 2 4 6 8 10 0 −10 −8 −6 −4 −2 0 2 4 6 8 10 (d) Figure 9.22: Toy Monte Carlo assuming the background shapes as known. (a): pdf shape for the toy Monte Carlo trials. (b): fit for the last toy sample. (c): pull distribution for the z-shape background yield. (d): pull distribution for the tt̄-shape background yield. 117 HPullMeanSignal Entries 5000 Mean −0.2568 RMS 1.386 2 χ / ndf 133.2 / 29 Constant 744.5 ± 15.0 Mean −0.2161± 0.0181 Sigma 1.217 ± 0.017 Pull Mean Signal 800 700 600 HPullSigmaSignal Entries 5000 Mean 0.02435 RMS 1.301 2 χ / ndf 231.4 / 24 Constant 896.3 ± 17.9 Mean 0.1804 ± 0.0160 Sigma 1.013 ± 0.014 Pull Sigma Signal 1000 800 600 500 400 400 300 200 200 100 0 −10 −8 −6 −4 −2 0 2 4 6 8 0 −10 10 −8 −6 −4 −2 (a) 0 2 4 6 8 10 (b) HPullYieldSignal Entries 5000 Mean 0.2907 RMS 1.089 2 χ / ndf 80.25 / 18 Constant 944.4 ± 17.5 Mean 0.339 ± 0.015 Sigma 1.011 ± 0.012 Pull Yield Signal 1000 800 600 400 200 0 −10 −8 −6 −4 −2 0 2 4 6 8 10 (c) Figure 9.23: Pull distributions for the signal parameters obtained with the fit with known background shapes: mean(a), sigma(b) and yield(c). background contamination leads to a lower sensitivity. 9.6.4 Fit constraining the ratio of the background yields The previous results rely on a very good knowledge of the background shapes. Such scenario is probably too optimistic, since the background shapes with real data may differ from those extracted from Monte Carlo. In addition, the background shapes are affected by the choice of the selection cuts, and cannot be immediately inferred from an analysis with loose selection criteria. Within the CMS collaboration, data-driven methods to estimate the background shapes have already been developed and could be applied to the present analysis[66]. Yet, they would not provide a perfect, ideal knowledge of the background shapes. Therefore, it has to be proven that the signal can be extracted from a fit where the background shapes are also free parameters. A possible strategy is the following. The samples are selected with a set of loose cuts (loose strategy, see Table 9.11), leading to the results in Fig. 9.24. The background fit parameters differ from the soft strategy by less then 20%. These values are used as starting point for a fit where the background shapes are free parameters. The fit reaches a convergence when a Gaussian constraint on the ratio between the number of z-shape and tt̄-shape events is provided. This ratio can be obtained comparing the number of events with the loose strategy for Monte Carlo and real data, and assuming that the cut efficiencies for the soft strategy scale consistently for the simulated and measured data. The result is reported in Fig. 9.25 and Table 9.12. All the parameters are consistent 118 tt-shape background 2000 sum 1800 DY 1600 bbll entries/25 GeV entries/25 GeV z-shape background WZ 1400 ZZ sum 350 ttbar 300 Wt WW 250 1200 1000 200 800 150 600 100 400 50 200 0 0 100 200 300 400 500 600 700 800 0 0 900 1000 mτ τ [GeV] (a) 2000 1800 1600 200 300 400 500 600 700 800 900 1000 mτ τ [GeV] (b) mcoll tt-shape background fit Mean 112 RMS 68.82 χ 2 / ndf 1547 / 37 9219 ± 170.3 A: 76.46 ± 0.44 MPV: w: 13.79 ± 0.18 entries/25 GeV entries/25 GeV z-shape background fit 100 1400 350 300 mcoll Mean RMS χ2 / ndf A: MPV: w: 324.1 206.3 106.4 / 37 1927 ± 42.5 195.2 ± 2.2 65.27 ± 1.21 250 1200 1000 200 800 150 600 100 400 50 200 0 0 0 0 100 200 300 400 500 600 700 800 900 1000 mτ τ [GeV] 100 200 300 400 500 600 700 800 900 1000 mτ τ [GeV] (c) Figure 9.24: Background shapes obtained with loose selection cuts. 119 Table 9.11: Summary of loose strategy selection cuts. cut trigger eId Iso pT,e > [GeV ] pT,µ > [GeV ] σ> #jets #b-tags ∆φmin [◦ ] ∆φmax [◦ ] xe,µ loose single e OR µ, iso OR relaxed tight category based track and calo based 20 20 0 [0,2] [0,1] 0 176 (0,1) with the expected values within the errors, which, not surprisingly, are quite large on the yield estimates. Therefore, this fit strategy is reliable if the background parameters can be estimated within ∼ 20% of the true values for data after the final selection cuts. events/25 GeV combined fit Data Fit Signal: Yield=186.6± 88.0 mA=152.1± 9.2 σ= 36.4± 5.9 z−shape: Yield=221.8± 38.5 MPV= 84.2± 4.8 w= 14.5± 2.5 tt−shape: Yield=452.6± 73.6 MPV=217.6± 29.2 w= 57.2± 7.8 120 100 80 60 40 20 0 0 100 200 300 400 500 600 700 800 900 1000 mτ τ [GeV] Figure 9.25: Fit with constraint on the background yield ratio for the soft strategy. This fitting procedure has been also tested with toy Monte Carlo experiments: some biases on the fit parameters are caused by the bigger freedom on input parameters and by the yield ratio constraint pushing the two backgrounds in the same direction. In particular, the signal yield tends to be overestimated. In conclusion, a possible strategy with free background shapes has also been identified. 120 Table 9.12: Parameters obtained from the fit with constraint on the background yield ratio and corresponding expected values. signal (exp) signal (fit) z-shape (exp) z-shape (fit) tt̄-shape (exp) tt̄-shape (fit) Yield 105.9±10.3 186.6±88.0 244.2±15.6 221.8±38.5 484.2±22.0 452.6±73.6 mA /MPV 154.2±4.3 152.1±9.2 87.8±2.2 84.2±4.8 200.7±5.4 217.6±29.2 σA /width 44.0±3.0 36.4±5.9 16.2±1.3 14.5±2.5 53.7±2.9 57.2±7.8 It seems to be sensitive to a signal excess already with 10 f b−1 luminosity; the fit converges to the expected values if an input reasonable knowledge of the background shapes and an additional constraint to the fit are provided. 121 Conclusions The A → τ τ decay is a powerful channel for the MSSM Higgs discovery. Significances of 2-3σ / T ) in the mass range have been estimated for the exclusive leptonic channel (A → τ τ → eµE −1 140-300 GeV and for tan β = 30 already at 10 f b integrated luminosity (Fig. 9.26). σ for L=10fb-1, tanβ=30 Significance 6 significance from counting 5 significance from fit 4 3 2 1 0 140 160 180 200 220 240 260 280 300 mA [GeV] Figure 9.26: Significance as a function of mA (tan β = 30 and errors are included. R L = 10 f b−1 ). Statistical The efficiency of each reconstruction and selection step has been evaluated. Key points of the selection strategies have been identified: b-tagging, missing transverse energy, lepton identification and isolation. They are potential sources of systematic effects which will need / T measurement, to be properly estimated. The dominant uncertainty is expected from the E which will depend on the actual calorimeter response and the applied missing energy algorithm. Such issues, first addressed in the TDR, are still under study within the collaboration. Global fits of the final selected samples have been performed to extract possible signal events from background. Various fit strategies have been proposed according to more severe or relaxed assumptions on the background shapes and systematic biases in the the fit have been investigated via toy Monte Carlo analysis. / T is a promising channel for MSSM Higgs This work suggests that the A → τ τ → eµE search. It has to be recalled that the final analysis will include the CP -even H boson, which, for the mA and tan β values considered in this analysis, is expected to be degenerate with A in terms of mass, coupling values and production cross section. 123 124 Bibliography [1] L. Evans, P. Bryant, LHC Machine, JINST 3:S08001, 2008. [2] R. Adolphi et al., The CMS experiment at the CERN-LHC, JINST 3:S08004, 2008. [3] CMS Collaboration, CMS Physics TDR Volume 1, CERN-LHCC-2006-001 (2006). [4] CMS Collaboration, CMS Physics TDR Volume 2, CERN-LHCC-2006-021 (2006). [5] CMS Collaboration, The Magnet Project Technical Design Report, CERN/LHCC 97-010 (1997). [6] CMS Collaboration, The Tracker Project Technical Design Report, CERN/LHCC 98-006 (1998). Addendum CERN/LHCC 2000-016. [7] CMS Collaboration, The Electromagnetic Calorimeter Technical Design Report, CERN/LHCC 97-033 (1997). Addendum CERN/LHCC 2002-027. [8] CMS Collaboration, The Hadron Calorimeter Technical Design Report, CERN/LHCC 97-031 (1997). [9] CMS Collaboration, The Muon Project Technical Design Report, CERN/LHCC 97-32 (1997). [10] CMS Collaboration, CMS. The TriDAS project. Technical design report, vol. 1: The trigger systems, CERN-LHCC-2000-038 (2000). [11] CMS Collaboration, CMS: The TriDAS project. Technical design report, Vol. 2: Data acquisition and high-level trigger, CERN-LHCC-2002-026 (2002). [12] S. Cucciarelli, M. Konecki, D. Kotlinski, T. Todorov, Track reconstruction, primary vertex finding and seed generation with the Pixel Detector, CMS NOTE-2006/026. [13] R. Fruhwirth, Application of Kalman filtering to track and vertex fitting, Nucl.Instrum.Meth.A262:444-450, 1987. [14] W. Adam, B. Mangano, Th. Speer, T. Todorov, Track Reconstruction in the CMS tracker, CMS NOTE-2006/041. [15] P. Azzurri, B. Mangano, Optimal filtering of fake tracks, CMS IN-2008/017. [16] M. Pioppi, Iterative tracking, CMS IN-2007/065. 125 [17] T. Miao, H. Wenzel, F. Yumiceva, N. Leioatts, Beam Position Determination using Tracks, CMS NOTE-2007/021. [18] A. Strandlie, W. Wittek, Propagation of Covariance Matrices of Track Parameters in Homogeneous Magnetic Fields in CMS, CMS NOTE-2006/001. [19] The CMS Offline SW Guide, https://twiki.cern.ch/twiki/bin/view/CMS/SWGuide. [20] CMS OO Reconstruction, http://cmsdoc.cern.ch/orca/. [21] G. B. Cerati, Outlier Rejection during the Final Track Fit, CMS IN-2008/007. [22] W. Adam et al., Track Reconstruction with Cosmic Ray Data at the Tracker Integration Facility, to be published on Journal of Instrumentation. [23] G. B. Cerati, Tracking performance with cosmic rays in CMS, to be included in the proceedings of the 11th Topical Seminar on Innovative Particle and Radiation Detectors (IPRD08), which will be published in Nuclear Physics B Proceedings Supplement. [24] S. Weinberg, Phys. Rev. Lett. 19 (1967) 1264. [25] A. Salam, Elementary Particle Theory, ed. N. Svartholm (Almquist and Wiksells, Stockholm, 1968), 367. [26] M.E. Peskin, D.V. Schroeder, An Introduction to Quantum Field Theory, Perseus Books (1995) [27] F. Halzen, A. Martin, Quarks and Leptons: An Introductory Course in Modern Particle Physics, John Wiley and Sons, New York, 1984. [28] C. Quigg, Gauge Theories of the Strong, Weak, and Electromagnetic Interactions, The Benjamin/Cummings Publishing Company, London, 1983. [29] P.W. Higgs, Phys. Lett. 12 (1964) 132. [30] S. Dawson, Introduction to the Physics of the Higgs Boson, hep-ph/9411325v1. [31] H. Zheng, A Renormalization Group Analysis of the Higgs Boson with Heavy Fermions and Compositeness, hep-ph/9602340. [32] S. P. Martin, A Supersymmetry Primer, hep-ph/9709356. [33] D. I. Kazakov, Beyond the Standard Model (in search of Supersymmetry), hepph/0012288v2. [34] S. Schael et al., Search for neutral MSSM Higgs bosons at LEP, Eur.Phys.J.C47:547-587, 2006. [35] M. S. Carena, S. Heinemeyer, C. E. M. Wagner, G. Weiglein, MSSM Higgs boson searches at the Tevatron and the LHC: Impact of different benchmark scenarios, Eur.Phys.J.C45:797-814, 2006. [36] CDF Collaboration, Search for Neutral MSSM Higgs Bosons Decaying to Tau Pairs with 1.8 f b−1 of Data, CDF note 9071, 2007. 126 [37] CDF Collaboration, Search for Higgs Bosons Produced in Association with b-Quarks, CDF Note 9284 v1.0, 2008. [38] J. Fernandez, Search for MSSM heavy neutral Higgs bosons in the four-b final state, CMS NOTE-2006/80. [39] S. Lehti, Study of MSSM H/A → τ τ → eµ + X in CMS, CMS NOTE-2006/101. [40] S. Gennai, A. Nikitenko, L. Wendland, Search for MSSM heavy neutral Higgs boson in τ τ → two jet decay mode, CMS NOTE-2006/126. [41] R. Kinnunen, S. Lehti Search for the heavy neutral MSSM Higgs bosons with the H/A → τ + τ − electron + jet decay mode, CMS NOTE-2006/075. [42] A. Kalinowski, M. Konecki, D. Kotlinski, Search for MSSM Heavy Neutral Higgs Boson in tau + tau -¿ mu +jet Decay Mode, CMS NOTE-2006/105. [43] R. Kinnunen, S. Lehti, F. Moortgat, A. Nikitenko, M. Spira, Measurement of the H/A¿tau tau Cross Section and Possible Constraints on Tan(beta), CMS NOTE-2004/027. [44] G. Abbiendi et al., Muon Reconstruction in the CMS Detector, CMS AN-2008/097. [45] S. Baffioni et al., Electron reconstruction in CMS, CMS NOTE-2006/040. [46] C. Charlot, C. Rovelli, Y. Sirois, Reconstruction of Electron Tracks Using Gaussian Sum Filter in CMS, CMS AN-2005/011. [47] J. Branson, M. Gallinaro, P. Ribeiro, R. Salerno, M. Sani, A cut based method for electron identification in CMS, CMS AN-2008/082. / T Performance in CMS, CMS AN-2007/041. [48] S. Esen et al., E [49] D. Green, O. Kodolova, I. Vardanian, S. Kunori, A. Nikitenko, T. Virdee, Energy Flow Objects and Usage of Tracks for Energy Measurement in CMS, CMS Note 2002/036. [50] O. Kodolova, G. Bruno, I. Vardanian, A. Nikitenko, L. Fano, Jet Energy Correction with Charged Particle Tracks in CMS, CMS Note 2004/015. [51] A. Ulyanov et al., Jet Reconstruction and Performance in the CMS Detector, CMS AN2005/053. [52] S. Esen et al., Plans for Jet Energy Corrections at CMS, CMS AN-2007/055. [53] I. Tomalin, b Tagging in CMS, CMS CR-2007/041. [54] A. Rizzi, F. Palla, G. Segneri Track impact parameter based b-tagging with CMS, CMS NOTE-2006/019. [55] S. Heinemeyer, W. Hollik, and G. Weiglein, FeynHiggs: A program for the calculation of the masses of the neutral CP-even Higgs bosons in the MSSM, Comput. Phys. Commun. 124 (2000) 76-89, arXiv:hep-ph/9812320 (www.feynhiggs.de). [56] T. Sjostrand et al, High-Energy-Physics Event Generation with PYTHIA 6.1, arXiv:hepph/0010017v1 (http://home.thep.lu.se/ torbjorn/Pythia.html). 127 [57] A. Belyaev, E. Boos, Single top quark tW + X production at the CERN LHC: A Closer look, Phys.Rev.D63:034012, 2001. [58] C. Charlot et al, Search Strategy for a Standard Model Higgs Boson Decaying to Two W Bosons in the Fully Leptonic Final State, CMS AN-2008/039. [59] S. Haywood et al., Electroweak Physics, hep-ph/0003275v1. [60] J. M. Campbell, R. K. Ellis, An update on vector boson pair production at hadron colliders, Phys.Rev.D60:113006, 1999. [61] A. Pukhov et al., CompHEP - a package for evaluation of Feynman diagrams and integration over multi-particle phase space. User’s manual for version 33, hep-ph/9908288. [62] S. R. Slabospitsky, L. Sonnenschein, TopReX generator (version 3.25). Short manual, Comput. Phys. Commun. 148 (2002), hep-ph/0201292. [63] W. Verkerke, D. Kirkby, The RooFit toolkit for data modeling, arXiv:physics/0306116v1 (http://roofit.sourceforge.net/). [64] ROOT: an Object Oriented Data Analysis Framework, http://root.cern.ch/. [65] S. Bityukov, Uncertainty, Systematics, Limits, http://cmsdoc.cern.ch/bityukov. [66] U. Langenegger, P. Trüb, A. Starodumov, Standard Model Higgs Boson Search in the Decay H → τ + τ − → ℓ+ ℓ− /ET , CMS AN-2008/053. 128 Acknowledgments First, I would like to thank the whole CMS Milano-Bicocca group for the opportunity to undertake this thesis work. In particular, I would like to acknowledge the pixel group: our boss Luigi Moroni and my tutor Sandra Malvezzi for sharing their knowledge with me, the passion for researching they communicate and the patience towards me; Daniele Pedrini, Dario Menasce, Silvano Sala, Marco Rovere for the competent advices and their helpfulness. Many thanks also to all ECAL Milano-Bicocca people, computing administrators Paolo Dini, Luca Carbone, my Ph.D. fellows Roberto Salerno, Martina Malberti, Valentina Tancini, Leonardo Sala, Silvia Taroni and the ex-Milano people Lorenzo Uplegger, Mauro Dinardo, for their kind support. Also, I would like to thank Wolfgang Adam, Boris Mangano, Kevin Burkett, Chiara Genta for the profitable discussions on tracking issues and Sasha Nikitenko, Chiara Mariotti, Guillermo Gomez-Ceballos for the precious help on the Higgs analysis. Special thanks to Stefano Magni for teaching me the art of programming and, most importantly, a smart approach to the research work. 129