Lecture 1.1 What is PAT and How to use it? Content
Transcription
Lecture 1.1 What is PAT and How to use it? Content
Lecture 1.1 What is PAT and How to use it? Content ● A short reminder of the CMS EDM and Analysis Workflow ● The answer to the question: What is PAT? ● An introduction to the PAT DataFormats ● Configuration of the PAT DataFormats ● An introduction to the PAT Workflow ● Support and Documentation PAT Tutorial June 2010 Reminder of the Event Data Model ● Configurable edm::Modules communicate with/via the EventContent ● Same file structure (i.e. root) for: Gen-Sim-Digi-Reco-Analysis ● Single framework for Reconstruction (POGs) and Analysis (PAGs) Typical CMS Analysis Workflow ● Prompt reconstruction at Tier-0. ● Central skims at Tier-1's. ● Users run cmsRun at Tier-2's: ● ● ● ● ● ● Perform high level analysis steps. Preselect events. Write their own user defined EventContent to private T2/T3 space. The latter step might be iterated. Copy reduced datasets to your favorite machine. Run your final analysis/produce plots. PAT helps you to create a user-defined EventContent What is the Physics Analysis Toolkit PAT is a toolkit as part of the CMSSW framework ● ● ● ● ● It serves as well tested and supported common ground for group and user analyses. It facilitates reproducibility and comprehensibility of analyses. It is an interface between the sometimes complicated EDM and the simple mind of the common user. You can view it as a common language between CMS analysts: If another CMS analyst describes you a PAT analysis you can easily know what he/she is talking about Three Aspects of PAT Common Tool Interface ● ● ● ● ● b/w RECO expertise & Analysis Level simplifies access via DataFormats canalizes expertise (via POG & PAG contacts) ● ● approved algorithms & sensible defaults synergy (everybody can profit from recent developments) quick start into analysis for beginners crossing point between POGs & PAGs ('vertical integration') Common Format ● facilitates transfer & comparisons ● PAG common configurations ● sustained provenance Facilitated Access to Event Information ● Do you know how to access this event information within the EDM? Object Id, Cluster shapes Isolation (different from defaults) Correction Factors, Object Resolutions reco::Candidate JetFlavor Generator Match, Trigger Match More, ... Associated Tracks, JetCharge BTag Algorithms, TagInfos ● With PAT Candidates you get this just by calling member functions! ● Note: Each PAT Candidate IS a corresponding reco::RecoCandidate (and more) The PAT Data Formats ● All pat::Objects inherit from their corresponding reco::RecoCandidates ● A PAT Candidate is a reco::RecoCandidate PLUS more. PAT Candidate Member Functions Check the Documentation: SWGuidePATDataFormats Combine Flexibility and User Friendliness ● ● You can choose yourself whether you really need all the extra information that the PAT Candidates provide. Still you don't need to know, how EDM/PAT manages this access for you under the hood. Flexibility User Friendliness Maximal Configuration ● The key is: configuration of DataFormats by cfi file! (E.g. for pat::Jets). Configuration of PAT DataFormats You can configure the content of the DataFormats yourself (example: pat::Jet)! import FWCore.ParameterSet.Config as cms patJets = cms.EDProducer("PATJetProducer", ... # embedding of AOD items embedCaloTowers = cms.bool(False), embedPFCandidates = cms.bool(False), # jet energy corrections addJetCorrFactors = cms.bool(True), jetCorrFactorsSource = cms.VInputTag("patJetCorrFactors"), # btag information addBTagInfo = cms.bool(True), addDiscriminators = cms.bool(True), discriminatorSources = cms.VInputTag( ... ), # clone tag infos ATTENTION: these take lots of space! # usually the discriminators from the default algos # are sufficient addTagInfos = cms.bool(True), tagInfoSources = cms.VInputTag( ... ), # track association addAssociatedTracks = cms.bool(True), trackAssociationSource = "ak5JetTracksAssociatorAtVertex", # jet charge addJetCharge = cms.bool(True), jetChargeSource = cms.InputTag("patJetCharge"), # add jet ID addJetID = cms.bool(True), jetIDMap = cms.InputTag("ak5JetID"), Size: 14kb/event (for ttbar) The PAT Workflow Have a look at: SWGuidePATWorkflow Pre-Production steps before PAT Candidate creation PAT Candidate creation Main collection (w/o cleaning) Main collection (with cleaning) Resembled by the structure of the python directory in the PatAlgos package (don't be shy, check it out!) EventContent of the default PAT Tuple ● Have a look to patEventContent_cff.py: ● Have a look to patTemplate_cfg.py: ● Size: 20kb/event (for ttbar) But decide yourself how your PAT Tuple should look like (add reco::Tracks or reco::GenParticles to the Event Content or BTag information to the jets, etc ... ). The concept of Maximal Configuration ● Configure your own DataFormats via embedding (see Lecture 2.2/Exercise 06). ● Configure your workflow via tools that PAT provides (see Lecture 2.1/Exercise 05). ● ● Add any extra info you need the the EventContent. Apply selections via the StringCutParser. The Code Location DataFormats/PatCandidates ● ● Definition of all PAT Candidates. pat::Photon, pat::Electron, pat::Muon, pat::Tau, pat::Jet, pat::MET, ... PhysicsTools/PatAlgos ● ● Implementation and filling of all data formats. Definition of common workflow and PAT tools. PhysicsTools/PatUtils ● Definition of common tools and helper functions used in PatAlgos. PhysicsTools/PatExamples ● Location of many examples e.g. all non-trivial examples used during this Tutorial. Development PAT is part of any CMSSW release. We recommend to use it from the release! Have a look at: SWGuidePATRecipes Development (cont'd) In case you want already to use features/fixes that will go into the next release follow the Pat release notes in the corresponding development branch. Support Check the the main entry page of PAT in the software guide: SWGuidePAT A short extract of possible support: ● Lecturers & Tutors ● Hypernews ● Community ● POG/PAG contacts ● Developers ● The quite developed PAT Documentation! Documentation ● SWGuidePAT/WorkBookPAT Main documentation pages ● WorkBookPATDataFormats ● WorkBookPATWorkflow ● WorkBookPATConfiguration ● SWGuidePATTools Description of all PaT tools. ● WorkBookPATTutorial Tutorials and examples to get started. ● SWGuidePATRecipes ● SWGuidePATEventSize Tools for event size estimate Description of all PAT Candidate. Description of the PAT workflow. Description of the configuration of PAT. Installation recipes And last but not least: This Tutorial and/or former Tutorials... Exercises By now you should be prepared to do the following Exercises on WorkBookPATTutorial: Have Fun! ● Exercise 1 (WorkBookPATDocNavigationExercise) The PAT Documentation is one of the most looked after parts of the WorkBook. To know your documentation and how to use it can speed up your learning curve enormously. Learn more about the PAT Documentation and how to make effective use of it. ● Exercise 2 (WorkBookTupleCreationExercise) Learn how the default PAT tuple is produced to be prepared to produce your own PAT tuples. ● Exercise 3 (WorkBookTupleCrapExercise) This is the part of the crab tutorial. Once you are doing large sceal analyses you will need crab.