DB2-Optimierung und SQL-Performance

Transcription

Kapitel 3: SQL – Die „Structured Query Language“ bei DB2
Standards, Tipps und Grundlagen
zum Umgang mit DB2/SQL
und anderen SQL -Dialekten
DB2-Optimierung
und
SQL-Performance
Ausgabe 6: 2006 (inkl. DB2V7/V8)
© S.K. Consulting Services
++49 8106 994390
www.sk-consulting.de
Seite 1 von 127
Inhaltsverzeichnis
INHALTSVERZEICHNIS
1
Vorwort
8
2
Grundsätzliches zu DB2 und Performance
10
2.1
Optimierungspotentiale bei DB2
10
2.2
Vorgehensweise beim Tuning
11
2.2.1 Anhaltspunkte und Analysedaten für Tuning
2.2.2 Accounting und “statistics records”
11
11
2.3
Grundvoraussetzungen für DB2/UDB Performance
12
2.4
Voraussetzungen für SQL-Performance bei DB2
13
2.5
Möglichkeiten und Maßnahmen zur SQL-Optimierung
15
2.5.1 Systemtechnische Aktivitäten
2.5.2 Anwendungsbezogene Maßnahmen
2.5.3 Die Tuningpotentiale des DB2-Systems
16
16
17
3
SQL - Die „Structured Query Language“ bei DB2
18
3.1
Relationale Sprachelemente und Operationen bei SQL
20
3.1.1
3.1.2
3.1.3
3.1.4
20
21
22
23
3.2
Die relationale Funktion "SELEKTION"
Die relationale Funktion "PROJEKTION"
Die relationale Funktion "JOIN"
Relationale Mengenoperationen-Zusammenfassung
Generelle Überlegungen und Voraussetzungen für SQL Performance 24
3.2.1
3.2.2
3.2.3
3.2.4
3.2.5
3.2.6
3.2.7
3.2.8
3.2.9
3.2.10
3.2.11
3.2.12
3.2.13
3.2.14
3.2.15
3.2.16
3.2.17
3.2.18
3.2.19
3.2.20
3.2.21
„Dynamic SQL“
Datenbankobjekte und ihre Struktur
Tabellen und Tablespaces
Indexe
Primary und Clustering Indexes
“Index Only”-Zugriffe auf VARCHAR Spalten
Verzögerte Objektdefinitionen
Aufwand und Kosten von Indexes
Empfehlungen zu Sortierungen
SQL-Abfragen mit Subqueries
Komplexität von Queries
Spalten-Funktionen
Formulieren von Prädikaten
Die Verwendung von „scalar functions“
Neuordnen der Tabellenfolge in der FROM Klausel
„list prefetch“
Uncommitted read“
“row level locks”
Freigabe von Locks
“lock escalation”
Materialized query tables(MQT's)und AQR
3.2.21.1 Empfehlungen für das Design von "materialized query tables"
3.2.21.2 Empfehlungen für das Design von zugehörigen "base tables"
3.2.22 Der DB2-Katalog
++49 8106 994390
24
24
25
26
26
26
26
26
28
29
29
30
30
31
31
31
31
32
32
32
32
33
34
34
Seite 2 von 127
Inhaltsverzeichnis
4
SQL-Tuning und Performance bei DB2
35
4.1
Die neuen Limits bei DB2 Version 8
36
4.2
Die DB2 SQL Engine
37
4.3
SQL-Tuning und die Logik von Abfragen
39
4.3.1
4.3.2
4.3.3
4.3.4
4.3.5
4.3.6
39
40
42
42
43
44
4.4
“constant propagation”
Eliminieren von „totem Code“
Zusammenfassen von Konstanten („constant folding“)
“case-insensitive” Suchen
„Sargability“
"Join transitive closure"
DB2 SQL und Performance
45
4.4.1 Grundsätzliche Empfehlungen zu DB2-SQL
45
4.4.1.1 Suche die kleinste „row“-Menge
46
4.4.1.2 Lies nur die Spalten, die wirklich benötigt werden
46
4.4.1.3 Reduziere die Anzahl der SQL-Statements
49
4.4.1.4 Kodiere Prädikate, die möglichst selektiv sind
53
4.4.1.5 Beachte die Qualität von DB2-SQL Abfragen
54
4.4.1.6 Nutze „stage1“-Prädikate
54
4.4.1.7 Verwende nie generische SQL-Statements
55
4.4.1.8 Vermeide unnötige SORT-Abläufe
55
4.4.1.9 Sortiere nur die erforderlichen Spalten
56
4.4.1.10 Benutze die ON-Klausel für alle JOIN-Prädikate
58
4.4.1.11 Vermeide UNIONs
58
4.4.1.12 Nutze JOINs anstatt „subqueries“
61
4.4.1.13 Komplemetärmengen bei „outer joins“
62
4.4.1.14 Kodiere die selektivsten Prädikate zuerst
63
4.4.1.15 Nutze erprobte Methoden zur Existenzprüfung(EXISTS)
64
4.4.1.16 Subqueries sind zu „tunen“
66
4.4.1.17 Vermeide alles, was nicht unbedingt notwendig ist
67
4.4.2 Modifikation von SQL-Statements
68
4.4.2.1 Häufigste Modifikationen durch das DB2
68
4.4.2.2 Modifikationen über zusätzliche Prädikate („transitive closure“) 68
4.4.2.3 Vereinfachung der JOIN-Verarbeitung
69
4.4.2.4 Beeinflussung der Reihenfolgen bei OUTER-JOINs
69
4.4.2.5 Subquery-Transformation in JOINs
69
4.4.2.6 Auswahl der „outer table“ bei JOINs
70
4.4.2.7 Ausschalten von Indizes
71
4.4.2.8 Beeinflussen der IX-Nutzung
72
4.4.2.9 Beeinflussen von „outer table“ Auswahl und JOIN-Methode
72
4.4.3 Restrukturierung von UNION- durch CASE-Ausdrücke
75
4.4.4 Spezielle Techniken
77
4.4.4.1 CASE in Prädikaten
77
4.4.4.2 CASE in UPDATE – Anweisungen
79
4.4.4.3 CASE zum Vermeiden von Rechen- oder anderen Fehlern
81
4.4.4.4 CASE zum Eliminieren von UNION-Klauseln
82
4.4.4.5 Funktionsequivalente Ausdrücke zur CASE-Klausel
83
4.4.4.6 GROUP BY für „single pass“
84
4.4.4.7 GROUP BY für beide Seiten
85
4.4.5 ORDER BY und SORT-Vorgänge bei DB2
87
4.4.5.1 ORDER BY und Vermeiden von Sorts (seit V7)
87
4.4.5.2 Nutzung von „Local Storage“
89
4.4.5.3 Buffer Pool Storage Nutzung
89
++49 8106 994390
Seite 3 von 127
4.5
4.6
4.7
Inhaltsverzeichnis
4.4.5.4 DASD Nutzung
4.4.5.5 Nutzung von Work Files
4.4.5.6 Berechnen der SORT Pool Größe
4.4.5.7 "Secondary Extents"
4.4.5.8 Weitere Überlegungen: „Sort Assist“
4.4.5.9 Weitere Überlegungen: „Destructive Reads“
4.4.5.10 Weitere Überlegungen:” Locks auf Work Files“
4.4.5.11 Wann wird ein Sort erforderlich?
4.4.5.12 Größe und Anzahl von “Sort Work Files“
4.4.5.13 Isolieren der DSNDB07 im eigenen Bufferpool
4.4.5.14 Benutzen von “Temporary Tables”
4.4.5.15 Weitere Nutzer der Workfile Database
4.4.5.16 Index-Unterstützung bei SORTs
4.4.5.17 Eliminieren nicht erforderlicher Spalten
4.4.5.18 “Sort Bufferpool Thresholds”
4.4.5.19 Nutzung des Hiperpools für Sortvorgänge
4.4.5.20 Einschalten eines DASD Cache
4.4.5.21 Definition eines 32 KB Bufferpools
4.4.5.22 Überlegungen zum „Data Sharing“
4.4.6 Einsatz des „cartesian join“ („star-join“)
4.4.6.1 Das „star join“ Schema
4.4.6.2 Wann wird ein “star join” Schema genutzt ?
4.4.6.3 Beispiele: Query mit drei “dimension tables“
4.4.6.4 Empfehlungen zur Erstellung von IX für "star join queries"
4.4.6.5 Bestimmen der Spaltenreihenfolge in einem Index für ein "star
schema"
89
90
91
91
92
92
92
93
93
94
94
94
94
95
95
95
96
96
97
98
98
99
100
102
Mehr Performance für SQL-Operationen in Programmen ab V8
106
4.5.1
4.5.2
4.5.3
4.5.4
4.5.5
„scalar full select“
„multiple DISTINCT‘s“
„multi row FETCH“ und „multi row INSERT“
„common table expressions“
Rekursives SQL
106
107
108
110
111
Mehr Performance für Modifikationsoperationen ab V8
113
4.6.1 „multi row INSERT“
4.6.2 INSERT innerhalb eines SELECT
4.6.3 „multi row FETCH“ und „positioned UPDATE/DELETE"
113
115
116
Do’s und Don’ts bei SQL in Kürze
117
4.7.1 Grundsätzliche Empfehlungen zu SQL
4.7.2 Tipps und Hinweise
4.7.2.1 In Programmen „verbotene“ SQL- Anweisungen
4.7.2.2 SQL - Anweisungen, die keine Indexbenutzung zulassen
4.7.2.3 SQL-Anweisungen, die eine Indexnutzung zulassen,
4.7.2.4 SQL - Anweisungen, die ungünstig formuliert sind
4.7.2.5 SQL – Anweisungen mit ungünstigen JOIN-Formulierungen
4.7.2.6 Vermeiden arithmetischer Ausdrücke in einem Prädikat
4.7.3 DB2 SQL – Nutzungsrichtlinien (Zusammenfassung)
4.7.3.1 Allgemeines
4.7.3.2 ORDER BY und GROUP BY
4.7.3.3 JOIN-Tuning
4.7.3.4 Subquery Tuning
117
118
118
118
119
119
120
121
122
122
123
123
123
++49 8106 994390
103
Seite 4 von 127
Inhaltsverzeichnis
5
Die relationalen Komponenten in DB2
124
5.1
Das RDS - Relational Data Systems
124
5.2
DM - Data Manager
124
5.3
BM - Buffer Manager
125
5.4
VSAM (außerhalb von DB2)
125
6
Die Verarbeitung von Prädikaten
126
6.1
Indexable Predicates
128
6.2
Wie wird der Index durchsucht ?
130
6.2.1 "Sargeable und indexable Predicates"(Übersicht)
6.2.2 Sargeable & indexable Predicates / Beispiele
6.2.3 Prädikate auf Sargeable & indexable umformulieren (Beispiele)
131
132
134
DB2 "Access Path"-Auswahl
135
6.3
6.3.1
6.3.2
6.3.3
6.3.4
6.3.5
6.3.6
6.3.7
6.3.8
6.3.9
6.3.10
6.3.11
6.3.12
Access-Path-Selection bei Tablespace- / Table-Scan
136
Access-Path-Selection bei "non-matching" Index
137
Access-Path-Selection bei "matching" Index
138
Access-Path-Selection bei "One-Fetch" Index-Scan
139
Access-Path-Selection bei "Index-only" Zugriff
139
DB2 "Access Path"-Selection (Zusammenfassung)
139
DB2 "Access Path"-Auswahl : "List Prefetch"
140
DB2 "Access Path": "Multiple Index Access" / UNION (OR)
141
DB2 "Access Path": "Multiple Index Access" / INTERSECT (AND) 142
DB2 "Access Path"-Auswahl : "Multiple Index Access" / AND + OR 143
DB2 - EXPLAIN für MI-Zugriffe
144
DB2 "Access Path"-Auswahl : JOINs / "nested loop"
145
6.3.12.1 Ablauf des "nested loop"-JOIN
146
6.3.12.2 Performancetipp
146
6.3.13 DB2 "Access Path"-Auswahl : JOINs / "merge scan"
147
6.3.13.1 Vorgehensweise
148
6.3.13.2 Performancetipp
148
6.4
Spezielle Techniken zum Beeinflussen der Pfadauswahl bei DB2
149
6.4.1 Die Informationen über Zugriffspfade
150
6.4.2 Minimieren des "overhead" durch Anfordern von wenigen "rows" 151
6.4.2.1 Was beinhaltet OPTIMIZE FOR n ROWS ?
151
6.4.2.2 OPTIMIZE FOR 1 ROW zum Vermeiden von SORTs?
151
6.4.2.3 Wie wird OPTIMIZE FOR n ROWS in einer CLI Applikation
genutzt?
151
6.4.2.4 Wieviele “rows” können mit OPTIMIZE FOR n ROWS gelesen
werden?
151
6.4.2.5 Wann ist OPTIMIZE FOR n ROWS effizient?
152
6.4.2.6 Anfordern einer begrenzten Zahl von "rows"
153
6.4.2.7 Abhängigkeit OPTIMIZE FOR n ROWS und FETCH FIRST n
ROWS ONLY
154
6.4.3 Nutzung der "cardinality" Klausel
155
6.4.4 Reduzieren der Anzahl "matching columns"
156
6.4.5 Neuanordnen der Tabellenreihenfolge
158
6.4.6 Update der Katalogstatistiken
158
++49 8106 994390
Seite 5 von 127
Inhaltsverzeichnis
6.4.6.1 Modifizieren des Katalogs (Anpassen "correlated columns")
6.4.6.2 Update des Katalogs wegen “join” mit “table functions”
6.4.7
Nutzung von Subsystem Parametern
6.4.7.1 Favorisieren von "matching index" Zugriffen
6.4.7.2 Optimieren von Queries mit IN-List Prädikaten
6.4.7.3 Favorisieren eines indizierten Zugriffs
6.4.7.4 Beispiel einer "Column correlation"
6.4.7.5 Feststellen von "column correlations"
6.4.7.6 Einflüsse von „column correlations“
6.4.7.7 Was tun bei “column correlations”?
159
159
160
160
160
161
161
162
163
165
7
Analyse der Zugriffspfade und DB2-Optimizer Informationen
166
7.1
Faktoren der Entscheidung für den DB2-Optimizer
167
7.2 Der Optimierungsvorgang und EXPLAIN
7.2.1
7.2.2
7.2.3
7.2.4
7.2.5
7.3
168
Prädikate und Prädikatkategorien
Die Filter
Filterfaktoren(FF)
PLAN_TABLE und EXPLAIN
Voraussetzungen für effizientes EXPLAIN
168
169
170
173
174
Informationen, die nicht in der PLAN_TABLE stehen
175
7.3.1
7.3.2
7.4
7.4.1
7.4.2
175
176
178
178
179
Die DSN_STATEMNT_TABLE
Einflüsse auf die Kostenkategorien
Verbesserungen für "stage 2"-Prädikate
"view" – Materialisierung
"nested table expression" – Materialisierung
8
Behandlung langlaufender Queries/Statements
180
8.1
Explain und PLAN_TABLE prüfen
180
8.2
DSN_STATEMNT_TABLE einbeziehen
180
8.3
SQL-Statement überprüfen
180
8.4
Prüfen der Struktur des Datenmodells
181
8.5
RUNSTATS-Statistik-Spalten des Katalogs überprüfen
181
8.6
Monitor einsetzen und Ergebnisse überprüfen (z.B. DB2PM)
181
8.7
SQL TRACE REPORT überprüfen (z.B. DB2PM)
182
9
Messwerte und Daten für SQL-Performance
183
9.1
Bufferpool-Hit Ratio und Maximal Unreferenced Pool Age (MUPA)
183
9.2
DB2-Tools und die Ermittlung von Performance-Werten
185
9.3
Statistikdaten des Katalogs
187
9.4
Statistikdaten und ihre Auswirkung auf das DB2 Optimizing
188
9.5
Manipulation von Statistikwerten
197
9.6
DB2-Accounting-Zeiten
200
9.7
SQL Query-Typen: "I/O Bound" und "CPU Bound"
202
++49 8106 994390
Seite 6 von 127
10
Inhaltsverzeichnis
SQL-Query und Tuning Empfehlungen (Zusammenfassung)
203
10.1 Richtlinien zur Leistungsoptimierung
203
10.2 Kodieren Sie SQL nur für die erforderlichen Ergebnisse
205
10.3 Wird die erwartete Performance nicht erreicht
205
10.4 Prädikate werden in „stages“ verarbeitet
206
10.5 Weitere Empfehlungen zu SQL-Queries
207
10.6 ORDER BY / GROUP BY
207
10.7 JOIN Tuning – Allgemein
207
10.8 Subquery Tuning
208
11
209
Anhang
11.1 „builtin-functions“
210
11.2 Abbildungsverzeichnis
220
11.3 Index
223
11.4 Glossar
227
11.5 Literaturhinweise
264
++49 8106 994390
Seite 7 von 127
1
Vorwort
„Information steht heute und auch in Zukunft im Mittelpunkt wirtschaftlichen
Handelns. Information wurde zur treibenden Kraft der Informationsgesellschaft ... „
Das Zitat von John Naisbitt über die Ressource “Information” in seinem Bestseller
„Megatrends“ von 1988(!) sagt in Kürze immer noch alles über die Bedeutung der Information in unserer Gesellschaft aus.
Information ist ein denkbar abstrakter Stoff, der leichter, effizienter und produktiver
verwendet werden kann, wenn er geordnet und seinem sinnvollen Zusammenhang
gemäß dargestellt und angeboten wird. Datenbankmanagementsysteme (DBMS)
sind die Werkzeuge, mit denen Informationen strukturiert, verwaltet und
bedarfsgerecht aufbereitet, wieder geliefert werden können sollen. Über sie werden
moderne Informationssysteme erst möglich.
DB2/UDB von IBM ist eines dieser Datenbanksysteme, das in einer modernen ITUmgebung in der Lage ist, Informationsarchitekturen und -systeme über und für die
gesamte Unternehmenshierarchie umfassend möglich zu machen.
Informationsverarbeitung ist dann effizient, wenn die richtigen Informationen zum
richtigen Zeitpunkt am richtigen Ort sind. Dazu bedarf es einer sorgfältigen
Planung, einer technisch perfekten Implementierung und einer ständigen Kontrolle
und Abstimmung. Die Datenbank als Informationsspeicher muss in der Lage sein, die
gestellten Anforderungen sicher, konsistent und schnell zu erfüllen: Manche
Informationen sind eben nur dann wertvoll, wenn sie hochaktuell sind. Und - jeder
Nutzer spezifischer Informationen kann seine eigenen individuellen und subjektiven
Ansprüche an diese Ressource “Information” stellen. Dies erfordert seitens der
Technik hochperformante und flexible, aber auch stabile und sichere Systeme.
DB2/UDB bietet Möglichkeiten, so eingestellt zu werden, dass alle erforderlichen
Aktivitäten und Anwendungen auf effizienteste Art und Weise bedient werden
könnten. Dazu müssen alle (System-)Parameter optimal gewählt und die
Datenstrukturen nach sorgfältiger Analyse in die physische DB2-Umgebung
implementiert werden. Dies gilt umso mehr, als mit der Ausweitung der
Informationstechnik die Komplexität der Information selbst und die Quantität
angebotener Datenmengen ständig zunimmt, andererseits die Informationsqualität
weiter verbessert und die verfügbaren Informationen immer effektiver und genauer
dargeboten werden sollen.
Insbesondere aber gilt es, Applikationen architektonisch so abzubilden und
Programme so zu schreiben, dass die Performanceziele erreicht werden und ein
Minimum an Kosten entstehen.
Denn: Ein Datenbanksystem selbst bringt den Unternehmen noch keinen oder
nur geringen Nutzen. Dieser entsteht erst aus der intensiven Nutzung der
verfügbaren Information und der daraus resultierenden betriebswirtschaftlichen
Wertschöpfung: Je mehr Nutzung, desto mehr Nutzen und umso besser für das
Unternehmen. Die Erkenntnis, dass der Unternehmenserfolg, wie bei den bekannten
„klassischen“ Produktionsfaktoren - Finanzen, Material, Anlagen und Personal unmittelbar von einer erschöpfenden und werteffizienten Verwertung dieser „fünften
Kraft“ – Information - abhängt, führte zur Suche nach neuen Konzepten in einem
neuen betriebswirtschaftlichen Umfeld - der Informationswirtschaft. Im Zentrum
dieser wirtschaftlichen Aspekte steht die Informationstechnologie - ihre
Möglichkeiten, ihre Produkte. Die Erwartungen an die Leistungsfähigkeit eines DBMS
sind folglich enorm hoch.
++49 8106 994390
Seite 8 von 127
Kapitel 2: Grundsätzliches zu DB2 und Performance
In dieser Handbuchserie werden unter DB2 Performance-Gesichtspunkten alle
wichtigen Fragen zu und die Möglichkeiten in Hinsicht auf das Produkt DB2
thematisiert. Die Serie besteht aus folgenden Büchern:
01_Die Umgebung von DB2 – Eine Architekturübersicht
02_DB2 und das Relationenmodell von Dr. Codd
03_DB2-Optimierung und SQL-Performance
04_Physisches DB-Design und DB2-Performance
05_DB2 und effiziente Anwendungsentwicklung
06_Administration von DB2 Umgebungen
07_Tunig-Beispiele zu DB2: Erfahrungen aus der Praxis
08_DB2 im Client-Server Umfeld
09_Tools und hilfreiche Produkte zu DB2
Die gesamte Handbuch-Serie stellt sich nicht in Form von „Manuals“ im Sinne von
Systemdokumentation dar – diese werden vom Hersteller sowieso angeboten.
Vielmehr ist beabsichtigt, DB2 unter Nutzbarkeits- und Performance-Gesichtspunkten
möglichst umfassend zu beleuchten.
Die Serie ist für Kenner, nicht in erster Linie für Neulinge im Umgang mit DB2,
konzipiert.
Dieses Handbuch beschäftigt sich mit dem Thema: „DB2 und SQL-Performance“.
Es soll als Leitfaden dienen, das SQL Statements ursprünglich, richtig und effizient
zu entwickeln, zu testen und zukünftig optimal schreiben und einstellen zu können immer mit dem Ziel, höchstmöglicher Performance in allen direkt betroffenen und
umliegenden Betrachtungsfeldern.
Viel Spaß beim Lesen und viel Erfolg bei der Nutzung von IBM’s DB2/UDB.
Mit freundlichen Grüßen
S.K. Consulting Services GmbH
Sepp Kraus
Für die Mitarbeit an diesem Handbuch bedanken wir uns insbesondere bei den
Firmen
ARAL AG, Bochum
AXA Versicherungen, Köln
BMW AG, München
Quelle AG, Fürth
Itellium GmbH & Co, Fürth
IT-Verlag, Sauerlach b. München
++49 8106 994390
Seite 9 von 127
2
2.1
Grundsätzliches zu DB2 und Performance
Optimierungspotentiale bei DB2
Die Optimierungspotentiale bei relationalen Datenbanksystemen unterscheiden sich
generell – auch zwischen DB2 und Oracle, SQL Server und SYBASE - nur minimal.
Sicher ist, dass die höchsten Potentiale, um diese relationalen Datenbanksysteme
schneller zu machen im Bereich der Abfragesprache SQL und damit im Umfeld der
Anwendungsentwicklung und der Programme zu suchen ist (siehe Grafik unten).
Eine weitere Fehlerquelle ist das physische DB-Design, gefolgt von der Einstellung
der Systemparameter im DB2 selbst und im Betriebssystem (OS/390, z/OS, AIX,
UNIX usw.)
Empfehlenswert ist es natürlich im Tuningfall dort zuerst zu suchen, wo das größte
Potential zum Lösen der Tuningaufgaben existiert. Man darf dann nur die anderen
Bereiche nicht vergessen.
In diesem Handbuch werden vorrangig die Problematiken der Sprache SQL und
der damit zusammenhängenden Performance“ behandelt. Die Problematik des
physischen Designs in DB2 findet man im Band "Physisches DB Design und DB2
Performance" aus dieser Reihe „Tuning und Performance für DB2-Umgebungen“.
2 = DB2
System
(10%)
3 = phys.
DBDB-Design
(20%)
4 = Anwendung
(60%)
1
2
3
4
1 = OS
System
(10%)
Bild-01: Tuningpotentiale bei DB2
++49 8106 994390
Seite 10 von 127
2.2
Vorgehensweise beim Tuning
Grundsätzlich ist Tuning ein iterativer Vorgang:
Schritt_1:
Analyse der
Details
Schritt_2:
Erarbeiten einer
Lösung
Schritt_3:
Test der Lösung
Schritt_4:
Vergleich der
Ergebnisse
Wiederholung des Prozesses bis zum besten Resultat
Bild-02: Vorgehensweise beim Tuning
In allen Schritten ist jede mögliche Maßnahme zum Erreichen des Tuning-Ziels
erlaubt.
2.2.1
Anhaltspunkte und Analysedaten für Tuning
Anhaltspunkte für Tuning bieten bei DB2 für OS/390 oder z/OS folgende Messdaten
2.2.2
1.
Elapsed Time Analysis und Tuning
2.
CPU Time Tuning: Aufwände für Select, Insert, Update, Delete, Dynamic
Bind, DB2 Traces, Distributed/ Stored Procedure, DB2 Data Compression
3.
Buffer Pool, Locking, EDM Pool, Work File, LOB, DBM1 Virtual
Storage
Accounting und “statistics records”
Alle “accounting” und Statistikdaten sind bei DB2 relativ einfach und kostengünstig zu
erhalten. Sie sind vor allem nützlich für ein kontinuierliches Monitoring der
Performance und das daraus erforderliche Tuning.
Für eine erste Analyse genügen meistens:
Der „Accounting report” (nicht “trace”) pro “connection type” oder
Plan und
Der „Accounting Report” (nicht “trace”) für dieselbe Zeitspanne
Diese Fakten sollten die ersten sein, die betrachtet werden, wenn ein DB2
Performance Problem auftaucht.
Beispiel:
DB2PM Command Eingabe zum Erhalt der entsprechenden passenden Daten:
DB2PM
STATISTICS REPORT LAYOUT (LONG), und
DB2PM
ACCOUNTING REPORT LAYOUT (LONG)
ORDER (CONNTYPE) EXCLUDE (PACKAGE(*))
zur Gruppierung über “thread connection type”, wie
TSO, CICS, DB2CALL, IMS, APPL- DIR, SYST- DIR, usw., oder
DB2PM
ACCOUNTING REPORT LAYOUT (LONG)
ORDER (PLANNAME) und
INCLUDE
(DB2ID (xxxx)) FROM (03/ 11/ 00,10: 00: 00.00) TO (.......)
Weitere Anhaltspunkte zur Analyse der DB2-Faktoren unter Pkt 9.6 ff.
++49 8106 994390
Seite 11 von 127
2.3
Grundvoraussetzungen für DB2/UDB Performance
1.
Stellen Sie sicher, dass genügend Plattenplatz vorhanden ist. (6-10
Laufwerke pro CPU ist für den Anfang genug).
Jeder "table space's container" sollte alle verfügbaren Platten
erreichen können. Einige "table spaces", wie zum Beispiel
SYSCATSPACE und alle mit einer geringen Anzahl von Tabellen sollten
nicht über alle möglichen "Disks" gestreut werden, wogegen die TS mit
einer großen Userzahl oder auch "temporary tables" möglichst über den
gesamten "diskpool" gestreut sein sollten.
2.
Bufferpools sollten einen Nutzungsgrad des verfügbaren Speichers von
ca. 75% (bei OLTP Anwendungen) oder 50% (bei OLAP Anwendungen)
ausweisen.
3.
RUNSTATS sollte auf allen Tabellen, inklusive der Systemtabellen
(Katalog) durchgeführt sein. Gegebenenfalls sollte man den "Design
Advisor" nutzen, um eine Empfehlung und ein "review" für die Indizes
bezüglich ihrer SQL" workloads" zu erhalten.
4.
Man kann auch den "Configuration Advisor" nutzen, um den "Database
Manager" und die Datenbank für die entsprechenden Applikationen zu
konfigurieren.
5.
Logging sollte auf separaten "high-speed Disks" erfolgen. Diese
sollten über den DB2 Konfigurationsparameter NEWLOGPATH
identifiziert werden.
6.
"Concurrency" kann durch häufige "commits" verbessert werden (SQL
Statement Tuning).
Der Parameter SORTHEAP sollte höher eingestellt werden, um so "sort
overflows" zu vermeiden (DBM und DB Konfiguration).
7.
Der Tablespace Typ für den "System catalog table space" sollte SMS
sein und "temporary table spaces" und "DMS raw (device)" oder "File"
sollte für den Rest definiert sein.
Empfohlen wird die Ausführung der Prozedur "db2empfa", um die "multipage file allocation" für die SMS Tablespaces einzuschalten. Dies erlaubt
den SMS Tablespaces in einem Zug per "extent" zu wachsen (anstatt
"page"-weise). Das kann massive INSERT-Operationen und
Sortvorgänge, die die Platten nutzen, erheblich beschleunigen.
8.
Man nutze "parameter markers" für sich wiederholende SQLStatements (SQL Statement Tuning).
++49 8106 994390
Seite 12 von 127
2.4
Voraussetzungen für SQL-Performance bei DB2
Es ist bekannt, dass sich Tuning- und Performance-Maßnahmen auch bei
relationalen Systemen bis auf die Applikationsentwicklung auswirken. Es gilt auch
hier, dass die ineffiziente Nutzung von Systemressourcen durch
Anwendungsprogramme über systemtechnische Einstellungen nicht korrigiert
werden kann.
Entwickler müssen deshalb:
Verständnis für die Interna der DB2-Umgebung besitzen
ein tiefes Wissen über DB2-Tuning-Ansätze und OptimizerVerhalten haben
Das Fundament für gute Performance kann nur über
entsprechende Maßnahmen beim System-Design in Daten- und
Funktionsentwurf erreicht werden
Weitere den Leistungsdurchsatz beeinflussende Faktoren sind:
1.
Bestimmte Benutzergruppen
Die grob einzuteilenden Benutzergruppen, die diese Frage aus der Sicht
des Anwenderverhaltens problematisch werden lassen, sind
End-User mit allen Erwartungen/Anforderungen
•
•
•
•
in allen denkbaren und nicht planbaren Datenkonstellationen
mit komfortablen Oberflächen
mit guten, unverzüglichen Antwortzeiten
mit permanenter Verfügbarkeit
Anwendungsentwickler mit ihrer Qualifikation
•
•
•
•
mit dem Wissen über Vorgehensmethoden und -techniken
mit dem Verständnis komplexer Zusammenhänge
mit Verständnis für interne, systemtechnische Zusammenhänge und Konsequenzen
unter dem Aspekt des Einsatzes und der Handhabung
von Tools
Administratoren mit ihrem Qualitätsanspruch
•
•
•
Planung und Kontrolle optimierter Ressource-Nutzung
Sicherstellen aller möglichen und notwendigen „security“Aspekte
Nutzung effizienter und sicherer Administrationswerkzeuge
++49 8106 994390
Seite 13 von 127
2.
Methodeneinsatz
In den aus der Praxis entlehnten Erfahrungen mit Performanceproblemen
weisen die meisten auf unsystematisches Vorgehen in der
Anwendungsentwicklung hin (siehe auch Grafik im Kapitel „DB2
Anwendungsentwicklung“ / Pkt. I: Übersicht). Sinnvollerweise sollte beim
Vorgehen in der AE (= Anwendungsentwicklung ) auf folgende Faktoren
besonderes Augenmerk gelegt werden:
Einsatz einer fundierten Vorgehens- und
Systementwicklungsmethodik und deren Kontrolle
Festlegung objektivierbarer und sinnvoller PerformanceZielsetzungen
Permanente Berücksichtigung aktueller PerformanceErkenntnisse
3.
Technologie-Einsatz
Hoher Komfort verlangt nach hohem Ressourceneinsatz. Dennoch sollen
die Ressourcen angemessen sein. „Übergroße Schuhe hindern einen am
Laufen ebenso wie zu kleine....“
Dabei ist es entscheidend, dass auf keiner der unterschiedlichen
Ressourcen- und Technologieebenen Engpässe auftreten:
angemessene Hardware
abgestimmtes Betriebssystem und systemnahe Software
moderater Einsatz von Standard-Software-Systemen
Unterstützung von Individualanwendungen
Nutzung von Performance-Tools
++49 8106 994390
Seite 14 von 127
2.5
Möglichkeiten und Maßnahmen zur SQL-Optimierung
Wie in jedem Datenbanksystem sind auch bei DB2 die Tuningmöglichkeiten auf
einige, aber komplexe und sinnvolle Maßnahmen beschränkt. Die
Tuningmaßnahmen in den einzelnen Bereichen gehorchen jedoch den allgemeinen
Gesetzen des Systemtunings und diese sind:
1.
Performance entsteht nicht von selbst!
2.
Performance ist niemals statisch!
3.
Performance ist zu definieren und damit PLANBAR
4.
"benchmarks" sind meist SUBJEKTIV und
damit im Einzelfall NICHT aussagefähig !!!!!
5.
Performance wird erreicht durch das Zusammenwirken mehrerer
(annähernd) GLEICHWERTIGER Faktoren:
a)
realistisches, "sauberes" Informationsmodell
b)
optimale Umsetzung in die physische Umgebung
c)
systematische Anwendungsentwicklung
d)
effiziente Anwendungsprogramme
e)
optimale Einstellung der DBMS-Parameter
f)
entsprechende Änderung der OS-Parameter
g)
ständige Überwachung der Produktionsumgebung
- Datenadministration
- Datenbankadministration
- "Monitoring"
h)
entsprechender Hardware-Einsatz
Die Tuning-Möglichkeiten bei DB2 lassen sich dann grob unterteilen in:
Systemtechnische Aktivitäten
Anwendungsbezogene Maßnahmen
Die Tuningpotentiale des DB2-Systems
++49 8106 994390
Seite 15 von 127
2.5.1
Systemtechnische Aktivitäten
Zu den systemtechnischen Maßnahmen, die in den direkten Zuständigkeitsbereich
der Datenbankadministratoren (DBAs) beispielsweise für DB2/MVS fallen, gehören:
•
Optimierung der Generierungsparameter für MVS, CICS, IMS-DB und
TSO.
•
Autorisierungskonzept.
•
Connection- und Thread-Nutzung
•
Optimierung der Generierungsparameter für DB2, wie z.B.:
-
•
Festlegung der Optionen für physische DB2-Objekte, wie z.B.:
-
2.5.2
Bufferpool-Größe und -Nutzung
EDM-PooI-Größe
Lock-Definitionen (IRLM)
LOG-Definifionen.
Storagegroup / User defined VSAM-Datasets
DB2-Databases
Tablespaces
Indizes
Packages, Collections und Pläne.
•
Re- bzw. Umorganisation der physischen Datenspeicherung.
•
Anlegen, Ändern oder Löschen von Indizes.
•
Beeinflussung des DB2-Zugriffspfades durch Manipulation von KatalogStatistik-Spalten.
•
Permanente Überwachung des Systemverhaltens, Starten von Utilities,
wie z.B. RUNSTATS,
•
Durchführung gezielter REBIND-Maßnahmen.
Anwendungsbezogene Maßnahmen
Unter anwendungsbezogenen Maßnahmen versteht man:
•
logische und physische Datenmodellierung mit Festlegung der
Benutzer-DB2-Objekte (auch Denormalisierung, falls erforderlich).
•
Einsatzentscheidungen für: Tabellen, Views, Synonyme und Aliase.
•
Veränderungen der Datenablage mit Auswirkung auf die logische
Ebene (z.B. Aufteilen langer Zeilen, Kompression, Änderung von
Datentypen).
•
Festlegung und Test von SQL-Statements (z.B. durch EXPLAIN nach
Ausführung von RUNSTATS).
•
Umschreiben von Queries (Abfragen und Manipulationen) in
effizienterer Form.
•
Festlegung von "constraints", "triggers", UDF’s und Prozeduren
++49 8106 994390
Seite 16 von 127
2.5.3
Die Tuningpotentiale des DB2-Systems
Die Tuningpotentiale des DB2-Systems selbst liegen vor allem in folgenden
Bereichen:
•
MVS-Prioritäten-Steuerung
•
Adressraum-Nutzung
•
Paging/Swapping
•
Interne Ressource-Nutzung
•
Generierungsparameter(ZPARMS)
•
Connection/Thread-Nutzung
•
Anzahl parallele Threads
•
Autorisierungs-Konzepte
•
MVS- und DB2-Systemparameter
•
Cross-Memory- und System-Kommunikation
•
Bufferpool-Größe und Nutzung
•
LOG-Management
•
LOCK-Management
•
Interne Ressource-Nutzung:
- Anzahl intern zu haltender Zeilen (Materialisierungen)
•
Definition der Daten-Zugriffspfade:
-
Einfache Zugriffspfade
Page Set Scan, Index-Nutzung
Komplexe Zugriffspfade
Join, Subqueries
•
Filtermöglichkeiten und Aufwand bei der Bearbeitung vorgegebener
SQL-Prädikate.
Andere
Andere
•
Anzahl zu übertragender Pages = Cl’s
•
VSAM-Optionen
•
DB2-DDL-Optionen
•
Page-Nutzung
•
Freespace-Zuordnung
•
Daten-Zusammenlegung
•
Daten-Verteilung(DDF)
•
Speicherhierarchien
•
Index-Definition und –Nutzung
Trägersysteme
Trägersysteme
SystemSystemServices
Services
DatabaseDatabaseServices
Services
IRLM
IRLM-Services
Services
DDFDDFServices
Services
VSAM
Bild-03: Die DB2-Services im Überblick
++49 8106 994390
Seite 17 von 127
3
SQL - Die „Structured Query Language“ bei DB2
SQL besteht aus folgenden Kategorien, die sich wiederum in ihren Sprachelementen
unterscheiden:
DDL
DML
DCL
Data Definition Language
Data Manipulation Language
Data Control Language
DDL
DML
DCL
CREATE
DROP
ALTER
DELETE
LABEL
COMMENT
SELECT
INSERT
UPDATE
GRANT
REVOKE
Bild-04: Übersicht über die SQL-Sprachelemente
Während DDL und DCL in Richtung des "environment management" von DB2 zielen,
kann die DML als das User-Interface der Sprache SQL bezeichnet werden. Dabei ist
nicht die Menge der Sprachelemente entscheidend, sondern deren Kombinierbarkeit.
Sie macht die Mächtigkeit von SQL aus. DB2 deckt damit die DML-Anforderungen im
Relationenmodell ab:
Es gibt keine Auswirkung der physischen Speicherungsgegebenheiten auf
-
die Formulierung von SQL, z. B. TS-Formen, Indizes usf.
SQL als nicht-prozedurale Sprache
die Qualität von DB2-SQL: alle Sprachelemente sind
Mengenoperationen
Und: SQL enthält Sprachkonstrukte für Projektion, Selektion, Join. SQL bietet eine
Vielzahl "eingebauter Funktionen" ("builtin functions" und "scalar functions" ) für
bool´sche Operationen, für spezielle Prädikate und "date / time" - Arithmetik.
++49 8106 994390
Seite 18 von 127
Fast alle SQL-DML-Befehle können in Form von "views" abgelegt werden. Die
wichtigsten Sprachelemente der SQL-DML finden Sie in der folgenden Übersicht.
Lesen
Ändern
SELECT
INSERT
UPDATE
DELETE
eingebaute Funktionen SUM
MAX, MIN, AVG
DISTINCT
COUNT
GROUP BY
HAVING
GROUP BY
HAVING
spezielle Aussagen
Sortieren
ORDER BY
Bool´sche Operatoren AND
OR
NOT
Sperren
LOCK
Arithmetische Operatoren
Gruppieren
Vergleichsoperatoren
= , >=, <=
^=, ==, <>
>,<
LIKE
UNION
IN, ANY , ALL
BETWEEN
EXISTS
+
/
(
Spez. Arithmetik
,
,
)
*
YEAR, MONTH,
DAY, DAYS
HOUR, MINUTE,
MICROSECOND,
CURRENT
DATE, TIME, DAY
WEEK,
"Scalar Functions"
SECOND, LENGTH, VALUE, SUBSTR, ||
CHAR, INT, HEX, DEC,
FLOAT, DIGITS
TIMESTAMP
NULLIF, CASE
COALESCE
Weitere Funktionen(Beispiele):
CLOB, BLOB
ABS, ROUND,
ACOS, ASIN,
"Table Functions"
ATAN, COS,
MQREADALL,
LOWER, UPPER
MQRECEIVEALL
LTRIM, RTRIM,
RAND, REPEAT
SIGN, STRIP, TAN,
TRUNC....
Bild-05: Grobe Übersicht über die SQL-Funktionen
Eine Liste aller derzeit in DB2 implementierten “builtin-functions“ mit Beispielen findet
man im Anhang unter Pkt.11.1 ff.
++49 8106 994390
Seite 19 von 127
3.1
Relationale Sprachelemente und Operationen bei SQL
SQL nutzt algebraische Mengenfunktionen zur Qualifikation der Daten. Die
Grundelemente sind dabei:
3.1.1
PROJEKTION
Auswahl bestimmter Spalten
SELEKTION
Auswahl bestimmter Zeilen aufgrund von Dateninhalten
- auch anhand verknüpfter Suchkriterien
JOIN
Zusammenführen von Daten aus mehreren Tabellen
Die relationale Funktion "SELEKTION"
Die Funktion "SELEKTION" meint in der relationalen Algebra die Auswahl bestimmter
Zeilen, z. B. aus TAB A
A
B
C
D
E
a1
a2
a3
a4
b1
b2
b3
b4
c1
c2
c3
c4
d1
d2
d3
d4
e1
e2
e3
e4
A
B
C
D
E
a2
a4
b2
b4
c2
c4
d2
d4
e2
e4
ergibt
In SQL lautet die Formulierung:
SELECT
*
FROM
TABA
WHERE
A = 'a2'
OR
A = 'a4'
Bild-06: Die Selektion im RDB-Modell
++49 8106 994390
Seite 20 von 127
3.1.2
Die relationale Funktion "PROJEKTION"
Die "PROJEKTION" im Sinne der relationalen Algebra bedeutet die Auswahl
bestimmter Spalten aus einer Relation, z.B. Spalte A, C, D
A
B
C
D
E
a1
a2
a3
a4
b1
b2
b3
b4
c1
c2
c3
c4
d1
d2
d3
d4
e1
e2
e3
e4
A
C
D
a1
a2
a3
a4
c1
c2
c3
c4
d1
d2
d3
d4
ergibt a)
häufig erfolgt auch eine
Mischung aus “Selektion”
und „Projektion“
ergibt b)
A
C
D
a2
a4
c2
c4
d2
d4
In SQL lautet die Formulierung:
a)
SELECT
FROM
A, C, D
TABA
b)
SELECT
FROM
WHERE
OR
A, C, D
TABA
A = 'a2'
A = 'a4'
Bild-07: Die Projektion im RDB-Modell
++49 8106 994390
Seite 21 von 127
3.1.3
Die relationale Funktion "JOIN"
Die Funktion JOIN basiert auf dem Verbinden von Tabellen = Bilden des
“kartesischen Produkts”. Diese Mengenfunktion ist mathematisch korrekt, aber
informationstechnisch von minderem Wert.
Deshalb bildet man bei SQL-JOINs nicht das kartesische Produkt, sondern vielmehr
eine Intersektionsmenge (= Durchschnittsmenge). Dazu muss der Anwender wissen,
über welche Attribute (Spalten) die Verbindung zur jeweils anderen Tabelle
hergestellt werden kann. Zusammengesetzte Schlüssel müssen dazu vollständig
qualifiziert werden. Im Beispiel wird über A verknüpft.
TABA
A
B
C
D
E
a1
a2
a3
a4
b1
b2
b3
b4
c1
c2
c3
c4
d1
d2
d3
d4
e1
e2
e3
e4
nicht im Ergebnis
(out of join)
TABB
ergibt als karthesisches
Produkt die Permutation aus
beiden Tabellen
- als JOIN Menge aber
folgendes Resultat:
F
G
H
a1
a2
a4
g1
g2
g3
h1
h2
h3
A
B
C
D
E
F
G
H
a1
a2
a4
b1
b2
b4
c1
c2
c4
d1
d2
d4
e1
e2
e4
a1
a2
a4
g1
g2
g3
h1
h2
h3
In SQL lautet die JOIN-Formulierung:
SELECT
FROM
,
WHERE
*
TABA
TABB
TABA.A
=
TABB.F
Bild-08: Der Join im RDB-Modell
++49 8106 994390
Seite 22 von 127
3.1.4
Relationale Mengenoperationen-Zusammenfassung
Alle Mengenoperationen des Realtionenmodells sind in genau diesem beschrieben
und diese sind als die folgenden relationalen Operationen in SQL formulierbar.
Untermenge ( SELECTION,
PROJECTION )
Schnittmenge ( (INNER) JOIN,
INTERSECT )
Vereinigungsmenge
( UNION, FULL
OUTER JOIN )
Ausschlussmenge ( ... NOT IN ...
NOT EXISTS.. , LEFT
oder RIGHT OUTER
JOIN)
Bild-09: Funktionen im RDB-Modell (Zusammenfassung)
++49 8106 994390
Seite 23 von 127
3.2
Generelle Überlegungen und Voraussetzungen für SQL Performance
In diesem Kapitel werden zunächst einige Basisfragen zum Thema "Do's and Don'ts
bei DB2 SQL" behandelt und damit soll mit einigen Gerüchten von vorneherein
aufgeräumt werden.
3.2.1
„Dynamic SQL“
Dynamisches SQL ist zunächst kein Problem – aber die mehrfache Nutzung
desselben SQL-Statements kann zu unnötigen Verzögerungen im Vergleich zu
statischen SQL-Anweisungen führen.
Deshalb sollten dynamische SQL-Anweisungen in einem speziellen Cache abgelegt
werden. In diesem Fall prüft das DB2 erst den Cache-Inhalt, wenn ein PREPARE
öfter dasselbe SQL-Statement benötigen sollte. Wird das SQL-Statement im Cache
gefunden, kann es wieder verwendet werden und das System erspart sich einen
„overhead“ von bis zu 80% CPU Verbrauch für zusätzliche Prepare's.
Benutzt man aber das so genannte „dynamic SQL-cacheing“ dann muss der EDMPool ebenfalls entsprechend angepasst werden. EDM-Pool Statistiken helfen das
„cacheing“ zu tunen (siehe IFCID-Informationen).
Wichtig ist es zudem, dass die SQL Queries dann als "parameter marked" Queries
an das DB2 gegeben werden. Das bedeutet nichts anderes, als dass keine
"hostvariablen direkt in die Query-Formulierungen eingebettet werden sollen, sondern
alle Stellen, an denen eine Hostvariable stehen kann mit einem "?" vorbelegt werden.
Beispiel:
SELECT
FROM
WHERE
AND
......
tab
t
t. c1
>
?
t.c6 BETWEEN ? AND ?
Erst der Aufruf der Query versorgt die "Parametermarker" mit Variablenwerten:
EXECUTE sql_statement USING :hvc1, :hvc61, :hvc62;
3.2.2
Datenbankobjekte und ihre Struktur
Oft trifft man gerade in Standardsoftware-Umgebungen schwierige und riesige
Anzahlen von DB-Objekten an. DWH- und/oder SAP-Anwendungen erzeugen für
sich oft mehr als 18.000 Objekte. In der Version DB2V7 installiert beispielsweise
Peoplesoft mit seiner Anwendung alleine 5 eigene DB2-Datenbanken(!).
Nun stellt in DB2 eine Datenbank nicht ein physisches Objekt dar, sondern eine "DB2
Database" bedeutet vielmehr eine Unterteilung der gesamten DB2-Objekte in
kleinere überschaubare Mengen von Objekten. Empfehlenswert wären nicht mehr als
20 Tablespaces pro "Database".
Der Grund hierfür ist die Größe des DBD (Database Descriptor). DBD-Sperren
erfolgen bei DB2 auf Datenbank-Ebene. Der DBD ist es auch, der, wann immer
Änderungen erfolgen, auf die Log-Datei geschrieben wird. Der DBD muss im
Speicher resident sein und im EDM-Pool in einem zusammenhängenden
Speicherabschnitt liegen.
++49 8106 994390
Seite 24 von 127
3.2.3
Tabellen und Tablespaces
Bei einer großen Menge von Tabellen, so ab ca. 1000 Tabellen, sollten die Daten in
Gruppen eingeteilt werden. I/O Strategien und die Verwaltbarkeit der DBUmgebung bestimmen das physische Design mit.
Obwohl “multiple table” Tablespaces einfach zu implementieren sind, sind doch
einige Punkte bezüglich Größe und physischer Strukturierung zu beachten und
empfehlenswert:
Anzahl pages
Tablespace Typ
n > 10.000
Partitionieren
n > 1.000
eine einzelne “table” pro segmented TS reicht
128 < n < 1.000
mehr als eine Tabelle in einem segmented TS
n < 128
mehr als eine Tabelle in einem segmented TS
Bild-10: Übersicht über Strukturierungsempfehlungen zum TS
Aber: 100 Tabellen für einen “shared segmented TS” sollten das Maximum sein.
Extrem wichtig in diesem Zusammenhang ist auch die Größe der TB-Segmente.
Anzahl pages
SEGSIZE
n <= 28
4 bis 28, ähnlich der Anzahl pages
28 < n < 128
32
n >= 128
64
Bild-11: Übersicht über Segmentierungsempfehlungen zum TS
Alle gruppierten Tabellen sollten sorgfältig dahingehend untersucht werden, ob sie
sich in ihren fundamentalen Charakteristiken ähneln. Manche Arbeits- und temporäre
Tabellen (gerade von Softwareherstellern wie z. B. Peoplesoft, SAP usw.) haben
sehr unterschiedliche Eigenschaften. Diese können sein:
•
•
•
•
Daten für die Clients
"Codetabellen"
Referenzen und
“lookup tables”
Die Gesamtgröße eines STS sollte immer kleiner 1.000 pages sein (Größere TS
sind als PTS zu definieren), um unnötige "prefetch"-Aktivitäten zu vermeiden.
Eine Methode hier gegenzusteuern ist sicherzustellen, dass der VPSEQT Parameter
auf 0 gesetzt ist und somit diese "prefetch"-Aktivitäten weitgehend ausgeschaltet
sind.
Zusätzlich sollte die “queuing” Methode auf FIFO gesetzt werden (ab DB2V6), um
den Overhead für das “latch”-Management zu vermeiden.
++49 8106 994390
Seite 25 von 127
3.2.4
Indexe
Index-Tuning ist eine Aufgabe, die in den DB2-Installationen dringend zu empfehlen
ist.
Beispielsweise umfassen viele Indexe nicht die erforderlichen Spalten, um die Wahl
eines besten Pfades zuzulassen: Einige Indizes enthalten die Spalten in der falschen
Spaltenreihenfolge oder in der falschen Sortierreichenfolge. Gerade für JOINOperationen sind aber gute und richtige Indexe unerlässlich.
3.2.5
Primary und Clustering Indexes
Einmal gesetzte Primary Indexes sollten niemals verändert werden (auch nicht bei
SAP oder Peoplesoft usw.). Änderungen am PI gefährden nicht nur die Eindeutigkeit
von Daten, sondern können auch RI-Bedingungen betreffen.
Ebenso sollte der “clustering Index” sorgfältig ausgewählt werden. Nicht immer ist es
geschickt den PK auch als “clustering Index” einzusetzen.
3.2.6
“Index Only”-Zugriffe auf VARCHAR Spalten
Wegen der vermehrten Nutzung von VARCHAR-Spalten im Index, wurde nach der
Version 5 ein „Feature“ für „index only“-Zugriffe auf solche Felder implementiert.
Dies ist nur möglich, da die Längeninformation dieser Spalten nun in den Index
mitaufgenommen wird. Somit kann auch auf indizierte VARCHAR-Felder zugegriffen
werden, ohne dazu die Datenpage lesen zu müssen. Dabei wird abhängig von der
Länge der eingetragenen Werte die maximale Länge mit Leerzeichen aufgefüllt.
Gesteuert wird dies über den neuen DSNZPARM RETVLCFK=YES.
Dazu müssen bei der Migration auf die neuen Versionen 6 oder 7 die
Anwendungsprogramme neu mit BIND behandelt werden.
In der DB2 Version 8 gibt es zum Thema VARCHAR- und Indexverarbeitung weitere
Verbesserungen ("padding").
3.2.7
Verzögerte Objektdefinitionen
Zusätzlich gibt es seit der DB2 Version 6 ein neues „Feature“, das es zulässt, TS und
Indizes im Nachhinein zu definieren(DEFER).
Im Gegensatz zu früher, müssen nun nicht mehr alle TS und Indexes zum Zeitpunkt
der Installation eines Systems (SAP, Peoplesoft...) definiert sein. Dies war früher
unabdingbar, auch wenn diese Indizes noch nicht genutzt wurden. Dies soll
Installationsvorgänge beschleunigen und die DBA einfacher machen.
3.2.8
Aufwand und Kosten von Indexes
Bevor man mit dem Anlegen von Indexes beginnt, sollte man sich Gedanken über
den Aufwand zur Pflege dieser Indexes machen und folgende Themen überlegen:
Indexes benötigen Speicherplatz - große Indexes viel Speicherplatz.
Jeder Index benötigt einen eigenen Indexspace und darunter liegende
VSAM Datasets. Es existiert eine Einschränkungen in der Anzahl der
offenen Datasets im MVS Betriebssystem (1000 Address Spaces
(=default)).
Ein Index muss bei jeder Datenänderungen mitgepflegt werden, um
die Änderungen in seinen Basistabellen zu reflektieren. Wenn eine
++49 8106 994390
Seite 26 von 127
UPDATE SQL-Anweisung eine Spalte verändert und die Spalte
Bestandteil eines Index ist, muss der Index ebenfalls verändert werden.
Die Gesamtzeit für die Pflege steigt somit entsprechend.
Indexes müssen während des Ladens einer Table erstellt werden das kostet Zeit. Sie müssen und können aus ihrer Basis-Tabelle
wiederhergestellt werden, wenn der Tablespace wiederhergestellt werden
muss, dies verbraucht zusätzlich Zeit.
Indexes können seit DB2 V7/V8 mit der Funktion REBUILD jederzeit
aus den Daten der zugehörigen Tabelle wiederhergestellt werden
nachdem diese Funktion dem Utility REORG beigeordnet ist, ist es mehr
als hilfreich, dass dieser im „online“-Modus laufen kann, d.h. der laufende
Betrieb von DB2 wird während der REORG-Zeit kaum behindert.
Empfehlung:
Das Design der Indexe sollte Bestandteil des Database Design sein und nicht
vernachlässigt werden. Treten bei SQL-Anweisungen Performance-Probleme auf,
stellt man sich zunächst folgende Fragen:
1.
2.
3.
Würde das Hinzufügen einer Spalte zu einem Index einer
Anweisung erlauben Index-Only-Zugriffe zu nutzen ?
Wird ein neuer Index benötigt ?
Ist der (bisherige) Index-Aufbau korrekt ?
++49 8106 994390
Seite 27 von 127
3.2.9
Empfehlungen zu Sortierungen
Häufig kann man Sortierungen vermeiden, wenn Index Keys in der Reihenfolge
vorliegen, die in ORDER BY, GROUP BY, einer Join-Operation, oder bei einem
DISTINCT in einer Column-Funktion benötigt werden. In anderen Fällen,
beispielsweise bei Einsatz des List Sequential Prefetch stehen im Index keine
sinnvollen Sortierungen zur Verfügung und die selektierten Daten müssen
zwangsläufig sortiert werden.
Sollte es zwingend erforderlich sein, Sortierungen zu verhindern, so sollte man die
Anlage eines passenden Index für die erforderlichen Spalten erwägen und
gegebenenfalls die OPTIMIZE FOR n ROW- Anweisung einsetzen (siehe auch Pkt.
6.4.2 ff).
Beispiel:
SELECT
FROM
WHERE
ORDER BY
C1, C2, C3
T1
C1
>
1
C1
OPTIMIZE FOR 1 ROW;
Ein aufsteigender Index auf der Spalte C1 kann eine Sortierung vermeiden helfen.
Ein Index auf C1 + C2 + C3 erfüllt den gleichen Zweck, ermöglicht aber gleichzeitig
einen Index-Only-Zugriff.
Man beachte in diesem Zusammenhang aber auch die Hinweise im Abschnitt
"Aufwand und Kosten von Indexes", bevor mit neuen Indizes versucht wird, SortOperationen zu vermeiden: Nicht alle Sorts bedeuten eine Behinderung.
Ist beispielsweise ein Index nicht effizient genug und werden sehr viele Rows
qualifiziert, kann der Optimizer einen anderen Zugriffspfad wählen und stattdessen
die Daten ohne Indexzugriff selektieren, anschließend sortieren, und damit u. U.
erheblich kostengünstiger arbeiten.
Faktoren, die man beachten muss, weil sie die Sort-Performance beeinflussen
und Techniken, die Sorts verbessern können, sind:
die eingesetzten Prädikate sollten die Daten liefern, die man benötigt: Jede
Einschränkung eines Auswahlergebnisses, die Reduzierung des Result
Sets, usw. reduziert auch den Sort-Aufwand.
wenn Sorts durchgeführt werden, hängt die Row-Länge von der Anzahl der
selektierten Ergebnisspalten ab. Die Reduktion der Spalten erhöht die
Performance eines Sortvorgangs, wobei vor allem der Umfang der Daten
und des Workspaces eine Rolle spielen. Generelle Vorschläge zur
Reduzierung der Sort-Row-Länge:
1.
2.
3.
Wenn VARCHAR-Spalten nicht benötigt werden, dann verzichte man
auf sie. VARCHARs im Index werden mit Blanks auf ihre maximale
Länge aufgefüllt (gilt seit DB2 V8 nicht mehr – IX können nun
„gepadded“ sein, d.h. die Blanks werden nicht im IX gespeichert).
Minimieren der Anzahl von Sort Key Columns,
Minimieren der Anzahl der Sort Data columns.
Workfiles verfügen über ein vielfältiges Einsatzspektrum und haben eine
Wechselwirkung zur Sort-Performance. Es gilt über Global Temporary
Tables und Materialized Views nachzudenken. Der Systemadministrator
++49 8106 994390
Seite 28 von 127
sollte ausreichend physischen Platz bereitstellen und diese Workspaces in
einen eigenen Bufferpool legen. Die Isolierung von anderen Objekten
vereinfacht das Monitoring und Tuning der Sort-Performance.
Anwendungen die Global Temporary Tables (GT‘J) nutzen, belegen
Workfile-Space bei einem COMMIT oder ROLLBACK. Wenn Sorts und GTTNutzung gleichzeitig erfolgen, dann benötigt man sehr wahrscheinlich mehr
Work-Space.
Der Systemadministrator sollte die Bufferpool-Größe für Workfile Buffers
erhöhen, wenn die Prefetch Rate 4 Pages oder weniger beträgt.
Bei der Nutzung von STOGROUP‘s sollte nur jeweils ein Volume je Storage
Group genutzt werden. Zusätzliche Volumes werden erst genutzt, wenn das
erste Volume vollständig belegt wurde.
Nie mehr als ein physisches Workfile Dataset je DASD Volume anlegen.
Die Größe des Sort Bufferpool beeinflusst die Sort-Performance. Je größer
der Buffer, desto größer die Effizienz von Sorts.
Die Planung der Konfiguration sollte so erfolgen, dass minimale I/O
Contention auf den I/O Paths zu den physischen Workfiles sicherzustellen.
Eine Verteilung der Workfiles auf unterschiedliche Disk Paths hilft meist.
Sind Statistiken nicht aktuell, sollte man diese mit dem RUNSTATS Utility
auf den aktuellen Stand bringen.
3.2.10 SQL-Abfragen mit Subqueries
Nutzen Sie Input (Host) Variablen in Prädikaten Ihrer Static SQL Query? Variable
wie Parameter-Marker erlauben keine Auskunft über mögliche Werte zur BIND- und
Ausführungszeit. DB2 nutzt deshalb den sog. Filterfaktor um den besten Zugriffspfad
für ein SQL-Statement zu bestimmen.
Wenn sich dieser Zugriffspfad als ineffizient herausstellen sollte, könnte man
eine erneute Überprüfung (REOPT) für langlaufende Queries (>10 secs
Elapsed Time) zur Laufzeit veranlassen.
Das Binden mit EXPLAIN-Option veranlasst ein "Static Explain". Um bereits
vor dem Bind den voraussichtlichen Zugriffspfad von Explain ermitteln zu
lassen, also einen "Dynamic EXPLAIN" auszuführen, extrahiert man das
jeweilige Statement aus dem Programm und ersetzt die Host-Variablen
durch Konstante. Der "Dynamic Explain" wird sich dann wie ein "Static
Explain" verhalten.
3.2.11 Komplexität von Queries
Man sollte sicherstellen, dass die SQL Query so einfach und effizient wie möglich
formuliert ist.
Die Auswahl nicht benötigter Spalten und unnötige ORDER BY oder GROUP BY
Anweisungen sind unbedingt zu vermeiden.
Dennoch sollte man diese Aussage nicht missverstehen und SQL-Funktionen oder
gar relationale Funktionen selbst programmieren. DB2 wird in jedem Fall jede
Datenbankaufgabe schneller lösen können, als eine vergleichbare programmierte
Sequenz. – wer kennt schon die DB-Umgebung besser als das System DB2 selbst?
++49 8106 994390
Seite 29 von 127
3.2.12 Spalten-Funktionen
Werden Column Functions eingesetzt, sollten diese so einfach wie möglich
gestaltet sein, damit die Wahrscheinlichkeit, dass sie bereits aufgeführt werden,
wenn die Daten beschafft werden, und nicht erst im Anschluss daran, möglichst hoch
ist. Grundsätzlich kann man sagen, dass Column Functions am effizientesten sind,
wenn sie nicht erst während der Sort-Phase des Statements ausgeführt werden.
Um Column Functions bereits während des Datenzugriffs zu ermöglichen, müssen
folgende Bedingungen vorliegen:
GROUP BY benötigt keinen Sort (EXPLAIN-Output prüfen).
Kein Stage-2-Prädikat verwenden. Dies ist in der Anwendung zu formulieren.
Keine Distinct-Set Funktion (wie z.B. COUNT(DISTINCT C1)).
Beinhaltet die Query einen Join, dann beziehen sich alle Set Functions auf
die letzte Tabelle des Joins (EXPLAIN-Output prüfen).
Alle Column Functions beziehen sich auf eine einzige Column ohne
arithmetischen Ausdruck, ausgenommen die Column Functions VARIANCE
and STDDEV, die niemals während einer „Retrieval“-Aktion ausgeführt
werden können.
3.2.13 Formulieren von Prädikaten
Manche Formulierungen in den Prädikaten schließen eine Indexnutzung bei
DB2 aus (siehe Pkt.: 4.7.2.2 und 4.7.2.3 ff.) .
Deshalb folgende Empfehlungen:
• Prädikate die Indexes nutzen könnten, sollten bevorzugt werden
• Unabsichtlich redundante oder unnötige Prädikate sind zu vermeiden
• Deklarierte Längen von Host Variablen darstellen: Die Länge deklarierter
Host-Variablen (HV) darf nicht länger sein als das Attribut der Datenspalte,
mit der die Host-Variable korrespondiert. Wenn die HV-Länge größer ist,
wird das Prädikat ‘Stage-2‘ und kann niemals passendes Prädikat für einen
Index Scan werden.
Folgende Host Variable und SQL Tabellenspalte sei angenommen:
Assembler Declaration
SQL definition
MYHOSTV
COL1 DECIMAL(6,3)
DS
PLn ‘value‘
Bild-12: Deklaration von HOST-Variablen
Die Präzision der Host-Variablen beträgt 2n-1. Bei n = 4 und Wert = ‘123.123‘ würde
das Prädikat, wie nachfolgend dargestellt, kein passendes sein, weil die Präzisionen
(7,0 und 6,3) unterschiedlich sind:
... WHERE
COL1 =
:MYHOSTV
Eine Möglichkeit solche ineffizienten Prädikate zu vermeiden besteht darin, HostVariablen ohne Längenoption zu versehen, also
MYHOSTV
DS
P‘l23.l23‘
Dies garantiert die identische Attributdefinition wie die der SQL Spalte.
++49 8106 994390
Seite 30 von 127
3.2.14 Die Verwendung von „scalar functions“
Die Verwendung von skalaren Funktionen, wie SUM, MAX, MIN, AVG, COUNT,
LENGTH, VALUE, CHAR, DATE, DECIMAL, DIGITS usw., sollte immer mit Vorsicht
zu erfolgen.
Es ist genauestens zu prüfen, ob Programmfunktionen in Anwendungsprogrammen
nicht dieselbe Wirkung und Funktionalität besitzen, ohne den DB2-Kernel unnötig zu
belasten.
DB2 ist ein Meister in relationaler Funktionalität aber kein Konvertieroder Rechenprogramm.
3.2.15 Neuordnen der Tabellenfolge in der FROM Klausel
Die Reihenfolge der Tabellen oder Views in der FROM Klausel kann die Auswahl des
Zugriffspfads beeinflussen. Wenn die Query langsam läuft kann dies deshalb sein,
weil die "join sequence" ineffizient ist. Man kann die "join sequence" innerhalb eines
Query-Blocks aus der Spalte PLANNO in der PLAN_TABLE ersehen (siehe auch Pkt.
7.2 ff.). Eine Neuanordnung der Tabellen oder Views in der FROM Klausel kann zu
einer besseren Performance der Query führen.
Es sollte Dabei darauf geachtet werden, dass in den verschiedenen JOIN-Verfahren
immer die Tabelle/View mit den kleineren Join-Resultaten als äußere Tabelle beim
Join verwendet wird.
3.2.16 „list prefetch“
“Index screening” ist genau die richtige Medizin für exzessive “list prefetch”Operationen. „List prefetch“ erfolgt immer, wenn Indizes nicht genau zur WHEREKlausel passen.
Beispiel:
Index: c1, c2, c3, c4
WHERE c1 = xx AND c3 = yy AND c4 = zz
Bisher wurde nur die c1 Spalte genutzt, um die RID’s für den „list prefetch“ zu finden.
Jetzt werden auch die Spalten c3 und c4 überprüft, um die entsprechenden RID’s vor
dem „list prefetch“ zu eliminieren. Damit wird die Last für den RID Pool reduziert,
über die ansonsten „RID Pool failures“ und andere Probleme hervorgerufen werden.
3.2.17 Uncommitted read“
Wenn irgend möglich sollte das “isolation level” uncommitted read”(UR) gesetzt
werden. UR vermeidet unnötigen „lock overhead“. Am einfachsten wird die
Nutzung von UR bei „read-only“ Daten. Jede Tabelle, die als „read-only“ erkannt und
analysiert ist sollte im SELECT-Statement die Klausel WITH UR enthalten.
Bei langlaufenden Queries auf den Clients können nach Erfahrungen so bis zu 30%
CPU-Zeit gespart werden.
++49 8106 994390
Seite 31 von 127
3.2.18 “row level locks”
Man sollte “row-level locking“ vermeiden wo immer es geht.
Oft wird RLL („row level locking“) genutzt, um Probleme bei der Parallelverarbeitung
zu vermeiden. Meist jedoch erzeugt diese Vorgehensweise mehr Probleme als sie
löst.
Besonders, da diese Verfahrensweise zu einer potentiellen Steigerung von
„deadlock“-Situationen führt, indem sie mehr als einen Prozess pro Page zulässt.
3.2.19 Freigabe von Locks
In n-tier Umgebungen verursachen “locks” die Belegung von aktiven oder inaktiven
„threads“ nach einem Commit. In der Komponente DB2-Connect kann man den
Parameter cursorhold auf 0 (kein “default”) setzen, was dazu führt, dass Sperren
nach einem „commit“ aufgehoben werden.
Zusätzlich dazu sollte man den Parameter autocommit überprüfen, der
unterschiedlich, z.B. in ERP-Systemen, eingesetzt werden kann. So setzen manche
ERP Systeme den Parameter autocommit auf 0, um „commits“ zwischen den SQLStatements zu unterbinden, da alle „commits“ in der Applikation ausgelöst werden.
In anderen Fällen ist der autocommit unerlässlich, z.B. in Tuxedo-Umgebungen.
3.2.20 “lock escalation”
Bei bestimmten TS ist es sinnvoll “lock escalation” auszuschalten. „Lock
escalation“ kann in einigen Situation durchaus zu einer echten Performance-Bremse
werden. Wenn man LOCKMAX in der TS-Definition auf 0 setzt, schaltet man „lock
escalation“ für diesen Tablespace aus.
Zudem sollte LOCKSIZE auf PAGE oder ROW gesetzt sein.
Bei "partitioned table spaces" (PTS) sollte das „selective partition locking“(SPL)
eingeschaltet sein (LOCKPART YES). Dies verursacht nur bei den benutzten
Partitions Sperrungen, nicht aber auf allen anderen – auch den nicht verwendeten.
Bedingungen, die SPL verhindern, sind:
• Type 1 Index wird im „access path“ verwendet
• Der Plan wurde mit ACQUIRE(ALLOCATE) gebunden
• Der TS wurde mit LOCKSIZE TABLESPACE erstellt
• LOCK TABLE IN EXCLUSIVE MODE wurde ohne PART Angabe gesetzt.
3.2.21 "materialized query tables"(MQT's) und "automatic query rewrite"(AQR)
Materialized query tables sind Tabelle, die Informationen enthalten, die aus
anderen Tabellen gewonnen werden. MQT's speichern Resultate aus
vorangehenden Queries , die aufwendige Joins und Aggregationsoperationen
durchführen.
Indem die gewonnene, zusammengefasste Information aufbewahrt und vorgehalten
wird, können MQT's folgende Query-Verarbeitung vereinfachen und die Performance
von "dynamic SQL queries" erheblich verbessern. MQT's sind deshalb besonders oft
in "data warehousing applications" zu finden.
Automatic query rewrite (AQR) ist der Prozess bei DB2, der zur Verarbeitung von
Daten aus einer MQT führt. Wird AQR zugelassen, dann entscheidet DB2 selbst, ob
++49 8106 994390
Seite 32 von 127
es eine "dynamic query" oder einen Teil daraus über die Nutzung einer "materialized
query table" schneller erledigen kann. Wenn ja, wird DB2 die Query so umformulieren
("rewrite"), dass anstatt der originalen Tabelle(n) die MQT verwendet werden kann.
Dabei ist zu beachten, dass eine MQT Query-Resultate enthalten kann, die nicht "ad
hoc"-aktuell sind – vor allem, wenn die betroffenen "base tables" nach der Erstellung
der MQT öfter geändert wurden.
Zu diesem Thema gibt es in der einschlägigen Literatur folgende Punkte
nachzulesen:
•
•
•
•
•
•
Einführung in MQT's und AOR
Definition einer MQT
Aufbau von MQT's
Security und MQT's
Nutzung von AQR
Beispiele zu MQT’s und AOR
3.2.21.1 Empfehlungen für das Design von "materialized query tables"
Die folgenden Empfehlungen betreffen direkt das Design von MQT's:
•
"Aggregate Functions" sollten strategisch im Fullselect einer "materialized
query table"-Definition berücksichtigt werden:
-
-
COUNT(*) und SUM(Ausdrücke)
SUM(expression*expression) nur , wenn auf VAR(expression),
STDDEV(expression), VAR_SAMP(expression), oder
STDDEV_SAMP(expression) abgefragt werden soll.
COUNT(expression) zusätzlich zu COUNT(*), fall der Ausdruck "nullable"
sein sollte.
MIN(expression) und MAX(expression), wenn dies abgefragt wird
NICHT: AVG(expression), VAR(expression), oder STDDEV(expression)
direkt, falls eine der folgenden Parameter Kombinationen genutzt werden
sollte:
SUM(expression), SUM(expression*expression) und COUNT(*)
SUM(expression), SUM(expression*expression) und
COUNT(expression)
DB2 kann AVG(expression), VAR(expression) und STDDEV(expression) aus
SUM(expression), SUM(expression*expression) und der zugehörigen
COUNT "aggregate function" ableiten.
•
Der "foreign key" einer "dimension table" einer GROUP BY Klausel einer MQT
Definition sollte ebenfalls in der MQT vorhanden sein; z.B. wenn PGROUP.ID
existiert, sollte auch PGROUP.LINEID existieren. DB2 kann dann die MQT
verwenden, um eine Summierung auf der Ebenen LINEID vorzunehmen,
ohne über PGROUP erneut zu "joinen"
•
Alle "higher-level columns" in der MQT sollten vorhanden sein, da DB2 die
funktionale Abhängigkeit einer denormalisierten "denormalized dimension
table" nicht kennt; z.B. CITY in einer GROUP BY Klausel impliziert auch
STATE und COUNTRY für diese Klausel. Ebenso wie MONTH in der GROUP
BY Klausel ebenfalls YEAR impliziert.
•
HAVING Klauseln haben in der MQT nichts verloren. Eine "materialized query
table" mit HAVING in ihrer Definition besitzt nur eine limitierte Verwendbarkeit
während des "query rewrite".
•
Erzeugen Sie IX auf “materialized query tables”, wie Sie sie auch auf den
Basistabellen anlegen würden.
++49 8106 994390
Seite 33 von 127
3.2.21.2 Empfehlungen für das Design von zugehörigen "base tables"
Die folgenden Empfehlungen betreffen die Strategien zum Design der Basistabellen
und dienen der Performancesteigerung und einer erhöhten Anwendbarkeit von
“materialized query tables”:
•
"referential integrity" sollte wann immer möglich entweder als ENFORCED
oder NOT ENFORCED definiert sein. Ebenso sollte ein IX als UNIQUE
definiert werden, wenn er „eindeutig“ ist.
•
Alle "base table" Spalten sollten wo immer möglich als NOT NULL definiert
sein, sodass COUNT(X) genauso möglich wird wie COUNT(*). In diesem Fall
muss man nicht auch noch COUNT(X) für jede "nullable column" X in einer
MQT Definition vorsehen. Falls nötig sollte ein spezieller Wert definiert
werden, der den NULL-Wert ersetzt.
•
Normalisierte Tabellen sollten denormalisierten Tabellen im "base tables"Umfeld vorgezogen werden. Nutzt man normalisierte Tabellen, so kann man
"non-primary key columns" für eine MQT eher vernachlässigen und spart
damit nebenbei auch noch Speicherplatz.
DB2 kann das Fehlen von "non-primary key columns" kompensieren, indem
es diese Spalten über einen Join auf eine "dimension table" immer noch
gewinnen kann. Ist Performance ein Aspekt, so kann man MQT's auch so
definieren, dass die so genannten "snowflake dimensions" denormalisiert
dargestellt werden.
3.2.22 Der DB2-Katalog
Es ist wichtig, den Katalog sauber und ordentlich zu verwalten. Normalerweise
nämlich ist die Anzahl „packages“ und DB2-Objekte in einer Produktionsumgebung
nicht unerheblich.
Der Katalog sollte keine unnötigen Objekte enthalten und ebenfalls von Zeit zu
Zeit reorganisiert werden. Dies ist Aufgabe einer leistungsfähigen DBA.
Erfahrungen zeigen eine Reduktion von I/O’s um ca. 10 bis 12 Prozent und eine
Reduktion von „elapsed time“ um ca. 11 bis 14 Prozent nach einer
Reorganisation.
In einigen Fällen, z.B. bei starker Fragmentierung wurden sogar bis zu 50%
Reduktion der I/O Tätigkeit nach einem REORG des Katalogs erzielt.
4
SQL-Tuning und Performance bei DB2
Das Tuning von SQL Queries kann folgende Aktivitäten erfordern:
•
Hinzufügen und/oder Ändern von Indizes
•
Anpassen der Spaltengrössen in Tabellen
•
Neuschreiben von Queries
Insbesondere Queries, die Bestandteil von angepasstem Code, generierten Abfragen
und benutzerverfasste Queries bedürfen häufig eines Tuning. Insbesondere bei
Reporting Aufgaben überschreiten die generierten Queries schon einmal die erlaubte
Anzahl von 15 Tabellen in einem Join in der DB2 Version 5.
Seit DB2 Version 6 sind bis zu 255 Tabellen in einem SQL-Statement verarbeitbar,
aber nur 15 „joined“-Tabellen in einem SQL-Statementblock erlaubt. Ab DB2
++49 8106 994390
Seite 34 von 127
Kapitel 11: Anhang – Index
Version 8 sind es bis zu 225 Basistabellen, die in einer einzigen "joined
expression" stehen können.
Joins über Spalten mit ungleicher Länge führten bis zur Version 5 zu PerformanceProblemen. Seit DB2 Version 6 kann DB2 dies über zusätzliche interne Funktionen
kompensieren. Dennoch sind nach der Migration betroffene Indizes anzupassen. In
Version 8 wird DB2 über "query rewrite"-Funktionalitäten derartige User-Fehler
weitgehend beheben können.
Neue „features“ bestehender oder neuer Funktionen in neuen Releases von DB2
werden seltenst in „alten“ Queries genutzt und eingebaut, obwohl DB2 daraus
Vorteile ziehen könnte.
Nur allzu wenig wird z.B. die Verwendungsmöglichkeit von CASE – Konstrukten
beachtet. Mit CASE kann man in SQL „UNION Konstrukte“ minimieren/eliminieren
bzw. restrukturieren.
In einem 5-fachen UNION Block beispielsweise kann alleine über den Weg von
CASE-Formulierungen eine Reduzierung von „elapsed time“ und CPU-Zeit von
bis zu 80% erzielt werden.
++49 8106 994390
Seite 35 von 127
4.1
Die neuen Limits bei DB2 Version 8
Folgende Begrenzungen in DB2 Version 8 haben sich im Vergleich zu den
vorhergehenden Versionen verändert:
•
Die Größe des Virtual Storage von 2 GB (231 Bytes) auf 16 EB (=
Exabytes = 264 Bytes) das sind 16 Milliarden GB(!)
•
Maximale Länge von Table- & Viewnames von 18 auf 128 Zeichen
•
Maximale Länge der Columnnames von 18 auf 30 Zeichen
•
Die maximale SQL Statement Länge von 32K auf 2 MB
•
Die Länge von “Character Literals” von 255 auf 32704 Zeichen
•
Die Anzahl Tables in einem Join von 15 auf 225
•
Die Maximale Anzahl Partitions in einem PTS von 254 auf 4096
•
Die Anzahl von “Active logs” von 31 auf 93
•
Die Anzahl “Archive logs“ von 1000 auf 10,000
Zusätzlich wurde einige der wichtigsten internen (systemtechnischen) Grenzen
aufgehoben:
• Veränderungen im Maschinencode (64-Bit Architekturen)
• Mehr Realspeicher (reduziert das "paging")
• Mehr "threads",
• größere Bufferpools,
• mehr Anwendungen parallel
• mehr Objekt
• größere Objekte usw.
Für komplexe Joins sollten auch die ZPARMs in der DB2-Umgebung entsprechend
angepasst werden:
Bild-13a: Neue SQL-wirksame „ZPARMs“ in DB2 V8
++49 8106 994390
Seite 36 von 127
4.2
Die DB2 SQL Engine
Im Folgenden wird die generelle Verarbeitung von SQL-Statements in DB2
dargestellt und welche Fragestellungen der Optimizer zu beantworten hat, bevor er
über die Zugriffe auf eine DB2-Tabelle entscheidet.
SORT ?
Wieviele Rows?
RID Sort ?
View Materialized?
Table Expression ?
BIND/Optimize
ORDER BY
Calling
CallingProcess
Process
Colum function ?
DSNDB07
Relational Data Server
Filtering ?
(Non-sargable
SORT?
Stage 2)
D
B
2
E
N
G
I
N
E
RESULT SETS
Global Temp Tables
Column Functions ?
Data Manager
Filtering ?
(Sargable
Stage 1)
Welcher Index ?
Welche Spalten ?
Wieviele „matching“ ?
Index Only ?
DB2 Catalog
DB2 Directory
DB2 Indexe
DB2 Tabellen
Table Scan
Oder Partition Scan ?
Bild-14: Die DB2-SQL „Engine“
Die kritischen Fragen sind:
•
•
•
•
•
•
•
•
•
•
•
•
Werden Indexe genutzt ?
Wird „list prefetch“ angesteuert ?
Sind alle Spalten im Index korrekt ?
Wie viele der Spalten passen zu den Queries (matching) ?
Werden weniger als alle „columns“ verwendet, gibt es Statistiken über die
Kardinalität dieses „subsets“ ?
Wurde „index-only“ Zugriff verwendet – oder sollte er verwendet werden ?
Wurden „column functions“ in Stage1 , Stage2 oder anderswo aufgelöst ?
Könnten sie
eventuell während der Datenwiedergewinnung aufgelöst werden ?
Wird ein ORDER BY über einen Index bedient oder wird ein SORT
ausgelöst?
Wie viele „rows“ werden von Stage1 an Stage2 übergeben ?
Wird ein View genutzt und materialisiert ?
Wird eine „table expression“ genutzt und materialisiert ?
Wurde alles, was möglich ist für die Daten getan, bevor sie ans Programm
zurückgegeben werden ?
++49 8106 994390
Seite 37 von 127
Dies alles resultiert in einem einfachen Grundsatz:
Sei so selektiv wie möglich !
Nach dem Lesen von Index und Daten sollten soviel Daten wie möglich über Stage1Prädikate gefiltert werden.
Dies macht einige Schwierigkeiten, da mit jedem neuen Release von DB2 immer
mehr Prädikate auf Stage1 gedrückt werden, dennoch aber heißt dies nicht, dass
wenn ein Prädikat in Stage1 potentiell „indexable“ ist, es auch in Stage1 verarbeitet
wird (siehe auch Pkt. 6.2.1 ff). Jede Filterung, die nicht in Stage1 erledigt werden
kann geht an Stage2 zur Erledigung – an den RDS Layer.
Viele dieser Funktionen, wie „scalar functions“, Aggregation, Sortieren und anderes
könnte genauso gut in Stage1 verarbeitet werden, wenn SQL anders formuliert
wäre.
Noch häufiger wird in Stage2 keinerlei Verarbeitung erledigt und die Daten werden
lediglich an das User-Programm weitergegeben.
Grundsätzlich gilt: Alles, was nicht in Stage1 verarbeitet werden kann, wird in
Stage2 erledigt.
++49 8106 994390
Seite 38 von 127
4.3
SQL-Tuning und die Logik von Abfragen
Unabhängig vom Datenbanksystem und vom damit zusammenhängenden SQLDialekt, gibt es Regeln der mathematischen Logik, nach denen man alle logischen
Anfragen (auch in SQL) ohne die Gefahr des Verfälschens von Ergebnissen
umformulieren kann. Hier einige Beispiele:
4.3.1
“constant propagation”
Das Transitivitätsgesetz in der mathematischen Logik stellt folgendes fest:
IF
AND
THEN
AND NOT
( A <vergleichsoperator> B ) IS true
( B <vergleichsoperator> C ) IS true
( A <vergleichsoperator> C ) IS true
( A <vergleichsoperator> C ) IS false
<vergleichsoperator> kann dabei sein: =, >, >=, <, <=
NICHT aber: <> oder LIKE (!)
Bild-15: Transitivitätsgesetz der mathematischen Logik
Das Transitivitätsgesetz führt zur Erkenntnis, dass man z. B. C mit B ersetzen kann,
ohne eine Änderung der Bedeutung eines Ausdrucks zu verursachen. Diese
Substitution einer Konstanten nennt man „constant propagation“.
Die folgenden zwei Ausdrücke meinen dasselbe, aber die zweite Formulierung ist
besser, da ein „column“-Name durch ein Literal(5) ersetzt wird:
Fall-1:
SELECT ….
FROM
WHERE
AND
AND
tab_1
t1.col1
t1.col2
t1.col1
t1
<
=
=
t1.col2
t1.col3
5
Fall-2:
SELECT ….
FROM
WHERE
AND
AND
tab_1
5
t1.col2
t1.col1
t1
<
=
=
t1.col2
t1.col3
5
Bild-16: Transitivitätsgesetz der mathematischen Logik – Beispiel-1
Der Ausdruck in Fall-2 wird auch Transformation des Ausdrucks in Fall1 genannt.
Einige gute DBMS-Systeme führen diese Transformation eigenständig und
automatisch durch (siehe Pkt. 4.4.2.1).
++49 8106 994390
Seite 39 von 127
Manche DBMS versuchen keine Transformationen, wenn die Gefahr besteht, dass
eine Query dadurch langsamer werden könnte, z. B. eine Transformation von
Ausdrücken mit NOT:
Fall3:
SELECT ….
FROM
WHERE
AND NOT
tab_1
t1
t1.col1
=
( t1.col3 =
t1.col2 )
Manuelle Lösung:
SELECT ….
FROM
WHERE
AND
AND
tab_1
t1.col1
t1.col3
t1.col2
5
7
OR
t1.col1 =
t1
=
5
<> 7
<> 5
Bild-17: Transitivitätsgesetz der mathematischen Logik – Beispiel-2
Das transformierte Statement ist in DB2 ca. 50% schneller als das Original. In
anderen Worten: Manchmal zahlt es sich aus Transformationen selbst vorzunehmen.
„constant propagation“ funktioniert manchmal nicht mit „float“ Daten: Sie können
sowohl „größer als“ und „gleich“ sein, wenn numerische Vergleiche gemacht werden.
Wenn die Methode möglich ist, kann man von einem Performance-Gewinn von ca.
5/8 ausgehen.
Andererseits funktioniert die „constant propagation“ oft nicht bei CHAR-Ausdrücken.
Wenn sie möglich ist, ist auch eine Performance-Verbesserung von 4/8 drin.
4.3.2
Eliminieren von „totem Code“
In manchen alten SQL-Programmen findet man auf beiden Seiten des Operators
Literale, wie im folgenden Beispiel:
SELECT
FROM
WHERE
AND
*
tab_1
0
t1.col1
t1
=
=
1
‘I hope we never execute this’
Bild-18: Eliminieren von „totem Code“
Die Query liefert immer 0 “rows” zurück. DBMS-Systeme könnten die gesamte
WHERE-Klausel ignorieren. Aber: Nicht alle tun dies. - Ein Test ergab, dass die
Query-Formulierung
SELECT
FROM
*
tab_1
t1
ohne WHERE-Klausel einen Performancegewinn von ca. 5/8 gegenüber einer
Klause, wie der oben gezeigten, brachte.
++49 8106 994390
Seite 40 von 127
Nun sind die beiden Queries ja nicht identisch. Dennoch sollte eine Query, die 0
„rows“ liefert schneller sein, als eine, die ein Resultat mit mehreren „rows“ abzugeben
hat und zudem einen TS Scan durchführen muss.
Das Beispiel zeigt, dass nicht alle DBMS „false“-Bedingungen und ihre abhängigen
Konditionen in der PREPARE-Phase eliminieren.
Dennoch sind die DBMS ziemlich hartnäckig im Eliminieren von „always-true“
Konditionen. Beispielsweise kann man die Rückgabe von Werten mit geringer
Präzision steuern, indem man eine zusätzliche Bedingung in die WHERE-Klausel mit
aufnimmt.
Beispiel:
SELECT
FROM
WHERE
OR
*
tab_1
t1
( 77 / 10 = 7.7 AND t1.col1 / 10 = 7.7 )
( 77 / 10 = 7 AND t1.col1 * 10 = 77 )
Bild-19: Zusätzliche Bedingung in der WHERE-Klausel – Beispiel
Der Performancegewinn kann in diesem Fall wiederum bis zu 5/8 betragen.
Natürlich kann es zunächst keine gute Idee sein, redundanten Code in ein SQLStatement einzubauen.
Aber:
Angenommen eine Spalte ist eine „indizierte NOT NULL“-Spalte. Dann
könnte man aber das folgende Statement wie folgt transformieren:
SELECT
*
SELECT
FROM
WHERE
*
FROM tab_1
t1
In:
tab_1 t1
indexed_column > 0
Bild-20: Zusätzliche Bedingung in der WHERE-Klausel mit „UNIQUE NOT NULL“ – Beispiel
Dies wiederum veranlasst DB2 beispielsweise zu einer Suche nach einem
passenden und möglicherweise über einen Index.
++49 8106 994390
Seite 41 von 127
4.3.3
Zusammenfassen von Konstanten („constant folding“)
Jeder, der schon einmal C programmiert hat weiß, dass der Ausdruck x = 1+1-1-1 zu
x=0 zum Compile-Zeitpunkt umformuliert wird. Da mag es überraschen, dass die
SQL-fähigen DBMS Systeme folgende Ausdrücke nicht ebenfalls auf die gleiche
Weise behandeln:
… WHERE
col1
… WHERE
5
… WHERE
col1
… CAST ( 1 AS INTEGER )
… WHERE
‚a’
+ 0
+ 0.0
IN ( 1, 3, 3 )
|| ‚b’
Bild-21: Zusammenfassen von Konstanten in der WHERE-Klausel – Beispiel
Findet man in altem SQL-Code solche Formulierungen, dann lasse man sie so wie
sie sind. Sie sind eventuell bzw. sicher geschrieben, um Indexzugriffe zu verhindern,
den Datentyp einer „result column“ zu ändern, eine Limitierung in der Zeilengröße zu
überwinden oder – na ja.
Wie auch immer, die Empfehlung für Transformation lautet hier:
… WHERE
a - 3
= 5
… WHERE
a
= 8
nach
umzuwandeln.
4.3.4
“case-insensitive” Suchen
Die meisten Datenbanken unterscheiden bei der Suche nach Objekten und Daten
zwischen Groß- und Kleinschreibung (MS-Access beispielsweise nicht). Die
meisten User verwenden für die so genannte „case-insensitive“ Suche die Funktion
UPPER.
Dies kann jedoch schon in dem Augenblick ein Fehler sein, wo mit anderen
Buchstaben als denen der reinen Lateinschrift gearbeitet wurde.
Beispiel:
… WHERE
… UPPER( ‘résumé’ )
Bei der Transformation geht Information verloren. Die obige Anweisung liefert
RESUME als Ergebnis. Und das ist die Bedeutung des Wortes von „curriculum
vitae“(engl.) zu „starte erneut“. So scheint es besser, die Funktion LOWER zu
verwenden, da so keine Information verloren gehen kann.
Beispiel:
… WHERE
LOWER(col1) = ‘résumé’
Bild-22: „case insensitive“ Suche – Beispiel
++49 8106 994390
Seite 42 von 127
Ist man unschlüssig, ob Daten in Groß- oder Kleinschreibung bzw. sogar gemischt
auf der Datenbank abgespeichert sind, so sollte man grundsätzlich Funktionen auf
Spalten in der WHERE-Klausel vermeiden.
Beispiel:
WHERE
OR
col1 = ‘SMITH’
col1 = ‘Smith’
Diese Formulierung ist immer noch langsam. Die Empfehlung ist, das DBMS nur
dann zum „case-sensitive search“ zu veranlassen, wenn dies erforderlich ist.
WHERE
OR
col1 = ‘SMITH’
( ‘SMITH’ <> ‘Smith’ AND
col1 = ‘Smith’ )
Dies bringt eine ungefähre Verbesserung der Performance von geschätzt ca. 3/8.
4.3.5
„Sargability“
Die ideale Form eines SQL-Prädikats hat die Form:
<column> <vergleichsoperator> <literal>
IBM nennt solche Prädikate „sargable“. SARG ist eine Abkürzung für „Search
ARGument. Später wurde die Bedeutung auf „kann im Index gesucht werden“
erweitert. Wie auch immer: Es ergibt sich daraus eine Empfehlung die heißt:
Die linke Seite eines Such-Prädikates sollte möglichst ein einfacher
Spaltenname sein; die rechte Seite sollte einen einfachen Suchbegriff enthalten!
Dem folgend transformieren alle DBMS die folgenden Ausdrücke:
5
=
<col1> - 3 =
<col1>
- <col2>
nach:
nach:
<col1> =
<col1> =
5;
- <col2> + 3;
Bild-25: „Sargability“
Dies kann zu einem Performanceerfolg von ca. 4/8 Verbesserung führen.
++49 8106 994390
Seite 43 von 127
4.3.6
"Join transitive closure"
Auch die Methode des "transitive closure" zum Verbinden von Spalten, stellt eine
von SQL unabhängige Methode der Umformung von Queries dar. Sie ist eine der
Methoden der mathematischen Logik, logische Ausdrücke aufzulösen und zu
berechnen, ohne die Logik zu verändern.
Sie könnte auf der Optimizer-Ebene jederzeit implementiert werden
Die folgende Query spezifiziert einen "equi-join" auf t1.c11 und t2.c21, und einen
"equi-join" auf t2.c21 und t3.c31:
select
from
where
and
and
*
t1, t2, t3
t1.c11 = t2.c21
t2.c21 = t3.c31
t3.c31 = 1
Ohne "join transitive closure" wäre die einzige richtige Reihenfolge der Tabellen (t1,
t2, t3), (t2, t1, t3), (t2, t3, t1) und (t3, t2, t1). Fügt man die Joins auf t1.c11 =
t3.31hinzu, wird der Optimizer versuchen, die Liste der Möglichkeiten zum Join um
folgende Aspekte zu erweitern: (t1, t3, t2) und (t3, t1, t2).
Über die "transitive closure" Regel kann die Bedingung t3.c31 = 1 auch auf die "join
columns" von t1 und t2 angewendet werden.
"Transitive closure" wird normalerweise ausschließlich auf "equ-joins" angewendet.
"Join transitive closure" wird nicht verwendet für:
•
"Non-equi-joins"; beispielsweise, t1.c1 > t2.c2
•
Equi-joins die einen Ausdruck beinhalten; z.B. t1.c1 = t2.c1 + 5
•
Equi-joins mit einer or Klausel
•
Outer joins, z.B. t1.c11 <> t2.c2 oder left join oder right join
•
Joins über "subquery" Grenzen hinweg
•
Joins zum Prüfen von "referential integrity" oder der „with check
option“ auf "views"
•
Spalten mit "incompatible datatypes"
++49 8106 994390
Seite 44 von 127
4.4
DB2 SQL und Performance
Folgende Empfehlungen zur Konstruktion von SQL-Queries gelten immer dann,
wenn eine der Forderungen Performance heißt. Die Aufzählung ist nicht unbedingt
komplett, weist aber den Weg und die Methode zum Tuning von SQL.
4.4.1
Grundsätzliche Empfehlungen zu DB2-SQL
1. Holen Sie das absolute Minimum an Daten in die DB2 „Engine“.
2. Filtern Sie alle „non-qualifying“ Daten aus der Resultatsmenge
3. Geben Sie nur das absolute Minimum an Daten an den
„requestor“ zurück
4. Erledigen Sie die Verarbeitung in der kleinsten Menge
funktionaler SQL-Statements
5. Erledigen Sie soviel Funktionalität wie möglich im SQLStatement (nicht im Programm)
Bild-26: DB2 SQL: Grundsätzliche Empfehlungen
Alle diese Maßnahmen führen dazu, dass SQL letztlich mit einem Minimum an
Ressourcen-Verbrauch ablaufen kann.
Beispiel:
DECLARE
testcurs1 CURSOR
SELECT
abtnr
,
abtname
,
abtltnr
FROM
abteilung
.....
FETCH testcurs1
INTO :hvn, :hvm :hvo
IF
sqlcode = 0 THEN
DO
LOOP
DECLARE
testcurs2 CURSOR
SELECT
persnr
FROM
mitarbeiter
WHERE
persnr = :hvo
AND ausbst > 20
...........
FETCH testcurs2
INTO :hv3
...........
IF
sqlcode = 0 THEN <fetch erneut>
ELSE
<loop erneut>
endif
endloop
endif
.........
++49 8106 994390
Seite 45 von 127
… das ist so nicht optimal !!!!
Subselect:
SELECT
,
FROM
WHERE
abtnr
abtname
abteilung
abtltnr IN
( SELECT
FROM
WHERE
persnr,
mitarbeiter
ausbst > 20 )
JOIN:
SELECT
,
FROM
,
WHERE
AND
abtnr
abtname
abteilung
mitarbeiter
abtltnr =
ausbst > 20
persnr,
Bild-26: Evaluation eines „result sets“ – Beispiel – 1
4.4.1.1 Suche die kleinste „row“-Menge
Nur die Daten sollten den „requestor“/Programm erreichen, die unbedingt gebraucht
werden. Selten kommt es vor, dass alle Daten in ein Programm zu übergeben sind.
Häufig ist dies ein Problem des physischen Design, weil Indexes fehlen oder aus
anderen Gründen, die in der Implementierung liegen. Ein einfaches Beispiel wird im
Pkt. 4.4.1.2 ff geliefert. Oft wird ähnliches SQL von Code-Generatoren erzeugt.
4.4.1.2 Lies nur die Spalten, die wirklich benötigt werden
Beispiel:
SELECT
,
,
FROM
WHERE
kd#
name
betrag
kunde
kd#
= :hv
Genügen würde:
SELECT
,
FROM
WHERE
name
betrag
kunde
kd#
= :hv
Bild-27: Evaluation eines „result sets“ – Beispiel - 2
++49 8106 994390
Seite 46 von 127
Es gibt schlichtweg keinen Grund, die „kd#“ zu selektieren, wenn sie bereits
bekannt ist – und das muss sie in diesem Fall ja wohl sein, sonst wäre die
„hostvariable“ :hv nicht zu besetzen gewesen(!) Der Zugriffspfad basiert auf diesen
Prädikaten. Um diese zusätzliche Spalte im Resultat unnötigerweise trotzdem
darzustellen, geschieht folgendes:
1.
Lesen aus der „Page“ in den Bufferpool
2.
Übertragen aus dem Bufferpool in eine „User Work Area“
3.
Übergabe an Stage2
4.
Übergabe über XMS an das Programm/QMF/SPUFI
Dies bedeutet in jedem Fall unnötige Beschäftigung der CPU.
Hier ist es nur eine „row“, aber wie viele Transaktionen und „rows“ und wie häufig
passiert dies pro Tag ? – Nicht das einzelne SQL-Statement verursacht Schäden im
DB2-Umfeld, vielmehr der Effekt der multiplen Nutzung lässt die Ressourcen knapp
werden und den Aufwand ansteigen. Sollte das vorgenannte Statement in einem
SQL vorkommen, das mehrere „rows“ zurückgibt und dies noch im Falle einer „child
table“ zur Kundentabelle, dann wird die Situation bedenklich.
Oft tritt der Fall ein, dass das Resultat sortiert angeboten werden muss und die
Spalte ist erforderlich zur Nutzung bestimmter Indexes:
Beispiel:
SELECT ac.kd#
,
ar.rechn#
,
iv.betrag
,
it.teile#
,
it.teile_beschreibung
,
it.teile_kosten
FROM kunde
ac
,
auftrag
ar
,
rechnung
iv
,
teile
it
WHERE ac.kd#
=
:hv1
AND
ac.kd#
=
ar.kd#
AND
ar.rechn#
=
iv.rechn#
AND
iv.teile#
=
it.teile#
ORDER BY
ac.kd#, ar.rechn#, iv.teile#
Es gibt in diesem Fall keine eindeutige Interpretation dessen, was passiert. Die
Query bezieht sich auf eine einzelne „kd#“ und so kann es sein, dass die Spalte „kd#“
im SELECT nicht erforderlich ist.
Es kann auch sein, dass die „kd#“ im ORDER BY SORT-Probleme verursacht. Dies
ist ein Join über 4 Tabellen und die Chance, dass die „kd#‘ im ORDER BY für die
Auswahl bestimmter Indexes zur Vermeidung von SORTs nötig ist, ist ziemlich
unwahrscheinlich.
++49 8106 994390
Seite 47 von 127
Diese Query könnte besser wie folgt geschrieben sein:
SELECT ar.kd#
,
ar.rechn#
,
iv.betrag
,
it.teile#
,
,
it.teile_kosten
FROM auftrag
ar
,
rechnung
iv
,
teile
it
WHERE ar.kd#
=
:hv1
AND
ar.rechn#
=
iv.rechn#
AND
iv.teile#
=
it.teile#
ORDER BY
ar.kd#, ar.rechn#, iv.teile#
Das Ergebnis ist identisch mit der ersten Query. Die Kundentabelle wird allerdings
nicht mehr verwendet und folglich auch nicht im JOIN eingebunden. Der JOIN ist ein
Join über 3 Tabellen anstatt über vier.
Trotz der vielfältigen Möglichkeiten, die aufgrund der vielfältigen Variablen für diese
Query bestehen, könnte man eine weitere Variante für ein besseres SQL wie folgt
vermuten:
SELECT iv.rechn#
,
iv.betrag
,
it.teile#
,
,
it.teile_kosten
FROM auftrag
ar
,
rechnung
iv
,
teile
it
WHERE ar.kd#
=
:hv1
AND
ar.rechn#
=
iv.rechn#
AND
iv.teile#
=
it.teile#
ORDER BY
iv.rechn#, iv.teile#
Hier wurde die „kd#“ aus dem SELECT-Statement genommen, da der Wert bereits in
der Hostvariablen „hv1“ enthalten ist. Es werden viele „rows“ zurückgegeben und so
reduziert die Wegnahme der „kd#“ den Aufwand für die Query enorm.
Zudem befinden sich nur noch zwei Spalten im ORDER BY anstatt drei und sortiert
werden nur noch fünf Spalten anstatt sechs (siehe SELECT-Statement).
Dies wiederum verkleinert die Anforderungen für SORT im Bufferpool und die
I/O Aktivitäten dort. Es könnte jetzt auch der Fall eintreten, dass der SORT-Pool für
das Sortieren ausreicht, während dies beim ersten Beispiel unwahrscheinlich(er)
ist.
++49 8106 994390
Seite 48 von 127
4.4.1.3 Reduziere die Anzahl der SQL-Statements
Jedes SQL-Statement bedeutet einen CALL an das DB2-Subsystem, was wiederum
„overhead““ bedeutet. Dieser „overhead“ hängt eng mit dem Typ des SQL-Statement
zusammen; die folgende Tabelle stammt von Akira Shibamiya und Michael Hannan
(Australien):
SQL-Typ
SQL-Kosten
zus. Kosten
CICS
OPEN und
CLOSE
zusammen
0.33 ms
„task switch“
+ 0.4 ms
FETCH
0.12 ms
„task switch“
+ 0.2 ms
Einzelner
SELECT
0.11 ms plus
„overhead“ wie
im FETCH
„task switch“
+ 0.2 ms
weitere Faktoren
Plus „column
overhead“ Class 2
accounting usw.
Bild-31: Kosten von SQL-Statements(Übersicht)
Diese Kosten basieren auf dem Einsatz eines IBM 3090-1870J Rechners und
müssen für andere Umgebungen entsprechend angepasst werden.
Das Problem sind nicht die Millisekunden, sondern die Tatsache, dass
redundante und unnötige SQLs den „overhead“ sprunghaft ansteigen
lassen.
Die bekanntesten und schlimmsten Szenarios für wiederholte und unnötige SQLFormulierungen findet man in den bekannten Fällen der „Have you seen Jones“
Diskussionen, in denen wiederholte SQL-Statements für dieselbe „row“ abgesetzt,
jedoch immer andere Spalten zurückgegeben werden.
Die Korrektur hierzu ist nicht so einfach, wie sie zunächst scheint. Hier ein Beispiel in
Pseudocode:
SELECT
Name und Adresse von Jones
DO irgendwas
IF fertig mit Jones THEN beende
ELSE SELECT
Telefon und Geburtstag von Jones
DO
irgendwas
ELSE SELECT
Spitzname, Alias, email_adr
ELSE …..
Bild-33: Reduktion der Anzahl von SQL-Statements – Beispiel: „Have you seen Jones“
++49 8106 994390
Seite 49 von 127
Eine “quick & dirty” Lösung ist zu einfach und trifft das Problem nicht genau:
SELECT
name, adresse, telefon, geburtstag, spitzname, alias,
email_adr usw...
Das eigentliche Problem liegt im prozentualen Anteil an Verarbeitung auf jeder
Ebene. Wenn beispielsweise der Übergang auf den zweiten SELECT nur in 10% der
Zeit erfolgt, dann werden alle „columns“ in 90% der Zeit ohne Grund
wiedergewonnen. Wird die Query nur einmal pro Zeiteinheit benutzt, ist dies
akzeptabel. Aber, wenn es sich um eine häufig auftretende Transaktion handelt, dann
hat man ein Problem.
Wenn die normale Verarbeitung durch alle funktionalen Ebenen läuft, dann kann das
SQL-Statement zu einem einzigen zusammengefasst werden, das alle Spalten
zurückgibt.
Manchmal jedoch ist es nicht offensichtlich, wo Redundanz eigentlich entsteht. – Es
gibt keinen Fehler im SQL-Statement, aber es handelt sich um den klassischen Fall
des so genannten „cut and paste“ SQL, was immer zu einem Desaster führen muss.
In unserem Fall existieren in einem Programm drei Cursor. Jeder wird über „cut
and paste“ vom Original aus aufgebaut. Jeder Cursor basiert auf einem eigenen „rule
set“ und gibt ein anderes Resultat zurück. Das SQL dieser drei Cursor wird nun zu
einem SQL SELECT zusammengefasst. Die Unterschiede der folgenden SQL
SELECTs bestehen in der WHERE - Klausel und den Hostvariablen, die die
Ergebnisse aufnehmen:
SELECT Statement1:
SELECT
FROM
WHERE
AND
AND
AND
AND
AND
SUM(cumqty)
INTO :ws-cat-1
kd_konto
ar
teile_typ
=
:ws-item-typ
teile_mengen-klasse =
:ws-amount-class
gruppen#
=
:ws-group-num
teile_jahr
=
:ws-teile-jahr
transaktion_code
IN
(208, 400, 434, 441)
transaktion_datum
=
( SELECT
MAX(transaktion_datum)
FROM
kd_konto )
SELECT Statement2:
SELECT
FROM
WHERE
AND
AND
AND
AND
AND
SUM(cumqty)
INTO :ws-cat-2
kd_konto
ar
teile_typ
=
:ws-item-typ
:ws-amount-class
gruppen#
=
:ws-group-num
teile_jahr
=
:ws-teile-jahr
transaktion_code
IN
( 210, 211, 221, 223, 224, 225 )
transaktion_datum
=
( SELECT
FROM
kd_konto )
++49 8106 994390
Seite 50 von 127
SELECT Statement3:
SELECT
FROM
WHERE
AND
AND
AND
AND
IN
AND
SUM(cumqty)
INTO :ws-cat-3
kd_konto
ar
teile_typ
=
:ws-item-typ
:ws-amount-class
gruppen#
=
:ws-group-num
teile_jahr
=
:ws-teile-jahr
transaktion_code
( 034, 100, 104, 105, 106, 332, 334, 341 )
transaktion_datum
=
( SELECT
FROM
kd_konto )
Bild-34: Redundanz in SQL-Statements – Beispiel
Die Performanceverbesserung entsteht bei der Zusammenfassung der
wiederholten SQL-Statements zu einem „single“-Statement (Cursor), sodass drei
Suchvorgänge auf eine Tabelle zu einem einzigen konsolidiert werden, z. B.:
SELECT Statements(konsolidiert):
SELECT
SUM( CASE
WHEN transaktion_code = 034 ... = 341
THEN cumqty END )
SUM( CASE WHEN transaktion_code = 208 ... = 441
THEN cumqty END )
SUM( CASE WHEN transaktion_code = 210 ... = 225
THEN cumqty END )
INTO
:ws-cat-1, :ws-cat-2, :ws-cat-3
FROM
kd_konto
ar
WHERE
teile_typ
=
:ws-item-typ
AND
:ws-amount-class
AND
gruppen#
=
:ws-group-num
AND
teile_jahr
=
:ws-teile-jahr
AND
transaktion_code
IN
(208, 400, 434, 441, 210, 211, 221, 223, 224, 225,
034, 100, 104, 105, 106, 332, 334, 341 )
AND
transaktion_datum
=
( SELECT
FROM
kd_konto )
Bild-35: Redundanz in SQL-Statements – Lösung
Wiederholbare oder redundante SQL Statements gibt es innerhalb eines Prozesses
normalerweise nicht allzu viele. Aber geht man davon aus, dass ein Prozess
mehrfach wiederholt wird, dann schon, z.B. täglich häufig angestoßene OLTP
Transaktionen.
Man stelle sich vor, eine Aktion, wie z.B. „get current date“ wird pro Tag zwei
millionenmal ausgeführt!
++49 8106 994390
Seite 51 von 127
Man könnte viel CPU-Zeit für Calls sparen, wenn man diese Aktion unterbinden
könnte. Auch wenn es sich um ein 24x7 System handelt, kann man derartige SQLCalls kontrollieren. Die fundamentalen Regeln hierzu lauten:
Make it work !
Make it work right !
Make it work fast !
Standards in Unternehmen verhindern oft die Verwendung eines „single“ SQL
Statements und erlauben lediglich die Nutzung von Cursor-Verarbeitung.
Der zusätzliche Ressourcenverbrauch kann abgeschätzt werden, indem man die o.g.
Zeiten für derartige Aktionen betrachtet, die nur eine oder 0 „rows“ zurückgeben.
OPEN
FETCH
CLOSE
CURSOR curs
curs INTO .....
curs
Gegenüber
SELECT ... FROM <table>;
Bild-36: Unnötiger Ressourcenverbrauch durch CURSOR – Beispiel
In beiden Fällen werden Cursor genutzt. Im zweiten Fall jedoch wird der Cursor
innerhalb von DB2 aufgebaut, geöffnet und geschlossen ohne Kosten für XMS und
zugehörige „entry“- und „exit“-Code-Strecken zu verursachen. Wann immer nur 0
oder 1 „row“ von einem SQL-Statement als Ergebnis erwartet wird, sollte niemals ein
Cursor kodiert werden. In OLTP Transaktionen mit hoher Nutzungsfrequenz würde
so der „overhead“ beachtlich.
Kurz und gut: Ein „single“ SQL-Statement sollte verwendet werden, wenn
1.
Normalerweise nur 0 oder 1 row zurückgegeben wird
2.
Wenn kein SQLCODE –811 möglich ist
3.
Wenn kein unnötiger „overhead“ erzeugt wird
(über „prefetch“ oder „staging“ ....)
Ein weiterer häufig vorkommender Fall sollte an dieser Stelle noch erwähnt werden:
Falls eine OLTP Transaktion viele SQL-Statements enthält, so verursacht sie „task
switch overhead“ für jeden dieser Aufrufe. Für gute Performance ist es besser diese
gesamte Arbeit in eine „stored procedure“ zu verlagern.
Oft können derartige OLTP-SQL-Statements über JOIN, inner/outer, usw.
miteinander kombiniert werden.
Dies verursacht nicht nur innerhalb von DB2 weniger Aufwand, es reduziert die
„task switches“ und den SQL-„overhead“ einer Transaktion.
++49 8106 994390
Seite 52 von 127
4.4.1.4 Kodiere Prädikate, die möglichst selektiv sind
Oft wird versucht nach der Methode vorzugehen: Ich kodiere das SQL und DB2
soll sehen, wie es damit zurechtkommt. Diese Vorgehensweise ignoriert die
Performance-Ansprüche an SQL vollständig. Als Beispiel seien die u.a. Queries
dargestellt, die alle die gleiche Antwort liefern:
Query 1:
SELECT
FROM
WHERE
s.sname
s
s.s#
IN
( SELECT
WHERE
sp.s# FROM sp
sp.p# = ‘P2’ )
Query 2:
SELECT
FROM
WHERE
s.sname
s
EXISTS
( SELECT
WHERE
AND
*
FROM sp
sp.p# = ‘P2’
sp.s# = s.s# )
Query 3:
SELECT
FROM
DISTINCT s.sname
s
INNER JOIN sp
ON s.s#
=
sp.s#
WHERE
sp.p# =
‘P2’
Bild-37: Selektive Prädikate – Beispiel
Leider werden heute immer noch (DB2-)Entwicklungsansätze praktiziert die lauten:
„Bring es durch den Precompiler, wenn es Daten zurückliefert - übergib es an
die Produktion“. Enge Projekttermine forcieren genau diese Vorgehensweise.
Betrachtet man allerdings die o.g. Queries, so ist es auch sehr schwer zu bestimmen,
welche von ihnen die beste Filterung besitzt und am wenigsten „overhead“ erzeugt.
Zum einen hängt die Aussage eng mit dem genutzten DB2 Release zusammen in
deren Historie es immer wieder Verbesserungen zu genau dem vorliegenden
Problem gab. – So kann man zunächst nicht sagen welche Formulierung am besten
ist, zumal man nicht erkennen kann, welche Indexes existieren, wie die Kardinalität
der Daten ist und welche Optimierungstechniken angewendet werden.
Man weiß nur, die Antworten sind identisch, - und das ist nicht genug.
++49 8106 994390
Seite 53 von 127
4.4.1.5 Beachte die Qualität von DB2-SQL Abfragen
Einen Anhaltspunkt für die Qualität einer Query kann schon das Wissen um die
Art, wie DB2 Prädikate verarbeitet, liefern. Dazu muss man die Art und Weise
kennen, mit welcher Gewichtung DB2 SQL-Prädikate verarbeitet:
Schritt-1: Indexable Prädikate werden verarbeitet
A. „matching predicates“ auf Index Spalten werden benutzt, wenn der
Index zugegriffen wird
1. Alle „ist gleich“-Prädikate und IN-Listen mit einem einzigen Wert
2. Alle „range predicates“ und Spalten mit NOT NULL Definition
3. Alle anderen Prädikatstypen
B.
„Indexscreening“ wird eingeleitet
Schritt-2:
Weitere Stage1 Datenprädikate werden als nächstes
angewendet, erkannt nach dem Zugriff auf Datenpages
Schritt-3:
Die Stage2 Prädikate werden auf die zurückgegebenen Daten„rows“ angewendet
Schritt-4:
Alle restlichen Prädikate werden bestimmt und in der
Reihenfolge in der sie in der Query stehen, bearbeitet.
4.4.1.6 Nutze „stage1“-Prädikate
Der beste Garant für gute Performance ist die Nutzung von Stage1-Prädikaten. Die
Tabelle aus dem DB2-Manual (siehe auch Pkt. 6.2.1 ff) zeigt „indexable“ Prädikate.
Dies heißt jedoch nicht, dass solche Statements wirklich einen Index nutzen oder in
Stage1 ablaufen – es müssen auch alle anderen Anforderungen an Stage1 Prädikate
erfüllt sein. Als erstes wird die Syntax des Prädikates zur Entscheidung
herangezogen, ob in Stage1 verarbeitet werden kann. Als zweites wirken Typ und
Länge von Konstanten im Prädikat.
Wann immer, obwohl als Stage1 Prädikat klassifiziert, ein solches Prädikat
NACH einer Join Operation bearbeitet wird, wird es zu einem Stage2 Prädikat.
Alle „indexable“ Prädikate sind Stage1,
aber nicht alle Stage1 Prädikate sind „indexable“!!
++49 8106 994390
Seite 54 von 127
4.4.1.7 Verwende nie generische SQL-Statements
Generische SQL-Statements bedeuten jedenfalls, dass etwas geliefert wird, was
nicht nötig ist. Meist wird SQL nur als „I/O Layer“ verwendet. Das folgende Beispiel
stammt aus der Praxis:
Modulname:
GET_CUSTOMER(ws-acct-#, struktur )
Code:
SELECT *
INTO
WHERE
Programmcode:
GET-CUSTOMER:
CALL get_customer (kdnr, kd_satz)
IF kd_code NOT = x
THEN GO TO GET-CUSTOMER
FROM kunde
:struktur
kd_nr = :ws-kd#
Dieses Beispiel verletzt alle Regeln über die bisher bezüglich SQL Performance
diskutiert wurde. Generisches SQL besitzt Logiken, die nur spezifische Daten oder
Datenzusammenhänge darstellen und die „business-rule“ im Programm
implementiert.
Generisches SQL und generische I/O Layer gehören bei DB2 nicht
zur Kategorie „hochperformante“ Systeme.
4.4.1.8 Vermeide unnötige SORT-Abläufe
Die Forderung nach möglichst wenigen SORT-Vorgängen gilt für alle Anwendungen,
ganz besonders für Systeme, für die höchste Performance erwartet wird. Wann
immer SORTs unvermeidbar sind, sollten sie über passende Indizes unterstützt sein.
Sortvorgänge werden bei DB2 durch folgende SQL-Anweisungen ausgelöst:
•
•
•
•
•
GROUP BY
ORDER BY
DISTINCT
UNION
Join Expressions
Eine Sortierung für Joins ist ein klares Indiz dafür, dass ein passender Index für den
JOIN-Vorgang fehlt. Wird ein SORT nach dem Join erforderlich, kann man nur
hoffen, dass die Ergebnismenge möglichst klein ist. Das schlimmste Szenario ergibt
sich aus einem SORT in einem Cursor von dem das Programm nur einen Teil der
gelesenen Daten verarbeitet. In diesem Fall sollte der SORT-„overhead“ unbedingt
verhindert werden. Es ist klar, dass, wenn im SQL 100 Sätze sortiert, aber nur 10
verarbeitet werden, dies keine Probleme bereitet, solange die Daten über gute
Prädikate spezifiziert wurden. Und: Es gibt keine Regel, die sagt, wann „zu viele“
Daten sortiert werden, da dies ursächlich mit den Parametern des DB2-Subsystems
zusammenhängt.
Aber es gibt einen Punkt, an dem überlegt werden muss, ob Sort als externer
Vorgang oder anders implementiert werden sollten. Die Antwort ist einfach:
In Tests sollte mit dem gesamten erforderlichen Datenumfang
gearbeitet werden, um eine möglichst gute Lösung zu finden.
++49 8106 994390
Seite 55 von 127
4.4.1.9 Sortiere nur die erforderlichen Spalten
Wenn DB2 Daten sortiert, benötigt es die Spalten, nach denen sortiert werden soll,
zweimal im SQL-Statement:
Query-1:
SELECT
FROM
WHERE
AND
ORDER BY
nachname, ort, staat, bundesland, planet
demograf_daten
planet
=
:hplanet
bundesland
=
:hbundesland
planet, bundesland, staat, ort, nachname
Bild-38: Sortieren mit SQL – Beispiel 1
Angenommen es gibt keinen Index auf dieser Tabelle, so kann man relativ schnell
sehen, dass hier ein Problem existiert. Planet und ort sind bereits bekannt. Sind die
fünf Spalten im ORDER BY indiziert, so würden diese beiden Spalten trotzdem nicht
benötigt werden. Die Query könnte wie folgt neu formuliert werden:
Query-2:
SELECT
FROM
WHERE
AND
ORDER BY
nachname, ort, staat
demograf_daten
planet
=
bundesland
=
staat, ort, nachname
:hplanet
:hbundesland
Der Vorteil hierbei ist die Eliminierung von unnötig zurückgegebenen Daten.
In dieser Variante könnte es aber passieren, dass alle „nachname“-Felder aller
Individuen innerhalb eines Bundeslandes sortiert werden. Betrachten wir die
ursprüngliche Query (Query1), so kann man annehmen, dass, wenn ein Index für das
Selektieren der Daten genutzt wird, kein SORT mehr erforderlich ist.
Im Fall „Query1“ sind die Daten unmittelbar innerhalb eines Cursor verfügbar, im Fall
„Query2“ müssen Query und SORT komplett sein, bevor die erste „row“ geliefert
werden kann. Hier noch eine dritte Alternative (ab DB2V6):
Query-3:
SELECT
FROM
WHERE
AND
ORDER BY
nachname, ort, staat
demograf_daten
planet
=
:hplanet
bundesland
=
:hbundesland
planet, bundesland, staat, ort, nachname
++49 8106 994390
Seite 56 von 127
In diesem Fall könnte ein SORT vermieden werden, wenn ein Index über fünf Spalten
zur Wiedergewinnung der Daten in der geforderten Reihenfolge genutzt wird. Es
gäbe keinen „overhead“ für zwei zusätzlich gelesene Spalten, die an ein Programm
übergeben würden. Aber was wenn die o.g. Lösungen nicht für eine Verarbeitung zur
Verfügung stehen ?
Es gibt noch weitere Aspekte, Queries zu untersuchen, wenn SORTs unerlässlich
und zu viele Spalten betroffen sind. Hier ein Beispiel mit GROUP BY, das natürlich
einen SORT-Vorgang anstößt.
Man könnte hier „nested table expressions“ verwenden, um unnötige SORTs zu
vermeiden.
Query-4:
SELECT
FROM
WHERE
GROUP BY
a.abtnr, a.abtname, a.ug_abtnr, count(*)
abteilung a, mitarbeiter m
m.abtnr
=
a.abtnr
a.abtnr, a.abtname, a.ug_abtnr
Die Spalten „abtnr“ und „ug_abtnr“ seien eindeutig für jede Abteilung und somit
müssen sie nicht im GROUP BY mitangegeben werden. In diesem Fall kann man
über „nested table expression“ Zeit einsparen, indem man weniger Daten sortiert,
oder über Indexzugriff gruppiert.
Query-5:
SELECT
FROM
WHERE
a.abtnr, a.abtname, a.ug_abtnr, count_erg
abteilung
a
( SELECT
m.abtnr
AS cnt_abt
COUNT(*)
AS count_erg
FROM
mitarbeiter
m
GROUP BY m.abtnr )
AS temp
a.abtnr
=
cnt_abt
Sollten Sorts in beiden Fällen erforderlich sein, so gibt es folgende Unterschiede:
Query-4:
a.abtnr, a.abtname und a.ug_abtnr sind „sort-key“ und a.abtnr,
a.abtname, a.ug_abtnr und COUNT(*) stellen die
Datenanforderung dar
Query-5:
m.abtnr ist „sort-key“ und m.abtnr und COUNT(*)stellen die
Datenanforderung dar
Die Chancen für Query-5 den SORT innerhalb des Sortpools bewerkstelligen zu
können sind ungleich höher als für Query-4.
Die Schlüsselfelder für einen SORT müssen nämlich in der Regel separat am
Anfang der „row“, die sortiert werden soll, stehen, und damit werden sie zu groß
für einen „incore“-SORT.
++49 8106 994390
Seite 57 von 127
4.4.1.10 Benutze die ON-Klausel für alle JOIN-Prädikate
Grundsätzlich gibt es zwei Formate für „inner join“-Formulierungen. Ein Format ist der
„natural join“ und die andere Möglichkeit ist die Formulierung über die ONBedingung. Es gibt keinen Grund mehr, die „natural join“-Form weiterhin zu
benutzen, vor allem, da die INNER JOIN Ausdrücke neu und damit mit zusätzlichen
Vorteilen seitens des DB2 ausgestattet sind. Das alleine reicht aber nicht aus,
vielmehr ist die Dokumentationsfähigkeit bei der ON-Formulierung besser: Man kann
nicht so leicht ein JOIN-Prädikat vergessen.
Query-1:
SELECT
FROM
WHERE
AND
bundesland, region, v.persnr, gesamt_ums
vertrieb
v
Mitarbeiter
m
v.persnr
=
m.persnr
v.bundesland =
:hbundesland
Query-2:
SELECT
FROM
WHERE
bundesland, region, v.persnr, gesamt_ums
vertrieb v
INNER JOIN mitarbeiter m
ON
v.persnr
=
m.persnr
v.bundesland =
:hbundesland
Bild-42: ON-Klausel als JOIN-Prädikat – Beispiel
In der zweiten Query wird man nur schwerlich ein JOIN-Prädikat vergessen. Die
Verbesserungen der ON-Klausel in DB2V6 waren umfangreich und viele davon
betrafen direkt die Performance. Die Filterung erfolgt direkt beim JOIN-Prozess, nicht
vorher und nicht nachher.
Alle Prädikate können auch in der ON-Klausel angegeben werden.
Ausnahme: „subqueries“
4.4.1.11 Vermeide UNIONs
UNION-Verarbeitung ist teuer, insbesondere, wenn jeder UNION-Block dieselbe
Tabelle verarbeitet. Zusätzlich werden die meisten UNION ohne ALL geschrieben,
was DB2 dazu veranlasst, mehrfach vorkommende Zeilen aus dem Ergebnis
auszuschließen (wie bei DISTINCT) obwohl eventuell gar keine Dubletten existieren.
Wie auch immer: UNION ohne ALL kann SORT-Vorgänge erforderlich machen (siehe
Pkt. 4.4.1.8 ff). Outer Join Formulierungen lassen die meisten UNIONFormulierungen überflüssig werden.
Zusätzlich dazu können viele UNION-Angaben durch die Einführung von CASEExpressions ersetzt werden. UNION-Formulierungen stellen für den Optimizer eine
schwierige Semantik dar. Sie kann nur selten gut optimiert bzw. automatisch
umformuliert werden.
++49 8106 994390
Seite 58 von 127
Beispiel (eine Query mit 4 UNION-Blöcken):
Query-1:
SELECT
FROM
WHERE
AND
GROUP BY
se.typ, cj.code, cj.gruppe, cj.tdatum,
dm.klasse, sum(cj.menge), 0, 0, 0
cj
INNER JOIN se
ON
cj.typ =
se.typ
INNER JOIN dm
ON
cj.klasse =
dm.klasse
INNER JOIN tc
ON
cj.code =
tc.code
tc.basis_code =
‘I’
cj.tdatum
=
( SELECT
MAX( cj2.tdatum)
FROM
cj
AS
cj2
WHERE
<correlated predicates> )
se.typ, cj.code, cj,gruppe, cj.tdatum, dm.klasse
UNION ALL
SELECT
FROM
WHERE
AND
GROUP BY
dm.klasse, 0, sum(cj.menge), 0, 0
cj
INNER JOIN se
ON
cj.typ =
se.typ
INNER JOIN dm
ON
cj.klasse =
dm.klasse
INNER JOIN tc
ON
cj.code =
tc.code
tc.basis_code =
‘DE’
cj.tdatum
=
( SELECT
MAX( cj2.tdatum)
FROM
cj
AS
cj2
WHERE
UNION ALL
SELECT
FROM
WHERE
AND
GROUP BY
dm.klasse, 0, 0, sum(cj.menge), 0
cj
INNER JOIN se
ON
cj.typ =
se.typ
INNER JOIN dm
ON
cj.klasse =
dm.klasse
INNER JOIN tc
ON
cj.code =
tc.code
tc.sec_code
=
‘I’
cj.tdatum
=
( SELECT
MAX( cj2.tdatum)
FROM
cj
AS
cj2
WHERE
UNION ALL
SELECT
FROM
WHERE
AND
GROUP BY
ORDER BY
dm.klasse, 0, 0, 0, sum(cj.menge)
cj
INNER JOIN se
ON
cj.typ =
se.typ
INNER JOIN dm
ON
cj.klasse =
dm.klasse
INNER JOIN tc
ON
cj.code =
tc.code
tc.sec_code
=
‘DE’
cj.tdatum
=
( SELECT
MAX( cj2.tdatum)
FROM
cj
AS
cj2
WHERE
2, 3, 4
++49 8106 994390
Seite 59 von 127
Eine Lösung kann es über die Verwendung von CASE-Expressions geben:
Lösung:
SELECT
,
,
,
,
FROM
WHERE
AND
GROUP BY
ORDER BY
se.typ, cj.code, cj.gruppe, cj.tdatum, dm.klasse
SUM (CASE WHEN tc.basis_code =
‘I’
THEN
cj.menge
END)
SUM (CASE WHEN tc.basis_code =
‘DE’
THEN
cj.menge
END)
SUM (CASE WHEN tc.sec_code =
‘I’
THEN
cj.menge
END)
SUM (CASE WHEN tc.sec_code =
‘I’
THEN
cj.menge
END)
cj
INNER JOIN se
ON
cj.typ =
se.typ
INNER JOIN dm
ON
cj.klasse =
dm.klasse
INNER JOIN tc
ON
cj.code =
tc.code
(
tc.basis_code
IN
( ‘I’, ‘DE’ )
OR tc.sec_code
IN
( ‘I’, ‘DE’ ))
cj.tdatum
=
( SELECT
MAX( cj2.tdatum)
FROM
cj
AS
cj2
WHERE
2, 3, 4
Bild-43a: CASE und INNER JOIN anstatt UNION – Beispiel
Ein UNION mit NOT EXISTS kann auch als Alternative mit einem OUTER JOIN
substituiert werden:
SELECT
a.abt_nr
,
a.abt_name
,
m.nachname
FROM
v_mitarbeiter
M
,
v_abteilung
A
WHERE
a.abt_ltnr = m.pers_nr
UNION ALL
SELECT
a.abt_nr
,
a.abt_name
,
'** UNBEKANNT **'
FROM
v_abteilung
A
WHERE NOT EXISTS
( SELECT
*
FROM
v_mitarbeiter X
WHERE
x.pers_nr = a.abt_ltnr
)
ORDER BY 2;
----------------------------------------------- ist identisch mit….
----------------------------------------------SELECT
a.abt_nr
,
a.abt_name
,
VALUE(m.nachname,'** UNBEKANNT **')
FROM
v_abteilung A LEFT OUTER JOIN v_mitarbeiter M
ON a.abt_ltnr = m.pers_nr
ORDER BY 2;
Bild-43b: OUTER JOIN anstatt UNION mit EXISTS – Beispiel
++49 8106 994390
Seite 60 von 127
4.4.1.12 Nutze JOINs anstatt „subqueries“
Diese Aussage gilt seit den Anfängen von DB2: Joins übertreffen „subqueries“ in
Hinsicht auf die Performance beinahe in jedweder Konstellation. Dabei ist das
WARUM relativ einfach zu verstehen und auch, wie der Ablauf erfolgt. Es gibt im
JOIN immer mehrere Alternativen bei der Wahl des Zugriffspfads und DB2 kann auch
zwei und mehr Tabellen simultan bearbeiten.
Aber warum findet man dann so viele „subqueries“ anstelle von Joins in
Applikationen?
Joins werden gemeinhin vermieden, weil es Gerüchte über die schlechte Performance von Joins oder sogenannte „inhouse“-Standards gibt, die Joins „verbieten“.
Zunächst gibt es zwei Formen von „subqueries“: „correlated“ und „non-correlated“
Subqueries.
Query-1(„non-correlated“):
SELECT
ta.col1, ta.col2
FROM
ta
WHERE
NOT EXISTS ( SELECT 1
WHERE
FROM tb
tb.col1 =
:hv1 )
Query-2(„correlated“):
SELECT
ta.col1, ta.col2
FROM
ta
WHERE
ta.col3
=
‘ABC’
AND
NOT EXISTS ( SELECT 1
WHERE
FROM tb
tb.col2 =
ta.col2 )
Bild-44: „correlated“ und „non-correlated“ Subqueries – Beispiel
Es gibt allerdings auch Situationen, in denen eine “correlated” Subquery bessere
Performance aufweist als ein Join.
Dies ist begründet in der in DB2 implementierten „lookaside“ Technik für
„caching“. Der Cache behält das Ergebnis einer „subquery“ und wenn dieselben
Werte erneut an die „subquery“ gegeben werden sollen, können die gleichen
korrekten Werte aus dem Cache verwendet werden.
Grundsätzlich sollte schon aus Performance-Gründen ein Join kodiert werden. Es
ist zwar nicht offensichtlich, aber auch NOT EXISTS und NOT IN können als Joins
formuliert werden. Manchmal konvertiert DB2 „non-correlated“ IN-Subqueries in
einen Join, doch vor der DB2 Version 6 führte das nicht unbedingt zu besseren
Ergebnissen in der Performance.
Normalerweise leiten ein EXISTS mit „correlated“ Subquery und IN Prädikate mit
„non-correlated“ Subqueries eine Stage2 Verarbeitung ein. Werden solche Queries
in Joins konvertiert, so kann die „inner table“ die Prädikate auf Stage1 verarbeiten.
Die Regel hierzu ist einfach:
Ein Join ist einer Subquery - in der Regel - immer vorzuziehen
++49 8106 994390
Seite 61 von 127
4.4.1.13 Komplemetärmengen bei „outer joins“
Häufig wird bei der Verwendung von „outer joins“ nicht nach der „outer join“ Menge,
sondern nach der echten Komplementärmenge gesucht. Bis heute wurde eine
derartige Anfrage mit „NOT EXISTS“ – Formulierungen getätigt, wie das folhende
Beispiel zeigt:
EXPLAIN PLAN SET QUERYNO = 646110 FOR
SELECT DISTINCT A.TESNR
,A.PARTNR
,A.UKZEMPF
,A.FILEMPF
,B.AONR
FROM Tabelle1 A, Tabelle2 B
WHERE A.DATWE = '1900-01-01'
AND NOT EXISTS (SELECT 1
FROM Tabelle3 C WHERE C.TESNR = A.TESNR
AND C.UKZ = :WEQUIT.UKZ
AND C.FIL = :WEQUIT.FIL)
AND A.ABRUFNR = B.ABRUFNR
AND A.ABRUFPOS = B.ABRUFPOS
AND A.UKZEMPF = B.UKZEMPF
AND A.FILEMPF = B.FILEMPF
AND A.PARTNR
= B.PARTNR
AND B.ABTEMPF = :VAPOS.ABTEMPF
AND A.UKZEMPF = :VAWAPO.UKZEMPF
AND A.FILEMPF = :VAWAPO.FILEMPF
AND A.DATWA
BETWEEN :U-WADATUM-VON AND :U-WADATUM-BIS
WITH UR
Bild-44a: „outer join“ und Komplementärmenge (mit NOT EXISTS) – Beispiel
Einfacher und effizienter kann man diese Query aber über die direkte Formulierung
der Komplementärmenge laufen lassen:
EXPLAIN PLAN SET QUERYNO = 646113 FOR
SELECT DISTINCT A.TESNR
,A.PARTNR
-,A.UKZEMPF
-- IST DAS NOETIG ??
-,A.FILEMPF
-- SIEHE WHERE KLAUSEL
,B.AONR
FROM DB.Tabelle1 A INNER JOIN DB.Tabelle2 B
ON A.ABRUFNR = B.ABRUFNR
AND A.ABRUFPOS = B.ABRUFPOS
AND A.UKZEMPF = B.UKZEMPF
AND A.FILEMPF = B.FILEMPF
AND A.PARTNR
= B.PARTNR
LEFT OUTER JOIN DB.Tabelle3 C
ON NOT ( C.TESNR = A.TESNR
AND C.UKZ = ?
--:WEQUIT.UKZ
AND C.FIL = ? ) --:WEQUIT.FIL
WHERE A.DATWE = '1900-01-01'
.....
WITH UR
Bild-44b: „outer join“ und Komplementärmenge (ohne NOT EXISTS) – Beispiel
Dabei ist es wichtig darauf zu achten, "leere Mengen" und "NULLS" dennoch
addressieren zu können!
++49 8106 994390
Seite 62 von 127
Dies dokumentiert die DSN_STATEMNT_TABLE in DB2:
QUERY
NO
-------646110
646113
PROGNAME
-------DSQBESQL
DSQBESQL
GROUP
MEMBR
-----
CST
EXPLAIN
TIME
---------2005-11-28
2005-11-28
STMT
TYPE
---SELE
SELE
CAT
GRY
--B
B
PROC
MS
----39
5
PROC
SU
----277
30
REASON
----------HOST VARIAB
HOST VARIAB
Aber nicht vergessen: Alles das ist eine Option, die davon ausgeht, dass die Queries
über passende Indizes und aktuelle Statistiken optimal unterstützt werden!
4.4.1.14 Kodiere die selektivsten Prädikate zuerst
Oft hört man, sich um die Statement-Reihenfolgen in den Prädikaten zu kümmern sei
Zeit- und Geldvergeudung. Das ist nicht richtig. In vielen SQL Statements werden
Prädikate auf dieselbe Verarbeitungsebene definiert.
Die Verarbeitung aber erfolgt von oben nach unten. Je eher ein Prädikat verarbeitet
werden kann, desto effizienter ist es. So sind Stage1 Prädikate besser als Stage2
Verarbeitung, da sie früher erkannt und bearbeitet werden können.
Alle SQL Queries sollten so kodiert sein, dass die am meisten restriktiven
Prädikate zuerst erkannt werden können. Damit verringern sich die Verarbeitungskosten für spätere Verarbeitungsschritte. Dies gilt insbesonders für die Nutzung von
„dynamic SQL“ in JAVA, VBA, VBA.NET usw., die in Programmen dynamische
Strings erzeugen, in denen sich „native“ SQL befindet.
Das „dynamic cache“ „feature“ von DB2 kann in solchen Fällen sehr hilfreich sein, da
damit mehrfache „prepares“ für dasselbe, häufig genutzte SQL – Statement
vermieden werden kann. Dazu müssen aber die dynamischen SQL-Statements
Parameter Marker („?“) verwenden und nicht die Hostvariablen direkt .
Public static void main()
{
Connection con = DriverManager.getConnection()
PreparedStatement stmt = null;
stmt = con.prepareStatement („SELECT name FROM person WHERE id = 25“);
ResultSet rs = stmt.executeQuery();
rs.print();
}
Try
{
Class.forName("COM.ibm.db2.jdbc.app.DB2Driver").newInstance();
Connection con = DriverManager.getConnection(“jdbc:db2:db “);
PreparedStatement stmt = null;
stmt = con.prepareStatement (“SELECT name FROM person WHERE id = ?“);
stmt.setInt (1, 25);
ResultSet rs = stmt.executeQuery();
while (rs.next()) {
System.out.println (rs.getString(1));
}
Catch (SQLExeption se){ System.out.println(“Error:“ + se)}
Bild-44c: „JDBC“ und Parameter Marker – Beispiel
++49 8106 994390
Seite 63 von 127
4.4.1.15 Nutze erprobte Methoden zur Existenzprüfung(EXISTS)
Hierzu gibt es eine Reihe von Untersuchungen, u.a von M. Hannan (Australien). Es
gibt diverse Lösungen zu diesem Problem und man muss genau hinsehen, um die
richtige Methode zu wählen. In vielen Fällen ist die folgende Formulierung die beste,
vorausgesetzt, dass KEYCOL die erste Spalte in einem der Indizes auf der Tabelle
TAB darstellt:
Query-Beispiel-1(im Programm):
SELECT
FROM
WHERE
‘1’
INTO
tab
tab.scol01
:hostvar1
=
:hostvar2
IF sqlcode = 0 OR sqlcode = -811 then Existenzpruefung positiv
IF sqlcode = +100 then Existenzpruefung negativ
Bild-45: Existenzprüfung im Programm – Beispiel-1
Unsicher wäre ein Test auf die Hostvariable :hostvar1, da bei einem SQLCODE –
811 die Inhalte der Hostvariablen nicht vorhersagbar wären. Dies gilt für alle
negativen SQLCODEs! Ist als Zugriffspfadverarbeitung „list prefetch“ bestimmt, hat
man das Problem, dass der „list prefetch“ erst beendet sein muss, bevor man ein
Resultat erhält. Dies ist nicht nur schlecht für die Antwortzeiten, sondern auch
belastend für die DB2-Ressourcen.
Was aber ist im Falle eines „sequential prefetch“? - Der „sequential prefetch“
besteht aus einer Menge von I/O-Operationen, deren erster Schritt synchron abläuft
und der die Antworten unmittelbar an die einzelne SQL-Query zurückgibt. Das
Problem wäre ein ein TS-Scan, der nur zur Existenzprüfung diente. Das wiederum
hängt davon ab wessen Existenz geprüft wird. Sucht SQL in einer kleinen „Code“Tabelle, mit 100 „rows“ pro DB2 Page, dann ist ein „sequential prefetch“ durchaus in
Ordnung, wenn kein Index zur Verfügung steht.
Hierfür generelle Regeln aufzustellen ist schwer, da sich die Fälle sehr unterschiedlich darstellen können. Dennoch: Folgende DB2-Aktionen sollten nicht zur
Existenzprüfung genutzt werden:
•
•
•
•
•
•
„Sequential prefetch“
„Dynamic prefetch“
„List prefetch“,
jede Art von Materialisierung
Jede Art von SORTs
„Hybrid“ und/oder „merge/scan“ Joins
Diese Funktionen erzeugen nachweislich einen zu großen „overhead“ ...
Es gibt einen anderen Weg, Existenzprüfungen effizienter zu gestalten, falls kein
adäquater Zugriffspfad ausgewiesen wurde: Man schreibe die Query als Cursor und
nutze die Anweisung OPTIMIZE FOR 1 ROW, idealerweise mit UR als Isolation
Level.
Das funktioniert auf jeden Fall, benötigt aber drei Calls auf das DB2 und das
bedeutet extra durchlaufenen Code, der aus Performance-Gründen vermieden
werden sollte. Eine weitere Query-Formulierung wird oft empfohlen, ist aber nur in
wenigen Fällen wirklich nützlich:
++49 8106 994390
Seite 64 von 127
Query-Beispiel-2:
SELECT
‘1’
INTO
:hostvar1
FROM
sysibm.sysdummy1 a
WHERE
EXISTS
( SELECT
‘1’
FROM
tab
WHERE
a.ibmreqg >= ‘ ‘
AND
<filter/matching prädikate> )
Bild-46: Existenzprüfung im Programm – Beispiel 2
Diese Query-Formulierung wird von einem „single“ SQL Statement normalerweise
übertroffen. Also kann die Regel nur lauten:
Man nutze „single“ SQL zur Existenzprüfung. Es ist jedoch
sicherzustellen, dass keine Zugriffsprobleme auftreten können.
Aber auch das Beispiel-2 ist nicht unbedingt zu empfehlen und es gibt Fälle, in
denen die Lösung mit der Cursor-Formulierung und der OPTIMIZE-.Klausel das
„single“ SELECT nicht übertreffen kann, weil „nonmatching index scans“ zu
lang sind. Eine weitere Empfehlung lautet also:
Der Prozess sollte nicht für ein und dieselbe Prüfung wiederholt
werden müssen.
Beispiel:
Ein Batch-Prozess mit im Programm implementierter RI kann durchaus verkürzt
werden, vor allem, wenn es sich nicht um den normalen, großen INSERT in eine
„child-table“ einer einzelnen „parent-table“ handelt. In diesem Fall schicken mehrere
Queues Daten an den INSERT-Prozess, der eine einzige Existenzprüfung durchführt.
Wenn also wiederholt gleiche Objekte von den Queues kommen, wird ein unnötiger
„existence check“ mehrfach durchgeführt. Die Lösung ist einfach: Ein separater
„existence check“ wird für jede Queue durchgeführt, um zu sehen, ob der nächste
Eintrag auf der Queue dieselbe „parent row“ hat wie der aktuelle.
++49 8106 994390
Seite 65 von 127
4.4.1.16 Subqueries sind zu „tunen“
Obgleich es keine allgemeinen Regeln für die Entscheidung gibt, ob und wie
Subqueries einzusetzen sind, hier eine generelle Hilfestellung:
Wenn möglich, ist es besser einen Join als eine Subquery nutzen, da
eine Subquery die Reihenfolge, in der auf Tables zugegriffen wird, streng
festlegt. Mit einem Join wählt DB2 die jeweils bessere TableZugriffsreihenfolge.
Wenn wirksame Indexes auf Tables in einer Subquery zur Verfügung
stehen, ist eine Correlated Subquery die wirksamste Art einer möglichen
Subquery.
Wenn keine passenden Indexes auf Tables in der Subquery vorhanden
sind, dürfte eine Non-Correlated Subquery der wirksamste Zugriff sein.
Wenn in einer Query mehrere Subqueries vorkommen, stellt man sicher,
dass die Subqueries in einer sinnvollen Reihenfolge angelegt sind.
Betrachten wir folgende Query auf die Tabelle MY_TABLE:
SELECT
FROM
WHERE
AND
*
MY_TABLE
TYPE
PARTS
IN (subquery 1)
IN (subquery 2);
Bild-47: Tuning von Subqueries – Beispiel
Unter der Voraussetzung, dass Subquery 1 und Subquery 2 vom selben Typ von
Subquery (correlated bzw. non-correlated) sind, wird DB2 die Subqueryprädikate in
der Reihenfolge auflösen, in der Sie in der WHERE-Bedingung angelegt sind.
Annahme:
Subquery 1 schließt 10% aller Rows aus, Subquery 2 schließt 80% aller Rows aus.
Und: Das Prädikat in Subquery 1 (P1) wird 1000 mal ausgeführt, und das Prädikat in
Subquery 2 (P2) wird im Anschluss daran 900 mal ausgeführt, was eine Summe von
1900 Prädikat-Prüfungen ergibt. Wenn man die Reihenfolge der Subqueries
verändert, wird P2 zwar ebenfalls 1000 mal ausgeführt, jedoch nur noch mit 20% der
Daten aus der Gesamtmenge, was sich in lediglich 200 Prüfungen in P1 niederschlägt. Die Anzahl der Prädikat-Prüfungen sinkt somit auf 1200.
Es scheint also die Reihenfolge P2 und dann P1 günstiger als die Reihenfolge P1
und dann P2 - unter der Voraussetzung, dass der Zeitaufwand je Prädikat gleichartig
ist. Ist der Zeitaufwand der Prädikate sehr unterschiedlich, müssen die Alternativen
im Einzelnen geprüft werden.
Wenn man sich nicht sicher ist, muss man die Funktion EXPLAIN auf die Query und
auf beide Variationen mit Correlated und einer Non-Correlated Subquery anwenden.
Nach Prüfung des EXPLAIN-Ergebnisses erfolgt die Untersuchung der vorliegenden
Struktur der Daten und damit sollte man in der Lage sein, das günstigste Statement
zu bestimmen.
Dieses Vorgehen kann für alle Typen von Prädikaten angewandt werden, da
Subquery Prädikate über das Potential verfügen, ein Vielfaches von Prozessor- und
I/O-Zeit gegenüber anderen Prädikatsformulierungen zu verschlingen. Man prüft
deshalb immer die Reihenfolge der Subqueries in einem SQL-Statement.
++49 8106 994390
Seite 66 von 127
Aber Achtung:
DB2 führt alle Non-Correlated Subquery-Prädikate vor den
Correlated Subquery-Prädikaten aus, unabhängig von deren
Reihenfolge in der Query.
Zum Tuning von Subqueries gibt es zusammenfassend folgende Empfehlungen:
1.
Vergleich der Subquery-Aufwände mit einer gleichwertigen JOIN-Lösung
2.
Reduktion des "subquery-nestings". Die Subqueries werden in der
Reihenfolge ihrer
Definition ausgeführt. Ggf. muss hierfür die Reihenfolge geändert werden falls logisch möglich
3.
anstatt einer "non-correlated" Subquery sollte womöglich
eine "correlated" Subquery
4.
verwendet werden
5.
Die Effizienz-Reihenfolge bei Subqueries lautet wie folgt:
a)
b)
c)
Correlated Subquery mit effizientem Index
"non-correlated" Subquery
Correlated Subquery ohne Index
Bild-48: Tuning von Subqueries – Zusammenfassung
4.4.1.17 Vermeide alles, was nicht unbedingt notwendig ist
So lautet die letzte Regel für Performance; d.h. man sollte nicht große Datenmengen
von DB2 lesen, wenn man nur 10 Zeilen für das Ergebnis braucht. Oft sind dies
Dinge, die man nur durch „tracing“ bemerken kann. Aber man bedenke: „Overhead“
stiehlt anderen Usern/Tasks die Ressourcen und die Beseitigung solcher
„Ressourcenvernichter“ kann zu erhöhtem Transaktionsdurchsatz insgesamt führen.
Cursor in Online-Programmen sollten immer eine Klausel
OPTIMIZE FOR 10 ROWS haben.
Anmerkung:
Ein „List prefetch“ wird oft als schlecht eingestuft. Er kann aber
sowohl gut als auch schlecht sein, wie jedes beliebige andere
„feature“. In OLTP-Umgebungen ist ein „list prefetch“ meist von
Vorteil. Die Aussage ist stark vom Datenumfang und der Art der
Datennutzung abhängig.
++49 8106 994390
Seite 67 von 127
4.4.2
Modifikation von SQL-Statements
Eine Modifikation von SQL-Statements im Sinne einer verbesserten Performance und
Verarbeitungsfähigkeit können das DB2 selbst oder der User vornehmen.
Nicht berücksichtigt ist im Folgenden die Fähigkeit von DB2 Version 8 „automatic
query rewriting(AQR)“ über den Einsatz von MQT’s („materialized query tables“) zu
betreiben.
4.4.2.1 Häufigste Modifikationen durch das DB2
Seit einigen Versionen modifiziert DB2 selbst bestimmte SQL-Formulierungen im
Sinne seines Optimizers. So werden vom DB2 folgende Aktionen angestossen:
•
IN-Listen mit einer Ausprägung zum „=“ Prädikat
Beispiel: WHERE kto_nr IN (12345)
wird zu WHERE kto_nr = 12345
•
OR-Prädikate, die sich auf dieselbe Spalte beziehen werden in eine IN-Liste
umgewandelt
Beispiel: WHERE kto_nr = 123 OR kto_nr = 345 OR kto_nr = 567
wird zu
WHERE kto_nr IN ( 123, 345, 567 )
4.4.2.2 Modifikationen über zusätzliche Prädikate („transitive closure“)
Das Einfügen weiterer Filterkriterien(nur bei <equal>, <not equal> und „ranges“
(inklusive BETWEEN, nicht bei LIKE und IN...) ist eine weitere Methode, Queries zu
verändern:
•
JOIN-Prädikat
Beispiel: WHERE
AND
AND
•
Lokale Prädikate
Beispiel: WHERE
AND
AND
•
a.projekt# = b.projekt#
a.projekt# BETWEEN 100 AND 200
b.projekt# BETWEEN 100 AND 200
neuer_preis
neuer_preis
alter_preis
=
>
>
alter_preis
8000
8000
ON-Prädikate(„outer join“)
Beispiel: ON
AND
AND
a.projekt# = b.projekt#
a.projekt# BETWEEN 100 AND 200
b.projekt# BETWEEN 100 AND 200
Einfügungen werden immer generiert (auch wenn bereits manuell eingefügt) falls die
u.a. Bedingungen gegeben sind:
1.
Es liegt ein <equal>-Prädikat vor
2.
Ein anderes Prädikat filtert eine dieser „columns“ weiter aus (nur bei
<equal> und <range> Prädikaten)
++49 8106 994390
Seite 68 von 127
4.4.2.3 Vereinfachung der JOIN-Verarbeitung
DB2 generiert eigenständig JOIN-Operationen um, wenn dies zu einem korrekten
Ergebnis führt. So wird aus einem
FULL OUTER JOIN ein LEFT OUTER JOIN generiert
LEFT oder RIGHT OUTER JOIN kann ein INNER JOIN werden
Dies kann natürlich nur unter bestimmten Bedingungen geschehen und so muss DB2
aus bestimmten Klauseln erkennen, ob ein Umformulieren von SQL-Statements
möglich ist oder nicht.
Erkennen aufgrund der WHERE-Klausel
Für Queries, in denen eine einfachere Join-Logik korrekt ist:
bei „boolean terms“
Ergebnis der Prädikat-Analyse ist „false“, wenn NULL resultiert
Beispiel:
SELECT .....
FROM t1 FULL OUTER JOIN t2
ON t1.c1 = t2.c1
WHERE t1.c5
BETWEEN 4 AND 5
SELECT .........
FROM t1 LEFT OUTER JOIN t2
ON t1.c1 = t2.c1
WHERE t1.c5
BETWEEN 4 AND 5
Beispiel:
SELECT .....
FROM t1 RIGHT OUTER JOIN t2
ON t1.c1 = t2.c1
WHERE t1.c5
BETWEEN 4 AND 5
SELECT .........
FROM t1 INNER JOIN t2
ON t1.c1 = t2.c1
WHERE t1.c5
BETWEEN 4 AND 5
Erkennen aufgrund der ON-Klausel
4.4.2.4 Beeinflussung der Reihenfolgen bei OUTER-JOINs
Die JOIN-Operationen werden von DB2 in ihrer Reihenfolge umgeneriert (schon seit
DB2 Version 3.1).
4.4.2.5 Subquery-Transformation in JOINs
Eine Subquery kann ohne Verlust der geltenden Logik unter bestimmten
Bedingungen in einen JOIN umformuliert werden
1.
Wenn sie mit IN, = ANY, = SOME aktiviert wird
Beispiel:
SELECT .....
FROM projekt
WHERE
projekt_typ IN
( SELECT
projekt_typ FROM projekt_typen_tab
WHERE
labor_ort IN ( ’München’, ’Frankfurt’ )
AND
projekt_typ = ’ENTW’ )
SELECT .....
FROM projekt p, projekt_typen_tab pt
WHERE
p.projekt_typ = pt.projekt_typ
AND pt.labor_ort IN ( ’München’, ’Frankfurt’ )
AND projekt_typ
= ’ENTW’
++49 8106 994390
Seite 69 von 127
2. Wenn eine non-correlated subquery mit NOT, IN oder mit <> ALL
vorgegeben ist, kann unter bestimmten Bedingungen ein NOT EXISTS
mit „correlated subquery“ formuliert werden.
Die Bedingungen sind:
−
Die ersetzbare „non-correlated subquery“ macht keinen „matching
index scan“
−
Die Spalte, auf die in der subquery referenziert wird, muss mit
einem „matching index scan“ unterstützbar sein
−
Die Cardinality muss hoch sein und es dürfen keine lokalen
Prädikate auf der Spalte existieren
−
Die Formate der Outer Query und der Subquery müssen
übereinstimmen und mit NOT NULL definiert sein
−
Die subquery darf keine GROUP BY, DISTINCT oder „column
functions“ enthalten
−
Das SQL-Statement führt keine Datenmanipulation aus (INSERT,
UPDATE, DELETE)
4.4.2.6 Auswahl der „outer table“ bei JOINs
Die Tabelle, auf die in einem JOIN mit einem UNIQUE Index zugegriffen wird, wird
zur „outer table“ in der internen („cursor“-gesteuerten) JOIN-Verarbeitung von DB2.
Dabei stellt sich das Problem eines programmtechnischen „loop“. Logisch ergibt sich
in den Konstellationen „t1 als outer table“ oder „t2 als outer table“ kein Unterschied.
Technisch aber erfolgt immer ein OPEN CURSOR auf die jeweilige Tabelle, deren
Zeilen abgearbeitet sind. So kann man folgende Rechnung aufmachen:
1. Besteht die „outer table“ aus 1000 Zeilen und die „inner table“ aus 100 Zeilen,
ergeben sich 1000 OPEN CURSOR t_outer_cursor, um alle Zeilen der „inner
table“ wieder positionieren und mit der nächsten Zeile der „outer table“
vergleichen zu können.
2. Besteht die „outer table“ aus 100 Zeilen und die „inner table“ aus 1000 Zeilen,
ergeben sich 100 OPEN CURSOR t_outer_cursor, um alle Zeilen der „inner table“
wieder positionieren und mit der nächsten Zeile der „outer table“ vergleichen zu
können.
Ein OPEN CURSOR aber ist eine aufwändige und zeitintensive Operation für DB2.
Also wird DB2 beim „optimizing“ die Tabelle, von der das kleinste Resultat an Anzahl
Zeilen zu erwarten ist als „outer table“ verwenden.
Beispiel:
SELECT
FROM
,
WHERE
AND
.....
projekt p
projekt_typen_tab pt
p.projekt_typ = pt.projekt_typ
p.projekt_nr = ’AD2100’
(PK) d.h. „outer table“ ist
projekt p
Was aber geschieht, wenn für den Optimizer kein eindeutiges Entscheidungskriterium erkennbar ist? – Dann ist der User - respektive die Formulierung der SQLQuery – gefragt (siehe auch Pkt. 3.2.15, 4.4.2.9 u.a.).
++49 8106 994390
Seite 70 von 127
4.4.2.7 Ausschalten von Indizes
Gibt es mehr als einen Index und das DB-System entscheidet sich für den
ungünstigsten, kann die IX-Nutzung durch Umsetzen von Stage-1 auf Stage-2
Prädikate ausgeschaltet und so die Wahl des günstigeren IX forciert werden.
Beispiel:
SELECT
FROM
,
WHERE
AND
AND
p.projekt#, p.stermin, p.projekttyp
projekt
p
projekttyp
t
p.projekttyp
= t.projekttyp
p.stermin
> ’1.1.2002’
←
p.durchfuehrungsort = ’München’ ←
IX günstig
IX ungünstig
(wird aber
angenommenerweise
vom DB2 gewählt)
IX wird ausgeschaltet:
SELECT
FROM
,
WHERE
AND
AND
p.projekt#, p.stermin, p.projekttyp
projekt
p
projekttyp
t
p.projekttyp =
t.projekttyp
p.stermin
>
’1.1.2002’
( p.durchfuehrungsort = ’München’
OR 0 = 1 )
←
IX günstig aktiv (!)
←
IX ungünstig wird
abgeklemmt
Bild-49: Ausschalten von Indizes
Dieses Verfahren kann für alle Datentypen ohne weitere Nachteile angewandt
werden. Die Ergebnismenge wird in keinem Fall beeinflusst.
Alternativen sind:
− Bei numerischen Werten:
Addition/Subtraktion von 0 +0 / -0
Multiplikation/Division mit 1
− Bei alphanumerischen Spalten
CONCAT mit Leerstring bzw. || ’ ’
− Generell
anstelle OR 0 = 1 kann auch OR 0 <> 0
stehen (mit dem Vorteil eines höheren
Filterfaktors !!!)
++49 8106 994390
Seite 71 von 127
4.4.2.8 Beeinflussen der IX-Nutzung
Durch die Vorgabe OPTIMIZE FOR 1 ROW kann sich die Nutzung von Indexen
verändern.
Beispiel:
SELECT
p.projekt#, p.stermin, p.projekttyp, p.name
FROM
projekt
p,
projekttyp
t
WHERE
p.projekttyp
=
t.projekttyp
AND
p.stermin
>
’1.1.2002’
AND
p.durchfuehrungsort =
’München’
OPTIMIZE FOR 1 ROW
Bild-50: OPTIMIZE-Klausel - Beispiel
Durch Änderung von Korrelationsnamen im SELECT kann ein Index für die
ORDER BY Anforderung genutzt werden
Beispiel:
SELECT
FROM
WHERE
AND
AND
ORDER BY
p.projekt#, p.stermin, T.projekttyp, p.name
projekt
p,
projekttyp
t
p.projekttyp
=
t.projekttyp
p.stermin
>
’1.1.2002’
’München’
projekttyp, name
Bild-51: Indexnutzung beim ORDER BY – Beispiel
4.4.2.9 Beeinflussen von „outer table“ Auswahl und JOIN-Methode
Über die Verwendung weiterer lokaler Prädikate kann ein JOIN-Prozess möglicherweise günstiger gestaltet werden.
Grundregel: Je mehr Prädikate sich auf eine Tabelle beziehen, desto eher wird
sie als „outer table“ favorisiert.
Beispiel:
SELECT
FROM
,
WHERE
AND
AND
AND
p.projekt#, p.stermin, t.projekttyp, p.name
projekt
p
projekttyp
t
p.projekttyp =
t.projekttyp
p.stermin
>
’1.1.2002’
p.durchfuehrungsort = ’München’
t.maxdauer
=
t.maxdauer
ggf. könnten noch weitere
Einfügungen vorgenommen
werden (Achtung: nur für
NOT NULL-Spalten)
Bild-52: Beeinflussen der OUTER-Table Auswahl – Beispiel 1
++49 8106 994390
Seite 72 von 127
Durch die Vorgabe OPTIMIZE FOR 1 ROW können sich „outer table“ und JOINMethode verändern. Wird ORDER BY vorgegeben, so wird der IX favorisiert, der die
SORT-Anforderung erfüllt. Damit wird auch die zugrundeliegende Tabelle zur „outer
table“
Beispiel:
SELECT
FROM
projekt
p
,
projekttyp
t
WHERE
p.projekttyp
=
t.projekttyp
AND
p.stermin
>
’1.1.2002’
AND
p.durchfuehrungsort = ’München’
OPTIMIZE FOR 1 ROW
Durch Änderung von Korrelationsnamen in der SELECT- bzw. WHERE-Klausel
können sich „outer table“ und JOIN-Methode verändern.
Beispiel:
SELECT
FROM
,
WHERE
AND
AND
p.projekt#, p.stermin, t.projekttyp, p.name
projekt
p
projekttyp
t
p.projekttyp
=
t.projekttyp
p.stermin
>
’1.1.2002’
’München’
Beispiel:
SELECT
FROM
,
WHERE
AND
AND
AND
projekt
p
projekttyp
t
p.projekttyp
=
t.projekttyp
p.stermin
>
’1.1.2002’
’München’
t.projekttyp
=
‘ENTW’
anstatt p.projekttyp
= ...
++49 8106 994390
Seite 73 von 127
Der Optimizer generiert bei LIKE- und IN-Prädikaten keine redundanten Prädikate.
Daher sollten bei JOINs Redundanzen ins Statement aufgenommen werden.
Beispiel:
SELECT
FROM
,
WHERE
AND
AND
projekt
p
projekttyp
t
p.projekttyp
=
t.projekttyp
p.projekttyp
LIKE ‘ENT%’
t.projekttyp
LIKE ‘ENT%’
von der IBM
empfohlen....
++49 8106 994390
Seite 74 von 127
4.4.3
Restrukturierung von UNION- durch CASE-Ausdrücke
Zum einen sind UNION's für das SQL aufwendige Operationen
("Vereinigungsmengen"), zum anderen verursachen sie beim Weglassen der ALLKlausel SORT-Vorgänge für die einzelnen Untermengen - wegen der Option
"distinct", die im UNION impliziert ist.
UNION-Ausdrücke sollten alle dahingehend untersucht werden, ob sie durch CASEAusdrücke bzw. „outer join“-Formulierungen ersetzt werden können.
Beispiel:
SELECT
sum(amount) AS tot1 , 0 , 0 , 0 , 0
FROM
tab_1 t1,
tab_2 t2,
tab_3
WHERE
t1.col1 =
t2.col2
AND
t1.col1 =
t3.col3
AND
t1.col2 =
t4.col4
AND
t1.colx IN
( ‘X’, ‘Y’, ‘Z’ )
UNION ALL
SELECT
0 , sum(amount) AS tot2 , 0 , 0 , 0
FROM
tab_1 t1,
tab_2 t2,
tab_3
WHERE
t1.col1 =
t2.col2
AND
t1.col1 =
t3.col3
AND
t1.col2 =
t4.col4
AND
t1.colx IN
( ‘A’, ‘B’, ‘C’ )
UNION ALL
SELECT
0 , 0 , sum(amount) AS tot3 , 0 , 0
FROM
tab_1 t1,
tab_2 t2,
tab_3
WHERE
t1.col1 =
t2.col2
AND
t1.col1 =
t3.col3
AND
t1.col2 =
t4.col4
AND
t1.colx IN
( ‘E’, ‘F’, ‘G’ )
UNION ALL
SELECT
0 , 0 , 0 , sum(amount) AS tot4 , 0
FROM
tab_1 t1,
tab_2 t2,
tab_3
WHERE
t1.col1 =
t2.col2
AND
t1.col1 =
t3.col3
AND
t1.col2 =
t4.col4
AND
t1.colx IN
( ‘Q’, ‘R’, ‘S’ )
UNION ALL
SELECT
0 , 0 , 0 , 0 , sum(amount) AS tot5
FROM
tab_1 t1,
tab_2 t2,
tab_3
WHERE
t1.col1 =
t2.col2
AND
t1.col1 =
t3.col3
AND
t1.col2 =
t4.col4
AND
t1.colx IN
( ‘T’, ‘U’, ‘V’ )
ORDER BY 2 , 4 , 3
t3,
tab_4
t4
t3,
tab_4
t4
t3,
tab_4
t4
t3,
tab_4
t4
t3,
tab_4
t4
Bild-57: Die UNION / CASE Alternative
++49 8106 994390
Seite 75 von 127
…. und hier dieselbe Query mit effizienten CASE-Statements:
SELECT
sum( CASE
END)
sum( CASE
END)
sum( CASE
END)
sum( CASE
END)
sum( CASE
END)
FROM
,
tab_2
,
tab_3
,
tab_4
WHERE
AND
AND
AND
WHEN (t1.colx =
THEN AMOUNT
‘X’
OR
t1.colx = ‘Y’ ….)
WHEN (t1.colx =
THEN AMOUNT
‘A’
OR
t1.colx = ‘B’ ….)
WHEN (t1.colx =
THEN AMOUNT
‘E’
OR
t1.colx = ‘F’ ….)
WHEN (t1.colx =
THEN AMOUNT
‘Q’
OR
t1.colx = ‘R’ ….)
WHEN (t1.colx =
THEN AMOUNT
‘T’
OR
t1.colx = ‘U’ ….)
tab_1
t2
t3
t4
t1.col1
t1.col1
t1.col2
t1.colx
t1
=
=
=
IN
t2.col2
t3.col3
t4.col4
( ‘A’ , ‘B’ , ‘C’ , ‘E’ , ‘F’ , ‘G’ , ‘Q’ , ‘R’ , ‘S’ ,
‘T’, ‘U’, ‘V’ , ‘X’ , ‘Y’ , ‘Z’ )
ORDER BY 2 , 4 , 3
Bild-58: Die CASE Alternative zum UNION
++49 8106 994390
Seite 76 von 127
4.4.4
Spezielle Techniken
Hier einige Techniken, mit denen man die Logik innerhalb der „DB2 engine“ im Sinne
von Performance positiv beeinflussen kann.
4.4.4.1 CASE in Prädikaten
Die Nutzung von CASE ist entgegen manchen Aussagen nicht grundsätzlich von
Übel. Benutzt man den CASE-Ausdruck allerdings innerhalb eines WHEREPrädikates, verursacht man nicht unerheblichen Stage-2 Overhead.
Im folgenden sollen mehrfache Queries oder auch „dynamisches SQL“ durch
komplexe SQL-Formulierungen eliminiert werden. Die Formel hier für ist einfach:
•
•
Lassen Sie SQL arbeiten
•
Ermöglichen Sie der
„engine:
DB2 soll dieses
optimieren
1. Das Anwenden der Prädikate
2. Das Durchführen der JOINs
3. Das Vermeiden/Durchführen von
SORTs
4. Das Durchführen von Berechnungen
5. Das Durchführen von Datentransformationen
Die Methode zum Aufbau von Komplexen SQL-Formulierungen lautet:
1. Sammeln aller Anforderungen
2. Auflisten der Tabellen für:
• Datenanzeige
• Datenfilterung
• Joins
3. Erzeugen der JOIN-Prädikate
4. Erzeugen der lokalen Prädikate
5. Erzeugen der „look-up“ Subqueries
6. Hinzufügen der Datentransformationen im SELECT
7. Hinzufügen von GROUP BY / HAVING (falls erforderlich)
8. Hinzufügen von FULL/LEFT Join (falls erforderlich)
9. Hinzufügen von „table expressions“ (falls erforderlich)
10. Zusammenfügen der Komponenten
11. Test
12. EXPLAIN – Tuning (falls erforderlich)
++49 8106 994390
Seite 77 von 127
Beispiel:
Alle :ws Hostvariablen, die zu Vergleichen genutzt werden, sind auf <blank> oder
andere <defaults> gesetzt, bevor die Query das „filtering“ einschaltet.
SELECT …..
FROM
....
WHERE ….
------------------------------------------------------------------- SELECT account nummern gruppe
------------------------------------------------------------------AND
( CASE WHEN :ws-acct-num-range
=
‚Y’
THEN acct_nummer
ELSE ‚
‚
END )
BETWEEN
:ws-acct-low
AND :ws-acct-high
------------------------------------------------------------------- SELECT gruppen nummern gruppe
------------------------------------------------------------------AND
( CASE WHEN :ws-group-num-range =
‚Y’
THEN ( grp_num_id
CONCAT
Sub_grp_num_id
CONCAT
DIGITS(grp_num_seq
)
ELSE ‚
‚
END )
BETWEEN
:ws-grp-low
AND :ws-grp-high
------------------------------------------------------------------- SELECT serielle nummern gruppe
------------------------------------------------------------------AND
( CASE WHEN :ws-serial-search
THEN serial_code
ELSE ‚ ‚
END )
=
:ws-serial-code
AND
THEN domain
ELSE ‚ ‚
END )
=
:ws-domain
AND
THEN serial_num
ELSE 000000000
END )
BETWEEN
:ws-ser-low
AND
=
‚Y’
=
‚Y’
=
‚Y’
:ws-ser-high
Bild-59: CASE in Prädikaten – Beispiel
++49 8106 994390
Seite 78 von 127
4.4.4.2 CASE in UPDATE – Anweisungen
CASE Ausdrücke können in vielen Fällen erfolgreich genutzt werden. Der beste Weg,
um dafür ein Gefühl zu entwickeln ist, eineige der folgenden Beispiele zu betrachten.
CASE Ausdrücke können die Logik, die über UNION, UNION ALL oder über
komplexe OR-Klausel formuliert wurde, eliminieren. Die Methode teilt die Ergebnisse
in mehrere Rückgabewerte ein, was DB2 dazu zwingen kann, mehrfach dieselben
Datenpages lesen zu müssen. Im Gegensatz dazu, kann man dieselben Daten mit
einer CASE "expression" und "one pass" durch die Daten gewinnen. Deshalb können
CASE Ausdrücke die "elapsed time" von Queries signifikant senken.
Dies gilt auch für UPDATEs, wie folgende Beispiele zeigen.
1.
Die Gehaltserhöhung hängt von der Jobklasse ab und bestimmte Klassen
erhalten somit 10%, andere 8%, weitere 5% und wieder andere 3.5%
Erhöhung. Man kann hier mehrere UPDATE Statements formulieren, aber auch
nur ein UPDATE mit einer CASE Expression.
UPDATE
SET
WHERE
TBEMPLOYEE
SALARY = SALARY * 1.10
JOB IN (‘MANAGER’ , ‘SUPRVSR’) ;
UPDATE
SET
WHERE
TBEMPLOYEE
JOB IN (‘DBA’ , ‘SYS PROG’) ;
UPDATE TBEMPLOYEE
SET
WHERE JOB = ‘PRGRMR’ ;
UPDATE
SET
WHERE
TBEMPLOYEE
JOB NOT IN (‘MANAGER’ , ‘SUPRVSR’, ‘DBA’ ,
‘SYS PROG’, ‘PRGRMR’) ;
UPDATE
SET
TBEMPLOYEE
SALARY =
CASE
WHEN JOB IN (‘MANAGER’ , ‘SUPRVSR’)
THEN SALARY * 1.10
WHEN JOB IN (‘DBA’ , ‘SYS PROG’)
THEN SALARY * 1.08
WHEN JOB = ‘PRGRMR’
THEN SALARY * 1.05
ELSE SALARY * 1.035
END ;
Bild-59a: Ein Update mit einer CASE "expression" und "one pass" auf die Daten (Beispiel-1)
2.
Im folgenden Beispiel wird das Feld ORDERSTATUS in der Tabelle TBORDER
abhängig vom Wert in der Spalte ORDERDATE modigiziert. Man lann dies mit
drei SQL Statements und drei Durchläufe durch die Daten erreichen, oder mit
einem SQL Statement und einem entsprechenden CASE Ausdruck und einem
Durchluaf durch die Daten. Zusätzlich dazu vereinfacht sich die Logik.
++49 8106 994390
Seite 79 von 127
UPDATE
SET
WHERE
TBORDER
ORDERSTATUS = '0'
ORDERDATE <= CURRENT DATE + 30 DAYS ;
UPDATE
SET
WHERE
AND
TBORDER
ORDERSTATUS = '1'
ORDERDATE > CURRENT DATE + 30 DAYS
ORDERDATE <= CURRENT DATE + 60 DAYS ;
UPDATE
SET
WHERE
TBORDER
ORDERSTATUS = '2'
ORDERDATE > CURRENT DATE + 60 DAYS ;
UPDATE
SET
TBORDER
ORDERSTATUS =
CASE
WHEN ORDERDATE <= CURRENT DATE + 30 DAYS
THEN ‘0’
WHEN (ORDERDATE > CURRENT DATE + 30 DAYS
AND ORDERDATE <= CURRENT DATE + 60 DAYS)
THEN '1'
WHEN ORDERDATE > CURRENT DATE + 60 DAYS
THEN '2'
END ;
Bild 59b: Update implementiert mit CASE Ausdruck (Beispiel-2)
++49 8106 994390
Seite 80 von 127
4.4.4.3 CASE zum Vermeiden von Rechen- oder anderen Fehlern
Im nächsten Beispiel wird eine "CASE expression" verwendet, um das Problem
"Division durch 0" zu vermeiden. Es zeigt eine Akkumulations- bzw. Summierungsoperation. In der ersten Query wäre es möglich, einen Fehler zu bekommen, das die
.......
++49 8106 994390
Seite 81 von 127
++49 8106 994390
Seite 82 von 127
++49 8106 994390
Seite 83 von 127
11.4 Index
„
„single“ SQL-Statement
48
A
access path
30
ACCESSTYPE
95, 110, 111, 112, 113, 114, 115,
119, 155
Accounting
160, 166, 167
ACQUIRE
30
AE 13
Application Elapsed Time
167
Application Elapsed TIme
167
autocommit
29, 30
B
BIND
27, 143, 162
BM
101, 102
bool´sche Operationen
17
Boolean Term
103, 145
BSDS
189
BT 103, 145
buffer pool
189, 201
Bufferpool
15, 16, 27, 102, 121, 158, 159
Bufferpool-Grösse
27
C
CASE-Expression
54
catalog visibility
161
CICS
15
CLOSE
218
clustering
192
clustering index
24
Column Functions
28
COMMIT
218
compound
144
Compound
103, 144
correlated
55, 56, 59, 60, 97, 103, 108, 121
Correlated Subquery
59, 60, 172
CPU-Time
161
deadlock
DELETE
Disk Paths
DISPLAY-Command
Distinct-Set
DM
DML
DSNZPARM
DWH
Dynamic EXPLAIN
29
160
27
160
28
101, 103, 104
17, 18
25
23
27
E
EDITPROC
15
EDM-Pool
23
Elapsed
27, 161, 167
Elapsed Time
27, 161, 167
Elapsed Wait Time
167
End Users Response Time
166
EXPLAIN 15, 27, 28, 60, 95, 96, 110, 114, 115, 119,
121, 123, 125, 155, 160
EXPLAIN-Option
27
EXPLAIN-Output
28
F
FIFO
Filterfaktor
Filterfaktoren
foreign key
free space
FREQVAL
Funktionsentwurf
24
27
171
198, 201, 211, 215
193, 201
162
12
G
Generierungsparameter
15
GET PAGE
101
GET PAGES 158, Siehe Get Page, Siehe Get Page,
Siehe Get Page
Global Temporary Tables
26, 27
Governor
160
GROUP BY
18, 26, 27, 28, 168, 171
H
D
Hit Ratio
database
186, 187, 191, 193, 196, 197, 198, 199,
204, 205, 209, 211, 212, 214, 215, 216, 218,
219, 220, 221, 222, 224
DATABASE
160
data-pages
111
DB2-Elapsed Time
167
DB2-Estimator
161
DB2I
161
DB2PM
160, 166
DB2-Systemparameter
16
DB2-Tuning
12
DB2-Zugriffspfad
15
DBA
25, 162
DBD
23, 196, 197
DB-Design
8, 9
DCL
17
DDF
16
DDL
16, 17, 95
++49 8106 994390
158
I
I/O Contention
27
I/O Paths
27
IFCID
203, 204
IFI 160
IMS-DB
15
Index
16, 25, 26, 28, 60, 94, 95, 96, 97, 103, 104,
105, 106, 107, 110, 111, 112, 113, 114, 115,
116, 117, 118, 119, 121, 123, 145, 153, 162, 171
INDEX ONLY
109, 114, 115
Index Scan
28
Index screening
29
indexable
103, 105, 106, 108, 109
Indexable
103, 105, 144
indexable Prädikate
103
Indexable Predicates
106
Seite 84 von 127
Indexe
Indexes
INDEXONLY
Index-Only-Zugriff
Indexscreening
Index-Screening
indizierbar
INSERT
Intersektionsmenge
IRLM
24, 123
24, 25, 26, 28, 59, 115, 172
114, 115
26
103
106
109
17, 18, 160
21
15, 204, 205, 207, 222
J
Join 16, 17, 26, 28, 59, 101, 110, 120, 122, 144, 172
JOIN 19, 21, 22, 24, 59, 60, 97, 103, 110, 121, 144,
155, 171, 172
JOIN Tuning
171
Join-Operation
26
JOIN-Prädikat
144
Joins
28, 31, 56, 144
K
karthesische
21
karthesische Produkt
21
Katalog
15, 95, 142, 143, 145, 160, 161
KEYCARD
162
Korrelation
144
N
nested loop
nested table expression
nicht-indizierbar
non-correlated
Non-Correlated
Non-Correlated Subquery
non-Indexable
Non-Indexable
non-matching index scan
non-sargable
NON-SARGABLE
NON-uniform distribution
NULL
120, 121, 123
53
109
56, 59
59, 60, 172
59, 60, 172
103
144
111, 112
103, 109
104
145
198
O
ONE MERGE PASS
123
One-Fetch index scan
114
Optimierungsmöglichkeiten
14
Optimizer
12, 26, 101, 103, 110, 143, 160, 162
ORDER BY
26, 27, 112, 168, 171, 221
ORDER BY / GROUP BY
171
OS/390 9, 192, 199, 209, 210, 214, 217, 223, 227,
228, 229
overhead
23, 29, 45, 48
P
L
latch
24
leaf-pages
111, 112, 113
List prefetch
29
LIST PREFETCH
110, 115, 116, 117, 118
List Sequential Prefetch
26
lnstrumentation Facility
160
LOB
188, 189, 199, 207, 208, 209, 211, 219, 224
Local
144
LOCATION
160
lock escalation
207
Lock escalation
30
LOCK TABLE
30
LOCKMAX
30
locks
29
LOCKSIZE
30
M
MATCHCOLS
95, 106, 112, 113, 114, 155
matching
105, 110, 112, 113, 115, 153
matching index scan
111, 112, 113, 114
Materialisierung
16
materialization
153
Materialized Views
26
MERGE SCAN
122, 123
METHOD
95, 121, 123
Methodeneinsatz
13
MIXOPSEQ
119
MI-Zugriff
119
MODIFY
204
multiple index
110
multiple index access
103
MUPA
158, 159
MVS
15, 16, 25, 161
++49 8106 994390
page
29
page set
204, 211, 212, 216, 219, 222
Page Set Scan
16
Page-Marker
27
Paging
16
Paging/Swapping
16
parse-tree
103, 144
parsing
103
Parsing
144
partitioned
145
partitioned IX
145
Performance 12, 13, 14, 25, 26, 27, 31, 96, 103, 153,
160, 162, 166
physische DB2-Objekte
15
PK 24
PLAN TABLE
155
PLAN_TABLE 106, 110, 111, 112, 113, 114, 155,
160
Prädikat 28, 60, 95, 98, 103, 105, 106, 123, 144, 145
Prädikate 16, 17, 26, 28, 60, 101, 102, 103, 104, 106,
111, 112, 113, 114, 123, 144, 153, 171
Prädikatskatekorien
144
prefetch
24, 110, 111, 112, 119
Prefetch
26, 27, 102, 110, 115, 119
PREFETCH 110, 114, 115, 116, 117, 118, 119, 155
Prefetch Rate
27
PREPARE
23, 95
primary key
198, 201, 211, 212, 215, 223
Projektion
17, 20
PTS
145
Q
Queries
15, 27, 103, 144, 155, 168, 171
Seite 85 von 127
R
RDM
101, 104
RDS
101, 102, 103, 104, 120, 121
REBIND
15, 162, 210
RECOVER
216
REOPT
27
REORG
30, 215
Resource Limit Facility
160
Response Time
166
Result
26, 101
Result Set
26
result table
103, 153
RID
29, 107, 115, 116, 117, 118, 119, 214, 216
RID Pool
29
RLF
160
RLIMIT
160
ROLLBACK
218
ROWID
145
row-level locking
29
RUNSTATS
15, 27, 160, 162
RUNSTATS Utility
27
RUNSTATS-Utility
160
S
SAP
23
sargeable
103, 109
sargeable Prädikate
103
segmented TS
23, 24
SELECT 17, 18, 19, 20, 21, 26, 59, 95, 96, 97, 98,
109, 111, 112, 113, 119, 121, 144, 160
Selektion
17, 20, 95
sequential Prefetch
110
Service units
160
Set Functions
28
simple
144
Simple
103, 144
Sort
26, 27, 28, 123
SORT
33, 43, 44, 51, 52, 53, 115, 155
Sort Bufferpool
27
Sort Data columns
26
Sort Key Columns
26
Sort-Aufwand
26
Sort-Performance
26
Sort-Row-Länge
26
space map
211, 216
SQL 8, 9, 186, 187, 189, 191, 192, 194, 196, 197,
199, 200, 201, 202, 210, 212, 213, 214, 215,
216, 218, 219, 220, 221, 222, 223, 224, 227,
228, 229
Stage1
103, 104, 106, 144
Stage2
103, 104, 106, 145
Stage-2
28
Stage2 - Prädikate
104
Static Explain
27
STOGROUP
27
++49 8106 994390
stored procedure
STOSPACE
STOSPACE-Utility
STS
subqueries
Subquery
subquery-nesting
SYSADM
System-Design
48
160
160
24
54, 56, 97
59, 60, 101, 103, 144, 172
60
213
12
T
Table
25, 59, 94, 101, 107, 110, 111, 121, 123
TABLE
189, 194, 197
Tablespace
25, 110, 111
tablespace scan
111
TABLESPACE SCAN
120
task switch overhead
48
Technologie-Einsatz
13
Thread
15, 16, 167
THREAD
160, 167
threads
29
timeout
222
Total Transit Time
166
TRACE
160
TSO
15
Tuningpotentiale
14, 16
U
uniform distribution
unique index
UNIX
UPDATE
Update Statistik
UR
UTILITY
145
203, 223
9, 228
17, 18, 25, 160
160
29
160
V
VARCHAR
VARCHAR-Spalten
view
virtual storage
Virtual Storage
Visual Explain
VPSEQT
VSAM
24, 26, 191
26
186, 194, 209, 221, 222, 224
186, 196, 202, 207
192, 209, 224
125, 160
24
15, 16, 25, 102
W
Workfile Dataset
Workfile-Space
27
27
Z
z/OS
9, 223, 225, 227, 228
Seite 86 von 127
11.5
Kapitel 11: Anhang – Glossar
Glossar
A
abend.
.
abend reason code.
Abnormal end of task
A 4-byte hexadecimal code that uniquely identifies a problem with
DB2. A complete list of DB2 abend reason codes and their
explanations is contained in DB2 Messages and Codes.
abnormal end of task (abend). Termination of a task, job, or subsystem because of an error condition
that recovery facilities cannot resolve during execution.
access method services.
The facility that is used to define and reproduce VSAM key-sequenced
data sets.
access path.
The path that is used to locate data that is specified in SQL
statements. An access path can be indexed or sequential.
active log.
The portion of the DB2 log to which log records are written as they are
generated. The active log always contains the most recent log records,
whereas the archive log holds those records that are older and no
longer fit on the active log.
address space.
A range of virtual storage pages that is identified by a number (ASID)
and a collection of segment and page tables that map the virtual
pages to real pages of the computer’s memory
address space connection.
The result of connecting an allied address space to DB2. Each
address space that contains a task that is connected to DB2 has
exactly one address space connection, even though more than one
task control block (TCB) can be present. See also allied address
space and task control block.
agent.
As used in DB2, the structure that associates all processes that are
involved in a DB2 unit of work. An allied agent is generally
synonymous with an allied thread. System agents are units of work
that process independently of the allied agent, such as prefetch
processing, deferred writes, and service tasks.
alias.
An alternative name that can be used in SQL statements to refer to a
table or view in the same or a remote DB2 subsystem.
allied address space.
An area of storage that is external to DB2 and that is connected to
DB2. An allied address space is capable of requesting DB2 services.
allied thread.
A thread that originates at the local DB2 subsystem and that can
access data at a remote DB2 subsystem.
ambiguous cursor.
A database cursor that is not defined with the FOR FETCH ONLY
clause or the FOR<UPDATE OF clause, is not defined on a read-only
result table, is not the target of a WHERE CURRENT clause on an
SQL UPDATE or DELETE statement, and is in a plan or package that
contains either PREPARE or EXECUTE IMMEDIATE SQL statements.
APAR.
Authorized program analysis report.
APAR fix corrective service.
A temporary correction of a DB2 defect. The correction is temporary,
because it is usually replaced at a later date by a more permanent
correction, such as a program temporary fix (PTF).
APF.
Authorized program facility.
API.
Application programming interface.
++49 8106 994390
Seite 87 von 127
APPL.
A VTAM network definition statement that is used to define DB2 to
VTAM as an application program that uses SNA LU 6.2 protocols.
application.
A program or set of programs that performs a task; for example, a
payroll application.
application-directed connection.
A connection that an application manages using the SQL CONNECT
statement.
application plan.
The control structure that is produced during the bind process. DB2
uses the application plan to process SQL statements that it encounters
during statement execution.
application process.
The unit to which resources and locks are allocated. An application
process involves the execution of one or more programs.
application programming interface (API).
A functional interface that is supplied by the operating system or by a
separately orderable licensed program that allows an application
program that is written in a high-level language to use specific data or
functions of the operating system or licensed program.
application requester.
The component on a remote system that generates DRDA requests
for data on behalf of an application. An application requester accesses
a DB2 database server using the DRDA application-directed protocol.
application server.
The target of a request from a remote application. In the DB2
environment, the application server function is provided by the
distributed data facility and is used to access DB2 data from remote
applications
archive log.
The portion of the DB2 log that contains log records that have been
copied from the active log.
ASCII.
An encoding scheme that is used to represent strings in many
environments, typically on PCs and workstations. Contrast with
EBCDIC and Unicode
.
attachment facility.
An interface between DB2 and TSO, IMS, CICS, or batch address
spaces. An attachment facility allows application programs to access
DB2.
attribute.
A characteristic of an entity. For example, in database design, the
phone number of an employee is one of that employee’s attributes.
authorization ID.
A string that can be verified for connection to DB2 and to which a set
of privileges is allowed. It can represent an individual, an
organizational group, or a function, but DB2 does not determine this
representation.
authorized program analysis report (APAR).
A report of a problem that is caused by a suspected defect in a current
release of an IBM licensed program.
authorized program facility (APF). A facility that permits the identification of programs that are
authorized to use restricted functions.
auxiliary index.
An index on an auxiliary table in which each index entry refers to a
LOB.
auxiliary table.
A table that stores columns outside the table in which they are defined.
Contrast with base table.
++49 8106 994390
Seite 88 von 127
B
backward log recovery.
The fourth and final phase of restart processing during which DB2
scans the log in a backward direction to apply UNDO log records for
all aborted changes.
base table.
(1) A table that is created by the SQL CREATE TABLE statement and
that holds persistent data. Contrast with result table and temporary
table.
(2) A table containing a LOB column definition. The actual LOB
column data is not stored with the base table. The base table contains
a row identifier for each row and an indicator column for each of its
LOB columns. Contrast with auxiliary table.
base table space.
A table space that contains base tables.
basic sequential access method (BSAM). An access method for storing or retrieving data blocks in a
continuous sequence, using either a sequential access or a direct
access device.
before trigger.
A trigger that is defined with the trigger activation time BEFORE.
binary large object (BLOB).
A sequence of bytes where the size of the value ranges from 0 bytes
to 2 GB.1. Such a string does not have an associated CCSID.
binary string.
A sequence of bytes that is not associated with a CCSID. For
example, the BLOB data type is a binary string.
bind.
The process by which the output from the SQL precompiler is
converted to a usable control structure, often called an access plan,
application plan, or package. During this process, access paths to the
data are selected and some authorization checking is performed. The
types of bind are:
automatic bind. (More correctly, automatic rebind)A process by which
SQL statements are bound automatically (without a user issuing a
BIND command) when an application process begins execution and
the bound application plan or package it requires is not valid.
dynamic bind. A process by which SQL statements are bound as
they are entered.
incremental bind. A process by which SQL statements are bound
during the execution of an application process, because they could not
be bound during the bind process, and VALIDATE(RUN) was
specified.
static bind. A process by which SQL statements are bound after they
have been precompiled. All static SQL statements are prepared for
execution at the same time.
BLOB.
Binary large object
BMP.
Batch Message Processing (IMS).
bootstrap data set (BSDS).
A VSAM data set that contains name and status information for DB2,
as well as RBA range specifications, for all active and archive log data
sets. It also contains passwords for the DB2 directory and catalog, and
lists of conditional restart and checkpoint records.
BSAM.
Basic sequential access method.
BSDS.
Bootstrap data set.
buffer pool.
Main storage that is reserved to satisfy the buffering requirements for
one or more table spaces or indexes.
built-in function.
A function that DB2 supplies. Contrast with user-defined function.
++49 8106 994390
Seite 89 von 127
C
CAF.
Call attachment facility.
call attachment facility (CAF). A DB2 attachment facility for application programs that run in TSO or
MVS batch. The CAF is an alternative to the DSN command processor
and provides greater control over the execution environment.
cascade delete.
The way in which DB2 enforces referential constraints when it deletes
all descendent rows of a deleted parent row.
cast function.
A function that is used to convert instances of a (source) data type into
instances of a different (target) data type. In general, a cast function
has the name of the target data type. It has one single argument
whose type is the source data type; its return type is the target data
type.
catalog.
In DB2, a collection of tables that contains descriptions of objects
such as tables, views, and indexes.
catalog table.
Any table in the DB2 catalog.
CCSID.
Coded character set identifier.
CDB.
Communications database.
CEC.
Central electronic complex. See central processor complex.
central electronic complex (CEC).
See central processor complex.
central processor complex (CPC).
A physical collection of hardware (such as an ES/3090) that consists
of main storage, one or more central processors, timers, and
channels.
character large object (CLOB). A sequence of bytes representing single-byte characters or a mixture
of single- and double-byte characters where the size of the value can
be up to 2 GB.1. In general, character large object values are used
whenever a character string might exceed the limits of the VARCHAR
type.
character set.
A defined set of characters.
character string.
A sequence of bytes that represent bit data, single-byte characters, or
a mixture of single-byte and multibyte characters.
check constraint.
See table check constraint.
check integrity.
The condition that exists when each row in a table conforms to the
table check constraints that are defined on that table. Maintaining
check integrity requires DB2 to enforce table check constraints on
operations that add or change data.
check pending.
A state of a table space or partition that prevents its use by some
utilities and some SQL statements because of rows that violate
referential constraints, table check constraints, or both.
checkpoint.
A point at which DB2 records internal status information on the DB2
log; the recovery process uses this information if DB2 abnormally
terminates.
CI.
Control interval.
++49 8106 994390
Seite 90 von 127
CICS.
Represents one of the following products:
CICS Transaction Server for OS/390:
Customer Information Control System Transaction Server for
OS/390
CICS/ESA:
Customer Information Control System/Enterprise Systems
Architecture
CICS/MVS:
Customer Information Control System/Multiple Virtual Storage
CICS attachment facility.
A DB2 subcomponent that uses the MVS subsystem interface (SSI)
and cross storage linkage to process requests from CICS to DB2 and
to coordinate resource commitment.
CIDF.
Control interval definition field.
claim.
A notification to DB2 that an object is being accessed. Claims prevent
drains from occurring until the claim is released, which usually occurs
at a commit point. Contrast with drain.
claim class.
A specific type of object access that can be one of the following:
Cursor stability (CS)
Repeatable read (RR)
Write
claim count.
A count of the number of agents that are accessing an object.
class of service.
A VTAM term for a list of routes through a network, arranged in an
order of preference for their use.
clause.
In SQL, a distinct part of a statement, such as a SELECT clause or a
WHERE clause.
client.
See requester.
CLIST.
Command list. A language for performing TSO tasks.
CLOB.
Character large object.
CLPA.
Create link pack area.
clustering index.
An index that determines how rows are physically ordered in a table
space.
coded character set.
A set of unambiguous rules that establish a character set and the oneto-one relationships between the characters of the set and their coded
representations.
coded character set identifier (CCSID).
A 16-bit number that uniquely identifies a coded representation of
graphic characters. It designates an encoding scheme identifier and
one or more pairs consisting of a character set identifier and an
associated code page identifier.
column.
The vertical component of a table. A column has a name and a
particular data type (for example, character, decimal, or integer).
column function.
An operation that derives its result by using values from one or more
rows. Contrast with scalar function.
"come from" checking.
An LU 6.2 security option that defines a list of authorization IDs that
are allowed to connect to DB2 from a partner LU.
command.
A DB2 operator command or a DSN subcommand. A command is
distinct from an SQL statement.
++49 8106 994390
Seite 91 von 127
command recognition character (CRC).
A character that permits an MVS console operator or an IMS
subsystem user to route DB2 commands to specific DB2 subsystems.
commit.
The operation that ends a unit of work by releasing locks so that the
database changes that are made by that unit of work can be perceived
by other processes.
commit point.
A point in time when data is considered consistent.
committed phase.
The second phase of the multisite update process that requests all
participants to commit the effects of the logical unit of work.
common service area (CSA).
In MVS, a part of the common area that contains data areas that are
addressable by all address spaces.
communications database (CDB).
A set of tables in the DB2 catalog that are used to establish
conversations with remote database management systems.
comparison operator.
A token (such as =, >, <) that is used to specify a relationship between
two values.
compression dictionary.
The dictionary that controls the process of compression and
decompression. This dictionary is created from the data in the table
space or table space partition.
concurrency.
The shared use of resources by more than one application process at
the same time.
conditional restart.
A DB2 restart that is directed by a user-defined conditional restart
control record (CRCR).
connection ID.
An identifier that is supplied by the attachment facility and that is
associated with a specific address space connection.
consistency token.
A timestamp that is used to generate the version identifier for an
application. See also version.
constraint.
A rule that limits the values that can be inserted, deleted, or updated in
a table. See referential constraint, table check constraint, and
uniqueness constraint.
control interval (CI).
A fixed-length area or direct access storage in which VSAM stores
records and creates distributed free space. Also, in a key-sequenced
data set or file, the set of records pointed to by an entry in the
sequence-set index record. The control interval is the unit of
information that VSAM transmits to or from direct access storage. A
control interval always includes an integral number of physical records.
control interval definition field (CIDF).
In VSAM, a field located in the 4 bytes at the end of each control
interval; it describes the free space, if any, in the control interval.
conversation.
Communication, which is based on LU 6.2 or Advanced Program-toProgram Communication (APPC), between an application and a
remote transaction program over an SNA logical unit-to-logical unit
(LU-LU) session that allows communication while processing a
transaction.
coordinator.
The system component that coordinates the commit or rollback of a
unit of work that includes work that is done on one or more other
systems.
correlated columns.
A relationship between the value of one column and the value of
another column.
++49 8106 994390
Seite 92 von 127
correlated subquery.
A subquery (part of a WHERE or HAVING clause) that is applied to a
row or group of rows of a table or view that is named in an outer
subselect statement.
correlation ID.
An identifier that is associated with a specific thread. In TSO, it is
either an authorization ID or the job name.
correlation name.
An identifier that designates a table, a view, or individual rows of a
table or view within a single SQL statement. It can be defined in any
FROM clause or in the first clause of an UPDATE or DELETE
statement.
cost category.
A category into which DB2 places cost estimates for SQL statements
at the time the statement is bound. A cost estimate can be placed in
either of the following cost categories: v A: Indicates that DB2 had
enough information to make a cost estimate without using default
values. v B: Indicates that some condition exists for which DB2 was
forced to use default values for its estimate. The cost category is
externalised in the COST_CATEGORY column of the
DSN_STATEMNT_TABLE when a statement is explained.
CPC.
Central processor complex.
CRC.
Command recognition character.
CRCR.
Conditional restart control record. See also conditional restart.
create link pack area (CLPA).
An option used during IPL to initialise the link pack pageable area.
created temporary table.
A table that holds temporary data and is defined with the SQL
statement CREATE GLOBAL TEMPORARY TABLE. Information
about created temporary tables is stored in the DB2 catalog, so this
kind of table is persistent and can be shared across application
processes. Contrast with declared temporary table. See also
temporary table.
cross-memory linkage.
A method for invoking a program in a different address space. The
invocation is synchronous with respect to the caller.
CS.
Cursor stability.
CSA.
Common service area.
CT.
Cursor table.
current status rebuild.
The second phase of restart processing during which the status of the
subsystem is reconstructed from information on the log.
cursor.
A named control structure that an application program uses to point to
a row of interest within some set of rows, and to retrieve rows from the
set, possibly making updates or deletions.
cursor stability (CS).
The isolation level that provides maximum concurrency without the
ability to read uncommitted data. With cursor stability, a unit of work
holds locks only on its uncommitted changes and on the current row of
each of its cursors.
cursor table (CT).
The copy of the skeleton cursor table that is used by an executing
application process.
cycle.
A set of tables that can be ordered so that each table is a descendent
of the one before it, and the first table is a descendent of the last table.
A self-referencing table is a cycle with a single member.
++49 8106 994390
Seite 93 von 127
D
DASD.
Direct access storage device.
database.
A collection of tables, or a collection of table spaces and index spaces.
database access thread.
A thread that accesses data at the local subsystem on behalf of a
remote subsystem.
database administrator (DBA). An individual who is responsible for designing, developing, operating,
safeguarding, maintaining, and using a database.
database descriptor (DBD).
An internal representation of a DB2 database definition, which reflects
the data definition that is in the DB2 catalog. The objects that are
defined in a database descriptor are table spaces, tables, indexes,
index spaces, and relationships.
database management system (DBMS).
A software system that controls the creation, organization, and
modification of a database and the access to the data stored within it.
database request module (DBRM).
A data set member that is created by the DB2 precompiler and that
contains information about SQL statements. DBRMs are used in the
bind process.
database server.
The target of a request from a local application or an intermediate
database server. In the DB2 environment, the database server
function is provided by the distributed data facility to access DB2 data
from local applications, or from a remote database server that acts as
an intermediate database server.
DATABASE 2 Interactive (DB2I).
The DB2 facility that provides for the execution of SQL statements,
DB2 (operator) commands, programmer commands, and utility
invocation.
data definition name (ddname).
The name of a data definition (DD) statement that corresponds to a
data control block containing the same name.
Data Language/I (DL/I).
The IMS data manipulation language; a common high-level interface
between a user application and IMS.
data sharing.
The ability of two or more DB2 subsystems to directly access and
change a single set of data.
data sharing group.
A collection of one or more DB2 subsystems that directly access and
change the same data while maintaining data integrity.
data sharing member.
A DB2 subsystem that is assigned by XCF services to a data sharing
group.
data space.
A range of up to 2 GB of contiguous virtual storage addresses that a
program can directly manipulate. Unlike an address space, a data
space can hold only data; it does not contain common areas, system
data, or programs.
data type.
An attribute of columns, literals, host variables, special registers, and
the results of functions and expressions.
date.
A three-part value that designates a day, month, and year.
date duration.
decimal integer that represents a number of years, months, and days.
datetime value.
A value of the data type DATE, TIME, or TIMESTAMP.
++49 8106 994390
Seite 94 von 127
DBA.
Database administrator.
DBCLOB.
Double-byte character large object.
DBCS.
Double-byte character set.
DBD.
Database descriptor.
DBID.
Database identifier.
DBMS.
Database management system.
DBRM.
Database request module.
DB2 catalog.
Tables that are maintained by DB2 and contain descriptions of DB2
objects, such as tables, views, and indexes.
DB2 command.
An instruction to the DB2 subsystem allowing a user to start or stop
DB2, to display information on current users, to start or stop
databases, to display information on the status of databases, and so
on.
DB2 for VSE & VM.
The IBM DB2 relational database management system for the VSE
and VM operating systems.
DB2I.
DATABASE 2 Interactive.
DB2I Kanji Feature.
The tape that contains the panels and jobs that allow a site to display
DB2I panels in Kanji.
DB2 PM.
DATABASE 2 Performance Monitor.
DCLGEN.
Declarations generator.
DDF.
Distributed data facility.
ddname.
Data definition name.
deadlock.
Unresolvable contention for the use of a resource such as a table or
an index.
declarations generator (DCLGEN).
A subcomponent of DB2 that generates SQL table declarations and
COBOL, C, or PL/I data structure declarations that conform to the
table. The declarations are generated from DB2 system catalog
information. DCLGEN is also a DSN subcommand.
declared temporary table.
A table that holds temporary data and is defined with the SQL
statement DECLARE GLOBAL TEMPORARY TABLE. Information
about declared temporary tables is not stored in the DB2 catalog, so
this kind of table is not persistent and can only be used by the
application process that issued the DECLARE statement. Contrast
with created temporary table. See also temporary table.
default value.
A predetermined value, attribute, or option that is assumed when no
other is explicitly specified.
deferred write.
The process of asynchronously writing changed data pages to disk.
degree of parallelism.
The number of concurrently executed operations that are initiated to
process a query.
delete rule.
The rule that tells DB2 what to do to a dependent row when a parent
row is deleted. For each relationship, the rule might be CASCADE,
RESTRICT, SET NULL, or NO ACTION.
++49 8106 994390
Seite 95 von 127
dependent.
An object (row, table, or table space) that has at least one parent. The
object is also said to be a dependent (row, table, or table space) of its
parent. See parent row, parent table, parent table space.
dependent row.
A row that contains a foreign key that matches the value of a primary
key in the parent row.
dependent table.
A table that is a dependent in at least one referential constraint.
descendent.
An object that is a dependent of an object or is the dependent of a
descendent of an object.
descendent row.
A row that is dependent on another row, or a row that is a descendent
of a dependent row.
descendent table.
A table that is a dependent of another table, or a table that is a
descendent of a dependent table.
DFHSM.
Data Facility Hierarchical Storage Manager.
DFP.
Data Facility Product (in MVS).
dimension.
A data category such as time, products, or markets. The elements of a
dimension are referred to as members. Dimensions offer a very
concise, intuitive way of organizing and selecting data for retrieval,
exploration, and analysis. See also dimension table.
dimension table.
The representation of a dimension in a star schema. Each row in a
dimension table represents all of the attributes for a particular member
of the dimension. See also dimension, star schema, and star join.
direct access storage device (DASD).
A device in which access time is independent of the location of the
data.
directory.
The DB2 system database that contains internal objects such as
database descriptors and skeleton cursor tables.
distinct type.
A user-defined data type that is internally represented as an existing
type (its source type), but is considered to be a separate and
incompatible type for semantic purposes.
Distributed Computing Environment MVS/ESA ™ (DCE MVS/ESA).
A set of technologies that are provided by the Open Software
Foundation to implement distributed computing.
distributed data facility (DDF). A set of DB2 components through which DB2 communicates with
another RDBMS.
Distributed Relational Database Architecture (DRDA).
A connection protocol for distributed relational database processing
that is used by IBM’s relational database products. DRDA includes
protocols for communication between an application and a remote
relational database management system, and for communication
between relational database management systems.
DL/I.
Data Language/I.
double-byte character large object (DBCLOB).
A sequence of bytes representing double-byte characters where the
size of the values can be up to 2 GB. In general, double-byte character
large object values are used whenever a double-byte character string
might exceed the limits of the VARGRAPHIC type.
++49 8106 994390
Seite 96 von 127
double-byte character set (DBCS).
A set of characters, which are used by national languages such as
Japanese and Chinese, that have more symbols than can be
represented by a single byte. Each character is 2 bytes in length.
Contrast with single-byte character set and multibyte character set.
drain.
The act of acquiring a locked resource by quiescing access to that
object.
drain lock.
A lock on a claim class that prevents a claim from occurring.
DRDA.
Distributed Relational Database Architecture.
DRDA access.
An open method of accessing distributed data that you can use to can
connect to another database server to execute packages that were
previously bound at the server location. You use the SQL CONNECT
statement or an SQL statement with a three-part name to identify the
server. Contrast with private protocol access.
DSN.
(1) The default DB2 subsystem name.
(2) The name of the TSO command processor of DB2.
(3) The first three characters of DB2 module and macro names.
duration.
A number that represents an interval of time. See date duration,
labeled duration, and time duration.
dynamic SQL.
SQL statements that are prepared and executed within an application
program while the program is executing. In dynamic SQL, the SQL
source is contained in host language variables rather than being
coded into the application program. The SQL statement can change
several times during the application program’s execution
++49 8106 994390
Seite 97 von 127
E
EA-enabled table space.
A table space or index space that is enabled for extended
addressability and that contains individual partitions (or pieces, for
LOB table spaces) that are greater than 4 GB.
EBCDIC.
Extended binary coded decimal interchange code. An encoding
scheme that is used to represent character data in the OS/390, MVS,
VM, VSE, and OS/400 ® environments. Contrast with ASCII and
Unicode.
EDM pool.
A pool of main storage that is used for database descriptors,
application plans, authorization cache, application packages, and
dynamic statement caching.
EID.
Event identifier.
embedded SQL.
SQL statements that are coded within an application program. See
static SQL.
enclave.
In Language Environment, an independent collection of routines, one
of which is designated as the main routine. An enclave is similar to a
program or run unit.
EOM.
End of memory.
EOT.
End of task.
Equi join.
A join operation in which the join-condition has the form expression =
expression.
error page range.
A range of pages that are considered to be physically damaged. DB2
does not allow users to access any pages that fall within this range.
ESDS.
Entry sequenced data set.
ESMT.
External subsystem module table (in IMS).
EUR.
IBM European Standards.
exception table.
A table that holds rows that violate referential constraints or table
check constraints that the CHECK DATA utility finds.
exclusive lock.
A lock that prevents concurrently executing application processes from
reading or changing data. Contrast with share lock.
exit routine.
A user-written (or IBM-provided default) program that receives control
from DB2 to perform specific functions. Exit routines run as extensions
of DB2.
expression.
An operand or a collection of operators and operands that yields a
single value.
extended recovery facility (XRF).
A facility that minimizes the effect of failures in MVS, VTAM, the host
processor, or high-availability applications during sessions between
high-availability applications and designated terminals. This facility
provides an alternative subsystem to take over sessions from the
failing subsystem.
external function.
A function for which the body is written in a programming language
that takes scalar argument values and produces a scalar result for
each invocation. Contrast with sourced function, built-in function, and
SQL function.
++49 8106 994390
Seite 98 von 127
external routine.
A user-defined function or stored procedure that is based on code that
is written in an external programming language.
External subsystem module table (ESMT).
The name of the external subsystem module table, which specifies
which attachment modules must be loaded by IMS.
F
fallback.
The process of returning to a previous release of DB2 after attempting
or completing migration to a current release.
field procedure.
A user-written exit routine that is designed to receive a single value
and transform (encode or decode) it in any way the user can specify.
filter factor.
A number between zero and one that estimates the proportion of rows
in a table for which a predicate is true.
fixed-length string.
A character or graphic string whose length is specified and cannot be
changed. Contrast with varying-length string.
foreign key.
A column or set of columns in a dependent table of a constraint
relationship. The key must have the same number of columns, with the
same descriptions, as the primary key of the parent table Each foreign
key value must either match a parent key value in the related parent
table or be null.
forward log recovery.
The third phase of restart processing during which DB2 processes the
log in a forward direction to apply all REDO log records.
free space. T
he total amount of unused space in a page; that is, the space that is
not used to store records or control information is free space.
full outer join.
The result of a join operation that includes the matched rows of both
tables that are being joined and preserves the unmatched rows of both
tables. See also join.
function.
A mapping, embodied as a program (the function body), invocable by
means of zero or more input values (arguments), to a single value (the
result). See also column function and scalar function. Functions can
be user-defined, built-in, or generated by DB2. (See built-in function,
cast function, external function, sourced function, SQL function, and
user-defined function.)
++49 8106 994390
Seite 99 von 127
G
GB.
Gigabyte (1 073 741 824 bytes).
GBP.
Group buffer pool.
generalized trace facility (GTF).
An MVS service program that records significant system events such
as I/O interrupts, SVC interrupts, program interrupts, or external
interrupts.
generic resource name.
A name that VTAM uses to represent several application programs
that provide the same function in order to handle session distribution
and balancing in a Sysplex environment.
getpage.
An operation in which DB2 accesses a data page.
GIMSMP.
The load module name for the System Modification
Program/Extended, a basic tool for installing, changing, and controlling
changes to programming systems.
graphic string.
A sequence of DBCS characters.
gross lock.
The shared, update,orexclusive mode locks on a table, partition, or
table space.
group buffer pool (GBP).
A coupling facility cache structure that is used by a data sharing group
to cache data and to ensure that the data is consistent for all
members.
GTF.
Generalized trace facility.
H
help panel.
A screen of information presenting tutorial text to assist a user at the
terminal.
hiperspace.
A range of up to 2 GB of contiguous virtual storage addresses that a
program can use as a buffer. Like a data space, a hiperspace can hold
user data; it does not contain common areas or system data. Unlike an
address space or a data space, data in a hiperspace is not directly
addressable. To manipulate data in a hiperspace, bring the data into
the address space in 4-KB blocks.
home address space.
The area of storage that MVS currently recognizes as dispatched.
host language.
A programming language in which you can embed SQL statements.
host program.
An application program that is written in a host language and that
contains embedded SQL statements.
host structure.
In an application program, a structure that is referenced by embedded
SQL statements.
host variable.
In an application program, an application variable that is referenced by
embedded SQL statements.
HSM.
Hierarchical storage manager.
++49 8106 994390
Seite 100 von 127
I
ICF.
Integrated catalog facility.
IDCAMS.
An IBM program that is used to process access method services
commands. It can be invoked as a job or jobstep, from a TSO terminal,
or from within a user’s application program.
IDCAMS LISTCAT.
A facility for obtaining information that is contained in the access
method services catalog.
identify.
A request that an attachment service program in an address space
that is separate from DB2 issues via the MVS subsystem interface to
inform DB2 of its existence and to initiate the process of becoming
connected to DB2.
identity column.
A column that provides a way for DB2 to automatically generate a
numeric value for each row. The generated values are unique if
cycling is not used. Identity columns are defined with the AS
IDENTITY clause. Uniqueness of values can be ensured by defining a
single-column unique index using the identity column. A table can
have no more than one identity column.
IFCID.
Instrumentation facility component identifier.
IFI.
Instrumentation facility interface.
IFI call.
An invocation of the instrumentation facility interface (IFI) by means of
one of its defined functions.
IFP.
IMS Fast Path.
image copy.
An exact reproduction of all or part of a table space. DB2 provides
utility programs to make full image copies (to copy the entire table
space) or incremental image copies (to copy only those pages that
have been modified since the last image copy).
IMS.
Information Management System.
IMS attachment facility.
A DB2 subcomponent that uses MVS subsystem interface (SSI)
protocols and cross-memory linkage to process requests from IMS to
DB2 and to coordinate resource commitment.
IMS DB.
Information Management System Database.
IMS TM.
Information Management System Transaction Manager.
in-abort.
A status of a unit of recovery. If DB2 fails after a unit of recovery
begins to be rolled back, but before the process is completed, DB2
continues to back out the changes during restart.
in-commit.
A status of a unit of recovery. If DB2 fails after beginning its phase 2
commit processing, it "knows," when restarted, that changes made to
data are consistent. Such units of recovery are termed in-commit.
independent.
An object (row, table, or table space) that is neither a parent nor a
dependent of another object.
index.
A set of pointers that are logically ordered by the values of a key.
Indexes can provide faster access to data and can enforce uniqueness
on the rows in a table.
index key.
The set of columns in a table that is used to determine the order of
index entries.
++49 8106 994390
Seite 101 von 127
index partition.
A VSAM data set that is contained within a partitioning index space.
index space.
A page set that is used to store the entries of one index.
indicator variable.
A variable that is used to represent the null value in an application
program. If the value for the selected column is null, a negative value
is placed in the indicator variable.
indoubt.
A status of a unit of recovery. If DB2 fails after it has finished its phase
1 commit processing and before it has started phase 2, only the
commit coordinator knows if an individual unit of recovery is to be
committed or rolled back. At emergency restart, if DB2 lacks the
information it needs to make this decision, the status of the unit of
recovery is indoubt until DB2 obtains this information from the
coordinator. More than one unit of recovery can be indoubt at restart.
indoubt resolution.
The process of resolving the status of an indoubt logical unit of work to
either the committed or the rollback state.
inflight.
A status of a unit of recovery. If DB2 fails before its unit of recovery
completes phase 1 of the commit process, it merely backs out the
updates of its unit of recovery at restart. These units of recovery are
termed inflight.
inner join.
The result of a join operation that includes only the matched rows of
both tables being joined. See also join.
inoperative package.
A package that cannot be used because one or more user-defined
functions or procedures that the package depends on were dropped.
Such a package must be explicitly rebound. Contrast with invalid
package.
install.
The process of preparing a DB2 subsystem to operate as an MVS
subsystem.
installation verification scenario.
A sequence of operations that exercises the main DB2 functions and
tests whether DB2 was correctly installed.
instrumentation facility component identifier (IFCID).
A value that names and identifies a trace record of an event that can
be traced. As a parameter on the START TRACE and MODIFY
TRACE commands, it specifies that the corresponding event is to be
traced.
instrumentation facility interface (IFI).
A programming interface that enables programs to obtain online trace
data about DB2, to submit DB2 commands, and to pass data to DB2.
Interactive System Productivity Facility (ISPF).
An IBM licensed program that provides interactive dialog services.
intermediate database server. The target of a request from a local application or a remote application
requester that is forwarded to another database server. In the DB2
environment, the remote request is forwarded transparently to another
database server if the object that is referenced by a three-part name
does not reference the local location.
internal resource lock manager (IRLM).
An MVS subsystem that DB2 uses to control communication and
database locking.
invalid package.
A package that depends on an object (other than a user-defined
function) that is dropped. Such a package is implicitly rebound on
invocation. Contrast with inoperative package.
++49 8106 994390
Seite 102 von 127
IRLM.
Internal resource lock manager.
ISO.
International Standards Organization.
isolation level.
The degree to which a unit of work is isolated from the updating
operations of other units of work. See also cursor stability, read
stability, repeatable read, and uncommitted read.
ISPF.
Interactive System Productivity Facility.
ISPF/PDF.
Interactive System Productivity Facility/Program Development Facility.
J
Japanese Industrial Standards Committee (JISC).
An organization that issues standards for coding character sets.
Java Archive (JAR).
A file format that is used for aggregating many files into a single file.
JCL.
Job control language.
JES.
MVS Job Entry Subsystem.
JIS.
Japanese Industrial Standard.
job control language (JCL).
A control language that is used to identify a job to an operating system
and to describe the job’s requirements.
Job Entry Subsystem (JES).
An IBM licensed program that receives jobs into the system and
processes all output data that is produced by the jobs.
join.
A relational operation that allows retrieval of data from two or more
tables based on matching column values. See also equijoin, full outer
join, inner join, left outer join, outer join, and right outer join.
K
KB.
Kilobyte (1024 bytes).
Kerberos.
A network authentication protocol that is designed to provide strong
authentication for client/server applications by using secret-key
cryptography.
Kerberos ticket.
A transparent application mechanism that transmits the identity of an
initiating principal to its target. A simple ticket contains the principal’s
identity, a session key, a timestamp, and other information, which is
sealed using the target’s secret key.
key.
A column or an ordered collection of columns identified in the
description of a table, index, or referential constraint.
key-sequenced data set (KSDS).
A VSAM file or data set whose records are loaded in key sequence
and controlled by an index.
KSDS.
Key-sequenced data set.
++49 8106 994390
Seite 103 von 127
L
labeled duration.
A number that represents a duration of years, months, days, hours,
minutes, seconds, or microseconds.
large object (LOB).
A sequence of bytes representing bit data, single-byte characters,
double-byte characters, or a mixture of single- and double-byte
characters. A LOB can be up to 2 GB.1 byte in length. See also BLOB,
CLOB, and DBCLOB.
latch.
A DB2 internal mechanism for controlling concurrent events or the use
of system resources.
LCID.
Log control interval definition.
LDS.
Linear data set.
leaf page.
A page that contains pairs of keys and RIDs and that points to actual
data. Contrast with nonleaf page.
left outer join.
tables that are being joined, and that preserves the unmatched rows of
the first table. See also join.
linear data set (LDS).
A VSAM data set that contains data but no control information. A linear
data set can be accessed as a byte-addressable string in virtual
storage.
linkage editor.
A computer program for creating load modules from one or more
object modules or load modules by resolving cross references among
the modules and, if necessary, adjusting addresses.
link-edit.
The action of creating a loadable computer program using a linkage
editor.
L-lock.
Logical lock.
load module.
A program unit that is suitable for loading into main storage for
execution. The output of a linkage editor.
LOB.
Large object.
LOB lock.
A lock on a LOB value.
LOB table space.
A table space that contains all the data for a particular LOB column in
the related base table.
local subsystem.
The unique RDBMS to which the user or application program is
directly connected (in the case of DB2, by one of the DB2 attachment
facilities).
lock.
A means of controlling concurrent events or access to data. DB2
locking is performed by the IRLM.
lock duration.
The interval over which a DB2 lock is held.
lock escalation.
The promotion of a lock from a row, page, or LOB lock to a table
space lock because the number of page locks that are concurrently
held on a given resource exceeds a preset limit.
locking.
The process by which the integrity of data is ensured. Locking
prevents concurrent users from accessing inconsistent data.
lock mode.
A representation for the type of access that concurrently running
programs can have to a resource that a DB2 lock is holding.
The resource that is controlled by a DB2 lock.
lock object.
++49 8106 994390
Seite 104 von 127
lock promotion.
The process of changing the size or mode of a DB2 lock to a higher
level.
lock size.
The amount of data controlled by a DB2 lock on table data; the value
can be a row, a page, a LOB, a partition, a table, or a table space.
log.
A collection of records that describe the events that occur during DB2
execution and that indicate their sequence. The information thus
recorded is used for recovery in the event of a failure during DB2
execution.
logical index partition.
The set of all keys that reference the same data partition.
logical lock (L-lock).
The lock type that transactions use to control intra- and inter-DB2 data
concurrency between transactions. Contrast with physical lock (Plock).
logical recovery pending (LRECP).
The state in which the data and the index keys that reference the data
are inconsistent.
logical unit.
An access point through which an application program accesses the
SNA network in order to communicate with another application
program.
logical unit of work (LUW).
The processing that a program performs between synchronization
points.
logical unit of work identifier (LUWID).
A name that uniquely identifies a thread within a network. This name
consists of a fully-qualified LU network name, an LUW instance
number, and an LUW sequence number.
log initialization.
The first phase of restart processing during which DB2 attempts to
locate the current end of the log.
log record sequence number (LRSN).
A number that DB2 generates and associates with each log record.
DB2 also uses the LRSN for page versioning. The LRSNs that a
particular DB2 data sharing group generates form a strictly increasing
sequence for each DB2 log and a strictly increasing sequence for each
page across the DB2 group.
log truncation.
A process by which an explicit starting RBA is established. This RBA
is the point at which the next byte of log data is to be written.
long string.
A string whose actual length, or a varying-length string whose
maximum length, is greater than 255 bytes or 127 double-byte
characters. Any LOB column, LOB host variable, or expression that
evaluates to a LOB is considered a long string.
LRECP.
Logical recovery pending.
LRH.
Log record header.
LRSN.
Log record sequence number.
LUW.
Logical unit of work.
LUWID.
#Logical unit of work identifier.
++49 8106 994390
Seite 105 von 127
M
materialize.
(1)
(2)
The process of putting rows from a view or nested table
expression into a work file for additional processing by a query.
The placement of a LOB value into contiguous storage.
Because LOB values can be very large, DB2 avoids
materializing LOB data until doing so becomes absolutely
necessary.
MB.
Megabyte (1 048 576 bytes).
migration.
The process of converting a DB2 subsystem with a previous release of
DB2 to an updated or current release. In this process, you can acquire
the functions of the updated or current release without losing the data
you created on the previous release.
mixed data string.
A character string that can contain both single-byte and double-byte
characters.
MLPA.
Modified link pack area.
MODEENT.
A VTAM macro instruction that associates a logon mode name with a
set of parameters representing session protocols. A set of MODEENT
macro instructions defines a logon mode table.
mode name.
A VTAM name for the collection of physical and logical characteristics
and attributes of a session.
MPP.
Message processing program (in IMS).
MSS.
Mass Storage Subsystem.
MTO.
Master terminal operator.
multibyte character set (MBCS).
A character set that represents single characters with more than a
single byte. Contrast with single-byte character set and double-byte
character set. See also Unicode.
multisite update.
Distributed relational database processing in which data is updated in
more than one location within a single unit of work.
must-complete.
A state during DB2 processing in which the entire operation must be
completed to maintain data integrity.
MVS.
Multiple Virtual Storage.
MVS/ESA.
Multiple Virtual Storage/Enterprise Systems Architecture.
++49 8106 994390
Seite 106 von 127
N
nested table expression.
A fullselect in a FROM clause (surrounded by parentheses).
network identifier (NID).
The network ID that is assigned by IMS or CICS, or if the connection
type is RRSAF, the OS/390 RRS unit of recovery ID (URID).
NID.
Network ID.
nonleaf page.
A page that contains keys and page numbers of other pages in the
index (either leaf or nonleaf pages). Nonleaf pages never point to
actual data.
nonpartitioning index.
Any index that is not a partitioning index.
NPI
see nonpartitioning index.
NRE.
Network recovery element.
NUL.
In C, a single character that denotes the end of the string.
null.
A special value that indicates the absence of information.
NUL-terminated host variable. A varying-length host variable in which the end of the data is indicated
by the presence of a NUL terminator.
NUL terminator.
In C, the value that indicates the end of a string. For character strings,
the NUL terminator is X'00'.
O
OASN (origin application schedule number).
In IMS, a 4-byte number that is assigned sequentially to each IMS
schedule since the last cold start of IMS. The OASN is used as an
identifier for a unit of work. In an 8-byte format, the first 4 bytes contain
the schedule number and the last 4 bytes contain the number of IMS
sync points (commit points) during the current schedule. The OASN is
part of the NID for an IMS connection.
OBID.
Data object identifier.
originating task.
In a parallel group, the primary agent that receives data from other
execution units (referred to as parallel tasks) that are executing
portions of the query in parallel.
OS/390.
Operating System/390.
OS/390 OpenEdition Distributed Computing Environment (OS/390 OE DCE).
A set of technologies that are provided by the Open Software
Foundation to implement distributed computing.
outer join.
tables that are being joined and preserves some or all of the
unmatched rows of the tables that are being joined. See also join.
++49 8106 994390
Seite 107 von 127
P
package.
An object containing a set of SQL statements that have been statically
bound and that is available for processing. A package is sometimes
also called an application package.
package list.
An ordered list of package names that may be used to extend an
application plan.
package name.
The name of an object that is created by a BIND PACKAGE or
REBIND PACKAGE command. The object is a bound version of a
database request module (DBRM). The name consists of a location
name, a collection ID, a package ID, and a version ID.
page.
A unit of storage within a table space (4 KB, 8 KB, 16 KB, or 32 KB) or
index space (4 KB). In a table space, a page contains one or more
rows of a table. In a LOB table space, a LOB value can span more
than one page, but no more than one LOB value is stored on a page.
page set.
Another way to refer to a table space or index space. Each page set
consists of a collection of VSAM data sets.
Parallel group.
A set of consecutive operations that executed in parallel and that have
the same number of parallel tasks.
parallel I/O processing.
A form of I/O processing in which DB2 initiates multiple concurrent
requests for a single user query and performs I/O processing
concurrently (in parallel) on multiple data partitions.
Parallel Sysplex.
A set of MVS systems that communicate and cooperate with each
other through certain multisystem hardware components and software
services to process customer workloads.
parallel task.
The execution unit that is dynamically created to process a query in
parallel. It is implemented by an MVS service request block.
parent row.
A row whose primary key value is the foreign key value of a
dependent row.
parent table.
A table whose primary key is referenced by the foreign key of a
dependent table.
parent table space.
A table space that contains a parent table. A table space containing a
dependent of that table is a dependent table space.
participant.
An entity other than the commit coordinator that takes part in the
commit process. The term participant is synonymous with agent in
SNA.
partition.
A portion of a page set. Each partition corresponds to a single,
independently extendable data set. Partitions can be extended to a
maximum size of 1, 2, or 4 GB, depending on the number of partitions
in the partitioned page set. All partitions of a given page set have the
same maximum size.
partitioning index.
The index that represents the partitioning values
partitioned data set (PDS).
A data set in direct access storage that is divided into partitions, which
are called members. Each partition can contain a program, part of a
program, or data. The term partitioned data set is synonymous with
program library.
partitioned page set.
A partitioned table space or an index space. Header pages, space
map pages, data pages, and index pages reference data only within
the scope of the partition.
++49 8106 994390
Seite 108 von 127
partitioned table space.
A table space that is subdivided into parts (based on index key
range), each of which can be processed independently by utilities.
partner logical unit.
An access point in the SNA network that is connected to the local DB2
subsystem by way of a VTAM conversation.
PCT.
Program control table (in CICS).
PDS.
Partitioned data set.
PI
see partitioning index.
piece.
A data set of a nonpartitioned page set.
physical consistency.
The state of a page that is not in a partially changed state.
plan.
See application plan.
plan allocation.
The process of allocating DB2 resources to a plan in preparation for
execution.
plan name.
The name of an application plan.
plan segmentation.
The dividing of each plan into sections. When a section is needed, it is
independently brought into the EDM pool.
PLT.
Program list table (in CICS).
point of consistency.
A time when all recoverable data that an application accesses is
consistent with other data. The term point of consistency is
synonymous with sync point or commit point.
postponed abort UR.
A unit of recovery that was inflight or in-abort, was interrupted by
system failure or cancellation, and did not complete backout during
restart.
PPT.
(1) Processing program table (in CICS).
(2) Program properties table (in MVS).
precompilation.
A processing of application programs containing SQL statements that
takes place before compilation. SQL statements are replaced with
statements that are recognized by the host language compiler. Output
from this precompilation includes source code that can be submitted to
the compiler and the database request module (DBRM) that is input to
the bind process.
predicate.
An element of a search condition that expresses or implies a
comparison operation.
prefix.
A code at the beginning of a message or record.
primary authorization ID.
The authorization ID used to identify the application process to DB2.
primary index.
An index that enforces the uniqueness of a primary key.
primary key.
In a relational database, a unique, nonnull key that is part of the
definition of a table. A table cannot be defined as a parent unless it
has a unique key or primary key.
principal.
An entity that can communicate securely with another entity. In
Kerberos, principals are represented as entries in the Kerberos
registry database and include users, servers, computers, and others.
principal name.
The name by which a principal is known to the DCE security services.
private connection.
A communications connection that is specific to DB2.
++49 8106 994390
Seite 109 von 127
private protocol access.
A method of accessing distributed data by which you can direct a
query to another DB2 system. Contrast with DRDA access.
private protocol connection.
A DB2 private connection of the application process. See also private
connection.
privilege.
The capability of performing a specific function, sometimes on a
specific object. The term includes:
explicit privileges, which have names and are held as the result of
SQL GRANT and REVOKE statements. For example, the SELECT
privilege.
implicit privileges, which accompany the ownership of an object,
such as the privilege to drop a synonym one owns, or the holding of an
authority, such as the privilege of SYSADM authority to terminate any
utility job.
privilege set.
For the installation SYSADM ID, the set of all possible privileges. For
any other authorization ID, the set of all privileges that are recorded for
that ID in the DB2 catalog.
process.
In DB2, the unit to which DB2 allocates resources and locks.
Sometimes called an application process, a process involves the
execution of one or more programs. The execution of an SQL
statement is always associated with some process. The means of
initiating and terminating a process are dependent on the environment.
program.
A single compilable collection of executable statements in a
programming language.
program temporary fix (PTF).
A solution or bypass of a problem that is diagnosed as a result of a
defect in a current unaltered release of a licensed program. An
authorized program analysis report (APAR) fix is corrective service for
an existing problem. A PTF is preventive service for problems that
might be encountered by other users of the product. A PTF is
temporary, because a permanent fix is usually not incorporated into
the product until its next release.
protected conversation.
A VTAM conversation that supports two-phase commit flows.
PTF.
Program temporary fix.
++49 8106 994390
Seite 110 von 127
Q
QMF.
Query Management Facility.
QSAM.
Queued sequential access method.
query block.
The part of a query that is represented by one of the FROM clauses.
Each FROM clause can have multiple query blocks, depending on
DB2’s internal processing of the query.
query CP parallelism.
Parallel execution of a single query, which is accomplished by using
multiple tasks. See also Sysplex query parallelism.
query I/O parallelism.
Parallel access of data, which is accomplished by triggering multiple
I/O requests within a single query.
queued sequential access method (QSAM).
An extended version of the basic sequential access method (BSAM).
When this method is used, a queue of data blocks is formed. Input
data blocks await processing, and output data blocks await transfer to
auxiliary storage or to an output device.
++49 8106 994390
Seite 111 von 127
R
RACF.
Resource Access Control Facility, which is a component of the
SecureWay Security Server for OS/390.
RAMAC.
IBM family of enterprise disk storage system products.
RBA.
Relative byte address.
RCT.
Resource control table (in CICS attachment facility).
RDB.
Relational database.
RDBMS.
Relational database management system.
RDBNAM.
Relational database name.
RDF.
Record definition field.
read stability (RS).
An isolation level that is similar to repeatable read but does not
completely isolate an application process from all other concurrently
executing application processes. Under level RS, an application that
issues the same query more than once might read additional rows that
were inserted and committed by a concurrently executing application
process.
rebind.
The creation of a new application plan for an application program that
has been bound previously. If, for example, you have added an index
for a table that your application accesses, you must rebind the
application in order to take advantage of that index.
record.
The storage representation of a row or other data.
record identifier (RID).
A unique identifier that DB2 uses internally to identify a row of data in
a table stored as a record. Compare with row ID.
record identifier (RID) pool.
An area of main storage above the 16-MB line that is reserved for
sorting record identifiers during list prefetch processing.
recovery.
The process of rebuilding databases after a system failure.
recovery log.
A collection of records that describes the events that occur during DB2
execution and indicates their sequence. The recorded information is
used for recovery in the event of a failure during DB2 execution.
recovery pending (RECP).
A condition that prevents SQL access to a table space that needs to
be recovered.
recovery token.
An identifier for an element that is used in recovery (for example, NID
or URID).
RECP.
Recovery pending.
redo.
A state of a unit of recovery that indicates that changes are to be
reapplied to the DASD media to ensure data integrity.
referential constraint.
The requirement that nonnull values of a designated foreign key are
valid only if they equal values of the primary key of a designated table.
referential integrity.
The state of a database in which all values of all foreign keys are valid.
Maintaining referential integrity requires the enforcement of referential
constraints on all operations that change the data in a table upon
which the referential constraints are defined.
++49 8106 994390
Seite 112 von 127
referential structure.
A set of tables and relationships that includes at least one table and,
for every table in the set, all the relationships in which that table
participates and all the tables to which it is related.
registry.
See registry database.
registry database.
A database of security information about principals, groups,
organizations, accounts, and security policies.
relational database (RDB).
A database that can be perceived as a set of tables and manipulated
in accordance with the relational model of data.
relational database management system (RDBMS).
A collection of hardware and software that organizes and provides
access to a relational database.
relational database name (RDBNAM).
A unique identifier for an RDBMS within a network. In DB2, this must
be the value in the LOCATION column of table SYSIBM.LOCATIONS
in the CDB. DB2 publications refer to the name of another RDBMS as
a LOCATION value or a location name.
relationship.
A defined connection between the rows of a table or the rows of two
tables. A relationship is the internal representation of a referential
constraint.
relative byte address (RBA).
The offset of a data record or control interval from the beginning of the
storage space that is allocated to the data set or file to which it
belongs.
remigration.
The process of returning to a current release of DB2 following a
fallback to a previous release. This procedure constitutes another
migration process.
remote attach request.
A request by a remote location to attach to the local DB2 subsystem.
Specifically, the request that is sent is an SNA Function Management
Header 5.
remote subsystem.
Any RDBMS, except the local subsystem, with which the user or
application can communicate. The subsystem need not be remote in
any physical sense, and might even operate on the same processor
under the same MVS system.
reoptimization.
The DB2 process of reconsidering the access path of an SQL
statement at run time; during reoptimization, DB2 uses the values of
host variables, parameter markers, or special registers.
REORG pending (REORP).
A condition that restricts SQL access and most utility access to an
object that must be reorganized.
REORP.
REORG pending.
repeatable read (RR).
The isolation level that provides maximum protection from other
executing application programs. When an application program
executes with repeatable read protection, rows referenced by the
program cannot be changed by other programs until the program
reaches a commit point.
request commit.
The vote that is submitted to the prepare phase if the participant has
modified data and is prepared to commit or roll back.
requester.
The source of a request to access data at a remote server. In the DB2
environment, the requester function is provided by the distributed data
facility.
resource allocation.
The part of plan allocation that deals specifically with the database
resources.
++49 8106 994390
Seite 113 von 127
resource control table (RCT).
A construct of the CICS attachment facility, created by site-provided
macro parameters, that defines authorization and access attributes for
transactions or transaction groups.
resource definition online.
A CICS feature that you use to define CICS resources online without
assembling tables.
A portion of DB2 code that prevents dynamic manipulative SQL
statements from exceeding specified time limits. The resource limit
facility is sometimes called the governor.
resource limit facility (RLF).
resource limit specification table.
A site-defined table that specifies the limits to be enforced by the
resource limit facility.
restart pending (RESTP).
A restrictive state of a page set or partition that indicates that restart
(backout) work needs to be performed on the object. All access to the
page set or partition is denied except for access by the: v RECOVER
POSTPONED command v Automatic online backout (which DB2
invokes after restart if the system parameter LBACKOUT=AUTO)
result table.
The set of rows that are specified by a SELECT statement.
RID.
Record identifier.
RID pool.
Record identifier pool.
right outer join.
tables that are being joined and preserves the unmatched rows of the
second join operand. See also join.
RLF.
Resource limit facility.
RMID.
Resource manager identifier.
RO.
Read-only access.
rollback.
The process of restoring data changed by SQL statements to the state
at its last commit point. All locks are freed. Contrast with commit.
root page.
The page of an index page set that follows the first index space map
page. A root page is the highest level (or the beginning point) of the
index.
routine.
A term that refers to either a user-defined function or a stored
procedure.
row.
The horizontal component of a table. A row consists of a sequence of
values, one for each column of the table.
ROWID.
Row identifier.
row identifier (ROWID).
A value that uniquely identifies a row. This value is stored with the row
and never changes.
row lock.
A lock on a single row of data.
RRE.
Residual recovery entry (in IMS).
RRSAF.
Recoverable Resource Manager Services attachment facility. RRSAF
is a DB2 subcomponent that uses OS/390 Transaction Management
and Recoverable Resource Manager Services to coordinate resource
commitment between DB2 and all other resource managers that also
use OS/390 RRS in an OS/390 system.
RS.
Read stability.
RTT.
Resource translation table.
++49 8106 994390
Seite 114 von 127
S
savepoint.
A named entity that represents the state of data and schemas at a
particular point in time within a unit of work. SQL statements exist to
set a savepoint, release a savepoint, and restore data and schemas to
the state that the savepoint represents. The restoration of data and
schemas to a savepoint is usually referred to as rolling back to a
savepoint.
scalar function.
An SQL operation that produces a single value from another value and
is expressed as a function name, followed by a list of arguments that
are enclosed in parentheses. Contrast with column function.
schema.
A logical grouping for user-defined functions, distinct types, triggers,
and stored procedures. When an object of one of these types is
created, it is assigned to one schema, which is determined by the
name of the object. For example, the following statement creates a
distinct type T in schema C: CREATEDISTINCT TYPEC.T ...
SDWA.
S ystem diagnostic work area.
search condition.
A criterion for selecting rows from a table. A search condition consists
of one or more predicates.
secondary authorization ID.
An authorization ID that has been associated with a primary
authorization ID by an authorization exit routine.
section.
The segment of a plan or package that contains the executable
structures for a single SQL statement. For most SQL statements, one
section in the plan exists for each SQL statement in the source
program. However, for cursor-related statements, the DECLARE,
OPEN, FETCH, and CLOSE statements reference the same section
because, they each refer to the SELECT statement that is named in
the DECLARE CURSOR statement. SQL statements such as
COMMIT, ROLLBACK, and some SET statements do not use a
section.
segmented table space.
A table space that is divided into equal-sized groups of pages called
segments. Segments are assigned to tables so that rows of different
tables are never stored in the same segment.
self-referencing constraint.
A referential constraint that defines a relationship in which a table is a
dependent of itself.
self-referencing table.
A table with a self-referencing constraint.
sequential data set.
A non-DB2 data set whose records are organized on the basis of their
successive physical positions, such as on magnetic tape. Several of
the DB2 database utilities require sequential data sets.
sequential prefetch.
A mechanism that triggers consecutive asynchronous I/O operations.
Pages are fetched before they are required, and several pages are
read with a single I/O operation.
server.
The target of a request from a remote requester. In the DB2
environment, the server function is provided by the distributed data
facility, which is used to access DB2 data from remote applications.
service class.
An eight-character identifier that is used by MVS Workload Manager to
associate customer performance goals with a particular DDF thread or
stored procedure. A service class is also used to classify work on
parallelism assistants.
session.
A link between two nodes in a VTAM network.
++49 8106 994390
Seite 115 von 127
session protocols.
The available set of SNA communication requests and responses.
share lock.
A lock that prevents concurrently executing application processes
from changing data, but not from reading data. Contrast with exclusive
lock.
short string.
A string whose actual length, or a varying-length string whose
maximum length, is 255 bytes (or 127 double-byte characters) or less.
Regardless of length, a LOB string is not a short string.
sign-on.
A request that is made on behalf of an individual CICS or IMS
application process by an attachment facility to enable DB2 to verify
that it is authorized to use DB2 resources.
simple page set.
A nonpartitioned page set. A simple page set initially consists of a
single data set (page set piece). If and when that data set is extended
to 2 GB, another data set is created, and so on up to a total of 32 data
sets. DB2 considers the data sets to be a single contiguous linear
address space containing a maximum of 64 GB. Data is stored in the
next available location within this address space without regard to any
partitioning scheme.
simple table space.
A table space that is neither partitioned nor segmented.
single-byte character set (SBCS).
A set of characters in which each character is represented by a single
byte. Contrast with double-byte character set or multibyte character
set.
SMF.
System management facility.
SMP/E.
System Modification Program/Extended.
SMS.
Storage Management Subsystem.
SNA.
Systems Network Architecture.
SNA network. T
he part of a network that conforms to the formats and protocols of
Systems Network Architecture (SNA).
sourced function.
A function that is implemented by another built-in or user-defined
function that is already known to the database manager. This function
can be a scalar function or a column (aggregating) function; it returns
a single value from a set of values (for example, MAX or AVG).
Contrast with built-in function, external function, and SQL function.
special register.
A storage area that DB2 defines for an application process to use for
storing information that can be referenced in SQL statements.
Examples of special registers are USER and CURRENT DATE.
SPUFI.
SQL Processor Using File Input.
SQL.
Structured Query Language.
SQL authorization ID (SQL ID). The authorization ID that is used for checking dynamic SQL
statements in some situations.
SQLCA.
SQL communication area.
SQL communication area (SQLCA).
A structure that is used to provide an application program with
information about the execution of its SQL statements.
SQLDA.
SQL descriptor area.
SQL descriptor area (SQLDA). A structure that describes input variables, output variables, or the
columns of a result table.
++49 8106 994390
Seite 116 von 127
SQL/DS.
Structured Query Language/Data System. This product is now
obsolete and has been replaced by DB2/UDB for VSE & VM.
SQL function.
A user-defined function in which the CREATE FUNCTION statement
contains the source code. The source code is a single SQL expression
that evaluates to a single value. The SQL user-defined function can
return only one parameter.
SQL processing conversation. Any conversation that requires access of DB2 data, either through an
application or by dynamic query requests.
SQL Processor Using File Input (SPUFI).
SQL Processor Using File Input. A facility of the TSO attachment
subcomponent that enables the DB2I user to execute SQL statements
without embedding them in an application program.
SQL routine.
A user-defined function or stored procedure that is based on code that
is written in SQL.
SSI.
Subsystem interface (in MVS).
SSM.
Subsystem member.
stand-alone.
An attribute of a program that means it is capable of executing
separately from DB2, without using DB2 services.
star join.
A method of joining a dimension column of a fact table to the key
column of the corresponding dimension table. See also join,
dimension, and star schema.
star schema.
The combination of a fact table (which contains most of the data) and
a number of dimension tables. See also star join, dimension, and
dimension table.
For a dynamic SQL statement, the character string form of the
statement.
statement string.
static SQL.
SQL statements, embedded within a program, that are prepared
during the program preparation process (before the program is
executed). After being prepared, the SQL statement does not change
(although values of host variables that are specified by the statement
might change).
storage group.
A named set of disks on which DB2 data can be stored.
stored procedure.
A user-written application program that can be invoked through the
use of the SQL CALL statement.
string.
See character string or graphic string.
Structured Query Language (SQL).
A standardized language for defining and manipulating data in a
relational database.
subcomponent.
A group of closely related DB2 modules that work together to provide
a general function.
subpage.
The unit into which a physical index page can be divided.
subquery.
A SELECT statement within the WHERE or HAVING clause of another
SQL statement; a nested SQL statement.
subselect.
That form of a query that does not include ORDER BY clause,
UPDATE clause, or UNION operators.
subsystem.
A distinct instance of a relational database management system
(RDBMS).
++49 8106 994390
Seite 117 von 127
sync point.
See commit point.
synonym.
In SQL, an alternative name for a table or view. Synonyms can be
used only to refer to objects at the subsystem in which the synonym is
defined.
Sysplex.
See Parallel Sysplex.
Sysplex query parallelism.
Parallel execution of a single query that is accomplished by using
multiple tasks on more than one DB2 subsystem. See also query CP
parallelism.
system administrator.
The person at a computer installation who designs, controls, and
manages the use of the computer system.
system agent.
A work request that DB2 creates internally such as prefetch
processing, deferred writes, and service tasks.
system conversation.
The conversation that two DB2 subsystems must establish to process
system messages before any distributed processing can begin.
system diagnostic work area (SDWA).
The data that is recorded in a SYS1.LOGREC entry that describes a
program or hardware error.
system-directed connection.
A connection that an RDBMS manages by processing SQL statements
with three-part names.
System Modification Program/Extended (SMP/E).
A tool for making software changes in programming systems (such as
DB2) and for controlling those changes.
Systems Network Architecture (SNA).
The description of the logical structure, formats, protocols, and
operational sequences for transmitting information through and
controlling the configuration and operation of networks.
SYS1.DUMPxx data set.
A data set that contains a system dump.
SYS1.LOGREC.
A service aid that contains important information about program and
hardware errors.
++49 8106 994390
Seite 118 von 127
T
table.
A named data object consisting of a specific number of columns and
some number of unordered rows. See also base table or temporary
table.
table check constraint.
A user-defined constraint that specifies the values that specific
columns of a base table can contain.
table function.
A function that receives a set of arguments and returns a table to the
SQL statement that references the function. A table function can be
referenced only in the FROM clause of a subselect.
table space.
A page set that is used to store the records in one or more tables.
table space set.
A set of table spaces and partitions that should be recovered together
for one of these reasons: v Each of them contains a table that is a
parent or descendent of a table in one of the others. v The set
contains a base table and associated auxiliary tables. A table space
set can contain both types of relationships.
task control block (TCB).
A control block that is used to communicate information about tasks
within an address space that are connected to DB2. An address space
can support many task connections (as many as one per task), but
only one address space connection. See also address space
connection.
TB.
Terabyte (1 099 511 627 776 bytes).
TCB.
Task control block (in MVS).
temporary table.
A table that holds temporary data; for example, temporary tables are
useful for holding or sorting intermediate results from queries that
contain a large number of rows. The two kinds of temporary table,
which are created by different SQL statements, are the created
temporary table and the declared temporary table. Contrast with result
table. See also created temporary table and declared temporary table.
thread.
The DB2 structure that describes an application’s connection, traces
its progress, processes resource functions, and delimits its
accessibility to DB2 resources and services. Most DB2 functions
execute under a thread structure. See also allied thread and database
access thread.
three-part name.
The full name of a table, view, or alias. It consists of a location name,
authorization ID, and an object name, separated by a period.
time.
A three-part value that designates a time of day in hours, minutes, and
seconds.
time duration.
A decimal integer that represents a number of hours, minutes, and
seconds.
timeout.
Abnormal termination of either the DB2 subsystem or of an
application because of the unavailability of resources. Installation
specifications are set to determine both the amount of time DB2 is to
wait for IRLM services after starting, and the amount of time IRLM is to
wait if a resource that an application requests is unavailable. If either
of these time specifications is exceeded, a timeout is declared.
Time-Sharing Option (TSO).
An option in MVS that provides interactive time sharing from remote
terminals.
++49 8106 994390
Seite 119 von 127
timestamp.
A seven-part value that consists of a date and time. The timestamp is
expressed in years, months, days, hours, minutes, seconds, and
microseconds.
TMP.
Terminal Monitor Program.
to-do.
A state of a unit of recovery that indicates that the unit of recovery’s
changes to recoverable DB2 resources are indoubt and must either be
applied to the DASD media or backed out, as determined by the
commit coordinator.
trace.
A DB2 facility that provides the ability to monitor and collect DB2
monitoring, auditing, performance, accounting, statistics, and
serviceability (global) data.
TSO.
Time-Sharing Option.
TSO attachment facility.
A DB2 facility consisting of the DSN command processor and DB2I.
Applications that are not written for the CICS or IMS environments can
run under the TSO attachment facility.
type 1 indexes.
Indexes that were created by a release of DB2 before DB2 Version 4
or that are specified as type 1 indexes in Version 4. Contrast with type
2 indexes. As of Version 7, type 1 indexes are no longer supported.
type 2 indexes.
Indexes that are created on a release of DB2 after Version 6 or that
are specified as type 2 indexes in Version 4 or later.
++49 8106 994390
Seite 120 von 127
U
UDF.
User-defined function.
UDT.
User-defined data type. In DB2 for OS/390 and z/OS, the term distinct
type is used instead of user-defined data type. See distinct type.
uncommitted read (UR).
The isolation level that allows an application to read uncommitted
data.
undo.
A state of a unit of recovery that indicates that the changes the unit of
recovery made to recoverable DB2 resources must be backed out.
Unicode.
A standard that parallels the ISO-10646 standard. Several
implementations of the Unicode standard exist, all of which have the
ability to represent a large percentage of the characters contained in
the many scripts that are used throughout the world.
union.
An SQL operation that combines the results of two select statements.
Unions are often used to merge lists of values that are obtained from
several tables.
unique constraint.
An SQL rule that no two values in a primary key, or in the key of a
unique index, can be the same.
unique index.
An index which ensures that no identical key values are stored in a
table.
unlock.
The act of releasing an object or system resource that was previously
locked and returning it to general availability within DB2.
UR.
Uncommitted read.
URE.
Unit of recovery element.
URID (unit of recovery ID).
The LOGRBA of the first log record for a unit of recovery. The URID
also appears in all subsequent log records for that unit of recovery.
user-defined data type (UDT).
See distinct type.
user-defined function (UDF).
A function that is defined to DB2 by using the CREATE FUNCTION
statement and that can be referenced thereafter in SQL statements. A
user-defined function can be an external function, a sourced function,
or an SQL function. Contrast with built-in function. UT. Utility-only
access.
++49 8106 994390
Seite 121 von 127
V
value.
The smallest unit of data that is manipulated in SQL.
varying-length string.
A character or graphic string whose length varies within set limits.
Contrast with fixed-length string.
version.
A member of a set of similar programs, DBRMs, packages, or LOBs.
A version of a program
is the source code that is produced by precompiling the program. The
program version is identified by the program name and a timestamp
(consistency token).
A version of a DBRM
is the DBRM that is produced by precompiling a program. The DBRM
version is identified by the same program name and timestamp as a
corresponding program version.
A version of a package
is the result of binding a DBRM within a particular database system.
The package version is identified by the same program name and
consistency token as the DBRM.
A version of a LOB
is a copy of a LOB value at a point in time. The version number for a
LOB is stored in the auxiliary index entry for the LOB.
view.
An alternative representation of data from one or more tables. A view
can include all or some of the columns that are contained in tables on
which it is defined.
Virtual Storage Access Method (VSAM).
An access method for direct or sequential processing of fixed- and
varying-length records on direct access devices. The records in a
VSAM data set or file can be organized in logical sequence by a key
field (key sequence), in the physical sequence in which they are
written on the data set or file (entry-sequence), or by relative-record
number.
Virtual Telecommunications Access Method (VTAM).
An IBM licensed program that controls communication and the flow of
data in an SNA network.
VSAM.
Virtual storage access method.
VTAM.
Virtual Telecommunication Access Method (in MVS).
++49 8106 994390
Seite 122 von 127
W
WLM application environment. An MVS Workload Manager attribute that is associated with one or
more stored procedures. The WLM application environment
determines the address space in which a given DB2 stored procedure
runs.
write to operator (WTO).
An optional user-coded service that allows a message to be written to
the system console operator informing the operator of errors and
unusual system conditions that may need to be corrected.
WTO.
Write to operator.
WTOR.
Write to operator (WTO) with reply.
X
XMS
Cross memory Service
XRF.
Extended recovery facility.
Z
z/OS.
An operating system for the eServer product line that supports 64-bit
real storage.
++49 8106 994390
Seite 123 von 127
11.6
Kapitel 11: Anhang – Literaturhinweis
Literaturhinweise
DB2 Universal Database Server for OS/390 und z/OS Version 7 “product
libraries”:
DB2 for OS/390 and z/OS
v DB2 Administration Guide, SC26-9931
v DB2 Application Programming and SQL Guide, SC26-9933
v DB2 Application Programming Guide and Reference for Java, SC26-9932
v DB2 Command Reference, SC26-9934
v DB2 Data Sharing: Planning and Administration, SC26-9935
v DB2 Data Sharing Quick Reference Card, SX26-3846
v DB2 Diagnosis Guide and Reference, LY37-3740
v DB2 Diagnostic Quick Reference Card, LY37-3741
v DB2 Image, Audio, and Video Extenders Administration and Programming, SC26-9947
v DB2 Installation Guide, GC26-9936
v DB2 Licensed Program Specifications, GC26-9938
v DB2 Messages and Codes, GC26-9940
v DB2 ODBC Guide and Reference, SC26-9941
v DB2 Reference for Remote DRDA Requesters and Servers, SC26-9942
v DB2 Reference Summary, SX26-3847
v DB2 Release Planning Guide, SC26-9943
v DB2 SQL Reference, SC26-9944
v DB2 Text Extender Administration and Programming, SC26-9948
v DB2 Utility Guide and Reference, SC26-9945
v DB2 What's New? GC26-9946
v DB2 XML Extender for OS/390 and z/OS Administration and Programming, SC27-9949
v DB2 Program Directory, GI10-8182
DB2 Administration Tool
v DB2 Administration Tool for OS/390 and z/OS User’s Guide, SC26-9847
DB2 Buffer Pool Tool
v DB2 Buffer Pool Tool for OS/390 and z/OS User’s Guide and Reference, SC26-9306
Net.Data ®
v Net.Data Library: Administration and Programming Guide for OS/390 and z/OS
v Net.Data Library: Language Environment Interface Reference
v Net.Data Library: Messages and Codes
v Net.Data Library: Reference
DB2 PM for OS/390
v DB2 PM for OS/390 Batch User's Guide, SC27-0857
v DB2 PM for OS/390 Command Reference, SC27-0855
v DB2 PM for OS/390 General Information, GC27-0852
v DB2 PM for OS/390 Installation and Customization, SC27-0860
v DB2 PM for OS/390 Messages, SC27-0856
v DB2 PM for OS/390 Online Monitor User's Guide, SC27-0858
v DB2 PM for OS/390 Report Reference Volume 1, SC27-0853
v DB2 PM for OS/390 Report Reference Volume 2, SC27-0854
v DB2 PM for OS/390 Using the Workstation Online Monitor, SC27-0859
v DB2 PM for OS/390 Program Directory, GI10-8223
Query Management Facility (QMF ™ )
v Query Management Facility: Developing QMF Applications, SC26-9579
v Query Management Facility: Getting Started with QMF on Windows, SC26-9582
v Query Management Facility: High Peformance Option User’s Guide for OS/390 and
z/OS, SC26-9581
v Query Management Facility: Installing and Managing QMF on OS/390 and z/OS, GC269575
++49 8106 994390
Seite 124 von 127
CICS Transaction Server for OS/390
v CICS Application Programming Guide, SC33-1687
v CICS External Interfaces Guide, SC33-1703
v CICS DB2 Guide, SC33-1939
v CICS Resource Definition Guide, SC33-1684
Data Facility Data Set Services
v Data Facility Data Set Services: User's Guide and Reference, SC26-4388
Database Design
v DB2 Design and Development Guide by Gabrielle Wiorkowski and David Kull, Addison
Wesley, ISBN 0-20158-049-7
v Handbook of Relational Database Design by C. Fleming and B. Von Halle, Addison
Wesley, ISBN 0-20111-434-8
v DB2 DB Design und physische Strukturen, Sepp Kraus, S.K. Consulting Services GmbH
DB2 Connect ®
v DB2 Connect Enterprise Edition for OS/2 and Windows: Quick Beginnings, GC09-2953
v DB2 Connect Enterprise Edition for UNIX: Quick Beginnings, GC09-2952
v DB2 Connect Personal Edition Quick Beginnings, GC09-2967
v DB2 Connect User's Guide, SC09-2954
DB2 Red Books
v DB2 UDB Server for OS/390 Version 6 Technical Update, SG24-6108-00
v DB2 for z/OS and OS/390 Version 7: Using the Utilities Suite, SG24-6289
v DB2 UDB Server for OS/390 and z/OS Version 7 Presentation Guide, SG24-6121
v DB2 Java Stored Procedures -- Learning by Example, SG24-5945
v DB2 UDB for OS/390 Version 7 Performance Topics, SG24-5351
v DB2 for OS/390 Capacity Planning, SG24-2244
v DB2 for OS/390 and Continuous Availability, SG24-5486
v Parallel Sysplex Configuration: Cookbook, SG24-2076-00
v DB2 for OS390 Application Design for High Performance, GG24-2233
v Storage Management with DB2 for OS/390, SG24-5462
DB2 Universal Database for UNIX, Windows, OS/2
v DB2 UDB Administration Guide: Planning, SC09-2946
v DB2 UDB Administration Guide: Implementation, SC09-2944
v DB2 UDB Administration Guide: Performance, SC09-2945
v DB2 UDB Administrative API Reference, SC09-2947
v DB2 UDB Application Building Guide, SC09-2948
v DB2 UDB Application Development Guide, SC09-2949
v DB2 UDB CLI Guide and Reference, SC09-2950
v DB2 UDB SQL Getting Started, SC09-2973
v DB2 UDB SQL Reference Volume 1, SC09-2974
v DB2 UDB SQL Reference Volume 2, SC09-2975
Distributed Relational Database Architecture ™
v Data Stream and OPA Reference, SC31-6806
v IBM SQL Reference, SC26-8416
v Open Group Technical Standard
v DRDA Version 2 Vol. 1: Distributed Relational Database Architecture (DRDA)
v DRDA Version 2 Vol. 2: Formatted Data Object Content Architecture
v DRDA Version 2 Vol. 3: Distributed Data Management Architecture
Parallel Sysplex ® Library
v OS/390 Parallel Sysplex Application Migration, GC28-1863
v System/390 MVS Sysplex Hardware and Software Migration, GC28-1862
v OS/390 Parallel Sysplex Overview: An Introduction to Data Sharing and Parallelism,
GC28-1860
v OS/390 Parallel Sysplex Systems Management, GC28-1861
++49 8106 994390
Seite 125 von 127
v OS/390 Parallel Sysplex Test Report, GC28-1963
Storage Management
v DFSMS/MVS Storage Management Library: Implementing System-Managed Storage,
SC26-3123
v MVS/ESA Storage Management Library: Leading a Storage Administration Group,
SC26-3126
v MVS/ESA Storage Management Library: Managing Data, SC26-3124
v MVS/ESA Storage Management Library: Managing Storage Groups, SC26-3125
System/370 ™ and System/390 ®
v ESA/370 Principles of Operation, SA22-7200
v ESA/390 Principles of Operation, SA22-7201
v System/390 MVS Sysplex Hardware and Software Migration, GC28-1210
++49 8106 994390
Seite 126 von 127
ENDE
++49 8106 994390
Seite 127 von 127

DB2-Optimierung und SQL-Performance

Transcription

Similar documents

Tcl - b.i.b.

MicroStation Tutorials - ETHZ D

Trigger

Traik I{tafes - Myakka Valley Ranches

Datenbankpraktikum SQL - Datenbanken und Informationssysteme

Datenbanken - DHBW Stuttgart

Inhalt - rh

Thema: 3D-Modellierung

SQL-Grundlagen - Abteilung Datenbanken Leipzig

CARDY für IRIS Business Card Reader

MySQL Tutorial als PDF

Protokoll der Fachschaftsratsitzung vom 14.04.2011

ID: 995 2007-06-06: für alle Familien Kindergeld in Höhe von 154