RDMs 8.4 SQL User`s Guide - Online Documentation

Transcription

RDMs 8.4 SQL User`s Guide - Online Documentation
RDM Server 8.4
SQL User's Guide
Trademarks
Raima®, Raima Database Manager®, RDM®, RDM Embedded® and RDM Server® are trademarks of Raima Inc. and may be
registered in the United States of America and/or other countries. All other names referenced herein may be trademarks of
their respective owners.
This guide may contain links to third-party Web sites that are not under the control of Raima Inc. and Raima Inc. is not
responsible for the content on any linked site. If you access a third-party Web site mentioned in this guide, you do so at your
own risk. Inclusion of any links does not imply Raima Inc. endorsement or acceptance of the content of those third-party sites.
Contents
Contents
Contents
i
1. Introduction
1
1.1 Overview of Supported SQL Features
1
1.2 About This Manual
2
2. A Language for Describing a Language
3
3. A Simple Interactive SQL Scripting Utility
5
4. RDM Server SQL Language Elements
6
4.1 Identifiers
6
4. 2 Reserved Words
6
4. 3 Constants
8
Numeric Constants
8
String Constants
9
Date, Time, and Timestamp Constants
9
System Constants
5. Administrating an SQL Database
10
11
5.1 Device Administration
11
5.2 User Administration
12
5.3 Database and File Maintenance
13
5.3.1 Database Initialization
13
5.3.2 Extension Files
14
5.3.3 Flushing Inmemory Database Files
15
5.3.4 SQL Optimization Statistics
15
5.4 Security Logging
16
5.5 Miscellaneous Administrative Functions
16
5.5.1 Login/Logout Procedures
16
5.5.2 RDM Server Console Notifications
17
6. Defining a Database
19
6.1 Create Database
21
6.2 Create File
22
6.3 Create Table
24
6.3.1 Table Declarations
25
6.3.2 Table Column Declarations
26
Data Types
26
Default and Auto-Incremented Values
27
Column Constraints
28
6.3.3 Table Constraint Declarations
30
6.3.4 Primary and Foreign Key Relationships
31
6.3.5 System-Assigned Primary Key Values
34
SQL User Guide
i
Contents
6.4 Create Index
35
6.5 Create Join
38
6.6 Compiling an SQL DDL Specification
41
6.7 Modifying an SQL DDL Specification
42
6.7.1 Adding Tables, Indexes, Joins to a Database
42
6.7.2 Dropping Tables and Indexes from a Database
42
6.7.3 Altering Databases and Tables
43
6.7.4 Schema Versions
44
6.8 Example SQL DDL Specifications
45
6.8.1 Sales and Inventory Databases
45
6.8.2 Antiquarian Bookshop Database
48
6.8.3 National Science Foundation Awards Database
52
6.9 Database Instances
54
6.9.1 Creating a Database Instance
55
6.9.2 Using Database Instances
55
6.9.3 Stored Procedures and Views
56
6.9.4 Drop Database Instance
57
6.9.5 Restrictions
57
7. Retrieving Data from a Database
59
7.1 Simple Queries
59
7.2 Conditional Row Retrieval
60
7.2.1 Retrieving Data from a Range
62
7.2.2 Retrieving Data from a List
62
7.2.3 Retrieving Data by Wildcard Checking
63
7.2.4 Retrieving Rows by Rowid
63
7.3 Retrieving Data from Multiple Tables
7.3.1 Old Style Join Specifications
65
65
Inner Joins
65
Outer Joins
66
Correlation Names
69
Column Aliases
70
7.3.2 Extended Join Specifications
70
7.4 Sorting the Rows of the Result Set
75
7.5 Retrieving Computational Results
77
7.5.1 Simple Expressions
78
7.5.2 Built-in (Scalar) Functions
79
7.5.3 Conditional Column Selection
82
7.5.4 Formatting Column Expression Result Values
83
7.6 Performing Aggregate (Grouped) Calculations
85
7.7 String Expressions
92
7.8 Nested Queries (Subqueries)
93
SQL User Guide
ii
Contents
7.8.1 Single-Value Subqueries
94
7.8.2 Multi-Valued Subqueries
95
7.8.3 Correlated Subqueries
97
7.8.4 Existence Check Subqueries
98
7.9 Using Temporary Tables to Hold Intermediate Results
99
7.10 Other Select Statement Features
100
7.11 Unions of Two or More Select Statements
101
7.11.1 Specifying Unions
102
7.11.2 Union Examples
102
8. Inserting, Updating, and Deleting Data in a Database
8.1 Transactions
105
105
8.1.1 Transaction Start
105
8.1.2 Transaction Commit
106
8.1.3 Transaction Savepoint
106
8.1.4 Transaction Rollback
106
8.2 Inserting Data
107
8.2.1 Insert Values
107
8.2.2 Insert from Select
108
8.2.3 Importing Data into a Table
110
8.2.4 Exporting Data from a Table
112
8.3 Updating Data
116
8.4 Deleting Data
117
9. Database Triggers
119
9.1 Trigger Specification
119
9.2 Trigger Execution
120
9.3 Trigger Security
121
9.4 Trigger Examples
122
9.5 Accessing Trigger Definitions
126
10. Shared (Multi-User) Database Access
10.1 Locking in SQL
128
129
10.1.1 Row-Level Locking
129
10.1.2 Table-Level Locking
130
10.1.3 Lock Timeouts and Deadlock
130
10.2 Transaction Modes
11. Stored Procedures and Views
11.1 Stored Procedures
131
133
133
11.1.1 Create a Stored Procedure
133
11.1.2 Call (Execute) a Stored Procedure
134
11.2 Views
135
11.2.1 Create View
135
11.2.2 Retrieving Data from a View
136
SQL User Guide
iii
Contents
11.2.3 Updateable Views
137
11.2.4 Drop View
137
11.2.5 Views and Database Security
138
12. SQL Database Access Security
12.1 Command Access Privileges
139
139
12.1.1 Grant Command Access Privileges
139
12.1.2 Revoke Command Access Privileges
140
12.2 Database Access Privileges
141
12.2.1 Grant Table Access Privileges
141
12.2.2 Revoke Table Access Privileges
142
13. Using SQL in a C Application Program
144
13.1 Overview of the RDM Server SQL API
145
13.2 Programming Guidelines
148
13.3 ODBC API Usage Elements
152
13.3.1 Header Files
152
13.3.2 Data Types
153
13.3.3 Use of Handles
154
13.3.4 Buffer Arguments
154
13.4 SQL C Application Development
154
13.4.1 RDM Server SQL and ODBC
154
13.4.2 Connecting to RDM Server
155
13.4.3 Basic SQL Statement Processing
156
13.4.4 Using Parameter Markers
156
13.4.4 Premature Statement Termination
158
13.4.5 Retrieving Date/Time Values
159
13.4.6 Retrieving Decimal Values
159
13.4.7 Retrieving Decimal Data
160
13.4.8 Status and Error Handling
160
13.4.9 Select Statement Processing
162
13.4.10 Positioned Update and Delete
166
13.5 Using Cursors and Bookmarks
13.5.1 Using Cursors
168
168
Rowset
168
Types of Cursors
168
13.5.2 Static Cursors
168
Using Static Cursors
169
Limitations on Static Cursors
169
13.5.3 Using Bookmarks
170
Activate a Bookmark
170
Turn Off a Bookmark
170
Retrieve a Bookmark
170
SQL User Guide
iv
Contents
Return to a Bookmark
171
13.5.4 Retrieving Blob Data
171
14. Developing SQL Server Extensions
175
14.1 User-Defined Functions (UDF)
14.1.1 UDF Implementation
175
177
UDF Module Header Files
177
Function udfDescribeFcns
178
SQL Data VALUE Container Description
180
Function udfInit
181
Function udfCheck
182
Function udfFunc
184
Function udfReset
186
Function udfCleanup
186
14.1.2 Using a UDF as a Trigger
187
14.1.3 Invoking a UDF
195
Calling an Aggregate UDF
196
Calling a Scalar UDF
196
14.1.4 UDF Support Library
197
14.2 User-Defined Procedures
197
14.2.1 UDP Implementation
198
Function udpDescribeFcns
198
Function ModInit
200
Function udpInit
201
Function udpCheck
203
Function udpExecute
203
Function udpMoreResults
205
Function udpColData
205
Function udpCleanup
207
Function ModCleanup
208
14.2.2 Calling a UDP
208
14.3 Login or Logout UDP Example
209
14.4 Transaction Triggers
212
14.4.1 Transaction Trigger Registration
212
14.4.2 Transaction Trigger Implementation
214
15. Query Optimization
218
Overview of the Query Optimization Process
218
Cost-Based Optimization
221
Update Statistics
222
Restriction Factors
222
Table Access Methods
224
Sequential File Scan
224
SQL User Guide
v
Contents
Direct Access Retrieval
225
Indexed Access Retrieval
225
Index Scan
226
Primary To Foreign Key Join
227
Foreign To Primary Key Join
227
Foreign Thru Indexed/Rowid Primary Key Predefined Join
228
Optimizable Expressions
229
How the Optimizer Determines the Access Plan
232
Selecting Among Alternative Access Methods
232
Selecting the Access Order
232
Sorting and Grouping
234
Returning the Number of Rows in a Table
236
Select * From Table
236
Query Construction Guidelines
236
User Control Over Optimizer Behavior
237
User-Specified Expression Restriction Factor
237
User-Specified Index
237
Optimizer Iteration Threshold (OptLimit)
238
Enabling Automatic Insertion of Redundant Conditionals
238
Checking Optimizer Results
238
Retrieving the Execution Plan (SQLShowPlan)
238
Using the SqlDebug Configuration Parameter
240
Limitations
245
Optimization of View References
245
Merge-Scan Join Operation is Not Supported
246
Subquery Transformation (Flattening) Unsupported
246
SQL User Guide
vi
1. Introduction
1. Introduction
The RDM Server SQL User's Guide is provided in order to instruct application developers in how to build C applications that
use the RDM Server SQL database language. Those developers that have SQL experience will find much information here
with which they are familiar. Moreover, while this guide is not intended to provide complete training on the use of SQL, it
does give sufficient information for the novice SQL programmer to get a good start on RDM Server SQL programming.
Other SQL-related RDM Server documentation includes:
SQL Language Reference
SQL C API Reference
ODBC User's Guide
JDBC User's Guide
ADO.NET User's Guide
A complete description of the SQL language and statements provided in RDM Server.
Descriptions of all SQL-related C application programming interface (API) functions.
Describes the use of ODBC with RDM Server SQL.
Describes the use of the RDM Server JDBC API.
Describes the use of the RDM Server ADO.NET API.
1.1 Overview of Supported SQL Features
RDM Server supports a subset of the ISO/IEC 9075 2003 SQL standard including referential integrity and column constraint
checks as wells as extensions that provide transparent network model database support and full relational access to combined
model databases. Specific RDM Server SQL features include the following.
l
l
l
l
Full automatic referential integrity checking.
Automatic checking of column and table constraints that conform to the SQL standard column and table constraint features.
Support for b-tree and hash indexes. Support for optional indexes that can be activated on-demand is also provided.
Ability to specify high-performance, pre-defined joins using the proprietary create join DDL statement. Used with foreign and primary key specifications to indicate that direct access methods are to be used in maintaining inter-table relationships.
l
Support for the definition of standard SQL triggers.
l
Searched and positioned update and delete used in conjunction with the RDM Server SQL ODBC API.
l
Support for date, time, and timestamp data types.
l
A full complement of built-in scalar functions that include math, string, and date manipulation capabilities.
l
Support for null column values.
l
l
Data insertion statements. RDM Server SQL provides the insert values statement to insert a single row into a specified
table. Your application can use the insert from select statement to insert one or more rows from one table into another.
The insert from file statement can be used to perform a bulk load from data contained in an ASCII text file.
Support for select statements including group by, order by, subqueries, unions, and extended join syntax specification.
l
Support for the database security through standard grant and revoke statements.
l
Full transaction processing capabilities, including the capability for partial rollbacks.
l
Ability to create multiple instances of the same database schema.
l
l
Capability to define and access C structure and array columns manipulated using the RDM Server Core API (d_ prefix
functions).
A cost-based query optimizer that uses data distribution statistics to generate query execution plans based on use of
indexes, predefined joins, and direct access.
SQL User Guide
1
1. Introduction
l
l
Support for user-defined functions (UDF) that can be used in SQL statements. UDFs are extension modules that implement scalar and/or aggregate functions. You can extend the SQL functionality of the server, for example, by writing a
function that does bitwise operations, or a function that performs an aggregate calculation (e.g., standard deviation) not
provided in the built-in functions.
Support for stored procedures written in SQL and user-defined procedures (UDP) written in C that execute on the database server.
1.2 About This Manual
The RDM Server User's Guide is organized into the following sections.
l
l
l
l
l
l
l
l
l
l
l
l
l
Chapter 2, "A Language for Describing a Language" describes the "meta-language" that is used to represent SQL statement syntax.
Chapter 3, "A Simple Interactive SQL Scripting Utility" introduces a simple, command-line utility called "rsql" that
can be used to interactively execute RDM Server SQL statements. We encourage you to use it to execute for yourself
many of the SQL examples provided in this document.
Chapter 4, "Administrating an SQL Database" provides descriptions of the SQL statements that can be used to perform
a variety of administration functions such as creating and dropping users and devices.
Chapter 5, "Defining a Database" explains how to create an SQL database definition (called a schema) using SQL database definition language statements. Also described is how one goes about making changes to an existing database
definition that contains data. The SQL DDL specifications for the example databases used throughout this manual are
provided as well.
Chapter 6, "Retrieving Data from a Database" provides descriptions all of the query capabilities available in the RDM
Server SQL select statement as well as how to specify a union of two or more select statements. It also describes how
you can predefine a specific query using the create view statement.
Chapter 7, "Inserting, Updating, and Deleting Data in a Database" explains the use of the SQL insert, update, and
delete statements.
Chapter 8, "Database Triggers" provides a detailed description of how to implement database triggers in which predetermined database actions can be automatically "triggered" whenever certain database modifications occur.
Chapter 9, "Transactions and Concurrent Database Access" describes the important features of RDM Server SQL that
can be controlled/used to manage concurrent access to a database from multiple users in order to balance high access
performance with the need to guarantee the integrity of the database data (through transactions).
Chapter 10, "Writing and Using Stored Procedures" shows you how to develop SQL stored procedures which encapsulate one or more SQL statements in a single, parameterized procedure. Stored procedures are pre-compiled thus avoiding having to recompile the statements each time they need to be executed.
Chapter 11, "Establishing SQL User Access Rights" explains the use of the SQL grant and revoke statements in order
to restrict access to portions of the database or restrict the use of certain SQL commands for specific users.
Chapter 12, "Using SQL in a C Application Program" provides detailed guidelines on how to write an RDM Server
SQL application in the C programming language using the SQL C API functions (based on ODBC with several nonODBC extensions also provided).
Chapter 13, "Developing SQL Server Extensions" contains how-to guidelines for writing C-language server extensions
for use by SQL. These include user-defined functions (UDF), user-defined procedures (UDP), user-defined import/export filters (IEF), login/logout procedures, and transaction triggers.
Chapter 14, "Query Optimization" provides a detailed description of how the RDM Server SQL query optimizer
determines the "best" way to execute a particular query. Don't skip this chapter! Writing efficient and correct select
statements is not always easy to do. Moreover, the "optimizer" is also not as smart as that particular designation may
lead you to think. The more you understand how queries are optimized, the better able you will be to not only create
quality queries but also to figure out why certain queries do not work quite the way you thought they should.
SQL User Guide
2
2. A Language for Describing a Language
2. A Language for Describing a Language
SQL stands for "Structured Query Language." You have probably seen many different methods used in programming manuals
to show how to use a specific programming language. The two most common methods use syntax flow diagrams and what is
known as Backus-Naur Form (BNF) which is a formal language for describing a programming language. In this document we
use a simplified BNF method that seeks to represent the language in a way that closely matches the way you will code your
own SQL statements for your application.
For example, the following select statement:
select sale_name, company, city, state
from salesperson natural join customer;
can be described by this syntax rule:
select_stmt:
select identifier [, identifier]… from identifier [natural join identifier] ;
where "select_stmt" is the name of the rule (sometimes called a non-terminal); the bold-faced identifiers select, from, natural,
and join are key words (sometimes called terminal symbols); identifier is like a function argument that stands in place of a
user-specified value (technically, it too is the name of a rule that is matched by any user-specified value that begins with a letter followed by any sequence consisting of letters, digits, and the underscore ("_") character). Rule names are identifiers and
their definitions are specified by giving the rule name beginning in column 1 and terminating the rule with a colon (":") as
shown above.
There are also special meta-symbols that are part of the syntax descriptor language. Two are shown in the above select_stmt
syntax rule. The brackets ("[" and "]") enclose optional elements. The ellipsis ("…") specifies that the preceding item can be
repeated zero or more times. Other meta-symbols include a vertical bar (i.e., an "or" symbol) that is used to separate alternative
elements and braces ("{" and "}") which enclose a set of alternatives from which one must always be matched. All other special characters (e.g., the "," and ";" in the select_stmt rule) are considered to be part of the language definition. Meta-symbols
that are themselves part of the language will be enclosed in single quotes (e.g., '[') in the syntax rule.
Rule names can be used in other rules. For example, the syntax for a stored procedure that can contain multiple select statements could be described by the following rule:
create_proc:
create procedure identifier as
select_stmt[; select_stmt]…
end proc;
In order to make the syntax more readable, any non-bold, italicized name is considered to be matched as an identifier. Thus,
the select_stmt rule can also be written as follows…
select_stmt:
select colname [, colname]… from tabname [natural join tabname] ;
where colname represents identifiers that correspond to table column names and tabname represents identifiers that correspond to table names.
Some italicized terms are used to match specific text patterns. E.g., number matches any text pattern that can be used to represent a number (either integer or decimal) and integer matches any pattern that represents an integer number.
These rules are summarized in the table below.
SQL User Guide
3
2. A Language for Describing a Language
Table 2-1. Syntax Description Language Elements
Syntax Element
Description
keyword
Bold-faced words that identify the special words used in the language that specify actions and
usage. Sometimes called reserved words. Examples, select, insert, create, using.
identifier
Italicized word corresponding to an identifier: sequences of letters, digits, and "_" that begin
with a letter.
number
Any text that corresponds to an integer or decimal number.
integer
Any text that corresponds to an integer.
[option1 | option2]
A selection in which either nothing or option1 or option2 is specified.
{option1 | option2}
Either option1 or option2 must be specified.
element…
Repeat element zero or more times.
identifier
Normal-faced identifiers correspond to the names of syntax rules. Syntax rules are defined by the
name starting in column 1 and ending with a ":".
Text for programming and SQL examples is shown in courier font in a shaded box as in the following example.
RSQL Utility - RDM Server 8.4.1 [22-Mar-2012]
A Raima Database Manager Utility
Copyright (c) 1992-2012 Raima Inc.. All Rights Reserved.
Enter ? for list of interface commands.
001 rsql:
Connected
*** using
001 rsql:
.c 1 p admin secret
to RDM Server Version 8.4.1 [22-Mar-2012]
statement handle 1 of connection 1
select * from salesperson;
sale_id sale_name
BCK
Kennedy, Bob
BNF
Flores, Bob
BPS
Stouffer, Bill
CMB
Blades, Chris
DLL
Lister, Dave
ERW
Wyman, Eliska
GAP
Porter, Greg
GSN
Nash, Gail
JTK
Kirk, James
SKM
McGuire, Sidney
SSW
Williams, Steve
SWR
Robinson, Stephanie
WAJ
Jones, Walter
WWW
Warren, Wayne
002 rsql:
SQL User Guide
dob
commission region
1956-10-29
0.075
0
1943-07-17
0.100
0
1952-11-21
0.080
2
1958-09-08
0.080
3
1999-08-30
0.075
3
1959-05-18
0.075
1
1949-03-03
0.080
1
1954-10-20
0.070
3
2100-08-30
0.075
3
1947-12-02
0.070
1
1944-08-30
0.075
3
1968-10-11
0.070
0
1960-06-15
0.070
2
1953-04-29
0.075
2
4
3. A Simple Interactive SQL Scripting Utility
3. A Simple Interactive SQL Scripting Utility
Okay, we know that this is the world of point-and-click, easy-to-use applications. In fact, many abound for doing just that
with SQL. So what value can there possibly be in providing a text-based, command-line-oriented, interactive SQL utility?
Well, for one thing, you can keep both hands on the keyboard and never have to touch the mouse! Novel concept isn’t it? It
also has provided us here at Raima with something that was easy to write and is easily ported to any platform. Hence, the
interface works identically on all platforms. It also provides us (and, presumably, you as well) with the ability to generate test
cases that can be easily and automatically executed. Since we also share the source code to the program, it allows you to
more easily see how to call the RDM Server SQL API functions without getting bogged down by object-oriented layers and
user-interface calls. There is an educational benefit as well. You will more effectively learn how to properly formulate SQL
statements by actually typing them in than by simply pointing to icons that do the job for you.
The name of this program is rsql (the standalone version is named rsqls). To start rsql, open an OS command window and
enter a command that conforms to the following syntax. Note that an RDM Server that manages the SQL databases to be
accessed must be running and available.
Table 3-1. RSQL Command Options
rsql [-? | -h] [-B] [-V] [-e] [-u] [-c num] [-H num] [-s num] [-w num] [-l num]
[-b [@hostname:port]] startupfile [arg]…]
-h
Display command usage information.
-B
Do not display program banner on startup.
-V
Display operating system version information.
-e
Do not echo commands contained in a script file.
-u
Display result set column headings in upper case.
-c num
Set maximum number of possible connections to num.
-H num
Set size of statement history list to num.
-s num
Set maximum number of statement handles per connection to num.
-w num
Set page width to num characters.
-l num
Set number of lines per display page to num.
-o filename
Output errors to filename.
Name of text file containing startup rsql/SQL commands and any needed script file arguments (see .r
startupfile [arg]…
command below).
SQL User Guide
5
4. RDM Server SQL Language Elements
4. RDM Server SQL Language Elements
This section defines all of the basic elements of RDM Server SQL that have been used throughout this User's Guide including
identifiers, reserved words and constants.
4.1 Identifiers
Identifiers are used to name a wide variety of SQL language objects including databases, tables, columns, indexes, joins,
devices, views, and stored procedures. An identifier is formed as a combination of letters, digits, and the underscore character
('_'), always beginning with a letter or an underscore. An identifier in RDM server can be from 1 to 32 characters in length.
Unless otherwise noted in the User's Guide, identifiers are case-insensitive (upper and lower case characters are indistinguishable). Thus, CUSTOMER, customer, and Customer all refer to the same item. Identifiers cannot be a reserved word
(see below).
4. 2 Reserved Words
Reserved words are predefined identifiers that have special meaning in RDM Server SQL. As with identifiers, RDM Server
SQL does not distinguish between uppercase and lowercase letters in reserved words. Table 4-1 lists the RDM Server SQL
reserved words. Some of the listed words are not described in this document but have been retained for compatibility with
other SQL systems. Note none of the words listed in this table can be used in any context other than that indicated by the use
of the word in the SQL grammar.
Table 4-1. RDM Server SQL Reserved Words
ABS
COUNTS
HAVING
NAME
SET
ACOS
CREATE
HEADINGS
NATURAL
SHARED
ACTONFAIL
CROSS
HOUR
NEW
SHORT
ADD
CURDATE
IF
NEXT
SHOW
ADMIN
CURRENCY
IFNULL
NOINIT
SIGN
ADMINISTRATOR
CURRENT
IGNORE
NOINITIALIZE
SIN
AFTER
CURRENT_DATE
IMPORT
NON_VIRTUAL
SMALLINT
AGE
CURRENT_TIME
IN
NOSORT
SOME
AGGREGATE
CURRENT_
TIMESTAMP
INDEX
NOT
SQRT
ALL
CURTIME
INIT
NOTIFY
START
ALTER
C_DATA
INITIALIZE
NOW
STATEMENT
AND
DATA
INMEMORY
NULL
SUBSTRING
ANY
DATABASE
INNER
NULLIF
SUM
AS
DATABASES
INSERT
NUMERIC
SWITCH
ASC
DATE
INSTANCE
NUMRETRIES
TABLE
ASCENDING
DAYOFMONTH
INT
OBJECT
TABLES
ASCII
DAYOFWEEK
INT16
OCTET_LENGTH
TAN
ASIN
DAYOFYEAR
INT32
OF
THEN
SQL User Guide
6
4. RDM Server SQL Language Elements
ATAN
DB_ADDR
INT64
OFF
THROUGH
ATAN2
DEACTIVATE
INT8
OLD
TIME
ATOMIC
DEBUG
INTEGER
ON
TIMESTAMP
AUTHORIZATION
DEC
INTO
ONE
TINYINT
AUTO
DECIMAL
IP
ONLY
TO
AUTOCOMMIT
DEFAULT
IPADDR
OPEN
TODAY
AUTOLOG
DELETE
IS
OPTION
TRAILING
AUTOSTART
DESC
ISOLATION
OPTIONAL
TRIGGER
AVG
DESCENDING
JOIN
OPT_LIMIT
TRIM
BEFORE
DEVICE
KEY
OR
TRUE
BEGIN
DIAGNOSTICS
LARGE
ORDER
TRUNCATE
BETWEEN
DISABLE
LAST
OUTER
TYPEOF
BIGINT
DISPLAY
LCASE
OWNER
UCASE
BINARY
DISTINCT
LEADING
PAGESIZE
UINT16
BIT
DOUBLE
LEFT
PARAM
UINT32
BLOB
DROP
LENGTH
PARAMETER
UINT64
BOOLEAN
EACH
LEVEL
PI
UINT8
BOTH
ELSE
LIKE
POSITION
UNICODE
BTREE
ENABLE
LN
PRECISION
UNION
BUT
ENCRYPTION
LOCALTIME
PRIMARY
UNIQUE
BY
END
LOCALTIMESTAMP
PROC
UNLOCK
BYTE
ERRORS
LOCATE
PROCEDURE
UNSIGNED
CALL
ESCAPE
LOCK
PUBLIC
UPDATE
CASCADE
EXCLUSIVE
LOG
QUARTER
UPPER
CASE
EXEC
LOGFILE
RAND
USE
CAST
EXECUTE
LOGGING
REAL
USER
CEIL
EXISTS
LOGIN
REFERENCES
USING
CEILING
EXP
LOGOUT
REFERENCING
VALUES
CHAR
EXTENSION
LONG
REMOVE
VARBINARY
CHARACTER
FALSE
LOWER
REP
VARBYTE
CHARACTER_
LENGTH
FILE
LTRIM
REPEAT
VARCHAR
CHAR_LENGTH
FILTER
MARK
REPLACE
VARYING
CHECK
FIRST
MASTER
REVOKE
VIRTUAL
CLOSE
FLOAT
MASTERALIAS
RIGHT
WAITSECONDS
COALESCE
FLOOR
MAX
ROLLBACK
WCHAR
COLUMN
FLUSH
MAXCACHESIZE
ROUND
WCHARACTER
COMMANDS
FOR
MAXPGS
ROW
WEEK
SQL User Guide
7
4. RDM Server SQL Language Elements
COMMIT
FOREIGN
MAXTRANS
ROWID
WHEN
COMMITTED
FROM
MEMBER
ROWS
WHERE
COMPARE
FULL
MIN
RTRIM
WITH
CONCAT
FUNCTION
MINIMUM
RUN
WORK
CONVERT
FUNCTIONS
MINUTE
SAVE
WVARCHAR
COS
GRANT
MOD
SAVEPOINT
XML
COT
GROUP
MODE
SECOND
YEAR
COUNT
HASH
MONTH
SELECT
4. 3 Constants
An RDM Server SQL constant is a number or string value that is used in a statement. The following sections describe how to
specify each type of constant value.
Numeric Constants
The RDM Server SQL numeric data types are smallint, integer, float, double, and decimal. Numeric constants are formed as
specified in the following syntax.
numeric_constant:
[+|-]digits[.digits]
digits:
d[d]...
d:
0|1|2|3|4|5|6|7|8|9
If you specify a constant with a decimal portion (that is, [.digits]), RDM Server stores the constant as a decimal. If you do not
use the decimal part, the constant is stored as an integer.
The following examples show several types of numeric constants.
1021
-50
3.14159
453.75
-81.75
Floating-point constants (data type real, float, or double) can be specified using as a numeric_constant or as an exponential
formed as specified below.
exponential_constant:
[+|-]digits[.digits]{E|e}[+|-]ddd
Shown below are several examples of floating-point constants.
SQL User Guide
8
4. RDM Server SQL Language Elements
6.02E23
1.8E5
-3.776143e-12
String Constants
ASCII string constants are formed by enclosing the characters in the string inside single quotation marks ('string') or double
quotation marks ("string"). To form a wide character (Unicode) string constant, the initial quotation mark must be immediately
preceded with "L". If the string itself contains quotation mark used to specify the string it must be immediately preceded by a
backslash (\). To include a backslash character in the string, enter a double backslash (\\).
The following are examples of string constants.
"This is an ASCII string constant"
L"This is a Unicode string constant"
"this string contains \"quotation\" marks"
'this string contains "quotation" marks too'
'this string contains a backslash (\\)'
The default maximum length of an RDM Server SQL string constant is 256 characters. You can change this value by modifying the MaxString configuration parameter in the [SQL] section of rdmserver.ini. Refer to RDM Server Installation / Administration Guide for more information.
Date, Time, and Timestamp Constants
The following syntax shows the formats for date, time, and timestamp constants.
date_constant:
date "YYYY-MM-DD"
|
@"[YY]YY-MM-DD"
time_constant:
time "HH:MM[:SS[.dddd]]"
|
@"HH:MM[:SS[.dddd]]"
timestamp_constant:
timestamp "YYYY-MM-DD HH:MM[:SS[.dddd]]"
|
@"YYYY-MM-DD [HH:MM[:SS[.dddd]]]"
The formats following the date, time, and timestamp keywords conform to the SQL standard. In the format for date constants,
YYYY is the year (you must specify all four digits), MM is the month number (1 to 12), and DD is the day of the month (1 to
31). The @ symbol represents a nonstandard alternative. When only two digits are specified for the year using the nonstandard format, the century is assumed to be 1900 where YY is greater than or equal to 50; where YY is less than 50 in this
format, the century is assumed to be 2000.
In the format for time constants, HH is hours (0 to 23), MM is minutes (0 to 59), SS is seconds (0 to 59), and .dddd is the fractional part of a second, with up to four decimal places of accuracy. If you specify more than four places, the value rounds to
four places. The format for timestamp constants simply combines the formats for date and time constants.
You can use three alternative characters as separators in declaring date, time, and timestamp constants. Besides hyphen ("-"),
RDM Server accepts slash ("/") and period (".").
SQL User Guide
9
4. RDM Server SQL Language Elements
The following are examples of the use of date, time, and timestamp constants.
insert into sales_order(ord_num, ord_date, amount)
values(20001, @"93/9/23", 1550.00);
insert into note
values("HI-PRI", timestamp "1993-9-23 15:22:00", "SKM", "SEA");
select * from sales_order where ord_date >= date "1993-9-1";
insert into event(event_id, event_time)
values("Marathon", time "02:53:44.47");
The set date default statement, shown below, can be used to change the separator character and the order of month, day, and
year.
set_date_default:
set {date default | default date} to {"MM-DD-YYYY" | "YYYY-MM-DD" | "DD-MM-YYYY"}
One of the three date format option must be specified exactly as shown except that the "-" separator can be any special character you choose. This statement will set the date format for both input and output. Note that the specified separator character
will be accepted for date constants as well as the built in characters hyphen, slash, and period.
System Constants
RDM Server SQL also recognizes three built-in literal constants as described in Table 4-2.
Table 4-2. Literal System Constants
Constant Value
user
The name of the user who is executing the statement.
today
The current date at the execution time of the statement.
now
The current timestamp at the execution time of the
statement.
The following examples illustrate the use of the literal system constants.
.. a statement that could be executed from an extension module or
.. stored procedure that is always executed when a connection is made.
insert into login_log(user_name, login_time) values(user, now);
.. check today's action items
select cust_id, note_text from action_items where tickle_date = today;
SQL User Guide
10
5. Administrating an SQL Database
5. Administrating an SQL Database
This chapter contains information pertinent to the administration of SQL databases. For complete RDM Server administration
details please refer to the RDM Server Installation and Administration Guide. Much of the capabilities described in this
chapter have alternative methods. For example, users and devices can be defined outside of SQL through use of the rdsadmin
utility. However, it is often convenient (e.g., for regression testing, etc.) to be able to perform basic administrative actions
through SQL statements. Hence, RDM Server SQL includes a variety of administration related statements. Note that administrator user privileges are required in order to use the SQL statements described below.
5.1 Device Administration
An RDM Server device specifies a logical name for a file system directory into which the server will manage database related
files. A device can be created through SQL using the create device statement with the following syntax.
create_device:
create [readonly] device devname as "directory_path"
This statement creates a device named devname with the specified directory_path which usually will be a fully-qualified path
name to an existing directory. Relative path names that are interpreted as being relative to the catalog directory as specified
by the CATPATH environment variable can also be used as in the following (Windows) example.
create device importdev as ".\impdata";
It is important to repeat that the directory specified in the as clause must already exist. Otherwise the system will return error
"invalid device: Illegal Physical Path for Device" error.
A readonly device is one in which the RDM Server managed files are only allowed to be read. Any attempt to write to a file
contained on a readonly device will result in an error.
Note that before any SQL DDL specification can be processed in order to define and use an SQL database, devices will need
to be created for the directories that contain the DDL specification files and that will contain the created database files.
create device sqldev as "c:\rdms\sqlscripts";
create device salesdb as ".\saledb";
Devices can be dropped but only when there are no RDM Server managed files contained in the directory associated with the
device. The syntax for the drop device statement is very simple.
drop_device:
drop device
devname
Successful execution of this statement will drop the logical device named devname from the RDM Server system. The directory to which it refers, however, will remain as well as any, non RDM Server managed, files.
You can retrieve a list of all of the devices defined for the RDM server to which you are connected by executing the predefined stored procedure name ShowDevices as shown in the example below.
SQL User Guide
11
5. Administrating an SQL Database
execute ShowDevices;
NAME
catdev
emsamp
importdev
mlbdev
mlbimpdev
rdsdll
samples
sqldev
sqlsamp
sysdev
TYPE
Read/Write
Read/Write
Read/Write
Read/Write
Read/Write
Read/Write
Read/Write
Read/Write
Read/Write
Read/Write
PATH
.\
..\examples\em\
.\impdata\
c:\aspen\mlbdb\
c:\aspen\mlbdb\impfiles\
..\dll\nt_i\
..\examples\tims.nt_i\
..\sqldb.nt_i\
..\examples\emsql\
..\syslog.nt_i\
RDM Server manages disk space in such a way as to protect against a server shutdown in the event that needed external disk
storage requirements are not satisfied (i.e., the system runs out of disk space). When this happens, RDM Server will automatically switch into read-only mode until sufficient disk space is freed and made available to RDM Server. A minimum available space attribute can be associated with all RDM Server devices that allows an administrator to have some control over
this low disk space system behavior. SQL provides the set device minimum statement in order to set the minimum number of
bytes of free space that must be available on every device in order for RDM Server to operate in its normal, read-write mode.
The syntax for this statement is as follows.
set_device:
set device minimum to
nobytes
where nobytes is the number of free space that must be available on each RDM Server device.
Note that this overrides the MinFreeDiskSpace parameter in the [Engine] section of rdmserver.ini.
set device minimum to 100000000;
This example sets the device minimum free space threshold to 100 megabytes.
5.2 User Administration
The create user statement can be used to create a new RDM Server user that will allow a user with the specified name to
login into the RDM Server associated with the connection that is issuing the create user statement. The syntax for create
user is shown below.
create_user:
create [admin[istrator]] user username password "password" [with encryption] on devname
The login name for the user is username and the login password is specified as "password" with devname as the user's default
device (the device which will be used with SQL statements for which an optional on devname clause has been omitted). The
with encryption option indicates that an encrypted form of the password is to be stored in the system catalog.
A username is a case-sensitive identifier so that "Sam", "SAM", and "sam" are three different user names.
SQL User Guide
12
5. Administrating an SQL Database
Administrator users have full access rights to all databases and commands. The access rights for normal (non-administrator)
users must be specified through use of the grant statement (see Chapter 11).
The create user statement can only be executed by administrator users.
The password for an existing user can be changed using the alter user statement.
alter_user:
alter user username {authorization | password} "password"
Normal users can use this statement only to change their own password. Administrator users can use it to change the password for any user.
Administrator users can remove a user from an RDM Server using the drop user statement:
drop_user:
drop user
username
IMPORTANT: RDM Server is delivered with some predefined users. Of particular importance is user
"admin" with password "secret". We high recommend that this user be dropped (or at least the password
changed) once you have defined your own administrator users.
An administrator can get a list of the names of all users of an RDM Server by executing the pre-defined stored procedure
named ShowUsers as shown in the following example.
create user randy password "RJ29j32r34s36k38" on sqldev;
create admin user paul password "SaulOrPaul" with encryption on catdev;
exec ShowUsers;
USER_NAME
admin
guest
paul
randy
wayne
RIGHTS
Admin
Normal
Admin
Normal
Normal
HOME_DEVICE
catdev
catdev
catdev
sqldev
samples
5.3 Database and File Maintenance
5.3.1 Database Initialization
Administrators can initialize a database by issuing the following statement.
initialize_database:
init[ialize] [database] dbname
SQL User Guide
13
5. Administrating an SQL Database
Before initializing a database, the database must be closed by all users (including you) who currently have the database
opened.
Execution of this statement is unrecoverable. The only way to restore the database is to restore from your last backup. If a
database contains rows that are referenced from another database, initializing the referenced database will invalidate the referential integrity between those databases.
5.3.2 Extension Files
Extension files can be created to allow database files to grow to sizes larger than can be accommodated in a single operating
system file. The feature was first added to RDM Server to overcome what was then the 2 gigabyte maximum size limitation
for files on some operating systems. As that may still be the case on some RDM Server OS installations, extension files are
necessary for those database files where the possible size limitation can be exceeded.
Extension files can also be used to partition the contents of a database file into multiple files contained on separate devices.
The partitioning is defined based strictly on the specified maximum size for the data file (which can be set using the alter
[extension] file statement) and the range of database addresses whose associated record occurrences are stored in a given file.
The syntax for the create extension file statement is given below.
create_extension_file:
create extension file "extname" on extdev for "basename" on basedev
The name of the extension file is specified by extname and must be a legal, non-existent file name for the operating system on
which RDM Server is installed. The extension file will be stored on the RDM Server device named extdev. This file will contain all data associated with the standard database file "basename" located on the device named basedev.
Device extdev can be the same as basedev as long as the extname is not the same as basename.
If more than one extension file is needed, you can issue as many create extension file statements on the basename as necessary. For example, the following statements create two extension files for file "sales.000" in the sales database.
create extension file "sales.0x0" on sqldev for "sales.000" on sqldev;
create extension file "sales.0x1" on sqldev for "sales.000" on sqldev;
The alter [extension] file statement can be used to specify a variety of file sizing options for base and extension files as shown
in the following syntax.
alter_file:
alter [extension] file extno for "basename" on basedev
set {[maxsize=maxsize] | [cresize=cresize] | [extsize=extsize]}...
The specified file size settings apply to extension whose number is extno where an extno of 0 (zero) refers to the base file
itself. The maxsize option specifies the maximum file size in bytes. The cresize option specifies the initial size of the file
when the file is first created. The extsize option specifies how much additional file space is to be allocated when data is
added to the end of the file. It is best that all of these values be integer multiples of the file's page size (see create file statement).
If maxsize is less than the current amount of allocated space on the file, or if the file is fully allocated to its current maximum,
the request is denied. The new extsize value takes effect the next time the file is extended. The new cresize value is used the
next time the file is initialized.
SQL User Guide
14
5. Administrating an SQL Database
You can use the maxsize value to control partitioning of the data among a set of extension files.
You can execute the ShowDBFiles predefined procedure to get a complete list of all of the files for a specified database as in
the example below.
exec ShowDBFiles("sales");
FILENO
0
0
0
1
2
3
4
5
6
EXTNO
0
1
2
0
0
0
0
0
0
DEVNAME
sqldev
sqldev
sqldev
sqldev
sqldev
sqldev
sqldev
sqldev
sqldev
FILENAME
sales.000
sales.0x0
sales.0x1
sales.001
sales.002
sales.003
sales.004
sales.005
sales.006
5.3.3 Flushing Inmemory Database Files
RDM Server provides the ability for specific database files to be kept entirely in memory while a database is opened. This is
particularly important for files whose contents are accessed often and which need to have as fast a response time as possible.
RDM Server inmemory files can specified to be volatile (meaning that the file always starts empty), read (meaning that the
data is initially read from the file but no changes are ever written back), or persistent (meaning that the data file is completely
loaded when the database is first opened and all changes are written back to the database when the database is last closed).
For persistent (and even read) immemory files, it may be necessary for changes to those files to be written back to the database while the database remains open. The flush database statement provides the ability to do just that.
flush:
flush [database] dbname[, dbname]...
This statement flushes the updated contents of the specified persistent or read inmemory files for the specified databases to the
physical database files. Note that use of the flush database statement is the only way in which changes made to inmemory
read files can be written to the database.
WARNING: The contents written to the database files with a flush database command are permanent and
cannot be rolled back with a transaction rollback..
5.3.4 SQL Optimization Statistics
The RDM Server SQL query optimizer (see Chapter 14) utilizes data distribution statistics to assist its process of determining
the best methods to use to efficiently execute a given select (or update/delete) statement. It is important that these statistics be
kept up to date so that they provide a reasonable estimation of the characteristics of the data stored in the database. These statistics are generated from the current state of the database contents by executing the update statistics statement on the specified database as follows.
update_stats:
update {stats | statistics} on
SQL User Guide
dbname
15
5. Administrating an SQL Database
The statistics are collected and stored in the SQL system catalog by executing the update statistics statement. The histogram
for each column is collected from a sampling of the data files. The other statistics are maintained by the RDM Server runtime
system.
The histogram for each column contains a sampling of values (25 by default, controlled by the rdmserver.ini file
OptHistoSize configuration parameter), and a count of the number of times that value was found from the sampled number
of rows (1000 by default, controlled by the rdmserver.ini file OptSampleSize configuration parameter). The sampled
values are taken from rows evenly distributed throughout the table.
When update statistics has not been performed on a database, RDM Server SQL uses default values that assume each table
contains 1000 rows. It is highly recommended that you always execute update statistics on every production database. The
execution time for an update statistics statement is not excessive in RDM Server and does not vary significantly with the size
of the database. Therefore, we suggest regular executions (possibly once per week or month, or following significant changes
to the database).
5.4 Security Logging
RDM Server SQL provides the ability to log all grant and revoke statements that are issued. The SecurityLogging configuration parameter is used to activate (=1) or deactivate this feature (=0). SecurityLogging is by default disabled. When
enabled, RDM Server SQL records the information associated with each grant and revoke statement that is successfully
executed in a row that is stored in the system catalog (syscat) table sysseclog along with a copy of the text of the command
being stored in table systext.
The following example shows the query can be used to display the security log showing when the command was issued, the
name of the issuing user, and a copy of the grant/revoke statement.
select issued, user_name, txtln from sysseclog natural join systext;
ISSUED
USER_NAME
2012-04-26 13:59:27.3160 admin
2012-04-26 13:59:16.5520 admin
2012-04-26 13:58:59.1110 admin
TXTLN
grant all privileges on item to wayne;
grant all privileges on product to wayne;
grant all privileges on outlet to wayne;
5.5 Miscellaneous Administrative Functions
5.5.1 Login/Logout Procedures
Login/logout procedures are stored procedures that the SQL system calls automatically whenever a user connects to or disconnects from a server. Two types of login/logout procedures are available:
l
l
Public login/logout procedures are called whenever any user connects or disconnects with the server.
Private login/logout procedures are associated with particular users and are only called when those users connect or disconnect.
If both a public and a private procedure have been defined for a user, both procedures are called; the public procedure is
called before the private procedure.
SQL User Guide
16
5. Administrating an SQL Database
Login/logout procedures cannot return a result set and cannot have arguments. They are typically used for setting user environment values (e.g., display formats) or for performing specialized security functions. A login or logout procedure can be written either as a standard SQL stored procedure or as a C-based, user-defined procedure (UDP).
Login/logout procedures are registered using the set login procedure statement with the following syntax.
set_login_procedure:
set {login | logout} proc[edure] for {public | username [, username ]...} to {procname | null}
The public option means that the login/logout procedure will be called whenever anyone logs in/out to/from the RDM Server
associated with the connection on which this statement is executed. Otherwise, the username list identifies the specific users
to which the procedure applies.
Only one private login/logout procedure can be associated with a user. Hence, a subsequent set login/logout procedure call
will replace the previous one.
The following example creates a stored procedure called set_germany, which is to be used as a login procedure that defines
the user environment for German users.
create proc set_germany as
set currency to "€";
set date display(12, "yyyy mmm dd");
set decimal to ",";
set thousands to ".";
set decimal display(20, "#.#,##' €'");
end proc;
The following statement registers the set_germany procedure as the login procedure for users Kurt, Wolfgang, Helmut, and
Werner.
set login proc for "Kurt", "Wolfgang", "Helmut", "Werner" to set_germany;
The use of login/logout procedures can be enabled or disabled using the set login statement as follows.
set_login:
set login [to | =] { on | off }
The effect of this statement is system-wide and will persist until the next set login is issued by this or another administrator
user. Use of login procedures is initially turned off.
5.5.2 RDM Server Console Notifications
The notify statement can be used to display a message on the RDM Server console. The syntax is shown below.
notify:
notify {"message" | procvar | trigvar | ?}
A stored procedure variable (procvar) can be specified when the notify statement is executed within a stored procedure. A trigger variable (trigvar) that references an old or new row column value can be specified when the notify statement is executed
within a trigger. If a parameter marker is specified, then the bound parameter value must be of type char or varchar.
The following example shows how the notify statement can be used in a trigger.
SQL User Guide
17
5. Administrating an SQL Database
create trigger grade_watch before update of grade on course
referencing old row as bc new row as nc
for each row
begin atomic
notify "Grade change made to: "
notify bc.student_course
end;
SQL User Guide
18
6. Defining a Database
6. Defining a Database
A poorly designed database can create all kinds of difficulties for the user of a database application. Unfortunately, the blame
for those difficulties are often laid at the feet of the database management system which, try as it might, simply cannot use
non-existent access paths to quickly get at the needed data. Good database design is as much of an art as it is engineering and
a solid understanding of the application requirements is a necessary prerequisite. However, it is not the purpose of this document to teach you how to produce good database designs. But you do need to understand that designing a database is a complex task and that the quality of the application in which it is to be used is highly dependent on the quality of the database
design. If you are not experienced in designing databases then it is highly recommended that you first consult any number of
good books on that subject before setting out to develop your RDM Server SQL database.
Information in a relational database is stored in tables. Each table is composed of columns that store a particular type of
information and rows that correspond to a particular record in the table. A simple but effective analogy can be made with a
file cabinet as illustrated in Figure 6-1.
Figure 6-1. A File Cabinet is a Database
A file cabinet contains drawers. Each drawer contains a set of files organized around a common theme. For example, one
drawer might contain customer information while another drawer might contain vendor information. Each drawer holds individual file folders for each customer or vendor, sorted in customer or vendor name order. Each customer file contains specific
information about the customer. The cabinet corresponds to a database, each drawer is like a table, and each folder is like a
row in the table.
Typically, tables are viewed as shown in Figure 6-2, where the basic components of a database table are identified in an
example customer table. Each column of the table has a name that identifies the kind of information it contains. Each row
gives all of the information relating to a particular customer.
Figure 6-2. Definition of a "Table"
Suppose that you want to expand this example further and define a simple sales order database that, initially, keeps track of
salespersons and their customers. Figure 6-3 shows how this information could be stored in the table.
SQL User Guide
19
6. Defining a Database
Figure 6-3. Salesperson Accounts Table
There are columns for each salesperson's name and commission rate. Each salesperson has one or more customer accounts. The
customer's company name, city, and state are also stored with the data of the salesperson who services that customer's account.
Note that the salesperson's name and commission are replicated in all of the rows that identify the salesperson's customers.
Such duplicated data is called redundant data. One of the goals in designing a database is to minimize the amount of redundant data that must be stored in a database. A description on how this is done will be given below in section 6.3.4.
A database schema is the definition of what kind of data is to be stored and how that data is to be organized in the database.
The Database Definition Language (DDL) consists of the SQL statements that are used to describe a particular database
schema (also called the database definition). Five DDL statements are provided in RDM Server SQL: create database
(schema), create file, create table, create index, and create join. The example below shows the RDM Server SQL DDL specification that corresponds to the TIMS Core API database definition.
create database tims on sqldev
disable null values
disable references count;
create
create
create
create
file
file
file
file
tims_d1;
tims_d2;
tims_k1;
tims_k2;
create table author(
name
char(31) primary key
) in tims_d2;
create unique index author_key on author(name) in tims_k2;
create table info(
id_code
char(15) primary key,
info_title
char(79),
publisher
char(31),
pub_date
char(11),
info_type
smallint,
name
char(31) references author
) in tims_d2;
create unique index info_key on info(id_code) in tims_k1;
create join has_published order last on info(name);
create table borrower(
myfriend
char(31),
date_borrowed
date,
date_returned
date,
id_code
char(15) references info
) in tims_d2;
create index borrower_key on borrower(myfriend) in tims_k2;
create join loaned_books order last on borrower(id_code);
create table text(
line
id_code
) in tims_d2;
SQL User Guide
char(79),
char(15) references info
20
6. Defining a Database
create join abstract order last on text(id_code);
create table keyword(
word
char(31) primary key
) in tims_d1;
create unique index keyword_key on keyword(word) in tims_k2;
create table intersect(
info_type
smallint,
id_code
char(15) references info,
word
char(31) references keyword
) in tims_d1;
create join key_to_info order last on intersect(word);
create join info_to_key order last on intersect(id_code);
Detailed explanation for the use of each of the statements used in the above example are given in the flowing sections of this
chapter. Section 6.1 explains the use of the create database (schema) statement which names the database that will be
defined by the DDL statements that follow it. The create file statement that can be used to define the files into which database data is stored is described in section 6.2. The create table statement, described in section 6.3, is used to define the characteristics of a table that will be stored in the database. The create index and create join statements are used to define
methods to quickly access database data and are described in section 6.4 and section 6.5, respectively. Instructions on how to
compile an SQL DDL specification follows in section 6.6. The kinds of changes that can be made to the schema of an existing (and operational) database are described in section 6.7. Finally, the database definitions for the example databases
provided with RDM Server are described in section 6.8.
6.1 Create Database
A complete DDL specification begins with a create database statement that conforms to the following syntax.
create_database:
|
create {database | schema [authorization]} dbname db_attributes
create {database | schema} dbname authorization username db_attributes
db_attributes:
[pagesize bytes]
|
slotsize {4 | 6 | 8}
|
on devname
|
[{enable | disable} null values]
|
[{enable | disable} reference count]
The name of the database to be created is specified by the dbname identifier which is case-insensitive meaning that "Sales",
"sales", and "SALES" all refer to the same database. The create schema form follows the SQL standard. If the authorization
username clause is specified then the owner of the database will be the user named username. Otherwise the owner is the user
submitting this statement.
The pagesize clause specifies that the default database file page size is to be set to the integer constant nobytes bytes. It is
recommended that this value be set to a multiple of the standard block size for the file system on which RDM Server is running. The default page size is 1024.
The slotsize clause specifies the number of bytes to be used for the record (row) slot number used in an RDM Server database
address. The slotsize defines the maximum number of rows that can be stored in a database file as the maximum unsigned 4,
6, or 8 byte integer value. The default slotsize is 4.
SQL User Guide
21
6. Defining a Database
The on clause is used to specify the default device on which the database files will be stored. The create file statement can be
used to locate database files on separate devices if desired.
RDM Server SQL maintains in each table row a bitmap that keeps track of that row's null column values. The column value is
null when the bit in the bitmap associated with that column is set to 1. One byte is allocated for this bitmap for every 8
columns that are declared in that row's table. These bitmaps are automatically allocated and invisibly maintained by RDM
Server SQL. However, for some applications (e.g., those designed for Core API use) do not require the use of SQL null
column values. Hence, the disable null values clause can be specified to disable the allocation and use of the null values bitmap for the database.
SQL requires that referential integrity be enforced for foreign and primary key relationships. This means that all rows in the
primary key table that are referenced by foreign key values in the rows of the referencing table exist. This is automatically
handled by SQL for those foreign keys on which a create join has been defined. For the other foreign keys, SQL maintains in
each referenced primary key table row a count of the number of current references to that row. RDM Server SQL enforces referential integrity by only allowing primary key rows to be deleted or primary key values to be updated when its references
count is zero. The references count value is automatically allocated and invisibly maintained by the SQL system for each row
in the referenced, primary key table. The allocation and use of the references count can be disabled by specifying the disable
references count clause on the create database statement.
NOTE: When disable references count is specified, it will not be possible to delete rows (or update the
primary key value) from a primary key table that is referenced by a foreign key for which a create join has
not been defined. .
The following example shows a create database statement for the bookshop database with a default page size of 4096 bytes
and located on device booksdev.
create database bookshop
pagesize 4096 on booksdev;
The create database for the RDM Server system catalog database is as follows.
create database syscat on catdev
disable references count
disable null values;
Note that this database is actually a Core API database as RDM Server SQL is itself a Core API application. Hence, the use of
both null values and the references count is disabled.
6.2 Create File
The create file statement is used to define a logical file in which will be stored the contents of one or more table rows,
indexes, or blob values. The table or index data which will be stored in the file is specified using the in clause of a subsequent create table or create index statement. The syntax for create file is as follows.
create_file:
create {file | tablespace} filename [pagesize bytes] [on devname]
The filename is a case-insensitive identifier to be referenced in an in clause of a later DDL statement. The pagesize clause can
be used to specify the page size to be used for this particular file. If not specified, the default page size for the database will
SQL User Guide
22
6. Defining a Database
be used. The on clause specifies the name of the RDM Server device on which the file will be located. If not specified, the
file is located on the default device for the database.
Use of the create file is not required. However, it must be used when a page size other than the database default is needed or
when this file needs to be located on a device other than the database's default device.
Files referenced in an in clause but be created before the statement that references them is compiled.
Files can only contain the same kind of content. In other words, a file can either contain the rows of one or more tables (a
data file), the occurrences of one or more index keys (a key file), the occurrences of a single hash index, or the occurrences of
one or more blob (e.g., long varchar) columns (a blob file).
A portion of the RDM Server system catalog SQL DDL specification is shown in the example below that illustrates the use of
the create file statement.
create database syscat on catdev
disable references count
disable null values;
...
/* index files
*/
create file sysnames;
// all name indexes
create file syspfkeys; // primary and foreign key column indexes
|
...
/* table files
*/
create file systabs;
// systable, ...
create file syspkeys;
// syskey
create file sysdbs;
// sysparms, sysdb, sysindex
...
/* blob files
*/
create file syscblobs pagesize 128; // long varchar data
...
create table sysdb "database definition"
(
name char(32) not null unique compare(nocase)
"name of database",
...
) in sysdbs;
create unique index db_name on sysdb(name) in sysnames;
...
create table syskey "primary or unique key definition"
(
) in syspkeys;
create unique index pkey on syskey(cols) in syspfkeys;
...
create table systable "table definition"
(
table_addr db_addr primary key,
name char(32) not null compare(nocase)
"table name",
dbid integer not null
"database identifier",
...
defn long varchar in syscblobs
"definition string",
...
SQL User Guide
23
6. Defining a Database
) in systabs;
create unique index tab_name on systable(name, dbid) in sysnames;
Note that RDM Server accepts both types of C-style comments to be embedded in an SQL script.
6.3 Create Table
An SQL table is the basic container for all data in a database. It consists of a set of rows each comprised of a fixed number of
columns. A simple example of a table declaration and the contents of a table is given below. The example shows the create
table declaration for the author table in the bookshop example database.
create table author(
last_name
char(13) primary key,
full_name
char(35),
gender
char(1),
yr_born
smallint,
yr_died
smallint,
short_bio
varchar(250)
);
The bookshop database contains 67 rows in the author table. Each row has values for each of the 6 columns declared in the
table. Some of the rows from this table are shown below. Note that the short_bio column values are truncated due to the size
of the display window.
LAST_NAME
AlcottL
...
AustenJ
wor ...
BaconF
state ...
BarrieJ
dramat ...
BaumL
dre ...
BronteC
poet, ...
BronteE
poet, ...
BurnsR
ici ...
BurroughsE
know ...
CarlyleT
writer, ...
CarrollL
son) ...
CatherW
A ...
. . .
TolstoyL
rega ...
TrollopeA
specte ...
SQL User Guide
FULL_NAME
Alcott, Louisa May
GENDER
M
YR_BORN YR_DIED SHORT_BIO
1832
1888 American novelist. She is
Austen, Jane
F
1775
1817 English novelist whose
Bacon, Francis
M
1561
1626 English philosopher,
Barrie, J. M. (James Matthew)
M
1860
1937 Scottish author and
Baum, L. Frank (Lyman Frank)
M
1856
1919 American author of chil-
Bronte, Charlotte
F
1816
1855 English novelist and
Bronte, Emily
F
1818
1848 English novelist and
Burns, Robert
M
1759
1796 Scottish poet and a lyr-
Burroughs, Edgar Rice
M
1875
1950 American author, best
Carlyle, Thomas
M
1795
1881 Scottish satirical
Carroll, Lewis
M
1832
1898 (Charles Lutwidge Dodg-
Cather, Willa
F
1873
1947 a Pulitzer Prize-winning
Tolstoy, Leo
M
1828
1910 Russian writer widely
Trollope, Anthony
M
1815
1882 One of the most...re-
24
6. Defining a Database
TwainM
...
VerneJ
p ...
WellsH
k ...
WhartonE
Ame ...
WhitmanW
j ...
WildeO
pr ...
WoolfV
...
Twain, Mark
M
1835
1910 (Samuel Clemens) American
Verne, Jules
M
1828
1905 French author who helped
Wells, H. G. (Herbert George)
M
1866
1946 English author, now best
Wharton, Edith
F
1862
1937 Pulitzer Prize-winning
Whitman, Walt
M
1819
1892 American poet, essayist,
Wilde, Oscar
M
1854
1900 Irish writer, poet, and
Woolf, Virginia
F
1882
1941 English author, essayist,
Details on how to properly define table using the create table statement are provided in the following sections of this
chapter.
6.3.1 Table Declarations
The create table statement is used to define a table and must conform to the following syntax.
create_table:
create table [dbname.]tabname ["description"]
(column_defn [, column_defn]... [, table_constraint]...)
[in
filename]
[inmemory [persistent | volatile | read] [maxpgs = maxpages]]
The table will be contained in the database defined by the most recently executed create database statement.
The name of the table is given by tabname which is an identifier of up to 32 characters in length. It is case-insensitive so that
"salesperson" and "SALESPERSON" both refer to the same table. The table name must be unique—there can be no other table
defined in the database with the same name. An optional "description" can be specified to provide additional descriptive
information about the table which will be stored in the system catalog entry for this table.
The infilename clause specifies a file, previously declared using create file, into which the rows of the table will be stored. If
no in clause is specified, the system will automatically create a file for the table's rows using the database's default page size
and storing it on the database's device.
The inmemory clause indicates that all of the rows in the table are to be maintained in the RDM Server computer's memory
while the database containing the table is open. The read, persistent, and volatile options control whether the table's rows are
read from disk when the database is opened (read, persistent), and whether they are written to the disk when the database is
closed (persistent). The default inmemory option is volatile which means that the table is always empty when the database is
first opened. The read option means that all of the table's rows are read from the file when the database is opened; changes to
the data are allowed but are not written back to the file on closing. The persistent option means that the table's changes that
were made while the database was open are written back to the file when the database is closed. The maxpgs parameter is
used to specify the maximum number of database pages allowed for the table. (A database page is the basic unit of file
input/output in RDM Server. A page contains one or more rows from a table. The number of rows per page is computed based
on the physical size of the table's row and the page size defined for the database file in which the table's rows are stored.)
SQL User Guide
25
6. Defining a Database
6.3.2 Table Column Declarations
A table is comprised of one or more column definitions. Each column definition must follow the syntax shown below.
column_defn:
colname basic_type [default {constant | null | auto}]
[not null]
[primary key | unique]
[references [dbname.]tabname [ (colname[, colname]...) ]]
[check(cond_expr)]
[compare({nocase | wnocase | cmdFcnId})]
["description"]
|
colname long {varchar | wvarchar | varbinary}
[default {constant | null}] [not null] [in filename]
The name of the column is given by colname which is a case-insensitive identifier. There can only be one column declared in
the table with that name but it can be used in other tables. A good practice when naming columns is to use the same names
for the primary and foreign key columns (except, of course, when the foreign key references the same table as in the salesperson table in the sales database example, see section 6.8 below). Keeping all other column names unique across all the
tables in the database will allow you to use the natural join operation in your select statements.
Data Types
Table columns can be declared to contain values of one of the following data types as specified in the syntax below.
basic_type:
{char | varchar |wchar | wvarchar } [( length )]
|
{binary | varbinary} [( length )]
|
{double [precision] | float }
|
real
|
tinyint
|
{smallint | short}
|
{int | integer | long}
|
bigint
|
rowid [ '[' {4 | 6 | 8} ']' ]
|
decimal [(precision[, scale])]
|
date | time [(precision)] | timestamp [(precision)]
Descriptions for each of these data types are given in the following table.
Table 6-1. RDM Server SQL Data Types
Data Type
Description
char, varchar
ASCII characters. The length specifies the maximum number of characters that can be stored
wchar, wvarchar
binary, varbinary
SQL User Guide
in the column which will be represented and stored as a null-terminated string. If no length is
specified (char only), a single character only is stored.
Wide character data in which the storage format is operating system dependent. On Windows,
wchar is stored as UTF-16 characters. On Linux, they are stored as UCS4 characters. The
length specifies the maximum number of characters (not bytes) that can be stored in column
which will be represented and stored as a null-terminated string.
Binary data where the length specifies the number of bytes that are stored in the column.
26
6. Defining a Database
Data Type
double, float
real
tinyint
smallint
int, integer, long
bigint
rowid
decimal
date
time
timestamp
long varchar
long wvarchar
long varbinary
Description
A 64-bit floating point number.
A 32-bit floating point number.
An 8 bit, signed integer.
A 16 bit, signed integer.
A 32 bit, signed integer.
A 64 bit, signed integer.
A 32-bit or 64-bit (depending on slotsize value) unsigned integer that holds the address of a
particular table row in the database.
A binary-coded decimal in which precision specifies the maximum number of significant
digits (default 32) and scale specifies the number of decimal digits (default 16).
Date values are stored as a 32 bit unsigned integer containing the number of elapsed days
since Jan 1, 1 A.D.
Time values are stored as a 32 bit unsigned integer contains the elapsed time since midnight
(to 4 decimal places => # seconds * 10000).
A struct containing a data and time as defined above.
A blob data column containing up to 2.1 gigabytes of ASCII character data which will be represented and stored as a null-terminated string.
A blob data column containing up to 2.1 gigabytes of wide character data which will be represented and stored as a null-terminated string.
A blob data column containing up to 2.1 gigabytes of binary data.
Default and Auto-Incremented Values
column_defn:
colname basic_type [default {constant | null | auto}]
The default clause can be used to specify a default value for a column when one has not been provided on an insert statement
for the table. The default is specified as a literal constant that is appropriate for that particular data type (see section 3.3) or it
can be set to null (the default).
A column of type integer is designated as an auto-increment column by specifying the default auto clause. This will cause
SQL to automatically assign the next monotonically increasing non-negative integer value to the column when a value is not
specified for the column in an insert statement.
For example, the log_num column of the ship_log table is declared with default auto in the following create table statement.
create table ship_log
(
LOG_NUM integer default auto primary key,
ORD_DATE timestamp default now
"date/time when order was entered",
ORD_NUM smallint not null
"order number",
PROD_ID smallint not null
"product id number",
LOC_ID char(3) not null
"outlet location id",
QUANTITY integer not null
"quantity of item to be shipped from loc_id",
BACKORDERED smallint default 0
SQL User Guide
27
6. Defining a Database
"set to 1 when item is backordered",
check(OKayToShip(ord_num, prod_id, loc_id, quantity, backordered) = 1)
);
When executing an insert statement, SQL automatically generates a value for log_num if no value has been specified. For
example, in the insert statement below, SQL supplies the value for the log_num column.
insert into ship_log
values(, date "1998-07-02", 3710, 17419, "SEA", 1, 0);
However, if you supply a value, then that value will be stored. In the example below, the loc_num value stored will be 12.
insert into ship_log
values(12, date "1998-07-02", 3710, 17419, "SEA", 1, 0);
You should have little reason to assign your own values. But if you do, be sure to assign a value lower than the most recently
auto-generated value.
The automatically generated integer values do not necessarily increase in strict monotonic order (that is, exactly by 1 each
time). If a table's rows are stored in a file that also contains rows from other tables, the next number might exceed the current
number by more than 1.
Values from deleted rows are not reused.
The use of auto-increment default values does not incur any additional performance cost. RDM Server has implemented them
as part of the standard file header, which uses special high-performance logging and recovery mechanisms.
Column Constraints
Column constraints restrict the values that can be legally stored in a column. The clauses used to do this are shown following
in the syntax portion.
column_defn:
colname basic_type
[not null]
[primary key | unique]
[references [dbname.]tabname [ (colname[, colname]...) ]]
[check(cond_expr)]
Specifying not null indicates that the column cannot be assigned a null value. This means that either a default clause must be
specified for the column (of course, default null is not allowed) or a value for the column must always be specified in an
insert statement on the table.
A column that is declared to be a primary key or unique means that only one row in the table can have any specific value.
SQL enforces this through creation of a unique index in which a copy each row's column value is contained. Error "integrity
constraint violation: unique" error is returned for any insert or update statement that attempts to assign a column value that is
already being used in another row of the table. Note that primary key and unique columns are automatically treated as not
null columns even when the not null clause is omitted in the column declaration.
SQL User Guide
28
6. Defining a Database
A column that is declared with the references clause identifies it as a foreign key column referencing the primary key column
in the referenced table, tabname which can be in a separate database (dbname). This means that there must exist a row in the
referenced table with a primary key value that matches the column value being assigned by the insert or update statement.
The check clause is used to specify a conditional expression that must evaluate to true for every row that is stored in the
table. The specified conditional can only reference this column name and will typically check the value that it belongs to a
certain range or set of values. Built-in or user-defined functions can be called from the conditional expression. Conditional
expressions are specified in the usual way as given in the syntax below.
cond_expr:
rel_expr [bool_oper rel_expr]...
rel_expr:
|
|
|
|
|
|
|
|
|
expression [not] rel_oper {expression | [{any | some} | all] (subquery)}
expression [not] between constant and constant
expression [not] in {(constant[, constant]...) | (subquery)}
[tabname.]colname is [not] null
string_expr [not] like "pattern"
not rel_expr
( cond_expr )
[not] exists (subquery)
[tabname.]colname *= [tabname.]colname
[tabname.]colname =* [tabname.]colname
subquery:
select {* | expression} from {table_list | path_spec} [where cond_expr]
expression:
arith_expr | string_expr
arith_expr:
arith_operand [arith_operator arith_operand]...
arith_operand:
constant | [tabname.]colname | arith_function | ( arith_expr)
arith_operator:
+|-|*|/
arith_function:
{sum | avg | max | min} (arith_expr)
|
count ({* | [tabname.]colname})
|
if ( cond_expr, arith_expr, arith_expr)
|
numeric_function | datetime_function | system_function
|
user_defined_function
string_expr:
string_operand [^ string_operand]
string_operand:
"string" | [tabname.]colname
|
if ( cond_expr, string_expr, string_expr)
|
string_function
|
user_defined_function
rel_oper:
|
|
= | ==
<
>
SQL User Guide
29
6. Defining a Database
|
|
|
<=
>=
<> | != | /=
bool_oper:
|
& | && | and
"|" | "||" | or
Descriptions of all supported SQL built-in functions can be found in Chapter 5 of the SQL Language Reference.
The following example gives the declaration of the salesperson table in the example sales database.
create table salesperson(
sale_id char(3) primary key,
sale_name char(30) not null,
dob date,
commission decimal(4,3) check(commission between 0.0 and 0.15),
region smallint check(region in (0,1,2,3)),
sales_tot double,
office char(3) references invntory.outlet(loc_id),
mgr_id char(3) references salesperson
);
This table contains a number of column constraint definitions. The sale_id column is defined as the table's primary key. The
sale_name column has the not null constraint meaning that a salesperson's name must always be specified on an insert statement. The commission column can only contain values in the range specified in its check clause. The region column must contain a value equal to 0,1,2 or 3. The office column value, if not null (null is okay), must be the same as the loc_id column of
a row from the outlet table in the invntory database. And the mgr_id column, if not null (null is also okay), must be the same
as a sale_id in another row of the same, salesperson table (this is a self-referencing table and is valid—note that it is not possible for a row to reference itself).
6.3.3 Table Constraint Declarations
Following all column definitions, table constraints can be defined. Table constraints are similar to column constraints and are
used to specify multi-column primary/unique and foreign key definitions and/or a check clause that can be used to specify a
conditional expression involving multiple columns in the table that must be true for all rows in the table. The syntax for specifying table constraints is as follows.
table_constraint:
{primary key | unique} ( colname[, colname]... )
|
foreign key ( colname[, colname]... )
references [dbname.]tabname ( colname[, colname]... )
|
check ( cond_expr )
The columns that comprise a unique or primary key cannot have null values.
The example below shows the create table statement for the note table in the sales database in which is declared a primary
key consisting of three of the table's columns.
create table note(
note_id char(12) not null,
SQL User Guide
30
6. Defining a Database
note_date date not null,
sale_id char(3) not null references salesperson,
cust_id char(3) references customer,
primary key(sale_id, note_id, note_date)
);
The note_line table declaration in the example below contains a table constraint that declares a foreign key to the note table
shown above.
create table note_line(
note_id char(12) not null,
note_date date not null,
sale_id char(3) not null,
txtln char(81) not null,
foreign key(sale_id, note_id, note_date) references note
);
Note that no column names are specified in the "references note" clause. The references clause usually references the primary
key of the referenced table but it could reference a unique column(s) declaration too. When the column names are not
provided, the references clause will always refer to the table's primary key.
NOTE: The number of data types of columns specified in a foreign key must match exactly with their corresponding referenced primary key (unique) counterparts..
A portion of the sales_order table declaration is shown below which includes a check clause that ensures that the specified
amount is greater than the tax.
create table sales_order
(
cust_id char(3) not null references customer,
...
amount double,
tax real default 0.0,
...
check(amount >= tax)
);
A side note needs to be mentioned here. The amount and tax columns are declared as floating point types which may be okay
for this simple example database but is not recommended for columns that are intended to contain monetary values. Floating
point arithmetic is too prone to computational errors to be used for monetary calculations. Instead, always use decimal types.
6.3.4 Primary and Foreign Key Relationships
Consider the create table statement below and its contents as shown in Table 6-2.
create table customer(
sale_name char(30),
comm decimal(4,3),
office char(3),
company varchar(30),
city char(17),
SQL User Guide
31
6. Defining a Database
state char(2),
zip char(5)
);
Table 6-2. Example Un-normalized Customer Table
sale_name
comm
Kennedy, Bob
0.075
Kennedy, Bob
0.075
Flores, Bob
0.100
Flores, Bob
0.100
Stouffer, Bill 0.080
Blades, Chris 0.080
Lister, Dave
0.075
Wyman, Eliska 0.075
Wyman, Eliska 0.075
Wyman, Eliska 0.075
Wyman, Eliska 0.075
Wyman, Eliska 0.075
Porter, Greg
0.080
Nash, Gail
0.070
Nash, Gail
0.070
Nash, Gail
0.070
Kirk, James
0.075
McGuire, Sidney
McGuire, Sidney
McGuire, Sidney
Williams, Steve
Williams, Steve
Williams, Steve
Robinson, Stephanie
Robinson, Stephanie
Robinson, Stephanie
Jones, Walter 0.070
Jones, Walter 0.070
Jones, Walter 0.070
Warren, Wayne 0.075
Warren, Wayne 0.075
office
DEN
DEN
SEA
SEA
SEA
SEA
ATL
NYC
NYC
NYC
NYC
NYC
SEA
DAL
DAL
DAL
ATL
0.070
0.070
0.070
0.075
0.075
0.075
0.070
0.070
0.070
CHI
CHI
CHI
MIN
MIN
company city
state
zip
Broncos Air Express
Denver
Cardinals Bookmakers
Phoenix
Seahawks Data Services Seattle
Forty-niners Venture Group
Colts Nuts & Bolts, Inc.
CO
80239
AZ
85021
WA
98121
San Francisco
Baltimore
Browns Kennels Cleveland
OH
Jets Overnight Express New York
Patriots Computer Corp. Foxboro MA
'Bills We Pay' Financial Corp. Buffalo
Giants Garments, Inc.
Jersey City
Lions Motor Company
Detroit MI
Saints Software Support New Orleans
Oilers Gas and Light Co.
Houston
Cowboys Data Services
Dallas TX
44115
NY
2131
NY
NJ
48243
LA
TX
75230
CA
IN
94127
46219
10021
14216
7749
70113
77268
WDC
Steelers National Bank Pittsburgh
PA
WDC
Redskins Outdoor Supply Co.
Arlington
WDC
Eagles Electronics Corp.
Philadelphia
ATL
Dolphins Diving School Miami
FL
33133
ATL
Falcons Microsystems, Inc.
Atlanta GA
ATL
Bucs Data Services
Tampa
FL
33601
LAX
Raiders Development Co. Los Angeles
CA
LAX
Chargers Credit Corp.
San Diego
CA
LAX
Rams Data Processing, Inc.
Los Angeles
Chiefs Management Corporation
Kansas City
MO
Bengels Imports Cincinnati
OH
45241
Bears Market Trends, Inc.
Chicago IL
60603
Vikings Athletic Equipment
Minneapolis
MN
Packers Van Lines
Green Bay
WI
54304
15234
VA
PA
22206
19106
30359
92717
92126
CA
64141
90075
55420
This table shows a customer list for a fictional company. Each customer entry contains information about the salesperson who
services that company. Notice that there are duplicate salesperson (the sale_name, comm. and office columns) entries because
most salespersons manage multiple customer accounts. Those duplicates comprise what is referred to as redundant data. Conceptually, an entire database can be viewed as a single table in which there is a great deal of redundant data among the rows
of that database. Hence, an important aspect of database design is the need to significantly reduce amount of redundant data
in order to reduce disk space consumption which will also result in improved data access performance.
The database design technique that does this is called normalization. Normalization transfers the columns containing the
same redundant data into a separate table and then defines two new columns that will be used to associated the old data in
the new table with its original data in the old one. The new column in the new table is called the primary key. The new
column in the old table is called the foreign key.
For the example above, the create table declarations for the two tables would be as follows.
create table salesperson(
sale_id
char(3) primary key,
sale_name char(30),
SQL User Guide
32
6. Defining a Database
comm decimal(4,3),
office char(3)
};
create table customer(
company varchar(30),
city char(17),
state char(2),
zip char(5),
sale_id char(3) references salesperson
);
The sale_id column in the salesperson table is the primary key. Each row of the salesperson table must have a unique sale_id
value. The sale_id column in the customer table is a foreign key that references the specific salesperson row that identifies the
salesperson who services that customer. The amount of redundant data per customer row has been reduced from about 40
down to 3 bytes.
Table 6-3 Example Normalized Customer and Salesperson Tables
Table 6-3 shows the contents of the two tables after normalization. Each customer's salesperson is found from the row in the
salesperson table that has a matching sale_id column value. In order to see the name of the salesperson who services any particular customer you must perform a join (specifically an equi-join) between the two tables. An example of a join between the
salesperson and customer tables is shown in the following select statement which displays the customers and their salespersons for the companies located in California.
select sale_name, company, city, state from salesperson, customer
where salesperson.sale_id = customer.sale_id and state = "CA";
sale_name
Robinson, Stephanie
Robinson, Stephanie
SQL User Guide
company
Raiders Development Co.
Rams Data Processing, Inc.
city
Los Angeles
Los Angeles
33
6. Defining a Database
Robinson, Stephanie
Flores, Bob
Chargers Credit Corp.
Forty-niners Venture Group
San Diego
San Francisco
A one-to-many relationship is formed between two tables through primary and foreign key column declarations in which for a
given row in the primary key table there can be many rows in the foreign key table with the same value. It is often very helpful to refer to a graphical representation of a database schema in order to see all of the foreign and primary key relationships
that have been defined between the database tables. There are some very sophisticated standard ways to graphically depict a
database design. We prefer, however, a simpler method using an arrow between the two related tables where the arrow starts
at the primary key table (the "one" side of the one-to-many relationship and the arrow ends at the foreign key table (the
"many" side of the one-to-many relationship). The arrow is labeled with the name of the foreign key column. The sales database example referred to in this documentation will be described in more detail later but a diagram of the schema showing all
of the foreign and primary key relationships is shown in the figure below.
Figure 6-4. Sales and Inventory Database Schema Diagram
Note that the sales example is actually comprised of two databases named sales and invntory. As you can see in the above
example, foreign and primary key relationships can even be declared between tables defined in separate databases.
It is usually a good design practice for primary and foreign key columns to have the same name. Moreover, while it is possible to declare multicolumn primary and foreign keys, it is better to define single column, unique primary keys. If there is
already data that uniquely identifies each row of a table (e.g., social security number, driver's or vehicle license number, etc.)
then you should make that the primary key. If not, RDM Server provides two, easy-to-use methods that automatically assign
primary key values for you when rows are inserted into the table.
6.3.5 System-Assigned Primary Key Values
You can declare an integer column primary key to be auto-generated. The use of auto-generated integer column values was
described earlier in the "Default and Auto-Incremented Values" paragraph in section 6.3.2. In the following create table statement the log_id column is declared to be an auto-generated, integer primary key. The insert statement which follows shows
how.
SQL User Guide
34
6. Defining a Database
create table activity_log(
log_id integer default auto primary key,
userid char(32),
act_time
timestamp,
act_code
tinyint,
act_desc
varchar(256)
);
The example below shows an insert statement into the above table and the select statement that shows the log_id value that
was assigned.
insert into activity_log values(,user,now,1,"created auto-gen primary key example");
select log_id, act_desc from activity_log;
log_id
1
act_desc
created auto-gen primary key example
Alternatively, you can declare a column to be a rowid primary key. A rowid primary key column uses a row's physical location in its data file to uniquely identify the row. Related tables would contain a rowid column foreign key referencing the
primary key row. This provides for the fastest possible method of locating rows based on the primary key value. Use of rowid
primary keys is much the same as auto-generated primary keys as shown in the example below.
create table activity_log(
log_id rowid primary key,
userid char(32),
act_time
timestamp,
act_code
tinyint,
act_desc
varchar(256)
);
The example below shows an insert statement into the above table and the select statement that shows the log_id value that
was assigned.
insert into activity_log values(,user,now,1,"created auto-gen primary key example");
select log_id, act_desc from activity_log;
log_id
1
act_desc
created auto-gen primary key example
A value can be assigned even to a rowid primary key but the value must be for a non-existent row in the database. This
allows you to export a table (in rowid order) including the rowid values so that they can be imported into an empty table
keeping the same rowid primary key column values. This is important because tables that have rowid foreign key references
to the rowid primary key table must also maintain their values for export/import purposes.
The primary difference between the two methods is that an auto-generated integer primary key has an index whereas no index
is needed for the rowid.
6.4 Create Index
An index is a separate file containing the values of one or more table columns that can be used to quickly locate a row or
rows in the table. Two indexing methods are supported in RDM Server. The standard indexing method is a Btree which
SQL User Guide
35
6. Defining a Database
organizes the indexed column values so that they are stored in sorted order. This allows fast access to all the rows that match
a specified range of values. It also provides the ability to retrieve the table rows in the column order defined by the create
index statement avoiding the need to do a separate sort when a select statement includes an order by clause for those
columns. The time required to locate a specific row using a Btree access depends on factors such as the size of the index key
and the total number of rows in the database but typically will require from 3 to 5 disk reads.
The second supported index method is called hashing in which the location of a row a determined from performing a "hash"
of the indexed column value. This method can often locate a particular row in 1 disk access and so can be used to provide
very fast access to a row based on the indexed column value. However, the values are not sorted and, hence, hash indexes are
only used to find rows that match a specific value.
You do not directly use an index from SQL, but indexes are used by the RDM Server SQL optimizer in the selection of an
access plan for retrieving data from the database. More indexes provide the optimizer with more alternatives and can greatly
improve select execution performance. Unfortunately, the cost associated with a large number of indexes is a large amount of
required storage and a lower performance, incurred by insert, update, and delete statements.
Therefore, your selection of table columns to include in an index requires careful consideration. In general, create an index on
the columns through which the table's rows typically will be accessed or sorted. Do not create an index for every possible sort
or query that may be of interest to a user. SQL can sort when the select statement is processed, so it is unnecessary to create
all indexes in advance. Create indexes on the columns you expect will be used most often in order to speed access to the
rows or to order the result rows.
The syntax for create index is shown in the following syntax specification:
create_index:
create [unique | optional] index [dbname.]ndxname ["description"]
[using {btree | hash {(hashsize) | for hashsize rows}}]
on tabname ( colname [asc | desc] [, colname [asc | desc] ]... )
[in filename]
[inmemory [maxpgs = maxpages] ]
Each index declared in the database has a unique name specified by the identifer ndxname. As with all table and column
names, index names are case-insensitive. The dbname qualifier is only specified when the index is being added to a database
that already exists. You can optionally include a "description" of the index which will be stored in the system catalog with
the other index information.
The create unique index statement is used to create an index that cannot contain any duplicate values. In addition, null
column values are not allowed in columns that participate in a unique index.
The create optional index creates an index (always non-unique) that can be deactivated so that the overhead incurred during
the execution of an insert statement can be avoided. Optional indexes that have been activated behave just like any other
index. The values of the columns on which the index is based are stored in the index during the processing of an insert statement. Use of an activated optional index is also taken into consideration by the SQL query optimizer. When an optional
index is deactivated, the index values are not created when new rows are inserted nor does the optimizer use the deactivated
optional index. Note, however, that a delete or update of a row in which an index value has been stored in the optional index
will properly maintain it (i.e., the index value will be deleted/updated) even when the index is deactivated. Hence, the active
or deactive state of an optional index only affects the use of the index in the processing of an insert or select statement.
Optional indexes are useful when the use of the index by the optimizer is important only when executing queries that do not
regularly occur. For example, an accounting system may activate optional indexes to improve the performance of month-end
reporting requirements but then deactivate them at all other times to improve performance of transactions in which new rows
are being added. To enable or disable use of an optional index, use the activate index and deactivate index statements. Execution of an activate index statement will read each row of the table and store an index value only for those rows that had
been inserted since the index was last deactivated. Initially, an optional index is deactivated.
SQL User Guide
36
6. Defining a Database
You can specify the indexing method the using clause. The btree method is the default when the using clause is not specified. For hash indexes the maximum number of rows that will (eventually) be stored in the table must be specified as the
hashsize. Note that this does not need to be exact as it is unlikely that you can actually know this value in advance. The hash
algorithm relies on this information so it needs to be sufficiently large to minimize the average number of rows that hash to
the same value. Note that hash indexes must also be unique.
The on clause specifies the table and table columns on which the index is to be created. For btree indexes you can also specify whether an indexed column is to be sorted in the index in either ascending (asc) or descending (desc) order.
Use the in clause to identify the file that contains the index. If not specified, the index will be maintained in a separate file
using the default page size (1024 bytes). For hash indexes, the file specified in the in clause can only be used to store one
hash index. For btree indexes, the file specified in the in clause can contain any number of other btree indexes. However, it is
recommended that each index be contained in its own file as this will generally produce better performance. Although, there
are some embedded operating systems with older (or simpler) file management capabilities in which having too many files
can also degrade performance.
The inmemory clause indicates that the index is to be maintained in the RDM Server computer's memory while the database
containing the table is open. The read, persistent, and volatile options control whether the index is read from disk when the
database is opened (read, persistent), and whether it is written to the disk when the database is closed (persistent). The
default inmemory option is volatile which means that the index is always empty when the database is first opened. The read
option means that entire index is read from the file when the database is opened; changes to the index are allowed but are not
written back to the file on closing. The persistent option means that the index's changes that were made while the database
was open are written back to the file when the database is closed. The maxpgs option is be used to specify the maximum
number of database pages allowed for the index. (A database page is the basic unit of file input/output in RDM Server. A
page contains one or more keys in the index. The number of keys per page is computed based on the size of the indexed
columns and the page size defined for the database file in which the index is stored.)
All unique and primary key columns (except those of the rowid data type) are indexed. If you do not specify a create index
for a unique or primary key, SQL will automatically create one for you. You only need to specify a create unique index for
unique or primary key table column(s) when 1) you want to use a hash index, 2) some of the columns in the btree index
need to be in desc order, 3) you need to use the in clause to specify the index file where the create file was used to specify a
page size other than the default page size, or 4) the index needs to be specified as an inmemory index.
In the following index example, the outlet table in our inventory database has two indexes. The loc_key index is an
inmemory index for the primary key and loc_geo is an optional index.
create table outlet
(
loc_id char(3) primary key,
city char(17) not null,
state char(2) not null,
region smallint not null
"regional U.S. sales area"
);
create unique index loc_key on outlet(loc_id) inmemory persistent;
create optional index loc_geo on outlet(state, city);
In the following index example, the outlet table in our inventory database has two indexes. The loc_key index is an
inmemory index for the primary key and loc_geo is an optional index.
The create table for the sales_order table in the sales database example is shown below along with a multi-column create
index on the ord_date, amount, and ord_time columns.
SQL User Guide
37
6. Defining a Database
create table sales_order
(
cust_id char(3) not null references customer,
ord_num smallint primary key,
ord_date date default today,
ord_time time default now,
amount double,
tax real default 0.0,
ship_date timestamp default null,
check(amount >= tax)
) in salesd0;
create index order_ndx on sales_order(ord_date, amount desc, ord_time) in salek1;
6.5 Create Join
Using a create join statement, you can declare predefined joins that RDM Server SQL will automatically maintain on each
insert, update, and delete statement issued by your application. Predefined joins are used to directly link together all of the
rows from a referencing table together with the referenced primary key row. Thus, queries that include a join (that is, an equijoin) between tables related through foreign and primary key relationships can directly access the related rows. This results in
optimal join performance. Like an index, a join is implicitly used by the RDM Server SQL optimizer in optimizing data
access. This means that no RDM Server SQL data manipulation statement refers directly to the predefined join.
A predefined join provides direct access from the primary key's row to all referencing foreign key rows, as well as from the foreign key rows to the referenced primary key row. Thus, bi-directional direct access is available without the necessity of an
index on the foreign key. This bi-directional access also provides efficient outer-join processing.
Suppose that the salesperson table illustrated in Figure 6-8 contains rows for newly hired salespersons who do not yet
have any customers. An "inner join" results in a virtual table that includes only the salespersons who have at least one customer (new hires are excluded). In this case, Figure 6-8 corresponds to an inner natural join of the salesperson and customer
tables. An "outer join" created for these tables results in a virtual table that includes all salespersons and their customers. New
hires appear in the table with empty (or null) customer column values, as illustrated in Figure 6-2.
Figure 6-5. Example of Outer Join Result
Access from the table row containing a foreign key to another table row containing the corresponding primary key entry is
always available through the primary key index. You can simply index the foreign key column to allow quick access to the
foreign key row from the primary key table. However, doing so can use a large amount of disk storage because many foreign
keys can be associated with a single primary key. If you create a join instead, without indexing the foreign key, RDM Server
uses direct access methods to form the table relationship. This strategy results in better performance and saves considerable
disk storage.
The foreign key columns used in a create join are virtual columns (that is, columns for which RDM Server does not store the
data values). The application can access a value in a virtual column, just as it does any column. For virtual columns, RDM
Server automatically extracts the data value from the primary key column of the referenced row through a pointer to that row
that is associated with the predefined join and maintained by RDM Server.
SQL User Guide
38
6. Defining a Database
Since values in a foreign key column come from the corresponding primary key column of the referenced table, no redundant
data is required. However, if an index uses one of the foreign key columns or if you have specified the non_virtual attribute
in your create join, the foreign key column values will be stored. In this case, redundant data is maintained in the referencing
(foreign key) table.
When all foreign keys that reference a particular primary key are virtual, RDM Server allows the primary
key to be modified, even if there are still active foreign keys that reference it. This is the only case where
RDM Server allows a primary key column to be modified with references still active. Thus, changing the
primary key value will instantly change it in all the foreign key rows that reference it.
Using a join in your schema guarantees that only a single logical disk access is necessary to retrieve a row in the referenced
table. Thus, performance is optimal for referential integrity checking, for select statement processing, and for locating all rows
of the tables with a particular foreign key value. In addition, the database can use either one-to-many or many-to-one data
retrieval.
As with indexes, you should take care in deciding what foreign keys to use in predefined joins. Since a join is implemented
by using database address links stored in each row, RDM Server must use resources to maintain the links during execution of
database modification calls. Therefore, you should only use a join for situations in which the application needs to access
tables by going from primary key to foreign key, as well as from foreign key to primary key. When the access direction will
only be from the foreign key to the table containing the primary key, simply using the primary key index usually achieves
acceptable performance.
The syntax for create join is shown in the following grammar specification:
create_join:
create join joinname order {first | last | next | sorted}
on foreign_key [and foreign_key]...
foreign_key:
[virtual | non_virtual] tabname [ ( colname [, colname]... ) ]
[by
colname [asc | desc] [, colname [asc | desc]]... ]
The name of the join is specified by joinname which is a case-insensitive identifier. Even though the join is named, no other
SQL statement refers to a join by name as use of joins is handled automatically by RDM Server SQL.
A join on a foreign key declared in table tabname. If only one foreign key is declared then no colname list needs to be specified. If specified, the colname list must exactly match the colname list in a foreign key declared in table tabname or, if only
one column is specified, the colname column declaration in table tabname must itself have a references clause specified.
Columns of foreign keys on which the create join is specified are by default virtual—meaning that the column value is not
stored in the foreign key table but is always retrieved from its referenced, primary key table row. This reduces the amount of
data redundancy in the database. However, you can declare the join to be non_virtual indicating that the foreign key values
are to also be stored in the foreign key table.
RDM Server implements a predefined join by maintaining all of the rows that have the same foreign key values (the referencing rows) in a linked list connected to the referenced primary key row (the referenced row). The order clause specifies
the order in which the referencing rows are maintained in this linked list as follows:
order
order
order
order
first
last
next
sorted
SQL User Guide
Newly
Newly
Newly
Newly
inserted
inserted
inserted
inserted
foreign
foreign
foreign
foreign
key
key
key
key
rows are
rows are
rows are
rows are
placed
placed
placed
placed
at the front of the list.
at the end of the list.
following the current position in the list.
in the order specified in the foreign_key clause.
39
6. Defining a Database
When you define a join as order sorted, you need to specify the by clause with either asc or desc to describe the sort order
for each column as ascending or descending, respectively. Sort orders of mixed ascending and descending columns can be specified but are not yet supported. Currently, ordering of all sort columns is based on the ordering of the first sort field.
The performance of an insert or update operation involving a joined foreign key will degrade when a large
number of matching foreign key values exist. This is because the linked list implementation of predefined
joins must be scanned to locate the proper insertion place. The larger the list, the longer the time of the
scan.
The and operation allows multiple tables which contain foreign key declarations that reference the same primary key table to
share the same predefined join. This means that rows from each table that reference the same primary key row will be maintained in the join's linked list. If the join is order sorted, then the type and length of the sort columns in each of the and'd
tables must match exactly. Use of the and reduces the amount of space allocated to each row of the primary key table needed
to maintain the predefined join lists. However, the cost of accessing related rows from one of the tables will be reduced as an
access cost is incurred from any intervening rows from the other table(s) that are in the linked list.
The following example shows a portion of the salesperson and customer tables containing their respective primary and foreign
key declarations.
create table salesperson (
sale_id char(3) primary key,
sale_name char(30),
...
);
create table customer (
cust_id char(3) primary key,
company varchar(30),
...
sale_id char(3) references salesperson
);
create join salesperson_customers order last on customer(sale_id);
The order last specification will place a given salesperson's newly inserted rows at the end of the list so that a subsequence
select statement that includes a join of the two tables will return the rows in the same order in which they were inserted.
insert into salesperson values "WHG", "Gates, Bill";
insert into
insert into
insert into
insert into
commit;
customer
customer
customer
customer
values
values
values
values
"IBM",
"DLL",
"INT",
"UW",
"IBM Corporation", "WHG";
"Dell, Inc.", "WHG";
"Intel Corporation", "WHG";
"University of Washington", "WHG";
select sale_name, cust_id, company from salesperson, customer
where salesperson.sale_id = customer.sale_id
and sale_id = "WHG";
sale_name
Gates, Bill
Gates, Bill
Gates, Bill
Gates, Bill
cust_id
IBM
DLL
INT
UW
company
IBM Corporation
Dell, Inc.
Intel Corporation
University of Washington
Now consider, on the other hand, that the create join was specified with order sorted as follows.
SQL User Guide
40
6. Defining a Database
create join salesperson_customers order sorted on customer(sale_id) by company;
Then the rows from that same select statement would be returned in company name order as shown below.
select sale_name, cust_id, company from salesperson, customer
where salesperson.sale_id = customer.sale_id
and sale_id = "WHG";
sale_name
Gates, Bill
Gates, Bill
Gates, Bill
Gates, Bill
cust_id
DLL
IBM
INT
UW
company
Dell, Inc.
IBM Corporation
Intel Corporation
University of Washington
6.6 Compiling an SQL DDL Specification
There are several ways to compile an SQL DDL specification. Each DDL statement can be individually compiled and
executed (of course in the correct sequence) through whatever method you would typically choose to use (e.g., the rdsadmin utility's SQL Browser). Usually, however, the SQL DDL specification will be contained in a text file as is the case with
all of the RDM Server example database specifications.
The SQL DDL specification text can include C-style comments where the text between an opening "/*" up through a closing
"*/" (can span multiple lines) is ignored as well as the text from a "//" to the end of the text line.
Two command-line utilities are provided that you can use to process an SQL DDL specification file. The sddlp utility is
provided just for that purpose. You can also use the rsql utility's ".r" command to process the DDL file as a script file. If
you do that, be sure to subsequently submit a commit statement (assuming, of course, there were no errors in the DDL specification).
The sddlp utility is executed from a command-line according the usage shown below.
sddlp [-?|-h] [-B] [-V] [-2] [-6] [-f] [-L server ; user ; password ] ddlfile
Option
-?
Description
Displays this usage information
-B
Do not display the start-up banner
-V
Display the version information
-2
Align records like version RDM Server 2.1
-6
Align BLOB files like version RDM Server 6.X
-f
Return database header file to client (for core API program use of SQL database).
-L server ; user ; pass- Login in to RDM Server named server with user name user and password password. Each are
word
separated by a semi-colon (:) with no intervening spaces. If not specified, sddlp will attempt
to use the values specified in the RDSLOGIN environment. variable and, failing that, will
issue command-line prompts for the information.
ddlfile
The name of the text file containing the SQL DDL specification.
SQL User Guide
41
6. Defining a Database
6.7 Modifying an SQL DDL Specification
RDM Server allows the schema for an existing (i.e., populated) database to be modified by adding new tables or indexes, dropping existing tables or indexes, or changing the definition of a table. Each of these types of DDL modifications are described
in the following sections.
6.7.1 Adding Tables, Indexes, Joins to a Database
You can add a new table or index to a database simply by issuing a create table/index statement with the table/index name
qualified by the database name as indicated in the earlier syntax specifications reproduced below.
create_table:
create table [dbname.]tabname ["description"]
(column_defn [, column_defn]... [, table_constraint]...)
[in
filename]
[inmemory [persistent | volatile | read] [maxpgs = maxpages]]
create_index:
create [unique | optional] index [dbname.]ndxname ["description"]
[using {btree | hash {(hashsize) | for hashsize rows}}]
on tabname ( colname [asc | desc] [, colname [asc | desc] ]... )
[in filename]
[inmemory [maxpgs = maxpages] ]
The table/index being created will be added to the database named dbname. If dbname is not specified, then the table/index
is added to the most recently opened/accessed database.
Note that the new table can contain a foreign key declaration that references an existing table, however, it is not possible to
add a create join on the foreign key. A create join can be added only when the join being defined is between two tables that
are also being added in the same transaction. The syntax for the create join statement is reproduced below.
create_join:
create join joinname order {first | last | next | sorted}
on foreign_key [and foreign_key]...
foreign_key:
[virtual | non_virtual] tabname [ ( colname [, colname]... ) ]
[by
colname [asc | desc] [, colname [asc | desc]]... ]
6.7.2 Dropping Tables and Indexes from a Database
You can use the drop table statement to remove a table from a database as shown in the syntax below.
drop_table:
drop table [dbname.tabname
The index will be dropped from database dbname. If dbname is not specified, then table tabname will be dropped from the
most recently opened/accessed database that contains a table named tabname.
SQL User Guide
42
6. Defining a Database
Tables to which foreign key references exist in other tables cannot be dropped. Nor can tables be dropped that have foreign
keys on which a create join has been declared.
Indexes can be dropped from a table using the drop index statement as follows:
drop_index:
drop index [dbname.]ndxname
The index will be dropped from database dbname. If dbname is not specified, then index ndxname will be dropped from the
most recently opened/accessed database that contains an index named ndxname.
Pre-defined joins (defined by the create join statement) cannot be dropped from a database.
Any create table, create index, drop table or drop index statements that are issued do not take effect until the next commit
statement is issued.
6.7.3 Altering Databases and Tables
If you will be making more than one change to a database schema, it is best to encapsulate the changes in an alter database
transaction. The syntax for the alter database statement is shown below.
alter_database:
alter {database | schema} dbname
The alter database statement is followed by a series of create file, create table, create index, drop table, drop index, or
alter table statements that describe the changes you wish to make to the database schema. All the changes will be processed
when a subsequent commit statement is submitted.
For example, the following alter database script will add an index on column contact in the customer table, drop the cust_
order_key index in sales_order, and add a new table called sales_office.
alter database sales;
create file salesd4;
create file salek3;
create index contact_key on customer(contact) in salek3;
drop index cust_order_key;
create table sales_office(
office_id char(3) primary key,
address char(30),
city char(20),
state char(2),
zip char(10),
phone char(12)
);
create unique index office_key on sales_office(office_id) in salek3;
commit;
SQL User Guide
43
6. Defining a Database
The alter table statement is used to change the definition of an existing table. It can be used to add, drop, or modify column
definitions, rename the table or its columns, to modify the inmemory maxpgs value, drop a foreign key, or change the table
description string. The syntax for the alter table statement is as follows.
alter_table:
alter table [dbname.]tabname alter_table_action
alter_table_action:
add [column] column_defn [, column_defn]...
|
alter [column] column_defn
|
drop column colname
|
inmemory maxpgs = maxpages
|
rename table tabname ["description"]
|
rename [column] oldname newname ["description"]
|
drop foreign key ( colname [, colname]... )
|
"description"
Execution of this statement will modify the definition of the table name tabname in database dbname. If dbname is not specified then the table must be defined in the database identified in a previously submitted and active alter database statement.
The add column clause is used to add one or more columns to the table. A complete column_defn must be specified for each
one. The added columns will be added at the end of the table in the order specified in the list.
The alter column is used to change the definition of an existing column. The column_defn must be complete, that is, it must
still include all of the original column definition entities that are to be retained. If not null is added to the column definition,
a default must be given. If the type or length of the column changes, any index that uses this column must have been previously dropped. If the type or length of the column changes, any foreign key references including this column must have
been previously dropped. The check and compare clauses of the column_defn cannot be changed, added, or removed when
altering a column. Type conversion from double, float, real, numeric, decimal, date, time or timestamp into varchar, char,
wvarchar or char will use the default display format for the type as defined by the user (i.e. "set date display(14, "mmm. d,
yyyy")").
The drop column action removes the column from the table The rename action can be used to change the table name or the
name of a column. The drop foreign key clause removes the foreign key table constraint with the specified column names.
Foreign keys on which a create join has been declared cannot be dropped.
6.7.4 Schema Versions
All of the DDL statements that have been submitted after the alter database statement which initiated the schema modification take effect upon execution of a commit statement. RDM Server assigns a new version number to the newly changed
database schema. Versioning allows DDL changes to take immediate effect without having to apply to those changes to the
existing database data.
All SQL statements that access the database which are submitted after the schema has been changed must
conform to the new DDL specification.
Any C applications or stored procedures that reference changed or dropped DDL tables or columns must be
changed and recompiled.
Database files that contained only tables, indexes, or blob column data that have been dropped are deleted.
SQL User Guide
44
6. Defining a Database
Columns that have been added to tables will have null values returned for the table's rows that existed prior to the DDL
changes being put into effect. If the newly added column was specified as not null then the column's default value will be
returned.
If any tables are dropped or columns are changed or dropped by an alter database transaction, an update
stats should be submitted on the database after the DDL changes have been committed..
6.8 Example SQL DDL Specifications
Several example databases are provided with RDM Server. The two example databases that are primarily used in this documentation to illustrate use of RDM Server SQL are for a hypothetical computer components sales company. (Since this
example has been in use in the RDM Server documentation since 1992 perhaps we should call it an antique computer component sales company.) Also provided is an example database for a hypothetical bookshop that only sells high-end, rare antiquarian books. The other example database contains actual data derived from over 130,000 National Science Foundation
(USA) research grants that were awarded during the years 1990 through 2003.
6.8.1 Sales and Inventory Databases
The example inventory database is defined in the following schema. This database consists of three tables. The product table
contains information about each product, including an identification code, description, pricing, and wholesale cost. The outlet
table identifies all company office and warehouse locations. The on_hand table is used to link the other two tables by defining the quantity of a specific product (through prod_id) located at a particular outlet (through loc_id). This specification is
available in the text file named "invntory.sql".
create database invntory on sqldev;
create table product
(
prod_id smallint primary key "product identification code",
prod_desc char(39) not null "product description",
price float
"retail price",
cost float
"wholesale cost"
);
create unique index prod_key on product(prod_id);
create index prod_pricing on product(price desc, prod_id);
create table outlet
(
loc_id char(3) primary key,
city char(17) not null,
state char(2) not null,
region smallint not null "regional U.S. sales area"
);
create unique index loc_key on outlet(loc_id);
create optional index loc_geo on outlet(state, city);
create table on_hand
(
loc_id char(3) not null references outlet(loc_id)
"warehouse containing this product",
prod_id smallint not null references product
"id of product at this warehouse",
quantity integer not null
"units of product on hand at this warehouse",
primary key(loc_id, prod_id)
SQL User Guide
45
6. Defining a Database
);
create unique index on_hand_key on on_hand(loc_id, prod_id);
create join inventory order last on on_hand(loc_id);
create join distribution order last on on_hand(prod_id);
The example sales database definition given below, is more complex than the inventory database. The salesperson table contains specific information about a salesperson, including sales ID code, name, commission rate, etc. The customer table contains standard data identifying each customer. The sale_id column in this table is a foreign key for the salesperson who
services the customer account. Note that sales orders made by a customer are identified through the cust_id foreign key in the
sales_order table. This DDL specification is contained in text file "sales.sql".
create
create
create
create
create
create
create
create
database sales on sqldev;
file salesd0;
file salesd1;
file salesd2;
file salesd3;
file salek0;
file salek1 pagesize 2048;
file salek2 pagesize 4096;
create table salesperson
(
sale_id char(3) primary key,
sale_name char(30) not null,
dob date "date of birth",
commission decimal(4,3) check(commission between 0.0 and 0.15)
"salesperson's commission rate",
region smallint check(region in (0,1,2,3))
"regional U.S. sales area",
office char(3) references invntory.outlet(loc_id)
"location where salesperson works",
mgr_id char(3) references salesperson
"salesperson id of sales mgr"
) in salesd0;
create unique index sale_key on salesperson(sale_id) in salek0;
create optional index sales_regions on salesperson(region, office) in salek1;
create optional index sales_comms
on salesperson(commission desc, sale_name) in salek2;
create join manages order last on salesperson(mgr_id);
create table customer
(
cust_id char(3) primary key,
company varchar(30) not null,
contact varchar(30),
street char(30),
city char(17),
state char(2),
zip char(5),
sale_id char(3) not null references salesperson
"salesperson who services customer account"
) in salesd0;
create unique index cust_key on customer(cust_id) in salek0;
create optional index cust_geo on customer(state, city) in salek2;
create join accounts order last on non_virtual customer;
create table sales_order
(
SQL User Guide
46
6. Defining a Database
cust_id char(3) not null references customer
"customer who placed order",
ord_num smallint primary key
"order number",
ord_date date default today
"date order placed",
ord_time time default now
"time order placed",
amount double
"total base amount of order",
tax real default 0.0
"state/local taxes if appl.",
ship_date timestamp default null,
check(amount >= tax)
) in salesd0;
create unique index order_key on sales_order(ord_num) in salek0;
create index order_ndx on sales_order(ord_date, amount desc, ord_time) in salek1;
create index cust_order_key on sales_order(cust_id) in salek0;
create join purchases order last on sales_order;
create table item
(
ord_num smallint not null references sales_order,
prod_id smallint not null references invntory.product,
loc_id char(3) not null references invntory.outlet,
quantity integer not null
"number of units of product ordered",
check( HaveProduct(ord_num, prod_id, loc_id, quantity) = 1 )
) in salesd1;
create index item_ids on item(prod_id, quantity desc) in salek1;
create join line_items order last on item(ord_num);
create table ship_log
(
log_num integer default auto primary key,
ord_date timestamp default now
"date/time when order was entered",
ord_num smallint not null
"order number",
prod_id smallint not null
"product id number",
loc_id char(3) not null
"outlet location id",
quantity integer not null
"quantity of item to be shipped from loc_id",
backordered smallint default 0
"set to 1 when item is backordered",
check(OKayToShip(ord_num,prod_id,loc_id,quantity,backordered) = 1)
) in salesd0;
create index ship_order_key on ship_log(ord_num, prod_id, loc_id) in salek1;
create table note
(
note_id char(12) not null,
note_date date not null,
sale_id char(3) not null references salesperson,
cust_id char(3) references customer,
primary key(sale_id, note_id, note_date)
) in salesd2;
create unique index note_key on note(sale_id, note_id, note_date) in salek1;
SQL User Guide
47
6. Defining a Database
create join tickler order sorted on note(sale_id) by note_date desc;
create join actions order sorted on note(cust_id) by note_date desc;
create table note_line
(
note_id char(12) not null,
note_date date not null,
sale_id char(3) not null,
txtln char(81) not null,
foreign key(sale_id, note_id, note_date) references note
) in salesd3;
create join comments order last on note_line;
For the sales database, the item table contains product and quantity data pertaining to the sales order identified through the
ord_num foreign key. Notes that can serve as a tickler for the salesperson or that indicates actions to be performed for the customer are stored in the note table. Each line of the note text is stored as a row of the note_line table. An additional table,
called ship_log, contains information about sales orders that have been booked but not yet shipped. Your application will create rows in this table through a trigger function, which is a special use of a user-defined function (UDF).
The schema diagram for the sales and inventory databases was given earlier in Figure 6-3 but is also shown in below. Recall
that the boxes represent tables. The arrow represents the foreign and primary key relationship between the two tables where
the arrow starts at the primary key table (the "one" side of the one-to-many relationship and the arrow ends at the foreign key
table (the "many" side of the one-to-many relationship). The arrow is labeled with the name of the foreign key column.
Figure 6-6. Sales and Inventory Databases Schema Diagram
6.8.2 Antiquarian Bookshop Database
Our fictional bookshop is located in Hertford, England (a very real and charming town north of London). It is located in a
building constructed around 1735 and has two rather smallish rooms on two floors with floor-to-ceiling bookshelves throughout. Upon entering, one is immediately transported to a much earlier era being quite overwhelmed by the wonderful sight and
SQL User Guide
48
6. Defining a Database
odor of the ancient mahogany wood in which the entire interior is lined along with the rare and ancient books that reside on
them. There is a little bell that announces one’s entrance into the shop but it is not really needed, as the delightfully squeaky
floor boards quite clearly makes your presence known.
In spite of the ancient setting and very old and rare books, this bookshop has a very modern Internet storefront through which
it sells and auctions off its expensive inventory. A computer system contains a database describing the inventory and manages
the sales and auction processes. The database schema for our bookshop is given below. It is contained in text file "bookshop.sql".
create database bookshop on booksdev;
create table author(
last_name
char(13) primary key,
full_name
char(35),
gender
char(1),
yr_born
smallint,
yr_died
smallint,
short_bio
varchar(250)
);
create table genres(
text
char(31) primary key
);
create table subjects(
text
char(51) primary key
);
create table book(
bookid
char(14) primary key,
last_name
char(13) references author,
title
varchar(255),
descr
char(61),
publisher
char(136),
publ_year
smallint,
lc_class
char(33),
date_acqd
date,
date_sold
date,
price
double,
cost
double
);
create join authors_books order last on book(last_name);
create index year_ndx on book(publ_year);
create table related_name(
bookid
char(14) references book,
name
char(61)
);
create join book_names order last on related_name(bookid);
create table genres_books(
bookid
char(14) references book,
genre
char(31) references genres
);
create join genre_book_mm order last on genres_books(genre);
create join book_genre_mm order last on genres_books(bookid);
create table subjects_books(
bookid
char(14) references book,
SQL User Guide
49
6. Defining a Database
subject
char(51) references subjects
);
create join subj_book_mm order last on subjects_books(subject);
create join book_subj_mm order last on subjects_books(bookid);
create table acctmgr(
mgrid
char(7) primary key,
name
char(24),
hire_date
date,
commission double
);
create table patron(
patid
char(3) primary key,
name
char(30),
street
char(30),
city
char(17),
state
char(2),
country
char(2),
pc
char(10),
email
char(63),
phone
char(15),
mgrid
char(7)
);
create index patmgr on patron(mgrid);
create index phone_ndx on patron(phone);
create table note(
noteid
integer primary key,
bookid
char(14) references book,
patid
char(3) references patron
);
create join book_notes order last on note(bookid);
create join patron_notes order last on note(patid);
create table note_line(
noteid
integer references note,
text
char(81)
);
create join note_text order last on note_line(noteid);
create table sale(
bookid
char(14) references book,
patid
char(3) references patron
);
create join book_sale order last on sale(bookid);
create join book_buyer order last on sale(patid);
create table auction(
aucid
integer primary key,
bookid
char(14) references book,
mgrid
char(7) references acctmgr,
start_date date,
end_date
date,
reserve
double,
curr_bid
double
);
create join book_auction order last on auction(bookid);
create join mgd_auctions order last on auction(mgrid);
SQL User Guide
50
6. Defining a Database
create table bid(
aucid
integer references auction,
patid
char(3) references patron,
offer
double,
bid_ts
timestamp
);
create join auction_bids order last on bid(aucid);
create join patron_bids order last on bid(patid);
Descriptions for each of the above tables are given below.
Table 6-5. Bookshop Database Table Descriptions
Table Name
Description
author
Each row contains biographical information about a reknowned author.
book
Contains information about each book in the bookshop inventory. The last_name column
genres
subjects
related_name
genres_books
subjects_books
note
note_line
acctmgr
patron
sale
auction
bid
associates the book with its author. Books with a non null date_sold are no longer available.
Table of genre names (e.g., "Historical fiction") with which particular books are associated via
the genres_books table.
Table of subject names (e.g., "Cape Cod") with which particular books are associated via the
subjects_books table.
Related names are names of individuals associated with a particular book. The names are usually hand-written in the book’s front matter or on separate pages that were included with the
book (e.g., letters) and identify the book’s provenance (owners). Only a few books have
related names. However, their presence can significantly increase the value of the book.
Used to create a many-to-many relationship between genres and books.
Used to create a many-to-many relationship between subjects and books.
Connects each note_line to its associated book. Notes include edition info and other comments (often coded) relating to its condition.
One row for each line of text in a particular note.
Account manager are the bookshop employees responsible for servicing the patrons and managing auctions.
Bookshop customers and their contact info. Connected to their purchases/bids through their
relationship with the sale and auction tables.
Contains one row for each book that has been sold. Connects the book with the patron who
acquired through the bookid and patid columns.
Some books are auctioned. Those that have been (or currently being) auctioned have a row in
this table that identifies the account manager who oversees the auction. The reserve column
specifies the minimum acceptable bid, curr_bid contains the current amount bid.
Each row provides the bid history for a particular auction.
A schema diagram depicting the intertable relationships is shown below.
SQL User Guide
51
6. Defining a Database
Figure 6-7. Bookshop Database Schema Diagram
6.8.3 National Science Foundation Awards Database
The data used in this example has been extracted from the University of California Irvine Knowledge Discovery in Databases
Archive (http://kdd.ics.uci.edu/). The original source data can be found at http://kdd.ics.uci.edu/databases/nsfabs/nsfawards.html. The data was processed by a Raima-developed RDM SQL program that, in
addition to pulling out the data from each award document, converted all personal names to a "last name, first name, ..."
format and, where possible, identified each person’s gender from the first name. The complete DDL specification for the NSF
awards database is shown below. It is contained in text file "nsfawards.sql".
create database nsfawards on sqldev;
create table person(
name
char(35) primary key,
gender
char(1),
jobclass
char(1)
);
create table sponsor(
name
char(27) primary key,
addr
char(35),
city
char(24),
state
char(2),
zip
char(5)
);
create index geo_loc on sponsor(state, city);
create table nsforg(
SQL User Guide
52
6. Defining a Database
orgid
name
char(3) primary key,
char(40)
);
create table nsfprog(
progid
char(4) primary key,
descr
char(40)
);
create table nsfapp(
appid
char(10) primary key,
descr
char(30)
);
create table award(
awardno
integer primary key,
title
char(182),
award_date date,
instr
char(3),
start_date date,
exp_date
date,
amount
double,
prgm_mgr
char(35) references person,
sponsor_nm char(27) references sponsor,
orgid
char(3) references nsforg
);
create join manages order last on award(prgm_mgr);
create join sponsor_awards order last on award(sponsor_nm);
create join org_awards order last on award(orgid);
create index award_date_ndx on award(award_date);
create index exp_date_ndx on award(exp_date);
create index amount_ndx on award(amount);
create table investigator(
awardno
integer references award,
name
char(35) references person
);
create join award_invtgrs order last on investigator(awardno);
create join invtgr_awards order last on investigator(name);
create table field_apps(
awardno
integer references award,
appid
char(10) references nsfapp
);
create join award_apps order last on field_apps(awardno);
create join app_awards order last on field_apps(appid);
create table progrefs(
awardno
integer references award,
progid
char(4) references nsfprog
);
create join award_progs order last on progrefs(awardno);
create join prog_awards order last on progrefs(progid);
Descriptions for each of the tables declared in the nsfawards database are given in the following table.
Table 6-6. NSF Awards Database Table Descriptions
Table Name
Description
person
Contains one row for each investigator or NSF program manager. An investigator (jobcclass =
"I") is a person who is doing the research. The NSF program manager (jobcclass = "P") over-
SQL User Guide
53
6. Defining a Database
Table Name
sponsor
nsforg
nsfprog
nsfapp
award
investigator
field_apps
progrefs
Description
sees the research project on behalf of the NSF. An award can have more than one investigator
but only one program manager. The gender column is derived from the first name but has
three values "M", "F", and "U" for "unknown" when the gender based on the first name could
not be determined (about 13%).
Institution that is sponsoring the research. Usually where the principal investigator is
employed. Each award has a single sponsor.
NSF organization. The highest level NSF division or office under which the grant is awarded.
Specific NSF programs responsible for funding research grants.
NSF application areas that the research impacts.
Specific data about the research grant. The columns are fairly self-explanatory. For clarity the
exp_data column contains the award expiration data (i.e., when the money runs out). The
amount column contains the total funding amount. The instr column is a code indicating the
award instrument (e.g., "CTG" = "continuing", "STD" = "standard", etc.).
The specific investigators responsible for carrying out the research. This table is used to form
a many-to-many relationship between the person and award tables.
NSF application areas for which the research is intended. This table is used to form a manyto-many relationship between the nsfapp and award tables.
Specific programs under which the research is funded. This table is used to form a many-tomany relationship between the nsfprog and award tables.
Note that the interpretations given in the above descriptions are Raima's and may not be completely accurate (e.g., it could be
that NSF programs are not actually responsible for funding research grants). However, our intent is to simply use this data for
the purpose of illustration. A schema diagram for the nsfawards database is shown below.
Figure 6-8. NSF Awards Database Schema Diagram
6.9 Database Instances
A database instance is a database that shares the schema (DDL specification) of another database. There can be any number of
database instances that share the same schema definition. One principal use for database instancing is a situation where mutually exclusive data exists that includes differing archiving requirements. For example, to retrieve and delete data from a database all database information related to a particular account or client record can be tedious to program and expensive to
SQL User Guide
54
6. Defining a Database
process. However, if each client or account data is placed in a separate database instance, it is easy to both archive (simply
copy the database files), and delete (simply reinitialize the database or delete it altogether).
Time oriented applications also can benefit from database instancing. Consider the example of a company that uses a separate
instance for each day of the current year. In this setup, each day's transactions can simply be stored in the instance for that
day.
Instancing is also useful in some replication applications. For example, assume a large corporation has a mainframe computer
that stores all accounts from all its branch offices. Each branch office performs a daily download of the new and modified
accounts into separate database instances for each account. This allows each modified account to simply reinitialize the database before receiving the new account information or to create a new instance for the new accounts.
Database instancing requires that the database definition be considered as distinct from the database itself, since there can be
more than one instance of a schema and each instance has a different name. The original instance has the same name as the
schema; subsequent instances have different names. Once a database instance has been created, it can be used in exactly the
same manner as any database.
6.9.1 Creating a Database Instance
When SQL processes an SQL DDL specification, the database name specified in the create database statement names both
the schema and the first instance of the schema, and automatically creates the first instance. Other instances can then be created and dropped.
A new instance of a database is created by a successful execution of the create database instance statement shown below.
create_database_instance:
create database instance newdb from sourcedb
[with data | [no]init[ialize]] [on devname]
New database instances are created from existing databases. The name of the new database is given by newdb, which must be
unique for all databases on the server. The existing database instance from which the new instance is created is sourcedb.
The database device name, devname, must be specified and must be a valid RDM Server database device. In addition, that
device cannot have been the device in which database sourcedb is contained nor any other instances of the same schema. All
database files will be stored on that device and, since the file names for all instances are identical, they must be stored in separate database devices. If specified, the with data option opens the source database for exclusive access and causes all database files and optimizer statistics from the source database to be copied into the new database. The init option (default) will
ensure that the database files for the instance are initialized. The noinit option can be specified to defer initialization to some
later time when an initialize database statement will be performed.
The create database instance statement can only be executed by administrators or the owner of the schema (that is, the user
who issued the original create database statement).
The initial instance of a database is created when a database definition is processed. The name of the instance is specified in
the create database statement. Other instances can then be created from the original database. All instances share the same
database definition information from the system catalog. However, database statistics used by the SQL query optimizer collected during execution of the update stats statement are maintained separately for each database instance.
6.9.2 Using Database Instances
Database instances are referenced just as you reference any database. You can explicitly open a database instance using the
open statement or implicitly open one through a qualified table name. For example, assume that wa_sales, ca_sales, and mi_
SQL User Guide
55
6. Defining a Database
sales are each instances of the sales database, containing the sales for Washington, California, and Michigan, respectively. The
following example shows how these instances can be created and populated.
create database instance wa_sales from sales on wadev;
insert into wa_sales.customer from file "customrs.wa" on salesdev;
create database instance ca_sales from sales on cadev;
insert into ca_sales.customer from file "customrs.ca" on salesdev;
create database instance mi_sales from sales on midev;
insert into mi_sales.customer from file "customrs.mi" on salesdev;
update stats on wa_sales, ca_sales, mi_sales;
The next example returns the customers from the Michigan instance of sales.
open mi_sales;
select * from customer;
This same query could have been executed using a single statement as follows.
select * from mi_sales.customer;
You can have any number of instances of the same schema opened at a time. An unqualified reference to a table in the
schema will use the most recently opened instance by default. If you are not sure which instance is open, it is best to explicitly qualify the table name with the database name.
An unqualified reference to a table from a schema on which there is more than one instance will use the oldest instance (usually the original) when none have been opened.
6.9.3 Stored Procedures and Views
Views and stored procedure definitions are maintained based on the schema definition and are not dependent on a particular
database instance except when the database instance is explicitly referenced in the view or stored procedure declaration.
However, the execution plan generated for the view or stored procedure is based on the optimization statistics associated with
whatever database instance was open at the time the view or stored procedure was compiled. Thus, if a view or stored procedure will be used with more than one database instance, it is important that the instance used during compilation contain a
representative set of data on which an update stats has been run.
The example below creates a view called in_state_by_zip that will list the customers in a database instance in zip code order.
The mi_sales database was opened for the create view because it contained a large number of customers. Thus, the optimizer
would be sure to use the index on zip (assuming that in this example zip is indexed). The subsequent open on wa_sales followed by the select of in_state_by_zip will return the results from the wa_sales database.
-- Lot's of customers in Michigan, should provide good stats
open mi_sales;
create view in_state_by_zip as
select * from customer order by zip;
open wa_sales;
select * from in_state_by_zip;
SQL User Guide
56
6. Defining a Database
Note that for views referenced in a select statement qualified with an instance name, the instance name is used to identify the
schema to which the view is attached. It does not specify which instance to use with any unqualified table names in the view
definition itself. Thus, in the following example, the result set will contain Washington, not Michigan, customers.
open wa_sales;
select * from mi_sales.in_state_by_zip;
6.9.4 Drop Database Instance
The drop database statement can be used to delete database instances. The syntax is shown below.
drop_database:
drop database dbname
This statement can only be executed by administrators or the database owner. Also, database dbname must not be opened by
any other users. The system drops the database instance by removing its instance-specific information from the system catalog.
The database definition information associated with the schema is not deleted.
Dropping the original database after all other database instances based on it have been dropped will remove the database completely from the system, including the schema definition.
6.9.5 Restrictions
A database instance cannot be created for any database which contain explicitly declared foreign key references to a different
database. For example, the example sales database schema provided in RDM Server contains foreign references to the invntory
database. Any attempt to create an instance of either sales or inventory will return an error. This restriction exists because it is
impossible for RDM Server to reliably manage inter-database reference counts for multiple database instances. The reliability
of such operations would be based on the correctness of the application's use of those databases, thus violating the very
concept of DBMS-enforced referential integrity.
Inter-database relationships can still be maintained by the application program by using undeclared foreign keys. Shown
below is an excerpt from sales.sql with the declared foreign keys to the invntory database highlighted. By simply removing
the indicated reference clauses, it is possible to create multiple instances of both sales and inventory. Referential integrity will
not be enforced by SQL but the inter-database relationships can still exist with no effect on how joins between the databases
are processed.
create table salesperson
(
sale_id char(3) primary key,
...
office char(3) references invntory.outlet(loc_id),
mgr_id char(3) references salesperson
);
...
create table item
(
ord_num smallint not null references sales_order,
prod_id smallint not null references invntory.product,
loc_id char(3) not null references invntory.outlet,
SQL User Guide
57
6. Defining a Database
...
);
SQL User Guide
58
7. Retrieving Data from a Database
7. Retrieving Data from a Database
The reason data is stored in a database is so that it can be later retrieved and looked at. However, in order to do something
intelligent with that data it must first intelligently be retrieved. This is often much easier to say than to do and that is particularly true with a language like SQL.
Data is retrieved from RDM Server databases using the SQL select statement. A completely specified select statement is commonly referred to as a query. The complete set of rows that are returned by a select statement is called the result set.
This chapter will explain how to properly formulate select statements to view data contained in one or more RDM Server databases. We will begin with the simplest and progress to more complex queries. The select statement syntax specification will
be incrementally developed throughout this chapter in order to show only the syntax that is relevant to the select statement
feature being explained.
7.1 Simple Queries
The most basic of queries is to retrieve all of the rows and columns of a table. The easiest way to do this is to use the following statement:
select:
select * from tabname
The "*" indicates that all of the columns declared in tabname are to be returned. Thus, you can enter the following statement
to see all of the account managers in the acctmgr table in the bookshop database.
For example, the following statement retrieves data from the salesperson table in the example sales database. To choose all
columns in the table, enter an asterisk (*).
select * from salesperson;
SALE_ID
BCK
BNF
BPS
CMB
DLL
ERW
GAP
GSN
JTK
SKM
SSW
SWR
WAJ
WWW
SALE_NAME
Kennedy, Bob
Flores, Bob
Stouffer, Bill
Blades, Chris
Lister, Dave
Wyman, Eliska
Porter, Greg
Nash, Gail
Kirk, James
McGuire, Sidney
Williams, Steve
Robinson, Stephanie
Jones, Walter
Warren, Wayne
DOB
COMMISSION REGION
1957-10-29
0.075
0
1943-07-17
0.100
0
1952-11-21
0.080
2
1958-09-08
0.080
3
1999-08-30
0.075
3
1959-05-18
0.075
1
1949-03-03
0.080
1
1954-10-20
0.070
3
2100-08-30
0.075
3
1947-12-02
0.070
1
1944-08-30
0.075
3
1968-10-11
0.070
0
1960-07-15
0.070
2
1953-04-29
0.075
2
SALES_TOT
736345.32
173102.02
29053.3
0
0
566817.01
439346.5
306807.26
0
208432.11
247179.99
374904.47
422560.55
212638.5
OFFICE
DEN
SEA
SEA
SEA
ATL
NYC
SEA
DAL
ATL
WDC
ATL
LAX
CHI
MIN
MGR_ID
BNF
*NULL*
*NULL*
*NULL*
*NULL*
GAP
*NULL*
CMB
*NULL*
GAP
CMB
BNF
BPS
BPS
Of course, if you only need to see some but not all of the columns in a table, those columns can be individually listed as
indicated in the following syntax.
select:
select colname[, colname]… from tabname
SQL User Guide
59
7. Retrieving Data from a Database
Each specified colname must identify a column that is declared in tabname. The next example retrieves the salesperson name,
sales total, commission, region code for each salesperson.
select sale_name, sales_tot, commission, region from salesperson;
SALE_NAME
Kennedy, Bob
Robinson, Stephanie
Flores, Bob
Wyman, Eliska
Porter, Greg
McGuire, Sidney
Jones, Walter
Warren, Wayne
Stouffer, Bill
Williams, Steve
Kirk, James
Lister, Dave
Nash, Gail
Blades, Chris
SALES_TOT COMMISSION REGION
736345.32
0.075
0
374904.47
0.070
0
173102.02
0.100
0
566817.01
0.075
1
439346.5
0.080
1
208432.11
0.070
1
422560.55
0.070
2
212638.5
0.075
2
29053.3
0.080
2
247179.99
0.075
3
0
0.075
3
0
0.075
3
306807.26
0.070
3
0
0.080
3
7.2 Conditional Row Retrieval
If you need to retrieve only table rows that meet particular selection criteria, you can issue a select statement using the where
clause to specify a condition indicating just the rows you want. The where clause contains a conditional expression consisting of one or more relational expressions separated by operators as specified in the syntax given below.
select:
select {* | colname[, colname]…} from
tabname
where cond_expr
cond_expr:
rel_expr [bool_oper rel_expr]...
rel_expr:
|
|
|
|
|
|
|
|
|
expression [not] rel_oper {expression | [{any | some} | all] (subquery)}
expression [not] between
constant
and
constant
expression [not] in {(constant[, constant]...) | (subquery)}
[tabname.]colname
is [not] null
string_expr [not] like "pattern"
not rel_expr
( cond_expr )
[not] exists (subquery)
[tabname.]colname *= [tabname.]colname
[tabname.]colname =* [tabname.]colname
expression:
arith_expr | string_expr
arith_expr:
arith_operand [arith_operator arith_operand]...
arith_operand:
constant | [tabname.]colname | arith_function | ( arith_expr)
arith_operator:
+|-|*|/
SQL User Guide
60
7. Retrieving Data from a Database
string_expr:
string_operand [^ string_operand]
string_operand:
"string" | [tabname.]colname
|
if ( cond_expr, string_expr, string_expr)
|
string_function
|
user_defined_function
rel_oper:
|
|
|
|
|
= | ==
<
>
<=
>=
<> | != | /=
bool_oper:
|
& | && | and
"|" | "||" | or
For example, the following query chooses only customer accounts in the customer table (sales database) that are serviced by
Sidney McGuire (that is, accounts with sale_id equal to "SKM").
select sale_id, cust_id, company, city, state from customer
where sale_id = "SKM";
SALE_ID
SKM
SKM
SKM
CUST_ID
PHI
PIT
WAS
COMPANY
Eagles Electronics Corp.
Steelers National Bank
Redskins Outdoor Supply Co.
CITY
Philadelphia
Pittsburgh
Arlington
STATE
PA
PA
VA
The next query example lists the sales_order rows for those orders that have not yet shipped (indicated by a null in the ship_
date column) and where the amount is $50,000 or more.
select cust_id, ord_num, ord_date, amount from sales_order
where ship_date is null and amount > 50000.00;
CUST_ID ORD_NUM ORD_DATE
BUF
2205 1997-01-03
DEN
2207 1997-01-06
GBP
2211 1997-01-10
NOS
2218 1997-01-24
DET
2219 1997-01-27
HOU
2226 1997-01-30
ATL
2230 1997-02-04
LAA
2234 1997-02-10
DEN
2237 1997-02-12
KCC
2241 1997-02-21
DET
2250 1997-03-06
PHO
2253 1997-03-16
CIN
2257 1997-03-23
NYJ
2270 1997-04-02
NEP
2281 1997-04-13
SFF
2284 1997-04-20
SQL User Guide
AMOUNT
150871.2
274375
53634.12
81375
74034.9
54875
62340
124660
103874.8
82315
82430.85
143375
62340
54875
66341.5
74315.16
61
7. Retrieving Data from a Database
DET
GBP
NOS
2288 1997-04-24
2292 1997-04-30
2324 1997-07-30
252425
77247.5
104019.5
Note that the "ship_date is null" and not "ship_date != null" relational operator is required in order for the query to return the
correct results. The SQL standard specifies that the result of a normal relational comparison with a null value is indeterminate
and that only those rows in which the where clause evaluates to true are returned by a select statement. Since, "ship_date !=
null' is, according to standard SQL, indeterminate, no rows would be returned from that select statement.
7.2.1 Retrieving Data from a Range
The between operator returns those rows where the left hand expression inclusively evaluates to a value between the two values on the right. In the following example, the between operator will restrict the select result set to only those sales orders
made from January 1 to January 31, 1997, inclusive.
select cust_id, ord_num, ord_date from sales_order
where ord_date between date "1997-1-1" and date "1997-1-31";
CUST_ID
CHI
MIN
KCC
CIN
BUF
LAN
DEN
PHI
PHO
IND
GBP
ATL
NYG
LAA
SEA
KCC
SDC
NOS
DET
DEN
NEP
CLE
MIN
TBB
SEA
HOU
IND
ORD_NUM
2201
2202
2203
2204
2205
2206
2207
2208
2209
2210
2211
2212
2213
2214
2215
2216
2217
2218
2219
2220
2221
2222
2223
2224
2225
2226
2227
ORD_DATE
1997-01-02
1997-01-02
1997-01-02
1997-01-02
1997-01-03
1997-01-02
1997-01-06
1997-01-07
1997-01-07
1997-01-09
1997-01-10
1997-01-15
1997-01-16
1997-01-16
1997-01-17
1997-01-21
1997-01-24
1997-01-24
1997-01-27
1997-01-27
1997-01-27
1997-01-28
1997-01-28
1997-01-28
1997-01-29
1997-01-30
1997-01-31
7.2.2 Retrieving Data from a List
You can use the in operator to choose only those rows that match one of the column values specified in the list. The example
shows a select statement that retrieves all customers located in Pacific Coast states from the customer table.
SQL User Guide
62
7. Retrieving Data from a Database
select cust_id, company, city, state from customer
where state in ("CA", "OR", "WA");
CUST_ID
SEA
SFF
LAA
LAN
SDC
COMPANY
Seahawks Data Services
Forty-Niners Venture Group
Raiders Development Co.
Rams Data Processing, Inc.
Chargers Credit Corp.
CITY
Seattle
San Francisco
Los Angeles
Los Angeles
San Diego
STATE
WA
CA
CA
CA
CA
7.2.3 Retrieving Data by Wildcard Checking
The where clause can include a like operator to retrieve the rows where a character column's value match the wildcard pattern
specified in the like string constant.
Two wildcard characters are defined in standard SQL.
Table 7-1. LIKE Operatior Wild Card Character Descriptions
Table Name
Description
% (percent)
Matches zero or more characters.
_ (underscore)
Matches any single character.
The next example includes a select statement that retrieves from the customer table all customers who have "Data" as part of
their company name.
select cust_id, company, city, state from customer
where company like "%Data%";
CUST_ID
SEA
DAL
TBB
LAN
COMPANY
Seahawks Data Services
Cowboys Data Services
Bucks Data Services
Rams Data Processing, Inc.
CITY
Seattle
Dallas
Tampa
Los Angeles
STATE
WA
TX
FL
CA
The application can change these match characters using the set wild statement.
7.2.4 Retrieving Rows by Rowid
RDM Server SQL provides a feature where rowid primary key columns can be declared in a table. The primary key value is
automatically assigned by the system to the row's location in the database file. This allows rows from that table to be
accessed directly through the primary key column. Even when no rowid primary key column has been declared in the table,
RDM Server SQL exposes the rowid of each row of the table through use of the rowid keyword. All a user needs to is reference a column called "rowid" in the select statement as shown in the example queries below.
select rowid, sale_id, sale_name, region, office, mgr_id from salesperson;
ROWID SALE_ID SALE_NAME
6 BCK
Kennedy, Bob
1 BNF
Flores, Bob
SQL User Guide
REGION OFFICE MGR_ID
0 DEN
BNF
0 SEA
*NULL*
63
7. Retrieving Data from a Database
3
4
14
7
2
11
13
8
12
5
9
10
BPS
CMB
DLL
ERW
GAP
GSN
JTK
SKM
SSW
SWR
WAJ
WWW
Stouffer, Bill
Blades, Chris
Lister, Dave
Wyman, Eliska
Porter, Greg
Nash, Gail
Kirk, James
McGuire, Sidney
Williams, Steve
Robinson, Stephanie
Jones, Walter
Warren, Wayne
2
3
3
1
1
3
3
1
3
0
2
2
SEA
SEA
ATL
NYC
SEA
DAL
ATL
WDC
ATL
LAX
CHI
MIN
*NULL*
*NULL*
*NULL*
GAP
*NULL*
CMB
*NULL*
GAP
CMB
BNF
BPS
BPS
The rowid column should be qualified by a table name if there is more than one table listed in the from clause as shown
below.
select salesperson.rowid, sale_name, customer.rowid, company
from salesperson, customer where salesperson.sale_id = customer.sale_id;
salesperson.rowid
6
6
1
1
3
7
7
7
7
7
2
11
11
11
8
8
8
12
12
12
5
5
5
9
9
9
10
10
sale_name
Kennedy, Bob
Kennedy, Bob
Flores, Bob
Flores, Bob
Stouffer, Bill
Wyman, Eliska
Wyman, Eliska
Wyman, Eliska
Wyman, Eliska
Wyman, Eliska
Porter, Greg
Nash, Gail
Nash, Gail
Nash, Gail
McGuire, Sidney
McGuire, Sidney
McGuire, Sidney
Williams, Steve
Williams, Steve
Williams, Steve
Robinson, Stephanie
Robinson, Stephanie
Robinson, Stephanie
Jones, Walter
Jones, Walter
Jones, Walter
Warren, Wayne
Warren, Wayne
customer.rowid
17
34
15
31
29
23
26
27
30
32
39
19
25
33
24
35
42
28
36
41
16
18
20
21
22
38
37
40
company
Broncos Air Express
Cardinals Bookmakers
Seahawks Data Services
Forty-niners Venture Group
Colts Nuts & Bolts, Inc.
Browns Kennels
Jets Overnight Express
Patriots Computer Corp.
'Bills We Pay' Financial Corp.
Giants Garments, Inc.
Lions Motor Company
Saints Software Support
Oilers Gas and Light Co.
Cowboys Data Services
Steelers National Bank
Redskins Outdoor Supply Co.
Eagles Electronics Corp.
Dolphins Diving School
Falcons Microsystems, Inc.
Bucs Data Services
Raiders Development Co.
Chargers Credit Corp.
Rams Data Processing, Inc.
Chiefs Management Corporation
Bengels Imports
Bears Market Trends, Inc.
Vikings Athletic Equipment
Packers Van Lines
If more than one table is listed in the from clause and the rowid column is not qualified with a table name, the system will
return the rowid from the first listed table. As with standard column references the qualifier name should be the correlation
name when a correlation name as been specified, as shown in the example below.
select s.rowid, s.sale_name, c.rowid, c.city, c.state
from salesperson s, customer c where s.sale_id = c.sale_id and s.region = 0;
SQL User Guide
64
7. Retrieving Data from a Database
S.ROWID
6
6
5
5
5
1
1
S.SALE_NAME
Kennedy, Bob
Kennedy, Bob
Robinson, Stephanie
Robinson, Stephanie
Robinson, Stephanie
Flores, Bob
Flores, Bob
C.ROWID
17
34
16
18
20
15
31
C.CITY
Denver
Phoenix
Los Angeles
San Diego
Los Angeles
Seattle
San Francisco
C.STATE
CO
AZ
CA
CA
CA
WA
CA
Direct access retrieval will occur for queries of the following form:
select … from … where [tabname.]rowid = constant
select s.rowid, sale_name, company, city, state
from salesperson s, customer c where s.sale_id = c.sale_id and s.rowid = 7;
S.ROWID
7
7
7
7
7
SALE_NAME
Wyman, Eliska
Wyman, Eliska
Wyman, Eliska
Wyman, Eliska
Wyman, Eliska
COMPANY
CITY
STATE
Browns Kennels
Cleveland
OH
Jets Overnight Express
New York
NY
Patriots Computer Corp.
Foxboro
MA
'Bills We Pay' Financial Corp. Buffalo
NY
Giants Garments, Inc.
Jersey City
NJ
7.3 Retrieving Data from Multiple Tables
A join associates two tables together common columns. Typically, but not always, the common columns will have the same
names. Join relationships can be explicitly defined between tables in the database definition through the specification of
primary and foreign key clauses. But even where explicit joins have not been defined in the schema, joins between tables
with common columns can still be specified in a select statement.
RDM Server support two different methods for specifying joins. Old style join specifications are based on the 1989 ANSI
SQL standard in which all of the inter-table join relationships are specified in the select statement’s where clause. Extended
join specifications are based on the join enhancements originally introduced in the 1992 ANSI SQL standard in which the
join relationships are specified in the from clause.
7.3.1 Old Style Join Specifications
Inner Joins
It is often necessary for an application to retrieve data from several related tables using a join. To form a join, issue a select
statement that specifies each table name in the from clause. In the where clause, include an equality comparison of the associated columns (that is, the foreign and primary key columns) from the two tables. This comparison is called a join predicate.
To differentiate between join columns of the same name in the two tables, the select statement must prefix the table names to
the column names in the comparison. An inner join is one in which only those rows from the two tables with matching values
are returned. Join predicates are specified in the where clause as a relational expression according to the following syntax.
rel_expr:
|
...
[tabname.]colname = [tabname.]colname
SQL User Guide
65
7. Retrieving Data from a Database
The example below retrieves and lists the customer accounts (customer table) for each salesperson (salesperson table).
select sale_name, company, city, state from salesperson, customer
where salesperson.sale_id = customer.sale_id;
SALE_NAME
Kennedy, Bob
Kennedy, Bob
Flores, Bob
Flores, Bob
Stouffer, Bill
Wyman, Eliska
Wyman, Eliska
Wyman, Eliska
Wyman, Eliska
Wyman, Eliska
Porter, Greg
Nash, Gail
Nash, Gail
Nash, Gail
McGuire, Sidney
McGuire, Sidney
McGuire, Sidney
Williams, Steve
Williams, Steve
Williams, Steve
Robinson, Stephanie
Robinson, Stephanie
Robinson, Stephanie
Jones, Walter
Jones, Walter
Jones, Walter
Warren, Wayne
Warren, Wayne
COMPANY
Broncos Air Express
Cardinals BoOKmakers
Seahawks Data Services
Forty-niners Venture Group
Colts Nuts & Bolts, Inc.
Browns Kennels
Jets Overnight Express
Patriots Computer Corp.
'Bills We Pay' Financial Corp.
Giants Garments, Inc.
Lions Motor Company
Saints Software Support
Oilers Gas and Light Co.
Cowboys Data Services
Steelers National Bank
Redskins Outdoor Supply Co.
Eagles Electronics Corp.
Dolphins Diving School
Falcons Microsystems, Inc.
Bucs Data Services
Raiders Development Co.
Chargers Credit Corp.
Rams Data Processing, Inc.
Chiefs Management Corporation
Bengels Imports
Bears Market Trends, Inc.
Vikings Athletic Equipment
Packers Van Lines
CITY
Denver
Phoenix
Seattle
San Francisco
Baltimore
Cleveland
New York
Foxboro
Buffalo
Jersey City
Detroit
New Orleans
Houston
Dallas
Pittsburgh
Arlington
Philadelphia
Miami
Atlanta
Tampa
Los Angeles
San Diego
Los Angeles
Kansas City
Cincinnati
Chicago
Minneapolis
Green Bay
STATE
CO
AZ
WA
CA
IN
OH
NY
MA
NY
NJ
MI
LA
TX
TX
PA
VA
PA
FL
GA
FL
CA
CA
CA
MO
OH
IL
MN
WI
Your application can join any number of tables using the select statement. The next example illustrates a three-table join from
the sales database that shows the January sales orders booked by Stephanie Robinson ("SWR").
select sale_name, cust_id, ord_date, ord_num, amount
from salesperson, customer, sales_order
where salesperson.sale_id = "SWR" and
salesperson.sale_id = customer.sale_id and
customer.cust_id = sales_order.cust_id and
ord_date between date "1997-1-1" and date "1997-1-31";
SALE_NAME
Robinson,Stephanie
Robinson,Stephanie
Robinson,Stephanie
CUST_ID
LAN
LAA
SDC
ORD_DATE
1997-01-02
1997-01-16
1997-01-24
2206
2214
2217
ORD_NUM AMOUNT
15753.190000
12614.340000
705.980000
Outer Joins
An outer join between two tables includes those rows in one table that do not have any matching rows from the other table.
A left outer join includes the rows for which the column on the left side of the join predicate do not have matching right-side
column values. A right outer join does just the opposite. RDM Server SQL supports both left outer joins and right outer joins
as specified below.
SQL User Guide
66
7. Retrieving Data from a Database
rel_expr:
|
|
...
[tabname.]colname *= [tabname.]colname
[tabname.]colname =* [tabname.]colname
Table 7-2. Outer Join Relational Operators
Type of Join
Operator
left outer join
*=
right outer join
=*
The "outer" side column of an outer join predicate must be indexed or be a foreign key column on which a
create join has been declared in order for RDM Server SQL to be able to perform the outer join. Otherwise, a "No access path between outer joined tables" error will be returned by SQL.
The select statement in the following example uses a left outer join operator to retrieve the customers for each salesperson,
whether or not that salesperson has any customers. The result set in this case will contain rows for all salespersons and null
for the customer table columns for those salespersons who do not manage any customer accounts (e.g., salesperson managers).
select sale_name, company, city, state from salesperson, customer
where salesperson.sale_id *= customer.sale_id;
SALE_NAME
Kennedy, Bob
Kennedy, Bob
Flores, Bob
Flores, Bob
Stouffer, Bill
Blades, Chris
Lister, Dave
Wyman, Eliska
Wyman, Eliska
Wyman, Eliska
Wyman, Eliska
Wyman, Eliska
Porter, Greg
Nash, Gail
Nash, Gail
Nash, Gail
Kirk, James
McGuire, Sidney
McGuire, Sidney
McGuire, Sidney
Williams, Steve
Williams, Steve
Williams, Steve
Robinson, Stephanie
Robinson, Stephanie
Robinson, Stephanie
Jones, Walter
Jones, Walter
Jones, Walter
Warren, Wayne
Warren, Wayne
SQL User Guide
COMPANY
Broncos Air Express
Cardinals BoOKmakers
Seahawks Data Services
Forty-niners Venture Group
Colts Nuts & Bolts, Inc.
*NULL*
*NULL*
Browns Kennels
Jets Overnight Express
Patriots Computer Corp.
'Bills We Pay' Financial Corp.
Giants Garments, Inc.
Lions Motor Company
Saints Software Support
Oilers Gas and Light Co.
Cowboys Data Services
*NULL*
Steelers National Bank
Redskins Outdoor Supply Co.
Eagles Electronics Corp.
Dolphins Diving School
Falcons Microsystems, Inc.
Bucs Data Services
Raiders Development Co.
Chargers Credit Corp.
Rams Data Processing, Inc.
Chiefs Management Corporation
Bengels Imports
Bears Market Trends, Inc.
Vikings Athletic Equipment
Packers Van Lines
CITY
Denver
Phoenix
Seattle
San Francisco
Baltimore
*NULL*
*NULL*
Cleveland
New York
Foxboro
Buffalo
Jersey City
Detroit
New Orleans
Houston
Dallas
*NULL*
Pittsburgh
Arlington
Philadelphia
Miami
Atlanta
Tampa
Los Angeles
San Diego
Los Angeles
Kansas City
Cincinnati
Chicago
Minneapolis
Green Bay
67
7. Retrieving Data from a Database
As you can see, the outer join result includes rows from the salesperson table that do not have any customers. This is exactly
what the outer join does. Compare with the results from the earlier query.
As long as the table names are unique, you need do nothing different to perform a join between tables in different databases.
The following example retrieves the descriptions of the specific products (product table in invntory database) ordered in the
Stephanie Robinson ("SWR") sales orders.
select cust_id, ord_num, prod_id, prod_desc
from customer, sales_order, item, product
where sale_id = "SWR" and ord_date between @"97-1-1" and @"97-1-31" and
customer.cust_id = sales_order.cust_id and
sales_order.ord_num = item.ord_num and
item.prod_id = product.prod_id;
CUST_ID
LAA
LAA
LAA
LAA
SDC
SDC
LAN
LAN
LAN
LAN
LAN
LAN
LAN
ORD_NUM
2214
2214
2214
2214
2217
2217
2206
2206
2206
2206
2206
2206
2206
PROD_ID
13016
17419
18060
19100
22024
23401
10450
15750
17214
18120
18121
23200
23400
PROD_DESC
RISC 16MB computer
19in SVGA monitor
60MB cartridge tape drive
flat-bed plotter
1200/2400 baud modem
track ball
486/50 computer
750 MB hard disk drive
14in VGA monitor
120MB cartridge tape drive
120MB tape cartridge
enhanced keyboard
mouse
If both databases have a table with the same name the table names listed in the from clause will need to be qualified with the
database name as indicated by the syntax shown below.
from_clause:
from [dbname.]tabname [, [dbname.]tabname ]...
For example, assume that both the sales database and the invntory database contain a table named "product". In the from
clause of the select statement, the name of the product table is prefixed with the database name "invntory". However, note
that the prod_id column in the select column list is not qualified. RDM Server assumes that an unqualified duplicate column
name is from the first table in the from list that contains a column of that name. Since the prod_id column values from both
tables is the same, it doesn't really matter which column is returned by the select statement.
select cust_id, ord_num, prod_id, prod_desc
from customer, sales_order, item, invntory.product
where sale_id = "SWR" and ord_date between @"97-1-1" and @"97-1-31" and
customer.cust_id = sales_order.cust_id and
sales_order.ord_num = item.ord_num and
item.prod_id = product.prod_id;
CUST_ID ORD_NUM PROD_ID PROD_DESC
LAA
2214
13016 RISC 16MB computer
LAA
2214
17419 19in SVGA monitor
LAA
2214
18060 60MB cartridge tape drive
LAA
2214
19100 flat-bed plotter
SDC
2217
22024 1200/2400 baud modem
SDC
2217
23401 track ball
LAN
2206
10450 486/50 computer
SQL User Guide
68
7. Retrieving Data from a Database
LAN
LAN
LAN
LAN
LAN
LAN
2206
2206
2206
2206
2206
2206
15750
17214
18120
18121
23200
23400
750 MB hard disk drive
14in VGA monitor
120MB cartridge tape drive
120MB tape cartridge
enhanced keyboard
mouse
Correlation Names
Sometimes an application must use the same select statement to reference two tables with the same name from separate databases. In that case, the from clause must include correlation names to distinguish between the two table references. Correlation names are aliased identifiers specified following the table name as shown in the following from clause syntax .
from_clause:
from [dbname.]tabname [[as] corrname][, [dbname.]tabname [[as] corrname]]...
The correlation name, corrname, is an identifier defined as an alias for the table name that can be used to qualify column
names in that table that are referenced in the select statement.
Suppose that the product table in the invntory database is named item instead of product. Then the information in the
example above would be specified as follows.
select cust_id, ord_num, prod_id, prod_desc
from customer, sales_order, sales.item s_item, invntory.item i_item
where sale_id = "SWR" and
ord_date between @"97-1-1" and @"97-1-31" and
customer.cust_id = sales_order.cust_id and
sales_order.ord_num = item.ord_num and
s_item.prod_id = i_item.prod_id;
In this example, the correlation name for the sales database item table is s_item, and the correlation name for the invntory database item table is i_item.
Correlation names are required when processing a self-join. A self-join is a join of a table with itself. The mgr_id column in
the salesperson table is a foreign key to the salesperson table. A self-join can be used to list all salespersons along with their
managers as shown in the following example. Notice how correlation names are used to distinguish between the manager's
row and the salesperson's row.
select emp.sale_name, mgr.sale_name
from salesperson emp, salesperson mgr where emp.mgr_id = mgr.sale_id;
EMP.SALE_NAME
Kennedy, Bob
Warren, Wayne
Williams, Steve
Wyman, Eliska
Jones, Walter
McGuire, Sidney
Nash, Gail
Robinson, Stephanie
SQL User Guide
MGR.SALE_NAME
Flores, Bob
Stouffer, Bill
Blades, Chris
Porter, Greg
Stouffer, Bill
Porter, Greg
Blades, Chris
Flores, Bob
69
7. Retrieving Data from a Database
Column Aliases
The columns specified in a select result column list can be assigned aliases as specified below.
select:
select select_item [, select_item]... from_clause
[where cond_expr]
select_item:
[tabname | corrname.]colname [ identifier | "headingstring" ]
The identifier or "headingstring" will be displayed in the result set heading instead of the column name. The last example is
shown below but using the column aliases "employee" and "manager".
select emp.sale_name employee, mgr.sale_name manager
from salesperson emp, salesperson mgr where emp.mgr_id = mgr.sale_id;
EMPLOYEE
Kennedy, Bob
Warren, Wayne
Williams, Steve
Wyman, Eliska
Jones, Walter
McGuire, Sidney
Nash, Gail
Robinson, Stephanie
MANAGER
Flores, Bob
Stouffer, Bill
Blades, Chris
Porter, Greg
Stouffer, Bill
Porter, Greg
Blades, Chris
Flores, Bob
7.3.2 Extended Join Specifications
The 1992 ANSI SQL standard introduced a new method by which joins between tables can be specified. This new method
separates the information needed to form the joins from the where clause and places it in the from clause of the select statement. In addition, the 1992 standard also enhanced join handling to allow the specification of left and right outer joins (the
1989 standard only allowed for inner joins, the "*=" and "=*" outer join operators described in the last section are non-standard).
The enhanced syntax for the select statement from clause that incorporates join specifications is given below.
select:
select {* | select_item [, select_item]...}
from table_ref [, table_ref]...
table_ref:
table_primary | table_join
table_primary:
table_name_spec | ( table_join )
table_name_spec:
[dbname.]tabname [[as] corrname]
table_join:
natural_join | qualified_join | cross_join
natural_join:
table_ref natural [inner | {left | right} [outer]] join table_primary
SQL User Guide
70
7. Retrieving Data from a Database
qualified_join:
table_ref [inner | {left | right} [outer]] join table_primary
{using (colname[, colname]... ) on cond_expr}
cross_join:
table_ref cross join table_primary
The natural join specification indicates that the join is to be performed based on the common columns (names and types)
from the two tables. RDM Server will perform the join based on the columns from the table (or tables) specified on the left
side of "natural … join" with those columns from the table (or tables) on the right side that have the same name. The example
below gives a natural inner join between the salesperson and customer tables.
select sale_name, company from salesperson natural inner join customer;
SALE_NAME
Kennedy, Bob
Kennedy, Bob
Flores, Bob
Flores, Bob
Stouffer, Bill
Wyman, Eliska
Wyman, Eliska
Wyman, Eliska
Wyman, Eliska
Wyman, Eliska
Porter, Greg
Nash, Gail
Nash, Gail
Nash, Gail
McGuire, Sidney
McGuire, Sidney
McGuire, Sidney
Williams, Steve
Williams, Steve
Williams, Steve
Robinson, Stephanie
Robinson, Stephanie
Robinson, Stephanie
Jones, Walter
Jones, Walter
Jones, Walter
Warren, Wayne
Warren, Wayne
COMPANY
Broncos Air Express
Cardinals Bookmakers
Seahawks Data Services
Forty-niners Venture Group
Colts Nuts & Bolts, Inc.
Browns Kennels
Jets Overnight Express
Patriots Computer Corp.
'Bills We Pay' Financial Corp.
Giants Garments, Inc.
Lions Motor Company
Saints Software Support
Oilers Gas and Light Co.
Cowboys Data Services
Steelers National Bank
Redskins Outdoor Supply Co.
Eagles Electronics Corp.
Dolphins Diving School
Falcons Microsystems, Inc.
Bucs Data Services
Raiders Development Co.
Chargers Credit Corp.
Rams Data Processing, Inc.
Chiefs Management Corporation
Bengels Imports
Bears Market Trends, Inc.
Vikings Athletic Equipment
Packers Van Lines
The common column between the two tables is sale_id so the above natural inner join example is equivalent to the following old style join:
select sale_name, company from salesperson, customer
where salesperson.sale_id = customer.sale_id;
A natural left (right) outer join includes the results of the inner join plus those rows of the left (right) table that do not have
a corresponding matching row in the joined table. This is illustrated below where the last example is changed from a natural
inner join to a natural left outer join.
SQL User Guide
71
7. Retrieving Data from a Database
select sale_name, company from salesperson natural left outer join customer;
SALE_NAME
Kennedy, Bob
Kennedy, Bob
Flores, Bob
Flores, Bob
Stouffer, Bill
Blades, Chris
Lister, Dave
Wyman, Eliska
Wyman, Eliska
Wyman, Eliska
Wyman, Eliska
Wyman, Eliska
Porter, Greg
Nash, Gail
Nash, Gail
Nash, Gail
Kirk, James
McGuire, Sidney
McGuire, Sidney
McGuire, Sidney
Williams, Steve
Williams, Steve
Williams, Steve
Robinson, Stephanie
Robinson, Stephanie
Robinson, Stephanie
Jones, Walter
Jones, Walter
Jones, Walter
Warren, Wayne
Warren, Wayne
COMPANY
Broncos Air Express
Cardinals Bookmakers
Seahawks Data Services
Forty-niners Venture Group
Colts Nuts & Bolts, Inc.
*NULL*
*NULL*
Browns Kennels
Jets Overnight Express
Patriots Computer Corp.
'Bills We Pay' Financial Corp.
Giants Garments, Inc.
Lions Motor Company
Saints Software Support
Oilers Gas and Light Co.
Cowboys Data Services
*NULL*
Steelers National Bank
Redskins Outdoor Supply Co.
Eagles Electronics Corp.
Dolphins Diving School
Falcons Microsystems, Inc.
Bucs Data Services
Raiders Development Co.
Chargers Credit Corp.
Rams Data Processing, Inc.
Chiefs Management Corporation
Bengels Imports
Bears Market Trends, Inc.
Vikings Athletic Equipment
Packers Van Lines
This statement is equivalent to the old style outer join:
select sale_name, company from salesperson, customer
where salesperson.sale_id *= customer.sale_id;
An inner join is the default so that the specification of "natural join" produces a natural inner join. For outer joins, "outer"
does not need to be specified. The following example requests a natural inner join between salesperson and customer and a
natural left outer join between customer and sales_order.
select sale_name, company, ord_num, ord_date, amount
from salesperson natural join customer natural left join sales_order;
SALE_NAME
Kennedy, Bob
Kennedy, Bob
Kennedy, Bob
Kennedy, Bob
Kennedy, Bob
Kennedy, Bob
Kennedy, Bob
Kennedy, Bob
SQL User Guide
COMPANY
Broncos Air Express
Broncos Air Express
Broncos Air Express
Broncos Air Express
Broncos Air Express
Broncos Air Express
Broncos Air Express
Cardinals Bookmakers
ORD_NUM
2207
2220
2237
2264
2282
2304
2321
2209
ORD_DATE
1997-01-06
1997-01-27
1997-02-12
1997-04-01
1997-04-14
1997-05-26
1997-06-24
1997-01-07
AMOUNT
274375
49980
103874.8
21950
21950
19995
6827.96
3715.83
72
7. Retrieving Data from a Database
Kennedy, Bob
Kennedy, Bob
Kennedy, Bob
Kennedy, Bob
Flores, Bob
Flores, Bob
Flores, Bob
...
Jones, Walter
Jones, Walter
Jones, Walter
Warren, Wayne
Warren, Wayne
Warren, Wayne
Warren, Wayne
Warren, Wayne
Warren, Wayne
Warren, Wayne
Warren, Wayne
Warren, Wayne
Warren, Wayne
Cardinals Bookmakers
Cardinals Bookmakers
Cardinals Bookmakers
Cardinals Bookmakers
Seahawks Data Services
Seahawks Data Services
Seahawks Data Services
2253
2269
2301
2313
2215
2225
2229
1997-03-16
1997-04-02
1997-05-14
1997-06-12
1997-01-17
1997-01-29
1997-02-04
143375
35119.46
16227.27
38955
16892
2987.5
8824.56
Bears Market Trends, Inc.
Bears Market Trends, Inc.
Bears Market Trends, Inc.
Vikings Athletic Equipment
Vikings Athletic Equipment
Vikings Athletic Equipment
Vikings Athletic Equipment
Vikings Athletic Equipment
Vikings Athletic Equipment
Packers Van Lines
Packers Van Lines
Packers Van Lines
Packers Van Lines
2249
2271
2295
2202
2223
2248
2266
2296
2315
2211
2235
2292
2327
1997-03-04
1997-04-03
1997-05-06
1997-01-02
1997-01-28
1997-02-28
1997-04-01
1997-05-07
1997-06-17
1997-01-10
1997-02-11
1997-04-30
1997-06-30
28570
49584.65
31580
25915.86
408
3073.54
5190.42
2790.99
12082.39
53634.12
8192.38
77247.5
24103.3
Natural joins will form the join based on equal values from all columns in the joined tables that have the same name. In the
examples above, there is only one common column name between salesperson and customer, sale_id, and one between customer and sales_order, cust_id. The customer table in the sales database shares two common columns with the outlet table in
the invntory database: city and state. In the next example, a natural join between the customer and outlet tables will produce
those customers that are located in the same city and state where there is a distribution outlet.
select company, city, state from customer natural join outlet;
COMPANY
Seahawks Data Services
Raiders Development Co.
Rams Data Processing, Inc.
Chargers Credit Corp.
Forty-niners Venture Group
Cowboys Data Services
Oilers Gas and Light Co.
Patriots Computer Corp.
Bears Market Trends, Inc.
Chiefs Management Corporation
Chiefs Management Corporation
'Bills We Pay' Financial Corp.
Jets Overnight Express
Falcons Microsystems, Inc.
Vikings Athletic Equipment
Broncos Air Express
CITY
Seattle
Los Angeles
Los Angeles
San Diego
San Francisco
Dallas
Houston
Foxboro
Chicago
Kansas City
Kansas City
Buffalo
New York
Atlanta
Minneapolis
Denver
STATE
WA
CA
CA
CA
CA
TX
TX
MA
IL
MO
MO
NY
NY
GA
MN
CO
A qualified join is like a natural join except that it requires that the columns on which the join is to be formed be explicitly
specified. Two specification methods are provided. With the using clause requires you name the common column names
between the joined tables which are to be used to form the join. With the on clause you specify the join predicates as conditional expressions exactly as they would be specified in the where clause under the old style joins. The on clause is necessary whenever the join is to be performed between columns that do not have the same name.
The using clause allows you to choose only the matching columns on which you want the join formed. So, for example, to
list those customers located in the same state, but not necessarily the same city, as a distribution outlet you would use the following statement:
SQL User Guide
73
7. Retrieving Data from a Database
select company, city, state from customer inner join outlet using(state);
COMPANY
Seahawks Data Services
Raiders Development Co.
Rams Data Processing, Inc.
Chargers Credit Corp.
Forty-niners Venture Group
Cowboys Data Services
Oilers Gas and Light Co.
Patriots Computer Corp.
Bears Market Trends, Inc.
Chiefs Management Corporation
Chiefs Management Corporation
'Bills We Pay' Financial Corp.
Jets Overnight Express
Falcons Microsystems, Inc.
Vikings Athletic Equipment
Broncos Air Express
CITY
Seattle
Los Angeles
Los Angeles
San Diego
San Francisco
Dallas
Houston
Foxboro
Chicago
Kansas City
Kansas City
Buffalo
New York
Atlanta
Minneapolis
Denver
STATE
WA
CA
CA
CA
CA
TX
TX
MA
IL
MO
MO
NY
NY
GA
MN
CO
It is usually a good database design principle for the columns on which different tables could be joined to have the same
names. Doing so will greatly simplify the select statement join specifications. However, there are situations in which this is
just not possible and a join is needed in which the columns on which the join is to be made cannot have the same name. One
such situation occurs in a self-referencing join, a join that is performed on the same table. For example, the salesperson table’s
primary key is sale_id but salesperson also contains a column named mgr_id that is a foreign key reference to the row in the
salesperson table associated with that salesperson’s manager. The following example gives a select statement that lists all managers along with those sales persons that they manage. Note that correlation names must be specified for the two salesperson
references in the from clause in order to differentiate the manager rows from the employee rows.
select mgr.sale_name, emp.sale_name
from salesperson mgr join salesperson emp
on mgr.sale_id = emp.mgr_id;
MGR.SALE_NAME
Flores, Bob
Flores, Bob
Stouffer, Bill
Stouffer, Bill
Blades, Chris
Blades, Chris
Porter, Greg
Porter, Greg
EMP.SALE_NAME
Robinson, Stephanie
Kennedy, Bob
Jones, Walter
Warren, Wayne
Nash, Gail
Williams, Steve
Wyman, Eliska
McGuire, Sidney
Parentheses are sometimes needed to be used to group joins when more than two tables are involved in the from clause. They
are required when one table needs to be joined with two or more tables. For example, the statement below produces a list of
product orders for those customers who are located in cities where an distribution outlet is also located. A natural join
between the customer table and both the sales_order table (based on the cust_id column) and the outlet table (based on the
city and state columns) will accomplish this.
select company, city, prod_id, quantity
from customer natural join
(sales_order natural join item natural join outlet);
COMPANY
Seahawks Data Services
SQL User Guide
CITY
Seattle
PROD_ID
16311
QUANTITY
20
74
7. Retrieving Data from a Database
Seahawks Data Services
Seahawks Data Services
Seahawks Data Services
...
Broncos Air Express
Broncos Air Express
Broncos Air Express
Broncos Air Express
Broncos Air Express
Seattle
Seattle
Seattle
18061
18121
16511
200
500
250
Denver
Denver
Denver
Denver
Denver
15340
17214
20303
23200
23400
2
2
2
2
2
By grouping the natural joins between sales_order, item, and outlet together with parentheses, the group is treated like a
single table to which a natural join with customer is then formed. The common columns between customer and sales_order
(cust_id), item (none), and outlet (city and state) becomes the basis on which the natural join is performed.
There can be no duplicate common column names between the table (or tables) on the left side of a join and
the table (or tables) on the right side of the join.
A cross join is simply a cross product of the two tables where each row of the left table is joined with each row of the right
table so that the cardinality of the result (i.e., the number of result rows) is equal to the product of the cardinalities of the two
tables. An on clause cannot be specified with a cross join. However, there is nothing that restricts including join conditions
in the where clause. In practice, there are very few times when a cross join is needed and since it can be a very expensive
operation that can potentially produce huge result sets, its use should be avoided.
7.4 Sorting the Rows of the Result Set
You can sort the result set produced by the select statement by using an order by clause that conforms to the following syntax.
select:
select [first | all | distinct] {* | select_item [, select_item]...}
from table_ref [, table_ref]...
[where cond_expr]
[order by {number | colname} [asc | desc] [, {number | colname} [asc | desc]]...]
The order by clause identifies the result set columns which are to be sorted and whether the column value is to be sorted in
ascending or descending order. The sort columns are identified either by the ordinal number it appears in the select result
column list beginning with 1 or by the name (or alias) of the column.
For example, the statement shown below sorts the salesperson table in alphabetical order by salesperson name (sale_name
column).
select * from salesperson order by sale_name;
SALE_ID
CMB
BNF
WAJ
BCK
JTK
DLL
SKM
GSN
SQL User Guide
SALE_NAME
Blades, Chris
Flores, Bob
Jones, Walter
Kennedy, Bob
Kirk, James
Lister, Dave
McGuire, Sidney
Nash, Gail
DOB
1958-09-08
1943-07-17
1960-06-15
1956-10-29
2100-08-30
1999-08-30
1947-12-02
1954-10-20
COMMISSION
0.080
0.100
0.070
0.075
0.075
0.075
0.070
0.070
REGION
3
0
2
0
3
3
1
3
OFFICE
SEA
SEA
CHI
DEN
ATL
ATL
WDC
DAL
MGR_ID
*NULL*
*NULL*
BPS
BNF
*NULL*
*NULL*
GAP
CMB
75
7. Retrieving Data from a Database
GAP
SWR
BPS
WWW
SSW
ERW
Porter, Greg
Robinson, Stephanie
Stouffer, Bill
Warren, Wayne
Williams, Steve
Wyman, Eliska
1949-03-03
1968-10-11
1952-11-21
1953-04-29
1944-08-30
1959-05-18
0.080
0.070
0.080
0.075
0.075
0.075
1
0
2
2
3
1
SEA
LAX
SEA
MIN
ATL
NYC
*NULL*
BNF
*NULL*
BPS
CMB
GAP
As noted above, you can specify columns listed in the order by clause by name or by number. The following lists the salesperson names and birth dates in birth date order.
select sale_name, dob from salesperson order by 2;
SALE_NAME
Flores, Bob
Porter, Greg
Stouffer, Bill
Kennedy, Bob
Blades, Chris
Robinson, Stephanie
DOB
1943-07-17
1949-03-03
1952-11-21
1956-10-29
1958-09-08
1968-10-11
You can use the order by clause to sort on more than one column. Additionally, the clause can be used to specify whether
each column is in ascending (the default) or descending order. In the following example, column 1 is the primary sort column.
select commission, sale_name from salesperson order by 1 desc, 2 asc;
COMMISSION
0.100
0.080
0.080
0.080
0.075
0.070
SALE_NAME
Flores, Bob
Blades, Chris
Porter, Greg
Stouffer, Bill
Robinson, Stephanie
Kennedy, Bob
The query below returns the sale total for the sales orders entered on or after 6-1-1997 where the "amount+tax" result column
is assigned alias sale_tot which is referenced in the order by clause.
select ord_num, amount+tax sale_tot from sales_order
where ord_date >= date "1997-06-01"
order by sale_tot desc;
ORD_NUM
2324
2310
2317
2313
2323
2308
2311
2319
2320
2318
2322
2327
2326
2325
SQL User Guide
SALE_TOT
104019.5
51283.9700292969
49778.7600683594
38955
35582.5
32675.6899902344
32589.6000976563
31602.1500976562
27782
27239.1000976563
25231.98
24103.3
22887.96
21532.0899902344
76
7. Retrieving Data from a Database
2314
2309
2316
2312
2315
2321
2307
20780
17388.6600341797
16986.99
16598.0000048828
12940.2399755859
7251.28998657227
4487.76
As you can see, the select statement result columns can be computational as described in the next section.
7.5 Retrieving Computational Results
Besides retrieving the values of individual columns, a select statement allows you to specify expressions that can perform
arithmetic operations on the columns in a table. The normal arithmetic operators (+, -, *, /) along with a wide range of built-in
functions can be included in a select column expression. The complete syntax for column expressions is given below.
select:
select [first | all | distinct] {* | select_item [, select_item]...}
from table_ref [, table_ref]...
[where cond_expr]
[order by col_ref [asc | desc] [, col_ref [asc | desc]]...]
select_item:
{tabname | corrname}.* | expression} [identifier | "headingstring"]
expression:
arith_expr | string_expr
arith_expr:
arith_operand [arith_operator arith_operand]...
arith_operand:
constant | [tabname.]colname | arith_function | ( arith_expr)
arith_operator:
+|-|*|/
arith_function:
numeric_function | datetime_function | system_function
|
user_defined_function
string_expr:
string_operand [^ string_operand]
string_operand:
"string" | [tabname.]colname
|
if ( cond_expr, string_expr, string_expr)
|
string_function
|
user_defined_function
numeric_function:
datetime_function:
string_function:
system_function:
SQL User Guide
See Table 7-3.
See Table 7-4.
See Table 7-5.
See Table 7-6.
77
7. Retrieving Data from a Database
7.5.1 Simple Expressions
The query example below shows the salespersons' orders with the largest earned commissions. The select statement computes
the commission earned by multiplying the commission rate by the amount of the order. It accesses the name of a salesperson
by using a three-table join and sorts the result in descending order by earned commission.
select sale_name, ord_num, amount*commission
from salesperson, customer, sales_order
where salesperson.sale_id = customer.sale_id
and customer.cust_id = sales_order.cust_id
order by 3 desc;
SALE_NAME
Kennedy, Bob
Porter, Greg
Wyman, Eliska
Kennedy, Bob
Robinson, Stephanie
Kennedy, Bob
Flores, Bob
Nash, Gail
Porter, Greg
Porter, Greg
Warren, Wayne
Jones, Walter
ORD_NUM
2207
2288
2205
2253
2234
2237
2284
2324
2250
2219
2292
2241
AMOUNT*COMMISSION
20578.125000
20194.000000
11315.340000
10753.125000
8726.200000
7790.610000
7431.516000
7281.365000
6594.468000
5922.792000
5793.562500
5762.050000
Note that, because column 3 contains an expression rather than a simple column name, the order by clause is needed in order
to use the column number. You could also use a column alias, as shown in the equivalent query below.
select sale_name, ord_num, amount*commission earned
from salesperson, customer, sales_order
where salesperson.sale_id = customer.sale_id
and customer.cust_id = sales_order.cust_id
order by earned desc;
SALE_NAME
Kennedy, Bob
Porter, Greg
Wyman, Eliska
...
ORD_NUM
2207
2288
2205
EARNED
20578.125000
20194.000000
11315.340000
In the next example, the select statement retrieves the amount that the company receives from each of the orders shown in the
example above. The amount to the company is simply the order amount minus the commission.
select sale_name, ord_num, amount-amount*commission "NET REVENUE"
from salesperson, customer, sales_order
where salesperson.sale_id = customer.sale_id
and customer.cust_id = sales_order.cust_id
order by 3 desc;
SALE_NAME
Porter, Greg
Kennedy, Bob
Kennedy, Bob
Robinson, Stephanie
Kennedy, Bob
SQL User Guide
ORD_NUM
2288
2207
2253
2234
2237
NET REVENUE
232231.000451
255168.749918
133338.749957
115933.799963
96603.563969
78
7. Retrieving Data from a Database
Porter, Greg
Porter, Greg
2250
2219
75836.382147
68112.108132
Arithmetic operators that are specified in an expression are evaluated based on the precedence given in the following table.
Table 7-2. Precedence of Arithmetic Operators
Priority
Operator
Use
Highest
()
Parenthetical expressions
High
+
Unary plus
High
-
Unary minus
Medium
*
Multiplication
Medium
/
Division
Lowest
+
Addition
Lowest
-
Subtraction
7.5.2 Built-in (Scalar) Functions
RDM Server SQL provides many built-in functions that can be used in select statement expressions. Four classes of built-in
functions are provided as noted in the select statement syntax shown above: numeric, datetime, system, and string functions.
These functions are called scalar functions because for a given set of argument values they each return a single value. Functions described in the next section are called aggregate functions because they perform computations over a set (group) of
rows.
The built-in numeric functions provided in RDM Server SQL are described in the following table.
Table 7-3. Built-in Numeric Functions
Function
Description
abs(arith_expr)
Returns the absolute value of an expression.
acos(arith_expr)
Returns the arccosine of an expression.
asin(arith_expr)
atan(arith_expr)
atan2(arith_expr)
{ceil | ceiling}(arith_
expr)
cos(arith_expr)
cot(arith_expr)
exp(arith_expr)
floor(arith_expr)
{ln | log}(arith_expr)
mod(arith_expr1, arith_
expr2)
pi()
rand(num)
sign(arith_expr)
sin(arith_expr)
sqrt(arith_expr)
tan(arith_expr)
SQL User Guide
Returns the arcsine of an expression.
Returns the arctangent of an expression.
Returns the arctangent of an x-y coordinate pair.
Finds the upper bound for an expression.
Returns the cosine of an angle.
Returns the cotangent of an angle.
Returns the value of an exponential function.
Finds the lower bound for an expression.
Returns the natural logarithm of an expression.
Returns the remainder of arith_expr1/arith_expr2.
Returns the value of pi.
Returns next random floating-point number. Non-zero num is seed.
Returns the sign of an expression (-1, 0, +1).
Returns the sine of an angle.
Returns the square root of an expression.
Returns the tangent of an angle.
79
7. Retrieving Data from a Database
The example below calls the floor function to truncate the cents portion from the amount column in the sales_order table for
all orders made by Seahawks Data Services.
select ord_num, ord_date, floor(amount) from sales_order
where cust_id = "SEA";
ORD_NUM
2215
2225
2229
2258
2273
2311
ORD_DATE
1997-01-17
1997-01-29
1997-02-04
1997-03-23
1997-04-03
1997-06-05
FLOOR(AMOUNT)
16892
2987
8824
1365
650
30036
Table 7-4. Built-in Date/Time Functions
Function
Description
age(dt_expr)
Returns the age (in full years).
{curdate | current_date}() Returns the current date.
{curtime | current_time}() Returns the current time.
current_timestamp()
Returns the current date and time
dayofmonth(dt_expr)
Returns the day of the month.
dayofweek(dt_expr)
dayofyear(dt_expr)
hour(dt_expr)
minute(dt_expr)
month(dt_expr)
now()
quarter(dt_expr)
second(dt_expr)
week(dt_expr)
year(dt_expr)
Returns the
Returns the
Returns the
Returns the
Returns the
Returns the
Returns the
Returns the
Returns the
Returns the
day of the week.
day of the year.
hour.
minute.
month.
current date and time.
quarter.
second.
week.
year.
The next query returns the age for each salesperson on April 19, 2012. As you can see, it is an experienced sales staff except
for Dave Lister who is only 12 and James T. Kirk who will not be born for another 89 years!
select sale_name, dob, curdate(), age(dob) from salesperson;
SALE_NAME
Flores, Bob
Blades, Chris
Porter, Greg
Stouffer, Bill
Kennedy, Bob
Kirk, James
Lister, Dave
Warren, Wayne
Williams, Steve
Wyman, Eliska
Jones, Walter
McGuire, Sidney
SQL User Guide
DOB
1943-07-17
1958-09-08
1949-03-03
1952-11-21
1956-10-29
2100-08-30
1999-08-30
1953-04-29
1944-08-30
1959-05-18
1960-06-15
1947-12-02
CURDATE()
2012-04-19
2012-04-19
2012-04-19
2012-04-19
2012-04-19
2012-04-19
2012-04-19
2012-04-19
2012-04-19
2012-04-19
2012-04-19
2012-04-19
AGE(DOB)
68
53
63
59
55
-89
12
58
67
52
51
64
80
7. Retrieving Data from a Database
Nash, Gail
Robinson, Stephanie
1954-10-20 2012-04-19
1968-10-11 2012-04-19
57
43
Table 7-5. Built-in String Functions
Function
Description
ascii(string_expr)
Returns the numeric ASCII value of a character
char(num)
Returns the ASCII character with numeric value num
concat(string_expr1,
Concatenates two strings
string_expr2)
insert(string_expr1,
Replace num2 chars from string_expr2 in string_expr1 beginning at position num1 (1st posnum1, num2, string_
ition is 1 not 0)
expr2)
lcase(string_expr)
Converts a string to lowercase
left(string_expr, num)
length(string_expr)
locate(string_expr1,
string_expr2, num)
ltrim(string_expr)
Returns the leftmost num characters from the string
Returns the length of the string
Locate string_expr1 from position num in string_expr2
Removes all leading spaces from string
repeat(string_expr, num) Repeats string num times
replace(string_expr1,
Replace string_expr2 with string_expr3 in string_expr1
string_expr2, string_
expr3)
right(string_expr, num) Returns the rightmost num characters from string
rtrim(string_expr)
substring(string_expr1,
num1, num2)
ucase(string_expr)
unicode(string_expr)
wchar(num)
Removes all trailing spaces from string
Returns num2 characters from string_expr beginning at position num1.
Convert string to uppercase
Returns the numeric Unicode value of a character
Returns a Unicode character with numeric value num.
The next query displays the customer company names and their lengths with the longest listed first.
select company, length(company) from customer order by 2 desc;
COMPANY
LENGTH(COMPANY)
'Bills We Pay' Financial Corp.
30
Chiefs Management Corporation
29
Redskins Outdoor Supply Co.
27
Falcons Microsystems, Inc.
26
Forty-niners Venture Group
26
Rams Data Processing, Inc.
26
...
Broncos Air Express
19
Lions Motor Company
19
Bucs Data Services
18
Packers Van Lines
17
Bengels Imports
15
Browns Kennels
14
The built-in system functions provided in RDM Server SQL are described in the table below.
SQL User Guide
81
7. Retrieving Data from a Database
Table 7-6. Built-in System Functions
Function
Description
convert(expression,
Converts expression result to specified data type.
type)
convert(expression,
Converts expresions to string of no more than width characters according to the specified
{char | wchar}, width,
format.
format)
database()
Returns string containing a comma-separated list of the currently opened databases.
if(cond_expr, expression1, expression2)
ifnull(expression1,
expression2)
user()
Returns result of expression1 if cond_expr is true , otherwise expression2.
Returns result of expression1 if not null, otherwise expression2.
Returns user name as a string.
One of the features of the RDM Server select statement is that you can use it as a simple calculator by not specifying a from
(or any other) clause. For example, the user and database functions return values that do not derive from any particular database so the following select simply returns their current values.
select user(), database();
USER()
admin
DATABASE()
invntory,sales
Use of the if and convert functions are described in detail in the next two sections.
7.5.3 Conditional Column Selection
The conditional if function allows you to select an expression result based on a specified condition applied to each result row
for a select statement. The if function syntax is as follows.
if(cond_expr, expression, expression)
For each row in which the conditional expression, cond_expr, evalutes is true, the function returns the result of evaluating the
first expression. If the condition is false, the second expression is evaluated and its result is returned.
The following example uses the if function to identify which customers are located "In-state" or "Out-of-state" where the state
is the beautiful state of Washington located in the great Pacific Northwest of the USA!
select company, if(state = "WA", "In-state", "Out-of-state") location from customer;
COMPANY
Cardinals Bookmakers
Raiders Development Co.
Rams Data Processing, Inc.
Chargers Credit Corp.
Forty-niners Venture Group
Broncos Air Express
Dolphins Diving School
Bucs Data Services
Falcons Microsystems, Inc.
Bears Market Trends, Inc.
Colts Nuts & Bolts, Inc.
Saints Software Support
SQL User Guide
LOCATION
Out-of-state
Out-of-state
Out-of-state
Out-of-state
Out-of-state
Out-of-state
Out-of-state
Out-of-state
Out-of-state
Out-of-state
Out-of-state
Out-of-state
82
7. Retrieving Data from a Database
Patriots Computer Corp.
Lions Motor Company
Vikings Athletic Equipment
Chiefs Management Corporation
Giants Garments, Inc.
'Bills We Pay' Financial Corp.
Jets Overnight Express
Bengels Imports
Browns Kennels
Eagles Electronics Corp.
Steelers National Bank
Cowboys Data Services
Oilers Gas and Light Co.
Redskins Outdoor Supply Co.
Seahawks Data Services
Packers Van Lines
Out-of-state
Out-of-state
Out-of-state
Out-of-state
Out-of-state
Out-of-state
Out-of-state
Out-of-state
Out-of-state
Out-of-state
Out-of-state
Out-of-state
Out-of-state
Out-of-state
In-state
Out-of-state
7.5.4 Formatting Column Expression Result Values
The convert function listed in Table 7-6 can be used to do simple type conversions or sophisticated formatting of expression
result values into a char or wchar string. The syntax for the convert function is given below.
convert_function:
convert(expression, type)
|
convert(expression, {char | wchar}, width, format)
data_type:
char | wchar | smallint | integer | real
|
double | date | time | timestamp | tinyint | bigint
format_spec:
numeric_format | datetime_format
numeric_format:
"[<< | >> | ><]['text' | $][- | (][#,]#[.#[#]...][e | E]['text' | $ | %]"
datetime_format:
"[<< | >> | ><]['text' | spchar | date_code | time_code]..."
date_code:
m | mm | mmm | mon | mmmm | month
|
d | dd | ddd | dddd | day
|
yy | yyyy
time_code:
h | hh | m | mm | s | ss | .f[f]... | [a/p | am/pm | A/P | AM/PM]
The expression specifies the SQL expression to be converted. In the first convert function form, type specifies the data type to
be returned. It must be a type for which a legal conversion can be performed.
The second form of the convert function will convert the expression result into either a char or a wchar string. The maximum
length of the result string is specified by width which must be an integer constant greater than 1. The result is formatted as
specified by a format string that conforms to the syntax shown above.
SQL User Guide
83
7. Retrieving Data from a Database
The format specifier for numeric values is represented as shown in the table below. The minimum specifier that must be used
for a numeric format is "#". If the display field width is too small to contain a numeric value, the convert function formats the
value in exponential format (for example, 1.759263e08).
Table 7-7. Numeric Format Specifiers
Element
Description
[ << | >> | >< ]
The justification specifier. You can specify left-justified text (<<), right-justified text (>>), or centered
text (><). The default for numeric values is right-justified.
[ 'text' | $ ]
A text character or string to use as a prefix for the result string. You must enclose the character or text
with single quotation marks unless the prefix is one dollar sign. A set currency statement will change
the symbol that is accepted by convert for the $.
[ - | ( ]
The display specifier for negative values. You can show negative values with a minus sign or with parentheses around the value. If parentheses are used, positive values are shown with an ending space to
ensure alignment of the decimal point.
[#,]#[.#[#]...]
The numeric format specifier. You can specify whether to show commas every third place before the
decimal point. Also, you can specify how many digits (if any) to show after the decimal point. A set
thousands or set decimal will change the symbol that is accepted by convert for the "," or the ".".
[e | E]
Whether to use exponential format to show numeric values. If this option is omitted, exponential format
is used only when the value is too large or small to be shown otherwise. You can specify display of an
lowercase or uppercase exponent indicator.
['text' | $ | %]
A text character or string to use as a suffix for the result string. You must enclose the character or text
with single quotation marks unless the suffix is one dollar or percent sign. A set currency statement will
change the symbol that is accepted by convert for the $.
The format specifier elements for date/time values are described in the next table. The date/time format specifier can contain
any number of text items or special characters that are interspersed with the date or time codes. You can arrange these items in
any order, but a time specifier must adhere to the ordering rules described in the syntax under "time_code". For the minute
codes to be interpreted as minutes (and not months) they must follow the hour codes. You cannot specify the minutes of a
time value without also specifying the hour. You can specify the hour by itself. Similarly, you cannot specify the seconds
without having specified minutes and you cannot specify fractions of a second without specifying seconds. Thus, the order
"hours, minutes, seconds, fractions" must be preserved.
Table 7-8. Date and Time Format Specifiers
Element
General Formatting Elements Description
[ << | >> | >< ]
The justification specifier. You can specify left-justified text (<<), right-justified text (>>), or centered
text (><). The default for numeric values is right-justified.
[ 'text' | spchar ]
A string or a special character (for example, "-", "/", or ".") to be copied into the result string. The special
character is often useful in separating the entities within a date and time.
Element
Date-Specific Formatting Elements Description
m
Month number (1-12) without a leading zero.
mm
Month number with a leading zero.
mmm
Three-character month abbreviation (e.g., "Jan").
mon
Same as mmm.
mmmm
Fully spelled month name (e.g., "January").
month
Same as mmmm.
SQL User Guide
84
7. Retrieving Data from a Database
Element
General Formatting Elements Description
d
Day of month (1-31) without leading zero.
dd
Day of month with leading zero.
ddd
Three character day of week abbreviation (e.g., "Wed").
dddd
Fully spelled day of week (e.g., "Wednesday").
day
Same as dddd.
yy
Two-digit year AD with leading zero if year between 1950 and 2049; otherwise same as yyyy.
yyyy
Year AD up to four digits without leading zero.
Element
Time-Specific Formatting Elements Description
h
Hour of day (0-12 or 23) without leading zero.
hh
Hour of day with leading zero.
m
Minute of hour (0-59) without leading zero (only after h or hh).
mm
Minute of hour with leading zero (only after h or hh).
s
Second of minute (0-59) without leading zero (only after m or mm).
ss
Second of minute with leading zero (only after m or mm).
.f[f]...
Fraction of a second: four decimal place accuracy (only after s or ss).
a/p | am/pm | A/P | Hour of day is 0-12; AM or PM indicator will be output to result string (only after last time code eleAM/PM
ment).
The following examples show numeric format specifiers and their results.
Function
convert(14773.1234, char, 10, "#.#")
convert(736620.3795, char, 12, "#,#.###")
convert(736620.3795, char, 12, "$#,#.##")
convert(736620.3795, char, 13, "<<#.######e")
convert(56.75, char, 10, "#.##%")
convert(56.75, char, 18, "#.##' percent'")
Result
"
14773.1"
"736,620.380"
"$736,620.38"
"7.366204e+005"
" 56.75%"
" 56.75 percent"
The examples below show date/time format specifiers and corresponding results. These examples show how the constant
"timestamp "1951-10-23 04:40:35" can be returned. The format specifier, rather than the entire function, is shown here in the
left column.
Format Spec.
mmm dd, yyyy
hh' hours on' ddd month dd, yyyy
dd 'of' month 'of the year' yyyy
dddd hh.mm.ss.ffff mm-dd-yyyy
'date:'yyyy.mm.dd 'at' hh:mm A/P
Result
Oct 23, 1951
04 hours on Tue October 23, 1951
23 of October of the year 1951
Tuesday 04.42.27.1750 10-23-1951
date:1951.10.23 at 04:42 A
7.6 Performing Aggregate (Grouped) Calculations
All of the select statements shown thus far have produced detail rows where each row of the result set corresponds to a single
row from the table (a base table or table formed from the set of joined tables in the from clause). There are often times when
SQL User Guide
85
7. Retrieving Data from a Database
you want to perform a calculation on one or more columns from a related set of rows returning only a summary row that
includes the calculation result. The set of rows over which the calculations are performed is called the aggregate. The select
statement group by clause is used to identify the column or columns that define each aggregate—those rows that have
identical group by column values. The syntax for the select statement including group by is as follows.
select:
select [first | all | distinct] {* | select_item [, select_item]...}
from table_ref [, table_ref]...
[where cond_expr]
[group by col_ref [, col_ref]... [having cond_expr]]
[order by col_ref [asc | desc] [, col_ref [asc | desc]]...]
select_item:
{tabname | corrname}.* | expression} [identifier | "headingstring"]
table_ref:
table_primary | table_join
table_primary:
table_name_spec | ( table_join )
table_name_spec:
[dbname.]tabname [[as] corrname]
table_join:
natural_join | qualified_join | cross_join
natural_join:
table_ref natural [inner | {left | right} [outer]] join table_primary
qualified_join:
table_ref [inner | {left | right} [outer]] join table_primary
{using (colname[, colname]... ) on cond_expr}
cross_join:
table_ref cross join table_primary
cond_expr:
rel_expr [bool_oper rel_expr]...
rel_expr:
|
|
|
|
|
|
|
|
|
expression [not] rel_oper {expression | [{any | some} | all] (subquery)}
expression [not] between
constant
and
constant
expression [not] in {(constant[, constant]...) | (subquery)}
[tabname.]colname
is [not] null
string_expr [not] like "pattern"
not rel_expr
( cond_expr )
[not] exists (subquery)
[tabname.]colname *= [tabname.]colname
[tabname.]colname =* [tabname.]colname
subquery:
select {* | expression} from {table_list | path_spec} [where cond_expr]
expression:
arith_expr | string_expr
arith_expr:
arith_operand [arith_operator arith_operand]...
SQL User Guide
86
7. Retrieving Data from a Database
arith_operand:
constant | [tabname.]colname | arith_function | ( arith_expr)
arith_operator:
+|-|*|/
arith_function:
{sum | avg | max | min} (arith_expr)
|
count ({* | [tabname.]colname})
|
if ( cond_expr, arith_expr, arith_expr)
|
numeric_function | datetime_function | system_function
|
user_defined_function
string_expr:
string_operand [^ string_operand]
string_operand:
"string" | [tabname.]colname
|
if ( cond_expr, string_expr, string_expr)
|
string_function
|
user_defined_function
The five built-in aggregate functions shown in the arith_function syntax rule above are defined in the table below.
Table 6-9. Built-in Aggregate Function Descriptions
Function
Description
count( [distinct] {* |
Returns the number (distinct) of rows in the aggregate.
[tabname.]colname} )
sum( [distinct] expres- Returns the sum of the (distinct) values of expression in the aggregate.
sion )
avg( [distinct] expres- Returns the average of the (distinct) values of expression in the aggregate.
sion )
min( expression )
Returns the minimum expression value in the aggregate.
max( expression )
Returns the maximum expression value in the aggregate.
The following example shows how grouped calculations are used to formulate a select statement that produces the year-todate earnings for each salesperson. All orders for each salesperson are summarized, the total amount of all orders is computed,
and the total commissions are calculated.
select sale_name, sum(amount), sum(amount*commission)
from salesperson, customer, sales_order
where salesperson.sale_id = customer.sale_id
and customer.cust_id = sales_order.cust_id
group by sale_name;
SALE_NAME
Flores, Bob
Jones, Walter
Kennedy, Bob
McGuire, Sidney
Nash, Gail
Porter, Greg
Robinson, Stephanie
Stouffer, Bill
Warren, Wayne
SQL User Guide
SUM(AMOUNT)
173102.02
422560.55
736345.32
208432.11
306807.26
439346.5
374904.47
29053.3
212638.5
SUM(AMOUNT*COMMISSION)
17310.202
29579.2385
55225.899
14590.2477
21476.5082
35147.72
26243.3129
2324.264
15947.8875
87
7. Retrieving Data from a Database
Williams, Steve
Wyman, Eliska
247179.99
566817.01
18538.49925
42511.27575
The set display statement lets you specify a default display format a data type. The example below uses the set display statement to specify a two decimal place, fixed-point format for all double and real type columns.
set double display(14, "#,#.##");
set real display(14,"#,#.##");
Re-executing the previous query now produces the following results.
SALE_NAME
Flores, Bob
Jones, Walter
Kennedy, Bob
McGuire, Sidney
Nash, Gail
Porter, Greg
Robinson, Stephanie
Stouffer, Bill
Warren, Wayne
Williams, Steve
Wyman, Eliska
SUM(AMOUNT) SUM(AMOUNT*COMMISSION)
173,102.02
17,310.20
422,560.55
29,579.24
736,345.32
55,225.90
208,432.11
14,590.25
306,807.26
21,476.51
439,346.50
35,147.72
374,904.47
26,243.31
29,053.30
2,324.26
212,638.50
15,947.89
247,179.99
18,538.50
566,817.01
42,511.28
Most of the remaining examples use the above specified formats (the amount column is double and the tax column is real in
the example databases).
Figure 7-1 illustrates the retrieved aggregate function results. The sales orders for Bob Flores are totaled in the amount and
amount*commission columns.
Figure 7-1. Group By Calculations
If the group by clause is omitted, calculations are performed on all rows as a single aggregate producing a single summary result row. The following example illustrates a select statement that calls the count, min, max, and avg aggregate functions
without the group by clause. The statement retrieves the total number of sales orders, along with the minimum, maximum,
and average order amounts.
SQL User Guide
88
7. Retrieving Data from a Database
select count(*), min(amount), max(amount), avg(amount) from sales_order;
COUNT(*)
127
MIN(AMOUNT)
68.750000
MAX(AMOUNT)
274375.000000
AVG(AMOUNT)
29269.189213
The next example illustrates use of the sum function. The function computes total year-to-date sales for all salespersons in the
sales database.
select sum(amount) from sales_order;
SUM(AMOUNT)
3,717,187.03
The count function is used to calculate the number of detail rows from which the aggregate is comprised. The next query
shows the number of orders placed by each salesperson sorted by the number of orders (most listed first).
select sale_name, count(ord_num) from salesperson, customer, sales_order
where salesperson.sale_id = customer.sale_id and
customer.cust_id = sales_order.cust_id
group by 1 order by 2 desc;
SALE_NAME
Wyman, Eliska
Jones, Walter
Robinson, Stephanie
Kennedy, Bob
McGuire, Sidney
Warren, Wayne
Flores, Bob
Nash, Gail
Williams, Steve
Stouffer, Bill
Porter, Greg
COUNT(ORD_NUM)
24
15
15
12
11
10
9
9
9
8
5
The argument in count can be any of the column names. Or, since any column you choose will give the same result, you can
simply write "count(*)".
A special form of the count function can also retrieve the total number of rows in a table, as shown below.
select count(*) from on_hand;
COUNT(*)
744
The result returned by "select count (*) from tablename" may include uncommitted records. However, if the
select query contains additional columns or clauses, the returned result set will not include uncommitted
records.
RDM Server SQL maintains on-line statistics that include the total number of rows per table, allowing the above query to
return the result instantly. However, if you did not specify "count(*)" but included a column in this query (as shown below),
RDM Server scans the entire table counting each row, using much more time for the query.
SQL User Guide
89
7. Retrieving Data from a Database
select count(quantity) from on_hand;
COUNT(quantity)
744
If you do not want duplicates included in aggregate calculations, you can specify distinct in an avg, count, or sum function.
Use of distinct is shown in the following query, which retrieves both the total number of items and the total number of distinct products sold by each salesperson.
select sale_name, count(prod_id), count(distinct prod_id)
from salesperson, customer, sales_order, item
where salesperson.sale_id = customer.sale_id and
customer.cust_id = sales_order.cust_id and
sales_order.ord_num = item.ord_num
group by 1;
SALE_NAME
Flores, Bob
Jones, Walter
Kennedy, Bob
McGuire, Sidney
Nash, Gail
Porter, Greg
Robinson, Stephanie
Stouffer, Bill
Warren, Wayne
Williams, Steve
Wyman, Eliska
COUNT(PROD_ID)
2
62
40
41
20
17
67
19
59
25
79
COUNT(DISTINCT PROD_ID)
24
29
27
27
16
14
34
15
38
17
41
SQL provides the having clause to restrict result rows based on aggregate functions. The next example uses the having clause
to limit the result set to only those companies with more than five orders for the year.
select company, count(ord_num), sum(amount) from customer natural join sales_order
group by company having count(ord_num) > 5;
COMPANY
Broncos Air Express
Browns Kennels
Colts Nuts & Bolts, Inc.
Patriots Computer Corp.
Rams Data Processing, Inc.
Seahawks Data Services
Vikings Athletic Equipment
COUNT(ORD_NUM)
7
7
8
6
8
6
6
SUM(AMOUNT)
$498,952.76
$43,284.54
$29,053.30
$120,184.69
$172,936.31
$60,756.36
$49,461.20
Note that your application cannot use a where clause in place of the having clause. The where clause restricts detail rows
before they affect the summary calculations, while the having clause restricts aggregate result rows after the calculations are
performed. Consider the following query.
select sale_name, sum(amount) from salesperson, customer, sales_order
where salesperson.sale_id = customer.sale_id
and customer.cust_id = sales_order.cust_id
and sale_id in ("BNF","GAP")
and ord_date between date "1997-06-01" and date "1997-06-30"
SQL User Guide
90
7. Retrieving Data from a Database
group by 1
having sum(amount) > 50000.0;
Figure 7-2 shows the different application of the where and having clauses during the processing of the above query.
Figure 7-2. Use of Where and Having Clauses
The following example uses the ucase and substring functions to retrieve all customers who have a customer identifier (cust_
id) equal to the first three characters of the company name.
select cust_id, company from customer
where cust_id = ucase(substring(company,1,3));
CUST_ID
SEA
SQL User Guide
COMPANY
Seahawks Data Services
91
7. Retrieving Data from a Database
7.7 String Expressions
The concat function is used to concatenate strings. It can be called from the select statement as shown in the following
example.
select sale_name, concat(city, concat(", ", concat(state, concat(" ", zip)))) locality
from accounts;
SALE_NAME
Flores, Bob
Flores, Bob
Porter, Greg
Stouffer, Bill
Robinson, Stephanie
Robinson, Stephanie
Robinson, Stephanie
Kennedy, Bob
Kennedy, Bob
LOCALITY
Seattle, WA 98121
San Francisco, CA 94127
Detroit, MI 48243
Baltimore, MD 46219
Los Angeles, CA 92717
San Diego, CA 92126
Los Angeles, CA 90075
Denver, CO 80239
Phoenix, AZ 85021
In the previous example, the concat function requires several recursive calls to construct the customer locality string. Alternatively, your application can use the string concatenation operator ^ as shown below.
select sale_name, city ^ ", " ^ state ^ " " ^ zip from accounts;
The query example below uses the select statement with the dayofweek scalar function, which retrieves the day of the week
(for example, 1 = Sunday, 7 = Saturday). It retrieves the distribution of sales orders, from our sales database, based on the day
of the week when the orders were placed.
select dayofweek(ord_date), count(*) from sales_order group by 1;
DAYOFWEEK(ORD_DATE)
2
3
4
5
6
COUNT(*)
22
25
25
29
26
In the next example, a select statement calls the month scalar function, which retrieves the month number from the order date.
The sum aggregate function computes the sales totals for each month.
select month(ord_date), sum(amount) from sales_order group by 1;
MONTH(ORD_DATE)
1
2
3
4
5
6
SUM(AMOUNT)
$969,467.02
$529,401.19
$415,894.50
$953,985.82
$249,299.81
$599,138.69
You can use the convert scalar function to change an expression to a character string according to a specified format. Using
this function overrides the default display format.
SQL User Guide
92
7. Retrieving Data from a Database
In the next example, the select statement uses convert to compute the total amount of all orders and the total commissions for
salespersons in the example sales database. Note that identifiers (for example, total_amt) are used to rename the columns.
select sale_name, convert(sum(amount), char, 12, "$#,#.##") total_amt,
convert(sum(amount*commission), char, 12, "$#,#.##") total_comm,
from acct_sale group by sale_name;
SALE_NAME
Flores, Bob
Kennedy, Bob
Porter, Greg
Robinson, Stephanie
Stouffer, Bill
TOTAL_AMT
$173,102.02
$736,345.32
$439,346.50
$374,904.47
$29,053.30
TOTAL_COMM
$17,310.20
$51,544.17
$35,147.72
$26,243.31
$2,324.26
The application can use the convert function to format a result so that it is easier for users to read. In the following example,
the "dddd" format indicates that the full spelling of the day is to be retrieved.
select convert(ord_date, char, 10, "dddd") "DAY OF ORDER", count(*)
from sales_order group by 1;
DAY OF ORDER
Monday
Tuesday
Wednesday
Thursday
Friday
COUNT(*)
22
25
25
29
26
7.8 Nested Queries (Subqueries)
Subqueries allow SQL statements to restrict where clause results based on the evaluated result of a select statement nested
within the SQL statement. Using its nested query capability, a single SQL select statement can perform a task that may take
many statements in procedural programming languages such as C. Subqueries are specified as a where clause relational expression as defined by the syntax below.
rel_expr:
|
|
|
|
expression [not] rel_oper {expression | [{any | some} | all] (subquery)}
expression [not] in {(constant[, constant]...) | (subquery)}
not rel_expr
( cond_expr )
[not] exists (subquery)
subquery:
select {* | expression} from {table_list | path_spec} [where cond_expr]
rel_oper:
|
|
|
|
|
= | ==
<
>
<=
>=
<> | != | /=
RDM Server SQL can evaluate the following subquery classes.
SQL User Guide
93
7. Retrieving Data from a Database
l
Simple, single-value subquery
l
Multi-value subquery
l
Complex, correlated subquery
l
Existence check subquery
Each of these types of subqueries are described in the following sections.
7.8.1 Single-Value Subqueries
A single-value subquery is the simplest and most often used subquery. This subquery retrieves a single value (often computed
from an aggregate function). A single value subquery has the following form:
select ... from ... where expression rel_oper (select expression from ...)
The subquery's select statement must return only one row.
The following example shows the use of a single value subquery in a select statement that retrieves customer orders with
order amounts larger than the average sales order. The subquery itself retrieves the average sales order amount.
select company, amount from customer, sales_order
where customer.cust_id = sales_order.cust_id and
amount > (select avg(amount) from sales_order);
COMPANY
Falcons Microsystems, Inc.
Falcons Microsystems, Inc.
'Bills We Pay' Financial Corp.
'Bills We Pay' Financial Corp.
Bears Market Trends, Inc.
Bears Market Trends, Inc.
. . .
Eagles Electronics Corp.
Eagles Electronics Corp.
Cardinals Bookmakers
Cardinals Bookmakers
Cardinals Bookmakers
Seahawks Data Services
Forty-niners Venture Group
Bucs Data Services
Bucs Data Services
Redskins Outdoor Supply Co.
AMOUNT
62,340.00
38,750.00
150,871.20
46,091.44
46,740.00
49,584.65
37,408.52
47,370.00
143,375.00
35,119.46
38,955.00
30,036.50
74,315.16
39,675.95
35,582.50
47,309.94
You can nest subqueries within other subqueries. The next example uses two nested subqueries to retrieve the orders (by salesperson) larger than the largest order, closed after the date the final order closed from New Jersey.
select sale_name, amount, ord_date from salesperson, customer, sales_order
where salesperson.sale_id = customer.sale_id and
customer.cust_id = sales_order.cust_id and
amount > (select avg(amount) from sales_order
where ord_date > (select max(ord_date) from
sales_order, customer where state = "NJ" and
sales_order.cust_id = customer.cust_id));
SALE_NAME
Kennedy, Bob
SQL User Guide
AMOUNT ORD_DATE
274,375.00 1997-01-06
94
7. Retrieving Data from a Database
Kennedy, Bob
Kennedy, Bob
Kennedy, Bob
Kennedy, Bob
Kennedy, Bob
Flores, Bob
Flores, Bob
Wyman, Eliska
Wyman, Eliska
Wyman, Eliska
. . .
Jones, Walter
Jones, Walter
Jones, Walter
Jones, Walter
Warren, Wayne
Warren, Wayne
49,980.00
103,874.80
143,375.00
35,119.46
38,955.00
30,036.50
74,315.16
32,925.00
54,875.00
66,341.50
1997-01-27
1997-02-12
1997-03-16
1997-04-02
1997-06-12
1997-06-05
1997-04-20
1997-02-06
1997-04-02
1997-04-13
46,740.00
28,570.00
49,584.65
31,580.00
53,634.12
77,247.50
1997-01-02
1997-03-04
1997-04-03
1997-05-06
1997-01-10
1997-04-30
7.8.2 Multi-Valued Subqueries
A multi-value subquery retrieves more than one value and has two forms of syntax, as shown below.
select ... from ... where expression rel_oper {{any | some} | all} (select expression from ...)
or
select ... from ... where expression [not] in (select expression from ...)
The any or some qualifier (they are synonyms) indicates that the relational operation is true if there is at least one row from
the subquery's result set for which it is true. The all qualifier indicates that the relational operation is true only when it is true
for every row from the subquery's result set.
The in ( subquery ) relational operation is true if there is one row from the subquery result set that is equal to the value of the
left-side expression. If not in is specified the relational operation is true when the value of the left-side expression does not
equal any of the subquery result row values.
Note that,
where expression in (select expression from ...)
is the same as
where expression = some (select expression from ...)
For example, the following query uses a subquery to retrieve the customer orders with amounts larger than all orders booked
in May.
select company, ord_num, ord_date, amount from customer, sales_order
where customer.cust_id = sales_order.cust_id and
amount > all (select amount from sales_order where
ord_date between date "1997-05-01" and date "1997-05-31");
COMPANY
ORD_NUM ORD_DATE
Falcons Microsystems, Inc.
2230 1997-02-04
'Bills We Pay' Financial Corp.
2205 1997-01-03
'Bills We Pay' Financial Corp.
2317 1997-06-18
Bears Market Trends, Inc.
2201 1997-01-02
SQL User Guide
AMOUNT
62,340.00
150,871.20
46,091.44
46,740.00
95
7. Retrieving Data from a Database
Bears Market Trends, Inc.
Bengels Imports
Broncos Air Express
Broncos Air Express
Broncos Air Express
Lions Motor Company
Lions Motor Company
Lions Motor Company
Packers Van Lines
Packers Van Lines
Oilers Gas and Light Co.
Chiefs Management Corporation
Raiders Development Co.
Patriots Computer Corp.
Saints Software Support
Saints Software Support
Jets Overnight Express
Eagles Electronics Corp.
Cardinals Bookmakers
Forty-niners Venture Group
Redskins Outdoor Supply Co.
2271
2257
2207
2220
2237
2219
2250
2288
2211
2292
2226
2241
2234
2281
2218
2324
2270
2290
2253
2284
2310
1997-04-03
1997-03-23
1997-01-06
1997-01-27
1997-02-12
1997-01-27
1997-03-06
1997-04-24
1997-01-10
1997-04-30
1997-01-30
1997-02-21
1997-02-10
1997-04-13
1997-01-24
1997-06-30
1997-04-02
1997-04-29
1997-03-16
1997-04-20
1997-06-04
49,584.65
62,340.00
274,375.00
49,980.00
103,874.80
74,034.90
82,430.85
252,425.00
53,634.12
77,247.50
54,875.00
82,315.00
124,660.00
66,341.50
81,375.00
104,019.50
54,875.00
47,370.00
143,375.00
74,315.16
47,309.94
The next example demonstrates two ways to use multi-value subqueries. The subqueries show customers who are located in
states that also have a sales office.
select company, city, state from customer
where state = any (select state from outlet);
.. or ..
select company, city, state from customer
where state in (select state from outlet);
COMPANY
Raiders Development Co.
Rams Data Processing, Inc.
Chargers Credit Corp.
Forty-niners Venture Group
Broncos Air Express
Falcons Microsystems, Inc.
Bears Market Trends, Inc.
Patriots Computer Corp.
Vikings Athletic Equipment
Chiefs Management Corporation
'Bills We Pay' Financial Corp.
Jets Overnight Express
Cowboys Data Services
Oilers Gas and Light Co.
Seahawks Data Services
CITY
Los Angeles
Los Angeles
San Diego
San Francisco
Denver
Atlanta
Chicago
Foxboro
Minneapolis
Kansas City
Buffalo
New York
Dallas
Houston
Seattle
STATE
CA
CA
CA
CA
CO
GA
IL
MA
MN
MO
NY
NY
TX
TX
WA
The following example illustrates a select statement using the first form of the multi-value subquery to retrieve companies located in states without a sales office.
select company, city, state from customer
where state <> all (select state from outlet);
SQL User Guide
96
7. Retrieving Data from a Database
COMPANY
Cardinals Bookmakers
Dolphins Diving School
Bucs Data Services
Colts Nuts & Bolts, Inc.
Saints Software Support
Lions Motor Company
Giants Garments, Inc.
Bengels Imports
Browns Kennels
Eagles Electronics Corp.
Steelers National Bank
Redskins Outdoor Supply Co.
Packers Van Lines
CITY
Phoenix
Miami
Tampa
Baltimore
New Orleans
Detroit
Jersey City
Cincinnati
Cleveland
Philadelphia
Pittsburgh
Arlington
Green Bay
STATE
AZ
FL
FL
IN
LA
MI
NJ
OH
OH
PA
PA
VA
WI
7.8.3 Correlated Subqueries
A correlated subquery is one that refers to a column from the outer query, called an outer reference. RDM Server SQL performs a correlated subquery by executing the inner query for each row of the outer query. Processing a subquery of this type
can take some time. An alternative is to create temporary tables and indexes that are then joined using the select statement to
retrieve the desired information.
The following is an example of a correlated subquery used to retrieve the customers who are located in cities that also have
an outlet. Note that the inner query references the state column from the outer query by including the table name shown in
the outer query.
select company, city, state from customer
where city in (select city from outlet where outlet.state = customer.state);
COMPANY
Raiders Development Co.
Rams Data Processing, Inc.
Broncos Air Express
Falcons Microsystems, Inc.
Bears Market Trends, Inc.
Vikings Athletic Equipment
Chiefs Management Corporation
Jets Overnight Express
Cowboys Data Services
Seahawks Data Services
CITY
Los Angeles
Los Angeles
Denver
Atlanta
Chicago
Minneapolis
Kansas City
New York
Dallas
Seattle
STATE
CA
CA
CO
GA
IL
MN
MO
NY
TX
WA
The query below retrieves the average sales order amounts for each sales manager's department.
select mgr_id, avg(amount)
from salesperson join customer using(sale_id) natural join sales_order
where mgr_id is not null
group by 1;
MGR_ID
BNF
BPS
CMB
GAP
SQL User Guide
AVG(AMOUNT)
41,157.40
25,407.96
30,777.07
22,149.97
97
7. Retrieving Data from a Database
The next example retrieves salespersons' order amounts greater than the average order amount for the department. You can
compare the amounts in the result set with the averages shown above to confirm that the query returned the correct results.
Also note the use of extended join syntax in the from clause.
select sale_name, mgr_id, ord_num, amount
from salesperson sp1 join customer using(sale_id) natural join sales_order
where mgr_id is not null
and amount > (select avg(amount)
from salesperson sp2 join customer using(sale_id) natural join sales_order
where sp2.mgr_id = sp1.mgr_id);
SALE_NAME
Kennedy, Bob
Kennedy, Bob
Kennedy, Bob
Kennedy, Bob
Wyman, Eliska
Wyman, Eliska
Wyman, Eliska
Wyman, Eliska
Wyman, Eliska
Wyman, Eliska
Wyman, Eliska
. . .
Jones, Walter
Jones, Walter
Jones, Walter
Jones, Walter
Jones, Walter
Jones, Walter
Warren, Wayne
Warren, Wayne
Warren, Wayne
MGR_ID ORD_NUM
BNF
2207
BNF
2220
BNF
2237
BNF
2253
GAP
2231
GAP
2270
GAP
2306
GAP
2281
GAP
2205
GAP
2259
GAP
2317
BPS
BPS
BPS
BPS
BPS
BPS
BPS
BPS
BPS
2241
2257
2201
2249
2271
2295
2202
2211
2292
AMOUNT
274,375.00
49,980.00
103,874.80
143,375.00
32,925.00
54,875.00
25,002.78
66,341.50
150,871.20
24,990.00
46,091.44
82,315.00
62,340.00
46,740.00
28,570.00
49,584.65
31,580.00
25,915.86
53,634.12
77,247.50
Since both queries reference different occurrences of the same table, correlation names must be specified (for example, "sp1"
and "sp2") for each separate salesperson table. This is necessary in order for SQL to determine which of the two salesperson
tables (the inner or outer) the mgr_id refers to.
7.8.4 Existence Check Subqueries
A subquery can also be used to simply check whether a select statement retrieves any row at all. The format of the existence
check subquery is as follows:
select ... from ... where [not] exists (select * from ...)
The existence check subquery does not retrieve any result set; it just returns true if the subquery retrieves at least one row,
and false otherwise.
The following example uses a correlated existence check subquery to return the list of outlets that are warehouses only and
not a sales office.
select * from outlet
where not exists (select * from salesperson where office = loc_id);
LOC_ID CITY
BOS
Boston
SQL User Guide
STATE
MA
REGION
1
98
7. Retrieving Data from a Database
KCM
STL
Kansas City
St. Louis
MO
MO
2
2
7.9 Using Temporary Tables to Hold Intermediate Results
It is sometimes just not possible to formulate a single select statement to perform a complex query. At those times, the complex query can sometimes be broken into separate, simpler queries in which intermediate results from those simpler queries
can be stored in temporary tables to be joined together in the final query to produce the originally desired results. RDM
Server provides the create temporary table statement just for this purpose with the following syntax.
create_temporary_table:
create temporary table
tabname (temp_col_defn [, temp_col_defn]...)
temp_col_defn:
colname type_spec [default {constant | null | auto}]
The tabname is a case-insensitive identifier that can be any name except for that of another temporary table already defined in
the same connection. The table is comprised of the specified columns which can be declared to be any standard RDM Server
SQL data type. The default clause can be used to specify a default value for the table.
You can use the create index statement to create an index on a temporary table.
A commit statement must be issued after the create temporary table and the create index statements associated with it
before you can use the temporary table.
You can use initialize table to re-initialize the table to contain other intermediate results (this is much fast than delete from
tabname).
Temporary tables are visible only to the connection that creates them. They exist until the connection is terminated. Also, you
must have at least one database open in order to create a temporary table.
Temporary tables can be used as an alternative to use of a correlated subquery where the performance penalty incurred by the
subquery it too great. So, while that is not really the issue with the following query, the example below shows how this can
be done. Suppose you want a list of the salespersons' order amounts greater than the average order amount for the department
(this example was given earlier in section 7.9.3). You could solve this by first storing the department averages in a temporary
table indexed on the mgr_id and then just do a join between the salesperson and that temporary table in order to get the
desired list. The following SQL script shows how to do this.
set double display as (12,"#,#.##"); // produces cleaner output
create temp table mgravg(mgr_id char(3), avgsale double);
create index mgravgid on mgravg(mgr_id);
commit;
insert into mgravg select mgr_id, avg(amount)
from salesperson join customer using(sale_id) natural join sales_order
where mgr_id is not null group by 1;
select * from mgravg;
MGR_ID
BNF
BPS
CMB
GAP
SQL User Guide
AVGSALE
41,157.40
25,407.96
30,777.07
22,149.97
99
7. Retrieving Data from a Database
select sale_name, mgr_id, ord_num, amount
from salesperson s, mgravg m, customer c, sales_order o
where s.mgr_id = m.mgr_id and s.sale_id = c.sale_id and c.cust_id = o.cust_id
and amount > avgsale;
SALE_NAME
Kennedy, Bob
Kennedy, Bob
Kennedy, Bob
Kennedy, Bob
Wyman, Eliska
Wyman, Eliska
Wyman, Eliska
Wyman, Eliska
. . .
Jones, Walter
Jones, Walter
Jones, Walter
Jones, Walter
Warren, Wayne
Warren, Wayne
Warren, Wayne
MGR_ID ORD_NUM
BNF
2207
BNF
2220
BNF
2237
BNF
2253
GAP
2231
GAP
2270
GAP
2306
GAP
2281
BPS
BPS
BPS
BPS
BPS
BPS
BPS
2201
2249
2271
2295
2202
2211
2292
AMOUNT
274,375.00
49,980.00
103,874.80
143,375.00
32,925.00
54,875.00
25,002.78
66,341.50
46,740.00
28,570.00
49,584.65
31,580.00
25,915.86
53,634.12
77,247.50
7.10 Other Select Statement Features
There are a few other select statement features that need to be described. These are shown in the select statement syntax grammar below.
select:
select [first | all | distinct] {* | select_item [, select_item]...}
from table_ref [, table_ref]...
[where cond_expr]
[with exclusive lock]
[group by col_ref [, col_ref]... [having cond_expr]]
[order by col_ref [asc | desc] [, col_ref [asc | desc]]...]
select_item:
{tabname | corrname}.* | expression} [identifier | "headingstring"]
You can indicate that a select statement is to return just the first row of the result set, all rows of the result set (which is the
default), or only the distinct result set rows in which duplicate rows have been eliminated. Some examples are shown below.
select first * from salesperson;
SALE_ID SALE_NAME
BCK
Kennedy, Bob
DOB
COMMISSION REGION
1956-10-29
0.075
0
SALES_TOT OFFICE MGR_ID
736,345.32 DEN
BNF
The next query returns the list of salespersons who have a least one customer account. Try it without the distinct and see
what you get.
select distinct sale_name
from salesperson join customer using(sale_id);
SQL User Guide
100
7. Retrieving Data from a Database
SALE_NAME
Kennedy, Bob
Flores, Bob
Stouffer, Bill
Wyman, Eliska
Porter, Greg
Nash, Gail
McGuire, Sidney
Williams, Steve
Robinson, Stephanie
Jones, Walter
Note that select distinct usually requires SQL to sort the rows of the result set. This can be expensive for large result sets so
make sure that you really need the distinct rows before using this feature.
Like a select * from tabname you can specify tabname.* to have SQL include all of the columns declared in tabname in the
select column list. This is useful when more than on table is listed in the from clause. For example, the following select displays all of the note table entries made by each salesperson.
select sale_name, note.* from salesperson join note using(sale_id);
SALE_NAME
Kennedy, Bob
Kennedy, Bob
Kennedy, Bob
Kennedy, Bob
Kennedy, Bob
Kennedy, Bob
. . .
Warren, Wayne
Warren, Wayne
Warren, Wayne
Warren, Wayne
Warren, Wayne
Warren, Wayne
Warren, Wayne
Warren, Wayne
NOTE.NOTE_ID
FOLLCALL1
FOLLCALL1
FOLLCALL1
FOLLCALL1
FOLLCALL1
FOLLCALL1
NOTE.NOTE_DATE
1996-12-27
1997-02-06
1997-03-05
1997-03-18
1997-04-03
1997-05-08
NOTE.SALE_ID
BCK
BCK
BCK
BCK
BCK
BCK
NOTE.CUST_ID
DEN
DEN
PHO
PHO
DEN
PHO
INITMEET
INITMEET
QUOTE1
QUOTE1
SALESLIT1
SALESLIT1
SALESLIT1
SALESLIT2
1996-11-11
1996-12-30
1996-12-09
1997-04-08
1997-03-10
1997-04-01
1997-05-01
1997-03-23
WWW
WWW
WWW
WWW
WWW
WWW
WWW
WWW
GBP
MIN
GBP
GBP
MIN
GBP
MIN
MIN
The with exclusive lock will cause SQL to place a write-lock (as opposed to its usual read-lock) on all of the select statement
result rows. It is not usually a good idea to do this but there can be certain processing requirements where exclusive access is
needed for all of the access rows even where only some may actually end up being changed.
7.11 Unions of Two or More Select Statements
Situations sometimes exist where needed information is stored in different tables or databases. It may be the case that data has
not been normalized so that redundant data co-resides in those separate tables or databases and the easiest way to access that
information is to submit separate queries on each table/database. Ideally, one wants to have the result sets from those separate
queries grouped into a single result set. This can be done through use of temporary tables and using the insert from select
statement to run the results of each query into a single table. But a much easier method is by using the union operator to have
SQL do that work for you. This section describes the use of the union operator to combine the results of separate select statements into a single result set.
SQL User Guide
101
7. Retrieving Data from a Database
7.11.1 Specifying Unions
The result sets of two or more similar select statements can be combined into a single result set through use of the union operator. The syntax for the union of multiple select statements is shown below.
union:
query_expr [order by {colname | num} [asc | desc][, {colname | num} [asc | desc]]...]
query_expr:
query_term | query_expr union [all] query_term
query_term:
query_spec | ( query_expr )
query_spec:
select [first | all | distinct] {* | select_item[, select_item]...}
from tab_ref [, tab_ref]...
[where cond_expr]
All select statements that are involved in each specified union must have the same number of result columns and each of the
corresponding columns must have compatible data types. The results of each of the select statements are combined into a
single result set.
The results from each pair of select statements that are unioned together will have any duplicate rows removed by default.
You can specify union all in order to keep any duplicate rows in the result set. Because RDM Server SQL must maintain a
separate index in order to locate the duplicate rows to be removed, the best performance will result by always specifying
union all.
The unions of more than two select statements are processed in left to right order but can be changed by using parentheses.
The size of the final result set can be affected by how the unions are parenthesized if duplicate rows are being eliminated in
some of the unions (i.e., all is specified in some but not all of the union operations).
Standard SQL assigns no column headings to the result columns. RDM Server SQL, however, by default assigns the result
column headings based on the column names or headings specified in the first select statement. You can turn this feature on
or off using the following set statement.
set_union_headings:
set union headings {on | off}
7.11.2 Union Examples
The FBI’s National Crime Information Center maintains the national Integrated Automated Fingerprint Identification System
(IAFIS) for use by law enforcement agencies throughout the United States. This database contains the fingerprint records for
over 55 million criminal subjects as well as the civilian subjects many of whom are current or former employees of various
local, state, and federal law enforcement agencies. The FBI also manages the COmbined DNA Index System (CODIS) and the
National DNA Index System containing over 6.7 million offender DNA profiles and almost 260,000 forensic DNA profiles
extracted from crime scenes (as of February, 2009). The CODIS database contains information on convicted felons, arrestees,
and missing persons and their biologically related relatives.
In the investigation of a crime, fingerprints and DNA samples are often found that can lead to the identification and apprehension of the perpetrators of the crime. In the following example, a set of fingerprints and human DNA samples that were
extracted from a hypothetical crime scene are submitted to these various databases in order to identify any individuals from
those databases that match any of the provided fingerprint and DNA codes. All tables contain the name, date of birth (dob),
SQL User Guide
102
7. Retrieving Data from a Database
gender, height, weight, race, hair color, eye color, and distinguishing scars and marks (dsm) of each person in their respective
databases. Each record also contains a unique NCIC identification number. The following query shows how the union operator can be used to return a single result set from these databases containing a list of all persons who match at least one of the
specified fingerprint or DNA codes.
select name, dob, gender, height, weight, race, hair, eye, dsm, ncic, "IAFIS Criminal"
from iafis.criminal where fpid in (fpcode1, fpcode2, ..., fpcodeN)
union all
select name, dob, gender, height, weight, race, hair, eye, dsm, ncic, "IAFIS Civilian"
from iafis.civil where fpid in (fpcode1, fpcode2, ..., fpcodeN)
union all
select name, dob, gender, height, weight, race, hair, eye, dsm, ncic, "CODIS Felon"
from codis.felon where dnacode in (dnacode1, dnacode2, ..., dnacodeN)
union all
select name, dob, gender, height, weight, race, hair, eye, dsm, ncic, "CODIS Arrestee"
from codis.arrestee where dnacode in (dnacode1, dnacode2, ..., dnacodeN)
order by 1, 2;
The union all ensures that SQL will not have to do the extra work to check for duplicate rows. The character literal that is
specified as the last entry in each select column list simply identifies the source from which the matching row was found. The
final result set is sorted by name and date of birth. The ncic identification number is returned in the ncic column which can
then be used to retrieve the entire record for each result row if desired.
Unions can also be used to simplify the kind of select statement to be used to retrieve the desired result. Our sales database
example stores only one address for each customer. But there are often situations where the customer has one address for
billing and another for shipping. One way this can be implemented is to separate customer address data into a separate table
and maintain a billing address foreign key and a shipping address foreign key referencing the address table in the customer
table. The DDL which implements this scheme is given below.
create table address
(
addrid rowid primary key,
address1 char(30),
address2 char(30),
city char(20),
state char(2),
zip char(5)
);
create table customer
(
cust_id char(3) primary key,
company varchar(30) not null,
contact varchar(30),
billingaddr rowid not null references address,
shippingaddr rowid references address
);
The billingaddr column contains the rowid of the address table row that contains the customer’s billing address information.
The shippingaddr column contains the rowid of the address table row that contains the shipping address information. When
the billing and the shipping address are the same, shippingaddr is null. Now suppose that each customer is to be sent a package of promotional material. A list of each customer’s shipping address is to be retrieved and used to produce mailing labels.
One way to do this is shown in the example below.
SQL User Guide
103
7. Retrieving Data from a Database
select company, contact,
if (shippingaddr is null, b.address1, s.address1) address1,
if (shippingaddr is null, b.address2, s.address2) address2,
if (shippingaddr is null, b.city, s.city) city,
if (shippingaddr is null, b.state, s.state) state,
if (shippingaddr is null, b.zip, s.zip) zip
from (customer inner join address b on (billingaddr = b.addrid))
left outer join address s on (shippingaddr = s.addrid)
order by 7;
This works well but is pretty complex. The same result, however, can be achieved using a union with a much simpler construction as follows.
select company, contact, address1, address2, city, state, zip
from customer inner join address on (shippingaddr = addrid)
union all
select company, contact, address1, address2, city, state, zip
from customer inner join address on (billingaddr = addrid)
where shippingaddr is null
order by 7;
SQL User Guide
104
8. Inserting, Updating, and Deleting Data in a Database
8. Inserting, Updating, and Deleting Data in a Database
The SQL insert statement is used to add new data into the database. Database data that already exists in the database can be
changed using the SQL update statement. You delete data from the database using the SQL delete statement. Use of these
three statements are described in detail in this chapter. Changes made by one or more of these statements are not stored in the
database until a commit statement is executed. A commit causes all of the database changes made in the current transaction
to be safely written to the database. Before describing the use of the SQL statements that you can use to change the data
stored in the database, it is necessary to first describe the use of transactions.
8.1 Transactions
It is very important that any database management system (DBMS) ensures that the data that is stored in a database satisfies
the ACID criteria: Atomicity, Consistency, Isolation, and Durability. Atomicity means that a set of interrelated database modifications all be made together at the same time. If one modification from the set fails then all fail. Consistency means that a
database never contains errant data or relationships and that a transaction always transforms the database from one consistent
state into another. Consistency is something that is primarily the responsibility of the application because the database cannot
be certain that all of the necessary modifications have been properly included in any given transaction. In SQL, consistency
rules are specified through DDL foreign and primary key declarations and the check clause and RDM Server SQL does ensure
that all database data adheres to those rules. Isolation means that the changes that are being made during a transaction are
only visible to the user (connection) making them. Not until the transaction’s changes have been committed to the database
are other users (connections) able to see them. Durability refers to the DBMS’s ability to ensure that the changes made by all
transactions that have committed survive any kind of system failure.
The work necessary to ensure that a DBMS supports "ACIDicity" makes it among the most complex of all system software
components. The challenge is to maintain ACIDicity and yet allow the database data to be easily accessed by as many users
as possible, as fast as possible. However, there is an unavoidable and severe negative performance impact caused by the need
to maintain an ACID compliant database. When enforcement of these properties is relaxed, data can be updated and accessed
much more quickly but the consistency and integrity of the data will certainly be impaired should a system failure occur.
A transaction is a group of related database modifications (i.e., a sequence of insert, update, and/or delete statements) that are
written to the database during execution of a commit statement in such a way as to guarantee that either all of the modifications are successfully written to the database or none are in the event of a system failure while the commit is being processed. Should the application detect an error (e.g., invalid user input) or RDM Server SQL detect an integrity error prior to
the commit, a rollback statement can be executed to discard all of the changes made since the start of the transaction.
Transactions are controlled through the use of four SQL statements: start transaction, commit, rollback, and savepoint.
8.1.1 Transaction Start
The start transaction statement is used to mark the beginning of a new database transaction. Use of this statement is not
strictly necessary as a transaction is implicitly started by SQL on execution of the first insert, update or delete statement that
follows the most recently executed commit (or call to SQLConnect). However, it is best to explicitly start each transaction
as that will clearly delineate transaction boundaries in your application. The syntax for the start transaction is given below.
start_trans:
start trans[action]
SQL User Guide
105
8. Inserting, Updating, and Deleting Data in a Database
The start transaction initiates a new database modification transaction in which the changes made by any subsequent insert,
update, or delete statements (as well as changes made by any triggers that have been defined on the modified tables, see
Chapter 8) will be atomically written to the database as a unit upon execution of a commit statement.
In earlier versions of RDM Server the begin transaction statement was used to start (begin) a transaction.
Its syntax is shown below and is still accepted by RDM Server SQL.
begin [trans[action] | work] [trans_id]
The optional trans_id is an identifier that can be used to label the transaction.
8.1.2 Transaction Commit
The commit statement is used to atomically write all of the changes made by insert, update and delete statements executed
since the most recently executed start transaction statement. The syntax for commit is as follows.
commit:
commit
A simple transaction used to insert a single row into the salesperson table is shown in the following example.
start trans;
insert into salesperson
values "MMB", "Bryant, Mike",date "1960-11-14",0.05,0,"SEA","BNF";
commit;
8.1.3 Transaction Savepoint
The savepoint statement is used to mark a transaction savepoint identified by savepoint_id that can be the target of a subsequently executed rollback to savepointsavepoint_id statement which will cause all of the database modifications made
after this savepoint to be discarded while keeping intact all changes made in the transaction prior to this savepoint. The syntax for the savepoint statement is shown below.
savepoint:
savepoint
savepoint_id
Of course, this statement requires that a transaction has been started.
Savepoints are discarded through execution of a rollback to a prior savepoint, or a rollback or commit of the transaction.
8.1.4 Transaction Rollback
The rollback statement is used to discard (undo) database modifications made during the current transaction. The syntax for
rollback is shown below.
rollback:
|
rollback [trans[action]]
rollback to savepoint
savepoint_id
SQL User Guide
106
8. Inserting, Updating, and Deleting Data in a Database
The first form is used to terminate the transaction and discard all of the changes made by all insert, update and delete statements that were executed during the transaction.
The second form is used to discard all of the changes made by all insert, update and delete statements that were executed
after execution of the savepoint statement with a matching savepoint_id. Changes made during the transaction prior to the
savepoint remain in place.
The example below illustrates the use of savepoint and rollback.
start trans;
insert into salesperson ... // new salesperson
savepoint new_customer;
insert into customer... // new customer for new salesperson
insert into customer... // another for the new salesperson
... // discover problem with new customers
rollback savepoint to new_customer;
commit; // commit new salesperson to database
8.2 Inserting Data
The insert statement is used to insert new rows into a table. Three different methods for inserting rows into a table are supported in RDM Server SQL. The insert values statement is the most common and is used to insert a single row into a table.
The insert from select statement can be used to insert the results from a select statement into a table. Finally, the insert from
file statement allows you to insert rows into a table from a comma-delimited text file or from a XML file. Use of each of these
methods is described in the following sections.
8.2.1 Insert Values
The insert values statement is used to insert a row into a table. The syntax for the insert values statement is:
insert_values:
insert into [ dbname.]tabname [ ( colname [, colname ]... ) ] values col_value [, col_value]...
col_value:
constant | null |? | proc_arg
The insert values statement is used to insert a single row into the table tabname which must identify a table declared in a
database managed by RDM Server. If more that one database has a table named tabname then dbname should be specified to
identify the database containing the desired table.
If a colname list is specified it must include every column which requires that a value be specified (a primary key column or
one which does not have a default value but does have a not null declared). For each column, there must be a value specified
in the same corresponding position in the values list. If no colname list is specified then there must be a value listed for each
column declared in the table in the order in which the columns were declared in the create table statement for tabname.
The values specified in the values list will usually simply be a constant of a data type that is compatible with the data type
of its corresponding column, or null if allowed by the corresponding column definition. However, insert values can include a
parameter marker references (designated by a "?") or, if the insert statement is contained within a create procedure statement,
procedure argument names (proc_arg).
For example, the following statement inserts a new salesperson into the example salesperson table.
SQL User Guide
107
8. Inserting, Updating, and Deleting Data in a Database
insert into salesperson
values "MMB", "Bryant, Mike",date "1960-11-14",0.05,0,"SEA","BNF";
In the salesperson table, the mgr_id column is a foreign key to the row of the salesperson table of the manager. For this
example, if RDM Server finds no salesperson row with "BNF" as the value of sale_id, it rejects the insert statement with a referential integrity violation..
When using the insert values statement, the application does not have to specify all columns for the table. Only non-null
columns that do not have default values must be specified. For these columns, the insert statement specifies either the default
value or null. If your application does not include a column list, it must specify values for all table columns, in the order in
which the columns are declared in the create table (or create view) statement.
Any SQL program that inserts rows into a table with a rowid primary key must add a place-holder (",,")
for the rowid primary key column in any values list, as well as in any text file that is used for importing
rows into the table. This affects only the use of the insert statement. Rowid primary key values cannot be
modified by using the update statement.
The next example shows the insert statements needed to store a complete sales order in the database.
start trans;
insert into sales_order
values("SEA",2311,date "1997-06-30",time "13:17:00",30036.50,2553.10);
insert into item values(2311,16311,30);
insert into item values(2311,18061,200);
insert into item values(2311,18121,1000);
commit;
The columns in the sales_order table are cust_id, ord_num, ord_date, ord_time, amount, tax, and ship_date respectively. The
columns in the item table are ord_num, prod_id, loc_id and quantity. Note that this is a single transaction, which contains
four insert statements. Hence, there is a single commit statement.
The following example illustrates the use of RDM Server SQL system literal constants in insert statements.
.. a statement that could be executed from an extension module or
.. stored procedure that is always executed when a connection is made.
insert into login_log(user_name, login_time) values(user, now);
.. check today's action items
select cust_id, note_text from action_items where tickle_date = today;
See RDM Server Language Reference for information about specifying constant values including date and
time constants and system literal constants, such as those in the example above.
8.2.2 Insert from Select
You can also insert new rows into a table from another table using insert from select statement. The syntax for the insert
from select statement is given below. The select statement was described in detail in the last chapter and its use with the
insert statement will show the basics of how the two can be used together.
insert_from_select:
SQL User Guide
108
8. Inserting, Updating, and Deleting Data in a Database
insert into [db_name.]tabname [(colname[, colname]...)] [from] select
The number of result columns returned from the select statement must equal the number of columns specified in the colname
list or, if not specified, the number of columns declared in the table. The data type of each result column must also be compatible with its corresponding table column.
Your application can create a temporary table to hold temporary results from which additional queries can be processed. A
temporary table is visible only to the application session that creates it. It can be queried just like any other table.
To create a temporary table, execute a create temporary table statement that conforms to he following syntax.
create_temporary_table:
create temporary table
tabname (temp_col_defn [, temp_col_defn]...)
temp_col_defn:
colname type_spec [default {constant | null | auto}]
The basics of table creation were described earlier in section 6.3. However, no foreign or primary keys or check constraints
can be declared for a temporary table.
Before any rows are inserted into the temporary table, you can create one or more indexes on the temporary table using the
create index statement. Note that the optional attribute and the in clause are not allowed in the create index statement for a
temporary index. Use of the create index statement is described in section 6.4.
Once the temporary table has been created, rows can be inserted into it using any form of the insert statement and it can be
referenced just like any table in other select, update, and delete statements.
The following example uses an insert statement to fill a temporary table called sp_sales with the customer orders processed
by Sidney McGuire ("SKM").
create temporary table sp_sales(
company char(30),
city char(17),
state char(2),
ord_date date,
amount float
);
create index skm_ndx on sp_sales(state);
insert into sp_sales
select company, city, state, ord_date, amount
from customer, sales_order
where customer.sale_id = "SKM" and
customer.cust_id = sales_order.cust_id
order by 1, 4;
The select statement in the insert statement above contains an order by clause that causes the natural ordering of the rows in
sp_sales to be sorted in company, ord_date order. Any select statement issued for sp_sales that does not itself have an order
by clause specified reports its results in the same order.
A temporary table can be reinitialized using the initialize temporary table statement as shown below.
initialize_temporary_table:
init[ialize] temp[orary] table
SQL User Guide
tabname[, tabname]...
109
8. Inserting, Updating, and Deleting Data in a Database
Each tabname must be the name of a previously created temporary table.
The following example shows an initialize statement used to reload the temporary table sp_sales with customers for Bob
Flores ("BNF").
init temp sp_sales;
insert into sp_sales
select company, city, state, ord_date, amount
from customer, sales_order
where customer.sale_id = "BNF" and
customer.cust_id = sales_order.cust_id
order by 1, 4;
8.2.3 Importing Data into a Table
Your application can use a single insert from file statement to import multiple rows from a comma-delimited text file into a
table. This statement can be used to perform a bulk load. If any of the rows in the text file violate any of the integrity constraints defined for the table, the load terminates with an error. All rows inserted up to that point are rolled back automatically. The syntax for this form of the insert statement is as follows:
insert_from_file:
insert [with auto commit] into [dbname.]tabname
[from] [ascii | unicode] file "filename" [, "delimiter"]... [on
devname]
|
insert [with auto commit] into [db_name.]table_name
[from] xml
file "filename"[, xml_option ...][on
devname]
xml_option:
"blobs={no | yes}"
|
"tags={columnnames | numbers}"
|
"attribs={no | only}"
|
"tabname={no | yes}"
|
"nulltags={no | yes}"
|
"dateformat={y | m | d}"
The specified file must reside in device devname or, if no device is given, in the user's device. The first form of the insert
from file statement stores each row in a text line with the column values delimited with a comma or the specified "delimiter"
character would be in the values clause of an insert values statement.
The insert from xml file imports the data from an xml formated file. You can specify a variety of options that describe the
format of the xml file to be imported. Note that these options are specified in a string with no spaces allowed between the
option elements. Also note that the default setting is the first option setting specified in the list. Hence, the default blobs
option is no. You can also specify just "y" for "yes" or "n" for "no". The option string is case-insensitive. Each of these
options is described in the following table.
Table 8-1. XML Import Option Descriptions
Option
Description
blobs
Set to "yes" to import the translation string specified for long varbinary column
data.
tags
Set to "numbers" when column tags are identified by their ordinal position in
the result set rather than its name (e.g., <COLUMN-2>).
attribs
Set to "only" when each result row output is a single text line where the
SQL User Guide
110
8. Inserting, Updating, and Deleting Data in a Database
Option
Description
column values specified as attributes (e.g., <ROW> sale_id="BNF" namee="Flores, Bob"<\ROW>).
tabname
Set to "yes" when each result row is tagged with its table name (e.g., <salesperson>) rather than <ROW>.
nulltags
Set to "yes" when an empty column entry is given for null-valued columns.
dateformat
Set to "y" when dates are in "YYYY-MM-DD" format (default), to "m" for
dates in "MM-DD-YYYY" format, and to "d" for dates in "DD-MM-YYYY"
format.
Each text line (ending in a newline character, '\n') in file "filename" corresponds to one row in the table. Each value in the
text line is specified just as it would be in the values clause of an insert values statement. Each column value is separated by
the "delimiter" character which is by default a comma (",").
Any errors encountered during the processing of any of the insert will result in an appropriate error return and will discard
any rows inserted prior to the occurrence of the error. The with auto commit clause can be specified to indicate that the system is to perform a commit on each row that it inserted into the table from the specified file which will preserve all rows that
were inserted up to the one in which the error was detected.
If the number of rows to be inserted is very large, your application should either explicitly open the database in exclusive mode or issue an exclusive table lock on the table being accessed. Otherwise, the server
is forced to maintain a growing number of record locks for the table, which can cause severe performance
degradation on the server.
The following example lists the contents of file "outlet.txt" located in the catdev device.
"SEA",
"LAX",
"DAL",
"BOS",
"CHI",
"KCM",
"STL",
"NYC",
"ATL",
"MIN",
"DEN",
"WDC",
"Seattle",
"Los Angeles",
"Dallas",
"Boston",
"Chicago",
"Kansas City",
"St. Louis",
"New York",
"Atlanta",
"Minneapolis",
"Denver",
"Washington",
"WA",
"CA",
"TX",
"MA",
"IL",
"MO",
"MO",
"NY",
"GA",
"MN",
"CO",
"DC",
0
0
3
1
2
2
2
1
3
2
0
1
All these values can be loaded into our example outlet table using the following insert statement.
insert into outlet file "outlet.txt" on catdev;
commit;
You can ensure the fastest possible import processing by first opening the database in exclusive access mode (no locks
required) with transaction logging turned off (see example below). Of course, the price paid for this performance is the loss of
recoverability in case the server crashes (for example, in a power failure) while the insert statement is being processed. If any
integrity constraints are violated, the insert statement terminates but the rows that have already been inserted cannot be rolled
back. No rollback capability exists at all in this case, because the changes are not logged.
SQL User Guide
111
8. Inserting, Updating, and Deleting Data in a Database
The following code illustrates insert from file statements issued for the product, outlet, and on_hand tables. Notice the use of
the update stats statement following the bulk load. It is always a good practice to execute update stats after making substantial modifications to a database, such as bulk loads. Executing this statement ensures that the SQL query optimizer is generating access plans based on reasonable data usage statistics.
open invntory exclusive with trans off;
insert into product from file "product.txt" on catdev;
insert into outlet from file "outlet.txt" on catdev;
insert into on_hand from file "on_hand.txt" on catdev;
commit;
update stats on invntory;
You use the insert from file statement to import XML data into a database table. The general format of an XML file is as follows:
<?xml version="1.0" encoding="UTF-8"?>
<anytagname>
<anyrowname anyattributename="value" ...>
<anytagname>value</anytagname>
...
</anyrowname>
...
</anytagname>
The first level tags (anyrowname) are assumed to enclose row values. The tag name is ignored. When anytagname or anyattributename matches a column in the table named in the insert statement, the value will be assigned for that column. A row
will be created if at least 1 column is specified, and the resulting insert of the row is valid according to SQL rules, such as
key uniqueness and referential integrity.
Tags nested at deeper levels will be ignored.
If a column is missing, it will be inserted as a null column value. Note that if the column is not nullable, the insert will fail.
If a column is identified more than once in a single row element (one or more attributes with the same name, and/or one or
more elements with the same name), only the first value will be used. The remaining values will be ignored. In the following
example, the sale_id column will have the value "one".
<?xml version="1.0" encoding="UTF-8"?>
<RAIMA-SQL>
<ROW sale_id="one" dob="1954-05-30" sale_id="two">
<SALE_ID>three</SALE_ID>
<SALE_NAME>Flores, Bob</SALE_NAME>
<SALE_ID>four</SALE_ID>
</ROW>
</RAIMA-SQL>
8.2.4 Exporting Data from a Table
The insert into file statement can be used to export data from a table (or tables) into either a comma-delimented formatted file
or an XML formatted file. The syntax for this statement is given below.
insert_into_file:
SQL User Guide
112
8. Inserting, Updating, and Deleting Data in a Database
|
insert into [ascii | unicode] file "filename" [ , "delimiter"] [on devname ] [from] select
insert into xml file "filename" [ , xml_option]... [on devname ] [from] select
xml_option:
"blobs={no | yes}
|
"header={noversion | version }"
|
"tags={columnnames | numbers}"
|
"attribs={no | only}"
|
"tabname={no | yes}"
|
"nulltags={no | yes}"
|
"dtd={no | yes}"
|
"schema={no | yes}"
|
"dateformat={y | m | d}"
This statement will export the result set returned from the specified select statement in the specified format (ascii, unicode, or
xml) into a text file named "filename" which will be stored in the device named devname or in the home device for the user
executing the statement.
The first form (non-xml) of the insert into file statement will store the result rows from the specified select statement in a
comma-delimited file in which the data is stored either as ascii-coded (default) or Unicode-coded (UTF-8) characters. You can
use the "delimiter" clause to change the delimiter from a comma to some other special character (e.g., "|").
The xml form of the insert into file statement allows one or more xml control options to be specified. Note that these options
are specified in a string with no spaces allowed between the option elements. Also note that the default setting is the first
option setting specified in the list. Hence, the default blobs option is no. You can also specify just "y" for "yes" or "n" for
"no". The option string is case-insensitive. Each of these options is described in the following table.
Table 8-2. XML Export Option Descriptions
Option
Description
blobs
Set to "yes" to include a translation string for long varbinary column data.
header
Set to "version" include in the generated xml file a <!-- --> header line containing the version of RDM Server that executed the statement.
tags
Set to "numbers" to have each column tag identified by its ordinal position in
the result set rather than its name (e.g., <COLUMN-2>).
attribs
Set to "only" to have each result row output as a single text line with the
column values specified as attributes (e.g., <ROW> sale_id="BNF" namee="Flores, Bob"<\ROW>).
tabname
Set to "yes" to have each result row tagged with its table name (e.g., <salesperson>) rather than <ROW>.
nulltags
Set to "yes" to output an empty column entry for null-valued columns.
dtd
Set to "yes" to output a DTD (Document Type Description) header for the xml
file.
schema
Set to "yes" to output a header containing the schema for the result xml table.
dateformat
Set to "y" to output dates in "YYYY-MM-DD" format (default), to "m" to output dates in "MM-DD-YYYY" format, and to "d" to output dates in "DD-MMYYYY" format.
Examples of a variety of export options of the outlet table (invntory database) are provided below. The insert statement is followed by the contents of the generated file.
SQL User Guide
113
8. Inserting, Updating, and Deleting Data in a Database
insert into file "outlet.txt" on catdev from select * from outlet;
"ATL","Atlanta","GA",3
"BOS","Boston","MA",1
"CHI","Chicago","IL",2
"DAL","Dallas","TX",3
"DEN","Denver","CO",0
"KCM","Kansas City","MO",2
"LAX","Los Angeles","CA",0
"MIN","Minneapolis","MN",2
"NYC","New York","NY",1
"SEA","Seattle","WA",0
"STL","St. Louis","MO",2
"WDC","Washington","DC",1
insert into file "outlet.txt", "|" on catdev from select * from outlet;
"ATL"|"Atlanta"|"GA"|3
"BOS"|"Boston"|"MA"|1
"CHI"|"Chicago"|"IL"|2
"DAL"|"Dallas"|"TX"|3
"DEN"|"Denver"|"CO"|0
"KCM"|"Kansas City"|"MO"|2
"LAX"|"Los Angeles"|"CA"|0
"MIN"|"Minneapolis"|"MN"|2
"NYC"|"New York"|"NY"|1
"SEA"|"Seattle"|"WA"|0
"STL"|"St. Louis"|"MO"|2
"WDC"|"Washington"|"DC"|1
insert into xml file "outlet.xml" on catdev from select * from outlet;
<?xml version="1.0" encoding="UTF-8"?>
<RAIMA-SQL>
<ROW>
<loc_id>ATL</loc_id>
<city>Atlanta</city>
<state>GA</state>
<region>3</region>
</ROW>
<ROW>
<loc_id>BOS</loc_id>
<city>Boston</city>
<state>MA</state>
<region>1</region>
</ROW>
<ROW>
<loc_id>CHI</loc_id>
<city>Chicago</city>
<state>IL</state>
<region>2</region>
</ROW>
<ROW>
<loc_id>DAL</loc_id>
<city>Dallas</city>
<state>TX</state>
<region>3</region>
</ROW>
<ROW>
<loc_id>DEN</loc_id>
SQL User Guide
114
8. Inserting, Updating, and Deleting Data in a Database
<city>Denver</city>
<state>CO</state>
<region>0</region>
</ROW>
<ROW>
<loc_id>KCM</loc_id>
<city>Kansas City</city>
<state>MO</state>
<region>2</region>
</ROW>
<ROW>
<loc_id>LAX</loc_id>
<city>Los Angeles</city>
<state>CA</state>
<region>0</region>
</ROW>
<ROW>
<loc_id>MIN</loc_id>
<city>Minneapolis</city>
<state>MN</state>
<region>2</region>
</ROW>
<ROW>
<loc_id>NYC</loc_id>
<city>New York</city>
<state>NY</state>
<region>1</region>
</ROW>
<ROW>
<loc_id>SEA</loc_id>
<city>Seattle</city>
<state>WA</state>
<region>0</region>
</ROW>
<ROW>
<loc_id>STL</loc_id>
<city>St. Louis</city>
<state>MO</state>
<region>2</region>
</ROW>
<ROW>
<loc_id>WDC</loc_id>
<city>Washington</city>
<state>DC</state>
<region>1</region>
</ROW>
</RAIMA-SQL>
insert into xml file "outlet.xml","attribs=only","tabname=y" on catdev from select * from outlet;
<?xml version="1.0" encoding="UTF-8"?>
<RAIMA-SQL>
<outlet loc_id="ATL" city="Atlanta" state="GA" region="3" />
<outlet loc_id="BOS" city="Boston" state="MA" region="1" />
<outlet loc_id="CHI" city="Chicago" state="IL" region="2" />
<outlet loc_id="DAL" city="Dallas" state="TX" region="3" />
<outlet loc_id="DEN" city="Denver" state="CO" region="0" />
<outlet loc_id="KCM" city="Kansas City" state="MO" region="2" />
<outlet loc_id="LAX" city="Los Angeles" state="CA" region="0" />
SQL User Guide
115
8. Inserting, Updating, and Deleting Data in a Database
<outlet loc_id="MIN"
<outlet loc_id="NYC"
<outlet loc_id="SEA"
<outlet loc_id="STL"
<outlet loc_id="WDC"
</RAIMA-SQL>
city="Minneapolis" state="MN" region="2" />
city="New York" state="NY" region="1" />
city="Seattle" state="WA" region="0" />
city="St. Louis" state="MO" region="2" />
city="Washington" state="DC" region="1" />
The last example shown above is an insert into xml file that specified two xml options. Any number of the xml options can
be specified in an insert statement.
8.3 Updating Data
The update statement is used to modify the values of one or more columns of one or more rows in a table.
update:
update [dbname.]tabname
set colname = {expression | null}[, colname = {expression | null}]...
[where cond_expr ]
The value to which each named column in the set clause is assigned is the evaluated result of its specified expression. The
table to be updated is named tabname which, if more than one database has a table of that name, should be qualified with its
dbname. The column values in table tabname referenced by the expressions are the pre-updated column values. The rows that
are updated are those for which the conditional expression is true. If no where clause is specified, every row in table tabname
will be updated. If the update of any of the selected rows results in an integrity constraint violation, the update is aborted and
the changes to the rows that had already been modified are discarded.
Note that you can only update a primary key of those rows for which there are either no foreign key references or for on
which a create join has been declared on all of the foreign keys that reference this primary key. Updates of foreign key
columns will be checked to ensure that referential integrity is preserved (i.e., the referenced primary key row exists).
The following example shows a basic update statement that sets the commission to eight percent for the salesperson with
sale_id "SWR" (Stephanie Robinson). This update modifies only a single row of a table.
start transaction;
update salesperson set commission = 0.08 where sale_id = "SWR";
commit;
The next example gives each non-manager salesperson a 10 percent increase in commission rate.
update salesperson
set commission = commission + 0.10*commission
where mgr_id is null;
commit;
Assume that Rams Data Processing, Inc., has moved to a new address. The next statement modifies the relevant columns in
the customer table of our sales database.
update customer
set address = "17512 SW 123rd St.", city = "Tustin", zip = "90121"
SQL User Guide
116
8. Inserting, Updating, and Deleting Data in a Database
where cust_id = "LAN";
commit;
The statements below illustrate another update example. Eliska Wyman ("ERW") has left the company. Until her replacement
is hired, Eliska's New York and New Jersey customers are to be serviced by Greg Porter ("GAP"), and her other customers will
be handled by Sidney McGuire ("SKM").
start trans;
update customer
set sale_id =
where sale_id
update customer
set sale_id =
where sale_id
commit;
"GAP"
= "ERW" and state in ("NY", "NJ");
"SKM"
= "ERW" and state not in ("NY", "NJ");
The following example uses the if column selection function. This function allows the application to do in a single statement
the modifications requiring two update statements in the previous example.
start trans;
update customer
set sale_id = if (state in ("NY","NJ"), "GAP", "SKM");
where sale_id = "ERW";
commit;
8.4 Deleting Data
The delete statement is used to delete one or more rows from a table. The syntax for the delete statement is as follows.
delete:
delete from [dbname.]tabname [where cond_expr ]
The table whose rows are to be updated is named tabname which, if more than one database has a table of that name, should
be qualified with its dbname. The rows to be deleted from tabname are those for which the conditional expression specified
in the where clause returns true. If no where clause is specified the all of the rows in the table will be deleted. The delete
statement will fail and return an error if it attempts to delete a row which is referenced by another foreign key rows in which
case no rows will be deleted.
The following example shows how the delete statement is used to try to delete the salesperson row with sale_id equal to
"ERW".
delete from salesperson where sale_id = "ERW";
However, since there are five customers who are serviced by this salesperson which have not been deleted the system (in this
case, the rsql utility) returns the following error.
****RSQL Diagnostic 3713: non-zero references on primary/unique key
SQL User Guide
117
8. Inserting, Updating, and Deleting Data in a Database
In the next example, sales manager Chris Blades has left the company and his salespersons are to be reassigned to Bill
Stouffer. Before deleting the salesperson row for Chris Blades (sale_id = "CMB"), an update statement must first be executing
to reassign the salesperson rows with mgr_id = "CMB" to mgr_id "BPS".
start trans;
update salesperson set mgr_id = "BPS" where mgr_id = "CMB";
*** 2 rows affected
delete from salesperson where sale_id = "CMB"
****RSQL Diagnostic 3713: non-zero references on primary/unique key
Oops. There are still some foreign keys somewhere that reference salesperson row with sale_id = "CMB" but there are no customers assigned to Blades since he is a manager. But there are notes. So, the statements below will successfully complete the
transaction. Note that the update salesperson statement is still active in the transaction even those the above delete statement
failed.
delete from note_line where sale_id = "CMB";
*** 29 rows affected
delete from note where sale_id = "CMB";
*** 11 rows affected
delete from salesperson where sale_id = "CMB"
*** 1 rows affected
commit;
SQL User Guide
118
9. Database Triggers
9. Database Triggers
A trigger is procedure associated with a table that is executed (i.e., fired) whenever that table is modified by the execution of
an insert, update, or delete statement. A non-standard trigger mechanism has been available in RDM Server SQL through the
use of a User-Defined Function that gets called via the execution of a check condition that was specified in the create table
statement. The SQL standard now provides the ability for triggers to be specified using SQL statements. This section describes
how standard SQL triggers are implemented in RDM Server SQL.
9.1 Trigger Specification
The create trigger statement is used to create a trigger on a specified database table. The syntax for this statement is given
below.
create_trigger:
create trigger trigname ["description"]
{before | after} {insert | delete | update [of colname[, colname]...]} on tabname
[referencing {old | new} [row] [as] corname [{new | old} [row] [as] corname]]
[for each {row [when (search_condition)] | statement]
trigger_stmts
trigger_stmts:
trig_stmt
|
begin [atomic]
trig_stmt...
end
trig_stmt:
|
open | close | flush | initialize_database | insert | delete
update | lock_table | call | initialize_table | notify
The trigname is the unique name of the trigger and must conform to a standard identifier. The tabname is the name of the
table with which the trigger is to be associated. If there is more than one database with a table named tabname then you can
qualify tabname with the name of its database dbname.
An optional string containing a description or comment about the trigger can be specified. This string is stored along with the
trigger definition in the system catalog.
The trigger is defined to be fired either before or after the changes are made by the specified insert, update, or delete (called
the trigger event). The firing of an update trigger can be restricted to occur only when the values of the column names specified in the update of clause are updated. If no columns are specified, then an update trigger will be fired upon the execution
of every update statement on tabname.
Two types of triggers can be created. A statement-level trigger is created by specifying for each statement in the trigger
declaration. If no for each clause is specified, for each statement is the default. A statement-level trigger fires once for each
execution of the insert, update, or delete statement as indicated in the specified trigger event. Thus, for example, an update
statement that modifies 100 rows in a table will execute a statement-level update trigger on the table only once.
A row-level trigger is created by specifying for each row in the trigger declaration. Row-level triggers fire once for each table
row that is changed by the insert, update, or delete statement. Row-level triggers are the more useful of the two types of triggers in that they can reference the old and/or new columns values for each row. The referencing clause is used to specify a
correlation name for either the old table row values or the new table row values.This clause can only be specified with row-
SQL User Guide
119
9. Database Triggers
level triggers. The when clause can be used to specify a condition that must evaluate to true in order for the trigger to fire.
Note that the only table values that can be referenced in the when conditional expression (cond_expr) are through the referencing old and/or new row correlation names.
The new or old column values of each row can be referenced in the trigger’s SQL statements through the correlation names
specified in the referencing clause. However, references to blob type columns (long varchar/varbinary/wvarchar) are not
allowed. Note that insert triggers only have new column values, delete triggers only have old column values, while update
triggers have both old and new column values.
The SQL statement to be executed when the trigger fires is specified last. If more than one statement is needed, it must be
placed within a begin [atomic] and end block. The SQL standard offers no explanation as to why it chose to include the
word "atomic." It normally is used to mean that a sequence of statements are not interruptable. However, since the execution
of a trigger can cause other data modifications to occur that also have triggers (they can be nested) this cannot be the case
with triggers. We have interpreted it to mean that either all of the SQL statements succeed or if any one fails then the state is
restored to its pre-trigger execution condition. Regardless of why they chose to include this term, it does tend to make one
not want to use triggers for fear of nuking the database!
There are some restrictions on the kinds of SQL statements that can be included in a trigger. No select, DDL, or create statements are allowed in a trigger. A trigger cannot create another trigger. A stored procedure cannot create a trigger. Also, since
it is necessary that any database modifications made by a trigger be included as part of the user’s transaction, no transaction
statements are allowed in a trigger definition. While stored procedures and user-defined procedures can be executed within a
trigger, great care must be exercised to ensure that no harmful side effects occur from the execution of these procedures inside
a trigger.
A trigger begins to take effect immediately upon the successful execution of the create trigger statement. Thus, it is should
be considered more of a DDL than a DML statement since their creation should occur immediately after the DDL statements
are issued that define the database tables on which the triggers are associated. Triggers that are created on an existing database may require that the conditions and data relationships being maintained by the triggers be externally established at trigger creation time. See the "Summary Statistics" section below for an example.
9.2 Trigger Execution
A trigger that has been defined on a table will be executed based on the {before | after} trigger event specification. Any
changes that are made by the SQL statements specified in a before trigger will remain intact even when the triggering data
modification statement fails (e.g., with an integrity violation). The triggered SQL statements defined in an after trigger are
only executed when the triggering data modification statement succeeds.
A before statement-level trigger will execute before any changes are made by the associated (triggering) insert, update, or
delete statement. An after statement-level trigger will execute after all changes have been successfully made by the associated
insert, update, or delete statement.
A before row-level trigger will execute prior to each row modification made by the triggering insert, update, or delete statement. An after row-level trigger executes after each row has been successfully modified by the triggering insert, update, or
delete statement. If a when clause has been specified with a row-level trigger, the trigger will only fire on those rows where
the evaluation of the when's conditional expression (cond_expr) returns true.
All changes made by the SQL statement(s) defined by the trigger are included as part of the user’s transaction. Thus, the
triggered database modifications will be committed when the user subsequently issues a commit statement or they will be
rolled back should the user subsequently execute a rollback statement.
There is no limit to the number of triggers than can be defined on a table. There can even be multiple triggers with the same
trigger event specified on a table. Multiple triggers are executed sequentially in the order in which they were defined.
SQL User Guide
120
9. Database Triggers
The SQL trigger_stmts can themselves make changes to tables on which other triggers have been defined. Thus, trigger execution can be nested.
Note that any rows that are modified by a trigger remained locked until the user either commits or rolls back the transaction.
Any trigger can be disabled and subsequently re-enabled through use of the alter trigger statement.
alter trigger trigname {enable | disable}
The altered trigger status takes effect immediately upon successful execution of the alter trigger statement.
9.3 Trigger Security
The ability for non-administrator users to create triggers is included in the create database command-level privilege.This can
be set by executing the following grant statement.
grant create database to user_id [, user_id ]...
The create database privilege can be removed by executing the following revoke statement.
revoke create database from user_id [, user_id ]...
A user must either be an administrator or have create database command privilege in order to create, alter or drop triggers.
In addition to having the proper command privilege, A non-administrator user must also have been granted trigger privilege
on any tables on which the user will be creating triggers. Trigger privileges are set using the following grant statement.
grant trigger on [dbname.]tabname to user_id [, user_id ]...
Trigger privilege is required for a user to create, alter, or drop a trigger on the specified table. Trigger privileges can be
revoked by issuing the following statement.
revoke trigger on [dbname.]tabname from user_id [, user_id ]...
Revoking trigger privileges does not affect any triggers that may have already been created by the specified user.
Triggers execute under the authority of the user who created the trigger and not that of the user who executed the original
insert, update, or delete statement that caused the trigger to fire. Thus, the user that issues the create trigger statement must
have the proper security privileges on any table that is to be accessed or modified by the trigger’s SQL statements. Later
changes to the security settings for the user who created the trigger will not affect the execution of the trigger. Please refer to
Chapter 11, "SQL Database Access Security" for details.
A trigger can be dropped by executing the drop trigger statement.
drop trigger
trigname
All triggers that have been defined on a particular table are automatically dropped when the table is dropped.
SQL User Guide
121
9. Database Triggers
9.4 Trigger Examples
The use of triggers in a database system necessarily means that modifications made to the tables on which triggers have been
defined will have side effects that are hidden from the user who issued the original SQL modification statement. Generally,
side effects are not a good thing to have occur in a software system. Yet, triggers are am important and useful feature for certain kinds of processing requirements. The examples in this section illustrate two such uses. Triggers are particularly useful in
maintaining certain kinds of statistics such as usage or summary stats. Triggers are also very useful in maintain various kinds
of audit trails.
Summary Statistics
The query below returns the sales totals for each customer in the sales database.
set double display(12, "#,#.##");
select cust_id, sum(amount) from sales_order group by 1;
cust_id
sum(amount)
ATL
113,659.75
BUF
263,030.36
CHI
160,224.65
. . .
SEA
60,756.36
SFF
112,345.66
TBB
104,038.25
WAS
63,039.90
An alternative approach which does not require running a query that scans through the entire sales_order table each time can
be implemented with triggers. A new column named sales_tot of type double is declared in the customer table. The following
three triggers can be defined on the sales_order table that keeps the related customer’s sales total amount up to date.
create trigger InsSalesTot after insert on sales_order
referencing new row as new_order
for each row
update customer
set sales_tot = sales_tot + new_order.amount
where cust_id = new_order.cust_id;
create trigger UpdSalesTot after update of amount on sales_order
referencing old row as old_order new row as new_order
for each row
update customer
set sales_tot = sales_tot + (new_order.amount - old_order.amount)
where cust_id = new_order.cust_id;
create trigger DelSalesTot before delete on sales_order
referencing old row as old_order
for each row
update customer
set sales_tot = sales_tot - old_order.amount
where cust_id = old_order.cust_id;
The first trigger, InsSalesTot, executes an update on the customer table after each successful insert on the sales_order table by
adding the new sales_order's amount through the correlation name new_order to the current value of the customer's sales_tot.
The second trigger is fired only when there is an update executed that changes the value of the amount column in the sales_
order table. When that occurs the customer's sales_tot column needs to subtract out the old amount and add in the new one.
The DelSalesTot trigger fires whenever a sales_order row is deleted causing its amount to be subtracted from the customer's
sales_tot.
SQL User Guide
122
9. Database Triggers
Now suppose you want to also maintain the sales totals for each salesperson in addition to each customer. You can also add a
sales_tot column of type double to the salesperson table and use a trigger to update it as well as the customer sales_tot
column. The simplest way to do this is to modify the above triggers to update the row of the salesperson table who manages
the account of the customer whose sales_order is being modified as shown below.
create trigger InsSalesTot after insert on sales_order
referencing new row as new_order
for each row
begin atomic
update customer
set sales_tot = sales_tot + new_order.amount
where cust_id = new_order.cust_id
update salesperson
set sales_tot = sales_tot + new_order.amount
where sale_id = (select sale_id from customer
where cust_id = new_order.cust_id)
end;
create trigger UpdSalesTot after update of amount on sales_order
referencing old row as old_order new row as new_order
for each row
begin atomic
update customer
set sales_tot = sales_tot + (new_order.amount - old_order.amount)
where customer.cust_id = new_order.cust_id;
update salesperson
set sales_tot = sales_tot + (new_order.amount - old_order.amount)
where sale_id = (select sale_id from customer
where cust_id = new_order.cust_id)
end;
create trigger DelSalesTot before delete on sales_order
referencing old row as old_order
for each row
begin atomic
update customer
set sales_tot = sales_tot - old_order.amount
where customer.cust_id = old_order.cust_id;
update salesperson
set sales_tot = sales_tot - old_order.amount
where sale_id = (select sale_id from customer
where cust_id = new_order.cust_id)
end;
Since each trigger contains two SQL update statements, they must be enclosed between the begin atomic and end pairs. Also
note that the subquery is needed to locate the salesperson row to be updated through the customer row based on the cust_id
column in the sales_order table.
The same result can also be achieved not by modifying the original triggers but by introducing one new trigger that updates
the salesperson's sales_tot whenever a related customer's sales_tot column is updated. Note that the saleperson sales_tot does
not need to be updated when a new customer row is inserted (because the sales_tot is initially zero) or when a customer row
is deleted (because the sales_order rows associated with the customer must first be deleted which causes the customer's sales_
tot to be updated). The trigger definition is as follows.
create trigger UpdSPSalesTot after update of amount on customer
referencing old row as old_cust new row as new_cust
for each row
update salesperson
SQL User Guide
123
9. Database Triggers
set sales_tot = sales_tot + (new_cust.amount - old_cust.amount)
where sale_id = new_cust.sale_id;
This trigger fires whenever an update is executed on the sales_tot column in the customer table. That will only occur when
one of the earlier triggers fires due to the execution of an insert, delete, or update of the amount column on the sales_order
table. Thus, this is an example of a nested trigger—a trigger which fires in response to the firing of another trigger.
The sales database example is delivered with the sales_tot column already declared in the salesperson and customer tables but
without the triggers having been declared. Now, however, you want to create the triggers that will maintain the sales_tot values for each customer and salesperson but data already exists in the database. So, the sales totals somehow need to be initialized at the time the triggers are created. To do this the database should be opened in exclusive access to ensure that no
updates occur between the time the triggers are first installed and the sales_tot values in the customer table are initialized.
The following rsql script shows how this can be done.
open sales exclusive;
set double display(12, "#,#.##");
select sale_id, sales_tot from salesperson;
sale_id
sales_tot
BCK
0.00
BNF
0.00
BPS
0.00
...
WAJ
0.00
WWW
0.00
select cust_id, sales_tot from customer;
sale_id
sales_tot
ATL
0.00
BUF
0.00
CHI
0.00
...
TBB
0.00
WAS
0.00
create trigger InsSalesTot ...
create trigger UpdSalesTot ...
create trigger DelSalesTot ...
create trigger UpdSPSalesTot ...
update customer set sales_tot =
query("select sum(amount) from sales_order where cust_id = ?", cust_id);
*** 28 rows affected
select cust_id, sales_tot from customer;
commit;
sale_id
sales_tot
ATL
113,659.75
BUF
263,030.36
CHI
160,224.65
...
TBB
104,038.25
WAS
63,039.90
select sale_id, sales_tot from salesperson;
sale_id
BCK
BNF
SQL User Guide
sales_tot
237,392.56
112,345.66
124
9. Database Triggers
BPS
0.00
...
WAJ
WWW
close sales;
141,535.34
49,461.20
Note that the update statement that sets the sales_tot values for each row in the customer table uses the query system function (a copy has also been included as an example user-defined function called "subquery").
Audit Trails
Audit trails keep track of certain changes that are made to a database along with an identification of the user who initiated
the change and a timestamp as to when the change occurred. Suppose we want to keep track of changes made to the sales_
order table. The following statement creates a table called orders_log that will contain on row per sales_order change and
grants insert (only) privileges on it to all users.
create table sales.orders_log(
chg_desc char(30),
chg_user char(32) default user,
chg_timestamp timestamp default now
);
commit;
grant insert on orders_log to public;
Six statement-level triggers are needed to track all successful and unsuccessful attempts to change the sales_order table: three
before triggers to track all attempts and three after triggers to track only those changes that succeed. Note that should the
transaction that contains the sales_order change statement be rolled back, the changes to orders_log will also be rolled back.
Thus, only unsuccessful change attempts associated with subsequently committed transactions will be logged in the orders_
log table. The declarations of the triggers are given below.
create trigger bef_ord_ins before insert on sales_order
for each statement
insert into orders_log(chg_desc) values "insert attempted";
create trigger bef_ord_upd before update on sales_order
insert into orders_log(chg_desc) values "update attempted";
create trigger bef_ord_del before delete on sales_order
insert into orders_log(chg_desc) values "delete attempted";
create trigger aft_ord_ins after insert on sales_order
insert into orders_log(chg_desc) values "insert successful";
create trigger aft_ord_upd after update on sales_order
insert into orders_log(chg_desc) values "update successful";
create trigger aft_ord_del after delete on sales_order
insert into orders_log(chg_desc) values "update successful";
By the way, as you can see from the above trigger declarations the for each statement clause is optional and is the default if
no for each clause is specified.
The rsql script below creates a couple of new users who each make several changes to the sales_order table in order to see the
results of the firing of the associated triggers. Note also that the original row-level triggers are still operative.
create user kirk password "tiberius" on sqldev;
grant all commands to kirk;
SQL User Guide
125
9. Database Triggers
grant select on orders_log to kirk;
create user jones password "tough" on sqldev;
grant all commands to jones;
grant select on orders_log to jones;
.c 2 server kirk tiberius
insert into sales_order values "IND",2400,today,now,10000.00,0,null;
*** 1 rows affected
commit;
.c 3 server jones tough
update sales_order set amount = 1000.00 where ord_num = 2400;
*** 1 rows affected
delete from sales_order where ord_num = 2210;
****RSQL Diagnostic 3713: non-zero references on primary/unique key
commit;
select * from orders_log;
chg_desc
chg_user
chg_timestamp
insert attempted
kirk
2009-07-27 11:58:17.9460
insert successful
kirk
2009-07-27 11:58:17.9460
update attempted
jones
2009-07-27 11:59:48.2900
update successful
jones
2009-07-27 11:59:48.2900
delete attempted
jones
2009-07-27 12:00:06.3680
.c 2
*** using statement handle 1 of connection 2
delete from sales_order where ord_num = 2400;
*** 1 rows affected
select * from orders_log;
chg_desc
chg_user
insert attempted
kirk
insert successful
kirk
update attempted
jones
update successful
jones
delete attempted
jones
delete attempted
kirk
delete attempted
kirk
delete successful
kirk
rollback;
select * from orders_log;
chg_timestamp
2009-07-27 11:58:17.9460
2009-07-27 11:58:17.9460
2009-07-27 11:59:48.2900
2009-07-27 11:59:48.2900
2009-07-27 12:00:06.3680
2009-07-27 12:05:10.0710
2009-07-27 12:05:49.9620
2009-07-27 12:05:49.9620
chg_desc
insert attempted
insert successful
update attempted
update successful
delete attempted
chg_timestamp
2009-07-27 11:58:17.9460
2009-07-27 11:58:17.9460
2009-07-27 11:59:48.2900
2009-07-27 11:59:48.2900
2009-07-27 12:00:06.3680
chg_user
kirk
kirk
jones
jones
jones
9.5 Accessing Trigger Definitions
Trigger definitions are stored in the system catalog. Two predefined stored procedures are available for accessing trigger definitions. Procedure ShowTrigger will return a result set containing a single char column and one row for each line of text from
the original declaration for the trigger name specified in the procedure argument. Procedure ShowAllTriggers returns two
columns: the trigger name and a line of text from the original declaration. Example calls and their respective result sets are
shown in the example below.
SQL User Guide
126
9. Database Triggers
exec ShowTrigger("UpdSalesTot");
TRIGGER DEFINITION
create trigger UpdSalesTot after update of amount on sales_order
referencing old row as old_order new row as new_order
for each row
update customer set sales_tot = sales_tot + (new_order.amount - old_order.amount)
where customer.cust_id = new_order.cust_id;
exec ShowAllTriggers;
NAME
DEFINITION
InsSalesTot
create trigger InsSalesTot after insert on sales_order
InsSalesTot
referencing new row as new_order
InsSalesTot
for each row
InsSalesTot
update customer set sales_tot = sales_tot + new_order.amount
InsSalesTot
where customer.cust_id = new_order.cust_id;
UpdSalesTot
create trigger UpdSalesTot after update of amount on sales_order
UpdSalesTot
referencing old row as old_order new row as new_order
UpdSalesTot
for each row
UpdSalesTot
update customer
UpdSalesTot
set sales_tot = sales_tot + (new_order.amount - old_order.amount
UpdSalesTot
where customer.cust_id = new_order.cust_id;
DelSalesTot
create trigger DelSalesTot before delete on sales_order
DelSalesTot
referencing old row as old_order
DelSalesTot
for each row
DelSalesTot
update customer set sales_tot = sales_tot - old_order.amount
DelSalesTot
where customer.cust_id = old_order.cust_id;
UpdSPSalesTot create trigger UpdSPSalesTot after update of sales_tot on customer
UpdSPSalesTot
referencing old row as oldc new row as newc
UpdSPSalesTot
for each row
UpdSPSalesTot
update salesperson
UpdSPSalesTot
set sales_tot = sales_tot + (newc.sales_tot - oldc.sales_tot)
UpdSPSalesTot
where sale_id = newc.sale_id;
SQL User Guide
127
10. Shared (Multi-User) Database Access
10. Shared (Multi-User) Database Access
An RDM Server database is designed to be efficiently accessed my multiple, concurrent users. In such a multi-user environment, some method must be used by the DBMS to protect against attempts by multiple users to update the same data at the
same time. RDM Server SQL applies locks on the shared data in order to restrict changes to shared data to one user at a time.
Why locking is needed is explained by the following example.
Table 10-1 shows the sequence of actions of two connections trying to update the same table row at approximately the same
time without using locks. At time t1, connection 1 reads the row from the database. Connection 2 reads the row at t2. Both
connections then update and write the row back to the database, with connection 1 going first for each operation. However,
at the end of the last write, the row copy for connection 2 does not include changes from connection 1 (changes occurred
after connection 2 read the row).
Table 10-1. Multi-User Database Access without Locks
Time
Connection 1
t1
read row
t2
t3
read row
update row
t4
t5
Connection 2
update row
write row
t6
write row
In this case, connection 2 can access connection 1 changes if connection 2 can read the row after time t5. What is necessary is
to provide a lock to serialize updates to the shared data. Table 10-2 illustrates the sequence of operations for the two example
applications synchronized by the use of locks. Note that once the lock request for connection 1 is granted at time t2, connection 2 must wait for the row to be unlocked before continuing. When connection 1 completes its updates, it frees the lock
at time t6. This action triggers RDM Server to grant the lock to connection 2, after which connection 2 can read the row
(including the connection 1 changes) and then make its own changes.
Table 10-2. Multi-User Database Access with Locks
Time
Connection 1
t1
request row lock
t2
lock granted
t3
read row
t4
update row
t5
write row
t6
free lock
Connection 2
request row lock
t7
read row
t8
update row
t9
write row
t10
free lock
SQL User Guide
128
10. Shared (Multi-User) Database Access
As the above example illustrates, an important feature of a multi-user DBMS is to provide the shared database access control
necessary to ensure that no data is lost and that the data is logically consistent (that is, the required inter-data relationships
exist). RDM Server SQL automatically manages the needed row-level locking for you. Yet it is important for you to know
how this is done and the features provided by RDM Server SQL that give you control over how SQL manages it. This is the
subject of the rest of this chapter.
10.1 Locking in SQL
10.1.1 Row-Level Locking
Two types of locks are used by RDM Server. A read lock (sometimes called a share or shared lock) is issued by SQL for each
row that is fetched from a select statement. Any number of connections can have a read lock on the same row. A write lock
(sometimes called an exclusive lock) is requested for each row to be modified by an update statement or deleted by a delete
statement. Once the write lock request has been granted to one connection, no other connected can lock (read or write) that
row until the write lock is freed upon execution of a commit or rollback statement. RDM Server SQL implicitly places a
write lock on rows created by execution of an insert statement. Note that SQL will also request and place a write lock on any
existing rows that are referenced by foreign key values in the newly inserted row.
Locks are managed by SQL in conjunction with transactions. All locks that are issued outside of a transaction are read locks.
After a transaction is started, as noted above, write lock requests are issued for the rows that are accessed for each insert,
update or delete statement that is executed as part of the transaction.
Within a transaction, the row-level locks used by a select statement depend on the current transaction mode (see section 10.2
below). Normally, a read lock is placed on the current row (i.e. most recently fetched row) and freed when the next row is
read. This is called cursor stability transaction mode. In read repeatability transaction mode the read locks are kept in place
until the transaction ends. This ensures that a previously fetched row does not change during the transaction.
The with exclusive lock clause of the select statement requests that the system apply write-locks instead of read-locks to the
rows of the result set. If no transaction is active when the statement is executed, SQL will automatically start a transaction.
The select statement that contains the with exclusive lock clause must be updateable which means that it:
l
Does not contain a distinct, group by or order by clause, or a subquery,
l
Does not contain any column expressions, and
l
Has a from clause that refers to only a single table.
The with exclusive lock clause follows the where clause in the specification of a select statement as shown in the following
syntax and example.
select:
select [first | all | distinct] {* | select_item [, select_item]...}
from table_ref [, table_ref]...
[where cond_expr]
[with exclusive lock]
[group by col_ref [, col_ref]... [having cond_expr]]
[order by col_ref [asc | desc] [, col_ref [asc | desc]]...]
select * from salesperson where mgr_id is not null with exclusive lock;
SQL User Guide
129
10. Shared (Multi-User) Database Access
10.1.2 Table-Level Locking
The lock table and unlock table statements allow you to lock an entire table. Two lock modes are provided. Shared mode
allows you (and others) read-only access to the table. Exclusive mode allows you to modify the table while denying all other
users access to it. The syntax for lock table is shown below.
lock_table:
lock table lock_spec[, lock_spec]...
lock_spec:
[dbname.]tabname[, [dbname.]tabname]... [in] {share | exclusive} [mode]
All table locks are automatically freed whenever a transaction is commited or rolled back. Shared mode table locks can be
freed explicitly with unlock table statement shown below.
unlock_table:
unlock table [dbname.]tabname[[, dbname.]tabname]...
For example, you can issue the following lock table statement to lock the salesperson and customer tables in shared mode and
the sales_order and item tables in exclusive mode.
lock table salesperson, customer share
sales_order, item exclusive;
A typical use for table locks is to place an exclusive lock on a table in order to do a bulk load, as illustrated in the following
example which loads the inventory database from comma-delimited text files stored in the RDM Server catdev device. By the
way, notice the update stats statement following the bulk load. It is always a good practice to execute update stats after making substantial modifications to a database to ensure that the optimizer is generating access plans based on reasonable usage
statistics.
lock table product, outlet, on_hand exclusive;
insert into product from file "product.txt" on catdev;
insert into outlet from file "outlet.txt" on catdev;
insert into on_hand from file "onhand.txt" on catdev;
commit;
update stats on invntory;
10.1.3 Lock Timeouts and Deadlock
RDM Server SQL issues lock requests that are either granted or denied. Lock requests are normally queued, waiting for the
current lock on the row (or table) to be freed at which time the request at the front of the queue will be granted. Associated
with each connection is a lock timeout value that specifies how long an ungranted lock request can wait on the queue. This
is an important feature to prevent the occurrence of a deadlock in which two connections each hold locks on rows for which
the other connection has a lock request (this is the simplest form of a deadlock—there are many ways in which deadlock can
occur among multiple users). In order to avoid deadlock, when a timeout error is returned from the execution of an insert,
update or delete statement, the proper procedure is to rollback the transaction and start over. This will free that transaction's
locks allowing another connection's competing transaction to proceed.
The set timeout statement can be used to set the lock request timeout value for a connection.
SQL User Guide
130
10. Shared (Multi-User) Database Access
set_timeout:
set timeout [to | =] numseconds
The numseconds is an integer constant that specified the minimum number of seconds a lock request is to wait. The system
default is 30 seconds. Setting the timeout value to 0 will cause lock requests that cannot be granted to timeout immediately.
Setting the timeout value to -1 will cause lock requests to wait indefinitely.
WARNING: Do not disable timeouts (set timeout = -1) on a deployed/operational database unless you are
absolutely certain that there is no way a deadlock can occur in your application. If you are using row-level
locking it is highly unlikely that you can be certain your application is deadlock free. Disabling timeouts is a
feature intended primarily for diagnosis and testing.
10.2 Transaction Modes
RDM Server SQL automatically controls locking of accessed rows during the processing of select, insert, update, and delete
statements. There are several methods provided by RDM Server SQL that allow you to control the behavior of the locking
operation. The following two set statements are used to establish the desired multi-user operational behavior.
set_read_repeatability:
set read repeatability [to | =] {on | off}
set_transaction:
set trans[action] isolation [to | =] {on | off}
The effects of these two statements are described in table 10-3 below.
Table 10-3. Transactions Control Settings Part I
transaction repeatable description
isolation
reads
on
on
This is called read repeatability mode. Changes from other connections (users) are not visible until committed. All rows are
locked. Read locks within a transaction are kept until the transaction commits or is rolled back.
on
off
Called cursor stability mode; this is the default mode for RDM
Server SQL. Changes are not visible to other connections until committed. A read lock is kept for the current row only. When the
cursor is advanced to the next row the current row is freed.
off
on
Allows dirty reads outside of a transaction whereby uncommitted
changes from other connections are visible and read locks are not
required to read data from the database. Inside a transaction, behavior is identical to read repeatability mode.
off
off
Allows dirty reads outside of a transaction whereby uncommitted
changes from other connections are visible and read locks are not
required to read data from the database. Inside a transaction, behavior is identical to cursor stability mode.
SQL User Guide
131
10. Shared (Multi-User) Database Access
Regardless of what mode you are in, all rows that are modified through execution of an insert, update, or delete statement are
write-locked and remain so until the transaction is ended through either a commit or a rollback. Because of this, it is a good
idea to keep the sizes of your transactions small. The more rows that are changed within a transaction, the more locks the
server must manage. The overhead associated with this lock management could become excessive. A commit/rollback will
free all of the write locks. Thus, short transactions can increase system throughput.
Read repeatability mode is the strictest form of transaction isolation available. In this mode, every row that is read within a
transaction is read-locked and kept locked until the transaction ends. Thus, rows that are re-fetched inside a transaction are
guaranteed to have the same values.
Cursor stability mode is the RDM Server SQL system's default mode. In this mode, a read lock is placed on each row as it is
fetched. When the next row is fetched, the lock on the current row is freed and the new row is locked. Thus, only the current
row is locked at any time.
So-called "dirty read" mode is useful in situations where the preciseness of the data is not particularly important. It could be
used, for example, when you are looking for a particular row of a table of the "I'll know it when I see it" variety. Its advantage
is that it does not place any locks and, therefore, does not get blocked by any rows in its path that happen to be write-locked
nor will it block other write-lock requests.
The transaction and read repeatability modes can also be set using the form of the set transaction statement shown below.
set_transaction:
set trans[action] trans_mode[, trans_mode]
trans_mode:
isolation level {read uncommited | read committed | repeatable read}
This form adheres to standard SQL with the modes set as indicated in the table below.
Table 10-4. Transactions Control Settings Part II
mode setting
transaction
isolation
repeatable
reads
read uncommitted
on
on
read committed
off
on
repeatable read
off
off
SQL User Guide
132
11. Stored Procedures and Views
11. Stored Procedures and Views
11.1 Stored Procedures
11.1.1 Create a Stored Procedure
An RDM Server SQL stored procedure is a precompiled group of one or more SQL (DML) statements stored in the system
catalog. A stored procedure is defined using the create procedure statement that conforms to the following syntax.
create_procedure:
|
create proc[edure] procname "description" [(arg_spec[, arg_spec]...)] as
proc_stmt...
end proc[edure]
create proc[edure] procname ["description"] in libname on devname
arg_spec:
argname type_spec [default
constant]
proc_stmt:
|
|
|
open | close | flush | initialize_database | insert | delete
update | select | lock_table | call | initialize_table
begin_trans | start_trans | commit | rollback | mark
set | notify | update_stats
insert:
insert_values | insert_from_select
The name of the stored procedure is procname and must be unique as stored procedures have system-wide scope. An optional
"description" string can be specified and will be stored with the procedure definition in the system catalog.
Procedures can have arguments. Associated with each argument is a name, data type and optional default value. The argname
is a case-insensitive identifier that can be any name except that it must be unique in the argument list of this procedure. The
data type declared for each argument can be any of the specified arg_type entries. Note that it is not necessary to specify a
length for a character argument as any is interpreted as a string since the length is determined from the actual value passed to
the procedure at the time it is invoked. The same is true for the precision and scaled of decimal type arguments.
One or more SQL statements that comprise the body of the stored procedure are placed in the order in which they will be
executed between the as and the end procedure clauses. They do not need to be separated by semi-colons. Only the specified
SQL statements can be contained in a stored procedure. The syntax for each of those statements is defined elsewhere in this
manual and/or the SQL Language Reference.
The order_report procedure illustrated below is a typical example of how you might use a stored procedure. The arguments
supply the date range over which a standard report is produced. If a value for end_date is not supplied when the procedure is
executed, the end date be set to the current date, as defined by its default clause.
create procedure order_report(
start_date date,
end_date date default today) as
select sale_name, company, ord_num, ord_date, amount, tax
SQL User Guide
133
11. Stored Procedures and Views
from salesperson, customer, sales_order
where salesperson.sale_id = customer.sale_id and
customer.cust_id = sales_order.cust_id and
ord_date between start_date and end_date
end procedure;
Assuming that the salesperson's sale_id is the same as the user name, the check_tickle procedure below retrieves all of a salesperson's notes in date order for the specified note_id. Note that you can abbreviate procedure as proc.
create proc check_tickle(id char) as
select note_date, cust_id, textln
from note, note_line
where note_id = id and sale_id = user()
and note.note_id = note_line.note_id
and note.note_date = note_line.note_date
order by 1, 2;
end proc;
The preceding examples contain only one statement, but a stored procedure can contain any number of statements. The
example procedure below, product_summary, uses two select statements. This stored procedure shows the total amount of a
particular product stored at all outlets, followed by the total amount of that same product that has been ordered.
create proc product_summary(pid smallint) as
select prod_id, prod_desc, sum(quantity) total_available
from product, on_hand
where prod_id = pid and product.prod_id = on_hand.prod_id
select prod_id sum(quantity) total_ordered
from item where prod_id = pid
end proc;
11.1.2 Call (Execute) a Stored Procedure
An RDM Server SQL stored procedure is called through an call (execute) statement that references the procedure that conforms to the syntax shown below.
call:
{call | exec[ute]} procname [( arg_value [, arg_value]... )
arg_value:
constant | ? | argname | corname.colname
An argument value, arg_value, can be one of the following:
l
a constant that is compatible with its declared argument type,
l
a parameter marker (?) but not if the procedure is being called from within another procedure or a trigger,
l
the name of an argument of the stored procedure containing this call statement,
l
a reference to an old or new column value within a trigger definition.
SQL User Guide
134
11. Stored Procedures and Views
RDM Server SQL returns control to the calling application after it completes processing. When an error occurs in the execution of any of the statements in the stored procedure, the procedure immediately terminates and returns an error code to the
calling application.
If the stored procedure has arguments, the call statement must specify a value (or a place holder) for each one and they must
be specified in the same order they were defined in the create procedure statement. The application does not have to supply
a value for an argument that has a default value, but it does need to supply a comma as a placeholder for that parameter, as
the following example illustrates.
call myproc(17002,,1);
The statement in the next example invokes the order_report stored procedure created in the previous section.
call order_report(date "06-01-1997", date "06-30-1997");
Since the next statement is executed on 6/30/97, it will produce the same results as the preceding statement because of the
default value specified for end_date. Note the use of exec (the alternate form of the execute statement) and the comma placeholder for the final default parameter.
exec order_report(date "06-01-1997",);
The following example invokes the check_tickle stored procedure defined in the prior section.
call check_tickle("PROSPECT");
11.2 Views
11.2.1 Create View
A view is a table derived from the results of the select statement which defines the view. Views can be used just like any
table, but it does not contain any rows of its own. Instead, it is solely composed of rows returned from its select statement on
the underlying base tables. The syntax for the create view statement is shown below.
create_view:
create view [dbname.]viewname ["description"] [ (colname [, colname ]...) ]
as select expression[, expression]... from table_ref [, table_ref]...
[where cond_expr]
[group by col_ref [, col_ref ]... [having cond_expr] ]
[with check option]
The table defined by the view is the one that results from executing the specified query. The select expressions constrain the
visible columns in the view. The where clause constrains the rows of the view to only those that satisfy its condition.
If a list of column names is specified, there must be one column name (colname) for each select expression. The value associated with each column is the value of its respective select expression (for example, the value of the fourth column is the result of the fourth expression). If a column name list is not specified, the column names of the view are the same as the column
SQL User Guide
135
11. Stored Procedures and Views
names in the select statement. If any of the columns in the select statement are expressions or if two columns have the same
name, a column name list must be specified. Note that the select statement defining the view cannot have an order by clause.
In the following example, the create view statement defines a view that provides a summary of the total order amounts per
salesperson per month for the current year. The sales_summary view contains three columns, sale_name, sales_month (month
in which the order was taken), and order_tot (total orders for the salesperson for the month).
create view sales_summary(sales_month, sale_name, order_tot) as
select month(ord_date), sale_name, sum(amount)
from salesperson, customer, sales_order
where year(ord_date) = year(curdate()) and
salesperson.sale_id = customer.sale_id and
customer.cust_id = sales_order.cust_id
group by 1, 2;
11.2.2 Retrieving Data from a View
Use a view in exactly the same way you would use any table. For example, the select statement shown below uses the view
defined in the previous section.
select * from sales_summary where sales_month in (1,2);
SALES_MONTH
1
1
1
1
1
1
1
1
1
1
1
2
2
2
2
2
2
2
2
2
2
SALE_NAME
Flores, Bob
Jones, Walter
Kennedy, Bob
McGuire, Sidney
Nash, Gail
Porter, Greg
Robinson, Stephanie
Stouffer, Bill
Warren, Wayne
Williams, Steve
Wyman, Eliska
Flores, Bob
Jones, Walter
Kennedy, Bob
McGuire, Sidney
Nash, Gail
Robinson, Stephanie
Stouffer, Bill
Warren, Wayne
Williams, Steve
Wyman, Eliska
ORDER_TOT
19879.5
76887.87
328070.83
3437.5
136250
74034.9
29073.51
15901.61
79957.98
32094.75
173878.57
8824.56
86065
103874.8
9386.25
3927.9
164816.47
4049.09
11265.92
62340
74851.2
The next example includes an order by clause. Notice that although the order_tot value is calculated using an aggregate function (sum), the comparison is specified in the where clause and not in a having clause. If the comparison were defined as part
of the view, it would need to be in the having clause of the create view's select statement.
select order_tot, sales_month, sale_name from sales_summary
where order_tot > 10000.0 order by 1 desc;
ORDER_TOT
328070.83
SQL User Guide
SALES_MONTH
1
SALE_NAME
Kennedy, Bob
136
11. Stored Procedures and Views
252425
173878.57
164816.47
143375
137157.05
136250
104019.5
103874.8
103076.79
4
1
2
3
4
1
6
2
6
Porter, Greg
Wyman, Eliska
Robinson, Stephanie
Kennedy, Bob
Wyman, Eliska
Nash, Gail
Nash, Gail
Kennedy, Bob
Wyman, Eliska
11.2.3 Updateable Views
An updateable view can be the table referenced in an insert, delete or update statement. A view is considered updateable
when the select statement defining the view meets all the following conditions.
l
It does not contain a subquery or a distinct, group by or order by clause.
l
It does not contain any column expressions.
l
It has a from clause referring to only a single table. .
A view having a with check option specification must be updateable. When specified, the with check option requires that
any insert or update statements referencing the view must satisfy the where condition of the view's defining select statement.
The following create view defines a view that restricts the outlet table to only those rows located in western region states.
create view west_outlets as
select * from outlet where state in ("CA","CO","OR","WA")
with check option;
If you attempted to insert a row into the west_outlets view with a state value other than one of the states listed in the where
clause, the insert statement would be rejected as in the example below. If the with check option had been omitted in the view
definition, the row would have been stored.
insert into west_outlets values("SAL","Salem","MA",0);
*** integrity constraint violation: check
11.2.4 Drop View
When a view is no longer needed, you can delete the view from the system by issuing a drop view statement.
drop_view:
drop view [dbname.]viewname [cascade | restrict]
The cascade option (the default) causes all other views referencing this view to be automatically dropped by the system. The
restrict option prohibits dropping viewname if any other views exist that reference this view. The example code below creates two views to help illustrate the use of the drop view statement.
create view acct_orders as
select sale_id, sale_name, cust_id, ord_num, ord_date, amount
SQL User Guide
137
11. Stored Procedures and Views
from salesperson, customer, sales_order
where salesperson.sale_id = customer.sale_id
and customer.cust_id = sales_order.cust_id;
create view sales_summary(sales_month, sales_month, order_tot) as
select month(ord_date), sale_name, sum(amount)
from acct_orders where year(ord_date) = year(curdate())
group by 1, 2;
The following statement will be rejected by SQL because a dependent view exists (sales_summary).
drop view acct_orders restrict;
The next statement, however, will drop not only the acct_orders view but sales_summary as well.
drop view acct_orders cascade;
The same result will occur using the next statement because cascade is the default action.
drop view acct_orders;
Thus, when in doubt, always specify restrict. You cannot undo (or roll back) a drop view.
11.2.5 Views and Database Security
One of the more important uses of views is in conjunction with database security. Views can have permissions assigned to
them just as with any table. If it is important to be able to restrict which columns from a base table users can have access to,
you can simply define a view that includes only those columns. The view would be accessible to those users, whereas the
base table would not. For example, the following view could be used to hide personal information about salespersons such as
date of birth and commission rate from unauthorized eyes.
create view sales_staff
as select sale_id, sale_name, region, office, mgr_id from salesperson;
Once the proper permissions have been established, the dob and commission columns would not be accessible to normal
users.
You can also use views to restrict rows from certain users. The west_outlets view in the last section could be set up so that
those salespersons from the western region could only access information (for example, inventory quantities) from offices located in those particular states.
SQL User Guide
138
12. SQL Database Access Security
12. SQL Database Access Security
Database security provides the ability to restrict user access to database information through restrictions on the database
columns and tables, or on the kinds of statements that a particular user can use.
The RDM Server system has two classes of users. Administrator users have full access rights to all system capabilities and
databases. Normal users have only the access rights granted to them by administrators and database owners. A system will typically have only a single administrator user (often referred to as the system or database administrator). RDM Server does not,
however, require that there be only one administrator.
RDM Server SQL provides two classes of access privileges: command privileges and database access privileges. Command
privileges allow an administrator to specify the kinds of commands that a particular user is allowed to use. Database access
privileges allow an administrator or database owner to specify the database information and operations that a particular user is
allowed to access. User access rights are assigned for an RDM Server SQL database using the grant and revoke statements.
To manipulate an RDM Server SQL database, a user must have both table and command privileges. For example, RDM Server
does not allow a user without delete command privileges to issue a delete statement, even if delete data access privileges on
the table have been granted to that user.
Attempts to execute an SQL statement by a user for which proper access privileges have not been granted
will result in an access rights violation error returned from RDM Server SQL.
Changes to a user's security settings do not take effect until the next time that user logs in to RDM Server.
12.1 Command Access Privileges
12.1.1 Grant Command Access Privileges
Command privileges specify the kinds of RDM Server SQL statements available to a user for database manipulation. The form
of the grant statement that is used to do this is defined by the following syntax.
grant:
grant cmd_spec to
user_id[, user_id]...
cmd_spec:
|
all commands [but command [, command]...]
commands command [, command]...
command:
|
|
create {database | proc[edure] | trigger}
insert | update | delete
lock table | unlock table
The user_id is an identifier that is case-sensitive and must exactly match the user id for the desired user. Two methods of
granting command privileges can be used. You can grant all commands but and list only those commands the user cannot
execute or you can grant commands followed by the list of only those command the user can issue. The specific command
privilege classes that can be granted (or not granted) are given in the table below. All other commands (including select) can
be issued by any user.
SQL User Guide
139
12. SQL Database Access Security
Table 12-1. Command Privilege Definitions
Command Class
Description
create database
Allows user to issue any DDL statement or create, alter, or drop trigger statement.
create view
Allows user to define his/her own views.
create procedure
Allows user to define his/her own stored procedures.
create trigger
Allows user to define triggers.
insert, update, or delete
Allows user to issue insert, update, or delete statements.
lock table
Allows user to issue lock and unlock table statements.
The example below grants permission for all users to issue any statements except DDL statements. It allows only the users
George and Martha to create databases.
grant all commands but create database to public;
grant all commands to George, Martha;
The next example restricts the user Jack to issuing select, update, and create view statements.
grant commands create view, update to "Jack";
12.1.2 Revoke Command Access Privileges
To rescind command privileges, an administrator can issue a revoke which identifies the specific commands that a user can no
longer issue. As with grant, two methods of specification are allowed as shown below. One form identifies the commands
from the restricted list that the user cannot use. The other form (all but) identifies the commands from the restricted list that
the user can use.
revoke:
revoke cmd_spec to
user_id[, user_id]...
cmd_spec:
|
all commands [but command [, command]...]
commands command [, command]...
command:
|
|
create {database | proc[edure] | trigger}
insert | update | delete
lock table | unlock table
The privileges that are being revoked must have been previously granted. The specified privileges can be revoked from all
users (public) or be restricted from only the users listed in the revoke command.
The example below grants permission for all users to issue any statements except DDL statements. It allows only the users
George and Martha to create databases.
grant all commands but create database to public;
grant all commands to George, Martha;
The next example restricts the user Jack to issuing select, update, and create view statements.
SQL User Guide
140
12. SQL Database Access Security
grant commands create view, update to Jack;
12.2 Database Access Privileges
12.2.1 Grant Table Access Privileges
Database access privileges allow an administrator or database owner to specify the database information and operations that a
particular user is allowed to access. You can assign user access privileges to database tables, views, and columns using the following form of the grant statement.
grant:
grant item_spec to {public | user_id[, user_id]...}
[with grant option] [cascade | restrict]
item_spec:
{privilege[, privilege]... | all [privileges] } on [dbname.]tabname
privilege:
select | delete | insert | update [(colname[, colname]...)] | trigger
The creator of a database (that is, the user who issued the create database statement) is the owner of that database. When a
database is created, only the owner and administrator users are allowed to access that database. The owner can grant other
users certain access privileges to the database. The grant statement is used to assign these access privileges to other users. Particular privileges can be granted to specific users or to all users (public). The with grant option grants the specified users the
right to issue other grant statements on the specified table. The cascade option indicates that the access privilege is to cascade down to the RDM Server core level access rights settings for the user. This only matters where the specified user(s) will
be executing application components that perform core-level access to the SQL database. The restrict option applies only to
SQL usage and is the default.
The types of access privileges are defined in the following table.
Table 12-2. Database Access Privilege Definitions
Command Class
Description
all privileges
Allows user all of the following access privileges on the table.
select
Allows user to issues select statements on the table.
insert
Allows user to insert rows into the table.
delete
Allows user to delete rows from the table.
update
Allows user to update any column of any row in the table.
update (colname [, colname ]...)
Allows user to update only the listed columns of any row in the table.
trigger
Allows user create, alter, or drop a trigger on the table.
Note that users who are granted a trigger privilege on a table must also have the create database command privilege.
In the example below, the system administrator or database owner is allowing all users privileges to issue select statements to
query invntory database tables. Only users George and Martha have permissions to modify the database.
grant select on invntory.product to public;
grant select on invntory.outlet to public;
grant select on invntory.on_hand to public;
SQL User Guide
141
12. SQL Database Access Security
grant all on invntory.product to George, Martha;
grant all on invntory.outlet to George, Martha;
grant all privileges on invntory.on_hand to George, Martha;
The following example illustrates how you can use a view to restrict access to a portion of a database table.
create view skk_customers as
select * from customer where sale_id = "SKK"
with check option;
grant all privileges on skk_customers to Sidney;
12.2.2 Revoke Table Access Privileges
The revoke statement is used to rescind a user's database table access privileges that had been previously granted. The syntax
for the revoke statement is shown below.
revoke:
revoke item_spec to {public | user_id[, user_id]...}
[with grant option] [cascade | restrict]
item_spec:
{privilege[, privilege]... | all [privileges] } on [dbname.]tabname
privilege:
select | delete | insert | update [(colname[, colname]...)] | trigger
The specified privileges can be revoked from all users (public) or be restricted from only the users specified in the revoke command. As with grant, the cascade option indicates that the access privilege is to cascade down to the RDM Server core level
access rights settings for the user. This only matters where the specified user(s) will be executing application components that
perform core-level access to the SQL database. The restrict option applies only to SQL usage and is the default.
In the example below, the system administrator or owner is revoking George's access privileges for several tables of the invntory database.
revoke insert, update, delete on product from George;
revoke insert, update, delete on outlet from George;
revoke insert, update, delete on on_hand from George;
The next example shows an rsql script that automatically drops the home_sales view when user Martha's access privilege on
the salesperson table is revoked.
.c 1 RDM Server Admin xyzzy
create view home_sales as select sale_name from salesperson where office = "SEA";
grant select on home_sales to Martha;
.c 2 RDM Server Martha HipposAreHip
select * from home_sales;
SALE_NAME
Flores, Bob
Porter, Greg
Stouffer, Bill
Blades, Chris
SQL User Guide
142
12. SQL Database Access Security
.d 2
.c 1
revoke select on home_sales from Martha;
.c 2 RDS Martha HipposAreHip
select * from home_sales;
****RSQL Diagnostic 4200: user access rights violation: home_sales
SQL User Guide
143
13. Using SQL in a C Application Program
13. Using SQL in a C Application Program
You use the RDM Server SQL system from a C application program by making calls to the RDM Server SQL application programming interface (API) library functions. The RDM Server SQL API is based on the industry standard Open Database Connectivity API specification developed by Microsoft. A complete description of the ODBC standard is available on the Web
at: http://msdn.microsoft.com/en-us/library/ms710252(v=vs.85).aspx.
SQL statements are dynamically compiled and executed and result sets are retrieved from the RDM Server through these function calls. Raima has also included a variety of additional functions in order to support RDM Server specific capabilities.
The RDM Server ODBC functions allow you to connect to one or more RDM Servers on the network as depicted below in
Figure 13.1. A given client application program can have any number of active connections. You can even have more than
one connection from a client to one server. Since each connection has its own individual context, you can simultaneously be
processing active statements in multiple connections.
Figure 13-1. RDM Server Client-Server Application Architecture
SQL statements are compiled and executed using different functions. Once compiled, a statement can be repeatedly executed
without having to be recompiled. Statements can contain parameter markers that serve as place holders for constant values
that are bound to program variables when the statement is executed. New sets of parameter values are assigned by simply
changing the value of the program variable before re-executing the statement.
The set of select statement result rows (result set) is retrieved a row at a time. The result columns can either be bound to program variables or individually retrieved one column at a time after each result row has been fetched.
SQL User Guide
144
13. Using SQL in a C Application Program
Cursors can be defined for a select statement to support positioned updates and deletes. An update or delete statement can
refer to the cursor associated with an active select statement to modify a particular row of a table.
Several functions are provided through which you can interrogate RDM Server SQL about the nature of a compiled statement.
For example, you can find out how many columns are in the result set as well as information about each one (such as the
column name, type, and length). RDM Server SQL also provides additional function calls to utilize RDM Server SQL
enhancements or to simply provide information not included in the standard ODBC API. For example, RDM Server SQL
includes a non-ODBC function that will tell you the type of statement after the statement has been compiled (prepared).
One of the most powerful features of RDM Server SQL is its extensibility provided through its server-based programming capabilities. The RDM Server SQL API that is used in server-based programming has additional functions to support, for
example, User-Defined Function (UDF) and User-Defined Procedure (UDP) implementations as described in Developing SQL
Server Extensions.
13.1 Overview of the RDM Server SQL API
This section contains summary descriptions of all of the RDM Server SQL functions. The functions are organized into tables
from the following usage categories:
l
Connecting to RDM Server database servers
l
Setting and retrieving RDM Server SQL options
l
Preparing (compiling) SQL statements
l
Executing SQL statements
l
Retrieving result information and data
l
Terminating statements and transactions
l
Terminating RDM Server server connections
l
System catalog access functions
l
RDM Server SQL support functions
l
ODBC support functions
Each client application program accesses RDM Server SQL client interface functions through handles which are initially allocated through a call to function SQLAllocHandle. There are four types of handles used in the ODBC API. An environment
handle that is used to keep track of the RDM Server connections utilized by the client application. Each connection has an
associated connection handle that is first allocated and then passed to function SQLConnect to log in to a specified RDM
Server SQL server. All statements that are to be executed on that server are associated with that particular connection handle.
A statement handle is used to keep track of all of the information related to the compilation and execution of an SQL statement. A descriptor handle which is used for keeping track of information about columns and parameters.
The functions used to establish a connection with an RDM Server are described below.
Table 13-1. Server Connection Functions
Function Name
Purpose
SQLAllocHandle
Allocates an environment, connection, statement, or descriptor handle. Only one environment
SQLConnect
SQLDriverConnect
SQL User Guide
handle is used by each client program. Each environment handle can support multiple connections. Connection handles manages information related to one RDM Server connection.
Statement handles manage information related to one RDM Server SQL statement. Descriptor
handles are used to hold information about SQL statement parameters and result columns.
Connects and logs in to the specified Raima Database Server with the specified user name
and password.
Connects and logs in to the specified Raima Database Server with the specified user name
and password. May prompt user for further information.
145
13. Using SQL in a C Application Program
Function Name
SQLSessionId
SQLConnectWith
Purpose
Called by the application to get the RDM Server session id associated with an SQL connection handle. The session id is used with the RDM Server remote procedure call function
rpc_emCall to call an RDM Server extension module from a client application.
Called by an extension module, UDF or UDP to get the RDM Server SQL connection handle
associated with an RDM Server session id.
The RDM Server SQL system supports a variety of runtime operational control options (attributes). These include four levels
of multi-user locking and transaction isolation control. These options can be set for all of the statements executed on a particular connection or for a single statement and are managed using the following functions.
Table 13-2. SQL Control Attribute Functions
Function Name
Purpose
SQLGetEnvAttr
Returns an environment attribute setting.
SQLSetEnvAttr
Sets an environment attribute.
SQLGetConnectAttr
SQLSetConnectAttr
SQLSetStmtAttr
SQLGetStmtAttr
Returns a current connection attribute setting.
Sets a connection attribute.
Sets a statement attribute.
Returns a current statement attribute setting.
SQL statements are submitted to an RDM Server SQL server as text strings. As such, they need to be compiled into a form
that is suitable for efficient execution. All functions that involve some kind of operation on a specific SQL statement use the
same statement handle which must first be allocated via a call to SQLAllocHandle. The functions that are called before a
statement can be executed are listed below in Table 13-3.
Table 13-3. SQL Statement Preparation Functions
Function Name
Purpose
SQLAllocHandle
Allocates a statement handle.
SQLGetCursorName
Returns the cursor name associated for the statement handle.
SQLSetCursorName
Sets the cursor name for the statement handle.
SQLPrepare
SQLBindParameter
Prepares an RDM Server SQL statement for execution.
Binds a client program variable to a particular SQL parameter marker.
Execution of a previously prepared SQL statement is performed through a call to the SQLExecute function. You can both
prepare and execute a statement with a single call to the SQLExecDirect function. The execution control functions are listed in Table 2 4.
Table 13-4. Statement Execution Functions
Function Name
Purpose
SQLExecute
Executes a previously prepared statement.
SQLExecDirect
Prepares and executes a statement.
SQLNumParams
SQLDescribeParam
SQLParamData
SQLPutData
Returns the number of parameter markers in a statement.
Returns a description (e.g., data type) associated with a parameter marker.
Used along with SQLPutData to provide parameter (usually blob data) values.
Assigns specific (or next chunk of a blob) value for a parameter.
Much of the work performed by an RDM Server SQL application will be associated with the processing of the data actually
retrieved from the database on the server. This work entails making inquiries to RDM Server SQL about the characteristics of
SQL User Guide
146
13. Using SQL in a C Application Program
a compiled or executed statement, fetching results, and processing errors. The functions used in this regard are summarized
below.
Table 13-5. Results Processing Functions
Function Name
Purpose
SQLRowCount
Returns the number of rows affected by the last statement (insert, update, or delete).
SQLNumResultCols
Returns the number of columns in the select statement result set.
SQLDescribeStmt
SQLDescribeCol
SQLColAttribute
SQLBindCol
SQLFetch
SQLFetchScroll
SQLSetPos
SQLGetData
SQLMoreResults
SQLGetDiagField
SQLGetDiagRec
SQLWhenever
SQLError
Returns the type of statement that is associated with the specified statement handle.
Returns a description of a column in the select statement result set.
Returns additional attribute descriptions of a column in the select statement result set.
Specifies the location of a client program variable into which a column result is to be stored.
Retrieves the next row of select statement result.
Retrieves a rowset of select statement result rows.
Sets the cursor position within a static cursor.
Returns a column value from the result set.
Determines if there are more result sets to be processed and, if so, executes the next statement
to initialize the result set.
Retrieves the current value of a field in the diagnostic record associated with the statement.
Retrieves the current value of the diagnostic record associated with the statement.
Registers the address of a function in the client program that is to be called by RDM Server
SQL whenever the specified error occurs.
Returns error or status information.
The processing of a statement is terminated using several functions depending on the desired results. Database modification
statements (that is, insert, update, or delete) are terminated by either committing or rolling back the changes made during a
transaction. When you have finished your use of a statement handle, you should free the handle so that the system can free all
of the memory associated with it. The functions that perform these operations are described below.
Table 13-6. Statement Termination Functions
Function Name
Purpose
SQLFreeStmt
Ends statement processing and closes the associated cursor and discards pending results.
SQLCloseCursor
Closes the cursor on the statement handle.
SQLFreeHandle
SQLCancel
SQLEndTran
Frees statement handle and all resources associated with the statement handle.
Cancels an SQL statement.
Commits or rolls back a transaction.
A client application program ends by disconnecting from all servers that it is connected to and then freeing the connection
handles and the environment handle using the following functions.
Table 13-7. Connection Termination Functions
Function Name
Purpose
SQLDisconnect
Closes the connection.
SQLFreeHandle
Frees the connection or environment handle.
Several functions are provided which allow an SQL application to retrieve database definition information from the SQL system catalog. These functions each automatically execute a system-defined select statement or stored procedure that returns a
result set that can be accessed using SQLFetch.
SQL User Guide
147
13. Using SQL in a C Application Program
Table 13-8. Catalog Access Functions
Function Name
Purpose
SQLTables
Retrieves result set of table definitions.
SQLColumns
Retrieves result set of column definitions.
SQLForeignKeys
Retrieve information about a table's foreign key columns.
SQLPrimaryKeys
Retrieve information about a table's primary key columns.
SQLSpecialColumns
SQLProcedures
SQLStatistics
Retrieves result set of columns that optimally access table rows.
Retrieves result set of available stored procedures.
Retrieves result set of statistics about a table and/or indexes.
The RDM Server SQL support functions are provided to facilitate use of the direct access capabilities of RDM Server. These
functions assist in retrieving rowid values that are automatically assigned by RDM Server SQL as well as allowing SQL programs to easily utilize the low-level RDM Server (Core API) function calls when necessary.
Table 13-9. RDM Server SQL Support Functions
Function Name
Purpose
SQLRowId
Returns the rowid of the current row.
SQLRowDba
Returns the RDM Server database address of the current row.
SQLDBHandle
Returns the RDM Server database handle for an open SQL database.
SQLRowIdToDba
SQLDbaToRowId
Converts SQL row id to RDM Server database address.
Converts RDM Server database address to SQL rowid.
With the Microsoft ODBC specification, third-party front-end tool vendors can call functions that provide information describing the capabilities that are supported by a back-end database. SQLGetInfo, SQLGetTypeInfo, and SQLGetFunctions
can be called to discover the ODBC features that are supported in RDM Server SQL.
Table 13-10. ODBC Support Functions
Function Name
Purpose
SQLNativeSql
Translates ODBC SQL statement into RDM Server SQL.
SQLGetFunctions
Retrieves information about RDM Server SQL-supported functions.
SQLGetInfo
Retrieves information about RDM Server SQL-supported ODBC capabilities.
SQLTypeInfo
Retrieves result set of RDM Server SQL data types.
13.2 Programming Guidelines
This section gives an overview of the calling sequences for accessing the RDM Server SQL server through the RDM Server
SQL C API. Most of the function calls must be made in a particular sequence. In most cases, the sequence is quite natural.
The guidelines given below illustrate the calling sequences for several of the standard types of operations. Actual programming examples are provided in subsequent sections.
This section gives an overview of the calling sequences for accessing the RDM Server SQL server through the RDM Server
SQL C API. Most of the function calls must be made in a particular sequence. In most cases, the sequence is quite natural.
The guidelines given below illustrate the calling sequences for several of the standard types of operations. Actual programming examples are provided in subsequent sections.
Figure 13-2 shows the sequence of calls required to connect to a particular RDM Server. The first call to SQLAllocHandle
allocates an environment handle that is then passed to the next call to SQLAllocHandle which is used to allocated a connection handle. The connection handle is passed to SQLConnect, which in turn connects to the specified RDM Server. All
SQL User Guide
148
13. Using SQL in a C Application Program
of the activity associated with a particular connection is identified by the connection handle. RDM Server SQL allows an
application to open any number of connections to any number of RDM Server systems.
Figure 13-2. Connecting to RDM Server Flow Chart
Function SQLDisconnect will close (log out from) the RDM Server connection. An error is returned if any uncommitted
transactions are pending on the connection. Any active statements are automatically freed for the specified connection. Before
calling SQLDisconnect, you should explicitly close (SQLFreeStmt) all active statements for a particular connection.
The connection handle can be reused in another SQLConnect or released by a call to SQLFreeHandle. When all connections have been closed and freed, a final call to SQLFreeHandle is made to free the environment handle.
A flow chart that gives a typical sequence of calls for processing a select statement is shown in Figure 13-3. A statement is
associated with a statement handle allocated by the call to SQLAllocHandle. Any number of statement handles (i.e., separate SQL statements) can be active at a time for a given connection. Statement handles are analogous to cursors when the
SQL statement associated with the statement handle is a select statement. Function SQLPrepare is used to compile (but not
execute) an SQL statement. If SQLPrepare is successful, you can call functions SQLDescribeCol and SQLNumResultCols to get information about the result columns such as the column name, data type and length. This is used so that the
appropriate host variables can be set up (through calls to SQLBindCol) to hold the column values for each result row.
SQL User Guide
149
13. Using SQL in a C Application Program
Figure 13-3. Select Statement Processing Flow Chart
SQL statements can have embedded parameter markers. A parameter marker is specified by a '?' in a position that would normally take a literal constant. The host variable for each parameter value must be specified by a call to SQLBindParameter
before the statement is executed by SQLExecute.
SQL User Guide
150
13. Using SQL in a C Application Program
Each row of the result set is retrieved one-at-a-time through the call to SQLFetch. When all rows have been fetched,
SQLCloseCursor is called to terminate processing of the select statement. In this example, the handle is then freed by the
call to SQLFreeHandle, so it can no longer be used. Alternatively, the statement could just be closed, terminating the current select statement execution but still allowing the statement to be reexecuted at a later time. Notice in this example that
statement compilation and execution are performed by separate functions. This allows the same statement to be executed multiple times without having to recompile it. For example, you might specify a different set of parameter values for each subsequent execution. The flow chart shown below shows a modified segment of the prior flow chart to indicate how this is
done.
Figure 13-4. Select Statement Re-Execution Flow Chart
Figure 13-5 gives a flow chart showing a sequence of calls that perform a positioned update. A positioned update statement
involves the use of two statement handles. The one associated with the select statement is the cursor. The update statement is
executed through the other statement handle once the cursor has been positioned to the desired row.
The particular cursor on which the update is performed can be specified two ways. In this example, function SQLSetCursorName is called to specify a user-defined cursor name. That name would then need to be referenced in the where current of
clause in the update statement text compiled by the call to SQLPrepare. Alternatively, function SQLGetCursorName
could be called to retrieve a system-generated cursor name which would need to be incorporated into the update statement
string prior to the call to SQLPrepare used to compile it.
When all updates have been completed, function SQLEndTrans is called to commit the changed rows to the database. This
could also be done by a call to SQLExecDirect to compile and execute a commit statement.
SQL User Guide
151
13. Using SQL in a C Application Program
Figure 13-5. Positioned Update Flow Chart
13.3 ODBC API Usage Elements
This section describes the basic elements that are used by the RDM Server SQL ODBC API functions. Included are descriptions of the standard header files, data type and constant definitions contained in those header files that are used in the function calls for argument types, indicator and descriptor variables, and status return codes.
13.3.1 Header Files
Your RDM Server SQL C application must include at least one of the three standard header files described below in Table
13-11. The files can be found in the RDM Server include directory.
Table 13-11. RDM Server SQL Header Files
File
Description
sql.h
Standard ODBC Core-level header file. Includes prototypes, data and constant definitions for
sqlext.h
sqlrds.h
the ODBC Core-level functions. Automatically included by sqlext.h.
Microsoft ODBC levels 1 and 2 extensions header file. Includes prototypes and data definitions for the ODBC level 1 and 2 functions. Automatically included by sqlrds.h.
RDM Server SQL main header file. Includes prototypes and data definitions for all functions
used with RDM Server SQL.
Inclusion of sqlrds.h in your application provides access to all RDM Server SQL capabilities. You can include sql.h
with your application to ensure its conformance to only the ODBC Core-level specification. Include sqlext.h to ensure
conformance to the full ODBC specification. The sqlext.h file automatically includes sql.h. The sqlrds.h file automatically includes sqlext.h.
SQL User Guide
152
13. Using SQL in a C Application Program
The sql.h and sqlext.h files in the RDM Server include directory are our own developed versions of the same files
that are part of the Microsoft ODBC SDK.
13.3.2 Data Types
SQL API uses a special set of type definitions. Rather than relying on the base types defined in the ANSI C programming language, data types that map into the standard C data types have been defined. The function arguments have been specified
using these ODBC-defined data types. The application variables that you pass in to the functions must be declared with the
proper data type. Those defined as either int64 or int32 will be int64 on a 64-bit RDM Server installation otherwise
int32. There are also some RDM Server-specific data types that are not included in the table. These are described in the
SQL API Reference Manual in the descriptions of the functions that use them.
Table 13-11. SQL API Data Type Descriptions
Type Name
Description
SQLHANDLE
Generic handle. Can be any one of the four handle types: SQLHENV, SQLHDBC, SQLHSTMT,
SQLHENV
SQLHDBC
SQLHSTMT
SQLHDESC
SQLPOINTER
SQLLEN
SQLULEN
SQLCHAR
SQLWCHAR
SQLSMALLINT
SQLUSMALLINT
SQLINTEGER
SQLUINTEGER
SQLBIGINT
SQLUBIGINT
SQLREAL
SQLFLOAT
SQLDOUBLE
SQLDECIMAL
SQLNUMERIC
SQLDATE
SQLTIME
SQLTIMESTAMP
SQLVARCHAR
SQLRETURN
DATE_STRUCT
SQL_DATE_STRUCT
TIME_STRUCT
SQL_TIME_STRUCT
TIMESTAMP_STRUCT
SQL_TIMESTAMP_STRUCT
SQL User Guide
SQLHDESC: "void *".
Environment handle.
Connection handle.
Statement handle.
Descriptor handle.
Generic pointer variable: "void *".
Signed buffer/string length variable: int64 or int32.
Unsigned buffer/string length variable: int64 or int32.
Standard character: unsigned char.
Wide character: usually wchar_t.
int16.
uint16.
int32.
uint32.
int64.
uint64.
float.
double.
double.
unsigned char (byte array).
unsigned char (byte array).
unsigned char (string).
unsigned char (string).
unsigned char (string).
unsigned char (string).
Function return code: int16.
Unpacked date struct.
Unpacked time struct.
Unpacked timestamp struct.
153
13. Using SQL in a C Application Program
13.3.3 Use of Handles
The RDM Server SQL application uses several ODBC-defined handles. Introduced in ODBC 3, the SQLAllocHandle function is used to allocate all of the handles. An environment handle (SQLHENV type) is allocated by passing SQL_HANDLE_
ENV as the handle type. Although only one environment handle is required, more may be allocated if needed. Before executing any other ODBC function using the environment handle SQLSetEnvAttr should be called with SQL_ATTR_ODBC_
VERSION to set the version of ODBC that the application will use. A connection handle (SQLHDBC type) is allocated by
passing SQL_HANDLE_DBC as the handle type. The application can open any number of connections to any number of
RDM Servers, with each connection referenced through a separate connection handle. A statement handle (SQLHSTMT type)
is allocated by passing SQL_HANDLE_STMT as the handle type. There is no restriction on the number of RDM Server SQL
statement handles your application can use. However, to conserve server memory, it is good practice to keep to a minimum
the number of active statement handles. Lastly, a descriptor handle (SQLHDESC type) is allocated by passing SQL_
HANDLE_DESC as the handle type.
When your RDM Server SQL application needs to call a Core API function or a server-side extension module, it must use a
Core session handle (RDM_SESS type) associated with the active server connection handle. (Each connection corresponds to
a single RDM Server login session.) The application calls SQLSessionId to retrieve the session handle.
For a particular server connection, your RDM Server SQL application might also need to call SQLDBHandle to obtain the
database handle (RDM_DB type) that RDM Server uses for a Core database. With this handle, the application can use the
runtime API (bypassing RDM Server SQL) to access database information.
13.3.4 Buffer Arguments
The usage rules for passing buffer arguments are as follows:
l
l
l
l
l
Each RDM Server function argument pointing to a string or data buffer has an associated length argument.
An input length argument contains the actual length of a string or buffer. If the application specifies a length value of
SQL_NTS (an ODBC-specified negative constant meaning "null-terminated string"), the pointer must address a null-terminated string. A length greater than or equal to zero implies the input string is not null-terminated (for example, if
your application is written in Pascal).
Two length arguments are used for an output buffer. The first argument provides the size (in bytes) of the output buffer. The second argument is a pointer to a variable in which RDM Server returns the number of bytes actually written
to the buffer. If the value is null, the function sets the result length to the ODBC negative constant SQL_NULL_
DATA.
All RDM Server API functions that fill output buffers with character data write null-terminated strings. The buffers
your application provides to receive this data must be long enough to hold the terminal null byte.
A null can be passed for the result output argument as long as the database schema does not allow a null for the result
field. If a null field is allowed in the database schema, then ODBC 3.51 requires the result output argument be supplied. If not supplied, errNOINDVAR will be returned.
13.4 SQL C Application Development
13.4.1 RDM Server SQL and ODBC
RDM Server includes an ODBC driver and related files so that RDM Server can be accessed by Microsoft Windows applications through the ODBC Driver Manager (DM). The driver is installed through the instodbc utility for Microsoft Windows. Connecting and using a driver through the ODBC DM is described in detail in the Microsoft ODBC manual. Details of
SQL User Guide
154
13. Using SQL in a C Application Program
driver installation are given where applicable in Installing the Server Software and Installing RDM Server Client Software,
from the RDM Server Installation / Administration Guide.
Note that your application does not need to operate with the ODBC DM if RDM Server SQL is the only type of data source
used. Since the ODBC API is the native API for RDM Server SQL, you can simply link an ODBC-compliant application directly to the RDM Server SQL API in the client library, eliminating the overhead of the ODBC Driver Manager.
13.4.2 Connecting to RDM Server
An application often uses the following basic steps in processing RDM Server SQL statements:
1. Call SQLAllocHandle to allocate the environment handle for the program.
2. Call SQLAllocHandle to allocate a connection handle.
3. Call SQLConnect to connect the user to a specific RDM Sserver.
Perform the following basic steps to end your RDM Server session:
1. Call SQLDisconnect to terminate the client connection to the server.
2. Call SQLFreeHandle to free the connection handle.
3. Call SQLFreeHandle to free the environment handle.
The example below illustrates the use of these function calls. Note that the type definitions for the environment and connection handles are declared in the standard header file sql.h. Also note the use of the constant SQL_NTS for the length
arguments in the call to SQLConnect to indicate that each of the char arguments is a standard C null-terminated string.
#include "sql.h"
char user[15]; /* user name */
char pw[15];
/* password */
SQLHENV eh;
/* environment handle */
SQLHDBC ch;
/* connection handle */
SQLHSTMT sh;
/* statement handle */
...
SQLAllocHandle(SQL_HANDLE_ENV, NULL, &eh);
SQLAllocHandle(SQL_HANDL_DBC, eh, &ch);
/* fetch user name and password */
ClientLogin(user, pw);
/* connect to MIS server */
stat = SQLConnect(ch, "MISserver", SQL_NTS, user, SQL_NTS, pw, SQL_NTS);
if (stat != SQL_SUCCESS ) return ( ErrHandler() );
...
/* run MIS application */
SQLDisconnect(ch);
SQLFreeHandle(SQL_HANDLE_DBC, ch);
SQLFreeHandle(SQL_HANDLE_ENV, eh);
You can establish connections to any RDM Server system that is available on the network. RDM Server SQL maintains each
connection in a separate task context. You can even have multiple connections to the same RDM Server system where, for
example, you may need to have separate task contexts to the same server.
SQL User Guide
155
13. Using SQL in a C Application Program
13.4.3 Basic SQL Statement Processing
An application often uses the following basic steps in processing RDM Server SQL statements:
l
Calls SQLAllocHandle (with SQL_HANDLE_STMT) to allocate a statement handle. This statement handle will be
used in all the following steps.
l
Calls SQLPrepare to compile the statement.
l
If processing a select statement, calls SQLBindCol to bind column results to host program variables.
l
Calls SQLBindParameter, if necessary, to associate a host variable with a parameter referenced in the statement.
l
Calls SQLExecute to execute the statement. This is usually the end of processing for any statement other than select.
l
For select statements, calls SQLFetch or SQLFetchScroll to retrieve the result set.
l
When finished with the handle, calls SQLFreeHandle (with SQL_HANDLE_STMT) to free it. If reusing the handle,
calls SQLCancel or SQLFreeStmt with the SQL_CLOSE setting. Alternatively, you may call SQLCloseCursor
to close an open cursor for reuse.
An application can call SQLExecDirect instead of making separate calls to SQLPrepare and SQLExecute but only
when a single call to SQLExecute would be needed.
The following example shows a simple, statement execution sequence that opens the example sales and invntory databases. It
consists of a call to SQLAllocHandle to allocate a statement handle, a call to SQLExecDirect to compile and execute
the open statement, and a call to SQLFreeHandle to drop the statement handle. The OpenDatabases function assumes that
the server connection handle is valid. If it is not valid, SQLAllocHandle returns the SQL_INVALID_HANDLE error code.
#include "sqlext.h"
SQLRETURN OpenDatabases(SQLHANDLE dbc)
{
SQLRETURN stat;
SQLHANDLE hstmt;
if (stat = SQLAllocHandle(SQL_HANDLE_STMT, dbc, &hstmt)) == SQL_SUCCESS)
stat = SQLExecDirect(hstmt, "open sales, invntory", SQL_NTS);
SQLFreeHandle(SQL_HANDLE_STMT, hstmt);
return stat;
}
13.4.4 Using Parameter Markers
To save processing time, your application can compile a statement once, by calling SQLPrepare, and then execute the statement multiple times with calls to SQLExecute. If the statement has embedded parameter markers ("?") in it, different values
can be substituted for these parameters before each statement execution. The application calls the SQLBindParameter function
before statement execution to associate host program variables with parameter markers. Each time the application calls SQLExecute for the statement, the current values from the bound variables are substituted in the statement for the parameter markers.
The next example uses parameter markers with an insert statement to insert rows in the product table from data input by a
user. The input is gathered by the local function GetValues, which sets the bound variables to the appropriate values. The
SQLExecute call then executes the insert statement using the current values in the variables.
In this example, note that the call to the SQLGetDiagRec function retrieves the sqlstate code and error message in the event
that SQLExecute returns an error. The example also illustrates the use of SQLEndTran to commit or roll back database
changes based on the occurrence of an error.
SQL User Guide
156
13. Using SQL in a C Application Program
#include <stdio.h>
#include <stdlib.h>
#include "sqlext.h"
char insert[] = "insert into product(prod_id, prod_desc, price, cost) values(?,?,?,?)";
SQLHANDLE
SQLHANDLE
SQLHANDLE
SQLHANDLE
henv;
hdbc;
hstmt;
error_handle;
int16 prod_id;
char prod_desc[40];
double price, cost;
int16 handle_type;
int main(void)
{
FILE *txtFile;
char sqlstate[6], emsg[80];
char user[15], pw[8];
int32 lineno;
SQLUSMALLINT txtype;
if ((txtFile = fopen("product.txt", "r")) == NULL)
abort("unable to open file\n");
SQLAllocHandle(SQL_HANDLE_ENV, SQL_NULL_HANDLE, &henv);
SQLSetEnvAttr(henv, SQL_ATTR_ODBC_VERSION, (SQLPOINTER)
SQL_OV_ODBC3, SQL_IS_INTEGER);
SQLAllocHandle(SQL_HANDLE_DBC, henv, &hdbc);
/* fetch user name and password */
ClientLogin(user, pw);
/* connect to MIS server */
if ((stat = SQLConnect(hdbc, "MIS", SQL_NTS, user, SQL_NTS, pw, SQL_NTS)) != SQL_SUCCESS)
{
handle_type = SQL_HANDLE_DBC;
error_handle = hdbc;
goto quit;
}
SQLAllocHandle(SQL_HANDLE_STMT, hdbc, &hstmt);
SQLPrepare(hstmt, insert, SQL_NTS);
SQLBindParameter(hstmt, 1, SQL_PARAM_INPUT, SQL_C_SHORT, SQL_SMALLINT,
0, 0, &prod_id, 0, NULL);
SQLBindParameter(hstmt, 2, SQL_PARAM_INPUT, SQL_C_CHAR, SQL_CHAR,
40, 0, prod_desc, 40, NULL);
SQLBindParameter(hstmt, 3, SQL_PARAM_INPUT, SQL_C_DOUBLE, SQL_FLOAT,
0, 0, &price, 0, NULL);
SQLBindParameter(hstmt, 4, SQL_PARAM_INPUT, SQL_C_DOUBLE, SQL_FLOAT,
0, 0, &cost, 0, NULL);
handle_type = SQL_HANDLE_STMT;
error_handle = hstmt;
while (GetValues(&prod_id, prod_desc, &price, &cost)) {
if ((stat = SQLExecute(hstmt)) != SQL_SUCCESS)
break;
}
SQL User Guide
157
13. Using SQL in a C Application Program
quit:
if (stat == SQL_SUCCESS)
txtype = SQL_COMMIT;
else {
SQLGetDiagRec(handle_type, error_handle, 1, sqlstate, NULL, emsg, 80, NULL};
printf("***Line %d - ERROR(%s): %s\n", lineno, sqlstate, errmsg);
txtype = SQL_ROLLBACK;
}
/* commit or roll back transaction */
SQLEndTran(SQL_HANDLE_DBC, hdbc, txtype);
return 0;
}
13.4.4 Premature Statement Termination
Your application terminates a database modification statement (insert, update, or delete) by either committing or rolling back
the changes made during the transaction. When the application finishes using a statement handle, it should free the handle so
that RDM Server can free all associated memory.
The application terminates processing of a select statement by a call to SQLFreeStmt, with the SQL_CLOSE option,
SQLCloseCursor, or SQLCancel (see the following example). Any result rows that the application has not fetched are
thrown out at this time.
#include "sqlext.h"
...
/* Print all rows of a table */
SQLRETURN PrintTable(
SQLHDBC svr,
/* server connection handle */
char
*tabname) /* name of table whose rows are to be printed */
{
char stmt[80];
SQLHSTMT sh;
...
/* set up and compile select statement */
sprintf(stmt, "select * from %s", tabname);
SQLAllocHandle(SQL_HANDLE_STMT, svr, &sh);
if (SQLExecDirect(sh, (SQLCHAR *)stmt, SQL_NTS) != SQL_SUCCESS)
return ErrHandler();
...
/* print all rows in table */
while (SQLFetch(sh) == SQL_SUCCESS) {
...
if (cancelled_by_user)
SQLCancel(sh);
/* or, SQLFreeStmt(sh, SQL_CLOSE); */
...
}
SQL User Guide
158
13. Using SQL in a C Application Program
return SQL_SUCCESS;
}
13.4.5 Retrieving Date/Time Values
Your application can access and manipulate RDM Server SQL-specific date and time values at the runtime (d_) level with the
VAL functions found in the RDM Server SQL API. These functions enable you to translate a date or time value from its native packed format (types *_VAL) into the ODBC format (types *_STRUCT), or vice versa.
To use the date or time manipulation functions, your application must include the sqlrds.h file. Error codes that can be
returned by these functions are defined in the valerrs.h header file, which will be automatically included when you
include sqlrds.h. These files are found in the RDM Server include directory.
Caution: Records changed via the RDM Server runtime API ignore SQL constraint checking. Therefore, if your application
defines column constraints, it must validate the values before writing to the database.
Note that the *_VAL data types (DATE_VAL, TIME_VAL, etc.) are always associated with internal storage format, while the
*_STRUCT data types (DATE_STRUCT, etc.) are standard ODBC types. Both types are defined in RDM Server Reference
Manual. Note also that the structure definitions are not currently produced by the ddlproc schema compiler; your application must declare them.
13.4.6 Retrieving Decimal Values
RDM Server SQL provides support for the ODBC date, time, and timestamp data types. Database columns of those types can
be returned in struct variables of type DATE_STRUCT, TIME_STRUCT, or TIMESTAMP_STRUCT. These structure types
are declared in sqlext.h as shown below.
typedef struct tagDATE_STRUCT
SQLSMALLINT
year;
SQLUSMALLINT
month;
SQLUSMALLINT
day;
} DATE_STRUCT;
{
/* year (>= 1 A.D., for example, 1993) */
/* month number: 1 to 12 */
/* day of month: 1 to 31 */
typedef struct tagTIME_STRUCT
SQLUSMALLINT
hour;
SQLUSMALLINT
minute;
SQLUSMALLINT
second;
} TIME_STRUCT;
{
/* hour of day: 0 to 23 */
/* minute of hour: 0 to 59 */
/* second of minute: 0 to 59 */
typedef struct tagTIMESTAMP_STRUCT {
SQLSMALLINT
year;
/* year (>= 1 A.D., for example, 1993) */
SQLUSMALLINT
month;
/* month number: 1 to 12 */
SQLUSMALLINT
day;
/* day of month: 1 to 31 */;
SQLUSMALLINT
hour;
/* hour of day: 0 to 23 */
SQLUSMALLINT
minute;
/* minute of hour: 0 to 59 */
SQLUSMALLINT
second;
/* second of minute: 0 to 59 */
SQLUINTEGER
fraction; /* billionths of a second: 0 to 999,900,000
} TIMESTAMP_STRUCT;
(RDM Server SQL accurate to 4 places only) */
typedef DATE_STRUCT
typedef TIME_STRUCT
typedef TIMESTAMP_STRUCT
SQL_DATE_STRUCT;
SQL_TIME_STRUCT;
SQL_TIMESTAMP_STRUCT;
The DATE_STRUCT name has been changed in ODBC 3 to SQL_DATE_STRUCT but either can be used.
SQL User Guide
159
13. Using SQL in a C Application Program
Use of date and time data is shown in the example below, which prints the year-to-date sales orders for a particular customer.
In this example, SQLBindCol is called to request the column values in their native data type.
13.4.7 Retrieving Decimal Data
The RDM Server SQL support module stores decimal values in a proprietary BCD format. RDM Server provides a library of
functions (BCD- prefix) your application can call to manipulate values stored in this format. The functions allow the application to convert between a string representation of a decimal value (for example, "123.4567") and the internal RDM Server
BCD format, as well as to perform all of the usual decimal arithmetic.
To call a decimal manipulation function, your application must first allocate a BCD environment handle, specifying the maximum precision and scale for the values you will manipulate, as shown in the code example below. The application passes
this handle to any of the decimal manipulation functions that it calls.
The application can set any BCD value needed. However, if the application must store BCD values directly in the database,
the maximum precision and scale you use must be identical to that specified by the RDM Server SQL support module. The
following code fragment shows how your application can determine from the syscat (system catalog) database what the system values are for these parameters.
...
int16 maxprecision, maxscale, bcd_len;
char *bcd_buf;
BCD_HENV hBcd;
/* determine the max precision and scale on the server */
SQLExecDirect(hStmt, "select maxprecision, maxscale from sysparms", SQL_NTS);
SQLBindCol(hStmt, 1, SQL_SMALLINT, &maxprecision, sizeof(int16), NULL);
SQLBindCol(hStmt, 2, SQL_SMALLINT, &maxscale, sizeof(int16), NULL);
SQLFetch(hStmt);
SQLFreeStmt(hStmt, SQL_CLOSE);
printf("max precision = %hd, max scale = %hd\n", maxprecision, maxscale);
/* allocate a BCD environment corresponding to configuration on server */
BCDAllocEnv((unsigned char)maxprecision, (unsigned char)maxscale, &hBCD);
/* allocate a buffer to contain the decimal string */
bcd_len = maxprecision+3; /* sign, decimal, and NULL byte */
bcd_buf = malloc(bcd_len);
...
13.4.8 Status and Error Handling
RDM Server returns to your RDM Server SQL application the codes and messages described in the Return Codes and Error
Messages section. RDM Server SQL API return code constants are defined in sql.h; these return codes are prefixed by
"SQL_".
Each RDM Server SQL API function returns a code indicating the success or failure of the operation. If an error occurs, your
application must call SQLGetDiagField or SQLGetDiagRec for details about the error.
The RDM Server SQL API provides a nonstandard function called SQLSetErrorFcn that your application can call to specify its own error handler. The prototype is shown below. SQLSetErrorFcn can be called any number of times to specify
the same or different handlers for different handles or error codes.
SQL User Guide
160
13. Using SQL in a C Application Program
Your application error handler is called by an RDM Server SQL API function that has produced an error. The following is the
prototype for the application error handler function (sqlrds.h file).
int32 REXTERNAL ErrorHandler(int16 handleType, SQLHANDLE handle, int32 code)
where:
handleType
handle
code
(input)
(input)
(input)
Specifies the type of the input handle.
Specifies the input handle.
Specifies the status/error code.
The calling RDM Server SQL function passes into the ErrorHandler the appropriate handle type and handle for the given
error. For example, if SQLExecute detects an error, it will pass into ErrorHandler SQL_HANDLE_STMT as the handle type
and the statement handle associated with the error. The RDM Server SQL API function also provides the status or error code
to the ErrorHandler function; the error handler then can call SQLGetDiagField or SQLGetDiagRec to retrieve detailed
information about the status or error. The return from ErrorHandler becomes the status code returned by the originally called
RDM Server SQL API function. Normally, the return value is simply equal to the value of the code parameter.
The following example shows the simplest use of the SQLSetErrorFcn and ErrorHandler functions. The call to SQLSetErrorFcn passes a valid connection handle and SQL_ERROR as the error code. This causes the automatic calling of the error
handler by RDM Server SQL API functions for all errors associated with the specified connection, including functions that reference statement handles allocated from the connection.
#include <stdio.h>
#include "sqlrds.h"
/* SQLSetErrorFcn is a Birdstep extension */
int32 REXTERNAL ErrHandler(
int16
hType,
SQLHANDLE handle,
int32
code)
{
SQLUINTEGER rsqlcode;
SQLCHAR buf[80], sqlstate[6];
SQLGetDiagRec(hType, handle, 1, sqlstate, &rsqlcode, buf, 80, NULL);
printf("****RSQL Error %ld: %s\n", rsqlcode, buf);
return code;
}
...
SQLSetErrorFcn(SQL_HANDLE_DBC, hdbc, SQL_ERROR, ErrHandler);
...
The next example shows how your application can specify separate error handlers for different errors. The first call to
SQLSetErrorFcn registers the standard error handler (ErrHandler) from the previous example. The second call to
SQLSetErrorFcn registers a handler for a specific error code, errINVCONVERT.
When an error occurs, the RDM Server SQL support module checks to see if there is a handler registered for the associated
error code and statement handle. If not, it then checks for a handler for the code and the connection handle. Then, if a handler
is still not found, the support module checks for a handler for the return code (for example, SQL_ERROR) corresponding to
the statement handle or connection handle.
SQL User Guide
161
13. Using SQL in a C Application Program
#include <stdio.h>
#include "sqlrds.h"
/* SQLWhenever is a Birdstep extension */
/* ================================================================
Invalid data type conversion
*/
int32 REXTERNAL BadConvert(
int16
hType,
SQLHANDLE handle,
int32
code)
{
/* My error message is better */
printf("**** A type conversion specified in SQLBindCol ");
printf("or SQLBindParameter call is not valid.\n");
return code;
}
...
/* Register standard error handler */
SQLSetErrorFcn(hType, handle, SQL_ERROR, ErrHandler);
/* Register invalid conversion handler */
SQLSetErrorFcn(hType, handle, errINVCONVERT, BadConvert);
...
In this example, an errINVCONVERT error on any statement handle associated with the server connection handle (hdbc) will
result in a call to BadConvert. For any other error on that connection, the ErrHandler function is called.
Caution: This particular case was created only to illustrate the use of separate error handlers. You should
not define a separate handler for each error code to output a more readable error message. It is far more
efficient to use a table in a single error handler.
13.4.9 Select Statement Processing
The application associates select with a statement handle allocated by a call to SQLAllocHandle. After allocating the statement handle, the application calls SQLPrepare to compile (but not to execute) the statement. When compilation is successful, the application can call SQLDescribeCol and SQLNumResultCols to get information about the result columns,
such as the column name and data type. This information can be used in the call to SQLBindCol to set up host variables to
hold the column values for each result row.
Before statement execution, your application can call the SQLBindParameter function to associate the host program variables with parameter markers. These markers are placeholders for constant values in the SQL statement, and are specified with
a question mark (?). The application then calls SQLExecute to run the select statement. During execution, the values from
the host program variables are substituted for the parameter markers. The following example illustrates how parameter markers
are used.
select company, ord_num, ord_date, amount from customer, sales_order
where customer.cust_id = sales_order.cust_id
and ord_date = ?;
If you need to execute the same select statement multiple times without having to recompile it, use separate
calls to SQLPrepare and SQLExecute.
SQL User Guide
162
13. Using SQL in a C Application Program
When statement execution is complete, your RDM Server SQL application calls SQLFetch to retrieve the rows of the result
set, one at a time. When all rows have been fetched, SQLFreeHandle is called to free the select statement handle and drop
it so it can no longer be used. Alternatively, the application can call SQLCancel to close the handle, terminating statement
execution but still allowing the statement to be re-executed at a later time. It also can call SQLFetchScroll to retrieve multiple rows in a single call.
The following example illustrates the processing of a select statement summarizing the year-to-date total sales for each salesperson in the sales database. The SalesSummary function is called with an open connection handle to the server containing
the database. This function allocates its own statement handle and calls SQLExecDirect to compile and execute the select
statement.
Calls to SQLBindCol bind the two column results to character buffers sale_name and amount. These function calls pass the
buffer size, as well as SQL_C_CHAR, indicating the result is to be converted to a character string. Note that a buffer size of
SQL_NTS is invalid for these calls, since the buffers are for output only. The last parameter passed to SQLBindCol is the
address of an integer (SQLLEN) variable to contain the output result length. For both calls in this example this parameter is
NULL, indicating that the application does not need the result length.
This example retrieves each row of the result set by calling SQLFetch. Each call retrieves the next row and stores the
column results in the program locations specified in the SQLBindCol calls.
#include "sql.h"
char stmt[] =
"select sale_name, sum(amount) from salesperson, customer, sales_order "
"where salesperson.sale_id = customer.sale_id "
"and customer.cust_id = sales_order.cust_id "
"group by sale_name";
SQLRETURN SalesSummary(
SQLHDBC hdbc)
/* connection handle to sales database server */
{
char sale_name[31]; /* salesperson name */
char amount[20];
/* formatted sales order amount */
SQLHSTMT sh;
/* statement handle */
SQLRETURN stat;
/* SQL status code */
if ((stat = SQLAllocHandle(SQL_HANDLE_STMT, hdbc, &sh)) != SQL_SUCCESSS)
return(stat);
if ((stat = SQLExecDirect(sh, stmt, SQL_NTS)) == SQL_SUCCESS) {
SQLBindCol(sh, 1, SQL_C_CHAR, sale_name, 31, NULL);
SQLBindCol(sh, 2, SQL_C_CHAR, amount, 20, NULL);
while ((stat = SQLFetch(sh)) == SQL_SUCCESS)
printf("Acct manager %s has a total of $%s in orders\n", sale_name, amount);
}
return stat;
}
In the next example, the PrintTable function outputs all columns and rows contained in the specified table. Unlike the prior
example, which has a fixed number of columns in the result set, this example can have a varying number of result columns.
Thus the code calls SQLNumResultCols to get the number of columns in the result set. An array of column result
descriptors (cols) is allocated to contain the definition and result information for each column. Function SQLDescribeCol
is called to retrieve the name, type, and display size for each result set column. The result value buffer is dynamically allocated and bound to its result column through the call to SQLBindCol.
SQL User Guide
163
13. Using SQL in a C Application Program
As each row is retrieved by SQLFetch, each column value and its result length are stored in the COL_RESULT container for
that column. The length for a null column value is returned as SQL_NULL_DATA. In this case (see the following example),
the program displays NULL.
#include "sql.h"
/* result container */
typedef static struct {
SQLCHAR
name[33]; /* column name */
void
*value;
/* column value */
SQLSMALLINT
type;
/* column type */
SQLLEN
len;
/* result value length */
} COL_RESULT;
/* Print all rows of a table */
SQLRETURN PrintTable(
SQLHDBC svr,
/* server connection handle */
char *tabname)
/* name of table whose rows are to be printed */
{
char stmt[80];
SQLHSTMT sh;
SQLSMALLINT tot_cols;
COL_RESULT *cols;
SQLUINTEGER size;
int32
row;
/* set up and compile select statement */
sprintf(stmt, "select * from %s", tabname);
SQLAllocHandle(SQL_HANDLE_STMT, svr, &sh);
if (SQLExecDirect(sh, stmt, SQL_NTS) != SQL_SUCCESS)
return ErrHandler();
/* allocate column results container */
SQLNumResultCols(sh, &tot_cols);
cols = (COL_RESULT *)calloc(tot_cols, sizeof(COL_RESULT));
/* fetch column names and bind column results */
for (i = 0; i < tot_cols; ++i) {
SQLDescribeCol(sh, i+1, cols[i].name, 33, NULL, &cols[i].type, &size, NULL, NULL);
cols[i].value = malloc(size+1);
SQLBindCol(sh, i+1, SQL_C_CHAR, cols[i].value, size+1, &cols[i].len);
}
/* print all rows in record-oriented format */
printf("========== %s ==========", stmt);
for (row = 1; SQLFetch(sh) == SQL_SUCCESS; ++row ) {
printf("**** row %ld:\n", row);
for (i = 0; i < tot_cols; ++i) {
printf("
%32.32s: %s\n", cols[i].name,
cols[i].len == SQL_NULL_DATA
"NULL" : cols[i].value);
}
}
/* drop statement handle and free allocated memory */
SQLFreeHandle(SQL_HANDLE_STMT, sh);
for (i = 0; i < tot_cols; ++i)
free(cols[i].value);
free((void *)cols);
return SQL_SUCCESS;
}
SQL User Guide
164
13. Using SQL in a C Application Program
The next example presents an application function that invokes the myproc stored procedure mentioned previously. After compiling and executing the initial statement, the application calls SQLNumResultCols to determine if that statement was a
select statement. If there are result columns, the application can call SQLDescribeCol and SQLBindCol to set up processing of the result set. Then the program calls SQLFetch until it returns SQL_NO_DATA.
If there are no result columns, the initial statement was either insert, update, or delete. The application can call
SQLRowCount to count the number of rows affected by the modification statement. After processing the result, the application calls the SQLMoreResults function to determine if there are any more stored procedure statements to be processed
and, if so, to execute the next one.
A stored procedure containing more than one select statement requires that the application call the
SQLMoreResults function after SQLFetch returns SQL_NO_DATA. This call determines if any more
result sets exist and initializes their processing. SQLMoreResults can be called repeatedly to execute
each subsequent statement in the procedure. If the statement is a select statement, SQLFetch can then be
called repeatedly to fetch the latest result set. When there are no more statements in the procedure,
SQLMoreResults returns SQL_NO_DATA.
#include "sqlext.h"
typedef struct col_result {
char
name[33];
char
*value;
SQLUINTEGER prec;
} COL_RESULT;
void RunProc(SQLHSTMT hstmt)
{
SQLSMALLINT nocols, col;
SQLLEN
norows;
stat = SQLExecDirect(hstmt, (SQLCHAR *)"execute myproc()", SQL_NTS);
while (stat == SQL_SUCCESS) {
SQLNumResultCols(hstmt, &nocols);
if (nocols > 0) {
/* set up and fetch result set */
results = (COL_RESULT *)calloc(nocols, sizeof(COL_RESULT *));
for (col = 0; col < nocols; ++col) {
COL_RESULT *rp = &results[col];
SQLDescribeCol(hstmt, col+1, rp->name, 33, NULL, NULL, &rp->prec, NULL, NULL);
rp->value = malloc(rp->prec+1);
SQLBindCol(hstmt, col+1, SQL_C_CHAR,
rp->value, rp->prec+1, NULL);
}
while (SQLFetch(hstmt) != SQL_NO_DATA)
DisplayResultRow(results, nocols);
/* free results memory */
for (col = 0; col < nocols; ++col)
free(results[col].value);
free(results);
}
else {
/* report number of rows affected */
SQLRowCount(hstmt, &norows);
if (norows > 0)
printf("*** %ld number of rows affected\n");
}
SQL User Guide
165
13. Using SQL in a C Application Program
stat = SQLMoreResults(hstmt);
}
}
13.4.10 Positioned Update and Delete
A cursor is a named, updateable select statement where the cursor position is the current row (that is, the row returned from
the most recent call to SQLFetch). An updateable select statement does not include a group by or order by clause and only
refers to a single table in the from clause. If the table is a view, that view must be updateable. Cursors are used in conjunction with positioned updates and deletes to allow the current row from a select statement to be updated or deleted.
The general procedure for a positioned update (or delete) is as follows:
1. Call SQLAllocHandle to allocate the statement handle for the select statement.
2. Call SQLAllocHandle to allocate a statement handle for the update statement.
3. Call SQLPrepare with the first statement handle to compile the select statement.
4. To specify your own cursor name, call SQLSetCursorName using the select statement handle. If necessary, this function can be called before step 3. To use a system-generated cursor name, skip this step.
5. Using the select statement handle, call SQLBindCol and SQLBindParameter as often as necessary and then call
SQLExecute.
6. If the cursor name is system-generated, call SQLGetCursorName to copy the cursor name into the update statement
(where current of clause).
7. Call SQLPrepare to compile the update statement.
8. Call SQLFetch repeatedly with the select statement handle until a row to be modified is retrieved.
9. To perform the update, assign the values to the desired parameters and call SQLExecute using the update statement
handle. Repeat steps 8 and 9 until finished.
10. Call SQLEndTran to commit the changes.
11. Free the statement handles by calling SQLFreeHandle.
The following example illustrates positioned update processing. The RaiseComm function fetches and displays each row of
the salesperson table so that the user (for example, the sales manager) can raise a salesperson's commission rate by 1 percent.
RaiseComm uses SQLSetCursorName to give the select statement the cursor name "comm_raise". Note that if the statement
associated with the cursor is not an updateable select statement, SQLSetCursorName returns an error code.
#include "sql.h"
static char SaleSel[] =
"select sale_id, sale_name, commission, mgr_id from salesperson";
static char SaleUpd[] =
"update salesperson set commission=commission+0.01 "
"where current of comm_raise";
/* Raise commission for selected salespersons */
SQLRETURN RaiseComm(HDBC srv)
{
char sale_id[4], mgr_id[4], sale_name[31];
float comm;
SQLLEN mgrIdInd;
char sqlstate[6], errmsg[80];
SQL User Guide
166
13. Using SQL in a C Application Program
SQLHSTMT sHdl, uHdl;
SQLRETURN stat;
/* step 1: allocate select statement handle */
if ((stat = SQLAllocHandle(SQL_HANDLE_STMT, svr, &sHdl)) != SQL_SUCCESS)
return(stat); /* this will catch connection handle problems */
/* step 2: allocate update statement handle */
SQLAllocHandle(SQL_HANDLE_STMT, svr, &uHdl);
/* step 3: compile the select statement */
SQLPrepare(sHdl, SaleSel, SQL_NTS);
/* step 4: specify cursor name */
SQLSetCursorName(sHdl, "comm_raise", SQL_NTS);
/* step 5: bind select stmt columns and execute select statement */
SQLBindCol(sHdl, 1, SQL_C_DEFAULT, sale_id, 4, NULL);
SQLBindCol(sHdl, 2, SQL_C_DEFAULT, sale_name, 31, NULL);
SQLBindCol(sHdl, 3, SQL_C_DEFAULT, &comm, sizeof(float), NULL);
/* mgrIdInd will be SQL_NULL_DATA for managers */
SQLBindCol(sHdl, 4, SQL_C_DEFAULT, mgr_id, 4, &mgrIdInd);
if ((stat = SQLExecute(sHdl)) == SQL_SUCCESS) {
/* step 6: compile positioned update statement */
SQLPrepare(uHdl, SaleUpd, SQL_NTS);
/* step 7: fetch each row and display, allowing user to update
if desired */
while ((stat = SQLFetch(sHdl)) == SQL_SUCCESS) {
if (mgrIdInd != SQL_NULL_DATA &&
DisplaySalesperson(sale_id, sale_name, comm, csize) == UPDATED) {
/* step 8: this salesperson gets the raise */
if ((stat = SQLExecute(uHdl)) != SQL_SUCCESS)
break;
}
}
}
if (stat == SQL_ERROR) {
SQLGetDiagRec(SQL_HANDLE_DBC, svr, 1, sqlstate, NULL, emsg, 80, NULL)
printf("***ERROR(%s): %s\n", sqlstate, errmsg);
SQLEndTran(SQL_HANDLE_DBC, svr, SQL_ROLLBACK);
}
else {
/* step 9: commit the changes */
SQLEndTran(SQL_HANDLE_DBC, svr, SQL_COMMIT);
}
/* step 10: drop the statement handles */
SQLFreeHandle(SQL_HANDLE_STMT, sHdl);
SQLFreeHandle(SQL_HANDLE_STMT, uHdl);
return stat;
}
Like a positioned update, a positioned delete can be used to delete the current row of a specified cursor. Execution of a positioned delete is identical to a positioned update except that no columns are updated and the row is simply deleted. A positioned delete must first define a select statement cursor by calling SQLGetCursorName or SQLSetCursorName. Then the
application issues a delete statement using the where clause with a current of qualifier to specify the cursor. In this case, the
SQL User Guide
167
13. Using SQL in a C Application Program
delete statement removes only the row indicated by the cursor. The following statement deletes the salesperson indicated by
the cursor named "comm_raise", as described in Processing a Positioned Update.
delete salesperson where current of comm_raise;
13.5 Using Cursors and Bookmarks
13.5.1 Using Cursors
In ODBC, a user fetches data from a database by executing an SQL query (through SQLExecDirect or SQLExecute). The
server determines a result set of rows that match the requested query, and creates a cursor that points to a row in this result set.
The user then fetches the data by calling SQLFetch or SQLFetchScroll.
Rowset
If the user calls SQLFetch, RDM Server returns the data for one row; if the user calls SQLFetchScroll, RDM Server
returns a group of rows (a rowset), starting with the row pointed to by the cursor. The number of rows in a rowset is determined by the rowset size setting, set with the SQLSetStmtAttr function, and the SQL_ROWSET_SIZE option. (The default
is 1.) The user can fetch additional rowsets by calling these fetch functions again.
Types of Cursors
The five types of cursors available in ODBC can be divided into two categories:
l
l
Non-scrollable cursors allow the user to fetch only the next rowset in the result set. When the end of the result set is
reached, the fetch function returns SQL_NO_DATA_FOUND. There is one kind of non-scrollable cursor, the forwardonly cursor.
Scrollable cursors give users the choice of which rowset to fetch (for example, the next rowset, the previous rowset, or
a rowset starting at an absolute row number). Scrollable cursor types include static, dynamic, keyset-driven, and mixed
cursors.
RDM Server supports one kind of scrollable cursor, the static cursor.
13.5.2 Static Cursors
This cursor is called static because the result set's membership is determined when the query is executed and does not change
for the life of the query. Therefore, if another user changes the data in a row that the user has fetched, this change is unseen
by the first user until that user re-executes the query. In essence, a snapshot of the result set is taken when the query is
executed and that snapshot does not change until the query is re-executed.
RDM Server caches result set data on the client side. When the data is requested through SQLFetchScroll, RDM Server
fetches as many rows from the server as necessary to meet the request. Thus, if the user requests the first rowset, RDM Server
only fetches that rowset. But, if the user requests the last rowset, RDM Server must fetch all intervening rows from the server
into the client side cache before it can fetch the requested rowset. If the result set is large, this could take several minutes.
However, once the data is on the client side, any request for a rowset is met quickly. When the cursor is freed (by calling
SQLCloseCursor), the client side cache is cleared.
SQL User Guide
168
13. Using SQL in a C Application Program
Using Static Cursors
By default, all cursors are forward-only. To implement static cursors the user must, before executing the query, call
SQLSetStmtAttr with the SQL_ATTR_CURSOR_TYPE option set to SQL_CURSOR_STATIC. If the user employs
SQLFetch to retrieve data, the cursor is still restricted to forward-only movement; furthermore, a user cannot mix
SQLFetchScroll and SQLFetch on a given cursor.
However, if a user employs SQLExtendedFetch, the user can fetch any rowset from the result set in any order. (For
instance, the user can fetch the last rowset in the result set or the rowset starting with the fiftieth row.) Once a rowset is
fetched, the user can call SQLSetPos (for static cursors only) to position the cursor at a particular row within the rowset. The
row's data can then be retrieved into variables using the SQLGetData function. Alternatively, the data can be retrieved by
binding columns to arrays of variables, just as with the forward-only cursor.
Limitations on Static Cursors
As explained in "Static Cursors" above, a static cursor cannot reflect changes to database data made after a query has been
executed. Static cursors additionally have the following limitations:
Changing the default display string for a data type affects what the server can retrieve. If you bind a column to SQL_C_
CHAR that has a default display type (as set by the SQL statement set type display), the cursor caches it as a string using the
format string. This means that you cannot subsequently rebind that column as a non-character type and fetch data until the
query is re-executed. Nor can you call SQLGetData to fetch the data in its native form, since this function also retrieves the
information from the client side cache.
Similarly, if you bind a column that has a specified display format as a non-character type, you cannot rebind it (or use
SQLGetData) as a character type during the life of the cursor. This is because the default format information for the type is
stored on the server, while the fetched data might be coming from the client side cache, which has no access to this information. Therefore, RDM Server returns the information as specified when the cursor was first opened (i.e., on the first SQLExtendedFetch call). Note that there is no binding/SQLGetData limitation of this type if no default format string was
specified for the data type.
It is possible in RDM Server to change, between fetches, the default format string for a data type from a result set. RDM
Server, however, freezes the format string (if any) at cursor creation time (i.e., during the first SQLExtendedFetch), so the
query must be re-executed to reflect the change. The new format will not be used for the data type until the query is reexecuted. This rule also applies if a format string is created for a data type that did not have one at query execution time.
The following example demonstrates this limitation. Suppose the user employs static cursors and calls the following statement
on the current connection:
set real display(10, "$#,#.##");
Any columns of data type real that are bound to SQL_C_CHAR variables will be returned using the specified format string.
Suppose the user executes the query and calls SQLExtendedFetch, having bound the only column of type SQL_REAL in
the query to char. The resulting data will be returned in the dollar-sign format specified.
However, if the user tries to call SQLGetData on the field as follows, an error results:
SQLGetData(hstmt, 1, SQL_C_FLOAT, &fval, 0, NULL);
Because of this limitation, the user cannot convert the SQL_REAL column to a SQL_C_FLOAT column for the life of the
static cursor.
SQL User Guide
169
13. Using SQL in a C Application Program
Suppose the user re-executes the query, binds the column as SQL_C_FLOAT, calls SQLExtendedFetch, and tries to rebind
the column as SQL_C_CHAR. The user will get another error, because now the column is returned as SQL_C_FLOAT and
the client side cache does not have access to the previously specified dollar sign display format.
BLOB fields are handled differently in static cursor mode, because fetching huge BLOB fields into the client side cache inhibits performance. (Note that this handling method does not meet ODBC specifications.) At cursor creation time (i.e., during the
first SQLExtendedFetch), RDM Server only fetches BLOBs that have been bound (using SQLBindCol). Further, it only
fetches up to the number of BLOB bytes necessary to fill the requested bound buffer.
Thus, if the user binds a BLOB column to a 50-byte field, a maximum of 50 bytes of that particular BLOB will be returned
when SQLExtendedFetch is called. The user then cannot fetch more than that 50 bytes, because the data is not available
on the client side. The user cannot retrieve the data from the server because a static cursor's data is set when the cursor is created. (The BLOB's data might have changed since the cursor was created with the SQLExtendedFetch call.)
To retrieve more data, the user must re-execute the query, binding to a larger buffer on re-execution. Note that an increased
bound buffer size could affect performance, because more data might be sent over from the server during the fetch. Also, if the
user has not bound the BLOB column before the first call to SQLExtendedFetch, no BLOB data is available for that BLOB
for the life of the cursor, unless the user has first used the SQL_FETCH_MAXBLOB option. (For details, see the description
of the SQLSetStmtAttr.)
13.5.3 Using Bookmarks
RDM Server supports the ODBC concept of bookmarks, which allows you to mark a row and then return to that specific row
later. Bookmarks are identifiers for a particular row that can be used to re-fetch a given row, provided the statement has been
fetched using static cursors and the SQLExtendedFetch function. Bookmarks are stored in integer buffers.
Activate a Bookmark
To use bookmarks (which are turned off by default), you must activate them on the statement handle with the SQLSetStmtAttr function. Use the SQL_USE_BOOKMARKS option with the SQL_UB_ON setting.
Once bookmarks are activated, execute a query, then fetch a rowset using SQLExtendedFetch (bookmarks do not work
with SQLFetch). To set a current row within the rowset, call SQLSetPos. (By default, the first row in the rowset is the current row.)
Turn Off a Bookmark
To turn off bookmarks, use the SQLSetStmtAttr function and the SQL_ATTR_USE_BOOKMARKS option, the same as
for activating, but use the SQL_UB_OFF setting instead. This option only works if the statement has previously been set up
to use static cursors.
Retrieve a Bookmark
To retrieve the bookmark for a row, call SQLGetStmtAttr and specify the SQL_FETCH_BOOKMARK_PTR option. The
bookmark is saved in a four-byte integer buffer. A bookmark can also be retrieved by using SQLBindCol or SQLGetData.
SQL User Guide
170
13. Using SQL in a C Application Program
Return to a Bookmark
To return to the bookmarks, call SQLFetchScroll with the fetch option set to SQL_FETCH_BOOKMARK and the
rowNum parameter set to the previously fetched bookmark. The returned rowset will start with the row marked by the bookmark. When the statement is freed, the bookmarks become invalid and must be re-fetched for subsequent queries.
13.5.4 Retrieving Blob Data
Like other data types, columns of the long varchar, long wvarchar, or long varbinary type can be retrieved via a select
statement. However, BLOB types cannot be used in where clauses or in any other expressions, except as parameters to a
UDF. The data can be either bound or unbound, and either SQLFetch or SQLFetchScroll can be used to retrieve the data. The
limitations involved with these methods are described below.
You can use the SQLBindCol function to bind a BLOB column to a buffer. If the buffer is not large enough to hold the
entire BLOB, the BLOB will be truncated to fit the buffer. If you have provided an output length variable to SQLBindCol,
this variable will contain the full length of the BLOB before truncation. The only way to retrieve the remainder of the BLOB
is to use the SQLGetData function. Therefore, use SQLBindCol to bind BLOB parameters only if you have relatively small
BLOBs or if you only care about the first portion of a BLOB. To retrieve a large number of rows in a rowset with
SQLFetchScroll, you need to call SQLBindCol to bind an array of buffers of rowset size, which will require considerable memory if you are retrieving large BLOBs. As an option, you can allocate less memory for the buffer, which will result in the BLOB data being truncated.
You also can retrieve the BLOB data using SQLGetData. This function can be used to retrieve all the BLOB data in chunks
of any size. However, you must use SQLFetch to retrieve the result set one row at a time, since you cannot use
SQLGetData with SQLFetchScroll. You can call SQLGetData multiple times if necessary; each time it writes into the
provided buffer the number of bytes specified in the buffer length parameter of the function. It will also write into the output
length parameter (if the parameter is provided) the number of bytes remaining to retrieve from the BLOB before the current
call to SQLGetData. The next time you call SQLGetData, the next chunk of the BLOB is returned into the buffer. If truncation has occurred, SQLGetData returns SQL_SUCCESS_WITH_INFO. When it returns the last part of the data,
SQLGetData returns SQL_SUCCESS. If called after this, SQLGetData returns SQL_NO_DATA.
The following example retrieves all data in the CDAlbum table associated with a specific composer, Beethoven. First, we
execute a statement:
SQLExecDirect(hstmt,
"select * from cdalbum "
"where composer = 'Beethoven, Ludwig Von';", SQL_NTS);
Next, if we are going to bind the column, we call SQLBindCol:
SQLBindCol(hstmt, 6, SQL_C_CHAR, notes, sizeof(notes), NULL);
SQLBindCol(hstmt, 7, SQL_C_BINARY, jacketpic, sizeof(jacketpic), NULL);
Both notes and jacketpic are character arrays; by binding the long varbinary column as SQL_C_BINARY, we eliminate the
need for a null terminator. Next, call SQLFetch or SQLFetchScroll. If we call SQLFetchScroll, the notes and jacketpic buffers must be two dimensional arrays where the first dimension equals the rowset size. For example, if the rowset size
is 50, the notes and jacketpic buffers might be declared as follows.
char notes[50][NOTES_SIZE];
char jacketpic[50][JACKETPIC_SIZE];
SQL User Guide
171
13. Using SQL in a C Application Program
After calling the fetch function, the buffer will contain up to sizeof(buffer) bytes of the BLOB in the buffer. At this
point, if you previously called SQLFetch, you can call SQLGetData to get the remainder of the BLOB data. Note that the
first time you call SQLGetData, it will refetch the data you already have in the buffer (the first bytes), which were bound to
that buffer when you called SQLFetch.
Alternatively, if we only use SQLGetData to retrieve the data, we call SQLFetch to fetch each row. Then call
SQLGetData multiple times to retrieve all the BLOB data. This approach might look like the following:
#define JPIC_SIZE
#define NOTES_SIZE
1000
100
char jpic[JPIC_SIZE], notes[NOTES_SIZE];
SQLINTEGER buflen;
int32 len, offset;
SQLExecDirect(---); /*as above */
while (SQLFetch(hstmt) == SQL_SUCCESS) {
offset = 0;
do {
status = SQLGetData(hstmt, 6, SQL_C_CHAR, notes, NOTES_SIZE, &buflen);
if (status == SQL_SUCCESS || status == SQL_SUCCESS_WITH_INFO) {
/* Copy data elsewhere, as our buffer will be overwritten
the next time we call SQLGetData.
*/
len = (int32)(buflen < sizeof(notes) ? buflen : sizeof(notes)-1);
memcpy(somebuf+offset, notes, len);
offset += len;
status = SQL_SUCCESS;
}
} while (status == SQL_SUCCESS);
/* Then do the same thing for the jacketpic blob. */
do {
status = SQLGetData(hstmt, 7, SQL_C_BINARY, jpic, JPIC_SIZE, &buflen);
if (status == SQL_SUCCESS || status == SQL_SUCCESS_WITH_INFO) {
/* Same idea as above, except this buffer has no
null terminator in it.
*/
...
}
...
} while (status == SQL_SUCCESS);
}
For most BLOB values, however, the usual way to insert or update BLOB data is to use parameter markers and the SQLParamData and SQLPutData functions to put the data into the BLOB in chunks (or all at once if you wish). To insert or update
the data this way, first prepare a statement containing a parameter marker for the BLOB column, bind the parameter, call
SQLExecute, call SQLParamData, then repeatedly call SQLPutData to put the data into the BLOB. Finally, call
SQLParamData again to prepare the next BLOB for insert/update, or to complete the modifications if there are no further
BLOBs. For example, if we have an external data file containing a copy of the album's jacket picture, we must prepare an
insert statement:
SQLPrepare (hstmt, "insert into cdalbum values(,'Eine Kleine Nachtmusik',
'Mozart, Wolfgang', 'Classical', ?, null);", SQL_NTS);
SQL User Guide
172
13. Using SQL in a C Application Program
Before executing this statement, we first must bind a variable to the parameter marker in the BLOB field. BLOB parameters
must be bound as SQL_DATA_AT_EXEC parameters, meaning data for the parameter will be provided after statement execution. In RDM Server, the SQL_LEN_DATA_AT_EXEC(length) macro indicates that the parameter is DATA_AT_EXEC.
The length parameter of this macro must be non-negative (usually 0) and is ignored by RDM Server. The last parameter in
SQLBindParameter is a pointer to an SQLINTEGER variable equal to the result of this macro.
Our next requirement concerns the variable we bind to the parameter; it must be a 4-byte value. This value can either be a
scalar value or a pointer. For example, we might bind to the BLOB parameter a pointer to a string containing the name of the
file containing the jacket picture, as shown below.
status = SQLBindParameter(hstmt, 1, SQL_PARAM_INPUT, SQL_C_BINARY,
SQL_LONGVARBINARY, 0, 0, picFileName, 0, &bloblen);
Here, picFileName is a pointer to the string "c:\albums\jackets\mozart733.jpg" containing the album's jacket picture data and bloblen = SQL_LEN_DATA_AT_EXEC(0). With all parameters in place, we can execute the statement:
status = SQLExecute(hstmt);
The value of status after this call will not be SQL_SUCCESS, but SQL_NEED_DATA, indicating that statement execution is
not complete. Thus, our next step is adding the BLOB data to this record. First, call SQLParamData, which takes two parameters (the second is a pointer). RDM Server will return into the pointer the value associated with the first bound DATA_
AT_EXEC parameter it finds. The SQL_NEED_DATA status code is returned if any DATA_AT_EXEC parameters are found.
RDM Server first searches for and returns all non-BLOB DATA_AT_EXEC parameters (if any), then returns the BLOB parameters. Each call to SQLParamData returns the next parameter, one parameter for each call. When there are no more to
return, SQLParamData returns SQL_SUCCESS. In our example, we only have one DATA_AT_EXEC parameter. Therefore,
after we call SQLParamData, ptr will point to the path to the jacket cover picture's file string ("c:\albums\jackets\mozart733.jpg ") that we bound earlier with SQLBindParameter:
status = SQLParamData(hstmt, &ptr);
Next, call SQLPutData as many times as necessary to put all the data into the BLOB field. When finished, call SQLParamData again to move to the next DATA_AT_EXEC parameter. If there is another DATA_AT_EXEC parameter, SQLParamData will return SQL_NEED_DATA. Otherwise, it will return SQL_SUCCESS, indicating the insert is now complete. In
our example, we call fopen using the value in ptr set by SQLParamData, and read the data out of the file. We will send the
data in chunks of 1024 bytes. We call SQLPutData multiple times until all the data is sent, then call SQLParamData
again:
#define BUFSIZE
1024
while ((status = SQLParamData(hstmt, &ptr)) == SQL_NEED_DATA) {
if ((fn = fopen(ptr, "rb")) != NULL) {
do {
/* put next block of data from file */
buflen = fread(buf, 1, BUFSIZE, fn);
status = SQLPutData(hstmt, buf, buflen);
} while (buflen == BUFSIZE && status != SQL_ERROR);
fclose(fn);
/* check here if status == SQL_ERROR or SQL_SUCCESS */
if (status == SQL_ERROR) {
...
SQL User Guide
173
13. Using SQL in a C Application Program
}
}
}
It is useful to have the pointer bound in SQLBindParameter represent something uniquely identifying the BLOB, particularly if there is more than one BLOB in the record. You must insert the data into the BLOBs in the order requested by
RDM Server (via SQLParamData); RDM Server returns the BLOBs in the order they are placed in the table.
Similarly, in an update statement, you cannot use the BLOB in the where clause to identify which rows to update (unless
you have a UDF that takes BLOB parameters). It is useful to define another field in the record that will uniquely identify
which records you want to update. In our example database, cd_id is a unique primary key that can be used. When the update
occurs, the entire new BLOB must be inserted into the database, completely replacing the BLOB already in the database (if
any). You cannot simply append changes onto the end of a BLOB.
As mentioned earlier, you cannot directly reference columns of the long varchar, long wvarchar, and long varbinary type
in the where clause of a select, update, or delete statement. You can, however, pass a BLOB column as an argument to a userdefined function (UDF).
One of the many uses for a UDF is doing fast low-level database lookups. Inside the UDF, these low-level operations can be
used to manipulate BLOBs. For instance, a UDF could return the BLOB's size, or whether the BLOB is NULL. You might
write a "BLOB grep" function to return whether a supplied string occurs in the BLOB.
You can also pass BLOB data types into UDFs (or UDPs) as parameters. As a simple example, if we write a UDF called blobgrep, we might execute the following select statement to retrieve the names of composers whose biographies contain the
string "violin".
select composer from cdalbum
where blobgrep(notes, "violin") = 1;
The blobgrep function itself could use runtime BLOB functions to search the current BLOB for the requested string, returning
1 if the string is found, 0 if not.
SQL User Guide
174
14. Developing SQL Server Extensions
14. Developing SQL Server Extensions
SQL server extensions are application-specific, C language modules that are extension of RDM Server and are called from
RDM Server SQL. SQL server extensions include the following:
l
C-based, user-defined functions (UDFs)
l
C-based, user-defined procedures (UDPs)
l
Transaction trigger functions
All these modules extend the capabilities of RDM Server SQL. Called by the RDM Server SQL system during the processing
of SQL statements, these modules run in DLLs or shared libraries on the RDM Server. They are easy to develop and provide a
powerful tool for development of high-performance RDM Server SQL database applications.
14.1 User-Defined Functions (UDF)
A UDF is an application-specific function used just like the RDM Server SQL scalar and aggregate functions, but developed
to meet the specific needs of your SQL application. After you have completed development of your UDF you need to register
it with the RDM Server SQL system. This is done using the create function statement, as shown in the following syntax.
create_function:
create [scalar | aggregate] function[s]
fcnname ["description"] [, fcnname ["description"]]...
in libname on devname
A scalar UDF operates on a single row and retrieves a single value. An aggregate function performs computations on sets of
rows that result from a select statement usually specified with the group by clause. For example, the following statements
register three user-defined aggregate functions contained in a DLL called "statpack.dll" on an RDM Server database device
named add_ins. The select statement calls the standard SQL aggregate function, avg, as well as the user-defined aggregate
function, geomean.
create aggregate function
devsq
"compute sum of the squares of deviations",
stddev
"compute standard deviation",
geomean
"compute the geometric mean"
in statpack on add_ins;
select state, avg(amount), geomean(amount) from customer, sales_order
where customer.cust_id = sales_order.cust_id
group by state;
User-defined functions have a variety of uses, such as:
l
Translating coded values into easy-to-read strings.
l
Performing special-purpose computations.
l
Adding new aggregate functionality.
l
Doing fast, low-level database lookups (including manipulation of BLOBs).
l
Implementing triggers called when tables are updated.
SQL User Guide
175
14. Developing SQL Server Extensions
You will find sample UDF code in the examples/udf directory. This code includes a sample module (udf.c) with some
source code, which will be used throughout this section. Instructions for using the sample code are provided in Executing
Your RDM Server Programs.
The sample UDF module defines the six user-defined functions listed in Table 13-1.
Table 13-1. Functions Defined in the Sample UDF Module
Function
Description
HaveProduct
Trigger.
OkayToShip
subquery
std
stds
udfcount
Trigger.
Takes a string containing a select statement that retrieves a
single-value result.
Computes exact standard deviation.
Computes sampled standard deviation.
Performs exactly the same operation as the RDM Server SQL
built-in count function.
These functions can be compiled using the provided udf.mak makefile. The resulting DLL is called udf.dll. After the
DLL is created, connect to RDM Server SQL and enter the following create function statement. The examples following this
statement illustrate the use of these functions.
create aggregate function
std
"actual standard deviation",
stds
"sampled standard deviation",
udfcount "alternate count function"
scalar function
subquery "selectable subquery function"
in udf on rdsdll;
You could show the average sales amounts and their standard deviations per salesperson using the following query.
select sale_name, avg(amount), std(amount)
from salesperson, customer, sales_order
where salesperson.sale_id = customer.sale_id
and customer.cust_id = sales_order.cust_id
group by sale_name;
Both the count and udfcount functions below should return identical results from the following two queries.
select sale_name, count(cust_id), udfcount(cust_id)
from salesperson, customer
where salesperson.sale_id = customer.sale_id
group by sale_name;
select sale_name, count(distinct state), udfcount(distinct state)
from salesperson, customer
where salesperson.sale_id = customer.sale_id
group by sale_name;
The next example uses the subquery function to return a percentage of total values in a single select statement.
SQL User Guide
176
14. Developing SQL Server Extensions
select state, 100*sum(amount)/subquery("select sum(amount) from sales_order") pct
from customer, sales_order
where customer.cust_id = sales_order.cust_id
group by state;
The implementation of these UDFs are described in subsequent sections.
14.1.1 UDF Implementation
Keep the following concepts in mind when programming your UDF module.
l
l
l
l
l
l
The module must include a load function named udfDescribeFcns which identifies all of the UDFs implemented
in the module.
Each UDF can optionally include an initialization function of type UDFINIT. If you define this function, SQL calls it
when the UDF begins executing.
Each UDF must include a function of type UDFCHECK that performs type checking on the UDF's arguments. UDFs
can take any number of arguments of any type, and the value returned can be of any data type, except long varchar,
long wvarchar, or long varbinary.
The main UDF function, of type UDFFUNC, performs processing for the UDF.
An aggregate UDF must include a reset function of type UDFRESET that is called by SQL when the group by value
changes in order to reset the aggregate calculations.
The UDF can optionally include a cleanup function of type UDFCLEANUP. If defined, this function is called by SQL
each time UDF execution is completed.
l
If the UDF is running on Microsoft Windows, the UDF must include LibMain.
l
A scalar UDF minimally must a type checking function (UDFCHECK) and the processing function (UDFFUNC) itself.
l
Each UDF should declare REXTERNAL in its function definition.
There are other specialized functions that can be used for implementing UDFs. The SQL UDF support functions (SYS prefix)
allow UDFs to perform low-level database operations associate SQL modification commands with client application transactions, and use the decimal arithmetic capabilities of RDM Server SQL.
UDF implementations also can use the SQL date and time manipulation functions (VAL prefix). By connecting into SQL's
internal arithmetic functions, these functions allow the UDFs to include mixed-mode arithmetic operations. The results of
mixed-mode arithmetic operations follow standard C-language rules.
UDF Module Header Files
Your UDF module code must include the header file named emmain.h. This header addresses platform-specific implementation and also includes all other standard header files (e.g., sqlrds.h and sqlsys.h) that you will need in your UDF
module. In order to use this header, you must precede the #include with two #define declarations. The first #define
specifies the name of the UDF module (in uppercase). The second #define identifies the module type (the emmain.h file
is used for all types of server extensions). The following code fragment shows the use of emmain.h for the sample UDF module.
/* Definitions and header to setup EM ------------ */
/* (all EMs must have a code block like this) --- */
#define EM_NAME UDF
SQL User Guide
/* the uppercased module name */
177
14. Developing SQL Server Extensions
#define EMTYPE_UDF
/* EMTYPE_EM, EMTYPE_UDF, EMTYPE_UDP, EMTYPE_IEF */
#include "emmain.h"
/* must follow the above defs and OS #includes */
The header files that contain definitions used in UDF modules are listed in Table 13-2. The files can be found in the
include directory.
Table 13-2. UDF Module Header Files
Header File
Description
emmain.h
RDM Server standard extension module header. Use #include
sqlrds.h
sqlsys.h
to add it to all extension, UDF, UDP, and IEF modules. Automatically includes the sqlrds.h and sqlsys.h files for
UDF, UDP, and IEF modules.
RDM Server SQL extensions header file. Includes prototypes
and data definitions for the C-language extension module
functions used with RDM Server SQL. Provides access to all
RDM Server SQL capabilities. This file automatically
includes the sqlext.h file.
RDM Server SQL UDF header file. Includes UDF function
type declarations, UDF specific data type definitions, and
SYS function prototypes.
Function udfDescribeFcns
Each UDF library module must contain a function named udfDescribeFcns that has arguments declared as shown in the
prototype specification below. This function is called when the first SQL statement that contains a reference to one of the
functions in the library is compiled. The responsibility of udfDescribeFcns is to return a pointer to a function description
table containing all of the entry point information. In addition, an optional module description string can be returned that will
be displayed on the RDM Server system console indicating that the UDF library module has been loaded.
/* ============================================================
User function description, called when statement is prepared
*/
void REXTERNAL udfDescribeFcns (
unsigned short *NumFcns,
/* out: number of functions in module */
PUDFLOADTABLE
*UDFLoadTable, /* out: points to UDFLOADTABLE array */
char
**fcn_descr);
/* out: optional description string */
{
*NumFcns
= RLEN(UdfTable);
*UDFLoadTable = UdfTable;
*fcn_descr = "Sample of SQL user-defined functions";
}
The UDFLoadTable is a struct array of type UDFLOADTABLE. There must be one entry defined in the array for each
UDF supported by the module. The declaration for UDFLOADTABLE is contained in the header file sqlsys.h and is
shown below.
typedef struct udfloadtable {
char
udfName[33];
SQL User Guide
/* name of user function */
178
14. Developing SQL Server Extensions
UDFFUNC udfCall;
UDFCHECK
udfCheck;
UDFINIT udfInit;
UDFCLEANUP
udfCleanup;
UDFRESET
udfReset;
} UDFLOADTABLE,*PUDFLOADTABLE;
/* address of user function */
/* type checking call */
/* initialization for user function */
/* cleanup for user function */
/* reset for user function */
Each element of the UDFLOADTABLE struct is described in the following table.
Table 13-3. UDFLOADTABLE Struct Element Descriptions
Function
Description
udfName
The name of the function. Must conform to a standard SQL
udpCall
udfCheck
udfInit
udfCleanup
udfReset
identifier. It is case-insensitive and unique (system-wide).
Pointer to the call processing function.
Pointer to the argument type checking function. Assign to
NULL if there are no arguments.
Pointer to pre-execution initialization function.
Pointer to post-execution cleanup function.
Pointer to function that reset the group calculation values for
an aggregate function. Assign to NULL for scalar functions.
Each UDFLOADTABLE entry must specify the name of the UDF and the address of at least two functions: the function
(udfCall) that actually performs the operation, and another function (udfCheck) that is called during the compilation of
an SQL statement that uses the UDF to perform type checking. The type of the argument expression(s) is passed into the function that must validate the argument type and return the result type. In addition, you can optionally specify: 1) the address of
a function (udfInit) that is called when the statement is first executed to perform any necessary initialization, and 2) the
address of a function (udfCleanup) that is called after the execution has completed (for example, after SQLFetch returns
SQL_NO_DATA_FOUND). Aggregate functions are also required to provide the address of a function (udfReset) that
resets the accumulator variables when the grouping value changes. Unused function entries should be NULL.
The code that defines the UDFLOADTABLE and the udfDescribeFcns code for the examples given in the udf.c module is shown below.
/*---------------------------------------------------------------------Function prototypes
----------------------------------------------------------------------*/
/* user function for udfcount */
UDFCHECK
CntCheck;
UDFFUNC
CntFunc;
UDFINIT
CntInit;
UDFCLEANUP CntCleanup;
UDFRESET
CntReset;
/* user function for standard deviation */
UDFCHECK
TypeCheck;
UDFFUNC
StdFunc;
UDFINIT
StdInit;
UDFCLEANUP StdCleanup;
UDFRESET
StdReset;
/* user function for sample standard deviation */
UDFFUNC
StdsFunc;
SQL User Guide
179
14. Developing SQL Server Extensions
/* user function for subquery function */
UDFCHECK
QueryCheck;
UDFFUNC
QueryFunc;
UDFINIT
QueryInit;
UDFCLEANUP QueryCleanup;
/* user function for HaveProduct trigger */
UDFCHECK
InvCheck;
UDFFUNC
InvFunc;
UDFINIT
InvInit;
UDFCLEANUP InvCleanup;
/* user function for OKayToShip trigger */
UDFCHECK
ShipCheck;
UDFFUNC
ShipFunc;
UDFINIT
ShipInit;
UDFCLEANUP ShipCleanup;
/*--------------------------------------------------------------------Table of user-defined functions for this module
---------------------------------------------------------------------*/
/* table of user functions callable from within an sql expression */
static UDFLOADTABLE UdfTable[] = {
/*name
UDFFUNC
UDFCHECK
UDFINIT
UDFCLEANUP
UDFRESET*/
/*------------------ ---------- --------- ------------ --------*/
{"std",
StdFunc, TypeCheck, StdInit, StdCleanup, StdReset},
{"stds",
StdsFunc, TypeCheck, StdInit, StdCleanup, StdReset},
{"SubQuery",
QueryFunc,QueryCheck,QueryInit,QueryCleanup,NULL
},
{"udfCount",
CntFunc, CntCheck, CntInit, CntCleanup, CntReset},
{"HaveProduct",InvFunc, InvCheck, InvInit, InvCleanup, NULL
},
{"OKayToShip", ShipFunc, ShipCheck, ShipInit, ShipCleanup, NULL
}
};
/* =====================================================================
User function description, called when statement is prepared
*/
void REXTERNAL udfDescribeFcns(
uint16
*NumFcns,
/* out: number of functions in module */
PUDFLOADTABLE *UDFLoadTable,/* out: points to UdfTable above */
char
**fcn_descr)
/* out: optional description string */
{
*NumFcns
= RLEN(UdfTable);
*UDFLoadTable = UdfTable;
*fcn_descr = "Sample of SQL user-defined functions";
}
SQL Data VALUE Container Description
The VALUE data type that is passed to both the type checking and the processing function is a multi-type value container
declared as shown below. The type field contains the standard SQL_* data type constant (for example, SQL_INTEGER). The
vt union declares a container variable for values of each SQL data type.
SQL User Guide
180
14. Developing SQL Server Extensions
typedef struct _value {
int16
type;
int16
cmpfcn;
union {
int8
tv;
int16
sv;
int32
lv;
int64
llv;
float
fv;
double
dv;
const BCD_X
*bv;
const BCD_Z
*zv;
BINVAR
xv;
LONGVAR
lvv;
TIMESTAMP_VAL
tsv;
const char
*cv;
const DB_WCHAR *wcv;
} vt;
} VALUE;
/* data type of value (SQL_*)
/* INTERNAL USE ONLY
*/
*/
/*
/*
/*
/*
/*
/*
/*
/*
/*
/*
/*
/*
/*
*/
*/
*/
*/
*/
*/
*/
*/
*/
*/
*/
*/
*/
SQL_TINYINT | SQL_BIT
SQL_SMALLINT
SQL_INTEGER | SQL_DATE | SQL_TIME
SQL_BIGINT
SQL_REAL
SQL_FLOAT
SQL_DECIMAL/SQL_NUMERIC (unpacked)
SQL_DECIMAL/SQL_NUMERIC (packed)
SQL_BINARY | SQL_VARBINARY
SQL_LONGVAR(CHAR|BINARY)
SQL_TIMESTAMP
SQL_CHAR || SQL_VARCHAR
SQL_WCHAR || SQL_WVARCHAR
Function udfInit
The code for udfCount will be used to explain how you would use each of the five functions. Function CntInit, shown
below, is called to initialize processing of a udfcount reference in a specific SQL statement. Initialization functions are
passed two arguments. The first is the system handle that is used by SQL to identify and maintain the context of the executing statement. The second argument is the address of a void pointer into which you may return a function context pointer that
you allocate. The allocated buffer will be stored by SQL with the statement context associated with hSys. It can contain anything you want. In this example, COUNT_CTX contains the memory allocation tag and a long that will contain the current
count value.
Although you can use the standard malloc and free memory allocation functions, we recommend that you use the RDM
Server resource manager memory allocation function rm_getMemory. An SQL UDF support function called
SYSMemoryTag returns the memory allocation tag that you should use in your calls to rm_getMemory. Memory allocated
with this tag remains active for the life of the statement that contains the call to the UDF. When the statement has terminated,
memory will be automatically freed by SQL. In the rare event that the server should not have enough memory for your rm_
getMemory request, SQL will gracefully abort the statement execution and return status SQL_ERROR (errSRVMEMORY)
to the application.
The following example shows the CntInit initialization function for the sample aggregate UDF, udfCount. Note the
COUNT_CTX structure defining the UDF context.
/* used by udfcount */
typedef struct count_cxt {
RM_MEMTAG mTag;
int32 count;
} COUNT_CTX;
/* ============================================================
Initialization function for CntFunc()
*/
int16 REXTERNAL CntInit(
HSYS
hSys, /* in: system handle */
void **cxtp) /* in: statement context pointer */
{
SQL User Guide
181
14. Developing SQL Server Extensions
COUNT_CTX *cnt;
RM_MEMTAG mTag;
SYSMemoryTag(hSys, &mTag);
cnt = *cxtp = rm_getMemory(sizeof(COUNT_CTX), mTag);
cnt->mTag = mTag;
cnt->count = 0L;
return SQL_SUCCESS;
}
Function udfCheck
The udfCheck function performs type checking on the argument expression(s) that are passed to the function. Function
CntCheck, shown below, does this for udfCount. In this case, however, the job is quite simple in that the result is independent of the data type of the argument and always returns an integer (int32). You typically need to check both the number of arguments and the type of the arguments required by the function. If either is incorrect the function will return status
SQL_ERROR, and the result will be assigned a character string value with a specific error message to be returned to the user
that submitted the erroneous call. The CntCheck function shown below ensures that only one argument expression has been
passed. If not, the result container is used to return an error message and the function returns status SQL_ERROR indicating
the fault.
UNREF_PARM is an RDM Server macro that references an unused function parameter to meet compiler
requirements. Note the absence of a ";" at the end of the calls to this macro.
int16 REXTERNAL CntCheck (
HSYS
hSys,
/* in:
int16
noargs, /* in:
const VALUE *args,
/* in:
VALUE
*result, /* out:
int16
*len)
/* out:
{
int16 status = SQL_SUCCESS;
system handle */
number of arguments to function */
array of arguments */
result value */
max length result string */
UNREF_PARM(hSys);
UNREF_PARM(args);
UNREF_PARM(len);
if (noargs != 1) {
result->type = SQL_CHAR;
result->vt.cv = "only 1 argument expression is allowed";
status = SQL_ERROR;
}
else
result->type = SQL_INTEGER;
return status;
}
Type checking for the subquery UDF involves compiling the statement to ensure that it does not have any errors. If there are
errors, the result value is set to type SQL_SMALLINT and the result value is the RDM Server SQL error code returned from
SQLError (retrieved by the call to SQLGetDiagRec). When SQL_ERROR is returned from a type checking function, if the
result type is SQL_CHAR then SQL understands it to be a descriptive error message. If the result type is SQL_SMALLINT
SQL User Guide
182
14. Developing SQL Server Extensions
then SQL understands it to be a specific RDM Server SQL error code (for example, errMISMATCH). In the latter case, this
will be the error code returned to the calling program. You can see that QueryCheck utilizes both methods of UDF error communication.
In order to call a standard RDM Server SQL API function from a UDF, it is necessary to establish a connection handle that
corresponds to the connection handle of the statement that is executing the subquery reference. Function SYSSessionId
returns the RDS session identifier associated with the SQL system handle. Function SQLConnectWith is then called with
that session handle to return the proper connection handle. All of the SQL functions will be passed this connection handle,
which is identified by the SQL system as the same connection as that of the invoking user.
int16 REXTERNAL QueryCheck (
HSYS
hSys,
/* in: system handle */
int16
noargs, /* in: number of arguments to function */
const VALUE *args,
/* in: array of arguments */
VALUE
*result, /* out: result value */
int16
*len)
/* out: max length result string */
{
/* NOTE: The argument to subquery MUST be a string literal in order
for this to work.
*/
SQLHENV
hEnv; /* environment handle for SQL calls */
SQLHDBC
hDbc; /* connection handle for SQL calls */
RDM_SESS
hSess; /* RDM session id */
SQLHSTMT
hStmt;
SQLRETURN
ret;
SQLSMALLINT colcount, parms;
int16
status = SQL_SUCCESS;
UNREF_PARM(noargs);
UNREF_PARM(len);
SQLAllocHandle(SQL_HANDLE_ENV, SQL_NULL_HANDLE, &hEnv);
SQLAllocHandle(SQL_HANDLE_DBC, hEnv, &hDbc);
SYSSessionId(hSys, &hSess);
SQLConnectWith(hDbc, hSess);
SQLAllocHandle(SQL_HANDLE_STMT, hDbc, &hStmt);
if ((ret = SQLPrepare(hStmt, (SQLCHAR *)args[0].vt.cv, SQL_NTS)) != SQL_SUCCESS ) {
result->type = SQL_SMALLINT;
SQLGetDiagRec(SQL_HANDLE_STMT, 1, &result->vt.lv, NULL, 0, NULL);
status = SQL_ERROR;
}
else {
SQLNumResultCols(hStmt, &colcount);
if (colcount > 1) {
result->type = SQL_CHAR;
result->vt.cv = "more than one result column";
status = SQL_ERROR;
}
else {
SQLNumParams(hStmt, &parms);
if (parms) {
result->type = SQL_CHAR;
result->vt.cv = "no argument markers allowed";
status = SQL_ERROR;
}
else
SQL User Guide
183
14. Developing SQL Server Extensions
SQLDescribeCol(hStmt, 1, NULL, 0, NULL, &result->type, NULL, NULL, NULL);
}
}
SQLFreeHandle(SQL_HANDLE_STMT, hStmt);
SQLDisconnect(hDbc);
SQLFreeHandle(SQL_HANDLE_DBC, hDbc);
SQLFreeHandle(SQL_HANDLE_ENV, hEnv);
return status;
}
Function udfFunc
The UDF processing function is called by SQL from the udfFunc entry in UDFLOADTABLE during execution of the SQL
statement that references the UDF. It is called once for each row that is retrieved by the statement. The function result is
returned in the VALUE container pointed to by argument result.
The following example illustrates the aggregate UDF processing function, CntFunc, defined for the udfCount. Note that the
result value returns the current count increment for each row processed, even though only the aggregate value is used. Aggregate calculations require a running calculation retrieval from every processing function. This is because you have no way of
knowing from within the UDF when RDM Server will call the function for the last time. The result SQL type and value (in
this case, the type is SQL_INTEGER and the value is the current count) are return in the result output argument and SQL_
SUCCESS is returned.
int16 REXTERNAL CntFunc (
HSYS
hSys,
/*
void
**cxtp,
/*
int16
noargs, /*
const VALUE *args,
/*
VALUE
*result) /*
{
COUNT_CTX *cnt = *cxtp;
in:
in:
in:
in:
out:
system handle */
statement context pointer */
number of arguments to function */
array of arguments */
result value */
UNREF_PARM(hSys);
UNREF_PARM(noargs);
result->type = SQL_INTEGER;
if (args[0].type != SQL_NULL)
result->vt.lv = ++cnt->count;
else
result->vt.lv = cnt->count;
return SQL_SUCCESS;
}
The processing function for the subquery UDF is shown below.Even though QueryCheck (see above) compiled the specified
select statement, QueryFunc needs to compile it as well because the statement containing the subquery reference may be contained in a precompile stored procedure.Therefore QueryFunc is being called in a (much) different context than when
QueryCheck was called. The NULL context pointer is the signal to both allocate the context and compile, execute and fetch
the subquery result. Notice that all of the work occurs on the first call to QueryFunc. All subsequent calls simply return the
subquery's result value.
SQL User Guide
184
14. Developing SQL Server Extensions
int16 REXTERNAL QueryFunc (
HSYS
hSys,
/* in: system handle */
void
**cxtp,
/* in: statement context pointer */
int16
noargs, /* in: number of arguments to function */
const VALUE *args,
/* in: array of arguments */
VALUE
*result) /* out: result value */
{
SUBQ_CTX *sqp = *cxtp; /* local context */
SQLHENV
hEnv;
/* environment handle for SQL calls */
SQLHDBC
hDbc;
/* connection handle for SQL calls */
RDM_SESS
hSess;
/* RDM session id */
SQLHSTMT
hStmt;
SQLUINTEGER prec;
SQLPOINTER ptr;
RM_MEMTAG
mTag;
int16
status = SQL_SUCCESS;
UNREF_PARM(noargs);
if (sqp == NULL) {
SYSMemoryTag(hSys, &mTag);
sqp = *cxtp = rm_getMemory(sizeof(SUBQ_CTX), mTag);
sqp->mTag = mTag;
SQLAllocHandle(SQL_HANDLE_ENV, SQL_NULL_HANDLE, &hEnv);
SQLAllocHandle(SQL_HANDLE_DBC, hEnv, &hDbc);
SYSSessionId(hSys, &hSess);
SQLConnectWith(hDbc, hSess);
SQLAllocHandle(SQL_HANDLE_STMT, hDbc, &hStmt);
SQLPrepare(hStmt, (UCHAR *) args[0].vt.cv, SQL_NTS);
SQLDescribeCol(hStmt,1,NULL,0,NULL, &sqp->result.type, &prec, NULL, NULL);
if (result->type == SQL_CHAR || result->type == SQL_VARCHAR)
ptr = sqp->result.vt.cv = rm_getMemory(prec, mTag);
else
ptr = &sqp->result.vt;
SQLBindCol(hStmt, 1, SQL_C_DEFAULT, ptr, prec, NULL);
SQLExecute(hStmt);
SQLFetch(hStmt);
*result = sqp->result;
if (SQLFetch(hStmt) != SQL_NO_DATA) {
result->type = SQL_CHAR;
result->vt.cv = "subquery() must return single row";
status = SQL_ERROR;
}
else
sqp->result = *result;
SQLFreeHandle(SQL_HANDLE_STMT, hStmt);
SQLDisconnect(hDbc);
SQLFreeHandle(SQL_HANDLE_DBC, hDbc);
SQLFreeHandle(SQL_HANDLE_ENV, hEnv);
}
else
*result = sqp->result;
SQL User Guide
185
14. Developing SQL Server Extensions
return status;
}
Function udfReset
The udfReset function is only used in an aggregate UDF to perform a reset after the grouping value changes. In the following example, the CntReset function for the udfCount UDF clears the accumulator variables for the last aggregate, restarting the count for the next group to zero.
int16 REXTERNAL CntReset (
HSYS
hSys,
/* in:
void
**cxtp)
/* in:
{
COUNT_CTX *cnt = *cxtp;
system handle */
statement context pointer */
UNREF_PARM(hSys);
cnt->count = 0L;
return SQL_SUCCESS;
}
Function udfCleanup
Your UDF can optionally include a cleanup function in the udfCleanup entry for each UDF defined in the
UDFLOADTABLE. When SQL statement processing is complete, SQL calls this function to free memory allocated by the
udfInit function, or any memory allocated during statement execution. For the sample UDF, udfCount, the cleanup function is called CntCleanup. As shown below, CntCleanup simply frees the context pointer.
Do not ever call rm_freeTagMemory within udfCleanup using the memory tag you acquired with
SYSMemoryTag. This tag is associated with aspects of the statement's memory that RDM Server uses
after udfCleanup returns. Rather, free the memory "manually" using rm_freeMemory.
void REXTERNAL CntCleanup (
HSYS
hSys,
/* in:
void
**cxtp)
/* in:
{
COUNT_CTX *cnt = *cxtp;
system handle */
statement context pointer */
UNREF_PARM(hSys);
rm_freeMemory(cnt, cnt->mTag);
*cxtp = NULL;
}
SQL User Guide
186
14. Developing SQL Server Extensions
14.1.2 Using a UDF as a Trigger
The definition and use of standard SQL triggers was previously described in Chapter 8 where trigger was defined as "a procedure associated with a table that is executed (i.e., fired) whenever that table is modified by the execution of an insert,
update, or delete statement." The standard database triggers as implemented in RDM Server described in that earlier chapter
are implemented using SQL statements only. If a trigger implementation requires more complex processing than can be done
with SQL statements then either the standard trigger must call as user-defined procedure (see section 14.2) to do the work or it
can be implemented through use of a UDF in conjunction with the table's check clause as described in this section.
In the database schema, you can define a trigger UDF in the check clause of the create table statement for a particular
table.The UDF returns a value (usually 1 for true and 0 for false) that is checked in the check condition. If the result of the
condition is true, SQL allows the modification to occur. If the result is false, the modification is rejected.
The example UDF module (udf.c) includes two trigger UDFs: HaveProduct and OkayToShip. The create table schema statements that references them are given below. Note that the prod_id and loc_id columns in the item table of the sales database
reference the corresponding primary keys in the product and outlet tables in the invntory database.
create table item
(
ord_num smallint not null references sales_order,
prod_id smallint not null references invntory.product,
loc_id char(3) not null references invntory.outlet,
quantity integer not null "number of units of product ordered",
check(HaveProduct(ord_num, prod_id, loc_id, quantity) = 1)
) in salesd1;
create table ship_log
(
ord_date timestamp default now "date/time when order was entered",
ord_num smallint not null "order number",
prod_id smallint not null "product id number",
loc_id char(3) not null "outlet location id",
quantity integer not null "quantity of item to be shipped from loc_id",
backordered smallint default 0 "set to 1 when item is backordered",
check(OKayToShip(ord_num,prod_id,loc_id,quantity,backordered) = 1)
) in salesd0;
The HaveProduct UDF automatically manages the invntory database and the ship_log table. When your RDM Server SQL
application executes an insert statement, HaveProduct looks up the on_hand record for the specified prod_id and loc_id
columns. If there are enough items available, HaveProduct subtracts the ordered amount of the item from the quantity in the
on_hand record and inserts a row in the ship_log table, from which a packing list will be created. If there are not enough
items available, HaveProduct assigns the quantity that is available to the order (that is, sets the quantity to zero) and inserts a
row in ship_log for that quantity. With the backordered flag (for the OkayToShip UDF) set to 1, HaveProduct specifies the
remaining amount needed to fill the order through an additional row in ship_log.
When the RDM Server SQL application uses HaveProduct with a delete statement, the UDF adds the number of items ordered
to the on_hand record and sets quantity in ship_log to 0 for the appropriate rows. The application can delete rows in ship_log
when an order is actually shipped. When the application executes a delete statement for this table, OkayToShip checks the
backordered flag. If the flag is set, the UDF rechecks the inventory to see if there are now enough items from which to fill the
order. If there are still not enough, the trigger UDF rejects the delete request. If enough items are available, OkayToShip
updates the inventory and processes the required number of items.
SQL User Guide
187
14. Developing SQL Server Extensions
The HaveProduct and OkayToShip trigger UDFs use the SQL statements shown below, along with the
SALES_CTX structure containing the UDF statement context data. The statements are compiled in
udfInit for each trigger UDF. The needed statements are executed by the udfFunc processing
functions./* HaveProduct & OKayToShip SQL statements: */
static char inv_cursor[]=
"select quantity from on_hand where prod_id=? and loc_id=?;";
static char inv_update[]=
"update on_hand set quantity = ? where current of inv_cursor";
static char shp_insert[]=
"insert into ship_log values(now, ?, ?, ?, ?, ?)";
static char shp_update[]=
"update ship_log set ord_date = now, quantity = 0 where "
"ord_num=? and prod_id=? and loc_id=?";
static char ord_update[]=
"update sales_order set ship_date = now where ord_num=?";
/* HaveProduct and OKayToShip context data */
typedef struct sales_ctx {
RM_MEMTAG mTag;
/* system memory allocation tag */
int16
stype;
/* statement type (e.g. sqlINSERT) */
SQLHENV henv;
/* SQL environment handle */
SQLHDBC hdbc;
/* SQL connection handle */
SQLHSTMT hInvSel; /* SQL statement handle for inv_cursor */
SQLHSTMT hInvUpd; /* SQL statement handle for inv_update */
SQLHSTMT hShpIns; /* SQL statement handle for shp_insert */
SQLHSTMT hShpUpd; /* SQL statement handle for shp_update or ord_update */
} SALES_CTX;
The InvInit function shown below is the initialization function (type UDFINIT) for HaveProduct. It calls SQLConnectWith
to use the same connection handle as the calling application, to ensure that the database changes made by the UDF are
included in the transaction of the calling application. Thus, if the application executes a rollback statement for the transaction, the HaveProduct changes will be rolled back as well. Note also the use of SYSDescribeStmt to determine which
type of operation (insert, delete, etc.) the application is performing on the table.
static int16 REXTERNAL InvInit (
HSYS
hSys,
/* in: system handle */
void
**cxtp)
/* in: statement context pointer */
{
SALES_CTX *stp;
RDM_SESS hsess;
RM_MEMTAG mTag;
int16
status = SQL_SUCCESS;
SYSMemoryTag(hSys, &mTag);
stp = *cxtp = rm_getMemory(sizeof(SALES_CTX), mTag);
stp->mTag
= mTag;
SYSDescribeStmt(hSys, &stp->stype);
if ( stp->stype == sqlINSERT || stp->stype == sqlDELETE ) {
/* connect to calling statement's connection */
SQLAllocHandle(SQL_HANDLE_ENV, NULL, &stp->henv);
SQLSetEnvAttr(stp->henv, SQL_ATTR_ODBC_VERSION,
(SQLPOINTER)SQL_OV_ODBC3, SQL_IS_INTEGER);
SQLAllocHandle(SQL_HANDLE_DBC, stp->henv, &stp->hdbc);
SYSSessionId(hSys, &hsess);
SQLConnectWith(stp->hdbc, hsess);
SQL User Guide
188
14. Developing SQL Server Extensions
SQLAllocHandle(SQL_HANDLE_STMT, stp->hdbc, &stp->hInvSel);
SQLSetCursorName(stp->hInvSel, inv_cursor_name, SQL_NTS);
SQLPrepare(stp->hInvSel, inv_cursor, SQL_NTS);
SQLAllocHandle(SQL_HANDLE_STMT, stp->hdbc, &stp->hInvUpd);
SQLPrepare(stp->hInvUpd, inv_update, SQL_NTS);
if ( stp->stype == sqlINSERT ) {
SQLAllocHandle(SQL_HANDLE_STMT, stp->hdbc, &stp->hShpIns);
SQLPrepare(stp->hShpIns, shp_insert, SQL_NTS);
} else {
SQLAllocHandle(SQL_HANDLE_STMT, stp->hdbc, &stp->hShpUpd);
SQLPrepare(stp->hShpUpd, shp_update, SQL_NTS);
}
}
return status;
}
The following example illustrates the InvCheck type checking function for HaveProduct. InvCheck verifies that the application is passing the correct number and types of parameters to HaveProduct.
static int16 REXTERNAL InvCheck
HSYS
hSys,
int16
noargs,
const VALUE *args,
VALUE
*result,
int16
*len)
{
int16 status = SQL_ERROR;
(
/*
/*
/*
/*
/*
in:
in:
in:
out:
out:
system handle */
number of arguments to function */
array of arguments */
result value */
max length result string */
UNREF_PARM(hSys);
UNREF_PARM(len);
/* validate arguments */
if ( noargs != 4 )
result->vt.cv = "HaveProduct: requires 4 arguments";
else if ( args[0].type != SQL_SMALLINT )
result->vt.cv = "HaveProduct: ord_num must be 1st arg";
else if ( args[1].type != SQL_SMALLINT )
result->vt.cv = "HaveProduct: prod_id must be 2nd arg";
else if ( args[2].type != SQL_CHAR )
result->vt.cv = "HaveProduct: loc_id must be 3rd arg";
else if ( args[3].type != SQL_INTEGER )
result->vt.cv = "HaveProduct: quantity must be 4th arg";
else {
result->type = SQL_SMALLINT;
status = SQL_SUCCESS;
}
return status;
}
In the following example, the InvFunc function (type UDFFUNC), which is only used in conjunction with an insert or delete
statement, performs the actual processing for HaveProduct. First, InvFunc opens a cursor to the row in the on_hand table with
the matching prod_id and loc_id values. Then, for an insert statement, InvFunc binds the parameters for the ship_log rows
and inserts one or two rows, depending on the available quantity in inventory. The function also updates the on_hand row. If
processing a delete statement, with quantity set to 0 for the previously entered ship_log rows, InvFunc updates the on_hand
record to include the non-backordered item quantity.
SQL User Guide
189
14. Developing SQL Server Extensions
static int16 REXTERNAL InvFunc (
HSYS
hSys,
/*
void
**cxtp,
/*
int16
noargs,
/*
const VALUE *args,
/*
VALUE
*result)
/*
{
int16
stat;
int16
backordered;
int32
quantity;
int32
diff;
const SALES_CTX *stp = *cxtp;
in:
in:
in:
in:
out:
system handle */
statement context pointer */
number of arguments to function */
array of arguments */
result value */
UNREF_PARM(hSys);
UNREF_PARM(noargs);
if ( stp->stype != sqlINSERT && stp->stype != sqlDELETE ) {
result->type = SQL_CHAR;
result->vt.cv = "cannot update item table - delete and re-insert";
return SQL_ERROR;
} else {
/* look up on_hand record */
SQLBindCol(stp->hInvSel,1,SQL_C_DEFAULT,&quantity,sizeof(quantity),NULL);
SQLBindParameter(stp->hInvSel,1,SQL_PARAM_INPUT,SQL_C_SHORT,SQL_SMALLINT,
0L,0,(void *)&args[1].vt.sv,0,NULL);
SQLBindParameter(stp->hInvSel,2,SQL_PARAM_INPUT,SQL_C_CHAR,SQL_CHAR,
3L,0,(void *)args[2].vt.cv,0,NULL);
SQLExecute(stp->hInvSel);
stat = SQLFetch(stp->hInvSel);
if ( stat != SQL_SUCCESS ) {
SQLFreeStmt( stp->hInvSel,SQL_CLOSE );
result->type = SQL_CHAR;
result->vt.cv = "missing inventory record";
return SQL_ERROR;
}
/* set up on_hand update parameter */
SQLBindParameter(stp->hInvUpd,1,SQL_PARAM_INPUT,SQL_C_LONG,SQL_INTEGER,
0L,0,&diff,0,NULL);
if ( stp->stype == sqlINSERT ) {
/* set up ship_log insert parameters */
SQLBindParameter(stp->hShpIns,1,SQL_PARAM_INPUT,SQL_C_SHORT,SQL_SMALLINT,
0L,0,(void *)&args[0].vt.sv,0,NULL);
SQLBindParameter(stp->hShpIns,2,SQL_PARAM_INPUT,SQL_C_SHORT,SQL_SMALLINT,
0L,0,(void *)&args[1].vt.sv,0,NULL);
SQLBindParameter(stp->hShpIns,3,SQL_PARAM_INPUT,SQL_C_CHAR,SQL_CHAR,
3L,0,(void *)args[2].vt.cv, 0,NULL);
SQLBindParameter(stp->hShpIns,4,SQL_PARAM_INPUT,SQL_C_LONG,SQL_INTEGER,
0L,0,(void *)&quantity,
0,NULL);
SQLBindParameter(stp->hShpIns,5,SQL_PARAM_INPUT,SQL_C_SHORT,SQL_SMALLINT,
0L,0,(void *)&backordered,
0, NULL);
diff = quantity - args[3].vt.lv;
if ( diff >= 0 ) {
/* all needed inventory is available */
backordered = 0;
quantity = args[3].vt.lv;
SQLExecute(stp->hShpIns); /* insert ship_log row */
SQLExecute(stp->hInvUpd); /* update inventory amount */
} else {
SQL User Guide
190
14. Developing SQL Server Extensions
/* there are not enough items available in inventory
-- use what is there and backorder the rest */
/* insert ship_log row of used items (all remaining inventory) */
backordered = 0;
SQLExecute(stp->hShpIns);
/* insert ship_log row of backordered items */
backordered = 1;
quantity = args[3].vt.lv - quantity;
SQLExecute(stp->hShpIns);
/* set inventory amount to zero */
diff = 0;
SQLExecute(stp->hInvUpd);
}
} else {
/* delete item row */
/* put items back into inventory */
diff = args[3].vt.lv + quantity;
SQLExecute(stp->hInvUpd);
/* ship_log.quantity == 0 => order has been changed */
SQLBindParameter(stp->hShpUpd,1,SQL_PARAM_INPUT,SQL_C_SHORT,SQL_SMALLINT,
0L,0,(void *)&args[0].vt.sv, 0, NULL);
SQLBindParameter(stp->hShpUpd,2,SQL_PARAM_INPUT,SQL_C_SHORT,SQL_SMALLINT,
0L,0,(void *)&args[1].vt.sv, 0, NULL);
SQLBindParameter(stp->hShpUpd,3,SQL_PARAM_INPUT,SQL_C_CHAR,SQL_CHAR,
3L,0,(void *)args[2].vt.cv, 0, NULL);
SQLExecute(stp->hShpUpd);
}
}
SQLFreeStmt(stp->hInvSel, SQL_CLOSE);
result->type = SQL_SMALLINT;
result->vt.sv = 1;
return SQL_SUCCESS; /*lint !e438 */
}
The InvCleanup cleanup function is shown below for HaveProduct. InvCleanup frees all RDM Server SQL handles used by
the trigger UDF, as well as the context memory previously allocated.
static void REXTERNAL InvCleanup (
HSYS
hSys,
/* in:
void
**cxtp)
/* in:
{
const SALES_CTX *stp = *cxtp;
system handle */
statement context pointer */
UNREF_PARM(hSys);
if ( stp->stype == sqlINSERT || stp->stype == sqlDELETE ) {
SQLFreeHandle(SQL_HANDLE_STMT, stp->hInvSel);
SQLFreeHandle(SQL_HANDLE_STMT, stp->hInvUpd);
if ( stp->stype == sqlINSERT )
SQLFreeHandle(SQL_HANDLE_STMT, stp->hShpIns);
else
SQL User Guide
191
14. Developing SQL Server Extensions
SQLFreeHandle(SQL_HANDLE_STMT, stp->hShpUpd);
SQLDisconnect(stp->hdbc);
SQLFreeHandle(SQL_HANDLE_DBC, stp->hdbc);
SQLFreeHandle(SQL_HANDLE_ENV, stp->henv);
}
rm_freeMemory(stp, stp->mTag); /*lint !e449 */
*cxtp = NULL;
}
The OkayToShip UDF is called from the check clause defined on the ship_log table. A delete on the ship_log table is
defined as indicating that the item is to be shipped to the customer.
The OkayToShip initialization function, ShipInit, is shown below. This function allocates the UDF context memory and the
needed SQL handles. It then calls SQLPrepare to compile the SQL statements that execute the desired trigger actions.
static int16 REXTERNAL ShipInit (
HSYS
hSys,
/* in:
void
**cxtp)
/* in:
{
SALES_CTX *stp;
RDM_SESS hsess;
RM_MEMTAG mTag;
int16
status = SQL_SUCCESS;
system handle */
statement context pointer */
SYSMemoryTag(hSys, &mTag);
stp = *cxtp = rm_getMemory(sizeof(SALES_CTX), mTag);
stp->mTag
= mTag;
SYSDescribeStmt(hSys, &stp->stype);
if ( stp->stype == sqlDELETE ) {
/* connect to calling statement's connection */
SQLAllocHandle(SQL_HANDLE_ENV, NULL, &stp->henv);
SQLSetEnvAttr(stp->henv, SQL_ATTR_ODBC_VERSION,
(SQLPOINTER)SQL_OV_ODBC3, SQL_IS_INTEGER);
SQLAllocHandle(SQL_HANDLE_DBC, stp->henv, &stp->hdbc);
SYSSessionId(hSys, &hsess);
SQLConnectWith(stp->hdbc, hsess);
SQLAllocHandle(SQL_HANDLE_STMT, stp->hdbc, &stp->hInvSel);
SQLPrepare(stp->hInvSel, inv_cursor, SQL_NTS);
SQLSetCursorName(stp->hInvSel, inv_cursor_name, SQL_NTS);
SQLAllocHandle(SQL_HANDLE_STMT, stp->hdbc, &stp->hInvUpd);
SQLPrepare(stp->hInvUpd, inv_update, SQL_NTS);
SQLAllocHandle(SQL_HANDLE_STMT, stp->hdbc, &stp->hShpUpd);
SQLPrepare(stp->hShpUpd, ord_update, SQL_NTS);
}
return status;
}
The cleanup function for OkayToShip frees the allocated SQL handles.
static void REXTERNAL ShipCleanup (
HSYS
hSys,
/* in:
void
**cxtp)
/* in:
{
const SALES_CTX *stp = *cxtp;
SQL User Guide
system handle */
statement context pointer */
192
14. Developing SQL Server Extensions
UNREF_PARM(hSys)
if ( stp->stype == sqlDELETE ) {
SQLFreeHandle(SQL_HANDLE_STMT, stp->hInvSel);
SQLFreeHandle(SQL_HANDLE_STMT, stp->hInvUpd);
SQLFreeHandle(SQL_HANDLE_STMT, stp->hShpUpd);
SQLDisconnect(stp->hdbc);
SQLFreeHandle(SQL_HANDLE_DBC, stp->hdbc);
SQLFreeHandle(SQL_HANDLE_ENV, stp->henv);
}
rm_freeMemory(stp, stp->mTag); /*lint !e449 */
*cxtp = NULL;
}
OkayToShip takes all of the ship_log columns except ord_date as arguments. Function ShipCheck, shown below, ensures that
the correct number and types have been specified.
static int16 REXTERNAL ShipCheck(
HSYS
hSys,
/*
int16
noargs,
/*
const VALUE *args,
/*
VALUE
*result,
/*
int16
*len)
/*
{
int16 status = SQL_ERROR;
in:
in:
in:
out:
out:
system handle */
number of arguments to function */
array of arguments */
result value */
max length result string */
UNREF_PARM(hSys);
UNREF_PARM(len);
/* validate arguments */
if ( noargs != 5 )
result->vt.cv = "OkayToShip: requires 5 arguments";
else if ( args[0].type != SQL_SMALLINT )
result->vt.cv = "OkayToShip: ord_num must be 1st arg";
else if ( args[1].type != SQL_SMALLINT )
result->vt.cv = "OkayToShip: prod_id must be 2nd arg";
else if ( args[2].type != SQL_CHAR )
result->vt.cv = "OkayToShip: loc_id must be 3rd arg";
else if ( args[3].type != SQL_INTEGER )
result->vt.cv = "OkayToShip: quantity must be 4th arg";
else if ( args[4].type != SQL_SMALLINT )
result->vt.cv = "OkayToShip: backordered must be 5th arg";
else {
result->type = SQL_SMALLINT;
status = SQL_SUCCESS;
}
return status;
}
Function ShipFunc performs the OkayToShip trigger operations. The on_hand row associated with the warehouse from which
the item will be shipped is rechecked for backordered items to see if there is now a sufficient quantity for filling the order. If
there is, on_hand.quantity is decremented by the backordered quantity and the delete is allowed. If there is still not enough
inventory, the delete is rejected. If all is okay, the ship_date column in the sales_order table is updated indicating that (at
least part of) the order has been shipped.
SQL User Guide
193
14. Developing SQL Server Extensions
static int16 REXTERNAL ShipFunc(
HSYS
hSys,
/*
void
**cxtp,
/*
int16
noargs,
/*
const VALUE *args,
/*
VALUE
*result)
/*
{
int16
stat;
int32
quantity;
int32
diff;
const SALES_CTX *stp = *cxtp;
in:
in:
in:
in:
out:
system handle */
statement context pointer */
number of arguments to function */
array of arguments */
result value */
UNREF_PARM(hSys);
UNREF_PARM(noargs);
if (stp->stype == sqlDELETE) {
if (args[4].vt.sv == 1) {
/* item was backordered -- see if inventory now has enough */
SQLBindCol(stp->hInvSel,1,SQL_C_DEFAULT,&quantity,sizeof(quantity),NULL);
SQLBindParameter(stp->hInvSel,1,SQL_PARAM_INPUT,SQL_C_SHORT,SQL_SMALLINT,
0L,0,(void *)&args[1].vt.sv,0,NULL);
SQLBindParameter(stp->hInvSel,2,SQL_PARAM_INPUT,SQL_C_CHAR,SQL_CHAR,
3L,0,(void *)args[2].vt.cv,0,NULL);
SQLExecute(stp->hInvSel);
stat = SQLFetch(stp->hInvSel);
if ( stat != SQL_SUCCESS ) {
SQLFreeStmt(stp->hInvSel, SQL_CLOSE);
result->type = SQL_CHAR;
result->vt.cv = "missing inventory record";
return SQL_ERROR;
}
if (quantity >= args[3].vt.lv) {
/* inventory now has enough to ship! */
/* set up on_hand update parameter
*/
SQLBindParameter(stp->hInvUpd,1,SQL_PARAM_INPUT,SQL_C_LONG,
SQL_INTEGER,0L,0,(void *)&diff,0,NULL);
diff = quantity - args[3].vt.lv;
SQLExecute(stp->hInvUpd);
SQLFreeStmt(stp->hInvSel, SQL_CLOSE);
} else {
/* still can't ship */
SQLFreeStmt(stp->hInvSel, SQL_CLOSE);
result->type = SQL_CHAR;
result->vt.cv = "can't delete(i.e. ship) backordered item";
return SQL_ERROR;
}
}
/* update sales_order's ship_date */
SQLBindParameter(stp->hShpUpd, 1, SQL_PARAM_INPUT, SQL_C_SHORT,
SQL_SMALLINT, 0L, 0, (void *) &args[0].vt.sv, 0, NULL);
SQLExecute(stp->hShpUpd);
}
result->type = SQL_SMALLINT;
result->vt.sv = 1;
return SQL_SUCCESS; /*lint !e438 */
}
SQL User Guide
194
14. Developing SQL Server Extensions
14.1.3 Invoking a UDF
Before your application can use a UDF, it must register the module using a create function statement, as shown in the following example. This statement registers the UDF module with the syscat database. The following statements create three
aggregate UDFs contained in a DLL called statpack on a Velocis device named add_ins.
create aggregate function
devsq
"compute sum of the squares of deviations",
stddev "compute standard deviation",
geomean "compute the geometric mean"
in statpack on add_ins;
Once the module is registered, a UDF can be called from SQL statements, just like the built-in RDM Server SQL-callable functions. Examples are given below for using an aggregate UDF and a scalar UDF.
The following example illustrates entry of statements to call the sample scalar UDF SubQuery and the sample aggregate
UDFs std and stds. The example uses the sales and invntory databases.
create aggregate functions std, stds
scalar function subquery
in udf on sqlsamp;
select sale_name, avg(amount), std(amount), stds(amount)
from salesperson, customer, sales_order
where salesperson.sale_id = customer.sale_id and
customer.cust_id = sales_order.cust_id group by 1;
SALE_NAME
Flores, Bob
Jones, Walter
Kennedy, Bob
McGuire, Sidney
Nash, Gail
Porter, Greg
Robinson, Stephanie
Stouffer, Bill
Warren, Wayne
Williams, Steve
Wyman, Eliska
AVG(AMOUNT)
19233.557778
28170.703333
61362.110000
18948.373636
34089.695556
87869.300000
24993.631333
3631.662500
21263.850000
27464.443333
23617.375417
STD(AMOUNT)
21767.832956
22055.396667
75487.487619
16888.086829
35751.014170
87370.831661
28766.406110
2731.390470
24150.207498
16696.742874
31511.044841
STDS(AMOUNT)
23088.273442
22829.504456
78844.109392
17712.374895
37919.676831
97683.559422
29776.059184
2919.979236
25456.553886
17709.570165
32188.779254
select state,
100*sum(amount)/subquery("select sum(amount) from sales_order") pct_of_sales
from customer, sales_order
where customer.cust_id = sales_order.cust_id group by state;
STATE
AZ
CA
CO
FL
GA
IL
IN
LA
MA
MI
MN
SQL User Guide
PCT_OF_SALES
6.386350
13.108034
13.422859
3.591970
3.057682
4.310374
0.781594
4.993924
3.233216
11.819327
1.330608
195
14. Developing SQL Server Extensions
MO
NJ
NY
OH
PA
TX
VA
WA
WI
3.807593
0.425850
10.425037
4.414228
3.911350
3.259824
1.695903
1.634471
4.389806
Calling an Aggregate UDF
After module registration, as described above, the application can call an aggregate UDF from SQL statements. For example,
the select statement shown below calls the aggregate UDF geomean, defined in the previous section. Note that the code also
calls the built-in aggregate function avg.
select state, avg(amount), geomean(amount) from customer, sales_order
where customer.cust_id = sales_order.cust_id
group by state;
Calling a Scalar UDF
Your RDM Server SQL application calls a scalar UDF from SQL statements, just as it calls the built-in functions. The next
example illustrates the use of the sample UDF SubQuery to retrieve a percentage of total values. Note the power of this UDF,
as shown by the need to use only a single select statement in the application.
select state,
100*sum(amount)/subquery("select sum(amount) from sales_order") pct_of_sales
from customer, sales_order where customer.cust_id = sales_order.cust_id
group by state;
The select statement in the next example calls a scalar UDF called tax_rate, which returns the tax rate for a given city.
select company, city, state, tax_rate(city, state) tax_rate from customer;
This tax_rate UDF looks up the tax rate for a locale in an internal table or a database table. An application can use this UDF
as shown below, to display sales orders with tax amounts that do not correspond to the going rate.
select company, city, state, ord_num, ord_date, amount, tax
from customer, sales_order
where customer.cust_id = sales_order.cust_id and
not equal_float(convert(tax, float), amount * tax_rate(city, state), 0.005);
Note that this example also uses a UDF called equal_float that returns TRUE if two floating-point values differ by less than
the value of the third parameter. Note also the use of the built-in function convert to change the value of the column tax from
type real to type float.
SQL User Guide
196
14. Developing SQL Server Extensions
14.1.4 UDF Support Library
A library of support functions for SQL user-defined functions has been provided to allow UDFs to perform low-level database
operations, associate SQL modification commands with the client's transactions, and utilize the SQL system's data type arithmetic capabilities. A list of the available functions is provided in the table below.
Table 13-1. UDF Support Library Functions
Function
Description
SYSDBHandle
Retrieve the RDM_DB handle for a specified database
SYSDbaToRowId
Convert an RDM DB_ADDR to an SQL_DBADDR (RowId)
SYSMemoryTag
SYSRowIdToDba
SYSRowDba
SYSRowId
SYSSessionId
SYSDescribeStmt
SYSValChgSign
SYSValCompare
SYSValAdd
SYSValSub
SYSValMult
SYSValDiv
Return UDF memory allocation tag
Convert an SQL_DBADDR for a table to an RDM DB_
ADDR
Get the DB_ADDR for the current row of the specified table
Get the RowId (SQL_DBADDR) for the current row of the
specified table
Get the RDM_SESS (SessionId) associated with a HSYS
Get the statement type associated with a HSYS
Change the sign of a VALUE
Compare 2 VALUEs
Add 2 VALUEs
Subtract 2 VALUEs
Multiply 2 VALUEs
Divide 2 VALUEs
The SYS-prefixed functions are used to access SQL-maintained statement context information. Most of the SYS functions
have standard SQL-prefixed functions that provide the same functionality for the connection handle (HDBC) instead of the
system handle (HSYS). These functions are provided so that the information associated with the SQL statement that uses a
UDF can be accessed by the UDF.
The SYSVal-prefixed functions are provided so that, if needed, you can do mixed-mode arithmetic in your UDFs. Each of the
SYSVal functions is passed arguments of type VALUE as described earlier. The functions that are provided hook into the
internal SQL arithmetic functions to perform the mixed-mode arithmetic operations. The results of the mixed mode arithmetic
follow the standard C rules.
14.2 User-Defined Procedures
A UDP is an application-specific procedure written in C that is architecturally similar to a UDF but is only invoked through
the RDM Server SQL call (execute) statement. UDPs can do anything that RDM Server SQL stored procedures can do (including retrieve result sets), but UDPs are more flexible than stored procedures because they are written in C and can support
dynamic parameter lists. The example UDP module code (udp.c) described below is included in the examples/udp directory. Before a UDP can be used from SQL is must first be registered with the system which is done using the following
form of the create procedure statement.
create_procedure:
create proc[edure] procname ["description"] in dllname on devname
The name of the procedure is given by the identifier procname along with an optional description string. The name of the
UDP module in which this UDP is implemented is given by libname which must be located in the RDM Server device
named devname.
SQL User Guide
197
14. Developing SQL Server Extensions
The example UDP module defines three UDPs, tims_data, log_login, and log_logout. The tims_data UDP retrieves a result set
consisting of data from the tims database. To illustrate how a UDP can retrieve multiple result sets, tims_data retrieves an additional result set that reports the total number of rows retrieved in the first result set. The sample log_login and log_logout
UDPs are special login and logout procedures which keep a record of all users who have logged in and out of the RDM
Server through the SQLConnect function.
Table 13-5. Procedures Defined in the Sample UDP Module
Function
Description
tims_data
Uses RDM Server core-level (d_) API to retrieve a result set
log_login
log_logout
from the TIMS database example.
System login tracking procedure.
System logout tracking procedure.
14.2.1 UDP Implementation
Keep the following concepts in mind when programming your UDP module.
l
l
l
l
l
l
l
l
l
The UDP must include a load function named udpDescribeFcns which identifies all of the UDPs implemented in the
module.
UDP modules that contain the implementation of a transaction trigger (registered through a call to function SQLTransactTrigger) must include a both a ModInit and a ModCleanup function.
Each UDP in the module can optionally include an initialization function of type UDPINIT. This function normally
allocates and initializes UDP-specific context memory containing statement handles and other operational data. The
SQL system calls it when the at the start of the execution of the call statement which invokes the UDP.
Each UDP that takes procedures arguments must have a type checking function of type UDPCHECK. UDPs can take
any number of arguments of any type, and the value returned can be of any data type, except long varchar, long
wvarchar, or long varbinary.
Each UDP must include a processing function of type UDPEXECUTE. This function executes the procedure and, if
applicable, returns the first select statement result set.
Each UDP can optionally include a function of type UDPMORERESULTS to obtain the next result set.
Each UDP can optionally include a function of type UDPCOLDATA which can be used with UDPs that return result
sets to return a description of one of the result set columns.
Each UDP can optionally include a cleanup function of type UDPCLEANUP. If defined, this function is called by
RDM Server SQL when the procedure's statement is closed, or when the udpMoreResults function returns status SQL_
NO_DATA (indicating there are no more result sets to be returned).
In coding your UDP, you must declare all functions exactly as shown in the function references (including the
REXTERNAL attribute). If a function declaration deviates at all, it will not match the UDP type (for example,
UDPCHECK) used in the function declaration. This will cause a compilation error.
Function udpDescribeFcns
The UDP module must contain a function with the name udpDescribeFcns (use exact name). This function is called by RDM
Server SQL to fetch the definitions of UDPs contained in the module from the UDPLOADTABLE struct array it returns. A
typical udpDescribeFcns implementation is shown in the following example.
void REXTERNAL udpDescribeFcns (
uint16
*NumProcs,
PUDPLOADTABLE *UDPLoadTable,
SQL User Guide
/* out: number of procedures in module */
/* out: points to UdfTable above */
198
14. Developing SQL Server Extensions
const char
**fcn_descr)
/* out: optional description string */
{
*NumProcs
= RLEN(UdpTable); /* RLEN computes # of entries in struct array */
*UDPLoadTable = UdpTable;
*fcn_descr = "Sample of SQL C-based procedures";
}
There must be one entry in the UDPLOADTABLE array for each UDP that is implemented in the module. The declaration for
UDPLOADTABLE is contained in header file sqlsys.h (included with emmain.h) and is shown below.
typedef struct udploadtable {
uint8
version;
char
udpName[33];
PUDPCHECK
udpCheck;
PUDPEXECUTE
udpExecute;
PUDPMORERESULTS udpMoreResults;
PUDPINIT
udpInit;
PUDPCLEANUP
udpCleanup;
PUDPCOLDATA
udpColData;
} UDPLOADTABLE, *PUDPLOADTABLE;
/*
/*
/*
/*
/*
/*
/*
/*
version of this structure */
name of user procedure */
type checking */
execute first result set */
move to next result set */
initilization for user procedure */
cleanup for user procedure */
column description data */
Each element of the UDPLOADTABLE struct is described in the following table.
Table 13-6. UDPLOADTABLE Struct Element Descriptions
Element
Description
version
Must be assigned to the RDM Server defined macro:
udpName
udpCheck
udpExecute
udpMoreResults
udpInit
udpCleanup
udpColData
UDPTBLVERSION.
The name of the procedure. Must conform to a standard SQL
identifier. It is case-insenstive and unique (system-wide).
Pointer to the argument type checking function. Assign to
NULL if there are no arguments.
Pointer to the execution function.
Pointer to the function that initializes processing of the next
result set. Assign to NULL if no more than 1 result set is
returned.
Pointer to pre-execution initialization function.
Pointer to post-execution cleanup function.
Pointer to function that returns descriptions of the columns in
the result set. Assign to NULL if the UDP does not return a
result set.
The udpDescribeFcns function must place an entry in the load table for each function in each UDP. You also can return an
optional string describing the module, which will be printed on the server console when the UDP is loaded. Null can be supplied if there is no string. The following example shows the load table and udpDescribeFcns for the sample UDP module
(udp.c).
#include <stdio.h>
#include <string.h>
SQL User Guide
199
14. Developing SQL Server Extensions
/* Definitions and header to setup EM ------------ */
/* (all EMs must have a code block like this) --- */
#define EM_NAME UDP
/* the uppercased module name */
#define EMTYPE_UDP
#define DEF_ModInit
#define DEF_ModCleanup
/* EMTYPE_EM, EMTYPE_UDF, EMTYPE_UDP, EMTYPE_IEF */
/* remove if no ModInit in this module */
/* remove if no ModCleanup in this module */
#include "emmain.h"
...
static UDPCHECK
static UDPEXECUTE
static UDPMORERESULTS
static UDPINIT
static UDPCLEANUP
static UDPCOLDATA
static UDPINIT
static UDPCLEANUP
static UDPEXECUTE
static UDPEXECUTE
/* must follow the above definitions and OS #includes */
timsCheck;
timsExecute;
timsMoreResults;
timsInit;
timsCleanup;
timsColData;
logInit;
logCleanup;
logLogin;
logLogout;
static TRANSACTTRIGGER TransactTrigger;
/*-------------------------------------------------------------------------Table of user-defined procedures in this module
---------------------------------------------------------------------------*/
static const UDPLOADTABLE UdpTable[] = {
{UDPTBLVERSION, "tims_data",
timsCheck,
timsExecute,
timsMoreResults,
timsInit,
timsCleanup,
timsColData},
{UDPTBLVERSION, "log_login",
NULL,
logLogin,
NULL,
logInit,
logCleanup,
NULL},
{UDPTBLVERSION, "log_logout",
NULL,
logLogout,
NULL,
logInit,
logCleanup,
NULL}
};
Function ModInit
If your UDP module has a transaction trigger you must include a ModInit (and ModCleanup) function. The server calls
ModInit when the UDP module is loaded, passing a single module handle. Your ModInit function needs to save this handle
in a global variable that will be subsequently used by SQLTransactTrigger to register the transaction trigger function.
The example below shows the ModInit function defined for the sample UDP module.
SQL User Guide
200
14. Developing SQL Server Extensions
static HMOD ghMod = NULL;
...
int16 REXTERNAL ModInit(
HMOD hMod) /* in: Module handle, used by SQLTransactTrigger() */
{
ghMod = hMod;
return S_OKAY;
}
Function udpInit
Your UDP can optionally include an initialization function of type UDPINIT. RDM Server SQL calls this function from the
udpInit entry in UDPLOADTABLE to perform any initialization the UDP may require. Common tasks include allocating
memory for any context data and initializations such as handle allocations, connections, and compiling statements to be
executed in udpExecute..
If needed, your udpInit function should allocate memory for the UDP context using an rm_cGetMemory call. The memory
tag (mTag) argument passed to the function should be used in all dynamic memory allocations. RDM Server SQL uses this
tag to free all of the memory allocated by the UDP in case an error occurs outside the UDP.
The err argument is a pointer to a standard RDM Server SQL VALUE structure (See SQL Data VALUE Container
Description). The udpInit function uses this structure to pass error information back to the RDM Server SQL API for reporting
to the application. The udpInit function sets the type field in the structure to either SQL_SMALLINT or SQL_CHAR. If the
field is set to SQL_SMALLINT, the vt.sv field should be set to the RDM Server SQL error code to return to the RDM Server
SQL API. If udpInit sets the type field in the VALUE structure to SQL_CHAR, then it must set the vt.cv field to point to a
static string containing an error message that will be reported under the errGENERAL error code. In addition to setting the err
parameter, the udpInit function should also return SQL_ERROR if an error occurs.
The timsInit function, shown below for the tims_data UDP, allocates the UDP_CTX structure for the UDP context and opens
the tims database via the Core (d_) API with dirty read mode enabled (no locking). Then the function allocates the necessary
RDM Server SQL statement handles, executes two statements, and stores the information in the allocated UDP context. If any
errors occur, the information is returned to RDM Server in the err argument.
typedef struct udp_ctx {
SQLHENV
hEnv;
SQLHDBC
hDbc;
SQLHSTMT
hStmt;
RDM_SESS
hSess;
RDM_DB
hDb;
int16
finished;
} UDP_CTX;
static const char TimsCreate[] =
"create temporary table timstab("
"author char(31), "
"id_code char(15), "
"info_title char(48), "
"publisher char(31), "
"pub_date char(11), "
"info_type smallint); ";
static const char TimsInsert[] =
SQL User Guide
201
14. Developing SQL Server Extensions
"insert into timstab values(?,?,?,?,?,?)";
...
/* ===================================================================
Initialization function for TIMS DB access
*/
int16 REXTERNAL timsInit(
void
**ctxp,
/* in: proc context pointer */
int16
noargs,
/* in: number of arguments passed */
VALUE
*args,
/* in: arguments, args[noargs-1] */
RDM_SESS
hSess,
/* in: current session id */
RM_MEMTAG
mTag,
/* in: memory tag for rm_ memory calls */
VALUE
*err)
/* out: container for error messages */
{
int16 stat;
/* allocate a cleared UDP context memory */
UDP_CTX *ctx = rm_cGetMemory(sizeof(UDP_CTX), mTag);
ctx->hSess = hSess;
if ((stat = d_open("tims", "s", hSess, &ctx->hDb)) != S_OKAY)
err->type = SQL_CHAR;
err->vt.cv = "unable to open TIMS";
rm_freeMemory(ctx, mTag);
return SQL_ERROR;
}
...
{
/* enable dirty reads */
d_rdlockmodes(1, 1, ctx->hSess);
/* allocate and initialize SQL handles */
SQLAllocHandle(SQL_HANDLE_ENV, SQL_NULL_HANDLE, &ctx->hEnv);
SQLSetEnvAttr(ctx->hEnv, SQL_ATTR_ODBC_VERSION,(SQLPOINTER)SQL_OV_ODBC3,
SQL_IS_UINTEGER);
SQLAllocHandle(SQL_HANDLE_DBC, ctx->hEnv, &ctx->hDbc);
SQLConnectWith(ctx->hDbc, hSess);
SQLAllocHandle(SQL_HANDLE_STMT, ctx->hDbc, &ctx->hStmt);
SQLExecDirect(ctx->hStmt, TimsCreate, SQL_NTS);
SQLPrepare(ctx->hStmt, TimsInsert, SQL_NTS);
*ctxp = ctx;
return SQL_SUCCESS;
}
The call to SQLExecDirect creates a temporary table (timstab) and the call to SQLPrepare compiles a statement to insert data
into this table. The columns declared in timstab include the author name, which will be retrieved from the author record in
the tims database and one column for each of the fields declared in the info record in tims. The insert statement includes a
parameter marker for each of the columns declared in timstab.
The last statement before the return in timsInit, (*ctxp = ctx;) must be specified so that the context pointer
can be passed by SQL to the other UDP functions.
SQL User Guide
202
14. Developing SQL Server Extensions
Function udpCheck
When the application calls SQLPrepare to compile a call (execute) statement referencing a UDP, SQLPrepare calls the function in the udpCheck entry of UPDLOADTABLE to validate that the arguments specified in the execute statement are correct
for the specified UDP. Like the udpInit function, udpCheck uses the err argument to return error information to RDM Server.
In addition, the function returns SQL_ERROR if an error returns as its return code. Unlike the udfCheck function, the
udpCheck function only uses the err argument for error information.
The following example shows the timsCheck type checking function for tims_data. Remember that the sample login and
logout procedures do not have any arguments and, hence, do not need a type checking function.
int16 REXTERNAL timsCheck(
int16
noargs, /* in: number of arguments passed */
const int16 *types, /* in: type of each arg., types[noargs-1] */
VALUE
*err)
/* out: container for error messages */
{
int16 arg;
err->type = SQL_CHAR;
err->vt.cv = "tims_data requires char arguments only";
for (arg = 0; arg < noargs; ++arg) {
if (types[arg] != SQL_CHAR)
return SQL_ERROR;
}
return SQL_SUCCESS;
}
Function udpExecute
The execution function for the UDP (type UDPEXECUTE) is called by RDM Server SQL from the udpExecute entry in
UDPLOADTABLE when SQLExecute is processing a call (execute) statement that references the UDP (after the udpInit
function and udpCheck functions, if any, have been called). If a result set is generated, then the hstmt referencing this result
set must be returned in the phStmt argument. This allows the client side application to fetch the results by calling SQLFetch
or SQLFetchScroll with the hstmt used in the procedure execution. The timsExecute function for the tims_data UDP is
described below.
Since tims is not an SQL database, timsExecute uses the Core API (d_ functions) to retrieve data from the tims database and
stores the data in a temporary SQL table. The timsExecute function first calls SQLBindParameter several times to bind values to the parameter markers for the insert statements. The execution function then accesses the tims database. The d_setoo
function call sets the current member of the author_list set to null so that the first d_findnm call will return the first member
of the set.
int16 REXTERNAL timsExecute(
void
**ctxp,
/* in:
int16
noargs, /* in:
VALUE
*args,
/* in:
RM_MEMTAG
mTag,
/* in:
SQLHSTMT
*phStmt, /* out:
VALUE
*err)
/* out:
{
static char errmsg[65];
char
author[32];
struct info ir;
int16
stat, arg;
SQL User Guide
proc context pointer */
number of arguments to procedure */
array of arguments */
memory tag for rm_ memory calls */
hstmt for result set */
container for error messages */
203
14. Developing SQL Server Extensions
char
UDP_CTX
RDM_DB
SQLHSTMT
*author_arg = args->vt.cv;
*ctx = *ctxp;
hdb = ctx->hDb;
hstmt = ctx->hStmt;
/* set up insert parameter values */
SQLBindParameter(hstmt, 1, SQL_PARAM_INPUT, SQL_C_CHAR,
SQL_CHAR, 31, 0, author, 0, NULL);
SQLBindParameter(hstmt, 2, SQL_PARAM_INPUT, SQL_C_CHAR,
SQL_CHAR, 15, 0, ir.id_code, 0, NULL);
SQLBindParameter(hstmt, 3, SQL_PARAM_INPUT, SQL_C_CHAR,
SQL_CHAR, 79, 0, ir.info_title, 0, NULL);
SQLBindParameter(hstmt, 4, SQL_PARAM_INPUT, SQL_C_CHAR,
SQL_CHAR, 31, 0, ir.publisher, 0, NULL);
SQLBindParameter(hstmt, 5, SQL_PARAM_INPUT, SQL_C_CHAR,
SQL_CHAR, 11, 0, ir.pub_date, 0, NULL);
SQLBindParameter(hstmt, 6, SQL_PARAM_INPUT, SQL_C_SHORT,
SQL_SMALLINT, 0, 0, &ir.info_type, 0, NULL);
/* extract data from tims database & insert into SQL table */
d_setoo(AUTHOR_LIST, AUTHOR_LIST, hdb);
for (;;) {
while ((stat = d_findnm(AUTHOR_LIST, hdb)) == S_OKAY) {
d_recread(AUTHOR, author, hdb);
for (arg = 0; arg < noargs; ++arg) {
/* check for author in argument list */
char *aname = args[arg].vt.cv;
if (strncmp(author, aname, strlen(aname)) == 0)
break;
}
if (noargs == 0 || arg < noargs)
break;
}
if (stat != S_OKAY)
break;
d_setor(HAS_PUBLISHED, hdb);
while ((stat = d_findnm(HAS_PUBLISHED, hdb)) == S_OKAY) {
d_recread(INFO, &ir, hdb);
SQLExecute(hstmt);
} } SQLFreeStmt(hstmt, SQL_RESET_PARAMS);
/* return result set from SQL table */
SQLExecDirect(hstmt, "select * from timstab", SQL_NTS);
*phStmt = ctx->hStmt;
return SQL_SUCCESS;
}
As each author record is retrieved from the author_list set, timExecute compares the name with each string argument passed to
the tims_data UDP. When the function finds a match, it fetches the publications for the author from the has_published set.
The timsExecute function reads each info record and executes the insert statement to store the result in the timstab table.
When all authors have been processed, the timsExecute function executes a select statement that will return the rows stored in
timstab. The function returns the handle of the select statement in phStmt, so that subsequent calls to SQLFetch can retrieve
the results.
SQL User Guide
204
14. Developing SQL Server Extensions
Function udpMoreResults
UDP that need to return the result sets of more than one select statement must include a udpMoreResults function, which is
called by RDM Server SQL from the udpMoreResults entry in UDPLOADTABLE when the application calls SQLMoreResults on the statement handle associated with the call (execute) statement. (SQLMoreResults is called after the initial
statement execution, which takes place in udpExecute, has occurred.) The udpMoreResults function has the same argments
and executes the same way as the udpExecute function, also returning the statement handle associated with the currently
executing select statement in phStmt. When there are no more result sets to return, have udpMoreResults return SQL_NO_
DATA informing SQL that UDP processing is complete and to call the UDP's udpCleanup function, if one exists.
In the next example, the first time timsMoreResults is called, the call to SQLExecDirect sets up a result set that retrieves
the total number of rows in the timstab table. The finished flag in the UDP context is set to indicate that the first call has
been made, so that the next time SQLMoreResults is called, the function sees this set flag and returns SQL_NO_DATA.
If a UDP omits the udpMoreResults function, SQLMoreResults automatically returns SQL_NO_
DATA.
int16 REXTERNAL timsMoreResults(
void
**ctxp,
/* in: proc context pointer */
int16
noargs, /* in: number of arguments to procedure */
VALUE
*args,
/* in: array of arguments */
RM_MEMTAG mTag,
/* in: memory tag for rm_ memory calls */
SQLHSTMT *phStmt, /* out: hstmt for result set */
VALUE
*err)
/* out: container for error messages */
{
UDP_CTX *ctx = *ctxp;
if (ctx->finished)
return SQL_NO_DATA;
ctx->finished = 1;
SQLCloseCursor(ctx->hStmt);
SQLExecDirect(ctx->hStmt,
"select count(*) 'TOTAL ROWS FOUND' from timstab", SQL_NTS);
*phStmt = ctx->hStmt;
return SQL_SUCCESS;
}
Function udpColData
This function, if specified, is called by SQL while processing a call to function SQLProcedureColumns in order to
retrieve descriptions of the columns in the result set returned by the UDP. The function returns via the pColDescr output argument a description of the column specified by argument colno where the first column is 0. The function must return status
SQL_NO_DATA when the specified colno is invalid (either less than zero or greater than or equal to the number of columns
in the UDP result set).
The udpUpdCol entry in the UDPLOADTABLE must be NULL if the UDP does not return a result set and can be NULL
even if the UDP does return a result set.
The pColDescr argument is a pointer to a struct of type UDPPROCCOLDATA which is declared in header file
sqlsys.h as shown below.
SQL User Guide
205
14. Developing SQL Server Extensions
/* user defined procedure column data */
typedef struct udpproccoldata {
char
dbname[33];
char
tblname[33];
char
procname[33];
char
colname[33];
int16
coltype;
int32
datatype;
char
sqltypename[33];
int32
precision;
int32
length;
int16
scale;
int16
radix;
int16
nullable;
const char *remarks;
char
col_def[33];
int32
sql_data_type;
int32
sql_datetime_sub;
int32
char_octet_len;
int32
ordinal_pos;
char
is_nullable[4];
char
specific_name[33];
} UDPPROCCOLDATA;
Each element of the UPDPROCCOLDATA struct corresponds to a column of the result set returned by the SQLProcedureColumns ODBC function call as described in the following table.
Table 13-7. UDPPROCCOLDATA Struct Element Descriptions
Element
Description
dbname
The name of the database accessed by the UDP or NULL.
tblname
The name of the table accessed by the UDP or NULL.
procname
colname
coltype
datatype
sqltypename
precision
length
scale
radix
nullable
remarks
col_def
sql_data_type
sql_datetime_sub
SQL User Guide
The name of the UDP.
The name of the result set column.
The type of column: only 5=result set column is supported.
SQL data type constant (e.g., SQL_SMALLINT).
RDM Server data type name (e.g., "smallint").
The specified max size of a character column or the precision
of a numeric column (e.g., integer, float, double).
The maximum length in bytes to contain the column values.
For char data this includes the terminating null byte.
For columns of type decimal this contains the number of
decimal places in the result. Zero otherwise.
For numeric types either 10 (decimal) or 2 (all others). Zero
for non-numeric types.
Indicates if the column can accept a NULL value.
Either NULL or a udpColData-allocated (use rm_getMemory
with the mTag argument) string containing a description of
your choice.
A string containing the column's default value or
"TRUNCATED" if the default value does not fit in the field.
0 (unused).
0 (unused).
206
14. Developing SQL Server Extensions
Element
char_octet_len
ordinal_pos
is_nullable
specific_name
Description
Currently, this returns the same value as the length field.
The column's ordinal position in the result set beginning with
1.
"YES" if the column can include nulls, "NO" if not, "" if it is
unknown.
Same as procname.
The tims_data version of udpColData is shown below including its declaration of the UPDPROCCOLDATA table for the result set it returns.
...
static const UDPPROCCOLDATA timsDataCD[] = {
{"","","tims_data","author",0,1,"CHAR",31,31,
-1,10,2,NULL,"",0,0,31,1,"","tims_data"},
{"","","tims_data","id_code",0,1,"CHAR",15,15,
-1,10,2,NULL,"",0,0,15,2,"","tims_data"},
{"","","tims_data","info_title",0,1,"CHAR",48,48,
-1,10,2,NULL,"",0,0,48,3,"","tims_data"},
{"","","tims_data","publisher",0,1,"CHAR",31,31,
-1,10,2,NULL,"",0,0,31,4,"","tims_data"},
{"","","tims_data","pub_date",0,1,"CHAR",11,11,
-1,10,2,NULL,"",0,0,11,5,"","tims_data"},
{"","","tims_data","info_type",0,5,"SMALLINT",5,2,
0,10,2,NULL,"",0,0,0,6,"","tims_data"},
};
#define MAX_TIMSDATA_COLUMNS (sizeof(timsDataCD)/sizeof(UDPPROCCOLDATA))
...
static int16 REXTERNAL timsColData(
const VALUE
*args,
/* in: array of arguments */
UDPPROCCOLDATA *pColDescr, /* out: procedure column data pointer */
int16
colno,
/* in: column number */
RM_MEMTAG
mTag,
/* in: memory tag for rm_ memory calls */
VALUE
*err)
/* out: container for error messages */
{
const char *procname = args[2].vt.cv;
UNREF_PARM(mTag)
UNREF_PARM(err)
if (stricmp(procname, "tims_data") != 0)
return SQL_NO_DATA;
if ((uint16) colno >= MAX_TIMSDATA_COLUMNS)
return SQL_NO_DATA;
memcpy(pColDescr, &timsDataCD[colno], sizeof(UDPPROCCOLDATA));
return SQL_SUCCESS;
}
Function udpCleanup
The udpCleanup function is called by SQL when UDP execution is completed (e.g., when SQL_NO_DATA is returned by
updExecute or udpMoreResults). It is used to free memory allocated in udpInit and/or udpExecute, close database
SQL User Guide
207
14. Developing SQL Server Extensions
connections, and drop temporary tables.
NEVER call rm_freeTagMemory within udpCleanup using the memory tag passed into the function.
Instead, free any UDP-allocated memory using rm_freeMemory.
The following example shows the timsCleanup function from the sample tims_data UDP. This function first closes the active
statement handle, then executes a drop table statement to close the timstab temporary table. Afterwards, timsCleanip drops
the statement handle, then closes and frees the connection and the SQL environment handle. Finally, the function closes the
tims database and frees the context memory using rm_freeMemory.
void REXTERNAL timsCleanup(
void **ctxp, /* in: statement udp_ctx pointer */
int16 mTag) /* in: memory tag for rm_ memory calls */
{
UDP_CTX *ctx = *ctxp;
SQLCloseCursor(ctx->hStmt);
SQLExecDirect(ctx->hStmt, "drop table timstab", SQL_NTS);
SQLFreeHandle(SQL_HANDLE_STMT, ctx->hStmt);
SQLDisconnect(ctx->hDbc);
SQLFreeHandle(SQL_HANDLE_DBC, ctx->hDbc);
SQLFreeHandle(SQL_HANDLE_ENV, ctx->hEnv);
d_rdlockmodes(0, 1, ctx->hSess);
d_close(ctx->hDb);
rm_freeMemory(ctx, mTag);
}
Function ModCleanup
If you include ModCleanup in the UDP module, the server calls it when unloading the module. The following example
shows the ModCleanup function for the sample UDP.
int16 REXTERNAL ModCleanup(
HMOD hMod) /* in: Module handle, used by SQLTransactTrigger() */
{
ghMod = NULL;
return S_OKAY;
}
14.2.2 Calling a UDP
Once you have coded and successfully compiled the UDP module it needs to be registered with the system so that SQL will
know where to find it. The create procedure statement defined at the start of this section is used to do this as shown in following is an example of this statement issued for tims_data.
create procedure tims_data in "udp" on sqlsamp;
Once the module is registered, the application can call the UDP just like an SQL-based stored procedure. The following
example shows the call statement that executes the tims_data UDP. Each parameter specified for tims_data is the name (or
SQL User Guide
208
14. Developing SQL Server Extensions
partial name) of an author for whom the UDP is retrieving applicable publications contained in the tims database. Note that
this output shows the results of both the udpExecute function and the udpMoreResults function.
14.3 Login or Logout UDP Example
Login and logout procedures are stored procedures (or UDPs) that are invoked when all or certain users log in to or log out
from the server. These procedures do not have parameters nor retrieve result sets. An administrator initially creates and activates these procedures, but they are automatically invoked by other users when they log in or log out. Login and logout procedures can be used for setting user environment values (for example, display formats) or for performing specialized security
functions.
Two UDPs are included in the sample UDP module (udp.c) provided with the system. The UDP named log_login is a login
procedure and the one named log_logout is, you guessed it, a logout procedure. As loging and logout procedures do not have
arguments nor retrieve resultsthe udpCheck and udpMoreResults function entries in the UDPLOADTABLE are NULL.
These UDPs both use logInit (type UDPINIT) as their initialization function, as shown in the following code. The logInit
function allocates the memory for the UDP context (LOG_CTX), attaches to the client connection, and allocates and initializes needed SQL handles. It also compiles an insert statement (LogInsert) used to store the record of a login or logout operation. This insert stores a new row in the table called activity_log, which must have been previously created or else
SQLPrepare will fail. If this happens, LogInit will create the activity_log database and table and call SQLPrepare again.
The UDP context includes fields to contain for action and label strings. These are associated (by SQLBindParameter) with
the two parameter markers contained in LogInsert. The declarations of the user_name and stamp columns in activity_log specify default values that automatically store the user name and the current timestamp values (the current date and time) in the
table.
...
typedef struct log_ctx {
HENV
hEnv;
HDBC
hDbc;
HSTMT
hStmt;
RDM_SESS
hSess;
char
action[24];
char
label[33];
} LOG_CTX;
...
static const char LogCreateDb[] =
"create database activity_log";
static char LogCreate[] =
"create table activity_log("
"action
char(23),"
"label
char(32)
default null,"
"user_name char(32)
default user,"
"stamp
timestamp default now)";
static char LogGrant[] = "grant insert on activity_log to public";
static char LogInsert[] = "insert into activity_log(action, label) values(?, ?)";
...
/* ======================================================================
Initialization function for logging functions
*/
int16 REXTERNAL logInit(
void
**ctxp,
/* in: proc context pointer */
SQL User Guide
209
14. Developing SQL Server Extensions
int16
VALUE
RDM_SESS
RM_MEMTAG
VALUE
noargs,
*args,
hSess,
mTag,
*err)
/*
/*
/*
/*
/*
in:
in:
in:
in:
out:
number of arguments passed */
arguments, args[noargs-1] */
current session id */
memory tag for rm_ memory calls */
container for error messages */
{
int16 stat;
HSTMT hstmt;
LOG_CTX *ctx = rm_getMemory(sizeof(LOG_CTX), mTag);
ctx->hSess = hSess;
SQLAllocHandle(SQL_HANDLE_ENV, SQL_NULL_HANDLE, &ctx->hEnv);
SQLAllocHandle(SQL_HANDLE_DBC, ctx->hEnv, &ctx->hDbc);
SQLConnectWith(ctx->hDbc, hSess);
SQLAllocHandle(SQL_HANDLE_STMT, ctx->hDbc, &hstmt);
if ((stat = SQLPrepare(hstmt, LogInsert, SQL_NTS)) != SQL_SUCCESS) {
/* activity_log table has not been created - create it */
SQLExecDirect(hstmt, LogCreateDb, SQL_NTS);
SQLExecDirect(hstmt, LogCreate, SQL_NTS);
SQLExecDirect(hstmt, "commit", SQL_NTS);
SQLExecDirect(hstmt, LogGrant, SQL_NTS);
SQLPrepare(hstmt, LogInsert, SQL_NTS);
}
SQLBindParameter(hstmt, 1, SQL_PARAM_INPUT, SQL_C_CHAR,
SQL_CHAR, 10, 0, ctx->action, 0, NULL);
SQLBindParameter(hstmt, 2, SQL_PARAM_INPUT, SQL_C_CHAR,
SQL_CHAR, 20, 0, ctx->label, 0, NULL);
ctx->hStmt = hstmt;
*ctxp = ctx;
return SQL_SUCCESS;
}
The udpExecute function for the log_login UDP called logLogin is given below. All that logLogin must do is copy "login"
to the action string and place the session ID number in the label string. he procedure then calls SQLExecute to insert the
row into activity_log. After this, a call is made to SQLEndTran to commit the new row to the database.
The logLogout function for the log_logout procedure is identical to logLogin except that it copies "logout"
(instead of "login") to the action string.
int16 REXTERNAL logLogin(
void
**ctxp,
/* in:
int16
noargs, /* in:
VALUE
*args,
/* in:
int16
mTag,
/* in:
HSTMT
*phStmt, /* out:
VALUE
*err)
/* out:
{
LOG_CTX
*ctx = *ctxp;
LOG_CTX
*ptr;
proc context pointer */
number of arguments to procedure */
array of arguments */
memory tag for rm_ memory calls */
hstmt for result set */
container for error messages */
/* record the login */
strcpy(ctx->action, "login");
sprintf(ctx->label, "session %d", ctx->hSess);
SQL User Guide
210
14. Developing SQL Server Extensions
SQLExecute(ctx->hStmt);
SQLEndTran(SQL_HANDLE_DBC, ctx->hDbc, SQL_COMMIT);
return SQL_SUCCESS;
}
Your login or logout procedure can optionally contain a cleanup function. The sample log_login and log_logout UDPs
include the logCleanup function shown below. This function performs cleanup by freeing and disconnecting the appropriate
SQL handles, and then freeing the context memory.
void REXTERNAL logCleanup(
void
**ctxp,
/* in: statement udp_ctx pointer */
RM_MEMTAG
mTag)
/* memory tag for rm_ memory calls */
{
LOG_CTX *ctx = *ctxp;
SQLFreeHandle(SQL_HANDLE_STMT, ctx->hStmt);
SQLDisconnect(ctx->hDbc);
SQLFreeHandle(SQL_HANDLE_DBC, ctx->hDbc);
SQLFreeHandle(SQL_HANDLE_ENV, ctx->hEnv);
rm_freeMemory(ctx, mTag);
}
Only an administrative user can set up a login or logout procedure for use. The first step is to issue a create procedure statement similar to the following.
create procedure log_login in udp on rdsdll;
create procedure log_logout in udp on rdsdll;
After the login or logout procedure is set up, the administrator must assign the procedure by using a set login proc statement,
as shown below. The for clause is used to set up the login or logout procedure for use by all users or only for the list of specified user identifiers. Recall that user names (ids) are case-sensitive so that "Sam" and "sam" are considered to be different
users.
set login proc for public to log_login;
set logout proc for public to log_logout;
With the login procedure (and logout procedure if applicable) assigned, the administrator can turn the registered login or
logout procedures on or off as needed. To do this, the administrator issues a set login statement with the on or off clause. The
effect is system wide and persists until the next set login statement is issued by any administrative user. Use of login procedures is initially turned off.
set login on
Now that the sample login and logout procedures are enabled, every login and logout by any user causes a row containing
the action, user ID, time, etc., to be inserted into the activity_log table. After a few log ins and log outs, the administrator can
view the table by issuing the following select statement.
select * from activity_log;
SQL User Guide
211
14. Developing SQL Server Extensions
To disable calls to a login or logout procedure, the administrator can issue another set login proc statement. This time, the
administrator specifies null instead of the procedure name in the to clause.
set login proc for public to null;
14.4 Transaction Triggers
A transaction trigger is a server-side function, residing in a UDP module, that RDM Server calls whenever a transaction operation occurs (commit, rollback, savepoint, or rollback to a savepoint). The trigger can be registered to be called either for the
next transaction operation only or for every transaction operation. The function is activated from within a UDP by a call to
the SQL API function, SQLTransactTrigger.
14.4.1 Transaction Trigger Registration
Transaction triggers are registered through a call to SQLTransactTrigger usually issued from a login UDP. This function
is only available on the server; it cannot be called from the client side. Thus, it must be called from a UDP (C-based procedure). The declaration for SQLTransactTrigger as shown below.
int16 REXTERNAL SQLTransactTrigger(
HMOD
hMod,
RDM_SESS
hSess,
const char
*name,
PTRANSACTTRIGGER
Trigger,
void
*ptr,
int16
mode);
The SQLTransactTrigger function arguments are described in the table below.
Argument
hMod
hSess
name
Trigger
ptr
mode
SQL User Guide
Description
The handle that uniquely identifies the UDP module. The
hMod value is originally passed into ModInit in the UDP
module when the module is first loaded. If you are using transaction triggers in your UDP, you must define the ModInit
function, so you can get the module handle and save it in a
global variable for later use in SQLTransactTrigger.
The session handle of the user activating the transaction trigger. This handle is an argument to the udpInit functon which
can either itself call SQLTransactTrigger or save the session
handle in the UDP context so that the udpExecute function
can call SQLTransactTrigger (see examples in this section).
A unique name to be associated with this transaction trigger.
A pointer to the transaction trigger implementation function
to be activated.
A pointer to any context memory needed by the transaction
trigger. Note that this memory needs to survive as long as the
connection (session) is open.
Informs the SQL system as to how often the transaction trigger is to fire on transactions issued by the connection associated with hSess.Set to SYS_COMMIT_EVERY to indicate
212
14. Developing SQL Server Extensions
Argument
Description
that the trigger is to fire after every transaction. Set to SYS_
COMMIT_ONCE to indicate that the trigger is only to fire on
the next transaction operation after which the transaction trigger is deactivated and a subsequent SQLTransactTrigger call
is required in order to reactivate the trigger.
Once the transaction trigger is registered, the RDM Server "fires" (calls) the function any time the related connection executes
a transaction commit, savepoint, or rollback. This can happen through a call to either SQLEndTran or SQLExtendedTransact, through a commit, mark, or rollback statement, or (if the session is in auto-commit mode) when RDM
Server automatically issues a transaction commit following an insert, update, or delete.
The code below shows a version of the logLogin function that implements a transaction trigger. After recording the login in
the activity_log table, the function allocates a context of type LOG_CTX, calls SQLConnectWith to associated the invoking client's session handle with an SQL connection, allocates a statement handle, and compiles the LogInsert statement. It
then calls SQLBindParameter to associate the action and label variables in the log context with LogInsert's parameter
markers.
Finally, logLogin calls SQLTransactTrigger, passing the global module handle set in the original call to ModInit. The
call to SQLTransactTrigger also passes the session handle stored in the log_login context by logInit and names the transaction trigger activity_log. Additionally, the call provides the address of the transaction trigger function named TransactTrigger, a pointer to the context for the transaction trigger function, and a constant indicating that the transaction trigger
function is to be called on every commit, savepoint, or rollback operation.
typedef struct log_action {
int16
type;
char
*label;
struct log_action *next;
struct log_action *prev;
} LOG_ACTION;
typedef struct log_ctx {
HENV
hEnv;
HDBC
hDbc;
HSTMT
hStmt;
RDM_SESS
hSess;
RM_MEMTAG
mTag;
char
action[16];
char
label[21];
LOG_ACTION *act_list;
} LOG_CTX;
...
/* ======================================================================
Main for login procedure*/
int16 REXTERNAL logLogin(
void
**ctxp,
/* in: proc context pointer */
int16
noargs, /* in: number of arguments to procedure */
VALUE
*args,
/* in: array of arguments */
RM_MEMTAG
mTag,
/* in: memory tag for rm_ memory calls */
HSTMT
*phStmt, /* out: hstmt for result set */
VALUE
*err)
/* out: container for error messages */
{
LOG_CTX
*ctx = *ctxp;
LOG_CTX
*ptr;
/* record the login */
SQL User Guide
213
14. Developing SQL Server Extensions
strcpy(ctx->action, "login");
sprintf(ctx->label, "session %d", ctx->hSess);
SQLExecute(ctx->hStmt);
SQLEndTran(SQL_HANDLE_DBC, ctx->hDbc, SQL_COMMIT);
/* set the transaction trigger for this connection */
/* allocate trigger's log context on global tag */
ptr = rm_getMemory(sizeof(LOG_CTX), 0);
SQLAllocHandle(SQL_HANDLE_ENV, SQL_NULL_HANDLE, &ptr->hEnv);
SQLAllocHandle(SQL_HANDLE_DBC, ptr->hEnv, &ptr->hDbc);
SQLConnectWith(ptr->hDbc, ctx->hSess);
SQLAllocHandle(SQL_HANDLE_STMT, ptr->hDbc, &ptr->hStmt);
SQLPrepare(ptr->hStmt, LogInsert, SQL_NTS);
SQLBindParameter(ptr->hStmt, 1, SQL_PARAM_INPUT, SQL_C_CHAR,
SQL_CHAR, 10, 0, ptr->action, 0, NULL);
SQLBindParameter(ptr->hStmt, 2, SQL_PARAM_INPUT, SQL_C_CHAR,
SQL_CHAR, 20, 0, ptr->label, 0, NULL);
ptr->act_list = NULL;
ptr->mTag = rm_createTag(NULL, 0, NULL, NULL, 0, RM_NOSEM);
SQLTransactTrigger(ghMod, ctx->hSess, "activity_log",
TransactTrigger, ptr, SYS_COMMIT_EVERY);
return SQL_SUCCESS;
}
14.4.2 Transaction Trigger Implementation
Once SQLTransactTrigger is called, the transaction trigger function will be called by RDM Server when a transaction
operation calls. The transaction trigger function is a function of type TRANSACTTRIGGER, with a prototype as shown
below. Note that you can name this function whatever you want as it is only identified from the function pointer passed in to
SQLTransactTrigger call that activated it.
void REXTERNAL TransactTrigger(
int16
type,
char
*label,
char
*name,
void
*ptr);
Each of the arguments are described in Table 13-9 below.
Table 13-9. Transaction Trigger Implementation Function Argument Descriptions
Argument
Description
type
See Table 13-10.
The transaction identifier specified with the savepoint or rollback.
name
The name of the transaction trigger being fired. Allows more
than one transaction trigger to share the same implementation
function.
ptr
A pointer to the transaction trigger context data passed in to
the call to SQLTransactTrigger which activated this trigger.
The table below describes each of the possible values for the type function argument.
label
SQL User Guide
214
14. Developing SQL Server Extensions
Table 13-10. Transaction Type Descriptions
Type
Description
SYS_SAVEPOINT
Triggered by a "savepoint label" operation.
SYS_ROLLBACK
Triggered by a "rollback [to savepoint label]" operation. The
SYS_COMMIT
SYS_REMOVE
label argument will be an empty string ("") to indicate that the
entire transaction is rolled back.
Triggered by a "commit" operation.
The trigger is being deleted. This happens because either the
trigger was registered to fire only once and has already fired,
or was registered to fire for every transaction and the application is disconnecting from the server. This value indicates
that the trigger should perform any necessary cleanup (e.g.,
freeing the allocated memory pointed to by the ptr argument).
The transaction trigger is always called AFTER the transaction operation has completed. Hence, if type is
SYS_COMMIT the TransactTrigger function is called after the commit has completed writing its changes
to the database.
The following example implements a transaction trigger (function TransactTrigger in udp.c) that is used in conjunction with
login procedures and that records a log of every transaction including the user id of the user issuing the transaction and the
timestamp when it occurred. The log is maintained in the activity_log database/table which is created by the logInit function
for the log_login UDP described earlier in section 14.3.
TransactTrigger operation depends on the value passed in through argument type. If type is SYS_REMOVE then the transaction trigger is being deleted and the trigger function needs to clean up after itself by freeing its previously allocated handles
and memory. Note that the call to SQLDisconnect breaks the association between the connection handle (ctx->hDbc) and
the client session handle that was established by the call to SQLConnectWith that was issued by function logInit.
If type is SYS_SAVEPOINT or SYS_ROLLBACK to a previous savepoint (indicated by a its presence in the list of previously issued savepoint labels stored in the ctx->act_list linked list), the transaction action is saved in the activity list
to be later written to the activity_log table after the application's transaction has been committed (or rolled back). This is
necessary in order to ensure that the rows inserted into the activity_log table are not included in the application's transaction
as they would get rolled back with an application's rollback leaving no log of the transaction operation.
The IsASavepoint function determines which type of rollback is in effect. When saving savepoint actions, the savepoint label
is stored in the LOG_ACTION entry. A SYS_ROLLBACK which has a label equal to one of the previously saved savepoint
labels indicates that it is a rollback to savepoint operation that triggered this call to TransactTrigger in which case IsASavepoint will return true. (Note that in RDM Server SQL the non-standard begin transaction statement can specify a transaction
id which can be specified in a subsequent commit or rollback in which case the label passed to TransactTrigger does not correspond to a savepoint.)
When the type is SYS_COMMIT or SYS_ROLLBACK (not associated with a prior savepoint) then a row is inserted into the
activity_log table for each previously saved action as well as the final commit or rollback itself. As TransactTrigger is called
after the commit or rollback has completed the rows inserted into the activity_log table will be performed in an independent
transaction which is committed by the call to SQLEndTran ad the end of the function.
/* ======================================================================
Function to find savepoint label in the list
*/
static int16 IsASavepoint(
LOG_ACTION *lap,
SQL User Guide
215
14. Developing SQL Server Extensions
char
*label)
{
LOG_ACTION
*lgp;
for (lgp = lap; lgp; lgp = lgp->next) {
if (strcmp (lgp->label, label) == 0)
return 1;
}
return 0;
}
/* ======================================================================
Transact Trigger
*/
static void REXTERNAL TransactTrigger (
int16
type,
const char *label,
const char *name,
void
*ptr)
{
LOG_CTX
*ctx = (LOG_CTX *)ptr;
LOG_ACTION
*lap = NULL;
const LOG_ACTION *lgp;
UNREF_PARM(name);
if (type == SYS_REMOVE) {
rm_freeTagMemory(ctx->mTag, 1);
ctx->mTag = NULL;
SQLFreeHandle(SQL_HANDLE_STMT, ctx->hStmt);
SQLDisconnect(ctx->hDbc);
SQLFreeHandle(SQL_HANDLE_DBC, ctx->hDbc);
SQLFreeHandle(SQL_HANDLE_ENV, ctx->hEnv);
rm_freeMemory(ctx, TAG0);
}
else {
if (type == SYS_SAVEPOINT ||
(type == SYS_ROLLBACK && IsASavepoint(ctx->act_list, label))) {
/* save the event info, to be committed later to the log */
lap = (LOG_ACTION *)rm_getMemory(sizeof(LOG_ACTION), ctx->mTag);
if (lap != NULL) {
lap->type = type;
lap->next = ctx->act_list;
lap->prev = NULL;
if (ctx->act_list) ctx->act_list->prev = lap;
lap->label = (label && *label) ? rm_Strdup(label, ctx->mTag) : NULL;
ctx->act_list = lap;
}
}
else if (type == SYS_COMMIT || type == SYS_ROLLBACK) {
/* Process stored actions (if any) */
if (ctx->act_list) {
/* Find first event (it's last in the list) */
for (lap = ctx->act_list; lap->next; lap = lap->next)
;
for (lgp = lap; lgp; lgp = lgp->prev) {
switch (lgp->type) {
case SYS_SAVEPOINT:
SQL User Guide
216
14. Developing SQL Server Extensions
strcpy(ctx->action, "savepoint");
break;
case SYS_ROLLBACK:
strcpy(ctx->action, "rollback to savepoint");
break;
default:
break;
}
strcpy(ctx->label, lgp->label);
SQLExecute(ctx->hStmt);
}
ctx->act_list = NULL;
rm_freeTagMemory (ctx->mTag, 0);
}
switch (type) {
case SYS_COMMIT:
strcpy(ctx->action, "commit");
break;
case SYS_ROLLBACK: strcpy(ctx->action, "rollback"); break;
default: break;
}
strcpy(ctx->label, label);
SQLExecute(ctx->hStmt);
SQLEndTran(SQL_HANDLE_DBC, ctx->hDbc, SQL_COMMIT);
}
}
}
SQL User Guide
217
15. Query Optimization
15. Query Optimization
The RDM Server SQL query optimizer is designed to generate efficient execution plans for database queries. The typical
kinds of queries used in an embedded database application environment include standard join, index lookup optimizations,
and grouping and sorting. Some of the more sophisticated optimization techniques (for example, support of on-line analysis
processing) are not included in the RDM Server optimizer. However, RDM Server does provide various access methods
(which support very fast access to individual rows) with the direct access capabilities of predefined joins and rowid primary
keys.
Overview of the Query Optimization Process
In SQL, queries are specified using the select statement, and many methods (or query execution plans) exist for processing a
query. The goal of the optimizer is to discover, among many possible options, which plan will execute in the shortest amount
of time. The only way to guarantee a specific plan as optimal is to execute every possibility and select the fastest one. As this
defeats the purpose of optimization, other methods must be devised.
The query optimizer must resolve two interrelated issues: how it will access each table referenced in the query, and in what
order. To access requested rows in a table, the optimizer can choose from a variety of access methods, such as indexes or predefined joins. It determines the best execution plan by estimating the cost associated with each access method and by factoring in the constraints on these methods imposed by each possible access ordering. Note that the decisions made by the
optimizer are independent of the listed order of the tables in the from clause or the location of the expressions in the where
clause.
Consider the following example query from the sales database.
select company, ord_num, ord_date, amount
from customer, sales_order
where customer.cust_id = sales_order.cust_id and
state = "CO" and ord_date = date "1993-04-01";
Two tables will be accessed: customer and sales_order. The first relational expression in the where clause specifies the join
predicate, which relates the two tables based on their declared foreign and primary keys. The DDL for the sales database (file
sales.sql) contains a create join called purchases on the sales_order foreign key, providing bidirectional direct access
between the two tables. Note that the state column in the customer table is also the first column in the cust_geo index, and
the ord_date column in the sales_order table is the first column in the order_ndx index. Thus the optimizer has choices of
which index to use. All possible execution plans considered by the RDM Server query optimizer for this query are listed in
the following table.
Table 15-1. Possible Execution Plans for Example Query
1. Scan customer table (that is, read all rows) to locate rows where state = "CO", then for each matching customer row,
scan sales_order table to locate rows that match customer's cust_id and have ord_date = 1993-04-01.
2. Scan customer table to locate rows where state = "CO", then for each customer row, read each sales_order row that
is connected through the purchases join, and return only those that have ord_date = 1993-04-01.
3. Use the cust_geo index to find the customer rows where state = "CO", then for each customer row, scan sales_order
table to locate rows that match customer's cust_id and have ord_date = 1993-04-01.
4. Use the cust_geo index to find the customer rows where state = "CO", then for each customer row, read each sales_
order row that is connected through the purchases join, and return only those that have ord_date = 1993-04-01.
SQL User Guide
218
15. Query Optimization
5. Scan sales_order table to locate rows where ord_date = 1993-04-01, then for each sales_order row, scan customer
table to locate rows that match sales_order's cust_id and have state = "CO".
6. Scan sales_order table to locate rows where ord_date = 1993-04-01, then for each sales_order row, read the customer row that is connected through the purchases join, and return only those that have state = "CO".
7. Use the order_ndx index to find the sales_order rows where ord_date = 1993-04-01, then for each sales_order row,
scan customer table to locate rows that match sales_order's cust_id and have state = "CO".
8. Use the order_ndx index to find the sales_order rows where ord_date = 1993-04-01, then for each sales_order row,
read the customer row that is connected through the purchases join, and return only those that have state = "CO".
Because the time (based on the number of disk accesses) required to scan an entire table is generally much greater than the
time needed to locate a row through an index, plans 4 and 8 seem the best. However, it is unclear which of the two plans is
optimal. In fact, both are probably good enough to obtain acceptable performance.
Additional information to help you make the best choice includes the number of rows in each table (28 customers, 127
sales_orders), the number of customers from Colorado (1), and the number of orders for April 1, 1993 (5). With this data we
can deduce that plan 4 is better than plan 8. Plan 4 requires 1 index lookup to find the one customer from Colorado (about 3
reads) plus the average cost to read through an instance of the purchases set to retrieve and check the dates of the related
sales_order records (average number of orders per customer = 127/28 = 4). Thus, plan 4 uses about 7 disk accesses. Plan 8
will use the order_ndx index to find the 5 sales_order rows dated 1993-04-01 (about 8 reads) plus one additional read to
fetch and check the related customer record through the purchases set (5 reads). Hence, plan 8 uses about 13 disk accesses.
Note that plans 1 and 5 perform what is called a Cartesian or cross-product—for each row of the first table accessed, all rows
of the second table are retrieved. (Thus if the first table contained 500 rows and the second table contained 1000 rows, the
query would read a total of 500,000 rows.) Cross-products are extremely inefficient and will never be considered by the optimizer except when a necessary join predicate has been omitted from the query. In our example, this would occur if the relational expression, "customer.cust_id = sales_order.cust_id" was not specified. Necessary join predicates are often erroneously
omitted when four or more tables are listed in the from clause and/or when multi-column join predicates (for compound foreign and primary keys) are required.
The following diagram shows the basic operational phases of the query optimization process, as illustrated by the previous
example.’
SQL User Guide
219
15. Query Optimization
Figure 15-1. Query Optimization Process
Using the information in the system catalog, the select statement is parsed, validated, and represented in a set of easily processed query description tables. These tables include a tree representation of the where clause expressions (called the expression tree) and information about the tables, columns, keys, indexes, and joins in the database.
The system then analyzes those tables, and constructs both the access rule table and the expression table. For the referenced
tables, the analysis process uses the system catalog and the distribution statistics (collected by the update statistics statement).
The access rule table contains a rule entry for each possible access method (for example, table scan or index lookup) for each
table referenced in the from clause. The expression table has one entry for each conditional expression specified in the where
clause. These tables drive the actual optimization process.
Finally, the optimizer determines the plan with the lowest total cost. An execution plan basically consists of a series of steps
(one step for each table listed in the from clause), of how the table in that particular plan step will be accessed. The possible
SQL User Guide
220
15. Query Optimization
access rules that can be applied at that step are sorted by their cost so that the first candidate rule is the cheapest. The optimizer's goal is to select one access rule for each step that minimizes the total cost of the complete execution plan. As the optimizer iterates through the steps, the cost of the candidate plan is updated. As soon as a candidate plan's cost exceeds the cost of
the currently best complete plan, the candidate plan is abandoned at its current step and the next rule for that step is then
tested. Conditional expressions that are incorporated into the plan are deleted from the expression tree so that they are not
redundantly executed.
Cost-Based Optimization
The cost to determine the execution plan is the time it takes the optimizer to find the "optimal" plan. An execution plan consists of n steps where n is the number of tables listed in the from clause. Each step of the plan specifies the table to be
accessed and the method to be used to access a row from that table. The cost increases factorially to the number of tables listed in the from clause (n!). Performance impact is noticeable in RDM Server for queries that reference more than about 8
tables. This is due to the increasing number of combinations of access orderings that must be considered (2 tables have 2 possible orderings, 3 have 6, 4 have 24, etc.). The cost to estimate each candidate plan also includes a linear factor of the number
of access methods available at each step in a plan from which the optimizer must choose. More access methods means the
optimizer must do more work, but the odds of finding a good plan improve.
The cost to carry out an execution plan is the amount of I/O time required to read the database information from disk. In
RDM Server, an estimate of this cost is based on an estimate of the total number of logical I/O accesses that will occur during
execution. Because it is extremely difficult to accurately estimate the effects caused by caching performance and diverse database page sizes, physical I/O estimates are not possible. The logical I/O estimates are based on analysis of the logical I/O time
required to access a record occurrence for each access method.
An heuristic optimizer selects an execution plan by using built-in heuristics or general rules about which particular access
method will return the fewest rows. For example, heuristic optimizers automatically assume that a "col = value" condition will
restrict the result set to fewer rows than would a "col > value" condition (which is assumed to restrict, on average, only half
the rows). In a case where 100 rows contain "value" and zero rows contain greater than "value", this assumption breaks down
and the choice would not be optimal.
A cost-based optimizer maintains the data distribution statistics it uses to more quantitatively determine the better of two
access methods. A histogram is maintained for the distribution of most commonly occurring data in each column. The percentage of the file that is covered by a given inequality expression involving an indexed column (for example, "where ord_
date between date '1993-02-01' and date '1993-02-28'") is interpolated from the histogram, providing a more accurate assessment of the better alternative than a built-in heuristic.
The statistics maintained for use by cost-based optimizers are used to: 1) guide the choice between alternative access methods
derived from the relational expressions specified in the where clause, 2) estimate the number of output rows that result from
each plan step, and 3) estimate the number of logical I/O's incurred by each possible access method.
The statistics used by the RDM Server cost-based optimizer include:
l
Number of pages in a file
l
Number of rows per page in a file
l
Number of rows in a table
l
Depth of an index's B-tree
l
Number of keys per page in an index
l
Frequency distribution histogram of most commonly occurring values for each column
SQL User Guide
221
15. Query Optimization
Update Statistics
The statistics are collected and stored in the SQL system catalog by executing the update statistics statement. The histogram
for each column is collected from a sampling of the data files. The other statistics are maintained by the RDM Server runtime
system.
The histogram for each column contains a sampling of values (25 by default, controlled by the OptHistoSize configuration
parameter), and a count of the number of times that value was found from the sampled number of rows (1000 by default, controlled by the OptSampleSize configuration parameter). The sampled values are taken from rows evenly distributed throughout the table.
When update statistics has not been performed on a database, RDM Server SQL uses default values that assume each table
contains 1000 rows. It is highly recommended that you always execute update statistics on every production database. The
execution time for an update statistics statement is not excessive in RDM Server and does not vary significantly with the size
of the database. Therefore, we suggest regular executions (possibly once per week or month, or following significant changes
to the database).
Restriction Factors
The histogram values are used to compute a restriction factor associated with a specified conditional expression. The restriction factor estimates the percentage of rows from the table that satisfy the conditional. For example, the restriction factor for a
between conditional is equal to the frequency count total of the histogram values that satisfy the conditional, divided by the
total of all sampled histogram values.
When an equality comparison value is not found in the histogram, the restriction factor is based on the average frequency
count of the five histogram values with the lowest frequency counts. The restriction factor for join predicates is based on the
average frequency count of all histogram values for the foreign/primary key column (this results in an estimate of the average
number of duplicates per value).
The histogram for the prod_id column of the item table is shown below.
Table 15-2. Histogram for ITEM.PROD_ID
Entry #
PROD_ID Value
# of Occurrences
0
10320
4
1
10433
12
2
11333
8
3
11433
14
4
12325
10
5
13032
9
6
14020
6
7
15200
5
8
16300
3
9
16301
11
10
16311
15
11
17110
2
12
17214
23
13
17419
12
14
19100
4
SQL User Guide
222
15. Query Optimization
Entry #
PROD_ID Value
# of Occurrences
15
19400
5
16
20200
6
17
20308
4
18
20400
9
19
21200
6
20
21500
3
21
23100
11
22
23200
24
23
23400
26
24
24200
17
The item table has 461 rows. All rows were sampled by update stats. The table shows the histogram counts for the first 25 distinct values sampled by update stats from the item table. A total of 249 rows from the item table contained one of those
prod_id values. These values are used by the optimizer to compute restriction factors for prod_id comparisons specified in a
where clause. The following table gives the restriction factor for some example expressions.
Table 15-3. Example Restriction Factors
Conditional Expression
Restriction
Factor
Cardinality Estimate
Actual
Count
prod_id = 16311
0.032538
15
15
prod_id >= 21200
0.349398
161
143
prod_id between 11433 and 20200
0.502008
231
246
prod_id = 10450
0.006941
3
7
prod_id in
(15200,20200,21200,24200)
0.073753
34
34
The restriction factor multiplied by the cardinality of the table (461) gives the cardinality estimate of the conditional expression (i.e., an estimate of the number of rows from the table that satisfy the conditional). The count of the number of actual
matching rows is also listed in the table. The accuracy of the estimate is very good but that is primarily because all table rows
were sampled.
The restriction factor for the prod_id = 16311 conditional is computed from the histogram count for entry 10 of Table 15-2,
divided by the total number of sampled rows (461). Thus, 15/461 = 0.032538. Note that the cardinality estimate equals the
histogram count value because all of the rows in the table were sampled, which will only be true for small tables. If there
were 1000 rows sampled from a 50,000 row table the restriction factor would have been 0.015 and the cardinality estimate
0.015*50,000 = 750 rows.
The restriction factor for inequality conditions is estimated as the percentage of the histogram table that matches the conditional expression. Thus, the restriction factor for prod_id >= 21200 is equal to the sum of the histogram counts for the
prod_id entries >= 21200 divided by the sum of all counts (249) or, 87/249 = 0.349398. Applying the same method to prod_
id between 11433 and 20200 gives us 125/249 = 0.502008.
For equality comparisons with values not in the table (or when the comparison is against a parameter marker or stored procedure argument) an estimate of the average number of duplicates per row is computed. The estimate is equal to the average
SQL User Guide
223
15. Query Optimization
counts for the 5 least occurring values in the histogram. Thus, the restriction factor for prod_id = 10450 is estimated as
((2+3+3+4+4)/5)/461 = 0.006941 (or an average of about 3 rows per value).
The restriction factor computation for the in conditional is simply the sum of the equality comparisons for each of the listed
values. The restriction factor for prod_id in (15200,20200,21200,24200) is (5+6+6+17)/461 = 0.073753.
Table Access Methods
RDM Server provides a variety of methods for retrieving the rows in a table. Each of these access methods is described below,
including how cost is estimated for each method. The cost estimates use the above statistics as represented by the following
values.
Table 15-4. Cost Estimate Value Definitions
Value
Definition
P
The number of pages in the file in which the table's rows are stored.
D
The depth of the B-tree index.
C
The cardinality of the table being accessed (that is, the number of rows in the
table).
Cf
The cardinality of the table containing the referenced foreign key.
Cp
The cardinality of the table containing the referenced primary key.
K
The maximum number of key values per index page.
R
The restriction factor, an estimate (between 0 and 1) of the percentage of the
rows of the table that satisfy the conditional expression. The restriction factor is
determined from the frequency distribution histogram and the constant values
specified in the conditional expression.
Database I/O in RDM Server is performed by reading data and index file pages. A data file page contains at least one (usually
more) table row so each physical disk read will read into the RDM Server cache that number of rows. An index file page contains many keys per page depending on the size of the page and the size of the index values. RDM Server uses a B-tree structure for its indexes, which guarantees that each index page is at least half full. On the average, index pages are about 60-70%
full. The depth of a B-tree indicates the number of index pages that must be read to locate a particular key value. Most B-trees
have a depth of from 4 to 7 levels.
Sequential File Scan
Each row of a table is stored as a record in a file. In RDM Server, a data file can contain the rows from one or more tables.
The most basic access method in RDM Server is to perform a sequential scan of a file where the table's rows are retrieved by
sequentially reading through the file. If the file contains rows from more than one table, only the rows from the needed table
are returned. However, all of the rows from all of the tables stored in the file will be read (the rows are intermixed). Thus, the
cost (measured in logical disk accesses) to perform a sequential scan of a table is equal to the number of pages in the file:
Cost of sequential file scan = P
A sequential file scan is used in queries where the where clause contains no optimizable conditional expressions that reference foreign key, primary key, or indexed columns. See the example below.
select sale_name, dob, region, office from salesperson
where age(dob) >= 40;
SQL User Guide
224
15. Query Optimization
Direct Access Retrieval
Direct access retrieval allows retrieval of an individual row based on the value of a rowed primary key. The rowid primary
key value can be specified directly in the query or may result from a join with a table containing a referencing rowid foreign
key. The cost of a direct access retrieval is 1 (since a single file read is all that is needed to retrieve the row based on its
rowid value):
Cost of direct access retrieval = 1
Consider the following table declarations:
create table pktable(
pkid rowid primary key,
pktext char(50)
);
create table fktable(
fkid rowid references pktable,
fktext char(50)
};
The optimizer produces an execution plan that uses direct access retrieval to fetch a particular row from pktable for the following query:
select * from pktable where pkid = 10;
The execution plan for the query below consists of two steps. The first step is a sequential scan of fktable. In the second step,
fkid is used to directly access the related pktable row and each fktable row.
select pkid, pktext, fktext from pktable, fktable where pkid = fkid;
Indexed Access Retrieval
Equality Conditionals
Indexed access retrieval allows retrieval of an individual row or set of matching rows, based on the value of one or more
columns contained in a single index. These values can be specified in the query directly or through a join predicate.
For a unique index, the cost to access a single row is equal to the depth of the index's B-tree (seldom more than 4 ) + 1 (to
read the row from the data file). For a non-unique index, the cost is based on an estimate of the average number of rows having the same index value derived from the indexed column's histogram. The percentage of the table's rows that match the specified equality constraint is the restriction factor (R). Thus, the estimate of number of matching rows is equal to the cardinality
of the table multiplied by the restriction factor, or:
number of matching rows = C * R
The cost estimate (in logical page reads) of an indexed access retrieval is equal to the number of index pages that must be
accessed plus the number of matching rows (1 logical page read per row), or:
Cost of index access = D + (C * R)/(.7 * K) + (C * R)
SQL User Guide
225
15. Query Optimization
This assumes that each index page is an average of 70% full (D = depth of B-tree, K = maximum number of keys per index
page). Note that this formula works for both unique and non-unique indexes (for unique indexes, R = 1/C).
In the following example, the optimizer uses the order_key index on the sales_order table to retrieve the specified row.
select * from sales_order where ord_num = 2310;
In the example below, the optimizer selects indexed access retrieval to find the item rows through the item_ids index and the
related product rows through the prod_key index.
select prod_id, quantity, prod_desc from item, product
where item.prod_id = 17214 and product.prod_id = 17214
and item.prod_id = product.prod_id;
Notice that the where clause contains a redundant expression. Including redundant expressions provides the optimizer with
more access choices. You can set a RDM Server configuration parameter called RedundantExprs to have the optimizer automatically add redundant expressions where appropriate, such as in the above query.
IN Conditionals
When the in operator is used, the restriction factor is equal to the sum of the equality restriction factors for each of the listed
values. Thus, the cost is simply the sum of the costs of the individual values.
Cost of index access for: column in (v1 , v2 , ..., vn ) = SUM(cost(column = vi)) for all i: 1..n
The optimizer will use the order_key index on the sales_order table to retrieve each of the rows specified in:
select * from sales_order where ord_num in (2210, 2215, 2242);
Index Scan
Inequality Conditionals
Indexed scans use an index to access the rows satisfying an inequality relational expression involving the major column in
the index. The estimate of the cost of an index scan is calculated exactly the same as the indexed access method. The restriction factor is calculated as the percentage of the column's histogram values that match the specified conditional inequality.
Consider the following query:
select * from ship_log where ord_num between 2250 and 2270;
The ship_log table contains 558 rows. The optimizer computed a restriction factor of .166667, which estimates that when the
between condition is applied, 93 rows (.166667*558) will pass. The cost to perform a sequential scan involves reading all
558 rows (145 pages) and is greater than the cost to use the index (D=2, C=558, R=.166667, K=72 => cost = 95). Thus, in
this example, the optimizer will choose to use the ship_order_key index.
Cost of index scan = D + (C * R)/(.7 * K) + (C * R)
SQL User Guide
226
15. Query Optimization
LIKE Conditionals
Index scans are also used to access rows satisfying like expressions that compare the major column of an index with a literal
string pattern. The restriction factor is calculated from the histogram values that match the specified pattern. If no matches are
found, it is then calculated from average of the five lowest frequency counts in the histogram.
Two types of scans are employed depending on the position of the first wild card character (for example, "%" or "_") in the
pattern. If the pattern starts with a wild card character, the entire index will be scanned and each key value will be compared
with the specified pattern. Only those keys that match will be returned. The cost of this scan is equal to the cost of reading
each index page plus the cost of reading the row associated with each matching index value as given in the following formula:
Cost of full index like scan = D + (C/(.7 * K)) + (R * C)
The cost typically will be much less when the pattern does not begin with a wild card. This allows the SQL system to position within the index those values having the same prefix (consisting of all characters up to the first wild card).
Cost of prefixed like index scan = D + (C * R)/(.7 * K) + (C * R)
Note that this is identical to the cost of an equality indexed access (although the restriction factor will be greater in this case).
Primary To Foreign Key Join
The use of create join on a foreign key in the DDL establishes a predefined join set relationship between the referenced
tables. Related rows in the two tables are connected using direct access pointers stored and maintained in each row's physical
record storage. All rows from a foreign key table are linked (in a linked list) to the row from the primary key table to which
they refer. Thus, the optimizer can generate execution plans that directly access the related foreign key table rows after having
accessed the primary key row. This access method is only considered by the optimizer when the join predicate (which equates
each foreign key column to its corresponding primary key column) is included in the where clause. The cost of a primary to
foreign key join is equal to the average number of foreign key rows for each primary key row:
Total cost = Cardinality of primary to foreign key join = Cf / Cp
Foreign To Primary Key Join
The foreign to primary key join is also made available through use of the create join. This method allows the optimizer to
generate execution plans that directly access the primary key row referenced from a previously accessed foreign key row.
Again, this access method is only considered by the optimizer when the join predicate that equates each foreign key column
to its corresponding primary key column is included in the where clause.
The cost of a foreign to primary key join = 1
(each foreign key row references a single primary key row)
Note that these latter two access methods are available through the presence of a join predicate in the where clause as in the
following example.
select sale_name, company, city, state from salesperson, customer
where salesperson.sale_id = customer.sale_id;
The optimizer can choose to either access the salesperson table first and then the related rows using the primary to foreign key
join access method based on the accounts join, or it can first access the customer table and then the related salesperson row
using the foreign to primary key join method. The method chosen will depend on the costs involved with first accessing one
or the other of the two tables.
SQL User Guide
227
15. Query Optimization
Foreign Thru Indexed/Rowid Primary Key Predefined Join
This method is used to access a foreign keyed table in which the foreign key is used in an equality comparison in the where
clause and for which the primary key table is not referenced. When the foreign key has an associated create join the optimizer can generate a plan that allows access to the matching foreign key rows through the primary key's index (or rowid). Look
at the query below:
select company, city, state from customer where sale_id = "ERW";
The optimizer accesses the matching customer rows using the index on sale_id in the salesperson table, then retrieves the
related customer rows through the accounts predefined join. This is equivalent to the following query:
select company, city, state from customer, salesperson
where salesperson.sale_id = "ERW" and
salesperson.sale_id = customer.sale_id;
By providing access to joined foreign key tables implicitly through the referenced primary key, faster access is achieved in
update statements where a join is not possible. See the following example.
update customer set contact = null where sale_id = "ERW";
The cost of accessing a foreign table through the primary key is equal to the cost of accessing the primary row added to the
cost of accessing the related foreign key rows.
Cost of accessing primary table through index = D + 1 (only one row is located).
Cost of accessing related foreign rows = Cost of Primary to foreign key join (see above).
If the primary key is of type rowid, the cost to access the primary row is 1.
A summary of the table access methods used by the RDM Server optimizer is shown in Table 15-5.
Table 15-5. Table Access Methods Cost Formulas
Access Method
Cost Estimate (logical I/Os)
Sequential File Scan
P
Direct Access
1
Indexed Access (equality)
D + ((C * R)/(.7 * K)) + (C * R)
Indexed Access (in)
SUM(Indexed Access Cost(column = vi)) for all i: 1..n.
Index Scan (inequality)
D + ((C * R)/(.7 * K)) + (C * R)
Index Scan (like/no prefix)
D + (C/(.7 * K)) + (R * C)
Index Scan (like/with prefix)
D + ((C * R)/(.7 * K)) + (C * R)
Primary to foreign key join
Foreign to primary key join
Cf / Cp
1
Foreign thru indexed primary key
D + 1 + (Cf / Cp )
Foreign thru rowid primary key
1 + (Cf / Cp )
SQL User Guide
228
15. Query Optimization
Optimizable Expressions
The RDM Server query optimizer is able to optimize a restricted set of relational expressions that are specified in the where
clause of a select statement. Simple expressions involving a comparison between a simple column and a literal constant value
(or parameter marker or stored procedure argument) can be analyzed by the optimizer to determine if any access methods exist
that can retrieve rows satisfying that particular conditional. Expressions for potential use by the optimizer in an execution
plan are referred to as optimizable. Table 15-6 summarizes the optimizable relational expressions.
Table 15-6. Optimizable Relational Expressions
1. RowidPkCol = constant
2. NdxCol1 = constant [and NdxCol2 = constant]...
3. FkCol1 = constant [and FkCol2 = constant]...
4. FkCol1 = PkCol1 [and FkCol2 = PkCol2]...
5. NdxCol1 = Cola [and NdxCol2 = Colb ]...
6. NdxCol1 in (constant[, constant]...)
7. NdxCol1 {> | >= | < | <=} constant
8. NdxCol1 {> | >=} constant [and NdxCol1 {< | <=} constant]
9. NdxCol1 between constant and constant
10. NdxCol1 like "pattern"
The constant is either a literal, a parameter marker ('?'), or a stored procedure argument (if statement is contained in a stored
procedure declaration). The RowidPkCol expression corresponds to a rowid primary key column. The NdxColi's refer to the
i'th declared column in a given index. The FkCol i's (PkCol i's) refer to the i'th declared column in a foreign (primary) key. An
equality comparison must be provided for all multi-column foreign and primary key columns in order for the optimizer to
recognize a join predicate. Cola, Colb , etc., are columns from the same table that match (in type and length) NdxCol1 ,
NdxCol2 , etc., respectively.
These expressions are all written in the following form: ColumnName relop expression. Note that expressions of the form:
expression relop ColumnName are recognized and reformed by the optimizer so that the ColumnName is always listed on the
left hand side. This transformation may require modification of the relational operator. For example,
select ... from ... where 1000 > colname
would become
select ... from ... where colname < 1000
Depending on how the where clause is organized, an expression may or may not be optimizable. Conditional expressions
composed in conjunctive normal form are optimizable. In conjunctive normal form, the where clause is constructed as follows:
C1 and C2 and ... Cn
Each Ci is a conditional expression comprised of a single or multiple or'ed relational comparison. Only those Ci 's that consist
of a single optimizable relational expression are optimizable. In other words, relational expressions that are sub-branches of an
or'ed conditional expression are not optimizable. The best possible optimization results are obtained when the desired conditions use and. Some or expressions can be rewritten in a form the optimizer can process. For example, because of the or
expression in the following query, the optimizer will not use an index on the state column in the customer table.
SQL User Guide
229
15. Query Optimization
select ... from customer
where state = "CA" or state = "WA" or state = "AZ" or state = "OR";
However, for the equivalent query shown below, the optimizer would use the index on state.
select ... from customer where state in ("CA", "WA", "AZ", "OR");
Examples
These examples are all based on the example sales and invntory databases. Refer to the sales.sql and invntory.sql DDL files
for relevant declarations of the entities referenced below.
The following select statement will locate the salesperson record for a particular salesperson ID code using the sale_key
index.
select * from salesperson where sale_id = "GAP";
The optimizer will use the accounts predefined join to optimize the join predicate in the query below.
select * from salesperson, customer
where salesperson.sale_id = customer.sale_id;
In the next example, those customers serviced by a specific salesperson would be accessed through the accounts predefined
join after locating the specified salesperson's row through the sale_key index.
select * from salesperson, customer
where salesperson.sale_id = customer.sale_id
and sale_id = "GAP";
Note that the optimizer would not use the comments join in the following example.
select note_id, note_date, txtln from note, note_line
where note.note_id = note_line.note_id;
The comments join cannot be used because only one of the three foreign and primary key columns from the join are specified
in the where clause. The note_id column is the second column in the note_key index, thus the note_key index cannot be
used either. Therefore, the optimizer has no good choices for resolving this query. Thus the query will be processed with the
candidate rows coming from a cross-product between the two tables and the result rows from those that have matching note_
id values. This result is not what the user wants.
Note that the query below will produce (efficiently) the result set the user wants.
select sale_id, note_id, note_date, txtln from note, note_line
where note.sale_id = note_line.sale_id
and note.note_id = note_line.note_id
and note.note_date = note_line.note_date;
SQL User Guide
230
15. Query Optimization
In all of the following queries, the order_ndx index on sales_order is selected by the optimizer to access the rows that satisfy the specified condition.
select * from sales_order where ord_date = date '1993-02-12';
select * from sales_order where ord_date > date '1993-03-31';
select * from sales_order
where ord_date >= date '1993-04-01' and ord_date < date '1993-05-01';
select * from sales_order
where ord_date in (@'1993-01-02',@'1993-02-03',@'1993-03-04');
select * from sales_order
where ord_date between date '1993-06-01' and date '1993-06-30';
In the following query, the optimizer cannot use either of the relational expressions specified.
select cust_id, ord_num, ord_date, amount from sales_order
where ord_num = 2293 or ord_date = date '1993-06-18';
The or expression prohibits the use of either index, because using the index on ord_num would not retrieve those rows that
have the specified ord_date, and vice-versa. In this case, the optimizer would select an access method that retrieves all rows
in the table (using file scan or a complete index scan). This query is best performed using a separate query for each part. In
general, when the table is large, a temporary table can be used as shown below.
create temporary table torders(
cid char(4),
onum smallint,
odate date,
oamt double
);
insert into torders
select cust_id, ord_num, ord_date, amount from sales_order
where ord_num = 2293;
insert into torders
select cust_id, ord_num, ord_date, amount from sales_order
where ord_date = date '1993-06-18';
select distinct * from torders;
Pattern matching using the SQL like operator can be optimized by using an index on the character column, provided the
column is the first (or only) declared column in an index, and the pattern is a string literal in which the first character is not a
wild-card character. For example, the index on cust_id is used in the following query to quickly select only those customer
rows that begin with the letter "S".
select * from customer where cust_id like "S%";
If the query is written differently, as shown below, all of the cust_id values in the index will be checked to find customer
rows with a cust_id ending with the letter "S".
select * from customer where cust_id like "%S";
The conditional is tested using the value from the index before the row is read so that if it does not match, there is no cost of
reading the row from the data file.
SQL User Guide
231
15. Query Optimization
How the Optimizer Determines the Access Plan
Selecting Among Alternative Access Methods
Consider the following query.
select * from ship_log
where ord_num = 2284 and prod_id = 23400;
The optimizer can choose to use either the index on ord_num or the index on prod_id to process this query. It will select the
index that it determines will execute the query in the fewest disk accesses. It estimates the required disk accesses using the
data usage statistics accumulated and stored in the system catalog by the update statistics statement. The following table
shows the relevant statistics and calculations used by the optimizer for each of the two relational expressions in the above
query.
Table 15-7. Optimizer Statistics Example #1
optimizer statistic/calculation
ord_num = 2284
prod_id = 23400
558
558
depth of B-tree (D)
2
2
number of keys per page (K)
72
72
0.003584
0.046595
estimate of number of result rows (C * R)
2
26
cost estimate
4
28
number of rows in table (C)
restriction factor (R)
The estimate of the number of rows that match each of the expressions is based on the operation (in this case "=") and the
count of histogram matches. If the histogram count is zero, the number of rows to be returned by an equality condition is
equal to the average frequency count of the five lowest histogram entries divided by the cardinality of the table. In this
example, for the "ord_num = 2284" condition, 2284 is not in the histogram. The average of the 5 lowest frequency counts
was 2. Thus, the restriction factor is 2/558 or 0.003584. For the "prod_id = 23400" condition, value 23400 is in the histogram
with a frequency count of 26. The restriction factor is, therefore, 26/558 or 0.046595. The optimizer will choose the ord_num
index because of its lower cost estimate (use the formula from Table 15-5 under "Indexed Access (equality)" to calculate the
costs).
Selecting the Access Order
When a query references more than one table, the optimization process becomes more complex, because the optimizer must
choose between different methods to access each table, and the order in which to access them. Many access methods rely only
on the values specified in the conditional expression for the needed data. However, some access methods (those associated
with join predicates) require that other tables have already been accessed. This places constraints on the possible orderings.
Access methods available at the first step in the plan are those that do not depend on any other tables.
For possible access methods at the first plan step, the optimizer chooses the method with the lowest cost from a list of possible methods sorted by cost. The accessed table is then marked as bound. The access methods available at the next step in
the plan include the choices from the first step for the other tables, plus those methods that depend on the table bound by the
first step. These too are ordered by cost. The optimizer continues in this manner until methods have been chosen for all steps
in the plan. It then selects the method with the next highest cost and recursively evaluates a new plan. At any point in the
SQL User Guide
232
15. Query Optimization
process, if the plan being evaluated exceeds the total cost of the current best complete plan, that plan is abandoned and
another is chosen. The entire optimizer algorithm is depicted in Figure 15-2 below.
Figure 15-2. Optimizer Algorithm
SQL User Guide
233
15. Query Optimization
Sorting and Grouping
For select statements that include a group by or order by specification, the SQL optimizer performs two separate optimization
passes. The first pass restricts the choice of usable access methods to only those that produce or maintain the specified ordering. For example, an index scan retrieves its results in the order specified in the create index declaration. If the results match
the specified ordering, they are included as a usable access method. This optimization pass is fast because, typically, very few
plans produce the desired ordering without performing an external sort of the result set. Note that ordering clauses can be satisfied through the use of indexes and predefined sorted joins (that is, create join with order sorted).
If a plan is produced by the first pass, it is saved (along with its cost estimate), and a second optimization is performed
without the ordering restriction. An estimate of the cost required to sort the result set, based on the optimizer's estimate of the
result set's size, is added to the cost of the plan produced by the unrestricted pass. The optimizer will then choose the plan
with the lowest cost.
The estimate of the sort cost is based on the optimizer's cardinality estimate, the length of the sort key, and the sort index
page size. The optimizer will calculate the number of I/Os as two times the number of index pages to store the sort index (one
pass to create the page and another to read each page in order) and add the number of result rows.
Note that if both the group by and order by clauses are specified, only the group by ordering can be satisfied by existing
indexes and joins. A separate sort of the result set will always be required for the order by clause. If there is no index to satisfy the specified group by, then two sort passes will be needed. For example, consider the following query on the ship_log
table.
select * from ship_log where ord_num = 2269 order by prod_id;
The table below shows optimizer information on the two optimizer passes for the above query. Pass 1 requires use of the
ship_prod_key index because it is the only method available that returns rows in the specified order. Pass 2 is free to choose
any access method. The cost difference between these choices is large and the optimizer is correct to choose the plan produced by pass 2, even though it will have to perform a sort on the result rows.
Table 15-8. Optimizer Statistics Example #2
Optimizer Statistic/Calculation
Pass 1
Pass 2
558
558
Depth of B-tree (D)
2
2
Number of keys per page (K)
72
72
1.0 (prod_id is not used in
where)
0.017921
Estimate of number of result rows (C *
R)
558
10
Cost estimate
674
38 (includes sort
cost)
Number of rows in table (C)
Restriction factor (R)
Unfortunately, the sort cost estimate can be inaccurate, because it is based on a cardinality estimate derived from databasewide data distribution statistics that will not hold for some individual cases. RDM Server provides a configuration parameter,
SortLimit, that can influence the sort decision. For cardinality estimates greater than the specified SortLimit number, the optimizer will always choose to use the restricted ordering plan rather than incur the cost of the sort. If SortLimit is zero or the cardinality estimate is less than SortLimit, the optimizer's choice is based on its computed cost estimates. Unless you observe
many instances where sorts are being performed when they should not be (or vice versa), it would be best to leave SortLimit
set to zero.
SQL User Guide
234
15. Query Optimization
A user can also force the optimizer to always select the restricted ordering plan by specifying nosort at the end of the order
by or group by clause. Thus, if a restricted order plan exists and nosort is specified, that plan will be executed. If not, an
external sort of the result set will still be performed.
The optimizer will only consider orderings involving the actual columns where the sort clauses are declared in the create
index or create join statements. The optimizer does not deduce additional ordering from the presence of join predicates in the
where clause. For example, consider the following schema fragment.
create table A(
a_pk integer primary key,
...
);
create table B(
b_pk integer primary key,
b_fk integer references A,
...
);
create join A_to_B order last on B(b_fk);
create table C(
c_fk references B,
c_date date
...
);
create join B_to_C order sorted on C(c_fk) by c_date;
The optimizer will recognize the join ordering to resolve the following query without performing an external sort:
select * from A,B,C where a_pk = b_fk and b_pk = c_fk
order by b_pk, c_date;
However, in the next query, the optimizer would perform an external sort even though it is possible to deduce from the join
predicates that the sort is unnecessary.
select * from A,B,C where a_pk = b_fk and b_pk = c_fk
order by a_pk, b_pk, c_date;
The optimizer looks ahead of the sort field in the order by clause for use of the primary key column from the referenced table
of a sorted join; it thus recognizes the order produced by the join access rule. To ensure that the predefined join preserves the
sort order imposed by the columns preceding b_pk in the order by clause, the optimizer must know that those columns are
unique. Thus, we derive the following two guidelines:
1. To use sorted joins, always include the referenced table's primary key column(s) prior to the sort columns in the order
by clause.
2. Do not assume that the optimizer is smarter than you.
Outer Join Processing
The optimizer processes outer joins by forcing all outer joins into left outer joins (right outer joins are converted into left
outer joins by simply reversing the order). It then will disable all access paths that require the right hand table to be accessed
before the left hand table. If there is no access path (that is, through an index or predefined join) from the left hand table to
the right hand table, the optimizer will simply perform an inner join (rather than doing a very expensive cross-product).
SQL User Guide
235
15. Query Optimization
Returning the Number of Rows in a Table
The row counts for each table in a database are maintained by the RDM Server engine. SQL recognizes queries of the following form:
select count(*) from tablename
SQL also generates a special execution plan that returns the current row count value for the specified table. No table or index
scan is needed. However, if the query is specified as shown in the next box below, the optimizer performs a scan of the table
or index (if colname is indexed) and counts the rows.
select count(columnname) from tablename
Thus, if you need the row count of the entire table, use the first form and not the second. However, note that the row count
returned from the first form includes uncommitted rows that have been inserted by another user. The second form counts only
the currently committed rows.
Select * From Table
ANSI standard SQL states that when an order by clause is not specified, the ordering of the result rows from a table is implementation dependent. Some notable ODBC-based front-end application development and report writer tools assume that a
"select * from table" returns the rows in primary key order. To work effectively with these products, RDM Server SQL will
return the rows in primary key order (or in the order defined by the first unique index on the table if there is no primary key).
Query Construction Guidelines
Some systems perform a great deal of work to convert poorly written queries into well written queries before submitting the
query to the optimizer. This is particularly useful in systems where ad hoc querying (such as in decision-support environments) is performed by non-technical people. SQL is less user friendly, so often this work is performed by front-end tools.
RDM Server does not perform complex query transformation analysis (it will do simple things such as converting expressions
like "10 = quantity" into "quantity = 10"). Therefore, a thorough understanding of the information provided here will assist
you in formulating queries that can be optimized efficiently by RDM Server SQL. Guidelines for writing efficient RDM
Server SQL queries are listed below.
Formulate where clauses in conjunctive normal form. Avoid using or.
Formulate conditional expressions according to the forms listed in Table 15-4. Use literal constants as often as
possible. The compile-time for most queries is insignificant compared to their execution time. Thus, dynamically constructing and compiling queries containing literal constants (as opposed to parameter markers or
stored procedures) will allow the optimizer to make more intelligent access choices based on the histogram statistics.
Include more (not fewer) conditional expressions in the where clause, and include redundant expressions. For
example, foreign and primary keys exist between tables A and B, B and C, and A and C. Even though it is not
strictly necessary (mathematically) to include a join predicate between A and C, doing so provides the optimizer with additional access path choices. Also, assuming that join predicates exist and a simple conditional is
specified for the primary key, you can include the same conditional on the foreign key as well. Look at the following query:
SQL User Guide
236
15. Query Optimization
select ... from A,B where A.pkey = B.fkey and A.pkey = 1000
You can improve this query by adding the conditional shown in an equivalent version below.
select ... from A,B where A.pkey = B.fkey and A.pkey = 1000 and B.fkey = 1000
Make certain join predicates exist for all pairs of referenced tables that are related through foreign and primary
keys.
Avoid sorting queries with large result sets in which no index is available to produce the desired ordering. If
you have heavy report writing requirements, consider using the replication feature to maintain a redundant,
read-only copy of the database on a separate server and run your reports from that. This will allow the online
server to provide the best response to update requests without blocking or being blocked by a high level of
query activity.
In defining your DDL, use create join where you would otherwise (that is, using other SQL systems), for performance reasons, create an index on a foreign key.
Do not include conditional expressions in the having clause that belong in the where clause. Conditional
expressions contained in the having clause should always include an aggregate function reference. Note that
expressions in the having clause are not taken into consideration by the optimizer.
Execute update statistics on your database(s) whenever changes have occurred which could have a significant
effect on the distribution characteristics of the data. When in doubt, run update stats.
User Control Over Optimizer Behavior
User-Specified Expression Restriction Factor
The restriction factor is the fraction of a table between 0 and 1 that is returned as a result of the application of a specific
where condition. The lower the value, the greater the likelihood that the access method associated with that condition will be
chosen by the optimizer. This factor is computed by the RDM Server optimizer from the data distribution statistics. Note that
you can override the optimizer's estimate by using a non-standard RDM Server SQL feature. A relational expression, relexpr,
can be written as "(relexpr, factor)", where factor is a decimal fraction between 0 and 1 indicating the percentage of the file
restricted by relexpr. In the example below, where the optimizer would normally access the data using the index on ord_num,
the user-specified restriction factor causes the optimizer to use instead the index on ord_date.
select * from sales_order where (ord_date > date '1996-05-20',0.00001)
and (ord_num = 2210, 1.0);
When statistics used by the optimizer are not accurate enough for a given query and the result is unsatisfactory, you can use
this feature to override the stats-based restriction factor and substitute your own value. However, your use of this feature
renders the query independent of future changes to the data distribution statistics.
User-Specified Index
If a column referenced in an optimizable conditional expression is used in more than one index, the optimizer will generate
an access rule for each index and select the index that it sees as the best choice. If the optimizer makes a poor choice, you can
SQL User Guide
237
15. Query Optimization
force its choice by specifying the index name in the column reference using colname@index_name syntax. This is illustrated
in the following example from the diskdir database.
select * from filetab where size@sizendx >= 100000;
Besides the sizendx index, the optimizer could have chosen to use sizenmndx or sizedatendx. By specifying the index name
with the column name in the conditional expression, the optimizer will only consider use of that particular index. Be certain
you know exactly what you are doing when you use this feature (as well as the one from the last section).
Optimizer Iteration Threshold (OptLimit)
The time required by the optimizer to determine the optimal execution plan for a query increases factorially with the number
of tables referenced in the from clause. Thus, the time to compile and optimize a query can become noticeable when there are
many (> 8-10) referenced tables. The algorithm used by RDM Server SQL will often (but not always) determine the best
access plan (or a reasonably good one) early in the optimization phase.
The optimizer algorithm includes a failure-to-improve threshold limit based on the number of access plan step iterations.
When the algorithm fails to generate a better access plan within the specified limit, the optimizer stops and uses the best plan
found up to that point. The number of iterations that the algorithm processes depends on the number of tables being accessed
and the number of usable access methods that can be chosen. The OptLimit configuration parameter can be used to specify
this failure-to improve value. When set, the optimizer will stop prematurely, after executing OptLimit number of steps, even
though a better plan than the current best plan has not been found. (This is similar to how chess programs work in timed
games.)
We recommend that you keep this parameter disabled (OptLimit = 0) unless you have an ongoing need to dynamically compile complex queries in which the optimization time degrades overall system performance. If you need to specify OptLimit,
the value is the number of optimizer iterations (i.e., candidate execution plan steps). You will typically choose a value greater
than 1,000 and less than 10,000. The higher the number, the longer the optimizer will take, but the better the likelihood of
finding the best plan.
An administrator can use the set opt_limit SQL statement to change the value for a particular session. The OptLimit configuration parameter sets the limit for all sessions. All configuration parameters in rdmserver.ini are read only at initial server
startup.
Enabling Automatic Insertion of Redundant Conditionals
A configuration parameter named RedundantExprs can be defined in rdmserver.ini (RedundantExprs=1) that allows the
optimizer to include redundant expressions involving foreign and primary key columns involved in a join predicate.
Checking Optimizer Results
Retrieving the Execution Plan (SQLShowPlan)
You can view the execution plan by calling the SQLShowPlan function from your C/C++ application program. You can
also view the plan from the rsql utility program. SQLShowPlan returns a result set containing one row for each step in the
execution plan (one step per table listed in the from clause). This result set returns columns as shown in Table 15-9.
SQL User Guide
238
15. Query Optimization
Table 15-9. SQLShowPlan Result Set Definition
Column Name
Description
STEP_NUMBER
The step number in the execution plan.
DB_NAME
TABLE_NAME
ACCESS_METHOD
ACCESS_NAME
STEP_CARDINALITY
PLAN_CARDINALITY
PLAN_COST
SORT_LEN
GROUP_LEN
The first row is step 1, second is step 2, etc,
The name of the database in which the table is defined.
The name of the base table being accessed.
The method by which the table is accessed (see below).
The name of the index or predefined join used by the access method, if applicable.
Estimate of the number of rows returned from this step.
Estimate of the total number of rows returned by the query.
Estimate of the total cost (in logical I/Os) to execute the query.
Length of the sort record for the order by clause (0 => no sort required).
Length of the sort record for the group by clause (0 => no sort required)
The last four columns return the same values for each row in the result set. The access method is identified using the names in
Table 15-10 below.
Table 15-10. SQLShowPlan Access Methods
Name Used in SQLShowPlan
Access Method
TABLE SCAN
Sequential file scan
DIRECT
INDEX FIND
INDEX LIST
INDEX SCAN
INDEX LIKE
P-TO-F JOIN
F-TO-P JOIN
JOINED INDEX
JOINED DIRECT
Direct access
Indexed access (equality)
Indexed access (in)
Index scan (inequality)
Index scan (like)
Primary to foreign key join
Foreign to primary key join
Foreign thru indexed primary key
Foreign thru rowid primary key
SQLShowPlan is called with two statement handles. The first is the statement handle into which the execution plan result
set will be returned. The second statement handle is for the statement whose execution plan will be retrieved. This second
handle must be at least in the prepared state (that is, the statement must have already been compiled using SQLPrepare or
SQLExecDirect). The prototype for SQLShowPlan is given below.
RETCODE SQLShowPlan(
HSTMT thisHstmt, //in: handle for SQLShowPlan result set
HSTMT thatHstmt) //in: handle of statement whose plan is to be fetched
SQLShowPlan will return an error if thisHstmt is not in a compatible state with SQLExecDirect (SQLShowPlan calls
SQLExecDirect). An error will also be returned if thatStmt is not a prepared or executed select, update, or delete statement.
You can view a statement's execution plan from rsql using the ".X" command. You must execute this command under a separate statement handle from the one whose execution plan you are interested. The following "select office, count" example
illustrates the use of the command.
SQL User Guide
239
15. Query Optimization
002 rsql: select office, count(*) from salesperson, customer, sales_order
+ 002 rsql: where salesperson.sale_id = customer.sale_id
+ 002 rsql: and customer.cust_id = sales_order.cust_id
+ 002 rsql: and state in ("AZ","CA",'CO','WA','TX') group by 1 order by 2 desc;
OFFICE
LAX
DEN
SEA
DAL
COUNT(*)
15
12
9
6
004 rsql: .h 2
*** using statement handle 2 of connection 1
004 rsql: .t
*** table mode is off
004 rsql: .X 1
STEP_NUMBER
: 1
DB_NAME
: SALES
TABLE_NAME
: CUSTOMER
ACCESS_METHOD
: INDEX LIST
ACCESS_NAME
: CUST_GEO
STEP_CARDINALITY: 9
PLAN_CARDINALITY: 40.000000
PLAN_COST
: 54.000000
SORT_LEN
: 9
GROUP_LEN
: 24
004 rsql: .n
STEP_NUMBER
: 2
DB_NAME
: SALES
TABLE_NAME
: SALESPERSON
ACCESS_METHOD
: F-TO-P JOIN
ACCESS_NAME
: ACCOUNTS
STEP_CARDINALITY: 9
PLAN_CARDINALITY: 40.000000
PLAN_COST
: 54.000000
SORT_LEN
: 9
GROUP_LEN
: 24
004 rsql: .n
STEP_NUMBER
: 3
DB_NAME
: SALES
TABLE_NAME
: SALES_ORDER
ACCESS_METHOD
: P-TO-F JOIN
ACCESS_NAME
: PURCHASES
STEP_CARDINALITY: 40
PLAN_CARDINALITY: 40.000000
PLAN_COST
: 54.000000
SORT_LEN
: 9
GROUP_LEN
: 24
004 rsql: .n
*** no more rows
Using the SqlDebug Configuration Parameter
The ENVIRONMENT section of rdmserver.ini contains a parameter called SqlDebug. This parameter has been implemented
for internal use by Raima, but it can also be used by an SQL developer to discover the execution plan choices made by the
RDM Server optimizer. Use this method only when you need more information than that provided by SQLShowPlan.
SQL User Guide
240
15. Query Optimization
WARNING: Enabling the SqlDebug parameter will cause many debug files to be created which can quickly
consume disk space. Do not enable this parameter on a production server; it should be used strictly in a test
environment with one user only.
Debug information for each query is written into a separate debug text file in the current directory within which the RDM
Server is executing. The files are named debug.nnn where nnn is 000 for the first file, 001 for the second, and so forth.
The SqlDebug parameter is a bit mapped value in which each bit setting controls the output of certain SQL internal tables.
RDM Server SQL maintains the debug settings shown in the Table 15-11 below. When more than one bit setting is specified,
the output for each is written into a separate debug file.
Table 15-11. SqlDebug Parameter Values
SqlDebug
Debug File Output (setting 2 is currently unused)
1
Formatted dump of compiled statement
Formatted dump of query optimizer tables and execution plan
Formatted dump of query execution plan
1 and 4
1 and 8
4 and 8
1, 4 and 8
4
8
5
9
12
13
Setting 1 is of no interest with respect to query optimization. Setting 8 provides basically the same information as
SQLShowPlan. Setting 4 will produce a formatted dump of the internal tables that drive the optimizer's analysis.
Included at the beginning of the debug file is a copy of the SQL select statement being optimized. A sample for the "select
office, count" query in Retrieving the Execution Plan section is shown below.
select office, count(*) from salesperson, customer, sales_order
where salesperson.sale_id = customer.sale_id
and customer.cust_id = sales_order.cust_id
and state in ("AZ","CA",'CO','WA','TX') group by 1 order by 2 desc;
---------------------------------------------------------------------------------FROM Table:
# name
tableid viewid
# rows step #
-- ---------------------- ------- ------- -------- -----0 SALESPERSON
36
0
14
1
1 CUSTOMER
37
0
28
0
2 SALES_ORDER
38
0
127
2
Referenced Column Table:
# name
tableno colno accmap ndxname
-- --------------- ------- ----- ------ ------0 OFFICE
0
6 0x0010
1 SALE_ID
0
1 0x0004
2 SALE_ID
1
8 0x0002
3 CUST_ID
1
1 0x0004
4 CUST_ID
2
1 0x0002
5 STATE
1
6 0x0004
Access Table:
# type tableno/name
SQL User Guide
id
-----ref'd columns----- binds
1 2 3 4 5 6 7 8 table updated
241
15. Query Optimization
-- ---- ------------------- ----- ----------------------0
i 0/SALESPERSON
0 1 -1 -1 -1 -1 -1 -1 -1
1
i 0/SALESPERSON
1 -1 0 -1 -1 -1 -1 -1 -1
2
i 1/CUSTOMER
0 3 -1 -1 -1 -1 -1 -1 -1
3
i 1/CUSTOMER
1 5 -1 -1 -1 -1 -1 -1 -1
4
j 1/CUSTOMER
0 2 -1 -1 -1 -1 -1 -1 -1
5
j 2/SALES_ORDER
0 4 -1 -1 -1 -1 -1 -1 -1
----- ------no
no
no
no
no
no
no
no
no
no
no
no
Expression Table:
# optable col0
emult0 col1
emult1 join operation
-- ------- ---- --------- ---- --------- ---- --------0
2
1 0.071429
2 0.035714 yes eq
1
2
3 0.035714
4 0.036220 yes eq
2
2
5 0.321429
-1 0.000000
no in
Rule Table:
#
-0
1
2
3
4
5
6
7
8
9
10
11
12
13
binds uses
binds
uses
sort
method
tab # tab #
# rows
cost
id expr #s col #s col #s col #s
----------- ----- ----- -------- ------ ----- -------- ------- ------- ------FILE SCAN
0
-1
14.00
145
-1
FILE SCAN
1
-1
28.00
145
-1
FILE SCAN
2
-1
127.00
145
-1
INDEX SCAN
0
-1
14.00
15
0
INDEX SCAN
0
-1
14.00
15
1
INDEX SCAN
1
-1
28.00
29
2
INDEX SCAN
1
-1
28.00
29
3
INDEX FIND
0
1
1.00
2
0
0
1
2
INDEX FIND
1
2
1.01
2
2
1
3
4
INDEX LIST
1
-1
9.00
9
3
2
5
JOIN OWNER
1
2
1.00
1 20002
1
3
4
JOIN MEMBER
2
1
4.54
4 20002
1
4
3
JOIN OWNER
0
1
1.00
1 20001
0
1
2
JOIN MEMBER
1
0
2.00
2 20001
0
2
1
---------------------------------------------------------------------------------Best Access Plan: cost = 54 i/o's, cardinality estimate = 40 rows
step rule #
cost
rows in rows out
---- ------ --------- --------- --------0
9
9
9
9
1
12
9
9
9
2
11
36
40
40
Number of optimizer iterations: 15
Table 15-12 below lists the tables referenced in the from clause of the statement.
Table 15-12. FROM Table
Column Heading
Description
#
Index into the FROM table; referred to in other tables as the
name
tableid
viewed
# rows
SQL User Guide
"tableno".
The table name, view name, or correlation (alias) name, if specified.
SQL's permanent ID number for the table as assigned in the system
catalog.
SQL's permanent ID number for the view as assigned in the system
catalog.
The cardinality of the table when the statement was compiled.
242
15. Query Optimization
Column Heading
step #
Description
Identifies the step number in the best plan where this table is
accessed.
Table 15-13 contains one entry for each column that is referenced in the statement.
Table 15-13. Referenced Column Table
Column Heading
Description
#
Referenced column number; used in the other tables to identify a
name
tableno
colon
accmap
ndxname
specific column.
The base column name (not the alias, if an alias was specified).
FROM table index of the table where this column is declared.
The column declaration number from its table (1 = first declared
column in table).
The column access type bit map (see Table 15-14 below).
Identifies a user-specified index name.
Table 15-14. Access Type Bit Map Values
Bit Map
Description
0x0001
Column is a rowid primary key (direct access)
0x0002
Column is the major (first) column in a joined foreign key
0x0004
Column is the major (first) column in an index
0x0008
0x0010
Column is a minor (not the first) column in a joined foreign key
Column is a minor (not the first) column in an index
Table 15-15 contains information about the indexes and joins that potentially can be used by the optimizer.
Table 15-15. Access Table
Column Heading
Description
#
Index number into this table; referenced in the "id" column of the
type
tableno/name
id
ref'd columns
binds table
updated
Rule Table.
Access type: "i" for an index, "j" for a predefined join.
The FROM table number and name of accessed table.
The index, foreign key, or primary/unique key entry for the accessed
table (this value is used to index into internal tables attached to the
table definition).
Identifies the columns (up to 8) in the index or foreign key that are
referenced in the statement. The non-negative values are indexes
into the Referenced Column Table. A -1 indicates an undeclared or
unreferenced column.
Indicates whether all of the columns from the table that are referenced in the query are contained in the index. If yes, SQL will not
have to read the row from the data file but can retrieve all the
column values from the index key value.
Only used on update statements and indicates if one of the columns
in the index is being modified in the statement.
The "Expression Table" has one entry for each optimizable conditional expression.
SQL User Guide
243
15. Query Optimization
Table 15-16. Expression Table
Column Heading
Descripiton
#
Index number into this table; it is referenced in the "expr #s" column
optable
col0
emult0
col1
emult1
join
operation
of the Rule Table.
A value of 2 indicates that at least one efficient access method is
associated with the expression. A value of 1 indicates that no efficient access methods exist for rows that satisfy the condition to be
efficiently retrieved.
The Referenced Column Table entry corresponding to the (left-hand)
column referenced in the conditional.
The restriction factor multiplier value associated with the col0
expression.
The Referenced Column Table entry corresponding to the (righthand) column referenced in a join condition.
The restriction factor multiplier value associated with the col1 join
condition. Two restriction factors are needed depending on which
table is being accessed.
Indicates if the expression is a join condition (that is, col0 = col1).
The relational operator specified in the expression.
The heart of the optimization analysis is driven by the Rule Table (Table 15-17).
Table 15-17. Rule Table
Column Heading
Description
#
Index number into this table; it is referenced in the "rule #" column
method
binds tab #
uses tab #
# rows
cost
id
expr #s
binds col #s
SQL User Guide
of the Best Access Plan.
The access method associated with this rule. See Table 15-10 for a
list of the access methods.
The tableno of the table being accessed by that method. A table
becomes "bound" at that step in the access plan where the rule is
applied. Prior to that step, the table is unbound.
The tableno of the table that contains column values needed by this
rule's access method. A -1 value means that the rule does not depend
on values from any other table. Rules that rely on values from
another table (through join predicates) can only be used in plan
steps that follow the rule that accesses the used table.
The optimizer's estimate of the number of rows from the table that
will be returned by the rule. When a table depends on another table
having first been bound (that is, "uses tab #" != -1), then the "#
rows" is the average number returned for each row of the dependent
table.
Estimate of the number of logical disk reads required for each application of the rule based on the formulas given in Table 15-5.
The Access Table entry that contributed to this rule or the internal
join identifier (i.e., a core-level d_ API set id constant). A -1 value
indicates that it is unused.
List of Expression Table entries that contributed to this rule.
The Referenced Columns (from the "binds tab #" table) that are
accessed by the rule.
244
15. Query Optimization
Column Heading
uses cols #s
sort col #s
Description
The Referenced Columns (from the "uses tab #" table) that are used
by the rule.
The Referenced Columns that specify the sort order in which the
rule returns its rows. A negative value indicates that this column is
returned in descending order (formed as -colno - 1).
A summary of the optimizer's results follow in Table 15-18. The cost and cardinality estimate of the Best Access Plan are
reported and followed by the plan itself. The plan lists the steps in the order of their execution.
Table 15-18. Best Access Plan
Column Heading
Description
step
Access plan step number. The steps are executed in this order.
rule #
The Rule Table entry for the rule that the optimizer selected for this
cost
rows out
rows in
rows in
rows out
step.
The cost to apply the rule at this step in the plan is equal to the
prior step's
times the rule's cost from the Rule Table. For step 0, the cost is the
rule's cost.
The number of rows from the prior step that invoke an application
of the rule in this step. It is equal to the prior step's rows out times
the "# rows" value from the Rule Table for the rule being applied at
this step. For step 0,
is the "# rows" value from the Rule Table.
The optimizer's estimate of the number of the rows in rows that satisfy all conditionals from the where clause involving the table
accessed at that step in the plan. Computed from the restriction
factors of all expressions that contribute to this rule.
When a group by or order by clause is specified and the optimizer selects an execution plan that satisfies the desired ordering without requiring a separate sort pass, a statement will be reported similar to the following:
Plan produces target ordering for order by: 3 d 1 a
The numbers here are simply the result column ordinals. When an external sort is required, the sort costs will automatically be
incorporated into the report plan cost and no notice will be printed. Note that in the example above, an external sort was
required (the "target ordering..." message was not printed). The optimizer's estimate of the cost of the sort can be computed by
subtracting from the plan's total cost, the cost reported in the last step of the Best Access Plan. Finally, the total number of
optimizer iterations needed to determine the best access plan is reported.
Limitations
Optimization of View References
Each view in RDM Server SQL is optimized at its creation time and stored in a compiled format. A view referenced in a
select statement is accessed according to its precompiled execution plan. This can cause performance problems if a view is referenced in a query with extra conditionals or is joined with another table. Instead of "unraveling" the view definition and re-
SQL User Guide
245
15. Query Optimization
optimizing it along with the extra conditionals, the view's rows are retrieved and the additional constraints are evaluated at
execution time. Thus, it is best to avoid creating joins that involve views (but a view definition can include joins on base
tables). An alternative is to use stored procedures, which are optimized at compile time but can be parameterized; the optimizer does incorporate stored procedure parameter references in its analysis.
Merge-Scan Join Operation is Not Supported
A merge-scan join operation is a join processing technique where indexes on the joined columns are merged and only rows
common to both indexes are returned. Some optimizers even include the cost of creating an index when one of the columns is
not joined. RDM Server does not include this technique because of its ability to define direct access joins using the create
join statement. Join processing based on these predefined joins is optimal.
Subquery Transformation (Flattening) Unsupported
Some optimizers perform an optimization technique on nested correlated subqueries where the query is "flattened" into an
equivalent query that has replaced the subqueries with joins. This method is not implemented in RDM Server.
SQL User Guide
246