A.3 SQL
The language of choice
for
querying and manipulating databases is Structured Query
Language, often referred to as SQL. SQL is often
pronounced "sequel." SQL is a
declarative language, as opposed to a procedural language, and it can
take a while to get used to working with a declarative language if
you are used to languages like VB or C#.
Most programmers tend to think in terms of a sequence of steps:
"Find me all the bugs, then get the
reporter's ID, then use that ID to look up that
user's records in People, then get me the email
address." In a declarative language, you declare the
entire query, and the query engine returns a set of results. You are
not thinking about a set of steps; rather, you are thinking about
designing and "shaping" a set of
data. Your goal is to make a single declaration that will return the
right records. You do that by creating temporary
"wide" tables that include all the
fields you need and then filtering for only those records you want.
"Widen the Bugs table with the People table, joining
the two on the PersonID, then filter for only those that meet my
criteria."
The heart of SQL is the query. A
query is
a statement that returns a set of records from the database. For
example, you might like to see all of the BugIDs and Bug Descriptions
in the Bugs table whose status is Open. To do so you would write:
Select BugID, BugDescription from Bugs where status = 'open'
SQL is capable of much more powerful queries. For example, suppose
the Quality Assurance manager would like to know the email address
for everyone who has reported a high-priority bug that was resolved
in the past ten days. You might create a query such as:
Select emailAddress from Bugs b
join People p on b.personID = p.personID
where b.priority='high'
and b.status in ('closed', 'fixed','NotABug')
and b.dateModified < DateAdd(d,-10,GetDate( ))
|
GetDate returns the current date, and
DateAdd returns a new date computed by adding or
subtracting an interval from a specified date. In this case, you are
returning the date computed by subtracting ten days from the current
date.
|
|
At first glance, you appear to be selecting the email address from
the Bugs table, but that is not possible because the Bugs table does
not have an email address. The key phrase is:
Bugs b join People p on b.personID = p.personID
It is as if the join phrase creates a temporary
table that is the width of both the Bugs table and the People table
joined together. The on keyword dictates how the
tables are joined. In this case, the tables are joined on the
personID: each record in Bugs (represented by the alias b) is joined
to the appropriate record in People (represented by the alias p) when
the personID fields match in both records.
A.3.1 Joining Tables
When you join two tables
you
can
say either "get every record that exists in
either," (this is called an outer
join ) or you can say, as we've done
here, "get only those records that exist in both
tables" (called an inner join).
|
Inner joins are the default, and so writing join
is the same as writing inner
join.
|
|
The inner join shown above says: get only the records in
People that match the records in
Bugs by having the same value in the
PersonID field (on b.PersonID =
p.PersonID).
The where clause further constrains the search to
those records whose priority is high, whose status
is one of the three that constitute a resolved Bug
(closed, fixed, or not
a bug), and that were last modified within the past ten
days.
A.3.2 Using SQL to Manipulate the Database
SQL can be used not only for searching for and retrieving data but
also for creating, updating, and deleting tables and generally
managing and manipulating both the content and the structure of the
database. For example, you can update the Priority of a bug in the
Bugs table with this statement:
Update Bugs set priority = 'high' where BugID = 101
For a full explanation of SQL and details on using it well, take a
look at Transact-SQL Programming, by Kevin
Kline, Lee Gould, and Andrew Zanevsky (O'Reilly).
|