Using Subqueries
Arguably the greatest of all the enhancements to MySQL in version 4.1 will be the ability to use subqueries. Using subqueries allows you to execute one query to obtain a result set, and then use these results as a condition within another queryall within the same line of SQL. In other words, you have two statements in your single query, the second being valid only when used with the results of the first.
Think of times when you may have run into a situation in your programming when you wanted to get results from multiple tables, but did not want to join the tables due to their complexity (or perhaps they weren't fully relational). Your solution may have been to execute the first query and then loop through the results and perform another query, multiple times over. There's nothing wrong with performing the task in that manner and, with compact code, it probably wasn't especially time consuming. However, using a subquery in that same situation would open a whole new world of efficiency.
Subquery Example for Time Tracking
Suppose you have a system in place that allows employees to enter hours spent on particular tasks, on certain dates, for a given client or project. The table creation and data insertion statements that follow are rudimentary examples of what you might find in such a system.
In the tasks table, the names of tasks and their assigned hourly rates are stored:
CREATE TABLE tasks (
task_id tinyint PRIMARY KEY NOT NULL auto_increment,
task_name varchar(50) NOT NULL default '',
task_fee float(4,2) NOT NULL default '0.00'
);
INSERT INTO tasks VALUES ('1', 'Programming', '150.00');
INSERT INTO tasks VALUES ('2', 'Database Design', '200.00');
INSERT INTO tasks VALUES ('3', 'Project Management', '150.00');
INSERT INTO tasks VALUES ('4', 'Data Entry', '75.00');
INSERT INTO tasks VALUES ('5', 'Graphic Design', '150.00');
In the employees table, the names of employees are stored:
CREATE TABLE employees (
e_id tinyint PRIMARY KEY NOT NULL auto_increment,
e_fname varchar(50) NOT NULL default '',
e_lname varchar(50) NOT NULL default ''
);
INSERT INTO employees VALUES ('1', 'John', 'Doe');
INSERT INTO employees VALUES ('2', 'Jane', 'Doe');
INSERT INTO employees VALUES ('3', 'Michael', 'Smith');
INSERT INTO employees VALUES ('4', 'Ralph', 'Jones');
In the clients table, the names of clients are stored:
CREATE TABLE clients (
c_id tinyint PRIMARY KEY NOT NULL auto_increment,
c_name varchar(255) NOT NULL default ''
);
INSERT INTO clients VALUES ('1', 'ABC Company');
INSERT INTO clients VALUES ('2', 'XYZ Company');
INSERT INTO clients VALUES ('3', 'Acme Telesystems');
INSERT INTO clients VALUES ('4', 'GoNutz Inc.');
Finally, in the billable_hours table, you have the individual entries. Note that all entries in the billable_hours table use the appropriate ID fields of the employee, client, and tasks tables in order to relate back to the master record for each.
CREATE TABLE billable_hours (
bh_id int PRIMARY KEY NOT NULL auto_increment,
work_date date,
e_id tinyint,
c_id tinyint,
task_id tinyint,
billable_time float(4,2) NOT NULL default '0.00'
);
INSERT INTO billable_hours VALUES ('', '2004-08-25', '1', '3', '2', '1.75');
INSERT INTO billable_hours VALUES ('', '2004-08-25', '1', '3', '1', '3.50');
INSERT INTO billable_hours VALUES ('', '2004-08-25', '2', '1', '3', '4.25');
INSERT INTO billable_hours VALUES ('', '2004-08-26', '2', '1', '4', '7.25');
INSERT INTO billable_hours VALUES ('', '2004-08-26', '2', '2', '1', '3.50');
INSERT INTO billable_hours VALUES ('', '2004-08-26', '3', '3', '1', '4.00');
INSERT INTO billable_hours VALUES ('', '2004-09-01', '3', '4', '2', '6.75');
INSERT INTO billable_hours VALUES ('', '2004-09-01', '3', '4', '2', '4.25');
INSERT INTO billable_hours VALUES ('', '2004-09-01', '4', '3', '3', '5.00');
INSERT INTO billable_hours VALUES ('', '2004-09-05', '4', '1', '3', '4.50');
Say you wanted to see a list of distinct tasks performed by any employee, for Acme Telesystemsand you weren't using subqueries. First, you'd have to get the ID for Acme Telesystems, and then issue a second query to get the list of tasks:
mysql> select c_id from clients where c_name = 'Acme Telesystems';
+------+
| c_id |
+------+
| 3 |
+------+
1 row in set (0.16 sec)
mysql> select distinct t.task_name as theTasks
-> from billable_hours as bh
-> left join tasks as t on t.task_id = bh.task_id
-> where bh.c_id = 3;
+--------------------+
| theTasks |
+--------------------+
| Database Design |
| Programming |
| Project Management |
+--------------------+
3 rows in set (0.22 sec)
Using subqueries, you can merge these two statements into one:
mysql> select distinct t.task_name as theTasks
-> from billable_hours as bh
-> left join tasks as t on t.task_id = bh.task_id
-> where bh.c_id = (select c_id from clients where c_name = 'Acme Telesystems');
+--------------------+
| theTasks |
+--------------------+
| Database Design |
| Programming |
| Project Management |
+--------------------+
3 rows in set (0.22 sec)
Using subqueries in this case has eliminated the first query, and 0.16 seconds from the whole process.
In general, a subquery is usually a SELECT statement, and this statement must not return more than one field within its results. Because you're trying to match one result from the second query to one condition in the first query, you can see where a multi-column SELECT statement in the second query would cause a mismatch error.
You can nest as many subqueries as you want, as long as the basic rules of queries are followed; that is, they must be valid and well-structured. Take the following example, which looks for work done by Michael Smith for GoNutz Inc.:
mysql> select t.task_name as theTask,
-> bh.billable_time * t.task_fee as theTotal
-> from billable_hours as bh
-> left join tasks as t on t.task_id = bh.task_id
-> where
-> (
-> (bh.c_id = (select c_id from clients where c_name = 'GoNutz Inc.'))
-> and
-> (bh.e_id = (select e_id from employees where (e_fname = 'Michael' and e_lname =
'Smith')))
->);
+-----------------+---------+
| theTasks |theTotal |
+-----------------+---------+
| Database Design | 1350.00 |
| Database Design | 850.00 |
+-----------------+---------+
2 rows in set (0.22 sec)
In the next sections, you will learn about some additional enhancements found in MySQL 4.1, but it is likely that none will have the same effect on your programming style as the ability to use subqueries in your applications!
|