SQL Subqueries

It is often impossible to solve a given task with just one query. In such cases, subqueries are used: internal, nested queries.

· 6 min read

A subquery is a SELECT statement nested within the body of another statement.

The coding of a subquery adheres to the same rules as coding a simple SELECT statement. The outer operator uses the result of the inner operator to determine the final result.

Based on the number of returned values, subqueries are divided into two types:

  • Scalar subqueries, which return a single value;
  • Table subqueries, which return a set of values.

There are two types of subqueries based on their execution method:

  • Simple subqueries;
  • Complex subqueries.

A subquery is considered simple if it can be executed independently of the outer query. The DBMS executes such a subquery once and then places its result in the outer query.

Sponsor

A subquery is called simple if it can be considered independently of the outer query. The DBMS executes such a subquery once and then incorporates its result into the outer query.

A complex subquery cannot be considered independently of the outer query. In this case, the execution of the statement starts with the outer query, which selects each individual row of the table. For each selected row, the DBMS executes the subquery once.

Simple Scalar Subqueries

Let's provide examples of simple scalar subqueries.

The database used in the examples is located in this post.

Example 1.
Determine the names of the parts whose price is higher than the price of the part named 'bolt'.

SELECT dname
FROM D
WHERE dprice > (SELECT dprice
                FROM D
                WHERE dname = ’bolt’)

This subquery belongs to the scalar type, as it returns a single value - the price of the part bolt.

The subquery is simple because it can be considered independently of the outer query. The DBMS first executes the subquery, obtaining the price of the part bolt - a value of 10, and then places this value into the outer query and executes it.

Example 2.
Determine the part numbers whose price is less than the average price of the parts.

SELECT dname
FROM D
WHERE dprice < (SELECT AVG(dprice)
                FROM D)

Example 3.
Determine the supplier number who made the delivery with the minimum volume.

SELECT pnum
FROM PD
WHERE volume = (SELECT min(volume)
                FROM PD)

Example 4.
Determine the part numbers of which more are supplied than parts with number 2.

SELECT pnum
FROM PD
GROUP BY dnum
HAVING sum(volume) > (SELECT sum(volume)
                      FROM PD
                      WHERE dnum = 2)
Subqueries can be used not only in the WHERE clause but also in other clauses of the SELECT statement, for example, in the SELECT clause itself.

Example 5.
Display the following information about the parts: name, price, deviation from the average price.

SELECT dname, dprice, dprice - (SELECT AVG(dprice) FROM PD) AS dif
FROM PD

As a result, we will get a table:

dnamedpricedif
bolt10-10
nut200
screw3010
Random block

Simple Table Subqueries

If a subquery returns a set of values, its result must be processed in a special way. This is where operations like IN, ANY, SOME, and ALL come into play. These operations can be used with subqueries that return a table consisting of a single column of values.

Operation IN

The IN operation checks for the inclusion of a value in a set obtained after the execution of the subquery.

Example 6.
Determine the names of suppliers who supply parts.

SELECT pname
FROM P
WHERE pnum in (SELECT pnum
               FROM PD)

This subquery is classified as table because it returns a set of values. The subquery is simple because it can be considered independently of the outer query.

The DBMS first executes the subquery, resulting in a set of supplier numbers who supply parts. Then, the DBMS checks the number of each supplier in the P table for membership in the obtained set. If a supplier's number is in the set, the supplier's name is placed in the resulting table.

Example 7.
Determine the names of suppliers who do not supply the part with number 2.

SELECT pname
FROM P
WHERE pnum not in (SELECT pnum
                   FROM PD
                   WHERE dnum = 2)

Example 8.
Determine the names of suppliers who supply only part number 1.

SELECT pname
FROM PD
WHERE pnum in (SELECT pnum
               FROM PD
               WHERE dnum = 1) AND pnum not in (SELECT pnum
                                                FROM PD
                                                WHERE dnum <> 1)

Operations ANY, SOME, ALL

If the keyword ANY precedes a subquery, the comparison condition is considered met if it is true for at least one of the values obtained after the execution of the subquery.

If the keyword ALL precedes a subquery, the comparison condition is considered met only if it is true for all of the values obtained after the execution of the subquery.

If the result of the subquery execution is an empty set, then for the ALL operation, the comparison condition will be considered met, but for the keyword ANY, it will be considered not met.

The keyword SOME is a synonym for ANY and is used to enhance the clarity of query texts.

Example 9.
Determine the names of suppliers who supply parts.

SELECT pname
FROM P
WHERE pnum = ANY(SELECT pnum
                 FROM PD)

This subquery is classified as table because it returns a set of values. The subquery is simple because it can be considered independently of the outer query.

The DBMS first executes the subquery, resulting in a set of supplier numbers who supply parts. Then, the DBMS checks if each supplier's number from the P table matches at least one of the numbers from the obtained set. If the condition is met, the supplier's name is placed in the resulting table.

Example 10.
Determine the name of the part with the highest price.

SELECT dname
FROM D
WHERE dprice >= ALL(SELECT dprice
                    FROM PD)

The last example can be solved in the following way:

SELECT dname
FROM D
WHERE dprice = (SELECT max(dprice)
               FROM PD)

Complex Table Subqueries

Operation EXISTS

The result of such operations is either TRUE or FALSE.

For the EXISTS operation, the result is TRUE if the table returned by the subquery contains at least one row. If the resulting table of the subquery is empty, then the EXISTS operation returns FALSE. For the NOT EXISTS operation, the reverse processing rules apply.

Since both operations only check for the presence of rows in the resulting table of the subquery, this table can contain an arbitrary number of columns.

Example 11.
Determine the names of suppliers who supply parts.

SELECT pname
FROM P
WHERE EXISTS(SELECT *
             FROM PD
             WHERE PD.pnum = P.pnum)

This subquery is classified as table because it returns a set of values. The subquery is complex because it cannot be executed independently of the outer query.

In this case, the execution of the statement starts with the outer query, which sequentially selects each individual row of the P table. For each selected row, the DBMS executes the subquery once. Only the names of those suppliers for whom the subquery returns at least one row are placed in the resulting table.

First, a row with information about the supplier Ivanov is selected. In the subquery, instead of P.pnum, the value 1 (the number of the supplier Ivanov) is substituted, after which the subquery is executed.

The subquery returns the first three rows from the PD table, corresponding to the deliveries of Ivanov, so the result of the EXISTS operation is TRUE, and the name Ivanov is placed in the resulting table.

Similarly, results are obtained for suppliers Petrov and Sidorov. When selecting a row with information about the supplier Kuznetsov, the subquery returns an empty set, so the result of the EXISTS operation is FALSE, and the name Kuznetsov is not placed in the resulting table.

Creating Self-Joins

A self-join is a standard SQL join that joins a table to itself. This type of join allows for comparing values stored in the same column of a table.

In self-joins, table aliases are used to differentiate between the joined copies of the table. These aliases are introduced in the FROM clause and are used as regular table names.

Example 12.
Determine the names of suppliers who supply both part number 1 and part number 2.

One of the ways to solve this task using a subquery can be written as follows.

SELECT pnum
FROM PD
WHERE dnum = 1 AND pnum in (SELECT pnum
                            FROM PD
                            WHERE dnum = 2)

The same result can be achieved by joining the PD table with its copy, which we will call PD1, in the following way:

SELECT PD.pnum
FROM (PD INNER JOIN PD AS PD1 ON PD.pnum=PD1.pnum)
INNER JOIN PD AS PD2 ON PD1.pnum=PD2.pnum
WHERE PD.dnum=1 AND PD1.dnum=2 AND PD2.dnum=3

Example 13.
Determine the names of suppliers who supply part number 1, part number 2, and part number 3.

SELECT PD.pnum
FROM (PD INNER JOIN PD AS PD1 ON PD.pnum=PD1.pnum)
INNER JOIN PD AS PD2 ON PD1.pnum=PD2.pnum
WHERE PD.dnum=1 AND PD1.dnum=2 AND PD2.dnum=3

Summing Up

From this article, you learned what a subquery in SQL is. Now you can easily distinguish between a scalar and a table subquery, as well as a simple query from a complex one.

We also examined operations such as IN, ANY, SOME, and ALL through examples.

Struchkov Mark
Struchkov Mark
Ask questions in the comments if something is unclear 👇