If you want to get something meaningful out of data, you’ll almost always need to join multiple tables. In this article, we’ll show how to do that using different types of joins. To achieve that, we’ll combine INNER JOINs and LEFT JOINs. So, let’s start.
The Model
In the picture below you can see out existing model. It consists of 6 tables and we’ve already, more or less, described it in the previous articles.
Still, even without describing, if the database is modeled and presented in a good manner (choosing names wisely, using naming convention, following the same rules throughout the whole model, lines/relations in schema do not overlap more than needed), you should be able to conclude where you can find the data you need. This is crucial because before you join multiple tables, you need to identify these tables first.
We’ll talk about naming convention and the advice on how to think when you’re writing SQL queries, later in this series. So far, let’s live with the fact that this model is pretty simple and we can do it fairly easily.
What do we know so far?
In this series, we’ve covered:
- Basics related to SQL SELECT statement, and
- Compared INNER JOIN and LEFT JOIN
We’ll use the knowledge from both these articles and combine these to write more complex SELECT statements that will join multiple tables.
Join multiple tables using INNER JOIN
The first example we’ll analyze is how to retrieve data from multiple tables using only INNER JOINs. For each example, we’ll go with the definition of the problem we must solve and the query that does the job. So, let’s start with the first problem.
#1 We need to list all calls with their start time and end time. For each call, we want to display what was the outcome as well the first and the last name of the employee who made that call. We’ll sort our calls by start time ascending.
Before we write the query, we’ll identify the tables we need to use. To do that, we need to determine which tables contain the data we need and include them. Also, we should include all tables along the way between these tables – tables that don’t contain data needed but serve as a relation between tables that do (that is not the case here).
The query that does the job is given below:
1 2 3 4 5 |
SELECT employee.first_name, employee.last_name, call.start_time, call.end_time, call_outcome.outcome_text FROM employee INNER JOIN call ON call.employee_id = employee.id INNER JOIN call_outcome ON call.call_outcome_id = call_outcome.id ORDER BY call.start_time ASC; |
The query result is given below:
There are a few things I would like to point out here:
- The tables we’ve joined are here because the data we need is located in these 3 tables
- Each time I mention any attribute from any table, I’m using format table_name.attribute_name (e.g. employee.first_name). While that’s not needed, it’s a good practice, because sometimes 2 or more tables in the same query could use the same attribute names and that would lead to an error
- We’ve used INNER JOIN 2 times in order to join 3 tables. This will result in returning only rows having pairs in another table
- When you’re using only INNER JOINs to join multiple tables, the order of these tables in joins is not important. The only important thing is that you use appropriate join conditions after the “ON” (join using foreign keys)
Since all calls had related employee and call outcome, we would get the same result if we’ve used LEFT JOIN instead of the INNER JOIN.
Join multiple tables using LEFT JOIN
Writing queries that use LEFT JOINs doesn’t differ a lot when compared to writing queries using INNER JOINs. The result would, of course, be different (at least in cases when some records don’t have a pair in other tables).
This is the problem we want to solve.
#2 List all counties and customers related to these countries. For each country display its name in English, the name of the city customer is located in as well as the name of that customer. Return even countries without related cities and customers.
The tables containing data we need are in the picture below:
First, let’s quickly check what is the contents of these 3 tables.
We can notice two important things:
- While each city has a related country, not all countries have related cities (Spain & Russia don’t have them)
- Same stands for the customers. Each customer has the city_id value defined, but only 3 cities are being used (Berlin, Zagreb & New York)
Let’s first write down the query using INNER JOIN:
1 2 3 4 |
SELECT country.country_name_eng, city.city_name, customer.customer_name FROM country INNER JOIN city ON city.country_id = country.id INNER JOIN customer ON customer.city_id = city.id; |
The query result is shown in the picture below:
We have 7 counties and 6 cities in our database, but our query returns only 4 rows. That is the result of the fact we have only 4 customers in our database. Each of these 4 is related to its city and the city is related to the country. So, INNER JOIN eliminated all these countries and cities without customers. But how to include these in the result too?
To do that, we’ll use LEFT JOIN. We’ll simply replace all “INNER” with “LEFT” so our query is as follows:
1 2 3 4 |
SELECT country.country_name_eng, city.city_name, customer.customer_name FROM country LEFT JOIN city ON city.country_id = country.id LEFT JOIN customer ON customer.city_id = city.id; |
The result is shown in the picture below:
You can easily notice that now we have all the countries, even those without any related city (Russia & Spain), as well all cities, even those without customers (Warsaw, Belgrade & Los Angeles). The remaining 4 rows are the same as in the query using INNER JOIN.
LEFT JOIN – Tables order matters
While the order of JOINs in INNER JOIN isn’t important, the same doesn’t stand for the LEFT JOIN. When we use LEFT JOIN in order to join multiple tables, it’s important to remember that this join will include all rows from the table on the LEFT side of the JOIN. Let’s rearrange the previous query:
1 2 3 4 |
SELECT country.country_name_eng, city.city_name, customer.customer_name FROM customer LEFT JOIN city ON customer.city_id = city.id LEFT JOIN country ON city.country_id = country.id; |
At first, you could easily say, that this query and the previous one are the same (this is true when using INNER JOIN). We’ve used the same tables, LEFT JOINs, and the same join conditions. Let’s take a look at the output first:
So, what happened here? Why do we have 4 rows (same 4 we had when we’ve used INNER JOIN)?
The answer is simple and it’s related to how LEFT JOIN works. It takes the first table (customer) and joins all its rows (4 of them) to the next table (city). The result of this is 4 rows because the customer could belong to only 1 city. Then we join these 4 rows to the next table (country), and again we have 4 rows because the city could belong to only 1 country.
The reason why we wouldn’t join these 3 tables in this way is given by the text of the example #2. The query is written in such manner it returns 4 rows would be the answer to the following: Return names of all customers as well as cities and countries they are located in. Return even customers without related cities and countries.
- Note: When you’re using LEFT JOIN, the order of tables in that statement is important and the query will return a different result if you change this order. The order actually depends on what you want to return as a result.
Join multiple tables using both – INNER JOIN & LEFT JOIN
This is also possible. Let’s again go with an example.
#3 Return the list of all countries and cities that have pair (exclude countries which are not referenced by any city). For such pairs return all customers. Return even pairs not having a single customer.
The query that does the job is:
1 2 3 4 5 |
SELECT country.country_name_eng, city.city_name, customer.customer_name FROM country INNER JOIN city ON city.country_id = country.id LEFT JOIN customer ON customer.city_id = city.id; |
The result of the query is given in the picture below:
You can easily notice that we don’t have countries without any related city (these were Spain & Russia). The INNER JOIN eliminated these rows. Still, we do have cites without any customers (Belgrade, Los Angeles & Warsaw). This is the result of the fact we used LEFT JOIN between tables city and customer.
Conclusion
When you need to join multiple tables, you have INNER & LEFT JOIN on your disposal (RIGHT JOIN is rarely used and can be easily replaced by LEFT JOIN). Which join you’ll use depends directly on the task you need to solve and you’ll get the feeling along the way. In upcoming articles, we’ll discuss how to think and organize yourself when you need to write more complex queries.
Table of contents
- Learn SQL: How to prevent SQL Injection attacks - May 17, 2021
- Learn SQL: Dynamic SQL - March 3, 2021
- Learn SQL: SQL Injection - November 2, 2020