In this tutorial I’m going to guide you through a method of database design which is invaluable when dealing with many different tables or just when dealing with a lot of data that you’d like to be able to keep organised without giving yourself too much of a headache.
Up until now we’ve been working with examples that only use one table of data and have been running SELECT and INSERT operations on a relatively simple dataset. We’re now going to look at an example where it becomes useful to use more than one table at a time while still using the powerful features of MySQL to query this data in a simple, straightforward manner
Consider the following, slightly messy table outlining a series of orders placed at a fictitious company - “Dirigibles, Sprockets & Thingamajigs Inc.”
| Order id | Customer Name | Address | Product Ordered | Product Description | In Stock? |
| 1 | Jim Jeffries | 25 Garden Road | Sprocket | Some sort of widget | 1 |
| 2 | Jim Jeffries | 25 Garden Road | Thingamajig | Something-or-other | 1 |
| 3 | Bob Robertson | 116 Main Street | Dirigible | Flying machine | 0 |
| 4 | Bob Robertson | 116 Main Street | Sprocket | Some sort of widget | 1 |
| 5 | Alan Davies | 2 Bumblebee Avenue | Dirigible | Flying machine | 0 |
| 6 | Peter Piper | 179 Pickled Parkway | Thingamajig | Something-or-other | 1 |
This table represents all the recent orders that have come into the company through their website, and it lets the owners know who has ordered what, where it should be delivered and whether or not the product is in stock. It also stores the product description and the name of the product ordered.
The table contains all the information necessary to process the orders, but it has several major shortfalls, here are the most obvious
We are repeating several pieces of data in multiple records. Two people have ordered two items from the shop, and for each order we’re storing their name and their address. If we needed to update their address for any reason, then we would have to update two seperate records
Also, the product names and descriptions are duplicated. Keeping the product names in the order table like this doesn’t really make much sense either. What if we wanted to change the name of one of the products being sold? We’d then have to modify all entries in the database containing that name.
The “In Stock” value makes things even more complicated. If an item’s stock status changes (i.e. if it sells out) then we would need to update all values of “In stock” for every identical item
You can probably see for yourself that this system is flawed in several respects, and could quickly lead to errors entering the process.
Now, I’m going to show you the same information, but split into three distinct tables - Customers, Orders and Products
Customer Table
| customer_id | customer_name | customer_address | number_of_orders |
| 1 | Jim Jeffries | 25 Garden Road | 2 |
| 2 | Bob Robertson | 116 Main Street | 2 |
| 3 | Alan Davies | 2 Bumblebee Avenue | 1 |
| 4 | Peter Piper | 179 Pickled Parkway | 1 |
Orders Table
| order_id | customer_id | product_id |
| 1 | 2 | 1 |
| 2 | 2 | 2 |
| 3 | 3 | 1 |
| 4 | 4 | 3 |
| 5 | 1 | 2 |
| 6 | 1 | 3 |
Products Table
| product_id | product_name | product_description | in_stock |
| 1 | Dirigible | Flying machine | 0 |
| 2 | Sprocket | Some sort of widget | 1 |
| 3 | Thingamajig | Something-or-other | 1 |
At first glance you might be thinking that this makes things even more complicated. After all, we now have three tables to look after, and several extra columns. But, if we look closer at the data, we notice a couple of very interesting points
None of the data is duplicated. We have defined all of the customers who have ordered from the shop in the “customers” table, but instead of referring to the customer by name in the “orders” table, we simply refer to their numerical id. This means that if we needed to update the address of any of the customers, we need only do that for one record, rather than many
Each table now represents a distinct area of operation for the shop. Customers have their own unique table, as do the orders and the shop’s products. Splitting large MySQL tables into seperate tables with their own distinct purpose lets you see more clearly the function of the data. It also allows you to expand upon each of the tables - for example, we could now add in a “password” field for the “customers” table, or a “ordered on this date” field for the “orders” table
Now we’ll move on to see how this example impacts upon our learning process, and how we can query this data in special ways, to make use of the relationships between the tables
First of all, we should construct the preceding tables within MySQL. Do this now, by executing the following MySQL query. This will delete any previous examples of the “customers”, “orders” and “products” tables
DROP TABLE IF EXISTS `customers`; CREATE TABLE IF NOT EXISTS `customers` ( `customer_id` int(11) NOT NULL AUTO_INCREMENT, `customer_name` varchar(64) NOT NULL, `customer_address` text NOT NULL, `number_of_orders` int(11) NOT NULL, PRIMARY KEY (`customer_id`) ) TYPE=InnoDB AUTO_INCREMENT=5 ; INSERT INTO `customers` (`customer_id`, `customer_name`, `customer_address`, `number_of_orders`) VALUES (1, 'Jim Jeffries', '25 Garden Road', 2), (2, 'Bob Robertson', '116 Main Street', 2), (3, 'Alan Davies', '2 Bumblebee Avenue', 1), (4, 'Peter Piper', '179 Pickled Parkway', 1); DROP TABLE IF EXISTS `orders`; CREATE TABLE IF NOT EXISTS `orders` ( `order_id` int(11) NOT NULL AUTO_INCREMENT, `customer_id` int(11) NOT NULL, `product_id` int(11) NOT NULL, PRIMARY KEY (`order_id`), KEY `customer_id` (`customer_id`,`product_id`) ) TYPE=InnoDB AUTO_INCREMENT=7 ; INSERT INTO `orders` (`order_id`, `customer_id`, `product_id`) VALUES (5, 1, 2), (6, 1, 3), (1, 2, 1), (2, 2, 2), (3, 3, 1), (4, 4, 3); DROP TABLE IF EXISTS `products`; CREATE TABLE IF NOT EXISTS `products` ( `product_id` int(11) NOT NULL AUTO_INCREMENT, `product_name` varchar(64) NOT NULL, `product_description` varchar(255) NOT NULL, `in_stock` tinyint(1) NOT NULL, PRIMARY KEY (`product_id`) ) TYPE=InnoDB AUTO_INCREMENT=4 ; INSERT INTO `products` (`product_id`, `product_name`, `product_description`, `in_stock`) VALUES (1, 'Dirigible', 'Flying machine', 0), (2, 'Sprocket', 'Some sort of widget', 1), (3, 'Thingamajig', 'Something-or-other', 1);
We can see that the “orders” table is linked to the other tables via a unique one-to-one relationship. Every record in the orders table has one customer associated with it, and also one product associated with it, via the unique IDs that we’ve assigned to each. Althouth this relationship isn’t explicitly defined anywhere, this practice of using unique IDs to label records allows us to create relationships when definining queries using a powerful feature of MySQL known as the JOIN
A JOIN in MySQL is a way of querying more than one table at a time in a way that allows us to see results as if they were stored within the same table. This powerful feature means we can now have as many seperate tables as we wish, so that we do not need to retain duplicate records of customer records, for example, while still having the ability to perform simple queries to fetch our data
Customers Table (with highlighted Primary Key)
| customer_id | customer_name | customer_address | number_of_orders |
| 2 | Bob Robertson | 116 Main Street | 2 |
Orders Table (with highlighted Foreign Keys)
| order_id | customer_id | product_id |
| 1 | 2 | 1 |
| 2 | 2 | 2 |
As mentioned earlier, we’re going to use the unique IDs inserted into the tables to define our relationships and we will refer to these IDs with the terminology Primary Key and Foreign Key. In the above tables, the Primary Key for the customer table is displayed in red and the Foreign Keys referring to the customer table from the “orders” table is in green
Each table in our example has a primary key, namely the first column in the table, such as “customer_id”, “order_id” and “product_id”. The primary key usually is defined as the AUTOINCREMENT column, and also has a primary index defined on it, to make searching faster. Examples of foreign keys can be found in the “orders” table. The “orders” table refers to rows in both the “customers” and “products” table, and we’re going to see now how to make use of these values to retrieve useful information
The simplest kind of JOIN in MySQL is one in which we don’t necessarily need to use the JOIN keyword. In this example, we’re going to ask the question “Show me all records of orders along with the names of each product ordered”. The following MySQL query will provide us with the results
SELECT order_id, product_name, in_stock FROM orders, products WHERE orders.product_id = products.product_id
Let’s take a quick look at what’s going on here
SELECT order_id, product_name, in_stock FROM orders, products - This is fairly straightforward, and only slightly different from the SELECT examples I showed you in the earlier tutorial, except now we’re selecting from two different tables. This is specified by writing the table names as a comma seperated list
WHERE orders.product_id = products.product_id - Here we are simply defining a condition which states that we wish to retrieve all values from both the “products” table and the “orders” table but on each row, we wish to have values from each table which match based on the condition that the value of product_id in the “orders” table is the same as the value of product_id in the “products” table
Running this query will give us the following results
order_id product_name in_stock 1 Dirigible 0 2 Sprocket 1 3 Dirigible 0 4 Thingamajig 1 5 Sprocket 1 6 Thingamajig 1
Note that I’m not using any PHP code here to show you these examples, I’ll just be using straight MySQL code which you can run directly through your MySQL management software, such as PHPMyAdmin
You can see from the query above how we’ve reconstructed part of the original table from our three seperate tables. This shows the power of JOIN and how it can be used to create reports from many different sources of data.
I mentioned before that this example of a JOIN statement doesn’t explicitly require the use of a JOIN keyword. The following example is exactly the same as the one I’ve just given you, but uses the JOIN keyword to produce the required result
SELECT order_id, product_name, in_stock FROM orders JOIN products ON (orders.product_id = products.product_id)
Let’s quickly see what’s going on here
FROM orders JOIN products ON (orders.product_id = products.product_id) - This is the part of our statement that is different. Here, we make use of the JOIN syntax to state that we wish to search from the “orders” table JOINed with the “products” table based ON the condition that product_id in the “orders” table matches product_id from the “products” table
Again, this will produce the same output as above. When we’re only working with simple JOINs like this, we don’t need to include the JOIN keyword, but it’s a good way to learn and grasp the concept of JOINs, and on the next page I’ll introduce you to some more complex types of JOIN where we must explicity mention the keyword. Let’s now ask the question “Show me all orders along with the product name, and the customer name & address”. Here, we need to make use of all three tables to get our query. The following SQL code will provide what we need
SELECT order_id, customer_name, customer_address, product_name, in_stock FROM orders JOIN products ON (orders.product_id = products.product_id) JOIN customers ON (orders.customer_id = customers.customer_id) ORDER BY order_id
order_id customer_name customer_address product_name in_stock 1 Bob Robertson 116 Main Street Dirigible 0 2 Bob Robertson 116 Main Street Sprocket 1 3 Alan Davies 2 Bumblebee Avenue Dirigible 0 4 Peter Piper 179 Pickled Parkway Thingamajig 1 5 Jim Jeffries 25 Garden Road Sprocket 1 6 Jim Jeffries 25 Garden Road Thingamajig 1
Here you can see how we’ve now pretty much recreated all of our original table containing duplicate records for columns such as “customer_name” and “product_name”, but all of this information is coming from three seperate tables, none of which actually contain this duplicate information!





March 20th, 2009 at 5:54 pm
EXCELLENT INFORMATION