Previously, you have learned about the concept of sql constraints. The Primary key works as an index to find the data we want. The Foreign key works as an index to reference data in the index table based on the primary key of the referenced table.
When we want to create a table that has a foreign key from another table, we must create the referenced table first. This is because we cannot create a table that has a foreign key from a table that does not yet exist.
Primary key
First, we need to create a primary key as a unique identifier for the table. Suppose there is a department table that has the following data:
id
name
1
IT
2
HRD
3
Finance
4
Marketing
Then there is an employee table that has the following data:
id
name
age
address
salary
department_id
1
Rizki
25
Jl. Kebon Jeruk
2000000
2
2
Andi
27
Jl. Kebon Sirih
3000000
3
3
Budi
30
Jl. Kebon Melati
4000000
1
4
Caca
32
Jl. Kebon Anggrek
5000000
1
5
Deni
35
Jl. Kebon Mawar
6000000
1
We can conclude that the employee table has a foreign key from the department table. This is because the department_id column in the employee table references the id column in the department table.
We can create a primary key on the id column because the data is unique by adding PRIMARY KEY as constraints on the id column when creating the department table:
CREATE TABLE department (id INT PRIMARY KEY, -- the `id` column will become the primary key name VARCHAR(255));
When the command is executed, the department table will be created with the id column as a primary key of type INT. Automatically, this data must be unique.
Foreign key
Then we can create a foreign key on the employee table by adding FOREIGN KEY as constraints on the department_id column when creating the employee table:
CREATE TABLE employee (id INT PRIMARY KEY, name VARCHAR(255), age INT, address VARCHAR(255), salary INT, department_id INT FOREIGN KEY REFERENCES department(id)-- the `department_id` column will become the foreign key
With python, let’s create the department and employee tables, and insert the data into these tables.
import sqlite3# Connect to SQLite databaseconn = sqlite3.connect('db/sqlite.db')# Create a cursor objectc = conn.cursor()# Create table 'department'c.execute('''CREATE TABLE department ( id INT PRIMARY KEY, name TEXT);''')# Insert data into the 'department' tabledata = [ (1, 'IT'), (2, 'HRD'), (3, 'Finance'), (4, 'Marketing')]c.executemany('INSERT INTO department VALUES (?,?)', data)# Create table 'employee'c.execute('''CREATE TABLE employee ( id INT PRIMARY KEY, name TEXT, age INT, address TEXT, salary REAL, department_id INT, FOREIGN KEY(department_id) REFERENCES department(id));''')# Insert data into 'employee' tabledata = [ (1, 'Rizki', 25, 'Jl. Kebon Jeruk', 2000000, 2), (2, 'Andi', 27, 'Jl. Kebon Sirih', 3000000, 3), (3, 'Budi', 30, 'Jl. Kebon Melati', 4000000, 1), (4, 'Caca', 32, 'Jl. Kebon Anggrek', 5000000, 1), (5, 'Deni', 35, 'Jl. Kebon Mawar', 6000000, 1)]c.executemany('INSERT INTO employee VALUES (?,?,?,?,?,?)', data)# Commit the transactionconn.commit()
Now, let’s perform a JOIN operation to combine these tables:
import pandas as pdpd.read_sql_query('''SELECT employee.id AS employee_id, employee.name AS employee_name, department.name AS department_name, employee.age, employee.address, employee.salaryFROM employeeINNER JOIN department ON employee.department_id = department.id''', conn)
employee_id
employee_name
department_name
age
address
salary
0
1
Rizki
HRD
25
Jl. Kebon Jeruk
2000000.0
1
2
Andi
Finance
27
Jl. Kebon Sirih
3000000.0
2
3
Budi
IT
30
Jl. Kebon Melati
4000000.0
3
4
Caca
IT
32
Jl. Kebon Anggrek
5000000.0
4
5
Deni
IT
35
Jl. Kebon Mawar
6000000.0
This will display employee details along with their respective department names.
The way to create a foreign key above is the same as creating a primary key. The only difference is we need to add REFERENCES to determine which table will be used as an index. Thus, the employee table has a column that references the data in the department table based on the id column.
SQL Joins
Previously, we have learned how to retrieve data from a table. But, what if we want to retrieve data from two tables? Suppose we want to retrieve data from table A and table B, and then display both data in one table. To do this, we can use the concept of join.
Please note, we can only join tables that have a foreign key to the referenced table
Suppose there are 2 tables, customers and orders. The customers table has the following data:
id
name
address
1
Rizki
Jl. Kebon Jeruk
2
Andi
Jl. Kebon Sirih
3
Budi
Jl. Kebon Melati
4
Caca
Jl. Kebon Anggrek
The orders table has the following data:
id
customer_id
order_date
1001
1
2020-01-01
1002
2
2020-01-02
1003
3
2020-01-03
1004
4
2020-01-04
Notice that the customer_id column in the orders table refers to id in the customers table. The relationship between the two tables above is the customer_id column.
We can join the two tables with a table display like the following:
order_id
customer_name
customer_address
order_date
1001
Rizki
Jl. Kebon Jeruk
2020-01-01
1002
Andi
Jl. Kebon Sirih
2020-01-02
1003
Budi
Jl. Kebon Melati
2020-01-03
1004
Caca
Jl. Kebon Anggrek
2020-01-04
The first column is the id data in the orders table, the second and third columns are the name and address data in the customers table. Finally, the order_date data in the orders table. The data was successfully combined because there is a reference in the orders table.
To perform a join, we can use the SELECT command by adding JOIN and ON:
SELECT * FROM <table1>-- query commandINNER JOIN <table2>ON <table1>.<key>=<table2>.<key>
Where <table1> and <table2> are the names of the tables to be merged using the INNER JOIN command, <key> is the column being referenced (primary key or foreign key) using the ON command and compared with =.
For example, we can join the customers and orders tables using inner join with the following query:
SELECT * FROM ordersINNER JOIN customersON orders.customer_id = customers.id;
The result of the query is as follows:
id
customer_id
order_date
id
name
address
1001
1
2020-01-01
1
Rizki
Jl. Kebon Jeruk
1002
2
2020-01-02
2
Andi
Jl. Kebon Sirih
1003
3
2020-01-03
3
Budi
Jl. Kebon Melati
1004
4
2020-01-04
4
Caca
Jl. Kebon Anggrek
When joining tables, the columns from the joined table will be on the right. All columns from the orders table will be displayed from the left (id, customer_id, order_date) and then display the columns from the customers table (id, name, address). We can tidy up the merged columns by selecting to SELECT only a few columns using the syntax table_name.column_name. For example:
SELECT orders.id AS order_id, customers.name AS customer_name, customers.address AS customer_address, orders.order_dateFROM ordersINNER JOIN customersON orders.customer_id = customers.id;
In the SELECT command, we take 2 columns in the orders table, namely id and order_date, and 2 columns in the customers table, namely name and address. We rename the id column in the orders table to order_id using AS, as well as for name and address in the customers table.
The result of the query will be according to what we want:
order_id
customer_name
customer_address
order_date
1001
Rizki
Jl. Kebon Jeruk
2020-01-01
1002
Andi
Jl. Kebon Sirih
2020-01-02
1003
Budi
Jl. Kebon Melati
2020-01-03
1004
Caca
Jl. Kebon Anggrek
2020-01-04
Finally, remember to close the connection to the database after all operations.
# Close the connectionconn.close()
Database Transaction
What is a transaction?
A Transaction is a set of SQL Commands designed to execute a set of interrelated actions. We can control each action that occurs by determining when the transaction begins and ends. A transaction can be canceled if an error occurs in one of the actions taken.
Why use a transaction?
Let’s take an example with a sales case study. Each sales process will carry out the process of selecting goods, payment, increasing the total goods sold, and recording the transaction with the selling price. And each of these processes is stored in a database with different tables. The data in the database is as follows:
Products table with example data:
id
name
price
stock
1
Product A
10000
10
2
Product B
20000
20
Sales table with example data:
id
product_id
quantity
total_price
1
1
2
20000
2
2
1
20000
Transactions table with example data:
id
date
product_id
amount
1
2020-01-01
1
15000
Suppose there is a customer using our application, by purchasing one Product A for Rp. 15,000. And the customer made the purchase on 2020-01-01.
When a sale occurs, we will reduce the total stock of goods in the products table column stock, then add the total sales in the sales table, and record the sales transaction in the transactions table. So, the command process is as follows:
-- step 1: get the product data to be searchedSELECT*FROM products WHEREid=1;-- step 2: reduce product stockUPDATE products SET stock = stock -1WHEREid=1;-- step 3: increase total salesINSERTINTO sales (product_id, quantity, total_price) VALUES (1, 1, 10000);-- step 4: record the transactionINSERTINTO transactions (date, product_id, amount) VALUES ('2020-01-01', 1, 15000);-- done
There are 4 SQL commands run to perform the sales transaction process: get the purchased product data in the products table, then perform an update by reducing stock by 1 in the products table, add total sales in the sales table, and record the sales transaction in the transactions table. These commands will always be run if there is a sales process in our application.
However, what if when getting the product it turns out the stock is empty, or when recording the transaction an error occurs. This will result in data inconsistency between each table. The stock may have been reduced by 1, and in the sales table there is already an addition of 1 sale, but in the transactions table no transaction has been recorded. This will confuse us to check these data.
To overcome this data inconsistency, we must return or cancel all SQL commands that have previously been performed. Therefore, we need to use a transaction.
With a transaction, we can ensure that all commands within a transaction will always be run simultaneously. If there is an error, then all commands within a transaction will be canceled.
So there are only 2 conditions, that is all commands in the transaction are successfully executed.
Or cancel all commands if there is an error.
SQL Transaction
To use a transaction in SQL, we need to understand the beginning of the process and the end of the process. The beginning of the process is when we start a transaction, and the end of the process is when we end a transaction.
We use the BEGIN TRANSACTION command to start a transaction, and the COMMIT command is used to complete a transaction.
BEGINTRANSACTION;-- SQL commandCOMMIT;
Based on the example above, we just need to insert 4 SQL commands into the transaction.
BEGINTRANSACTION;-- step 1: get the product data to be searchedSELECT*FROM products WHEREid=1;-- step 2: reduce product stockUPDATE products SET stock = stock -1WHEREid=1;-- step 3: increase total salesINSERTINTO sales (product_id, quantity, total_price) VALUES (1, 1, 10000);-- step 4: record the transactionINSERTINTO transactions (date, product_id, amount) VALUES ('2020-01-01', 1, 15000);COMMIT;
We can issue a ROLLBACK command to cancel a transaction if there is an error or unmet condition.
BEGINTRANSACTION;-- SQL commandROLLBACK;
You can apply a transaction in an online SQL compiler or SQLite/DBMS tool that you use.