Experiments on Different Isolation Levels - MySQL
Transaction isolation levels and their guarantees.
There are four levels of transaction isolation: read uncommitted, read committed, repeatable read, and serializable. I have experimented with MySQL's implementation of those transaction isolation levels and summarized the consistency properties each isolation level guarantees. For brevity, I will explore the implementation details of each isolation level in another article.
Read Uncommitted
Let me introduce Alice and Bob, they are my old friends. In the read uncommitted isolation level, Bob can see the updated value of x
.
We are going to check the following things:
- After Alice's write, can Bob observe
x=11
(i.e. READ UNCOMMITTED)? - Does Bob's update operation makes
x=12
(i.e. DIRTY WRITE)?
Preparation
CREATE TABLE ledger1 (x INT);
INSERT INTO ledger1 (x) VALUES (10);
Execution
After Alice's write, can Bob observe x=11
(i.e. READ UNCOMMITTED)?
Yes, he could observe the uncommitted update, x=11
.
Does Bob's update operation makes x=12
(i.e. DIRTY WRITE)?
No, Bob could not obtain a lock for write and timed out.
Isolation Level | Dirty Read | Dirty Write |
---|---|---|
Read Uncommitted | YES | NO |
It was surprising to see that MySQL's Read Uncommitted isolation level prevented dirty writes from happening. However, you do not want to use an isolation level that allows dirty reads, right? Let's proceed to the next isolation level, read committed.
Read Committed
First of all, let's check that no dirty reads happen using the read committed isolation level.
Yes, we could confirm that the read committed isolation level prevented dirty read.
Let us think of the following scenario: Alice asked Bob to move her money from her US bank account to her Japanese bank account (suppose 1 USD= 100 JPY). Bob agreed to do that without telling her when he would conduct the transaction. One day, Alice checked the balance of her bank accounts to see if Bob had already exchanged and transferred her money. Coincidentally, Bob was just doing the operation for her. Seeing zero money in both of her accounts, Alice got confused.
Preparation
Execution
This unlucky scenario happened as the inevitable consequence of using the read committed isolation level. We can say that the read operations by Alice were skewed. This kind of inconsistency is called a read skew.
Isolation Level | Dirty Read | Dirty Write | Read Skew |
---|---|---|---|
Read Committed | NO | NO | YES |
If your use case of the database can tolerate read skews, then you can stop at the read committed isolation level. If you think this is unacceptable and want Alice to read consistent data, you should proceed to the repeatable read isolation level.
Repeatable Read
The repeatable read isolation level is also known as snapshot isolation. Let's check that no read skews occur with the repeatable read isolation level.
Alice could see 10 USD in her US bank account. Combining the fact that she didn't see 1000 JPY in her Japanese bank account, we can say that she was seeing the values as of when she started the transaction, notwithstanding the changes made by another transaction.
Although repeatable read prevents most of the read-related anomalies, it suffers from write-related anomalies. Let's see what they are.
Lost Update Problem
Alice and Bob both conducted a read-modify-update cycle to increment x
in the database. However, the effect of the first write operation was clobbered by the last write operation.
This type of write-related anomaly is called the lost update problem.
Write Skew
Write skew is a write-related anomaly that is caused by multiple write transactions. If transactions write to the same object and cause a consistency problem with it, we specifically call those problems lost update problems.
The following example shows a write skew problem that involves multiple objects. Only one person should be able to book the hotel room. However, Alice and Bob succeeded in making their bookings in the same room. If only Bob had started the transaction after Alice had finished hers, he could not have booked the room... Such records like the one Alice has made in this example are called phantoms.
Preparation
CREATE TABLE hotel_room_calendar (
room_id INT,
date DATE,
booked_by VARCHAR(10)
);
As we have seen, the repeatable read isolation level is not enough to prevent write skews. Some of the problems can be solved by using FOR UPDATE
, but note that it is not a perfect solution. For example, the hotel room booking problem cannot be mitigated with FOR UPDATE
because it is not effective against phantoms.
Isolation Level | Dirty Read | Dirty Write | Read Skew | Lost Update Problem | Write Skew |
---|---|---|---|---|---|
Repeatable Read | NO | NO | NO | YES | YES |
Serializable
Serializable is the strongest isolation level. Let's see if it solves lost update problems and write skews.
Lost Update Problem
The following diagram shows that the database detected a deadlock when it received a write request from Bob, aborted his transaction, and successfully processed Alice's transaction.
Write Skew
The same thing happens when a multi-record write skew is about to be introduced.
In both of the cases, the database maintained the serializability by aborting Bob's transaction and asked him to retry after Alice's transaction finished. When Bob made his second attempt, he read the values updated by Alice.
Isolation Level | Dirty Read | Dirty Write | Read Skew | Lost Update Problem | Write Skew |
---|---|---|---|---|---|
Serializable | NO | NO | NO | NO | NO |
Conclusion
Through experiments, we have obtained the following table showing what kind of anomaly may occur in each isolation level.
Isolation Level | Dirty Read | Dirty Write | Read Skew | Lost Update Problem | Write Skew |
---|---|---|---|---|---|
Read Uncommitted | YES | NO | YES | YES | YES |
Read Committed | NO | NO | YES | YES | YES |
Repeatable Read | NO | NO | NO | YES | YES |
Serializable | NO | NO | NO | NO | NO |
Serializable is achieved by detecting a deadlock and aborting one of the involved transactions.
In a future article, I will explore how these isolation levels are implemented by MySQL.
Appendix
Install MySQL Server
I created an e2-small VM using Google Compute Engine and named it mysql. Then, I sshed into the VM and issued the following commands:
Create a User and a Database
I created my user account for the MySQL server and a database for testing. Then I granted all privileges regarding the newly created database to myself.
mysql -u root -p
SELECT User FROM mysql.user;
# There are four users: mysql.infoschema, mysql.session, mysql.sys, and root.
CREATE USER yuji;
# New user 'yuji' has been created.
SHOW DATABASES;
# There is only one database: information_schema.
CREATE DATABASE test;
# A database in MySQL is actually a directory of files.
GRANT ALL ON test.* to 'yuji'@'%';
# An account name consisting only of a user name is equivalent to 'user_name'@'%'. For example, 'me' is equivalent to 'me'@'%'.
EXIT;
Mitigating Lost Update Problems with FOR UPDATE
FOR UPDATE
takes an exclusive lock as writes do.
Notice that Bob needed to wait for 8.19 seconds until Alice committed, i.e. the lock she obtained when reading x
was released. As we have seen from this example, lost update problems can be prevented even with the read uncommitted isolation level with the help of the FOR UPDATE
keyword.