Experiments on Different Isolation Levels - MySQL

Transaction isolation levels and their guarantees.

Experiments on Different Isolation Levels - MySQL

There are four levels of transaction isolation: read uncommitted, read committed, repeatable read, and serializable. I have experimented with MySQL's implementation of those transaction isolation levels and summarized the consistency properties each isolation level guarantees. For brevity, I will explore the implementation details of each isolation level in another article.

Read Uncommitted

Let me introduce Alice and Bob, they are my old friends. In the read uncommitted isolation level, Bob can see the updated value of x.

I have been playing with Alice and Bob since 2016.

We are going to check the following things:

  • After Alice's write, can Bob observe x=11 (i.e. READ UNCOMMITTED)?
  • Does Bob's update operation makes x=12 (i.e. DIRTY WRITE)?

Preparation

CREATE TABLE ledger1 (x INT);
INSERT INTO ledger1 (x) VALUES (10);

Execution

Alice's console is shown left. Bob's is shown right.
After Alice's write, can Bob observe x=11 (i.e. READ UNCOMMITTED)?

Yes, he could observe the uncommitted update, x=11.

Does Bob's update operation makes x=12 (i.e. DIRTY WRITE)?

No, Bob could not obtain a lock for write and timed out.

Isolation Level Dirty Read Dirty Write
Read Uncommitted YES NO

It was surprising to see that MySQL's Read Uncommitted isolation level prevented dirty writes from happening. However, you do not want to use an isolation level that allows dirty reads, right? Let's proceed to the next isolation level, read committed.

Read Committed

First of all, let's check that no dirty reads happen using the read committed isolation level.

Updated values are only visible to operations in the same transaction that made the changes.
Left is Alice and right is Bob. Bob's SELECT was conducted after Alice's UPDATE operation.

Yes, we could confirm that the read committed isolation level prevented dirty read.

Let us think of the following scenario: Alice asked Bob to move her money from her US bank account to her Japanese bank account (suppose 1 USD= 100 JPY). Bob agreed to do that without telling her when he would conduct the transaction. One day, Alice checked the balance of her bank accounts to see if Bob had already exchanged and transferred her money. Coincidentally, Bob was just doing the operation for her. Seeing zero money in both of her accounts, Alice got confused.

Bob is an earnest guy, he does what he has promised to his friends.

Preparation

CREATE TABLE jpbank (id VARCHAR(10), balance INT);
CREATE TABLE usbank (id VARCHAR(10), balance INT);
INSERT INTO jpbank (id, balance) VALUES ('alice', 0);
INSERT INTO usbank (id, balance) VALUES ('alice', 10);
Alice has 0 JPY in the Japanese Bank account and 10 USD in the US Bank account.

Execution

Left is Alice, and right is Bob.

This unlucky scenario happened as the inevitable consequence of using the read committed isolation level. We can say that the read operations by Alice were skewed. This kind of inconsistency is called a read skew.

Isolation Level Dirty Read Dirty Write Read Skew
Read Committed NO NO YES

If your use case of the database can tolerate read skews, then you can stop at the read committed isolation level. If you think this is unacceptable and want Alice to read consistent data, you should proceed to the repeatable read isolation level.

Repeatable Read

The repeatable read isolation level is also known as snapshot isolation. Let's check that no read skews occur with the repeatable read isolation level.

Two transactions are isolated to the extent that the other's modifications are invisible even after the other transaction is committed.
Left is Alice, and right is Bob.

Alice could see 10 USD in her US bank account. Combining the fact that she didn't see 1000 JPY in her Japanese bank account, we can say that she was seeing the values as of when she started the transaction, notwithstanding the changes made by another transaction.

Although repeatable read prevents most of the read-related anomalies, it suffers from write-related anomalies. Let's see what they are.

Lost Update Problem

Alice and Bob both conducted a read-modify-update cycle to increment x in the database. However, the effect of the first write operation was clobbered by the last write operation.

11 + 1 + 1 should be 13, right?
Left is Alice, and right is Bob.

This type of write-related anomaly is called the lost update problem.

Write Skew

Write skew is a write-related anomaly that is caused by multiple write transactions. If transactions write to the same object and cause a consistency problem with it, we specifically call those problems lost update problems.

The following example shows a write skew problem that involves multiple objects. Only one person should be able to book the hotel room. However, Alice and Bob succeeded in making their bookings in the same room. If only Bob had started the transaction after Alice had finished hers, he could not have booked the room... Such records like the one Alice has made in this example are called phantoms.

It is the hotel's duty to ensure that no two bookings are made on the same day.

Preparation

CREATE TABLE hotel_room_calendar (
  room_id INT,
  date DATE,
  booked_by VARCHAR(10)
);
There are two bookings on the same day for room 5.

As we have seen, the repeatable read isolation level is not enough to prevent write skews. Some of the problems can be solved by using FOR UPDATE, but note that it is not a perfect solution. For example, the hotel room booking problem cannot be mitigated with FOR UPDATE because it is not effective against phantoms.

Isolation Level Dirty Read Dirty Write Read Skew Lost Update Problem Write Skew
Repeatable Read NO NO NO YES YES

Serializable

Serializable is the strongest isolation level. Let's see if it solves lost update problems and write skews.

Lost Update Problem

The following diagram shows that the database detected a deadlock when it received a write request from Bob, aborted his transaction, and successfully processed Alice's transaction.

Alice's transaction and Bob's transaction are serialized, right?
Left is Alice, right is Bob.

Write Skew

The same thing happens when a multi-record write skew is about to be introduced.

Alice's transaction and Bob's transaction are serialized, right?
Left is Alice, and right is Bob.

In both of the cases, the database maintained the serializability by aborting Bob's transaction and asked him to retry after Alice's transaction finished. When Bob made his second attempt, he read the values updated by Alice.

Isolation Level Dirty Read Dirty Write Read Skew Lost Update Problem Write Skew
Serializable NO NO NO NO NO

Conclusion

Through experiments, we have obtained the following table showing what kind of anomaly may occur in each isolation level.

Isolation Level Dirty Read Dirty Write Read Skew Lost Update Problem Write Skew
Read Uncommitted YES NO YES YES YES
Read Committed NO NO YES YES YES
Repeatable Read NO NO NO YES YES
Serializable NO NO NO NO NO

Serializable is achieved by detecting a deadlock and aborting one of the involved transactions.

In a future article, I will explore how these isolation levels are implemented by MySQL.

Appendix

Install MySQL Server

I created an e2-small VM using Google Compute Engine and named it mysql. Then, I sshed into the VM and issued the following commands:

sudo apt update
sudo apt install gnupg wget
sudo wget https://dev.mysql.com/get/mysql-apt-config_0.8.18-1_all.deb
sudo dpkg -i mysql-apt-config_0.8.18-1_all.deb
# A prompt asks you about the configuration. Click Ok with the following values.
# MySQL Server & Cluster: mysql-8.0
# MySQL Tools & Connectort: Enabled
# MySQL Preview Packages: Disabled
sudo apt update
sudo apt-get install mysql-server
# A prompt asks you to set the root password and whether to use the new authentication method.
# Set your password and opt for the strong one.
systemctl status mysql
# Confirm that the MySQL server is up and running!
This is how I installed a MySQL server on my VM.

Create a User and a Database

I created my user account for the MySQL server and a database for testing. Then I granted all privileges regarding the newly created database to myself.

mysql -u root -p
SELECT User FROM mysql.user;
# There are four users: mysql.infoschema, mysql.session, mysql.sys, and root.
CREATE USER yuji;
# New user 'yuji' has been created.
SHOW DATABASES;
# There is only one database: information_schema.
CREATE DATABASE test;
# A database in MySQL is actually a directory of files.
GRANT ALL ON test.* to 'yuji'@'%';
# An account name consisting only of a user name is equivalent to 'user_name'@'%'. For example, 'me' is equivalent to 'me'@'%'.
EXIT;

Mitigating Lost Update Problems with FOR UPDATE

FOR UPDATE takes an exclusive lock as writes do.

Bob cannot even read the value of x until Alice releases the lock.
Left is Alice, and right is Bob.

Notice that Bob needed to wait for 8.19 seconds until Alice committed, i.e. the lock she obtained when reading x was released. As we have seen from this example, lost update problems can be prevented even with the read uncommitted isolation level with the help of the FOR UPDATE keyword.