Dodging phantom reads

Dodge me if you can

October… the month of Halloween. Spooky events. SQL Server has all kinds of Halloween-ish themed activities like Split Brains, Zombies, Ghosts, Phantoms, and of course Halloween itself. In this post, we’re going to mill about with phantom reads a bit.

Phantoms in SQL Server are actually called “Phantom Reads”. This ectoplasmic phenomenon manifests itself when an identical query being run multiple times, in a single connection, returns a different result set for each time it is run. A Phantom Read is one of the transaction isolation level concurrency events. The read uncommitted, read committed and repeatable read transaction isolation levels may exhibit Phantom Reads; the serializable and snapshot transaction isolation levels are not susceptible to this phenomenon.

Let’s set up a demo to see this in action. First, let’s create our sample data:

USE master; 
GO 
 
IF DB_ID('IsolationLevelTest') IS NOT NULL 
BEGIN 
    ALTER DATABASE IsolationLevelTest SET SINGLE_USER WITH ROLLBACK IMMEDIATE; 
    DROP DATABASE IsolationLevelTest; 
END; 
CREATE DATABASE IsolationLevelTest; 
GO 
 
USE IsolationLevelTest; 
GO 
CREATE PROCEDURE dbo.db_reset AS 
IF OBJECT_ID('dbo.IsolationTests','U') IS NOT NULL 
    DROP TABLE dbo.IsolationTests; 
CREATE TABLE dbo.IsolationTests (
    Id   INTEGER IDENTITY,
    ColA CHAR(1)
); 
INSERT INTO dbo.IsolationTests(ColA) 
SELECT 'A' UNION ALL 
SELECT 'A' UNION ALL 
SELECT 'A' UNION ALL 
SELECT 'A' UNION ALL 
SELECT 'A' UNION ALL 
SELECT 'A' UNION ALL 
SELECT 'A'; 
 
IF EXISTS (SELECT 1 FROM sys.databases WHERE name = N'IsolationLevelTest' AND snapshot_isolation_state = 1)
    ALTER DATABASE IsolationLevelTest SET ALLOW_SNAPSHOT_ISOLATION OFF;
GO 
EXECUTE dbo.db_reset; 
GO

Now, let’s test phantom reads in the repeatable read transaction isolation level. Open the following 2 queries in separate query windows. Run Code Script 1, and within 10 seconds run Code Script 2.

/* Execute this script from the first query window */
 
USE IsolationLevelTest;
GO
SET TRANSACTION ISOLATION LEVEL REPEATABLE READ;
BEGIN TRANSACTION;
SELECT * FROM dbo.IsolationTests;
WAITFOR DELAY '00:00:10';
SELECT * FROM dbo.IsolationTests;
ROLLBACK;
SELECT * FROM dbo.IsolationTests;
-- Run this in query window 2 while Code Script 1 is running
USE IsolationLevelTest;
GO
UPDATE dbo.IsolationTests SET ColA = 'W';
SELECT * FROM dbo.IsolationTests;

The repeatable read transaction isolation level guarantees that there will not be any updates or deletes, so the second query is blocked until the transaction in Code Script 1 has finished. You can see that the second result set is identical to the first one. Once the transaction completes, the update in Code Script 2 can run, and the third result set from Code Script 1 shows that the result set is now changed. However, the repeatable read transaction isolation level still allows inserts, which means that you can still have a phantom read. Let’s see this by changing Code Script 2 to:

-- Run this in query window 2 while Code Script 1 is running
USE IsolationLevelTest;
GO
INSERT INTO dbo.IsolationTests (ColA)
VALUES ('W');
SELECT * FROM dbo.IsolationTests;

Run both code scripts again (Code Script 1 first, and Code Script 2 within 10 seconds). This time, you will see that Code Script 2 completes immediately without being blocked, and when Code Script 1 finishes, it has spawned additional data in its result set. A phantom read in action.

If you want to hide from phantom reads completely, then you’ll need to use either the serializable or snapshot transaction isolation levels. Both of these have the same concurrency effects: No dirty reads, non-repeatable reads, or phantom reads. The difference is in how they are implemented: the serializable transaction isolation level will block all other transactions affecting this data, while the snapshot isolation level utilizes row versions to create connection-specific versions of the table for the transaction – all of these row versions will cause increased activity in the tempdb database. Let’s take a look at how the snapshot isolation level will eradicate the phantom reads. First off, we need to modify the database to accept this isolation level.

ALTER DATABASE IsolationLevelTest SET ALLOW_SNAPSHOT_ISOLATION ON;
GO

To see this in action, we will once again have two queries. Run the code from Code Script 3 in one query window, and within 10 seconds run the code from Code Script 4 in another query window.

-- Run this code in query window 1
USE IsolationLevelTest;
GO
SET TRANSACTION ISOLATION LEVEL SNAPSHOT;
BEGIN TRANSACTION;
SELECT * FROM dbo.IsolationTests;
WAITFOR DELAY '00:00:10';
SELECT * FROM dbo.IsolationTests;
ROLLBACK;
-- Run this in query window 2 while the 1st query is running
USE IsolationLevelTest;
GO
INSERT INTO dbo.IsolationTests(ColA)
VALUES ('X');
SELECT * FROM dbo.IsolationTests;

Notice that Code Snippet 2 was executed immediately, however the row that it inserted was not returned in the result set from Code Snippet 1.

And now you know how to dodge phantoms.