# Serializability Theory

Consider a database D = (x, y, z), on which we will concurrently perform a series of transactions T1, T2, ..., Tn. We want some way of knowing whether we executed the transactions &quot;correctly.&quot; Formally, we state that an execution is correct if and only if it is equivalent to some serial execution of the transactions. In other words, a correct concurrent execution of the transactions produces the same result we would get if we executed them one at a time.

It may not be clear how executing transactions concurrently can cause problems. Date  identifies the Lost Update problem as a potential problem for concurrently executing transactions. A lost update occurs when two transactions, A and B, both update the same location R in the database. Suppose A reads R, but before A can update R, B reads R. Then, A updates R, and B updates R based on R's previous contents. B has overwritten A's update; A's update, thus, is lost. Note that this would not be a problem has A and B executed sequentially: A would have updated R before B read R.

In this section, we will consider how to use logs to detect whether a concurrent execution was serializable (ie correct).

## Logs

A log is simply a record of all operations which each transaction performed. The log entries are chronological; each entry in the log gives the following information>
• operation type (read or write)
• data on which the operation was performed
• transaction which performed the operation

As we shall see, logs are central to our study of serializability.

Example: Consider the transactions T1, T2, and T3, defined below:

T1 = r1[x] r1[z] w1[x]

T2 = r2[y] r2[z] w2[y]

T3 = w3[x] r3[y] w3[z]

Two possible logs of their execution are:

L1 = w3[x] r1[x] r3[y] r2[y] w3[x] r2[z] w2[y] w1[x]

L2 = w3[x] r3[y] w3[z] r2[y] r2[z] w2[y] r1[x] r1[z] w1[x].

## Serial Logs

A serial log is just a log in which transactions are run one at a time. In a serial log, there is no interleaving of operations from different transactions. L2 above is an example of a serial log, corresponding to the execution {T3, T2, T1}. L1 above is not a serial log, since an operation from transaction 1 (r1) occurs between two operations from transaction 3 (w3 and r3).

## Log Equivalence

Intuitively, two logs are equivalent if the executions that produced them leave the database in the same state. The formal definition of equivalence is:

Two logs are equivalent if and only if:

• Each read operation reads from the same write operation in both logs, and
• Both logs have the same final writes.
We say a read operation j reads from a write operation i if read operation j reads a value most recently written by write operation i. The final write of a log is the log's last write operation.

## Serializable Logs

Since we consider serial logs to be "correct", but we also want to execute transactions concurrently, we want to know whether a concurrent log is serializable; that is, equivalent to a serial log of the same transactions. Knowing this will allow us to execute transactions concurrently, but also verify the correctness of the execution.

## The Serializability Theorem

To determine whether a log is serializable, we construct its serialization graph. This graph is constructed as follows: The transactions {T1, T2, ...,Tn} are nodes in the graph. There is a directed edge from Ti to Tj if and only if, for some x, one of the following hold:
• ri[x]<wj[x],
• wi[x]<rj[x], or
• wi[x]<wj[x].
The Serializability Theorem states that: A log L is serializable if and only if SG(L) (the serialization graph) is acyclic.

Thus, given a transaction log, we can construct its serialization graph and determine whether the log is serializable.

## Summary

In this section, we have determined how to tell whether a concurrent execution of transactions on a database was correct. We first decided that correctness meant that the execution was equivalent to a serial execution of the transactions. Then, we found that, by constructing a serialization graph, we can test its serializability, since serializability is equivalent to the graph's being acyclic.

## Study Questions

1. Consider the following two transactions, and a log of their execution:

T1 = r1[a] r1[b] w1[c]

T2 = w2[b] w2[a] r2[c]

L = r1[z] r1[y] w2[y] w2[z] r2[x] w1[x]

Construct the serializability graph for this log, and show that this execution is not serializable.

2. Rearrange the log L above so that it is serializable. Try to do this without making L a serial log (ie retain concurrent execution of T1 and T2).

## References

 Date, C.J. An Introduction to Database Systems, Addison-Wesley, 1990,

 Singhal, M and N Shivaratri. Advanced Concepts in Operating Systems, McGraw-Hill, 1994, pp. 488-491.