Model Checking in Haskell, Part 2: From Programs to Transition Systems

In the previous post, we introduced transition systems, which are directed graphs that capture properties of the state of a system as it evolves through time. Each state in the graph was labeled with a true-set, the set of all atomic propositions which are true in that state. We explored how to build logical propositions in terms of the atomic propositions of the state labels, and how to check that such a proposition is an invariant of the transition system. By using an off-the-shelf graph search algorithm, we discovered all reachable states and evaluated the proposition at each state.

In this post, we will take a look at how transition systems can be derived from computer programs. We will develop a very simple imperative programming language, and then we will write a function that converts parallel programs written in this language to transition systems. Finally, we’ll use this language to implement Peterson’s algorithm for mutual exclusion, and we’ll use the checkInvariant function from the previous post to ensure that it is correct.

A simple imperative programming language

module ModelChecking2 where

import ModelChecking1
import Control.Applicative (liftA2)
import Data.Map.Strict (Map)
import qualified Data.Map.Strict as Map
import Data.Vector (Vector)
import qualified Data.Vector as Vec

In this section, we’ll define a simple, Turing-complete imperative language with variable assignments and conditional gotos. The language will be implemented as an embedded domain-specific language (eDSL) in Haskell. The embedding will be (mostly) shallow. We will use Haskell functions to represent effects and state predicates. However, we will not allow these functions to modify or query the current line number in the program; that bit will be deeply embedded in the language AST.

Our language will be called MIG, which stands for Modify/IfGoto. In MIG, a program is a sequence of statements. There are two kinds of statements:

  1. Modify: modify the global variable environment (e.g. assign a variable to a value)
  2. IfGoto: test a condition; if it’s true, go to the given line number

The global variable environment, or just the environment, is an assignment of values to a set of variables. The environment is going to be a Map from vars to vals:

type Env var val = Map var val

A statement that modifies the global variable environment is represented as an effect, which is a function taking the old environment to a new one:

type Effect var val = Env var val -> Env var val

We’ll use Int as a sensible type for our line numbers:

type LineNumber = Int

A statement in MIG either modifies the current environment, or conditionally goes to the given line number:

data Stmt var val = Modify (Effect var val)
                  | IfGoto (Predicate (Env var val)) LineNumber

To execute a Modify statement, we simply apply the Effect to the current environment, thus modifying it, and then go to the next line in the program. To execute an IfGoto statement, we first test the Predicate against the current environment: if the predicate evaluates to true, then we go to the LineNumber indicated; if it is not true, then we go to the next line in the program.

We’ll also need an unconditional goto statement. We’ll define it as IfGoto true, where true :: Predicate a (defined in the previous post) is the function that always returns True:

goto :: LineNumber -> Stmt var val
goto lineNum = IfGoto true lineNum

Another occasionally useful statement is noop, which is implemented as a Modify that doesn’t actually do anything:

noop :: Stmt var val
noop = Modify id

A MIG program is just a Vector of statements:

type Prog var val = Vector (Stmt var val)

The Modify statement

In this section and the next section, we will define some helper functions that will make it easier to create readable statements in MIG. This will be a bit of a diversion from the main theme of model checking, but it is always worthwhile to get the details to look good and feel good. In this section, we focus on Modify; in the next, we’ll look at IfGoto.

The Modify constructor takes a single argument, an Effect:

Modify :: Effect var val -> Stmt var val

Recall that an effect is a function that modifies the global variable environment:

type Effect var val = Env var val -> Env var val

In C, we might see a line like this:

int x = y * 4;

The left-hand side of the = is a variable, and the right-hand side is an expression that can be evaluated, given an environment that has a definition for the variable y. In MIG, we could write the corresponding Modify statement like so:

data XY = X | Y deriving (Show, Eq, Ord)

modify_stmt :: Stmt XY Int
modify_stmt = Modify (\env -> Map.insert X (env Map.! Y * 4) env)

The function we passed to Modify took the current environment and modified it by looking up the value of Y, adding 4 to it, and setting X equal to the result. This is okay, but it would be much nicer to write something that looked more like the corresponding C statement.

To accomplish this, we’ll define an assignment operator that works on single variables. The operator will be .=, and the syntax

x .= e

will mean “evaluate the expression e and assign the result to the variable x”. A simple way to represent an expression is as a function from the environment to a particular value:

type Expr var val = Env var val -> val

Then, the assignment operator can be written as

(.=) :: Ord var => var -> Expr var val -> Effect var val
(x .= e) env = Map.insert x (e env) env
infix 2 .=

This allows us to create an Effect, which is a function, without needing to use an explicit lambda expression that calls Map.insert. Now, we can define the modify_stmt statement a bit more nicely:

modify_stmt :: Stmt XY Int
modify_stmt = Modify (X .= (\env -> env Map.! Y * 4))

This is better. However, the Expr we are binding X to is still defined in terms of a lambda expression and an explicit Map.! operator. We can do a bit better still by defining some more functions to build Exprs more cleanly.

If x :: var is a variable, we can create a corresponding expression for x. In our representation, the expression for x will be a function that simply looks up x in the environment and returns its value.

var :: Ord var => var -> Expr var val
var x env = env Map.! x

If c :: val is a constant value, we can create a corresponding expression for c. In our representation, the expression for c will be a function that ignores the current environment and blindly returns the value c.

val :: val -> Expr var val
val c _ = c

If val is a numeric type, we can lift the usual numeric operators to expressions. This can actually be accomplished by providing an orphan Num instance for functions:

instance Num b => Num (a -> b) where
  (+) = liftA2 (+)
  (*) = liftA2 (*)
  (-) = liftA2 (-)
  abs = fmap abs
  signum = fmap signum
  fromInteger = const . fromInteger
  negate = fmap negate

Since an Expr var val is just a function Env var val -> val, this means that if val has a Num instance, we have a corresponding instance for Expr var val that behaves in the way we’d expect.

Now, we can rewrite our int x = y * 4; statement in a much nicer way:

modify_stmt :: Stmt XY Int
modify_stmt = Modify (X .= var Y * 4)

If we have two effects that we’d like to perform, one after another, we can combine them with the following operator:

(>:) :: Effect var val -> Effect var val -> Effect var val
(a >: b) env = b (a env)
infixr 1 >:

If a and b are effects, a >: b is the effect which results from first performing a, then performing b. This is useful when we wish to update two variables atomically, i.e. perform both updates in a single step of the program.

The IfGoto statement

In this section, we’ll define a few helper functions to help us write IfGoto statements in a readable way. IfGoto takes an environment predicate and a line number as arguments:

IfGoto :: Predicate (Env var val) -> LineNumber -> Stmt var val

Recall from the previous post that a predicate is just a single-argument function that returns a Bool:

type Predicate a = a -> Bool

Therefore, a Predicate (Env var val) is a function Env var val -> Bool. If this function evaluates to True in the current environment, the line number should change to the value specified by the second argument of IfGoto, the LineNumber.

In C, we might see a conditional goto statement like this:

if (x == 1 + y) goto label;

Ideally the programmer doesn’t use explicit gotos, but in MIG it will be the only option for affecting control flow in our programs. Let’s assume label refers to a specific line number, namely line 17 of the program. Then we can write the corresponding IfGoto statement like so:

if_goto_stmt :: Stmt XY Int
if_goto_stmt = IfGoto (\env -> env Map.! X == 1 + env Map.! Y) 17

The function we passed to IfGoto took the current environment, looked up the values of X and Y, and tested whether the value of X was equal to 1 plus the value of Y. As with Modify, we’ll write some helper functions to create these environment predicates in a way that more closely resembles the original C code. The first such function will be the equality operator, which evaluates two expressions and determines if the results are equal:

(.==) :: Eq val => Expr var val -> Expr var val -> Predicate (Env var val)
(e1 .== e2) env = e1 env == e2 env
infix 4 .==

The next functions we’ll need are the inequality operators:

(.<=) :: Ord val => Expr var val -> Expr var val -> Predicate (Env var val)
(e1 .<= e2) env = e1 env <= e2 env
infix 4 .<=
(.<) :: Ord val => Expr var val -> Expr var val -> Predicate (Env var val)
(e1 .< e2) env = e1 env < e2 env
infix 4 .<
(.>=) :: Ord val => Expr var val -> Expr var val -> Predicate (Env var val)
(e1 .>= e2) env = e1 env >= e2 env
infix 4 .>=
(.>) :: Ord val => Expr var val -> Expr var val -> Predicate (Env var val)
(e1 .> e2) env = e1 env > e2 env
infix 4 .>

Also note that we can combine predicates using the boolean operators .&, .|, pnot, and .-> as defined in the previous post; these enable us to “build up” larger predicates out of smaller ones.

Now, we can implement the C statement if (x == 1 + y) goto label; in a much nicer way:

if_goto_stmt :: Stmt XY Int
if_goto_stmt = IfGoto (var X .== 1 + var Y) 17

Implementing factorial

To give you a simple example of how to program in MIG, let’s implement factorial. We won’t do any model checking of this program; this is merely to give you (the reader) a concrete sense of how programs can be written in it.

We’re going to hand-translate this C function:

int fact(int n) {
  int i = 2, res = 1;
  while (i <= n) {
    res *= i;
    i += 1;
  }
  return res;
}

into a MIG program. The program has three variables:

data FactVar = N | I | Res deriving (Show, Eq, Ord)

Now we’re ready to write the fact program (line numbers are listed in comments to the left of each statement):

fact :: Prog FactVar Int
fact = Vec.fromList
  {- 0 -} [ Modify (Res .= 1 >: I .= 2)
  {- 1 -} , IfGoto (pnot (var I .<= var N)) 5
  {- 2 -} ,   Modify (Res .= var Res * var I)
  {- 3 -} ,   Modify (I .= var I + 1)
  {- 4 -} ,   goto 1
  {- 5 -} , goto 5 -- halt
          ]

We don’t have a separate Halt statement, so we model that with a goto statement that points to itself, infinitely looping.

Parallel programs

In this section, we will examine the problem of model checking multiple programs running in parallel, which all have access to the same global variable environment. We will use MIG to express each individual program, and then we will define a function to convert a parallel program into a transition system.

A parallel program is just a Vector of sequential programs with a shared environment:

type ParProg var val = Vector (Prog var val)

We will use the term process to denote a sequential program that is part of a larger parallel program. The process id of a particular process is just its index within this vector.

type ProcId = Int

To “run” a parallel program, we first start each process at line 0. Then, we arbitrarily pick which process will execute next, and we execute that process’s current statement, updating the global variable environment and/or that process’s current line number. This type of parallel execution model is called interleaving, because each process advances independently of the other, and the processes can advance in any order.

Example: Peterson’s mutual exclusion algorithm

Consider two processes, P0 and P1, which are running corresponding programs. Suppose that each program has a critical section, and if P0 and P1 are ever both in their critical sections, bad things can happen. The property that both processes cannot execute their critical sections simultaneously is called mutual exclusion.

Peterson’s algorithm is an elegant way to ensure mutual exclusion. It uses three variables: Turn, Wait0, and Wait1.

data PeteVar = Turn | Wait0 | Wait1 deriving (Show, Eq, Ord)

Intuitively, the meaning of each variable is as follows:

We implement Peterson’s algorithm in MIG as a parallel program with two processes. Each process is in an infinite loop, continually attempting to enter its own critical section. When a process wants to enter its critical section, it first sets its own Wait variable to True, and then it (somewhat counterintuitively) signals that it is the other process’s turn. Then, before it enters its own critical section, it busy-waits until either the other process is not waiting, or its own turn has arrived. Finally, when the critical section is complete, the process sets its Wait variable to False, indicating it has exited its critical section, and the other process is free to enter theirs. The process then loops back to the beginning, and again attempts to enter its critical section.

pete_0 :: Prog PeteVar Bool
pete_0 = Vec.fromList
  {- 0 -} [ Modify (Wait0 .= val True)
  {- 1 -} , Modify (Turn .= val True)
  {- 2 -} , IfGoto (var Turn .& var Wait1) 2
  {- 3 -} , noop -- CRITICAL SECTION
  {- 4 -} , Modify (Wait0 .= val False)
  {- 5 -} , goto 0
          ]
pete_1 :: Prog PeteVar Bool
pete_1 = Vec.fromList
  {- 0 -} [ Modify (Wait1 .= val True)
  {- 1 -} , Modify (Turn .= val False)
  {- 2 -} , IfGoto (pnot (var Turn) .& var Wait0) 2
  {- 3 -} , noop -- CRITICAL SECTION
  {- 4 -} , Modify (Wait1 .= val False)
  {- 5 -} , goto 0
          ]
pete :: ParProg PeteVar Bool
pete = Vec.fromList [ pete_0, pete_1 ]

It is possible to reason through why this solution works, and even to write a formal proof that both processes cannot be at line 3 simultaneously. However, by converting this parallel program to a transition system, we can automatically check that Peterson’s algorithm ensures mutual exclusion. In the next section, we discuss how to perform this conversion, and then apply it to this algorithm.

From parallel programs to transition systems

In order to model check a parallel program, we first convert such a program into a transition system. The basic idea for the conversion will be that a state in the transition system will be a pair (Vector LineNumber, Env var val), consisting of the current line number of each process, and the current values of the global variable environment.

type ParProgState var val = (Vector LineNumber, Env var val)

Every state (lineNums, env) will have exactly Vec.length lineNums outgoing transitions, each corresponding to executing the current line of one of the running processes.

What should the set of atomic propositional variables be for the transition system of a program? Instead of choosing this for all programs, we defer this choice to the caller. The caller will provide a list of atomic propositional variables, along with a function which maps each variable to a state predicate, i.e. a Predicate (ParProgState var val). Then, the label of each state will be the set of variables whose corresponding predicate is true about the current state.

The action type will just be a (ProcId, LineNumber) pair, corresponding to “performing the statement at the given line of the given process.” As in the previous post, the action is just a name for each transition, and does not have any semantic content whatsoever.

Let’s define parProgToTS, which converts a parallel program to a transition system:

parProgToTS :: [Env var val]
            -> [ap]
            -> (ap -> Predicate (ParProgState var val))
            -> ParProg var val
            -> TransitionSystem (ParProgState var val) (ProcId, LineNumber) ap
parProgToTS initialEnvs aps apToPred parProg = TransitionSystem

The initialEnvs argument is a list of all initial environments we are considering. aps is the complete list of atomic propositional variables, and apToPred maps each atomic propositional variable to its corresponding state predicate. For each initial environment provided by the caller, there is a corresponding initial state in the transition system with every process starting at line 0:

  { tsInitialStates = [(Vec.replicate (Vec.length parProg) 0, env) | env <- initialEnvs]

States are labeled by the atomic propositional variables whose corresponding predicates they satisfy:

  , tsLabel = \s -> [ p | p <- aps, s |= apToPred p ]

Each state has one outgoing transition for every process in the input program, corresponding to executing the current line of that particular process.

  , tsTransitions = \(lineNums, env) ->
      [ t env lineNums procId | procId <- [0..Vec.length parProg-1] ]
  }
  where t env lineNums procId =
          let lineNum = lineNums Vec.! procId
              process = parProg Vec.! procId
              action = (procId, lineNum)
          in case process Vec.! lineNum of
            Modify effect ->
              (action, (lineNums Vec.// [(procId, lineNum+1)], effect env))
            IfGoto p lineNum'
              | env |= p  -> (action, (lineNums Vec.// [(procId, lineNum' )], env))
              | otherwise -> (action, (lineNums Vec.// [(procId, lineNum+1)], env))

Checking Peterson’s algorithm

The correctness property for Peterson’s algorithm is simple: both processes should never be in their critical sections simultaneously. To ensure that this is true, we use the following type for the atomic propositional variables:

data ProcAtLine = ProcAtLine ProcId LineNumber
  deriving (Show, Eq, Ord)

ProcAtLine procId lineNum represents the predicate that “process procId is currently at line lineNum.” This is codified by the following function:

procAtLinePred :: ProcAtLine -> Predicate (ParProgState var val)
procAtLinePred (ProcAtLine procId lineNum) =
  \(lineNums, _) -> lineNums Vec.! procId == lineNum

In order to call peteTS, we also need to collect all of the atomic propositional variables. These will be ProcAtLine procId lineNum for every valid (ProcId, LineNumber) pair. We can collect all such pairs for an arbitrary ParProg with the following function:

enumProcAtLine :: ParProg var val -> [ProcAtLine]
enumProcAtLine parProg =
  [ ProcAtLine procId lineNum | procId  <- [0..Vec.length parProg - 1]
                              , lineNum <- [0..Vec.length (parProg Vec.! procId) - 1] ]

Now, we can define the transition system for the pete program as follows:

peteTS :: TransitionSystem (ParProgState PeteVar Bool) (ProcId, LineNumber) ProcAtLine
peteTS = parProgToTS [initialEnv] (enumProcAtLine pete) procAtLinePred pete
  where initialEnv = Map.fromList [ (Turn, False), (Wait0, False), (Wait1, False) ]

Note that there is only one initial environment; all three variables are initially False.

The critical section occurs at line 3 in both processes, so the property that both processes are not simultaneously in their critical sections is stated as follows:

pnot (atom (ProcAtLine 0 3) .& atom (ProcAtLine 1 3))

Let’s check that this property is an invariant for peteTS transition system:

  > checkInvariant (pnot (atom (ProcAtLine 0 3) .& atom (ProcAtLine 1 3))) peteTS
  Nothing

Glancing back at each process in Peterson’s algorithm, we see that the assignment to the Wait variable occurs before the assignment to Turn. What happens if we swap the two? Will our property still hold?

bad_pete_0 :: Prog PeteVar Bool
bad_pete_0 = Vec.fromList
  {- 0 -} [ Modify (Turn .= val True)
  {- 1 -} , Modify (Wait0 .= val True)
  {- 2 -} , IfGoto (var Turn .& var Wait1) 2
  {- 3 -} , noop -- CRITICAL SECTION
  {- 4 -} , Modify (Wait0 .= val False)
  {- 5 -} , goto 0
          ]
bad_pete_1 :: Prog PeteVar Bool
bad_pete_1 = Vec.fromList
  {- 0 -} [ Modify (Turn .= val False)
  {- 1 -} , Modify (Wait1 .= val True)
  {- 2 -} , IfGoto (pnot (var Turn) .& var Wait0) 2
  {- 3 -} , noop -- CRITICAL SECTION
  {- 4 -} , Modify (Wait1 .= val False)
  {- 5 -} , goto 0
          ]
bad_pete :: ParProg PeteVar Bool
bad_pete = Vec.fromList [ bad_pete_0, bad_pete_1 ]
bad_peteTS :: TransitionSystem (ParProgState PeteVar Bool) (ProcId, LineNumber) ProcAtLine
bad_peteTS = parProgToTS [initialEnv] (enumProcAtLine bad_pete) procAtLinePred bad_pete
  where initialEnv = Map.fromList [ (Turn, False), (Wait0, False), (Wait1, False) ]

The naming of these programs probably tips you off about the outcome of our little experiment. Let’s try it:

  > checkInvariant (pnot (atom (ProcAtLine 0 3) .& atom (ProcAtLine 1 3))) bad_peteTS
  Just (([3,3],fromList [(Turn,True),(Wait0,True),(Wait1,True)]),Path {pathHead = ([0,0],fromList [(Turn,False),(Wait0,False),(Wait1,False)]), pathTail = [((0,0),([1,0],fromList [(Turn,True),(Wait0,False),(Wait1,False)])),((0,1),([2,0],fromList [(Turn,True),(Wait0,True),(Wait1,False)])),((0,2),([3,0],fromList [(Turn,True),(Wait0,True),(Wait1,False)])),((0,3),([4,0],fromList [(Turn,True),(Wait0,True),(Wait1,False)])),((0,4),([5,0],fromList [(Turn,True),(Wait0,False),(Wait1,False)])),((0,5),([0,0],fromList [(Turn,True),(Wait0,False),(Wait1,False)])),((1,0),([0,1],fromList [(Turn,False),(Wait0,False),(Wait1,False)])),((0,0),([1,1],fromList [(Turn,True),(Wait0,False),(Wait1,False)])),((0,1),([2,1],fromList [(Turn,True),(Wait0,True),(Wait1,False)])),((0,2),([3,1],fromList [(Turn,True),(Wait0,True),(Wait1,False)])),((1,1),([3,2],fromList [(Turn,True),(Wait0,True),(Wait1,True)])),((1,2),([3,3],fromList [(Turn,True),(Wait0,True),(Wait1,True)]))]})

Oh no! Looks like it doesn’t hold. Let’s make the counterexample a bit more readable so we can analyze what went wrong:

  > Just (_, path) = checkInvariant (pnot (atom (ProcAtLine 0 3) .& atom (ProcAtLine 1 3))) bad_peteTS
  > print (pathHead path) >> mapM_ print (snd <$> pathTail path)
  ([0,0],fromList [(Turn,False),(Wait0,False),(Wait1,False)])
  ([1,0],fromList [(Turn,True),(Wait0,False),(Wait1,False)])
  ([2,0],fromList [(Turn,True),(Wait0,True),(Wait1,False)])
  ([3,0],fromList [(Turn,True),(Wait0,True),(Wait1,False)])
  ([4,0],fromList [(Turn,True),(Wait0,True),(Wait1,False)])
  ([5,0],fromList [(Turn,True),(Wait0,False),(Wait1,False)])
  ([0,0],fromList [(Turn,True),(Wait0,False),(Wait1,False)])
  ([0,1],fromList [(Turn,False),(Wait0,False),(Wait1,False)])
  ([1,1],fromList [(Turn,True),(Wait0,False),(Wait1,False)])
  ([2,1],fromList [(Turn,True),(Wait0,True),(Wait1,False)])
  ([3,1],fromList [(Turn,True),(Wait0,True),(Wait1,False)])
  ([3,2],fromList [(Turn,True),(Wait0,True),(Wait1,True)])
  ([3,3],fromList [(Turn,True),(Wait0,True),(Wait1,True)])

Here is the important bit, annotated with letters for discussion:

  ([0,0],fromList [(Turn,True),(Wait0,False),(Wait1,False)])  -- a
  ([0,1],fromList [(Turn,False),(Wait0,False),(Wait1,False)]) -- b
  ([1,1],fromList [(Turn,True),(Wait0,False),(Wait1,False)])  -- c
  ([2,1],fromList [(Turn,True),(Wait0,True),(Wait1,False)])   -- d
  ([3,1],fromList [(Turn,True),(Wait0,True),(Wait1,False)])   -- e
  ([3,2],fromList [(Turn,True),(Wait0,True),(Wait1,True)])    -- f
  ([3,3],fromList [(Turn,True),(Wait0,True),(Wait1,True)])    -- g

At a, both processes are at line 0. First, P1 sets Turn to False, indicating it is P0’s turn to enter the critical section. Then, P0 sets it to True, indicating it is P1’s turn. Then, at c, P0 sets Wait0 to True, indicating it wishes to enter. At line d, P0 enters its critical section even though Turn is True, because Wait1 is still False. Then, P1 is able to enter its critical section, because Turn is True.

We see that it really matters that the Wait variables are updated before the Turn flag is flipped! Otherwise, the algorithm simply does not work.

What’s next?

In this post, we applied model checking to a real-world program and proved something valuable and non-trivial. Peterson’s algorithm isn’t too complicated, but it’s not easy to see why it’s correct at first glance. We saw that by translating it to a transition system, we could exhaustively explore the reachable state space, and found that it was impossible to violate the desired invariant. When we modified the program slightly, we discovered that the invariant failed. Furthermore, the model checking approach discovered a counterexample that helped us to understand why it failed.

So far in this series, the only properties we have explored and checked are invariants. However, there are some properties that cannot be expressed as invariants; for instance, the property that a traffic light will be green infinitely often, or that a yellow light always precedes a red light. In the next post, we will explore a larger class of properties called regular safety properties. We will show how to use nondeterministic finite automata to express such properties, and how to check whether these properties hold.