TLA+

a high-level language for modeling programs and systems—especially concurrent and distributed ones.

It's based on the idea that the best way to describe things precisely is with simple mathematics. TLA+ and its tools are useful for eliminating fundamental design errors, which are hard to find and expensive to correct in code.

What is it?

Fundamental idea

We can model specify behaviors of systems, including computer processes, through descriptions of actions that update states.

Assume a fixed set of variables.
States:
Assignments of values to variables
Behaviors:
Infinite discrete sequences of states, with a starting point, like: s₁, s₂, s₃, …
Behaviors are naturally ordered by “tail of” relation, so s₁, s₂, s₃, … is immediately followed by s₂, s₃, s₄ …
Actions:
Boolean valued expression, expressing relation between the values of variables in two states, intuitively the current and next state.

Examples of actions:

NB: for some reason, “=” is a boolean operator and “≜” is definitional equality. Assignment would correspond to an action.

The Temporal Logic of Actions

TLA is a language that can be used to write statements that could be true or false within (at the starting point of) behaviors.

Intuitively:

  1. “All (non-trivial) transitions conform to this action”:
  2. “This action could occur”
  3. “If this action eventually always could occur, it eventually does”

One nuance. TLA is designed to ensure that behaviors always are allowed to “stutter”, i.e. fail to update anything. There seem to be at least two reasons for this.

Stuttering is supposed to be more realistic.

If behaviors can stutter, then “implementation is implication”

For example, a spec for a clock with an hour hand and a minute hand will be implied by a spec for a clock with an hour, minute, and second hand, in the sense that every behavior satisfying the second spec satisfies the first.

The idea is to ensure that you can leave out temporal detail in higher-level specifications, and then add it in without needing to backtrack.

Some formal results here, Lamport/Cohen reduction theorem.

Predicate ≜ Action with no primed variables
          | ENABLED Action

Form ≜ Predicate   | □[Action]ₜ | ¬Form 
     | Form ∧ Form | □Form
              

Where:

TLA+ extends the above (simple TLA) by adding general quantification, sets, and specific operators for use in actions.

How do you use it?

The main tool for using TLA+ is the tlc model checker.

Given a sufficiently descriptive statement in TLA+, tlc can attempt to generate a representation of all possible behaviors that conform to that statement.

The representation (I think) is called a model

“Sufficiently descriptive” means basically that the statement gives an initial assignment to each declared variable, and gives an update relation that pins down how state changes.

Example:

Init ≜ V = 0       // Nat
     ∧ pc = "Plus" // program counter

Plus ≜ pc = "Plus" ∧ pc' = "Minus" 
     ∧ V' = V + 1

Minus ≜ pc = "Minus" ∧ pc' = "Plus" 
      ∧ V' = V - 1

Next ≜ Plus ∨ Minus

Spec ≜ Init ∧ □[Next]ₜ
                

Models have a general graph structure; kind of like automata, with behaviors corresponding to automata runs.

tlc can also do some extra checks on the model:
  1. Invariants: make sure some predicate is true in every state
  2. Liveness properties: statements that are true somewhere in any behavior.
  3. More: e.g. “if φ is always true, then so is ψ.”
Basically check any TLA+ formula, abort if violation found.

The TLA+ language is pretty simple.

But if you want to model algorithms, writing in pure TLA+ still introduces a layer of indirection.

One possible solution is to use a DSL for representing an algorithm, and then automatically translate to TLA+.

PlusCal:

variable seq = <<1, 2, 3, 2>>;
index = 1;
seen = {};
is_unique = TRUE;

begin
  TheMainLoop:
    while index <= Len(seq) do
      if seq[index] \notin seen then
        seen := seen \union {seq[index]};
      else
        is_unique := FALSE;
      end if;
      index := index + 1;
    end while;
end algorithm;

PlusCal:

variable seq = <<1, 2, 3, 2>>;
index = 1;
seen = {};
is_unique = TRUE;

begin
  TheMainLoop:
    while index <= Len(seq) do
      if seq[index] \notin seen then
        seen := seen \union {seq[index]};
      else
        is_unique := FALSE;
      end if;
      index := index + 1;
    end while;
end algorithm;

Is it any good?

The Good:

  1. Very expressive.
  2. Learnable.
  3. Some neat case studies of industrial applications: https://lamport.azurewebsites.net/tla/industrial-use.html: Intel, Amazon, Microsoft, NASA, OpenComRTOS.
  4. Use might be growing. In April 2023, AWS, Microsoft, and Oracle became founding members of the TLA+ Foundation, launched by the Linux Foundation. The TLA+ Foundation will “promote adoption, provide education and training resources, fund research, develop tools, and build a community of TLA+ practitioners.” Also functions as a language committee

The Bad:

  1. Only concrete models (symbolic model checker, apalache, seems immature). State explosion.
  2. No spec ↔ implementation automation. A few academic experiments (e.g. C2TLA+) translating PlusCal to and from C. Actually, there's PGo, which can translate PlusCal to Go.
  3. No type system (by design), no pattern matching… Awkward recursion.

The Ugly:

  1. TLA-toolbox (main TLA+ IDE) is not great.
  2. Support for modern editors is limited. There's a VSCode plugin, and syntax highlighting for other editors, but that's about it.
  3. TLAPs, proof system for TLA+, is not particularly well documented (defunct "hyperbook", some video lectures). I don't know if it's used.

What did you do with it?

I tried out comparing two algorithms in TLA+.

Basic Method:

  1. Write both algorithms in pluscal
  2. Translate to TLA+
  3. Merge (by hand) the two resultant specifications, so that the update relation updates state for both.
Problem: Stuttering.
  1. Add some checkpoints, where one algorithm will pause and wait for the other to catch up
  2. Write invariants that say certain parts of the state are the same, when the algorithms are in sync.

Basically works, probably wasn't a great idea.

I think the main kind of algorithm comparison that TLA+ is intended for is refinement.

One develops a program through a series of refinements, starting from a high-level algorithm and eventually reaching a low-level program. Just as we went from Program 1 to the finier-grained Program 2… the entire process from specification to C code could in principle be carried out in TLA. “All” we would need is a precise semantics of C, which would allow the translation of any C program into a TLA formula.
The ideal, I think, is refinement-driven-development, not verification of existing software.