About 15 years ago, I was hanging out at the MIT AI Lab, and there was an ongoing seminar on the Coq proof assistant. The idea was that you wouldn’t have to guess whether your programs were correct; you could prove that they worked correctly.

The were just two little problems:

  1. It looked ridiculously intimidating.
  2. Rumor said that it took a grad student all summer to implement and prove the greatest common divisor algorithm, which sounded rather impractical.

So I decided to stick to Lispy languages, which is what I was officially supposed to be hacking on, anyway, and I never did try to sit in on the seminar.

Taking another look

I should have taken a look much sooner. This stuff provides even more twisted fun than Haskell! Also, projects like the CompCert C compiler are impressive: Imagine a C compiler where every optimization has been proven correct.

Even better, we can write code in Coq, prove it correct, then export it to Haskell or several other functional languages.

Here’s an example Coq proof. Let’s start with a basic theorem that says “If we know A is true, and we know B is true, then we know A /\ B (both A and B) is true.”

Theorem basic_conj : forall (A B : Prop),
  A -> B -> A /\ B.
Proof.
  (* Give names to our inputs. *)
  intros A B H_A_True H_B_True.
  (* Specify that we want to prove each half
     of /\ separately. *)
  split.
  - apply H_A_True. (* Prove the left half. *)
  - apply H_B_True. (* Prove the right half. *)
Qed.

But Coq proofs are intended to be read interactively, using a tool like CoqIDE or Emacs Proof General. Let me walk you through how this proof would really look.

Proof.

At this point, the right-hand pane will show the theorem that we’re trying to prove:

1 subgoals, subgoal 1 (ID 1)
  
  ============================
   forall A B : Prop, A -> B -> A /\ B

But there’s a lot of clutter to prove here, so let’s pull some of that up into our assumptions:

  intros A B H_A_True H_B_True.

This gives us the following:

1 subgoals, subgoal 1 (ID 5)

  A : Prop
  B : Prop
  H_A_True : A
  H_B_True : B
  ============================
   A /\ B

Now we only need to prove A /\ B. Here, H_A_True can be read as “Hypothesis: A is true.” But to prove A /\ B, we need to split it into two pieces:

  split.

First we need to prove A:

2 subgoals, subgoal 1 (ID 7)
  
  A : Prop
  B : Prop
  H_A_True : A
  H_B_True : B
  ============================
   A

subgoal 2 (ID 8) is:
 B

We can prove this by applying our one of hypotheses:

  - apply H_A_True.

This leaves us with only one more thing to prove:

1 subgoals, subgoal 1 (ID 8)
  
  A : Prop
  B : Prop
  H_A_True : A
  H_B_True : B
  ============================
   B

And we can prove this trivially:

  - apply H_B_True.

And now we’re done!

No more subgoals.

This is fun for about five minutes, and then it gets incredibly tedious. So Coq offers a large library of “proof tactics” like auto, which can do this whole proof for us:

Theorem basic_conj' : forall (A B : Prop),
  A -> B -> A /\ B.
Proof. auto. Qed.

Functional programming

We can also write functional programs in Coq! Here’s a function that inserts into a sorted list:

Fixpoint insert_sorted (n : nat) (l : list nat) : list nat :=
  match l with
    | [] => [n]
    | n' :: l' =>
      if n <=? n'
      then n :: l
      else n' :: insert_sorted n l'
  end.

(A complete version of this code is available on GitHub.)

Once we’ve written the code, we can provide examples, and prove them correct:

Example test_insert_3_1_2 :
  insert_sorted 2 (insert_sorted 1 (insert_sorted 3 []))
    = [1; 2; 3].
Proof. reflexivity. Qed.

Example is just another name for Theorem, and reflexivity says, “Simplify both sides of the = sign and make sure that they match.” So far, this doesn’t give us anything more than a unit test, but we’ll get to that.

But we really want to prove a general result: When l is an order list, and you run insert_sorted n l, the result should also be ordered. Or, in more formal terms:

Theorem insert_sorted_stays_sorted : forall n l,
  Sorted leb l -> Sorted leb (insert_sorted n l).

Proving this will take a surprising amount of work. It’s not an easy theorem to state rigorously. But Coq’s standard library and proof tools will simplify things considerably.

Checking the standard library: Ordering

Fortunately, Coq has a huge standard library. Here, we’re going to import SfLib (provided by the book Software Foundations), and some libraries related to ordering and sorting:

Require Import SfLib.
Require Import Coq.Structures.Orders.
Require Import Coq.Sorting.Sorting.

Next, we’re going to grab some sample code from the Coq documentation. The first bit allows us to treat true (a regular boolean value) as True (a logical proposition):

(* From Coq.Structures.Orders. *)
Local Coercion is_true : bool >-> Sortclass.
Hint Unfold is_true.

The second bit defines how we want want to order the natural numbers in our list. Yes, Coq really does represent natural numbers using the Peano arithmetic, where O is 0, and S n is “add 1 to n”. It’s a recursive data structure like nil and cons in Lisp, except with no data stored in the list, and the length of the list is represents the number: S (S (S O)) represents 3.

(* From Coq.Sorting.Mergesort NatOrder example. Highest priority is 0. *)
Module Import PriorityOrder <: TotalLeBool.
  Definition t := nat.
  Fixpoint leb x y :=
    match x, y with
    | 0, _ => true
    | _, 0 => false
    | S x', S y' => leb x' y'
    end.
  Infix "<=?" := leb (at level 35).
  Theorem leb_total : forall a1 a2, a1 <=? a2 \/ a2 <=? a1.
  Proof. induction a1; destruct a2; simpl; auto. Qed.
End PriorityOrder.

This part is pretty cool:

Theorem leb_total : forall a1 a2, a1 <=? a2 \/ a2 <=? a1.
Proof. induction a1; destruct a2; simpl; auto. Qed.

This proof says, “Use induction on variable a1. That means we need to show that the base case and the inductive step are correct. For each of these goals, break down each possible case of a2, giving four total goals. Then for each goal, simplify everything you can, and grind through the rest of the proof with auto.” In particular, the ; character means, “Apply the following in parallel over all goals produced by the previous step.”

Checking the standard library: Sorting

Among the code that we import from Coq’s Sorted library, we find the very useful HdRel and Sorted properties:

Variable A : Type.
Variable R : A -> A -> Prop.
  
Inductive HdRel a : list A -> Prop :=
| HdRel_nil : HdRel a []
| HdRel_cons b l : R a b -> HdRel a (b :: l).

Inductive Sorted : list A -> Prop :=
| Sorted_nil : Sorted []
| Sorted_cons a l : Sorted l -> HdRel a l -> Sorted (a :: l).

Here, Sorted takes a list A as an argument, and returns a logical proposition. But the logical proposition is actually pretty weird: it’s represented as an algebraic data type! Specifically, the Sorted_nil contructor acts as “evidence” saying that the empty list is sorted. Sorted_nil says that “the list (a :: l) is Sorted if we can prove that Sorted l and HdRel a l are both true.

We can test Sorted using an Example block:

Example test_Sorted : Sorted leb [1; 2; 3].
Proof. auto. Qed.

We’re finally ready to prove insert_sorted is correct!

First, we need one more helper lemma. Here, Admitted means, “just accept this on faith, please.” (I have proved this, but it’s too tedious for a blog post.)

Lemma flip_not_leb : forall (a b : nat), (a <=? b) = false -> b <=? a.
Admitted.

Also, we need to tell Coq that auto should automatically try to apply the following when it looks like they might help:

Hint Resolve flip_not_leb.
Hint Constructors Sorted.
Hint Constructors HdRel.

And now we’re ready for the proof. This has roughly the same structure as this informal proof, but it makes all the details formal:

(* This would make more sense if you read it interactively with
   Proof General or CoqIDE. *)

Theorem insert_sorted_stays_sorted : forall n l,
  Sorted leb l -> Sorted leb (insert_sorted n l).
Proof.
  intros n l H_sorted_l.

  (* Induction on l, and prove the base case. *)
  induction l as [|n' l']; simpl; auto.

  (* Prove the inductive case. *)
  Case "l = n' :: l'".

    (* The expression (n <=? n') is either true or
       false, so let's look at both (and immediately
       dispose of the easy case). *)
    destruct (n <=? n') eqn:H_n_le_n'; auto.

    (* Now we're going to have to work for it. *)
    SCase "n <=? n' = false".

      (* We know `Sorted leq (n' :: l')`, but let's
         pull some more useful implications out of that. *)
      apply Sorted_inv in H_sorted_l.
      inversion H_sorted_l as [H_sorted_l' HdRel_n'_l'].

      (* We want to prove Sorted_cons, so break it up into
         individual prerequisites. *)
      apply Sorted_cons.

      (* This prerequisite is true via induction. *)
      SSCase "Sorted (insert_sorted n l')". apply IHl'. auto.

      (* ...but this one is going to hurt. *)
      SSCase "HdRel n' (insert_sorted n l')".

        (* Use our induction hypotheses to turn another
           hypotheses into something more useful. *)
        apply IHl' in H_sorted_l'.

        (* Tear the remaining steps apart case by case. *)
        destruct l'; simpl; auto.
        destruct (n <=? n0); auto.
        inversion HdRel_n'_l'. auto.
Qed.

I’m pretty sure the final steps of this proof could be cleaned up a bit by introducing a few useful lemmas. But we’ve done it!

Converting to Haskell code

Of course, it doesn’t do us any good to prove our code correct in Coq, and then hand-translate it to another language, making mistakes along the way. So Coq can actually “extract” Haskell, OCaml or Scheme code semi-automatically. But first, we need to map Coq data types to Haskell data types:

Extraction Language Haskell.
Extract Inductive list => "([])" [ "[]" "(:)" ].
Extract Inductive bool => "Bool" [ "True" "False" ].
Extract Inductive nat => "Int" ["0" "(1+)"].
Extraction insert_sorted.

This outputs the following Haskell code:

insert_sorted :: Int -> (([]) Int) -> ([]) Int
insert_sorted n l =
  case l of {
   [] -> (:) n [];
   (:) n' l' ->
    case leb n n' of {
     True -> (:) n l;
     False -> (:) n' (insert_sorted n l')}}

If we combine this with the following helper code, we’ll get a working Haskell program:

leb :: Int -> Int -> Bool
leb = (<=)

main :: IO () 
main = do
  print $ insert_sorted 5 (insert_sorted 3 [])

Learning more

In recent years, the Coq community has been doing lots of cool stuff. There’s the CompCert C compile compiler, with a provably correct optimizer. There’s a fully-checked proof of the four-color theorem. And there’s a proof of the Feit–Thompson theorem, a major result in group theory that originally took over 250 pages to prove on paper.

There are three introductory books which come highly recommended:

  • Software Foundations, which walks through the basics in considerable detail with lots of exercises. This has also lots of chapters on programming language semantics and how to prove interesting things about them.
  • Certified Programming with Dependent Types, which skips over the low-level details of how Coq works, but which introduces high-level strategies for writing clear, readable proofs.
  • Coq’Art also comes highly recommended, but I haven’t looked at it yet.

Anyway, I hope this is enough to get you interested!

Update: Here are some more proof assistants that people like, with sample sorting proofs:

Also of interest: