4 Lecture 3 - Rocq Programming II
Inductive Formulas, Lists, and Substitution
Below is a consolidated summary of the whole Rocq lecture file PropFormulas.v, together with the extra conceptual points we discussed while revising it.
Defining well-formed formulas as an inductive datatype
The central datatype is:
Inductive Wff :=
P : ℕ → Wff
| Neg : Wff → Wff
| Impl : Wff → Wff → Wff
| Conj : Wff → Wff → Wff
| Disj : Wff → Wff → Wff.
This says that Wff is the type of well-formed propositional formulas.
Each constructor corresponds to a formation rule:
P n is an atomic proposition.
Neg F is the negation of F.
Impl F G is the implication F ⇒ G.
Conj F G is the conjunction F ∧ G.
Disj F G is the disjunction F ∨ G.
So the inductive datatype corresponds directly to the grammar of propositional formulas:
F ::= Pₙ | ¬F | F ⇒ F | F ∧ F | F ∨ F
The important point is:
An inductive type tells Rocq how objects of that type can be constructed.
So Wff is not just a set declared abstractly. It is generated by the constructors P, Neg, Impl, Conj, and Disj.
Constructor types
The declaration defines constructors with types:
P : ℕ → Wff
Neg : Wff → Wff
Impl : Wff → Wff → Wff
Conj : Wff → Wff → Wff
Disj : Wff → Wff → Wff
Examples:
Check P.
Check Neg.
Check Impl.
For atomic propositions:
Check P 0.
Check P 42.
Both have type:
Wff
Composite formulas can be built by applying constructors:
Check Neg (P 0).
Check Impl (P 1) (P 2).
Check Conj (P 1) (P 2).
For example:
Definition F₁ := Neg (P 0).
Definition F₂ := Conj (P 1) (P 2).
Then:
Impl F₁ F₂
is again a formula of type Wff.
Currying and partial application
The constructor:
Impl : Wff → Wff → Wff
is curried.
This does not mean that Impl takes a pair:
Wff × Wff → Wff
Instead it means:
Impl : Wff → (Wff → Wff)
So Impl takes one formula and returns a function waiting for the second formula.
The lecture demonstrates this using:
Section Scratch.
Variable (dummy_formula : Wff).
Check Impl dummy_formula.
End Scratch.
Inside the section:
Variable (dummy_formula : Wff).
means:
Let dummy_formula be an arbitrary formula of type Wff.
Then:
Check Impl dummy_formula.
shows that after applying Impl to one formula, the result has type:
Wff → Wff
So:
Impl dummy_formula : Wff → Wff
Then, after applying a second argument:
Impl dummy_formula (P 0) : Wff
This illustrates partial application.
Sections and local variables
The block:
Section Scratch.
Variable (dummy_formula : Wff).
Check Impl dummy_formula.
End Scratch.
creates a temporary local context.
Inside the section, Rocq knows:
dummy_formula : Wff
After:
End Scratch.
the variable disappears.
So Section ... End is useful for temporary assumptions, local experiments, and definitions that depend on parameters.
We discussed that:
Variable dummy_formula : Wff.
and:
Variable (dummy_formula : Wff).
are essentially equivalent here.
Both mean:
Assume dummy_formula is an arbitrary element of Wff.
This is not an imperative-programming variable. It is not a mutable storage cell. It is more like the mathematical phrase:
Let F be an arbitrary well-formed formula.
We clarified the difference between Variable and Definition.
A declaration like:
Variable F : Wff.
means:
Assume an arbitrary formula F.
It does not construct a concrete formula.
A definition like:
Definition F : Wff := P 0.
means:
Define F to be the concrete formula P 0.
So Variable is an assumption or parameter, while Definition introduces a named term.
You cannot generally do this:
Variable x : A.
Definition x := val.
because x is already a name in the current scope. Also, Variable x : A is not an uninitialized placeholder waiting to be assigned later. It is already an arbitrary but fixed object of type A.
The usual Rocq workflow is not:
declare x, then assign x later
but rather:
work generally with x as a parameter,
then later instantiate definitions or theorems with a concrete value.
Notation declarations
The lecture introduces readable notation:
Notation "a ∧ b" := (Conj a b) (at level 80, right associativity).
Notation "a ∨ b" := (Disj a b) (at level 85, right associativity).
Notation "a ⇒ b" := (Impl a b) (at level 99, right associativity).
Notation "¬ a" := (Neg a) (at level 75, format "¬ a").
These do not create new constructors. They only tell the parser and printer how to read and display expressions.
So:
P 0 ∧ P 1
is notation for:
Conj (P 0) (P 1)
Similarly:
¬ P 0
is notation for:
Neg (P 0)
We also discussed that one cannot type-check an infix notation by writing:
Check /\.
or similarly for Unicode operators.
Notation is not a standalone term. It is a parsing pattern.
Instead, one can check:
Check Conj.
Check (P 0 ∧ P 1).
Check (fun a b => a ∧ b).
The last one treats the notation as a binary operation by eta-expanding it into a function.
Defining formula height with Fixpoint
The first serious recursive function is the height function:
Fixpoint ht (F : Wff) : ℕ :=
match F with
| P i => 0
| Neg F => 1 + ht F
| Impl F F' => 1 + max (ht F) (ht F')
| Conj F F' => 1 + max (ht F) (ht F')
| Disj F F' => 1 + max (ht F) (ht F')
end.
Conceptually:
The height of an atom is 0.
The height of ¬F is 1 + height of F.
The height of a binary formula F ⋄ G is 1 + max(height F, height G).
The important Rocq idea:
Fixpoint is used for recursive definitions.
For ordinary non-recursive functions, one writes:
Definition f (x : A) := ...
But for recursive functions, the function body refers to the function itself, so one uses:
Fixpoint f (x : A) := ...
Rocq only accepts recursive definitions when it can verify termination. In this case, recursive calls are made on structurally smaller subformulas:
ht F
ht F'
where F and F' are subformulas of the original formula.
Why Fixpoint is needed
A recursive function such as height satisfies equations like:
ht (Neg F) = 1 + ht F
ht (Conj F G) = 1 + max (ht F) (ht G)
So the function refers to itself.
A plain Definition cannot introduce such self-reference directly. That is why Rocq has a special recursive definition construct:
Fixpoint
We discussed that Fixpoint is related to the mathematical idea of fixed points.
A recursive definition can be seen abstractly as an equation:
f = T(f)
where T is an operator that takes a candidate function and returns an improved/new function.
For factorial, one can define an operator:
T(g)(0) = 1
T(g)(n + 1) = (n + 1) · g(n)
Then factorial is a fixed point of T:
fact = T(fact)
This operator viewpoint is not necessary for ordinary Rocq programming, but it explains the word “Fixpoint” and connects to recursion theory, lambda calculus, domain theory, and denotational semantics.
In Rocq, however, the practical rule is:
Fixpoint = recursive function accepted only if Rocq can see termination.
Lower-level fix
The lecture then shows that the height function can also be defined using the lower-level fix term:
Definition ht₀ : Wff → ℕ :=
fix ht (F : Wff) : ℕ :=
match F with
| P i => 0
| Neg F => 1 + ht F
| Impl F F' => 1 + max (ht F) (ht F')
| Conj F F' => 1 + max (ht F) (ht F')
| Disj F F' => 1 + max (ht F) (ht F')
end.
Here:
fix ht (F : Wff) : ℕ := ...
is an explicit recursive function term.
The name ht inside the fix expression is the self-reference used for recursive calls.
So:
Fixpoint ht (F : Wff) : ℕ := ...
is a convenient command-level syntax, while:
Definition ht₀ : Wff → ℕ :=
fix ht (F : Wff) : ℕ := ...
uses the lower-level recursive term former.
The lecture then proves:
Goal ht = ht₀.
Proof.
reflexivity.
Qed.
This shows that the two definitions coincide.
Goal versus Theorem
We discussed:
Goal ht = ht₀.
Proof.
reflexivity.
Qed.
Goal starts an anonymous theorem.
It is like:
Theorem ht_eq_ht0 : ht = ht₀.
Proof.
reflexivity.
Qed.
except the result is not given a useful global name.
So Goal is often used for:
small experiments
lecture demonstrations
checking that something is provable
Why reflexivity proves ht = ht₀
The tactic:
reflexivity.
proves equalities whose two sides are equal after computation/unfolding.
It does not only check textual equality. It checks definitional equality.
Here, ht and ht₀ reduce to the same underlying recursive function. Therefore:
reflexivity
can prove:
ht = ht₀
This is an example of equality by computation.
This differs from equalities requiring real reasoning, such as:
forall n : nat, n + 0 = n
which usually needs induction.
Defining a function using tactics
The lecture then defines height again, this time using proof/tactic mode:
Definition ht₁ (F : Wff) : ℕ.
Proof.
induction F.
- exact 0.
- exact (1 + IHF).
- exact (1 + (max IHF1 IHF2)).
- exact (1 + (max IHF1 IHF2)).
- exact (1 + (max IHF1 IHF2)).
Defined.
This looks like a proof, but it defines a function.
The line:
Definition ht₁ (F : Wff) : ℕ.
means:
Define ht₁.
It takes F : Wff.
It must return a natural number.
I will construct the body interactively.
After this, Rocq enters proof mode. The goal is:
ℕ
under the assumption:
F : Wff
In type theory, proving and constructing are the same kind of activity. A goal is a type, and solving the goal means constructing a term of that type.
If the goal is a proposition, the term is a proof.
If the goal is ℕ, the term is a natural number.
So this tactic script constructs a program.
The role of induction F
The tactic:
induction F.
splits into cases according to the constructors of Wff.
Since Wff has five constructors, it produces five cases:
F = P i
F = Neg F
F = Impl F1 F2
F = Conj F1 F2
F = Disj F1 F2
In each case, the goal is to construct a natural number, the height.
For the atom case:
exact 0.
For the negation case, Rocq provides an induction result:
IHF : ℕ
which plays the role of the recursively computed height of the subformula. So:
exact (1 + IHF).
For binary constructors, Rocq provides:
IHF1 : ℕ
IHF2 : ℕ
corresponding to the recursively computed heights of the two subformulas. So:
exact (1 + (max IHF1 IHF2)).
The correspondence is:
Fixpoint recursive call tactic version
ht F IHF
ht F1 IHF1
ht F2 IHF2
So the induction tactic builds the same recursive function.
Why Defined instead of Qed
The tactic-defined function ends with:
Defined.
rather than:
Qed.
The difference:
Qed closes the proof opaquely.
Defined closes it transparently.
For ordinary proofs, Qed is common.
For functions/programs built with tactics, Defined is usually needed if we want Rocq to compute with the definition.
So if ht₁ is ended with Defined, then:
Compute ht₁ some_formula.
can unfold and compute.
If ended with Qed, the definition may become opaque and computation will not unfold it as expected.
The version using all:
The lecture then defines another version:
Definition ht₂ (F : Wff) : ℕ.
Proof.
induction F.
exact 0.
exact (1 + IHF).
all: exact (1 + (max IHF1 IHF2)).
Defined.
This is the same function again.
The only new syntax is:
all: tactic.
It means:
Apply this tactic to all remaining goals.
After:
induction F.
there are five goals.
The first goal is solved by:
exact 0.
The second goal is solved by:
exact (1 + IHF).
The remaining three goals are the binary cases: implication, conjunction, disjunction.
They all have the same shape, with two induction results:
IHF1 : ℕ
IHF2 : ℕ
So the same expression works for all of them:
all: exact (1 + (max IHF1 IHF2)).
This is more compact, but less explicit than the bullet version.
Introduction of lists
The next part introduces lists because they are needed to compute subformulas.
The lecture presents the inductive definition:
Inductive list (A : Type) : Type :=
| nil : list A
| cons : A → list A → list A.
This says:
For every type A, list A is the type of lists whose elements have type A.
Examples:
list nat
list bool
list Wff
The type constructor:
list
has type:
Type → Type
So list itself is not a type of elements; rather, it takes a type and returns a type.
This is analogous to C++ templates:
std::vector<int>
std::vector<bool>
std::vector<Wff>correspond to:
list nat
list bool
list Wff
The analogy is useful at the type-constructor level:
C++: vector<T>
Rocq: list A
But Rocq lists are pure inductive linked-list-like values, not mutable containers with capacity and memory operations.
Why Inductive list (A : Type) : Type
We discussed the difference between:
Inductive Wff := ...
and:
Inductive list (A : Type) : Type := ...
For Wff, Rocq can infer:
Wff : Type
One could explicitly write:
Inductive Wff : Type := ...
For list, the declaration means:
For each type A, list A is a Type.
So:
list : Type → Type
and:
list A : Type
The final : Type says that once A : Type is given, list A is itself a type.
We noted that Rocq can often infer the final : Type, so one could often write:
Inductive list (A : Type) := ...
But the lecture writes it explicitly for clarity.
List constructors
The constructors are:
nil : list A
cons : A → list A → list A
More explicitly, because list is polymorphic:
nil : forall A : Type, list A
cons : forall A : Type, A → list A → list A
So fully explicit uses look like:
nil bool
cons bool true (nil bool)
cons nat 3 (nil nat)
cons Wff (P 0) (nil Wff)
But this is verbose.
Arguments and implicit parameters
The lecture comments out or discusses:
Arguments nil {A}.
Arguments cons {A} a l.
These commands control implicit arguments.
The curly braces mean:
This argument should be implicit; Rocq should infer it when possible.
So:
Arguments nil {A}.
means:
The type parameter A of nil is implicit.
And:
Arguments cons {A} a l.
means:
The type parameter A of cons is implicit.
The element a and tail l remain explicit.
Without implicit arguments, one writes:
cons bool true (nil bool)
With implicit arguments, one can write:
cons true nil
and Rocq infers:
A = bool
from true : bool.
We also discussed the @ notation:
@nil
@cons
The @ forces Rocq to show or use all arguments explicitly, including implicit ones.
So:
cons true nil
is roughly shorthand for:
@cons bool true (@nil bool)
Friendlier list notation
The lecture then introduces standard list notation:
Open Scope list_scope.
and imports list notation:
From Stdlib Require Import List.
Import ListNotations.
This allows syntax such as:
[1; 2; 3]
1 :: [2; 3]
[]
The notation:
x :: xs
means:
cons x xs
So:
1 :: 2 :: 3 :: nil
means:
cons 1 (cons 2 (cons 3 nil))
And:
[1; 2; 3]
is nicer notation for the same list.
Strict subformulas
The lecture then defines a function computing strict subformulas:
Fixpoint strict_sf (F : Wff) : list Wff :=
match F with
| P i => nil
| Neg F => F :: strict_sf F
| Conj F F' => (F :: strict_sf F) ++ (F' :: strict_sf F')
| Disj F F' => (F :: strict_sf F) ++ (F' :: strict_sf F')
| Impl F F' => (cons F (strict_sf F)) ++ (cons F' (strict_sf F'))
end.
The meaning:
An atom has no strict subformulas.
The strict subformulas of ¬F are:
F itself, plus the strict subformulas of F.
The strict subformulas of F ⋄ G are:
F and all strict subformulas of F,
together with G and all strict subformulas of G.
The operator:
++
is list concatenation.
So:
(F :: strict_sf F) ++ (F' :: strict_sf F')
means:
combine the left-subformula list and right-subformula list
Then the full subformula list is defined as:
Definition sf (F : Wff) : list Wff :=
F :: strict_sf F.
So:
sf F = F itself plus all strict subformulas of F.
Substitution in formulas
The lecture then defines formula substitution:
Fixpoint subst_Wff (F : Wff) (I : ℕ → bool) (G : ℕ → Wff) : Wff :=
match F with
| P j => if I j then G j else P j
| ¬F => ¬(subst_Wff F I G)
| F₁ ⇒ F₂ => (subst_Wff F₁ I G) ⇒ (subst_Wff F₂ I G)
| F₁ ∧ F₂ => (subst_Wff F₁ I G) ∧ (subst_Wff F₂ I G)
| F₁ ∨ F₂ => (subst_Wff F₁ I G) ∨ (subst_Wff F₂ I G)
end.
Here:
I : ℕ → bool
is a predicate/characteristic function deciding which atomic propositions should be replaced.
And:
G : ℕ → Wff
is a replacement family. It tells us what formula should replace P j.
In the atomic case:
P j
the function does:
if I j then G j else P j
So:
If j is selected by I, replace P j with G j.
Otherwise leave P j unchanged.
For compound formulas, substitution recursively enters the subformulas and rebuilds the same connective.
For example:
subst(¬F) = ¬ subst(F)
subst(F ∧ G) = subst(F) ∧ subst(G)
So this function is a recursive traversal over the syntax tree.
Example characteristic function
The lecture defines a concrete selection function:
Definition I (n : ℕ) : bool :=
match n with
| 0 => true
| 2026 => true
| 41 => true
| 42 => true
| _ => false
end.
This selects some natural-number indices.
We noticed that if the comment says this represents {0; 2026; 42}, then the code also includes 41, so the actual function selects:
0, 41, 42, 2026
Example replacement families
The file then defines replacement functions such as:
Definition G₁ : ℕ → Wff :=
fun n => P (n + 2) ∧ P (n + 1).
This means selected atoms P n are replaced by:
P (n + 2) ∧ P (n + 1)
Another example:
Definition G₂ : ℕ → Wff :=
fun n => match n with
| 2025 => P 44
| 2026 => P 1 ⇒ P 2
| _ => P n
end.
This replacement function changes only certain indices. But substitution still happens only when I n = true.
So even if G₂ 2025 = P 44, the atom P 2025 is not replaced unless I 2025 = true.
A third example uses evenness:
Definition is_even (n : nat) : bool :=
eqb (n mod 2) 0.
Definition G₃ (n : ℕ) : Wff :=
if is_even n then P (n / 2) else P n.
This replacement family changes even indices by halving them, and leaves odd indices unchanged.
Again, the final substitution is controlled by both:
I decides whether replacement happens.
G decides what replacement is used.
General substitution for indexed families
The lecture abstracts the atomic substitution pattern into a general function:
Definition substFamily {Y} {X} :
(Y → X) → (Y → bool) → (Y → X) → Y → X :=
fun P I G y => if I y then G y else P y.
This is polymorphic in two types:
Y : Type
X : Type
Because they are written in curly braces:
{Y} {X}
they are implicit arguments.
The function takes:
P : Y → X
I : Y → bool
G : Y → X
y : Y
and returns:
if I y then G y else P y
So it says:
Given an original family P,
a selection predicate I,
and a replacement family G,
produce the substituted family.
In the formula case:
Y = ℕ
X = Wff
P : ℕ → Wff
where P n is the atomic formula indexed by n.
Second substitution function
The file then defines another formula substitution function:
Fixpoint subst_Wff' (I : ℕ → bool) (G : ℕ → Wff) (F : Wff) : Wff :=
match F with
| P j => substFamily P I G j
| ¬F => ¬(subst_Wff' I G F)
| F₁ ⇒ F₂ => (subst_Wff' I G F₁) ⇒ (subst_Wff' I G F₂)
| F₁ ∧ F₂ => (subst_Wff' I G F₁) ∧ (subst_Wff' I G F₂)
| F₁ ∨ F₂ => (subst_Wff' I G F₁) ∨ (subst_Wff' I G F₂)
end.
This is essentially the same as subst_Wff.
The difference is that the atomic case uses:
substFamily P I G j
instead of writing directly:
if I j then G j else P j
Also, the argument order is different.
Original:
subst_Wff : Wff → (ℕ → bool) → (ℕ → Wff) → Wff
New version:
subst_Wff' : (ℕ → bool) → (ℕ → Wff) → Wff → Wff
This order is often useful because, after fixing I and G:
subst_Wff' I G
becomes a function:
Wff → Wff
So it is a formula transformer.
Final exercise: adding biconditional
The file ends with an exercise: extend the language with biconditional.
Add a constructor:
Iff : Wff → Wff → Wff
Then define notation, for example:
Notation "a ⇔ b" := (Iff a b) ...
or in ASCII:
Notation "a <=> b" := (Iff a b) ...
Then update all recursive functions:
ht
strict_sf
subst_Wff
because whenever we extend an inductive datatype with a new constructor, every function that pattern-matches on that datatype must handle the new case.
This is an important structural lesson:
Inductive datatype = syntax/grammar.
Recursive functions over it = one case per constructor.
If you add a constructor, you must update the recursive consumers.
The exercise also asks you to think about precedence and associativity, because notation levels determine how expressions are parsed.
30. The central conceptual pattern of the whole lecture
The whole lecture revolves around one fundamental pattern:
Define syntax as an inductive datatype.
Then define recursive functions by pattern matching on that syntax.
For this file:
Wff defines the syntax of formulas.
ht consumes a formula and returns a number.
strict_sf consumes a formula and returns a list of formulas.
subst_Wff consumes a formula and returns a transformed formula.
All of these functions follow the same shape:
Fixpoint f (F : Wff) :=
match F with
| P i => ...
| Neg F => ... f F ...
| Impl F G => ... f F ... f G ...
| Conj F G => ... f F ... f G ...
| Disj F G => ... f F ... f G ...
end.
This is the core programming pattern for inductive syntax trees.
The deeper lessons from our supplementary discussion
Beyond the lecture itself, we clarified several important Rocq/type-theoretic ideas.
Variable is not an imperative variable
Variable x : A.
does not mean “create a mutable slot.”
It means:
Assume x is an arbitrary element of A.
Definition introduces a fixed named term
Definition x : A := val.
means:
Define x to be val.
It is closer to a constant binding than to a mutable variable.
Fixpoint is controlled recursion
Rocq only accepts recursive definitions if it can verify termination, usually by structural recursion.
This is essential because Rocq is a proof assistant. Unrestricted recursion would endanger logical consistency.
fix is the lower-level recursive term
fix f (x : A) : B := ...
is the term-level recursive function construct.
Fixpoint is the usual top-level command syntax.
Goal is an anonymous theorem
Goal P.
Proof.
...
Qed.
starts a proof of P without naming it.
reflexivity proves definitional equality
It works when both sides are the same after computation/unfolding.
Tactics can build programs
A tactic script can construct a proof, but it can also construct ordinary data or functions.
This works because in type theory:
proofs and programs are both terms.
propositions and data types are both types.
Defined keeps the term transparent
For functions built interactively, use:
Defined.
if you want computation to unfold them.
Polymorphic inductive types are type constructors
list : Type → Type
is analogous, loosely, to a C++ template like:
template <typename T>
class vector;Applying it to a type produces a concrete type:
list Wff : Type
Implicit arguments reduce verbosity
Constructors of polymorphic types have hidden type parameters.
With implicit arguments, one writes:
cons true nil
instead of:
@cons bool true (@nil bool)
Best mental model for the lecture
A good final mental model is:
Wff is a syntax tree type.
P, Neg, Impl, Conj, Disj are the constructors for building syntax trees.
Notation makes those trees look like logical formulas.
Fixpoint defines recursive tree traversals.
ht computes a number from a formula tree.
strict_sf collects formula subtrees.
subst_Wff transforms a formula tree by replacing selected atoms.
list is introduced because subformulas naturally form a list.
substFamily abstracts the indexed replacement operation.
So the lecture is not merely about propositional logic. It is really an introduction to the Rocq pattern:
inductive syntax + recursive functions over syntax
This pattern will return constantly in type theory, proof assistants, programming-language semantics, lambda calculus, formal grammars, operational semantics, and verified interpreters.