A pure functional language is one in which everything is immutable. Such a language has the property that every expression is referentially transparent, i.e. it can be replaced with its value without changing the meaning of the program.
Functional purity is not necessary for referential transparency, however, and it forbids useful idioms. A weaker rule that nonetheless guarantees referential tansparency is that no expression is both shared and mutable. Mark Wainwright suggests that a language with this property should be called “lightly soiled”.
As I write this, Rust is the main example of a lightly soiled language. It has a static type system (“borrow checker”) that checks the “shared NAND mutable” property. Its static type system has additional goals too, such as tracking responsibility for freeing memory. A common criticism of Rust’s borrow checker is that it is difficult to learn how to use it.
Here I attempt to make a “lightly soiled lambda calculus” as a prototypical example of a lightly soiled language. It simplifies Rust’s design by lowering its ambitions:
In an ordinary type system, each expression that stands for a value is assigned a type, which describes and restricts the possible values. In addition, the lightly soiled lambda calculus assigns each expression a set of duties, which restrict how the value can be used.
There are two possible duties:
Each of these is a trade-off. Taking on a duty permits behaviour or assumptions which would not otherwise be allowed:
Other duties are conceivable, but not needed here.
Function types are also annotated with duties which restrict how the function uses its free variables. Specifically, a function whose type is marked as constant cannot mutate its free variables.
If variable has certain duties, and stands for a function of a type that does not respect those duties, then the function cannot be called. Specifically, a function that mutates its free variables cannot be called via a variable marked as constant.
It is always safe to treat an expression as having additional duties, but duties can never be removed or otherwise ignored.
It is always safe to treat a function as respecting fewer duties, but never more.
Assume a “ground language”, consisting of things like integers, strings, and operations on them. The ground language also includes all the primitive mutation and I/O operations. The details of the ground language are abstract here, so I will make use of non-terminals <ground-expr>
and <ground-type>
, being respectively the expressions and types that come from the ground language.
Assume a non-terminal <name>
that represents valid variable names.
The grammar uses the following keywords:
λ
to define a function.ᶜ
to indicate constant values.ᵈ
to indicate doomed values.,
to separate list items, and brackets to enclose them.:
to introduce a type annotation.Please read <foo> , ...
as a comma-separated list of zero or more <foo>
s.
Please read <foo>?
as an optional <foo>
.
As in the lambda calculus, an expression (other than a <ground-expr>
or a bracketed expression) must be a variable name, a function application, or a function definition:
<expr> ::=
<ground-expr>
( <expr> )
<name>
<expr> ( <expr> , ... )
λ ( <param> , ... ) <expr>
Note that functions in this language take zero or more arguments, in contrast to the simply-typed lambda calculus in which functions always take one argument. This is to obviate Currying, which is not always possible.
As in the simply-typed lambda calculus, function parameters require explicit type annotations:
<param> ::=
<name> : <duty-type>
In the function definition λ(x₁: u₁, ...) e
the names x₁, ...
must all be distinct.
As in the simply-typed lambda calculus, a type (other than a <ground-type>
or a bracketed type) must be a function type:
<type> ::=
<ground-type>
( <type> )
λ ᶜ? ( <duty-type> , ... ) <duty-type>
Note that two kinds of function type are distinguished, according to the ᶜ
flag.
<duty-type>
adds duties to <type>
:
<duty-type> ::= ᶜ? ᵈ? <type>
As in the simply-typed lambda calculus, the type-checking rules make use of the concept of an <environment>
. The purpose of the <environment>
is to track the free variables of an <expr>
and the way in which they are used. <environment>
s are a concept internal to the type-checker, and never appear in a program.
Define an <environment>
to be a finite partial map from <name>
to <duty-type>
. I will use the syntax { <name> : <duty-type> , ... }
to write literal <environment>
s.
Boolean variables c
, d
stand for duty flags. e
stands for an <expr>
, t
for a <type>
, u
for a <duty-type>
, and x
for a <name>
. The greek letter Γ
stands for an <environment>
.
Define predicates is_constant
and is_doomed
on <duty-type>
in the obvious way.
Erasing all type and duty annotations from a lightly-soiled lambda calculus expression gives an untyped lambda calculus expression, which has roughly its usual semantics. However, the values manipulated by the lambda calculus expression are references into a heap, and can be used to mutate the contents of the heap.
Let <ref>
be a set equipped with an allocate
procedure that produces elements that have never been used before.
A <scope>
is a finite map from <name>
to <ref>
.
A <heap>
is a finite map from <ref>
to <value>
.
A <value>
(other than a <ground-value>
) is a closure:
<value> ::=
<ground-value>
λ <scope> ( <name> , ... ) <expr>
The <value>
λS(x₁, ...) e
is well-formed iff the domain of the <scope>
S
excludes the set of bindings {x₁, ...}
.
Write (e, S, H⁻) → (r, H⁺)
to mean the <expr>
e
in <scope>
S
reduces to the <ref>
r
while mutating the <heap>
H⁻
into H⁺
. The relation →
is defined inductively by the following rules:
Ground: (x, S, H⁻) → (r, H⁺)
if the ground language says so.
Variable: (x, S, H) → (S(x), H)
Function application: (e₀(e₁, ...), S, H⁻) → (r, H⁺)
if:
n
is the number of arguments(e₀, S, H⁻) → (r₀, H₀)
i
from 1
to n
:
(eᵢ, S, Hᵢ₋₁) → (rᵢ, Hᵢ)
Hₙ(r₀) = λ S₀ (x₁, ...) e
S = S₀ ∪ {x₁: r₁, ...}
(e, S, Hₙ) → (r, H⁺)
Function definition: (λc(x₁: u₁, ...) e, S, H⁻) → (r, H⁺)
if:
r
is not in H
, i.e. we generate it using allocate
.H⁺ = H⁻ ∪ {r: λS'(x₁, ...) e}
S'
is S
restricted to the free names of λc(x₁: u₁, ...) e
.Note that the type and duty annotations are ignored. They may be erased from the program before executing it, to save time and memory.
Note that the heap, which is carefully threaded through the evaluation sequence, is not mutated by any rule except “Ground”. The rule “Function definition” extends the heap but does not touch anything that was already in it. If not for the rule “Ground”, we could replace every <ref>
by its corresponding <value>
, to obtain the familiar untyped lambda calculus.
For simplicity, this implementation leaks memory. Frequently, we end up with a <heap>
that records the value of a <ref>
that can never again be reached. Garbage collecting such heap items is harmless, and can be achieved using e.g. reference counting.
The duties induce a subtype relation. Since we are dealing with subtypes anyway, we further allow the ground language to contribute to the subtype relation. Subtyping of functions is defined in the usual way.
The formal definition follows:
A <ground-type>
t⁻
is a subtype of another t⁺
if the ground language says it is.
A function type λc⁺(u₁⁺, ...) u⁻
is a subtype of another λc⁻(u₁⁻, ...) u⁺
iff:
c⁺ ⇐ c⁻
uᵢ⁺
is a supertype of uᵢ⁻
for each i
u⁻
is a subtype of u⁺
A <duty-type>
c⁻d⁻t⁻
is a subtype of another c⁺d⁺t⁺
iff:
c⁻ ⇒ c⁺
d⁻ ⇒ d⁺
t⁻
is a subtype of t⁺
An <environment>
Γ⁻
is a subtype of another Γ⁺
iff:
x
such that Γ⁺(x)
is defined:
Γ⁻(x)
is a subtype of Γ⁺(x)
We define an <expr>
e
to be well-typed with <duty-type>
u
in <environment>
Γ
iff we can derive a “typing judgement” Γ ⊢ e : u
using the following rules:
Subtype: Γ⁻ ⊢ e : u⁺
if:
Γ⁺ ⊢ e : u⁻
u⁻
is a subtype of u⁺
Γ⁻
is a subtype of Γ⁺
Ground: {} ⊢ e : u
if the ground language says so.
Variable: {x: p} ⊢ x : p
Function application: Γ ⊢ e₀(e₁, ...) : u
if:
i
:
Γ
is a subtype of Γᵢ
Γᵢ ⊢ eᵢ : uᵢ
u₀ = cdλc(u₁, ...) u
i < j
, x
such that Γᵢ(x)
and Γⱼ(x)
are both defined:
is_constant(Γᵢ(x))
and is_constant(Γⱼ(x))
:
is_doomed(Γᵢ(x))
¬is_doomed(uᵢ)
Function definition: Γ ⊢ λc(x₁: u₁, ...) e : dλc (u₁, ...) u
if:
Γ(x₁)
, … is definedx
such that Γ(x)
is defined:
c ⇒ is_constant(Γ(x))
is_doomed(Γ(x)) ⇒ d