Page structure. This page is an overview of language features and implementation tasks that contribute, directly or indirectly, towards the goal of adding dependent types to Haskell. The tasks are organized into a graph. Edges point to prerequisites or subtasks. Completed tasks are highlighted with a purple border. Accepted proposals awaiting implementation are marked by an orange border. After the graph you will find a grid of cards with information about each node.
Target audience. The primary purpose of this page is to track current implementation progress and indicate the relationships between various tasks, thus helping implementors and enthusiasts to see the big picture. You may find it difficult to follow some of the task descriptions if you are new to Haskell or unfamiliar with dependent types, but you might be able to find some useful links and resources.
Common questions. Before we proceed to the current state of the project, here are answers to some questions regarding dependent types in the context of Haskell.
Just as any other type system feature, dependent types allow the programmer to be more precise about the properties of their code. This has all the usual upsides: less boilerplate, fewer unsafe operations, better correctness guarantees, higher runtime efficiency. For instance, it would be possible to eliminate out-of-bounds array access without the overhead of runtime bounds checking.
The existing language extensions (DataKinds, PolyKinds, TypeFamilies, GADTs) allow us to emulate dependent types with techniques such as singleton types and defunctionalized type families. However, the compile-time and run-time costs of these techniques, as well as the associated accidental complexity, limit their applicability. We envision that proper support for dependent types will result in a language that is more well-rounded, easier to learn, and nicer to use.
At the moment, the best route to understanding dependent types is to learn a language that properly supports them (Agda and Idris are good choices for someone with a Haskell background) and then transfer your understanding back to Haskell. Once we have full support for dependent types in Haskell, it will be possible to create learning resources that do not involve other languages.
In addition to being a good purely functional programming language, Haskell has a wonderful community, a rich collection of libraries, and high-quality compiler and runtime system. We are lucky to be able to use it in an industrial setting. The initiative to add dependent types to Haskell is motivated by real world use cases. We spotted plentiful opportunities to apply dependent types to improve the quality of our Haskell code.
We would love to provide an estimate, but it is unclear how long each individual subtask will take. To understand the current state of the project, please consult the task graph below.
GHC is an open source compiler, and its development is of inherently collaborative nature. While Serokell OÜ has been funding its own GHC team since 2018 with the goal of adding dependent types to Haskell, our company has designed and implemented only some of the features described on this page.
We want to acknowledge the numerous contributions by other companies, institutions, and individuals. All further uses of we
are meant to include all relevant contributors. Please consult the referenced publications, proposals, issues, and commits to determine the authors of each contribution.
Dependent types are a language feature that enables development of correct-by-construction software. Its key ideas are dependent quantification and dependent pattern-matching. As a paradigm, dependently-typed programming also relies on type-level data and type-level computation.
To some degree, these ideas already took root in Haskell in the form of GADTs, TypeFamilies, PolyKinds and DataKinds, but the limitations of those language extensions preclude their widespread adoption. Our goal is to make dependently-typed programming in Haskell ergonomic and practical.
A typesafe list indexing operation:
index :: foreach (xs :: List a) -> Fin (length xs) -> a
index (x : xs) FZ = x
index (x : xs) (FS i) = index xs i
The bounds check is performed statically if the index is a statically known constant. Otherwise, the type checker will ensure that the caller of this function performs a runtime bounds check. Either way, index
never throws a runtime exception unlike (!!)
.
To count Haskell among dependently-typed programming languages, we need to add support for the canonical dependent types: Σ-types and Π-types.
Dependent sums and dependent products derive their names from the mathematical notation for summation (Σ) and product (Π) respectively, so it might be helpful to see how their algebraic interpretation compares to that of existing (non-dependent) types in Haskell:
Math notation | Haskell syntax |
---|---|
Void | |
Unit | |
Either A B | |
Tuple2 A B | |
A → B | |
foreach (a ∷ A) → F a | |
some (a ∷ A) | F a |
For example, given a type family , the dependent sum is equivalent to .
From the standpoint of logic, Π-types and Σ-types are closely related to universal and existential quantification respectively. However, an important aspect of foreach
and some
is the possibility of a dependent pattern match, which makes them different from forall
and exists
.
To enable pattern matching, foreach/some-quantified variables must be retained at runtime, unlike forall/exists-quantified variables that are erased shortly after type checking. Consequently, the values of Σ-types and Π-types all have a particular runtime representation. Dependent sums are represented as pairs, hence their alternative name: dependent pairs; dependent products are represented as functions, hence their alternative name: dependent functions.
Π-types, also known as dependent products or dependent functions, are functions whose result type depends on the value of the argument.
A dependent function is a hybrid between a function A -> B
and universal quantification forall (a :: A). F a
, sharing some properties with the former and some with the latter.
The mathematical notation for a dependent function type is , and the planned Haskell notation is foreach (a :: A) -> F a
.
Σ-types, also known as dependent sums or dependent pairs, are pairs in which the type of the second component depends on the value of the first component.
More generally, any n-tuple or record type is a Σ-type if the types of some of its components (fields) depend on the values of its other components (fields).
The mathematical notation for a dependent pair type is
some (a :: A) | F a
.As part of its compilation pipeline, GHC translates all Haskell programs to an internal language called Core.
To support dependent types in Haskell, we have two options: encode them using existing Core features or extend Core with dependent types too.
The Core language is the central part of the GHC pipeline that sits between the frontend (parser, renamer, type checker, desugarer) and the backend (STG, Cmm, code generator). The vast majority of optimizations that GHC performs are implemented as Core-to-Core transformations.
Core is a typed language whose type system is separate from that of Haskell. Its type checker is called Core Lint and is used as a sanity test for other parts of the compiler.
The current version of Core is based on System FC, a polymorphic lambda calculus with coercions. It has no support for dependent types, and known encodings based on singleton types are inefficient.
Changes to Core are rare and are not taken lightly. Developing a new formalism for Core that can accommodate both linear types and dependent types is a subject of ongoing research.
The style of dependently-typed programming that we envision for Haskell assumes seamless integration of and significant code reuse across term- and type-level definitions.
Promotion is the idea of making term-level definitions available for use in type-level contexts, and our goal is to maximize the set of data types and functions that can be promoted.
With today's DataKinds, most ADTs can be promoted. For example, the boolean data constructor True
can be directly used at the type level: there is no need to introduce a separate kind KBool
for type-level booleans. We get to reuse the plain old Bool
.
However, when we look beyond simple ADTs, promotion is often not available. Problematic examples include function types, unboxed types, constraints, and data families. In the singletons
library, this issue manifests in the form of the Demote
mapping, which does not fully solve the problem and introduces a great deal of complexity that we'd like to avoid.
Overloaded literals that behave as closely as possible to the ones at the term level.
Promote character literals 'x'
.
Promote numeric literals 42
.
At the moment, any data type that internally makes use of unboxed types cannot be promoted. This includes Integer
, Text
, ByteArray
, and so on. The only exceptions are Char
and Natural
that have ad-hoc support in the compiler.
Promote class methods for use at the type level.
Promote term-level functions for use at the type level.
Lift the restriction that data families cannot be promoted.
Closed terms are terms that contain no free variables. Consider:
v :: Vec (length [1,2,3]) Bool
v = True :. False :. True :. VNil
To typecheck this declaration, we need to reduce length [1,2,3]
to 3
.
GHC is already capable of evaluating closed terms at compile time by means of TemplateHaskell, so we only need to make this functionality available without explicit conversions to and from TH AST.
Open terms are terms that contain free variables. Consider:
x :: forall a. Num a => id a
x = 42
To typecheck this declaration, we need to reduce id a
to a
without instantiating a
to a concrete type.
This requires an evaluator that can deal with free variables or other unknowns.
Support anonymous functions \a b -> e
at the type level.
To ensure we are building on a solid foundation, we want to polish the existing type-level programming features in Haskell before we move on to dependent types.
This is a somewhat open-ended goal. The main criterion for including a task or a feature here is whether it would simplify the current practice of singletons-based programming.
We would also like to offer typing rules and a reference type checker implementation for modern Haskell, including all its type system extensions. See Typing rules for Haskell.
Declarations in a Haskell module may appear in any order, so the compiler needs to perform dependency analysis and arrange them in topological order before type checking.
At the moment, GHC partitions the module into type declarations and term declarations, and all type declarations are processed first. This will not work if our goal is to allow mixing types and terms freely.
Furthermore, the existing algorithm has trouble with type family instances.
We need to fix the current bugs in dependency analysis and then extend it to handle type declarations that depend on terms.
Add first-class existential quantification exists a. C a /\ a
to the language.
At the moment, GHC does not support partial application of type families or type synonyms. Type families must be fully applied (saturated) at use sites, restricting users to first-order programming.
Consider this definition of the map
function for use at the type level:
type Map :: (a -> b) -> List a -> List b
type family Map f xs where
Map _ [] = []
Map f (x:xs) = f x : Map f xs
It is not possible to pass another type family as the first argument to Map
because that would require partial (unsaturated) application of the function argument.
The singletons
package offers a workaround in the form of defunctionalization. Direct support for unsaturated type families would eliminate the need for that workaround.
The key idea is to annotate the function arrow with matchability, so that generative and non-generative type constructors are assigned different kinds.
Add class constraints to the kinds of associated type families.
One of the never-ending challenges of language design is to come up with user-facing syntax for new features. To ease this problem, we proposed modifiers: generic, multi-purpose syntax of the form %mod
.
Modifiers can be attached to arrows, types, expressions, patterns, record fields, data constructors, type declarations, and so on.
For example, LinearTypes use the syntax of modifiers to specify multiplicity:
f :: (a -> b) %1 -> a -> b -- %1 marks a linear arrow
let %1 y = g x -- %1 marks a linear let-binding
We intend to use modifiers for matchability annotations of unsaturated type families, as well as for other annotations that turn out necessary in the context of our work on dependent types.
In development, see the tracking issue for details.
Standalone kind signatures make it possible to fully specify the kind of a type constructor:
type T :: (k -> Type) -> k -> Type -- signature
data T m a = MkT (m a) (T Maybe (m a)) -- declaration
A standalone kind signature can accompany a data
, newtype
, type
, class
, type family
, or data family
declaration.
The kind checker enters either checking mode or inference mode depending on the presence or absence of a standalone kind signature. Polymorphic recursion is only supported in checking mode.
This feature is described in more detail in section 6.4.13.8. Standalone kind signatures and polymorphic recursion
of the User's Guide.
Standalone kind signatures are modeled after term-level type signatures. They are the recommended replacement for CUSKs.
Done, except for known issues:
CUSKs (complete user-supplied kinds) are a syntactic heuristic to determine whether the kind checker should operate in checking mode or inference mode. The exact rules are laid out in section 6.4.13.7. Complete user-supplied kind signatures and polymorphic recursion
of the User's Guide.
The main use case for CUSKs is to enable polymorphic recursion in type, data, and class declarations, as polymorphic recursion cannot be inferred.
Like any other heuristic, CUSKs can be fiddly. Standalone kind signatures offer a much better alternative: instead of a complicated set of rules, the choice between checking and inference is determined by the presence or absence of a signature.
The design of standalone kind signatures mirrors that of type signatures for term-level declarations, so we would like to deprecate CUSKs in their favor.
Impact analysis is needed, and a transitional warning is desirable before deprecation takes place. See MR !1602 Disable CUSKs by default
.
One of the design aspects of dependent types that we envision for Haskell is that every quantifier must come in two variants: visible and invisible.
For example, universal quantification over an erased type is possible with either the invisible quantifier forall a.
or the visible quantifier forall a ->
.
Invisible quantification relies primarily on type inference to instantiate type variables behind the scenes. However, it may be occasionally useful to bind or supply an invisible type argument explicitly, and to do that, one must use a visibility override, spelled @
in the Haskell syntax.
At use sites, this is done with the TypeApplications extension, and with TypeAbstractions at binding sites.
To implement Π-types, we also plan to introduce two quantifiers: the invisible quantifier foreach a.
and the visible quantifier foreach a ->
. We expect to fully reuse the existing machinery to deal with visibility overrides.
The type abstraction syntax can be used to bind type variables in data
, newtype
, type
, class
, type family
, and data family
declarations:
type C :: forall k. k -> Constraint
class C @k a where ... -- @k binds the kind variable
Invisible binders also provide a mechanism to control the arity of type synonyms and type families.
Section 6.4.17.3. Invisible Binders in Type Declarations
of the User's Guide describes this feature in more detail.
Feature preview is available since GHC 9.8, and the remaining work is tracked by the following tickets:
The type abstraction syntax can be used in lambdas and function left-hand sides to bring type variables into scope in the function body:
r = higherRank (\ @a x -> ...) -- @a binds a rank-2 type
Invisible binders in functions are the modern and more flexible alternative to ScopedTypeVariables, capable of handling higher-rank scenarios.
Section 6.4.17.2. Type Abstractions in Functions
of the User's Guide describes this feature in more detail.
Feature preview is available since GHC 9.10, and the remaining work is tracked by the following tickets:
The type abstraction syntax can be used in patterns to bind existential type variables associated with a GADT data constructor:
data G where { MkG :: forall e. e -> e -> G }
f (MkG @e a b) = ... -- @e binds the existential
Section 6.4.17.1. Type Abstractions in Patterns
of the User's Guide describes this feature in more detail.
Most of the time GHC is capable of inferring the instantiations of invisible type variables. For example, id True
internally expands to id @Bool True
.
However, there are also cases when the programmer might want to visibly supply a type argument to resolve an ambiguity. Consider show . read
, where it is not clear what type should be used for the intermediate representation.
With the TypeApplications extension, it is possible to write f @t
to instantiate type variables at use sites.
Section 6.4.16. Visible type application
of the User's Guide describes this feature in more detail.
Currently this syntax works with polymorphic functions of type forall a. t
, and we intend to reuse the same mechanism to visibly supply the arguments of dependent functions of type foreach a. t
. Note the dot that denotes that the argument is normally invisible, inferred.
The original implementation of the TypeApplications extension only allowed to supply type arguments at the term level:
x = Nothing @Bool
It has been later generalized to allow supplying kind arguments at the type level:
type X = Nothing @Bool
In this example, the inferred kind of X
is Maybe Bool
.
The syntactic separation between terms and types comes apart at the seams when we allow data constructors in type-level contexts or unadorned types in term-level contexts:
a :: G (Just 10) -- terms in types (DataKinds)
b = f (Int -> Bool) -- types in terms (RequiredTypeArguments)
To better adapt to the new demands of language design, we aim to fully unify the syntax of terms and types.
The goal is to use a single internal representation for terms and types in the compiler frontend (AST, parser, renamer, type checker) and to reuse as much of the grammar and name resolution logic as possible.
Type syntax is available in expressions since GHC 9.12. However:
The use of identical names for type-level and term-level entities is called punning. Haskell makes heavy use of punning in its built-in syntax and common types.
However, there are various contexts in which an occurrence of a name may refer either to the type namespace or the data namespace, so disambiguation may be necessary:
WARNING
, DEPRECATED
, and ANN
pragmasThe simplest way to avoid namespace ambiguity is to avoid punning entirely. However, given the pervasive use of punning in the Haskell ecosystem, even users who wish to avoid punning will inevitably end up importing modules which make use of it. Thus we need mechanisms to disambiguate the namespace on import or at use sites.
We propose to introduce two new warnings, -Wpuns
and -Wpun-bindings
, to facilitate writing code as if Haskell had a single namespace.
By making use of those warnings, library authors could ensure their APIs do not require explicit namespace disambiguation to use, whereas users would be able to see where and when implicit namespace disambiguation takes place.
Namespace-specified imports extend ExplicitNamespaces to allow data
and type
keywords to be used as part of import or export lists, potentially with a ..
wildcard:
import qualified Data.Proxy as T (type ..)
import qualified Data.Proxy as D (data ..)
The namespace specifier filters imported names, keeping only the ones from the requested namespace.
The proposal has been accepted and awaits implementation.
The built-in list syntax [a]
could denote either a list of one element or the type of a list. []
could mean either an empty list or the list type constructor. Likewise, (a, b)
could be either a tuple type or the tuple itself.
This form of overloading, called punning, is usually resolved based on context, i.e. whether we are in term syntax or type syntax. When we start mixing types and terms (e.g. with DataKinds or RequiredTypeArguments), it becomes a problem.
We proposed and implemented the (No)ListTuplePuns extension. List and tuple types are now available under alternative names List
, Unit
, Tuple2
, Tuple3
, and so on.
Constraint tuples and unboxed tuples are also available under alphanumeric names CUnit
, CTuple2
, CTuple3
, Unit#
, Tuple2#
, Tuple3#
, and so on.
Done, except for the Tuple
and Constraints
type families that currently cannot be defined due to type inference issues.
Visible dependent quantification forall a -> t
allows defining functions that take type arguments that are syntactically indistinguishable from term arguments:
printf "pkg: %s-%d.%d.%d.%d\n" "typelits-printf" 0 3 0 0
In this example, the formatting string is a required type argument passed and processed at compile time. See section 6.4.18. Required type arguments
of the User's Guide for a comprehensive overview of this feature.
The type forall a -> t
differs from a Π-type in one important way: the type argument is erased. That is, a required type argument does not exist at runtime and does not support dependent pattern-matching.
By implementing RequiredTypeArguments first, we have been able to tackle many challenges related to parsing and name resolution, so that by the time we get to Π-types, we will not get sidetracked by the minutiae of syntax.
Concrete results include the specification of a resolved syntax tree, the t2t-transformation, and types-in-terms.
Feature preview is available since GHC 9.10, and the remaining work is tracked by the following tickets:
Visible forall in pattern synonyms
Reject puns in T2T
Error message hints: use the env instead of the namespace
No -Wbadly-staged-types
warning for required type argument
Panic with RequiredTypeArguments and EmptyCase
Haddock's support for VDQ looks incomplete
Visible forall in GADTs
Visible forall in class methods
Partial type signature solutiontoo rigidwhen using required type arguments
In order to implement required type arguments, we needed to settle the question of whether forall a -> t
and forall a. t
are equal types in Core.
After a long discussion and playing around with ideas like visibility subsumption, we decided on making those types distinct but related by a coercion.
The Core language has been extended with visibility coercions, which unblocked the work on required type arguments.
At the moment, type variables in GADTs are subject to two restrictions:
We propose to lift both restrictions to enable library authors to design better APIs.
The proposal has been accepted and awaits implementation.
To allow resolving namespace-related conflicts, we introduced the type t
syntax in patterns and expressions. The new syntax is guarded behind ExplicitNamespaces and allows embedding types as subexpressions:
idv (type a) (x :: a) = x -- `type t` pattern
n = idv (type Int) 42 -- `type t` expression
This is only useful to disambiguate namespaces in required type arguments. We recommend omitting type
whenever possible.
Our long-term goal is to allow freely mixing terms and types, including the ability to use term-level functions and variables in types:
fn x = ... where
g :: x -> x -- `x` used in a type signature
g = ...
Term variable capture is going to become especially useful in combination with function promotion and dependent pattern-matching. In the short term, we can use it to bind required type arguments.
Prior to our work on GHC, the x
variable bound in the term namespace could never be mentioned in a type signature.
We proposed and implemented new name resolution rules that allow term variable capture, currently guarded behind the RequiredTypeArguments extension.
Done. Library authors are kindly asked to ensure their code does not trigger the -Wterm-variable-capture
warning.
The *
notation for the kind of proper types, i.e. types that classify terms, has traditionally been used in kind signatures such as Int :: *
or Maybe :: * -> *
.
Unfortunately, as types and kinds have been merged, this notation began to conflict with type operators: a * b
can be read as either a (*) b
or (*) a b
. As a temporary workaround, we introduced the StarIsType extension to control this.
The problem becomes more pressing when we attempt to unify terms and types, as *
is commonly used to denote multiplication. We had to decide on a single meaning for the star syntax, and multiplication is a clear winner.
We are nearing the end of an 8-year-long migration period. Library authors are advised to switch from *
to Type
.
In the past, forall
used to be a keyword under two conditions:
The first condition has been dropped to improve error messages, and the second condition is at odds with unification of type and term syntax. As a result, we ended up making forall
a proper unconditional keyword.
To soften the impact of this breaking change, we implemented a transitional warning -Wforall-identifier
, and also contacted maintainers of all affected packages.
Thanks to this change, we have been able to extend RequiredTypeArguments with direct support for forall-types in argument positions.
One might be surprised to learn that equality constraints a ~ b
used to be implemented as built-in syntax rather than a straightforward application of the type-level operator (~)
.
In fact, (~)
was not a syntactically valid operator at all and was rejected in terms:
(a ~ b) = ... -- syntax error
Building upon our whitespace-sensitive operator parsing framework, we reimplemented equality constraints as infix applications of the tilde operator, exported from Data.Type.Equality
.
This makes the syntax of types and terms more uniform and simplifies the language, but it is also a breaking change, as (~)
is no longer magically in scope.
We are following the usual procedure for language changes: first a proposal, then transitional warnings -Wtype-equality-out-of-scope
and -Wtype-equality-requires-operators
, and now we are in the waiting period before we can remove the compatibility fallbacks.
Prior to our work, the function composition operator (.)
could not be defined at the type level:
type (f . g) x = f (g x) -- syntax error
We extended the syntax of TypeOperators to allow the dot operator in types, thereby bringing the syntax of types and terms closer together.
To facilitate unification of term- and type-level parsers, we had to find a way to minimize operator-related discrepancies in the syntax. Of particular interest are (~)
, (!)
, (@)
, (-)
, (.)
, ($)
, and ($$)
, all of which can also act as special syntax depending on enabled extensions.
Our solution is to use surrounding whitespace to classify each operator occurrence into one out of four categories:
a ! b | loose infix |
a!b | tight infix |
a !b | prefix |
a! b | suffix |
Depending the surrounding whitespace, we can introduce special parsing rules for individual operators. For example, !
could be used either as an ordinary infix operator a ! b
or as a bang-pattern f !x
.
Within this framework of whitespace-sensitive operator parsing, we have been able to reconcile type applications with as-patterns, laziness annotations with equality constraints, bang-patterns with infix operators, and so on.
Done, except for one known issue:
-Woperator-whitespace
ignores(*)
,(.)
, and(-)
The original implementation of kind polymorphism (PolyKinds) offered no way to explicitly bring a kind variable into scope. Instead, users just mentioned k
in an appropriate context and the kind variable came into being.
Because of this legacy, GHC continued to treat kind variables differently than type variables when it came to name resolution. In particular, kind variables were not subject to the forall-or-nothing rule and had unusal interactions with ScopedTypeVariables.
This meant that, for reasons of backwards compatibility, GHC had to maintain a distinction between kind and type variables, even though this distinction was superficial: it had no effect outside implicit quantification.
The discrepancy was eliminated by following the usual procedure: first a proposal to get the changes approved by the committee, then a compatibility warning to help users update their code, and finally a breaking change to simplify the language.
GHC has been accumulating DT-adjacent features for a long time now. For example, GADTs date back as far as GHC 6.4 (March 2005). However, to keep the scope of this page within reasonable limits, we must choose a more recent cut-off point.
The introduction of the Type :: Type
axiom in GHC 8.0 (May 2016) is a rather appropriate milestone for that purpose. This change blurred the line between types and kinds to such an extent that a single forall-quantified variable could be used both as a type variable and a kind variable in one signature.
Aside from various anomalies in the surface language, one could say that the TypeInType
extension unified types and kinds in GHC and opened up the next frontier: unifying types and terms.