Skip to content
This repository has been archived by the owner on Mar 14, 2024. It is now read-only.

Latest commit

 

History

History
215 lines (163 loc) · 9.1 KB

subtypes.md

File metadata and controls

215 lines (163 loc) · 9.1 KB

A Theory of Subtyping for Go

In Featherweight Go, the subtyping judgement (Δ ⊢ τ <: σ) plays a central role. However, subtyping is notably absent from both the Go specification (which instead relies on assignability) and the Type Parameters draft (which uses a combination of interface implementation and type list matching). Here, I examine a general notion of subtyping for the full Go language, derivable from assignability.

What is a type?

According to the Go specification, “[a] type determines a set of values together with operations and methods specific to those values”.

What is a subtype?

Formal definitions of a “subtype” vary. Fuh and Mishra defined subtyping “based on type embedding or coercion: type t₁ is a subtype of type t₂, written t₁t₂, if we have some way of mapping every value with type t₁ to a value of type t₂.” That would seem to correspond to Go's notion of conversion operation.

However, Reynolds defined subtypes in terms of only implicit conversions: “When there is an implicit conversion from sort ω to sort ω′, we write ω ≤ ω′ and say that ω is a subsort (or subtype) of ω′. Syntactically, this means that a phrase of sort ω can occur in any context which permits a phrase of sort ω′.” This definition maps more closely to Go's assignability, and to the “implements” notion of subtyping used in Featherweight Go, and it is the interpretation I will use here.

Notably, all commonly-used definitions of subtyping are reflexive: a type τ is always a subtype of itself.

What is a subtype in Go?

Go allows implicit conversion of a value x to type T only when x is “assignable to” T. Following Reynolds' definition, I interpret a “context which permits a phrase of sort ω′” in Go as “an operation for which ω′ is assignable to the required operand type”. That leads to the following definition:

In Go, T1 is a subtype of T2 if, for every T3 to which a value of type T2 is assignable, every value x of type T1 is also assignable to T3. (Note that the precondition is “x of type T1”, not “x assignable to T1”. The distinction is subtle, but important.)

Let's examine the assignability cases in the Go specification.

A value x is assignable to a variable of type T ("x is assignable to T") if one of the following conditions applies:


  • x's type is identical to T

This gives the two reflexive rules from the Featherweight Go paper:

----------
Δ ⊢ α <: α
------------
Δ ⊢ τS <: τS

For the full Go language, we can generalize τS to include the built-in non-interface composite types (arrays, structs, pointers, functions, slices, maps, and channels), boolean types, numeric types, and string types, which we collectively call the concrete types.

This part of the assignability definition also allows a (redundant) rule for interface types, which is omitted from Featherweight Go:

------------
Δ ⊢ τI <: τI

This case could in theory make each literal type a subtype of every defined type, provided that the defined type has no methods (and thus cannot be assigned to any interface with a non-empty method set).

literal(τ)   Δ ⊢ underlying(σ) = τ   methodsΔ(σ) = ∅
----------------------------------------------------
                     Δ ⊢ τ <: σ

However, this rule is inadmissibly fragile: if a method were to be added to T, as is allowed by the [Go 1 compatibility guidelines][], then V would cease to be a subtype of T. Although there may be a subtype relationship today between such types, no program should rely on it. (Consider an example in which a defined type gains a trivial String method.)


This leads to the interface-subtyping rule <:I from Featherweight Go:

methodsΔ(τ) ⊇ methodsΔ(τI)
--------------------------
       Δ ⊢ τ <: τI

  • x is a bidirectional channel value, T is a channel type, x's type V and T have identical element types, and at least one of V or T is not a defined type.

This case is analogous to the case for “identical underlying types”. It makes the literal type chan T a subtype of both <-chan T and chan<- T, and by extension a subtype of any defined type with <-chan T or chan<- T as its underlying type.

----------------------
Δ ⊢ chan τ <: <-chan τ
----------------------
Δ ⊢ chan τ <: chan<- τ

Unlike a defined type, a literal channel type with a direction can never acquire methods in the future. However, we cannot exclude the possibility of new interface types themselves: for example, if sum types are ever added to the language, chan T could not reasonably be assignable to a sum type that allows both <-chan T and chan<- T, since we could not determine to which of those types it should decay. However, each of <-chan T and chan<- T could individually be assignable to such a sum.


  • x is the predeclared identifier nil and T is a pointer, function, slice, map, channel, or interface type.

The predeclared identifier nil does not itself have a type, so this case does not contribute any subtypes. (If it did, then the nil type would be a subtype of all pointer, function, slice, map, channel, and interface types.)


As with nil, this case does not contribute any subtypes.

Conclusion

Having examined the above cases, we find:

  • A type T is always a subtype of itself.
  • A type T is a subtype of every interface that it implements.
  • Additional subtype relationships exist, but are too fragile to rely on.
    • A literal composite type is a subtype of any defined type that has it as an underlying type and has no methods, but Go 1 compatibility allows methods to be added to defined types.
    • A bidirectional channel type literal is a subtype of the corresponding directional channel type literals, but only provided that sum types are not added to the language.

Thus, the only useful subtyping relationship in Go is: “T1 is a subtype of T2 if T1 is or implements T2“.

Appendix: Why “of type” instead of “assignable to”?

I could have chosen a different definition of subtyping, based on assignability to type T1 instead of a value of type T1. That would give a rule something like: “T1 is a subtype of T2 if, for every value x assignable to T1, x is also assignable to T2.“ Why not use that definition instead?

Unfortunately, that alternate definition would mean that even interface types do not have subtypes: if defined type T has a literal underlying type U and implements inteface I, then a value of type U is assignable to T, but not to I, so T could not be a subtype of I. A notion of subtyping that does not even capture simple interface-implementation matching does not seem useful.