It would be useful to permit types to say they may only be initialized by their …defining package, for two reasons:
- types that have value-level invariants, e.g an ordered interval between two natural numbers, where start < from
- enums - i.e types which gain their utility from having a specific, limited set of legal values
Currently, it's easy for create values of either of these that violate their invariants, especially via zero-value initialization, and that allows bugs to creep in. Existing ways to attempt to prevent invalid values being created are [limited and unidiomatic](#user-content-issues-with-alternatives).
This would be a backwards-compatible change - only programs that opted a type into this new behaviour would be affected.
This proposal calls this feature 'package created types', or PCTs.
<details>
<summary>
Answers to language-changes.md questions.
</summary>
* Would you consider yourself a novice, intermediate, or experienced Go programmer?
* Experienced - 18 months writing Go at Github, 12 months at Plaid
* What other languages do you have experience with?
* C, JS, TypeScript, Python, Ruby, Erlang, Clojure, Objective C
* Would this change make Go easier or harder to learn, and why?
* Slightly harder, it's one more feature. However the compiler error messages would be able to propose a clear fix: use an exported function or value of the package that exported this type to initialize it. Unmarshalling error messages at runtime would be similar
* Has this idea, or one like it, been proposed before?/If so, how does this proposal differ?
- https://github.com/golang/go/issues/28939 - contrast: that proposal requires to changing the behaviour of existing initializations. This proposal would not change the behaviour of existing code.
- https://github.com/golang/go/issues/32076 - contrast: that proposal attempts to make some function calls implicit, this proposal doesn't
* Who does this proposal help, and why?
* Any codebase that wishes to ensure invariants are fulfilled on the type level, which provides useful guarantees when reasoning through how a piece of the code behaves
* This is especially useful for medium to large teams, where ensuring, for instance, every time a value of a certain type is initialized a constructor is used, becomes extremely hard
* Is this change backward compatible?
* Yes - no existing code would be affected.
* What is the cost of this proposal? (Every language change has a cost).
* How many tools (such as vet, gopls, gofmt, goimports, etc.) would be affected?
* I believe this would only be a change to the compiler. gofmt would depend on how indicating types was implemented - an additional keyword for a type would require an extra, simple, formatting rule
* What is the compile time cost?
* This would be an additional check for each initialisation at compile time, and map or slice type at type-checking time (ensuring they don't have values that are PCTs, or reference a PCT)
* What is the run time cost?
* Standard library unmarshalling code needs an additional check for types that don't implement their own `Unmarshal` function (to error on package created types that lack one)
* Is the goal of this change a performance improvement?
* No
* Does this affect error handling?
* No
* Is this about generics?
* No
Answered by the proposal:
* What is the proposed change?
* Please describe as precisely as possible the change to the language.
* What would change in the language spec?
* Please also describe the change informally, as in a class teaching Go.
* Show example code before and after the change.
* Can you describe a possible implementation?
* Orthogonality: how does this change interact or overlap with existing features?
</details>
## Motivation
Let’s consider an example of a data type that represents a "natural interval" defined as an interval with the invariants `From > 0` and `To > From`.
```go
package interval
type Natural struct {
From int
To int
}
```
Currently, a function accepting an `interval.Natural` can’t guarantee it is valid according to the invariants above. Invalid or zero interval values can be created outside the package and passed in a number of ways:
```go
package otherpkg
import "interval"
func withInterval(intval interval.Natural) {
/* some code that wants to rely on interval.Natural's invariants being respected by values passed as intval */
}
// directly initialized with invalid values
withInterval(interval.Natural{To: -7})
// the following examples are all ways to get a zero value of the type
var s interval.Natural
withInterval(s) // zero value
// referencing an uninitialized - zero - struct field
withInterval(struct{s interval.Natural}{}.s)
nat := new(interval.Natural)
withInterval(*nat) // access the zero-value pointed at by nat
var ms map[string]interval.Natural
withInterval(ms["anything"]) // zero-value
intervals := make([]interval.Natural, 0, 10)
withInterval(intervals[0]) // zero-value
ch := make(chan interval.T)
close(ch)
withInterval(<-ch) // receive zero value
```
The author of `interval` could define a validating constructor, which would only return valid natural ranges. However, there is no way to require it's always used, and thus it doesn't provide any guarantees.
```go
package interval
func New(from, to int) (Natural, error) { /* validation rules */ }
```
If it was possible to know that every `interval.Natural` value had been created by the `interval` package, the `withInterval` method would be able to rely on the `interval` package having been able to enforce its invariants at creation time (if the underlying type is immutable, e.g an integer, the guarantees are stronger still).
### Enums
Enums are generally new types over an underling primitive, e.g an `int` or `string`:
```go
package interval
type Kind int
const (
OrdinalInterval = iota
DayInterval
// ... etc
)
```
Again, this doesn't provide safety if other packages embed, or explicitly create, enum values:
```go
package otherpkg
import "interval"
var invalidOne = interval.Kind(47)
var invalidTwo = struct{
kind interval.Kind
}{
kind: 47,
}.kind
```
Therefore, code working with enums can't be sure they are valid.
## Desired behaviour
In this example, `interval.Natural` and `interval.Kind` are types whose values must be initialized by its package:
```go
package interval
package type Interval struct {}
package type Kind int
```
the examples below assume the following code:
```go
package interval
// since Interval is a PCT, it should export a construct if it wishes other packages
// to be able to create Interval values
func New(from, to int) (Interval, error) { /* ... */ }
// Kind is an enum, so exports a range of values. As it's a PCT, these are the only
// values of Kind that any other package can use, or expect.
const (
OrdinalInterval = iota
DayInterval
// ... etc
)
```
### Initialization <a href="#user-content-initialization" id="initialization">#</a>
PCTs cannot be initialized outside their defining package, which restricts explicit and implicit initialization:
```go
package otherpkg
import "interval"
var one interval.Natural // ERROR: 'interval.Natural' must be created by interval package
two := interval.Natural{} // ERROR: 'interval.Natural' must be created by interval package
three := interval.Natural{5, 10} // ERROR: 'interval.Natural' must be created by interval package
four := new(interval.Natural) // ERROR: 'interval.Natual' must be created by interval package
five := interval.Kind(47) // ERROR: 'interval.Kind' must be created by interval package
```
`new` can only be passed a package-created-type in its defining package.
The examples above in which attempts were made to initialize non-zero values of `Interval` can be achieved via the exported constructor. The attempts to initialize zero-values of PCTs outside their defining package are impossible, an explicit goal of this proposal:
```go
package otherpkg
import "interval"
// one, two and four are now impossible, it's explicitly a goal that we can't create zero interval.Natural values
// outside of the interval package
three, err := interval.New(5,10)
s2, err := interval.New(10, 20)
if err != nil {
// handle
}
// with no exported function returning an interval.Kind, the otherpkg must use one of interval's exported values
five := interval.DayInterval
```
### Structs <a href="#user-content-structs" id="structs">#</a>
Types that have a PCT as a field must explicitly initialized those fields. Thus any struct that contains a package created type will become liable to similar restrictions on its initialization:
```go
package otherpkg
import "interval"
type someType struct {
First interval.Natural
Second interval.Natural
}
var s someType{} // ERROR: someType's First field is a package-created type, interval.Natural, which must be initialized by the interval package
```
### Maps, slices and channels <a href="#user-content-map-slice-chan" id="map-slice-chan">#</a>
Package created types must use pointers to be valid map values, or slice/channel element, i.e `[]*interval.Natural` is okay, but `[]interval.Natural` is not. This ensures the semantics of maps, slices and channels are unaffected by this proposal, while achieving its goal of restricting value initialisation to the defining package.
```go
package otherpkg
import "interval"
// see below for details on map/slice/channels
var intervalMap map[string]interval.Natural // ERROR: interval.Natural is a package-created type, and cannot be a map value (use map[string]*interval.Natural)
intervalSlice := make([]interval.Natural, 0, 10) // ERROR: interval.Natural is a package-created type, and cannot be a slice element (use []*interval.Natural)
ch := make(chan interval.T) // ERROR interval.Natural is a package-created type, and cannot be a channel element (use chan(*interval.Natural))
```
To fix the errors the author would need to do use pointers to the PCT as the element type:
```go
package otherpkg
import "interval"
var intervalMap map[string]*interval.Natural // pointer to type means zero-value is nil, not a zero interval.Natural
var intervalSlice []*interval.Natural
var intervalCh chan *interval.Natural
```
### Operators <a href="#user-content-operators" id="operators">#</a>
Where a package-created-type's underlying type is a type on which operators are valid (e.g, an int, rather than a struct), it will be a compiler error to attempt an operation with a PCT on the [left-hand side](https://golang.org/ref/spec#Arithmetic_operators) (which determines the type of the expression evaluation):
```go
package otherpkg
import "interval"
invalidKind := interval.DayInterval + interval.HourInterval // ERROR interval.Kind is a package-created type, and cannot be created outside of the interval package
unitless := 42 + interval.DayInterval // ok, as it evaluates to an int not an interval.Kind
```
## Mechanics
### Indicating a type is package created
This proposal has no opinion on the best syntax to indicate a given type in a package should have 'package created type' behaviour. A indicative proposal would be prefixing indicating a type was package create by preceeding it with the package keyword:
```go
package interval
// indicates Interval is package created, with the behaviour above
package type Interval struct {/* ... */}
package type Kind int
```
### Unmarshalling
Unmarshalling packages could be made package created types (PCT) aware. They’d use reflection to check whether a given component of a unmarshalling target was a PCT, and return an error if that type did not export an appropriate unmarshalling function.
### Reflect
Reflect would be extended with:
```go
func (v Value) IsPackageType() bool
```
Reflect's `New(typ Type)` function, and others, would be updated to panic on invalid creation of zero values (similar to panicing on accessing unexported struct fields), or of zero values of types that reference PCTs (this would need to be a recursive check, e.g A has a field of B that has a field of C that's a PCT, means you can't `New()` a A).
### Compiler implementation
I don't know enough (anything) about Go's compiler to have any opinions on how this could be implemented. I assume it would involve updating the code that compiled/type-checked initializations.
## Issues with alternatives <a href="#user-content-alternatives" id="alternatives">#</a>
### Unexported types
<a id=issues-with-alternatives>Unexported</a> types cannot be stored in structures, maps, or slices in other packages:
```go
type wantsToStoreInterval {
interval interval.natural // error: Unexported type 'natural' usage
}
var _ map[string]interval.natural // error: Unexported type 'natural' usage
```
For unexported structs with exported methods there is a workaround - an appropriate interface can be defined on the consumer side and used to store the values. However, there is nothing to stop other packages defining methods with the same names, and again breaking the invariants.
Further, interface wrapper and accessors are not idiomatic. In Go, data and enums are usually concrete and access is not mediated via getters, unlike Java etc.
```go
// defining package must export methods to allow unexported `natural` to be stored within interface values in other packages
// which define GetFrom and GetTo
func (n natural) GetFrom() int {
return n.From
}
func (n natural) GetTo() int {
return n.From
}
```
If the defining package of the unexported type doesn't define methods, the consuming package would have to wrap the values in an `interface{}`:
```go
type wantsToStoreInterval {
interval interface{}
}
func (w wantsToStoreInterval) doSomething() {
// note: will panic if any other type of value is stored in w.interval, or it's not initialized
fmt.Println(w.interval.(interval.natural).From)
}
```
This is far from idiomatic, and imposes runtime cost, and risk of panics.
### Interfaces
Interfaces with unexported methods prevent other packages from defining their own implementation. This can get closer to guaranteeing a value was created by its package.
Again, however, this has meant data and enum members have to be wrapped in an interface value. That isn’t idiomatic in Go - data is typically ‘raw’, without indirection. It also incorrectly suggests to readers it is an abstract data type, which is a more common use for interface-wrapped data in Go. And we've now had to define a nearly identical struct and interface.
```go
package interval
type NaturalInterval interface {
GetFrom() int
GetTo() int
private() // prevents otherpkg from fully implementing Interval
}
func NewNatural(from, to int) (NaturalInterval, error) {
// ...
}
```
### Discipline: always use package's constructor
Practically today, many codebases would define `NewNatural(from, to int) (interval.Natural, error)` and trust that `NewNatural` is always used, and that zero or partially initialized `interval.Natural` values are never created.
However, this isn't a solution, it's accepting the status quo. It's equivalent to arguing that Go's memory safety isn't valuable as you can "just always remember to check your bounds". In a large codebase, with many engineers, the probability of such rules being violated somewhere approaches 1.
### Linting
Linting can implement some of the above (though not changes to stdlib marshaling/unmarshaling), but it'd be faster and more widely supported if it was a compiler feature.
## Relation to other proposals
- https://github.com/golang/go/issues/28939 - contrast: that proposal attempts to change the behaviour of existing initializations. This proposal would not change the behaviour of existing code.
- https://github.com/golang/go/issues/32076 - contrast: that proposal attempts to make some function calls implicit, this proposal doesn't