A Typed Programming Language

CommunityNews · 23 March 2022 13:42

ABSTRACT

In the rank-polymorphic programming model, all functions operate on aggregate data of arbitrarily high rank, or number of dimensions. During function application, an argument array is split into cells, the individual components the function expects to consume. For example, an RGB-to- greyscale pixel transform operates on each vector in an arbitrarily large array. The aggregate structure surrounding the cells, called the frame, serves as the iteration space for cell-wise function application. The programming model was first developed by Iverson with the language APL [43], but it struggled with a barrier to efficient compilation: Loop nesting structure is derived from data computed at run time.

This dissertation presents the design and formal semantics of Remora, a higher-order, rank-polymorphic programming language with a static type system which identifies the shape of run-time data. This overview is followed by formal semantics for a core language. Remora’s static semantics ascribes to each expression a type which describes the shape of the resulting array. Quantification over the shape of cells and the type of atoms within an array is explicit, but the polymorphism over frames is entirely implicit. That is, a function’s type only describes its cell-level behavior, while implicit iteration—which is common to all functions—is identified by typing rules. A type-driven dynamic semantics determines the iteration space for functions applied to computed array data, and a type soundness theorem ensures that the types—and shapes—ascribed to expressions match those of their eventual results.

While frame polymorphism is instantiated implicitly in Remora’s formal semantics, explicitly instantiating cell polymorphism is a severe annotation burden. For example, a vector-mean function can be used on a 3ˆ5ˆ4 array with no explanation that the array is a 3ˆ5 frame, but the function must be explicitly instantiated to operate on vectors of length 4. That burden is alleviated by a bidirectional typing system which uses a novel constraint solver for the theory of array shapes to identify implicit dimension and shape arguments. The vector-mean function can then be applied directly to the 3 ˆ 5 ˆ 4 array, with bidirectional rules elaborating to code which explicitly instantiates it for 4-vector cells.

Two translation steps link Remora’s formal semantics to conventional rank-monomorphic languages with explicit iteration. While Remora’s dy- namic semantics relies heavily on run-time type information, a type era- sure pass can change from carrying full type information in dynamically created closures and arrays to describing argument and iteration-space shapes statically at sites. With that shape information at each call site, the program can be translated from using rank-polymorphic function calls to rank-monomorphic explicit iteration.

Read in full here:

This thread was posted by one of our members via one of our news source trackers.

OvermindDL1 · 28 March 2022 20:13

I still find it better to define the shape of your data first then write functions to transform between them rather than have the machine try to infer it (like how V8 does)…