Skip to content

TM014 Module Loading

Joe Politz edited this page Jul 13, 2016 · 8 revisions

Module System Goals

Module Locating Control

Instead of being beholden to requirejs's rules for locating modules, which is based on requirejs-specific rules about the filesystem and relative URLs, we can write custom Pyret/JS code to handle different kinds of imports. This isa ctually how my-gdrive and shared-gdrive have worked for a while—the module design was about halfway committed as of last summer—but now everything is built on this infrastructure (including bootstrapping the compiler, building CPO, etc).

It's not implemented yet, but one thing this will let us do is have a uniform import style across web and command line. For example, this import:

import path("relative/path/to/lib.arr") as L

can use relative filepaths from the CLI, but use GDrive directory traversal on the web, simply by changing the implementation of a well-specified interface for what it means to be a "path" dependency.

Pyret (and JS) Modules Outside of pyret-lang

Since we have much better control over locating and identifying modules, it's now much easier to build a standalone JavaScript file in many different contexts. CPO does this now, and builds a standalone JS file that includes both requirejs-formatted modules for UI behavior, but also some Pyret files and libraries like d3 that really don't belong in pyret-lang (since they only make sense in the browser right now). This also allows the Pyret standalone for CPO to avoid including things like the command-line "main" program for Pyret that reads directly from the filesystem, which is somewhat nonsensical for CPO to try doing.

There are, of course, a number of design decisions in how we set up the build of pyret-lang vs. CPO, and where modules ought to live. The point is, we have much more flexibility in how we set that up now, and not everything revolves around standalone blobs from pyret-lang linked magically against other JS at load time.

Static Module Specification

Compiled (or handwritten JS) modules now specify their static information purely statically (what a novel idea!). In the past, there was a requirejs instantiation step needed to get at module contents, including static information like the provided identifiers and types of a module. This caused some annoying dependencies on requirejs instantiation in the middle of, say, trying to type-check things.

The specification of static information is in pure JSON, which makes it much easier for many different kinds of tools (pure JS or Pyret) to consume module metadata.

Handwritten JS modules can also specify dependencies on Pyret files statically, which was not possible before. This allows handwritten JS modules to participate in the topological sort of module dependencies, and generally act more like any other Pyret module. A main use case for this is writing a Pyret file that contains some data definitions, then writing a JS file that imports them, does some internal magic, and exports functions that use those data definitions (the plot library is a good example).

Towards Unified Specification of Builtins

The type system has always relied on a file called "type-defaults.arr" to specify types for builtin modules. Since the builtin modules still don't quite type-check, that will remain for a while. However, the functions provided by the runtime itself are all in pure JS, and need types specified manually. This can now be done using the same type specification language used by the all compiled modules, removing a big chunk of type-defaults. Moving forward, more and more of that type specification in code should turn into static specification in data.

More First-order Information

The main feature the module system provided last year was a simple uilut useful one: enough first-order information about provided types to make "include" work (albeit in somewhat limited circumstances, but enough for it to work for BS:2 on CPO, which was the main thing we wanted).

We have much more first order information across modules now, which allows for some useful optimization, even without full type-checking information. For example:

Right now every annotation has to check, at creation time, if any of its pieces contain a refinement, and set a flag if they do. The compiler has to generate code to dynamically check this flag and decide whether to use the stack-saving calling convention or not, since a refinement could blow the stack (A fun test: write an annotation that succeeds or fails based on the result of a game played with big-bang. Yes, that works). Now the compiler can statically determine which annotations could contain refinements, and simply not generate the callbacks/stack safe code at all except when absolutely necessary

Right now "cases" compilation is hampered by a lack of shape information about the datatypes it dispatches over. Since we have the name of the datatype available in the cases expression, and datatype shape information statically provided from modules, we can begin augmenting cases compilation to check variant arities, incomplete matches, and typo'd cases statically, even with the type-checker off. Performance-wise, we can also turn some dictionary lookups to find the right case branch into arithmetic, since we will know the names of the variants statically.

Every time a program uses an annotation like "A.Foo" for a type, or an expression like "A.x" for identifier lookup (where "A" is a module import binding), that compiles to a dictionary lookup with a check for non-object and field-not-found errors. Now that we have more type information across modules, those can be compiled to check that they exist statically, and be bound once at the top of the program.

Distinguishing Native and Pyret Modules

Of course, Pyret can't do much without access to the underlying operating system, done through native Node modules at the CLI and the DOM APIs in CPO. In addition to specifying Pyret-shaped modules, JS-implemented modules can also specify native modules, which are still resolved using requirejs (there needs to be some way to locate and instantiate modules that are in pure JS, and requirejs is a pretty reasonable choice for this). When building a standalone (see below), the native modules are found via a standard requirejs specification of paths, and included in the generated standalone JavaScript blob. System-level modules like fs for the filesystem are handled by requirejs, which will find the appropriate builtin module from node.

This avoids the confusion from before where requirejs define calls mixed Pyret code (which had to be precompiled) with pure JavaScript files, and the ensuing path and instantiation issues.