Skip to content

Implementation Overview

Jukka Lehtosalo edited this page Jun 30, 2022 · 16 revisions

This is a general introduction to the mypy implementation. The linked subpages have more detailed information. The file links point to the mypy git repository.

Structure of the type checker

The main entry point of mypy is mypy/main.py. This file is responsible for parsing the config file, parsing command line options, and handling other related boilerplate.

The flow of execution is then passed off to the Build Manager (mypy/build.py), which is responsible for coordinating a build, managing dependencies between modules, and performing compilation passes in the correct order.

The build manager will then perform several passes over each module to typecheck them (see the process_graph function for more details).

The build manager ensure that before we process a module, all imported modules have been processed first. If there are cyclic import dependencies, each such strongly connected component (SCC) is processed together.

Common passes (click the links in bold for more details):

  1. Python Parser (mypy/parse.py, mypy/fastparse.py)
    • Uses the stdlib ast module (or typed_ast on Python 3.7 and earlier) to build an abstract syntax tree (AST).
    • The typed_ast library is nearly identical to CPython's "ast" module except that it has support for parsing things like type comments that is not supported by the ast module of earlier Python versions.
    • The mypy AST is then converted into mypy-specific AST (parse tree), which is defined in the files mypy/nodes.py and mypy/types.py.
    • To traverse mypy ASTs, we use the visitor pattern (mypy/visitor.py, mypy/traverser.py, mypy/type_visitor.py).
  2. Semantic Analyzer (mypy/semanal_main.py, mypy/semanal.py, mypy/semanal_*.py, mypy/typeanal.py)
    • Binds names to definitions.
    • Performs various consistency checks.
    • To handle forward references, we may run multiple passes of semantic analysis over parts of an AST. In initial passes forward references are stored using "placehold nodes" (incomplete references), and these will be filled in during later passes.
  3. Type Checker (mypy/checker.py, mypy/checkexpr.py, mypy/checkmember.py)
    • Type checks the program.
    • Performs type inference.
    • Again if there are forward references, we may need to type check parts of the AST multiple times.

Stdlib stubs

Mypy ships with stubs for stdlib modules from typeshed. See Typeshed for more information.

Most test cases don't use typeshed to speed up tests. Instead, they use much smaller test fixtures from test-data/unit/lib-stub (minimal stubs used by default) and test-data/unit/fixtures (individual test cases can opt in to more general stubs from here).

Mypy daemon

The mypy daemon retains ASTs in a long-lived server process to speed up incremental mypy runs. Refer to Mypy Daemon for an overview of the implementation.

Mypyc

The mypyc compiler is used to compile mypy to C extension modules, speeding mypy up significantly. If you are not working on mypyc, understanding mypyc internals is not important.

Refer to mypyc on GitHub and mypyc documentation if you want to learn about using mypyc.

Refer to Introduction for Mypyc Contributors if you want to learn about working on mypyc or mypyc internals.

Mypyc issues are tracked separately from mypy issues here: https://github.com/mypyc/mypyc/issues