This package provides a functional interface for reducing boilerplate in dodo.py
of the pydoit build system. In short, all tasks are created and managed using a .Manager
. Most features<features>
are exposed using python context manager, e.g., grouping tasks.
example
>>> import doit_interface as di
>>> # Get a default manager (or create your own to use as a context manager). >>> manager = di.Manager.get_instance()
>>> # Create a single task. >>> manager(basename="create_foo", actions=["touch foo"], targets=["foo"]) {'basename': 'create_foo', 'actions': ['touch foo'], 'targets': ['foo'], ...}
>>> # Group multiple tasks. >>> with di.group_tasks("my_group") as my_group: ... member = manager(basename="member") >>> my_group <doit_interface.contexts.group_tasks object at 0x...> named my_group with 1 task
Note
The default manager obtained by calling .Manager.get_instance
has a number of default contexts enabled:
.SubprocessAction.use_as_default
to use.SubprocessAction
by default for string actions..create_target_dirs
to create target directories if they are missing..normalize_dependencies
such that task objects can be used as file and task dependencies.
It also injects a default DOIT_CONFIG
configuration variable if the filename is dodo.py
.
If you want to override this default behavior, you can create a dedicated manager and call .Manager.set_default_instance
or modify the .Manager.context_stack
of the default manager.
The .DoitInterfaceReporter
provides more verbose progress reports and points you to the location where a failing task was defined. The DOIT_CONFIG
is used by default if you use .Manager.get_instance
to get a .Manager
.
reporter
>>> DOIT_CONFIG = {"reporter": DoitInterfaceReporter} >>> manager(basename="false", actions=["false"]) {'basename': 'false', 'actions': ['false'], 'meta': {'filename': '...', 'lineno': 1}}
$ doit
EXECUTE: false
FAILED: false (declared at ...:1)
...
Group tasks to easily execute all of them using .group_tasks
. Tasks can be added to groups using a context manager (as shown below) or by calling the group to add an existing task. Groups can be nested arbitrarily.
group_tasks
>>> with group_tasks("vgg16") as vgg16: ... train = manager(basename="train", actions=[...]) ... validate = manager(basename="validate", actions=[...]) >>> vgg16 <doit_interface.contexts.group_tasks object at 0x...> named vgg16 with 2 tasks
Use .create_target_dirs
to automatically create directories for each of your targets. This can be particularly useful if you generate nested data structures, e.g., for machine learning results based on different architectures, seeds, optimizers, learning rates, etc.
create_target_dirs
>>> with create_target_dirs(): ... task = manager(basename="bar", targets=["foo/bar"], actions=[...]) >>> task["actions"] [(<function create_folder at 0x...>, ['foo']), ...]
Use .defaults
to share default values across tasks, such as file_dep
.
defaults
>>> with defaults(file_dep=["data.pt"]): ... train = manager(basename="train", actions=[...]) ... validate = manager(basename="validate", actions=[...]) >>> train["file_dep"] ['data.pt'] >>> validate["file_dep"] ['data.pt']
.normalize_dependencies
normalizes file and task dependencies such that task objects can be used as dependencies (in addition file and task names).
normalize_dependencies
>>> with normalize_dependencies(): ... base_task = manager(basename="base", name="output", targets=["output.txt"]) ... file_dep_task = manager(basename="file_dep_task", file_dep=[base_task]) ... task_dep_task = manager(basename="task_dep_task", task_dep=[base_task]) >>> file_dep_task["file_dep"] ['output.txt'] >>> task_dep_task["task_dep"] ['base:output']
Path prefixes can be added using the .path_prefix
context if file dependencies or targets share common directories. General prefixes are also available using .prefix
.
path_prefix
>>> with path_prefix(targets="outputs", file_dep="inputs"): ... manager(basename="task", targets=["out.txt"], file_dep=["in1.txt", "in2.txt"]) {'basename': 'task', 'targets': ['outputs/out.txt'], 'file_dep': ['inputs/in1.txt', 'inputs/in2.txt'], ...}
The .SubprocessAction
lets you spawn subprocesses akin to doit.action.CmdAction
yet with a few small differences. First, it does not capture output of the subprocess which is helpful for development but may add too much noise for deployment. Second, it supports Makefile style variable substitutions and f-string substitutions for any attribute of the parent task. Third, it allows for global environment variables to be set that are shared across all, e.g., to limit the number of OpenMP threads. You can use it by default for string-actions using the .SubprocessAction.use_as_default
context.
doit_interface