Skip to content

Optuna GSoC 2020

Crissman Loomis edited this page Mar 23, 2020 · 11 revisions

Optuna

Optuna is an open source hyperparameter optimization framework to automate hyperparameter search. Optuna provides eager search spaces for automated search for optimal hyperparameters using Python conditionals, loops, and syntax, state-of-the-art algorithms to efficiently search large spaces and prune unpromising trials for faster results, and easy parallelization for hyperparameter searches over multiple threads or processes without modifying code.

Optuna is participating in GSoC 2020 as a member of NumFOCUS.

Getting Started

For coding on Optuna, a solid basis in Python coding will be required. Experience working with Git and github.com will also be useful. If you're interested in applying, we recommend you take a look at the Optuna github repository, and try your hand at some of the Contribution Welcome labelled issues. We will evaluate applications largely based on their contributions to Optuna and other open source projects.

To contact us, please email us at optuna@preferred.jp

Projects

Create Optuna Documentation and Technical Writing

Add Integration Modules for Other Open Source Projects

Implement a GP-BO-based sampler for Optuna

Create Optuna Documentation and Technical Writing

Revise and update the Optuna documentation to make it more helpful for users. Some ideas are:

  • Create a tutorial about user-defined pruners
  • Create a document about sampling/pruning algorithms
  • Create a document to clarify InMemoryStorage
  • Create documentation in another major language (Mandarin, Spanish, Japanese, Korean…)

Mentors

@Crissman

Difficulty and Requirements

Easy to Medium

You need know:

  • Python
  • Hyperparameter optimization basics
  • Sphinx-docs (optional)

First Steps

Survey the Optuna documentation and compare with documentation for other hyperparameter optimization frameworks. Note what you feel is a clear explanation and inconsistencies in the Optuna documentation, then improve the Chainer documentation.

Why this is cool

Hyperparameter optimization is critical to many complex algorithms, and many users are not aware that automated programs like Optuna are available. By improving the Optuna documentation, you will help users understand how to improve the performance of their algorithms without doing fiddly manual tuning. Learn to work with self-documenting code in a large open-source project. Build relationships with coders in a top tier startup in Japan.

Add Integration Modules for Other Open Source Projects

Optuna has integration modules for PyTorch, TensorFlow, Keras, SciKit-Learn, and several other Open Source projects. We would like to increase the number of Open Source projects for which Optuna has a custom module to allow for easy integration. This could include creating pruner integrations similar to the existing modules or making new integrations, such as implementing a callback to export data to another framework like MLFlow or TensorBoard.

Mentors

@Crissman

Difficulty and Requirements

Medium to Hard

You need know:

  • Python
  • Hyperparameter optimization basics
  • Open Source projects

First Steps

Look at the existing Optuna integration modules, and review other popular Open Source projects to see which have hyperparameters or could provide additional functionality to Optuna.

Why this is cool

You’ll work with both Optuna and other Open Source projects, and each integration module you create will make both Optuna and the other project more useful. Build relationships with coders in a top tier startup in Japan.

Implement a GP-BO-based sampler for Optuna

Description

Optuna supports multiple sampling algorithms such as TPE or CMA-ES. Although Bayesian Optimization based on Gaussian Process (GP-BO), which is one of most popular algorithms for hyperparameter optimization, is also supported, it internally invokes scikit-optimize. Having Optuna’s own implementation of GP-BO instead of wrapping another library would make Optuna more efficient and extensible.

In this project, you will implement a new sampler class based on GP-BO. The sampler should meet the following requirements:

  • the sampler is a subclass of optuna.samplers.BaseSampler,
  • the sampler shows comparable performance with TPE (the default algorithm of Optuna) on multiple benchmark tasks, without tuning any hyper-hyper parameter,
  • the sampler is compatible with the basic parameter types of Optuna: float (uniform), integer, categorical, discrete float, and log-scale of float,
  • the sampler is compatible with distributed optimization,
  • kernels and acquisition function of the sampler are customizable,

[Optional] Based on the new sampler class, you are encouraged to implement state-of-the-art bayesian optimization algorithms such as:

Mentors

TBD

Difficulty and Requirements

Medium - Hard

You need to know:

  • Python,
  • Theory of bayesian optimization

First Steps

Go through the document and/or implementation of samplers and understand how to implement a sampling algorithm. You are encouraged to write an example of a sampler class implementing either of the following algorithms and share it with us. The class may not be production-ready, i.e., it may be incompatible with categorical kernels, custom acquisition functions, or distributed optimization.

Some practice sampler classes you could write:

Why this is cool

This is a challenging project that will require building simple, scalable interfaces for complex gaussian processes and bayesian optimization mechanisms. You’ll learn how to design interfaces and gain practical understanding of how bayesian optimization and gaussian processes work. Build relationships with coders in a top tier startup in Japan.