V2: Add CLI and tests for hyperparameter optimization #734

hwpang · 2024-03-18T21:05:45Z

Description

I will work on implementing hyperparameter optimization CLI in this PR for v2. This is currently a work in progress, but comments / reviews are welcome to help shape this PR.

Example / Current workflow

There is not much CLI support for hyperparameter search in v2 currently.

Bugfix / Desired workflow

Based on discussion from last week's meeting, we choose to Ray Tune as hyperparameter optimization backend.

I have added arguments related to hyperparameter search in v1 to the Google sheet. We should go over it and discuss ones to keep/discard.

Questions

What do we want to save in v2? Just the best model config? Do we want to save the best model checkpoint?
There are many options available through ray tune. What is the best user-friendly way to expose these to the users?
I remove the ensemble modeling and cross validation during hyperparameter tuning, as I think it is unnecessary. I think in v1 they are available during hyperparameter tuning though. What are people's thoughts / experiences?
The training function used by hyperopt.py overlaps with the function in train.py. I could try to write a generic version used by both to reduce redundant codes. I can also see the flexibility to keep them separate.
What do we want to test in the CLI test for hyperopt? Currently I only test that the best config.json file is saved. Another thing we should test is that the update_args_with_config function actually updates args correctly.

Relevant issues

Updated hyperparameter search space: #571

Checklist

linted with flake8?
(if appropriate) unit tests added?

am2145 · 2024-03-20T18:24:11Z

I'm just leaving a comment here to document that the current release of Ray has an istallation issue on Windows with python 3.11 at the moment as described here. From the thread it looks like they already merged in a fix so hopefully this will be released/resolved soon, but it's broken for the time being.

chemprop/cli/hyperopt.py

am2145 · 2024-03-20T19:47:01Z

chemprop/cli/hyperopt.py

+
+logger = logging.getLogger(__name__)
+
+AVAILABLE_SPACES = {


I think this is a really good default. However, It would be good to give the user the option to pass in their own search space as well.

Good point! What type of input format do you think we should use?

Just leaving a comment as per our discussion today that we are leaning towards supporting a JSON input that contains the dictionary which we translate to the hyperparameter search space.

Do we plan to make this available via the CLI?

@JacksonBurns We plan to make this available via the CLI, but we haven't figured out an appropriate input format. In previous dev meeting, people think JSON is a good format to use. I tried, but functions are not JSON serializable. I tried many ways to translate the search space dictionary to be JSON serializable, but they all feel hacky. I will implement a version of CLI available through the pickle format. Feel free to chime in on this.

Edit: I figure a way to handle the JSON input format. Trying it right now.

So input this as a JSON format would look something like this.

{ "activation": { "type": "choice", "args": ["relu", "gelu", "tanh"], }, "aggregation": { "type": "choice", "args": ["mean", "sum", "max",], }, "aggregation_norm": { "type": "quniform", "kwargs": {"low": 1, "high": 200, "q": 1,} }, "batch_size": { "type": "quniform", "kwargs": {"low": 5, "high": 200, "q": 5,} }, "depth": { "type": "randint", "kwargs": {"low": 2, "high": 6,} }, "dropout": { "type": "choice", "args": [ {"type": "choice", "args": [0.0]}, {"type": "quniform", "kwargs": {"low": 0.05, "high": 0.4, "q": 0.05}}, ], }, "ffn_hidden_size": { "type": "quniform", "kwargs": {"low": 300, "high": 2400, "q": 100,} }, "ffn_num_layers": { "type": "randint", "kwargs": {"low": 2, "high": 6,} }, "final_lr_ratio": { "type": "loguniform", "kwargs": {"low": 1e-4, "high": 1,} }, "hidden_size": { "type": "quniform", "kwargs": {"low": 300, "high": 2400, "q": 100,} }, "init_lr_ratio": { "type": "loguniform", "kwargs": {"low": 1e-4, "high": 1,} }, "linked_hidden_size": { "type": "quniform", "kwargs": {"low": 300, "high": 2400, "q": 100,} }, "max_lr": { "type": "loguniform", "kwargs": {"low": 1e-6, "high": 1e-2,} }, }

This JSON corresponds to a python dictionary for search space

{ "activation": tune.choice(["relu", "gelu", "tanh"]), "aggregation": tune.choice(["mean", "sum", "max",]), "aggregation_norm": tune.quniform(lower=1, upper=200, q=1), "batch_size": tune.quniform(lower=5, upper=200, q=5), "depth": tune.randint(lower=2, upper=6), "dropout": tune.choice([tune.choice([0.0]), tune.quniform(lower=0.05, upper=0.4, q=0.05)]), "ffn_hidden_size": tune.quniform(lower=300, upper=2400, q=100), "ffn_num_layers": tune.randint(lower=2, upper=6), "final_lr_ratio": tune.loguniform(lower=1e-4, upper=1), "hidden_size": tune.quniform(lower=300, upper=2400, q=100), "init_lr_ratio": tune.loguniform(lower=1e-4, upper=1), "linked_hidden_size": tune.quniform(lower=300, upper=2400, q=100), "max_lr": tune.loguniform(lower=1e-6, upper=1e-2), }

I prefer the PICKLE format because

it is more flexible to handle the user input type

The JSON format requires the same amount of user knowledge on ray tune function as the PICKLE format

We don't have to maintain our custom JSON format to PICKLE format translator

I see two questions here:

"How can we accept plaintext input to parameterize the search space for a given hyperparameter?"

"More broadly, can we programmatically define a search space using plaintext input?"

For (1), we'd have to design some sort of factory pattern around the ray.tune search space API. This actually isn't that hard and requires (i) a mapping from a string key to the corresponding function and (ii) a way to pass parameters to that function. Luckily, Ray has already provided (i) for us:

DISTRIBUTIONS = { "uniform": tune.uniform, "quniform": tune.quniform, "loguniform": tune.loguniform, "qloguniform": tune.qloguniform, "randn": tune.randn, "qrandn": tune.qrandn, "randint": tune.randint, "qrandint": tune.qrandint, "lograndint": tune.lograndint, "qlograndint": tune.qlograndint, "choice": tune.choice, "func": tune.sample_from, # NOTE: not a good idea "grid": tune.grid_search, }

You could also just do getattr(tune, key), but there's less control there in which search spaces we allow. From there, you just pass through the received parameters like so: func(**params). So you would need some sort of plaintext format that enables providing a mapping like this:

{ "distribution": "uniform", "params": { "lower": 0.0, "upper": 1.0 } }

Note
I'm using JSON here, but you can use YAML, TOML, INI, etc.

Then, inside hyperopt.py, you can just create the search space like so:

d = json.load(...) search_space = DISTRIBUTIONS[d['distribution']](**d['params']) # or equivalently search_space = getattr(tune, d['distribution'])(**d['params'])

There would need to be some error checking here to validate the input parameters. This can be somewhat cumbersome because we will need to establish, for each hyperparameter, the allowable search space parameterizations. Ex: "aggregation" can only be "choice" with params["choices"] $\subseteq$ {"max", "norm", "mean", "sum"} and "batch_size" can be any integer distribution with a lower bound $\geq$ 1 or tune.choice where params["choices"] $\subseteq \mathbb N^n$

For (2), we would need to establish a mapping from a given hyperparameter key to its actual hyperparameter. We already have this in the form of our train.py args and the update_args_with_config() method defined below.

@davidegraff Thanks for the detailed write-up! Note that using this would not allow a nested search space such as the one below, which is used in our default, unless we make the parsing function parse recursively.

"dropout": tune.choice([tune.choice([0.0]), tune.quniform(lower=0.05, upper=0.4, q=0.05)]),

Yeah that's one of the drawbacks to allowing "fully user-defined search spaces" unless we also define our own text parser. I think that's outside the scope of the chemprop CLI and can thus be limited to (1) our hardcoded search space or (2) a user-implemented hyperoptimization.

I decide to keep this part of changes as a separate PR and focus on getting the base case hyperparameter optimization in for now.

chemprop/cli/hyperopt.py

shihchengli

Thanks for making this PR. I've left some comments below. As we switch from Hyperopt to Ray Tune, some v1 arguments might not work. Also, we probably need to support ensemble modeling and cross validation during hyperparameter tuning when model is trained on a tiny dataset. In such cases, the model performance would highly depend on the splits.

chemprop/cli/hyperopt.py

shihchengli · 2024-03-30T04:56:50Z

As discussed in the previous meeting, we are considering providing some flexibility to allow users to define their own search space. However, the method for handling parameter expressions has not been fully determined. It largely depends on the level of flexibility we wish to offer. Below are some of my thoughts:

(1) Provide a range list for each search parameter

{
  "learning_rate": [0.001, 0.0001, 0.00001]
  "batch_size": [16, 32, 64],
  "optimizer": ["adam", "sgd"]
}

(2) Establish a stochastic expression in advance and only let user provide the range

{
  "learning_rate": {
    "min": 0.00001,
    "max": 0.001
  },
  "batch_size": [16, 32, 64],
  "optimizer": ["adam", "sgd"]
}

def to_tune_search_space(search_space):
    tune_search_space = {}
    for param, values in search_space.items():
        if isinstance(values, dict) and 'min' in values and 'max' in values:  # Continuous parameter
            tune_search_space[param] = tune.uniform(values['min'], values['max'])
        elif isinstance(values, list):  # Discrete parameter
            tune_search_space[param] = tune.grid_search(values)
        else:
            raise ValueError(f"Unsupported format for parameter {param}")
    return tune_search_space

(3) Allow users to define both the stochastic expression and the search range simultaneously

{
  "activation": {
    "choice": ["relu", "sigmoid"]
  },
  "batch_size": {
    "quniform": {
      "lower": 32,
      "upper": 128,
      "q": 32
    }
  },
  "lr": {
    "loguniform": {
      "lower": 1e-5,
      "upper": 1e-3
    }
  }
}

The above cases are based on the JSON format, but I personally prefer YAML more because we can add comments and it is more succinct.

chemprop/cli/hyperopt.py

JacksonBurns

Some small notes, especially about some possibly blocking PRs

chemprop/cli/hyperopt.py

JacksonBurns · 2024-03-30T20:16:02Z

chemprop/cli/hyperopt.py

+
+logger = logging.getLogger(__name__)
+
+AVAILABLE_SPACES = {


Do we plan to make this available via the CLI?

chemprop/cli/hyperopt.py

…e use for dropout. Commenting out relevant code for now.

hwpang · 2024-04-12T13:21:42Z

I was testing hyperparameter optimization function. raytune + hyperopt works fine. raytune + optuna, however, it seems that it couldn't handle the nested distribution search space such as below.

"dropout": tune.choice([tune.choice([0.0]), tune.quniform(lower=0.05, upper=0.4, q=0.05)]),

The best config is returned as

{'train_loop_config': {'ffn_hidden_size': 700.0, 'hidden_size': 1700.0, 'ffn_num_layers': 3, 'depth': 2, 'dropout': <ray.tune.search.sample.Categorical object at 0x7fc8932f40d0>}}

while something like this is expected

{'train_loop_config': {'ffn_hidden_size': 700.0, 'hidden_size': 1700.0, 'ffn_num_layers': 3, 'depth': 2, 'dropout': 0}}

I'm inclined to comment out the relevant code for now for the v2.0 release deadline and only support random search and hyperopt search algorithm. We can keep track of this in the future.

hwpang · 2024-04-12T13:52:26Z

@KnathanM @JacksonBurns @am2145 This is ready for another review and merge! Would like to get this in today for the v2.0 release. All additional feature requests should be opened as a v2.1 issues for feature tracking!

JacksonBurns · 2024-04-12T13:57:15Z

@hwpang CI is failing please take a look: https://github.com/chemprop/chemprop/actions/runs/8662788070/job/23755482209?pr=734#step:5:10

JacksonBurns · 2024-04-12T14:01:23Z

.github/workflows/ci.yml

+          python -m pip install ".[dev,docs,test,hpopt]"
+          python -m pip install optuna


Suggested change

python -m pip install ".[dev,docs,test,hpopt]"

python -m pip install optuna

python -m pip install ".[dev,docs,test]"

python -m pip install ".[hpopt]" optuna

This could fix the CI issue

…i.yml (Line: 85, Col: 13): Unexpected symbol: '"3'. Located at position 26 within expression: matrix.python-version == "3.11""

JacksonBurns · 2024-04-12T15:53:36Z

chemprop/cli/hpopt.py

+from ray import tune
+from ray.train import CheckpointConfig, RunConfig, ScalingConfig
+from ray.train.lightning import (
+    RayDDPStrategy,
+    RayLightningEnvironment,
+    RayTrainReportCallback,
+    prepare_trainer,
+)
+from ray.train.torch import TorchTrainer
+from ray.tune.schedulers import ASHAScheduler
+
+NO_HYPEROPT = False
+try:
+    from ray.tune.search.hyperopt import HyperOptSearch
+except ImportError:
+    NO_HYPEROPT = True


Suggested change

from ray import tune

from ray.train import CheckpointConfig, RunConfig, ScalingConfig

from ray.train.lightning import (

RayDDPStrategy,

RayLightningEnvironment,

RayTrainReportCallback,

prepare_trainer,

)

from ray.train.torch import TorchTrainer

from ray.tune.schedulers import ASHAScheduler

NO_HYPEROPT = False

try:

from ray.tune.search.hyperopt import HyperOptSearch

except ImportError:

NO_HYPEROPT = True

NO_HYPEROPT = False

try:

from ray.tune.search.hyperopt import HyperOptSearch

from ray import tune

from ray.train import CheckpointConfig, RunConfig, ScalingConfig

from ray.train.lightning import (

RayDDPStrategy,

RayLightningEnvironment,

RayTrainReportCallback,

prepare_trainer,

)

from ray.train.torch import TorchTrainer

from ray.tune.schedulers import ASHAScheduler

except ImportError:

NO_HYPEROPT = True

resolves latest test failure here: https://github.com/chemprop/chemprop/actions/runs/8664591034/job/23761373381?pr=734#step:6:136

JacksonBurns · 2024-04-12T15:56:43Z

chemprop/cli/hpopt.py

+
+logger = logging.getLogger(__name__)
+
+DEFAULT_SEARCH_SPACE = {


If you have this dict as a global variable it will be created at import time of this module. This dict requires that the hpopt packages are installed. You can fix this by:

moving this defintion into the try/except block where you do the imports

inside the CLI main class, move the import of the hopt subcommand into the body of the cli function to avoid importing until needed

The former is easier

JacksonBurns

LGTM to me assuming tests pass. My only ask is that you open an issue to track adding optuna back in

shihchengli

I only have a few minor comments here, but I will approve it for now as most parts look good to me. Perhaps you could open an issue related to enabling fully customizable distribution types and search ranges for tracking purposes.

chemprop/cli/hpopt.py

Co-authored-by: Shih-Cheng Li <scli@mit.edu>

hwpang requested review from am2145 and JacksonBurns March 19, 2024 22:43

hwpang changed the title ~~V2: Add CLI for hyperparameter optimization~~ V2: Add CLI and tests for hyperparameter optimization Mar 20, 2024

am2145 reviewed Mar 20, 2024

View reviewed changes

chemprop/cli/hyperopt.py Outdated Show resolved Hide resolved

am2145 reviewed Mar 20, 2024

View reviewed changes

chemprop/cli/hyperopt.py Outdated Show resolved Hide resolved

am2145 reviewed Mar 20, 2024

View reviewed changes

chemprop/cli/hyperopt.py Outdated Show resolved Hide resolved

am2145 reviewed Mar 20, 2024

View reviewed changes

chemprop/cli/hyperopt.py Outdated Show resolved Hide resolved

kevingreenman added this to the v2.0.0 milestone Mar 21, 2024

hwpang requested a review from shihchengli March 21, 2024 18:37

JacksonBurns mentioned this pull request Mar 26, 2024

[QUESTION]: Hyperparameter optimisation in v2 #744

Closed

KnathanM assigned shihchengli Mar 28, 2024

shihchengli requested changes Mar 28, 2024

View reviewed changes

shihchengli reviewed Mar 30, 2024

View reviewed changes

chemprop/cli/hyperopt.py Outdated Show resolved Hide resolved

shihchengli reviewed Mar 30, 2024

View reviewed changes

chemprop/cli/hyperopt.py Outdated Show resolved Hide resolved

JacksonBurns reviewed Mar 30, 2024

View reviewed changes

JacksonBurns self-requested a review April 2, 2024 18:43

hwpang added 11 commits April 2, 2024 14:47

Add ray tune as hyperparameter backend

85d11ed

Import ray

4f53cd3

Add args

0325e0a

Initial CLI

53f7921

Import from correct path

5262f9f

Should be upper, not high

2130245

Enable hyperopt subcommand

afd9342

Should be n_cpu not n_cpus

9913b9b

Process hyperopt args

6cbafd8

Use randint for depth

f1bf720

update

68cfafa

Optuna can't handle a nested distribution search space like the one w…

13baba0

…e use for dropout. Commenting out relevant code for now.

Add missing comma and formatting

08df407

hwpang requested review from KnathanM, JacksonBurns and davidegraff April 12, 2024 13:51

JacksonBurns reviewed Apr 12, 2024

View reviewed changes

hwpang added 4 commits April 12, 2024 10:02

Only install hpopt dependencies if py3.11

4222384

Currently not fully support optuna

61746d5

Fix syntax error

b8c2752

Attempting to resolve "The workflow is not valid. .github/workflows/c…

d723628

…i.yml (Line: 85, Col: 13): Unexpected symbol: '"3'. Located at position 26 within expression: matrix.python-version == "3.11""

JacksonBurns reviewed Apr 12, 2024

View reviewed changes

hwpang added 6 commits April 12, 2024 11:59

Catch ray import error

13d0ee1

Merge branch 'v2/dev' into v2/hyperparam_opt

8a25570

Initialize flag

c51abd6

Move into try except block to catch import error

ea50618

Initialize default search space and raise package not available error

785c6b7

formatting

1eb74fc

JacksonBurns approved these changes Apr 12, 2024

View reviewed changes

hwpang mentioned this pull request Apr 12, 2024

[TODO]: Add Optuna support for hyperparameter optimization #783

Closed

Change ffn num layers search range back to 1 to 3

a888168

shihchengli approved these changes Apr 12, 2024

View reviewed changes

chemprop/cli/hpopt.py Outdated Show resolved Hide resolved

chemprop/cli/hpopt.py Outdated Show resolved Hide resolved

chemprop/cli/hpopt.py Outdated Show resolved Hide resolved

Apply suggestions from code review

55e769d

Co-authored-by: Shih-Cheng Li <scli@mit.edu>

KnathanM enabled auto-merge April 12, 2024 18:53

KnathanM merged commit ec55383 into chemprop:v2/dev Apr 12, 2024
9 checks passed

davidegraff mentioned this pull request Apr 16, 2024

V2: Bug fix for hyperparameter optimization #796

Merged

2 tasks

JacksonBurns mentioned this pull request May 16, 2024

[v2 Feature Request]: automated rdkit2d features #843

Open

shihchengli mentioned this pull request May 28, 2024

Add Optuna support for HPO #888

Merged

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

V2: Add CLI and tests for hyperparameter optimization #734

V2: Add CLI and tests for hyperparameter optimization #734

hwpang commented Mar 18, 2024 •

edited by am2145

am2145 commented Mar 20, 2024

am2145 Mar 20, 2024

hwpang Mar 20, 2024

am2145 Mar 28, 2024

JacksonBurns Mar 30, 2024

hwpang Apr 5, 2024 •

edited

hwpang Apr 5, 2024

davidegraff Apr 5, 2024

hwpang Apr 5, 2024

davidegraff Apr 5, 2024

hwpang Apr 9, 2024

shihchengli left a comment

shihchengli commented Mar 30, 2024

JacksonBurns left a comment

JacksonBurns Mar 30, 2024

hwpang commented Apr 12, 2024

hwpang commented Apr 12, 2024

JacksonBurns commented Apr 12, 2024

JacksonBurns Apr 12, 2024

JacksonBurns Apr 12, 2024

JacksonBurns Apr 12, 2024

JacksonBurns left a comment

shihchengli left a comment •

edited

		python -m pip install ".[dev,docs,test,hpopt]"
		python -m pip install optuna


		logger = logging.getLogger(__name__)

		DEFAULT_SEARCH_SPACE = {

V2: Add CLI and tests for hyperparameter optimization #734

V2: Add CLI and tests for hyperparameter optimization #734

Conversation

hwpang commented Mar 18, 2024 • edited by am2145

Description

Example / Current workflow

Bugfix / Desired workflow

Questions

Relevant issues

Checklist

am2145 commented Mar 20, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

hwpang Apr 5, 2024 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

shihchengli left a comment

Choose a reason for hiding this comment

shihchengli commented Mar 30, 2024

JacksonBurns left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

hwpang commented Apr 12, 2024

hwpang commented Apr 12, 2024

JacksonBurns commented Apr 12, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

JacksonBurns left a comment

Choose a reason for hiding this comment

shihchengli left a comment • edited

Choose a reason for hiding this comment

hwpang commented Mar 18, 2024 •

edited by am2145

hwpang Apr 5, 2024 •

edited

shihchengli left a comment •

edited