Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

V2: Add CLI and tests for hyperparameter optimization #734

Merged
merged 71 commits into from
Apr 12, 2024

Conversation

hwpang
Copy link
Contributor

@hwpang hwpang commented Mar 18, 2024

Description

I will work on implementing hyperparameter optimization CLI in this PR for v2. This is currently a work in progress, but comments / reviews are welcome to help shape this PR.

Example / Current workflow

There is not much CLI support for hyperparameter search in v2 currently.

Bugfix / Desired workflow

Based on discussion from last week's meeting, we choose to Ray Tune as hyperparameter optimization backend.

I have added arguments related to hyperparameter search in v1 to the Google sheet. We should go over it and discuss ones to keep/discard.

Questions

  • What do we want to save in v2? Just the best model config? Do we want to save the best model checkpoint?
  • There are many options available through ray tune. What is the best user-friendly way to expose these to the users?
  • I remove the ensemble modeling and cross validation during hyperparameter tuning, as I think it is unnecessary. I think in v1 they are available during hyperparameter tuning though. What are people's thoughts / experiences?
  • The training function used by hyperopt.py overlaps with the function in train.py. I could try to write a generic version used by both to reduce redundant codes. I can also see the flexibility to keep them separate.
  • What do we want to test in the CLI test for hyperopt? Currently I only test that the best config.json file is saved. Another thing we should test is that the update_args_with_config function actually updates args correctly.

Relevant issues

Updated hyperparameter search space: #571

Checklist

  • linted with flake8?
  • (if appropriate) unit tests added?

@hwpang hwpang changed the title V2: Add CLI for hyperparameter optimization V2: Add CLI and tests for hyperparameter optimization Mar 20, 2024
@am2145
Copy link
Contributor

am2145 commented Mar 20, 2024

I'm just leaving a comment here to document that the current release of Ray has an istallation issue on Windows with python 3.11 at the moment as described here. From the thread it looks like they already merged in a fix so hopefully this will be released/resolved soon, but it's broken for the time being.

chemprop/cli/hyperopt.py Outdated Show resolved Hide resolved
chemprop/cli/hyperopt.py Outdated Show resolved Hide resolved

logger = logging.getLogger(__name__)

AVAILABLE_SPACES = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a really good default. However, It would be good to give the user the option to pass in their own search space as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point! What type of input format do you think we should use?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just leaving a comment as per our discussion today that we are leaning towards supporting a JSON input that contains the dictionary which we translate to the hyperparameter search space.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we plan to make this available via the CLI?

Copy link
Contributor Author

@hwpang hwpang Apr 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JacksonBurns We plan to make this available via the CLI, but we haven't figured out an appropriate input format. In previous dev meeting, people think JSON is a good format to use. I tried, but functions are not JSON serializable. I tried many ways to translate the search space dictionary to be JSON serializable, but they all feel hacky. I will implement a version of CLI available through the pickle format. Feel free to chime in on this.

Edit: I figure a way to handle the JSON input format. Trying it right now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So input this as a JSON format would look something like this.

{
    "activation":
        {
            "type": "choice",
            "args": ["relu", "gelu", "tanh"],
        },
    "aggregation":
        {
            "type": "choice",
            "args": ["mean", "sum", "max",],
        },
    "aggregation_norm":
        {
            "type": "quniform",
            "kwargs": {"low": 1,
            "high": 200,
            "q": 1,}
        },
    "batch_size":
        {
            "type": "quniform",
            "kwargs": {"low": 5,
            "high": 200,
            "q": 5,}
        },
    "depth":
        {
            "type": "randint",
            "kwargs": {"low": 2,
            "high": 6,}
        },
    "dropout":
        {
            "type": "choice",
            "args": [
                {"type": "choice", "args": [0.0]},
                {"type": "quniform", "kwargs": {"low": 0.05, "high": 0.4, "q": 0.05}},
            ],
        },
    "ffn_hidden_size":
        {
            "type": "quniform",
            "kwargs": {"low": 300, "high": 2400, "q": 100,}
        },
    "ffn_num_layers":
        {
            "type": "randint",
            "kwargs": {"low": 2, "high": 6,}
        },
    "final_lr_ratio":
        {
            "type": "loguniform",
            "kwargs": {"low": 1e-4, "high": 1,}
        },
    "hidden_size":
        {
            "type": "quniform",
            "kwargs": {"low": 300,
            "high": 2400,
            "q": 100,}
        },
    "init_lr_ratio":
        {
            "type": "loguniform",
            "kwargs": {"low": 1e-4,
            "high": 1,}
        },
    "linked_hidden_size":
        {
            "type": "quniform",
            "kwargs": {"low": 300,
            "high": 2400,
            "q": 100,}
        },
    "max_lr":
        {
            "type": "loguniform",
            "kwargs": {"low": 1e-6,
            "high": 1e-2,}
        },
}

This JSON corresponds to a python dictionary for search space

{
    "activation": tune.choice(["relu", "gelu", "tanh"]),
    "aggregation": tune.choice(["mean", "sum", "max",]),
    "aggregation_norm": tune.quniform(lower=1, upper=200, q=1),
    "batch_size": tune.quniform(lower=5, upper=200, q=5),
    "depth": tune.randint(lower=2, upper=6),
    "dropout": tune.choice([tune.choice([0.0]), tune.quniform(lower=0.05, upper=0.4, q=0.05)]),
    "ffn_hidden_size": tune.quniform(lower=300, upper=2400, q=100),
    "ffn_num_layers": tune.randint(lower=2, upper=6),
    "final_lr_ratio": tune.loguniform(lower=1e-4, upper=1),
    "hidden_size": tune.quniform(lower=300, upper=2400, q=100),
    "init_lr_ratio": tune.loguniform(lower=1e-4, upper=1),
    "linked_hidden_size": tune.quniform(lower=300, upper=2400, q=100),
    "max_lr": tune.loguniform(lower=1e-6, upper=1e-2),
}

I prefer the PICKLE format because

  • it is more flexible to handle the user input type
  • The JSON format requires the same amount of user knowledge on ray tune function as the PICKLE format
  • We don't have to maintain our custom JSON format to PICKLE format translator

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see two questions here:

  1. "How can we accept plaintext input to parameterize the search space for a given hyperparameter?"
  2. "More broadly, can we programmatically define a search space using plaintext input?"

For (1), we'd have to design some sort of factory pattern around the ray.tune search space API. This actually isn't that hard and requires (i) a mapping from a string key to the corresponding function and (ii) a way to pass parameters to that function. Luckily, Ray has already provided (i) for us:

DISTRIBUTIONS = {
    "uniform": tune.uniform,
    "quniform": tune.quniform,
    "loguniform": tune.loguniform,
    "qloguniform": tune.qloguniform,
    "randn": tune.randn,
    "qrandn": tune.qrandn,
    "randint": tune.randint,
    "qrandint": tune.qrandint,
    "lograndint": tune.lograndint,
    "qlograndint": tune.qlograndint,
    "choice": tune.choice,
    "func": tune.sample_from,    # NOTE: not a good idea
    "grid": tune.grid_search,
}

You could also just do getattr(tune, key), but there's less control there in which search spaces we allow. From there, you just pass through the received parameters like so: func(**params). So you would need some sort of plaintext format that enables providing a mapping like this:

{
  "distribution": "uniform",
  "params": {
    "lower": 0.0,
    "upper": 1.0
  }
}

Note

I'm using JSON here, but you can use YAML, TOML, INI, etc.

Then, inside hyperopt.py, you can just create the search space like so:

d = json.load(...)
search_space = DISTRIBUTIONS[d['distribution']](**d['params'])
# or equivalently
search_space = getattr(tune, d['distribution'])(**d['params'])

There would need to be some error checking here to validate the input parameters. This can be somewhat cumbersome because we will need to establish, for each hyperparameter, the allowable search space parameterizations. Ex: "aggregation" can only be "choice" with params["choices"] $\subseteq$ {"max", "norm", "mean", "sum"} and "batch_size" can be any integer distribution with a lower bound $\geq$ 1 or tune.choice where params["choices"] $\subseteq \mathbb N^n$


For (2), we would need to establish a mapping from a given hyperparameter key to its actual hyperparameter. We already have this in the form of our train.py args and the update_args_with_config() method defined below.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@davidegraff Thanks for the detailed write-up! Note that using this would not allow a nested search space such as the one below, which is used in our default, unless we make the parsing function parse recursively.

"dropout": tune.choice([tune.choice([0.0]), tune.quniform(lower=0.05, upper=0.4, q=0.05)]),

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah that's one of the drawbacks to allowing "fully user-defined search spaces" unless we also define our own text parser. I think that's outside the scope of the chemprop CLI and can thus be limited to (1) our hardcoded search space or (2) a user-implemented hyperoptimization.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I decide to keep this part of changes as a separate PR and focus on getting the base case hyperparameter optimization in for now.

chemprop/cli/hyperopt.py Outdated Show resolved Hide resolved
chemprop/cli/hyperopt.py Outdated Show resolved Hide resolved
Copy link
Contributor

@shihchengli shihchengli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for making this PR. I've left some comments below. As we switch from Hyperopt to Ray Tune, some v1 arguments might not work. Also, we probably need to support ensemble modeling and cross validation during hyperparameter tuning when model is trained on a tiny dataset. In such cases, the model performance would highly depend on the splits.

chemprop/cli/hyperopt.py Outdated Show resolved Hide resolved
chemprop/cli/hyperopt.py Outdated Show resolved Hide resolved
chemprop/cli/hyperopt.py Outdated Show resolved Hide resolved
chemprop/cli/hyperopt.py Outdated Show resolved Hide resolved
chemprop/cli/hyperopt.py Outdated Show resolved Hide resolved
chemprop/cli/hyperopt.py Outdated Show resolved Hide resolved
chemprop/cli/hyperopt.py Outdated Show resolved Hide resolved
chemprop/cli/hyperopt.py Outdated Show resolved Hide resolved
chemprop/cli/hyperopt.py Outdated Show resolved Hide resolved
chemprop/cli/hyperopt.py Outdated Show resolved Hide resolved
@shihchengli
Copy link
Contributor

As discussed in the previous meeting, we are considering providing some flexibility to allow users to define their own search space. However, the method for handling parameter expressions has not been fully determined. It largely depends on the level of flexibility we wish to offer. Below are some of my thoughts:

(1) Provide a range list for each search parameter

{
  "learning_rate": [0.001, 0.0001, 0.00001]
  "batch_size": [16, 32, 64],
  "optimizer": ["adam", "sgd"]
}

(2) Establish a stochastic expression in advance and only let user provide the range

{
  "learning_rate": {
    "min": 0.00001,
    "max": 0.001
  },
  "batch_size": [16, 32, 64],
  "optimizer": ["adam", "sgd"]
}
def to_tune_search_space(search_space):
    tune_search_space = {}
    for param, values in search_space.items():
        if isinstance(values, dict) and 'min' in values and 'max' in values:  # Continuous parameter
            tune_search_space[param] = tune.uniform(values['min'], values['max'])
        elif isinstance(values, list):  # Discrete parameter
            tune_search_space[param] = tune.grid_search(values)
        else:
            raise ValueError(f"Unsupported format for parameter {param}")
    return tune_search_space

(3) Allow users to define both the stochastic expression and the search range simultaneously

{
  "activation": {
    "choice": ["relu", "sigmoid"]
  },
  "batch_size": {
    "quniform": {
      "lower": 32,
      "upper": 128,
      "q": 32
    }
  },
  "lr": {
    "loguniform": {
      "lower": 1e-5,
      "upper": 1e-3
    }
  }
}

The above cases are based on the JSON format, but I personally prefer YAML more because we can add comments and it is more succinct.

Copy link
Member

@JacksonBurns JacksonBurns left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some small notes, especially about some possibly blocking PRs

chemprop/cli/hyperopt.py Outdated Show resolved Hide resolved

logger = logging.getLogger(__name__)

AVAILABLE_SPACES = {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we plan to make this available via the CLI?

chemprop/cli/hyperopt.py Outdated Show resolved Hide resolved
chemprop/cli/hyperopt.py Outdated Show resolved Hide resolved
chemprop/cli/hyperopt.py Outdated Show resolved Hide resolved
@JacksonBurns JacksonBurns self-requested a review April 2, 2024 18:43
…e use for dropout. Commenting out relevant code for now.
@hwpang
Copy link
Contributor Author

hwpang commented Apr 12, 2024

I was testing hyperparameter optimization function. raytune + hyperopt works fine. raytune + optuna, however, it seems that it couldn't handle the nested distribution search space such as below.

"dropout": tune.choice([tune.choice([0.0]), tune.quniform(lower=0.05, upper=0.4, q=0.05)]),

The best config is returned as

{'train_loop_config': {'ffn_hidden_size': 700.0, 'hidden_size': 1700.0, 'ffn_num_layers': 3, 'depth': 2, 'dropout': <ray.tune.search.sample.Categorical object at 0x7fc8932f40d0>}}

while something like this is expected

{'train_loop_config': {'ffn_hidden_size': 700.0, 'hidden_size': 1700.0, 'ffn_num_layers': 3, 'depth': 2, 'dropout': 0}}

I'm inclined to comment out the relevant code for now for the v2.0 release deadline and only support random search and hyperopt search algorithm. We can keep track of this in the future.

@hwpang
Copy link
Contributor Author

hwpang commented Apr 12, 2024

@KnathanM @JacksonBurns @am2145 This is ready for another review and merge! Would like to get this in today for the v2.0 release. All additional feature requests should be opened as a v2.1 issues for feature tracking!

@JacksonBurns
Copy link
Member

Comment on lines 83 to 84
python -m pip install ".[dev,docs,test,hpopt]"
python -m pip install optuna
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
python -m pip install ".[dev,docs,test,hpopt]"
python -m pip install optuna
python -m pip install ".[dev,docs,test]"
python -m pip install ".[hpopt]" optuna

This could fix the CI issue

…i.yml (Line: 85, Col: 13): Unexpected symbol: '"3'. Located at position 26 within expression: matrix.python-version == "3.11""
Comment on lines 24 to 39
from ray import tune
from ray.train import CheckpointConfig, RunConfig, ScalingConfig
from ray.train.lightning import (
RayDDPStrategy,
RayLightningEnvironment,
RayTrainReportCallback,
prepare_trainer,
)
from ray.train.torch import TorchTrainer
from ray.tune.schedulers import ASHAScheduler

NO_HYPEROPT = False
try:
from ray.tune.search.hyperopt import HyperOptSearch
except ImportError:
NO_HYPEROPT = True
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
from ray import tune
from ray.train import CheckpointConfig, RunConfig, ScalingConfig
from ray.train.lightning import (
RayDDPStrategy,
RayLightningEnvironment,
RayTrainReportCallback,
prepare_trainer,
)
from ray.train.torch import TorchTrainer
from ray.tune.schedulers import ASHAScheduler
NO_HYPEROPT = False
try:
from ray.tune.search.hyperopt import HyperOptSearch
except ImportError:
NO_HYPEROPT = True
NO_HYPEROPT = False
try:
from ray.tune.search.hyperopt import HyperOptSearch
from ray import tune
from ray.train import CheckpointConfig, RunConfig, ScalingConfig
from ray.train.lightning import (
RayDDPStrategy,
RayLightningEnvironment,
RayTrainReportCallback,
prepare_trainer,
)
from ray.train.torch import TorchTrainer
from ray.tune.schedulers import ASHAScheduler
except ImportError:
NO_HYPEROPT = True

resolves latest test failure here: https://github.com/chemprop/chemprop/actions/runs/8664591034/job/23761373381?pr=734#step:6:136


logger = logging.getLogger(__name__)

DEFAULT_SEARCH_SPACE = {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you have this dict as a global variable it will be created at import time of this module. This dict requires that the hpopt packages are installed. You can fix this by:

  • moving this defintion into the try/except block where you do the imports
  • inside the CLI main class, move the import of the hopt subcommand into the body of the cli function to avoid importing until needed

The former is easier

Copy link
Member

@JacksonBurns JacksonBurns left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM to me assuming tests pass. My only ask is that you open an issue to track adding optuna back in

Copy link
Contributor

@shihchengli shihchengli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I only have a few minor comments here, but I will approve it for now as most parts look good to me. Perhaps you could open an issue related to enabling fully customizable distribution types and search ranges for tracking purposes.

chemprop/cli/hpopt.py Outdated Show resolved Hide resolved
chemprop/cli/hpopt.py Outdated Show resolved Hide resolved
chemprop/cli/hpopt.py Outdated Show resolved Hide resolved
Co-authored-by: Shih-Cheng Li <scli@mit.edu>
@KnathanM KnathanM enabled auto-merge April 12, 2024 18:53
@KnathanM KnathanM merged commit ec55383 into chemprop:v2/dev Apr 12, 2024
9 checks passed
@shihchengli shihchengli mentioned this pull request May 28, 2024
2 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[TODO]: Implement HPO and change default HPO search space
7 participants