Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

更新API OneCycleLR设计文档 #132

Merged
merged 1 commit into from
May 16, 2022
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
17 changes: 7 additions & 10 deletions rfcs/APIs/20210312_api_design_for_one_cycle_lr.md
Original file line number Diff line number Diff line change
Expand Up @@ -155,26 +155,23 @@ def get_lr(self):

## 命名与参数设计

API设计为`paddle.optimizer.lr.OneCycleLR(max_learning_rate, total_steps=None, epochs=None, steps_per_epoch=None, pct_start=0.3, anneal_strategy='cos', divide_factor=25., final_divide_factor=1e4, three_phase=False, last_epoch=-1, verbose=False)`
API设计为`paddle.optimizer.lr.OneCycleLR(max_learning_rate, total_steps, divide_factor=25., end_learning_rate=0.0001, phase_pct=0.3, anneal_strategy='cos', three_phase=False, last_epoch=-1, verbose=False)`

形参名`div_factor`->`divide_factor`, `final_div_factor`->`final_divide_factor`, `max_lr`->`max_learning_rate`。
形参名`div_factor`->`divide_factor`, `max_lr`->`max_learning_rate`。

参数:

- max_learning_rate:训练过程中的最大学习率;
- total_steps:训练总步数,若为指定则会通过`epochs`*`steps_per_epoch`计算得来;
- epochs:训练的epoch数;
- steps_per_epoch:每个epoch中训练的步数;
- pct_start:提高学习率所需步数占总训练步数的比例,默认:0.3;
- anneal_strategy:学习率退火策略,'cos'或'linear',默认为'cos';
- total_steps:训练总步数;
- divide_factor:初始学习率`initial_learning_rate`由`max_learning_rate`/`divide_factor`确定,默认为25;
- final_divide_factor:最小学习率`min_learning_rate`由`initial_learning_rate`/`final_divide_factor`确定,默认为1e4;
- three_phase:若为`True`,则使用三阶段的调度策略,即学习率先由`initial_learning_rate`上升至`max_learning_rate`,再下降回initial_learning_rate,最后下降到min_learning_rate;若为False,则使用两阶段的调度策略,即学习率先由`initial_learning_rate`上升至`max_learning_rate`,再直接下降到`min_learning_rate`。默认为`False`;
- end_learning_rate:训练过程中的最小学习率,应该是一个远小于初始学习率的数。
- phase_pct:提高学习率所需步数占总训练步数的比例,默认:0.3;
- anneal_strategy:学习率退火策略,'cos'或'linear',默认为'cos';
- three_phase:若为`True`,则使用三阶段的调度策略,即学习率先由`initial_learning_rate`上升至`max_learning_rate`,再下降回`initial_learning_rate`,最后下降到`min_learning_rate`;若为False,则使用两阶段的调度策略,即学习率先由`initial_learning_rate`上升至`max_learning_rate`,再直接下降到`min_learning_rate`。默认为`False`;
- last_epoch:可选,上一轮的轮数,重启训练时设置为上一轮的epoch数。默认值为 -1,则为初始学习率;
- verbose:可选,如果是 `True` ,则在每一轮更新时在标准输出 stdout 输出一条信息。默认值为 `False` 。



## 底层OP设计

仅使用python实现,无需设计底层OP。
Expand Down