Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

【Hackathon No.11】为 Paddle 新增 multi_margin_loss API #171

Merged
merged 5 commits into from Aug 8, 2022
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
182 changes: 182 additions & 0 deletions rfcs/APIs/20220707_api_design_for_multi_margin_loss.md
@@ -0,0 +1,182 @@
# paddle.nn.MultiMarginLoss 设计文档


|API名称 | MultiMarginLoss |
|---|---|
|提交作者<input type="checkbox" class="rowselector hidden"> | yangguohao |
|提交时间<input type="checkbox" class="rowselector hidden"> | 2022-07-07 |
|版本号 | v2.2|
|依赖飞桨版本<input type="checkbox" class="rowselector hidden"> | develop版本 |
|文件名 | 20220707_api_design_for_multi_margin_loss.md<br> |


# 一、概述
## 1、相关背景
为了提升飞桨API丰富度,Paddle需要扩充 API paddle.nn.MultiMarginLoss 以及 paddle.functional.multi_margin_loss

## 2、功能目标
paddle.nn.MultiMarginLoss 为多分类问题的 Hinge loss。
loss 按照以下公式计算

$$\text{loss}(x, y) = \frac{\sum_i \max(0, \text{margin} - x[y] + x[i])^p}{\text{x.size}(0)}$$

如果有权重 w 下,公式为

$$\text{loss}(x, y) = \frac{\sum_i \max(0, w[y] * (\text{margin} - x[y] + x[i]))^p}{\text{x.size}(0)}$$

## 3、意义
为 paddle 框架中新增计算损失函数的方法,完善并丰富飞桨深度学习框架的功能及实用性。

# 二、飞桨现状
Paddle 目前没有相关的损失函数方法。


# 三、业内方案调研
Pytorch 中有相关的
`torch.nn.functional.multi_margin_loss(
input: Tensor,
target: Tensor,
p: int = 1,
margin: float = 1.0,
weight: Optional[Tensor] = None,
size_average: Optional[bool] = None,
reduce: Optional[bool] = None,
reduction: str = "mean",
) -> Tensor:`和`torch.nn.MultiMarginLoss(p: int = 1,
margin: float = 1.,
weight: Optional[Tensor] = None,
size_average=None,
reduce=None,
reduction: str = 'mean') -> None:`
其中的size_average和reduce将被弃用,统一使用reduction
在 pytorch 中,介绍为:
```
"""Creates a criterion that optimizes a multi-class classification hinge
loss (margin-based loss) between input :math:`x` (a 2D mini-batch `Tensor`) and
output :math:`y` (which is a 1D tensor of target class indices,
:math:`0 \leq y \leq \text{x.size}(1)-1`):

For each mini-batch sample, the loss in terms of the 1D input :math:`x` and scalar
output :math:`y` is:

.. math::
\text{loss}(x, y) = \frac{\sum_i \max(0, \text{margin} - x[y] + x[i])^p}{\text{x.size}(0)}

where :math:`x \in \left\{0, \; \cdots , \; \text{x.size}(0) - 1\right\}`
and :math:`i \neq y`.

Optionally, you can give non-equal weighting on the classes by passing
a 1D :attr:`weight` tensor into the constructor.

The loss function then becomes:

.. math::
\text{loss}(x, y) = \frac{\sum_i \max(0, w[y] * (\text{margin} - x[y] + x[i]))^p}{\text{x.size}(0)}
"""
```

Pytorch 代码
```
def multi_margin_loss(
input: Tensor,
target: Tensor,
p: int = 1,
margin: float = 1.0,
weight: Optional[Tensor] = None,
size_average: Optional[bool] = None,
reduce: Optional[bool] = None,
reduction: str = "mean",
) -> Tensor:
r"""multi_margin_loss(input, target, p=1, margin=1, weight=None, size_average=None, reduce=None, reduction='mean') -> Tensor
See :class:`~torch.nn.MultiMarginLoss` for details.
"""
if has_torch_function_variadic(input, target, weight):
return handle_torch_function(
multi_margin_loss,
(input, target, weight),
input,
target,
p=p,
margin=margin,
weight=weight,
size_average=size_average,
reduce=reduce,
reduction=reduction,
)
if size_average is not None or reduce is not None:
reduction_enum = _Reduction.legacy_get_enum(size_average, reduce)
else:
reduction_enum = _Reduction.get_enum(reduction)
if p != 1 and p != 2:
raise ValueError("only p == 1 and p == 2 supported")
if weight is not None:
if weight.dim() != 1:
raise ValueError("weight must be one-dimensional")

return torch._C._nn.multi_margin_loss(input, target, p, margin, weight, reduction_enum)
```


tensorflow没有专门的multimarginloss

# 四、对比分析
- 采取使用pytorch的框架方法。


# 五、设计思路与实现方案

## 命名与参数设计
共添加以下两个 API:

- `paddle.nn.functional.multi_margin_loss(
input,
label,
p:int=1,
margin:float=1.0,
weight= None,
reduction: str = "mean",
name:str=None,
) -> Tensor:`
- input:Tensor, 维度为[batchsize,num_classes]
- label:Tensor, 维度为[batchsize,]
- weight: Optional[Tensor],维度为[num_classes,]
- reduction:str,'none','mean','sum
- name (str,可选)
和`
- paddle.nn.MultiMarginLoss(
p:int = 1,
margin:float=1.0,
weight (Tensor,可选),
reduction(str,可选),
) -> None:

## 底层OP设计
## API实现方案

1. 检查参数

1. 检查 reduction 有效性(同其余 functional loss 中的实现)
2. 检查输入的 dtype(含 `input`、`label`、`weight`)(同其余 functional loss 中的实现)
3. 检查输入的`input`、`label`、`weight`维度是否正确。
4. 检查 p, 只支持 1 和 2(同 pytorch 的实现)。

2. 计算

1. 根据公式计算每一个 batchsize 的 loss。

3. 根据 `reduction`,输出 loss(同其余 functional loss 中的实现)
# 六、测试和验收的考量

测试考虑的case如下:
- 1.动态图,静态图,要与numpy计算下的结果输出需要一致。
- 2.检查 p 的合法性。
- 3.检查 margin 的合法性
- 4.各reduction下计算一致
- 5.各参数输入有效。


# 七、可行性分析和排期规划
方案主要依赖现有paddle api组合而成,可以满足在当前版本周期内开发完成。

# 八、影响面
无影响