Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

【Hackathon No.60】refactor unary sparse ops and add sparse sqrt, tanh, sin #41356

Merged
merged 36 commits into from May 12, 2022
Merged
Show file tree
Hide file tree
Changes from 8 commits
Commits
Show all changes
36 commits
Select commit Hold shift + click to select a range
97ac270
refactor unary sparse ops and add relu
tiancaishaonvjituizi Apr 2, 2022
26f4662
add test
tiancaishaonvjituizi Apr 2, 2022
a7f3410
fix the bug in generated api code, tests are passed now
tiancaishaonvjituizi Apr 4, 2022
d4310af
Merge branch 'develop' into sparse_relu
tiancaishaonvjituizi Apr 20, 2022
7e5f102
update relu for new sparse api
tiancaishaonvjituizi Apr 20, 2022
71864fd
update test, implement api, fix sqrt grad
tiancaishaonvjituizi Apr 21, 2022
a99a5ba
manually register relu and relu_grad kernel to bypass the restriction
tiancaishaonvjituizi Apr 21, 2022
95aa0b3
polish sqrt docs
tiancaishaonvjituizi Apr 21, 2022
f706dea
reformat
tiancaishaonvjituizi Apr 21, 2022
d898df7
polish docs
tiancaishaonvjituizi Apr 21, 2022
b770f41
remove csr backward api
tiancaishaonvjituizi Apr 21, 2022
f92e8cd
fix test compile error
tiancaishaonvjituizi Apr 21, 2022
394ce5e
use allclose instead of array_equal
tiancaishaonvjituizi Apr 21, 2022
c577f46
move sqrt to math_kernel.cc, implement sin and tanh
tiancaishaonvjituizi Apr 21, 2022
3ad6fba
polish header file
tiancaishaonvjituizi Apr 21, 2022
56fc5da
reformat
tiancaishaonvjituizi Apr 21, 2022
1f18c59
refine
tiancaishaonvjituizi Apr 21, 2022
c606825
fix typo
tiancaishaonvjituizi Apr 22, 2022
5dd4507
fix typo
tiancaishaonvjituizi Apr 22, 2022
f59fa26
add test about error, reformat
tiancaishaonvjituizi Apr 23, 2022
dea61c7
fix test error
tiancaishaonvjituizi Apr 23, 2022
60c7359
fix format
tiancaishaonvjituizi Apr 23, 2022
ad8ceda
fix false positive warning in gcc>=9
tiancaishaonvjituizi Apr 26, 2022
178dd27
use more aggressive way
tiancaishaonvjituizi Apr 26, 2022
1ace46f
Merge remote-tracking branch 'origin/develop' into sparse_relu
tiancaishaonvjituizi Apr 26, 2022
7bb41d7
add api in paddle.sparse namespace
tiancaishaonvjituizi Apr 26, 2022
790cb0d
Merge remote-tracking branch 'tiancaishaonv/variant_fix_gcc9_fp_warni…
tiancaishaonvjituizi Apr 26, 2022
c44ac74
address reviews
tiancaishaonvjituizi Apr 27, 2022
d35e923
Merge remote-tracking branch 'origin/develop' into sparse_relu
tiancaishaonvjituizi Apr 27, 2022
b04ab6c
fix ci error
tiancaishaonvjituizi Apr 29, 2022
fa93d7d
rename to unary_kernel, update name
tiancaishaonvjituizi May 6, 2022
a6d2cd0
Merge remote-tracking branch 'origin/develop' into sparse_relu
tiancaishaonvjituizi May 6, 2022
67d14b4
remove unused files
tiancaishaonvjituizi May 6, 2022
268ac34
rename python files
tiancaishaonvjituizi May 6, 2022
39c9750
fix import path
tiancaishaonvjituizi May 7, 2022
06787c0
reformat
tiancaishaonvjituizi May 9, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
2 changes: 1 addition & 1 deletion paddle/phi/api/lib/api_gen_utils.cc
Expand Up @@ -154,7 +154,7 @@ phi::TensorBase* SetSparseKernelOutput(Tensor* out, TensorType type) {
std::make_shared<phi::SparseCsrTensor>(phi::DenseTensor(),
phi::DenseTensor(),
phi::DenseTensor(),
phi::DDim{-1});
phi::DDim{-1, -1});
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里 paddle 原本的代码是错误的,不知道是不是没有测试过。SparseCsrTensor 的构造函数内会检查 dims 的长度,只允许 2d 和 3d,这里 1d 的 dim 会导致 check 失败

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

当前文件数超了最大限制了,要不把api_gen_utils.cc和sparse_csr_tensor.cc这两个修复的文件单独提一个PR?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

好的~

out->set_impl(sparse_tensor);
return sparse_tensor.get();
} else {
Expand Down
8 changes: 5 additions & 3 deletions paddle/phi/core/sparse_csr_tensor.cc
Expand Up @@ -27,9 +27,11 @@ SparseCsrTensor::SparseCsrTensor() {
inline void check_shape(const DDim& dims) {
bool valid = dims.size() == 2 || dims.size() == 3;

PADDLE_ENFORCE(valid,
phi::errors::InvalidArgument(
"the SparseCsrTensor only support 2-D Tensor."));
PADDLE_ENFORCE(
valid,
phi::errors::InvalidArgument("the SparseCsrTensor only support 2-D or "
"3-D Tensor, but get %d-D Tensor",
dims.size()));
}
#define Check(non_zero_crows, non_zero_cols, non_zero_elements, dims) \
{ \
Expand Down
1 change: 1 addition & 0 deletions paddle/phi/kernels/activation_grad_kernel.h
Expand Up @@ -187,6 +187,7 @@ DECLARE_ACTIVATION_GRAD_KERNEL_DEPX(Log1p);
DECLARE_ACTIVATION_GRAD_KERNEL_DEPOUT(Relu);
DECLARE_ACTIVATION_GRAD_KERNEL_DEPOUT(Tanh);
DECLARE_ACTIVATION_GRAD_KERNEL_DEPOUT(Sigmoid);
DECLARE_ACTIVATION_GRAD_KERNEL_DEPOUT(Sqrt);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

之前 dense tensor 的 SqrtGrad kernel 没有在头文件中声明


DECLARE_ACTIVATION_GRAD_KERNEL_NODEP(Round);
DECLARE_ACTIVATION_GRAD_KERNEL_NODEP(Floor);
Expand Down
69 changes: 32 additions & 37 deletions paddle/phi/kernels/sparse/activation_grad_kernel.cc
Expand Up @@ -13,58 +13,53 @@ See the License for the specific language governing permissions and
limitations under the License. */

#include "paddle/phi/kernels/sparse/activation_grad_kernel.h"
#include "paddle/phi/kernels/activation_grad_kernel.h"
#include "paddle/phi/kernels/copy_kernel.h"
#include "paddle/phi/kernels/empty_kernel.h"

#include "paddle/phi/backends/cpu/cpu_context.h"
#include "paddle/phi/backends/gpu/gpu_context.h"
#include "paddle/phi/core/kernel_registry.h"

namespace phi {
namespace sparse {

template <typename T, typename Context>
void SparseReluGradKernel(const Context& dev_ctx,
const SparseCooTensor& x,
const SparseCooTensor& out_grad,
SparseCooTensor* x_grad) {
DenseTensor non_zero_indices =
phi::EmptyLike<T, Context>(dev_ctx, x.non_zero_indices());
DenseTensor non_zero_elements =
phi::EmptyLike<T, Context>(dev_ctx, x.non_zero_elements());
phi::Copy(dev_ctx,
x.non_zero_indices(),
dev_ctx.GetPlace(),
false,
&non_zero_indices);
phi::ReluGradKernel<T, Context>(dev_ctx,
x.non_zero_elements(),
out_grad.non_zero_elements(),
&non_zero_elements);
x_grad->SetMember(non_zero_indices, non_zero_elements, x.dims(), true);
}
#include "paddle/phi/kernels/activation_grad_kernel.h"
#include "paddle/phi/kernels/sparse/utils.h"

} // namespace sparse
} // namespace phi
DEFINE_AND_REGISTER_SPARSE_UNARY_GRAD_KERNEL(sqrt_grad, SqrtGradKernel)

PD_REGISTER_KERNEL(sparse_relu_grad,
// NOTE: the following code is to bypass the restriction of Paddle
// kernel registration mechanism. Do NOT refactor them unless you
// know what you are doing.
// If you want to implement any new kernel, please follow `sqrt_grad` above
// instead of `relu_grad` following
DEFINE_SPARSE_UNARY_GRAD_KERNEL(ReluGradKernel)
PD_REGISTER_KERNEL(sparse_coo_relu_grad,
CPU,
ALL_LAYOUT,
phi::sparse::SparseReluGradKernel,
phi::sparse::SparseCooReluGradKernel,
float,
double) {
kernel->InputAt(0).SetDataLayout(phi::DataLayout::SPARSE_COO);
}

PD_REGISTER_KERNEL(sparse_csr_relu_grad,
CPU,
ALL_LAYOUT,
phi::sparse::SparseCsrReluGradKernel,
float,
double) {
kernel->InputAt(0).SetDataLayout(phi::DataLayout::SPARSE_CSR);
}
#if defined(PADDLE_WITH_CUDA) || defined(PADDLE_WITH_HIP)
PD_REGISTER_KERNEL(sparse_relu_grad,
PD_REGISTER_KERNEL(sparse_coo_relu_grad,
GPU,
ALL_LAYOUT,
phi::sparse::SparseReluGradKernel,
phi::sparse::SparseCooReluGradKernel,
float,
double,
phi::dtype::float16) {
kernel->InputAt(0).SetDataLayout(phi::DataLayout::SPARSE_COO);
}

PD_REGISTER_KERNEL(sparse_csr_relu_grad,
GPU,
ALL_LAYOUT,
phi::sparse::SparseCsrReluGradKernel,
float,
double,
phi::dtype::float16) {
kernel->InputAt(0).SetDataLayout(phi::DataLayout::SPARSE_CSR);
}
#endif

23 changes: 18 additions & 5 deletions paddle/phi/kernels/sparse/activation_grad_kernel.h
Expand Up @@ -15,15 +15,28 @@ limitations under the License. */
#pragma once

#include "paddle/phi/core/sparse_coo_tensor.h"
#include "paddle/phi/core/sparse_csr_tensor.h"

namespace phi {
namespace sparse {

template <typename T, typename Context>
void SparseReluGradKernel(const Context& dev_ctx,
const SparseCooTensor& x,
const SparseCooTensor& out_grad,
SparseCooTensor* x_grad);
#define DECLARE_SPARSE_ACTIVATION_GRAD_KERNEL(name) \
template <typename T, typename Context> \
void SparseCoo##name##GradKernel(const Context& dev_ctx, \
const SparseCooTensor& x, \
const SparseCooTensor& out_grad, \
SparseCooTensor* x_grad); \
\
template <typename T, typename Context> \
void SparseCsr##name##GradKernel(const Context& dev_ctx, \
const SparseCsrTensor& x, \
const SparseCsrTensor& out_grad, \
SparseCsrTensor* x_grad);

DECLARE_SPARSE_ACTIVATION_GRAD_KERNEL(Relu)
DECLARE_SPARSE_ACTIVATION_GRAD_KERNEL(Sqrt)

#undef DECLARE_SPARSE_ACTIVATION_GRAD_KERNEL

} // namespace sparse
} // namespace phi
61 changes: 30 additions & 31 deletions paddle/phi/kernels/sparse/activation_kernel.cc
Expand Up @@ -13,54 +13,53 @@ See the License for the specific language governing permissions and
limitations under the License. */

#include "paddle/phi/kernels/sparse/activation_kernel.h"
#include "paddle/phi/kernels/copy_kernel.h"
#include "paddle/phi/kernels/empty_kernel.h"

#include "paddle/phi/backends/cpu/cpu_context.h"
#include "paddle/phi/backends/gpu/gpu_context.h"
#include "paddle/phi/core/kernel_registry.h"
#include "paddle/phi/kernels/sparse/utils.h"

namespace phi {
namespace sparse {
DEFINE_AND_REGISTER_SPARSE_UNARY_KERNEL(sqrt, SqrtKernel)

template <typename T, typename Context>
void SparseReluKernel(const Context& dev_ctx,
const SparseCooTensor& x,
SparseCooTensor* out) {
DenseTensor non_zero_indices =
phi::EmptyLike<T, Context>(dev_ctx, x.non_zero_indices());
DenseTensor non_zero_elements =
phi::EmptyLike<T, Context>(dev_ctx, x.non_zero_elements());
phi::Copy(dev_ctx,
x.non_zero_indices(),
dev_ctx.GetPlace(),
false,
&non_zero_indices);
phi::ReluKernel<T, Context>(
dev_ctx, x.non_zero_elements(), &non_zero_elements);
out->SetMember(non_zero_indices, non_zero_elements, x.dims(), true);
}

} // namespace sparse
} // namespace phi
// NOTE: the following code is to bypass the restriction of Paddle
// kernel registration mechanism. Do NOT refactor them unless you
// know what you are doing.
// If you want to implement any new kernel, please follow `sqrt` above
// instead of `relu` following
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

为了避开 #41356 (comment) 的限制,relu 这个 kernel 用 PD_REGISTER_KERNEL 手动注册而没有用 DEFINE_AND_REGISTER_SPARSE_UNARY_KERNEL 宏。并写了详细的注释

DEFINE_SPARSE_UNARY_KERNEL(ReluKernel)

PD_REGISTER_KERNEL(sparse_relu,
PD_REGISTER_KERNEL(sparse_coo_relu,
CPU,
ALL_LAYOUT,
phi::sparse::SparseReluKernel,
phi::sparse::SparseCooReluKernel,
float,
double) {
kernel->InputAt(0).SetDataLayout(phi::DataLayout::SPARSE_COO);
}
PD_REGISTER_KERNEL(sparse_csr_relu,
CPU,
ALL_LAYOUT,
phi::sparse::SparseCsrReluKernel,
float,
double) {
kernel->InputAt(0).SetDataLayout(phi::DataLayout::SPARSE_CSR);
}

#if defined(PADDLE_WITH_CUDA) || defined(PADDLE_WITH_HIP)
PD_REGISTER_KERNEL(sparse_relu,
PD_REGISTER_KERNEL(sparse_coo_relu,
GPU,
ALL_LAYOUT,
phi::sparse::SparseReluKernel,
phi::sparse::SparseCooReluKernel,
float,
double,
phi::dtype::float16) {
kernel->InputAt(0).SetDataLayout(phi::DataLayout::SPARSE_COO);
}

PD_REGISTER_KERNEL(sparse_csr_relu,
GPU,
ALL_LAYOUT,
phi::sparse::SparseCsrReluKernel,
float,
double,
phi::dtype::float16) {
kernel->InputAt(0).SetDataLayout(phi::DataLayout::SPARSE_CSR);
}
#endif
28 changes: 23 additions & 5 deletions paddle/phi/kernels/sparse/activation_kernel.h
Expand Up @@ -16,22 +16,40 @@ limitations under the License. */

#include "paddle/phi/core/dense_tensor.h"
#include "paddle/phi/core/sparse_coo_tensor.h"
#include "paddle/phi/core/sparse_csr_tensor.h"
#include "paddle/phi/kernels/activation_kernel.h"
#include "paddle/phi/kernels/empty_kernel.h"

namespace phi {
namespace sparse {

template <typename T, typename Context>
void SparseReluKernel(const Context& dev_ctx,
const SparseCooTensor& x,
SparseCooTensor* out);
#define DECLARE_SPARSE_ACTIVATION_KERNEL(name) \
template <typename T, typename Context> \
void SparseCoo##name##Kernel( \
const Context& dev_ctx, const SparseCooTensor& x, SparseCooTensor* out); \
\
template <typename T, typename Context> \
void SparseCsr##name##Kernel( \
const Context& dev_ctx, const SparseCsrTensor& x, SparseCsrTensor* out);

DECLARE_SPARSE_ACTIVATION_KERNEL(Relu)
DECLARE_SPARSE_ACTIVATION_KERNEL(Sqrt)

#undef DECLARE_SPARSE_ACTIVATION_KERNEL

template <typename T, typename Context>
SparseCooTensor SparseRelu(const Context& dev_ctx, const SparseCooTensor& x) {
DenseTensor indices, values;
SparseCooTensor coo(indices, values, x.dims());
SparseReluKernel<T, Context>(dev_ctx, x, &coo);
SparseCooReluKernel<T, Context>(dev_ctx, x, &coo);
return coo;
}

template <typename T, typename Context>
SparseCooTensor SparseSqrt(const Context& dev_ctx, const SparseCooTensor& x) {
tiancaishaonvjituizi marked this conversation as resolved.
Show resolved Hide resolved
DenseTensor indices, values;
SparseCooTensor coo(indices, values, x.dims());
SparseCooSqrtKernel<T, Context>(dev_ctx, x, &coo);
return coo;
}

Expand Down