New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding prod & prod_dim #1173
Adding prod & prod_dim #1173
Conversation
2. ONNX IR fix ReduceL* 3. ignoring ipynb checkpoints
# Conflicts: # burn-book/src/building-blocks/tensor.md
burn-candle/src/ops/int_tensor.rs
Outdated
@@ -311,6 +311,14 @@ impl<F: FloatCandleElement, I: IntCandleElement> IntTensorOps<Self> for Candle<F | |||
CandleTensor::new(tensor.tensor.sum_keepdim(dim).unwrap()) | |||
} | |||
|
|||
fn int_prod<const D: usize>(tensor: IntTensor<Self, D>) -> IntTensor<Self, 1> { | |||
todo!(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tch, ndarray & candle dont support prod_axis required for this implementation.
QUESTION should I implement it in the burn layer with some gymnastics or leave it as panic!() like e.g. burn-candle's int_div_scalar
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's rather bad to offer an operation in the API if it's gonna fail on 75% of our backends. I think if you can work out some default implementation in the burn layer it would be cool, otherwise we should not even offer the operation.
Candle's int_div_scalar is an exception that we should fix eventually.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i agree. i will file issues on tch, ndarray and candle and implement a default.
@@ -0,0 +1,116 @@ | |||
use burn_compute::tune::{AutotuneOperation, AutotuneOperationSet}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is a large copy-pasta from sum implementation
what is the recommendation here? parametrize the core implementation and call them with sum / prod arguments?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess we'll want to refactor this so that all reduce operations share some core. The list of reduction operations is growing (there's also this PR #1136 coming up with more autotuned reduce) so we can't afford to have this many duplicates. For now you can keep it as is but I'll open an issue for refactoring this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @unrenormalizable
Thanks for the PR draft, it's looking good. See my comments
(WORKGROUP_DEFAULT * WORKGROUP_DEFAULT).to_string(), | ||
) | ||
.register("initial", 0.0.to_string()) | ||
.register("update", "shared_memory[local_id] += value; ") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess you'll want initial to be 1.0 and the update to be *=
@@ -0,0 +1,116 @@ | |||
use burn_compute::tune::{AutotuneOperation, AutotuneOperationSet}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess we'll want to refactor this so that all reduce operations share some core. The list of reduction operations is growing (there's also this PR #1136 coming up with more autotuned reduce) so we can't afford to have this many duplicates. For now you can keep it as is but I'll open an issue for refactoring this.
workgroupBarrier(); | ||
|
||
if id_local == 0u { | ||
var prod = {{ elem }}(0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should start at 1.0 or it won't compute much ;)
burn-candle/src/ops/int_tensor.rs
Outdated
@@ -311,6 +311,14 @@ impl<F: FloatCandleElement, I: IntCandleElement> IntTensorOps<Self> for Candle<F | |||
CandleTensor::new(tensor.tensor.sum_keepdim(dim).unwrap()) | |||
} | |||
|
|||
fn int_prod<const D: usize>(tensor: IntTensor<Self, D>) -> IntTensor<Self, 1> { | |||
todo!(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's rather bad to offer an operation in the API if it's gonna fail on 75% of our backends. I think if you can work out some default implementation in the burn layer it would be cool, otherwise we should not even offer the operation.
Candle's int_div_scalar is an exception that we should fix eventually.
burn-autodiff/src/ops/tensor.rs
Outdated
let ones = B::ones(shape, &B::device(&grad)); | ||
let grad = B::prod_dim(grad, dim); | ||
|
||
B::mul(ones, grad) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why multiply by one?
burn-autodiff/src/ops/tensor.rs
Outdated
|
||
unary::<B, D, D, _>(ops.parents, ops.node, grads, |grad| { | ||
let ones = B::ones(shape, &B::device(&grad)); | ||
let grad = B::prod_dim(grad, dim); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As an example, I think the derivative of prod_dim([a, b, c, d], 0) should be
[bcd, acd, abd, bcd], then multiply by the unchanged grad. So you'll need to register the original input as state. Don't hesitate to call me out if you think I'm wrong I haven't thought of it too long
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah i am stupid. i will add unit tests for this.
Great! Let me know when you start working on ONNX part. |
will do & my apologies, taking way longer than expected. overestimated my learning abilities 🤣 |
No worries. We are here to help. |
# Conflicts: # burn-autodiff/src/ops/tensor.rs # burn-candle/src/lib.rs # burn-candle/src/ops/tensor.rs # burn-fusion/src/ops/float.rs # burn-fusion/src/stream/operation.rs # burn-ndarray/src/ops/tensor.rs # burn-tch/src/ops/tensor.rs # burn-wgpu/src/ops/float_ops.rs
Folks, I am going to pause the prod/prod_dim work till required dependencies are available. The main blocker is cumprod is required to implement the autodiff/grad component for prod. This is as per the pytorch implementation which is a reasonable approach. In addition I have filed the prod_axis requirements on huggingface/candle/1620 & rust-ndarray/ndarray/1351 Following items from this PR will be taken forward in a separate PR.
|
Pull Request Template
Checklist
run-checks all
script has been executed.Related Issues/PRs
n/a
Changes
Adds prod & prod_dim in preparation for adding ReduceProd ONNX op. Really a most useless checkin, doing it for getting familiar with the codebase + fun.
Testing
run-checks all