New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HAL mul8x8to16 added #25506
HAL mul8x8to16 added #25506
Conversation
modules/core/src/arithm.cpp
Outdated
int res; | ||
res = cv_hal_mul8u16u(/* src1_data */ &ua, /* src1_step */ 1, /* src2_data */ &ub, /* src2_step */ 1, | ||
/* dst_data */ &uc, /* dst_step */ 1, /* width */ 1, /* height */ 1, /* scale */ 1); | ||
if (res == CV_HAL_ERROR_NOT_IMPLEMENTED) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The HAL function may return not implemented status, if size is to small or data is not aligned. Static pre-check does not work.
modules/core/src/arithm.cpp
Outdated
void* usrdata) | ||
{ | ||
double scale = *((double*)usrdata); | ||
cv_hal_mul8s16s((schar*)src1, step1, (schar*)src2, step2, (short*)dst, step, width, height, scale); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cv_hal_mul8s16s
and cv_hal_mul8u16u
return value is not checked. They may return error or not implemented status.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have an ugly solution for this, please take a look
modules/core/src/arithm.cpp
Outdated
{ | ||
CV_INSTRUMENT_REGION(); | ||
|
||
arithm_op(src1, src2, dst, noArray(), dtype, getMulTab(), | ||
true, &scale, std::abs(scale - 1.0) < DBL_EPSILON ? OCL_OP_MUL : OCL_OP_MUL_SCALE); | ||
static bool halMul8to16available = (cv_hal_mul8u16u != hal_ni_mul8u16u) && (cv_hal_mul8s16s != hal_ni_mul8s16s); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if HAL implements only one of them?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
Can we replace |
modules/core/src/arithm.cpp
Outdated
@@ -617,7 +622,11 @@ static void arithm_op(InputArray _src1, InputArray _src2, OutputArray _dst, | |||
|
|||
Mat src1 = psrc1->getMat(), src2 = psrc2->getMat(), dst = _dst.getMat(); | |||
Size sz = getContinuousSize2D(src1, src2, dst, src1.channels()); | |||
tab[depth1](src1.ptr(), src1.step, src2.ptr(), src2.step, dst.ptr(), dst.step, sz.width, sz.height, usrdata); | |||
if (extendedFunc(src1.ptr(), src1.step, src2.ptr(), src2.step, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if (!extendedFunc || extendedFunc() != 0)
modules/core/src/arithm.cpp
Outdated
{ | ||
double scale = *((double*)usrdata); | ||
int res = cv_hal_mul8u16u(src1, step1, src2, step2, (ushort *)dst, step, width, height, scale); | ||
if (res == 0 || res == CV_HAL_ERROR_NOT_IMPLEMENTED) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use constant instead of 0.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
fixes #25034
Pull Request Readiness Checklist
See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request
Patch to opencv_extra has the same branch name.