Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LLVM IR generation bug when adding metadata for excinfo-related store instruction #9550

Open
dlee992 opened this issue Apr 29, 2024 · 6 comments · May be fixed by #9551
Open

LLVM IR generation bug when adding metadata for excinfo-related store instruction #9550

dlee992 opened this issue Apr 29, 2024 · 6 comments · May be fixed by #9551

Comments

@dlee992
Copy link
Contributor

dlee992 commented Apr 29, 2024

I noticed in a certain case, numba will emit LLVM IR like this:

B10.if:                                           ; preds = %B10.if.loopexit, %B10.if.loopexit10
  %excinfo.2.0 = phi { i8*, i32, i8*, i8*, i32 }* [ @.const.picklebuf.23377073394240, %B10.if.loopexit10 ], [ @.const.picklebuf.23377075691904, %B10.if.loopexit ]
  %2 = bitcast i32* %.9.i to i8*
  %3 = bitcast i32* %.7.i to i8*
  call void @llvm.lifetime.end.p0i8(i64 4, i8* nonnull %3)
  call void @llvm.lifetime.end.p0i8(i64 4, i8* nonnull %2)
  store { i8*, i32, i8*, i8*, i32 }* %excinfo.2.0, { i8*, i32, i8*, i8*, i32 }** %excinfo, align 8
  br label %common.ret

I think this branch is used for storing excinfo and return. But the store instr doesn't have the metadata of numba_exception_output, which makes the ref_prune pass in llvmlite can't remove some ref pairs.
The source code in llvmlite:

    bool isRaising(const BasicBlock *bb) {
        for (auto &instruction : *bb) {
            if (instruction.getOpcode() == Instruction::Store) {
                auto name = dyn_cast<StoreInst>(&instruction)
                                ->getPointerOperand()
                                ->getName();
                if (name.equals("excinfo") &&
                    instruction.hasMetadata("numba_exception_output")) {
                    return true;
                }
            }
        }
        return false;
    }

Willing to provide a python test case later. Trying to find a fix meanwhile.

@dlee992
Copy link
Contributor Author

dlee992 commented Apr 29, 2024

Yeah, in numba/core/callconv.py, more than one location can generate excinfo-related store instr, but only one location adds the metadata for it.

Can push a PR soon.

@dlee992
Copy link
Contributor Author

dlee992 commented Apr 29, 2024

could be related to #9186, where I observed this excinfo change hurts the performance.

@guilhermeleobas
Copy link
Collaborator

@dlee992, I wasn't aware of the numba_exception_output metadata. On callconv.py::set_dynamic_user_exc, one should set the metadata for the last store instruction as well.

zero = int32_t(0)
exc_fields = (builder.extract_value(struct_gv, PICKLE_BUF_IDX),
builder.extract_value(struct_gv, PICKLE_BUFSZ_IDX),
builder.bitcast(st_ptr, GENERIC_POINTER),
builder.bitcast(unwrap_fn, GENERIC_POINTER),
int32_t(len(struct_type)))
for idx, arg in enumerate(exc_fields):
builder.store(arg, builder.gep(excinfo_p, [zero, int32_t(idx)]))
builder.store(excinfo_p, excinfo_pp)

Can you try that?

@sklam
Copy link
Member

sklam commented Apr 30, 2024

The LLVM IR is from post-optimization and LLVM is allowed to move/remove metadata as it wish. Perhaps, we should stop using metadata but instead detect the store going to %exeinfo which must be one of the argument on the LLVM function.

So, in isRaising, it should detect that the store target is the LLVM argument for the exception pointer.

@guilhermeleobas
Copy link
Collaborator

@sklam, should we go with adding the metadata for the missing case? Or just try to fix the isRaising in llvmlite?

@dlee992
Copy link
Contributor Author

dlee992 commented Apr 30, 2024

@guilhermeleobas, yes, I tried to add more metadata in #9551 . And my internal test case's performance is better with that. ex: w/o this PR, calls of NRT_decref are 3000 times; w/ this PR, the calls go down to 1800 times.

@sklam, yes, changing the detection way in llvmlite side is also a solution.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants