-
Notifications
You must be signed in to change notification settings - Fork 556
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
IllegalStateException: Expected to find a flow element with id 'migrated-variable-2251799826929599' in process with key '2251799825939186' but not found. #8626
Comments
Summary of affected entities so far: Affected variables:
Affected processes:
Affected process Instances:
Stack traces:
B)
|
Grafana looks normal. It's a cluster with a single node. |
Affected instances are blacklisted. Other than that, Zeebe keeps running normally. |
One thing that is peculiar about the affected process instances:
What I don't understand is why processes with an active incident are showing any activity at all. |
Perhaps the process instance is being cancelled? |
Summary of findings: Impact for currently known cases: minimal. It is a dev cluster, the processes have been started last year in December Impact in general: unknown at this point Timeline of events:
|
How can I see that in the logs? Or anywhere else? |
Not something we log, but should be available on the log stream. 🤔 I'm not sure if operate would also already show that the instance is terminating entirely, but perhaps you could see it there as well. EDIT: You can also see it in using |
Confirmed that there were cancel requests (we were able to see the requests in Grafana) |
Timeline of events:
|
After discussing with @Zelldon we decided to lower the severity.
|
Curious what the incident type is. We have an existing issue with cancelling instances with incidents of UNHANDLED_ERROR #8588 |
The problem were variables that couldn't be resolved. Looked a bit like a typo in the FEEL expression. But tbh, I didn't pay too much attention to this detail. |
Leaving in the backlog for now - from my understanding, the impact is mostly on visibility as the instance is blacklisted, which is not really visible for users. However, the net effect is what they wanted - users cannot interact with this instance anymore. It's still confusing, but I think this stems mostly from the blacklisting concept and its lack of visibility. Two things which might cause us to reprioritize:
|
As someone who uses Zeebe on its own (without Camunda Cloud), this was confusing to me. I searched on docs.camunda.io but couldn't find any references to this. Could you explain more about blacklisting an instance? Is it a feature of Zeebe, or of Camunda Cloud / Operate? |
There is a open issue to document this camunda/camunda-docs#145 some details can be found here #1988 |
Thanks @Zelldon -- the details in #1988 are sparse but I think I get the idea. Hopefully the more formal documentation will be written soon. In the meantime, is there any risk that an instance running on one of our self-hosted zeebe clusters might get blacklisted? It seems like we would have no way to know about that if it happened. |
Normally you should see an error in the log. Furthermore there are metrics regarding blacklisted instances. I agree it would be good if we have soon some documentation, but also be aware that there are plans to replace the blacklisting with a different concept. |
Thanks! |
It doesn't to occur again, I will close it for now. |
Describe the bug
Observed in production log.
A) https://console.cloud.google.com/errors/detail/COfpgtnP4c-E0QE;service=zeebe;time=P7D?project=camunda-cloud-240911
Logs: https://drive.google.com/file/d/1DF-Kl0TXV5LKT6e2Nnk7Gk8YsdE926Cn/view?usp=sharing
and
B) https://console.cloud.google.com/errors/detail/CJmzuouvuYOk5wE;service=zeebe;time=P7D?project=camunda-cloud-240911
Logs: https://drive.google.com/file/d/1bgGiULD6bNzvuY0LapW5BQmbuDZYW2Lr/view?usp=sharing
Log/Stacktrace
Full Stacktrace
Environment:
The text was updated successfully, but these errors were encountered: