Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix #4408: allowing for users to set the exception handling behavior #4488

Merged
merged 3 commits into from Oct 19, 2022

Conversation

shawkins
Copy link
Contributor

@shawkins shawkins commented Oct 10, 2022

Description

Fix #4408

cc @csviri

I couldn't come up with a good way to deal with this all under the covers, so it seemed best to introduce an explicit handler with a signature that's sufficient to cover our default behavior. This makes the final solution much closer to some of the earlier proposals. For the method:

boolean retryAfterException(boolean isStarted, Throwable t);

our current behavior is:

(b, t) -> b && !(t instanceof WatcherException)

That is we'll retry once started and only if it's not a WatcherException.

The expectation here is that if you use the runnableInformer method, you can change this behavior by calling something like exceptionHandler((b, t) -> true)

In reviewing the go implementation, I see the following differences:

  • Their WatchErrorHandler is doc'd as only informational - however the reflector is passed in, so you can easily kill it if you wanted.
  • They will always retry, so the WatchErrorHandler method does not return anything.
  • The equivalent of a watch onClose exception is not passed to their WatchErrorHandler. I think their assumption is that you'll immediately be trying a fresh list / watch and then you'll experience a problem with one of those calls if something is still wrong.

I've added a helper method to ExceptionHandler to check for deserialization issues - but that is not currently used in any built-in way. If you use something like exceptionHandler((b, t) -> true), but there is a jsonprocessing exception with the list call or with a watch event the result would be that you'll just keep retrying - a fresh list operation, then watch. If you don't want to make an assumption that things will self heal (a missing conversion webhook is installed later), then ExceptionHandler.isDeserializationException could be used - exceptionHandler((b, t) -> !ExceptionHandler.isDeserializationException(t))

With a major release we can then determine if the default behavior should become more generally to just keep retrying.

Along with this is a small breaking change to make the return type for start and stopped to be a CompletionStage rather than a CompletableFuture - I realized that since we don't want the user completing and don't support canceling, we should be using the narrower type instead. Since the stopped method was just introduced we may be able to change that one without too much concern (that would also need backported). If anyone has any qualms with start changing as well, I'll revert that.

Type of change

  • Bug fix (non-breaking change which fixes an issue)
  • Feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change
  • Chore (non-breaking change which doesn't affect codebase;
    test, version modification, documentation, etc.)

Checklist

  • Code contributed by me aligns with current project license: Apache 2.0
  • I Added CHANGELOG entry regarding this change
  • I have implemented unit tests to cover my changes
  • I have added/updated the javadocs and other documentation accordingly
  • No new bugs, code smells, etc. in SonarCloud report
  • I tested my code in Kubernetes
  • I tested my code in OpenShift

@@ -106,11 +107,11 @@ public synchronized <T> SharedIndexInformer<T> getExistingSharedIndexInformer(Cl

@Override
public synchronized Future<Void> startAllRegisteredInformers() {
List<CompletableFuture<Void>> startInformerTasks = new ArrayList<>();
List<CompletionStage<Void>> startInformerTasks = new ArrayList<>();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't seem to be compliant with line 128.
I'm going to add a subsequent commit to ensure this won't cause any issue (there shouldn't be any issue because the underlying impl will be CompletableFuture, but the code is wrong nonetheless).

Signed-off-by: Marc Nuri <marc@marcnuri.com>
@sonarcloud
Copy link

sonarcloud bot commented Oct 19, 2022

SonarCloud Quality Gate failed.    Quality Gate failed

Bug B 1 Bug
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 1 Code Smell

64.4% 64.4% Coverage
0.0% 0.0% Duplication

Copy link
Member

@manusa manusa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thx!

@manusa manusa merged commit f8dd11b into fabric8io:master Oct 19, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Informer initial start exception handling
2 participants