Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add HIP for obtaining output from hook jobs and pods #301

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

z4ce
Copy link

@z4ce z4ce commented May 3, 2023

This proposes a new annotation to indicate that the output from the job should be displayed to the user. This is mostly already implemented in:

helm/helm-www#1242
helm/helm#10309

@z4ce
Copy link
Author

z4ce commented May 3, 2023

cc @Bez625 @xavpaice

@Bez625
Copy link

Bez625 commented May 4, 2023

Thanks @z4ce for raising this - yes it would be great if we can finally get this merged in! Happy to address any comments, but haven't had any feedback in well over a year

@z4ce z4ce changed the title Add HIP for obtaining output from failed hook jobs and pods Add HIP for obtaining output from hook jobs and pods May 31, 2023
Copy link

@joejulian joejulian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I have had many occasions where I could have used this.

@Bez625
Copy link

Bez625 commented Jul 25, 2023

Thanks! I have had many occasions where I could have used this.

Yeah its a shame the PR with the actual changes has been overlooked for the last 21 months. I'm sure a lot of people could have benefited from this.

Are we any closer to getting that merged?

Copy link
Contributor

@mattfarina mattfarina left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this idea. I have a number of comments on the HIP to try and make it more clear and get some detail we're looking for.

hips/hip-9999.md Outdated Show resolved Hide resolved
hips/hip-9999.md Outdated

Often it is important to verify that the kubernetes cluster you are deploying a helm chart into has certain properties. You might need to know that the cluster is of a certain version to use various APIs. You might need to know that it has ingress available, a certain amount of ephemeral storage, memory, or CPUs available. You might want to validate the the service key they provided was correct or that that database they entered is reachable. Letting a chart deploy and then finding debugging to see why it failed is a poor user experience. These things can all be done with preflight checks enabled by the hook proposed in this HIP.

In general, allowing chart developers to run jobs and present that feedback directly to the users could also open up additional use cases beyond just the preflight use case that motivated this HIP. I could imagine scenarios where maybe CVE warnings are presented or specific upgrade feedback is presented instead of just helm install failure.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Presenting information is interesting and useful. But, the Helm Client isn't the only case where charts will be installed. We cannot assume that the thing installing a chart is a person using the Helm CLI. It could be happening in CI or happening via a system like Flux. For those cases, we need to think about how those tools can get to the relevant information to present.

I think it's also important to consider the role various users are in this process. The one who creates a chart is different from the one who installs it. This includes people at different organizations with different rules. For example, the person creating a chart may be at an organization with different policies around CVEs from the one installing the chart. Making a CVE report part of a hook doesn't really make sense. An image scanner on the part of the consumer does because they can apply their own policies there.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Bez625 @xavpaice any thoughts on this one? My first thought is it should be pretty similar to how notes are processed, but I don't think I've ever seen chart notes in flux or argo, but now I should go look.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My motivation for implementing the feature was to get hook output back to the client (NOT specifically the user). This was more to get feedback in the form of logs to any CD controllers executing helm, but it is also useful to see this if a user is calling helm manually.

My take is that "preflight" checks are a usecase for hooks. Adding output of hooks to the client is a nice feature to add to hooks. If the issue here is a disagreement with the examples can we not just remove them and stick with something more general.

E.g. "There are many usecases for hooks and it would improve the user experience if it were possible to output the logs of those hooks when the client executes them. The feedback could be direct to a user, part of a CI/CD pipeline or maybe even consumed from the helm library."

Focusing this on preflight checks could be a bit misleading as I'm pretty sure the implementation works for pre and post hooks, so it really is a new feature for hooks rather than specific to pre-install hooks or the preflight usecase.

@mattfarina @z4ce does this help or have I missed the point?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mattfarina can chime in here some more, but I think the real point of this comment is about how would Flux/Argo be able to make use of the output that helm gathered?

When an API user gets a Release ( https://pkg.go.dev/helm.sh/helm/v3@v3.12.3/pkg/release#Release ) back, how should they then get the relevant logs?

My first thought was that we should put it in something like Release.Info.HookOutput[], but that object gets stored in the secret that helm manages and putting more data into there doesn't seem desirable.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mattfarina can chime in here some more, but I think the real point of this comment is about how would Flux/Argo be able to make use of the output that helm gathered?

My reading of the comment was that IF you accept the usecase presented (i.e. specifically for preflight checks) then you need to accommodate for that in any way that those preflight checks could be called. My view is that there was never an intention on my part to implement this specifically for preflight checks - the purpose was to give feedback as a convenience for users of the client (which could be called via a CD process that has its own log handling and was the case we wanted it for).

I think it best @mattfarina add clarity to his comment at this stage as it seems we aren't aligned on the intent of his feedback.

When an API user gets a Release ( https://pkg.go.dev/helm.sh/helm/v3@v3.12.3/pkg/release#Release ) back, how should they then get the relevant logs?

I strongly feel they shouldnt. The Release describes a helm deployment and the log output from a job or pod is way too fine a level of detail for that struct. E.g. if a kubernetes Deployment failed with CrashLoopBackoff you wouldnt expect the logs of multiple pods to appear on the Release struct.

I think if the desire was to implement a preflight checks feature you would opt for something much more opinionated - arguably a first class citizen of the codebase. It would likely not be an extension of hooks, but it's own concept and have well defined constructs for declaring each check and the results format it should conform to. An array of []PreFlightChecks would make more sense on the Release struct.

Personally, I feel outputting of hook logs is a strong enough usecase on it's own and there is clearly a lot of support for it in the community. Since this PR has been sat around for nearly 2 years I'd like to try and get it merged in. If the delay is simply a disagreement in the Motivation then let's change the motivation to what it was intended for, i.e. a feature addition to hooks that adds a convenience for users of the CLI.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hiddeco @mattfarina @Bez625 I went ahead and took the liberty of making a PR to support the method of integration I spoke about above. Let me know if this looks right and I'll write it up for the HIP

Bez625/helm@1d7566a

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @z4ce comment in the call that we should follow the YAGNI principle and avoid an SDK specific integration until there is a usecase.

There is a use case, as there are existing SDK dependents. As I mentioned, having access to the job name instead of a stream of log lines is much more valuable to any SDK user.

From the previous comments it wasn't clear to me that there was a usecase for the hook output for SDK users - only that they could potentially access them via the reference implementation, but it might not be the easiest to work with. If you can let me know which interfaces would be best to adapt this for the SDK users (or if you agree with @z4ce suggestion) I'm happy to include that in the reference implementation.

Cutting corners by applying a YAGNI principle most of the time results in code which isn't flexible enough to be changed at a later stage without introducing breaking changes.

You are correct - as a general rule if software development practices are applied incorrectly the result will be poor software. In this specific case no one is trying to cut corners. The intention was - and always has been - to facilitate hook output to CLI users. If there is a way we can incorporate this change to work with an SDK specific usecase I'm more than happy to include it.

What I'm trying to avoid is extending the scope of this by including a lot of changes to support possible SDK integrations that may not ever be used. If there are slight tweaks we can make to the implementation that will allow future integrations much easier to do (e.g. include job names etc) i'm happy to do that.

Which is also why it works best in combination with "continuous refactoring", a practice which can not be applied easily to software tied to semantic versioning, and has historically not been applied to Helm.

Based on this experience, I agree that the Helm code lifecycle is not a good fit for "continuous refactoring".

@z4ce I had a quick look at the change and it looks good to me. I'd like to review the implementation in more detail, but let's confirm @hiddeco is happy with the proposed changes on this thread first.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added this section to the HIP https://github.com/helm/community/pull/301/files#diff-b5f8adc4453184fee3729ff0b5387b3e6c11b174019d59f76e50b87bf9ca73e2R44 I think that this resolves this but before I close this thread I'll give it a couple of days. I'd love to hear from @hiddeco that this interface would work well for him.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will need a bit of time to do a more extensive review on how this is woven into #10309, but it shouldn't take too long.

Copy link

@hiddeco hiddeco Sep 19, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After reviewing the code of the PR while taking into account the note around c.Log, I think this approach is fine and gives SDK users sufficient things to work with. It's also better than the previous reliance on log.Writer(), which would cause the logging to not be configurable at all due to Go preventing any changes to be made to the std instance. As noted in Bez625/helm@1d7566a#r127777881 however, it does change the behavior of the PR as instead of os.Stderr things will be written to os.Stdout.

As I has missed this subtle difference while writing #301 (comment), it would also have meant that without this change we would be doomed due to it not making use of the logger callback.

In general, my advise would still be to organize better means than relying on a hard-coded (albeit configurable) os.Stdout in a code base where appropriate loggers appear to be wired through. But in your defense, there is now at least a way for this to be avoided.

hips/hip-9999.md Outdated Show resolved Hide resolved
hips/hip-9999.md Outdated Show resolved Hide resolved
@z4ce z4ce force-pushed the joboutput_hip branch 3 times, most recently from 1d64e1f to aa526fc Compare September 19, 2023 15:40
Copy link

@hiddeco hiddeco left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Variety of nits to address styling issues, spelling errors and other improvements to readability.

hips/hip-9999.md Outdated Show resolved Hide resolved
hips/hip-9999.md Outdated Show resolved Hide resolved
hips/hip-9999.md Outdated Show resolved Hide resolved
hips/hip-9999.md Outdated Show resolved Hide resolved
hips/hip-9999.md Outdated Show resolved Hide resolved
hips/hip-9999.md Outdated Show resolved Hide resolved
hips/hip-9999.md Outdated Show resolved Hide resolved
hips/hip-9999.md Outdated Show resolved Hide resolved
hips/hip-9999.md Outdated Show resolved Hide resolved
hips/hip-9999.md Outdated Show resolved Hide resolved
@hiddeco
Copy link

hiddeco commented Sep 19, 2023

Note that #301 (comment) included a question.

Additionally, a new user flag should be created `--no-log-output` that would skip the output of logs.

Additionally, there will be a new item added to the action SDK configuration to allow SDK consumers to get the output.
By default this output will be written to stdout, but an SDK consumer can overwrite the HookOutputFunc to provide a custom writer.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think the SDK should write to StdOut. StdErr is where logs usually go. Consider a service, like Flux, running this update. How can things be low impact on them? What about writing to io.Discard by default. They set a writer if they want to capture the output. This is how they can opt-in to getting that output.


## Backwards compatibility

The only backwards compatibility concern would be that scripts parsing `helm install` output would see some additional text in the case of logs being output. The fact that notes already make the output unstructured should mitigate any concern here. Since we already are trusting chart developers to provide output in the form of notes, this is a logical extension of that that allows the developer to provide more dynamic output.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if the Helm client wrote them to StdErr instead? This is where logs like this are usually written. You can even see it in default Go logger.

Signed-off-by: Ian Zink <zforce@gmail.com>
@z4ce
Copy link
Author

z4ce commented Nov 15, 2023

I updated the HIP with feedback. I switched the CLI output to stderr and the default SDK behavior to "discard." Does this look good to everyone now?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants