Add HIP for obtaining output from hook jobs and pods #301

z4ce · 2023-05-03T22:08:41Z

This proposes a new annotation to indicate that the output from the job should be displayed to the user. This is mostly already implemented in:

helm/helm-www#1242
helm/helm#10309

z4ce · 2023-05-03T22:12:28Z

cc @Bez625 @xavpaice

Bez625 · 2023-05-04T07:17:12Z

Thanks @z4ce for raising this - yes it would be great if we can finally get this merged in! Happy to address any comments, but haven't had any feedback in well over a year

joejulian

Thanks! I have had many occasions where I could have used this.

Bez625 · 2023-07-25T04:15:42Z

Thanks! I have had many occasions where I could have used this.

Yeah its a shame the PR with the actual changes has been overlooked for the last 21 months. I'm sure a lot of people could have benefited from this.

Are we any closer to getting that merged?

mattfarina

I like this idea. I have a number of comments on the HIP to try and make it more clear and get some detail we're looking for.

hips/hip-9999.md

mattfarina · 2023-08-31T21:39:44Z

hips/hip-9999.md

+
+Often it is important to verify that the kubernetes cluster you are deploying a helm chart into has certain properties. You might need to know that the cluster is of a certain version to use various APIs. You might need to know that it has ingress available, a certain amount of ephemeral storage, memory, or CPUs available. You might want to validate the the service key they provided was correct or that that database they entered is reachable. Letting a chart deploy and then finding debugging to see why it failed is a poor user experience. These things can all be done with preflight checks enabled by the hook proposed in this HIP.
+
+In general, allowing chart developers to run jobs and present that feedback directly to the users could also open up additional use cases beyond just the preflight use case that motivated this HIP. I could imagine scenarios where maybe CVE warnings are presented or specific upgrade feedback is presented instead of just helm install failure.


Presenting information is interesting and useful. But, the Helm Client isn't the only case where charts will be installed. We cannot assume that the thing installing a chart is a person using the Helm CLI. It could be happening in CI or happening via a system like Flux. For those cases, we need to think about how those tools can get to the relevant information to present.

I think it's also important to consider the role various users are in this process. The one who creates a chart is different from the one who installs it. This includes people at different organizations with different rules. For example, the person creating a chart may be at an organization with different policies around CVEs from the one installing the chart. Making a CVE report part of a hook doesn't really make sense. An image scanner on the part of the consumer does because they can apply their own policies there.

@Bez625 @xavpaice any thoughts on this one? My first thought is it should be pretty similar to how notes are processed, but I don't think I've ever seen chart notes in flux or argo, but now I should go look.

My motivation for implementing the feature was to get hook output back to the client (NOT specifically the user). This was more to get feedback in the form of logs to any CD controllers executing helm, but it is also useful to see this if a user is calling helm manually.

My take is that "preflight" checks are a usecase for hooks. Adding output of hooks to the client is a nice feature to add to hooks. If the issue here is a disagreement with the examples can we not just remove them and stick with something more general.

E.g. "There are many usecases for hooks and it would improve the user experience if it were possible to output the logs of those hooks when the client executes them. The feedback could be direct to a user, part of a CI/CD pipeline or maybe even consumed from the helm library."

Focusing this on preflight checks could be a bit misleading as I'm pretty sure the implementation works for pre and post hooks, so it really is a new feature for hooks rather than specific to pre-install hooks or the preflight usecase.

@mattfarina @z4ce does this help or have I missed the point?

@mattfarina can chime in here some more, but I think the real point of this comment is about how would Flux/Argo be able to make use of the output that helm gathered?

When an API user gets a Release ( https://pkg.go.dev/helm.sh/helm/v3@v3.12.3/pkg/release#Release ) back, how should they then get the relevant logs?

My first thought was that we should put it in something like Release.Info.HookOutput[], but that object gets stored in the secret that helm manages and putting more data into there doesn't seem desirable.

@mattfarina can chime in here some more, but I think the real point of this comment is about how would Flux/Argo be able to make use of the output that helm gathered?

My reading of the comment was that IF you accept the usecase presented (i.e. specifically for preflight checks) then you need to accommodate for that in any way that those preflight checks could be called. My view is that there was never an intention on my part to implement this specifically for preflight checks - the purpose was to give feedback as a convenience for users of the client (which could be called via a CD process that has its own log handling and was the case we wanted it for).

I think it best @mattfarina add clarity to his comment at this stage as it seems we aren't aligned on the intent of his feedback.

When an API user gets a Release ( https://pkg.go.dev/helm.sh/helm/v3@v3.12.3/pkg/release#Release ) back, how should they then get the relevant logs?

I strongly feel they shouldnt. The Release describes a helm deployment and the log output from a job or pod is way too fine a level of detail for that struct. E.g. if a kubernetes Deployment failed with CrashLoopBackoff you wouldnt expect the logs of multiple pods to appear on the Release struct.

I think if the desire was to implement a preflight checks feature you would opt for something much more opinionated - arguably a first class citizen of the codebase. It would likely not be an extension of hooks, but it's own concept and have well defined constructs for declaring each check and the results format it should conform to. An array of []PreFlightChecks would make more sense on the Release struct.

Personally, I feel outputting of hook logs is a strong enough usecase on it's own and there is clearly a lot of support for it in the community. Since this PR has been sat around for nearly 2 years I'd like to try and get it merged in. If the delay is simply a disagreement in the Motivation then let's change the motivation to what it was intended for, i.e. a feature addition to hooks that adds a convenience for users of the CLI.

@hiddeco @mattfarina @Bez625 I went ahead and took the liberty of making a PR to support the method of integration I spoke about above. Let me know if this looks right and I'll write it up for the HIP

Bez625/helm@1d7566a

I agree with @z4ce comment in the call that we should follow the YAGNI principle and avoid an SDK specific integration until there is a usecase.

There is a use case, as there are existing SDK dependents. As I mentioned, having access to the job name instead of a stream of log lines is much more valuable to any SDK user.

From the previous comments it wasn't clear to me that there was a usecase for the hook output for SDK users - only that they could potentially access them via the reference implementation, but it might not be the easiest to work with. If you can let me know which interfaces would be best to adapt this for the SDK users (or if you agree with @z4ce suggestion) I'm happy to include that in the reference implementation.

Cutting corners by applying a YAGNI principle most of the time results in code which isn't flexible enough to be changed at a later stage without introducing breaking changes.

You are correct - as a general rule if software development practices are applied incorrectly the result will be poor software. In this specific case no one is trying to cut corners. The intention was - and always has been - to facilitate hook output to CLI users. If there is a way we can incorporate this change to work with an SDK specific usecase I'm more than happy to include it.

What I'm trying to avoid is extending the scope of this by including a lot of changes to support possible SDK integrations that may not ever be used. If there are slight tweaks we can make to the implementation that will allow future integrations much easier to do (e.g. include job names etc) i'm happy to do that.

Which is also why it works best in combination with "continuous refactoring", a practice which can not be applied easily to software tied to semantic versioning, and has historically not been applied to Helm.

Based on this experience, I agree that the Helm code lifecycle is not a good fit for "continuous refactoring".

@z4ce I had a quick look at the change and it looks good to me. I'd like to review the implementation in more detail, but let's confirm @hiddeco is happy with the proposed changes on this thread first.

I added this section to the HIP https://github.com/helm/community/pull/301/files#diff-b5f8adc4453184fee3729ff0b5387b3e6c11b174019d59f76e50b87bf9ca73e2R44 I think that this resolves this but before I close this thread I'll give it a couple of days. I'd love to hear from @hiddeco that this interface would work well for him.

I will need a bit of time to do a more extensive review on how this is woven into #10309, but it shouldn't take too long.

After reviewing the code of the PR while taking into account the note around c.Log, I think this approach is fine and gives SDK users sufficient things to work with. It's also better than the previous reliance on log.Writer(), which would cause the logging to not be configurable at all due to Go preventing any changes to be made to the std instance. As noted in Bez625/helm@1d7566a#r127777881 however, it does change the behavior of the PR as instead of os.Stderr things will be written to os.Stdout.

As I has missed this subtle difference while writing #301 (comment), it would also have meant that without this change we would be doomed due to it not making use of the logger callback.

In general, my advise would still be to organize better means than relying on a hard-coded (albeit configurable) os.Stdout in a code base where appropriate loggers appear to be wired through. But in your defense, there is now at least a way for this to be avoided.

hips/hip-9999.md

hiddeco

Variety of nits to address styling issues, spelling errors and other improvements to readability.

hips/hip-9999.md

hiddeco · 2023-09-19T17:16:15Z

Note that #301 (comment) included a question.

mattfarina · 2023-09-21T19:06:27Z

hips/hip-9999.md

+Additionally, a new user flag should be created `--no-log-output`  that would skip the output of logs.
+
+Additionally, there will be a new item added to the action SDK configuration to allow SDK consumers to get the output. 
+By default this output will be written to stdout, but an SDK consumer can overwrite the HookOutputFunc to provide a custom writer.


I don't think the SDK should write to StdOut. StdErr is where logs usually go. Consider a service, like Flux, running this update. How can things be low impact on them? What about writing to io.Discard by default. They set a writer if they want to capture the output. This is how they can opt-in to getting that output.

mattfarina · 2023-09-21T19:08:54Z

hips/hip-9999.md

+
+## Backwards compatibility
+
+The only backwards compatibility concern would be that scripts parsing `helm install` output would see some additional text in the case of logs being output. The fact that notes already make the output unstructured should mitigate any concern here. Since we already are trusting chart developers to provide output in the form of notes, this is a logical extension of that that allows the developer to provide more dynamic output. 


What if the Helm client wrote them to StdErr instead? This is where logs like this are usually written. You can even see it in default Go logger.

Signed-off-by: Ian Zink <zforce@gmail.com>

z4ce · 2023-11-15T18:23:05Z

I updated the HIP with feedback. I switched the CLI output to stderr and the default SDK behavior to "discard." Does this look good to everyone now?

pull-request-size bot added the size/L label May 3, 2023

xavpaice mentioned this pull request May 8, 2023

Preflight to search stdin for secrets containing preflight specs replicatedhq/troubleshoot#1146

Closed

z4ce force-pushed the joboutput_hip branch from f957b32 to 13c5331 Compare May 11, 2023 16:13

z4ce changed the title ~~Add HIP for obtaining output from failed hook jobs and pods~~ Add HIP for obtaining output from hook jobs and pods May 31, 2023

joejulian approved these changes Jul 24, 2023

View reviewed changes

mattfarina reviewed Aug 31, 2023

View reviewed changes

z4ce force-pushed the joboutput_hip branch 3 times, most recently from 1d64e1f to aa526fc Compare September 19, 2023 15:40

hiddeco reviewed Sep 19, 2023

View reviewed changes

z4ce force-pushed the joboutput_hip branch from 1992d07 to 99c5879 Compare September 21, 2023 18:37

mattfarina reviewed Sep 21, 2023

View reviewed changes

gjenkins8 mentioned this pull request Nov 1, 2023

How to make helm upgrade --install my_release and helm test my_release atomic helm/helm#12534

Closed

Add HIP for obtaining output from failed hook jobs and pods

b65ebd2

Signed-off-by: Ian Zink <zforce@gmail.com>

z4ce force-pushed the joboutput_hip branch from 99c5879 to b65ebd2 Compare November 14, 2023 22:33

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add HIP for obtaining output from hook jobs and pods #301

Add HIP for obtaining output from hook jobs and pods #301

z4ce commented May 3, 2023

z4ce commented May 3, 2023

Bez625 commented May 4, 2023

joejulian left a comment

Bez625 commented Jul 25, 2023

mattfarina left a comment

mattfarina Aug 31, 2023

z4ce Sep 6, 2023

Bez625 Sep 7, 2023

z4ce Sep 7, 2023

Bez625 Sep 8, 2023

z4ce Sep 18, 2023

Bez625 Sep 19, 2023

z4ce Sep 19, 2023

hiddeco Sep 19, 2023

hiddeco Sep 19, 2023 •

edited

hiddeco left a comment

hiddeco commented Sep 19, 2023

mattfarina Sep 21, 2023

mattfarina Sep 21, 2023

z4ce commented Nov 15, 2023


		Often it is important to verify that the kubernetes cluster you are deploying a helm chart into has certain properties. You might need to know that the cluster is of a certain version to use various APIs. You might need to know that it has ingress available, a certain amount of ephemeral storage, memory, or CPUs available. You might want to validate the the service key they provided was correct or that that database they entered is reachable. Letting a chart deploy and then finding debugging to see why it failed is a poor user experience. These things can all be done with preflight checks enabled by the hook proposed in this HIP.

		In general, allowing chart developers to run jobs and present that feedback directly to the users could also open up additional use cases beyond just the preflight use case that motivated this HIP. I could imagine scenarios where maybe CVE warnings are presented or specific upgrade feedback is presented instead of just helm install failure.


		## Backwards compatibility

		The only backwards compatibility concern would be that scripts parsing `helm install` output would see some additional text in the case of logs being output. The fact that notes already make the output unstructured should mitigate any concern here. Since we already are trusting chart developers to provide output in the form of notes, this is a logical extension of that that allows the developer to provide more dynamic output.

Add HIP for obtaining output from hook jobs and pods #301

Are you sure you want to change the base?

Add HIP for obtaining output from hook jobs and pods #301

Conversation

z4ce commented May 3, 2023

z4ce commented May 3, 2023

Bez625 commented May 4, 2023

joejulian left a comment

Choose a reason for hiding this comment

Bez625 commented Jul 25, 2023

mattfarina left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

hiddeco Sep 19, 2023 • edited

Choose a reason for hiding this comment

hiddeco left a comment

Choose a reason for hiding this comment

hiddeco commented Sep 19, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

z4ce commented Nov 15, 2023

hiddeco Sep 19, 2023 •

edited