Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow multiple 'runs' for Periscope #196

Merged
merged 12 commits into from Aug 31, 2022

Conversation

peterbom
Copy link
Collaborator

@peterbom peterbom commented Aug 26, 2022

This addresses #157, in particular this clarification. Rather than replacing the DaemonSet with some other resource type (which we've discussed extensively, but not found anything suitable), we make the existing DaemonSet behave more like a DaemonSet. I.e., rather than running once and doing nothing, it continues listening for a configuration change that triggers it to run again.

The specific configuration change it is looking for is the DIAGNOSTIC_RUN_ID setting. This is now what determines the name of the container that logs are exported to, so each time this value is changed, Periscope will run again and export to a new container. This means we can now remove the select {} 'hack' that was introduced to prevent the DaemonSet going into CrashLoopBackoff.

It also fixes #165, which describes a couple of bugs that were side-effects of Periscope trying to determine its 'run ID' by itself rather than having it specified in configuration.

Some implementation notes:

  • Configuration (both ConfigMap and Secret resources) are now mounted as file volumes in the container, rather than environment variables. The reason for this is that environment variables are not updated when the underlying ConfigMap variables change. For consistency (and because other settings such as SAS token will also change between runs), this was done for all configuration from ConfigMaps and Secrets.
  • There is a new FileContentWatcher type which is responsible for detecting changes in these mounted volumes (currently it's only used to monitor the DIAGNOSTIC_RUN_ID value). This has a naive polling implementation because there are no cross-platform mechanisms like inotify, and the most prominent golang library for cross-platform file watching was archived when I last checked (it has since been restored, but we can manage without the dependency). There is not much point in trying to respond immediately to file content changes, because there can be a two-minute gap between updating the ConfigMap and file content changing within the pod.
  • The OS identifier is moved out of the other runtime info settings, because (a) it'll never change for a single pod, and (b) runtime info values now depend on file paths, which depend on OS. It is now based on a OSIdentifier type, rather than a string.
  • The helpers.go file was cleaned up (there were a few unused functions in there).
  • A new GetContent function was added to helpers.go, to avoid duplication of code that read strings from functions returning an io.ReadCloser.
  • The linter was complaining about use of the deprecated io/ioutil package, so all usages have been replaced by the equivalent functions in io/os.

@codecov-commenter
Copy link

codecov-commenter commented Aug 26, 2022

Codecov Report

Merging #196 (4cddaad) into master (8dd3720) will increase coverage by 0.25%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##           master     #196      +/-   ##
==========================================
+ Coverage   80.25%   80.50%   +0.25%     
==========================================
  Files          14       14              
  Lines         785      785              
==========================================
+ Hits          630      632       +2     
+ Misses         95       94       -1     
+ Partials       60       59       -1     
Impacted Files Coverage Δ
pkg/collector/dns_collector.go 100.00% <100.00%> (+6.66%) ⬆️
pkg/collector/iptables_collector.go 80.95% <100.00%> (+0.95%) ⬆️
pkg/collector/kubeletcmd_collector.go 77.27% <100.00%> (+1.08%) ⬆️
pkg/collector/pods_containerlogs_collector.go 82.35% <100.00%> (ø)
pkg/collector/systemlogs_collector.go 82.60% <100.00%> (+0.79%) ⬆️
pkg/collector/windowslogs_collector.go 100.00% <100.00%> (ø)

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

@Tatsinnit Tatsinnit added the enhancement 🏎 New feature or request label Aug 27, 2022
Copy link
Member

@Tatsinnit Tatsinnit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❤️🙏☕️ THankyou so much for adding me to the review, looks awesome and especially enabling the polling mechanism rather then empty run is awesome.

as we discussed we will updated one scenario where if user is able to duplicate the Run_ID that could lead to side effect and we can get over it by documenting it.

Thanks heaps,

@peterbom peterbom merged commit 3082e38 into Azure:master Aug 31, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement 🏎 New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Periscope will only run correctly in the aks-periscope namespace
3 participants