Allow multiple 'runs' for Periscope #196

peterbom · 2022-08-26T02:01:11Z

This addresses #157, in particular this clarification. Rather than replacing the DaemonSet with some other resource type (which we've discussed extensively, but not found anything suitable), we make the existing DaemonSet behave more like a DaemonSet. I.e., rather than running once and doing nothing, it continues listening for a configuration change that triggers it to run again.

The specific configuration change it is looking for is the DIAGNOSTIC_RUN_ID setting. This is now what determines the name of the container that logs are exported to, so each time this value is changed, Periscope will run again and export to a new container. This means we can now remove the select {} 'hack' that was introduced to prevent the DaemonSet going into CrashLoopBackoff.

It also fixes #165, which describes a couple of bugs that were side-effects of Periscope trying to determine its 'run ID' by itself rather than having it specified in configuration.

Some implementation notes:

Configuration (both ConfigMap and Secret resources) are now mounted as file volumes in the container, rather than environment variables. The reason for this is that environment variables are not updated when the underlying ConfigMap variables change. For consistency (and because other settings such as SAS token will also change between runs), this was done for all configuration from ConfigMaps and Secrets.
There is a new FileContentWatcher type which is responsible for detecting changes in these mounted volumes (currently it's only used to monitor the DIAGNOSTIC_RUN_ID value). This has a naive polling implementation because there are no cross-platform mechanisms like inotify, and the most prominent golang library for cross-platform file watching was archived when I last checked (it has since been restored, but we can manage without the dependency). There is not much point in trying to respond immediately to file content changes, because there can be a two-minute gap between updating the ConfigMap and file content changing within the pod.
The OS identifier is moved out of the other runtime info settings, because (a) it'll never change for a single pod, and (b) runtime info values now depend on file paths, which depend on OS. It is now based on a OSIdentifier type, rather than a string.
The helpers.go file was cleaned up (there were a few unused functions in there).
A new GetContent function was added to helpers.go, to avoid duplication of code that read strings from functions returning an io.ReadCloser.
The linter was complaining about use of the deprecated io/ioutil package, so all usages have been replaced by the equivalent functions in io/os.

codecov-commenter · 2022-08-26T02:12:08Z

Codecov Report

Merging #196 (4cddaad) into master (8dd3720) will increase coverage by 0.25%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##           master     #196      +/-   ##
==========================================
+ Coverage   80.25%   80.50%   +0.25%     
==========================================
  Files          14       14              
  Lines         785      785              
==========================================
+ Hits          630      632       +2     
+ Misses         95       94       -1     
+ Partials       60       59       -1

Impacted Files	Coverage Δ
pkg/collector/dns_collector.go	`100.00% <100.00%> (+6.66%)`	⬆️
pkg/collector/iptables_collector.go	`80.95% <100.00%> (+0.95%)`	⬆️
pkg/collector/kubeletcmd_collector.go	`77.27% <100.00%> (+1.08%)`	⬆️
pkg/collector/pods_containerlogs_collector.go	`82.35% <100.00%> (ø)`
pkg/collector/systemlogs_collector.go	`82.60% <100.00%> (+0.79%)`	⬆️
pkg/collector/windowslogs_collector.go	`100.00% <100.00%> (ø)`

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

Tatsinnit

❤️🙏☕️ THankyou so much for adding me to the review, looks awesome and especially enabling the polling mechanism rather then empty run is awesome.

as we discussed we will updated one scenario where if user is able to duplicate the Run_ID that could lead to side effect and we can get over it by documenting it.

Thanks heaps,

peterbom added 11 commits August 22, 2022 16:13

Rely on the configured run ID for running the application

6c169bb

add logging for each run

9ef7af7

Add logging to watcher

5d89ecd

move all configuration to file system

27c7fb7

fix spec error for secret volume

53c6a6b

move node name back to env var as it's not supported as a volume

4f06000

check existence of config files before reading

2108e73

remove unused helper functions and generalize GetContent helper

9c8e441

fix linting errors about deprecated ioutil package use

5d70949

add tests for file system watcher

6bd40ee

fix data races

5a47b31

peterbom requested a review from Tatsinnit August 26, 2022 02:01

Tatsinnit assigned peterbom Aug 27, 2022

Tatsinnit added the enhancement 🏎 New feature or request label Aug 27, 2022

update readme

4cddaad

Tatsinnit approved these changes Aug 30, 2022

View reviewed changes

peterbom merged commit 3082e38 into Azure:master Aug 31, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow multiple 'runs' for Periscope #196

Allow multiple 'runs' for Periscope #196

peterbom commented Aug 26, 2022 •

edited

codecov-commenter commented Aug 26, 2022 •

edited

Tatsinnit left a comment

Allow multiple 'runs' for Periscope #196

Allow multiple 'runs' for Periscope #196

Conversation

peterbom commented Aug 26, 2022 • edited

codecov-commenter commented Aug 26, 2022 • edited

Codecov Report

Tatsinnit left a comment

Choose a reason for hiding this comment

peterbom commented Aug 26, 2022 •

edited

codecov-commenter commented Aug 26, 2022 •

edited