Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding bats_pipe Helper Function (originally suggested adding --output-processor Flag to run Helper). #663

Merged
merged 30 commits into from
Jul 13, 2023

Conversation

qec-pconner
Copy link
Contributor

@qec-pconner qec-pconner commented Oct 8, 2022

This PR allows for easily processing output from the command being run, and properly propagate status.

When looking up how to use pipes with run, a number of "work arounds" are mentioned, although it seems that none fully handle the core intention of processing output.

One mentioned work around (e.g. run grep "item" <(command_to_run)) would seem to suggest that the testing focus is on the functionality of grep and not the command. If that is any such tests purpose, that's great. However, if one wanted to focus on the functionality of command_to_run, with grep as an aside, there is no "easy" way to do this.

A different alternative shows the usage of bash -c to handle piping (run bash -c 'command_to_run | grep "item"'). However, this swallows the exit code of the command of interest, command_to_run. Proper preservation would require pulling it from PIPESTATUS (e.g. run bash -c 'command_to_run | grep "item" ; exit "${PIPESTATUS[0]}"'). This would properly set the global status variable to the exit code from command_to_run. One could wrap this functionality in a function, but its still quite bloated. Arguably, requiring each test to manually handle this is brittle and complicates the test -- I have not actually seen this suggested in any of the threads that I have been searching through (even though those sample tests are still asserting on status, which seems brittle/misleading/lack of coverage).

As per discussion below, this PR adds a new helper bats_pipe, which is able to parse \| to allow piping of multiple commands.

A simple test to illustrate this is as follows:

@test "run bats_pipe" {
  run bats_pipe returns_status_5 \| returns_status_0

  [ "$status" -eq 5 ]
}

This functionality is especially helpful when testing an executable/script/etc which outputs binary data into stdout. This can be shown as follows:

@test "run using output processor on binary data" {
  command_under_test() {
    printf '\x00\xDE\xAD\xF0\x0D'
  }

  run bats_pipe command_under_test \| hexdump -v -e "1/1 \"0x%02X \""

  [ $status -eq 0 ]
  [ "$output" = "0x00 0xDE 0xAD 0xF0 0x0D " ]
}

New tests have been added to cover a wide variety of usage for this new helper function, see tests/bats_pipe.bats.

Original text of this PR (which adding `--output-processor` flag to `run`). No longer applicable, but kept for context. In an effort to encapsulate this into the `run` command directly, this PR adds a new flag, `--output-processor` which allows specifying an "output processing command" to transform output as the author sees fit. It handles the command of interest separately from this extra output processing. I implemented this in such a way to allow the test author to handle/process output however was best for them, and not be too prescriptive/presumptive for how this functionality best be done exactly.

A simple test to illustrate this is as follows:

@test "run using output processor on ascii data" {
  command_under_test() {
    printf 'some string'
  }

  run --output-processor "xargs -0 printf '~%s~'" command_under_test

  [ $status -eq 0 ]
  [ "$output" = "~some string~" ]
}

This functionality is especially helpful when testing an executable/script/etc which outputs binary data into stdout. This can be shown as follows:

@test "run output incomplete with binary data" {
  command_under_test() {
    printf '\x41\x42\x0\x43'
  }

  run command_under_test

  [ $status -eq 0 ]
  # Note that the command actually printed 4 bytes, where 1 gets ignored.
  [ "${#output}" -eq 3 ]
  [ "$output" = "ABC" ]

  # Other bytes would not be so tame and would lead to:
  # warning: command substitution: ignored null byte in input
}

@test "run using output processor on binary data" {
  command_under_test() {
    printf '\x00\xDE\xAD\xF0\x0D'
  }

  run --output-processor "hexdump -v --format '1/1 \"0x%02X \"'" command_under_test

  [ $status -eq 0 ]
  [ "$output" = "0x00 0xDE 0xAD 0xF0 0x0D " ]
}

These tests are included in this PR (added to 'tests/run.bats').

It is important to note that, if --output-processor is not supplied, command execution in run is completely unchanged (aside from the check for output_processor).


This expansion of functionality of run is also very helpful when using bats extension libraries, like bats-assert, which make run very helpful and does a great job of making straight forward, readable, and powerful tests.

Files changed in the PR were based mostly on the changes in f69c26d. If there's something else that needs to be updated, I'm happy to.

If you have any questions, suggestions, or other feedback, please do let me know. I would also like to thank you all for creating/maintaining this repo. I've made great use of it over a number of years, and am very glad to have another great tool to easily write tests. Thanks!

Related resources: sstephenson/bats#10, https://bats-core.readthedocs.io/en/stable/writing-tests.html, https://bats-core.readthedocs.io/en/stable/gotchas.html#my-piped-command-does-not-work-under-run

@qec-pconner qec-pconner requested a review from a team as a code owner October 8, 2022 00:36
@martin-schulze-vireso
Copy link
Member

Thanks for your contribution. I think a better approach would be a helper function that encapsulates the piping. This would be composible with run without complicating its code. You already had to wade through some of the combinatorial complexity that run is starting to accumulate with all those flags.

Here is a short example of how I would envision this:

run bats_pipe echo hello \| grep world

The bats_pipe helper would translate the \| back to the actual pipe and could also allow for flags that steer which part of the pipe should determine the exit code. By default I would give it set -o pipefail.

@qec-pconner
Copy link
Contributor Author

qec-pconner commented Oct 10, 2022

bats_pipe sounds fine. This would allow for more robust piping, calling commands instead of using strings, and would allow for multiple nested pipes. Will likely require a bit of usage of arrays to properly capture arguments given between |'s. I'll try putting this together, updating the PR, and try and address some of the test issues (which mostly seem like some hexdump impls only like -e and not --format).

Thanks for the quick review!

@martin-schulze-vireso
Copy link
Member

I think allowing for arbitrary number of pipes is easiest to solve via recursion. For a first try we can do with a single pipe that can be nested manually:

bats_pipe echo hello \| bats_pipe tr a b \| grep world

@qec-pconner qec-pconner changed the title Adding --output-processor Flag to run Helper. Adding bats_pipe Helper Function (formerly --output-processor Flag to run Helper). Oct 13, 2022
@qec-pconner qec-pconner changed the title Adding bats_pipe Helper Function (formerly --output-processor Flag to run Helper). Adding bats_pipe Helper Function (originally suggested adding --output-processor Flag to run Helper). Oct 13, 2022
@qec-pconner
Copy link
Contributor Author

qec-pconner commented Oct 13, 2022

I've updated this PR (code, documentation, and PR description). Now implemented using separate bats_pipe command, instead of extra flag on run.

This implementation does support any number of piped commands, e.g.: bats_pipe command0 \| command1 \| command2 (etc).

lib/bats-core/test_functions.bash Outdated Show resolved Hide resolved
lib/bats-core/test_functions.bash Outdated Show resolved Hide resolved
lib/bats-core/test_functions.bash Outdated Show resolved Hide resolved
man/bats.7.ronn Outdated Show resolved Hide resolved
@martin-schulze-vireso
Copy link
Member

After thinking a bit more about this, I noticed that there might be an even easier solution: eval.

Passing the parameters including pipes straight into eval should work. The only problems I see are

  1. We have to capture pipestatus in eval
  2. The other (non-pipe-symbol) parameters may need quoting

@qec-pconner
Copy link
Contributor Author

29s

IMO, eval is not a good solution. It's prone issues when it does not parse as expected. I think processing the input and handling it guarantees expected functionality.

@martin-schulze-vireso
Copy link
Member

martin-schulze-vireso commented Jan 17, 2023

29s

Huh?

IMO, eval is not a good solution. It's prone issues when it does not parse as expected. I think processing the input and handling it guarantees expected functionality.

Maybe I haven't been bitten by eval enough but I think the quoting should be quite easy. Take this example:

bats_pipe () {
	args=()
	for arg in "$@"; do
		if [[ $arg != '|' ]]; then
			printf -v arg "%q" "$arg"
		fi
		args+=("$arg")
	done
	eval "${args[@]}"
}


# prints $bla, not its contents, as expected!
bats_pipe echo '$bla'

# bash: line 9: echo $bla: command not found
bats_pipe 'echo $bla'

# bash: eval: line 9: syntax error near unexpected token `|'
# bash: eval: line 9: `echo | | true'
bats_pipe echo \| \| true

I see the following advantages:

  1. very short, readable code
  2. built in error checks by bash
  3. easily accessible pipestatus/pipefail behavior

@qec-pconner
Copy link
Contributor Author

29s

Huh?

This is how GitHub wanted to format a reply. I do not know. 🤷

@qec-pconner
Copy link
Contributor Author

To comment here on eval again, I think it would mostly help with the invocation of the command(s) given -- which is much of the smallest part here.

I think that a testing library should help authors write good tests, prevent them from writing bad tests (and be clear why), and help them diagnose when a test fails. If I were to write a test and I saw bash: eval: line 9: syntax error near unexpected token '|' and bash: eval: line 9: 'echo | | true', I would not find that helpful.

I would have to dig through the test library (bats), not even my tests, to figure out what I did wrong. This sort of thing costs developer time and is a good way to get them to not write tests (or use any given library). I'd much favor keeping the explicit checks that are in place to give human-readable feedback, e.g. Usage error: Cannot have consecutive `\|`. Found at argument position '4'. (maybe I should even add that its a bats_pipe usage error).

If this explicit checking is done, as well as the PIPESTATUS functionality, I don't see much of a win from using eval (aside from the execution itself).

With all that said, I can try and implement this eval version as well -- just to have a fair assessment of all the options. Will get that ready when I have the chance.

Thanks!

@martin-schulze-vireso
Copy link
Member

To comment here on eval again, I think it would mostly help with the invocation of the command(s) given -- which is much of the smallest part here.

I think that a testing library should help authors write good tests, prevent them from writing bad tests (and be clear why), and help them diagnose when a test fails. If I were to write a test and I saw bash: eval: line 9: syntax error near unexpected token '|' and bash: eval: line 9: 'echo | | true', I would not find that helpful.

Fair enough. Maybe we can replace the eval part of this message with bats_pipe and be done. Most of the relevant context should come from run's backtrace anyway.

If this explicit checking is done, as well as the PIPESTATUS functionality, I don't see much of a win from using eval (aside from the execution itself).

The pipestatus should be pretty lightwheight, something like:

eval "${command[@]}" ";" 'BATS_PIPESTATUS=("${PIPESTATUS[@]}")'

With all that said, I can try and implement this eval version as well -- just to have a fair assessment of all the options. Will get that ready when I have the chance.

Thanks!

Thanks for going the extra mile. :)

@martin-schulze-vireso
Copy link
Member

I did not fix those shellcheck messages that stem from unreachable code due to the different implemetations for bats_pipe.

@qec-pconner
Copy link
Contributor Author

Hey, I am very sorry about this loooong delay. 😅 I've been on paternity leave and spending time with my family (first kid).

I went ahead and flushed out the eval implementation (we pretty much were thinking the same thing). I also went ahead and removed the other implementations. Let's go ahead and get this PR finally merged in, I've taken long enough (again, really sorry).

FWIW, while I was gone, I thought more about eval and realized that it makes the best sense to use. There could be a lot of functionality that a test author could put into bats, and giving consistency with the rest of bash is a great priority -- and not reinventing it ourselves (via usage of eval) is the best way to go.

Feel free to give any other feedback on the current version of the PR and I'll be sure to update on Monday (or asap otherwise). Sorry again and thank you for your patience and friendliness! 😄

@martin-schulze-vireso martin-schulze-vireso force-pushed the addOutputProcessor branch 2 times, most recently from d5a7d1f to 2aaee72 Compare May 25, 2023 23:09
Copy link
Member

@martin-schulze-vireso martin-schulze-vireso left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No worries, I was absent myself.

My suggestions are mainly cosmetic/minor changes. Good job so far, especially on the documentation and extensive tests.

local -a escaped_args=()
local pipe_count=0
local previous_pipe_index=-1
for (( index = 0; index < $#; index++ )); do

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
for (( index = 0; index < $#; index++ )); do
local -i index=0
for (( index = 0; index < $#; index++ )); do

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Also updated other integer type declarations.

printf "Usage error: Cannot have leading \`\\|\`.\n" >&2
return 1
fi
if [ "$(( previous_pipe_index + 1 ))" -ge "$index" ]; then

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if [ "$(( previous_pipe_index + 1 ))" -ge "$index" ]; then
if (( previous_pipe_index + 1 >= index )); then

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

escaped_args+=("$escaped_arg")
done

if [ "$previous_pipe_index" -gt 0 ] && [ "$previous_pipe_index" -eq "$(( $# - 1 ))" ]; then

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if [ "$previous_pipe_index" -gt 0 ] && [ "$previous_pipe_index" -eq "$(( $# - 1 ))" ]; then
if (( previous_pipe_index > 0 && previous_pipe_index == $# - 1 )); then

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

return 1
fi

if [ "$pipe_count" -eq 0 ]; then

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if [ "$pipe_count" -eq 0 ]; then
if (( pipe_count == 0 )); then

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

return 1
fi

if [ "$pipestatus_position" -gt "$pipe_count" ]; then

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if [ "$pipestatus_position" -gt "$pipe_count" ]; then
# there will be pipe_count + 1 entries in PIPE_STATUS
# valid indices are [0; pipe_count] and [-pipe_count-1; -1]
if (( pipestatus_position > pipe_count || pipestatus_position < -pipe_count - 1 )); then

Comment on lines 266 to 301
if (( pipestatus_position < 0 )); then
# if we are performing default "last failure" behavior,
# iterate backwards through pipe_status to find the last error.
result_status=0
for index in "${!__bats_pipe_eval_pipe_status[@]}"; do
# OSX bash doesn't support negative indexing.
local backward_iter_index="$((${#__bats_pipe_eval_pipe_status[@]} - index - 1))"
local status_at_backward_iter_index="${__bats_pipe_eval_pipe_status[$backward_iter_index]}"
if (( status_at_backward_iter_index != 0 )); then
result_status="$status_at_backward_iter_index"
break;
fi
done
else
result_status="${__bats_pipe_eval_pipe_status[$pipestatus_position]}"
fi

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (( pipestatus_position < 0 )); then
# if we are performing default "last failure" behavior,
# iterate backwards through pipe_status to find the last error.
result_status=0
for index in "${!__bats_pipe_eval_pipe_status[@]}"; do
# OSX bash doesn't support negative indexing.
local backward_iter_index="$((${#__bats_pipe_eval_pipe_status[@]} - index - 1))"
local status_at_backward_iter_index="${__bats_pipe_eval_pipe_status[$backward_iter_index]}"
if (( status_at_backward_iter_index != 0 )); then
result_status="$status_at_backward_iter_index"
break;
fi
done
else
result_status="${__bats_pipe_eval_pipe_status[$pipestatus_position]}"
fi
if [[ -n $pipestatus_position ]]; then
if (( pipestatus_position < 0 )); then
pipestatus_position=$((pipestatus_position + ${#__bats_pipe_eval_pipe_status[@]} ))
fi
result_status="${__bats_pipe_eval_pipe_status[$pipestatus_position]}"
else
result_status=0
# use last non zero value
for val in ${__bats_pipe_eval_pipe_status[@]}; do
if (( val != 0)); then
result_status=$val
fi
done
fi

# Supplying -N (e.g. -0) will instead always use the exit code of the command
# at that position in the chain.

local pipestatus_position=-1

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest adding the possibility to use negative values like --1 to always use the last status, --2 next to last status, and so on

Suggested change
local pipestatus_position=-1
local pipestatus_position=

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose negative indexing (as used in linux's bash, python, etc) would be an interesting option. But the suggested syntax is quite peculiar. I don't think I've seen negative values used outside args to other parameters (e.g. --num -5, -X=-100, etc) or as a positional argument (e.g. some_command -a -f -- -2 -50 -100). Using negative values as optional arguments confuses the meaning of - and becomes more confusing when considering unix style arguments (where -- is for long parameters).

WDYT about adding in a separate argument that allows for negative values? Something like --returned-status=N (where N is a positive or negative value). The -N syntax would just be a shorthand (for positive values).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've gone ahead and added in --returned-status (which supports positive and negative values). I've also added in tests for this. Let me know what you think.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please also add this to the online documentation in the .md files.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated documentation.

#########

@test "run bats_pipe with no commands" {
run bats_pipe

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some general hints for all tests here:

  1. use the -1 switch to do inline testing
  2. avoid echo $output, there is --print-output-on-failure for that and if run fails due to an exit code check, it should also print $output (in future releases)
  • also, stderr and stdout is not differentiated in bats output, so no need to redirect
  1. (maybe) place the line count check last, to get more localized error messages first

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use the -1 switch to do inline testing

No problem with changing [ "$status" -eq 1 ] to instead use run -1 blah.

... but I'm a little curious around intended flow and interplay with extension libs. My usage of [ "$status" -eq 1 ] to to mimic the functionality of bats-assert's assert_failure command (or assert_success).

I happen to use bats-assert for my projects as I find it as a very nice addition to bats. Obviously, I'm not using it here as it would present a project circular dependency. However, there is this overlap of functionality with run -N. I'm curious about intention over that. If it's as simple as "authors could use whichever" and "run -N is less verbose than [ "$status" -eq N ]", then that's cool.

FWIW, I like [ "$status" -eq N ] or assert_failure N as it's easier to read what the test is doing (run -N is a bit more opaque when reading).

Sorry for asking and dragging out conversation. 😅

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

avoid echo $output

Done.

[ "${lines[0]}" = 'Will return status 4.' ]
}

@test "run bats_pipe for Nth error status too large" {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we added support for negative indices above, please add tests for the range check and actual return code position here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added tests for --returned-status as discussed in other comment thread.

Allows for easily processing output from the command being run, and
properly propagate status.
  * Working out a few issues and clarifying values.
  * Adding comments to clarify flow.
* Leaving previous functions for the time being for comparison.
* Adding comment to detail unquoted var with disable.
…debug.

* Adding test cases for parameters with spaces.
* Switching out hexdump for the (more widely available) od.
  * Updating assertions to match od output.
* Cleaning up / consolidating / simplifying bats_pipe function.
* Moving if conditions to use c-style parenthesis construct.
  * Using explicit parenthesis to not rely upon operator precedence.
* Changing eval array expansion to use `[@]` instead of `[*]`.
  * Likely not a big impact in this exact use case, but this is the typical syntax for desired behavior -- less likely to have undesired surprises.
  * Supports positive and negative values.
  * Negative values count from the end in reverse, similar to (linux's) bash and python.
  * Can be used as either `--return-status N` or `--return-status=N`.
* Adding tests to validate `--return-status` functionality.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants