Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Race condition causes ScpTest.testScpNativeOnMultipleFiles() to fail (flaky test) #266

Closed
tomaswolf opened this issue Nov 3, 2022 · 0 comments · Fixed by #265
Closed
Assignees
Labels
bug An issue describing a bug in the code
Milestone

Comments

@tomaswolf
Copy link
Member

The ultimate problem is a general one concerning ChannelExecs:

If

  • there is a communication protocol implemented between the client and the remote command via the stdin/stdout streams,
  • and if the protocol is such that the remote command can return an error code in this application protocol,
  • and the client uses the default inverted output stream to read that error code,
  • and the client then decides to close the channel (even gracefully),

then it is possible that the AbstractClientChannel's out stream is closed between lines 436 and 437, and then the flush()call on the ChannelPipedOutputStream may fail, causing the whole session to go down.

This can happen in the ScpTestat line 343, which tests an scp upload command that should fail. The scp server sends back an ERROR ack, which the client reads (on an I/O thread) and forwards in AbstractClientChannel. The client reads this though the ChannelPipedInputStream (on the main thread) and throws an exception, which then causes the channel to be closed at the end of DefaultScpClient.runUpload() (still on the main thread). Only then does the I/O thread call flush(), which then fails with an exception.

So in short we have

  1. I/O thread: receive data
  2. I/O thread: write data to ChannelPipedOutputStream
  3. Main thread: read data from ChannelPipedInputStream
  4. Main thread: close channel
  5. I/O thread: call ChannelPipedOutputStream.flush() and fail.

This is all the more infuriating since ChannelPipedOutputStream.flush() is essentially a no-op. Moreover: an exception when a channel has received data and tries to forward it should not take down the whole session but only this channel.

@tomaswolf tomaswolf added the bug An issue describing a bug in the code label Nov 3, 2022
@tomaswolf tomaswolf added this to the 2.9.2 milestone Nov 3, 2022
@tomaswolf tomaswolf self-assigned this Nov 3, 2022
tomaswolf added a commit to tomaswolf/mina-sshd that referenced this issue Nov 4, 2022
ChannelPipedOutputStream passes on data immediately to its sink.
Flushing such a stream is thus a no-op, and must not throw an
exception even when the stream is already closed. Otherwise there
may be spurious failures if the reader of the sink decides to
close the whole channel before the channel has flushed the
ChannelPipedOutputStream.

Fixes apache#266.

Bug: apache#266
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug An issue describing a bug in the code
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant