Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Client does not work the same when using the async api as with the synchronous api #426

Open
alexmherrmann opened this issue Oct 24, 2023 · 1 comment

Comments

@alexmherrmann
Copy link

Version

2.11.0

Bug description

I have not found any examples of what I am trying to do in official documentation.

I have two tests here (in kotlin):

package minatest
import com.google.common.truth.Truth
import org.apache.sshd.client.SshClient
import org.junit.jupiter.api.MethodOrderer.OrderAnnotation
import org.junit.jupiter.api.Order
import org.junit.jupiter.api.Test
import org.junit.jupiter.api.TestMethodOrder
import org.junit.jupiter.api.Timeout
import org.slf4j.LoggerFactory
import org.springframework.beans.factory.annotation.Autowired
import org.springframework.boot.test.context.SpringBootTest
import org.springframework.test.context.ActiveProfiles
import reactor.kotlin.test.test
import java.time.Duration.ofSeconds
import java.util.concurrent.CountDownLatch
import java.util.concurrent.TimeUnit.SECONDS

	val client = SshClient.setUpDefaultClient().apply {
		start()
	}

	@Test
	fun sanityTest() {
		val instance = requestor.getExistingInstance(existingInstance).block(ofSeconds(5))

		val ipv4 = instance.getPublicIpv4Address()

		val connection = client.connect(
			"root",
			ipv4,
			22
		).verify(ofSeconds(5))

		val output = connection
			.session
			.run {
				auth().verify(ofSeconds(5))
				executeRemoteCommand("uptime")
			}

		logger.info("Output: $output")

	}

	@Test
	fun secondSanityNowAsync() {
		val instance = requestor
			.getExistingInstance(existingInstance)
			.block(ofSeconds(10))
		val connected = client.connect("root", instance.getPublicIpv4Address(), 22)

		val blocker = CountDownLatch(1)
		connected.addListener { conn ->
			logger.info("Connected: $conn")
			val sesh = conn.session
			val auth = sesh.auth().addListener { authed ->
				conn.verify(0)
				authed.verify(0)
				logger.info("Authed: $authed")
				val output = sesh.executeRemoteCommand("uptime")
				logger.info("Output: $output")
				Truth.assertThat(output).contains("load")
				// NOTIFY OF COMPLETION
				blocker.countDown()
			}
		}

		assert(blocker.await(10, SECONDS))

	}

The synchronous one works just fine. I call verify twice and then I am set up to go. However I'm going for an async architecture and would rather do something like the second as it will eventually be integrated into a project reactor application.

The first one runs just fine, the second one hangs forever on the connected.session.executeRemoteCommand(...). Setting a breakpoint it makes it all the way to the second listener (i.e. I get the "Authed: ..." log message) and then never makes it to the "Output: ".

Actual behavior

The test fails to complete in any amount of time.

Expected behavior

The executeRemoteCommand should execute

Relevant log output

From the run where it hangs:

2023-10-24T14:27:51,113 DEBUG [sshd-SshClient[15fdd1f2]-nio2-thread-9] o.a.s.c.s.AbstractClientSession: createExecChannel(ClientSessionImpl[root@/142.93.54.117:22])[uptime] created id=0 - PTY=null
2023-10-24T14:27:51,113 DEBUG [sshd-SshClient[15fdd1f2]-nio2-thread-9] o.a.s.c.c.AbstractClientChannel: open(ChannelExec[id=0, recipient=-1]-ClientSessionImpl[root@/142.93.54.117:22]) Send SSH_MSG_CHANNEL_OPEN - type=session
2023-10-24T14:27:51,113 DEBUG [sshd-SshClient[15fdd1f2]-nio2-thread-9] o.a.s.c.s.h.AbstractSession: encode(ClientSessionImpl[root@/142.93.54.117:22]) packet #7 sending command=90[SSH_MSG_CHANNEL_OPEN] len=24
2023-10-24T14:27:51,113 DEBUG [sshd-SshClient[15fdd1f2]-nio2-thread-9] o.a.s.c.i.n.Nio2Session: writeBuffer(Nio2Session[local=/192.168.86.247:63735, remote=/142.93.54.117:22]) writing 60 bytes

java.lang.AssertionError: Assertion failed (on the countdownlatch block)

The last few logs from when it runs successfully:

2023-10-24T14:15:39,701 DEBUG [main] o.a.s.c.s.AbstractClientSession: createExecChannel(ClientSessionImpl[root@/142.93.54.117:22])[uptime] created id=0 - PTY=null
2023-10-24T14:15:39,702 DEBUG [main] o.a.s.c.c.AbstractClientChannel: open(ChannelExec[id=0, recipient=-1]-ClientSessionImpl[root@/142.93.54.117:22]) Send SSH_MSG_CHANNEL_OPEN - type=session
2023-10-24T14:15:39,702 DEBUG [main] o.a.s.c.s.h.AbstractSession: encode(ClientSessionImpl[root@/142.93.54.117:22]) packet #7 sending command=90[SSH_MSG_CHANNEL_OPEN] len=24
2023-10-24T14:15:39,702 DEBUG [main] o.a.s.c.i.n.Nio2Session: writeBuffer(Nio2Session[local=/192.168.86.247:63598, remote=/142.93.54.117:22]) writing 60 bytes
2023-10-24T14:15:40,374 DEBUG [sshd-SshClient[15fb7a32]-nio2-thread-11] o.a.s.c.s.h.AbstractSession: doHandleMessage(ClientSessionImpl[root@/142.93.54.117:22]) process #8 SSH_MSG_GLOBAL_REQUEST
2023-10-24T14:15:40,375 DEBUG [sshd-SshClient[15fb7a32]-nio2-thread-11] o.a.s.c.s.h.AbstractConnectionService: globalRequest(ClientConnectionService[ClientSessionImpl[root@/142.93.54.117:22]]) received SSH_MSG_GLOBAL_REQUEST hostkeys-00@openssh.com want-reply=false

Other information

No response

@tomaswolf
Copy link
Member

The basic problem here is a mixture of asynchronous mechanisms (futures, and listeners) and then a listener making a synchronous executeRemoteCommand call.

The immediate cause for blocking is that internally a lock is held, which prevents any more incoming messages for that session being handled until that listener is done. "Fixing" this in a transparent and scalable way inside the library looks very difficult.

A quick work-around for your case would be to make that listener use a different thread for everything after logger.info("Authed: $authed").

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants