New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
The data packets are misaligned when SFTP timeout caught by outside code and continue downloading. #1959
Comments
I have considered several approaches to resolve the issue, but none of them are perfect, so I comment on this to discuss it. Initially, I focused on the unconsumed packets left over from a timeout. One possible solution is to send a cleanup flag packet with a random ID, which could be of a harmless type such as However, SFTP is a stateful protocol, and simply discarding the packets corresponding to timed-out OPEN requests would result in resource leaks on the server side. So this may not be a good approach. Later, I considered actively closing the channel when a timeout occurs. However, the RFC does not specify that the server must clean up any open files when a channel close occurs. When I tried sending SSH_MSG_CHANNEL_CLOSE to my test SFTP server, the server abruptly closed the entire TCP connection. The last solution I can think of is to call |
First, nice sleuthing!
I think the best fix would be for phpseclib to support multiple channels. So like each SFTP transfer would be on its own dedicated channel. If the server times out, like you described, then it doesn't matter because the next download or upload would be on a whole new channel. This would also enable multiple concurrent file downloads as well. Unfortunately, this would be a pretty significant rewrite of SSH2.php and SFTP.php and would only be included in a major release and it's very doubtful the next major release will happen even next year.
I could always make it public. Altho I feel like just creating a new SFTP object would be sufficient as well? |
It sounds that the next major release is unlikely to occur soon. Since I am indirectly using the phpseclib, I am not able to access the SFTP instance held by I submitted a PR to thephpleague/flysystem to catch the |
I believe I have identified a bug. I am downloading multiple files in a loop. The code is based on Laravel and use the league/flysystem-sftp-v3 as the underlying storage driver. The relevant part is as follows:
I use a try-catch block to catch exceptions that occur during the remote file download process and continue with the next file. However, I've noticed that sometimes, after a network error occurs, the downloaded file size does not match as the size from metadata.
This issue occurs sporadically, and I have spent several months trying to reproduce it a few times. I enabled the
NET_SFTP_LOGGING
and added some more info to SFTP logs for debugging. Finally, I have obtained enough log information to reconstruct the entire process of the issue occurring.Here are the key logs. sftp.log
Everything begins with the
\phpseclib3\Net\SFTP::close_handle
call$this->get_sftp_packet()
timeout. As theget_sftp_packet
method will returnfalse
when timeout, the$response
of course not equalNET_SFTP_STATUS
in this scenario, so it throw an UnexpectedValueException.As shown in line 69 of the log, the
close_handle
method throws an UnexpectedValueException when it waits for aNET_SFTP_STATUS
packet timeout. The try-catch block in loop caught the exception, recoard the message and trying to download the next file. Since the network is still stalled, theNET_SFTP_OPEN
also waits timeout and throws another UnexpectedValueException. As shown in line 72 of the log.The loop keep trying next and next files, the client keep sending OPEN and OPEN packets. Until sometime the stalled network resume to respond packets.
As shown in line 147 of the log, the network resume and a
NET_SFTP_STATUS
packet response. It should have corresponded to theNET_SFTP_CLOSE
packet (log line 68) that occurred before the network stalled. But theget
method have no idea about this and returnfalse
so theleague/flysystem-sftp-v3
throws an UnableToReadFileException that recorded in line 148 of the log. Then the loop catch it again and try to download the next file.Coincidentally, the next
NET_SFTP_OPEN
packet gets aNET_SFTP_HANDLE
packet "as expected"(log line 151), which should have corresponded to the firstNET_SFTP_OPEN
(log line 71) during the network outage. But theget
method again have no idea about this and happily send 32 ×NET_SFTP_READ
for downloading the file.When calling
get_sftp_packet(0)
to get the response data for the first (with index 0)NET_SFTP_READ
, response packets correspond to the previous timeoutNET_SFTP_OPEN
s received with packet_id1
(log line 184-209). Theget_sftp_packet
method will keep setting buffer for packet_id1
. When theNET_SFTP_DATA
with packet_id0
finally recevied (log line 210), the 32k data block will be written into local file.And then calling
get_sftp_packet(1)
will get the buffered packetNET_SFTP_STATUS
(log line 211), which should have corresponded to the lastNET_SFTP_OPEN
during the network outage. But theget
method have no idea about this and it will start clearing the responses by reading response without handling. In these processes, when callingget_sftp_packet(2)
, aNET_SFTP_DATA
with packet_id1
corresponded to the recentNET_SFTP_READ
will be set into the buffer for packet_id1
(log line 212).The rest of the response packets will be read and dropped (line 213-242), and the file could be closed successfullly (line 243-244). However, since the
NET_SFTP_HANDLE
is misaligned with theNET_SFTP_OPEN
, the downloaded file content is not as expected. Also the downloaded file will only contain the first 32k data.At this time, all packet buffers except for packet_id
1
are empty. The leftover packet_id1
buffer would impact the following file downloads. Everytime callingget_sftp_packet(1)
will return the previously buffered packet and everytime callingget_sftp_packet(2)
will fill the packet_id1
buffer with new data. That makes the downloaded file contains wrong content bewteen 32k-64k.The text was updated successfully, but these errors were encountered: